Se compiliamo con NVCC in modo che utilizzi un Virtual Code possiamo utilizzare il programma su qualunque architettura di GPU però abbiamo un delay allo startup dell’applicazione, come descritto a pagina 25 del manuale di NVCC di CUDA 7.
CUDA_Compiler_Driver_NVCC.pdf
Che si trova nel seguente direttorio: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\doc\pdf
“By specifying a virtual code architecture instead of a real GPU, nvcc postpones theassembly of PTX code until application runtime, at which the target GPU is exactlyknown. For instance, the command below allows generation of exactly matching GPUbinary code, when the application is launched on an sm_20 or later architecture.nvcc x.cu –gpu-architecture=compute_20 –gpu-code=compute_20The disadvantage of just in time compilation is increased application startup delay,but this can be alleviated by letting the CUDA driver use a compilation cache (refer to”Section 3.1.1.2. Just-in-Time Compilation” of CUDA C Programming Guide) which ispersistent over multiple runs of the applications.”