c++ - CUDA kernel and printf strange behaviour. -


i wrote simple kernel code, trying manipulate 1 dimensional array elements:

    #include "stdio.h"  __global__ void loop(double *x, int cellsnum, int varnum,const double constant1) {  int idx = threadidx.x+blockdim.x*blockidx.x; int = (idx+1)*varnum ; double exp1,exp2,exp3,exp4 ;  if(idx<cellsnum-2) {  exp1=double(0.5)*(x[i+6+varnum]+x[i+6])+x[i+10] ; exp2=double(0.5)*(x[i+8+varnum]+x[i+8]) ;  if(i==0) { printf("%e %e",exp1,exp2) ; }  exp3=x[i+11]-constant1*(exp1*exp2)/x[i+5] ;  exp4=constant1*(x[i+9]*exp1-x[i+9-varnum]*exp2)/x[i+5] ;  x[i+12]=exp3+exp4; } }  extern "c" void cudacalc_(double *a, int* n1, int* n2, double* n3) { int cells_num = *n1; int var_num = *n2; double constant1 = *n3;  loop<<<1,cells_num>>>(a,cells_num,var_num,constant1);  } 

but doesn't work if comment piece of code:

if(i==0) { printf("%e %e",exp1,exp2) ; } 

even when variable greater zero. comment lines code produces nan in x array. i'm trying run code compiled -arch sm_20 flag on tesla gpu. maybe can me issue ?

this kernel has opportunity race condition, because kernel code both reading x , writing x no synchronization or protection.

the simplest way fix separate output statement write different array:

xo[i+12]=exp3+exp4; 

cuda-memcheck can check race conditions within kernel. use cuda-memcheck --help find specific racecheck options.


Comments

Popular posts from this blog

Line ending issue with Mercurial or Visual Studio -

python - Received unregistered task using Celery with Django -

tags - Jquery Mixitup plugin help prevent handlers being destroyed -