c++ - CUDA kernel and printf strange behaviour. -
i wrote simple kernel code, trying manipulate 1 dimensional array elements:
#include "stdio.h" __global__ void loop(double *x, int cellsnum, int varnum,const double constant1) { int idx = threadidx.x+blockdim.x*blockidx.x; int = (idx+1)*varnum ; double exp1,exp2,exp3,exp4 ; if(idx<cellsnum-2) { exp1=double(0.5)*(x[i+6+varnum]+x[i+6])+x[i+10] ; exp2=double(0.5)*(x[i+8+varnum]+x[i+8]) ; if(i==0) { printf("%e %e",exp1,exp2) ; } exp3=x[i+11]-constant1*(exp1*exp2)/x[i+5] ; exp4=constant1*(x[i+9]*exp1-x[i+9-varnum]*exp2)/x[i+5] ; x[i+12]=exp3+exp4; } } extern "c" void cudacalc_(double *a, int* n1, int* n2, double* n3) { int cells_num = *n1; int var_num = *n2; double constant1 = *n3; loop<<<1,cells_num>>>(a,cells_num,var_num,constant1); }
but doesn't work if comment piece of code:
if(i==0) { printf("%e %e",exp1,exp2) ; }
even when variable greater zero. comment lines code produces nan in x array. i'm trying run code compiled -arch sm_20 flag on tesla gpu. maybe can me issue ?
this kernel has opportunity race condition, because kernel code both reading x
, writing x
no synchronization or protection.
the simplest way fix separate output statement write different array:
xo[i+12]=exp3+exp4;
cuda-memcheck
can check race conditions within kernel. use cuda-memcheck --help
find specific racecheck
options.
Comments
Post a Comment