cuda - What dimension should the cuRAND initialization kernel have -
i working on program in there 2 main kernels.
due impact on performances, each kernel has own dimensions. have 2 different block , grid sizes (whose values cannot known @ compile time).
both kernels need use curand library, before third kernel launched initialize curand state on device.
my question comes when need choose dimensions of kernel.
let's have kernel 1 , 2:
block_size_1 = 256 grid_size_1 = 10 block_size_2 = 512 grid_size_2 = 2
for curand initialization kernel, should use largest sizes (10*512
), or highest number of threads (10*256
)?
pick biggest kernel size, because maximum number of curand generators you'll use. can easyly evaluate size need using like
__host__ void fun(){ curandstate * randstate; int mycurandsize = ((block_size1 * grid_size1) > (block_size2 * grid_size2))? block_size1 * grid_size1 : block_size2 * grid_size2); error = cudamalloc((void **)&randstate, mycurandsize * sizeof(curandstate)); if (error == cudaerrormemoryallocation){ cudadevicereset(); return 1; } setup_curand <<<1, mycurandsize>>> (randstate, unsigned(time(null))); //don't forget free space cudafree(randstate); } __global__ void setup_curand(curandstate * state, unsigned long seed) { int id = threadidx.x; curand_init(seed, id, 0, &state[id]); }
edit: asumming block_size * grid_size
not go on maximum thread limit, otherwise, can same keeping aswell grid , block dimension , launching number of threads setup_curand<<<x, y>>>(...);
Comments
Post a Comment