How to measure overhead of a kernel launch in CUDA -
How to measure overhead of a kernel launch in CUDA -
i want measure overhead of kernel launch in cuda.
i understand there various parameters impact overhead. interested in following:
number of threads created size of info beingness copiedi doing measure advantage of using managed memory has been introduced in cuda 6.0. update question code develop , comments. thanks!
how measure overhead of kernel launch in cuda dealt in section 6.1.1 of "cuda handbook" book n. wilt. basic thought launch empty kernel. here sample code snippet
#include <stdio.h> __global__ void emptykernel() { } int main() { const int n = 100000; float time, cumulative_time = 0.f; cudaevent_t start, stop; cudaeventcreate(&start); cudaeventcreate(&stop); (int i=0; i<n; i++) { cudaeventrecord(start, 0); emptykernel<<<1,1>>>(); cudaeventrecord(stop, 0); cudaeventsynchronize(stop); cudaeventelapsedtime(&time, start, stop); cumulative_time = cumulative_time + time; } printf("kernel launch overhead time: %3.5f ms \n", cumulative_time / n); homecoming 0; }
on laptop geforce gt540m card, kernel launch overhead 0.00245ms
.
if want check dependence of time number of threads launched, alter kernel launch configuration <<<*,*>>>
. appears timing not alter number of threads launched, consistent statement of book most of time spent in driver.
cuda
Comments
Post a Comment