How to measure overhead of a kernel launch in CUDA -



How to measure overhead of a kernel launch in CUDA -

i want measure overhead of kernel launch in cuda.

i understand there various parameters impact overhead. interested in following:

number of threads created size of info beingness copied

i doing measure advantage of using managed memory has been introduced in cuda 6.0. update question code develop , comments. thanks!

how measure overhead of kernel launch in cuda dealt in section 6.1.1 of "cuda handbook" book n. wilt. basic thought launch empty kernel. here sample code snippet

#include <stdio.h> __global__ void emptykernel() { } int main() { const int n = 100000; float time, cumulative_time = 0.f; cudaevent_t start, stop; cudaeventcreate(&start); cudaeventcreate(&stop); (int i=0; i<n; i++) { cudaeventrecord(start, 0); emptykernel<<<1,1>>>(); cudaeventrecord(stop, 0); cudaeventsynchronize(stop); cudaeventelapsedtime(&time, start, stop); cumulative_time = cumulative_time + time; } printf("kernel launch overhead time: %3.5f ms \n", cumulative_time / n); homecoming 0; }

on laptop geforce gt540m card, kernel launch overhead 0.00245ms.

if want check dependence of time number of threads launched, alter kernel launch configuration <<<*,*>>>. appears timing not alter number of threads launched, consistent statement of book most of time spent in driver.

cuda

Comments

Popular posts from this blog

php - Android app custom user registration and login with cookie using facebook sdk -

django - Access session in user model .save() -

php - .htaccess Multiple Rewrite Rules / Prioritizing -