How to measure overhead of a kernel launch in CUDA -

i want measure overhead of kernel launch in cuda.

i understand there various parameters impact overhead. interested in following:

number of threads created size of info beingness copied

i doing measure advantage of using managed memory has been introduced in cuda 6.0. update question code develop , comments. thanks!

how measure overhead of kernel launch in cuda dealt in section 6.1.1 of "cuda handbook" book n. wilt. basic thought launch empty kernel. here sample code snippet

#include <stdio.h>  __global__ void emptykernel() { }  int main() {      const int n = 100000;      float time, cumulative_time = 0.f;     cudaevent_t start, stop;     cudaeventcreate(&start);     cudaeventcreate(&stop);      (int i=0; i<n; i++) {           cudaeventrecord(start, 0);         emptykernel<<<1,1>>>();          cudaeventrecord(stop, 0);         cudaeventsynchronize(stop);         cudaeventelapsedtime(&time, start, stop);         cumulative_time = cumulative_time + time;      }      printf("kernel launch overhead time:  %3.5f ms \n", cumulative_time / n);      homecoming 0; }

on laptop geforce gt540m card, kernel launch overhead 0.00245ms.

if want check dependence of time number of threads launched, alter kernel launch configuration <<<*,*>>>. appears timing not alter number of threads launched, consistent statement of book most of time spent in driver.

cuda

Search This Blog

Three

How to measure overhead of a kernel launch in CUDA -

Comments

Post a Comment

Popular posts from this blog

php - Android app custom user registration and login with cookie using facebook sdk -

ruby on rails - Devise Logout Error in RoR -

c# - Create a Notification Object (Email or Page) At Run Time -- Dependency Injection or Factory -