Profiling: Add GPU Time column to -log_view, stop censoring host time when GPUs are used
Before this MR the alternatives are
-
-log_view_gpu_time 0
And have most timings censored, even if they are not measuring events that are sensitive to the asynchronous nature of GPU computations, or -
-log_view_gpu_time 1
And insert lots of event synchronizations into a program, slowing it down (I see event synchronizations that are on the order of a millisecond in Nsight, which is often longer than the event being timed).
This MR adds a GPU Time (s)
column to -log_view
. These times are censored unless -log_view_gpu_time
is passed. The timings in the Time (sec)
columns are not censored anymore.
This MR also adds documentation about interpreting -log_view
output with GPUs to the manual.