The time of converting cpu image to gpu image is too long

qiang-zhang-neu · March 17, 2020, 3:13am

In my project, I need to convert the itk::Image to itk::GPUImage. However, I find the time is too long (~ 3s) in release version.

The code is:

		start1 = clock();
		CPUTOGPUImageFilterPointer cpuToGpuImageFilter = CPUTOGPUImageFilterType::New();
		cpuToGpuImageFilter->SetInput(m_cpuMovingImage);
		cpuToGpuImageFilter->UpdateLargestPossibleRegion();
		m_gpuMovingImage = cpuToGpuImageFilter->GetOutput();
		end1 = clock();
		std::cout << "cpu to gpu image time:" << (double)(end1 - start1) / CLOCKS_PER_SEC << std::endl;

And the information in console is:

cpu to gpu image time:3.441

Why it would cost so much time? Or is there any thing wrong with my code?

matt.mccormick · March 17, 2020, 3:02pm

CPU to GPU time is costly, but that is more than expected.

It is worth investigating your hardware and the CPU-GPU bus speed.

simon.rit · March 17, 2020, 8:41pm

I’m interested in your CPUTOGPUImageFilterType. Can you give me a pointer to the corresponding filter? I can’t find in ITK. For Cuda, we do the conversion manually

github.com

SimonRit/RTK/blob/master/examples/FirstReconstruction/FirstCudaReconstruction.py#L67-L70


projections.SetPixelContainer(rei.GetOutput().GetPixelContainer())
projections.CopyInformation(rei.GetOutput())
projections.SetBufferedRegion(rei.GetOutput().GetBufferedRegion())
projections.SetRequestedRegion(rei.GetOutput().GetRequestedRegion())

which costs almost nothing but the image is not on the GPU yet. I think it’s a better idea to put these operations in a filter and I’d like to know how it’s done in the one you used.

qiang-zhang-neu · March 18, 2020, 10:01am

Actually, it is a cast filter:

	using CPUTOGPUImageFilterType = itk::CastImageFilter<CPUImageType, GPUImageType>;
	using CPUTOGPUImageFilterPointer = typename CPUTOGPUImageFilterType::Pointer;

qiang-zhang-neu · March 18, 2020, 10:04am

However, I convert a array vector to GPU image, and it only cost about 0.05s. If I convert the CPU image to array vector, and then convert the array vector to GPU image, I think the cost time is smaller than 0.1s (I don’t test it).

I just think it is very weird. Why convert CPU image to GPU image would cost so much time? What operator does it perform?

simon.rit · March 18, 2020, 10:29am

What are the CPUImageType and the GPUImageType? I don’t think there is anything specific to gpu in the cast image filter so I think it’s not related to the GPUImage type. The solution I suggested above is much faster since there is no data copy whereas the cast does a data copy.

dzenanz · March 18, 2020, 12:31pm

You can step through the code to find out.

qiang-zhang-neu · March 20, 2020, 1:18am

using CPUImageType = itk::Image<PixelType, Dimension>;
using GPUImageType = itk::GPUImage<PixelType, Dimension>;

Thanks, I have tested the SetPIxelContainer method, and it really make the program faster than caster filter.