The time of converting cpu image to gpu image is too long

In my project, I need to convert the itk::Image to itk::GPUImage. However, I find the time is too long (~ 3s) in release version.

The code is:

		start1 = clock();
		CPUTOGPUImageFilterPointer cpuToGpuImageFilter = CPUTOGPUImageFilterType::New();
		cpuToGpuImageFilter->SetInput(m_cpuMovingImage);
		cpuToGpuImageFilter->UpdateLargestPossibleRegion();
		m_gpuMovingImage = cpuToGpuImageFilter->GetOutput();
		end1 = clock();
		std::cout << "cpu to gpu image time:" << (double)(end1 - start1) / CLOCKS_PER_SEC << std::endl;

And the information in console is:

cpu to gpu image time:3.441

Why it would cost so much time? Or is there any thing wrong with my code?

CPU to GPU time is costly, but that is more than expected.

It is worth investigating your hardware and the CPU-GPU bus speed.

I’m interested in your CPUTOGPUImageFilterType. Can you give me a pointer to the corresponding filter? I can’t find in ITK. For Cuda, we do the conversion manually


which costs almost nothing but the image is not on the GPU yet. I think it’s a better idea to put these operations in a filter and I’d like to know how it’s done in the one you used.

Actually, it is a cast filter:

	using CPUTOGPUImageFilterType = itk::CastImageFilter<CPUImageType, GPUImageType>;
	using CPUTOGPUImageFilterPointer = typename CPUTOGPUImageFilterType::Pointer;

However, I convert a array vector to GPU image, and it only cost about 0.05s. If I convert the CPU image to array vector, and then convert the array vector to GPU image, I think the cost time is smaller than 0.1s (I don’t test it).

I just think it is very weird. Why convert CPU image to GPU image would cost so much time? What operator does it perform?

What are the CPUImageType and the GPUImageType? I don’t think there is anything specific to gpu in the cast image filter so I think it’s not related to the GPUImage type. The solution I suggested above is much faster since there is no data copy whereas the cast does a data copy.

You can step through the code to find out.

using CPUImageType = itk::Image<PixelType, Dimension>;
using GPUImageType = itk::GPUImage<PixelType, Dimension>;

Thanks, I have tested the SetPIxelContainer method, and it really make the program faster than caster filter.