Is it possible to use GPU for resample filter in simpleITK and Windows?

Is it possible to use GPU for image resample in simpleITK or ITK?

I checked the previous post, in that, one of the answers was to use the “itk-elastic-opencl”
But it seems like work only on Linux

Is there any other way to accelerate the resample filter on Windows?

And my final goal is that getting a resample result of less than 0.5 s
Is it also possible?

@TY_Park Not ITK specifically, but you could check out clEsperanto, which offers GPU-accelerated image processing across languages and platforms. It works in Python among other platforms, so I expect could be combined with ITK/SimpleITK via NumPy.

Hello @TY_Park,

The ResampleImageFilter is only implemented for the CPU using a multi-threaded approach. To accelerate its performance you can increase the number of threads via the SetNumberOfThreads method.

Note that increasing the number of threads does not always reduce the time (depends on data size, thread context switching overhead etc.).

GPUs can do linear interpolation on 3D images extremely quickly, but it is only helpful for your project if all of these are true:

  • linear interpolation is sufficient (if you may need to significantly zoom in the image then linear is not sufficient, you need higher-order interpolation to avoid diamond-shape artifacts)
  • your complete image fits into GPU memory (if you need to process the image piece by piece, moving each piece in and out of GPU memory then the performance gain is lost)
  • you have the rest of your processing and display pipeline implemented on the GPU (if you need to move the processed data back to CPU memory for further processing, send back in the end to GPU for display, etc. then the performance gain of using a GPU for resampling may become insignificant)
  • you don’t need a long-term, multi-platform solution (you would need to implement custom shaders, but OpenGL and OpenCL are already being phased out, especially on macOS; and Vulkan is still immature and not widely supported yet; Metal is macOS only; CUDA is NVidia-only)

If all these limitations are acceptable for your project then it may make sense to implement GPU-based interpolation, as it can be much faster. However, for most applications a well-optimized multi-threaded CPU implementation is a better choice overall, because it is fast enough and you don’t need to deal with any of the above-described limitations.

3 Likes