Hi.
I am using RTK to do iterative reconstruction. Everything seems to be working, except when trying to run things on the GPU. Everything compiles fine and it runs. However, when the GPU-compiled programs run, they are slow, slower than the CPU versions. nvtop shows that the process is present on the GPU, but that it is using 0% of GPU processing while it is hammering the CPU. This is true of RTK code that I wrote as well as the RTK-ready applications such as rtkfdk, admmtv, conjugategradient, etc. I have compiled (clean) ITK with the flags ITK_USE_GPU and RTK_USE_CUDA. Below is a list of system stuff (for reference) and the display from nvtop. Any pointers/advice is appreciated.
Ross
OS: Ubuntu 22.04
Graphics card: NVIDIA RTX 500
Driver Version: 535.183.01
CUDA Version: 12.2