Hi,
I run itkGPUDemonsRegistrationFilterTest via ITKGPUPDEDeformableRegistrationTestDriver, and the GPU registration seemed not working. The result of GPU is almost the same as the moving image, even though the result of CPU looks good. I used the following arguments, which uses .png files in the external data folder.
itkGPUDemonsRegistrationFilterTest 2 500 PATH\TO\InsightToolkit-5.0.1-bin7\ExternalData\Modules\Registration\GPUPDEDeformable\test\Input\LenaFix.png PATH\TO\InsightToolkit-5.0.1-bin7\ExternalData\Modules\Registration\GPUPDEDeformable\test\Input\LenaMov.png PATH\TO\InsightToolkit-5.0.1-bin7\ExternalData\Modules\Registration\GPUPDEDeformable\test\Input\LenaOut.png
My environment is…
ITK 5.0.1
CMake 3.16.4
Cuda 10.2
Windows 10
The followings are checked in CMake configuration.
BUILD_EXAMPLES
BUILD_SHARED_LIB
BUILD_TESTING
ITK_USE_GPU
I would appreciate any advice.
Thank you,
Kenji
log
Starting GPU Demons
Platform : NVIDIA CUDA
Total # of platform : 2
Platform 0 : NVIDIA CUDA
Platform 1 : Intel® OpenCL
GeForce GTX 1070
Maximum Work Item Sizes : { 1024, 1024, 64 }
Maximum Work Group Size : 1024
Alignment in bits of the base address : 4096
Smallest alignment in bytes for any data type : 128
cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics
Defines: #define DIM_2
#define BUFPIXELTYPE float
#define OUTPIXELTYPE float
#define PIXELDIM 2
Defines: #define DIM_2
#define OUTPIXELTYPE float
Defines: #define DIM_2
#define IMGPIXELTYPE float
#define BUFPIXELTYPE float
#define OUTPIXELTYPE float
Defines: #define blockSize 128
#define nIsPow2 1
#define T int
Defines: #define blockSize 128
#define nIsPow2 1
#define T float
Defines: #define blockSize 128
#define nIsPow2 1
#define T float
Debug: In D:\workspace_itk\InsightToolkit-5.0.1\Modules\Registration\GPUPDEDeformable\test\itkGPUDemonsRegistrationFilterTest.cxx, line 66
Object (0000009EEECFF110): Progress: 0 Iter: 0 Metric: 1.79769e+308 RMSChange: 0
Debug: In D:\workspace_itk\InsightToolkit-5.0.1\Modules\Registration\GPUPDEDeformable\test\itkGPUDemonsRegistrationFilterTest.cxx, line 66
Object (0000009EEECFF110): Progress: 0.002 Iter: 1 Metric: 592.802 RMSChange: 0.327976
GPU InitTime in seconds = 0.0194838
GPU ComputeUpdateTime in seconds = 0.00176787
GPU ApplyUpdateTime in seconds = 7.29561e-05
GPU SmoothFieldTime in seconds = 0.00223136
Debug: In D:\workspace_itk\InsightToolkit-5.0.1\Modules\Core\Common\src\itkObject.cxx, line 609
Object (0000009EEECFF110): Destructing!
WARNING: In D:\workspace_itk\InsightToolkit-5.0.1\Modules\Core\Common\src\itkLightObject.cxx, line 206
LightObject (0000009EEECFF110): Trying to delete object with non-zero reference count.
Finished GPU Demons
Starting CPU Demons
WARNING: In D:\workspace_itk\InsightToolkit-5.0.1\Modules\Core\Common\src\itkLightObject.cxx, line 206
LightObject (0000009EEECFF170): Trying to delete object with non-zero reference count.
Finished CPU Demons
Average GPU registration time in seconds = 0.794806
Average CPU registration time in seconds = 14.9423
Maximum displacement difference = 6.18324
Average displacement difference = 2.25719
Test failed
D:\workspace_itk\InsightToolkit-5.0.1-bin7\bin\Debug\ITKGPUPDEDeformableRegistrationTestDriver.exe (process 31472) exited with code 1.
To automatically close the console when debugging stops, enable Tools->Options->Debugging->Automatically close the console when debugging stops.
Press any key to close this window . . .