itkGPUDemonsRegistrationFilterTest fails

Hi,

I run itkGPUDemonsRegistrationFilterTest via ITKGPUPDEDeformableRegistrationTestDriver, and the GPU registration seemed not working. The result of GPU is almost the same as the moving image, even though the result of CPU looks good. I used the following arguments, which uses .png files in the external data folder.

itkGPUDemonsRegistrationFilterTest 2 500 PATH\TO\InsightToolkit-5.0.1-bin7\ExternalData\Modules\Registration\GPUPDEDeformable\test\Input\LenaFix.png PATH\TO\InsightToolkit-5.0.1-bin7\ExternalData\Modules\Registration\GPUPDEDeformable\test\Input\LenaMov.png PATH\TO\InsightToolkit-5.0.1-bin7\ExternalData\Modules\Registration\GPUPDEDeformable\test\Input\LenaOut.png

My environment is…
ITK 5.0.1
CMake 3.16.4
Cuda 10.2
Windows 10

The followings are checked in CMake configuration.
BUILD_EXAMPLES
BUILD_SHARED_LIB
BUILD_TESTING
ITK_USE_GPU

I would appreciate any advice.
Thank you,

Kenji

log

Starting GPU Demons
Platform : NVIDIA CUDA
Total # of platform : 2
Platform 0 : NVIDIA CUDA
Platform 1 : Intel® OpenCL
GeForce GTX 1070
Maximum Work Item Sizes : { 1024, 1024, 64 }
Maximum Work Group Size : 1024
Alignment in bits of the base address : 4096
Smallest alignment in bytes for any data type : 128
cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics
Defines: #define DIM_2
#define BUFPIXELTYPE float
#define OUTPIXELTYPE float
#define PIXELDIM 2

Defines: #define DIM_2
#define OUTPIXELTYPE float

Defines: #define DIM_2
#define IMGPIXELTYPE float
#define BUFPIXELTYPE float
#define OUTPIXELTYPE float

Defines: #define blockSize 128
#define nIsPow2 1
#define T int

Defines: #define blockSize 128
#define nIsPow2 1
#define T float

Defines: #define blockSize 128
#define nIsPow2 1
#define T float

Debug: In D:\workspace_itk\InsightToolkit-5.0.1\Modules\Registration\GPUPDEDeformable\test\itkGPUDemonsRegistrationFilterTest.cxx, line 66
Object (0000009EEECFF110): Progress: 0 Iter: 0 Metric: 1.79769e+308 RMSChange: 0

Debug: In D:\workspace_itk\InsightToolkit-5.0.1\Modules\Registration\GPUPDEDeformable\test\itkGPUDemonsRegistrationFilterTest.cxx, line 66
Object (0000009EEECFF110): Progress: 0.002 Iter: 1 Metric: 592.802 RMSChange: 0.327976

GPU InitTime in seconds = 0.0194838
GPU ComputeUpdateTime in seconds = 0.00176787
GPU ApplyUpdateTime in seconds = 7.29561e-05
GPU SmoothFieldTime in seconds = 0.00223136
Debug: In D:\workspace_itk\InsightToolkit-5.0.1\Modules\Core\Common\src\itkObject.cxx, line 609
Object (0000009EEECFF110): Destructing!

WARNING: In D:\workspace_itk\InsightToolkit-5.0.1\Modules\Core\Common\src\itkLightObject.cxx, line 206
LightObject (0000009EEECFF110): Trying to delete object with non-zero reference count.

Finished GPU Demons

Starting CPU Demons
WARNING: In D:\workspace_itk\InsightToolkit-5.0.1\Modules\Core\Common\src\itkLightObject.cxx, line 206
LightObject (0000009EEECFF170): Trying to delete object with non-zero reference count.

Finished CPU Demons
Average GPU registration time in seconds = 0.794806
Average CPU registration time in seconds = 14.9423
Maximum displacement difference = 6.18324
Average displacement difference = 2.25719
Test failed

D:\workspace_itk\InsightToolkit-5.0.1-bin7\bin\Debug\ITKGPUPDEDeformableRegistrationTestDriver.exe (process 31472) exited with code 1.
To automatically close the console when debugging stops, enable Tools->Options->Debugging->Automatically close the console when debugging stops.
Press any key to close this window . . .

Hi

I found m_RMSChange in GPUFiniteDifferenceImageFilter is 0.0, which terminated the process. Instead, m_RMSChange in FiniteDifferenceImageFilter was set in ApplyUpdate method of GPUDemonsRegistrationFilter. Here is the screen shot.

I added the following in GPUFiniteDifferenceImageFilter.h. I am not sure this is the best solution, but the GPU registration worked.

/** Set/Get the maximum error allowed in the solution. This may not be
defined for all solvers and its meaning may change with the application. */
itkSetMacro( MaximumRMSError, double );
itkGetConstReferenceMacro( MaximumRMSError, double );

/** Set/Get the root mean squared change of the previous iteration. May not
be used by all solvers. */
itkSetMacro( RMSChange, double );
itkGetConstReferenceMacro( RMSChange, double );

Thank you,
Kenji

2 Likes

Hi Kenji,

Welcome to the ITK community! :sunny:

Thanks for discussing your issues and posting the follow-up.

Could your change be submitted as a patch?

I will.

Kenji

1 Like