[SimpleITK] Registration of non-medical multi-modal images

Hi everyone,

I am starting to learn how to use the toolkit for image registration.
I have a pair of thermal-visual images that need to be registered. Each image has a different dimensions and with a 90% of the scene is overlapping.

The content below is the structure of each image:

Reference Optical image:
origin: (0.0, 0.0)
size: (1936, 1216)
spacing: (0.26458333333333334, 0.26458333333333334)
direction: (1.0, 0.0, 0.0, 1.0)
pixel type: 32-bit float
number of pixel components: 1
######################################################
Thermal image:
origin: (0.0, 0.0)
size: (640, 480)
spacing: (0.26458333333333334, 0.26458333333333334)
direction: (1.0, 0.0, 0.0, 1.0)
pixel type: 32-bit float
number of pixel components: 1

To perform the registration, I am following the notebook 05_basic_registration and I am not achieving good results. First of all, I read the images as float32 and then I create a initial transform as shown below.

With the initial transform, It’s possible to note that the entire moving image is mapped at the center of the fixed image. I don’t know why. For the second step, I am using the same code as the tutorial with minor changes, as follows:

registration_method = sitk.ImageRegistrationMethod()

# Similarity metric settings.
registration_method.SetMetricAsMattesMutualInformation(numberOfHistogramBins=50)
registration_method.SetMetricSamplingStrategy(registration_method.RANDOM)
registration_method.SetMetricSamplingPercentage(1)

registration_method.SetInterpolator(sitk.sitkLinear)

# Optimizer settings.
registration_method.SetOptimizerAsGradientDescent(learningRate=1.0, numberOfIterations=500, convergenceMinimumValue=1e-6, convergenceWindowSize=100)
# registration_method.SetOptimizerScalesFromPhysicalShift()

# Setup for the multi-resolution framework.            
# registration_method.SetShrinkFactorsPerLevel(shrinkFactors = [4,2,1])
# registration_method.SetSmoothingSigmasPerLevel(smoothingSigmas=[2,1,0])
# registration_method.SmoothingSigmasAreSpecifiedInPhysicalUnitsOn()

# Don't optimize in-place, we would possibly like to run this cell multiple times.
registration_method.SetInitialTransform(initial_transform, inPlace=False)

# Connect all of the observers so that we can perform plotting during registration.
registration_method.AddCommand(sitk.sitkStartEvent, rgui.start_plot)
registration_method.AddCommand(sitk.sitkEndEvent, rgui.end_plot)
registration_method.AddCommand(sitk.sitkMultiResolutionIterationEvent, rgui.update_multires_iterations) 
registration_method.AddCommand(sitk.sitkIterationEvent, lambda: rgui.plot_values(registration_method))

final_transform = registration_method.Execute(fixed_image_2, moving_image)

# Always check the reason optimization terminated.
print('Final metric value: {0}'.format(registration_method.GetMetricValue()))
print('Optimizer\'s stopping condition, {0}'.format(registration_method.GetOptimizerStopConditionDescription()))

Can I perform such kind of image registration with sampleITK? If yes, what could be wrong in my example?

I was able to achieve good results using the Matlab image registration toolkit with Mattes Mutual Information and with the same parameters.

Thanks in advance,
Paulo

Your 640x480 image should have 3x bigger spacing than your 1936x1216 image, because they have similar physical extent. Maybe keep everything the same, but add right after reading from disk: smallImage.SetSpacing(0.8,0.8); (or something similar to this) to force a proper spacing.

By the pixel physical extent, you mean the real world dimensions?

In this case, each pixel, in the 640x480 image, should be mapping proportionally the scene in the same way as the 1936x1216 image, right?
If my assumption is correct, to find the correct spacing, I would have to make some calculations based on the camera intrinsics parameters and lens to find the exact pixel physical extent, right?

After setting the spacing as you suggested my initial transform was better nut the registration stills very poorly.

If you do this right, you can constrain the registration to only use translation or translation + rotation, which is easier to do and should probably lead to better results.

This is a step that we are trying to avoid due to project specifications and also the complexity of calibrating a thermal camera.

I will try to find the correct spacing and see the results.

Using approximately correct spacing + rigid transform should but a lot better than what you have shown in the initial post.

Well… After @dzenanz suggestions of fixing the image spacing, a got better initial transformations. Nevertheless, I still didn’t get any good registration.

I will explain what I have been trying to do step-by-step with some code snippets.

Step 1) Read the images as float32 and correct the spacing as suggested.

Fixed and moving images parameters after correction:

Reference Fixed image:
origin: (0.0, 0.0)
size: (1936, 1216)
spacing: (0.26458333333333334, 0.26458333333333334)
direction: (1.0, 0.0, 0.0, 1.0)
pixel type: 32-bit float
number of pixel components: 1

Moving image:
origin: (0.0, 0.0)
size: (640, 480)
spacing: (0.8, 0.6701)
direction: (1.0, 0.0, 0.0, 1.0)
pixel type: 32-bit float
number of pixel components: 1

Step 2) Created a centered initial transform using Similarity2DTransform, AffineTransform or Euler2DTransform.

initial_transform = sitk.CenteredTransformInitializer(fixed_image, moving_image,
                                              sitk.Similarity2DTransform(),
                                              sitk.CenteredTransformInitializerFilter.GEOMETRY)

gui.RegistrationPointDataAquisition(fixed_image, moving_image, figure_size=(10,4), known_transformation=initial_transform);

With the RegistrationPointDataAquisition, I verified the initial registration with some landmarks. The points were near the correct position. In this case, I don’t there is a need for some heavy computation to find the registration transform.

Step 3) Registration

As the images are multi-modality, I am using the MattesMutualInformation method. I tried using the regular and random strategy and increasing/decreasing the number of histogram bins but, all without success. Additionally, I had to increase converge window size to stop early termination. Please, see the code snippet below.

registration_method = sitk.ImageRegistrationMethod()
# Similarity metric settings.
registration_method.SetMetricAsMattesMutualInformation(numberOfHistogramBins=50)
registration_method.SetMetricSamplingStrategy(registration_method.RANDOM)
registration_method.SetMetricSamplingPercentage(1)
registration_method.SetInterpolator(sitk.sitkLinear)

# Optimizer settings.
registration_method.SetOptimizerAsGradientDescent(learningRate=1.0, numberOfIterations=1000, convergenceMinimumValue=1e-6, convergenceWindowSize=200)

registration_method.SetInitialTransform(initial_transform, inPlace=False)
final_transform = registration_method.Execute(roi_fixed_image, roi_moving_image)

The results from the code snippet above weren’t good at all as the following pictures demonstrate.
I haven’t got any run that the registration algorithm converged.

Final metric value: -0.24377469376639058
Optimizer’s stopping condition, GradientDescentOptimizerv4Template: Convergence checker passed at iteration 250.

I created a lot of landmarks to see where my moving images are going and consequently my ROI.

As I didn’t get good results using the full scene, I decided to crop both images into regions of interest (ROI).

Step 4 Extract the image ROI

Screenshot from 2021-02-11 13-53-25

Fixed image ROI:
origin: (0.0, 0.0)
size: (491, 902)
spacing: (0.26458333333333334, 0.26458333333333334)
direction: (1.0, 0.0, 0.0, 1.0)
pixel type: 32-bit float
number of pixel components: 1

Moving image ROI:
origin: (0.0, 0.0)
size: (178, 386)
spacing: (0.8, 0.6701)
direction: (1.0, 0.0, 0.0, 1.0)
pixel type: 32-bit float
number of pixel components: 1

Step 5 Experiment with multiple registration methods on the ROI image

The next step is the same as presented in steps 2 and 3 but, at this time, using the ROI images.

For the initial transformation, the ROI images seem to be off by scaling, translation and, rotation.

I tried executing the registration algorithm using 3 different initial transforms. I will present the result below for each one.

Similarity2DTransform

AffineTransform

Euler2DTransform

Neither of the tested initial transformations had success. I wasn’t able to get any signal that the method was going to converge.

Step 6) Manual registration

I performed a manual registration with OpenCV with pretty good accuracy. By selecting 3 pairs of points, I was able to find an Affine matrix with the function getAffineTransformation. The mentioned matrix did a good registration for almost the entire scene. Then, I tried to replicate the same code with simpleITK. I used the function LandmarkBasedTransformInitializer to create an affine transformation from the selected points.

From this approach, I got the same result as I got with OpenCV. I tried to use this matrix as an initialization transform and used the same algorithm presented in step 3. It didn’t show any considerable improvement. Here is the image of the registration from the 3 points.

Selecting 3 points to perform the manual registration

Registration result

Sorry for the long post.
Could someone give me some hints of how can I achieve the same result from the manual registration using steps 2 and 3? What can I be doing wrong or what am I missing?

Any help is welcome :slight_smile:

Thanks,
Paulo

Hello @PauloFavero,

OK, some observations:

  1. Step 1 looks good.
  2. Step 2: You are using a similarity transformation, you should use an AffineTransform. Rationale: working with two “video” cameras, geometric model is a pinhole camera. The relevant transformation model is a homography - planar projective mapping.
  3. Step 3 problem is here: registration_method.SetOptimizerScalesFromPhysicalShift()
    is commented out or removed, that will make or break your registration. It needs to be there for most registration tasks.
    I would also use inPlace=True just out of convenience.

Hello @zivy,

Thanks for your help.

I have tried the AffineTransformation on step 5 and, it wasn’t providing good results. Your suggestion to use the SetOptimizerScalesFromPhysicalShift helped a lot in terms of convergence. The metric value x iterations graphic is way better after calling this method.

I noticed that the initial transform plays a fundamental role in the registration success. The key to getting everything working was to set the initial transformation as the identity and also setting the convergenceWindowSize to 50.

I am getting good results using the NCC (Normalized Cross-Correlation) or the MattesMutualInformation methods.