Hi @zivy ,
Thanks very much for your detailed response and code. I tried running the code you provided and it seems that the only difference is that while I set up the optimizer with:
regMethod.SetOptimizerAsGradientDescent(
learningRate=1.0,
numberOfIterations=500,
estimateLearningRate=regMethod.Once
)
you set it up with:
regMethod.SetOptimizerAsGradientDescent(
learningRate=1.0,
numberOfIterations=100,
convergenceMinimumValue=1e-6,
convergenceWindowSize=10
)
(I can see that I had not defined the values for my variables learningRate
and numIters
in my original post but I’ve included them explicitely in the above snippet.)
The values you set for convergenceMinimumValue
and convergenceWindowSize
are defaults, so my method effectively used the same values, and likewise my value for estimateLearningRate
is .Once
(default). So as far as I can tell our methods are identical.
I cropped the moving (CT) image rather than the fixed (MR) image - perhaps you meant to as well?
I did so by “eye”, having selected min/max indices that roughly crop movIm
to fixIm
's extent:
movImCropped = movIm[25:290, 115:390, 198:250]
Here is a plot of two frames from fixIm
and movImCropped
, as well as fixIm
and movIm
by comparison.
It depends on how we subjectively define a good/poor/bad/terrible registration, but based on 10 repeated runs of registering movIm
to fixIm
, and of movImCropped
to fixIm
I would summarise my results as follows:
movIm
to fixIm
=> 1 decent result, 9 poor/bad results
movImCropped
to fixIm
=> 4 decent, 4 poor/bad, 2 terrible
The results are here if interested.
I fully accept that 10 is not a large enough number to draw any firm conclusions, but I find it surprising that the registrations of movImCropped
were just as bad, and some much worse, than movIm
.
I also tried implementing the ITKv4 framework:
optimizedTx = sitk.AffineTransform(3)
regMethod = sitk.ImageRegistrationMethod()
regMethod.SetMovingInitialTransform(initialTx)
regMethod.SetInitialTransform(optimizedTx, inPlace=False)
... (same metric, optimizer, multi-res, interpolator settings as before)
optimizedTx = regMethod.Execute(fixIm, movIm)
finalTx = sitk.CompositeTransform(optimizedTx)
finalTx.AddTransform(initialTx)
but those results were no better (I seem to recall getting successful results using the ITKv4 framework for landmark-initialized registrations but it perhaps the datasets were different).
I played around with samplingPercentage
and came to some surprising results. Using samplingPercentage = 0.5
(10 repeated runs), about 50% of the runs had terrible results, 40% were “bad”, and 10% “decent”.
Using samplingPercentage = 0.05
(10 repeated runs), 55% were decent, 45% were bad (none were terrible).
And using samplingPercentage = 0.01
(10 repeated runs), 80% were decent, 20% were poor (none were terrible).
I accept that there’s vagueness in “decent”, “poor”, “bad” and “terrible”, and not only have I not defined how I assign such scores, I didn’t use a rigorous or consistent way of assessing them. Rather I made very quick subjective assessments. Nontheless there seemed to be an inverse relationship between the sampling percentage and the quality of the registration.
So I decided to repeat the 1% runs, this time with 50 repeated runs: 66% were decent, 24% were poor, 10% were bad, none were as bad as the “terrible” results from the runs using 50% sampling.
These results seem counter-intuitive to me, since with more samples (randomly selected) I would reason that there is a greater likelihood that samples will fall within structure as well as noise, rather than just noise. But perhaps that assumption is wrong.
Can anyone comment on my findings and explain why I might have found that more samples result in worse registrations?
A final point:
I got much better (and consistent) registrations using BSplines with LandmarkBasedTransformInitializer
than with affine with LandmarkBasedTransformInitializer
The BSpline method uses SetOptimizerAsLBFGS2
(when using scale factors) or SetOptimizerAsLBFGSB
(when not), as opposed to SetOptimizerAsGradientDescent
for the affine registration.
But they both use the same metric: SetMetricAsMattesMutualInformation
(I found that for some datasets the BSpline registrations were well enough with 5% samplingPercentage
, when affine registrations needed 50%).
Based on the fact that the metric used for the affine and bspline registration were the same, why should the bspline perform so much better? Might it be down to the optimizer rather than the metric?