ImageError = how many pixel-wise differences are above the threshold (2 by default, I think). The rest are just regular statistics of those differences - in this case, all are equal due to sample size of 1.
Sadly, the most likely reason for failure is minor numerical instability. And that is quite hard to trace and fix in a cross-platform CPU-model-independent manner. You are welcome to give it a try yourself!
The CurvatureAnisotropicDiffusionImageFilterTest and LaplacianSharpeningImageFilterTest differences do not look significant to me.
I think I have seem similar Resampling differences with that test. As I recall this is with the nearest neighbor interpolator, which is numerically problematic when the expected value is close to a rounding boundary. This discourse thread may shed some light:
Bradley, I’ll look more closely at that discussion thread, but maybe the test results indeed depend on the number of threads. My two bots where the tests fail have 12 to 16 cores, whereas the ones it passes on have 2-4 cores.
Tonight, I’ve set Module_ITKVtkGlue=0 on Rogue7 Debug to see if that makes the test pass.
I still find it odd that VtkGlue changes things… is there code that conditionally uses VTK vs some other ITK code?
So the reverse test of turning off VtkGlue indeed made the tests pass.
So, in summary, on both Rogue7 and Rogue17:
setting VtkGlue ON seems to make those 3 tests reliably fail.
setting VtkGlue OFF seems to make those 3 tests reliably pass.
my bots use VTK master, built just before building ITK master.
building as debug vs release doesn’t matter.
building with AppleClang vs regular Clang doesn’t matter.
If it’s a numerical stability issue, I don’t see what VtkGlue has to do with it. I only looked quickly, but the test doesn’t even seem to use any VTK.
As I don’t use those classes, and work on ITK on my employer’s time/dime, I won’t be diving into this myself, but could try any suggestions anyone has…