ResampleImageFilter non-determinism with threading

(Matt McCormick) #21

Great sleuthing @Simon! :male_detective:

Yes, we should call TransformIndexToPhysicalPoint to obtain more consistent results. Would you like to submit a patch?

(Simon Alexander) #22

@matt.mccormick - sure, I haven’t done that for a while and I think your tools have changed. I’ll read up on the current process !

(Bradley Lowekamp) #23

Before we jump to dropping this optimization, we need to evaluate the impact of it on performance. We may just want an option.

I am uncomfortable that the number of threads effect the result. Hopefully, there is some underlying numeric issue which could be improved to fix this. Preferring to use a proper rounding over traction may yield improved results.

(Simon Alexander) #24

@blowekamp agreed on all counts, I think a patch is the right way to start, then we can evaluate impact.

There are a number of things we can try but probably want the baseline of pointwise anyway, so we can compare the resulting drift. At the very least the multiple transforms I’m applying above could be collapsed into one operation.

By the way I will be tied up for a couple of days with other things anyway, but will look at a patch and a proper test case.

(Bradley Lowekamp) #25

I am going to take a stab a an implementation.

The propose implementation of doing the full transform involves 3 affine transformation; it essentially removes the optimized linear transform path.

I propose that for each scan-line the starting and ending index of the largest possible region be transformed into the input domain as a continuous index. Then for each pixel we simply linearly interpolate between these two transformed points. This methods is free from accumulation error, and independent of how the region is split.

(Simon Alexander) #26

@blowekamp I’m not actual proposing the above fix, it’s just a demonstration of how much drift there is but as you note removes the optimization. Apologies if I was not clear.

You’ve come up with essentially the same approach I was going to try (that was one of the “a number of things we can try”), since I noted the current implementation is sensitive to the delta computed for the first two voxels which is then used for all scanlines. At the very least an interpolation distributes the error evenly over the scanline which is better than current situation, and if this is done per scanline it wont’ be affected buy the chunking.

If you have time to do it I’m happy to leave it to you, otherwise I’ll have a look at it when I have a bit of spare time. If it matters, I have the code roughed out already for it, but not tested.

(Bradley Lowekamp) #27

I’m getting the implementation done quickly, but if you are able to work on the evaluation and performance that would be wonderful.

(Simon Alexander) #28

Sure. interpolated method removes any difference in thread count for me here, at least on these inputs. For what it’s worth it also reduced the disagreement between recomputing the input points through transform and the accumulated input point by about an order of magnitude (average case) on this input. For what it’s worth just recalculating delta per scanline does also, but that cant’ be as numerically stable as the interpolation method.

I’ll think about if there is a useful test case we can add that will stress this harder.

(Bradley Lowekamp) #29

The next thing to do is write a performance test so we can track the performance of the ResampleFilter, evaluate the impact of this change:

Please find my patch here:

I did get the following test failure locally:

2153: Test command: /scratch/blowekamp/build/ITK/bin/ITKRegistrationCommonTestDriver "itkRecursiveMultiResolutionPyramidImageFilterTest" "Resample"
2153: Test timeout computed to be: 1200
2153: false
2153: Run RecursiveMultiResolutionPyramidImageFilter in standalone mode with progress
2153: RecursiveMultiResolutionPyramidImageFilter (0x268a360)
2153:   RTTI typeinfo:   itk::RecursiveMultiResolutionPyramidImageFilter<itk::Image<short, 3u>, itk::Image<float, 3u> >
2153:   Reference Count: 2
2153:   Modified Time: 81
2153:   Debug: Off
2153:   Object Name: 
2153:   Observers: 
2153:     ProgressEvent(SimpleMemberCommand)
2153:   Inputs: 
2153:     Primary: (0x268a0d0) *
2153:   Indexed Inputs: 
2153:     0: Primary (0x268a0d0)
2153:   Required Input Names: Primary
2153:   NumberOfRequiredInputs: 1
2153:   Outputs: 
2153:     Primary: (0x2699890)
2153:     _1: (0x2699ba0)
2153:     _2: (0x2699ef0)
2153:     _3: (0x269a330)
2153:   Indexed Outputs: 
2153:     0: Primary (0x2699890)
2153:     1: _1 (0x2699ba0)
2153:     2: _2 (0x2699ef0)
2153:     3: _3 (0x269a330)
2153:   NumberOfRequiredOutputs: 4
2153:   Number Of Threads: 88
2153:   ReleaseDataFlag: Off
2153:   ReleaseDataBeforeUpdateFlag: Off
2153:   AbortGenerateData: Off
2153:   Progress: 0
2153:   Multithreader: 
2153:     RTTI typeinfo:   itk::PoolMultiThreader
2153:     Reference Count: 1
2153:     Modified Time: 42
2153:     Debug: Off
2153:     Object Name: 
2153:     Observers: 
2153:       none
2153:   CoordinateTolerance: 1e-06
2153:   DirectionTolerance: 1e-06
2153:   MaximumError: 0.1
2153:   No. levels: 4
2153:   Schedule: 
2153: [8, 4, 2]
2153: [4, 2, 1]
2153: [2, 1, 1]
2153: [1, 1, 1]
2153: Use ShrinkImageFilter= 0
2153: Progress 0
2153: Progress 0.25
2153: Progress 0.5
2153: Progress 0.75
2153: Run ImagePyramid with streamer
2153: Compare standalone and streamed outputs
2153: Streamed output is different!!!
2153: Test failed.
2/2 Test #2153: itkRecursiveMultiResolutionPyramidImageFilterWithResampleFilterTest ....***Failed    0.16 sec

(Simon Alexander) #30

I added some comments to your patch but they are all in “Draft” status.

Not sure if this is a permissions issue or I am just missing something on the interface.

(Bradley Lowekamp) #31

You need to hit the reply button there in the top middle, to make a review, and “post” it.

(Simon Alexander) #32

gotcha, done now.

(Simon Alexander) #33

hope you’ll excuse chattiness of review comments - i’m not terribly familiar with your conventions.

(Simon Alexander) #34

By the way, attempting to pull the benchmarking module through cmake runs into:

fatal: ‘submodule’ appears to be a git command, but we were not
able to execute it. Maybe git-submodule is broken?

CMake Error at CMake/ITKModuleRemote.cmake:32 (message):
Failed to init submodules in:
Call Stack (most recent call first):
CMake/ITKModuleRemote.cmake:107 (_git_clone)
CMake/ITKModuleRemote.cmake:153 (_fetch_with_git)
Modules/Remote/PerformanceBenchmarking.remote.cmake:2 (itk_fetch_module)
Modules/Remote/CMakeLists.txt:6 (include)

Is this a known issue? Something to do with my windows & git config perhaps?

(Bradley Lowekamp) #35

In the CMake configuration check which git executable is referenced.

(Simon Alexander) #36

Ah, cheers, that was it. Cmake is finding the wrong git.exe for some reason.