Why ResampleImageFilter is slow?

(Matt McCormick) #21

Well done, @grothausmann.roman!

Good idea to exclude itk::SpecialCoordinatesImage.

We can switch over the interpolator type to apply the appropriate padding, i.e. 1 for nearest neighbor, 2 for linear, etc. It can be applied with PadByRadius followed by Crop with the LargestPossibleRegion of the input.

As was done with itk::WarpImageFilter, we should check all corners, instead of just the “lower-left” and “upper-right” corners. Consider this case, for example:


The lack of a regard in WarpImageFilter for the interpolator could be a bug. We should recommend using itk::ResampleImageFilter in the WarpImageFilter documentation and even consider deprecating it if ResampleImageFilter can do the job of WarpImageFilter. We should focus our efforts on optimizing and maintaining ResampleImageFilter

(grothausmann.roman) #22

Thanks @matt.mccormick for your feedback, I agree and will try to get that realized next week.

(Niels Dekker) #23

Triggered by this thread, I also had a closer look at itk::ResampleImageFilter, and found a possible way to improve its performance. Please review my patch: PERF: Made ResampleImageFilter::CastPixelWithBoundsChecking faster

My colleague Denis (@denis.p.shamonin) tested the patch with the benchmark he is currently working on, and observed more than 15% reduction of the runtime duration of a ResampleImageFilter::Update() call, for 3D images.

With 2D images we observed even more than 20% reduction of the runtime duration, e.g., by running the following example, using Visual Studio 2015 x64 Release:

#include <itkResampleImageFilter.h>
#include <chrono>
#include <iostream>

int main()
  using ImageType = itk::Image<double>;
  const auto image = ImageType::New();
  const ImageType::SizeType size = { { 25000, 25000 } };
  const ImageType::RegionType region{ size };

  const auto resampleImageFilter =
      itk::ResampleImageFilter<ImageType, ImageType>::New();

  using namespace std::chrono;
  const auto timePointBeforeUpdate = high_resolution_clock::now();

  const auto timePointAfterUpdate = high_resolution_clock::now();
  const auto durationSeconds =
    duration_cast<duration<double>>(timePointAfterUpdate - timePointBeforeUpdate);

  std::cout << "Duration: " << durationSeconds.count() << " seconds" << std::endl;

So please check http://review.source.kitware.com/#/c/23696

(Matt McCormick) #24

This is awesome! :guitar::clap:

Do you think we can improve performance further SIMD operations via an interface like xsimd?

(Denis P. Shamonin) #25

Hi @dzenanz, @matt.mccormick, @blowekamp

I’ve created the benchmarking for the ITK ResampleImageFilter: Resample benchmark

The resample benchmark includes combinations of image dimensions, interpolators, extrapolators, transforms and their combinations and etc. Therefore I’ve added some common registrations scenarios for the benchmarking in the CMakeLists using the add_resample_benchmark macro.

You can still check if you would like to benchmark different scenarios.

In general benchmark is complete and has all the things I want to know about its performance. The documentation included, it is self contained and I would like to keep it this way.

Please check the Pull request

Regards, Denis

(Matt McCormick) #26

Awesome, @denis.p.shamonin! Looks fantastic.

(Simon Warfield) #27

That looks great!
Did you come to any conclusion about what is slow and what is fast ?

I read the code, but I didn’t see if you published any benchmarks using it ?

(grothausmann.roman) #28

Many thanks @dzenanz for this hint. For just exercising this test I run ctest -R ResampleImageTest2 (as described here: https://itk.org/pipermail/insight-users/2014-January/049528.html), is that still correct?
For the simple case of just scaling as in the ResampleImageTests my changes to the itkResampleImageFilter do yield the same results when using streaming but for my own 3D test data the third axis is extended by one slice. So I guess I should extend the new tests into at least 3D. Is there any test for itkResampleImageFilter that testes its performance on 3D input?

(grothausmann.roman) #29

Folowing @dzenanz idea to reuse ResampleImageTest2 I created a streaming version of it: ResampleImageTest2s. In commit

of branch test_RSI-SDI the verification that streaming was used leads to the result that it was not, i.e. Streamed pipeline was executed 1 times which was not the expected number 8 of times.
Is the chosen approach wrong?

(Matt McCormick) #30

@grothausmann.roman we may need to write to a MetaImage file, .mha extension, since this is one of the file output formats that supports streaming.

(grothausmann.roman) #31

Many thanks @matt.mccormick for that hint, had not thought about this in the heat of the action;-) It seems that was the problem, but I have a hard time to get the newly created MHAs into the TestingData “system”. The newest instructions I could find seem to be these:
but how to test the tests before doing any uploading? Basically I am stuck at how to populate Testing/Data/Baseline/BasicFilters/ based on the commited MD5-file (manually created) and the (manually copied) e.g. ExternalData/Objects/MD5/758cd1949c26ff52d03d8036c5c629de.

(Dženan Zukić) #32

I think that newest instructions are here. The link you gave are instructions for creating a release. Those are old instructions ITK maintainers have been writing for themselves - the releases are done infrequently and humans are forgetful creatures :smiley:

(grothausmann.roman) #33

Trying to use the code of ImageAlgorithm::EnlargeRegionOverBox for this, I am facing the problem that an #include "itkIdentityTransform.h" in itkImageAlgorithm.hxx leads to a fatal error: itkIdentityTransform.h: No such file or directory. Why is it not found? Are itkImageAlgorithm.hxx and itkIdentityTransform.h in different “levels”?

(grothausmann.roman) #34

Many thanks @dzenanz, got it worked out with those instructions, at least locally. Oddly, in the current state the non-linear transform test case also apparently succeeds in streaming, which I was expecting not to be the case. Is monitor->VerifyAllInputCanStream(1) not the way to test if streaming failed? It reports:
Streamed pipeline was executed 8 times which was not the expected number 1 of times.

(Dženan Zukić) #35

itkIdentityTransform.h is in module ITKTransform, whereas itkImageAlgorithm.hxx is in ITKCommon. As Common is the most basic module, it cannot depend on other modules - all the other modules depend on it. However, Common’s tests can, and do depend on ITKTransform. Yes, you have to take care of such things when changing low level plumbing :smiley:

(grothausmann.roman) #36

I’m trying to create a test that compares the output of ResampleImageFilter without streaming with the result when streaming is employed. To avoid streaming limitations of a specific file format reader, I create and use a ramp image. However, setting the number of stream divisions to 1 leads to the error
PipelineMonitorImageFilter (0x55b0de6a2740): The input filter's BufferedRegion is not contained by LargestPossibleRegion
Setting it to 8 divisions then leads to the error:
Streamed pipeline was executed 0 times which was not the expected number 8 of times.
What could be the problem here?
How to test that streaming did not take place?

PS: In itkWarpImageFilterTest2.cxx the filter is executed twice (with a different displacement field) but the outputs of each run are not disconnected from the pipeline (DisconnectPipeline), so to my understanding result1 and result2 point to the very same output and will always match. Is that intended?

(Bradley Lowekamp) #37

I would suggest compiling in Debug mode, and turning PipelineMonitorImageFilter->DebugOn(). This will enable the monitor filter to print a bunch of debugging information to help diagnose the problem. You should also look at the source for the error message.

Looking at the code in your test, it appears that you are only changing the number of streaming divisions. You are not modifying the PipelineMonitorImageFilter, nor the ResampleImageFilter. Since both these filters have produced outputs, and the modify time of the output is new then the filters, neither on of these filters needs to execute.

(grothausmann.roman) #38

Many thanks @blowekamp for your quick reply and for pointing out the source for the second problem. What would be the best way to enforce re-execution of the ResampleImageFilter without modifying any of its processing settings? Basically I want it to execute exactly as in the first run except that it should be exercised with streaming.

(Bradley Lowekamp) #39

Just update the modified time with a call to itk::Object::Modified().

(grothausmann.roman) #40

Many thanks Brad for that hint, which did the trick. However, for numStreamDiv= 8 the pipeline is now executed only 1 time but not multiple times.
Also, I’m still stuck on how to successfully not stream with the monitor-streamer pipeline without an image reader. I rebased my tests onto v4.13.1 (daab8e5e5) to make sure my changes to ResampleImageFilter do not cause the problem, but even with the plain ResampleImageFilter of v4.13.1 I get The input filter's BufferedRegion is not contained by LargestPossibleRegion and the RequestedRegion size is empty [0, 0].

Is it expected that the monitor-streamer pipeline does not work without a reader-source that can stream?

The only test that I could find that uses SetNumberOfStreamDivisions( 1 ) uses a writer (not the StreamingImageFilter) and checks for success without PipelineMonitorImageFilter (itkVTKImageIOStreamTest.cxx).