LabelStatisticsImageFilter floating point exception

Fijoy_Vadakkumpadan · November 5, 2018, 7:22pm

Hello,

Line #170 in LabelStatisticsImageFilter implementation here:

InsightSoftwareConsortium/ITK/blob/master/Modules/Filtering/ImageStatistics/include/itkLabelStatisticsImageFilter.hxx

/*=========================================================================
 *
 *  Copyright Insight Software Consortium
 *
 *  Licensed under the Apache License, Version 2.0 (the "License");
 *  you may not use this file except in compliance with the License.
 *  You may obtain a copy of the License at
 *
 *         http://www.apache.org/licenses/LICENSE-2.0.txt
 *
 *  Unless required by applicable law or agreed to in writing, software
 *  distributed under the License is distributed on an "AS IS" BASIS,
 *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 *  See the License for the specific language governing permissions and
 *  limitations under the License.
 *
 *=========================================================================*/
#ifndef itkLabelStatisticsImageFilter_hxx
#define itkLabelStatisticsImageFilter_hxx
#include "itkLabelStatisticsImageFilter.h"

This file has been truncated. show original

is causing a floating point exception and crashing my application. I think the reason is a numerically unstable calculation of variance, resulting in a negative value passed to std::sqrt(). In my case, I have a signed 16-bit image with a uniform intensity of -100 for a large background region, the variance of which is computed as approximately -2e-12 by the filter code.

Can anyone else confirm this is a bug? If yes, maybe the direct formula (based on the definition of variance) could be adopted as discussed in below?
https://www.johndcook.com/blog/2008/09/28/theoretical-explanation-for-numerical-results/

Is there another filter that can provide a workaround for calculating standard deviatation while this issue is being resolved?

–Fijoy

blowekamp · November 5, 2018, 7:31pm

This filter was recently updated to use Kahan Summation which should improve the accuracy of the computation.

What is the size of your data? How big is your image, and how many pixels are -100?

Also, are you using the current (-ish) ITKv5 master or the the prior implementation of the filter from ITKv4?

Fijoy_Vadakkumpadan · November 5, 2018, 7:44pm

I’m using ITKv4, 4.10.1 to be specific. My image is 512x512x236. Most of it (60838487 pixels) is background. This means the code is computing variance as (608384870000 - -6083848700*-6083848700/60838487) / ( 60838487 - 1.0 ).

Kahan Summation will not resolve this issue, will it? I think the use of the formula above is the issue.

blowekamp · November 5, 2018, 8:08pm

You are right, that square and division is very brutal for numeric instability. I implemented the Kahan summation to improve the accuracy of the mean. Let me look at the Welford method…

blowekamp · November 7, 2018, 8:11pm

Can you please create an issue on Github with this information?

Thanks.

blowekamp · November 13, 2018, 8:41pm

Please find a fix for the sort of negative value here:

Thank you for the link to the Walford method of computing variance. However, ITK uses a parallel divide and conquer method to compute the variance with multiple threads. It is not clear to me how this recurrent method can be adapted for chunking of that data.

Fijoy_Vadakkumpadan · December 3, 2018, 9:56pm

Thank you Bradley. Sorry I couldn’t get back right away and create a github issue due to some deadlines at work.

blowekamp · December 3, 2018, 10:01pm

I hope I addressed your issue.

Would you be able to share the data set which produced the issue? It would be good to be able to test that the issue is addressed.

Fijoy_Vadakkumpadan · December 3, 2018, 10:07pm

The data set is not mine (or even my company’s) to share unfortunately. It should be very easy to reproduce though by creating a large image with constant intensity value of -100 (or some such number) and running the LabelStatisticsImageFilter on it.

blowekamp · December 3, 2018, 10:11pm

The filter requires two inputs an intensity and a label or segmentation image. What should the label image be?

And what do you mean by large…

Fijoy_Vadakkumpadan · December 3, 2018, 10:17pm

Create a 512x512x250 image with mostly (at least 60 million) background pixels of constant intensity value -100, and a small foreground region. Run it through itk::LabelMap to create a label map of the image, and pass the image and the label map to the statistics filter.

SteveRobbins · March 31, 2023, 1:45am

Curious about this statement, so I had a look at the file : ITK/Modules/Filtering/ImageStatistics/include/itkLabelStatisticsImageFilter.hxx at master · InsightSoftwareConsortium/ITK · GitHub

Unless I very much misunderstand things, I’m not seeing Kahan summation. Am I misreading this?

blowekamp · March 31, 2023, 11:43am

It looks like it was the StatisticsImageFilter which had the improved summation algorithmic:

github.com

InsightSoftwareConsortium/ITK/blob/master/Modules/Filtering/ImageStatistics/include/itkStatisticsImageFilter.h#L167


      
          
          
  itkSetDecoratedOutputMacro(Minimum, PixelType);
            itkSetDecoratedOutputMacro(Maximum, PixelType);
            itkSetDecoratedOutputMacro(Mean, RealType);
            itkSetDecoratedOutputMacro(Sigma, RealType);
            itkSetDecoratedOutputMacro(Variance, RealType);
            itkSetDecoratedOutputMacro(Sum, RealType);
            itkSetDecoratedOutputMacro(SumOfSquares, RealType);
          
          
private:
            CompensatedSummation<RealType> m_ThreadSum{ 1 };
            CompensatedSummation<RealType> m_SumOfSquares{ 1 };
          
          
  SizeValueType m_Count{ 1 };
            PixelType     m_ThreadMin{ 1 };
            PixelType     m_ThreadMax{ 1 };
          
          
  std::mutex m_Mutex{};
          }; // end of class
          } // end namespace itk

Perhaps the LabelStatisticsImageFilter could benefit from it too?

SteveRobbins · May 24, 2023, 8:54pm

Hello @blowekamp! I thought I’d have a go at adding CompensatedSummation to LabelStatisticsImageFilter. I haven’t quite succeeded yet, so I could use some advice.

First thing I did was to find test data that would break a naive double sum but work with compensated summation. Using Here is a complete simplified Kahan summation test and indeed it works with -O3 ... | Hacker News as inspiration, I created the following test. As the comment notes, it fails when using “double sum” but succeeds 100% of the time (on my machine) using CompensatedSummation as written.

// Demonstration code: test fails using double, but succeeds using CompensatedSummation
TEST(StatisticsImageFilter, test0)
{
  // double sum = 1.0;
  itk::CompensatedSummation<double> sum = 1;
  for (int i = 0; i < 1000000; ++i)
  {
    sum += 1e-20;
  }

  EXPECT_EQ((double)sum, 1 + 1e-14);
}

Next, I used the same data in a one megapixel image and wrote a StatisticsImageFilter test [1] – which also works. Third, I created a corresponding test using LabelStatisticsImageFilter [2] – which, pleasingly, does fail.

Finally I changed LabelStatisticsImageFilter to use CompensatedSummation – but my new test still fails. I’m not sure whether I’ve made a boo-boo in the code or in the test. I like to think that the test is sufficiently simple to be bulletproof, but maddeningly: of the two tests in [2], the sum always fails but “mean” occasionally passes. So I may be expecting too much precision?

Full changes here: Comparing InsightSoftwareConsortium:master...smr99:master · InsightSoftwareConsortium/ITK · GitHub

-Steve

[1] ITK/itkStatisticsImageFilterGTest.cxx at master · smr99/ITK · GitHub
[2] ITK/itkLabelStatisticsImageFilterGTest.cxx at master · smr99/ITK · GitHub

blowekamp · May 25, 2023, 7:18pm

Hello,

Thank you for working on this

The code changes to the filter looks correct. Does the precision of the results change with the changes? One like that I am suspicious of is the following:
m_sumtest_sum = 1 + 1e-14 - 1e-20; // 1 + 999,999 * 1e-20
I believe addition and subtraction are applied left to right here, so it would be “(1 + 1e-14) -1e-20” which is at the limit of double precision. You may get a more accurate expected value by doing “1 + ( 1e-14 - 1e-20)”

Please consider making a PR to ITK with the improvement.

SteveRobbins · May 26, 2023, 1:21pm

Adding the parentheses does not change the outcome on my machine.

Also, note that the same test is running “successfully” for the StatisticsImageFilter. This suggests to me that there is some difference in how the two filters compute the sum – but I cannot see what could make the difference.

One thing I have noticed is that the result does vary across runs. When the test fails, the output is something like

Expected equality of these values:
  filter->GetSum(1)
    Which is: 1.0000000000000122
  m_sumtest_sum
    Which is: 1.00000000000001

However, the least significant figures in the “GetSum” result (122, above) do vary across runs; e.g. 131, 138, etc. I don’t understand what might account for variability in the test results. There’s nothing deliberately random in the test code. Can you think of something that would bring randomness? Does the pipeline randomize the image decomposition during streaming? I guess even if not, the order of merging the sub-problems might vary. But even if that were the case – why doesn’t the same phenomenon appear in the StatisticsImageFilter test case?

-Steve

P.S. My intention is indeed to create a PR once I can get a working set of tests.

dzenanz · May 26, 2023, 4:23pm

If you have built ITK with TBB multi threader, that will likely cause different parallel decompositions from run to run. I don’t know why it doesn’t affect StatisticsImageFilter the same way.

blowekamp · May 26, 2023, 5:06pm

The way the data is merged between threads is not deterministic:

github.com

InsightSoftwareConsortium/ITK/blob/master/Modules/Filtering/ImageStatistics/include/itkLabelStatisticsImageFilter.hxx#L258-L282


      
          while (true)
          {
            MapType tomerge{};
            {
              const std::lock_guard<std::mutex> lockGuard(m_Mutex);
          
              if (m_LabelStatistics.empty())
              {
                swap(m_LabelStatistics, localStatistics);
                break;
              }
          
              // Move the data of the output map to the local `tomerge` and clear the output map.
              swap(m_LabelStatistics, tomerge);
          
            } // release lock, allow other threads to merge data
          
            // Merge tomerge into localStatistics, locally
            MergeMap(localStatistics, tomerge);
          }

This file has been truncated. show original

This will be true for TBB or the default threader.

You may want to set the number of threads to 1 to remove this variability. It will also not use the “merge” code to help narrow down where the difference between the expected results is located.

SteveRobbins · May 28, 2023, 7:24am

Using 1 thread does make the problem go away. So there must be some issue with the merging.

To investigate whether a simple re-ordering of the terms in the sum might play a role, I changed the “test0” to put the ‘1’ term last:

TEST(StatisticsImageFilter, test0)
{
  itk::CompensatedSummation<double> sum = 0;
  for (int i = 0; i < 1000000; ++i)
  {
    sum += 1e-20;
    if (i == 999999) sum += 1;
  }

  EXPECT_EQ((double)sum, 1 + 1e-14);
}

The test still passes. So whatever the problem with the merge, it shouldn’t be simply because of re-ordering the terms in the sum.

The other weird thing that bothers me: the computed sum is larger than expected; e.g.

Expected equality of these values:
  filter->GetSum(1)
    Which is: 1.0000000000000124
  m_sumtest_sum
    Which is: 1.00000000000001

If there were to be some issue in the summation, I’d expect the error to be truncation to zero and thus a smaller value, not a larger one? I don’t think this is a rounding issue, since the ulp for this sum would be 1 + 2**-52 = 1.0000000000000002 – so the result (above) of …124 represents an error of 12 * ulp.

Another data point: I removed the ‘1’ from the sum test, so that all the million entries are 1e-20. The sum is then consistently 1e-14 as expected (or close enough; it is something like 9.9999999999999905e-15). Even with merging.

SteveRobbins · May 29, 2023, 1:53pm

Just in case someone goes looking for my code: I moved it to a new branch GitHub - smr99/ITK at topic-label-stats-compensated