Strange behavior of itk::RegularExpressionSeriesFileNames

Hello,

I came across a strange behavior of itk::RegularExpressionSeriesFileNames, and either I miss something, or there is a bug somewhere.

I have the following directory structure:

projections
├── projections0000.mhd
├── projections0000.raw
├── projections0001.mhd
├── projections0001.raw
├── projections0002.mhd
├── projections0002.raw
├── ...
├── projections0359.mhd
└── projections0359.raw
projections_cpp
├── projections0000.mhd
├── projections0000.raw
├── projections0001.mhd
├── projections0001.raw
├── projections0002.mhd
├── projections0002.raw
├── ...
├── projections0359.mhd
└── projections0359.raw

Then, I run the following code:

#include <itkRegularExpressionSeriesFileNames.h>

int main(int argc, char * argv[])
{
  itk::RegularExpressionSeriesFileNames::Pointer names = itk::RegularExpressionSeriesFileNames::New();
  names->SetDirectory("projections");
  names->SetNumericSort(false);
  names->SetRegularExpression(".mhd");
  names->SetSubMatch(0);

  for(auto filename : names->GetFileNames()){
    std::cout << filename << std::endl;
  }

  std::cout << "-------" << std::endl;

  itk::RegularExpressionSeriesFileNames::Pointer names2 = itk::RegularExpressionSeriesFileNames::New();
  names2->SetDirectory("projections_cpp");
  names2->SetNumericSort(false);
  names2->SetRegularExpression(".mhd");
  names2->SetSubMatch(0);

  for(auto filename : names2->GetFileNames()){
    std::cout << filename << std::endl;
  }

  return EXIT_SUCCESS;
}

I expect to get the alphabetical order of the two itk::RegularExpressionSeriesFileNames to be the same. However, I get some strange differences, as illustrated below:

I understand that in that specific case, one should rather use a numeric sort. But still, shouldn’t the order be exactly the same in the two cases?

I believe that the problem, if there is one, lies somewhere around those lines of code: ITK/Modules/IO/ImageBase/src/itkRegularExpressionSeriesFileNames.cxx at 9dc398562a0ecfa792efbb6744e166f0c0e0d069 · InsightSoftwareConsortium/ITK · GitHub

Let me know what you think. Thanks!

Maybe the order is dependent on the file system? You did SetNumericSort(false).

Your regular expression it “.mhd”. The ‘.’ matches any character followed by “mhd”. The will be the same “match” for each file and work results with the default ordering from the file system.

Try “.*mhd” for the regular expression.

1 Like

I think you don’t have a submatch which is why alphabetic ordering does not work. I would try “(.*).mhd” where the parentheses indicate the submatch location.

1 Like