GDCMSeriesFileNames fails to read folder with non ascii characters

antoine.letouzey · October 31, 2019, 3:15pm

I’m having trouble with code that worked just fine for years. I’m trying to open a DICOM dataset from Japan and my code cannot see any files of the series folder.

The code is (with placeholders for seriesUID and patient name :

... 
auto seriesUID = "1.2.3.4"; 
seriesPath = "F:\\Data\\Images\\Pat_MAXXXXXX MIXXXX=丸山___=マル___\\Study_a1753f72\\Series_10b36a87";

typedef itk::ImageSeriesReader<itk::Image<short, 3>> t_ImageSeriesReader;
t_ImageSeriesReader::Pointer imageSeriesReader = t_ImageSeriesReader::New();

typedef itk::GDCMImageIO t_ImageIOType;
t_ImageIOType::Pointer imageIO = t_ImageIOType::New();
imageSeriesReader->SetImageIO(imageIO);

typedef itk::GDCMSeriesFileNames t_NamesGeneratorType;
t_NamesGeneratorType::Pointer namesGenerator = t_NamesGeneratorType::New();
namesGenerator->SetDirectory(seriesPath);
imageSeriesReader->ReleaseDataBeforeUpdateFlagOn();
imageSeriesReader->AbortGenerateDataOff();

imageSeriesReader->SetFileNames(namesGenerator->GetFileNames(seriesUID));
...

On the last line, GetFileNames() returns an empty list; but I can see I have multiple files in the folder.
When I rename the folder to remove the Japanese characters then loading works fine.

Is there something that I am missing here ? I am using (and kinda stuck with) ITK 4.13

nathanm · March 10, 2021, 4:35pm

I have exactly the problem described here: I’m trying to open a DICOM dataset (in my case with Chinese characters in the directory) and GDCMSeriesFileNames won’t find the files in the folder.

When removing Chinese characters it works properly, but modifying the directory to remove non-ASCII characters is not an option for me. Any idea on this issue? I’m with latest ITK 5.2.

Thanks,
Nathan

matt.mccormick · March 10, 2021, 4:50pm

Hi @nathanm ,

Perhaps setting the locale similar to this issue:

in itk::GDCMSeriesFileNames addresses the issue?

nathanm · March 10, 2021, 5:02pm

Hi @matt.mccormick , thank you for prompt reply.

I checked the link you provided and it seems pretty much related. How should I try this? Is it a branch of ITK to be merged with the bug fix? Or how do I set the locale?

Nathan

matt.mccormick · March 10, 2021, 5:25pm

Hi Nathan,

To try a fix, set up a repository for contributing to ITK, a local build (more information can be found in The ITK Software Guide, then, like the patch did in itkGDCMImageIO.cxx, add std::locale currentLocale = std::locale::global(std::locale::classic()); and std::locale::global(currentLocale); around relevant blocks in itkGDCMSeriesFileNames.cxx.

Thanks,
Matt

lassoan · March 10, 2021, 8:03pm

In 3D Slicer, we switched to UTF-8 application code page and it solved all problems related to non-ASCII characters (except passing non-ASCII text via command-line arguments on Windows). On Linux and macOS UTF-8 code page should be the default, while on Windows you can enable this in the application manifest.

nathanm · March 11, 2021, 1:59pm

Hi @matt.mccormick,

I tried a similar fix to the one you linked, with no success so far…

Hi @lassoan,

In order to enable this application manifest in Windows (WindowsApplicationUseUtf8.manifest), what’s the procedure to follow exactly? I’m using Visual Studio C++ to create a dynamic library (.dll).

Thank you,
Nathan

lassoan · November 14, 2021, 11:48pm

It cannot be set in a DLL, it must be set in the application. See how it is done in CTK at the link that I provided above.

codeling · June 4, 2024, 12:29pm

Sorry to revive such an old topic, but I needed to remark here that your post, @lassoan, on adding the manifest was most helpful in finally getting proper utf-8 support on Windows! Thanks a lot !

The only minor nuisance now remaining (in Debug mode) are a number of debug assertion failures, sometimes (interestingly not always) happening when unicode characters are present in a file name, from CanReadFile method in the NiftiImageIO:

Debug Assertion Failed!
...
File: minkernel\crts\ucrt\src\appcrt\convert\isctype.cpp
Line: 36

Expression: c >= -1 && c <= 255

Stack trace:

 	ucrtbased.dll...
 	...
	ITKIONIFTI-5.4.dll!make_lowercase(char * str) Line 3327	C
	ITKIONIFTI-5.4.dll!nifti_find_file_extension(const char * name) Line 2637	C
	ITKIONIFTI-5.4.dll!nifti_validfilename(const char * fname) Line 2567	C
	ITKIONIFTI-5.4.dll!is_nifti_file(const char * hname) Line 3471	C
	ITKIONIFTI-5.4.dll!itk::NiftiImageIO::CanReadFile(const char * FileNameToRead) Line 887	C++
	ITKIOImageBase-5.4.dll!itk::ImageIOFactory::CreateImageIO(const char * path, itk::CommonEnums::IOFileMode mode) Line 55	C++

But we can easily live with those

lassoan · June 5, 2024, 5:51pm

The error seems to be due to a bug in niftilib - in how it prepares input data for isupper or tolower. Since you can already reproduce the problem on your computer, the best would be if you could fix it and submit a pull request to ITK.

seanm · June 14, 2024, 4:30pm

The error seems to be due to a bug in niftilib - in how it [prepares input data for isupper or tolower]

If it’s nifti_clib bug, best would be to fix it upstream:

or at least create an Issue there.

Sean

lassoan · June 16, 2024, 7:21pm

Right, I just did not want to scare @codeling with this extra task, but definitely the ultimate solution is to fix this bug is to fix it in GitHub - NIFTI-Imaging/nifti_clib: C libraries for NIFTI support.

@codeling it may be worth a try submitting a bug report to nifti_clib. Maybe developers fix it for you quickly and then you can ask ITK developers to update ITK’s implementation.

codeling · June 17, 2024, 8:19pm

I am sorry but I currently do neither have the time to look into this further nor do a proper bug report, and as I said, for me it’s far away from being a critical issue. Just wanted to share my findings!

seanm · June 17, 2024, 9:30pm

I took 5 minutes to write up a ticket: Unicode filename problems on Windows because of isupper() and tolower() · Issue #189 · NIFTI-Imaging/nifti_clib · GitHub