Reading dicom series with SimpleITK and filesort

I am trying to read DICOM series using SimpleITK. The current situation is that I have a folders with a lot of dicom series in there, and I only need a few. Luckily I know the prefix of the filenames so I can select the proper files.

However, reader.SetFileNames(fns) does not sort the dicom slices correctly. For this I would need to use sitk.ImageSeriesReader.GetGDCMSeriesFileNames(data_directory, series_ID) which would read through my data folders a couple of times, which is quite inefficient.

So what is the appropriate way to tell the ImageSeriesReader to sort my dicom files?

Hi @jonasteuwen,

Not sure if just reading the meta-data for all files is that expensive. You can try the following two options:

  1. Standard SimpleITK.
  2. Python acrobatics + standard SimpleITK.

Option 1:

import SimpleITK as sitk

# A file name that belongs to the series we want to read
file_name = '1.dcm'
data_directory = '.'

# Read the file's meta-information without reading bulk pixel data
file_reader = sitk.ImageFileReader()
file_reader.SetFileName(file_name)
file_reader.ReadImageInformation()

# Get the sorted file names, opens all files in the directory and reads the meta-information
# without reading the bulk pixel data
series_ID = file_reader.GetMetaData('0020|000e')
sorted_file_names = sitk.ImageSeriesReader.GetGDCMSeriesFileNames(data_directory, series_ID)

# Read the bulk pixel data
img = sitk.ReadImage(sorted_file_names)

Option 2:

import tempfile
import os
from random import shuffle

# Our list of known file names, just a shuffled version of the 
# sorted tuple above
known_file_names = list(sorted_file_names)
shuffle(known_file_names)

# Create the file reader and get the series_ID
file_reader = sitk.ImageFileReader()
file_reader.SetFileName(known_file_names[0])
file_reader.ReadImageInformation()
series_ID = file_reader.GetMetaData('0020|000e')

# Create a temp directory and symlink all of the files, 
# then read from there.
with tempfile.TemporaryDirectory() as tmpdir_name:
    # Create symbolic links to the original files from our tmpdir
    for f in known_file_names:
        os.symlink(os.path.abspath(f), os.path.join(tmpdir_name, os.path.basename(f)))
   # Now get the sorted list of file names
    sorted_file_names = sitk.ImageSeriesReader.GetGDCMSeriesFileNames(tmpdir_name, series_ID)
    img = sitk.ReadImage(sorted_file_names)

Hope this helps. Please update the thread with which option you chose.

2 Likes

Dear Zivy,

Thanks for the nice solutions. I will go for Option 2, as these directories are really large. However, it is a little bit hacky. Would it be a possibility to add sort_fns=True parameter to the GetGDCMSeriesFileNames option? That would be an SimpleITK-only solution.

Hi @jonasteuwen,

We will likely not modify this in SimpleITK as you have a somewhat unique setting (knowing the files belonging to the series without opening all of the files in a directory).
The previous solution with the tempdir may be too acrobatic, though it leaves the sorting to the underlying library.

Below is option 3, where we do the sorting ourselves:

from random import shuffle

known_file_names = list(sorted_file_names)
shuffle(known_file_names)

file_reader = sitk.ImageFileReader()
file_names_and_image_position = []
for f in known_file_names:
    file_reader.SetFileName(f)
    file_reader.ReadImageInformation() # get Z coordinate from image position (patient)
    file_names_and_image_position.append((f, float(file_reader.GetMetaData('0020|0032').split('\\')[2])))
# Sort list according to z
file_names_and_image_position.sort(key= lambda x: x[1])    
z_sorted_file_names,_ = zip(*file_names_and_image_position) 

img = sitk.ReadImage(z_sorted_file_names)

1 Like

Hi @zivy,

Thank you for the final solution and your efforts. This seems like the most elegant way to do it with the current library.

Jonas

Best,
Jonas

I tried option 1, but there is an error!

Hello @Aya_Hassan,

Welcome to SimpleITK!

When posting questions, it is best to describe the issue in detail and not post a screenshot from the IDE in which you are working.

In this case we see that you get a very specific error saying that the file could not be read because it doesn’t exist. As your script can be run from any directory, giving a relative file name will usually result in such errors as the file cannot be found in your current working directory. It is better to provide the full path to the file name which doesn’t depend on your current working directory.

1 Like