How to handle duplicte DICOM slices using SimpleITK?

mattwarkentin · June 24, 2020, 9:33pm

Hi,

This isn’t strictly a SimpleITK issue, but I am wondering how to handle such an issue using SimpleITK. I have been given axial CT from a colleague (stored in DICOM format), and it contains 146 slices of dimension 512x512.

When I load each slice, I notice there are several slices that have the exact same origin coordinate in physical space. In other words, these are duplicate slices.

First, can anyone offer any insight into why a DICOM series might contain duplicate slices? This isn’t a one-off occurrence, among the thousands of CTs I have, this occurs every so often.

Second, is there a rigorous way to handle such an occurrence? Currently, my approach has been to read in each .dcm file independently, extract the origins with GetOrigin(), order by origin values, identify duplicates, and randomly select one slice among the duplicates. This leaves with with a new list of files that only
contain unique slices. I can then use this set of files to read in the whole series (and write out to NRRD for other purposes).

I am under the impression that SimpleITK uses the difference in the Z-dimension for the first two slices (in physical order) to determine the slice thickness/spacing. So when the first slice is duplicated, the difference is zero, and SimpleITK defaults to unit-spacing.

When the duplicate slices are anywhere else in the stack, it just breaks SimpleITK from ordering the series properly, and the result is an image with slices ordered according to file name, rather than slice location.

I would really appreciate any insights. Thanks.

zivy · June 25, 2020, 12:15am

Hello @mattwarkentin,

These are truly strange datasets. Are the spatially duplicate images really duplicates? They occupy the same region in space and their intensity content is the same? If the intensity content is different then you have two scans that for some reason share study-series IDs which is worrisome as you need to use some other mechanism to separate them.

Have you tried loading them into some non-ITK based viewer? I wonder if osirix is able to separate the scans.

Possibly look into the acquisition time tag (‘0008|0032’). I expect it to be different for the duplicate images. If it isn’t then I’m really suspicious of the data source.

lassoan · June 25, 2020, 1:36am

Most likely the extra slices are localizers. They are really annoying and not straightforward to filter out. Unfortunately, these series are valid DICOM data sets, so every reader must recognize and handle them. What we ended up doing in 3D Slicer is to split the series based on image type and by default not load a sub-series that contains a single slice if there is another subseries that contains many slices. See this discussion for details: https://discourse.slicer.org/t/problems-during-dicom-import/9060/31.

There are lots of other issues (tilted-gantry acquisitions, non-uniform slice spacing, time sequences, and of course all the vendor-specific DICOM implementation errors) that needs to be addressed if you want to have a robust DICOM reader that can reconstruct 3D volumes. I think ITK aims to handle the most common cases, reports probable geometry errors, and lets application-level logic to resolve them. If you want to be able to read a wide range of DICOM images, I would recommend these software:

3D Slicer: nice GUI and Python interface for DICOM import/export of all kinds of DICOM data objects (3D/4D images, segmentations, spatial registration objects, structured reports, RT structure sets, plans, dose maps, etc.). Uses ITK for reading images: DICOM files are sorted and filtered, passed to ITK’s DICOM reader, then post-processed (to handle tilted-gantry acquisitions etc.)
dcm2niix: very robust command-line tool for converting DICOM series to 3D/4D nifti or nrrd images, handles many special cases and common DICOM implementation errors

mattwarkentin · June 25, 2020, 5:28pm

Thank you both for your responses.

To add to the oddity, there are six slices in duplicate (rather than just 1, which maybe would point toward a localizer), resulting in 12 total slices with 6 different origins.

@zivy Yes, I checked each of the 6 pairs of duplicates and each slice in the pair carry the same origin and intensity data. So they are exact duplicates. I also tried to inspect the DICOM tag '0008|0032' and it was not available in these DICOM files.

@lassoan Thanks for the suggestions. I would be surprised if these were scouts/localizers since within a patients directory there are separate sub-directories for localizers and full axial CTs already. So I would be surprised if localizers found themselves in the full CT sub-directory. I do actually use Slicer quite extensively for visualization. When loading the DICOM series mentioned above in Slicer, it does load (sort of) but produces a dialog which says:

Warnings detected during load. Examine data in Advanced mode for details. Load Anyway?

And the warning in Advanced mode is the following:

Images are not equally spaced (a difference of -2 vs 2 in spacings was detected). If loaded image appears distorted, enable ‘Acquisition geometry regularization’ in Application settings / DICOM / DICOMScalarVolumePlugin. Please use caution.

Perhaps you could speak to what might cause such a warning. In any case, when I load anyway, the image looks fine, and the number of slices in the Volume module is the number of DICOM files (146). If there are localizers present, wouldn’t they not be loaded according the heuristic described above?

lassoan · June 25, 2020, 5:43pm

Localizer may either refer to scout scans done before the full scans (to set field of view and imaging parameters) and slices like this intermixed with real image slices:

I don’t know what your series contain. If you can share an example then I can have a look.

3D Slicer correctly loads an image with varying slice spacing or duplicate slices if “Acquisition geometry regularization” is enabled in application settings (I think it is enabled by default in recent Slicer-4.11 versions).

mattwarkentin · June 25, 2020, 5:52pm

Hmm, thanks. I didn’t know that was possible.

I will anonymize these DICOM files so that I can share them. Thanks for your help.

mattwarkentin · June 25, 2020, 6:48pm

@lassoan DICOM directory can be downloaded from here.

lassoan · June 26, 2020, 2:26pm

The example data was useful. I’ve found that frames that are in the same position contain exactly the same image pixel data and metadata except SOPInstanceUID, StudyInstanceUID, and SeriesInstanceUID values.

Difference in StudyInstanceUID and SeriesInstanceUID must be be result of anonymization, because otherwise each frame would be loaded as a separate series. Can you please check if in the original (non-anonymized) files:

is SOPInstanceUID value the same in slices that are in the same position?
are there any pixel or metadata differences between slices that are in the same position?

mattwarkentin · June 26, 2020, 7:35pm

Thank you for looking into this, Andras. I can confirm that the 6 pairs of slices in the same position do have the exact same pixel data in the original file set. In my copy of the data, the SOPInstanceUID are not the same for the duplicate slices. However, the SeriesInstanceUID and StudyInstanceUID are the same in the my copy of the data.

I then compared ALL the DICOM header meta-data and the only fields that varied between the duplicate slices were SOPInstanceUID and MediaStorageSOPInstanceUID, which carried the same values.

However, I should point out that by the time the data got to me, they have already been through one anonymization, so even though they have different SOPInstanceUID in my original data, that could be the consequence of the first anonymization which was not done by me.

lassoan · June 26, 2020, 7:55pm

Having the exact same slice with two different SOPInstanceUID is obviously unnecessary, but probably still valid DICOM. Having these slices in the series makes the spacing between slices non-uniform (there are a couple of slices with 0 spacing between them), therefore it causes problems for readers that require uniform slice spacing. 3D Slicer should read these images correctly if “Acquisition geometry regularization” is enabled.

If you only work with a limited data set then you may consider writing a short Python script that parses the files and recognizes and fixes this specific error. However, if you keep getting images from various sources all the time then the best is to use tools that are prepared to deal with odd cases.

mattwarkentin · June 26, 2020, 8:50pm

Thanks again for all of your help. I really appreciate it.

Out of curiosity, what does Slicer do to handle this issue such that it can display it properly? It doesn’t have to be a nuanced explanation, but it does lead into my next issue.

My clinical colleague produced a segmentation mask for an object in this image using a Slicer extension (and also saved a fiducial for the centre of mass). When saving the mask, the dimensions are the same as the series including the duplicate slices (i.e. 146 slices rather than 140 unique slices). So when I removed the duplicate slices using {SimpleITK}, the mask no longer properly aligns with the new image when loaded in Slicer. Do I need to resample the mask? Do I need request him to segment it over again?

If I recall correctly, when loading the fiducial, the markup no longer appears at the COM where it was originally defined. Whatever cure I ultimately use to fix the duplicate slice issue will need to be propagated to the masks and fiducials. I would love any insight.

lassoan · June 26, 2020, 10:08pm

Slicer creates a warping transform that transforms each slice to its correct location. Depending on Slicer’s version, this warping was or was not enabled by default (if it was not then you got a warning when loaded the image). After the image is loaded, probably the best is to harden the transform on the image (it resamples the image using the warping transform).

If you import the DICOM volume with “Acquisition geometry regularization” enabled and apply the generated regularization transform to your segmentation then it should fix the distortion in the segmentation (should match the 140-slice DICOM image).

mattwarkentin · June 29, 2020, 4:10pm

I am using Slicer 4.10.2 at the moment. I checked the Edit > Application Settings > DICOM and Acquisition geometry regularization is set to default (none).

A warning was produced when loading this series, but when I chose to load anyway, the series has 146 slices. It looks like the duplicate slices are just included in the series. I can see I am “moving” through the image via the Data Probe coordinates, but sometimes the same image is shown two slices in a row where the duplicates exist.

I reset Slicer with Acquisition geometry regularization enabled and after hardening the transform to the image, the resulting volume did indeed have only 140 slices. When I harden that transform on the fiducial markup and segmentation mask it seems to properly align with the new transformed image.