Get anatomical metadata from dicom in Python

I have dicom images converted to nifti format. I need to find if the images are of carotid or coronary vessels from the metadata. I have searched and am convinced that these metadata are not available in Nifti.header.

Are there ways to find that in dicoms using any python package? I can get the metadata using some C++ tools but need to implement the same in python.

Hello @banikr,

Indeed with nifti you loose all of the rich DICOM header information.

Using SimpleITK (>=2.1.0) and your original DICOM image:

import SimpleITK as sitk
image = sitk.ReadImage('my_dicom_image.dcm')

# ignore leading/trailing whitespace and upper/lower case
if image['0018|0015'].strip().lower()== 'carotid':
  print('carotid')

Hi, I have got the following error with codes:

casePath = r'/media/banikr2/CAP_Exam_Data2/OrganizedFolder2/dicoms/im0-1.3.6.1.4.1.5962'
dcmItk = dicomread(casePath) #, imgitk=True)
print(dcmItk.GetSize())
if dcmItk['0018|0015'].strip().lower()== 'carotid':
  print('carotid')
(512, 512, 559)
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-19-372160a971c2> in <module>
      2 dcmItk = dicomread(casePath) #, imgitk=True)
      3 print(dcmItk.GetSize())
----> 4 if dcmItk['0018|0015'].strip().lower()== 'carotid':
      5   print('carotid')

~/miniconda3/envs/carotid36/lib/python3.6/site-packages/SimpleITK/SimpleITK.py in __getitem__(self, idx)
   3724 
   3725         if (len(idx) > dim):
-> 3726            raise IndexError("too many indices for image")
   3727 
   3728     # All the indices are integers just return GetPixel value

IndexError: too many indices for image

Another question: are these values 0018|0015 indicate anatomical metadata for all the dicoms?

Additional info: the dicom files nor have .dcm extension neither in single file. Rather are in multiple series/slices.
dicomread function encompasses the following codes to read the image given the case/folder path:

reader = sitk.ImageSeriesReader()
dicom_names = reader.GetGDCMSeriesFileNames(casePath)
reader.SetFileNames(dicom_names)
image = reader.Execute()

Hello @banikr,

  1. The file extension doesn’t matter as you called the reader’s GetGDCMSeriesFileNames which will try all files and return the files associated with the first DICOM series it encounters.
  2. The tag, 0018|0015 corresponds to BodyPartExamined, which is defined by the DICOM standard.
  3. To see which DICOM tags were loaded: image.GetMetaDataKeys().
  4. You are likely using a SimpleITK version <2.1.0, hence the error (see original response above which specifically says version >=2.1.0). For older versions the syntax is image.GetMetaData('0018|0015').strip().lower()
1 Like

>> SimpleITK.__version__
'2.0.2'

>> image = dicomread(dcmPath, True)
>> image.GetSize()
(512, 512, 240)

Now following the lower sitk version syntax:

image.GetMetaData('0018|0015').strip().lower()

This results in the following error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-35-d30ec7128239> in <module>
----> 1 image.GetMetaData('0018|0015').strip().lower()

~/miniconda3/envs/carotid36/lib/python3.6/site-packages/SimpleITK/SimpleITK.py in GetMetaData(self, key)
   3123     def GetMetaData(self, key):
   3124         r"""GetMetaData(Image self, std::string const & key) -> std::string"""
-> 3125         return _SimpleITK.Image_GetMetaData(self, key)
   3126 
   3127     def SetMetaData(self, key, value):

RuntimeError: Exception thrown in SimpleITK Image_GetMetaData: /tmp/SimpleITK-build/ITK/Modules/Core/Common/src/itkMetaDataDictionary.cxx:77:
itk::ERROR: Key '0018|0015' does not exist 

I think when reading dicom images by sitk, it doesn’t read and store all the metadata.
Again, image.GetMetaDataKeys() results: ()

The next thing I am gonna try is to install 2.1.0 version of sitk and check again.

Can you dump all the meta data keys that are in the image? This line ought to do it:

print(image.GetMetaDataKeys())

Or can you share an example image?

Hi, print also gives blank parenthesis.
The data/.dcm is shared here.

I’m not seeing the key ‘0008|0015’ in that image. Here’s the metadata dictionary that I see:

0008|0008 : ORIGINAL\PRIMARY\CARDIAC_CTA
0008|0016 : 1.2.840.10008.5.1.4.1.1.2.1
0008|0018 : 1.3.6.1.4.1.19291.2.1.3.11689618230218138555609087104
0008|0020 : 20130205
0008|0021 : 20130205
0008|0022 : 20130205
0008|0023 : 20130205
0008|002a : 20130205131533.872
0008|0030 : 131102.000
0008|0031 : 131453.077
0008|0032 : 131529.230
0008|0033 : 131533.872
0008|0050 : CT5000089169
0008|0060 : CT
0008|0061 : CT
0008|0070 : TOSHIBA
0008|0080 : NIH Clinical Center
0008|0090 : CHEN^MARCUS
0008|1010 : 1C705A
0008|1030 : CT Non-Cardiac Finding
0008|103e : HALF 75% 0.97s Cardiac 0.5 CE
0008|1040 : Radiology CT4
0008|1090 : Aquilion ONE
0008|2111 : LossLess compression with JPEG 2K, compression ratio 3.59
0008|9205 : MONOCHROME
0008|9206 : VOLUME
0008|9207 : NONE
0010|0010 : unnamed
0010|0020 : PACI0061_V1
0010|0021 :
0010|0030 : 19620130
0010|0040 : F
0010|1010 : 051Y
0018|1000 : 1CB1262006
0018|1020 : V4.93ER004
0018|1030 : Cardiac Subtraction - Chen
0018|1081 :
0018|1082 :
0018|1083 :
0018|1084 :
0018|5100 : FFS
0018|9004 : PRODUCT
0018|9037 : RETROSPECTIVE
0018|9070 : 973
0018|9073 : 4919
0018|9085 : ECG
0018|9169 : RR_INTERVAL
0020|000d : 1.3.6.1.4.1.19291.2.1.1.11689618230218138555609086992
0020|000e : 1.3.6.1.4.1.19291.2.1.2.11689618230218138555609087103
0020|0010 : 4000123328
0020|0011 : 8
0020|0012 : 6
0020|0013 : 1
0020|0020 : L\P
0020|0052 : 1.2.392.200036.9116.2.5.1.37.2423320044.1360037467.302306
0020|1040 :
0020|1041 : 60.00
0020|1208 : 1983
0020|4000 : CTA\HALF
0028|0002 : 1
0028|0004 : MONOCHROME2
0028|0008 : 240
0028|0010 : 512
0028|0011 : 512
0028|0100 : 16
0028|0101 : 16
0028|0102 : 15
0028|0103 : 1
0028|2110 : 00
0032|1060 : CT Non-Cardiac Finding
0040|0002 : 20130205
0040|0003 : 130207.000
0040|0004 : 20130205
0040|0005 : 133207.000
0040|0007 : CT Non-Cardiac Finding
0040|0009 : 5000089169
0040|0244 : 20130205
0040|0245 : 131102.000
0040|0253 : 1022
0040|1001 : 4000123328
2050|0020 : IDENTITY
ITK_original_direction : [UNKNOWN_PRINT_CHARACTERISTICS]

I see, then the metadata is not available inherently.
Are there reasons for image.GetMetaDataKeys() not printing anything?
How did you read the metadata? I am reading dicoms using dicomread shared in previous replies.

I’m not sure why your metadata dictionary is not printing anything. You can see the script I used here:

I ran it with the ‘-v’ flag.

1 Like

It is multi-frame IOD, only a very small part of DICOM attributes is visible, ‘metadata dictionary’ contains only top-level tags. In particular case there is related sequence is shared groups, but it is empty

Edit:
P.S. BTW, anatomy description attributes are “Type 3”, optional.

Edit:
There is also one close private tag, BTW

I can read the metadata using the following itk functions for dicom or given a casePath_(name of the directory)

imageType = itk.Image[itk.F, 3]
# print(imageType)
reader = itk.ImageSeriesReader[imageType].New()
dicomIO = itk.GDCMImageIO.New() # GDCMImageIO is not available in sitk
dicomFN = itk.GDCMSeriesFileNames.New()
reader.SetImageIO(dicomIO)
dicomFN.SetUseSeriesDetails(True)
dicomFN.SetDirectory(casePath_)
uids = dicomFN.GetSeriesUIDs()
fnames = dicomFN.GetFileNames(uids[0])
reader.SetFileNames(fnames)
reader.Update()
# image = reader.GetOutput()
# ##------------------------------------------------------------------------------##
metaData = dicomIO.GetMetaDataDictionary()
# ## now you can access the meta data using indexing by key.
# ## The key we are conserned with is the body site key '0018|0015'
print(metaData['0018|0015'])

Are there any such versions in sitk to read the metadata?
What I figured that sitk has no GDCMImageIO unlike itk and maybe sitk doesn’t support all the metadata readability as itk

Hello,
sitk.ReadImage only works when the dicom files are saved in a single file with .dcm extension. But many dicom files are stored as multiple slices in a single directory. How do you suggest reading those dicoms’ metadata.

I use the following function to read both single .dcm and multiple sliced dicom images.

def dicomread(casePath, imgitk=None):
    if not os.path.isdir(casePath):
        print("The path doesn't exist.")
    reader = sitk.ImageSeriesReader()
    dicom_names = reader.GetGDCMSeriesFileNames(casePath)
    reader.SetFileNames(dicom_names)
    image = reader.Execute() # type --> 'SimpleITK.SimpleITK.Image'
    if image.GetDimension() == 4 and image.GetSize()[3] == 1:
#         image = sitk.GetArrayFromImage(image)[0,...]
        image = image[:, :, :, 0]
    return image

would it be possible to add the metadata part you mentioned inside/with this function?

Hello @banikr,

The reader in your code contains all of the metadata. Please see this usage example on read-the-docs, the relevant function is GetMetaData.

Highly recommend that you spend some time skimming the SimpleITK documentation, specifically the examples on read the docs and the Jupyter notebooks to better familiarize yourself with the toolkit.

1 Like