How to understand the coordinate system of dicom

I am confuzed in the understanding of dicom coordinate system.

I display dicom information with SimpleITK:

import SimpleITK as sitk

path1 = 'C:\\Users\\Administrator\\Desktop\\medspro\\Data\\dce1\\IMG-0001-00001.dcm'
path2 = 'C:\\Users\\Administrator\\Desktop\\medspro\\Data\\dce1\\IMG-0001-00080.dcm'


# path1 = 'C:\\Users\\Administrator\\Desktop\\medspro\\Data\\Chang Cheng\\TOF\\IM_0174'
# path2 = 'C:\\Users\\Administrator\\Desktop\\medspro\\Data\\Chang Cheng\\TOF\\IM_0293'

img1 = sitk.ReadImage(path1)
print('img 1---------------------------------------')
print('instance number: ', img1.GetMetaData('0020|0013'))
print('direction: ', img1.GetDirection())
print('position: ', img1.GetOrigin())

img2 = sitk.ReadImage(path2)
print('img 2---------------------------------------')
print('instance number: ', img2.GetMetaData('0020|0013'))
print('direction: ', img2.GetDirection())
print('position: ', img2.GetOrigin())

Please have a look for the information of two dicom files which come from a DCE sequence:

img 1---------------------------------------
instance number:  1 
direction:  (1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0)
position:  (-226.074, -211.921, 69.1556)
img 2---------------------------------------
instance number:  80
direction:  (1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0)
position:  (-226.074, -211.921, -128.344)

The direction shows the z-direction is [0, 0, 1]. From the instance change from 1 to 80, the z-value of patient position decrease from 69 to -128.

For another two dicom files from a TOF sequence:

img 1---------------------------------------
instance number:  1 
direction:  (0.9984669087122222, 0.0, -0.055351894336681175, 0.0, 1.0000000000000002, 0.0, 0.055351894336681175, 0.0, 0.9984669087122222)
position:  (-107.45459007851, -165.12128686904, 81.0415926597073)
img 2---------------------------------------
instance number:  120 
direction:  (0.9984669087122222, 0.0, -0.055351894336681175, 0.0, 1.0000000000000002, 0.0, 0.055351894336681175, 0.0, 0.9984669087122222)
position:  (-112.06539940468, -165.12128686904, 164.213925423379)

The z-direction is nearly [0, 0, 1]. From the instance change from 1 to 120, the z-value of patient position increase from 81 to 164.

For the above two situation, the z-direction is both about [0, 0, 1], with the instance number increases, why the z-value of patient position is increased in TOF sequence but decreased in DCE sequence?

Any suggestion is appreciated~~~

The instance number does not matter for determining 3D positions. The acquisition system is free to assign instance numbers and choose IJK voxel coordinate system axes any way it wants.

Only the physical coordinates (in case of DICOM: patient LPS) provide relevant spatial position information. You can compute physical coordinates from IJK voxel indices using the image origin, spacing, and axis directions.

1 Like

In my understanding, if I have the origin for the image with “1” instance number (p1), and I want to calculate the origin for the image with “2” instance number (p2), the calculation should be:

p2 = p1 + spacing*z_direction

Am I correct?

The z_direction can be obtained from the Image Position (Patient): 0020:0032. The SimpleITK function image.GetDirection() can also return the z_direction. In the above two examples, z_direction is both about [0, 0, 1]. Thus, z_direction(2) = 1, and we have:

p2_z = p1_z + spacing*z_direction(2)
-->
p2_z = p1_z + spacing

In this way, p2_z > p1_z because spacing always is bigger than 0.

However, in the example of DCE data, p2_z < p1_z, and it really make me confuzed.

Is my understanding p2 = p1 + spacing*z_direction wrong?

Yes, this is wrong. Instance numbers should usually be ignored.

Each slice has it’s own 3D position. Slices should be ordered according to that (image position patient) when assembling them into a 3D volume. That’s what GDCMSeriesFileNames does, see DICOM read example.

1 Like

Thank you very much for your kindly reply.

My understanding is:

let’s say there are N files (img1, img2, ..., imgN), in which the instance number are 1, 2, ..., N, and the image position patient are p1, p2, ..., pN. To assembling them into a 3D volume,

  1. if pN = p1 + spacing*N*z_direction, the origin point for the generated 3D volume is p1, and the generated 3D volume should be image that image = np.array([img1, img2, ..., imgN]):

  2. if p1 = pN + spacing*N*z_direction, the origin point for the generated 3D volume is pN and the generated 3D volume should be image that image = np.array([imgN, img_N-1, ..., img1]).

Is my understanding correct?

Yes.

1 Like

Thank you very much~~~

What you describe is often true, but not always.

First of all, DICOM does not provide spacing, because the spacing between slices may vary (e.g., spacing may be larger in areas that are farther from the main region of interest). When the slices are ordered then the spacing between each neighbor must be checked and if it is constant then that value can be used as spacing between slices, otherwise interpolation is needed to reconstruct a 3D image.

DICOM does not provide Z_direction, because it may vary across the slices (e.g., in case of rotational acquisition of cardiac MRI). Also note that slice normal may be different from the vector that connects the origin of a slice to the origin of the next slice (e.g., tilted-gantry acquisition of brain CTs).

Also, instance number may not be always match the spatial order of slices and may not be contiguous, so you must not rely on it in any way when you reconstruct a volume. Getting instance number from physical position is not always available because slices may intersect and so there are positions that belong to many instances.

You may also need to group the slices before you can reconstruct a 3D volume from them. A 4D volume, such as a time sequence is stored in DICOM as a list of overlapping slices (many slices being located at the same physical position). You need to find out what is the DICOM tag that splits the series into reasonable sets of non-overlapping slices. The grouping tag may be acquisition time, content time, trigger time, repetition time, and about a dozen more - which one is used depends on the image modality, acquisition protocol, and vendor implementation.

If you use ITK’s DICOM sorter then it performs checks while computing the origin, spacing, and axis directions and it throws an exception if they vary across the series. If they vary then you can choose to exclude the image from analysis or reconstruct a volume from it by grouping and interpolation (e.g., 3D Slicer can reconstruct a volume from slices with arbitrary spacing and orientation, even if they intersect or there are large gaps between them, and it can also automatically group slices to load them as a 4D volume).

1 Like

@lassoan Thank you for your kindly reply. And I have another two questions:

Getting instance number from physical position is not always available because slices may intersect

I met a MR sery which include both axial and sagittal scan. In this situation, how to order order those image? The patient position can not be used to decide the order.

The grouping tag may be acquisition time, content time, trigger time, repetition time, and about a dozen more - which one is used depends on the image modality, acquisition protocol, and vendor implementation.

Which tag is used in ITK to sort 4D dynamic sery?

Usually the axial and sagittal slices are in two separate series, but if not then it’s not an issue, you can simply split the series based on image orientation first and ignore the orientation that seems less useful.

It is chosen completely arbitrarily by the device manufacturer, so you need to inspect your frames and see.

In 3D Slicer we implemented an automated system for this, which tries dozens of possible interpretations and uses the one that is most likely to be correct (and the user can override this choice).

Alternatively, you can try dcm2niix, which can convert some 4D DICOM images to nrrd or nifti format. 4D images in these formats can be easily loaded into 3D Slicer, Python, ITK-Snap, etc.

1 Like

Do you mean that we need to firstly get the manufacturer information, and then use the specifically tag? It may be a huge work.

I mean you need to inspect the tags that the image series contains to see which splits them into meaningful subsets.

1 Like