Memory won't release when loading images in Python

I’m fairly new to the SimpleITK API and I’m using it right now to load a CT and PET image, and resample the PET to the same coordinate system and size as the CT (for a deep learning algorithm).

Right now my code is:

reader = sitk.ImageSeriesReader()
# loading CT
dicom_names = reader.GetGDCMSeriesFileNames(pathToCTDicomFiles)
reader.SetFileNames(dicom_names)
imageCT = reader.Execute()
# Loading PET
dicom_names = reader.GetGDCMSeriesFileNames(pathToPETDicomFiles)
reader.SetFileNames(dicom_names)
imagePET = reader.Execute()
# Resampling the PET image
pet_resampled = sitk.GetArrayFromImage(sitk.Resample(imagePET, imageCT))

The problem is that each time I loop over this part for a new image, the memory usage increases until the program crashes after 2 or 3 iterations. I run the code in Spyder 4.1 with python 3.7 on Ubuntu 18.04 with 32 GB of ram.

I have tried using del and gc.collect(), but it does nothing in this case.

Any suggestions for better use of the API are also very welcome, e.g. only load the the metadata for the CT or something like this.

Hello @malteekj,

Welcome to SimpleITK!

Generally speaking your code looks fine and you should not be running out of memory. As you separated the looping from the example, I ran the following and did not experience the behavior you describe (increase in memory as number of iterations increases):

reader = sitk.ImageSeriesReader()

for i in range(10):
    dicom_names = reader.GetGDCMSeriesFileNames(pathToCTDicomFiles)
    reader.SetFileNames(dicom_names)
    imageCT = reader.Execute()

    dicom_names = reader.GetGDCMSeriesFileNames(pathToPETDicomFiles)
    reader.SetFileNames(dicom_names)
    imagePET = reader.Execute()

    pet_resampled = sitk.GetArrayFromImage(sitk.Resample(imagePET, imageCT))

Are you writing the pet_resampled numpy array to disk after each iteration or are you adding them into a list or some other data structure that is constantly growing? I suspect the issue is there and not in the resampling code itself.

One other observation, you are assuming that the PET and CT files are each in their own directory and that there is a single image series in each of these directories, otherwise you would have had to select a specific series from the directory (right now you are using the first series found in each directory as you did not specify a series ID).

1 Like

Hi,
Thanks for the answer!

I just overwrite all the variables each iteration, so nothing should be stored. Can I provide anything that might help indicate where the problem is? I don’t run into these problems of releasing memory when using e.g. pydicom.

Hello @malteekj,

Did you experience the memory issue running the code I provided? If not, then please provide a more complete version of your code so that we can identify why that code exhibits memory issues and the code above doesn’t. Once we identify the difference that will likely be the culprit.

I had exactly the same problems and tried to fix it the same way as you.

The problem with me was the SimpleITK version on Debian subsystem. I upgraded from 2.0.2 to 2.1.0 (latest) and the problem disappeared.

You probably resolved your issue already, maybe this comment will help someone else.

I’m using Version 2.4 and I’m having the same problem! I want to save only 1 slice out of a 4D-fMRI-dataset, but it’s impossible, because I run out of RAM…

def get_image_data(path):
    data = sitk.ReadImage(path)
    image_data = np.asarray(sitk.GetArrayFromImage(data), dtype=np.float32)
    image_data = image_data[50]
    image_data = image_data[:,:,45]
    return image_data

Loop:
    data_list.append(get_image_data(root + "/" + file))

Is there any way to free the memory?

Hello @Chris,

The way your code is set up, the reference count to the SimpleITK image is decremented when the get_image_data is exited and then the Python garbage collector is expected to reclaim the memory. There is not much you can do about this because it is an automated process.

A slight change to the code will significantly reduce the memory footprint: GetArrayFromImage creates a copy and asarray will create a third copy if the dtype of the ndarray returned from GetArrayFromImage isn’t already float. To reduce the memory footprint, extract the desired slice from the SimpleITK image (xy_slice = data[:,:,z_index, time_index]) and then get the numpy array from it using the GetArrayFromImage.

Depending on the file format in which your data is stored you may be able to limit the reading to the desired slice without loading all of the image. Please see this jupyter notebook section titled Streaming Image IO. Highly recommend skimming all the notebooks in the toolkit’s notebook repository you will find generally useful information there.

3 Likes

The problem is not with retaining the SimpleITK objects, but the numpy arrays.
See Indexing on ndarrays — NumPy v2.1 Manual – look for the Note about numpy slicing.

The way your code is set up, the numpy array for the whole image is held in memory by the slice. changing the last line before the return to something like

     image_data = image_data[:,:,45].copy()

should improve things significantly.

1 Like

And additional improvement would be to specify the Region to be read to the simpleitk reader. Here is an example:
https://simpleitk.readthedocs.io/en/master/link_AdvancedImageReading_docs.html

3 Likes