Faster brightness/contrast augmentation?

Floowey · June 30, 2022, 9:06am

Hello,
I intend to alter the contrast/brightness of my images in my pytorch Dataset pipeline using the sitk.AdaptiveHistogramEqualization method to make my neural network more robust against such differences. However, I found that it decreases my step rate from 5 it/s to just 1 it/s, which in turn increases my training time from a few days to weeks which I can’t afford. Am I perhaps misusing the method or is there a different, faster alternative?

The dataset looks like this

class CTChangeContrastDataset(Dataset):
    def __init__(self, dataset: Dataset):
        self.data = dataset

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        im, label = self.data[idx]
        rndAlpha = uniform(0,1)
        rndBeta = uniform(0,1)
        new_im = sitk.AdaptiveHistogramEqualization(im, alpha=rndAlpha, beta=rndBeta)
        return new_im, label

dzenanz · June 30, 2022, 9:50am

You are re-doing the equalization each time an image is fetched. Of course it is slow. All fancy augmentations are slow. You might consider doing this augmentation with some probability, e.g. 10%. That should allow your NN to learn it and be resistant to it, but would not be overly computationally expensive.

zivy · June 30, 2022, 12:56pm

Hello @Floowey,

Deterministic, computationally complex augmentations are best done using a caching mechanism so that they are only performed once during training. For example, see the MONAI implementations for PersistentDataset and CacheDataset.

blowekamp · June 30, 2022, 1:20pm

For the size of images used for deep learning it should not be that time consuming. How long does just the AdaptiveHistogramEqualizationFilter take the filter to run? 4 seconds seems way too long.

What is your image size, pixel type and version of SimpleITK/ITK?

The current implementation using a moving histogram which should be fairly efficient.

Floowey · June 30, 2022, 1:39pm

Thanks for your replies. Simply not doing it in every step is totally reasonable.

@zivy thank you, I will take a look at those. I’ll do a big part of my data preparation before training already, but maybe this could improve it too - gotta hope my machine can handle it.

@blowekamp My images are 35x35x35 floats and I’m using SITK version 2.1.1. One call takes in the order of 0.33-0.40 seconds. I’m sorry if I didn’t say it clearly, but the numbers I gave are the iterations/steps per second that decrease. One step loads 4 images.

blowekamp · June 30, 2022, 1:56pm

Not sure if that timing is just the Eq step, but this is what I am getting locally:


In [12]: img = sitk.Image([35,35,35], sitk.sitkFloat32)

In [13]: img = sitk.Noise(img)

In [14]: %timeit -n 1 out = sitk.AdaptiveHistogramEqualization(img, alpha=1, beta=1, radius=(3,3,3))
4.78 ms ± 693 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

Floowey · June 30, 2022, 3:40pm

I couldn’t quite manage to reprodue that timeit in visual studio I timed it only around the AdaptiveHistogramEqualization call, but I noticed I get float64 from a previous normalization step. I doubt that it would make that much of a difference, but I’ll change it anyways. Thank you!

For now I’ll wait and see whether I get satisfying results just not doing it every time.