GPU filter (anisotropic diffusion) pipeline - memory allocation issue (itk-4.7.0)

Dear all,

I’m using not the newest version of ITK (4.7.0) because I’ve some older code framework where an upgrade to a more recent ITK version is only planned beginning of next year.
Still, I hope you can help me with the following issue relating to the older ITK version.

I’m trying to denoise a series of 2D images using the GPU-accelerated ITK anisotropic diffusion image filter. An extracted code fragment that shows what I’m doing here:

{
    for (int i = 0; i < 1000; ++i)
    {
      typedef itk::GPUImage<float, 3> ImageType;
      typedef itk::ImageFileWriter<ImageType> WriterType;

      typedef itk::ImageFileReader<ImageType> ReaderType;
      ReaderType::Pointer r = ReaderType::New();
      r->SetFileName("c:\\temp\\float.mhd");
      r->Update();
      ImageType::Pointer itki = r->GetOutput();

      typedef itk::GPUGradientAnisotropicDiffusionImageFilter<ImageType, ImageType> FilterType;
      FilterType::Pointer f = FilterType::New();
      f->SetInput(itki);
      f->SetNumberOfIterations(10);
      f->SetTimeStep(0.03);
      f->SetConductanceParameter(3.0);
      f->SetUseImageSpacing(true);

      try
      {
        std::cout << "denoising #" << (i + 1) << std::endl;
        f->Update();
      }
      catch (itk::ExceptionObject &e)
      {
        std::cerr << "ERROR: " << e << std::endl;
        continue;
      }

      typedef itk::ImageFileWriter<ImageType> WriterType;
      WriterType::Pointer w = WriterType::New();
      w->SetInput(f->GetOutput());
      w->SetFileName("c:\\temp\\denoised.mhd");
      w->Update();
    }
  }

The issue is that after several 100 images, I get the following error:

ERROR:
itk::ExceptionObject (0000004525B09970)
Location: "unknown"
File: C:\dev-libs\itk-4.7.0dcmtk\src\Modules\Core\GPUCommon\src\itkGPUDataManager.cxx
Line: 127
Description: OpenCL Error : CL_MEM_OBJECT_ALLOCATION_FAILURE

So, I assume, the implicit deletion of the smart pointers of filter (f) and image (itki) as well as associated readers/writers in the loop does not cause deallocation of the memory on the GPU that was used for filtering.
If I look at the GPU memory graph of Windows 10 task manager during running the example above, I also see growing and growing GPU memory, but no deallocation at some point.

Can you please help me and explain to me what I have to do to make ITK (associated manager objects or similar) releasing the GPU memory at the end of each iteration of the for loop?
I played around with decrementing reference counters of filter and dirty flags of image, but this did not work out.

Any supportive input very much appreciated!

Thanks,
Phil

This might be the same bug as 1887. This likely involves some low-level GPU memory management plumbing. If you provide a runnable variant of the above code fragment it should help fix it.

Note, however, that ITK is community-supported right now. There is no dedicated funding. We are prioritizing higher impact bugs, and easier to fix bugs.