I have seen that Facebook (Niebler et al.) published an implementation about executors recently: libunifex.
I heard about executors from @matt.mccormick, I was thinking about the steps that ITK would need to implement to use this approach allowing an easier use of accelerators (GPU).
From my view the current GPUImage
base class is a great example of plumbing opencl kernels into ITK infrastructure, but having to reimplement the filters for GPU, with a different class seems cumbersome. There have been little contributions of new GPUFilters since its inception.
Executors seems the C++ way to plumb in a scalable way accelerators with CPU.
I wonder though, what would be the optimal architecture for implementing it in ITK.
Right now ITK has single threaded code and optional multithread.
I guess one way would be to use a GenerateData
like function, with option to call accelerators.
execute(single_thread)
, execute(multi_thread)
, execute(openCL)
.
So, I am guessing the work to transfer data and set up options for the accelerators would need to be done in the executors initialization, but this can be done once in a base class, and let the filter have functions specific to the different accelerators.
Just touching waters, but seems an interesting approach that would be standarized in C++ (23?) but usable with this third-party library from C++17.