I am currently working on a patch which accelerates the execution speed of the ANTs Atropos application. I have until now been using openmp to implement the acceleration. The suggested code changes and initial related discussion can be found in PR https://github.com/ANTsX/ANTs/pull/1452
In essence the change accelerates a loop using the omp parallel c++ pragma macro.
I have been trying to wrap my head around the ITK multi-threading framework, but cannot seem to find a way to convert my current openmp accelerated code https://github.com/ANTsX/ANTs/commit/4c7f75a3f4a56e22387bf09097b983515ad854a5 into code that utilizes to ITK building blocks described in the documentation.
If anyone have any ideas, suggestion, or resources such as examples or tutorials, it will be of greatly help.
Hi @dzenanz. Thank you for your reply, I really appreciate your help and the pointers to the code examples. I have tried my best to think of ways to apply the examples to my problem, but I still fail to see how. As fare as I can see, the example that you have linked do parallel acceleration of an integer index iterationer, and when I try to apply this it seems hard to make a generel solution that works for multiple dimensionalties.
Maybe a parallelized version of the Iterator’s, or some sort of wrapper function exists and can be applied?
I’m trying to replace/accelerate the loop iteration:
I have been looking at the ParallelizeImageRegion construct as used in useparallelizeimageregion maybe this construct can be used? I however also for this construct fail to see how the code should be put together (how ParallelizeImageRegion should be used together with the ConstNeighborhoodIterator and ImageRegionIterator).
Do you have any idears or thoughts about how and/or if this sort of acceleration of the ConstNeighborhoodIterator and ImageRegionIterator can be achieved using the ITK primitives?
For parallelizing region iteration, ParallelizeImageRegion is the right choice. In essence, the old code goes into the lambda function of ParallelizeImageRegion, and it iterates over the lambda’s parameter (partial region) instead of the whole region. Some other changes might be needed to enable parallel computation. Here is a simpler example:
Here is a non-trivial example:
and the refactoring change from the old to the new, parallel implementation:
As this is my first go at using the ITK multithreading primitives I’m not entirely sure that my implementation is ITK best practice. If you could glans over these changes and comment on the implementation that I have made, I would be of great value. Any suggestions for improvements are very welcome.