RegistrationV4 and BsplineTransform Parallellization

Hello everyone.

As I’ve been mentioning in my previous posts, I am working in a deformable registration pipeline.
It happens that I am running my tests in a supercomputer and noticed that the code I wrote using ITK classes ItkRegistrationV4 and ItkBsplineTransform is not parallel.

How can I make it parallel? Is there any previously defined way to do that in ITK?

Registration code in ITK is parallel. What might be a problem is weirdness of the supercomputer. You can change the number of threads to see how that affects speed. Take a look at the code which initializes max number of threads:


Some speed improvement is expected if you distribute a registration task to run on more cores, but as you slice the problem into smaller pieces, the relative overhead increases. At some point adding more cores will not improve the performance anymore but will start to slow things down.

If you want to use hundreds of cores work efficiently in parallel on a single registration task then you may need to implement new registration algorithms, or at least redesign the implementation for your special computing architecture.

If you do batch processing then you can achieve high efficiency by running multiple independent registration tasks in parallel (each independent registration running on one or few threads).


All right. I partially got it.
What I still did not get is HOW can I change the NUMBER OF THREADS?
Or how can I distribute my registration task over more cores?
Do I have to include a specific class? Fold my code with some other loop?

I have two computers: one with 7 cores, other with 64 cores.

My code right now is basically the DeformableRegistration8.cxx example.

Thank you very much!

Try invoking in bash like this:

ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=1 ./DeformableRegistration8 /path/to/input1 /path/...

This temporarily sets environment variable ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS during execution of the command which follows.