I wanted to ask ask about the interaction between python parallelism and SimpleITK. I’ve successfully wrapped the calls I need from SimpleITK in concurrent.futures parallel calling, but I’m not sure if SimpleITK can be called in ThreadPool or ProcessPool. I am aware of the need to control SimpleITK/ITK’s internal threading when I do this.
ITK by default will identify the number of cores and use that for internal thread parallelism.
Use multi-threading if it is a light task. I’d say registration is a heavy task so would use process level parallelism. In any case, multi-threaded or multi-process it requires that you consider the number of threads you want ITK to use and the number of threads/processes you allocate on top of that for the registrations happening in parallel. If you do not, you will overwhelm the system as you require more cores than exist when running multiple registrations in parallel and each registration has multi-threading inside it too. It will actually take longer to complete the work than just running the registrations one after the other. This is a balancing act num_process*num_threads_per_process and the number of cores.
Multiple single-core CPUs, a cluster:
Run a single registration at a time on each node and let ITK use as many threads as it wants. This assumes that ITK detects the number of cores correctly. I think it does but not sure. I usually provide the number of cores as a parameter and split it between ITK threading and the parallel processes.
Automatically detecting the number of cores available to you may not be trivial when working on a shared resource (e.g. SLURM). The CPU has 128 cores but you are allocated 16 for your work (if I remember correctly os.cpu_count() will return 128 and not 16, but not sure about this).
Thanks for the info, I’ve already been playing around with the ITK threads vs. a concurrent.future ProcessPool, so far I’ve found ProcessPool parallelization is far more effective than the multithreading in native ITK.
My question was mostly about the underlying ITK/SimpleITK code, and if it should be run under a ProcessPool or ThreadPool.
I can report empirically that it should be run under a ProcessPool.
The simplest way to do parallelism to do pure task based by setting ITK’s global default number of threads to 1 and the number of tasks to the number of cpu’s available.
SimpleITK can be effectively run with either as it does unlock the Python GIL to threaded parallelism.
If you want ITK parallelism with task based parallelism, you can also try both the “PLATFORM” and the “POOL” multi-threader. The “PLATFORM” will create per task set’s of threads, while the “POOL” will create a persistent pool of threads to be shared with the current process.
ITK and SimpleITK will automatically detect the allocated number of CPU slot in a SLURM environment. This a detail that can generally be skipped.
On a side note I have pondered if a single-thread multi-threader for ITK would be useful, when integrating SimpleITK/ITK into an application/platform that is are ready parallel.