Threading refactoring renames

dzenanz · May 31, 2018, 7:53pm

In this patch (part of another threading refactoring pass), @blowekamp suggested renaming threads into workers, partly to make a distinction with the old concept of threads, partly to use a standard thread-pool terminology. Are there any strong opinions about this? @hjmjohnson @mstaring @matt.mccormick @fbudin @warfield @phcerdan @Niels_Dekker

blowekamp · May 31, 2018, 7:59pm

It looked like you were proposing to replace the “NumberOfThreads” with the “NumberOfWorkUnits”:
http://review.source.kitware.com/#/c/23489/2/Modules/Core/Common/include/itkProcessObject.h

I believe both concepts are still useful.

matt.mccormick · June 6, 2018, 6:11pm

Both NumberOfThreads and NumberOfWorkUnits are useful concepts, and it will be fantastic to have them separated in ITK 5.

As explained by @benoit and @warfield in this thread: Multi-threader refactoring, the separation of these two is critical for load balancing, which is critical for performance.

It will be a huge long term benefit to the toolkit if we:

Have algorithms specify their jobs in terms of work units instead of threads.
Have a maximum number of threads specified on the itk::MultiThreaderBase object. Since the number of threads can fluctuate throughout execution, and it in general the threading backends sometimes do not want to spawn too many threads, a user-specified maximum is better than a specific value.

warfield · June 6, 2018, 9:38pm

I think a convenient way to address the issue of how to control the number of threads available to the program is provided by the TBB task_scheduler_init API. This API controls the number of threads used by the task scheduler, the stack size for worker threads, and when the scheduler is created and destroyed.

This is separate and distinct from the details of providing jobs/tasks to the workpile.

In our example filters, we allow a grainsize for jobs to be generated (by spatial decomposition) that can ensure many more tasks than threads, and therefore efficient scheduling of jobs on to threads, and effective dynamic load balancing. We made an effort to ensure how that efficiency was achieved did not require knowing what thread was running where at any time.

Some considerations are described here:

https://software.intel.com/en-us/node/506296

https://software.intel.com/en-us/blogs/2011/04/09/tbb-initialization-termination-and-resource-management-details-juicy-and-gory

https://software.intel.com/en-us/blogs/2010/12/28/tbb-30-and-processor-affinity

dzenanz · June 7, 2018, 4:49pm

In this proposed patch, I am doing that.