Thread safety of itk::ObjectFactoryBase::Initialize()

Hi all,

In testing my own app with clang thread sanitizer (TSan) I hit a race condition with ITK. I reduced it to many threads running this code:

typedef itk::Image<float, 2> MyTypedef;
MyTypedef::New();

The issue is that two threads end up in itk::ObjectFactoryBase::Initialize() simultaneously and race on the m_PimplGlobals->m_Initialized flag.

Should I be able to do this? i.e. is ITK missing some mutex (or other synchronization) here?

Or is it my responsibility to first “prime” ObjectFactoryBase somehow? Doing so seems to fix my issue. If I create a dummy itk::Image on the main thread before spawning all my other threads, then TSan no longer complains.

Interestingly, I tried to create a reproducer test case by hijacking the existing itkImageAlgorithmCopyTest2 test, but initially I could not repro. I eventually determined this is because itkTestDriverIncludeRequiredIOFactories.cxx basically “primes” ObjectFactoryBase on the main thread when it calls RegisterRequiredFactories.

Cheers,

Sean

I remember @fbudin last worked on that. I thought that solution was to create one global singleton, and put all singleton-like globals there. I don’t know whether ObjectFactoryBase is part of that, and if not whether it could or should be.

Would it be sufficient to declare ObjectFactoryBasePrivate::m_Initialized as std::atomic_bool? Instead of bool, at:

From C++11, the initialization of static local variables (declared locally within a function) is by definition thread-safe, even without explicitly using atomic types and mutexes. This “magic statics” feature might also be helpful…

@Niels_Dekker The origin of the PimplGlobals is related to system differences with how duplicate “global” variables are resolved at run time when they are contained in multiple dynamically loaded libraries. I don’t recall ( and may not have been aware ) all the details now, but it is quite a tricky issue!

1 Like

@seanm Looking at the code, there are several levels of calls which are not concurrent thread safe. The Singleton object, the initialization of m_PimplGlobals, missing unique lock on the initialization code.

So a lock around the initialization code may some the problem you are encountering (most of the time), IMHO there is a good bit of work todo to get it robust.

Yes I think this is the best ( easiest ) thing to do right now.

@Niels_Dekker I think the atomic operations would be very useful to make the SingletonIndex class concurrent thread safe without the use of mutex. If we assume there is a limited number of global singletons stored in the index ( which is very reasonable assumption ), then the SingletonData type could be changed from a dynamic std:map, to some type of hash map where we can set a fix size at compile time.

Yeah, I started tweaking some of the locking, but TSan just keeps finding additional races. I clearly should not be using this code in multiple threads this way, at least not yet.

What would be the best way to prime ObjectFactoryBase then? From the public methods, maybe calling ReHash() would do?

Meanwhile, I’m setting up a nightly TSan build that will be on cdash every day.

And I have a few patches coming…

Sean

2 Likes