Highly concurrent SimpleITK program written in fsharp crashes a lot, and may return inconsistent images

Hi all,

We are having bad problems with the VoxLogicA tool, which uses SimpleITK in a highly concurrent manner; the problems manifest themselves only on a recent machine with a 16-cores Intel Core i7 processor and 32GB of RAM.

The tool allocates plenty of images as the result of computing thousands of tasks concurrently; images are never de-allocated (which is indeed a missing feature). However, current usage of the tool in the case study we are working on, is about 6GB of resident size, therefore I would not expect that the tool has run out of ram. Anyway

Problem 1) sometimes we get incorrect results, and inspecting the intermediate images, we find images that clearly are an overlap of other computed images, as if the image buffer for some calculation has been reused. I do understand this may be related to “lazy” behaviour in SimpleITK (of which I have no knowledge except a remark by @blowekamp in another thread); and I do understand that I need to use “makeunique”; but don’t really know how. I’ve tried to add “makeunique” before every image creation and also before accessing internal image buffers; the latter is probably not necessary if done at creation time? However: the program has become slower, and problem 2 has become much more prominent.

Problem 2) we frequently get a “System.AccessviolationException: Attempted to read or write protected memory” in “Invoke” which — I believe — should be the invocation of external functions.

I still hope all of this is just due to insufficent memory and therefore to missing garbage collection in VoxLogicA but: shouldn’t I get some other form of exception in that case?

Hello,

It sounds like you are developing an interesting environment. If you could narrow down the issue to a small reproducible example that could be used as a test that would enable us to track down the problem.

Please try the latest under development SimpleITK the compiled binaries with ITK 5.0, to see if there is any change in behavior. With ITKv4 the “Modified Time Stamp” was only 32-bits, it was finally changed in ITKv5 to be 64-bits. While I have heard this issue with ITK, it self, it has come up several times with SimpleITK causing filters not to execute properly.

The C++ “lazy” copying is heavily tested and has proven to be reliable, efficient and stable. There may be an issue with the CSharp wrapping and managing memory. It may be specific for how you are using it so producing a small example of the issue is important.

Hi, thanks for your reply. If this turns out to be a bug in simpleitk I will try to create a minimal example, but I’m still convinced it may be that I do something wrong in the first place. Here are some questions:

  1. how do I use “makeUnique” and when is it necessary to do that?

  2. can I disable lazy copying for testing? What is it designed for?

  3. I do not expect that lazy copying on its own has bugs, but rather that it may postpone some checks and cause odd failures. For instance:

  4. how do I check if memory allocation failed after “new Image(img)”, “VectorIndexSelectionCast(img)” or “Cast(img)”? It occurs to me now that I never explicitly check failure of these functions. I expect that an exception is raised in .net land; but that may be a wrong assumption. But with lazy copying maybe they don’t fail on their own? I’m very suspicious here

Here is an update: I wrote a simple program that allocates 1024 processes each one applying a threshold and then calling connected components on a loaded image. This runs out of memory very quickly (as expected). On the machine in question, this happens at 22gb rss, therefore much higher than what we’ve been observing in our issue. But on the other hand, when that happens, no exception is raised in .net. The program is killed with signal 9.

Another remark: I wrote in my first post that I get Access Violation in Invoke, and wondered what “Invoke” is; that’s just dynamic call of methods, which is used in VoxLogicA internally, so the “Invoke” bit is not really interesting for the problem in question.

1 Like

Yet another question: @blowekamp I promised already two times that I would investigate how to publish a simpleitk package on nuget; but this did not yet happen. In the meantime, maybe you could consider publishing the pre-compiled csharp binaries also for osx and linux, together with the python binaries? You pointed me to the development binaries, but I only have linux on the machine in question. For now I will compile from source.

1 Like

There is the sitk::Image::MakeUnique method which can be called to ensure there are no other underlying copies of the itk::Image. This may be necessary when directly accessing the the buffer of an Image. If you are accessing the buffer of a sitk::Image, then no other process should have that sitk::Image. I’d recommend creating a copy (lazy) of then sitk::Image then a MakeUnique for the safest access to the buffer.

Disabling lazy copying is not implemented. It is designed so that the “sitk::Image” object can be returned by value instead of by pointer. The implementation is quite robust and thoroughly testing. The only trickiness it maintaining safe ownership when accessing the raw buffer.

The expected behavior in ITK C++ when memory is exhausted is an exception is throw. SimpleITK has configured SWIG to throw a C# exception when an C++ exception is encounter. The statement new Image(img) will just create the “lazy” copy. The functions create a new image and return it.

Thanks again for your interest here. Creating portable binaries that meet a languages communities expectation for requirements is an involved task. Experience and knowledge of these requirements is needed to move forward. Contributions and pull request to help with this issues are welcomed.

@blowekamp I’ve finally managed to produce a really minimal example demonstrating at least one of my issues (hopefully the only one). Here it is:

W.r.t. the “portable binaries” issue, I just think, since you publish the csharp dlls for windows, you may want to consider to publish the same also for linux and osx. I’ve always compiled those myself for the three systems, and it amounted to just blindly following the instructions you provided. Can definitely provide help in setting up the process for linux, whereas I have no real expertise on OSX. I’m also considering to set up a cross-compilation toolchain in voxlogica just for simpleitk.

2 Likes

Thanks again for creating the minimal example to reproduce the problem.

There are more comments in the Github issue.

To summarize the source of the bug: SimpleITK’s GetBufferAs.. method has a const and a non-const overloaded method in C++. The non-constant method calls sitk::Image::MakeUnique which is not concurrent thread safe. The .Net runtime underlying C#, F# etc., does not have a const object concept similar to C++'s const so only the modifying non-const GetBufferAs... method exist in .Net languages. This was the source of the concurrent threading issue.

To moving forward with the SimpleITK interface, we will be adding a lower-level GetBuffer method to explicitly indicate if the image should be made unique. This may return a void pointer.

Thanks a lot for the proposal! Keep me posted, I will test the fix as soon as it is available.

Vincenzo

Here is the PR Which adds “GetConstBufferAs…” methods just for CSharp/.Net:

2 Likes