Add a parallel compression method to NRRD and/or MetaImage?

(Dženan Zukić) #21

I was getting ready to add zstd to ITK, in order to enable its use in MetaIO and NRRD. But before doing so I decided to compile it locally and try it with my own example. Unfortunately, it turns out the multi-threaded support is an advanced and experimental feature at the moment. I have asked a question about when that might change. Let’s see what they answer.

In light of this, I also quickly checked out LZ4. The situation there does not really seem better with respect to multi-threading.

(Dženan Zukić) #22

They recently decided to stabilize the multithreading API, so I entered an issue to track this.

(Andras Lasso) #23

Considering that the nrrd user community needs several major features (as proved by current discussion and a similar one here) and ready to contribute, but there seems to be no efficient way of doing it in the teem repository (not on GitHub, contains lots of irrelevant features, not modern C++, etc.), I would suggest to create a new nrrd library.

If everyone is OK with that, we could start from teem’s nrrd, add tests, make obvious cleanups (e.g., leverage C++11), and add new features (random file access, faster zip compression, etc.).

Which GitHub organization should we use?

What should be the repository name? NrrdIO, NrrdTools, QuickNrrd, SuperNrrd, …?

@hjmjohnson @pieper @jcfr

(Hans Johnson) #24

My preference would be the InsightSoftwareConsortium.

FYI: I just worked on reviving the long stagnant NIFTI library as well . While this effort is non-trivial, it will hopefully provide long term maintenance easier.


(Steve Pieper) #25

@lassoan, thanks for keeping this moving.

I’d vote for the name CppNrrd. Also a new toplevel organization make sense (NrrdTools is good).

This could become a central place to host nrrd libraries in other languages, like matlab, python and javascript. Reference documentation of the format and sample data could be shared by all these implementations instead of having them all over the place like they are now.

I’d also vote for starting with a very specific conformance statement of which nrrd use cases will be supported, and have tests and example data for each of them.

Off the top of my head, I’d suggest the new C++ implementation need to support at least the following features at least in order to be viable replacement for the current library:

  • scalar/vector/tensor volumes, 2D, 3D, 3D+t
  • dwi/dti extensions (gradient tables, measurement frames…)
  • .seg.nrrd
  • .nhdr and data files

This shouldn’t be hard since for the most part they just expose the raw buffers and headers. Adding parallel compression and other new features would be great.

Just to note: replacing hard to maintain legacy nrrd code makes sense, and nrrd is a good lightweight option in many purposes, but personally I’ll put more effort into better DICOM support for many of the same use cases.

(Andras Lasso) #26

It would be great if we could store image data in DICOM files using high-speed compression/decompression and random access. Maybe what we learn from modern NRRD implementation (and hopefully even source code) can be used for DICOM files in the future.

CppNrrd could work, but it might not be fair to claim such a general name. I often find this a real issue in Python, where random people create a pyXYZ library and publish on pypi. For example, there is already a pynnrd implementation - if we don’t add any distinctive word then we could run into a name conflict if we want to release a Python version.

(Steve Pieper) #27

Good point - if we choose to support only a subset of nrrd (or a superset?) then maybe we should make that explicit in the naming. And as a magic number or version in the header.

(Chris Rorden) #28

I think this thread is related to this other issue. Since I support NRRD without using your code, I would ask that these changes become part of the formal NRRD specification and are done with full awareness that they will break compatibility with older tools. This may well be worth it - zstd is very fast and is gaining widespread traction. However, using pigz would provide faster compression without breaking compatibility and pre-filtered zstd (e.g. blosc) might be much better suited for the nature of NRRD data than pure zstd.