pyramidal image format for volumes

kayarre · June 5, 2020, 2:37am

I have used the tiff format for looking at pyramidal (tiled) 2-D images that help deal with the not being able to hold all of the data in memory at the same time, is there an equivalent or similar way to deal with really large image Volumes or collections of image volumes?
I supposed the easiest thing would be to bisect the image and only look at the sub image volume. I didn’t know if there is a way to do this in another fashion I have a high resolution industrial micro ct image set, and when I export it from Fiji to nrrd format it is a ‘mere’ 14Gb file.

there is a lot of the image that is hollow in the middle and isn’t relevant to the part I am interested in segmenting. What should I be looking at or looking for that can help deal with such large images sets?

Also this is just peanuts to what I want to do, which is stitch together multiple of these, although I guess I only need the stitched together segmentation and not the original images themselves.

any thoughts or opinions welcome.

pieper · June 5, 2020, 12:02pm

Good questions - I’m also looking at working with some big volume datasets and there aren’t a lot of good answers (yet?).

In the microscopy community people have been working on OME tiff and next generations of zarr and N5 and that would be worth investigating.

Some of use are also pursuing dicom whole slide imaging format for multiframe pyramidal multi-channel pathology images. It’s been 2D so far, but because it’s dicom the same encodings can be used for multiframe and 3D. One advantage of this approach is that you can do frame level access via dicomweb.

blowekamp · June 5, 2020, 1:03pm

I have been impressed with the speed of the file formats that are nearly raw, like NRRD and MHA files. They provide fast random access ( for streaming IO) and and blazing burst speeds when combined with SSDs.

My general work flow it to keep the files compressed on the network shared, then decompress them to a local SSD in a nearly raw format. I am registering numerous 10-30+GB volumes to one another. In the whole workflow I am spending more time on file format conversion and I/O than the registration algorithm. This works for a batch style workflow, but not for on-demand web viewing.

In microscopy 5D images of XYZTC are ubiquitous. To improve SimpleITK’s usage with microscopy we have extended the image class and IO to directly support up to 5D images by default. Additionally, the extract and paste filter support arbitrary input and output dimensions to enable for example extracting a 2D image from a 5D volume with the paste providing the inverse operation.

kayarre · June 5, 2020, 2:17pm

I was looking at vaa3d and neuroglancer and curious if those provide some of the infrastructure to handle these image sizes? It looks those are using some form of tiled volume format.

If I want to segment a portion of the image or perform image processing without loading the whole thing into memory.

I probably just need a bigger computer with more memory…

matt.mccormick · June 10, 2020, 2:54pm

The ITK Python wrapping supports Xarray DataArray images in 5.1.0, which can be converted to / from Zarr stores, which are chunked, compressed, and well suited for multi-dimensional, hierarchical data.