It appears that long double is now only supported by MeshIO, not by ImageIO. Honestly I don’t really have a use case for long double, and I realize that its size is very platform dependent: typically either 64 bits, 80 bits, or 128 bits. Is long double still important to other users?
I’m asking, just because I’m considering some refactoring, to share code between ImageIO and MeshIO. Specifically, MeshIOBase::MapComponentType and ImageIOBase::MapPixelType are very similar, but the first one supports long double, and the second one does not!
long double was interesting with IA32 and older architectures in x64 series, because 80-bit float was a (the?) native type. With AMD64 that disappeared, which has 64-bit native float.
As removals are easier, I have a mild preference for adding LDOUBLE to ImageIO.
The concept of having the ImageIO express pixel component types as non-portable types is problematic. Files are potable between systems, these descriptive types are not. IMHO describing the component type as “long double” is not descriptive and will not be well defined.
For comparison of a portable specification, consider the ZARR dtype spec:
Thanks to you both @dzenanz and @blowekamp So basically, there are three possible options:
Let ImageIO and MeshIO both support long double
Drop long double for both ImageIO and MeshIO
Let MeshIO support long double, while ImageIO does not (current situation)
Could it be that long double is more relevant for MeshIO than it is for ImageIO, for fundamental reasons?
By the way, we could of course also add fixed width floating point types, FLOAT80 and FLOAT128 to the enum class IOComponent, but then they should be processed in a platform/compiler dependent way. (Possibly throw an exception when the platform does not have a floating point type of that size.)
Does “long double” for ImageIO provide anything meaningful? Does any Image file format support “long double”?
Also how would FLOAT80 be stores as in a buffer? 3-bytes with poor alignment? 4-bytes?
I don’t see a need to for this and you state that you don’t have a use case either. There are details here that could be done incorrectly that would cause more problems if an implementation with a well defined use case was required.
I do not know if the MeshIO is usable or functional with long double. A quick grep though the IO code seems to indicate that there are a large number of occurrences of LDOUBLE in many of the MeshIO implementation so this would indicate good support for it at one point.
The ImageIOBase::IOComponentEnum is already itk::IOComponentEnum which contains LDOUBLE in the enum. So it looks like this enum is already valid if ImageIO classes.
In that context, defining the behavior or LDOUBLE for imageIO seems more reasonable. I hope that all the ImageIO implementations have correct handling of the “default” case for this enum.
Ideally I would like to move the enum class { UCHAR, CHAR, ..., LDOUBLE }, as well as the type-to-enum-value map (Map...Type<T>) to “Core/Common”, so that they can be shared between the ImageIO and MeshIO. What do you think?
So I would then just propose a common ctype-to-enum-value map.
Note that there appears another difference between MeshIOBase and ImageIOBase. MeshIOBase::MapComponentType unconditionally maps char to CHAR, whereas ImageIOBase::MapPixelType maps char to either CHAR or UCHAR, depending on the signedness of char: if char is signed, it is mapped to CHAR, otherwise to UCHAR. This difference should also be resolved, when introducing a common ctype-to-enum-value map.
Currently IOComponent has distinct enum values for two char types: CHAR and UCHAR. I think it would be easier if we would have an extra enum value for SIGNED_CHAR, because in C++ char and signed char are distinct type. And then unconditionally map char to CHAR, and signed char to SIGNED_CHAR, right? Or would that break too much legacy code?
It sounds like you are thinking the CHAR enum is describing the char C type ( which is incorrect). The actual usage of it is to describe a signed char/8-bit type. Changing the definition of the CHAR enum would not be good. Perhaps adding an alias of “SIGNED_CHAR == CHAR” would be descriptive and useful.
Also 8-bit integers are only signed or unsigned. The three states of C’s “char” types is an artifact of C and not data. The ImageIO interface really should be describing what is stored on the disk and not the state of types and sizes at programming language level, IMHO. That is to say I have a long term frustration with this interface not providing fixed width integer types.
But the CHAR enum type ( for ImageIO ) currently must represents signed 8-bit data. There is not other way to represent as a component type, so it much be signed:
(Those later cases look like they can be conveniently updated with the new fixed types. )
And yes, 8-bit signed data is real and common with certain file formats. Java does not have an 8-bit unsigned data type. So for Java developer signed 8-bit data/images are very natural.
FYI, This pull request of mine is somewhat related:
Please note this pull request in it current state is only a style PR. It does not address the different behavior between MeshIO and ImageIO regarding the type mapping of CHAR and LDOUBLE (long double). So it may be processed independently.
I see, SimpleITK’s sitkImageReaderBase maps IOComponentEnum::CHAR to int8_t (which is a signed integer type, by definition), whereas ITK’s itk::ImageFileReader maps IOComponentEnum::CHAR to char (which might be an unsigned integer type):