ITK ERROR: Hematoxylin and Eosin are getting mixed up; failed

I’m trying to help my doctor daughter by using machine learning to find tumors in her osteosarcoma histology slides. The H&E staining is poor — quite low but visible to humans – compared to that of the public domain data (Univ. Texas) I’ve trained a CNN on. I’d like to use ITK color normalization to bring her data from Harvard’s Boston Children’s Hospital (BCH data) to have coloring similar to the Univ. Texas data.

When I substitute a typical BCH data image for the input image in your demo (https://www.kitware.com/new-insight-toolkit-itk-module-for-structure-preserving-color-normalization/), I get the error

RuntimeError: …/…/…/include/itkStructurePreservingColorNormalizationFilter.hxx:199:
ITK ERROR: Hematoxylin and Eosin are getting mixed up; failed

Could you please suggest what I might try, if anything, to make ITK H&E color normalization work on these BCH images?

Many thanks,
Jon

@Lee_Newberg

Hi – I’m happy to get some sort of response from a very knowledgeable person, but if there was some text sent, I guess I don’t know how to access it. Sorry.

-Jon

Retired physics/EE prof spent nearly 20 years at U Colorado, Boulder teaching graduate students, research in optical computing systems. Before that, 10 years at Bell Labs involved in C/C++, and before that, 10 years as high energy particle physicist/accelerator physicist at Fermi Lab. Stanford UG, PhD in particle physics from Tufts U. Spent last year learning enough Python and machine learning techniques to train networks using public domain data to classify tumors in pediatric osteosarcoma H&E histology images. The idea is to apply machine learning to my daughter’s medical practice where such a tool working on real data would be medically useful and valuable.

1 Like

Hi — I tried to fill out a profile, but a web site seems to be required. Since, as an independent researcher, I don’t have a supported project with any institution with a web site, I don’t have a web site to provide. So no profile, sorry. Instead I added a short summary of my career since my 1970 PhD in a reply, This is also below.

< =================================
Retired physics/EE prof spent nearly 20 years at U Colorado, Boulder teaching graduate students, research in optical computing systems. Before that, 10 years at Bell Labs involved in C/C++, and before that, 10 years as high energy particle physicist/accelerator physicist at Fermi Lab. Stanford UG, PhD in particle physics from Tufts U. Spent last year learning enough Python and machine learning techniques to train networks using public domain data to classify tumors in pediatric osteosarcoma H&E histology images. The idea is to apply machine learning to my daughter’s medical practice where such a tool working on real data would be medically useful and valuable.
================================= >

Jon R. Sauer, retired prof
jon.sauer@gmail.com
Acton, MA, USA 01720
+1 303.579.3009

I apologize for the trouble that you are having with the code. If there aren’t HIPAA issues or similar, could you provide the image and code that are leading to the error … or provide another example that leads to the same problem?

Hi Lee,

The image included below is a typical “tile” (small section of a much larger whole slide image of Boston Children’s Hospital (BCH) H&E stained osteosarcoma resected bone) of the data I’d like to apply structure-preserving color normalization to. These tiles are nearly all under-stained compared to the reference image used in the example code or the well-curated public domain images I trained a CNN on.

What I was hoping to do is use the ITK Color Normalization facility to make the BCH data more like the well-stained training data or the example reference image which are very similar.

The code below is taken from
https://github.com/InsightSoftwareConsortium/ITKColorNormalization/blob/master/examples/ITKColorNormalization.ipynb

with a few slight modifications to optionally run on my iTerm command line environment instead of the original Jupyter notebook (“JRS=True”) or use a BCH tile (“BCH=True”).

I hope it won’t be too hard for you to reproduce my problems.

The diffs from the example code are in “diffs.txt" below. The BCH image is in “Test_1_0810.png”. The actual code used is in “xx.py”. The errors are in “errs.txt”.

diffs.txt (1.99 KB)

xx.py (3.78 KB)

errs.txt (904 Bytes)

Hi Lee,

I’m hoping you are able to carve out some time from your busy schedule to help me handle under-stained H&E images compared to the well-curated, well-stained images used in our ML training. See email sent 01/31/2022 11:35am EST.

I’m now visiting my daughter, Dr. Nadine Sauer SantaCruz, head of pediatric neuro-oncology at one of the two large hospitals in Maine). We will be working on the computing needed for our small proposal to use machine learning to label tumor regions in pediatric osteosarcoma whole slide images. The training data for the CNN used is public domain from U Texas*. The problem is that the real patient data we have access to from a collaborator seems under-stained compared to the reference image in your example or the U Texas data. We would like to use your color normalization tool to bring the H&E colors to be like your reference image or equivalently the U Texas data. I think such normalization needs to be routinely applied even for data within the same institution.

From your Kitware bio (https://www.kitware.com/lee-newberg/), it certainly looks like there are a few parallels in our careers with your career path (much abbreviated):

MIT Physics BS → PhD Computer Science → biological sequence analysis with NIH/NSF/DoD grants → Kitware (ML medical image analysis tools, also grant-supported).

A somewhat longer version of my career is below†.

So we both find ourselves involved in medical image analysis using machine learning.

Let me know if you got the email referenced above with the details you requested.

Many thanks,
Jon

† Stanford Physics BS → PhD particle physics → experimental physics data analysis (Fortan, Cray machines) → Bell Labs (Unix/C/C++) → teaching MS students Unix/C/C++ for 19+ years (even without formal CS training, but courses introductory, not heavily theoretical) → founded company to sequence DNA using instrumented nanopores (https://patents.google.com/patent/US6413792B1/en) → machine learning (Python/tensorflow after retirement) → image analysis for tumor classification as a physician tool (the intended target, no official status or funding yet but a pending small proposal with verbal acceptance with my daughter)

Jon R. Sauer
jon.sauer@gmail.com
Acton, MA, USA 01720
+1 303.579.3009

Hi Lee,

Vahadane et al., the originators of SPCN, provide their MATLAB implementation code in

https://github.com/abhishekvahadane/CodeRelease_ColorNormalization.git

as I’m sure you know.

My MATLAB experience is minimal and, after retirement from the university, I no longer have access
to proprietary code packages like MATLAB or Mathematica. However, for someone well-versed in the
UNIC/C++ and Python worlds, it initially seems straightforward to convert Vahadane’s SPCN code to
Python, if time-comsuming.

Before I embork on the expense in dollars and time in this translation, which I hope then to be able
to make work for my application (not guaranteed by any means), perhaps you could let me know whether
it is likely or unlikely that you will be able to respond to my earlier query about using your ITK
SPCN package to handle quite faded H&E images? I know you may indeed be in the midst of some other
fire-suppression effort or important project deadline and so unavailable.

I believe I need SPCN as the CNN model to be used is trained on well-curated H&E images, but the images
available from the hospital patients have much different staining characteristics. I’m pretty sure
the trained CNN model needs similarly-colored images to make correct inferences.

Thanks,

Cheers,
Jon

Jon R. Sauer, retired Physics/EE prof
jon.sauer@gmail.com
Acton, MA, USA 01720
+1 303.579.3009

Yes, you’ve hit the nail on the head … I have other priorities at the moment. I hope to take a look by the end of the week, but I can’t promise even that. :frowning:

The ITK SPCN code is indeed based off of the Vahadane, but where the latter starts with a random solution to then optimize, the ITK code uses non-negative matrix factorization to start with a seed that is plausible.

Regardless, if you want to plunge in and do some coding yourself, what you propose is one possibility. Another possibility is to make changes to the ITK code and submit a pull request. Familiarity with git and GitHub would be a big plus in this regard: https://github.com/InsightSoftwareConsortium/ITKColorNormalization

1 Like

Hi Lee,

Many thanks for your reply.

At the moment, I only have a slight acquaintance with git/GitHub, but I clearly should learn to use those facilities better. So I think I’ll try your second suggestion (looking at your github code). Basically, I’d like to find out if there is anything I can do to improve the situation after getting the error

ITK ERROR: Hematoxylin and Eosin are getting mixed up; failed.

when under-stained inputs are used.

Basically, I’m hoping that, with the code available, I can watch how it gets to that error and maybe figure out a way to change something to get past that point. So far, a few changes of input images and reference images have not succeeded.

Cheers,
Jon

Jon R. Sauer
jon.sauer@gmail.com
Acton, MA, USA 01720
+1 303.579.3009

I have tried running your xx.py script but am not able to reproduce the error; the code works for me. Admittedly, I am using current “master branch” sources (i.e., the bleeding edge code that hasn’t even been released yet) for ITK and ITKColorNormalization, so perhaps that has something to do with it. I can version and release the latter, which may help.

What versions of ITK and ITKColorNormalization are you running? Although it shouldn’t lead to the error that you are seeing … what version of Python are you using?

Hi Lee,

Good heavens — I missed your email yesterday until now, too involved trying to understand what is not working for me. I had thought you might not be able to get to this with your other commitments.

I’ll send you a more extensive set of details in a bit.

Many thanks.

Cheers,
Jon

Jon R. Sauer
jon.sauer@gmail.com
Acton, MA, USA 01720
+1 303.579.3009

addendum: sometimes the error ends with
. . .

RuntimeError: …/…/…/include/itkStructurePreservingColorNormalizationFilter.hxx:158:
ITK ERROR: The image to be normalized could not be processed; does it have white, blue, and pink pixels?

I think these inputs are just to faded, pixel intensities need to be boosted (intelligently) somehow before ITK’s color normalization can work as intended?

Cheers,
Jon

Jon R. Sauer
jon.sauer@gmail.com
Acton, MA, USA 01720
+1 303.579.3009

Hi Lee —

Attached are a Python script (zz.py) and a directory file (tmp) containing 4 faded H&E inputs typical of those I’d like to apply color normalization to (inp0.npy inp1.npy inp2.npy inp3.npy ) and 3 reference inputs from a public data base of the same type of cancer (ref0.npy ref1.npy ref2.npy).

The scrip optionally accepts command line arguments to specify paths to input and reference images but defaults to the files from your demo (‘Easy1.png’ and ‘Hard.png’ . Assuming a UNIX-like terminal window:

export REF=tmp/ref0.npy; export INP=tmp/inp0.npy # set env vars REF and INP python3 zz.py -REF $REF -INP $INP. # execute script with optional args

All 12 combinations of 3 inputs and 4 reference images produce errors as below:

ref,inp result
0,0 RuntimeError: …/…/…/include/itkStructurePreservingColorNormalizationFilter.hxx:199:
ITK ERROR: Hematoxylin and Eosin are getting mixed up; failed
0,1 seems to change all pixels to shades of pink
0,2 "
0,3 “

1,0 RuntimeError: …/…/…/include/itkStructurePreservingColorNormalizationFilter.hxx:199:
ITK ERROR: Hematoxylin and Eosin are getting mixed up; failed
1,1 seems to change all pixels to shades of pink
1,2 “
1,3 “

2,0 RuntimeError: …/…/…/include/itkStructurePreservingColorNormalizationFilter.hxx:199:
ITK ERROR: Hematoxylin and Eosin are getting mixed up; failed
2,1 seems to change all pixels to shades of pink
2,2 “
2,3 “

My environment is Python 3.7.6, MacOS 12.1, laptop MacBook Pro (15” 2016). As to ITK and ITKColorNormalization, just as downloaded from web site via '$ pip install itk’.

Sorry, some diagnostic extraneous prints left in zz.py, but final prints relevant.

Cheers,
Jon

Jon R. Sauer
jon.sauer@gmail.com
Acton, MA, USA 01720
+1 303.579.3009

Jon R. Sauer
jon.sauer@gmail.com
Acton, MA, USA 01720
+1 303.579.3009

ref0.npy (127 KB)

ref1.npy (127 KB)

ref2.npy (127 KB)

Jon R. Sauer
jon.sauer@gmail.com
Acton, MA, USA 01720
+1 303.579.3009

(Attachment inp0.npy is missing)

Hi Lee,

Input tiles too big for ITK to accept (~3.2 Mb bit limit 3.0 Mb), so will generate much smaller patches.

Using the smaller patches, say 258x258 pixels, will again try out ITK’s color normalization and send you the results.
Will take a few hours, sorry.

Cheers,
Jon

Jon R. Sauer
jon.sauer@gmail.com
Acton, MA, USA 01720
+1 303.579.3009

Hi Lee —

Trash the zz.py I sent you, too much diagnostics left in and tries to use input files that are too big to send you.

But the 3 reference files inp0.npy, inp1.npy, and inp2.npy I’ll continue to use.

Sorry,
Jon

Jon R. Sauer
jon.sauer@gmail.com
Acton, MA, USA 01720
+1 303.579.3009

I see the three files, ref[012].npy, which presumably are image files for the references, but no corresponding “input” files to be color normalized to those references. So, I think that I am still waiting on you (to address that the original input files were too large). If instead, you are waiting on me, please let me know. Thanks --Lee.

P.S. The ref[012].npy are the only files that I have from this past weekend’s messages. If you are sending an update to xx.py or anything else, I will need those files too.