Dear all,
I am considering the processing of a big binary image. A first processing step is to transform it into a second binary image, whose connected components I wish to retain.
For lower disk usage and further parallel processing of blocks for semi-local features computation, I am considering to store a representative per CC and the second binary image, as opposed to the 64 bits connected components labeling of this second image.

This leads to the following question, that represents well what I will have to do in the blocks.

Question: Let A a boolean array, B a 64 bits array labeling the connected components of A>0.
Let R be a complete list of representative pixels for the components, indexed by their label in B. What is an efficient way to recover B from (A,R) ?

I am thinking of using watershedding as implemented, say, in skimage, seeding with the elements of R.

Regarding my specific question, the idea is that you do not have B, you forgot it, but you know exactly one guy in each blob (a representative) and know the array B != 0 (namely A) and you want to recover B efficiently.

I used the term representative because the partition in connected components defines an equivalence relation on the locus A>0.

Thank you for the references, these are indeed interesting for the overall question of efficient storage of labeling.