Remote Module Grading/Ranking System Needed

jhlegarreta · February 21, 2020, 9:04am

Thanks for the monumental work you have done lately to update the remotes Hans. And both of you for discussing these aspects.

In the past I invested some time in trying to improve the remote module infrastructure to comply with the recommended practices in the ITKModuleTemplate, their coverage, keeping up to date with the core ITK coding style changes, creating (together with some folks in the community) some scripts in ITK/Utilities/Maintenance to be able to apply a given change to all remotes, etc.

I have not had that time since a while, though. I am sorry.

Some have still not transitioned to Azure pipelines, and as Hans says, some have not built successfully for a while.

Having read Hans’ introduction, one of the first things that I realize is that we made an effort to transfer the repositories to the InsightSoftwareConsortium organization. This was motivated because very frequently we had a few issues with the upstream repositories (e.g. did not have merge rights, etc.), or the original contributors and maintainers did no longer have the time to maintain their repository. This somehow means that the maintenance burden is put back to the core developers or to the community in a broader sense. I think this is a chicken and egg problem.

Another of the issues had to do with keeping the modules up-to-date with the latest ITK release, but at the same time providing maintenance for prior releases or other ITK build options (e.g. ITKv4 vs. ITKv5, CMake flags such as ITK_LEGACY_REMOVE to On/Off, etc.), or allowing the modules to be used both within the ITK source tree and outside. Both these certainly require work to create the appropriate CI builds, tagging the repositories, etc.

IMHO, scoring the remotes would still demand an automated system based on the criteria we may come up with, including the ones proposed in this thread.

A few thoughts:

In my mind, a proxy of the number of users would be the number of clones or the number of PyPI installs, rather than the number of stars.
The accuracy or the fact that it has been validated I guess it comes from the amount and quality testing (i.e. does it use just synthetic testing data?, can we add real imaging data?, etc.) or from its use/citation in scientific publications (this last may be hard to track unless and even though a citation system like Zenodo is used?) beyond the IJ.
IMHO the robustness is determined by the CI. Additionally, if I am not mistaken the Python ecosystem has tools to automatically report failures from users that use the package.
Can we automatically determine/measure whether all methods are documented?

And Hans’ points are also interesting/necessary.

IMHO it is unfeasible that this is done manually by any core developer.

And finally, when a remote has been around for some time and is at compliance level 0 or 1, when is the time to integrate it into ITK proper? In the ITK Software Guide, section 9.7.2 Procedure for Adding a Remote Module, p. 239, it was agreed to say that

After the Remote Module has experienced sufficient testing, and community members express broad interest in the contribution, the submitter can then move the contribution into the ITK repository via GitHub code review.

But I think we did not succeed in doing this for a few of the remotes that have been around for some time (which would ease some of the issues above).

May be we should try to define what is sufficient and broad interest (e.g. number of pip installs?) as well using the criteria you both proposed.