CPANTS update

There are tons of blogs and articles trying to find an answer on How to measure software quality. As this is a rather difficult question and we don't really have a good tool for measuring it Michael Schwern - one of the key persons in the Perl QA team - came up with the idea to measure Kwalitee instead.

He even created a tool called CPANTS that measures Kwalitee. The tool is now maintained by Thomas Klausner with my occasional additions to the code.

CPANTS has currently 48 metrics. Some are direct such as has_test others are indirect such as is_prereq or some of the recent ones such as distributed_by_debian.

is_prereq indicates if the module is used by some other module of some other author on CPAN. Using this as a quality indicator uses the same argument as in the academics, where the important metric is not how many articles have you written but rather how many other articles references yours and how important those articles are. Google uses the same idea in ranking pages.

I assume the same is true on CPAN. CPAN authors are usually picky. So if a CPAN author uses module X that probably means modules X is good. Or at least unavoidable.

In a similar way we have added a set of new metrics:

distributed_by_debian indicates if the module has been included in Debian. Inclusion in a Linux distribution likely means there is a need for that module in the user base. In return that probably means the module is, well, usable. (Currently there are only 790 modules listed as such due to some fragmentation in the way Debian maintains Perl modules but I hope this will be soon resolved.)
latest_version_distributed_by_debian indicates if the version included in Debian is the latest from CPAN. Obviously this indicator has some problems. A module that has frequent updates will fail this metric most of the time as the Debian people take their time upgrading the module. After all they want to make sure that they only include good versions of a module. So take this metric with appropriate caution.
has_no_bugs_reported_in_debian Debian has its own bug reporting system, separate from RT used by CPAN authors. It is quite possible that there is a bug in the version distributed by Debian that has already been fixed on CPAN but on the long run, modules without open bug reports are likely to be of better quality for some meaning of quality.
has_no_patches_in_debian As the Debian maintainer see themselves responsible for providing high quality software to their users sometimes they fix bugs in the software they distribute. These fixes are called patches. Normally these patches should be integrated back to the main version of the module on CPAN and then the Debian maintainer should be able to update the version they distribute by a unpatched version.
Still as long as there are patches for a module it means there is a bug somewhere in the module on CPAN.

In addition there is a new experimental metric called

uses_test_nowarnings.
Most of the CPAN modules come with plenty of tests. The CPAN Testers report any test failure which is really cool. Still sometime I encounter modules that while they pass their test suit they also generate warnings during the test execution.
These warnings are probably never reported as most people run the tests in an unattended mode.
For now it seems that the best way to catch and report these warnings is by using [dist://Test::NoWarnings] in every test script. So this new metric checks if all the test script use it.

The metrics concerting Debian server two purposes. On one hand they can be used as indicators for the kwalitee of a module. On the other hand providing an easy way for CPAN authors to monitor what happens to their module can help resolving issues encountered in Debian.

We started with Debian but I really hope soon we'll be able to hook up other distributions too such as FreeBSD and Fedora

Comments

related on module-authors list

perl-qa list

related on perl-qa list