Measuring Copying of Java Archives

Tetsuya Kanda, Daniel Morales German, Takashi Ishio, Katsuro Inoue


Copying the whole of a library is one of the major types of reuse in software development.
In Java, a single library archive file often contains other libraries it depends on,
but users of the library hardly know about such inner libraries.
Since reusing libraries is a black-box method, developers may combine some libraries
without knowing that those libraries contain the same library inside independently.
As a result, a library may contain inside several copies of a library it reuses.
In this research, we measured copying of jar archives in the Maven Central Repository, a collection of open source Java libraries.
Our results show that about 14% of top-level jar files are reused in other jar files
and some of them are duplicated in a single jar file.
We also found that some libraries contain two or more different versions of the same library.

Full Text:




Hosted By Universit├Ątsbibliothek TU Berlin.