About Metrics for Clone Detection

Thierry Lavoie, Ettore Merlo

Abstract


Clone detectors rely on the concept of similarity and dissimilarity measures to identify cloned fragments. The choice of specific distance function in a clone detector is arbitrary up to some extent. However, with a deeper knowledge of similarity measures, we can condition this choice to have some properties that can help improve scalability and quality of tools. This paper presents some interesting results, insights and questions about similarity and dissimilarity measures, including a somehow counter-intuitive result on the cosine distance.

Full Text:

PDF


DOI: http://dx.doi.org/10.14279/tuj.eceasst.63.923

DOI (PDF): http://dx.doi.org/10.14279/tuj.eceasst.63.923.915

Hosted By Universitätsbibliothek TU Berlin.