Literature/1993/Ellis


 * http://works.bepress.com/furner/8

Authors

 * University of Sheffield

Abstract
Describes the use of a variety of similarity coefficients in the measurement of the degree of similarity between objects that contain textual information, such as documents, paragraphs, index terms or queries. The work is intended as a preliminary to future investigation of the calculations involved in measuring the degree of similarity between structured objects that may be represented in graph-theoretic forms. Discusses the role of similarity coefficients in text retrieval in terms of: document-query similarity; document-document similarity; co-citation analysis; term-term similarity; and the similarity between sets of judgements, such as relevance judgements. Describes several methods for expressing the formulae used to define similarity coefficients and compares their attributes. Concludes with details of the characteristics of similarity coefficients: equivalence and monotonicity; consideration of negative matches; geometric analyses; and the meaning of correlation coefficients.