Returning to the bag-of-words example, we can use the notion of angle to measure how two different documents are close to each other.
Given two documents, and a pre-defined list of words appearing in the documents (the dictionary), we can compute the vectors of frequencies [latex]x, y[/latex] of the words as they appear in the documents. The angle between the two vectors is a widely used measure of closeness (similarity) between documents.
See also: