"

Returning to the bag-of-words example, we can use the notion of angle to measure how two different documents are close to each other.

Given two documents, and a pre-defined list of words appearing in the documents (the dictionary), we can compute the vectors of frequencies [latex]x, y[/latex] of the words as they appear in the documents. The angle between the two vectors is a widely used measure of closeness (similarity) between documents.

See also:

License

Icon for the Public Domain license

This work (Đại số tuyến tính by Tony Tin) is free of known copyright restrictions.