Reading

The following material is optional.  However, interested readers are encouraged to peruse it.

Google Ngram Viewer

This site provides information on the Google Ngram Viewer, also known as Google Books Ngram Viewer, which is one of the most widely used n-gram tools in scientific research and humanities scholarship.

View: Google Ngram Viewer

N-grams Data

This site provides 2-, 3-, 4-, and 5-grams based on the publicly available Corpus of Contemporary American English (COCA).  The site enables a number of different queries, including sequences for noun + noun, verb + “the” + noun, three-word strings with a preposition in the middle position, and two word strings where either of the words have specific beginnings or endings.

View: ngrams.info

 

Big Data for the Humanities Using Google Ngrams.  Discovering Hidden Patterns of Conceptual Trends

Shai Ophir

Read: Big Data for the Humanities Using Google Ngrams.  Discovering Hidden Patterns of Conceptual Trends

 

Digital Humanities, Big Data, and Ngrams

Claude S. Fischer

June 30, 2013

Read: Digital Humanities, Big Data, and Ngrams

 

Understanding Word N-grams and N-gram Probability in Natural Language Processing

Sunny Srinidhi

Read: Understanding Word N-grams and N-gram Probability in Natural Language Processing

-or-

Read: Understanding Word N-grams and N-gram Probability in Natural Language Processing

Nov 27, 2019

 

From DataFrame to N-Grams

Ednalyn C. De Dios

May 22, 2020

Read: From DataFrame to N-Grams

 

Python Code

The code to generate the bigram displayed in this section is discussed in the next course.  Interested readers may refer to the script Bigram_Visualization_Example.py.  This code can be modified to calculate and display the unigram and trigram.  A Jupyter Notebook (Bigram_Visualization_Example.ipynb) is available for this code.

[NEXT]

License

Icon for the Creative Commons Attribution-ShareAlike 4.0 International License

Contemporary Digital Humanities Copyright © 2022 by Mark P. Wachowiak is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book