Reading

The following material is optional.  However, interested readers are encouraged to peruse it.

Google Ngram Viewer

This site provides information on the Google Ngram Viewer, also known as Google Books Ngram Viewer, which is one of the most widely-used n-gram tools in scientific research and humanities scholarship.

View: Google Ngram Viewer

 

N-grams Data

[]

This site provides 2-, 3-, 4-, and 5-grams based on the publicly available Corpus of Contemporary American English (COCA).  The site enables a number of different queries, including sequences for noun + noun, verb + “the” + noun, three word strings with a preposition in the middle position, and two word strings where either of the words have specific beginnings or endings.

View: N-grams Data

 

Big Data for the Humanities Using Google Ngrams.  Discovering Hidden Patterns of Conceptual Trends

Shai Ophir

View: Big Data for the Humanities Using Google Ngrams.  Discovering Hidden Patterns of Conceptual Trends

 

Digital Humanities, Big Data, and Ngrams

Claude S. Fischer

June 30, 2013

View: Digital Humanities, Big Data, and Ngrams

 

Understanding Word N-grams and N-gram Probability in Natural Language Processing

Sunny Srinidhi

Nov 27, 2019

Here and Here

 

From DataFrame to N-Grams

A quick-start guide to creating and visualizing n-gram ranking using nltk for natural language processing.

Ednalyn C. De Dios

May 22, 2020

View: From DataFrame to N-Grams

 

 

PROGRAMMING INFORMATION: The following websites contain useful information about many of the Python and library functions employed in the examples in this section.

Data Frame Manipulation (Pandas)

Here and Here

 

Natural Language Processing Toolkit (NLTK)

nltk.org

 

Plotly Functions

Text and Annotation

 

Range Slider (Date/Time Selector)

 

Time Series Plots

 

Hover Text

 

ABC Headline News Corpus (CSV file)

 

From DataFrame to N-grams. A quick-start guide to creating and visualizing n-gram ranking using nltk for natural language processing.

Ednalyn C. De Dios

May 22, 2020

View: From DataFrame to N-grams. A quick-start guide to creating and visualizing n-gram ranking using nltk for natural language processing.

 

Python Code

 

This section requires the following Python code:

N-Gram_Visualization_Example.py (Jupyter Notebook N-Gram_Visualization_Example.ipynb) and the data file abcnews-date-text-bydate-yrmo.csv.

Bigram_Visualization_Example.py (Jupter Notebook Bigram_Visualization_Example.ipynb) and the following data files:

abcnews-date-text-bydate-yrmo.csv, abcnews-date-text-bydate.csv, and abcnews-date-text.csv.

[NEXT]

License

Icon for the Creative Commons Attribution-ShareAlike 4.0 International License

Digital Humanities Tools and Techniques II Copyright © 2022 by Mark Wachowiak, Ph.D. is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book