Reading

The material on the following site is important and should be read either before or after studying this section.

Principal Component Analysis Explained Visually

Victor Powell (with text by Lewis Lehe)

This short web article provides a basic introduction to PCA illustrated with graphical examples.  It is highly-recommended due to its intuitive, yet thorough presentation that emphasizes the “why” of PCA, and how it can be employed for data exploration.

 

 

The following material is optional.  However, interested readers are encouraged to peruse it.

Principal Component Analysis (PCA) in Python

Aditya Sharma

January 1, 2020

This web article provides a thorough, intuitive introduction to principal component analysis (PCA) without complex mathematics.  The main concepts of PCA are illustrated through examples demonstrated in Python using Scikit Learn library functions and popular data sets.

 

Clustering with Scikit-Learn in Python

Thomas Jurczyk

September 29, 2021

This web article provides a thorough demonstration of k-means clustering on Greco-Roman authors in the ancient world.  Principal component analysis is used to further analyze the results.  The example in this article is illustrated with the Scikit Learn package in Python.  Many code snippets are presented.  Mathematical details and more advanced techniques are also provided, which the reader may skip.  For the purposes of the present discussion, most benefit from the article will be drawn from the discussion of k-means clustering, principal component analysis, the explanation of the application at the intersection of literary studies and classical studies, and the instructive Python code.

 

The following website may be used for reference.

PCA with Scikit Learn in Python

 

Python Code

 

This section uses the Python code:

K-Means_Ancient_Authors_Example.py (Jupyter Notebook K-Means_Ancient_Authors_Example.ipynb) and the data file DNP_ancient_authors.csv.

PCA_tSNE_Example.py (Jupyter Notebook PCA_tSNE_Example.ipynb).

K-Means_tSNE_Example.py (Jupyter Notebook K-Means_tSNE_Example.ipynb).

Note: Because the data generated are random, the visualizations may differ from those shown in the text.

 [NEXT]

License

Icon for the Creative Commons Attribution-ShareAlike 4.0 International License

Digital Humanities Tools and Techniques II Copyright © 2022 by Mark Wachowiak, Ph.D. is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book