Appendix
This project is made possible with funding by the Government of Ontario and through eCampusOntario’s support of the Virtual Learning Strategy.
To learn more about the Virtual Learning Strategy visit: https://vls.ecampusontario.ca.
Nipissing University sits on the territory of Nipissing First Nation, the territory of the Anishnabek, within lands protected by the Robinson Huron Treaty of 1850. We are grateful to be able to live and learn on these lands with all our relations.
PYTHON AND R TUTORIALS USING JUPYTER NOTEBOOKS
This course includes an interactive Python statistics tutorial (PythonStatisticsTutorial.ipynb) in the form of a Jupyter Notebook. Jupyter (https://jupyter.org/) is a free, open-source, interactive web tool, or “notebook”, in which text, code from a variety of programming languages, output, explanations, and multimedia resources can be combined into a single document that is presented through a web browser (Perkel, 2018). The notebooks facilitate interactive data exploration, where learners can execute code, observe the results, modify and experiment with the code, and engage in an “iterative conversation” between scholars, learners, computations, and data (Perkel, 2018).
In addition to the tutorials, several Jupyter notebooks to supplement the Python and R code presented in this course.
Although many of the code examples in this course have corresponding Jupyter Notebooks, the code is also supplied as Python and R files so that they can be used without Jupyter.
Although Jupyter notebooks are the preferred method for using the tutorials, as they integrate text, code, and interactivity, it is not necessary to run the tutorials in these notebooks. If for whatever reason the instructor or learner encounters difficulties running the notebooks, or decides not to use them, the corresponding Python and R code is available as *.py files and *.R files, respectively, that can be run in an appropriate interface for those languages. In other words, if the user does not use the Jupyter notebooks, the code that is demonstrated in those notebooks is available. The Python code (*.py files) can be used in interfaces such as IDLE, Spyder, etc., or on the command line. The R code (*.R files) can be used in interfaces such as RGui and RStudio.
INSTALLING JUPYTER
Installation instructions for Jupyter are found on https://jupyter.org/install. For Python,
Install the classic Jupyter Notebook with pip:
pip install notebook
To run the notebook from the command line (e.g., in Windows, accessed through cmd):
jupyter notebook
This command will open Jupyter Notebooks in the user’s browser.
When first working with Jupyter, it is easiest to run the above command in the same directory (folder) where the Jupyter notebook files – files with the ipynb extension. In addition, it is easiest to also keep all data files used by the notebooks in that directory. However, the user can navigate to any directory from the main Jupyter interface that opens in the user’s browser upon entering the jupyter notebook command from the command line.
Jupyter supports Python and R. To set up Jupyter for R, install the IRkernel package from the R command line (e.g., from the command line in the RGui interface).
install.packages(‘IRkernel’)
After the IRkernel package has been installed, run the following command from the R command line to make the kernel available to Jupyter:
IRkernel::installspec(user = FALSE)
See the document Package ‘IRkernel’ on https://cran.r-project.org/web/packages/IRkernel/IRkernel.pdf for additional information.
INTERACTING WITH THE JUPYTER NOTEBOOKS
Jupyter Notebooks provide the user with flexibility to interact with the code in different ways. For instance, users may choose to run the entire notebook by selecting Cell from the menu, and then selecting Run All. The user may then modify the code, add new cells, and experiment with the code.
DATA USED BY THE PYTHON AND R SCRIPTS AND JUPYTER NOTEBOOKS
Many Python and R scripts and Jupyter require data files, which are supplied in this distribution. The data may be stored in any directory/folder, but the corresponding code (Python, R, and Jupyter Notebooks) needs to be slightly adjusted for this path. The Python and R code default to the Data\ directory, meaning that this code is expecting to locate any data in a separate Data subdirectory within the directory where the Python and R code execute. For instance, if the Python code is placed in a directory named C:\DIGI2306\Python, the data would be placed into the C:\DIGI2306\Python\Data directory. For R, the directories would be, for example, C:\DIGI2306\R and C:\DIGI2306\R\Data.
For the Jupyter Notebooks, the default is for data to be located in the same location as the notebooks. For instance, if the Jupyter notebooks were placed into the directory C:\DIGI2306\Jupyter, then the data files would be placed there too.
The file path in the code can be modified as necessary. The code contains commented sections indicating where the path(s) should be changed.
JUPYTER NOTEBOOKS AVAILABLE FOR THE COURSES IN THIS CERTIFICATE
The following Jupyter notebooks are available for the first three courses in this certificate (DIGI 2016, DIGI 2316, and DIGI 3017).
Bigram_Visualization_Example.ipynb
Colours_Example.ipynb
GenderedPerspectives_Visualization.ipynb
GenreTree_Example.ipynb
GIS_Density_Mapping_Example.ipynb
K-Means_Ancient_Authors_Example.ipynb
K-Means_Example.ipynb
K-Means_tSNE_Example.ipynb
N-Gram_Visualization_Example.ipynb
PCA_tSNE_Example.ipynb
PythonStatisticsTutorial.ipynb
PythonTutorial.ipynb
Regression_Example.ipynb
RTutorial.ipynb
Sentences_KMeans_Example.ipynb
SocialNetworks_GIS_Example.ipynb
SocialNetwork_Visualization_Example.ipynb
Sunburst_Example.ipynb
TextAnalysis_Example.ipynb
TF-IDF_Example.ipynb
Visualizations_Matplotlib_Plotly_Example.ipynb
WordCloud_Example_2.ipynb
REFERENCES
Perkel, J. M. (2018). Why Jupyter is data scientists’ computational notebook of choice. Nature, 563(7732), 145-147.