The “Defining the Digital Humanities” Genre
As discussed above, definitions of the “digital humanities” abound, and some of them are canonical, such as the widely cited 2010 definition and description of Matthew G. Kirschenbaum (M. G. Kirschenbaum, 2016). In literary studies, a large subfield of the digital humanities, the study of genre is important, and defining the digital humanities has become such a popular and controversial activity that it has led to the emergence of a new genre, specifically focusing on digital humanities definitions. The literature on what digital humanities is/are (its “welcoming”, or “pull” aspects) and is not (its “gate-keeping”, or “push” aspects) is a big part of this new “Defining the Digital Humanities” genre.
One way to investigate this genre is topic modeling, which is also a widely employed technique in the digital humanities itself. A topic model is a statistical model that determines topics in a corpus (a large collection or body of documents) or other document collection. The topics are not known beforehand. These topics assist in the analysis of semantic structures that embedded in text. It is also useful for discovering abstract topics that are important and defining for a particular genre. Topic modeling determines large “clusters” of words that belong to the topics that have been identified. From studying these clusters, scholars can assign names, or labels, to these topics, and use them to research key semantic questions about the texts. In recent scholarship in the new “Defining the Digital Humanities” genre, researchers have recently identified 334 definitions (Callaway et al., 2020). The researchers collected and curated a corpus of digital humanities in the English language and included 15 different metadata fields (data about the data) for identifying information such as department, career stage, and institution for the authors of each definition (Callaway et al., 2020). The researchers performed topic modeling with software written in the R programming language, and subsequently analyzed the results through word cloud (tag cloud) visualizations and statistical methods. However, according to the researchers, the most useful results were obtained by analyzing the metadata (Callaway et al., 2020).
From the results of this investigation, the researchers analyzed four topics in detail. Some of the words included in the topics were: “job, field, scholars, students, programming”, “values, community, open, collaboration, openness”, “literary, reading, text, literature, criticism”, and “race, projects, gender, women, studies”. The topics represented by these words were respectively designated as “Code”, “Community”, “Distant Reading”, and “Diversity and Inclusion”.
The researchers arrived at some interesting conclusions. Because of the multitude of definitions, the “push”, or “gate-keeping” mechanism of the digital humanities (barriers to entering or progressing in the field, such as a lack of computer programming or other technical knowledge) is offset. However, this same definitional variety can be disorienting to aspiring digital humanities scholars. The researchers detected possible gender trends in the topics classified as “Distant Reading” and “Diversity and Inclusion”, suggesting the potential influence of gender differences across different topics. The authors also found that the definitions in the corpus contained gender and class imbalance, as evidenced by the large number of definitions written by male academics. The researchers conclude that the task of defining digital humanities is unfinished, and that that activity continues unabated (Callaway et al., 2020).