Archives, Museums, And Other Glam Institutions
Archives enable access to documents, books, articles manuscripts, letters and correspondence, artwork, and other assets that have inestimable value for humanities research. They are clearly a major concern of galleries, libraries, archives, and museums – known as GLAM institutions. Because they preserve records of people, places, things, and events, archives are particularly salient for historical scholarship.
As museums, libraries, and archives are considered the retainers of some form of collective memory, they are often called memory institutions to emphasize the linkages between these organizations, increasingly enabled by digital media and network technology (Stainforth, 2016). Consequently, there is a symbiotic relationship between cultural heritage, conceptions of memory, and digital technology. Large-scale digitization projects, such as Europeana Common Culture, provide a sustainable aggregation framework and environment for cultural data, consisting of a virtual “sandbox” for cultural data processing, a data aggregator, and a 3D viewer that facilitates incorporating 3D resources into the project from URLs. In this way, museums, libraries and archives become memory institutions, preserving cultural heritage, leveraging the capabilities of digital technology (Stainforth, 2016).
There is currently a disconnect between digital humanities research and technological advances in born digital preservation, including archival science research. Driven primarily by galleries, libraries, archives, and museums (GLAM institutions), the science of archiving digital materials and digital archives is highly developed. However, legal, ethical, and copyright considerations, among other factors, make digital humanists reluctant to fully incorporate digital archives into their work (Ries & Palkó, 2019).
The gap between the two communities can be closed, or at least reduced, with further improvements from the archiving community. For instance, born digital archives need interoperable standards and standardized practices and workflows. Such standards are being developed in the archiving community. Digital archives also require advanced application programming interfaces (APIs), which must be accessible to humanities scholars. GLAM institutions must also continue to develop their born digital collections to include the concerns of various user communities, improve preservation methods, curation workflows and practices, and improve access for researchers and other stakeholders (Ries & Palkó, 2019).
Of special relevance to the digital humanities are collections of digitized materials and the concepts of digital archives and web archives. Some archives in digital format consist of digitized copies of entire archival collections. The Clara Barton Papers, physically residing at the U.S. Library of Congress in Washington, D.C., are available in their entirety online. Both content of the collection as well as the folders in which they are located were digitized. The online, digitized collection is presented through an interface that emulates the boxes and folders as they exist in the physical collection, providing access to the collection in the manner in which the archivists arranged and designed it. The interface contextualizes the digitized collection, and therefore this archive functions as a “digital surrogate”, or computer-enabled stand-in, to the physical collection (Owens & Padilla, 2021).
However, the concept of a digital archive has a slightly different inflection. According to Trevor Owens, Head of Digital Content Management at the Library of Congress, digital archives “…refer to born digital materials processed as part of a more traditional notion of an archive”, and “…digital archives hang together as ‘a conscious weaving together of different representational media.’”. Recall that born digital refers to materials that were originally conceived and created digitally, in contrast to analog – non-digital – materials that were subsequently digitized. Examples of born digital materials include email, documents created from word processing or text editing software, websites, databases, images from mobile devices, and social media data. Computer code can also be considered a born digital source (Owens & Padilla, 2021). Additionally, the born digital historical record has long been an integral part of the digital humanities. Digital humanists Matthew Kirschenbaum and Jane Winters urge humanities researchers in general to accept and to embrace this record itself as a primary source, as it has been in media history, historical bibliography, and textual scholarship (Ries & Palkó, 2019).
Because they contain born digital material, a digital archive is considered to be a subset, or a component of a physical archive. There is a continuing development of standards and practices for acquiring, aggregating, processing, and preserving born digital archival material, and consequently the quality and consistency of archive accessibility and descriptions is uneven across digital archives (Owens & Padilla, 2021).
Hypertext and hypermedia – which extend hypertext to allow the linking to and provision of multimedia, such as audio, video, or graphics – are distinguishing characteristics of digital archives.
Web archives comprise a particular type of born digital archives, but sources are collected and organized differently than digitized copies of archival collections. Web archives are often created using data obtained with web crawlers, specialized programs that browses the web in a specified, systematic manner to get all available rendered content of a webpage, its associated files, and possibly of the other pages that link to it (Owens & Padilla, 2021). Web archiving is a process in which parts of the World Wide Web are collected to preserve those components in archives for future use. A typical example is the Wayback Machine, a not-for-profit digital library whose aim is to archive the entire Web, including archived copies of currently defunct web pages. Web archives are generally publicly available. They are sometimes referred to as internet archives. Web archiving is a complex task, and specialized file formats, such as the Web Archive File Format (WARC), which facilitates combining and aggregating digital materials with possibly different formats with related information and metadata, have been developed for this purpose.
Webrecorder provides open source projects and tools for capturing interactive web sites. Many of these tools are written in Python. It generates collections of WARC files. Webrecorder adopts a Web archiving approach that diverges from the web crawling paradigm (which is more comprehensive), and supports micro-analysis, and a more fine-grained approach to harvesting Web data.
Another category of archives of digital material is collections of user-generated born digital primary sources. This category encapsulates the participatory nature of the Web, where users can upload content, share content, and provide commentary, and “crowdsource” material. User-generated born digital archives are recognized by historians, as evidenced in the September 11 Digital Archive, which was “crowdsourced” from born digital materials focusing on a specific topic (Owens & Padilla, 2021).
For the purposes of many scholars in the digital humanities, digital archives are collections or groupings of digitized materials that were not originally in digital form, and that are generally available online. Many of the original materials are print publications, and are originally located in different geographic locations or in different physical repositories. The purpose of the archive is to advance and to support a goal or set of goals, and therefore digital repositories generally have a theme or consist of related subject matter. The Rosetti Archive and William Blake archive are two examples of archives that are used primarily for scholarly purposes. Because archives contain heterogeneous materials, such as multiple drafts of manuscripts and multiple editions, they are of interest to digital humanists for interpretive scholarship, and to explore conflicting viewpoints on a particular question. The importance of archives to humanities work has led some observers to call for closer collaborations between humanists and archivists (Poole, 2017).