Archives and Databases

Archives have a complex relationship to databases.  A database is an integrated computer structure that stores data.  Databases emerged because of ubiquity of data.  Data in any undertaking or enterprise are abundant and can be collected or generated from anywhere.  In other words, it is ubiquitous.  Data must also be persistent – that is, they must have a continuing availability to users.  To this end, databases make data persistent, shareable, and secure.  The value of databases is evident when data are transformed into information, which provides context to the data, and is the foundation of insights.  Data simply represent raw, unprocessed facts.  In themselves, they have no semantic content.  They are, however, the raw materials from which information is built (Coronel & Morris, 2018).  Databases is often shared amongst multiple users.  A database consists of end-user data, or raw facts of interest to the user of the database, and metadata, or “data about data”.  The purpose of metadata is to facilitate integration and management of the data in the database and describes data characteristics and relationships.  Databases, which are normally stored as a collection of different types of files, is managed and made accessible through specialized software known as a database management system (DBMS).  A DMBS is a complex system of programs that organizes the data, manages the database structure, and controls access to data contained in the database.  It is an intermediary between users and the database.  It facilitates sharing, integration, and access of data, manages the data robustly and efficiently, and provides an integrated view of the data.  Its goal is to facilitate decision-making and improve productivity, where the definition of productivity is dependent on the specific application to which the DBMS is applied (Coronel & Morris, 2018).  DBMSs can also be queried through specialized functions.  That is, they can retrieve data based on user requests, and “answer” questions and about the data and relationships among the data, allowing information, and, ultimately, knowledge and insights to be obtained.  Most DBMSs provide languages that allow the user to formulate and execute different types of queries.  Usually, database queries programmed in database languages are incorporated into a graphical user interface or web form that hide the complexities of formulating queries from users.  DBMSs and queries will be discussed in detail in a subsequent section.

 

The ability to query a database is a powerful tool to uncover and discover new relationships.  However, as already stated, databases and the data in them do not provide semantics, and, as with archives, the gleaned relationships cannot be easily interpreted by database users.    Consequently, there is a concern that reliance on databases in the digital humanities diverts attention from argumentation to what can be queried from the DBMS (Poole, 2017).

 

Because of their utility and effectiveness, databases and DBMSs are considered by some scholars to be the successors to archives.  Databases and archives are both pivotal in the digital humanities.  Some archives incorporate databases to address some of the deficiencies and problems associated with standard archives.  Through interfaces that hide DBMS complexities from users, databases in archives increase and enhance accessibility to scholars and other users.  They also allow documents from multiple archives to be compared and contrasted to gain new insights and to establish new connections (Poole, 2017).

 

Digitization of analog archives, digital archives, web archives, and user-generated born digital archives raise new questions about sources and sources treated as “data” in digital instantiations.  Data – which are represented in binary code in digital realizations – allow both “macroscopic” and “microscopic” analysis.  The former refers to investigation of data across a large number – as high as millions – of sources.  Computational approaches such as text mining, image analysis, and machine learning techniques are applied at the macroscopic level.  More detailed analyses and engagement are performed at the microscopic scale.  For instance, close reading can be facilitated through digitally enabled concordances and patterns occurring in search terms (Owens & Padilla, 2021).

[NEXT]

License

Icon for the Creative Commons Attribution-ShareAlike 4.0 International License

Contemporary Digital Humanities Copyright © 2022 by Mark P. Wachowiak is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book