Blockchain Technology for Digital Archives
Blockchain is technology wherein a distributed digital record, or ledger, is maintained, and in which records of transactions are verifiable, and are securely preserved. Records, known as blocks, are added to the growing blockchain ledger using sophisticated cryptographic techniques. The ledger itself is distributed over a peer-to-peer network, and blocks are added to the blockchain by nodes on this network. The blocks in the ledger may contain any type of data. These records are immutable, or unchangeable, to the way blocks are added. Each new block contains with itself a cryptographic code, called a hash, of the block that precedes. A hash is a mathematical transformation of data using extremely complex functions so that it exceedingly difficult to recover the original data from its hash. (Note: cryptography is a highly developed area in computer science and applied mathematics, and, due to its importance in computer and network security, and because of the ubiquitous use of e-commerce and transactions performed electronically, it is an active research area. However, cryptography is also quite complex, and is therefore beyond the scope of the current discussion.) The new block also contains a timestamp, or numerical representation of the exact time and date of the transaction, and the transaction data, usually represented by a tree data structure. An obvious application area is the management and maintenance of financial records. Blockchain technology is employed in cryptocurrencies, or digital currencies that use their own system of “coins”, to record transactions.
There are different types of blockchains, which are classified based on the type of users that can access the blockchain, and on the features provided. Two important types of blockchains are Ethereum and Hyperledger. Ethereum is a decentralized open source blockchain (its source code is available). It is a permissionless blockchain, meaning that any user can participate in using the blockchain. Each block contains data that identifies the block immediately preceding it. A node that executes a sequence of operations in a transaction changes the state of the Ethereum accounts, consisting of the balances and other information. The state itself is maintained on the node itself in a data structure known as a Merkle tree, which contains hash values. Ethereum also contains its own programming language, named Solidity. Hyperledger (specifically Hyperledger Fabric, the framework in which Hyperledger applications are built), like Ethereum, is an open source blockchain. Its goal is to advance cross-industry blockchain technologies for enterprise solutions, and to facilitate collaborative and transparency in blockchain development. Unlike Ethereum, Hyperledger Fabric blockchains are usually permissioned; users must be authenticated and authorized before entering the network (Duca et al., 2020).
Blockchains may be viable future technology for digital archives. They are immutable; they cannot be changed, mainly due to each block containing the cryptographic hash code of information in the previous block to which it is appended. The blockchains are distributed on a peer-to-peer network and have high availability. One of the main characteristics of blockchain is their security, and this conveys a decided advantage to using them for archives. Personal information is kept anonymous in the blockchain. They can be used to track cultural artifacts. They enable the crediting and ownership of data in the digital archive (Hedges et al., 2019). They also comprise a data repository that is publicly accessible Their disadvantages include scalability and efficiency, as blockchain algorithms can be computationally costly (Duca et al., 2020).
In addition to finance and cryptocurrency, blockchain technology is increasingly used in many research areas, including the sciences. They also convey potential benefits to the digital humanities. Cultural heritage, and particularly archives containing tangible heritage data, are areas where blockchains may be useful. For instance, a potential application is storing digital archives of artworks. As discussed in a previous section, digital archives, like physical archives, degrade over time, and digital archives in particular are subject to obsolescence when the computer hardware on which it is stored, the storage media themselves, the operating system that runs the hardware, or the software used to create, maintain, access, or update the archive becomes obsolete. An example of such obsolescence is storing archived data on 3-1/2², 5-1/4², or even older 8² floppy disks. Not only is it difficult to access hardware that reads these disks, but the physical media themselves, like tape storage, are subject to bit rot and other physical decay. Consequently, long-term preservation, ensuring the accessibility of digital artworks, is needed in digital archives (Duca et al., 2020).
Another advantage of blockchain technology is that blockchains are replicated registries that can be used to retrieve information in archives that have been lost due to, for or example, natural or man-made disasters. Furthermore, because blockchains feature sophisticated security mechanisms, information in the blockchain cannot be erased or modified. This is beneficial if a real, physical work of art is stolen, as related data remains available and could be used for identification purposes and for detecting counterfeits (Duca et al., 2020).
In research on using blockchain technology for cultural heritage digital archives, specific technical requirements of storing digital archives on blockchains, as well as a potential blockchain architecture for this purpose, were proposed. The two main blockchains, Ethereum and Hyperledger Fabric, were also compared. Blockchains were found to be promising technologies to address many outstanding issues in digital archives, especially archives for preserving cultural heritage (Duca et al., 2020).