Analog And Digital
Analog-to-Digital conversion is a very important topic in the digital humanities, as analog (non-digital or non-electronic) media are often digitized, unlike the case of new media, where media are generally “born digital”. Sounds and images require converting naturally analog representations to digital representations. Without digressing too much on the physics of sound, some basic concepts are nonetheless important. Sound is a signal represented physically by “waves”. Mathematically, these waves are functions of time. These time series waves are characterized by an amplitude, which is height of the wave at a specific time instant, and period, which indicates the length of time until the wave pattern is repeated. Frequency is often used to characterize sound waves and is simply the reciprocal of period. Frequency is therefore the number of periods per time unit, or frequency = 1 / period. To be useful in the digital humanities, these sound signals, or waves, must be digitized. Recall that digitization is converting a continuous, analog signal to a digital form. However, as the signal is continuous, it cannot be precisely or perfectly represented in a digital form. The signal must undergo sampling, in which the sound signal/wave is recorded values at fixed, discrete intervals. The sound is reproduced by approximating the sound signal using samples.
The quality of the sampled signal is determined by the sampling rate, or the number of samples in a unit of time, usually a second. The higher the sampling rate, the more accurately the sound wave can be represented by its samples, and, consequently, the sound will be of higher quality. For compact discs, the standard sampling rate is 44.1 kHz (kilohertz), or 44,100 samples per second.
Sampling digitizes the sound signal on its time dimension. However, the other dimension, the amplitude of the signal, must also be digitized. Digitization of the amplitude, or of each sample, is known as quantization, and is measured by bit depth, or the number of bits per sample. A higher number of bits indicates that the amplitude can be represented with greater accuracy.
For images, the related concept is image sampling, in which colour or the intensity of a pixel value is recorded at fixed, discrete intervals. For images, this digitization is performed in two dimensions – the rows and columns of the 2D image. Picture elements, more commonly known as pixels, are the individual recorded samples. The samples are generally encoded using the RGB colour model briefly described above.
The most common digital paradigm for images is representing the image with raster graphics, in which the image is represented and stored as 2D grid of pixel values.
The basics of image representation and processing will be discussed in more detail in subsequent sections.
In the digitized (or born digital) media employed by present-day digital humanities work, the question of data size becomes relevant. As digital archives, images, sound files, videos, and new media can become quite large, efforts are made to represent these media with a lower number of numeric values. The process of reducing the overall storage required by media is called data compression, or simply compression. Data compression is storing data in a reduced-size form to reduce the space and time requirements. There are two main compression paradigms. In lossless compression, data can be perfectly restored and reconstructed, as all the samples of a media are stored exactly as they are digitized. In contrast, in lossy compression, data cannot be perfectly reconstructed, as data are removed, or transformed to be of lower size (and consequently, lower quality) in the compression process. However, using computational implementations of advanced mathematical algorithms, lossy compression can result in large space/time savings, without noticeably degrading the quality or fidelity of the reconstruction.
Computers were originally designed to use binary, and continue to use binary, because such “bistable” systems are reliable. At some point, below the abstractions enabled by computer programs and software, the underlying hardware of the device needs to be in one of two states. Current can be on or off, but not both simultaneously. Magnetic fields are either left or right. Such concepts are crucial to the hardware components that constitute computers. Transistors and solid-state switches change their on/off state when power is supplied on a control line. Transistors are extremely small, and billions of these transistors can reside on a single chip. Such miniaturization is the main reason why present-day computers are so powerful. They allow computers to perform calculations at extremely high speeds, such as GFlops (gigaflops, or on the order of a billion floating point operations per second) or even TFlops (teraflops, or on the order of a trillion floating point operations per second), and make it possible for these devices to address, store, retrieve, and process gigabytes of data.