"

2 The Journey of Physical and Digital Assets

Digital Asset Creation

Digital asset creation is the process of converting physical records into a digital format. This usually involves scanning the physical records to create a digital image and then converting the image into a format that can be used by a computer but it could also include 3D scanning an object and providing a digital 3D object.

The key steps in digital asset creation as presented in the introduction of this book are:

 

There are a number of considerations when creating digital assets, including:

  • The type of item being digitised: Different types of items will require different scanning and conversion techniques. For example, photographs will need to be scanned at a higher resolution than text-based documents. Bound documents, such as books, may need to be unbound before they can be scanned using a high-speed scanner. Also consider microfilm, microfiche, historical artifacts, archaeological and paleontological finds, voice recordings, digital documents, emails and more.
  • The level of security required: If the documents contain sensitive information, additional security measures will need to be in place. These may include encrypting the digital assets, storing them in a secure location, and controlling who has access to them.
  • The budget for the project: Digital asset creation can be a costly process, so it is important to have a clear budget in place before starting the project.

Benefits of Digital Asset Creation

  • Enhanced accessibility and collaboration: Multiple users can access digital documents simultaneously, regardless of their physical location.
  • Reduced storage costs: Digital assets can be stored electronically, which can free up valuable physical space and save on storage costs.
  • Improved document search capabilities: Digital assets can be easily searched and retrieved using keywords and metadata.
  • Enhanced document security: Digital assets can be protected using encryption and password protection.

There are a number of different software programs and services available to help with digital asset creation. Choosing the right software and service provider is essential for the success of your project.

In the rest of the chapter we will examine the complete journey of an asset to be digitized.


Intake and initial assessment – Receiving

Process of receiving and evaluating physical assets for digitization. Please note that the flow below covers multiple steps, however not every project may flow this way. Depending on the assets or preservation needs there may be different workflows.

The journey of the digital asset begins when it is handed over to the digitization facility through an Input Channel. In the business world, organizations utilize various input channels to transfer their assets for digitization. These channels include traditional physical methods like mail and courier delivery services, as well as digital methods such as email attachments, fax transmissions, and web form submissions. Media conversion projects, such as digitizing microfilm or microfiche, typically follow these same business channels, with materials being delivered through secure courier services or direct delivery to the facility.

When sending sensitive assets or assets that require special care often digitization companies are required to inform the client when an item is received and who it is received by. This maintains a Chain of Custody which we will discuss in further detail in a later topic. This may be done through electronic systems, mail notifications or email.

The receiving operator is also responsible for opening the packaging on any shipped items and checking for hazards.


Preparation for digitization

During this step, assets are prepped for digitization. Depending on the type of digitization industry/type of assets this can look very different.

Prep for Artifact Assets

These items may be fragile and require special care. The preparations may involve:

  • Laying out paintings and photos
  • Supporting 3d items
  • Cleaning assets to ensure accurate digitization

Prep for Paper Assets

Prep for paper items usually involves ensuring all data is accurately captured and nothing is covered. It is also important to remove anything that could lead to issues within a scanner or cause a double feed (when a scanner scans two documents at once) to occur.

  • Remove staples
  • Repairing documents
  • Moving sticky notes to blank sheets
  • Unfold dog ears
  • Removing book bindings or scanning all book pages individually
  • Unfolding and laying out large format images for scan

The image below depicts a pile of documents with visible issues that will need to be rectified before digitization. Click on the areas of the image where you see one or more issues. There are eight issues to identify, click them all to complete the activity.

Prep for Digital Assets and Removable Media

Within the digital world you may not think there is much to prepare. However, there are still risks that need to be addressed before inputting information into any system. Some common considerations:

  • Virus Scans on computers not connected to networks
  • Conversion of filetypes

Prep for sensitive materials

Digitizing sensitive materials goes far beyond simple scanning – it’s about safeguarding treasures, secrets, and history. Whether handling a centuries-old manuscript or private records, proper preparation can mean the difference between preservation and disaster. Preparing sensitive items for digitization resembles planning a high-security museum move. You need to protect the physical items themselves – fragile documents may require repair work, from mending torn pages to stabilizing delicate bindings. Like handling precious artifacts, specialized equipment and protective gear become essential tools in the digitization process.

Security stands paramount with confidential information. Each sensitive document must be treated as a classified file, requiring secure environments, careful tracking, and robust encryption once digitized. This extends beyond good practice into legal requirements, particularly when handling personal data protected by regulations. Historical and cultural items demand special attention. Digitizing rare books or artwork means creating digital preservations that could outlast the originals. Success requires collaboration with conservators and advanced imaging techniques that capture every detail while protecting the original piece.

Success in this process hinges on meticulous organization, from creating comprehensive inventories to addressing seemingly minor but vital details like removing paperclips and staples. Through this careful attention to detail, what might begin as a simple scanning task evolves into a professional digitization project that both preserves critical information and ensures regulatory compliance for years to come.

Organize the following items into either Sensitive or Non-Sensitive Assets.


The Scanning Process

Digitization is the process of converting analog or physical assets into digital, computer readable, files for preservation and ease of access. It is conducted utilizing a variety of equipment such as scanners, cameras and more. In this section we will review some of the most common equipment.

Scanners

Scanners are devices which use light and a sensor to capture images of flat assets such as paper documents. Scanners come in a variety of forms. Scanners can be split into three categories. Desktop scanners, which can fit on a desktop. These are the scanners you are most likely to have in your home. Department Scanners, these are the scanners often seen in offices and libraries. Lastly, production scanners which are large scale industrial scanners which can scan large format media and thousands of documents per minute. In the slide below you will see a number of these scanners.

Joggers are devices which shake pages to ensure they are flush with each other and aligned properly. This way when they are put through the scanner there is less chance of a double feed or issue occurring.

Image: 400 Paper Jogger. Martin Yale Industries. [Photo] https://martinyale.com/paper-handling-products/400-paper-jogger/

Envelope Openers are industrial machines used to quickly open envelopes.

Image: IM-16C Mail Opener. Quadient. [Photo]  https://mail.quadient.com/en/mail-openers/im-16

Depending on the project, other digitization equipment may be used. Such as 3d scanners, microphones and more. We will explore specialized scanning in another part of the course.


Conversion

Conversion is a critical step in the digitization process that transforms physical assets into dynamic useable digital assets. The aim of conversion is to make information accessible and usable. In the context of digital asset creation, this means converting analogue or legacy digital formats into modern digital formats that can be used by computers and accessed by users.

The specific steps involved in the conversion process will vary depending on the type of file being converted.

Document Conversion

Document conversion usually refers to converting physical documents into digital formats. This involves scanning the documents to create digital images and then using optical character recognition (OCR) software to make the text in the images machine-readable. The images are then usually saved in a widely compatible format, such as PDF.

For example, a company might scan their paper-based customer records to create digital images, use OCR to make the text searchable, and then save the images as PDFs. This would allow them to store the records electronically, reduce storage costs, and make the information in the records easier to search and retrieve.

Microfilm Conversion

Microfilm is a type of analogue storage that was commonly used in the 20th century to store large volumes of documents. Microfilm conversion involves scanning the microfilm reels to create digital images of the documents. These images can then be converted to standard digital formats such as PDF or TIFF and indexed.

Audio and Video Conversion

Audio and video conversion involves converting analogue or legacy digital audio and video files into modern digital formats. This can be necessary to ensure that the files can be played on modern devices, or to improve the quality of the files.

For example, a library might convert their collection of vinyl records to digital audio files, such as MP3s, to make them accessible to a wider audience. The UK Parliament has a dedicated team that digitises their audio-visual collection to ensure ongoing accessibility.

Email Conversion

Email conversion can involve a number of different processes, such as:

  • Migrating emails from one email server to another.
  • Converting emails to a different format.
  • Archiving emails.

Email conversion can be necessary for a number of reasons, such as to improve email management, to ensure compliance with regulations, or to preserve emails for historical purposes.

Considerations When Converting Digital Assets

There are a number of considerations when converting digital assets, including:

  • The quality of the original source material. If the source material is of poor quality, the converted digital asset will also be of poor quality. For example, if a document was poorly photographed when the microfilm was created, the resulting digital image will also be of poor quality.
  • The desired file format. The file format chosen will depend on how the digital asset will be used. For example, a PDF might be suitable for a document that will be shared online, while a TIFF might be suitable for a photograph that will be printed.
  • The level of security required. If the digital asset contains sensitive information, it will need to be protected with appropriate security measures. This could include encrypting the file or password-protecting it.
  • The budget for the project. Conversion can be a costly process, especially if it involves a large volume of data or complex file formats. It is therefore important to have a clear budget in place before starting the project.

Quality Control

Quality control is essential at every stage of the conversion process to ensure that the resulting digital assets are accurate, complete, and usable. This may involve checking the quality of the scanned images, the accuracy of the OCR, and the completeness of the indexing. For example, a company that is digitizing its medical records will need to make sure that the scanned images are clear and legible, that the OCR has accurately captured all of the text, and that the records are correctly indexed so they can be easily retrieved.

You will learn more about this stage in the next chapter. In the meantime, watch this video to learn about what it means to consider the quality of video records.

 

Reference: Kingston Technology, HD, 4K, 8K? TV and Camera Video Resolutions Explained – DIY in 5 Ep 70 [Video]. Youtube. https://youtu.be/aIygY9Kv3bA?si=KxKoxayIrulw8iQu (3:53min)


Metadata, Indexing and Classification: The Backbone of Digital Organization

Effective digital asset management requires sophisticated systems and methodologies to organise, retrieve, and utilise digital content. The creation of digital assets, through processes like scanning or digital photography, is only the first step in a comprehensive digital transformation strategy. To fully harness the power of these assets, organisations need to consider how these assets will be organised, stored, and retrieved. This is where metadata frameworks, classification methodologies, and indexing systems become essential, especially within the context of a robust Digital Asset Management (DAM) system.

Metadata’s Role in Digital Asset Creation

Metadata, in the context of digital asset creation, serves as the initial layer of information that provides context and meaning to the newly created asset. Even as the digital asset is being generated, metadata can be embedded, either automatically or through manual input, to ensure its proper identification and future searchability.

Metadata should be useful to the client, so it is important to capture metadata that makes their lives easier and makes it easy to search for their assets. Some examples of metadata that can be captured during the creation stage include:

Assigning metadata during the creation phase streamlines the process of integrating the asset into the DAM system. This front-loaded approach ensures that the asset is readily identifiable and can be easily categorised and indexed for future retrieval.

How does metadata get assigned to assets?

Metadata can be assigned to assets via automatic processes or manually. For example: Documents may be scanned by AI to identify key information such as first and last names, emails and more. Or metadata may need to be added to documents by hand during a process called indexing. With advancements in technology more and more indexing can be done by machines and confirmed by humans. Later in this chapter we delve into indexing further.

Tagging and Its Role

TIP: Tags on social media are a form of metadata, allowing us to search for posts containing those tags.

Tagging involves attaching descriptive labels or keywords to content. This process is crucial for organizing and finding data in unstructured environments, such as social media or informal digital archives. However, without a standardized approach, tagging can become inconsistent, leading to inefficiencies. In a professional environment, tags are often predefined to maintain order.

Classification Methodologies

Classification methodologies provide systematic approaches to organizing digital assets based on defined criteria and relationships. Hierarchical classification creates structured relationships between broader and narrower concepts, while faceted classification enables multi-dimensional organization through independent characteristic sets. Taxonomic classification employs standardized terminology and controlled vocabularies to ensure consistent asset categorization across the organization.

The Importance of Classification Schemes

Classification schemes play a critical role in the digitization industry for several reasons:

  • Enhanced Searchability: Organized data helps users find information faster, reducing time spent searching for specific items.
  • Consistency: Standardized classification reduces confusion, ensuring everyone follows the same rules when categorizing content.
  • Scalability: As organizations grow, having a robust classification scheme allows for the seamless integration of new content.
  • Compliance: For industries with strict regulations, such as healthcare or finance, proper classification ensures sensitive information is handled correctly.

Indexing: Bridging Metadata and Classification

Indexing is the process of applying the chosen classification scheme to the digital assets, using the metadata to assign each item to the appropriate category. Think of it as the process of creating the library’s card catalogue – using the information about each book (metadata) to determine where it belongs on the shelves (classification scheme). Indexing enables efficient retrieval of information because users can search for specific criteria (e.g., author, date range, keywords) and the system can quickly locate the relevant assets.

Several methods can be used for indexing:

  • Manual Indexing: Involves human input of metadata and category assignments, which can be time-consuming and prone to errors.
  • Automated Indexing: Uses character recognition technology (OCR) or other software tools to extract metadata and assign categories automatically. This can be faster and more efficient but may require human review to ensure accuracy.
  • Workflow-Based Indexing: Integrates metadata assignment into the existing workflows of an organisation. For example, a document scanning process might automatically add metadata like scanning date and operator name.

Effective indexing requires a well-defined classification scheme and accurate metadata. The more consistent and descriptive the metadata, the more effective the indexing will be, and the easier it will be for users to find the information they need. Indexing systems create the operational bridge between metadata frameworks and classification schemes. Indexing is the process of organizing information using predefined fields or attributes, identified via metadata assignment. Indexing enables complex searches and efficient retrieval of data. It plays a key role in ensuring that digital content is accessible and manageable, particularly when dealing with large datasets. Properly indexed content has transformed how many sectors handle information: universities use it to organize research materials, businesses rely on it for document management, and hospitals depend on it for organizing patient records and medical images.

As our digital content continues to grow rapidly, effective indexing becomes increasingly important for keeping information accessible and manageable across all fields. Just as a well-organized library makes finding books easier, digital indexing helps us navigate the vast amount of digital information created through digitization projects.

Indexing/ Data Entry

Indexing refers to the job of extracting information from documents or assets and assigning it into metadata fields. This job is usually completed by or reviewed by human workers. An indexer may look at a document and capture the “first name” in the “first name” field on an indexing software. Have you ever changed the name of a folder on your computer so you could find it better? You are performing a task very similar to indexing.

Values that cannot be accurately extracted by computers or that require human confirmation are usually processed by indexers. This can include: Names, numbers, emails, descriptions of media and more.

Automatic/Computer assigned

Values assigned by computers or automatically are captured as a document is processed or moved from one stage of processing to another. Computers excel at capturing accurate date, times, username information and calculating distribution dates for physical items. Values that may be captured by computers can be: receiving date and time, who received the item, when the item was provided to the client, destruction date and more.

OCR and Full Text

Optical Character Recognition is a process through which document text is converted to machine readable text. This process is completed by computer software utilizing pattern matching. Pattern matching in this context refers to a software with a stored database of letters which are compared to the letters on the document on a pixel to pixel basis. A likely letter is then proposed. Alternatively, OCR can function on a feature extraction basis, identifying key features of letters for identification.

TIP: If you are able to search or highlight the text on a PDF, it is very likely the PDF was put through OCR.

After Processing

After processing, once the client has access to their digital assets, metadata can still be applied to assets. Users can edit metadata to ensure it is correct or to update documents as changes occur, the storage system may record action taken on documents or deletion information. Just because a document has finished being processed does not mean metadata is no longer being applied!

AI

As technology advances the world of digitization advances with it. AI, though a controversial topic, has been very useful in the world of digitization. AI can assist indexers by providing more accurate identification of document text. AI can also be used to identify possible threats to clients by examining multiple documents for threatening language in a matter of minutes thus helping to save lives.

It is always important with any new tool to evaluate it thoroughly and consider its applications.

Real-World Indexing Examples

  • Walmart: Uses AI-driven indexing to track customer behavior and personalize recommendations.
  • Google Scholar: Uses conceptual indexing to connect users with relevant research papers based on the context of their searches.

Summary – Classification schemes, Metadata, Tagging, and Indexing

Understanding classification schemes, metadata, tagging, and indexing is foundational for anyone starting in the digitization industry. Together, these elements provide essential context that turns stored data into information that’s easy to navigate, understand, and use effectively. By mastering them, you are setting yourself up to manage and navigate the vast digital landscapes you’ll encounter in your career.


Upload Stage – Digital Asset Management Architecture

Once the asset has been digitized, and metadata applied to it, it can be hosted online in long term storage. But where are these assets stored? Assets are stored in Digital Asset Management Systems (DAMs). DAMs are large repositories of digital assets with features for their users to find and perform actions on digital assets. These repositories can host a large amount of data, boast strong security features and intense search and collaboration features.

DAM architecture incorporates storage infrastructure, processing capabilities for asset transformation and workflow automation, comprehensive security frameworks, and integration layers for connecting with enterprise systems. This foundation supports specialized implementations including Document Management Systems (DMS), Case Management Systems, and Document Portals.

Specialized DAM Implementations

Document Management Systems focus on document lifecycle management, implementing advanced classification, retention management, and workflow automation. These systems excel in environments requiring sophisticated document control and compliance monitoring.

Case Management Systems extend beyond basic document handling to support complex business processes. They incorporate process modeling, decision support, analytics, and knowledge base management. These systems are particularly valuable in legal, healthcare, and government sectors where case-based work requires comprehensive document organization and process management.

Document Portals emphasize secure external access and collaboration. They implement zero-trust security architectures, context-aware access control, and digital signature integration. Modern portal implementations support real-time collaboration while maintaining strict security protocols and compliance requirements.

Implementation Considerations

Successful DAM implementation requires careful evaluation of technical architecture, integration requirements, and governance frameworks. Organizations must consider scalability needs, performance optimization, and high availability designs while developing comprehensive integration strategies for authentication systems and enterprise applications. Governance frameworks establish policies for metadata standards, classification schemes, and retention schedules, ensuring consistent asset management practices and regulatory compliance.


Chapter Conclusion

The digitization of assets represents a transformative bridge between the physical and digital worlds, fundamentally changing how society interacts with and preserves information. Through this chapter, we’ve explored the comprehensive process that enables this transformation—from the initial receipt and preparation of materials, through the technical aspects of scanning and conversion, to the crucial steps of metadata assignment and quality control. Each stage builds upon the previous one, demanding precision and attention to detail to ensure the integrity of digitized assets.

Understanding this process extends far beyond the technical act of converting physical items into digital formats. We’ve examined practical challenges like managing scan quality, handling fragile materials, and ensuring metadata accuracy, alongside the advanced tools that make this work possible—from Optical Character Recognition to Digital Asset Management Systems. These components work together to create a robust framework for preserving and managing digital assets in the modern era.

The implications of digitization reach across all sectors of society. Whether enabling access to cultural treasures, optimizing business processes, or creating searchable databases for researchers, this work shapes how humanity interacts with information. For students and professionals entering fields related to information management, digital preservation, or archival work, these concepts represent essential knowledge that will only grow in importance as our digital dependence expands.

As we look ahead, this foundational knowledge becomes a springboard for advancing the field of digitization. The technical expertise and strategic understanding gained here transcend mere procedural requirements—they represent the tools that will shape how future generations access, preserve, and interact with information. In an era of rapid technological evolution, the digitization field stands at the intersection of innovation and preservation, offering endless possibilities to bridge our cultural heritage with emerging digital frontiers. Those who master these principles are positioned not just to participate in this transformation, but to lead it, ensuring that our shared knowledge remains accessible and relevant for generations to come.

Key Takeaways

Digital asset creation follows a systematic five-step process: preparation, scanning, conversion, indexing, and quality control

Each type of material (paper, artifacts, digital media) requires specific preparation procedures and handling protocols to ensure successful digitization

Metadata serves three critical purposes: describing the asset’s content, defining its structure, and managing access rights

Quality control must be integrated at every stage of digitization rather than treated as a final step

Common scanning issues (scan lines, double feeds, poor quality) can be prevented through proper preparation and resolved through standardized troubleshooting procedures

Digital Asset Management (DAM) systems require careful consideration of security, accessibility, and long-term preservation needs

The conversion process must match the original source material’s format while considering the end user’s needs and technical requirements

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Foundations in Digitization Copyright © by marklamontagne is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.