41 Accessible Video and Audio

Considerations for alternative sensory modalities:

If content you are sharing does not already have captions, consider the following:

Video type and captioning options.
Type of Video Options
Videos that are shared​​
  • Replace videos with others that have good captions​​
  • Ask the owner of the video (e.g. YouTube) to caption their content​​
Videos that you create / own​​
  • Upload your content to a platform that allows for auto-generated captions and edit where needed​​

When creating video content, where possible:

  • Consider writing a script – this will help with auto-captioning accuracy, improve recording efficiency and can double as a transcript.When creating video content, where possible:
  • Consider including in your script audio descriptions of what is visually taking place on the screen. This is called Interpreted Description and can reduce the need for described video accommodations.


Common Video Hosting Platforms at McMaster and Caption Solutions


Circulating Audio-Only Information?

For the distribution of .mp3 / audio files, the most essential alternative sensory modality that one can include is:

  • A transcript!
    • (A plain-text version of the speech or audio in an audio recording)
  • Consistent feedback from Blind and Deaf learners, as well as learners for whom English is not a first language – having access to a video or audio transcript is not only preferred – it offer access to materials in a way that video / audio cannot.


EDI Considerations for Relying on Automatic Speech Recognition (ASR)

  • Feedback from the McMaster community around inconsistency of ASR results (e.g. closed captions and transcripts) has been dependent on the following factors:
    • Speaking in a “non-native” English accent (even when person’s 1st or 2nd language is an English dialect, e.g. Chinese English and Indian English).
    • Using complex vocabulary (e.g. STEM and Humanities).
    • Speaking with speech impairment.
  • We are hoping to acknowledge here that these experiences are real and valid.
    • ASR currently depends on linguistics / language data-sets to “train” the Ai-technology to recognize both accent and vocabulary variation.
    • Due to histories of intersectional oppression and colonization – Data sets containing rich and varied examples of accents, speech impairments and even specific vocabulary usages are limited in comparison to “native” English speaking data sets.
    • Because of this (overly-simplified) explanation, those embodying the above experiences will face different and additional barriers than those who do not embody these experiences (e.g. needing to depend more on heavily scripted content to upload to a video’s captioning interface).


Helpful Resources for Audio and Video

  1. Use YouTube to convert a script into a caption file. (Auto-sync function).
  2. Convert caption files from SRT to VTT.
  3. Convert caption files from VTT to SRT.



Accessible Digital Content Training Copyright © by Jessica Blackwood and Kate Brown. All Rights Reserved.

Share This Book