Accessible Web Video and Audio

10 Accessible Web Video and Audio

If content you are sharing on your MacSite does not already have captions, consider the following:

Type of Video

Options

Videos that are shared

Videos that you create / own

Upload your content to a platform that allows for auto-generated captions and edit manually where needed

Consider writing a script – this will help with auto-captioning accuracy, improve recording efficiency, and can double as a transcript.
Consider including in your script audio descriptions of what is visually taking place on the screen. This is called Integrated Description and can reduce the need for described video accommodations.

For the distribution of .mp3 / audio files, the most essential alternative sensory modality that one can include is:

A transcript!
- (A plain-text version of the speech or audio in an audio recording)
Consistent feedback from Blind and Deaf learners, as well as learners for whom English is not a first language – having access to a video or audio transcript is not only preferred – it offers access to materials in a way that video / audio cannot.

Feedback from the McMaster community around inconsistency of ASR results (e.g., closed captions and transcripts) has been dependent on the following factors:
- Speaking in a “non-native” English accent (even when person’s 1st or 2nd language is an English dialect, e.g., Chinese English and Indian English).
- Using complex vocabulary (e.g., STEM and Humanities).
- Speaking with speech impairment.
We are hoping to acknowledge here that these experiences are real and valid.
- ASR currently depends on linguistics / language data-sets to “train” the Ai-technology to recognize both accent and vocabulary variation.
- Due to histories of intersectional oppression and colonization – Data sets containing rich and varied examples of accents, speech impairments and even specific vocabulary usages are limited in comparison to “native” English speaking data sets.
- Because of this (overly-simplified) explanation, those embodying the above experiences will face different and additional barriers than those who do not embody these experiences (e.g., needing to depend more on heavily scripted content to upload to a video’s captioning interface).