If your site has a single embedded product demo, an explainer video, or a founder’s welcome message, you have already taken on three separate accessibility obligations — and most teams only know about one of them. Captions, transcripts, and audio description are not synonyms. WCAG treats them as distinct requirements for distinct audiences, and a video can pass on one while failing the other two. This guide breaks down exactly what each does, where it lives in the standard, and the tooling that makes it practical.

The three jobs, and who they serve

Start with the core distinction, because everything else follows from it. Captions make the audio accessible. Audio description makes the visuals accessible. A transcript makes the whole thing available as text.

  • Captions serve people who are Deaf or hard of hearing. Per W3C’s caption guidance, captions are “a text version of the speech and non-speech audio information needed to understand the content” — that includes who is speaking and meaningful sound effects, not just dialogue.
  • Audio description serves people who are blind or have low vision. It’s narration added during natural pauses to describe important on-screen action, text shown on screen, and scene changes — the visual information someone can’t see.
  • Transcripts serve everyone who prefers or needs text: people who are both Deaf and blind reading via a braille display, people on a slow connection, and anyone who’d rather skim than watch.

A common trap: assuming subtitles count as captions. They don’t. W3C notes that subtitles typically translate dialogue into another language and omit non-speech sound, so a foreign-language subtitle track alone won’t satisfy the captions requirement.

What WCAG 2.1 actually requires

The relevant guideline is WCAG 2.1 Guideline 1.2, Time-based Media. Here’s the map, by conformance level, for the case most sites care about — prerecorded video that has sound (cross-checked against WebAIM’s WCAG 2 checklist):

Success CriterionWhat it requiresLevel
1.2.2 Captions (Prerecorded)Synchronized captions for the audioA
1.2.3 Audio Description or Media AlternativeEither a full text alternative or audio descriptionA
1.2.5 Audio Description (Prerecorded)Standalone audio description (no transcript substitute)AA
1.2.4 Captions (Live)Real-time captions for live streamsAA

Two things worth pulling out. First, at Level A you get a choice under 1.2.3 — transcript or audio description — but the moment you target Level AA, 1.2.5 makes audio description mandatory regardless. Most organizations aim for AA, so plan for description from the start. Second, audio-only and video-only content have their own rule: 1.2.1 (Level A) requires a transcript for a podcast-style audio file, and a transcript or audio track for a silent video clip.

Because targeting Level AA is the practical baseline for most ADA website compliance efforts, the working checklist for a typical talking-head or demo video is: captions, audio description, and — ideally — a transcript that wraps both together.

Captions: the part you can mostly automate (carefully)

Captions are the most achievable of the three, but “achievable” isn’t “automatic.” W3C is blunt that auto-generated captions “do not meet user needs or accessibility requirements unless they are confirmed to be fully accurate” — and offers a memorable example where automatic captioning turned “not preheat” into “know to preheat,” reversing a recipe’s meaning entirely.

The realistic workflow: let a tool generate a first draft, then edit it. Auto-captioning gets you 80% of the way; a human fixes the homophones, the proper nouns, the punctuation, and adds speaker labels and sound cues like [door slams]. The standard delivery format on the web is WebVTT (.vtt), with SRT and TTML as alternatives. Most caption editors export all three, and — usefully — most will also export a plain-text transcript once your captions are clean, which knocks out a second requirement at no extra cost.

Audio description: the one teams forget

Audio description is where sites most often fall short, because it requires thinking about what the camera shows that the soundtrack never says. If your demo video silently points to a button — “click here” with no spoken context — a blind viewer is lost. Description fills those gaps in the natural pauses between dialogue.

Two practical routes exist. You can produce a separate described audio track (more work, sometimes a second video version), or you can lean on the transcript. Per W3C, a descriptive transcript can substitute for audio description at Level A and is “easy and inexpensive to make” by combining your edited captions with notes on the visual content. For many small-business videos, writing scripts with description in mind — so the narrator says what’s on screen — avoids the problem before it starts.

Transcripts: the cheapest accessibility win you have

A basic transcript is the speech and non-speech audio as text. A descriptive transcript adds the visual information too, and W3C recommends going descriptive because it serves the widest audience — including people who are both Deaf and blind — and can stand in for audio description. The build process is three steps: get the text (export it from your captions), format it with speaker names and visual notes, and publish it on the page near the video.

Transcripts also pay off beyond compliance. Search engines index the text, so a transcript can improve how accessibility work supports SEO — the same curb-cut effect that runs through all of this.

Why an overlay won’t do this for you

It’s worth saying plainly: no accessibility widget can caption your video, write its description, or transcribe its audio. Those tasks require a human to listen, watch, and write. This is one reason overlay tools fall short of real conformance and why uncaptioned media remains a frequent target in ADA website lawsuits. At Curbcut we handle media as part of hands-on accessibility remediation — editing real caption files, writing real transcripts, and validating against WCAG success criteria — not bolting on a script that pretends to.

This article is general information, not legal advice; consult a qualified attorney about your specific obligations. If you want to know which pages on your site have uncaptioned or undescribed video, start with a free scan.