Accessibility Testing

What happens here

Accessibility testing is the validation activity that verifies the build against the accessibility commitments named in the signed NFR set — most commonly WCAG 2.2 Level AA, sometimes a higher bar (AAA) or a regulator-specific bar (Section 508, EN 301 549). It runs alongside the other QA activities — functional and regression, performance, security — and gates UAT and launch on systems where accessibility is a contractual or regulatory commitment.

The activity has three named layers, each catching issues the others miss:

Automated tooling. axe (axe-core, axe DevTools), Lighthouse, Pa11y, WAVE — runs in CI, in browser DevTools, and as part of the engagement’s test pipeline. Catches roughly 30–40% of WCAG issues — missing alt text, insufficient colour contrast, unlabeled form fields, ARIA misuse, heading-order issues. Cheap to run, cheap to scale, never sufficient on its own.
Manual audit. A trained accessibility specialist (sometimes external) walks the system against the WCAG success criteria — keyboard-only navigation, screen-reader compatibility, focus management, error handling, time-based content, motion, cognitive accessibility. Catches the issues automation cannot perceive: whether the keyboard order is logical, whether ARIA actually maps to user intent, whether the system is usable by someone who cannot see the screen.
Assistive-technology testing. The system is exercised with the assistive technology real users use — VoiceOver on macOS/iOS, NVDA and JAWS on Windows, TalkBack on Android, switch controls, voice control, browser zoom at 400%, system-level magnifier. The fidelity matters: passing axe and passing manual audit does not guarantee the system is usable on the technology users actually deploy.

Accessibility commitments in the SOW dictate the depth. Engagements with no accessibility commitment may run automated tooling only and accept its coverage gaps. Engagements with WCAG 2.2 AA in the SOW require automated + manual audit at minimum. Engagements in regulated sectors (US federal — Section 508; EU — EN 301 549 / EAA; UK public sector — PSBAR) require all three layers plus a Voluntary Product Accessibility Template (VPAT) or accessibility conformance report. The contractual stakes increased materially in 2025: the EU Accessibility Act (EAA) came into force on 2025-06-28, expanding accessibility obligations to private-sector commerce, banking, transport, and consumer-facing platforms across the EU. US ADA case law continues to expand digital-accessibility liability for any system serving US users.

The output is a documented accessibility report — automated scan results, manual audit findings, AT testing notes, and a remediation log — plus the artefacts that travel with the system: the VPAT (where applicable), the named accessibility statement (a public-facing artefact required by some regulations), and the operational handoff to maintenance of the accessibility-regression checks the retainer commits to running.

Best practices

Treat accessibility as a build-time concern, not a QA-cycle discovery. Accessibility issues found in the QA cycle that should have been caught at requirements, design, or development time are dramatically expensive to fix — every fix risks a visual or behavioural regression somewhere else, and the team is fighting deadline pressure when remediation should be straightforward. The discipline starts at Requirements & Design — accessibility is a named NFR, the design specification covers contrast/keyboard/focus/error patterns, the frontend development page treats accessibility as part of the build target. QA’s job is to verify, not to discover, the accessibility posture of the system.

Run automated checks in CI, not as a manual cycle activity. axe-core (or similar) runs on every PR build, blocks merge on regressions of the established baseline, and reports the deltas — the same way security scans and the test suite gate merge. Engagements that run automated accessibility tools manually during the QA cycle catch the issues weeks after they were introduced; engagements that gate merge on automated accessibility regressions catch them in minutes. Cheap, mechanical, high signal.

Pair automated tooling with manual audit on a defined cadence. Automated tooling catches 30–40%; the rest needs a trained eye. The manual audit cadence is engagement-shaped: a pre-launch full audit on every engagement with an accessibility commitment, a delta audit at every major release post-launch (under the retainer’s operational scope). For engagements over £200k or systems serving regulated audiences, the audit is run by a specialist (internal or third-party) with documented WCAG expertise rather than by a generalist QA engineer.

Test on the assistive technology users actually deploy. A WCAG-AA-conformant system that nobody has tested with NVDA, VoiceOver, or TalkBack is a system passing the spec on paper while failing real users. The discipline is mechanical: every accessibility cycle includes screen-reader testing on at least one desktop AT and one mobile AT, keyboard-only walkthroughs of every critical user flow, and a 400% zoom check on every key page. Engagements that skip AT testing find the production complaints that surfaced after launch were predictable failures of the unstested AT path.

Document the conformance posture, including what is not conformant. The accessibility report is honest about gaps — every WCAG success criterion that has not been verified, every known limitation, every accepted-residual issue with rationale. The honesty matters because over-claiming conformance is itself a regulatory and reputational risk; under-claiming conformance underprices the work the engagement actually delivered. The VPAT format (originally US-federal but now widely adopted) is the standard documentation form: each criterion marked Supports, Partially Supports, Does Not Support, or Not Applicable, with a remarks column.

Plan accessibility regression testing into the retainer. Every change to the system — new feature, dependency upgrade, design refresh — risks accessibility regression. The retainer’s operational scope names the accessibility regression activity: automated checks on every release, manual delta audit on major releases, full re-audit on the cadence the engagement requires (typically annually for AA conformance, more often for AAA or regulated targets). Engagements that hand off without a documented accessibility-regression plan deliver a system that conforms at handoff and degrades unobserved.

Desired outcomes

By the end of accessibility testing, the engagement has:

A documented accessibility test strategy tied to the signed accessibility NFR (typically WCAG 2.2 AA or a stated higher/regulator-specific bar), with traceability from each WCAG success criterion to the verification method (automated, manual, AT)
Automated accessibility checks running in CI, gating merge on regressions, with the established baseline documented as the engagement’s accessibility budget
A completed pre-launch manual audit run by a competent reviewer, with findings logged, triaged into bug/enhancement/requirements-clarification per the QA triage discipline, and remediated or formally deferred with client sign-off
A completed AT testing pass covering at least one desktop screen reader, one mobile screen reader, keyboard-only navigation of every critical flow, and 400% zoom verification of every key page
An accessibility report combining all three layers’ results, signed off by the client as evidence of the conformance posture at launch
A VPAT or accessibility conformance report (where the engagement requires one) as a procurement artefact
A draft accessibility statement for the client to publish on the live site, covering conformance level, audit method, known limitations, and the issue-report contact route
A documented retainer commitment for accessibility regression testing, with named cadence, named tooling, and the trigger for full re-audit

What the industry does

Automated-only vs. mixed-method vs. specialist-led accessibility shops. Automated-only agencies run axe or Lighthouse and treat their pass as accessibility coverage. Trade-off: cheap, fast, catches the obvious; misses 60–70% of WCAG issues, produces systems that pass the scan and fail real users. Common in agencies whose clients have not contractually committed to accessibility and treat it as a polish concern. Mixed-method agencies pair automated tooling with manual audit, typically led by an internal QA engineer with accessibility training. Trade-off: catches most WCAG issues, defensible documentation, requires accessibility expertise on the QA bench. The modern default in agencies whose engagements include accessibility commitments. Specialist-led agencies engage a third-party accessibility specialist (Deque, TPGi, Level Access, others) for audits, with the agency’s QA running the automated and regression layers. Trade-off: highest assurance, third-party signature defensible in regulated settings, expensive on rates. Common in regulated work and in agencies whose engagements include procurement-mandated accessibility audits.

Build-time-integrated vs. cycle-time-discovery cultures. Build-time agencies treat accessibility as a frontend development concern — components are built accessible from the start, automated checks run on every PR, design reviews catch accessibility issues before code is written. Trade-off: highest cost upfront, lowest cost overall, requires accessibility-fluent engineers and designers. Cycle-time agencies build first and audit later — accessibility issues surface during the QA cycle and remediation is the engagement’s accessibility activity. Trade-off: lowest cost upfront if no issues exist, highest cost if many issues exist (because remediation late in the engagement risks regressions); produces unpredictable engagement economics. Build-time-integrated dominates in modern agencies serving accessibility-committed clients and in agencies operating in regulated EU/UK markets; cycle-time-discovery survives in agencies whose engagements treat accessibility as a procurement-checkbox commitment rather than a design-and-build commitment.

Overlay vs. native-build cultures. Overlay agencies use third-party overlays (AccessiBe, UserWay, AudioEye, EqualWeb) to “fix” accessibility at runtime via injected JavaScript. Trade-off: appears cheap and fast, surfaces marketing-visible accessibility-statement language, almost always produces worse real-user outcomes than no-overlay-at-all and has been the basis for multiple US ADA lawsuits in 2024–2025. Native-build agencies refuse overlays as architectural malpractice and build accessibility into the system itself. Trade-off: higher upfront cost, dramatically better real-user outcomes, defensible against accessibility litigation. The accessibility community (NFB, WebAIM, Deque, accessibility consultants broadly) is unanimous against overlays as a substitute for native accessibility; the regulatory environment is increasingly aligned with that position. Overlay agencies survive in commodity-engagement work where the client has not yet been advised against them; native-build is the modern professional default.