Closed Captioning Best Practices for State Governments

Closed Captioning Best Practices for State Governments

State governments have a lot of opportunities around video content, and are taking advantage of it. Meetings, from committees to interim task force debriefings, can be streamed to expand reach and participation from communities on a broader level. However, when presenting this content, the question of accessibility comes into play, and with it the inclusion of closed captions.

Late in 2016, the U.S. Department of Justice was looking to revise the Americans with Disabilities Act Title II regulations. A possible outcome was to establish requirements for making services, programs or activities offered by state and local governments to the public via the Web accessible. The reason for this belief was that the 2010 update stated: “The Department intends to engage in additional rulemaking in the near future addressing accessibility in these areas and others, including next generation 9–1–1 and accessibility of Web sites operated by covered public entities and public accommodations.” However, this was not reflected in the 2016 update, putting an indeterminate timetable on captions possibly being required in the future. However, some states have already adopted regulations on their own requiring captions for online video on state web sites.

For those paving the path toward captioning now, before a requirement goes into law, this article presents closed captioning best practices for state governments. This entails formatting and judgement decisions, along with ways to scale the actual creation of captions as well for both live and on-demand content.

If you would like to learn more on this topic be sure to download this white paper on AI Closed Captioning Services for Local and State Governments as well.

Offer synchronized captions

For someone who has trouble hearing, it can be frustrating to watch content and have the captions be a phrase or more behind. It is distracting to the point where it could be seen as beneficial to either turn the captions off or mute the audio so to better comprehend the content. Consequently, having captions that are synced with the audio is immensely important. In fact, as it pertains to media, it’s one of the four criteria listed in the FCC guidelines for closed captioning:

“Synchronous: Captions must coincide with their corresponding spoken words and sounds to the greatest extent possible and must be displayed on the screen at a speed that can be read by viewers.”

As a result, a lot of attention should be paid to making captions that are timed appropriately. A general rule of thumb is that as someone starts to speak the caption associated with their passage or dialogue should appear on screen.

Have natural feeling durations

Deciding how long a caption stays on screen can be tricky. To serve their purpose, they must be displayed on screen at a speed that can be read by viewers. Now how long a caption should stay on screen varies by the amount of text inculded. A helpful strategy is to try and think of a minimum duration, which should be greater than 1 second. In fact, even a caption that is essentially one word, like “Okay”, should stay on screen for 2 seconds.

Keep in mind the viewer will likely be reading the captions and trying to stay up with the visuals on screen, so their attention will be divided. As a result, add some buffer in terms of how long captions stay on screen. A good practice here is to try and place a good amount of text into a single or double line of captions. The idea is that someone can more quickly read a short sentence and also focus on the action as opposed to trying to read smaller, quicker succession captions. Don’t make captions too long, though. They shouldn’t be placed too close to the edge of the screen or take up too much of the screen real estate. Furthermore, captions should be large in terms of font size as well, so this is not advocating making the text size smaller to accommodate more words.

All of this said, captions shouldn’t remain on screen too long either. For example if someone makes a quick comment and then is silent for 20 seconds, the captions shouldn’t remain on screen for this duration, as that would not accurately depict the content for someone who is deaf or hard of hearing as they might assume that the person was speaking longer than they did.

Place a background or border around captions

Captions presented in a single color with no background or border result in a legibility issue. All white captions might be very clear during darker scenes, but undecipherable during lighter sequences. Similarly, black captions will work on lighter sequences, but won’t work for darker sequences.

The ideal approach is to have a border or background around the captions. This will make sure the captions are clear to read, regardless of what’s happening in a scene. An ideal and often default setting for this is white text on a black background or black with a faint hint of transparency to it. Although going off the beaten path and choosing a different combo probably isn’t wise, if done make sure it passes a color contrast check to be legible.

Air on the side of verbatim

If it’s in the audio, it should be in the captions is a good rule of thumb. Someone who is hard of hearing shouldn’t note and feel “cheated” that they are getting a censored or a dumbed down version of what’s actually being said. There are exceptions or cases of judgement to that statement, though. For example if someone notably stutters, starting their presentation with “Well… umm… uhhhh… you… ummm…”, that could be an example of something that’s not fully represented in the captions.

On the topic of verbatim, this includes strong language. Simply put, if it slipped into the audio track, and it certainly could if a committee meeting was open to the public, it should be represented in the captions. If the word is obscured in the captions, but not in the actual audio, it creates a disconnect. As a result, leaving the explosive dialogue captioned correctly is likely the optimal approach. In fact, taking a page from media sources, leaving this in is part of the subtitle guidelines from the BBC.

Be cost effective through using AI

Automated approaches for captioning video content have existed for some time, but traditionally have not been reliable. The benefit and allure of automating what can be a costly and time consuming process though is overt. Thankfully, IBM has worked to progress this technology, offering something that is far improved over traditional automated solutions.

The process involves utilizing IBM Watson and with it the benefits of integrating an artificial intelligence. The service begins the same way as many others, using ASR (Automated Speech Recognition) to receive audio and convert it to a machine readable format, in this case text that can be used for captions. However, it’s able to tackle previous short comings of automated, speech-to-text processes by being trainable. For the purpose of state government, this can be invaluable, being able to train the AI on the names of state officials as just one of many examples. Through expanding both vocabulary and relevant, hyper-localized context, the AI infused solution is able to offer something that state governments can use more cost effectively. In addition, this technology can be used for both live and on-demand applications through either IBM Watson Captioning Live or IBM Watson Captioning.

To learn more about how AI, and IBM Watson in particular, is changing captioning, see this closed captioning software solution page.


For state governments using video but looking to adopt captions into their content, the notion of generating captions might first seem daunting, both in terms of the time it takes and also expense wise. That said, there are newer, more accurate automated approaches out there to help manage this activity. What’s more, ones like IBM Watson Captioning are built with the best practices listed in this article in mind. This includes a focus on synchronization while also having natural feeling durations through the inclusion of a smart layout algorithm, which automatically segments caption cues at natural breaking points for readability. What’s more, the captions can be setup to have a black box surrounding them for legibility. On the topic of verbatim, the AI will strive to grab everything represented. For the purpose of on-demand, someone can manually edit what is generated as well, removing “umms” for example if this is desired.

To learn more about how IBM Watson Captioning can work for your government, schedule a demo.