Captioning at the Speed of Live for Accessible TV

When news breaks, television news crews do what they do best: hustle to the scene to get the word out quickly, accurately and often under daunting conditions.

Their work has enormous impact: Even in a new era of instant-access to digital news on the Internet, television remains a go-to resource. The September 2017 State of the News survey by the Pew Research Center found more people get their news from television than any other source. What’s more, Pew found most of those TV news viewers get their news from their local TV stations and their companion websites.

Understanding the scope and social impact of TV news helps to explain why it’s disappointing to news directors and station managers that coverage isn’t always accurate and available for a significant share of the audience – people who rely on written text, not spoken language, to know what’s happening. To highlight this, we cover the importance of making accessible TV possible, even for live television content, through advancements happening around automation thanks to AI (artificial intelligence).

For more depth on the topic of using AI for captions, also download this white paper which goes over some of the solutions available from IBM Watson Media: Captioning Goes Cognitive.

The importance of accessible TV

Approximately 20% of U.S. adults report having some level of hearing loss, according to the non-profit advocacy group Hearing Loss Association of America. That share represents the broad contours of the U.S. marketplace for captioned television programming. Millions of viewers regularly use captioning to comprehend what’s being said while images appear on the screen. Especially during moments of crisis or urgent importance, being able to quickly and accurately communicate with these individuals is critical. In the U.S. congressional report tied to the FCC’s 1997 Closed Captioning Report and Order, lawmakers spelled out the objective that “all Americans ultimately have access to video services and programs, particularly as video programming becomes an increasingly important part of the home, school, and workplace.”

That reality is well understood in the video business. But solving the problem is difficult. The craft of applying live captioning to television has vexed the TV and video industries for decades, literally since captioning first came on the scene in 1972 when the Boston TV station WGBH-TV aired a captioned version of the cooking program “The French Chef.”

Imperfection fades away

To be sure, significant investment has been applied to make live TV captioning a better experience for viewers and for news teams alike. But best-effort initiatives uniformly have fallen short of the ideal – accurate, live, contextually relevant captions that keep up with the flow of news coverage.

The welcome news is that these imperfections are on the edge of a fade-out. New breakthroughs that align automated speech recognition applications with artificial intelligence are producing powerful results for station groups, news teams, meteorologists and anybody else who aspires to improve live-captioning experiences.

The efficiency argument is easily understood. The speech-to-text translation accomplished by Watson Captioning Live typically puts spoken words on the screen within no more than three seconds of their occurrence. For fast-changing events and breaking news reports, timing is critical. Watson Captioning Live was developed to address the urgency of the medium.

Accuracy also is paramount to broadcasters. Hosted on IBM Cloud, Watson Captioning Live is trained several times a day on each local TV station’s market-specific terminology so that the solution can be as accurate as possible in identifying local landmarks, events, and even names of local personalities and politicians. This is beneficial to the Station who is constantly looking for new ways to optimize the local viewer experience.

Captioning at the Speed of Live for Accessible TV

One attribute of news gathering/reporting that’s especially well-suited to AI-powered captioning is the large bank of existing video material stations possess. These rich archives present ready-made instruction manuals, in a sense. This existing material is ingested and evaluated by the Watson Captioning solution ahead of time, making it optimally prepared for the moment when a station flips the switch on live captioning. But it doesn’t stop there: By paying attention to context, to repetition of phrases and to external environmental inputs, Watson does what attentive human beings do: It learns.

This attribute is part of what distinguishes Watson Captioning Live from the broader automated speech recognition translation category. As many a station manager or news director can attest, even the most discriminating of speech conversion systems stumbles over oddities of language, resulting in on-screen experiences that reflect poorly, if unfairly, on the broadcaster.

A key result is the ability to communicate naturally, with intelligent interpretation that captures not just the correct words, but their surrounding context. The names of cities, the proper spelling of streets and the correct presentation of the mayor’s name, for instance, are common tripping points for manual captioning systems. So are the ways words are used. The difference between a legislative bill (lower case) and the “Bill” in a popular athlete’s name (upper case) is just one of thousands of examples of how the AI power of Watson sets the viewing experience apart. So does the ability to learn over time by paying attention to language settings, meaning and intent, helping to overcome limitations that have long frustrated viewers who use captioning.

The presentation is especially striking for viewers who have depended on captioning for some time. The difference in quality of the on-screen presentation can be almost palpable: In an instant, viewers may recognize they are experiencing a notable improvement in the respect stations convey for those with hearing loss. For too long the live captioning experience has at least on occasion strayed into the realm of regulatory obligation, not net benefit to viewers. And in some cases, poor captioning work has provoked storms of unwanted social media controversy, as exemplified by some recent high-profile captioning missteps that have occurred in the digital video marketplace.

Short and longterm benefits of captioning

Captioning at the Speed of Live for Accessible TV

The elevation of viewer perceptions is the central ambition that’s driving much of the broadcast industry’s interest in evaluating new technology solutions for live captioning. But it’s not just the on-screen experience that can improve based on the marriage of cognition and captioning, however. Watson Captioning Live delivers important behind-the-scenes attributes that serve larger economic and strategic goals of stations, networks and TV news organizations. These include:

Enhanced archival access:
Not to be overlooked is the important side benefit of enabling a catalogued archive of news and other content. The material ingested by Watson can be enriched with improved metadata, helping to enable editors to construct granular, detailed searches for archived videos with swift retrieval. The improved archive/search functionality alone can create an economic benefit independent of the captioning process itself.
Improved content evaluation:
Stations that have implemented Watson Captioning Live also benefit from having a first-ever, holistic view of the content they produce and air. Being able to count, catalogue and compare newscasts, weather reports and other original productions helps arm news directors and station professionals with important insights about how they’re fulfilling mandates around content, subjects, timeliness and more.
Smarter use of human resources:
A common item on the TV station “wish list” is more time: especially from talented producers and editors whose daily workloads can rapidly become clogged with time-sapping manual searches and archive management. Watson Captioning Live helps transform what are commonly laborious manual processes with more efficient ways to gather and present information.

Summary

Improving the on-screen experience while extending the reach and accessibility of live programming – and netting internal process benefits in the meantime – is an ambitious objective. But it’s attainable with the right application of resources. Going forward, near-flawless live captioning has a chance to emerge – finally – as a staple of the broadcast television experience. Given that the industry has been trying since at least 1972, it’s about time.

Want to put smart to work through closed captioning software to develop automated captions for your live, televised content? Request a demo.