Barrier-Free WEBTOON: The First Step to Enhancing Content Inclusiveness
# Enhancing Accessibility for the Visually Impaired Through Ai-Based Alternative Text
We at WEBTOON ENTERTAINMENT began developing an automatic alternative text feature in the second half of 2021 to ensure that all users, including those with visual impairments, can enjoy the content provided by WEBTOON. As a result, we developed a technology for providing alternative text using AI, introducing our “Barrier-Free WEBTOON” beta service in January 2023. This technology allowed the visually impaired to enjoy web-comic, marking a first for the industry. We began the Barrier-Free WEBTOON beta service in January 2023 by adding alternative text to approximately 180,000 episodes from completed and ongoing series, and are continuing to improve the service based on user feedback. Currently, the Barrier-Free WEBTOON beta service is available only in Korean, but we plan to expand and provide an English service by the first quarter of 2024.
The Process of Barrier-Free WEBTOON
In order to make web-comic accessible to visually impaired individuals, the image information must be converted into text information and then further converted into auditory information. We focused on finding a way to rapidly convert the image information of thousands of completed works and hundreds of new episodes added every week into text information. After over a year of research, we developed the technology for providing “AI-based automatc alternative text.” The process of converting text information into auditory information is possible by using the basic functions provided by smartphone OS, such as “VoiceOver” or “TalkBack.”
The automatic provision of alternative text in WEBTOON goes through several key steps, including △ image segmentation, △ dialogue area detection, and △ text extraction and determining the order of dialogues. The technologies utilized in WEBTOON are “OCR (Optical Character Reader)” and “WEBTOON Object Detection.”
To be more specific, most of the works provided by WEBTOON are presented in a long vertical-scrolling format, resulting in large image sizes for each episode. Therefore, the image segmentation process must precede the dialogue area detection and text extraction. In particular, it is crucial to segment the images correctly without damaging dialogues, speech bubbles, and panel lines to ensure that dialogues can be arranged in the correct order. To address this, WEBTOON has developed WEBTOON Object Detection, which allows elements such as panels, speech bubbles, and dialogue areas to be detected.
After the image segmentation process, WEBTOON utilizes WEBTOON Object Detection to detect dialogue areas. Due to the nature of storytelling in webtoons, where the narrative should flow smoothly without interruptions, it’s crucial that readers can accurately recognize the order of speech bubbles, as well as the text inside speech bubbles and the text outside of them. If text embedded in the image background, like a building sign, is misinterpreted as alternative text, it could disrupt the narrative. Generic OCR alone might recognize words from different speech bubbles and panels as a single sentence, resulting in incorrect dialogue orders. To reduce such errors, we use WEBTOON Object Detection to detect dialogue areas first.
After detecting dialogue areas using WEBTOON Object Detection, the text is extracted through OCR. Lastly, the positioning information of the detected speech bubbles and panels is utilized to determine the order of dialogue, resulting in the finalized alternative text.
The current version of automatic alternative text generation technology is still in its early stages. We will continue to challenge ourselves and conduct research to create alternative text that perfects the delivery of the story. Our long-term goal is to develop a speaker detection function to determine which character’s dialogue is being portrayed. We will keep working until no users are excluded from the WEBTOON ecosystem.