Why Eye Tracking Technology Makes Make Subtitling More Effective

First, let’s explore how eye tracking is actually measured. An eye tracking assessment is usually done in a lab, under controlled conditions, with specialized equipment, and through various stages. Like any other type of scientific assessment, the goal is to collect data that can be considered to be reliable and actionable (with immediate practical applications, such as for subtitling).

The first stage consists of informing the people whose eye movements will be measured about the purpose of the assessment. This includes how the eye tracking procedure is done and any potential risks or discomfort. Because these assessments are typically treated as part of research on human subjects, people must provide consent before the procedure begins.

The eye tracking equipment used for these assessments can be a head-mounted device, a remote eye tracker mounted on a monitor, or a virtual reality headset with integrated eye tracking. The device is connected to a computer.

Controlled conditions in a lab environment include things like adjusting lighting and blocking distractions to ensure accurate data collection. People whose gaze is being tracked may be standing, sitting or lying down, depending on the specific experimental setup.

How Eye Tracking is Done

Once the assessment is in progress, the following happens:

People are asked to follow a series of visual targets or stimuli on a screen or within their field of view
The eye tracking device records their eye movements and uses this data to establish a baseline (a reference value) for subsequent measurements
The eye tracking device is calibrated as needed to ensure that it accurately maps the person's gaze to the visual content (stimuli) presented

The types of visual content participants are exposed to include images, videos, websites, or real-world scenarios, depending on the goal of the research. For subtitling research, the eye tracker continuously records reading patterns, pupil dilation, and other relevant metrics, such as cognitive load.

In some cases, researchers may ask people to perform specific tasks while their eye movements are monitored, like interacting with a user interface. During those tasks, various eye movement metrics are calculated, including fixation duration (the time a person’s gaze is sustained on a focal point), saccade amplitude (rapid eye movement that shifts the gaze from one focal point to another), pupil dilation, and blink rate. These metrics provide insights into cognitive load, emotional responses, and decision-making processes.

When the eye tracking activity is complete, i.e. the assessment for data collection is over, the recorded data is used to create “gaze maps,” which are visual representations of where people looked and for how long. This is how eye tracking technology can help reveal areas of interest like attention distribution (more on that below) and potential difficulties.

Researchers then use statistical methods to compare the compiled data across participants, the research conditions, or time points to identify significant patterns and differences, and draw conclusions.

Using Eye Tracking Technology to Improve Subtitles

There are several ways in which data collected through eye tracking technology provides useful information for subtitling generation. For example, eye tracking reveals how viewers read text on screen: how they look at the order of words, the time they spend reading each word, and the number of times they look to read back. These types of viewer reading patterns can help determine subtitle readability and timing.

Eye-tracking technology also helps understand other factors researchers find useful for subtitling. One of those factors is “attention distribution.” This basically shows how viewers divide their attention between subtitles and other visual content. Attention distribution information helps engineers make better decisions for subtitle placement and format.

Another key factor is “cognitive load” or a person’s mental effort as they use their eyes. This means that scientists analyze the gaze patterns, i.e., the different positions of the eyes as people read. Human eyes do not just passively scan the surrounding environment. Instead, eye movement is closely linked to attention and mental effort to understand and process information. The metrics mentioned above, such as pupil dilation, fixation duration, and blink rate, also help measure cognitive load in eye tracking technology to improve things like subtitling speed.

Combining the data obtained for subtitle readability and timing, attention distribution, and cognitive load allows researchers to find a comfortable viewing balance or validate decisions such as running faster subtitles or changing the font color.

For multilingual subtitling, achieving a balance is a big deal. For example, when a target language expands by 20-35% compared to the source language, it is generally accepted that straight translation will not work. Expanded target text instead undergoes adaptation and multiple rounds of additional adjustments to account for readability, time on screen, and other nuances.

In that context, eye tracking data can reveal whether viewers are struggling to keep up with subtitles due to excessive text or complex vocabulary, or lack of synchronicity between the audio and the text on screen. This data can inform decisions about how much text to include in subtitles for which target languages, and how to simplify the language as needed to reduce cognitive load.

It is known that viewers process subtitles in different languages differently, and eye tracking has helped establish the differences in reading patterns between native language and foreign languages.

Eye tracking has also been shown to help assess how viewers process subtitles that are broken up into different lines or segments. This can help optimize the placement of line breaks for easier reading and understanding, particularly in fast-paced scenes or with longer sentences.

Analyzing viewers' gaze patterns in relation to the video content helps researchers to fine-tune the timing and synchronization of subtitles. This in turn ensures that translated subtitles appear and disappear when it is most opportune, minimizing the need for viewers to switch between subtitles and other visual elements.

Eye tracking studies can also reveal variations in mental effort among different viewers, based on factors such as language proficiency, familiarity with subtitles, and cognitive abilities. This helps audio-visual production teams to personalize subtitling options that can accommodate individual needs and preferences.

Measuring this mental effort also has important applications for subtitles on applications such as learning interfaces. Taking into account the data obtained can help user interface designers to improve performance and reduce errors when making them accessible to the hearing impaired or to people who speak other languages.

Henni Paulsen

Henni Paulsen is a language localization operations and technology expert with over two decades of experience in executive and consulting roles across diverse industries. She specializes in fit-for-purpose localization strategies, best practices, and standards. Henni also helps organizations learn about technology implementations that enable and drive international business growth.

Why Eye Tracking Technology Makes Make Subtitling More Effective

How Eye Tracking is Done

Using Eye Tracking Technology to Improve Subtitles

Henni Paulsen

Your transcription, dubs and subtitles with 99% accuracy