In user interface design and in advertising, knowing where people look is important. After all, you want to make sure your audience sees the most important areas of a screen or an ad. In UI design, knowing the path of users’ visual attention extends the insights we can get from their engagement with a system, beyond the typical mouse and keyboard action observation and thinking aloud protocols. Eye tracking and saliency modeling are two very different techniques that can provide answers, each with its own set of pros and cons.
Eye Tracking: An Empirical Approach to User Behavior
Eye tracking assesses where the eyes focus (“fixations”) and the motion of the eyes from one place to the next (“saccades”). Technically this is done through a camera system tracking the user’s pupils, which is why glasses, hard contact lenses and small pupils sometimes make it hard to spot the pupil. The high-cost version is a high-resolution infrared camera. The low cost version is a simple webcam which tends to be lower in resolution. The resolution corresponds to the accuracy of the tracking, and works for even very small objects that a person is looking at.
Eye tracking is empirical. The data and insights are directly derived from user behavior. Obviously, without the person verbalizing their thoughts, we cannot know why they look at certain areas. But if we asked them to share with us what they’re thinking (“thinking aloud” protocol), it would influence their viewing patterns and therefore the results. Eye tracking principally provides valid results for as long as it is being run — no matter if it’s half a second or 2 minutes. It can tell us where users look in sequence, what screen areas are more prominent than others, or whether users see the call to action button on the screen.
Eye tracking can be conducted in house or outsourced to professional UX services in which case no equipment needs to be purchased and no staff has to be trained in operating the equipment and creating results. The results can be visualized in different ways, one of which is a heatmap display which highlights those areas that draw the highest visual attention.
Saliency Modeling: An Algorithmic Simulation of Fixations and Saccades
Saliency Modeling provides answers to the same questions that eye tracking addresses, but does so algorithmically by simulating a person’s fixations and saccades. The underlying model determines the salience of screen areas through the visual properties of their elements, including:
- Contrast: The higher the contrast between a particular element and its background, the higher its saliency.
- Position: Elements that are central on the screen have a higher saliency than elements on the side.
- Size: Large elements draw more visual attention than smaller elements.
Validation studies reported high correlations between results derived from saliency modeling and eye tracking. The resolution is lower compared to infrared eye tracking.
Saliency modeling is not empirical. It requires no test persons, which makes this technique cheaper and faster than eye tracking.
A critical limitation is that the model only covers pre-attentive visual attention, which is the short time when a person first looks at something without consciously thinking about what it is. That duration is only about 250 milliseconds (1/4 of a second), so saliency modeling is not valid beyond that time range.
There are open-source code repositories as well as commercial services that can be used for saliency modeling. Results can be presented in the same way as in eye tracking.
Comparing Eye Tracking and Saliency Modeling
Both eye tracking and saliency modeling can answer questions like:
- What areas of a screen draw the most visual attention?
- In what sequence does a user look at various screen elements?
- What screen areas are more salient than others?
- Do users perceive those screen elements that we want them to see?
Neither technique can answer the question of why people look at certain screen areas. This is comparable to web analytics where we also merely know what users do — but not why. Consequently, eye tracking, saliency modeling and web analytics should be complimented with other user research techniques like usability testing where test users share their thoughts and sentiment about a product.
The crucial difference between both techniques is that while eye tracking has no upper time limit, saliency modeling cannot make valid statements beyond the 250 millisecond mark. This has important consequences. Take the advertisement example again: being applied to the original ad, both techniques yield similar results (top row in the image). Yet, if the image is changed so that the woman looks at the product, viewers connect her eye to the product which eye tracking catches, but the saliency model does not (bottom row in the image).
Here are more differences between the techniques.
Requires test persons
Does not require test persons
High resolution with infrared
Requires hardware and software
Valid throughout exposure
Only valid within first 250ms of exposure
Works in real-time, dynamic contexts
Works off still images
Which Technique Suits Your Organization?
Which technique is better suited for you? It depends on your situation. How much budget and time do you have available? Is this a one-time exercise or something you'll do on a regular basis? How accurate do you need the results to be?
Whatever your choice, do not use these techniques as stand alones. Instead, combine them with other UX methods like usability testing or focus groups that can shed light on why users look at certain screen areas and elements.