Abstract: Digitization is currently infiltrating all daily processes, forcing casual computer users to become acquainted with unfamiliar tools. In order to avoid overstraining these users, simplified interfaces that are reduced to the functionality and content which are relevant to the individual user are imperative. Gaze-contingent systems thus monitor viewing behavior during natural system interactions to predict relevant interface elements. The prediction performance is highly dependent on the underlying features and algorithm, especially when the interface consist of dynamic elements such as videos. In this paper, we conduct two studies with a total of 233 subjects in which we record the viewers’ gaze while watching videos. We then compare the quality of preference predictions for video elements of majority voting to the performance of machine learning. Our results indicate that (1) majority voting can predict preferences with an accuracy of up to 73% (66%) for two (four) elements, (2) machine learning improves the performance to 82% (74%), (3) prediction accuracy depends on the strength of the user’s preference for an element, and (4) we can rank preferences for individual elements.
Authors: Melanie Heck (University of Mannheim, Germany), Janick Edinger (University of Hamburg, Germany), Jonathan Bünemann (University of Mannheim, Germany), Christian Becker (University of Mannheim, Germany)