Refine
Document Type
- Doctoral Thesis (1)
- Part of Periodical (1)
Keywords
- Annotation (1)
- Augenbewegung (1)
- Auslese (1)
- Auswahl (1)
- Blickbewegung (1)
- Eyetracking (1)
- Fotoauswahl (1)
- Photographie (1)
- Regionenlabeling (1)
- eye tracking (1)
Institute
The availability of digital cameras and the possibility to take photos at no cost lead to an increasing amount of digital photos online and on private computers. The pure amount of data makes approaches that support users in the administration of the photo necessary. As the automatic understanding of photo content is still an unsolved task, metadata is needed for supporting administrative tasks like search or photo work such as the generation of photo books. Meta-information textually describes the depicted scene or consists of information on how good or interesting a photo is.
In this thesis, an approach for creating meta-information without additional effort for the user is investigated. Eye tracking data is used to measure the human visual attention. This attention is analyzed with the objective of information creation in the form of metadata. The gaze paths of users working with photos are recorded, for example, while they are searching for photos or while they are just viewing photo collections.
Eye tracking hardware is developing fast within the last years. Because of falling prices for sensor hardware such as cameras and more competition on the eye tracker market, the prices are falling, and the usability is increasing. It can be assumed that eye tracking technology can soon be used in everyday devices such as laptops or mobile phones. The exploitation of data, recorded in the background while the user is performing daily tasks with photos, has great potential to generate information without additional effort for the users.
The first part of this work deals with the labeling of image region by means of gaze data for describing the depicted scenes in detail. Labeling takes place by assigning object names to specific photo regions. In total, three experiments were conducted for investigating the quality of these assignments in different contexts. In the first experiment, users decided whether a given object can be seen on a photo by pressing a button. In the second study, participants searched for specific photos in an image search application. In the third experiment, gaze data was collected from users playing a game with the task to classify photos regarding given categories. The results of the experiments showed that gaze-based region labeling outperforms baseline approaches in various contexts. In the second part, most important photos in a collection of photos are identified by means of visual attention for the creation of individual photo selections. Users freely viewed photos of a collection without any specific instruction on what to fixate, while their gaze paths were recorded. By comparing gaze-based and baseline photo selections to manually created selections, the worth of eye tracking data in the identification of important photos is shown. In the analysis of the data, the characteristics of gaze data has to be considered, for example, inaccurate and ambiguous data. The aggregation of gaze data, collected from several users, is one suggested approach for dealing with this kind of data.
The results of the performed experiments show the value of gaze data as source of information. It allows to benefit from human abilities where algorithms still have problems to perform satisfyingly.
Towards Improving the Understanding of Image Semantics by Gaze-based Tag-to-Region Assignments
(2011)
Eye-trackers have been used in the past to identify visual foci in images, find task-related image regions, or localize affective regions in images. However, they have not been used for identifying specific objects in images. In this paper, we investigate whether it is possible to assign image regions showing specific objects with tags describing these objects by analyzing the users' gaze paths. To this end, we have conducted an experiment with 20 subjects viewing 50 image-tag-pairs each. We have compared the tag-to-region assignments for nine existing and four new fixation measures. In addition, we have investigated the impact of extending region boundaries, weighting small image regions, and the number of subjects viewing the images. The paper shows that a tag-to-region assignment with an accuracy of 67% can be achieved by using gaze information. In addition, we show that multiple regions on the same image can be differentiated with an accuracy of 38%.