In the context of augmented reality we define tracking as a collection of methods to obtain the position and orientation (pose) of a user. By means of various displaying techniques, this ensures a correct visual overlay of graphical information onto the reality perceived. Precise results for calculation of the camera pose are gained by methods of image processing, usually analyzing the pixels of an image and extracing features, which can be recognized over the image sequence. However, these methods do not regard the process of image synthesis or at least in a very simplyfied way. In contrast, the class of model-based methods assumes a given 3D model of the observed scene. Based on the model data features can be identified to establish correspondences in the camera image. From these feature correspondences the camera pose is calculated. An interesting approach is the strategy of analysis-by-synthesis, regarding the computer graphics rendering process for extending the knowledge about the model by information from image synthesis and other environment variables.
In this thesis the components of a tracking system are identified and further it is analyzed, to what extend information about the model, the rendering process and the environment can contribute to the components for improvement of the tracking process using analysis-by-synthesis. In particular, by using knowledge as topological information, lighting or perspective, the feature synthesis and correspondence finding should lead to visually unambiguous features that can be predicted and evaluated to be suitable for stable tracking of the camera pose.
Ziel dieser Arbeit ist es, markerloses Tracking unter dem Ansatz der Analyse durch Synthese zu realisieren und dabei auf den Einsatz merkmalsbasierter Verfahren zu verzichten. Das Bild einer Kamera und ein synthetisches Bild der Szene sollen durch den Einsatz von Stilisierungstechniken so verändert und angeglichen werden, dass zu dem gegebenen Kamerabild aus einer Auswahl von gerenderten Bildern jenes erkannt werden kann, welches die reale Kamerapose am exaktesten wiedergibt. Es werden Kombinationen von Ähnlichkeitsmaßen und Visualisierungen untersucht, um eine bestmögliche Vergleichbarkeit der Bilder zu erreichen, welche die Robustheit gegen Trackingfehler erhöhen soll.