Institut für Computervisualistik
Refine
Year of publication
Document Type
- Master's Thesis (16)
- Bachelor Thesis (12)
- Doctoral Thesis (9)
- Part of Periodical (7)
- Diploma Thesis (5)
- Study Thesis (2)
- Conference Proceedings (1)
Language
- English (52) (remove)
Keywords
- virtual reality (3)
- Bildverarbeitung (2)
- Computer Graphics (2)
- Computergraphik (2)
- Graphik (2)
- Line Space (2)
- OpenGL (2)
- Volumen-Rendering (2)
- tracking (2)
- Acceleration Structures (1)
Institute
This thesis explores a 3D object detection and pose estimation approach based on the point pair features method presented by Drost et. al. [Dro+10]. While pose estimation methods have shown good improvements, they still remain a crucial problem on the computer vision field. In this work, we implemented a program that takes point cloud scenes as input and returns the detected object with their estimated pose. The program fully covers an object detection pipeline by processing 3D models during an offline phase, extracting their point pair features and creating a global descriptor out of them. During an online phase, the same features are extracted from a point cloud scene and are matched to the model features. After the voting scheme, potential poses of the object are retrieved. The poses end being clustered together and post-processed to finally deliver a result. The program was tested using simulated and real data. We evaluate these tests and present the final results, by discussing the achieved accuracy of the detections and the estimated poses.
Augmented reality (AR) applications typically extend the user's view of the real world with virtual objects.
In recent years, AR has gained increasing popularity and attention, which has led to improvements in the required technologies. AR has become available to almost everyone.
Researchers have made great progress towards the goal of believable AR, in which the real and virtual worlds are combined seamlessly.
They mainly focus on issues like tracking, display technologies and user interaction, and give little attention to visual and physical coherence when real and virtual objects are combined. For example, virtual objects should not only respond to the user's input; they should also interact with real objects. Generally, AR becomes more believable and realistic if virtual objects appear fixed or anchored in the real scene, appear indistinguishable from the real scene, and response to any changes within it.
This thesis examines on three challenges in the field of computer vision to meet the goal of a believable combined world in which virtual objects appear and behave like real objects.
Firstly, the thesis concentrates on the well-known tracking and registration problem. The tracking and registration challenge is discussed and an approach is presented to estimate the position and viewpoint of the user so that virtual objects appear fixed in the real world. Appearance-based line models, which keep only relevant edges for tracking purposes, enable absolute registration in the real world and provide robust tracking. On the one hand, there is no need to spend much time creating suitable models manually. On the other hand, the tracking can deal with changes within the object or the scene to be tracked. Experiments have shown that the use of appearance-based line models improves the robustness, accuracy and re-initialization speed of the tracking process.
Secondly, the thesis deals with the subject of reconstructing the surface of a real environment and presents an algorithm to optimize an ongoing surface reconstruction. A complete 3D surface reconstruction of the target scene
offers new possibilities for creating more realistic AR applications. Several interactions between real and virtual objects, such as collision and occlusions, can be handled with physical correctness. Whereas previous methods focused on improving surface reconstructions offline after a capturing step, the presented method de-noises, extends and fills holes during the capturing process. Thus, users can explore an unknown environment without any preparation tasks such as moving around and scanning the scene, and without having to deal with the underlying technology in advance. In experiments, the approach provided realistic results where known surfaces were extended and filled in plausibly for different surface types.
Finally, the thesis focuses on handling occlusions between the real and virtual worlds more realistically, by re-interpreting the occlusion challenge as an alpha matting problem. The presented method overcomes limitations in state-of-the-art methods by estimating a blending coefficient per pixel of the rendered virtual scene, instead of calculating only their visibility. In several experiments and comparisons with other methods, occlusion handling through alpha matting worked robustly and overcame limitations of low-cost sensor data; it also outperformed previous work in terms of quality, realism and practical applicability.
The method can deal with noisy depth data and yields realistic results in regions where foreground and background are not strictly separable (e.g. caused by fuzzy objects or motion blur).
The development of a game engine is considered a non-trivial problem. [3] The architecture of such simulation software must be able to manage large amounts of simulation objects in real-time while dealing with “crosscutting concerns” [3,p. 36] between subsystems. The use of object oriented paradigms to model simulation objects in class hierarchies has been reported as incompatible with constantly changing demands during game development [2, p. 9], resulting in anti-patterns and eventual, messy refactoring.[13]
Alternative architectures using data oriented paradigms revolving around object composition and aggregation have been proposed as a result. [13, 9, 1, 11]
This thesis describes the development of such an architecture with the explicit goals to be simple, inherently compatible with data oriented design, and to make reasoning about performance characteristics possible. Concepts are formally defined to help analyze the problem and evaluate results. A functional implementation of the architecture is presented together with use cases common to simulation software.
Statistical Shape Models (SSMs) are one of the most successful tools in 3Dimage analysis and especially medical image segmentation. By modeling the variability of a population of training shapes, the statistical information inherent in such data are used for automatic interpretation of new images. However, building a high-quality SSM requires manually generated ground truth data from clinical experts. Unfortunately, the acquisition of such data is a time-consuming, error-prone and subjective process. Due to this effort, the majority of SSMs is often based on a limited set of this ground truth training data, which makes the models less statistically meaningful. On the other hand, image data itself is abundant in clinics from daily routine. In this work, methods for automatically constructing a reliable SSM without the need of manual image interpretation from experts are proposed. Thus, the training data is assumed to be the result of any segmentation algorithm or may originate from other sources, e.g. non-expert manual delineations. Depending on the algorithm, the output segmentations will contain errors to a higher or lower degree. In order to account for these errors, areas of low probability of being a boundary should be excluded from the training of the SSM. Therefore, the probabilities are estimated with the help of image-based approaches. By including many shape variations, the corrupted parts can be statistically reconstructed. Two approaches for reconstruction are proposed - an Imputation method and Weighted Robust Principal Component Analysis (WRPCA). This allows the inclusion of many data sets from clinical routine, covering a lot more variations of shape examples. To assess the quality of the models, which are robust against erroneous training shapes, an evaluation compares the generalization and specificity ability to a model build from ground truth data. The results show, that especially WRPCA is a powerful tool to handle corrupted parts and yields to reasonable models, which have a higher quality than the initial segmentations.
Six and Gimmler have identified concrete capabilities that enable users to use the Internet in a competent way. Their media competence model can be used for the didactical design of media usage in secondary schools. However, the special challenge of security awareness is not addressed by the model. In this paper, the important dimension of risk and risk assessment will be introduced into the model. This is especially relevant for the risk of the protection of personal data and privacy. This paper will apply the method of IT risk analysis in order to select those dimensions of the Six/Gimmler media competence model that are appropriate to describe privacy aware Internet usage. Privacy risk aware decisions for or against the Internet usage is made visible by the trust model of Mayer et al.. The privacy extension of the competence model will lead to a measurement of the existing privacy awareness in secondary schools, which, in turn, can serve as a didactically well-reasoned design of Informatics modules in secondary schools. This paper will provide the privacy-extended competence model, while empirical measurement and module design is planned for further research activities.
The following thesis analyses the functionality and programming capabilitiesrnof compute shaders. For this purpose, chapter 2 gives an introductionrnto compute shaders by showing how they work and how they can be programmed. In addition, the interaction of compute shaders and OpenGL 4.3 is shown through two introductory examples. Chapter 3 describes an NBodyrnsimulation that has been implemented in order to show the computational power of compute shaders and the use of shared memory. Then it is shown in chapter 4 how compute shaders can be used for physical simulationsrnand where problems may arise. In chapter 5 a specially conceived and implemented algorithm for detecting lines in images is described and then compared with the Hough transform. Lastly, a final conclusion is drawn in chapter 6.
This work covers techniques for interactive and physically - based rendering of hair for computer generated imagery (CGI). To this end techniques
for the simulation and approximation of the interaction of light with hair are derived and presented. Furthermore it is described how hair, despite such computationally expensive algorithms, can be rendered interactively.
Techniques for computing the shadowing in hair as well as approaches to render hair as transparent geometry are also presented. A main focus of
this work is the DBK-Buffer, which was conceived, implemented and evaluated. Using the DBK-Buffer, it is possible to render thousands of hairs as
transparent geometry without being dependent on either the newest GPU hardware generation or a great amount of video memory. Moreover, a comprehensive evaluation of all the techniques described was conducted with respect to the visual quality, performance and memory requirements. This
revealed that hair can be rendered physically - based at interactive or even at real - time frame rates.
Computed tomography (CT) and magnetic resonance imaging (MRI) in the medical area deliver huge amounts of data, which doctors have to handle in a short time. These data can be visualised efficiently with direct volume rendering. Consequently most direct volume rendering applications on the market are specialised on medical tasks or integrated in medical visualisa- tion environments. Highly evolved applications for tasks like diagnosis or surgery simulation are available in this area. In the last years, however, another area is making increasing use of com- puted tomography. Companies like phoenix |x-ray, founded in 1999 pro- duce CT-scanners especially dedicated to industrial applications like non destructive material testing (NDT). Of course an application like NDT has different demands on the visualisation than a typical medical application. For example a typical task for non destructive testing would be to high- light air inclusions (pores) in a casting. These inclusions usually cover a very small area and are very hard to classify only based on their density value as this would also highlight the air around the casting. This thesis presents multiple approaches to improve the rendering of in- dustrial CT data, most of them based on higher dimensional transfer func- tions. Therefore the existing volume renderer application of VRVis was extended with a user interface to create such transfer functions and exist- ing render modes were adapted to profit from the new transfer functions. These approaches are especially suited to improve the visualisation of sur- faces and material boundaries as well as pores. The resulting renderings make it very easy to identify these features while preserving interactive framerates.
This paper introduces Vocville, a causal online game for learning vocabularies. I am creating this application for my master thesis of my career as a "Computervisualist" (computer visions) for the University of Koblenz - Landau. The application is an online browser game based on the idea of the really successful Facebook game FarmVille. The application is seperated in two parts; a Grails application manages a database which holds the game objects like vocabulary, a Flex/Flash application generates the actual game by using these data. The user can create his own home with everything in it. For creating things, the user has to give the correct translation of the object he wants to create several times. After every query he has to wait a certain amount of time to be queried again. When the correct answer is given sufficient times, the object is builded. After building one object the user is allowed to build others. After building enough objects in one area (i.e. a room, a street etc.) the user can activate other areas by translating all the vocabularies of the previous area. Users can also interact with other users by adding them as neighbors and then visiting their homes or sending them gifts, for which they have to fill in the correct word in a given sentence.