Refine
Year of publication
Document Type
- Master's Thesis (16)
- Bachelor Thesis (12)
- Doctoral Thesis (9)
- Part of Periodical (7)
- Diploma Thesis (5)
- Study Thesis (2)
- Conference Proceedings (1)
Language
- English (52) (remove)
Keywords
- virtual reality (3)
- Bildverarbeitung (2)
- Computer Graphics (2)
- Computergraphik (2)
- Graphik (2)
- Line Space (2)
- OpenGL (2)
- Volumen-Rendering (2)
- tracking (2)
- Acceleration Structures (1)
Institute
- Institut für Computervisualistik (52) (remove)
Ray tracing acceleration through dedicated data structures has long been an important topic in computer graphics. In general, two different approaches are proposed: spatial and directional acceleration structures. The thesis at hand presents an innovative combined approach of these two areas, which enables a further acceleration of the tracing process of rays. State-of-the-art spatial data structures are used as base structures and enhanced by precomputed directional visibility information based on a sophisticated abstraction concept of shafts within an original structure, the Line Space.
In the course of the work, novel approaches for the precomputed visibility information are proposed: a binary value that indicates whether a shaft is empty or non-empty as well as a single candidate approximating the actual surface as a representative candidate. It is shown how the binary value is used in a simple but effective empty space skipping technique, which allows a performance gain in ray tracing of up to 40% compared to the pure base data structure, regardless of the spatial structure that is actually used. In addition, it is shown that this binary visibility information provides a fast technique for calculating soft shadows and ambient occlusion based on blocker approximations. Although the results contain a certain inaccuracy error, which is also presented and discussed, it is shown that a further tracing acceleration of up to 300% compared to the base structure is achieved. As an extension of this approach, the representative candidate precomputation is demonstrated, which is used to accelerate the indirect lighting computation, resulting in a significant performance gain at the expense of image errors. Finally, techniques based on two-stage structures and a usage heuristic are proposed and evaluated. These reduce memory consumption and approximation errors while maintaining the performance gain and also enabling further possibilities with object instancing and rigid transformations.
All performance and memory values as well as the approximation errors are measured, presented and discussed. Overall, the Line Space is shown to result in a considerate improvement in ray tracing performance at the cost of higher memory consumption and possible approximation errors. The presented findings thus demonstrate the capability of the combined approach and enable further possibilities for future work.
Research has shown that people recognize personality, gender, inner states and many other items of information by simply observing human motion. Therefore the expressive human motion seems to be a valuable non-verbal communication channel. On the quest for more believable characters in virtual three dimensional simulations a great amount of visual realism has been achieved during the last decades. However, while interacting with synthetic characters in real-time simulations, often human users still sense an unnatural stiffness. This disturbance in believability is generally caused by a lack of human behavior simulation. Expressive motions, which convey personality and emotional states can be of great help to create more plausible and life-like characters. This thesis explores the feasibility of an automatic generation of emotionally expressive animations from given neutral character motions. Such research is required since common animation methods, such as manual modeling or motion capturing techniques, are too costly to create all possible variations of motions needed for interactive character behavior. To investigate how emotions influence human motion relevant literature from various research fields has been viewed and certain motion rules and features have been extracted. These movement domains were validated in a motion analysis and implemented in a system in an exemplary manner capable of automating the expression of angry, sad and happy states in a virtual character through its body language. Finally, the results were evaluated in user test.
Molecular dynamics (MD) as a field of molecular modelling has great potential to revolutionize our knowledge and understanding of complex macromolecular structures. Its field of application is huge, reaching from computational chemistry and biology over material sciences to computer-aided drug design. This thesis on one hand provides insights into the underlying physical concepts of molecular dynamics simulations and how they are applied in the MD algorithm, and also briefly illustrates different approaches, as for instance the molecular mechanics and molecular quantum mechanics approaches.
On the other hand an own all-atom MD algorithm is implemented utilizing and simplifying a version of the molecular mechanics based AMBER force field published by \big[\cite{cornell1995second}\big]. This simulation algorithm is then used to show by the example of oxytocin how individual energy terms of a force field function. As a result it has been observed, that applying the bond stretch forces alone caused the molecule to be compacted first in certain regions and then as a whole, and that with adding more energy terms the molecule got to move with increasing flexibility.
The mitral valve is one of the four valves in the human heart. It is located in the left heart chamber and its function is to control the blood flow from the left atrium to the left ventricle. Pathologies can lead to malfunctions of the valve so that blood can flow back to the atrium. Patients with a faulty mitral valve function may suffer from fatigue and chest pain. The functionality can be surgically restored, which is often a long and exhaustive intervention. Thorough planning is necessary to ensure a safe and effective surgery. This can be supported by creating pre-operative segmentations of the mitral valve. A post-operative analysis can determine the success of an intervention. This work will combine existing and new ideas to propose a new approach to (semi-)automatically create such valve models. The manual part can guarantee a high quality model and reliability, whereas the automatic part contributes to saving valuable labour time.
The main contributions of the automatic algorithm are an estimated semantic separation of the two leaflets of the mitral valve and an optimization process that is capable of finding a coaptation-line and -area between the leaflets. The segmentation method can perform a fully automatic segmentation of the mitral leaflets if the annulus ring is already given. The intermediate steps of this process will be integrated into a manual segmentation method so a user can guide the whole procedure. The quality of the valve models generated by the method proposed in this work will be measured by comparing them to completely manually segmented models. This will show that commonly used methods to measure the quality of a segmentation are too general and do not suffice to reflect the real quality of a model. Consequently the work at hand will introduce a set of measurements that can qualify a mitral valve segmentation in more detail and with respect to anatomical landmarks. Besides the intra-operative support for a surgeon, a segmented mitral valve provides additional benefits. The ability to patient-specifically obtain and objectively describe the valve anatomy may be the base for future medical research in this field and automation allows to process large data sets with reduced expert dependency. Further, simulation methods that use the segmented models as input may predict the outcome of a surgery.
Bio-medical data comes in various shapes and with different representations.
Domain experts use such data for analysis or diagnosis,
during research or clinical applications. As the opportunities to obtain
or to simulate bio-medical data become more complex and productive,
the experts face the problem of data overflow. Providing a
reduced, uncluttered representation of data, that maintains the data’s
features of interest falls into the area of Data Abstraction. Via abstraction,
undesired features are filtered out to give space - concerning the
cognitive and visual load of the viewer - to more interesting features,
which are therefore accentuated. To address this challenge, the dissertation
at hand will investigate methods that deal with Data Abstraction
in the fields of liver vasculature, molecular and cardiac visualization.
Advanced visualization techniques will be applied for this purpose.
This usually requires some pre-processing of the data, which will also
be covered by this work. Data Abstraction itself can be implemented
in various ways. The morphology of a surface may be maintained,
while abstracting its visual cues. Alternatively, the morphology may
be changed to a more comprehensive and tangible representation.
Further, spatial or temporal dimensions of a complex data set may
be projected to a lower space in order to facilitate processing of the
data. This thesis will tackle these challenges and therefore provide an
overview of Data Abstraction in the bio-medical field, and associated
challenges, opportunities and solutions.
Object recognition is a well-investigated area in image-based computer vision and several methods have been developed. Approaches based on Implicit Shape Models have recently become popular for recognizing objects in 2D images, which separate objects into fundamental visual object parts and spatial relationships between the individual parts. This knowledge is then used to identify unknown object instances. However, since the emergence of aσordable depth cameras like Microsoft Kinect, recognizing unknown objects in 3D point clouds has become an increasingly important task. In the context of indoor robot vision, an algorithm is developed that extends existing methods based on Implicit Shape Model approaches to the task of 3D object recognition.
On the recognition of human activities and the evaluation of its imitation by robotic systems
(2023)
This thesis addresses the problem of action recognition through the analysis of human motion and the benchmarking of its imitation by robotic systems.
For our action recognition related approaches, we focus on presenting approaches that generalize well across different sensor modalities. We transform multivariate signal streams from various sensors to a common image representation. The action recognition problem on sequential multivariate signal streams can then be reduced to an image classification task for which we utilize recent advances in machine learning. We demonstrate the broad applicability of our approaches formulated as a supervised classification task for action recognition, a semi-supervised classification task for one-shot action recognition, modality fusion and temporal action segmentation.
For action classification, we use an EfficientNet Convolutional Neural Network (CNN) model to classify the image representations of various data modalities. Further, we present approaches for filtering and the fusion of various modalities on a representation level. We extend the approach to be applicable for semi-supervised classification and train a metric-learning model that encodes action similarity. During training, the encoder optimizes the distances in embedding space for self-, positive- and negative-pair similarities. The resulting encoder allows estimating action similarity by calculating distances in embedding space. At training time, no action classes from the test set are used.
Graph Convolutional Network (GCN) generalized the concept of CNNs to non-Euclidean data structures and showed great success for action recognition directly operating on spatio-temporal sequences like skeleton sequences. GCNs have recently shown state-of-the-art performance for skeleton-based action recognition but are currently widely neglected as the foundation for the fusion of various sensor modalities. We propose incorporating additional modalities, like inertial measurements or RGB features, into a skeleton-graph, by proposing fusion on two different dimensionality levels. On a channel dimension, modalities are fused by introducing additional node attributes. On a spatial dimension, additional nodes are incorporated into the skeleton-graph.
Transformer models showed excellent performance in the analysis of sequential data. We formulate the temporal action segmentation task as an object detection task and use a detection transformer model on our proposed motion image representations. Experiments for our action recognition related approaches are executed on large-scale publicly available datasets. Our approaches for action recognition for various modalities, action recognition by fusion of various modalities, and one-shot action recognition demonstrate state-of-the-art results on some datasets.
Finally, we present a hybrid imitation learning benchmark. The benchmark consists of a dataset, metrics, and a simulator integration. The dataset contains RGB-D image sequences of humans performing movements and executing manipulation tasks, as well as the corresponding ground truth. The RGB-D camera is calibrated against a motion-capturing system, and the resulting sequences serve as input for imitation learning approaches. The resulting policy is then executed in the simulated environment on different robots. We propose two metrics to assess the quality of the imitation. The trajectory metric gives insights into how close the execution was to the demonstration. The effect metric describes how close the final state was reached according to the demonstration. The Simitate benchmark can improve the comparability of imitation learning approaches.
This paper describes the robot Lisa used by team homer@UniKoblenz of the University of Koblenz Landau, Germany, for the participation at the RoboCup@Home 2017 in Nagoya, Japan. A special focus is put on novel system components and the open source contributions of our team. We have released packages for object recognition, a robot face including speech synthesis, mapping and navigation, speech recognition interface via android and a GUI. The packages are available (and new packages will be released) on
http://wiki.ros.org/agas-ros-pkg.
This paper describes the robot Lisa used by team
homer@UniKoblenz of the University of Koblenz Landau, Germany, for the participation at the RoboCup@Home 2016 in Leipzig, Germany. A special focus is put on novel system components and the open source contributions of our team. We have released packages for object recognition, a robot face including speech synthesis, mapping and navigation, speech recognition interface via android and a GUI. The packages are available (and new packages will be released) on http://wiki.ros.org/agas-ros-pkg.
In this thesis we present an approach to track a RGB-D camera in 6DOF andconstruct 3D maps. We first acquire, register and synchronize RGB and depth images. After preprocessing we extract FAST features and match them between two consecutive frames. By depth projection we regain the z-value for the inlier correspondences. Afterwards we estimate the camera motion by 3D point set alignment between the correspondence set using least-squares. This local motion estimate is incrementally applied to a global transformation. Additionally wernpresent methods to build maps based on point cloud data acquired by a RGB-D camera. For map creation we use the OctoMap framework and optionally create a colored point cloud map. The system is evaluated with the widespread RGB-D benchmark.
This work describes a novel software tool for visualizing anatomical segmentations of medical images. It was developed as part of a bachelor's thesis project, with a view to supporting research into automatic anatomical brain image segmentation. The tool builds on a widely-used visualization approach for 3D image volumes, where sections in orthogonal directions are rendered on screen as 2D images. It implements novel display modes that solve common problems with conventional viewer programs. In particular, it features a double-contour display mode to aid the user's spatial orientation in the image, as well as modes for comparing two competing segmentation labels pertaining to one and the same anatomical region. The tool was developed as an extension to an existing open-source software suite for medical image processing. The visualization modes are, however, suitable for implementation in the context of other viewer programs that follow a similar rendering approach.
The modified code can be found here: soundray.org/mm-segmentation-visualization.tar.gz.
In recent years head mounted displays (HMD) and their abilities to create virtual realities comparable with the real world moved more into the focus of press coverage and consumers. The reason for this lies in constant improvements in available computing power, miniaturisation of components as well as the constantly shrinking power consumption. These trends originate in the general technical progress driven by advancements made in smartphone sector. This gives more people than ever access to the required components to create these virtual realities. However at the same time there is only limited research which uses the current generation of HMDs especially when comparing the virtual and real world against each other. The approach of this thesis is to look into the process of navigating both real and virtual spaces while using modern hardware and software. One of the key areas are the spatial and peripheral perception without which it would be difficult to navigate a given space. The influence of prior real and virtual experiences on these will be another key aspect. The final area of focus is the influence on the emotional state and how it compares to the real world. To research these influences a experiment using the Oculus Rift DK2 HMD will be held in which subjects will be guided through a real space as well as a virtual model of it. Data will be gather in a quantitative manner by using surveys. Finally, the findings will be discussed based on a statistical evaluation. During these tests the different perception of distances and room size will the compared and how they change based on the current reality. Furthermore, the influence of prior spatial activities both in the real and the virtual world will looked into. Lastly, it will be checked how real these virtual worlds are and if they are sufficiently sophisticated to trigger the same emotional responses as the real world.
The Material Point Method (MPM) has proven to be a very capable simulation method in computer graphics that is able to model materials that were previously very challenging to animate [1, 2]. Apart from simulating singular materials, the simulation of multiple materials that interact with each other introduces new challenges. This is the focus of this thesis. It will be shown that the self-collision capabilities of the MPM can naturally handle multiple materials interacting in the same scene on a collision basis, even if the materials use distinct constitutive models. This is then extended by porous interaction of materials as in[3], which also integrates easily with MPM.It will furthermore be shown that regular single-grid MPM can be viewed as a subset of this multi-grid approach, meaning that its behavior can also be achieved if multiple grids are used. The porous interaction is generalized to arbitrary materials and freely changeable material interaction terms, yielding a flexible, user-controllable framework that is independent of specific constitutive models. The framework is implemented on the GPU in a straightforward and simple way and takes advantage of the rasterization pipeline to resolve write-conflicts, resulting in a portable implementation with wide hardware support, unlike other approaches such as [4].
Tracking is an integral part of many modern applications, especially in areas like autonomous systems and Augmented Reality. For performing tracking there are a wide array of approaches. One that has become a subject of research just recently is the utilization of Neural Networks. In the scope of this master thesis an application will be developed which uses such a Neural Network for the tracking process. This also requires the creation of training data as well as the creation and training of a Neural Network. Subsequently the usage of Neural Networks for tracking will be analyzed and evaluated. This includes several aspects. The quality of the tracking for different degrees of freedom will be checked as well as the the impact of the Neural Network on the applications performance. Additionally the amount of required training data is investigated, the influence of the network architecture and the importance of providing depth data as part of the networks input. This should provide an insight into how relevant this approach could be for its adoption in future products.
We introduce linear expressions for unrestricted dags (directed acyclic graphs) and finite deterministic and nondeterministic automata operating on them. Those dag automata are a conservative extension of the Tu,u-automata of Courcelle on unranked, unordered trees and forests. Several examples of dag languages acceptable and not acceptable by dag automata and some closure properties are given.
Constituent parsing attempts to extract syntactic structure from a sentence. These parsing systems are helpful in many NLP applications such as grammar checking, question answering, and information extraction. This thesis work is about implementing a constituent parser for German language using neural networks. Over the past, recurrent neural networks have been used in building a parser and also many NLP applications. In this, self-attention neural network modules are used intensively to understand sentences effectively. With multilayered self-attention networks, constituent parsing achieves 93.68% F1 score. This is improved even further by using both character and word embeddings as a representation of the input. An F1 score of 94.10% was the best achieved by constituent parser using only the dataset provided. With the help of external datasets such as German Wikipedia, pre-trained ELMo models are used along with self-attention networks achieving 95.87% F1 score.
Texture-based text detection in digital images using wavelet features and support vector machines
(2010)
In this bachelor thesis a new texture-based approach for the detection of text in digital images is presented. The procedure can be essentially divided into two main tasks, in detection of text blocks and detection of individual words, whereby the individual words are extracted from the detected text blocks. Roughly, the developed method acts with multiple support vector machines, which classify possible text regions of an image into real text regions, using wavelet-based features. In the process the possible text regions are defifined by edge projections with diσerent orientations. The results of the approach are X/Y coordinates, width and height of rectangular regions of an image, which contains individual words. This knowledge can be further processed, for example by an optical character recognition software to get the important and useful text information.
This thesis focuses on the utilization of modern graphics hardware (GPU) for visualization and computation purposes, especially of volumetric data from medical imaging. The considerable increase in raw computing power in recent years has turned commodity systems into high-performance workstations. In combination with the direct rendering capabilities of graphics hardware, "visual computing" and "computational steering" approaches on large data sets have become feasible. In this regard several example applications and concepts such as the "ray textures" have been developed and are discussed in detail. As the amount of data to be processed and visualized is steadily increasing, memory and bandwidth limitations require compact representations of the data. While the compression of image data has been investigated extensively in the past, the thesis addresses possibilities of performing computations directly on the compressed data. Therefore, different categories of algorithms are identified and represented in the wavelet domain. By using special variants of the compressed format, efficient implementations of essential image processing algorithms are possible and demonstrate the potential of the approach. From the technical perspective, the GPU-based framework "Cascada" has been developed in the course of this thesis. The introduction of object-oriented concepts to shader programming, as well as a hierarchical representation of computation and/or visualization procedures led to a simplified utilization of graphics hardware while maintaining competitive performance. This is shown with different implementations throughout the contributions, as well as two clinical projects in the field of diagnosis assistance. On the one hand the semi-automatic segmentation of low-resolution MRI data sets of the human liver is evaluated. On the other hand different possibilities in assessing abdominal aortic aneurysms are discussed; both projects make use of graphics hardware. In addition, "Cascada" provides extensions towards recent general-purpose programming architectures and a modular design for future developments.
The goal of this minor thesis is to integrate a robotic arm into an existing robotics software. A robot built on top of this stack should be able to participate successfully RoboCup @Home league. The robot Lisa (Lisa is a service android) needs to manipulate objects, lifting them from shelves or handing them to people. Up to now, the only possibility to do this was a small gripper attached to the robot platform. A "Katana Linux Robot" of Swiss manufacturer Neuronics has been added to the robot for this thesis. This arm needs a driver software and path planner, so that the arm can reach its goal object "intelligently", avoiding obstacles and creating smooth, natural motions.
While Virtual Reality has been around for decades it gained new life in recent years. The release of the first consumer hardware devices allows fully immersive and affordable VR for the user at home. This availability lead to a new focus of research on technical problems as well as psychological effects. The concepts of presence, describing the feeling of being in the virtual place, body ownership and their impact are central topics in research for a long time and still not fully understood.
To enable further research in the area of Mixed Reality, we want to introduce a framework that integrates the users body and surroundings inside a visual coherent virtual environment. As one of two main aspects we want to merge real and virtual objects to a shared environment in a way such that they are no longer visually distinguishable. To achieve this the main focus is not supposed to be on a high graphical fidelity but on a simplified representation of reality. The essential question is, what level of visual realism is necessary to create a believable mixed reality environment that induces a sense of presence in the user? The second aspect considers the integration of virtual persons. Can characters be recorded and replayed in a way such that they are perceived as believable entities of the world and therefore act as a part of the users environment?
The purpose of this thesis was the development of a framework called Mixed Reality Embodiment Platform. This inital system implements fundamental functionalities to be used as a basis for future extensions to the framework. We also provide a first application that enables user studies to evaluate the framework and contribute to aforementioned research questions.