Refine
Year of publication
Document Type
- Master's Thesis (16)
- Bachelor Thesis (12)
- Doctoral Thesis (9)
- Part of Periodical (7)
- Diploma Thesis (5)
- Study Thesis (2)
- Conference Proceedings (1)
Language
- English (52) (remove)
Has Fulltext
- yes (52) (remove)
Keywords
- virtual reality (3)
- Bildverarbeitung (2)
- Computer Graphics (2)
- Computergraphik (2)
- Graphik (2)
- Line Space (2)
- OpenGL (2)
- Volumen-Rendering (2)
- tracking (2)
- Acceleration Structures (1)
- Action Recognition (1)
- Action Segmentation (1)
- Adobe Flex (1)
- Automatische Klassifikation (1)
- Avatar (1)
- Bildanalyse (1)
- Bildsegmentierung (1)
- Blickpunktabhängig (1)
- C++ (1)
- Casual Games (1)
- Coloskopie (1)
- Compute Shader (1)
- Computer Vision (1)
- Computer assisted communication (1)
- Computeranimation (1)
- Computertomografie (1)
- Computervisualistik (1)
- DTI (1)
- Darmpolyp (1)
- Data compression (1)
- Datenkompression (1)
- Deep Metric Learning (1)
- Diagnoseunterstützung (1)
- Diagnosis assistance (1)
- Diffusionsbildgebung (1)
- Digitale Bilder (1)
- ECSA (1)
- Entity Component System Architecture (1)
- Fabric Simulation (1)
- Facebook Application (1)
- Fiber Tracking (1)
- GPU (1)
- Gefäßanalyse (1)
- Gefühl (1)
- Gehirn (1)
- Grafikkarte (1)
- Grafikprogrammierung (1)
- Grails (1)
- Grails 1.2 (1)
- Graphicsprogramming (1)
- Graphik-Hardware (1)
- Human motion (1)
- IceCube (1)
- Image Processing (1)
- Image Understanding (1)
- Imitation Learning (1)
- Industrial-CT (1)
- Informatik (1)
- Inpainting-Verfahren (1)
- Konturfindung (1)
- Leichte Sprache (1)
- Linespace (1)
- Maschinelles Lernen (1)
- Material Point Method (1)
- MeVisLab (1)
- Merkmalsdetektion (1)
- Mitral Valve (1)
- Mitralklappe (1)
- Motion Capturing (1)
- Multidimensional (1)
- Multimodal Action Recognition (1)
- Multimodal Medical Image Analysis Cochlea Spine Non-rigid Registration Segmentation ITK VTK 3D Slicer CT MRI CBCT (1)
- Multiple Object Tracking (1)
- N-Body Simulation (1)
- N-Körper Simulation (1)
- Neutino (1)
- Objektentfernung (1)
- One-Shot Action Recognition (1)
- OpenGL Shading Language (1)
- Pattern Recognition (1)
- Pfadplanung (1)
- Physiksimulation (1)
- Programmierung (1)
- Random Finite Sets (1)
- Raytracing (1)
- Reflections (1)
- Reflektionen (1)
- Rendering (1)
- RoboCup (1)
- Robotik (1)
- Robust Principal Component Analysis (1)
- Sand (1)
- Schnee (1)
- Segmentation (1)
- Segmentierung (1)
- Shader (1)
- Social Games (1)
- Software Engineering (1)
- Specular (1)
- Statistical Shape Model (1)
- Stoffsimulation (1)
- Text (1)
- Texterkennung (1)
- Tracking-System (1)
- Transferfunction (1)
- Transferfunktion (1)
- Ultraschall (1)
- Ultrasound (1)
- Unterwasser-Pipeline (1)
- Unterwasserfahrzeug (1)
- Unterwasserkabel (1)
- VIACOBI (1)
- Vascular analysis (1)
- Virtual characters (1)
- Virtuelle Realität (1)
- Vocabulary Trainer (1)
- Volume Hatching (1)
- Wavelet (1)
- directed acyclic graphs (1)
- finite state automata (1)
- image processing (1)
- image warping (1)
- leap motion (1)
- machine learning (1)
- media competence model (1)
- multidimensional (1)
- natural language generation (1)
- path planning (1)
- performance optimization (1)
- plain language (1)
- privacy and personal data (1)
- privacy competence model (1)
- regular dag languages (1)
- risk (1)
- robotics (1)
- scaffolded writing (1)
- scene analysis (1)
- security awareness (1)
- stereoscopic rendering (1)
- volume rendering (1)
- warp divergence (1)
Institute
- Institut für Computervisualistik (52) (remove)
With the appearance of modern virtual reality (VR) headsets on the consumer market, there has been the biggest boom in the history of VR technology. Naturally, this was accompanied by an increasing focus on the problems of current VR hardware. Especially the control in VR has always been a complex topic.
One possible solution is the Leap Motion, a hand tracking device that was initially developed for desktop use, but with the last major software update it can be attached to standard VR headsets. This device allows very precise tracking of the user’s hands and fingers and their replication in the virtual world.
The aim of this work is to design virtual user interfaces that can be operated with the Leap Motion to provide a natural method of interaction between the user and the VR environment. After that, subject tests are performed to evaluate their performance and compare them to traditional VR controllers.
This thesis addresses the automated identification and localization of a time-varying number of objects in a stream of sensor data. The problem is challenging due to its combinatorial nature: If the number of objects is unknown, the number of possible object trajectories grows exponentially with the number of observations. Random finite sets are a relatively new theory that has been developed to derive at principled and efficient approximations. It is based around set-valued random variables that contain an unknown number of elements which appear in arbitrary order and are themselves random. While extensively studied in theory, random finite sets have not yet become a leading paradigm in practical computer vision and robotics applications. This thesis explores random finite sets in visual tracking applications. The first method developed in this thesis combines set-valued recursive filtering with global optimization. The problem is approached in a min-cost flow network formulation, which has become a standard inference framework for multiple object tracking due to its efficiency and optimality. A main limitation of this formulation is a restriction to unary and pairwise cost terms. This circumstance makes integration of higher-order motion models challenging. The method developed in this thesis approaches this limitation by application of a Probability Hypothesis Density filter. The Probability Hypothesis Density filter was the first practically implemented state estimator based on random finite sets. It circumvents the combinatorial nature of data association itself by propagation of an object density measure that can be computed efficiently, without maintaining explicit trajectory hypotheses. In this work, the filter recursion is used to augment measurements with an additional hidden kinematic state to be used for construction of more informed flow network cost terms, e.g., based on linear motion models. The method is evaluated on public benchmarks where a considerate improvement is achieved compared to network flow formulations that are based on static features alone, such as distance between detections and appearance similarity. A second part of this thesis focuses on the related task of detecting and tracking a single robot operator in crowded environments. Different from the conventional multiple object tracking scenario, the tracked individual can leave the scene and later reappear after a longer period of absence. Therefore, a re-identification component is required that picks up the track on reentrance. Based on random finite sets, the Bernoulli filter is an optimal Bayes filter that provides a natural representation for this type of problem. In this work, it is shown how the Bernoulli filter can be combined with a Probability Hypothesis Density filter to track operator and non-operators simultaneously. The method is evaluated on a publicly available multiple object tracking dataset as well as on custom sequences that are specific to the targeted application. Experiments show reliable tracking in crowded scenes and robust re-identification after long term occlusion. Finally, a third part of this thesis focuses on appearance modeling as an essential aspect of any method that is applied to visual object tracking scenarios. Therefore, a feature representation that is robust to pose variations and changing lighting conditions is learned offline, before the actual tracking application. This thesis proposes a joint classification and metric learning objective where a deep convolutional neural network is trained to identify the individuals in the training set. At test time, the final classification layer can be stripped from the network and appearance similarity can be queried using cosine distance in representation space. This framework represents an alternative to direct metric learning objectives that have required sophisticated pair or triplet sampling strategies in the past. The method is evaluated on two large scale person re-identification datasets where competitive results are achieved overall. In particular, the proposed method better generalizes to the test set compared to a network trained with the well-established triplet loss.
The automatic detection of position and orientation of subsea cables and pipelines in camera images enables underwater vehicles to make autonomous inspections. Plants like algae growing on top and nearby cables and pipelines however complicate their visual detection: the determination of the position via border detection followed by line extraction often fails. Probabilistic approaches are here superior to deterministic approaches. Through modeling probabilities it is possible to make assumptions on the state of the system even if the number of extracted features is small. This work introduces a new tracking system for cable/pipeline following in image sequences which is based on particle filters. Extensive experiments on realistic underwater videos show robustness and performance of this approach and demonstrate advantages over previous works.
Deformable Snow Rendering
(2019)
Accurate snow simulation is key to capture snow's iconic visuals. Intricate
methods exist that attempt to grasp snow behaviour in a holistic manner. Computational complexity prevents them from reaching real-time performance. This thesis presents three techniques making use of the GPU that focus on the deformation of a snow surface in real-time. The approaches are examined by their ability to scale with an increasing number of deformation actors and their visual portrayal of snow deformation. The findings indicate that the approaches maintain real-time performance well into several hundred individual deformation actors. However, these approaches each have their individual restrictions handicapping the visual results. An experimental approach is to combine the techniques at reduced deformation actor count to benefit from the detailed, merged deformation pattern.
This thesis evaluates automated techniques to remove objects from an image and proposed several modifications for the specific application of removing a colour checker from structure dominated images. The selection of approaches covers the main research field of image inpainting as well as an approach used in medical image processing. Their results are investigated to disclose their applicability to removing objects from structure-intense images. The advantages and disadvantages discovered in the process are then used to propose several modifications for an adapted inpainting approach suitable for removing the colour checker.
Since the invention of U-net architecture in 2015, convolutional networks based on its encoder-decoder approach significantly improved results in image analysis challenges. It has been proven that such architectures can also be successfully applied in different domains by winning numerous championships in recent years. Also, the transfer learning technique created an opportunity to push state-of-the-art benchmarks to a higher level. Using this approach is beneficial for the medical domain, as collecting datasets is generally a difficult and expensive process.
In this thesis, we address the task of semantic segmentation with Deep Learning and make three main contributions and release experimental results that have practical value for medical imaging.
First, we evaluate the performance of four neural network architectures on the dataset of the cervical spine MRI scans. Second, we use transfer learning from models trained on the Imagenet dataset and compare it to randomly initialized networks. Third, we evaluate models trained on the bias field corrected and raw MRI data. All code to reproduce results is publicly available online.
Leichte Sprache (LS, easy-to-read German) is a simplified variety of German. It is used to provide barrier-free texts for a broad spectrum of people, including lowliterate individuals with learning difficulties, intellectual or developmental disabilities (IDD) and/or complex communication needs (CCN). In general, LS authors are proficient in standard German and do not belong to the aforementioned group of people. Our goal is to empower the latter to participate in written discourse themselves. This requires a special writing system whose linguistic support and ergonomic software design meet the target group’s specific needs. We present EasyTalk a system profoundly based on natural language processing (NLP) for assistive writing in an extended variant of LS (ELS). EasyTalk provides users with a personal vocabulary underpinned with customizable communication symbols and supports in writing at their individual level of proficiency through interactive user guidance. The system minimizes the grammatical knowledge needed to produce correct and coherent complex contents by intuitively formulating linguistic decisions. It provides easy dialogs for selecting options from a natural-language paraphrase generator, which provides context-sensitive suggestions for sentence components and correctly inflected word forms. In addition, EasyTalk reminds users to add text elements that enhance text comprehensibility in terms of audience design (e.g., time and place of an event) and improve text coherence (e.g., explicit connectors to express discourse-relations). To tailor the system to the needs of the target group, the development of EasyTalk followed the principles of human-centered design (HCD). Accordingly, we matured the system in iterative development cycles, combined with purposeful evaluations of specific aspects conducted with expert groups from the fields of CCN, LS, and IT, as well as L2 learners of the German language. In a final case study, members of the target audience tested the system in free writing sessions. The study confirmed that adults with IDD and/or CCN who have low reading, writing, and computer skills can write their own personal texts in ELS using EasyTalk. The positive feedback from all tests inspires future long-term studies with EasyTalk and further development of this prototypical system, such as the implementation of a so-called Schreibwerkstatt (writing workshop)
Tractography on HARDI data
(2011)
Diffusion weighted imaging is an important modality in clinical imaging and the only possibility to gain insight into the human brain noninvasively and in-vivo. The applications of this imaging technique are diversified. It is used to study the brain, its structure, development and the functionality of the different areas. Further, important fields of application are neurosurgical planning, examinations of pathologies, investigation of Alzheimer-, strokes, and multiple sclerosis. This thesis gives a brief introduction to MRI and diffusion MRI. Based on this, the mostly used data representation in diffusion MRI in clinical imaging, the diffusion tensor, is introduced. As the diffusion tensor suffers from severe limitations new techniques subsumed under the term HARDI (high angular resolution diffusion imaging) are introduced and discussed in detail. Further, an extensive introduction to tractography, approaches that aim at reconstructing neuronal fibers, is given. Based on the knowledge fromthe theoretical part established tractography algorithms are redesigned to handle HARDI data and, thus, improve the reconstruction of neuronal fibers. Among these algorithms, a novel approach is presented that successfully reconstructs fibers on phantom data as well as on human brain data. Further, a novel global classification approach is presented to cluster voxels according to their diffusion properties.
In this thesis, the performance of the IceCube projects photon propagation
code (clsim) is optimized. The process of GPU code analysis and perfor-
mance optimization is described in detail. When run on the same hard-
ware, the new version achieves a speedup of about 3x over the original
implementation. Comparing the unmodified code on hardware currently
used by IceCube (NVIDIA GTX 1080) against the optimized version run on
a recent GPU (NVIDIA A100) a speedup of about 9.23x is observed. All
changes made to the code are shown and their performance impact as well
as the implications for simulation accuracy are discussed individually.
The approach taken for optimization is then generalized into a recipe.
Programmers can use it as a guide, when approaching large and complex
GPU programs. In addition, the per warp job-queue, a design pattern used
for load balancing among threads in a CUDA thread block, is discussed in
detail.
Real-time graphics applications are tending to get more realistic and approximate real world illumination gets more reasonable due to improvement of graphics hardware. Using a wide variation of algorithms and ideas, graphics processing units (GPU) can simulate complex lighting situations rendering computer generated imagery with complicated effects such as shadows, refraction and reflection of light. Particularly, reflections are an improvement of realism, because they make shiny materials, e.g. brushed metals, wet surfaces like puddles or polished floors, appear more realistic and reveal information of their properties such as roughness and reflectance. Moreover, reflections can get more complex, depending on the view: a wet surface like a street during rain for example will reflect lights depending on the distance of the viewer, resulting in more streaky reflection, which will look more stretched, if the viewer is locatedrnfarther away from the light source. This bachelor thesis aims to give an overview of the state-of-the-art in terms of rendering reflections. Understanding light is a basic need to understand reflections and therefore a physical model of light and its reflection will be covered in section 2, followed by the motivational section 2.2, that will give visual appealing examples for reflections from the real world and the media. Coming to rendering techniques, first, the main principle will be explained in section 3 followed by a short general view of a wide variety of approaches that try to generate correct reflections in section 4. This thesis will describe the implementation of three major algorithms, that produce plausible local reflections. Therefore, the developed framework is described in section 5, then three major algorithms will be covered, that are common methods in most current game and graphics engines: Screen space reflections (SSR), parallax-corrected cube mapping (PCCM) and billboard reflections (BBR). After describing their functional principle, they will be analysed of their visual quality and the possibilities of their real-time application. Finally they will be compared to each other to investigate the advantages and disadvantages over each other. In conclusion, the gained experiences will be described by summarizing advantages and disadvantages of each technique and giving suggestions for improvements. A short perspective will be given, trying to create a view of upcoming real-time rendering techniques for the creation of reflections as specular effects.
Artificial neural networks is a popular field of research in artificial intelli-
gence. The increasing size and complexity of huge models entail certain
problems. The lack of transparency of the inner workings of a neural net-
work makes it difficult to choose efficient architectures for different tasks.
It proves to be challenging to solve these problems, and with a lack of in-
sightful representations of neural networks, this state of affairs becomes
entrenched. With these difficulties in mind a novel 3D visualization tech-
nique is introduced. Attributes for trained neural networks are estimated
by utilizing established methods from the area of neural network optimiza-
tion. Batch normalization is used with fine-tuning and feature extraction to
estimate the importance of different parts of the neural network. A combi-
nation of the importance values with various methods like edge bundling,
ray tracing, 3D impostor and a special transparency technique results in a
3D model representing a neural network. The validity of the extracted im-
portance estimations is demonstrated and the potential of the developed
visualization is explored.
We present a non-linear camera pose estimator, which is able to handle a combined input of point and line feature correspondences. For three or more correspondences, the estimator works on any arbitrary number and choice of the feature type, which provides an estimation of the pose on a preferably small and flexible amount of 2D-3D correspondences. We also give an analysis of different minimization techniques, parametrizations of the pose data, and of error measurements between 2D and 3D data. These will be tested for the usage of point features, lines and the combination case. The result shows the most stable and fast working non-linear parameter set for pose estimation in model-based tracking.
While Virtual Reality has been around for decades it gained new life in recent years. The release of the first consumer hardware devices allows fully immersive and affordable VR for the user at home. This availability lead to a new focus of research on technical problems as well as psychological effects. The concepts of presence, describing the feeling of being in the virtual place, body ownership and their impact are central topics in research for a long time and still not fully understood.
To enable further research in the area of Mixed Reality, we want to introduce a framework that integrates the users body and surroundings inside a visual coherent virtual environment. As one of two main aspects we want to merge real and virtual objects to a shared environment in a way such that they are no longer visually distinguishable. To achieve this the main focus is not supposed to be on a high graphical fidelity but on a simplified representation of reality. The essential question is, what level of visual realism is necessary to create a believable mixed reality environment that induces a sense of presence in the user? The second aspect considers the integration of virtual persons. Can characters be recorded and replayed in a way such that they are perceived as believable entities of the world and therefore act as a part of the users environment?
The purpose of this thesis was the development of a framework called Mixed Reality Embodiment Platform. This inital system implements fundamental functionalities to be used as a basis for future extensions to the framework. We also provide a first application that enables user studies to evaluate the framework and contribute to aforementioned research questions.
The goal of this minor thesis is to integrate a robotic arm into an existing robotics software. A robot built on top of this stack should be able to participate successfully RoboCup @Home league. The robot Lisa (Lisa is a service android) needs to manipulate objects, lifting them from shelves or handing them to people. Up to now, the only possibility to do this was a small gripper attached to the robot platform. A "Katana Linux Robot" of Swiss manufacturer Neuronics has been added to the robot for this thesis. This arm needs a driver software and path planner, so that the arm can reach its goal object "intelligently", avoiding obstacles and creating smooth, natural motions.
This thesis focuses on the utilization of modern graphics hardware (GPU) for visualization and computation purposes, especially of volumetric data from medical imaging. The considerable increase in raw computing power in recent years has turned commodity systems into high-performance workstations. In combination with the direct rendering capabilities of graphics hardware, "visual computing" and "computational steering" approaches on large data sets have become feasible. In this regard several example applications and concepts such as the "ray textures" have been developed and are discussed in detail. As the amount of data to be processed and visualized is steadily increasing, memory and bandwidth limitations require compact representations of the data. While the compression of image data has been investigated extensively in the past, the thesis addresses possibilities of performing computations directly on the compressed data. Therefore, different categories of algorithms are identified and represented in the wavelet domain. By using special variants of the compressed format, efficient implementations of essential image processing algorithms are possible and demonstrate the potential of the approach. From the technical perspective, the GPU-based framework "Cascada" has been developed in the course of this thesis. The introduction of object-oriented concepts to shader programming, as well as a hierarchical representation of computation and/or visualization procedures led to a simplified utilization of graphics hardware while maintaining competitive performance. This is shown with different implementations throughout the contributions, as well as two clinical projects in the field of diagnosis assistance. On the one hand the semi-automatic segmentation of low-resolution MRI data sets of the human liver is evaluated. On the other hand different possibilities in assessing abdominal aortic aneurysms are discussed; both projects make use of graphics hardware. In addition, "Cascada" provides extensions towards recent general-purpose programming architectures and a modular design for future developments.
Texture-based text detection in digital images using wavelet features and support vector machines
(2010)
In this bachelor thesis a new texture-based approach for the detection of text in digital images is presented. The procedure can be essentially divided into two main tasks, in detection of text blocks and detection of individual words, whereby the individual words are extracted from the detected text blocks. Roughly, the developed method acts with multiple support vector machines, which classify possible text regions of an image into real text regions, using wavelet-based features. In the process the possible text regions are defifined by edge projections with diσerent orientations. The results of the approach are X/Y coordinates, width and height of rectangular regions of an image, which contains individual words. This knowledge can be further processed, for example by an optical character recognition software to get the important and useful text information.
Constituent parsing attempts to extract syntactic structure from a sentence. These parsing systems are helpful in many NLP applications such as grammar checking, question answering, and information extraction. This thesis work is about implementing a constituent parser for German language using neural networks. Over the past, recurrent neural networks have been used in building a parser and also many NLP applications. In this, self-attention neural network modules are used intensively to understand sentences effectively. With multilayered self-attention networks, constituent parsing achieves 93.68% F1 score. This is improved even further by using both character and word embeddings as a representation of the input. An F1 score of 94.10% was the best achieved by constituent parser using only the dataset provided. With the help of external datasets such as German Wikipedia, pre-trained ELMo models are used along with self-attention networks achieving 95.87% F1 score.
We introduce linear expressions for unrestricted dags (directed acyclic graphs) and finite deterministic and nondeterministic automata operating on them. Those dag automata are a conservative extension of the Tu,u-automata of Courcelle on unranked, unordered trees and forests. Several examples of dag languages acceptable and not acceptable by dag automata and some closure properties are given.
Tracking is an integral part of many modern applications, especially in areas like autonomous systems and Augmented Reality. For performing tracking there are a wide array of approaches. One that has become a subject of research just recently is the utilization of Neural Networks. In the scope of this master thesis an application will be developed which uses such a Neural Network for the tracking process. This also requires the creation of training data as well as the creation and training of a Neural Network. Subsequently the usage of Neural Networks for tracking will be analyzed and evaluated. This includes several aspects. The quality of the tracking for different degrees of freedom will be checked as well as the the impact of the Neural Network on the applications performance. Additionally the amount of required training data is investigated, the influence of the network architecture and the importance of providing depth data as part of the networks input. This should provide an insight into how relevant this approach could be for its adoption in future products.
The Material Point Method (MPM) has proven to be a very capable simulation method in computer graphics that is able to model materials that were previously very challenging to animate [1, 2]. Apart from simulating singular materials, the simulation of multiple materials that interact with each other introduces new challenges. This is the focus of this thesis. It will be shown that the self-collision capabilities of the MPM can naturally handle multiple materials interacting in the same scene on a collision basis, even if the materials use distinct constitutive models. This is then extended by porous interaction of materials as in[3], which also integrates easily with MPM.It will furthermore be shown that regular single-grid MPM can be viewed as a subset of this multi-grid approach, meaning that its behavior can also be achieved if multiple grids are used. The porous interaction is generalized to arbitrary materials and freely changeable material interaction terms, yielding a flexible, user-controllable framework that is independent of specific constitutive models. The framework is implemented on the GPU in a straightforward and simple way and takes advantage of the rasterization pipeline to resolve write-conflicts, resulting in a portable implementation with wide hardware support, unlike other approaches such as [4].