OPUS 4 | Suchen

Study on Data Placement Strategies in Distributed RDF Stores (2020)

Janke, Daniel

The distributed setting of RDF stores in the cloud poses many challenges. One such challenge is how the data placement on the compute nodes can be optimized to improve the query performance. To address this challenge, several evaluations in the literature have investigated the effects of existing data placement strategies on the query performance. A common drawback in theses evaluations is that it is unclear whether the observed behaviors were caused by the data placement strategies (if different RDF stores were evaluated as a whole) or reflect the behavior in distributed RDF stores (if cloud processing frameworks like Hadoop MapReduce are used for the evaluation). To overcome these limitations, this thesis develops a novel benchmarking methodology for data placement strategies that uses a data-placement-strategy-independent distributed RDF store to analyze the effect of the data placement strategies on query performance. With this evaluation methodology the frequently used data placement strategies have been evaluated. This evaluation challenged the commonly held belief that data placement strategies that emphasize local computation, such as minimal edge-cut cover, lead to faster query executions. The results indicate that queries with a high workload may be executed faster on hash-based data placement strategies than on, e.g., minimal edge-cut covers. The analysis of the additional measurements indicates that vertical parallelization (i.e., a well-distributed workload) may be more important than horizontal containment (i.e., minimal data transport) for efficient query processing. Moreover, to find a data placement strategy with a high vertical parallelization, the thesis tests the hypothesis that collocating small connected triple sets on the same compute node while balancing the amount of triples stored on the different compute nodes leads to a high vertical parallelization. Specifically, the thesis proposes two such data placement strategies. The first strategy called overpartitioned minimal edge-cut cover was found in the literature and the second strategy is the newly developed molecule hash cover. The evaluation revealed a balanced query workload and a high horizontal containment, which lead to a high vertical parallelization. As a result these strategies showed a better query performance than the frequently used data placement strategies.

Summative Evaluation facettierter Suche und Exploration auf mobilen Endgeräten (2012)

Schneider, Mark

Mittels facettierter Suche lassen sich große, unbekannte Datensätze einfach und gezielt erkunden. Bei der Implementation von Anwendungen für Smartphones ist zu beachten, dass im Gegensatz zu Desktop-Anwendungen ein kleinerer Bildschirm und nur beschränkte Möglichkeiten zur Interaktion zwischen Benutzer und Smartphone zur Verfügung stehen. Diese Beschränkungen können die Benutzbarkeit einer Anwendung negativ beeinflussen. Mit FaThumb und MobileFacets existieren zwei mobile Anwendungen, die die facettierte Suche umsetzen und verwenden, aber nur MobileFacets ist für gegenwärtige Smartphones mit Touchscreenbildschirm ausgelegt. Jedoch bietet FaThumb eine neuartige Facettennavigation, die durch MFacets in dieser Arbeit für aktuelle Smartphones neu realisiert wird. Außerdem befasst sich diese Arbeit mit der Durchführung einer summativen Evaluation zwischen den beiden Anwendungen, MFacets und MobileFacets, bezüglich der Benutzbarkeit und präsentiert die ausgewerteten Ergebnisse.

Techniques for optimized reasoning in description logic knowledge bases (2016)

Schon, Claudia

One of the main goals of the artificial intelligence community is to create machines able to reason with dynamically changing knowledge. To achieve this goal, a multitude of different problems have to be solved, of which many have been addressed in the various sub-disciplines of artificial intelligence, like automated reasoning and machine learning. The thesis at hand focuses on the automated reasoning aspects of these problems and address two of the problems which have to be overcome to reach the afore-mentioned goal, namely 1. the fact that reasoning in logical knowledge bases is intractable and 2. the fact that applying changes to formalized knowledge can easily introduce inconsistencies, which leads to unwanted results in most scenarios. To ease the intractability of logical reasoning, I suggest to adapt a technique called knowledge compilation, known from propositional logic, to description logic knowledge bases. The basic idea of this technique is to compile the given knowledge base into a normal form which allows to answer queries efficiently. This compilation step is very expensive but has to be performed only once and as soon as the result of this step is used to answer many queries, the expensive compilation step gets worthwhile. In the thesis at hand, I develop a normal form, called linkless normal form, suitable for knowledge compilation for description logic knowledge bases. From a computational point of view, the linkless normal form has very nice properties which are introduced in this thesis. For the second problem, I focus on changes occurring on the instance level of description logic knowledge bases. I introduce three change operators interesting for these knowledge bases, namely deletion and insertion of assertions as well as repair of inconsistent instance bases. These change operators are defined such that in all three cases, the resulting knowledge base is ensured to be consistent and changes performed to the knowledge base are minimal. This allows us to preserve as much of the original knowledge base as possible. Furthermore, I show how these changes can be applied by using a transformation of the knowledge base. For both issues I suggest to adapt techniques successfully used in other logics to get promising methods for description logic knowledge bases.

Time series influences in political communication (2019)

Thesing, Tobias

Current political issues are often reflected in social media discussions, gathering politicians and voters on common platforms. As these can affect the public perception of politics, the inner dynamics and backgrounds of such debates are of great scientific interest. This thesis takes user generated messages from an up-to-date dataset of considerable relevance as Time Series, and applies a topic-based analysis of inspiration and agenda setting to it. The Institute for Web Science and Technologies of the University Koblenz-Landau has collected Twitter data generated beforehand by candidates of the European Parliament Election 2019. This work processes and analyzes the dataset for various properties, while focusing on the influence of politicians and media on online debates. An algorithm to cluster tweets into topical threads is introduced. Subsequently, Sequential Association Rules are mined, yielding wide array of potential influence relations between both actors and topics. The elaborated methodology can be configured with different parameters and is extensible in functionality and scope of application.

Uncertainty and inconsistency in knowledge representation (2016)

Thimm, Matthias

This habilitation thesis collects works addressing several challenges on handling uncertainty and inconsistency in knowledge representation. In particular, this thesis contains works which introduce quantitative uncertainty based on probability theory into abstract argumentation frameworks. The formal semantics of this extension is investigated and its application for strategic argumentation in agent dialogues is discussed. Moreover, both the computational as well as the meaningfulness of approaches to analyze inconsistencies, both in classical logics as well as logics for uncertain reasoning is investigated. Finally, this thesis addresses the implementation challenges for various kinds of knowledge representation formalisms employing any notion of inconsistency tolerance or uncertainty.

Warum Wer Wen kennt. Eine themenspezifische Auswertung sozialer Netzwerke (2008)

Henkes, René

In unserer heutigen Welt spielen soziale Netzwerke eine immer größere werdende Rolle. Im Internet entsteht fast täglich eine neue Anwendung in der Kategorie Web 2.0. Aufgrund dieser Tatsache wird es immer wichtiger die Abläufe in sozialen Netzwerken zu verstehen und diese für Forschungszwecke auch simulieren zu können. Da alle gängigen sozialen Netzwerke heute nur im eindimensionalen Bereich arbeiten, beschäftigt sich diese Diplomarbeit mit mehrdimensionalen sozialen Netzwerken. Mehrdimensionale soziale Netzwerke bieten die Möglichkeit verschiedene Beziehungsarten zu definieren. Beispielsweise können zwei Akteure nicht nur in einer "kennt"-Beziehung stehen, sondern diese Beziehungsart könnte auch in diverse Unterbeziehungsarten, wie z.B. Akteur A "ist Arbeitskollege von" Akteur B oder Akteur C "ist Ehepartner von" Akteur D, unterteilt werden. Auf diese Art und Weise können beliebig viele, völlig verschiedene Beziehungsarten nebeneinander existieren. Die Arbeit beschäftigt sich mit der Frage, in welchem Grad die Eigenschaften von eindimensionalen auch bei mehrdimensionalen sozialen Netzwerken gelten. Um das herauszufinden werden bereits bestehende Metriken weiterentwickelt. Diese Metriken wurden für eindimensionale soziale Netzwerke entwickelt und können nun auch für die Bewertung mehrdimensionaler sozialer Netzwerke benutzt werden. Eine zentrale Fragestellung ist hierbei wie gut sich Menschen finden, die sich etwas zu sagen haben. Um möglichst exakte Ergebnisse zu erhalten, ist es notwendig reale Daten zu verwenden. Diese werden aus einem Web 2.0-Projekt, in das Benutzer Links zu verschiedenen Themen einstellen, gewonnen (siehe Kapitel 4). Der erste praktische Schritte dieser Arbeit besteht daher darin, das soziale Netzwerk einzulesen und auf diesem Netzwerk eine Kommunikation, zwischen zwei Personen mit ähnlichen Themengebieten, zu simulieren. Die Ergebnisse der Simulation werden dann mit Hilfe der zuvor entwicklelten Metriken ausgewertet.

“Did I say something wrong?” A word-level analysis of Wikipedia articles for deletion discussions (2016)

Ruster, Michael

Diese Arbeit beschäftigt sich damit, linguistische Erkenntnisse auf Wortebene über schriftlichen Diskussionen zu gewinnen. Die Unterscheidung zwischen Botschaften, welche sich förderlich auf Diskussionen auswirken und jene, welche diese unterbrechen, spielte dabei eine besondere Rolle. Hierbei lag ein Schwerpunkt darauf, zu ermitteln, ob Ich- und Du-Botschaften charakteristisch für die beiden Kommunikationsarten sind. Diese Botschaften sind über Jahre hinweg zu Empfehlungen für erfolgreiche Kommunikation avanciert. Ihre zugeschriebene Wirkung wurde zwar mehrfach bestätigt, jedoch geschah dies stets in kleineren Studien. Deshalb wurde in dieser Arbeit mithilfe der Löschdiskussionen der englischen Wikipedia und der Liste gesperrter Nutzer eine vollautomatische Erstellung eines annotierten Datensatzes entwickelt. Dabei wurden Diskussionsbotschaften entweder als förderlich oder schädlich für einen konstruktiven Diskussionsverlauf markiert. Dieser Datensatz wurde anschließend im Rahmen einer binären Klassifikation verwendet, um charakteristische Worte für die beiden Kommunikationsarten zu bestimmen. Es wurde zudem untersucht, ob anhand von Synsemantika (auch bekannt als Funktionswörter) wie Pronomen oder Konjunktionen eine Entscheidung über die Kommunikationsart einer Botschaft getroffen werden kann. Du-Botschaften wurden, übereinstimmend mit ihrer zugeschriebenen negativen Auswirkung auf Kommunikation, als schädlich in den durchgeführten Untersuchungen identifiziert. Entgegen der zugeschriebenen positiven Auswirkung von Ich-Botschaften, wurde bei diesen ebenfalls eine schädlich Wirkung festgestellt. Eine klare Aussage über die Relevanz von Synsemantika konnte anhand der Ergebnisse nicht getroffen werden. Weitere charakteristische Worte konnten nicht festgestellt werden. Die Ergebnisse deuten darauf hin, dass ein anderes Modell textliche Diskussionen potentiell besser abbilden könnte.

Autor(en)
Titel
Weitere Person(en)
Gutachter
Zusammenfassung
Volltext

Filtern

Autor

Erscheinungsjahr

Dokumenttyp

Sprache

Schlagworte

Institut

27 Treffer