Refine
Document Type
- Doctoral Thesis (3)
- Diploma Thesis (1)
Keywords
- Information Retrieval (4) (remove)
Institute
Folksonomies are Web 2.0 platforms where users share resources with each other. Furthermore, they can assign keywords (called tags) to the resources for categorizing and organizing the resources. Numerous types of resources like websites (Delicious), images (Flickr), and videos (YouTube) are supported by different folksonomies. The folksonomies are easy to use and thus attract the attention of millions of users. Together with the ease they offer, there are also some problems. This thesis addresses different problems of folksonomies and proposes solutions for these problems. The first problem occurs when users search for relevant resources in folksonomies. Often, the users are not able to find all relevant resources because they don't know which tags are relevant. The second problem is assigning tags to resources. Although many folksonomies (like Delicious) recommend tags for the resources, other folksonomies (like Flickr) do not recommend any tags. Tag recommendation helps the users to easily tag their resources. The third problem is that tags and resources are lacking semantics. This leads for example to ambiguous tags. The tags are lacking semantics because they are freely chosen keywords. The automatic identification of the semantics of tags and resources helps in reducing problems that arise from this freedom of the users in choosing the tags. This thesis proposes methods which exploit semantics to address the problems of search, tag recommendation, and the identification of tag semantics. The semantics are discovered from a variety of sources. In this thesis, we exploit web search engines, online social communities and the co-occurrences of tags as sources of semantics. Using different sources for discovering semantics reduces the efforts to build systems which solve the problems mentioned earlier. This thesis evaluates the proposed methods on a large scale data set. The evaluation results suggest that it is possible to exploit the semantics for improving search, recommendation of tags, and automatic identification of the semantics of tags and resources.
Das Hauptaugenmerk dieser Arbeit, liegt auf einer softwareergonomisch empfehlenswerten Integration eines Serach Term Recommender Moduls (STR) mithilfe von Usability-Tests und dem gezielten Blick auf den State of the Art des Interaktionsdesigns bei Retrieval-Mehrwertdiensten. Daniela Holl (Holl, 2009) hat in ihrer Diplomarbeit ein Search Term Recommender Modulprototyp unter der Verwendung der Software MindServer entwickelt und die Vorzüge eines Search Term Recommenders in Bezug auf die Behandlung der verbleibenden Vagheit zwischen Benutzer und kontrolliertem Vokabular erläutert. Nach Entwicklung eines lauffähigen Prototyps führte sie eine empirische Studie zu den gelieferten Ergebnissen bzw. der Qualität der zurückgegeben Ergebnisse durch. Somit wurde zu einem Großteil eine korrekte Funktionalität dieses Prototyps gewährleistet.rnSinn und Zweck des Search Term Recommenders ist es, dem Benutzer nur die Terme vorzuschlagen, welche nicht bereits durch Termtransformationen des Heterogenitätsservices behandelt wurden. Vorallem aber steht die Unterstützung der gezielten Suche nach bestimmten Daten zur Befriedigung des Informationsbedürfnisses im Mittelpunkt. Dabei gilt es den Benutzer sowohl in seiner Suchanfrageformulierung visuell, als auch beim Herausfiltern, der für ihn relevanten Ergebnisse in der Trefferanzeige zu unterstützen.rnDa bislang lediglich maschinelle und interne Daten zum Testabgleich verwendet wurden, liegt die Priorität dieser Arbeit auf der Untersuchung von Dialoggestaltung bzw. der Benutzerinteraktion mit dem Search Term Recommender. Schwerpunkt war eine umfassende Evaluation von Designprototypen und (Papier)-Mockups, mittels Methoden des Usability-Engineering direkt am Benutzer selbst. Dies hinsichtlich Machbarkeit und Gebrauchstauglichkeit des Search Term Recommenders.
The search for scientific literature in scientific information systems is a discipline at the intersection between information retrieval and digital libraries. Recent user studies show two typical weaknesses of the classical IR model: ranking of retrieved and maybe relevant documents and the language problem during the query formulation phase. At the same time traditional retrieval systems that rely primarily on textual document and query features are stagnating for years, as it could be observed in IR evaluation campaigns such as TREC or CLEF. Therefore alternative approaches to surpass these two problem fields are needed. Two different search support systems are presented in this work and evaluated with a lab evaluation using the IR test collection GIRT and iSearch with 150 and 65 topics, respectively. These two systems are (1) a query expansion that is based on the analysis of co-occurrences of document attributes and (2) a ranking mechanism that applies informetric analysis of the productivity of information producers in the information production process. Both systems were compared to a baseline system using the Solr search engine. Both methods showed positive effects when applying additional document attributes like author names, ISSN codes and controlled terms. The query expansion showed an improvement in precision (bpref +12%) and in recall (R +22%).
he alternative ranking methods were able to compete with the baseline for author names and ISSN codes and were able to beat the baseline by using controlled terms (MAP +14%). A clear negative influence was seen when using entities like publishers or locations. Both methods were able to generate a substantially different sorting of the result set, measured using Kendall. So, additional to the improved relevance in the result list, the user can get a new and different view on the document set. Query expansion using author names, ISSN codes and thesaurus terms showed great potential that lies within the rich metadata sets of digital library systems. The proposed ranking methods could outperform standard relevance ranking methods after they were filtered by the existence of a so-called power law. This showed that the proposed ranking methods cannot be used universally in any case but require specific frequency distributions in the metadata. A connection between the underlying informetric laws of Bradford, Lotka and Zipf is made clear. The evaluated methods were implemented as interactive search supporting systems that can be used in an interactive prototype and the social science digital library system Sowiport. Besides that, the methods are adaptable to other systems and environments using a free software framework and a web API.
The amount of information on the Web is constantly increasing and also there is a wide variety of information available such as news, encyclopedia articles, statistics, survey data, stock information, events, bibliographies etc. The information is characterized by heterogeneity in aspects such as information type, modality, structure, granularity, quality and by its distributed nature. The two primary techniques by which users on the Web are looking for information are (1) using Web search engines and (2) browsing the links between information. The dominant mode of information presentation is mainly static in the form of text, images and graphics. Interactive visualizations offer a number of advantages for the presentation and exploration of heterogeneous information on the Web: (1) They provide different representations for different, very large and complex types of information and (2) large amounts of data can be explored interactively using their attributes and thus can support and expand the cognition process of the user. So far, interactive visualizations are still not an integral part in the search process of the Web. The technical standards and interaction paradigms to make interactive visualization usable by the mass are introduced only slowly through standardatization organizations. This work examines how interactive visualizations can be used for the linking and search process of heterogeneous information on the Web. Based on principles in the areas of information retrieval (IR), information visualization and information processing, a model is created, which extends the existing structural models of information visualization with two new processes: (1) linking of information in visualizations and (2) searching, browsing and filtering based on glyphs. The Vizgr toolkit implements the developed model in a web application. In four different application scenarios, aspects of the model will be instantiated and are evaluated in user tests or examined by examples.