Refine
Year of publication
- 2011 (2) (remove)
Document Type
- Doctoral Thesis (1)
- Master's Thesis (1)
Keywords
- Data Mining (2) (remove)
Institute
- Fachbereich 4 (1)
- Institut für Informatik (1)
Identifying reusable legacy code able to implement SOA services is still an open research issue. This master thesis presents an approach to identify legacy code for service implementation based on dynamic analysis and the application of data mining techniques. rnrnAs part of the SOAMIG project, code execution traces were mapped to business processes. Due to the high amount of traces generated by dynamic analyses, the traces must be post-processed in order to provide useful information. rnrnFor this master thesis, two data mining techniques - cluster analysis and link analysis - were applied to the traces. First tests on a Java/Swing legacy system provided good results, compared to an expert- allocation of legacy code.
Folksonomies are Web 2.0 platforms where users share resources with each other. Furthermore, they can assign keywords (called tags) to the resources for categorizing and organizing the resources. Numerous types of resources like websites (Delicious), images (Flickr), and videos (YouTube) are supported by different folksonomies. The folksonomies are easy to use and thus attract the attention of millions of users. Together with the ease they offer, there are also some problems. This thesis addresses different problems of folksonomies and proposes solutions for these problems. The first problem occurs when users search for relevant resources in folksonomies. Often, the users are not able to find all relevant resources because they don't know which tags are relevant. The second problem is assigning tags to resources. Although many folksonomies (like Delicious) recommend tags for the resources, other folksonomies (like Flickr) do not recommend any tags. Tag recommendation helps the users to easily tag their resources. The third problem is that tags and resources are lacking semantics. This leads for example to ambiguous tags. The tags are lacking semantics because they are freely chosen keywords. The automatic identification of the semantics of tags and resources helps in reducing problems that arise from this freedom of the users in choosing the tags. This thesis proposes methods which exploit semantics to address the problems of search, tag recommendation, and the identification of tag semantics. The semantics are discovered from a variety of sources. In this thesis, we exploit web search engines, online social communities and the co-occurrences of tags as sources of semantics. Using different sources for discovering semantics reduces the efforts to build systems which solve the problems mentioned earlier. This thesis evaluates the proposed methods on a large scale data set. The evaluation results suggest that it is possible to exploit the semantics for improving search, recommendation of tags, and automatic identification of the semantics of tags and resources.