Refine
Document Type
- Master's Thesis (7)
- Doctoral Thesis (3)
- Bachelor Thesis (1)
- Diploma Thesis (1)
Keywords
- API (1)
- API analysis (1)
- API-Analyse (1)
- Ad-hoc-Netz (1)
- Algorithmische Geometrie (1)
- Beaconless (1)
- Befahrbarkeit (1)
- Developer profiling (1)
- Dimension 3 (1)
- Distributed Algorithm (1)
- Drahtloses Sensorsystem (1)
- Drahtloses vermachtes Netz (1)
- Ebener Graph (1)
- Echtzeit-Interaktion (1)
- Entwickler Profil (1)
- Gelände (1)
- Geographic routing (1)
- Geometric spanner (1)
- Graph (1)
- Hindernis (1)
- Java. Programmiersprache (1)
- Klassifikation (1)
- Knowledge Engineering (1)
- Komplexität / Algorithmus (1)
- Laser (1)
- Local algorithm (1)
- MSR (1)
- Meteor (1)
- Mining Software Repositories (1)
- Nachbarschaftsgraph (1)
- Netzwerktopologie (1)
- Nutzerzufriedenheit (1)
- Planar graphs (1)
- Quasi unit disk graph (1)
- Reactive algorithm (1)
- Roboter (1)
- Routing (1)
- Software Engineering (1)
- Software Repositories (1)
- Straßenzustand (1)
- Technology Acceptance Model (1)
- Unit disk graph (1)
- Verteilter Algorithmus (1)
- Wahrscheinlichkeitsrechnung (1)
- Wireless sensor network (1)
- Zusammenhängender Graph (1)
- e-service quality (1)
Institute
- Institut für Informatik (12) (remove)
This thesis presents an analysis of API usage in a large corpus of Java software retrieved from the open source repositories hosted at SourceForge. Most larger software projects use software libraries, which offer a public "application programming interface" or API as an interface for the programmer. In order to facilitate the transition between different APIs, there are emerging research projects in the field of automated API migration. However, there is a lack of basic statistical background information about in-the-wild usage of APIs as such measurements have, until now, only been done on rather small corpora. We thus present an analysis method suitable for measurements with large corpora. First, we create a corpus of open source projects hosted on SourceForge, as well as a corpus of software libraries. Then, all projects in the corpus are compiled with an instrumented compiler. We use a compiler plugin for javac that gives detailed information about every method created by the compiler. This information is stored in a database and analyzed.
The publication of open source software aims to support the reuse, the distribution and the general utilization of software. This can only be enabled by the correct usage of open source software licenses. Therefore associations provide a multitude of open source software licenses with different features, of which a developer can choose, to regulate the interaction with his software. Those licenses are the core theme of this thesis.
After an extensive literature research, two general research questions are elaborated in detail. First, a license usage analysis of licenses in the open source sector is applied, to identify current trends and statistics. This includes questions concerning the distribution of licenses, the consistency in their usage, their association over a period of time and their publication.
Afterwards the recommendation of licenses for specific projects is investigated. Therefore, a recommendation logic is presented, which includes several influences on a suitable license choice, to generate an at most applicable recommendation. Besides the exact features of a license of which a user can choose, different methods of ranking the recommendation results are proposed. This is based on the examination of the current situation of open source licensing and license suggestion. Finally, the logic is evaluated on the exemplary use-case of the 101companies project.
Reactive local algorithms are distributed algorithms which suit the needs of battery-powered, large-scale wireless ad hoc and sensor networks particularly well. By avoiding both unnecessary wireless transmissions and proactive maintenance of neighborhood tables (i.e., beaconing), such algorithms minimize communication load and overhead, and scale well with increasing network size. This way, resources such as bandwidth and energy are saved, and the probability of message collisions is reduced, which leads to an increase in the packet reception ratio and a decrease of latencies.
Currently, the two main application areas of this algorithm type are geographic routing and topology control, in particular the construction of a node's adjacency in a connected, planar representation of the network graph. Geographic routing enables wireless multi-hop communication in the absence of any network infrastructure, based on geographic node positions. The construction of planar topologies is a requirement for efficient, local solutions for a variety of algorithmic problems.
This thesis contributes to reactive algorithm research in two ways, on an abstract level, as well as by the introduction of novel algorithms:
For the very first time, reactive algorithms are considered as a whole and as an individual research area. A comprehensive survey of the literature is given which lists and classifies known algorithms, techniques, and application domains. Moreover, the mathematical concept of O- and Omega-reactive local topology control is introduced. This concept unambiguously distinguishes reactive from conventional, beacon-based, topology control algorithms, serves as a taxonomy for existing and prospective algorithms of this kind, and facilitates in-depth investigations of the principal power of the reactive approach, beyond analysis of concrete algorithms.
Novel reactive local topology control and geographic routing algorithms are introduced under both the unit disk and quasi unit disk graph model. These algorithms compute a node's local view on connected, planar, constant stretch Euclidean and topological spanners of the underlying network graph and route messages reactively on these spanners while guaranteeing the messages' delivery. All previously known algorithms are either not reactive, or do not provide constant Euclidean and topological stretch properties. A particularly important partial result of this work is that the partial Delaunay triangulation (PDT) is a constant stretch Euclidean spanner for the unit disk graph.
To conclude, this thesis provides a basis for structured and substantial research in this field and shows the reactive approach to be a powerful tool for algorithm design in wireless ad hoc and sensor networking.
The identification of experts for a specific technology or framework produces a large benefit for collaborative software projects. Hence it reduces the communication overhead that is required to identify an expert on the fly. Therefore this thesis describes a tool and approach that can be used to identify an expert that has a specific skill-set. It will mainly focus on the skills and expertise of developers that use the Django framework. By adding more rules to our framework that approach could easily be extended for different technologies or frameworks. The paper will close with a case study on an open source project.
One task of executives and project managers in IT companies or departments is to hire suitable developers and to assign them to suitable problems. In this paper, we propose a new technique that directly leverages previous work experience of developers in a systematic manner. Existing evidence for developer expertise based on the version history of existing projects is analyzed. More specifically, we analyze the commits to a repository in terms of affected API usage. On these grounds, we associate APIs with developers and thus we assess API experience of developers. In transitive closure, we also assess programming domain experience.
This thesis proposes the use of MSR (Mining Software Repositories) techniques to identify software developers with exclusive expertise about specific APIs and programming domains in software repositories. A pilot Tool for finding such
“Islands of Knowledge” in Node.js projects is presented and applied in a case study to the 180 most popular npm packages. It is found that on average each package has 2.3 Islands of Knowledge, which is possibly explained by the finding that npm packages tend to have only one main contributor. In a survey, the maintainers of 50 packages are contacted and asked for opinions on the results produced by the Tool. Together with their responses, this thesis reports on experiences made with the pilot Tool and how future iterations could produce even more accurate statements about programming expertise distribution in developer teams.
Modern software projects are composed of several software languages, software technologies and different kind of artifacts. Therefore, the understanding of the software project at hand, including the semantic links between the different parts, becomes a difficult challenge for a developer. One approach to attack this issue is to document the software project with the help of a linguistic architecture. This kind of architecture can be described with the help of the MegaL ontology. A remaining challenge is the creation of it since it requires different kind of skills. Therefore, this paper proposes an approach for the automatic extraction of a linguistic architecture. The open source framework Apache Jena, which is focusing on semantic web technologies like RDF and OWL, is used to define custom rules that are capable to infer new knowledge based on the defined or already extracted RDF triples. The complete approach is tested in a case study on ten different open source projects. The aim of the case study is to extract a linguistic architecture that is describing the use of Hibernate in the selected projects. In the end, the result is evaluated with the help of different metrics. The evaluation is performed with the help of an internal and external approach.
Using semantic data from general-purpose programming languages does not provide the unified experience one would want for such an application. Static error checking is lacking, especially with regards to static typing of the data. Based on the previous work of λ-DL, which integrates semantic queries and concepts as types into a typed λ-calculus, this work takes its ideas a step further to meld them into a real-world programming language. This thesis explores how λ-DL's features can be extended and integrated into an existing language, researches an appropriate extension mechanism and produces Semantics4J, a JastAdd-based Java language semantic data extension for type-safe OWL programming, together with examples of its usage.
Software systems are often developed as a set of variants to meet diverse requirements. Two common approaches to this are "clone-and-owning" and software product lines. Both approaches have advantages and disadvantages. In previous work we and collaborators proposed an idea which combines both approaches to manage variants, similarities, and cloning by using a virtual platform and cloning-related operators.
In this thesis, we present an approach for aggregating essential metadata to enable a propagate operator, which implements a form of change propagation. For this we have developed a system to annotate code similarities which were extracted throughout the history of a software repository. The annotations express similarity maintenance tasks, which can then either be executed automatically by propagate or have to be performed manually by the user. In this work we outline the automated metadata extraction process and the system for annotating similarities; we explain how the implemented system can be integrated into the workflow of an existing version control system (Git); and, finally, we present a case study using the 101haskell corpus of variants.
The aim of this thesis was to develop and to evaluate a method, which enables the utilization of traditional dialog marketing tools through the web. For this purpose, a prototype of a website with "extended real-time interaction (eEI)" capabilities has been implemented and tested. The prototype was evaluated by a methodology based on the five-dimensional "e-service quality" measure after Gwo-Guang Lee und Hsiu-Fen Lin. The Foundation of the "e-service quality" measure is the SERVQUAL-Model. A statistical analysis of the user study results showed a significant correlation between eEI and user satisfaction. Before the actual realization of eEI, the "Technology Acceptance Model" after Fred D. Davis was used to investigate currently used real-time interaction systems.
Empirical studies in software engineering use software repositories as data sources to understand software development. Repository data is either used to answer questions that guide the decision-making in the software development, or to provide tools that help with practical aspects of developers’ everyday work. Studies are classified into the field of Empirical Software Engineering (ESE), and more specifically into Mining Software Repositories (MSR). Studies working with repository data often focus on their results. Results are statements or tools, derived from the data, that help with practical aspects of software development. This thesis focuses on the methods and high order methods used to produce such results. In particular, we focus on incremental methods to scale the processing of repositories, declarative methods to compose a heterogeneous analysis, and high order methods used to reason about threats to methods operating on repositories. We summarize this as technical and methodological improvements. We contribute the improvements to methods and high-order methods in the context of MSR/ESE to produce future empirical results more effectively. We contribute the following improvements. We propose a method to improve the scalability of functions that abstract over repositories with high revision count in a theoretically founded way. We use insights on abstract algebra and program incrementalization to define a core interface of highorder functions that compute scalable static abstractions of a repository with many revisions. We evaluate the scalability of our method by benchmarks, comparing a prototype with available competitors in MSR/ESE. We propose a method to improve the definition of functions that abstract over a repository with a heterogeneous technology stack, by using concepts from declarative logic programming and combining them with ideas on megamodeling and linguistic architecture. We reproduce existing ideas on declarative logic programming with languages close to Datalog, coming from architecture recovery, source code querying, and static program analysis, and transfer them from the analysis of a homogeneous to a heterogeneous technology stack. We provide a prove-of-concept of such method in a case study. We propose a high-order method to improve the disambiguation of threats to methods used in MSR/ESE. We focus on a better disambiguation of threats, operationalizing reasoning about them, and making the implications to a valid data analysis methodology explicit, by using simulations. We encourage researchers to accomplish their work by implementing ‘fake’ simulations of their MSR/ESE scenarios, to operationalize relevant insights about alternative plausible results, negative results, potential threats and the used data analysis methodologies. We prove that such way of simulation based testing contributes to the disambiguation of threats in published MSR/ESE research.