Refine
Year of publication
Document Type
- Master's Thesis (93) (remove)
Language
- English (93) (remove)
Keywords
Business rules have become an important tool to warrant compliance at their business processes. But the collection of these business rules can have various conflicting elements. This can lead to a violation of the compliance to be achieved. This conflicting elements are therefore a kind of inconsistencies, or quasi incon- sistencies in the business rule base. The target for this thesis is to investigate how those quasi inconsistencies in business rules can be detected and analyzed. To this aim, we develop a comprehensive library which allows to apply results from the scientific field of inconsistency measurement to business rule formalisms that are actually used in practice.
Web application testing is an active research area. Garousi et al. did a systematic mapping study and classified 79 papers published between 2000-2011. However, there seems to be a lack of information exchange between the scientific community and tool developers.
This thesis systematically analyzes the field of functional, system level web application testing tools. 194 candidate tools were collected in the tool search and screened, with 23 tools being selected as foundation of this thesis. These 23 tools were systematically used to generate a feature model of the domain. The methodology to support this is an additional contribution of this thesis. It processes end user documentation of tools belonging to an examined domain and creates a feature model. The feature model gives an overview over the existing features, their alternatives and their distribution. It can be used to identify trends and problems, extraordinary features, help decision making of tool purchase or guide scientists how to focus research.
The development of a pan-European public E-Procurement system is an important target of the European Union to enhance the efficiency, transparency and competitiveness of public procurement procedures conducted within the European single market. A great obstacle for cross-border electronic procurement is the heterogeneity of national procurement systems in terms of technical, organizational and legal differences. To overcome this obstacle the European Commission funds several initiatives that contribute to the aim of achieving interoperability for pan-European public procurement. Pan European Public Procurement OnLine (PEPPOL) is one of these initiatives that aims at piloting an interoperable pan-European E-Procurement solution to support businesses and public purchasing entities from different member states to conduct their procurement processes electronically.rnrnAs interoperability and inter-connection of distributed heterogeneous information systems are the major requirements in the European procurement domain, and the VCD sub-domain in particular, service-oriented architecture (SOA) seems to provide a promising approach to realize such an architecture, as it promotes loose coupling and interoperability. This master thesis therefore discusses the SOA approach and how its concepts, methodologies and technologies can be used for the development of interoperable IT systems for electronic public procurement. This discussion is enhanced through a practical application of the discussed SOA methodologies by conceptualizing and prototyping of a sub-system derived from the overall system domain of the Virtual Company Dossier. For that purpose, important aspects of interoperability and related standards and technologies will be examined and put into the context of public electronic procurement. Furthermore, the paradigm behind SOA will be discussed, including the derivation of a top-down development methodology for service-oriented systems.
With the emergence of current generation head-mounted displays (HMDs), virtual reality (VR) is regaining much interest in the field of medical imaging and diagnosis. Room-scale exploration of CT or MRI data in virtual reality feels like an intuitive application. However in VR retaining a high frame rate is more critical than for conventional user interaction seated in front of a screen. There is strong scientific evidence suggesting that low frame rates and high latency have a strong influence on the appearance of cybersickness. This thesis explores two practical approaches to overcome the high computational cost of volume rendering for virtual reality. One lies within the exploitation of coherency properties of the especially costly stereoscopic rendering setup. The main contribution is the development and evaluation of a novel acceleration technique for stereoscopic GPU ray casting. Additionally, an asynchronous rendering approach is pursued to minimize the amount of latency in the system. A selection of image warping techniques has been implemented and evaluated methodically, assessing the applicability for VR volume rendering.
Advanced Auditing of Inconsistencies in Declarative Process Models using Clustering Algorithms
(2021)
To have a compliant business process of an organization, it is essential to ensure a onsistent process. The measure of checking if a process is consistent or not depends on the business rules of a process. If the process adheres to these business rules, then the process is compliant and efficient. For huge processes, this is quite a challenge. Having an inconsistency in a process can yield very quickly to a non-functional process, and that’s a severe problem for organizations. This thesis presents a novel auditing approach for handling inconsistencies from a post-execution perspective. The tool identifies the run-time inconsistencies and visualizes them in heatmaps. These plots aim to help modelers observe the most problematic constraints and help them make the right remodeling decisions. The modelers assisted with many variables can be set in the tool to see a different representation of heatmaps that help grasp all the perspectives of the problem. The heatmap sort and shows the run-time inconsistency patterns, so that modeler can decide which constraints are highly problematic and should address a re-model. The tool can be applied to real-life data sets in a reasonable run-time.
Despite the inception of new technologies at a breakneck pace, many analytics projects fail mainly due to the use of incompatible development methodologies. As big data analytics projects are different from software development projects, the methodologies used in software development projects could not be applied in the same fashion to analytics projects. The traditional agile project management approaches to the projects do not consider the complexities involved in the analytics. In this thesis, the challenges involved in generalizing the application of agile methodologies will be evaluated, and some suitable agile frameworks which are more compatible with the analytics project will be explored and recommended. The standard practices and approaches which are currently applied in the industry for analytics projects will be discussed concerning enablers and success factors for agile adaption. In the end, after the comprehensive discussion and analysis of the problem and complexities, a framework will be recommended that copes best with the discussed challenges and complexities and is generally well suited for the most data-intensive analytics projects.
Software systems are often developed as a set of variants to meet diverse requirements. Two common approaches to this are "clone-and-owning" and software product lines. Both approaches have advantages and disadvantages. In previous work we and collaborators proposed an idea which combines both approaches to manage variants, similarities, and cloning by using a virtual platform and cloning-related operators.
In this thesis, we present an approach for aggregating essential metadata to enable a propagate operator, which implements a form of change propagation. For this we have developed a system to annotate code similarities which were extracted throughout the history of a software repository. The annotations express similarity maintenance tasks, which can then either be executed automatically by propagate or have to be performed manually by the user. In this work we outline the automated metadata extraction process and the system for annotating similarities; we explain how the implemented system can be integrated into the workflow of an existing version control system (Git); and, finally, we present a case study using the 101haskell corpus of variants.
The paper is a study focusing on exploring which factors and examining the impact of those factors influencing the entrepreneurial intention among students in the Construction industry, specifically among students of Hanoi Construction University and Hanoi Architecture University. The study also mentions some solution of this findings for entrepreneurship in the Construction field in Vietnam that the author might think of based on this research work for future study. The Theory of planned behavior is used as the theoritical framework for this study. Both qualitative and quantitative methods are employed. The questionaire will be conducted among students of the two universities mentioned above. Then, an exploratory factor analysis (EFA) will performed to test the validity of the constructs. The research findings provide factors and their impact factors influencing the entrepreneurial intention and propose some solutions to improve the entrepreneurship in the Construction field in Vietnam.
The purpose of this thesis is to explore the sentiment distributions of Wikipedia concepts.
We analyse the sentiment of the entire English Wikipedia corpus, which includes 5,669,867 articles and 1,906,375 talks, by using a lexicon-based method with four different lexicons.
Also, we explore the sentiment distributions from a time perspective using the sentiment scores obtained from our selected corpus. The results obtained have been compared not only between articles and talks but also among four lexicons: OL, MPQA, LIWC, and ANEW.
Our findings show that among the four lexicons, MPQA has the highest sensitivity and ANEW has the lowest sensitivity to emotional expressions. Wikipedia articles show more sentiments than talks according to OL, MPQA, and LIWC, whereas Wikipedia talks show more sentiments than articles according to ANEW. Besides, the sentiment has a trend regarding time series, and each lexicon has its own bias regarding text describing different things.
Moreover, our research provides three interactive widgets for visualising sentiment distributions for Wikipedia concepts regarding the time and geolocation attributes of concepts.
Since the invention of U-net architecture in 2015, convolutional networks based on its encoder-decoder approach significantly improved results in image analysis challenges. It has been proven that such architectures can also be successfully applied in different domains by winning numerous championships in recent years. Also, the transfer learning technique created an opportunity to push state-of-the-art benchmarks to a higher level. Using this approach is beneficial for the medical domain, as collecting datasets is generally a difficult and expensive process.
In this thesis, we address the task of semantic segmentation with Deep Learning and make three main contributions and release experimental results that have practical value for medical imaging.
First, we evaluate the performance of four neural network architectures on the dataset of the cervical spine MRI scans. Second, we use transfer learning from models trained on the Imagenet dataset and compare it to randomly initialized networks. Third, we evaluate models trained on the bias field corrected and raw MRI data. All code to reproduce results is publicly available online.
The erosion of the closed innovation paradigm in conjunction with increasing competitive pressure has boosted the interest of both researchers and organizations in open innovation. Despite such rising interest, several companies remain reluctant to open their organizational boundaries to practice open innovation. Among the many reasons for such reservation are the pertinent complexity of transitioning toward open innovation and a lack of understanding of the procedures required for such endeavors. Hence, this thesis sets out to investigate how organizations can open their boundaries to successfully transition from closed to open innovation by analyzing the current literature on open innovation. In doing so, the transitional procedures are structured and classified into a model comprising three phases, namely unfreezing, moving, and institutionalizing of changes. Procedures of the unfreezing phase lay the foundation for a successful transition to open innovation, while procedures of the moving phase depict how the change occurs. Finally, procedures of the institutionalizing phase contribute to the sustainability of the transition by employing governance mechanisms and performance measures. Additionally, the individual procedures are characterized along with their corresponding barriers and critical success factors. As a result of this structured depiction of the transition process, a guideline is derived. This guideline includes the commonly employed actions of successful practitioners of open innovation, which may serve as a baseline for interested parties of the paradigm. With the derivation of the guideline and concise depiction of the individual transitional phases, this thesis consequently reduces the overall complexity and increases the comprehensibility of the transition and its implications for organizations.
Der Zweck dieser Arbeit ist es, sich auf die kritischen Forschungsherausforderungen und -themen zu konzentrieren, die UI/UX-Designprinzipien umgeben, mit einem Schwerpunkt auf kulturübergreifenden Konzepten aus der Perspektive von E-Learning-Plattformen. Zu diesem Zweck betrachten wir zunächst die kulturellen Dimensionen auf der Grundlage des Hofstede-Rahmens mit dem Ziel, wichtige kulturelle Werte zu identifizieren. Als zweites Ziel der Forschung erleichtert eine Reihe von Kriterien, die so genannte Usability-Heuristik von Nielsen, die Erkennung von Usability Problemen bei der Gestaltung von Benutzeroberflächen (UI). Die Usability-Heuristiken umfassen zehn Variablen, die die Interaktion zwischen dem Benutzer und einem Produkt oder System beeinflussen. Wenn wir uns näher mit
diesen Themen befassen, werden wir in der Lage sein, eine Matrix mit Beziehungen zwischen der heuristischen Bewertung von Nielsen und dem kulturellen Rahmen von Geert Hofstede aufzudecken. Abschließend erörtern wir das mögliche Potenzial kultureller Werte zur Beeinflussung von Benutzeroberflächen für E-Learning-Plattformen. In der Tat gibt es einige Funktionen in E-Learning-Plattformen, die aufgrund der Kultur weniger diskutiert werden, obwohl sie sehr praktisch in die Plattformen integriert werden können.
The thesis develops and evaluates a hypothetical model of the factors that influence user acceptance of weblog technology. Previous acceptance studies are reviewed, and the various models employed are discussed. The eventual model is based on the technology acceptance model (TAM) by Davis et al. It conceptualizes and operationalizes a quantitative survey conducted by means of an online questionnaire, strictly from a user perspective. Finally, it is tested and validated by applying methods of data analysis.
This thesis focuses on approximate inference in assumption-based argumentation frameworks. Argumentation provides a significant idea in the computerization of theoretical and practical reasoning in AI. And it has a close connection with AI, engaging in arguments to perform scientific reasoning. The fundamental approach in this field is abstract argumentation frameworks developed by Dung. Assumption-based argumentation can be regarded as an instance of abstract argumentation with structured arguments. When facing a large scale of data, a challenge of reasoning in assumption-based argumentation is how to construct arguments and resolve attacks over a given claim with minimal cost of computation and acceptable accuracy at the same time. This thesis proposes and investigates approximate methods that randomly select and construct samples of frameworks based on graphical dispute derivations to solve this problem. The presented approach aims to improve reasoning performance and get an acceptable trade-off between computational time and accuracy. The evaluation shows that for reasoning in assumption-based argumentation, in general, the running time is reduced with the cost of slightly low accuracy by randomly sampling and constructing inference rules for potential arguments over a query.
Assessing ChatGPT’s Performance in Analyzing Students’ Sentiments: A Case Study in Course Feedback
(2024)
The emergence of large language models (LLMs) like ChatGPT has impacted fields such as education, transforming natural language processing (NLP) tasks like sentiment analysis. Transformers form the foundation of LLMs, with BERT, XLNet, and GPT as key examples. ChatGPT, developed by OpenAI, is a state-of-the-art model and its ability in natural language tasks makes it a potential tool in sentiment analysis. This thesis reviews current sentiment analysis methods and examines ChatGPT’s ability to analyze sentiments across three labels (Negative, Neutral, Positive) and five labels (Very Negative, Negative, Neutral, Positive, Very Positive) on a dataset of student course reviews. Its performance is compared with fine tuned state-of-the-art models like BERT, XLNet, bart-large-mnli, and RoBERTa-large-mnli using quantitative metrics. With the help of 7 prompting techniques which are ways to instruct ChatGPT, this work also analyzed how well it understands complex linguistic nuances in the given texts using qualitative metrics. BERT and XLNet outperform ChatGPT mainly due to their bidirectional nature, which allows them to understand the full context of a sentence, not just left to right. This, combined with fine-tuning, helps them capture patterns and nuances better. ChatGPT, as a general purpose, open-domain model, processes text unidirectionally, which can limit its context understanding. Despite this, ChatGPT performed comparably to XLNet and BERT in three-label scenarios and outperformed others. Fine-tuned models excelled in five label cases. Moreover, it has shown impressive knowledge of the language. Chain-of-Thought (CoT) was the most effective technique for prompting with step by step instructions. ChatGPT showed promising performance in correctness, consistency, relevance, and robustness, except for detecting Irony. As education evolves with diverse learning environments, effective feedback analysis becomes increasingly valuable. Addressing ChatGPT’s limitations and leveraging its strengths could enhance personalized learning through better sentiment analysis.
This thesis analyzes the online attention towards scientists and their research topics. The studies compare the attention dynamics towards the winners of important scientific prizes with scientists who did not receive a prize. Web signals such as Wikipedia page views, Wikipedia edits, and Google Trends were used as a proxy for online attention. One study focused on the time between the creation of the article about a scientist and their research topics. It was discovered that articles about research topics were created closer to the articles of prize winners than to scientists who did not receive a prize. One possible explanation could be that the research topics are more closely related to the scientist who got an award. This supports that scientists who received the prize introduced the topics to the public. Another study considered the public attention trends towards the related research topics before and after a page of a scientist was created. It was observed that after a page about a scientist was created, research topics of prize winners received more attention than the topics of scientists who did not receive a prize. Furthermore, it was demonstrated that Nobel Prize winners get a lower amount of attention before receiving the prize than the potential nominees from the list of Citation Laureates of Thompson Reuters. Also, their popularity is going down faster after receiving it. It was also shown that it is difficult to predict the prize winners based on the attention dynamics towards them.
Mobile payment has been a payment option in the market for a long time now and was predicted to become a widely used payment method. However, over the years, the market penetration rate of mPayments has been relatively low, despite it having all characteristics required of a convenient payment method. The primaryrnreason for this has been cited as a lack of customer acceptance mainly caused due to the lack of perceived security by the end-user. Although biometric authentication is not a new technology, it is experiencing a revival in the light of the present day terror threats and increased security requirements in various industries. The application of biometric authentication in mPayments is analysed here and a suitable biometric authentication method for use with mPayments is recommended. The issue of enrolment, human and technical factors to be considered are discussed and the STOF business model is applied to a BiMoP (biometric mPayment) application.
Blockchain in Healthcare
(2020)
The underlying characteristics of blockchain can facilitate data provenance, data integrity, data security, and data management. It has the potential to transform the healthcare sector. Since the introduction of Bitcoin in the fintech industry, the blcockhain technology has been gaining a lot of traction and its purpose is not just limited to finance. This thesis highlights the inner workings of blockchain technology and its application areas with possible existing solutions. Blockchain could lay the path for a new revolution in conventional healthcare systems. We presented how individual sectors within the healthcare industry could use blockchain and what solution persists. Also, we have presented our own concept to improve the existing paper-based prescription management system which is based on Hyperledger framework. The results of this work suggest that healthcare can benefit from blockchain technology bringing in the new ways patients can be treated.
The growing numbers of breeding rooks (Corvus frugilegus) in the city of Landau (Rhineland- Palatinate, Germany) increase the potential for conflict between rooks and humans, which is mainly associated with noise and faeces. Therefore, the aim of this work is a better understanding of the breeding tree selection of the rook in order to develop options for action and management in the future.
Part I of this thesis provides general background information on the rook and includes mapping of the rookeries in the Anterior Palatinate and South Palatinate including Landau in the year 2020. That mapping revealed that the number of rural colonies has decreased, while the number of urban colonies has increased in the study area in the last few years. In line with current literature, tree species and tree size were important criteria for breeding tree selection. However, the mapping showed that additional factors must be important as well.
Therefore, as rooks seem to often breed along traffic axes, Part II of this thesis examines how temperature, artificial light and noise, which are all linked to traffic axes, affect the breeding tree selection of the rook in the city of Landau. The following three hypotheses are developed: (1) manually selected breeding trees (Bm) have a warmer microclimate than manually selected non-breeding trees (Nm) or randomly selected non-breeding trees (Nr), (2) Bm are exposed to a higher light level than Nm or Nr and (3) Bm are exposed to a higher noise level than Nm or Nr. To test these hypotheses, 15 Bm, 13 Nm and 16 Nr are investigated.
The results show that Bm were exposed to more noise than both types of non-breeding trees (μBm, noise = 36.52481 dB, μNm, noise = 31.27229 dB, μNr, noise = 29.17417 dB) where the difference between Bm and Nr was significant. In addition, there was a tendency for Bm to be exposed to less light (μBm, light = 0.356 lx) than Nm (μNm, light = 0.4107692 lx) and significantly less light than Nr (μNr, light = 1.995 lx), while temperature did not differ between the groups (μBm, temp = 16.90549 °C, μNm, temp = 16.93118 °C, μNr, temp = 17.28639 °C).
This study shows for the first time that rooks prefer trees which are exposed to low light levels and high noise levels, i.e. more intense traffic noise, for breeding. It can only be speculated that the cause of this is lower enemy pressure at such sites. The fact that temperature does not seem to have any influence on breeding tree selection may be due to only small temperature differences at nest height, which might be compensated by breeding behaviour. Consequently, in the long term one management approach could be to divert traffic from inner-city areas, especially schools and hospitals, to bypasses. If tree genera suitable for rooks, such as plane trees, are planted along the bypasses, those sites could provide suitable alternative habitats to inner-city breeding locations, which become less attractive for breeding due to noise reduction. In the short term in addition to locally implemented repellent measures the most effective approach is to strengthen rook acceptance among the population. However, further research is needed to verify the results of this thesis and to gain further insights into rook breeding site selection in order to develop effective management measures.
Thesis is devoted to the topic of challenges and solutions for human resources management (HRM) in international organizations. The aim is to investigate methodological approaches to assessment of HRM challenges and solutions, and to apply them on practice, to develop ways of improvement of HRM of a particular enterprise. The practical research question investigated is “Is the Ongoing Professional Development – Strategic HRM (OPD-SHRM) model a better solution for HRM system of PrJSC “Philip Morris Ukraine”?”
To achieve the aim of this work and to answer the research question, we have studied theoretical approaches to explaining and assessing HRM in section 1, analyzed HRM system of an international enterprise in section 2, and then synthesized theory and practice to find intersection points in section 3.
Research findings indicate that the main challenge of HRM is to balance between individual and organizational interests. Implementation of OPD-SHRM is one of the solutions. Switching focus from satisfaction towards success will bring both tangible and intangible benefits for individuals and organization. In case of PrJSC “Philip Morris Ukraine”, the maximum forecasted increase is 330% in net profit, 350% in labor productivity, and 26% in Employee Development and Engagement Index.
Challenges of Implementing Innovation Strategies at Large Organizations: A case of Lotte Group
(2023)
For many decades, one of the most important focuses of research has been on determining whether or not there is a correlation between the size of an organization and its level of innovation. Unlike small companies, large companies often have well-established structure that are hard to change and change managements seems to be much more difficult especially related to innovation. Nevertheless, there are many examples to prove the opposites. Some large organization like Apple, Amazon... always show great innovation efforts and keep changing in a much positive way. Therefore, the aim of this thesis is to discuss of how large organization can be able to implement innovation when having much drawbacks compare to SMEs. Through the use of a qualitative research approach, researcher was able to explore essential information on the innovation strategies that large companies are using in order to innovate and how they could overcome existing challenges by studying the working process of Lotte Group – one of the biggest companies in Korea.
Coordination and awareness mechanisms are important in systems for Computer-Supported Cooperative Work (CSCW) and traditional groupware systems. It has been a key focus of research into collaborative groupware and its capability to enable people to efficiently collaborate and coordinate work. Until now, no classification of the mechanisms has been undertaken to identify commonalities and differences in coordination and awareness mechanisms and to show their significance in collaborative environments. In addition, there is a little investigation of coordination and awareness mechanisms in new forms of groupware such as socially enabled Enterprise Collaboration Systems (ECS). Indeed, both in science and in practices, ECS incorporating social software have become increasingly important. Based on the combination of traditional groupware and social software, ECS also include coordination and awareness mechanisms that may simplify collaboration, but these have not yet been investigated.
Therefore, the aim of this thesis is to identify coordination and awareness mechanisms in the academic literature to provide a general overview of those mechanisms examples. Additionally, this thesis aims to classify the mechanism examples. Based on a deep literature analysis, concepts described in literature are chosen and applied with the intension to analyse the mechanisms and to reach a classification. Based on the classification of the identified mechanisms their commonalities and differences are examined and described to gain a better understanding of them. For illustration purpose, examples of coordination and awareness mechanisms and their application are portrayed. The mechanisms examples refer to the classification groups derived. The selection of the mechanisms for the visualization is based on significant differences in their functionality. Subsequently, the selected mechanisms, more based on traditional groupware, are checked to a limited extend whether they can be found in socially enabled ECS. The collaborative platform of IBM Connections serves as a practical example of ECS incorporating social software. IBM Connections is used at the University of Koblenz to run the platform "UniConnect". On the platform it is investigated which of the identified mechanisms examples of the literature are applied in IBM Connections and which additional mechanisms are created by users. This work is the first step in the study of coordination and awareness mechanisms in socially-enabled ECS. In addition, it is expected to detect new mechanisms which are used while the social factor to collaborative work is new.
The purpose of this thesis is to examine and collect coordination and awareness mechanisms examples in literature to analyse them. Additionally, the purpose is to provide a first overview of mechanisms and to classify them by investigating their commonalities. Beside this thesis should give incentive for further investigations to investigate coordination and awareness mechanisms in socially integrated ECS.
Commonsense reasoning can be seen as a process of identifying dependencies amongst events and actions. Understanding the circumstances surrounding these events requires background knowledge with sufficient breadth to cover a wide variety of domains. In the recent decades, there has been a lot of work in extracting commonsense knowledge, a number of these projects provide their collected data as semantic networks such as ConceptNet and CausalNet. In this thesis, we attempt to undertake the Choice Of Plausible Alternatives (COPA) challenge, a problem set with 1000 questions written in multiple-choice format with a premise and two alternative choices for each question. Our approach differs from previous work by using shortest paths between concepts in a causal graph with the edge weight as causality metric. We use CausalNet as primary network and implement a few design choices to explore the strengths and drawbacks of this approach, and propose an extension using ConceptNet by leveraging its commonsense knowledge base.
In recent years head mounted displays (HMD) and their abilities to create virtual realities comparable with the real world moved more into the focus of press coverage and consumers. The reason for this lies in constant improvements in available computing power, miniaturisation of components as well as the constantly shrinking power consumption. These trends originate in the general technical progress driven by advancements made in smartphone sector. This gives more people than ever access to the required components to create these virtual realities. However at the same time there is only limited research which uses the current generation of HMDs especially when comparing the virtual and real world against each other. The approach of this thesis is to look into the process of navigating both real and virtual spaces while using modern hardware and software. One of the key areas are the spatial and peripheral perception without which it would be difficult to navigate a given space. The influence of prior real and virtual experiences on these will be another key aspect. The final area of focus is the influence on the emotional state and how it compares to the real world. To research these influences a experiment using the Oculus Rift DK2 HMD will be held in which subjects will be guided through a real space as well as a virtual model of it. Data will be gather in a quantitative manner by using surveys. Finally, the findings will be discussed based on a statistical evaluation. During these tests the different perception of distances and room size will the compared and how they change based on the current reality. Furthermore, the influence of prior spatial activities both in the real and the virtual world will looked into. Lastly, it will be checked how real these virtual worlds are and if they are sufficiently sophisticated to trigger the same emotional responses as the real world.
Large amounts of qualitative data make the utilization of computer-assisted methods for their analysis inevitable. In this thesis Text Mining as an interdisciplinary approach, as well as the methods established in the empirical social sciences for analyzing written utterances are introduced. On this basis a process of extracting concept networks from texts is outlined and the possibilities of utilitzing natural language processing methods within are highlighted. The core of this process is text processing, to whose execution software solutions supporting manual as well as automated work are necessary. The requirements to be met by these solutions, against the background of the initiating project GLODERS, which is devoted to investigating extortion racket systems as part of the global fiσnancial system, are presented, and their fulσlment by the two most preeminent candidates reviewed. The gap between theory and pratical application is closed by a prototypical application of the method to a data set of the research project utilizing the two given software solutions.
Constituent parsing attempts to extract syntactic structure from a sentence. These parsing systems are helpful in many NLP applications such as grammar checking, question answering, and information extraction. This thesis work is about implementing a constituent parser for German language using neural networks. Over the past, recurrent neural networks have been used in building a parser and also many NLP applications. In this, self-attention neural network modules are used intensively to understand sentences effectively. With multilayered self-attention networks, constituent parsing achieves 93.68% F1 score. This is improved even further by using both character and word embeddings as a representation of the input. An F1 score of 94.10% was the best achieved by constituent parser using only the dataset provided. With the help of external datasets such as German Wikipedia, pre-trained ELMo models are used along with self-attention networks achieving 95.87% F1 score.
Code package managers like Cabal track dependencies between packages. But packages rarely use the functionality that their dependencies provide. This leads to unnecessary compilation of unused parts and to speculative conflicts between package versions where there are no conflicts. In two case studies we show how relevant these two problems are. We then describe how we could avoid them by tracking dependencies not between packages but between individual code fragments.
This thesis explores and examines the effectiveness and efficacy of traditional machine learning (ML), advanced neural networks (NN) and state-of-the-art deep learning (DL) models for identifying mental distress indicators from the social media discourses based on Reddit and Twitter as they are immensely used by teenagers. Different NLP vectorization techniques like TF-IDF, Word2Vec, GloVe, and BERT embeddings are employed with ML models such as Decision Tree (DT), Random Forest (RF), Logistic Regression (LR) and Support Vector Machine (SVM) followed by NN models such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) to methodically analyse their impact as feature representation of models. DL models such as BERT, DistilBERT, MentalRoBERTa and MentalBERT are end-to-end fine tuned for classification task. This thesis also compares different text preprocessing techniques such as tokenization, stopword removal and lemmatization to assess their impact on model performance. Systematic experiments with different configuration of vectorization and preprocessing techniques in accordance with different model types and categories have been implemented to find the most effective configurations and to gauge the strengths, limitations, and capability to detect and interpret the mental distress indicators from the text. The results analysis reveals that MentalBERT DL model significantly outperformed all other model types and categories due to its specific pretraining on mental data as well as rigorous end-to-end fine tuning gave it an edge for detecting nuanced linguistic mental distress indicators from the complex contextual textual corpus. This insights from the results acknowledges the ML and NLP technologies high potential for developing complex AI systems for its intervention in the domain of mental health analysis. This thesis lays the foundation and directs the future work demonstrating the need for collaborative approach of different domain experts as well as to explore next generational large language models to develop robust and clinically approved mental health AI systems.
The World Wide Web (WWW) has become a very important communication channel. Its usage has steadily grown within the past. Interest by website owners in identifying user behaviour has been around since Tim Berners-Lee developed the first web browser in 1990. But as the influence of the online channel today eclipses all other media the interest in monitoring website usage and user activities has intensified as well. Gathering and analysing data about the usage of websites can help to understand customer behaviour, improve services and potentially increase profit.
It is further essential for ensuring effective website design and management, efficient mass customization and effective marketing. Web Analytics (WA) is the area addressing these considerations. However, changing technologies and evolving Web Analytic methods and processes present a challenge to organisations starting with Web Analytic programmes. Because of lacking resources in different areas and other types of websites especially small and medium-sized enterprises (SME) as well as non-profit organisations struggle to operate WA in an effective manner.
This research project aims to identify the existing gap between theory, tool possibilities and business needs for undertaking Web Analytic programmes. Therefore the topic was looked at from three different ways: the academic literature, Web Analytic tools and an interpretative case study. The researcher utilized an action research approach to investigate Web Analytics presenting an holistic overview and to identify the gaps that exists. The outcome of this research project is an overall framework, which provides guidance for SMEs who operate information websites on how to proceed in a Web Analytic programme.
Magnetic resonance (MR) tomography is an imaging method, that is used to expose the structure and function of tissues and organs in the human body for medical diagnosis. Diffusion weighted (DW) imaging is a specific MR imaging technique, which enables us to gain insight into the connectivity of white matter pathways noninvasively and in vivo. It allows for making predictions about the structure and integrity of those connections. In clinical routine this modality finds application in the planning phase of neurosurgical operations, such as in tumor resections. This is especially helpful if the lesion is deeply seated in a functionally important area, where the risk of damage is given. This work reviews the concepts of MR imaging and DW imaging. Generally, at the current resolution of diffusion weighted data, single white matter axons cannot be resolved. The captured signal rather describes whole fiber bundles. Beside this, it often appears that different complex fiber configurations occur in a single voxel, such as crossings, splittings and fannings. For this reason, the main goal is to assist tractography algorithms who are often confound in such complex regions. Tractography is a method which uses local information to reconstruct global connectivities, i.e. fiber tracts. In the course of this thesis, existing reconstruction methods such as diffusion tensor imaging (DTI) and q-ball imaging (QBI) are evaluated on synthetic generated data and real human brain data, whereas the amount of valuable information provided by the individual reconstruction mehods and their corresponding limitations are investigated. The output of QBI is the orientation distribution function (ODF), where the local maxima coincides with the underlying fiber architecture. We determine those local maxima. Furthermore, we propose a new voxel-based classification scheme conducted on diffusion tensor metrics. The main contribution of this work is the combination of voxel-based classification, local maxima from the ODF and global information from a voxel- neighborhood, which leads to the development of a global classifier. This classifier validates the detected ODF maxima and enhances them with neighborhood information. Hence, specific asymmetric fibrous architectures can be determined. The outcome of the global classifier are potential tracking directions. Subsequently, a fiber tractography algorithm is designed that integrates along the potential tracking directions and is able to reproduce splitting fiber tracts.
The Internet of Things (IoT) is a network of addressable, physical objects that contain embedded sensing, communication and actuating technologies to sense and interact with their environment (Geschickter 2015). Like every novel paradigm, the IoT sparks interest throughout all domains both in theory and practice, resulting in the development of systems pushing technology to its limits. These limits become apparent when having to manage an increasing number of Things across various contexts. A plethora of IoT architecture proposals have been developed and prototype products, such as IoT platforms, been introduced. However, each of these architectures and products apply their very own interpretations of an IoT architecture and its individual components so that IoT is currently more an Intranet of Things than an Internet of Things (Zorzi et al. 2010). Thus, this thesis aims to develop a common understanding of the elements forming an IoT architecture and provide high-level specifications in the form of a Holistic IoT Architecture Framework.
Design Science Research (DSR) is used in this thesis to develop the architecture framework based on the pertinent literature. The development of the Holistic IoT Architecture Framework includes the identification of two new IoT Architecture Perspectives that became apparent during the analysis of the IoT architecture proposals identified in the extant literature. While applying these novel perspectives, the need for a new component for the architecture framework, which was merely implicitly mentioned in the literature, became obvious as well. The components of various IoT architecture proposals as well as the novel component, the Thing Management System, were combined, consolidated and related to each other to develop the Holistic IoT Architecture Framework. Subsequently, it was shown that the specifications of the architecture framework are suitable to guide the implementation of a prototype.
This contribution provides a common understanding of the basic building blocks, actors and relations of an IoT architecture.
Digital Transformation Maturity of Vietnam Aviation Industry: The Effect of Organizational Readiness
(2023)
The paper studies the digital transformation maturity in the context of the aviation industry in Vietnam. Digital transformation can mean enhancing existing processes, finding new opportunities within existing business domains, or finding new opportunities outside existing business domains. In the era of post Covid-19, digital transformation will play a vital role in the recovery with the support from digital technology to leverage the communication and implementation of new projects or changes.
Digital transformation and digital transformation maturity sometimes are used indistinguishing, but they are two different definitions. This paper will further explain the differences and will apply digital transformation maturity as a scale for the digital transformation in the report.
Due to the lack of experiment in the relationship between digital transformation maturity and the organizational readiness, the study will explore four components of organizational readiness, including digital leadership, digital culture, digital capabilities, and digital partnering.
We present the conceptual and technological foundations of a distributed natural language interface employing a graph-based parsing approach. The parsing model developed in this thesis generates a semantic representation of a natural language query in a 3-staged, transition-based process using probabilistic patterns. The semantic representation of a natural language query is modeled in terms of a graph, which represents entities as nodes connected by edges representing relations between entities. The presented system architecture provides the concept of a natural language interface that is both independent in terms of the included vocabularies for parsing the syntax and semantics of the input query, as well as the knowledge sources that are consulted for retrieving search results. This functionality is achieved by modularizing the system's components, addressing external data sources by flexible modules which can be modified at runtime. We evaluate the system's performance by testing the accuracy of the syntactic parser, the precision of the retrieved search results as well as the speed of the prototype.
The internet is becoming more and more important in daily life. Fundamental changes can be observed in the private sector as well as in the public sector. In the course of this, active involvement of citizens in planning political procedures is more and more supported electronically. The expectations culminate in the assumption that information and communication technology (ICT) can enhance civic participation and reduce disenchantment with politics. Out of these expectations, a lot of eparticipation projects were initiated in Germany. Initiatives were established, e.g. the "Initiative eParticipation", which gave many incentives of electronic participation for policy and administration in order to strengthen decision-making processes with internet supported participation practices. This thesis consists of two major parts. In the first part, definitions of the essential terms are presented. The position of e-participation within the dimension of ebusiness is pointed out. In order to explain e-participation, basics of the classical offline participation are delivered. It will be shown that a change is in progress, not only because of the deployment of ICT. Subsequently, a framework to characterize eparticipation is presented. The European Union is encouraging the implementation of e-participation. So, the city of Koblenz should be no exception. But what is the current situation in Koblenz? To provide an answer to this question, the status quo was examined with the help of a survey among the citizens of Koblenz, which was developed, conducted and evaluated. This is the second major part of this thesis.
Entwicklung eines Regelungsverfahrens zur Pfadverfolgung für ein Modellfahrzeug mit Sattelanhänger
(2009)
Besides the progressive automation of internal goods traffic, there is an important area that should also be considered. This area is the carriage of goods in selected external areas. The use of driverless trucks in logistic centers can report economic efficiency. In particular, these precise control procedures require that trucks drive on predetermined paths. The general aim of this work is the adaption and evaluation of a path following control method for articulated vehicles. The differences in the kinematic behavior between trucks with one-axle trailer and semi-trailer vehicles will be emphasized. Additionally, the characteristic kinematic properties of semi-trailers for the adaptation of a control procedure will be considered. This control procedure was initially designed for trucks with one-axle trailer. It must work in forwards and backwards movements. This control process will be integrated as a closed component on the control software of the model vehicle. Thus, the geometry of the model vehicle will be specified, and the possible special cases of the control process will be discovered. The work also documents the most relevant software components of the implemented control process.
One task of executives and project managers in IT companies or departments is to hire suitable developers and to assign them to suitable problems. In this paper, we propose a new technique that directly leverages previous work experience of developers in a systematic manner. Existing evidence for developer expertise based on the version history of existing projects is analyzed. More specifically, we analyze the commits to a repository in terms of affected API usage. On these grounds, we associate APIs with developers and thus we assess API experience of developers. In transitive closure, we also assess programming domain experience.
Exploring Academic Perspectives: Sentiments and Discourse on ChatGPT Adoption in Higher Education
(2024)
Artificial intelligence (AI) is becoming more widely used in a number of industries, including in the field of education. Applications of artificial intelligence (AI) are becoming crucial for schools and universities, whether for automated evaluation, smart educational systems, individualized learning, or staff support. ChatGPT, anAI-based chatbot, offers coherent and helpful replies based on analyzing large volumes of data. Integrating ChatGPT, a sophisticated Natural Language Processing (NLP) tool developed by OpenAI, into higher education has sparked significant interest and debate. Since the technology is already adapted by many students and teachers, this study delves into analyzing the sentiments expressed on university websites regarding ChatGPT integration into education by creating a comprehensive sentiment analysis framework using Hierarchical Residual RSigELU Attention Network (HR-RAN). The proposed framework addresses several challenges in sentiment analysis, such as capturing fine-grained sentiment nuances, including contextual information, and handling complex language expressions in university review data. The methodology involves several steps, including data collection from various educational websites, blogs, and news platforms. The data is preprocessed to handle emoticons, URLs, and tags and then, detect and remove sarcastic text using the eXtreme Learning Hyperband Network (XLHN). Sentences are then grouped based on similarity and topics are modeled using the Non-negative Term-Document Matrix Factorization (NTDMF) approach. Features, such as lexico-semantic, lexico structural, and numerical features are extracted. Dependency parsing and coreference resolution are performed to analyze grammatical structures and understand semantic relationships. Word embedding uses the Word2Vec model to capture semantic relationships between words. The preprocessed text and extracted features are inputted into the HR-RAN classifier to categorize sentiments as positive, negative, or neutral. The sentiment analysis results indicate that 74.8% of the sentiments towards ChatGPT in higher education are neutral, 21.5% are positive, and only 3.7% are negative. This suggests a predominant neutrality among users, with a significant portion expressing positive views and a very small percentage holding negative opinions. Additionally, the analysis reveals regional variations, with Canada showing the highest number of sentiments, predominantly neutral, followed by Germany, the UK, and the USA. The sentiment analysis results are evaluated based on various metrics, such as accuracy, precision, recall, F-measure, and specificity. Results indicate that the proposed framework outperforms conventional sentiment analysis models. The HR-RAN technique achieved a precision of 98.98%, recall of 99.23%, F-measure of 99.10%, accuracy of 98.88%, and specificity of 98.31%. Additionally, word clouds are generated to visually represent the most common terms within positive, neutral, and negative sentiments, providing a clear and immediate understanding of the key themes in the data. These findings can inform educators, administrators, and developers about the benefits and challenges of integrating ChatGPT into educational
settings, guiding improvements in educational practices and AI tool development.
We examine the systematic underrecognition of female scientists (Matilda effect) by exploring the citation network of papers published in the American Physical Society (APS) journals. Our analysis shows that articles written by men (first author, last author and dominant gender of authors) receive more citations than similar articles written by women (first author, last author and dominant gender of authors) after controlling for the journal of publication, year of publication and content of the publication. Statistical significance of the overlap between the lists of references was considered as the measure of similarity between articles in our analysis. In addition, we found that men are less likely to cite articles written by women and women are less likely to cite articles written by men. This pattern leads to receiving more citations by articles written by men than similar articles written by women because the majority of authors who published in APS journals are male (85%). We also observed Matilda effect reduces when articles are published in journals with the highest impact factors. In other words, people’s evaluation of articles published in these journals is not affected by the gender of authors significantly. Finally, we suggested a method that can be applied by editors in academic journals to reduce the evaluation bias to some extent. Editors can identify missing citations using our proposed method to complete bibliographies. This policy can reduce the evaluation bias because we observed papers written by female scholars (first author, last author, the dominant gender of authors) miss more citations than articles written by male scholars (first author, last author, the dominant gender of authors).
On-screen interactive presentations have got immense popularity in the domain of attentive interfaces recently. These attentive screens adapt their behavior according to the user's visual attention. This thesis aims to introduce an application that would enable these attentive interfaces to change their behavior not just according to the gaze data but also facial features and expressions. The modern era requires new ways of communications and publications for advertisement. These ads need to be more specific according to people's interests, age, and gender. When advertising, it's important to get a reaction from the user but not every user is interested in providing feedback. In such a context more, advance techniques are required that would collect user's feedback effortlessly. The main problem this thesis intends to resolve is, to apply advanced techniques of gaze and face recognition to collect data about user's reactions towards different ads being played on interactive screens. We aim to create an application that enables attentive screens to detect a person's facial features, expressions, and eye gaze. With eye gaze data we can determine the interests and with facial features, age and gender can be specified. All this information will help in optimizing the advertisements.
Most social media platforms allow users to freely express their opinions, feelings, and beliefs. However, in recent years the growing propagation of hate speech, offensive language, racism and sexism on the social media outlets have drawn attention from individuals, companies, and researchers. Today, sexism both online and offline with different forms, including blatant, covert, and subtle lan- guage, is a common phenomenon in society. A notable amount of work has been done over identifying sexist content and computationally detecting sexism which exists online. Although previous efforts have mostly used peoples’ activities on social media platforms such as Twitter as a public and helpful source for collecting data, they neglect the fact that the method of gathering sexist tweets could be biased towards the initial search terms. Moreover, some forms of sexism could be missed since some tweets which contain offensive language could be misclassified as hate speech. Further, in existing hate speech corpora, sexist tweets mostly express hostile sexism, and to some degree, the other forms of sexism which also appear online was disregarded. Besides, the creation of labeled datasets with manual exertion, relying on users to report offensive comments with a tremendous effort by human annotators is not only a costly and time-consuming process, but it also raises the risk of involving discrimination under biased judgment.
This thesis generates a novel sexist and non-sexist dataset which is constructed via "UnSexistifyIt", an online web-based game that incentivizes the players to make minimal modifications to a sexist statement with the goal of turning it into a non-sexist statement and convincing other players that the modified statement is non-sexist. The game applies the methodology of "Game With A Purpose" to generate data as a side-effect of playing the game and also employs the gamification and crowdsourcing techniques to enhance non-game contexts. When voluntary participants play the game, they help to produce non-sexist statements which can reduce the cost of generating new corpus. This work explores how diverse individual beliefs concerning sexism are. Further, the result of this work highlights the impact of various linguistic features and content attributes regarding sexist language detection. Finally, this thesis could help to expand our understanding regarding the syntactic and semantic structure of sexist and non-sexist content and also provides insights to build a probabilistic classifier for single sentences into sexist or non-sexist classes and lastly find a potential ground truth for such a classifier.
The purpose of this master thesis is to enable the Robot Lisa to process complex commands and extract the necessary information in order to perform a complex task as a sequence of smaller tasks. This is intended to be achieved by the improvement of the understanding that Lisa has of her environment by adding semantics to the maps that she builds. The complex command itself will be expected to be already parsed. Therefore the way the input is processed to become a parsed command is out of the scope of this work. Maps that Lisa builds will be improved by the addition of semantic annotations that can include any kind of information that might be useful for the performance of generic tasks. This can include (but not necessarily limited to) hierarchical classifications of locations, objects and surfaces. The processing of the command in addition to some information of the environment shall trigger the performance of a sequence of actions. These actions are expected to be included in Lisa- currently implemented tasks and will rely on the currently existing modules that perform them.
Nevertheless the aim of this work is not only to be able to use currently implemented tasks in a more complex sequence of actions but also make it easier to add new tasks to the complex commands that Lisa can perform.
Ontologies are valuable tools for knowledge representation and important building blocks of the Semantic Web. They are not static and can change over time. Changing an ontology can be necessary for various reasons: the domain that is represented by an ontology can change or an ontology is reused and must be adapted to the new context. In addition, modeling errors could have been introduced into the ontology which must be found and removed. The non-triviality of the change process has led to the emerge of ontology change as an own field of research. The removal of knowledge from ontologies is an important aspect of this change process, because even the addition of new knowledge to an ontology potentially requires the removal of older, conflicting knowledge. Such a removal must be performed in a thought-out way. A naïve change of concepts within the ontology can easily remove other, unrelated knowledge or alter the semantics of concepts in an unintended way [2]. For these reasons, this thesis introduces a formal operator for the fine-grained retraction of knowledge from EL concepts which is partially based on the postulates for belief set contraction and belief base contraction [3, 4, 5] and the work of Suchanek et al. [6]. For this, a short introduction to ontologies and OWL 2 is given and the problem of ontology change is explained. It is then argued why a formal operator can support this process and why the Description Logic EL provides a good starting point for the development of such an operator. After this, a general introduction to Description Logic is given. This includes its history, an overview of its applications and common reasoning tasks in this logic. Following this, the logic EL is defined. In a next step, related work is examined and it is shown why the recovery postulate and the relevance postulate cannot be naïvely employed in the development of an operator that removes knowledge from EL concepts. Following this, the requirements to the operator are formulated and properties are given which are mainly based on the postulates for belief set and belief base contraction. Additional properties are developed which make up for the non-applicability of the recovery and relevance postulates. After this, a formal definition of the operator is given and it is shown that the operator is applicable to the task of a fine-grained removal of knowledge from EL concepts. In a next step, it is proven that the operator fulfills all the previously defined properties. It is then demonstrated how the operator can be combined with laconic justifications [7] to assist a human ontology editor by automatically removing unwanted consequences from an ontology. Building on this, a plugin for the ontology editor Protégé is introduced that is based on algorithms that were derived from the formal definition of the operator. The content of this work is then summarized and a final conclusion is drawn. The thesis closes with an outlook into possible future work.
Identifying reusable legacy code able to implement SOA services is still an open research issue. This master thesis presents an approach to identify legacy code for service implementation based on dynamic analysis and the application of data mining techniques. rnrnAs part of the SOAMIG project, code execution traces were mapped to business processes. Due to the high amount of traces generated by dynamic analyses, the traces must be post-processed in order to provide useful information. rnrnFor this master thesis, two data mining techniques - cluster analysis and link analysis - were applied to the traces. First tests on a Java/Swing legacy system provided good results, compared to an expert- allocation of legacy code.
Our work finds the fine grained edits in context of neighbouring tokens in Wikipedia articles. We cluster those edits according to similar neighbouring context. We encode neighbouring context into vector space using word vectors. We evaluate clusters returned by our algorithm on extrinsic and intrinsic metric and compare it with previous work. We analyse the relation between extrinsic and intrinsic measurements of fine grained edit tokens.
Implementation of Agile Software Development Methodology in a Company – Why? Challenges? Benefits?
(2019)
The software development industry is enhancing day by day. The introduction of agile software development methodologies was a tremendous structural change in companies. Agile transformation provides unlimited opportunities and benefits to the existing and new developing companies. Along with benefits, agile conversion also brings many unseen challenges. New entrants have the advantage of being flexible and cope with the environmental, consumer, and cultural changes, but existing companies are bound to rigid structure.
The goal of this research is to have deep insight into agile software development methodology, agile manifesto, and principles behind the agile manifesto. The prerequisites company must know for agile software development implementation. The benefits a company can achieve by implementing agile software development. Significant challenges that a company can face during agile implementation in a company.
The research objectives of this study help to generate strong motivational research questions. These research questions cover the cultural aspects of company agility, values and principles of agile, benefits, and challenges of agile implementation. The project management triangle will show how benefits of cost, benefits of time, and benefits of quality can be achieved by implementing agile methodologies. Six significant areas have been explored, which shows different challenges a company can face during implementation agile software development methodology. In the end, after the in depth systematic literature review, conclusion is made following some open topics for future work and recommendations on the topic of implementation of agile software development methodology in a company.
Belief revision is the subarea of knowledge representation which studies the dynamics of epistemic states of an agent. In the classical AGM approach, contraction, as part of the belief revision, deals with the removal of beliefs in knowledge bases. This master's thesis presents the study and the implementation of concept contraction in the Description Logic EL. Concept contraction deals with the following situation. Given two concept C and D, assuming that C is subsumed by D, how can concept C be changed so that it is not subsumed by D anymore, but is as similar as possible to C? This approach of belief change is different from other related work because it deals with contraction in the level of concepts and not T-Boxes and A-Boxes in general. The main contribution of the thesis is the implementation of the concept contraction. The implementation provides insight into the complexity of contraction in EL, which is tractable since the main inference task in EL is also tractable. The implementation consists of the design of five algorithms that are necessary for concept contraction. The algorithms are described, illustrated with examples, and analyzed in terms of time complexity. Furthermore, we propose an new approach for a selection function, adapt for the concept contraction. The selection function uses metadata about the concepts in order to select the best from an input set. The metadata is modeled in a framework that we have designed, based on standard metadata frameworks. As an important part of the concept contraction, the selection function is responsible for selecting the best concepts that are as similar as possible to concept C. Lastly, we have successfully implemented the concept contraction in Python, and the results are promising.
The content aggregator platform Reddit has established itself as one of the most popular websites in the world. However, scientific research on Reddit is hindered as Reddit allows (and even encourages) user anonymity, i.e., user profiles do not contain personal information such as the gender. Inferring the gender of users in large-scale could enable the analysis of gender-specific areas of interest, reactions to events, and behavioral patterns. In this direction, this thesis suggests a machine learning approach of estimating the gender of Reddit users. By exploiting specific conventions in parts of the website, we obtain a ground truth for more than 190 million comments of labeled users. This data is then used to train machine learning classifiers to use them to gain insights about the gender balance of particular subreddits and the platform in general. By comparing a variety of different approaches for classification algorithm, we find that character-level convolutional neural network achieves performance with an 82.3% F1 score on a task of predicting a gender of a user based on his/her comments. The score surpasses 85% mark for frequent users with more than 50 comments. Furthermore, we discover that female users are less active on Reddit platform, they write fewer comments and post in fewer subreddits on average, when compared to male users.
The output of eye tracking Web usability studies can be visualized to the analysts as screenshots of the Web pages with their gaze data. However, the screenshot visualizations are found to be corrupted whenever there are recorded fixations on fixed Web page elements on different scroll positions. The gaze data are not gathered on their fixated fixed elements; rather they are scattered on their recorded scroll positions. This problem has raised our attention to find an approach to link gaze data to their intended fixed elements and gather them in one position on the screenshot. The approach builds upon the concept of creating the screenshot during the recording session, where images of the viewport are captured on visited scroll positions and lastly stitched into one Web page screenshot. Additionally, the fixed elements in the Web page are identified and linked to their fixations. For the evaluation, we compared the interpretation of our enhanced screenshot against the video visualization, which overcomes the problem. The results revealed that both visualizations equally deliver accurate interpretations. However, interpreting the visualizations of eye tracking Web usability studies using the enhanced screenshots outperforms the video visualizations in terms of speed and it requires less temporal demands from the interpreters.