Das Suchergebnis hat sich seit Ihrer Suchanfrage verändert. Eventuell werden Dokumente in anderer Reihenfolge angezeigt.
  • Treffer 3 von 49
Zurück zur Trefferliste

Subjectivity Detection through Quotation Analysis in Wikipedia

  • This Master Thesis is an exploratory research to determine whether it is feasible to construct a subjectivity lexicon using Wikipedia. The key hypothesis is that that all quotes in Wikipedia are subjective and all regular text are objective. The degree of subjectivity of a word, also known as ''Quote Score'' is determined based on the ratio of word frequency in quotations to its frequency outside quotations. The proportion of words in the English Wikipedia which are within quotations is found to be much smaller as compared to those which are not in quotes, resulting in a right-skewed distribution and low mean value of Quote Scores. The methodology used to generate the subjectivity lexicon from text corpus in English Wikipedia is designed in such a way that it can be scaled and reused to produce similar subjectivity lexica of other languages. This is achieved by abstaining from domain and language-specific methods, apart from using only readily-available English dictionary packages to detect and exclude stopwords and non-English words in the Wikipedia text corpus. The subjectivity lexicon generated from English Wikipedia is compared against other lexica; namely MPQA and SentiWordNet. It is found that words which are strongly subjective tend to have high Quote Scores in the subjectivity lexicon generated from English Wikipedia. There is a large observable difference between distribution of Quote Scores for words classified as strongly subjective versus distribution of Quote Scores for words classified as weakly subjective and objective. However, weakly subjective and objective words cannot be differentiated clearly based on Quote Score. In addition to that, a questionnaire is commissioned as an exploratory approach to investigate whether subjectivity lexicon generated from Wikipedia could be used to extend the coverage of words of existing lexica.

Volltext Dateien herunterladen

Metadaten exportieren

Weitere Dienste

Teilen auf Twitter Suche bei Google Scholar
Metadaten
Verfasserangaben:Pang Wayne Wee
URN:urn:nbn:de:kola-16474
Gutachter:Claudia Wagner, Fabian Flöck
Dokumentart:Masterarbeit
Sprache:Englisch
Datum der Fertigstellung:29.05.2018
Datum der Veröffentlichung:29.06.2018
Veröffentlichende Institution:Universität Koblenz, Universitätsbibliothek
Titel verleihende Institution:Universität Koblenz, Fachbereich 4
Datum der Abschlussprüfung:19.06.2018
Datum der Freischaltung:29.06.2018
Seitenzahl:ix, 36
Institute:Fachbereich 4 / Institute for Web Science and Technologies
Lizenz (Deutsch):License LogoEs gilt das deutsche Urheberrecht: § 53 UrhG