Filtern
Schlagworte
- Linked Data Modeling (1)
- Linked Open Data (1)
- Schema Information (1)
- Support System (1)
- Vocabulary Mapping (1)
Institut
Various best practices and principles guide an ontology engineer when modeling Linked Data. The choice of appropriate vocabularies is one essential aspect in the guidelines, as it leads to better interpretation, querying, and consumption of the data by Linked Data applications and users.
In this paper, we present the various types of support features for an ontology engineer to model a Linked Data dataset, discuss existing tools and services with respect to these support features, and propose LOVER: a novel approach to support the ontology engineer in modeling a Linked Data dataset. We demonstrate that none of the existing tools and services incorporate all types of supporting features and illustrate the concept of LOVER, which supports the engineer by recommending appropriate classes and properties from existing and actively used vocabularies. Hereby, the recommendations are made on the basis of an iterative multimodal search. LOVER uses different, orthogonal information sources for finding terms, e.g. based on a best string match or schema information on other datasets published in the Linked Open Data cloud. We describe LOVER's recommendation mechanism in general and illustrate it alongrna real-life example from the social sciences domain.
Schema information about resources in the Linked Open Data (LOD) cloud can be provided in a twofold way: it can be explicitly defined by attaching RDF types to the resources. Or it is provided implicitly via the definition of the resources´ properties.
In this paper, we analyze the correlation between the two sources of schema information. To this end, we have extracted schema information regarding the types and properties defined in two datasets of different size. One dataset is a LOD crawl from TimBL- FOAF profile (11 Mio. triple) and the second is an extract from the Billion Triples Challenge 2011 dataset (500 Mio. triple). We have conducted an in depth analysis and have computed various entropy measures as well as the mutual information encoded in this two manifestations of schema information.
Our analysis provides insights into the information encoded in the different schema characteristics. It shows that a schema based on either types or properties alone will capture only about 75% of the information contained in the data. From these observations, we derive conclusions about the design of future schemas for LOD.