Faculdade de Letras da Universidade do Porto - OCS, 15th INTERNATIONAL ISKO CONFERENCE

Font Size: 
Ingo Frank

Last modified: 2017-12-18


The poster presents the implementation of a phenomenon-based knowledge organization system in the context of the ongoing design of a discovery system for our institutional research data repository. Besides the requirements for interdisciplinary information retrieval, there are challenges to overcome the limitations and shortcomings of existing metadata standards for research data management and library catalogues. To demonstrate how these issues can be handled, I will focus on the prospects and opportunities of knowledge organization for interdisciplinary research organization through ontology-based modeling and Linked Data technology. We want to achieve interoperability between institutional research data management, the institutional research information system and the institute's library catalogue in order to integrate information about research objects, methods, data and related publications towards better information access to support the information needs of interdisciplinary researchers.

There are two steps to reach our objectives: First, integration of information. Second, enrichment of the integrated information with facets for interdisciplinary knowledge organization.

Bibliographic data is available for integration from the B3Kat catalogue. The ISF ontology of our VIVO research information system provides an opportunity to interlink research data and bibliographic data. Shortcomings of existing metadata standards, such as the lack of using authority files or using them inconsequently, and inconsistent usage of controlled vocabularies are solved in VIVO by enforcing role-based modeling of authorship, affiliation, etc. with authors, organizations, etc. identified via linked ORCID or GND records. We use CKAN as datastore for our institutional research data repository because of its DCAT vocabulary, the RDF and Linked Data features and some useful view extensions. In addition to the Map View and Geospatial View extensions for georeferenced maps (from our GeoPortOst project) and spatial research data (like historical census data), we use CKAN's da|ra extension which supports the DOI registration service (also used in our LaMBDa data portal).

To build the faceted classification system, phenomena, methods, theories or paradigms are selected from the JEL classification system for the discipline of economics, subsets from GND subject headings converted into SKOS for the discipline of history and the European Thesaurus on International Relations and Area Studies for conflict studies.

The integrated records are superimposed with facets to classify research objects, used methods and theories as subject of publications and research data. The simplest solution would be to use a subject classification property for each facet. As SKOS does not support facets, i.e. the combination of concepts, n-ary relations would be needed for multiple facet classifications.

As an alternative to subject classification properties and n-ary relations, the NeDiMAH Methods Ontology (NeMO) was selected to model the digital research process more precisely. Comparable to phenomenon-based classification, NeMO distinguishes between research goals and activities, research techniques, and research objects.

The phenomenon-based faceted classification system builds the groundwork for the navigation and search system. The ontology-based information architecture meets the requirements of interdisciplinary researchers to find relevant information within the institutional research data repository. It also allows complex queries via SPARQL and integration with other repositories via Linked Data. The later is planned in a large DFG-funded research infrastructure project in cooperation with the Bavarian State Library and other Leibniz research institutes as partners.

Finally, an outlook shows preliminary results of prototypical automatic faceted classification based on the KEA keyphrase extraction algorithm. KEA's SKOS support is used to select matching entries from the faceted classification system. This controlled indexing approach tries to find information about research objects, methods, therories and data in the abstracts of the according research articles.