Faculdade de Letras da Universidade do Porto - OCS, 15th INTERNATIONAL ISKO CONFERENCE

Font Size: 
Ya-Ning Chen

Last modified: 2018-06-20


Traditionally, Machine-Readable Cataloguing (MARC) has played a role in interchange of records across various information systems within the library community. Although a vast volume of MARC records have been created and maintained by libraries, they are nearly all isolated from the web or cannot be found by search engines such as Google. With application of the semantic web, the Functional Requirements for Bibliographic Records (FRBR) has been regarded as a conceptual reference model (CRM) for the bibliographic universe. Actually, many projects and studies have justified the use FRBR as feasible as a bibliographic ontology for the semantic web; however, common agreement on a best practice of FRBR implementation and usage in the library community is still lacking.

In recent years, Linked Open Data (LOD) has become the preferred approach for the conversion of MARC-based legacy records into a part of the semantic web by libraries. Based on the principles of LOD, library catalog records can be sliced into LOD and then be aggregated with other external resources and their contexts on the web. With the advancement of LOD, official documents released by W3C (Hyland et al., 2017; Hyland & Villazón-Terrazas, 2011) can be regarded as useful best practices for authoring and publishing LOD. It is of interest to know how the aforementioned W3C best practices can be customized for library-oriented LOD. On the other hand, entities and the relationships between entities of FRBR and Functional Requirements for Authority Data (FRAD) have been implemented as classes and properties into the RDA Registry with permanent namespace and identifiers. Therefore, how classes and the relationships of FRBR and FRAD defined in the RDA Registry can be employed as CRM to transform existing MARC records into LOD in terms of workflow for LOD generation deserves investigation. In practice, the issues related to transformation from MARC records into LOD also need to be investigated, such as LOD based data duplication and collaboration on a global scale.

In this study, four bibliographic MARC records of ‘Pride and Prejudice’ authored by Jane Austen were selected from the WebPAC of National Taiwan University Library (http://tulips.ntu.edu.tw) as subject. Within the four MARC records, one was English and the others were Chinese records. Hidden relationships embedded between hierarchical bibliographic MARC records such as translation, version and reproduction, as well as relationships between entities in FRBR group 1 and 2 were also included in this study. First, three components provided by Hyland and Villazón-Terrazas (2011) were chosen as a basis to develop the workflow for transforming MARC into LOD and examining the related issues (i.e., LOD’s deduplication and collaboration) for library-oriented LOD. The aforementioned three components are as follows: modeling, naming with URIs, and reusing existing vocabularies.

As a result, a workflow for transforming MARC records into LOD has been completed as follows: changing MARC tags into semantic labels with their data, selecting significant labels for LOD, mapping selected labels with equivalent classes of the RDA Registry in terms of semantic equivalence, selecting appropriate properties of the RDA Registry to build up the relationship between classes, adding instances to classes for validating the appropriateness of classes and their relationships, and reusing existing value vocabularies from LC Linked Data Service (including subject heading, genre/form, carriers and name authority), VIAF (including corporate names and geographic names) and TGN. Deduplication of LOD is a difficult task for knowledge organization (KO) professionals, because the granularity of KO has changed from document-centric records into link-centric data. This means that KO professionals need to perform a series of knowledge verification to distinguish individual LOD from one to the other, rather than simply based on URI with insufficient information. Although LOD is a useful basis for KO collaboration to seamlessly aggregate various external resources together, the fundamental collaborative LOD still lies in the deduplication and related issues, such as mapping and selection principles for reusing existing metadata element sets, value vocabularies and library datasets.

LOD is a good approach to the practice of “thinking global and acting locally” for KO. Although the RDA Registry has paved the way towards LOD for libraries by a combination of RDA and FRBR family members, a new approach for KO such as LOD is also a paradigm shift. If the future trend of LOD based KO is distributed and aggregated over the Internet seamlessly, new policies and guidelines are emerging as essential tasks to obtain a common agreement on deduplication, collaboration, mapping and selection for library-oriented LOD.



Hyland, B., Atemezing, G.A. and Villazón-Terrazas, B. (2017), “Best practices for publishing linked data”, available at:  https://dvcs.w3.org/hg/gld/raw-file/cb6dde2928e7/bp/index.html (accessed June 08, 2017)

Hyland, B. and Villazón-Terrazas, B. (2011), “Linked data cookbook: cookbook for open government linked data”, available at: https://www.w3.org/2011/gld/wiki/Linked_Data_Cookbook (accessed June 08, 2017)