UMLS-SKOS: A Semantic Web Framework for Representation of Biomedical Terminological Knowledge using Simple Knowledge Organization System (SKOS)

Parsa Mirhaji, MD, PhD

The University of Texas Health Science Center at Houston


Link to WebEx recording of the presentation.


SKOS is a Semantic Web framework for representing thesauri, classification schemes, subject heading systems, controlled vocabularies, and taxonomies. It enables novel ways of representing terminological knowledge and its linkage with domain knowledge in unambiguous, reusable, and encapsulated fashion within computer applications. According to the National Library of Medicine, the UMLS Knowledge Source (UMLS-KS) integrates and distributes key terminology, classification and coding standards, and associated resources to promote creation of more effective and interoperable biomedical information systems and services “that behave as if they ‘understand’ the meaning of the language of biomedicine and health”. However the current information representation model utilized by UMLS-KS itself is not conducive to computer programs effectively retrieving and automatically and unambiguously interpreting the ‘meaning’ of the biomedical terms and concepts and their relationships. 

In this presentation we propose using Simple Knowledge Organization System (SKOS) as an alternative to represent the body of knowledge incorporated within the UMLS-KS within the framework of the Semantic Web technologies. We also introduce our conceptualization of a transformation algorithm to produce an SKOS representation of the UMLS-KS that integrates UMLS-Semantic Network, the UMLS-Metathesaurus complete with all its source vocabularies as a unified body of knowledge along with appropriate information to trace or segregate information based on provenance and governance information. Our proposal and method is based on the idea that formal and explicit representation of any body of knowledge enables its unambiguous, and precise interpretation by automated computer programs. The consequences of such undertaking would be at least three fold: 1) ability to automatically check inconsistencies and errors within a large and complex body of knowledge, 2) automated information interpretation, integration, and discovery, and 3) better information sharing, repurposing and reusing (adoption), and extending the knowledgebase within a distributed and collaborative community of researchers. We submit that UMLS-KS is no exception to this and may benefit from all those advantages if represented fully using a formal representation language. Using SKOS in combination with the transformation algorithm introduced in this presentation are our first steps in that direction. We explain our conceptualization of the algorithms, problems we encountered and how we addressed them with a brief gap analysis to outline the road ahead of us. At the end we also present several use cases from our laboratories at the School of Health information Sciences utilizing this artifact.