Abstraction networks to support comprehension and auditing for terminologies




Biomedical terminologies are large and complex and by their nature are error-prone. Hence, Quality Assurance (QA)  or auditing should be an inherent part of the  terminology life cycle. The importance of QA is increasing  with the spread of terminology-use in Biomedical research and in  the health care industry. Terminology QA is  especially critical  with the current ARRA initiative of enhancing the meaningful  use of EMRs in health care.  Terminology domain expert editors who can perform QA work require multidisciplinary training and are a scarce resource. This fact combined with budget constraints for terminology maintenance are ruling out manual auditing of a complete terminology as impractical. However, the value of using terminologies in the Biomedical industry depends on the  correctness and accuracy of their implemented knowledge.
To overcome this shortage in trained terminology experts, there is a need to automate part of the terminology QA work. However, most terminology errors require an expert editor's review. One approach to solve this problem is to develop automatic computational  techniques for identifying concepts and relationships with high probability of errors. Directing editors' efforts toward reviewing such concepts and relationships will increase their QA productivity. Abstraction networks for terminologies were proven to support such automatic computational techniques.
Abstraction networks are compact "semantic" networks, which summarize the knowledge in a much larger terminology. They provide orientation to the content and structure of a terminology, urgently needed by editors and users alike.  In this presentation we will explore various kinds of abstraction networks and their properties. Some properties of abstraction networks will be shown to provide improved orientation and QA support. We will demonstrate how abstraction networks help in identifying groups of concepts with high likelihood of errors.  Examples of abstraction networks for UMLS, SNOMED, NCIt and the Columbia MED will be presented and used to illustrate their support for terminology QA work.


Dr. Perl received his PhD in Computer Science from the Weizmann Institute of Science in Israel in 1975. He is currently a Professor of Computer Science at the New Jersey Institute of Technology (NJIT). He has published over 80 journal paper and 60 conference papers in top Computer Science and Medical Informatics journals and conferences.  Dr. Perl has been working in the field of “medical terminologies” for the last 18 years, following his earlier research career in the theory of algorithms. Dr. Perl was awarded the Harlan Perlis NJIT Research Award in 1996 for his earlier research, and the NJIT College of Computing Sciences Excellence in Research Award in 2008 for his research in Biomedical Terminologies.

The NJIT Terminology Research Group, upgraded in 2007 to the Structural Analysis of Biomedical Ontologies Center (SABOC), has established a strong research record in the design of structural techniques for partitioning, abstraction and quality assurance of medical terminologies such as the Columbia MED, the UMLS, the NCIt and SNOMED.  Many publications in top Medical Informatics journals and conferences have validated Dr. Perl's and the SABOC Center's major research theme, namely that structural analysis of terminologies can automatically pinpoint areas of high likelihood of inconsistencies, errors and necessary enhancements.