Progress on the ODIE Toolkit

Rebecca Crowley, MD, MS

University of Pittsburgh School of Medicine





A few months ago, we quietly released ODIE 1.0 on the NCBO GForge site. ODIE is an application that let's researchers (1) analyze text corpora using BioPortal ontologies to map text to BioPortal concepts, and (2) identify candidate concepts within text corpora that could be considered for addition to an existing ontology. The ODIE project has made some significant advances this year - including (1) complete redesign to use the UIMA platform, (2) integration of Mayo cTAKES pipeline into ODIE, (3) addition of a simple co-reference resolution algorithm as UIMA resource, and (4) addition of statistical as well as linguistic methods for discovering candidate concepts. In addition to the software, we've been working on developing co-reference corpora, and testing a variety of existing methods for ontology enrichment using domain experts and ontology developers as judges. This talk will demonstrate the latest release and will sample some of the year's research project findings. I will also preview the upcoming release – ODIE 1.1 which includes improved installation, visualization, placement of candidate concepts and export of proposals as OWL files.