UBERON:Main Page

From NCBO Wiki
Revision as of 15:37, 12 January 2009 by Cjm (talk | contribs) (Status and Availability)
Jump to: navigation, search

Uberon is a multi-species anatomy ontology created to facilitate comparison of phenotypes across multiple species and to use in definition GO biological process terms.

The first iteration of Uberon was generated semi-automatically from the union of existing species-centric anatomy ontologies. As such, it contained many errors and biological falsehoods. The guiding principles deliberately err on the side of generating false positives in query results.

In addition, there has been a fairly light amount of curation of Uberon. New versions of species-centric AOs are periodically aligned against Uberon and incorporated.

Status and Availability

This is an alpha release, Uberon still contains many mistakes

Continued Development

We hope that there will eventually be resources for a biologically sound multi-species anatomy ontology - when this arrives Uberon will have served its purpose and can disappear into the night.

In the intermediate time, Uberon can serve as a strawman ontology, and a useful source of empirical results illustrating the need for a properly curated multi-species anatomy ontology with highly specific classes.

Relationship to other ontologies

Uberon contains xref tags pointing to other ontology terms. These can be interpreted as reverse is_a links.


MIAA is undoubtedly better than Uberon in that it is manually curated. However, Uberon has more specific and granular classes than MIAA (Uberon has 2000, MIAA 400). Uberon also attempts to employ is_a, part_of and developmental relations in the same manner as species specific ontologies (sometimes wrongly) - it attempts (not entirely successfully) to be an ontology. MIAA is more of a terminology.

Uberon subsumes MIAA (MIAA was one of the inputs), and includes xrefs to MIAA IDs.


CARO represents very general upper level types. Uberon extends CARO

species-centric anatomy ontologies

Uberon links to these via xrefs. In addition, there is a separate mapping file that provides is_a links between ssAO IDs and Uberon IDs. This is to facilitate subsumption based reasoning (eg queries for uberon:lower_jaw should return ma:lower_jaw)


Uberon was constructed around analogy rather than homology, purely as a matter of expediency. In fact Uberon may even contain classes that represent groupings that are not even analagous (these are errors and should be remobed).

uberon has utility despite not having formal phylogenetic representation. Here is why:

What uberon does do that is really important is that it allows biologists to search for analogy too. A lot of things that are considered analogous are really homologous, just in ways or at different levels of granularity that aren't fully understood. Later on we should separate homology from analogy, but I think there should always be some kind of uber ontology that allows searching for analogy. So while parts of uberon may get a formal homology treatise, there are parts that will remain useful purely as analogous groupings. In fact, I see this as a process of gradual reclassification in terms of homology as we learn.

If nothing else uberon can help people consider where we want to go from here.

Use Cases

Currently used in OBD. See http://www.berkeleybop.org/obd

See for example: lower jaw annotations

A query for "lower jaw" in Uberon returns mouse genes, zebrafish genes and human genes that are somehow implicated in phenotypes of the lower jaw. This query also uses MP-XP

Uberon may also be used to make GO xps, also for facilitating analysis in OBD. See biological_process_xp_anatomy


We subdivide ontologies into those that are species-centric (scAOs) and generic (gAOs). An example of an scAO is the Foundational Model of Anatomy (FMA), which is human-centric. Some scAOs may be applicable further up the taxonomic hierarchy above the species level - for example, the adult mouse anatomy (MA) CHECK WITH MGI EXACTLY HOW FAR UP IT IS APPLICABLE AND VALID FOR. The Gene Ontology cellular component ontology is an example of a gAO at the subcellular level, applicable to prokaryotyes and eukaryotes. The OBO Cell Ontology (CL) is a gAO that represents different kinds of cells across a variety of phyla. Of course, gAOs may contain classes that are only applicable for certain taxonomic subsets. Some ontologies are 'inbetween' - for example the TAO is an anatomy ontology applicable for teleost (bony fish) which is more general than the zebrafish anatomy ontology (ZFA) yet more specific than a true gAO.

We tend to find that scAOs sit at the gross anatomical level (presumably at least in part due to the higher cross-species diversity at this level). Some scAOs also delved into cellular and subcellular territory (eg ZFA and FBbt). CARO is a very general gAO that defines a set of high level classes to be used across AOs. MIAA is a gAO that defines a minimal set of anatomy terms to be used in microarray annotation.

Our approach is homology-neutral. We seek to group classes from multiple anatomical ontologies regardless of whether the relation between the anatomical entities (AEs) is one of homology or analogy (e.g. convergent evolution). This approach is driven by pragmaticism rather than biology - we hope that future efforts will extend this with a more phylogenetically valid approach as outlined in [REF:CARO], and described at the multi-species anatomy ontology meeting [REF].

We first sought to collate all sources of putative homology and analogy between the AEs in different species-centric anatomy ontologies (scAOs). Sometimes these came from the ontology themselves, in the form of xref tags in the underlying obo file. The xref tag has no fixed semantics, but is by convention used to link scAOs to gAOs - see for example OBO Mappings. For the mouse-human mappings we relied on [REF:Bodenreider et al]. In other cases we had to create our own mappings by running the Obol S3 algorithm (simple synonym and stemming) which performs basic text-based matching.

We then sought to eliminate as many non-isomorphic (1-1) pairwise mappings as possible. This was done on an ad-hoc basis by a non-expert by occasional consultation with experts. This aspect could use further curation.

We collected grouping classes based on mappings. We used an extremely promiscuous grouping algorithm, finding the maximally self-connected sets. This undoubtedly results in some meaningless classes. Each grouping class was given an ID and placed in ontology called UBERON. This ontology includes is_a links incoming from the external AOs, and the superset of all synonyms and definitions from the external AOs.

Finally we added links to the ontology based on links in the external AOs. Again, we were extremely liberal in what we accepted: if an external AO contained a link (X,R,Y) (asserted or logically entailed) we also created a link (X',R,Y'), where X<->X' and Y<->Y' are mappings. This ultra-liberal policy undoubtedly creates invalid links. The goal is to create the maximal set for now, and curate this in future. 3 cyclic links were generated, these were manually removed.