Precise Biochemical Knowledge Starting with Structure-Based Criteria for Molecular Identity

Michel Dumontier

Carleton University


Biochemical ontologies aim to capture and represent biochemical entities and the relations that exist between them in an accurate and precise manner. A fundamental starting point is the use of identifiers that precisely and uniquely identify some biochemical entity, whether it be a substance, a quality or some biological process. Yet, our current approach for generating identifiers doing so is often haphazard and incomplete. This prevents us from accurately integrating knowledge and also leads to under specification of our knowledge.

This talk aims to initiate a discussion on plausible structure-based strategies for biochemical identity, ultimately to generate identifiers in an automatic and curator/database independent fashion, whether it be at molecular level or some part thereof (e.g. residues, collection of residues, atoms, collection of atoms, functional groups). With structure-based identifiers in hand, we will be in a position to accurately capture specific biochemical knowledge, such as how a set of residues in a binding site are involved in a chemical reaction including the fact that a key nitrogen atom must first be de-protonated. Thus, this will enhance our current representation of biochemical knowledge and make it fundamentally more useful.