Updates on the NCBO Driving Biomedical Project “Ontology-Based Annotation of Biomedical Time-Series Data”

WEBEX RECORDING: https://stanford.webex.com/stanford/lsr.php?AT=pb&SP=MC&rID=42938592&rKe...


An ECG is the time-varying electrical potential generated by the underlying electrical activity of the heart. It is one of the most common measurements made in cardiovascular clinical research.  ECG signals are usually measured by placing 3- or 12-lead electrodes on the torso and recording short time segments of electrical activity (~ 10 Sec). ECG signals may also be measured using a portable device known as a Holter monitor to obtain longer duration recordings (~ 24 hours).
Features of the ECG waveform provide important information on the electrical activity of the heart. The P-wave represents electrical activation of the atria. The QRS complex is generated as a wave of electrical activity propagates throughout the cardiac ventricles, producing mechanical contraction. The ST segment reflects the amount of time the ventricles remain electrically depolarized. The T-wave is generated as the myocardium recovers from electrical excitation. Changes in the properties of these ECG features are important indicators of abnormal heart activity. For example, prolongation of the QRS complex duration reflects slowed conduction in the heart. Prolongation of the QT interval is now known to be an important biomarker for risk of arrhythmia. Despite the diagnostic value of these data, there are no open, non-proprietary tools for its storage and dissemination. ECG data is collected and stored in a proprietary format using “heart stations”. Data stored in heart stations is difficult, and sometimes impossible, for users to access directly. Therefore, in most clinical studies, the only ECG data that is retained are values computed from the ECG time-series using analysis software integrated into the heart station (e.g., QT interval variability), images (jpegs, png’s, etc) of a few cycles of the ECG, or even paper recordings.
The CardioVascular Research Grid Project is developing tools for extracting ECG data from heart stations and storing it in digital form so that all the primary ECG time-series data collected in clinical research studies can be saved, analyzed using open-source software, and shared with others. In an NCBO Driving Biomedical Project, we are going beyond this work to develop ontology for describing ECG data collection protocols, instrumentation, waveform features, and values calculated from the ECG signal. We are developing a web-interface, using the Google Web Toolkit (GWT), by which users can search for specific ECG data sets in data services, download these data from the services, visualize the data, select features of interest in the ECG waveform, annotate those features by accessing concepts in the ECG Ontology stored in Bioportal, and then store the annotated data back into data services. This infrastructure now makes it possible to save and disseminate carefully annotated primary digital ECG data sets. Annotation makes it possible for others to re-analyze the data in ways not intended in the original study. It also supports the integration of ECG primary and derived data across studies.
In this presentation, we will demonstrate these tools. We will describe: a) recent extensions to the ECG Ontology, including addition of new concepts provided by the user community; b) integration of the Basic Formal Ontology; c) extensions to the web-interface made using the GWT; and d) application of these tools in what are now 9 large-scale NHLBI-funded clinical research projects.