Synapse: A Precompetitive Space for Integrative Genomics

Mike Kellen and Stephen Friend, Sage Bionetworks

The past two decades have seen an exponential growth in the technical ability to generate genetic and biomolecular data fueled by advances in measurement technologies.  However, with a few exceptions, these data have failed to improve prevention or treatment of common human disease.  A fundamental reason for this discrepancy between data generation and clinical improvement is the immature development of analytical techniques to meaningfully interpret these new data types.  As with any new field, analytical methodologies need to be iteratively developed and refined. The difficulty of accessing, understanding, and reusing data, analysis methods, or models of disease across multiple labs with complimentary fields of expertise is a major barrier to the effective interpretation of genomic data today.  Additionally, much of the relevant data to answer a particular research question is spread among multiple public and private repositories.  Sage Bionetworks' mission is to catalyze a cultural transition from the traditional single lab, single-company, and single-therapy research paradigm to a model founded on broad precompetitive collaboration on analysis of large-scale biological data.  In pursuit of these goals we are working with academic and corporate partners in several important areas of human health including cancer, metabolic diseases, and neurodegenerative diseases.  Our focus on data obtained from human clinical trials ensures that the work is directly applicable to advances in patient care.

The technology component of Sage Bionetworks’ solution strategy is Synapse, an informatics platform for open data-driven collaborative research.  In this DBP we will focus on the integration of NCBO technology within Synapse to support a variety of research in clinical genomics.  Synapse consists of a web portal, a set of web services, and integrations with scientific data analysis tools.  The Synapse Portal will allow scientists to interact and share data, models, and analysis methods, both in the context of specific research projects, and broadly across otherwise disparate projects.  The portal is organized around projects, which any scientist can create and invite collaborators to join.  These online workspaces then serve as the glue to help teams of researchers collaborate to solve complex scientific analysis problems. Synapse leverages a web-service based architecture with which different client applications are integrated to support a growing set of diverse users over time. Synapse services provide support for annotating, querying, and updating data, analysis code, and models, and controlling access to these resources. They also support tracking of the provenance of a multi-step analysis procedure, executed in multiple analysis tools. Synapse currently supports the R / Bioconductor statistical environment allowing users to track analyses performed with published packages or custom scripts to complete their work.