NIH taps new partners to build commons with petabytes of biomedical data

A consortium of public and private developers will build out analytics tools, interoperability standards for researchers.
By Mike Miliard
02:05 PM
NIH new data commons

The National Institutes of Health has chosen Cambridge, Massachusetts-based Seven Bridges, which develops cloud-based analytics tools, to help support the pilot phase of the NIH Data Commons.

Building on its experience developing the NCI Cancer Genomics Cloud, Seven Bridges will lead a public-private partnership comprising Elsevier, UK-based Repositive Ltd. and Boston VA Research Institute, which helped create the Million Veteran Program, the world's largest genomic database.

[Also: NIH expands All of Us national network by adding new medical centers]

Those members together will enable access to petabytes of additional data: million-plus indexed datasets from the Repositive Platform, Elsevier’s Mendeley data hub, and the VA’s GenHub Ecosystem.

NIH's aim is to create a biomedical data discovery and computing environment using FAIR data sharing principles: findable, accessible, interoperable, reusable.

[Also: NIH partners with biopharma to speed development of cancer immunotherapy work]

The consortium, called FAIR4CURES, will work in the overall NIH Data Commons pilot to build a full-stack solution bringing together data from a variety of research environments into a single ecosystem to enable discovery, access, and computation.

Seven Bridges' cloud infrastructure for biomedical data analysis includes AWS, Google, and local computer storage solutions. As the company leads the pilot it will also incorporate interoperability standards, such as Common Workflow Language, to help speed research and open source development, officials said.

[Also: How Veterans Affairs is leading with big data, precision medicine]

The company will also develop APIs to link biomedical data from the Cancer Genomics Cloud and Gabriella Miller Kids First Data Center to additional NIH datasets such as the Trans-Omics for Precision Medicine, Genotype-Tissue Expression and Model Organism Databases datasets.  

"The NIH Data Commons promises to transform the way public biomedical data is stored and analyzed," said Seven Bridges CEO Brandi Davis-Dusenbery. "An effort of this scale has never been tried before and its focus on interoperable data accessibility answers the call to break open data silos, setting new standards for healthcare research."

Twitter: @MikeMiliardHITN
Email the writer: