NIH showcases informatics researchers as new open source ventures launch

By Anthony Brino
10:04 AM

After the National Institutes of Health grew interested in bioinformatics, following breakthroughts in the 1990s, the National Centers for Biomedical Computing were created with the goal of advancing the field by a few leaps and bounds, because IT systems hadn’t quite caught up to molecular biology. 

The nine centers were founded through the 2000s, and with the advent of new data processing and visualization tools, there's been "an explosion of knowledge" in biomedical research, said Brian Athey, from the University of Michigan Medical School’s National Center for Integrative Bio Informatics (NCIBI). 

When NCIBI was created in 2005, "we had some kind of architecture of the genome, but it was still pretty new,” Athey said. NCIBI and other centers have since developed software and applications that have led to findings in areas such as prostate cancer progression, organ-specific complications of type 1 and 2 diabetes, bipolar disorder, the metabolic of obesity and dozens of other biological problem areas. As the field evolves, so are the research insitutions. NCIBI, Stanford's oncology center and Harvard's I2B2 code set project are moving to the tranSMART project, an international, open source consortium with funding from the European Union, pharmaceutical companies and other organizations.

[Related: CDC girding to open its cloud to public health departments]

More a data wareshouse than a research factory, the tranSMART project carries on the broader efforts of NCIBI and others to develop methods for mining genetic and metabalomic knowledge along with data from electronic health records and clinical databases, with the goal of probing the molecular basis of diseases and treatments.

 “It’s kind of an end-to-end molecular to phenotype, genotype to phenotype, phenotype to genotype analysis platform that’s being utilized quite significantly by the pharmaceutical industry," Athey, who also teaches psychiatry and internal medicine, said. "Pharma feels that they need all this, but that they shouldn’t develop 10 different platforms for 10 different pharma companies.” And in addition to other health organizations, the FDA has been interested in using phramacogenomic and toxicogenomic data to study the impacts of some drugs.

In January 2012, tranSMART was released into the public domain under open source GPLv3 license, and offers a data repository that includes demographics, clinical observations, clinical trial outcomes, and adverse events, biomarker data like gene expression, pharmaco-dynamics markers, metabolomics data and proteomics data. 

NCIBI contributed several interactive tools to tranSMART, including genetic- and metabolomic-specific search functions that link genes and compounds with supporting publications. The genetic data tool has been used over 1 million times in the last year, Athey said, especially by researchers in China. Another tool can visualize the interactions of genes, enzymes, reactions and compounds in human metabolomics pathways. 

The tranSMART project has public and private funding sources contributing grants for service delivery, platform development and analytics from the E.U., and cloud delivery platforms, standards and interoperability from the U.S. It’s "a modern way to sustain these kinds of efforts," Athey said. Several of the NIH's bioinformatics centers suffered from declining federal funding, and not enough industry funding, Athey said, which is why some are winding down or merging.

Even so, he added, the efforts produced several very productive open source software projects and datasets that have lead to a number of novel applications and discoveries. NCIBI developed plug-ins that work with Cytoscape, an open source analysis and visualization platform for molecular networks and systems. Cytoscape, developed by a consortium that included the Institute for Systems Biology and Agilent Technologies, has been used and cited for many findings that include the metabolomics of spruce tree hardening during winter, new therapies for drug-resistant malaria and genetic patterns of tumor suppression in ovarian cancer. 

Following the ability to generate and study genetic and moleculer data, the search for molecular-based disease treatments has “gotten to be a much more interesting, difficult and timely a problem,” said Athey, who in the 1980s explained a model for the structure of chromatin, the DNA and protein mixture in a cell nucleus. 

(Photo credit: University of Michigan)