‘Big data’ promises big yield for healthcare

By Bernie Monegain
02:02 PM

White House makes good on PCAST advice

WASHINGTON – Among the projects that will benefit from the government’s new “big data” project is the National Institutes of Health’s "1000 Genomes Project Data Available on Cloud.” and also "Core Techniques and Technologies for Advancing Big Data Science & Engineering,” which the NIH is undertaking with the National Science Foundation.

Healthcare stands to yield big rewards from the $200 million initiative launched March 29 by the Obama Administration.

“If US healthcare were to use big data creatively and effectively to drive efficiency and quality, the sector could create more than $300 billion in value every year,” according to a recent report from the McKinsey Global Institute. Two-thirds of that would be in the form of reducing U.S. healthcare expenditure by about 8 percent, researchers note.

The Obama Administration’s “Big Data Research and Development Initiative,” promises to “extract knowledge and insights from large and complex collections of digital data,” to help address the nation’s most pressing challenges.

It’s a measure the President’s Council on Science and Technology (PCAST) recommended in a December 2010 report, calling for high-risk/high-reward research.

Top on PCAST’s list of priorities: Health information technology. PCAST called on the president to “make possible comprehensive lifelong multi-source health records for individuals; enable both professionals and the public to obtain and act on health knowledge from diverse and varied sources as part of an interoperable health IT ecosystem; and provide appropriate information, tools, and assistive technologies that empower individuals to take charge of their own health and healthcare to reduce its cost.

The advisory group called for going “well beyond the current national program to adopt electronic health records.”

“In the same way that past federal investments in information technology R&D led to dramatic advances in supercomputing and the creation of the Internet, the initiative we are launching today promises to transform our ability to use big data for scientific discovery, environmental and biomedical research, education and national security,” said John P. Holdren, assistant to the president and director of the White House Office of Science and Technology Policy.

Six federal departments and agencies announced more than $200 million in new commitments that, together, promise to improve the tools and techniques needed to access, organize, and glean discoveries from huge volumes of digital data, officials said.

The initiative aims to:

• Advance state-of-the-art core technologies needed to collect, store, preserve, manage, analyze and share huge quantities of data.
• Harness these technologies to accelerate the pace of discovery in science and engineering, strengthen our national security, and transform teaching and learning; and
• Expand the workforce needed to develop and use big data technologies.

Holdren said the initiative is in response to the President’s Council of Advisors on Science and Technology, which last year concluded that the federal government is under-investing in technologies related to big data. In response, OSTP launched a senior steering group on big data to coordinate and expand the government’s investments in this area.

The first wave of agency commitments to support the initiative include:

National Science Foundation and the National Institutes of Health – "Core Techniques and Technologies for Advancing Big Data Science & Engineering." NIH is particularly interested in imaging, molecular, cellular, electrophysiological, chemical, behavioral, epidemiological, clinical and other data sets related to health and disease.

National Science Foundation – In addition to funding the big data solicitation, and keeping with its focus on basic research, NSF is implementing a comprehensive, long-term strategy that includes new methods to derive knowledge from data; infrastructure to manage, curate and serve data to communities; and new approaches to education and workforce development.

National Institutes of Health – "1000 Genomes Project Data Available on Cloud." The world’s largest set of data on human genetic variation – produced by the international 1000 Genomes Project – is now freely available on the Amazon Web Services (AWS) cloud. At 200 terabytes – the equivalent of 16 million file cabinets filled with text, or more than 30,000 standard DVDs – the current 1000 Genomes Project data set is a prime example of big data, where data sets become so massive that few researchers have the computing power to make best use of them. AWS is storing the 1000 Genomes Project as a publically available data set for free and researchers only will pay for the computing services that they use.

Department of Defense – "Data to Decisions." The Department of Defense is investing approximately $250 million annually (with $60 million available for new research projects) across the military departments to harness and utilize massive data in new ways and bring together sensing, perception and decision support to make truly autonomous systems that can maneuver and make decisions on their own.

Department of Energy – "Scientific Discovery Through Advanced Computing." The Department of Energy will provide $25 million in funding to establish the Scalable Data Management, Analysis and Visualization (SDAV) Institute.

US Geological Survey – "Big Data for Earth System Science." USGS is announcing the latest awardees for grants it issues through its John Wesley Powell Center for Analysis and Synthesis. The Center catalyzes innovative thinking in Earth system science by providing scientists a place and time for in-depth analysis, state-of-the-art computing capabilities, and collaborative tools for making sense of huge data sets.