EHRs are holding troves of genomic data, too bad it's not always easily usable

In some cases, more data is needed to draw conclusions, and in others, the tech just isn’t there yet, John Quackenbush, a computational biologist at Harvard's Dana Farber Cancer Institute, says.
By Mike Miliard
10:46 AM
EHRs genomic data

John Quackenbush believes data is the currency for all scientific advancement, driving conclusions and even improving the practice of medicine.

“But the challenge is that when we really look at what data is available, often the data is incomplete or inaccessible or not in a readily usable format,” he said.

Quackenbush, professor of computational biology and bioinformatics at Harvard Medical School-affiliated Dana Farber Cancer Center, worked on the Human Genome Project and did a two-year stint at Stanford University exploring the intersection genomics and computational biology. So his focus is primarily on the large and fast-growing galaxy of genetic data. But his words hold true for any type of clinical data, in electronic health records and beyond.

[Register Now: Upcoming HIMSS Big Data & Health Analytics Forum]

At the HIMSS Big Data and Healthcare Analytics Forum on Oct. 23, Quackenbush's keynote address will explore ways data can be made more accessible, usable and valuable for improving care. It's a prospect, you've probably noticed, that's not as easy or clear cut as it might sound.

"Our ability to make relevant inferences from the genome is still limited," said Quackenbush. "And what it's limited by is that the genome sequence by itself isn't enough. We need to know something about the health and health status of each individual whose genome is sequenced if we ever want to get to the point where we can draw meaningful conclusions."

In other words, he, said, "the real challenge we face moving forward is not a dearth of data but instead a wealth of data with incomplete metadata."

Add to that the challenging fact that healthcare's tools and techniques still aren't always up to the task of sifting through the mounds of digital insights we've amassed, and the challenges become even more acute.

"There's a lot of useful data in EHRs, but the technology has not kept up with what we really need."

John Quackenbush, Dana Farber Cancer Institute

"There's a lot of useful data in EHRs, but the technology has not kept up with what we really need," said Quackenbush. "There was a really big push to implement them, but part of the challenge is they're very useful for some things but not very useful for everything."

Consider genomics: As Beth Israel Deaconess Medical Center CIO John Halamka, MD, has joked, most EHRs relay precision medicine data using "highly interoperable standard for such material called 'PDF.'"

As Nephi Walton, MD, assistant professor of genomic medicine at Geisinger Health System has pointed out, typically whole genome sequencing generates about 100-200 gigabytes of data, which is then distilled down. The rest of that data is "not totally thrown away, it's still at the lab," said Walton, but "essentially we're throwing it away – in large part based on the fact that we don't have a place to put it or an easy way to use it."

Genomics, of course, is highly specialized, data-rich and complex. But even on the more basic level of day-to-day clinical care, Quackenbush says IT systems are limited.

He tells the story of a time not long ago when he was struck a spell of intense dizziness. The room was spinning. He could barely make it to the phone. He eventually made it to the ED, where his vertigo, thankfully, was diagnosed to be loose calcium crystals in his inner ear, which clear up on their own.

But Quackenbush arrived at his office the next morning to a phone call from his primary care physician, who wanted to follow up on what he thought had been a cardiovascular episode.

"The hospital had run an EKG so what they did was annotate me as someone who had a cardiovascular event so they could get reimbursed for running this test," said Quackenbush. "That's symptomatic of the challenge of EHRs. They're designed to help the hospital in its main enterprise, which is getting paid for the services they render. It's a tool for tracking patients for reimbursement, more than medical care."

Similar challenges arise when analytics tools are applied to data without strategic thought about how they're put to work, and what hospitals hope to accomplish.

"With a lot of analytics, people apply it blindly without thinking, is this the appropriate tool?" said Quackenbush. "I can use a wrench to loosen a bolt. Or I can use a wrench to pound in a nail, which is not the best use of the wrench. Either one of those applications doesn't make it a good or bad tool. It's a good tool for what it was designed to be used for."

Consider artificial intelligence: "There's been a tremendous amount of excitement about AI and machine learning," he said. "There's also been a good amount of hype. I work in a domain where some of my colleagues use machine learning tools in an emerging field we call radiomics: You look at quantitative images, look at CT scans and you can extract quantitative features. We can use those to make predictions, we can use them as biomarkers,” he said.

"Machine learning tools are exquisitely good at things like tumor segmentation, and the reason is you have a simple yes or no answer," said Quackenbush. "It performs so well because you could get 10,000 images that a really highly qualified radiologist has gone through and circled the tumor. You have training sets, you have data, you have truths."

Indeed, machine learning can often outperform humans, and when radiologists use the tools correctly, there's "tremendous promise for applications like that."

On the other hand, sometimes machine learning isn't the miracle cure some expect it to be, either because it's being used for the wrong problem, or because it's being fed inadequate information.

Cognitive computing stumbles when "we don't have the right data to train an algorithm," said Quackenbush. "There's an interesting discussion. You don't have a system that's learning from the data to make predictions about what the best therapy is, you have physicians interpreting clinical guidelines and papers to make associations that the machine can then carry forward.

"Why does machine learning fail? It's because we don't have data on thousands of patients, all of whom have the same mutation," he said. "If we had all that data we might have the ability to train a really robust algorithm. But we don't. And so absent that data, using machine in this context just isn't using the right tool for the right job."

Twitter: @MikeMiliardHITN
Email the writer: