International project doubles genomic variation catalog in cloud database

By Anthony Brino
09:47 AM

The 1,000 Genomes Project, an international public-private consortium, has doubled the number of sequenced genomes in its open source catalog of human genomic variations — a great aid, researchers say, in medical efforts to find genetic underpinnings of diseases and individually target treatment.

Funded in part by the National Institutes of Health, with researchers from the U.S., Britain, China, Germany and Canada, the 1000 Genomes Project is trying to catalog variants in the human genome occurring at least in 1 out of 50 people, in an effort to identify common and rare genomic variations. So far the project has mapped the genomes of 1,092 people from 14 populations in Europe, East Asia, sub-Saharan Africa and the Americas.

The findings, published in the October 31 issue of Nature, “provide deeper insights about the presence and pattern of variants in different people's genomes, which is critical information for studying the genomic basis of human disease,” said Eric Green, MD, director of the U.S. National Human Genomic Research Institute.

The project has produced an integrated haplotype map, highlighting parts of the genome often associated with disease susceptibility, drug response and reaction to environmental factors. Researchers identified 38 million single-nucleotide polymorphisms, DNA sequences accounting for most population variations, and mapped varying linear structures in indentifying 1.4 million small DNA base deletions or insertions and 14,000 large DNA base deletions.

All of this in an open source, cloud-based database should expand opportunities for current research and treatment and could help with the commercialization of individual genomic analysis.

There’s something of a bio-tech race right now to offer individual genome sequencing for $1,000. Currently a California-based company, Complete Genomics, can do it for around $5,000.

The goal is to genetically-tailor pharmaceuticals for individuals and people with similar genes. Of the dozens of treatments available for a particular cancer, some treatments work for some people and don’t for others. In trying to understand genetically how certain people respond to certain treatments, one limitation for researchers has been a lack of large genomic databases, for comparing individuals to broader populations.

[See also: Presidential bioethics panel recommends genomic regulations protecting privacy, innovation]

The 1,000 Genomes Project cost $120 million and has plans to eventually study 2,500 people from 26 populations. The project recently made its datasets available via the cloud with Amazon Web Services, with the hope that researchers will use the databases for their own work and find new innovations.

Aside from medical research, the datasets are also helping trace the evolution of genomic variations across the globe. "I view this project as a Lewis and Clark expedition to the interior of the human genome," Stephen Sherry, chief of reference collections at the National Center for Biotechnology Information’s Information Engineering Branch, said. "We knew the outlines and contours (of the genome). Now, we’re trying to document all the fine details such as the rivers and tributaries."