May 2, 2008

Next-generation DNA sequencer aids research

Featured Image

Faces of determination — Robert Woodhall, David Sexton and Shawn Levy, Ph.D., left to right,
provide information to researchers using the next-generation DNA sequencer. (photo illustration by Neil Brake)

Next-generation DNA sequencer aids research

David Sexton, Robert Woodhall and Shawn Levy, Ph.D., left to right, with the new Illumina Genome Analyzer. (photo by Neil Brake)

David Sexton, Robert Woodhall and Shawn Levy, Ph.D., left to right, with the new Illumina Genome Analyzer. (photo by Neil Brake)

Just a year ago, deciphering the sequence of a single person's genome — around 3 billion “letters,” or bases, long — cost millions of dollars and took several months. But rapid advances in genome sequencing technology are making this process much faster and cheaper.

Vanderbilt Medical Center has now entered this “next generation” of DNA sequencing with the recent acquisition of a new sequencing instrument, called the Illumina Genome Analyzer, and the establishment of the Genome Technology Core (GTC).

The new sequencing technology will be provided by the GTC, a joint effort of three existing Vanderbilt cores — the Vanderbilt Microarray Shared Resource (VMSR), the DNA Sequencing Facility, and the Computational Genomics Core — and directed by Dan Roden, M.D., assistant vice chancellor for Personalized Medicine.

The new instrument will boost the Medical Center's DNA sequencing capacity by several orders of magnitude, expanding scientific opportunities and enhancing competitiveness for extramural funding.

While Vanderbilt does not plan to decode entire human genomes with the new technology, next-generation sequencing will be particularly useful for sequencing smaller bits of the genome to identify gene variations associated with disease risk and drug response, said Roden.

“So the question is not 'what is a human genome or viral genome,' but 'what are the differences among those?’”

“We will now be able to look at large collections of patients with well-defined disease susceptibilities and drug responses and ask whether there are genetic variations that explain those differences,” Roden said.

Vanderbilt has traditionally performed DNA sequencing by the standard “Sanger” method, which can sequence about 50 million base pairs in one year.

The new technology, dubbed “next-generation DNA sequencing,” works differently than Sanger, generating shorter DNA fragments, but many more copies of them. The Illumina can produce more than 1 billion base pairs in just three days.

The difference between the old and next-generation technologies is like “the difference between a water fountain and a fire hose,” said Shawn Levy, Ph.D., director of the VMSR. “Next-generation technologies provide unprecedented amounts of genomic information, but projects that only require a 'drink' of that information will be overwhelmed by the new methods, just as analyzing huge areas of the genome requires a volume of data that standard sequencing can not deliver efficiently."

The new methodology will not replace Sanger sequencing, noted Alfred George Jr., M.D., director of Vanderbilt's DNA Sequencing Facility, but it offers an entirely new scale of projects that Vanderbilt investigators can now perform in-house.

For example, he says, an investigator doing candidate gene resequencing — i.e., looking for sequence variations in disease-associated genes — can potentially analyze dozens, hundreds and maybe even thousands of genes rather than one gene at a time. And, they can probably have 100 genes resequenced with the new method for about the same cost as a single gene using Sanger sequencing.

Next-generation sequencing will also be able to analyze entire viral or bacterial genomes and help determine, for example, why certain strains of bacteria are more dangerous than others.

Timothy Cover, M.D., professor of Medicine, plans to study how the genetic makeup of Helicobacter pylori — a bacterium that resides in the stomachs of nearly half of all humans — might impact the development of gastric cancer and peptic ulcer disease.

“Until recently, most of our knowledge about H. pylori genetic diversity has been based on analysis of individual genes,” said Cover.

“With the availability of next-generation sequencing technology, it will be feasible to analyze entire H. pylori genomes, which will allow us to gain important new insights into geographic diversity among H. pylori strains, and may ultimately help us understand why certain diseases (such as gastric cancer) develop in some H. pylori-infected persons but not others.”

The increased capacity to generate DNA sequence presents a number of computational challenges, however.

“Runs on these machines create an enormous amount of data and storage can be an issue. And assembling the reads and aligning them to a reference sequence can be a challenge,” said Marylyn Ritchie, Ph.D., director of the Computational Genomics Core.

The informatics group has engineered a “pipeline” from the new instrument to Vanderbilt's supercomputer, the VAMPIRE Linux Cluster, at the Advanced Computing Center for Research and Education (ACCRE). They also hope to develop new algorithms to help process and analyze the massive amounts of data obtained by next-generation sequencing.

For more information about the Genome Technology Core and the Illumina Genome Analyzer, contact: Christie.ingram@vanderbilt.edu.