FOR Lawrence Livermore researchers involved in the Human Genome Project, gene hunting is like standing in front of a mountain, shovel in hand, and knowing somewhere, amongst tons of rock, is the motherlode.
The search has been going on for years, but it has accelerated recently to a new level, noted Linda Ashworth, a Lawrence Livermore biomedical scientist working in the Laboratory's Human Genome Center. "In 1992, about 80% of our effort was devoted to generating road maps for specific chromosomes or regions on a chromosome.1 Now, about 70% of our effort goes towards sequencing DNA and furthering sequencing technology."
Sequencing involves determining the exact order of the individual chemical building blocks, or bases, that form DNA. The four chemical bases--commonly abbreviated as A, G, C, and T--bind together to create base pairs that are the "business end" of the DNA molecule. (See figure above.)
After researchers sequence a piece of DNA, they search for the special strings of sequence that form genes. The ultimate goal of the worldwide Human Genome Project is to find all the genes in the DNA sequence and develop tools for using this information in the study of human biology and medicine. Major benefits will be a better understanding of and treatments for genetic diseases.

To the Hunt
It is a hunt of gigantic magnitude, a bit like chopping away at Mount Everest with a pick and shovel. Genes range in size from 1,000 base pairs (bp) to over 1,000,000 bp. The smallest human chromosome (21) contains approximately 45 million bp; the largest (chromosome 1) has approximately 250 million. The entire human genome contains about 3 billion bp. As of mid-August 1996, about half of one percent of the human genome had been sequenced worldwide in 15,000 bp chunks or longer. "Maybe six times that amount has been sequenced in smaller pieces, which are useful for diagnostic purposes," according to Jane Lamerdin, one of the Center's researchers.
To put things in perspective, there are perhaps 100,000 human genes scattered throughout the chromosomes, interspersed with non-gene material. "Chromosome 19, the one we're focusing on here at Livermore, has about 2% of the total DNA, so we're estimating as many as 2,000 genes," said Ashworth. "We have a handle on about 400, so there are a lot left to find." (See the box below.)


Genes at Livermore

The difference between weapons testing and gene hunting may seem enormous, but their connection relates to how the study of biology became an integral part of the Laboratory's work. Livermore's first biomedical program was chartered in 1963 to study the radiation dose to humans of isotopes in the environment; a natural extension was to explore how radiation and chemicals interact with human genetic material to produce cancers, mutations, and other adverse biological effects.
In the last 20 years, advances in microbiology, biochemistry, genetics, and bioengineering gave rise to the field of biotechnology. Recent advances in genetic-engineering technologies then made it possible to examine and sequence DNA faster and more efficiently than ever imagined. The Laboratory was well positioned to take advantage of this new field, which combines the disciplines of biology, genetics, engineering, and computer science. Pulling from other Laboratory organizations, the Biology and Biotechnology Research Program called on engineers, physicists, and computer scientists to join biologists to help solve the mystery of the human genome.
In 1987, researchers at Lawrence Livermore began studying all of chromosome 19. This project grew out of research on three genes, each involved in the repair of DNA damaged by radiation or chemicals. As the Laboratory became known worldwide for its work on this chromosome, other researchers, hunting for genes thought to be somewhere on this chromosome, contacted the Laboratory and international collaborations were formed. Such collaborations discovered, for example, the genes for myotonic dystrophy (a late-onset genetic disease causing muscle atrophy) and a form of dwarfism called pseudoachondroplasia.
In 1990, the Department of Energy and the National Institutes of Health formed the joint Human Genome Project. The long-term goal of this 15-year project is to decipher the DNA of the entire human genome. Three DOE national laboratories--Lawrence Livermore, Los Alamos, and Lawrence Berkeley--are DOE centers for this project, while NIH supports eight facilities involved in this work.





High Technology to the Rescue
What has made it possible to even contemplate sequencing the entire genome are advances in genetic-engineering technologies in the past decade.
Not so long ago, sequencing 40,000 bp was considered a worthy multiyear thesis project for a Ph.D. student. Livermore's Center now sequences this amount in less than a week using the Center's integrated system that sequences and tracks the DNA fragments being studied.
The best of current technology allows researchers to sequence about 1,000 bp along a stretch of a piece of DNA. Most facilities only sequence in 300-bp chunks. The Laboratory's Genome Center currently sequences about 700 to 800 bp along the DNA. "We're entering the era of production sequencing," said Lamerdin. "A lot of the up-front work has been automated. There's no more manual pipetting, for instance. We have robots to do that." (See the photo below.)





To sequence a section of DNA, researchers first use special enzymes that act as biological "scissors" to cut DNA at specific points into smaller fragments. They then clone or make hundreds of identical copies of these fragments. When researchers wish to sequence a fragment, they run four nearly identical reactions using that DNA as a template, in which the four bases are chemically labeled with four different fluorescent dyes.
A laser scans the reaction products, exciting the fluorochromes, and a computer captures and stores the resulting fluorescent signals. (See the photo on p. 26.) Software automatically determines the order of bases from the four-color data. The Center has 13 of these sequencing machines, each capable of reading more than 25,000 bases a day. Additional software actually hunts for particular A, G, C, and T combinations that mark the beginnings and endings of genes.
A relational database, developed by Lawrence Livermore computer scientists, keeps track of where each clone is, what has been done to it, who did it, when they did it, and what has yet to be done. "When the sequencing was someone's thesis project, the individual usually kept track of progress in a notebook," explained Lamerdin. "But in this kind of high-throughput environment, we need computers to track the progress of all these pieces and also to help us make decisions. Computational support is a critical element in the success of this project."

After Sequencing
Determining the human genome sequence and finding the genes is really just a first step. "Knowing the bases that make up a gene and where it's located on a chromosome doesn't tell you what the gene does," noted Ashworth. "After sequencing, we still need to determine what proteins the genes produce, and what those proteins do in the cell."
Why bother? First and foremost, genes and their proteins hold the key to unlocking the mysteries of inherited diseases. Once the genetic code for a disease is broken, gene and drug therapies can follow. For example, the gene for cystic fibrosis was discovered four years ago, and while we are still a long way off from "fixing" the gene defect that causes this disease, unraveling the gene's secrets has allowed private industry to deal with one of the major symptoms of cystic fibrosis.
"So, the sequence is really a starting point," said Ashworth. "We still need to know the structure and function of the protein produced by the gene, and how that protein interacts in the environment of the cell. The sequence, you might say, is the detailed map we need to help us find the buried treasure."
Future S&TR highlights will discuss the Center's work on the next-generation sequencing machine and a collaboration to uncover the gene involved in one form of inherited kidney disease.





Key Words: chromosome, DNA sequencing, gene, Human Genome Project.

Reference
1. "The Human Genome Project," Energy & Technology Review, UCRL-52000-92-4/5 (April/May 1992), pp. 29-62.

For further information contact Linda Ashworth (510) 422-5665 (ashworth1@llnl.gov).


Go to the Human Genome Center's Internet home page.

Go to the Department of Energy's "Primer on Molecular Genetics" page.


Back to November 1996