Arlington, Virginia
December 18, 2003
National Science Foundation news release
Decoding of
a variety of plant genomes could accelerate due to two
complementary methods that remove from
analysis vast stretches of DNA that do not contain genes.
The
approaches, applied jointly in efforts to determine the gene
sequences in maize, are described in the Dec. 19 issue of the
journal Science.
The evaluation of these methods and the assembly of the
resulting sequences were undertaken by two groups led by
researchers from The Institute for
Genomic Research (TIGR) in Rockville, Md., and
Cold Spring Harbor Laboratory
in New York.
The
research was funded by the National
Science Foundation's (NSF) Plant Genome Research Program.
Only about
a quarter of the maize genome codes for genes, and these are
found in small clusters scattered through a mixture of
non-coding DNA and transposons (mobile DNA segments). Two
different methods tested by the TIGR group successfully captured
parts of the maize genome containing genes. The gene-sequences
are of most interest because they provide the specific blueprint
for an organism's development, structure and physiology.
With so
much non-gene sequence to deal with, it has not been feasible to
sequence and assemble the whole maize genome with current
technologies. Thus, it is a major shortcut to capture only the
portion of the maize sequence containing its genes without
having to sequence the entire genome.
"Collecting
the maize genes for sequencing is like panning for gold," said
Jane Silverthorne, program director for NSF's plant genome
program. "Just as gold can be separated from the surrounding
rock because it is denser, maize genes can be separated from the
surrounding DNA by their chemical and sequence properties."
The first
method tested, called methylation filtration, removes sequences
that contain a chemical modification (methylation) found on most
of the repeated sequences and transposons, leaving behind the
proverbial gold of genes. It was developed by a team led by
Robert Martienssen and W. Richard McCombie at Cold Spring Harbor
Laboratory.
The second
method, developed by researchers at the University of Georgia,
removes the repeated sequences by separating the DNA into
"high-copy," gene-poor segments and "low-copy," gene-rich
segments.
Led by
Cathy Whitelaw, the research team at TIGR compared sequences
obtained by the two methods. About one fourth of the genes in
each collection matched known gene sequences. About 35 percent
of the genes were represented in both collections.
Each method
was found to enrich for distinct but complementary regions of
maize's 10-chromosome genome. Combined, the methods could cut
the amount of sequencing necessary to find all of the maize
genes to about one-fourth of what it would take to sequence the
entire genome.
As both
methods yielded short stretches of sequence, a major challenge
was to reassemble these into complete genes. To do this, the
Cold Spring Harbor group lined up the sequence pieces from maize
along the rice genome sequence, a deep draft of which was
completed in 2002 by an international consortium. The
researchers then reassembled selected sets of sequence fragments
into complete genes. This approach will be an important part of
assembling the short pieces of DNA yielded by the two
enrichments methods into complete gene clusters.
According
to Silverthorne, "Together, these findings suggest that
scientists could be able to sift out the approximately 450
million base pairs of DNA containing the genes from the maize
genome and then reassemble the sequence. Such a comprehensive
genomic resource would provide growers and breeders a wealth of
tools to improve maize, as well as other cereal crops."
Other
collaborators in the study included the
Donald Danforth Plant
Science Center and Orion Genomics, LLC,
both of St. Louis, Missouri.
Less is more: New technology captures gene-rich DNA segments
Sequencing key regions speeds genome
research in corn and other important crop species
Cold Spring Harbor Laboratory
news release
Obtaining genome sequence information frequently leads to
breakthroughs in the study of a particular organism. Bringing
agriculturally important plant species into the genomic age is
therefore an important goal. However, because they are typically
larger or much larger than the 3-billion letter human DNA
sequence and have a high proportion of so-called repetitive DNA
that is difficult to sequence and contains few coding regions or
genes, the genomes of many plants--including most agriculturally
important species--have posed significant challenges to
researchers interested in crop improvement, plant molecular
biology, or genome evolution. A new study by Cold Spring Harbor
Laboratory researchers is a significant step toward overcoming
those challenges.
By applying a method they recently developed that captures
gene-rich regions and excludes the vast majority of repetitive,
gene-poor DNA,
Cold Spring Harbor Laboratory
researchers have now achieved a dramatic shortcut to sequencing
the genes of corn. The approach should provide a similar boost
to the sequencing and comparative analysis of other genomes in a
wide variety of biological, biomedical, and biotechnological
settings.
The study, led by Cold Spring Harbor Laboratory scientists W.
Richard McCombie and Robert Martienssen, is published in the
December 19 issue of Science along with a related study carried
out by researchers at The Institute for Genomic Research in
Rockville, Maryland. A key method used in both studies, called
methylation filtration, was developed in 1999 by McCombie and
Martienssen's groups through work funded by the U.S. Department
of Agriculture.
Methylation filtration relies on the observation that the DNA of
repetitive, gene-poor regions in the corn genome (and other
plant genomes) is modified by a process called methylation,
whose study has been pioneered in part by Martienssen's group.
Methylation filtration takes advantage of this observation to
preferentially capture the unmethylated, gene-rich regions of
the corn genome for subsequent analysis. Indeed, the new study
demonstrates that methylation filtration removes 93% of
repetitive, gene-poor DNA. As a result, the researchers were
able to focus their efforts on the sequencing and analysis of
the gene-rich regions of the corn genome.
"This study establishes that methylation filtration, combined
with other simple techniques, can be used to successfully
recover and properly assemble complete gene sequences from
genomes that are otherwise extraordinarily difficult to
decipher," says McCombie. "Moreover, both studies involved
large-scale tests that validated our initial estimates regarding
how well the procedure would work. Perhaps most importantly,
we've shown that after gene-enriched draft DNA sequences are
obtained, they can be converted into the complete sequence of
the corn genes by using the related, but much smaller rice
genome sequence as a guide. We believe that taking this
short-cut approach has brought us a very close to a final
sequence map of the biologically important regions of the corn
genome at a fraction of the cost of other approaches," adds
McCombie.
The rice genome, which is about 1/6 the size of the corn genome,
is being sequenced as part of an international consortium funded
in the United States by the National Science Foundation and the
U.S. Department of Agriculture. Corn is the most important
agricultural crop in the U.S. Because the genome structures of
wheat, oats, barley, and many other crops are quite similar to
that of corn, the approaches outlined by the new study provide
the means to bring investigations of all of these important
crops into the genomics era.
The study was funded by the
National
Science Foundation Plant Genome Research Program. Dr. Jane
Silverthorne, Director of NSF's Plant Genome Research Program,
says, "The success of this project highlights the importance of
virtual center projects in bringing together the expertise
required to tackle large complex problems in genomics."
Scientists discover way to streamline
analysis of maize genome
Rockville,
Maryland
TIGR news release
Combination of Two Techniques Can Help Identify "Gene Islands"
in the Key Crop
Like tiny
islands in a vast sea, the gene clusters in maize are separated
by wide - and extremely difficult to decipher - expanses of
highly-repetitive DNA. This complex structure has greatly
complicated efforts to sequence the genome of maize, which is
one of the world's most important crops.
In an
effort to streamline the way that researchers identify and
sequence the DNA in those gene-rich islands, scientists at
The Institute for
Genomic Research
and collaborators have discovered that two different approaches
to identifying the non-repetitive regions of the genome together
provide a complementary and cost-effective alternative to
sequencing the entire genomes of complex plants.
In a paper
published in the December 19th issue of the journal
Science, the
researchers found that two independent gene-enrichment
techniques - methylation filtering and High-C0t
selection - target somewhat distinct but overlapping regions of
the genome and therefore could be used together to help identify
nearly all of the genes in maize as well as their genomic
structures.
This
finding is significant because the maize genome, which includes
about 2.5 billion base pairs of DNA, is about 20 times larger
than the first plant genome to be deciphered,
Arabidopsis thaliana,
and nearly six times larger than the rice genome. The reason
that the maize genome is so large is that approximately 80%
consists of families of nearly identical repetitive sequences.
The gene-containing sequences are concentrated in the remaining
20% of the genome.
The
challenge for genomic researchers is to explore the gene-rich
islands without having to negotiate through the sea of
highly-repetitive DNA surrounding them. In the
Science study,
researchers reported on two "filtration" techniques that
separate the gene-rich regions from the gene-poor ones,
providing about a four-fold reduction in the amount of
sequencing necessary to find all of the maize genes.
"A
combination of these techniques may be an excellent method for
sequencing maize as well as other large and complex plant
genomes at a cost far lower than current approaches," says Cathy
A.Whitelaw, the TIGR researcher who led the maize analysis
project and is the first author of the Science paper.
The major
collaborators for the study were the Donald Danforth Plant
Science Center in St. Louis, MO.; the University of Georgia's
genetics department in Athens, GA; and Orion Genomics, in St.
Louis. The project was sponsored by the National Science
Foundation's Plant Genome Research Program.
"The
success of this project highlights the importance of virtual
center projects in bringing together the expertise required to
tackle large complex problems in genomics," says Jane
Silverthorne, who leads the NSF's plant genome program.
TIGR
Investigator John Quackenbush, the paper's senior author, says,
"Maize is the single largest food crop in the
United States,
so developing strategies to decode its complex genome is a high
priority. More importantly, the techniques that we have
developed will be useful in the analysis of many other crops
such as soybean whose genomes are also highly repetitive."
The two
filtration techniques - methylation filtering and High-C0t
selection - are not new, but this was the first time that they
were tested together on a major scale, in this case with a
combined total of about 93 million DNA base pairs from the maize
genome.
The
methylation filtering technique excludes hyper-methylated DNA
sequences (a characteristic of highly-repetitive
DNA) by means of bacterial restriction systems that cleave
those areas of the genome. The technique was first developed by
scientists at Cold Spring Harbor Laboratory.
The High-C0t
selection technique, developed by researchers at the
University
of Georgia's genetics department, excludes highly-repetitive
DNA sequences by using a different method that separates
DNA
segments into "low-copy" (High C0t)
and "high-copy" (Low C0t)
sequences, which correspond roughly to gene-rich and gene-poor
sections of the genome, respectively.
When
researchers analyzed the composition of Simple Sequence Repeats
- short, repetitive segments of two, three or four
DNA bases - recovered from the two techniques, they were able to show that
the filtration methods targeted different regions of the maize
genome.
An analysis
of "genetic markers" - sequences related to the maize genetic
map - reinforced that conclusion and further indicated that
these methods do not have significant biases, as newly-sequenced
regions are evenly distributed across the 10 maize chromosomes.
"While both
of these methods increase the rate of gene identification from
maize genomic sequence, our analysis implies that they have
biases; this suggests that both methods are required to ensure
comprehensive coverage of the maize gene space," says W. Brad
Barbazuk, Ph.D., senior bioinformatics specialist at the
Danforth Center.
TIGR's
President, Claire M. Fraser, calls the maize study is an
important step in tackling the genomes of complex plants: "Not
only has this project given us a preview of the structure of the
maize genome, it also has helped us find a rapid and
cost-effective alternative to sequencing the entire genome."
The Institute for Genomic Research
(TIGR) is a not-for-profit research institute based in
Rockville, Maryland. TIGR, which sequenced the first complete
genome of a free-living organism in 1995, has been at the
forefront of the genomic revolution since the institute was
founded in 1992. TIGR conducts research involving the
structural, functional, and comparative analysis of genomes and
gene products in viruses, bacteria, archaea, and eukaryotes.
Danforth Center maize genome pilot
sequencing project results in six-fold reduction of effective
size of maize genome
St. Louis, Missouri
December 19, 2003
Initial Results From NSF-Funded Project May Serve As A Cost
Effective Model For Sequencing Large Complex Genomes
As reported
in the December 19, 2003 issue of Science magazine, the Maize
Genomics Consortium, led by scientists at the
Donald Danforth Plant Science Center, has evaluated and
validated a gene-enrichment strategy for genome sequencing
resulting in a six-fold reduction of the effective size of the
Zea mays (maize or corn) genome while creating a four-fold
increase in the gene identification rate when compared to
standard whole-genome sequencing methods.
The
Maize Genomics Consortium, consisting of The Donald Danforth
Plant Science Center, The Institute for Genomic Research (TIGR),
Purdue University, and Orion Genomics, was awarded a two-year,
$6 million plant genome grant on September 20, 2002 by the
National Science Foundation (NSF) to develop and evaluate
high-throughput and robust strategies to isolate and sequence
maize genes. The two gene-enrichment methods used in the
research published in Science are methyl-filtration and high-Cot
selection.
According to Karel R. Schubert, Ph.D., principal investigator
and vice president of technology management and science
administration, and W. Brad Barbazuk, Ph.D., senior
bioinformatics specialist and assistant domain member, both at
the Donald Danforth Plant Science Center, the overall goal of
the pilot sequencing project in maize is to derive an effective
strategy to sequence the maize genome. To meet this goal, the
Maize Genomics Consortium will generate approximately 800,000
total sequence reads using the methyl-filtration and high-Cot
methods, with the results published in Science describing the
analysis of the first 200,000 sequence reads.
It is a
challenging effort to sequence the maize genome, as its size and
structure preclude using the standard whole-genome methods for
sequence analysis and alignment. At about 2 to 3 billion base
pairs, the maize genome is estimated to be 20 times larger than
Arabidopsis, the first plant genome to be completely sequenced.
However, maize probably has only twice as many genes as
Arabidopsis. The rest of the maize genome is made up of a large
amount of highly repetitive DNA including many mobile DNA
elements. Unlike Arabidopsis genes, the maize genes are not
spaced evenly throughout the genome but instead are clustered in
"islands" floating in a large "sea" of repeat-sequence DNA.
To
sequence these "islands", the Maize Consortium employed two
methods for gene-enrichment, methyl-filtration and high-Cot
selection. The methyl-filtration method was developed at Cold
Spring Harbor Laboratory in Long Island, New York, and has been
exclusively licensed to St. Louis-based Orion Genomics. This
method is based on the finding that highly repetitive DNA is
modified (methylated) while genes are largely free of such
modification. The well-established high-Cot selection method was
applied at Purdue University and exploits the fact that gene
sequences are in relatively low abundance compared with the
large amount of repeated non-genic sequences. These methods
target overlapping, but non-identical fractions of the genome
that are highly enriched for genes sequences.
The Donald Danforth Plant Science Center is a not-for-profit
research institution that was founded in 1998 as the product of
a unique and innovative alliance joining the University of
Illinois at Urbana-Champaign, the Missouri Botanical Garden, the
University of Missouri-Columbia, Monsanto Company, Purdue
University, and Washington University in St. Louis. The mission
of the Danforth Center is to increase understanding of basic
plant biology; to apply new knowledge for the benefit of human
nutrition and health and to improve the sustainability of
agriculture worldwide; to facilitate the rapid development and
commercialization of promising technologies and products; and to
contribute to the education and training of graduate and
postdoctoral students, scientists, and technicians from around
the world. |