vassarmicro

 

Metagenomics

Page history last edited by leigh 1 yr ago

Metagenomics

 

 The genes all around you!


 


 

Introduction

 

Almost every enviornment on earth harbors life. The problem is that most of that life is not visible to the naked eye. Traditionally scientists try to culture microbes in a laboratory, but the technique is ineffective. Less than 1% of microbes are able to be cultured in a laboratory. A new technique was needed. In the mid 1980’s, Norman Pace proposed a revolutionary idea on how to study the microbial community without growing them in a lab (Nicholls 2007). He proposed that the DNA of communities could be analyzed from a mixture of organisms sampled from the environment. However, at that time his inventive ideas outpaced the current technology. It was not until the mid 1990’s that the technology was developed. In 1998, Jo Handelsman and his colleagues named this new branch of biology Metagenomics (Nicholls 2007).

 

Through the processes of Metagenomics, many new species, genes and ecosystems have been discovered in unlikely places. These locations range in diversity from hot vents deep in the ocean to abandoned battery acid mines. The multitude of discoveries has led to a number of medicinal breakthroughs, especially in the area of pharmaceuticals. This technology might eventually reveal how life on our planet began (Nicholls 2007).

 

The techniques used are simple but highly effective. After getting a sample, the DNA is fractured into a number of smaller pieces and sequenced. A computer then works backwards to reconstruct the DNA sequences in to individual genes or even entire genomes! From this restructured genome, and reveals the genes that were present in the sample (Nichols 2007).

 

 

Methods

 

The concept of metagenomics could never be possible without modern sequencing techniques. Older techniques involved carefully culturing an organism on nutrient-rich media and isolating and copying the DNA with PCR before finally sequencing that organisms DNA. Now, this process more efficient because of a sequencing technique called shotgun Sanger Sequencing. This technique does not use lab cultured samples. Instead, all of the organisms in a sample are processed concurrently. The resulting data reflects the percentages of each organism in the sample. 

 

 

A large step for microorganisms

 

In the field of microbiology, there are few people more famous than J. Craig Venter. His efforts have pushed scientific discovery lightyears ahead of where it was resting. Although most well known for being the first to sequencing the human genome, his research has gone far beyond that.

 

In the spirit of Darwin's travels on the Beagle, Venter began a global  expedition. A specially outfitted yacht set sail in 2003 to take ocean samples around the world. On an scale unprecedented in microbiological research, the research team set out to evaluate the worlds microdiversity. Using a sample size of 200 liters, the team traveled over 32,000 miles sampling every 200 miles. After being frozen and shipped to the land-based research center, the samples were sequenced using shotgun techniques. What was discovered was astonishing! More than 1.2 million new genes and over 1,800 new species of organisms were discovered, many in places formerly thought to be relatively devoid of life (Global Sampling 2007).

 

 

Figure 1: Map showing route and the 41 sample locations of Sorcerer II

 

 

As a way to test techniques before starting this expedition, a similar expedition was sent to the Sargasso Sea. This voyage was shorter and sampled a limited area. The data from the Sargasso Sea voyage revealed several hundred new proteohodopsin light harvesting genes. This discovery will hopefully reveal their important role in energy metabolism under low nutrient conditions (Rusch 2007).

 

The trans-ocean voyage brought back massive amounts of information, most of which scientists wil be analyzing for years to come. All together, there were over 7 million sequencing reads. After running the sequences, computers created random clone insert libraries. Between all data analyzed, 6.25 Gbp were generated, 6.4 Gbp of non-redudant sequences and 6 million contiguous sequences were created.  Large contiguous sequences link genes together into operons in order to better understand their orgin and function. Anonymous sequences are compared with highly studied 'taxonomic markers' such as 16s RNA or recA in order to identify taxonomic group. In order to find the most similarity, sequences were BLAST'ed using only 55% similarity. This way, anything remotely similar would be picked up. Fifty-five percent is the absolute lowest similarity that could be used and still have sequences be close enough for comparision. Over 70% of all reads aligned to at least one of the 584 known microbial genomes that were used as references. Most only aligned to a low percentage making them related, but distantly to the reference genomes.  Using a higher similarity percentage revealed that only 30% were recognizable. This indicates that all those that were not recognized were totally new species (Rusch 2007).

 

 

Only genomes from the genus' of Prochlorococcus, Synechococcus, Pelagibacter, Shewanella, and Burkholderia yielded substantial and uniform recruitment of fragments. The first three of those combined to form 50% of all recruited reads. The other 50% consisted of Shewanella, Burkholderia and a random assortment of other genus'. More detailed sequencing included the entire genomes of Pelagibacterubique HTCC1062, Prochlorococcus marinus MIT9312, and Synechococcus WH8102 (Rusch 2007).

 

The distribution of these ribotypes reveals a number of distinct microbial communities. Very few of them appear to be ubiquitious species. Ribotypes that appear in numerous samples, tend to thrive in similar environments. For example, SAR 11 and SAR 86 were widespread only in samples taken near the surface. Each individual environment has distinguising ribotypes specific to it. This similarity factor will allow us to learn more about each microbial community's metabolic needs. By applying what was already known about the microbes found in each aquaeous niche, it seems that the communities more significantly affected by the amount of light and nutrients than by salinity or temperature. That is not saying that temperature and salinity do not have an effect, just that the effect on the habitat of microbial communities is less. The different marine habitats each had a number of new genes identified, with phosphorus collection and utilization genes being the most prominent. The number found depended on the area being sampled, with a coorelated difference between samples from the Atlantic Ocean and the Pacific Ocean. The number of phosphorus genes increased as the amount of free phosphorus in the water increased (Rusch 2007).

 

One of the largest questions that came out of the expedition was how subtype variants have existed alongside each other for so long. As communities adapt and change, there tends to be one dominate form of the bacterium that succeeds at out-competing other variants. This expedition found many species that were closely related living in the same environment. A similar phenomenon has been observed in experiments dealing with plankton and has been termed the "Paradox of the Plankton". This, along with many findings was unable at that time to be fully explored due to limitations in the experiment. In order to sequence such large amounts of microbial DNA simultaneously, deep detail needed to be sacrificed. Only a few hundred base pairs were sequenced for each sample. This lack of depth and reliance upon fully sequenced genomes made this project more of a survey than an in-depth study. Future research can build upon what was learned here to discover new and wonderful things (Rusch 2007).

 

 

Applications

 

The relatively new field of metagenomics offers several different applications. Within microbiology, researchers use this technique to continue exploring microbial adaptations to extreme environments, as well as understand their role in an ecosystem. For example, the acid mine drainage on Iron Mountain in California, has a pink biofilm floating over the water. The conditions here represent some of the harshest on Earth, with temperatures at 42°C, and high concentrations of, iron, zinc, copper and arsenic—the concentrations are lethal to people. The pH of the water below the biofilm is between zero and one and carbon and nitrogen only appear in gaseous forms (Handelsman 2004). The study revealed that Leptospirillum, Sulfobacillus, Accidomicrobium, and an Archael species, Ferroplasma acidarmanus, dominates the community. The research team found that Leptospirillum III, which represents 10% of the bacterial population, executes the nitrogen fixation process (Riesenfeld 2004). Ferroplasma and Leptospirillum both seem to generate energy through iron oxidation. Each species maintains a neutral pH, most likely with the proton efflux system and can maintain nontoxic levels of metals by pumping them outside the cell (Handelsman 2004). By being able to isolate these species through metagenomic researchers to identify the relevant genes, researchers were able to determine some of the bacterial species that are responsible for supporting this ecosystem.

 

Metagenomics can also be applied outside the field of microbiology, to paleogenomics. Poinar’s research team sequenced 28 million base pairs of DNA from a Siberian wooly mammoth. Almost half of those sequences belong to the targeted specimen. Ninety-eight and a half percent of the mammoth DNA was the same as the African elephant. It is possible to complete the mammoth’s genome, jumpstarting another scientific field, paleogenomics (Poinar 2006).

 

Metagenomics also has potential to add to the repertoire of biotechnology. New organic molecules can be discovered, as well as the genes that control them. Some of these genes are antibacterials of interest to medicine. Turbomycin, for example, is one of the first antibiotics that this new technique helped identify (Schloss 2003). Microbial molecules could be used in laundry detergents, food applications, agriculture, livestock feed, textile processing and paper processing. Each of these industries is highly lucrative, creating an incentive for research. Furthermore, a new molecule could potentially provide an alternative energy source. (Lorenz 2005). Metagenomics could isolate genes that can eradicate pollution in the environment, as well as a tool to counteract bioterrosim (Jurkowski 2007).

 

    There are two different applicable approaches to isolating molecules applicable to biotechnology, sequence-driven analysis and function-driven analysis. The sequence-driven approach uses conserved DNA sequences in metagenomic libraries and scans the library for clones that contain sequences of consequence to the study. To identify the sequences, researchers design hybridized probes or PCR. Similarly, random sequencing of clones has also been a successful technique. In contrast, the function-driven method recognizes clones with a beneficial trait. Sequencing and biochemical analysis describe the gene in question. This second approach has fast results, but to work, the host cell must express the gene, limiting the output of the scheme (Schloss 2003).

 

 

 

 Problems of Metagenomics

 

Metagenomics is almost too effective. The technique yields 100,000s of microbial sequences at once. Sifting through the unprecedented yield of results is a daunting task and will take several years to analyze (Nicholls 2007).

 

Furthermore, the computer software designed to analyze sequence is not designed to manage the immense input of data. According to Oremland, the 1.2 million DNA fragments isolated from the Sargasso Sea overwhelmed GenBank (Oremland 2005). Furthermore, now five percent of GenBank’s sequences are from the Sargasso Sea, but some of these reconstructed sequences do not exist naturally. To compensate, Venter made the trace files of individual sequences available, allowing GenBank users to confirm the validity of GenBank results. In addition, analysts using GenBank must be aware that matches of sequences from the Sargasso Sea may be the result of the sheer number of sequences from the Sargasso, rather than actual similarity  (Riesenfeld 2004).

 

According to Henry Nicholls, rare microbial species represent the bulk of microbial diversity—as high as 90%. The director of the Josephine Bay Paul Center in Comparative Microbiology and Evolution at the Marine Biological Laboratory, Mitchell Sogin, thinks that of the environments that Craig Venter sampled, 100 times the number of collected species exist. Some researchers are currently adjusting the metagenomics process to include the underrepresented species (Nicholls 2007).

 

 

Benefits of Metagenomics

 

The culture independent technique of environmental genomics bypasses the more labor intensive and limited technique of growing bacterial cultures in the laboratory. For each culture grown, only a single DNA fragment can be sequenced. These are unimpressive results in comparison to metagenomics, which can generate 100,000s of sequences simultaneously. The ability to grow a microbe in a laboratory is not a limiting factor in the field of metagenomics either. To successfully plate a microbe, it needs to be within its ‘Goldilocks Zone.’ Improper nutrients, temperature, pH balance, salinity, light and oxygen could each be independently responsible for inhibiting laboratory growth of a microbe. Researchers estimate that as many as 99% of all microorganisms cannot be cultured by standard culturing techniques (Riesenfled 2004). Metagenomics is an invaluable and necessary technological advancement in the field of microbiology. Though it does present new problems, it invites a significantly wider spectrum of research, expanding scientific knowledge, biotechnology and medicine.

 


 

References

 

Global Ocean Sampling Expedition. Rockville, MD: J.Craig Venter Institute, 2007.

 

 

Handelsman, Jo. "Metagenomics: Application of Genomics to Uncultured Microorganisms." Microbiology and Molecular Biology Reviews. 68.4: 669-685. December     2004.

 

 

Jurkowski, Anne, Ann H. Reid, and Jay B. Labov. "Metagenomics: A Call for Bringing a New Science into the Classroom (While It's Still New)." CBE--Life Sciences     Education. 6: 260-265. Winter 2007.

 

 

Lorenz, Patrick and Jurgen Eck. "Metagenomics and industrial applications." Nature. 3: 510-516. June 2005.

 

 

Nicholls, Henry. "Welcome to our world." NewScientist. 2596: 44-47, Mar. 17 2007.

 

 

Oremland, Ronald S., et al. "Whither or wither geomicrobiology in the era of 'community metagenomics.'" Nature Reviews Microbiology.

    (Accessed 6 January 2005). www.nature.com/reviews/micro  10 June 2005.

 

 

Poinar, Hendrik N. et. al. "Metagenomics to Paleogenomics: Large-Scale Sequencing of Mammoth DNA." Science. 311.5759: 392-394. 20 Jan 2006.

 

 

Riesenfeld, Christian S., Patrick D. Schloss, and Jo Handelsman. "Metagenomics: Genomic Analysis of Microbial Communities." Annual Reviews.

    (Accessed 21 April 2008). http://arjournals.annualreviews.org 38: 525-551. 14 July 2004.

 

 

Rusch, Douglas B, et al. "The Sorcerer II Global Ocean Sampling." PLOS Biology 5 (2007): 0398-431. 18 Apr. 2008.

 

 

Schloss, Patrick D. and Jo Handelsman. "Biotechnological propsects from metagenomics." Current Opinion in Biotechnology 14: 303-310. 2003.

Comments (13)

Nick Katz said

at 9:49 pm on Feb 24, 2008

How does the computer know how to reconstruct the DNA sequences into the correct genome? Also, how you can tell what can you learn about an organism from its genome?

profile picture

Shirley Shangguan said

at 2:09 pm on Feb 27, 2008

How specifically do you collect DNA samples?

Anonymous said

at 7:02 pm on Feb 27, 2008

If the DNA is broken up into smaller pieces, how can it be reassembled without a complete template as a guide? Also, is the DNA broken up randomly, or only at certain places with restriction enzymes?

profile picture

Emma Marsh said

at 7:06 pm on Feb 27, 2008

I guess I wasn't logged in for that last one. -Emma

Edem Binka said

at 12:55 am on Feb 28, 2008

How easy is it for existing academic institutions, like Vassar, to utilize metagenomics in the classroom or in the lab?

david esteban said

at 8:36 pm on Feb 28, 2008

Im probably in a better position to answer Edem's question so here goes. For the most part, metagenomics is too expensive for small institutions. So much sequencing gets very costly. However, next year, the micro class will be doing a metagenomics project! I recently got a grant to help us do that, but mostly its thanks to the generosity of a major sequencing center that will do the sequencing for free!

Katrina Mateo said

at 1:32 am on Mar 1, 2008

How accurate is this method? How does it account for DNA repeats and mutations?

Anonymous said

at 12:12 am on Apr 22, 2008

Edem Binka: There is a really cool video on the JCVI website that talks about how the ocean sampling project was conceived and where the research is leading. Follow this link: http://www.jcvi.org/cms/nc/research/projects/gos/video/

Anonymous said

at 3:45 pm on Apr 22, 2008

Shirley: Is metagenomics helping in the field of medicine in terms of treating disease?

Dylan Hershkowitz said

at 9:29 pm on Apr 22, 2008

What will be done with the new genomes that are sequenced? Other than having a greater understanding of the diversity of life on Earth, can this have applications for helping humans?

profile picture

Emma Marsh said

at 9:39 pm on Apr 22, 2008

Is it useful to use metagenomics to gain information about specific organisms, or just how the community functions as a whole? Although the DNA information is important, how valuable is this without being able to study the actual bacteria? Is it the hope that eventually these newly found species will be culturable?

Adriana said

at 11:30 pm on Apr 23, 2008

If they're using shotgun sequencing on the huge samples they've found, how accurate are the results? It seems that using the shotgun method is inaccurate while only sequencing one organism, so can all these random genes they've found be processed to determine phylogeny/functions/etc.???

Stephen Evans said

at 10:07 am on May 21, 2008

Metagenomics seems like a promising field for research, but I too, am wary of something that employs shotgun sequencing and generates so much data, perhaps through a method that is less than ideal. How useful is it in the short-term to generate hundreds of thousands of sequences that may not be analyzed/fully-sequenced/understood/or even looked at for years? I can see arguments for both sides. Supporting argument: our technology and understanding of genomics is increasing at an exponential rate, and we may be able to comprehend the data we are generating currently far quicker than we first thought. Critical argument: our research endeavors would be better served by focusing in on one or a few topics/sequences which we can fully sequence (not using shotgun sequencing) and then attempt to comprehend.

You don't have permission to comment on this page.