Getting Genetics Done: Comparing Sequence Classification Algorithms for Metagenomics

by cupton · April 2, 2013

See on Scoop.it – Virology and Bioinformatics from Virology.ca

Metagenomics is the study of DNA collected from environmental samples (e.g., seawater, soil, acid mine drainage, the human gut, sputum, pus, etc.). While traditional microbial genomics typically means sequencing a pure cultured isolate, metagenomics involves taking a culture-free environmental sample and sequencing a single gene (e.g. the 16S rRNA gene), multiple marker genes, or shotgun sequencing everything in the sample in order to determine what’s there.

A challenge in shotgun metagenomics analysis is the sequence classification problem: i.e., given a sequence, what’s it’s origin? I.e., did this sequence read come from E. coli or some other enteric bacteria? Note that sequence classification does not involve genome assembly – sequence classification is done on unassembled reads. If you could perfectly classify the origin of every sequence read in your sample, you would know exactly what organisms are in your environmental sample and how abundant each one is.

The solution to this problem isn’t simply BLAST’ing every sequence read that comes off your HiSeq 2500 against NCBI nt/nr. The computational cost of this BLAST search would be many times more expensive than the sequencing itself. There are many algorithms for sequence classification. This paper examines a wide range of the available algorithms and software implementations for sequence classification as applied to metagenomic data.

See on gettinggeneticsdone.blogspot.pt

Getting Genetics Done: Comparing Sequence Classification Algorithms for Metagenomics

You may also like...

Leave a Reply Cancel reply

Twitter Timeline

Getting Genetics Done: Comparing Sequence Classification Algorithms for Metagenomics

You may also like...

How hepatitis B virus enters liver cells

Oh, Go Get a Fucking Flu Shot Already!

#Wikipedia needs your $. Make your donation now

Leave a Reply Cancel reply

Twitter Timeline