The MaSuRCA genome assembler

by cupton · October 18, 2013

See on Scoop.it – Virology and Bioinformatics from Virology.ca

Motivation: Second-generation sequencing technologies produce high coverage of the genome by short reads at a low cost, which has prompted development of new assembly methods. In particular, multiple algorithms based on de Bruijn graphs have been shown to be effective for the assembly problem. In this article, we describe a new hybrid approach that has the computational efficiency of de Bruijn graph methods and the flexibility of overlap-based assembly strategies, and which allows variable read lengths while tolerating a significant level of sequencing error. Our method transforms large numbers of paired-end reads into a much smaller number of longer ‘super-reads’. The use of super-reads allows us to assemble combinations of Illumina reads of differing lengths together with longer reads from 454 and Sanger sequencing technologies, making it one of the few assemblers capable of handling such mixtures. We call our system the Maryland Super-Read Celera Assembler (abbreviated MaSuRCA and pronounced ‘mazurka’).

See on bioinformatics.oxfordjournals.org

The MaSuRCA genome assembler

You may also like...

Leave a Reply Cancel reply

Twitter Timeline

The MaSuRCA genome assembler

You may also like...

Evolutionary analysis of Porcine circovirus 3 (PCV3) indicates an ancient origin for its current strains and a worldwide dispersion

AGILE: an assembled genome mining pipeline

Viral evasion and challenges of hepatitis C virus vaccine development

Leave a Reply Cancel reply

Twitter Timeline