Whisper: Read sorting allows robust mapping of DNA sequencing data

MOTIVATION: Mapping reads to a reference genome is often the first step in a sequencing data analysis pipeline. The reduction of sequencing costs implies a need for algorithms able to process increasing amounts of generated data in reasonable time. RESULTS: We present Whisper, an accurate and high-performant mapping tool, based on the idea of sorting reads and then mapping them against suffix arrays for the reference genome and its reverse complement. Employing task and data parallelism as well as storing temporary data on disk result in superior time efficiency at reasonable memory requirements. Whisper excels at large NGS read collections, in particular Illumina reads with typical WGS coverage. The experiments with real data indicate that our solution works in about 15% of the time needed by the well-known BWA-MEM and Bowtie2 tools at a comparable accuracy, validated in a variant calling pipeline. AVAILABILITY: Whisper is available for free from https://github.com/refresh-bio/Whisper or http://sun.aei.polsl.pl/REFRESH/Whisper/.See it on Scoop.it, via Viruses, Immunology & Bioinformatics from Virology.uvic.ca
Whisper: Read sorting allows robust mapping of DNA sequencing data
Source: Viral Bioinformatics

You may also like...