Toward perfect reads: self-correction of short reads via mapping on de Bruijn graphs
November 09, 2017 Β· Entered Twilight Β· π Bioinformatics
"Last commit was 7.0 years ago (β₯5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: .travis.yml, Bcool, README.md, install.sh, src, test
Authors
Antoine Limasset, Jean-Francois Flot, Pierre Peterlongo
arXiv ID
1711.03336
Category
cs.DS: Data Structures & Algorithms
Cross-listed
q-bio.QM
Citations
27
Venue
Bioinformatics
Repository
https://github.com/Malfoy/BCOOL
β 11
Last Checked
4 months ago
Abstract
Motivations Short-read accuracy is important for downstream analyses such as genome assembly and hybrid long-read correction. Despite much work on short-read correction, present-day correctors either do not scale well on large data sets or consider reads as mere suites of k-mers, without taking into account their full-length read information. Results We propose a new method to correct short reads using de Bruijn graphs, and implement it as a tool called Bcool. As a first st ep, Bcool constructs a compacted de Bruijn graph from the reads. This graph is filtered on the basis of k-mer abundance then of unitig abundance, thereby removing from most sequencing errors. The cleaned graph is then used as a reference on which the reads are mapped to correct them. We show that this approach yields more accurate reads than k-mer-spectrum correctors while being scalable to human-size genomic datasets and beyond. Availability and Implementation The implementation is open source and available at http://github.com/Malfoy/BCOOL under the Affero GPL license. Contact Antoine Limasset antoine.limasset@gmail.com & Jean-FranΓ§ois Flot jflot@ulb.ac.be & Pierre Peterlongo pierre.peterlongo@inria.fr
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
π
π
The Cartographer
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
π
π
The Cartographer