๐
๐
The Cartographer
Popping Bubbles in Pangenome Graphs
October 28, 2024 ยท Entered Twilight ยท ๐ arXiv.org
Repo contents: .clangd, .gitignore, .gitmodules, .projectile, CMakeLists.txt, LICENSE, README.md, deps, docs, src, test_data, tests
Authors
Njagi Mwaniki, Erik Garrison, Nadia Pisanti
arXiv ID
2410.20932
Category
cs.DS: Data Structures & Algorithms
Cross-listed
q-bio.GN
Citations
3
Venue
arXiv.org
Repository
https://github.com/urbanslug/povu/
โญ 4
Last Checked
3 months ago
Abstract
In this paper, we introduce flubbles, a new definition of "bubbles" corresponding to variants in a (pan)genome graph $G$. We then show a characterization for flubbles in terms of equivalence classes regarding cycles in an intermediate data structure we built from the spanning tree of the $G$, which leads us to a linear time and space solution for finding all flubbles. Furthermore, we show how a related characterization also allows us to efficiently detect what we define as hairpin inversions: a cycle preceded and followed by the same path in the graph; being the latter necessarily traversed both ways, this structure corresponds to inversions. Finally, Inspired by the concept of Program Structure Tree introduced fifty years ago to represent the hierarchy of the control structure of a program, we define a tree representing the structure of G in terms of flubbles, the flubble tree, which we also find in linear time. The hierarchy of variants introduced by the flubble tree paves the way for new investigations of (pan)genomic structures and their decomposition for practical analyses. We have implemented our methods into a prototype tool named povu which we tested on human and yeast data. We show that povu can find flubbles and also output the flubble tree while being as fast (or faster than) well established tools that find bubbles, such as vg and BubbleGun. Moreover, we show how, within the same time, povu can find hairpin inversions that, to the best of our knowledge, no other tool is able to find. Our tool is freely available at https://github.com/urbanslug/povu/ under the MIT License.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Data Structures & Algorithms
R.I.P.
๐ป
Ghosted
Route Planning in Transportation Networks
R.I.P.
๐ป
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
๐ป
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
๐ป
Ghosted
Graph Isomorphism in Quasipolynomial Time
๐
๐
The Cartographer