Multiple Genome Analytics Framework: The Case of All SARS-CoV-2 Complete Variants
January 13, 2022 Β· Declared Dead Β· π Journal of Biotechnology
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Konstantinos Xylogiannopoulos
arXiv ID
2201.05198
Category
q-bio.GN
Cross-listed
cs.DS
Citations
3
Venue
Journal of Biotechnology
Last Checked
3 months ago
Abstract
Pattern detection and string matching are fundamental problems in computer science and the accelerated expansion of bioinformatics and computational biology have made them a core topic for both disciplines. The SARS-CoV-2 pandemic has made such problems more demanding with hundreds or thousands of new genome variants discovered every week, because of constant mutations, and there is a desperate need for fast and accurate analyses. The requirement for computational tools for genomic analyses, such as sequence alignment, is very important, although, in most cases the resources and computational power required are enormous. The presented Multiple Genome Analytics Framework combines data structures and algorithms, specifically built for text mining and pattern detection, that can help to efficiently address several computational biology and bioinformatics problems concurrently with minimal resources. A single execution of advanced algorithms, with space and time complexity O(nlogn), is enough to acquire knowledge on all repeated patterns that exist in multiple genome sequences and this information can be used from other meta-algorithms for further meta-analyses. The potential of the proposed framework is demonstrated with the analysis of more than 300,000 SARS-CoV-2 genome sequences and the detection of all repeated patterns with length up to 60 nucleotides in these sequences. These results have been used to provide answers to questions such as common patterns among all variants, sequence alignment, palindromes and tandem repeats detection, different organism genome comparisons, polymerase chain reaction primers detection, etc.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β q-bio.GN
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Accurate Genomic Prediction Of Human Height
R.I.P.
π»
Ghosted
Synergistic Drug Combination Prediction by Integrating Multi-omics Data in Deep Learning Models
π
π
Old Age
GateKeeper: A New Hardware Architecture for Accelerating Pre-Alignment in DNA Short Read Mapping
R.I.P.
π»
Ghosted
Tasks, Techniques, and Tools for Genomic Data Visualization
π
π
Old Age
Spaced seeds improve k-mer-based metagenomic classification
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted