Training-free Measures Based on Algorithmic Probability Identify High Nucleosome Occupancy in DNA Sequences

August 05, 2017 · Declared Dead · 🏛 arXiv.org

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Hector Zenil, Peter Minary arXiv ID 1708.01751 Category q-bio.QM Cross-listed cs.IT, q-bio.GN Citations 0 Venue arXiv.org Last Checked 3 months ago

Abstract

We introduce and study a set of training-free methods of information-theoretic and algorithmic complexity nature applied to DNA sequences to identify their potential capabilities to determine nucleosomal binding sites. We test our measures on well-studied genomic sequences of different sizes drawn from different sources. The measures reveal the known in vivo versus in vitro predictive discrepancies and uncover their potential to pinpoint (high) nucleosome occupancy. We explore different possible signals within and beyond the nucleosome length and find that complexity indices are informative of nucleosome occupancy. We compare against the gold standard (Kaplan model) and find similar and complementary results with the main difference that our sequence complexity approach. For example, for high occupancy, complexity-based scores outperform the Kaplan model for predicting binding representing a significant advancement in predicting the highest nucleosome occupancy following a training-free approach.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — q-bio.QM

R.I.P. 👻 Ghosted

Deep Learning for Identifying Metastatic Breast Cancer

Dayong Wang, Aditya Khosla, ... (+3 more)

q-bio.QM 🏛 arXiv 📚 981 cites 9 years ago

R.I.P. 👻 Ghosted

DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences

Ingoo Lee, Jongsoo Keum, Hojung Nam

q-bio.QM 🏛 PLoS Comput. Biol. 📚 522 cites 7 years ago

R.I.P. 👻 Ghosted

ProtVec: A Continuous Distributed Representation of Biological Sequences

Ehsaneddin Asgari, Mohammad R. K. Mofrad

q-bio.QM 🏛 PLoS ONE 📚 440 cites 11 years ago

R.I.P. 👻 Ghosted

A Perspective on Deep Imaging

Ge Wang

q-bio.QM 🏛 IEEE Access 📚 409 cites 9 years ago

R.I.P. 💀 404 Not Found

Deep learning in bioinformatics: introduction, application, and perspective in big data era

Yu Li, Chao Huang, ... (+4 more)

q-bio.QM 🏛 bioRxiv 📚 325 cites 7 years ago

R.I.P. 👻 Ghosted

Data-driven Advice for Applying Machine Learning to Bioinformatics Problems

Randal S. Olson, William La Cava, ... (+3 more)

q-bio.QM 🏛 Pacific Symposium on Biocomputing 📚 279 cites 8 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Federated Learning: Strategies for Improving Communication Efficiency

Jakub Konečný, H. Brendan McMahan, ... (+4 more)

cs.LG 🏛 arXiv 📚 5.2K cites 9 years ago

R.I.P. 👻 Ghosted

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi, Cliff Young, ... (+73 more)

cs.AR 🏛 ISCA 📚 5.1K cites 9 years ago

R.I.P. 👻 Ghosted

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

Hoo-Chang Shin, Holger R. Roth, ... (+7 more)

cs.CV 🏛 IEEE TMI 📚 4.9K cites 10 years ago

R.I.P. 👻 Ghosted

Explanation in Artificial Intelligence: Insights from the Social Sciences

Tim Miller

cs.AI 🏛 AI 📚 4.9K cites 8 years ago