Gene expression modelling across multiple cell-lines with MapReduce
July 21, 2015 ยท Declared Dead ยท ๐ BMC Bioinformatics
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
David M. Budden, Edmund J. Crampin
arXiv ID
1507.05720
Category
q-bio.QM
Cross-listed
cs.DC,
q-bio.GN,
stat.ML
Citations
2
Venue
BMC Bioinformatics
Last Checked
2 months ago
Abstract
With the wealth of high-throughput sequencing data generated by recent large-scale consortia, predictive gene expression modelling has become an important tool for integrative analysis of transcriptomic and epigenetic data. However, sequencing data-sets are characteristically large, and previously modelling frameworks are typically inefficient and unable to leverage multi-core or distributed processing architectures. In this study, we detail an efficient and parallelised MapReduce implementation of gene expression modelling. We leverage the computational efficiency of this framework to provide an integrative analysis of over fifty histone modification data-sets across a variety of cancerous and non-cancerous cell-lines. Our results demonstrate that the genome-wide relationships between histone modifications and mRNA transcription are lineage, tissue and karyotype-invariant, and that models trained on matched epigenetic/transcriptomic data from non-cancerous cell-lines are able to predict cancerous expression with equivalent genome-wide fidelity.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ q-bio.QM
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
GuacaMol: Benchmarking Models for De Novo Molecular Design
R.I.P.
๐ป
Ghosted
DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences
R.I.P.
๐ป
Ghosted
ProtVec: A Continuous Distributed Representation of Biological Sequences
R.I.P.
๐ป
Ghosted
A Perspective on Deep Imaging
R.I.P.
๐
404 Not Found
Deep learning in bioinformatics: introduction, application, and perspective in big data era
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted