Network-based Distance Metric with Application to Discover Disease Subtypes in Cancer
March 01, 2017 Β· Declared Dead Β· π arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Jipeng Qiang, Wei Ding, John Quackenbush, Ping Chen
arXiv ID
1703.01900
Category
q-bio.QM
Cross-listed
cs.IR,
q-bio.GN,
q-bio.MN
Citations
1
Venue
arXiv.org
Last Checked
3 months ago
Abstract
While we once thought of cancer as single monolithic diseases affecting a specific organ site, we now understand that there are many subtypes of cancer defined by unique patterns of gene mutations. These gene mutational data, which can be more reliably obtained than gene expression data, help to determine how the subtypes develop, evolve, and respond to therapies. Different from dense continuous-value gene expression data, which most existing cancer subtype discovery algorithms use, somatic mutational data are extremely sparse and heterogeneous, because there are less than 0.5\% mutated genes in discrete value 1/0 out of 20,000 human protein-coding genes, and identical mutated genes are rarely shared by cancer patients. Our focus is to search for cancer subtypes from extremely sparse and high dimensional gene mutational data in discrete 1 and 0 values using unsupervised learning. We propose a new network-based distance metric. We project cancer patients' mutational profile into their gene network structure and measure the distance between two patients using the similarity between genes and between the gene vertexes of the patients in the network. Experimental results in synthetic data and real-world data show that our approach outperforms the top competitors in cancer subtype discovery. Furthermore, our approach can identify cancer subtypes that cannot be detected by other clustering algorithms in real cancer data.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β q-bio.QM
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences
R.I.P.
π»
Ghosted
ProtVec: A Continuous Distributed Representation of Biological Sequences
R.I.P.
π»
Ghosted
A Perspective on Deep Imaging
R.I.P.
π
404 Not Found
Deep learning in bioinformatics: introduction, application, and perspective in big data era
R.I.P.
π»
Ghosted
Data-driven Advice for Applying Machine Learning to Bioinformatics Problems
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted