Publication date: Nov 25, 2019
Lung cancer is one of the most malignant tumors, causing over 1,000,000 deaths each year worldwide. Deep learning has brought success in many domains in recent years. DNA methylation, an epigenetic factor, is used for model training in many studies. There is an opportunity for deep learning methods to analyze the lung cancer epigenetic data to determine their subtypes for appropriate treatment.
Here, we employ variational autoencoders (VAEs), an unsupervised deep learning framework, on 450K DNA methylation data of TCGA-LUAD and TCGA-LUSC to learn latent representations of the DNA methylation landscape. We extract a biologically relevant latent space of LUAD and LUSC samples. It is showed that the bivariate classifiers on the further compressed latent features could classify the subtypes accurately. Through clustering of methylation-based latent space features, we demonstrate that the VAEs can capture differential methylation patterns about subtypes of lung cancer.
VAEs can distinguish the original subtypes from manually mixed methylation data frame with the encoded features of latent space. Further applications about VAEs should focus on fine-grained subtypes identification for precision medicine.
Open Access PDF
Wang, Z. and Wang, Y. Extracting a biologically latent space of lung cancer epigenetics with variational autoencoders. 05839. 2019 BMC Bioinformatics (20):Suppl 18.
|Clustering||Branches of biology|
- Multiplatform biomarker identification using a data-driven approach enables single-sample classification.
- Applications of Bioinformatics in Cancer.