Interpretable genotype-to-phenotype classifiers with performance guarantees.

Interpretable genotype-to-phenotype classifiers with performance guarantees.

Publication date: Mar 11, 2019

Understanding the relationship between the genome of a cell and its phenotype is a central problem in precision medicine. Nonetheless, genotype-to-phenotype prediction comes with great challenges for machine learning algorithms that limit their use in this setting. The high dimensionality of the data tends to hinder generalization and challenges the scalability of most learning algorithms. Additionally, most algorithms produce models that are complex and difficult to interpret. We alleviate these limitations by proposing strong performance guarantees, based on sample compression theory, for rule-based learning algorithms that produce highly interpretable models. We show that these guarantees can be leveraged to accelerate learning and improve model interpretability. Our approach is validated through an application to the genomic prediction of antimicrobial resistance, an important public health concern. Highly accurate models were obtained for 12 species and 56 antibiotics, and their interpretation revealed known resistance mechanisms, as well as some potentially new ones. An open-source disk-based implementation that is both memory and computationally efficient is provided with this work. The implementation is turnkey, requires no prior knowledge of machine learning, and is complemented by comprehensive tutorials.

Open Access PDF

Drouin, A., Letarte, G., Raymond, F., Marchand, M., Corbeil, J., and Laviolette, F. Interpretable genotype-to-phenotype classifiers with performance guarantees. 04202. 2019 Sci Rep (9):1.

Concepts Keywords
Antibiotics Genotype
Antimicrobial Resistance Learning algorithms
Genome Machine learning
Genotype Polymorphism
Great Learning Phenotype
Memory Cybernetics
Open Source Learning algorithms
Phenotype
Precision Medicine
Scalability
Sci
Turnkey

Semantics

Type Source Name
pathway BSID Reproduction
gene UNIPROT ENO1
drug DRUGBANK Naproxen
gene UNIPROT RASA1
gene UNIPROT RGS6
gene UNIPROT RNF31
gene UNIPROT TRIM9
drug DRUGBANK Silver
gene UNIPROT ADM
gene UNIPROT SLC17A5
gene UNIPROT INTU
drug DRUGBANK Erythromycin
disease DOID pneumonia
disease MESH pneumonia
drug DRUGBANK Capreomycin
drug DRUGBANK Amikacin
disease MESH extensively drug resistant tuberculosis
gene UNIPROT DUOXA1
gene UNIPROT PLEKHG5
disease MESH infectious diseases
gene UNIPROT CASP8
gene UNIPROT PROC
gene UNIPROT BRD2
disease MESH adverse drug reactions
gene UNIPROT PRPF6
disease MESH classi
gene UNIPROT SPEN
gene UNIPROT CPB2
gene UNIPROT AP3B1
gene UNIPROT HPS1
disease DOID HPs
gene UNIPROT FASTK
drug DRUGBANK Tropicamide
gene UNIPROT MBNL1
gene UNIPROT TNFSF14
drug DRUGBANK Piroxicam
gene UNIPROT BTG3
gene UNIPROT REST
drug DRUGBANK Trifluoro-thiamin phosphate
disease MESH Confusion
drug DRUGBANK Methyl isocyanate
drug DRUGBANK Tobramycin
drug DRUGBANK Ciprofloxacin
drug DRUGBANK Meticillin
drug DRUGBANK Chloramphenicol
drug DRUGBANK Levofloxacin
drug DRUGBANK Moxifloxacin
drug DRUGBANK Azithromycin
drug DRUGBANK Isoniazid
drug DRUGBANK Gentamicin
drug DRUGBANK Clavulanic acid
drug DRUGBANK Amoxicillin
drug DRUGBANK Vancomycin
drug DRUGBANK Imipenem
drug DRUGBANK Spinosad
drug DRUGBANK Aspartame
gene UNIPROT RORC
gene UNIPROT REL
disease DOID cancer
disease MESH cancer
drug DRUGBANK Bleomycin
drug DRUGBANK Pyrazinamide
disease MESH separation
gene UNIPROT SLC26A5
drug DRUGBANK Guanine
disease DOID rrs
disease MESH multi
disease MESH multiple
disease MESH point mutation
gene UNIPROT THOP1
gene UNIPROT CD48
drug DRUGBANK Meropenem
gene UNIPROT TNFRSF11A
drug DRUGBANK Pentaerythritol tetranitrate
gene UNIPROT SSRP1
gene UNIPROT CD36
gene UNIPROT FAT1
drug DRUGBANK Palmitic Acid
gene UNIPROT MERTK
gene UNIPROT GPER1
pathway BSID Tuberculosis
disease DOID tuberculosis
disease MESH tuberculosis
drug DRUGBANK Kanamycin
gene UNIPROT DEPP1
gene UNIPROT GOPC
gene UNIPROT PRUNE1
disease MESH diagnosis
gene UNIPROT ARTN
gene UNIPROT AGRP
gene UNIPROT SET
gene UNIPROT GPR182
gene UNIPROT LITAF
drug DRUGBANK Spectinomycin
gene UNIPROT CARTPT
gene UNIPROT DNMT1
gene UNIPROT CD69
gene UNIPROT CD5L
gene UNIPROT LARGE1
drug DRUGBANK Warfarin
pathway BSID Metabolism
gene UNIPROT GAL
drug DRUGBANK Coenzyme M
gene UNIPROT RXFP2

Similar

Original Article

Leave a Comment

Your email address will not be published. Required fields are marked *