Deep Phenotyping for Precision Medicine

The practice of medicine is predicated on discovering commonalities or distinguishing characteristics among patients to inform corresponding treatment. Given a patient grouping (hereafter referred to as a phenotype), clinicians can implement a treatment pathway accounting for the underlying cause of disease in that phenotype. Traditionally, phenotypes have been discovered by intuition, experience in practice, and advancements in basic science, but these approaches are often heuristic, labor intensive, and can take decades to produce actionable knowledge. Although our understanding of disease has progressed substantially in the past century, there are still important domains in which our phenotypes are way too broad. To accelerate phenotype discovery, researchers have used machine learning to find patterns in electronic health records, but have often been thwarted by missing data, sparsity, and data heterogeneity.

Through our partnership with the U.S. Department of Veterans Affairs (VA) we have access to longitudinal clinical data on over 20 million veterans. To understand the unique aspects of every patient, you need to be able to understand the narrative of clinical notes and discern the tiniest details in medical images. We do this by applying deep learning to understand unstructured clinical data, including clinical notes and medical images, to automatically annotate the notes and images with various ontologies (e.g. ICD 10, LOINC, SNOMED-CT) and then integrate this data into graph-based data models with relationships and associations that represent the health of patients. We de-identify every patient’s data which builds upon the existing knowledge graph from the aggregate data that represents the co-occurrence or co-frequency of relationships in millions of patients’ longitudinal records. This facilitates the building of cohorts for the discovery of biomarkers and personalization of treatment decisions.

We apply deep learning to clinical and molecular data to define the phenotypic signatures of specific diseases. This allows us to approach phenotyping without a hypothesis or bias, to discover the clinical and biological associations between genomic, proteomic, metabolomic, lipidomic, and phenotypic factors for disease and treatment. By querying the knowledge graph, we can go beyond traditional phenotypes and cohorts to precisely match the most important characteristics of a patient or disease to a precise phenotype (e.g. disease subtype) to drive personalized decision-making.

Leave a comment