by Tom Wilemon
The National Cancer Institute (NCI) has funded the development of interactive visual tools that will allow researchers to extract cancer phenotypes from electronic medical records.
Currently, the evaluation of cancer phenotypes — the physical characteristics and clinical presentations of tumors and cells in individual cases — is mostly performed manually. This makes it difficult for researchers to correlate phenotype data with genotype data – the genetic structure of tumors and cancer cells, which can be more readily chronicled and accessed. The phenotype initiative utilizes specially developed software that enables comprehensive longitudinal data processing from various sources.
Jeremy Warner, MD, MS, associate professor of Medicine and Biomedical Informatics, is one of three principal investigators with the initiative, which the NCI has funded with a $4.35 million grant.
The other principal investigators are Harry Hochheiser, PhD, associate professor of Biomedical Informatics and the Intelligent Systems Program at the University of Pittsburgh School of Medicine, and Guergana Savova, PhD, associate professor of Pediatrics at Harvard Medical School and a faculty member of the Harvard Computational Health Informatics Program.
“Natural language processing is an artificial intelligence technique that has been under development for more than 50 years, but it isn’t until very recently that computing power and new ‘deep learning’ approaches have enabled performance that begins to rival human experts,” Warner said. “And unlike a human, a computing system can crank away at medical records all day and all night, processing potentially millions of notes. Our goals are to both continue to develop the underpinnings of the DeepPhe system as well as to encourage its broad implementation.”
The five-year grant was awarded based upon the work of the “Cancer Deep Phenotype Extraction” (DeepPhe) project. DeepPhe combines details from multiple documents to form longitudinal summaries. The tool is being designed for both clinicians and researchers.
The goal is to develop a multiscale visualization tool for interpreting the complexities of relationships between cancers, tumors, treatments, responses, biomarkers and other attributes.