Tech & Health

February 7, 2019

PheWAS Core helps researchers make sense of electronic health record data

Some biomedical researchers may be unsure about routine electronic health record (EHR) data and how useful it ultimately may prove for drawing meaningful, actionable associations that warrant changes to clinical practice and lead to improved clinical outcomes.

Some biomedical researchers may be unsure about routine electronic health record (EHR) data and how useful it ultimately may prove for drawing meaningful, actionable associations that warrant changes to clinical practice and lead to improved clinical outcomes.

“I think that’s something that we’re still exploring, but with every study that comes out of the core, we’re adding another brick to that argument that these really are important data for improving our care of patients,” said Sara Van Driest, MD, PhD, assistant professor of Pediatrics and Medicine.

She means the Phenotyping and PheWAS Core. Researchers seeking informatics assistance to study Vanderbilt University Medical Center’s vast repositories of electronic health record (EHR) data and patient genotype data can turn to this research core, which launched in November 2016 as a program of the Center for Precision Medicine.

VUMC has electronic health records for some 2.8 million patients and has taken pains to make this mountain of information easy for experts to study, while safeguarding patient privacy. Under a related program called BioVU, VUMC has collected and stored DNA specimens from a subset of its EHR population, some 240,000 patients and counting, with more than 90,000 of these specimens having been genotyped to date.

Lisa Bastarache, MS, Josh Denny, MD, MS, and colleagues are helping researchers study associations among de-identified genotype data and electronic health records data. (photo by John Russell)
Lisa Bastarache, MS, Josh Denny, MD, MS, and colleagues are helping researchers study associations among de-identified genotype data and electronic health records data. (photo by John Russell)

De-identified genotype data and EHR data are linked, and to study associations among them, Vanderbilt researchers apply for access through the Internal Review Board.

In areas such as clinical decision support for gene-drug interactions, research derived from these data repositories is already influencing care at VUMC, said Josh Denny, MD, MS, director of the Center for Precision Medicine and vice president for Personalized Medicine.

“EHR data can be messy but it contains perhaps the single richest source of disease history, drug exposures and their response, and prognosis available for research. Using advanced informatics through our core, we can help researchers go beyond what might be gleaned from demographics, billing codes and other structured data, to draw finer distinctions within the patient population based on unlabeled data, such as information in clinical notes,” Denny said.

Studying biomedicine through an EHR data lens often involves precise automated selection of cases and controls, that is, patient records with and without specific diagnoses or other features of interest, and this is one of the areas where the core has been helping researchers from VUMC and other institutions.

The core’s scientific director is Lisa Bastarache, MS, and its operations manager, Janey Wang, MS, MEng.

“Their ability to work with investigators to move from what may at first appear to be a simple clinical question, through the complexities of identifying a research cohort and defining the clinical outcome, is tremendously valuable. They have specific expertise in helping to translate clinical questions into electronic algorithms that make it possible to study large cohorts of patients,” said Van Driest, who has sought the core’s services for her studies of dose-based drug responses in children.

Otolaryngologist and cancer researcher Young Jun Kim, MD, PhD, is interested in questions such as how to predict who will and who won’t respond to immunotherapy.

“One way to approach this is using animal studies and trying new drugs, but the other approach is to study the genetics of responders and non-responders, in the cancer tissue and in the germline. That’s where the core’s services are very valuable,” said Kim, the Barry and Amy Baker Professor of Laryngeal, Head and Neck Research.

Stokes Peebles, MD, studies lung inflammation.

“We wanted to see whether variants in the genes we’re interested in from our mouse studies appear to increase or decrease allergic inflammation and virally induced inflammation in humans,” said Peebles, the Elizabeth and John Murray Professor Medicine.

In an initial set of PheWAS results, with the core’s help, Peebles found that a drug used for another indication might also serve to treat lung inflammation.

PheWAS is short for phenome-wide association study. Any feature or pattern in an organism might qualify as a phenotype, including states of health and disease, and a phenome is simply the sum of phenotypes to be observed in an individual or species. In its original and simplest form, a PheWAS involves taking a genetic variant of interest and using custom algorithms to scan for associations with International Classification of Disease codes occurring in the EHR.

According to Wang, the core’s turnaround for a set of routine PheWAS results is generally within two weeks. Other services offered by the core include free initial consultation, custom natural language processing, custom PheWAS programing, and LabWAS, which is a scan for associations between a genetic variant of interest and clinical lab results.

Besides the Internal Review Board, other gatekeepers and overseers of VUMC’s research data repositories include the Vanderbilt Institute for Clinical and Translational Research (VICTR), BioVU, the and the Medical Center Ethics Committee.

Through its StarBRITE system (employee login required), VICTR offers various tools for data retrieval and analysis. To further aid researchers, the Phenotyping and PheWAS Core also works with other research cores, including VICTR’s Integrated Data Access and Services Core (iDASC).