To understand a protein’s structure is to understand its function, says structural and chemical biologist Jens Meiler, PhD, distinguished research professor of Chemistry.
It can take a PhD student up to five sleep-deprived years to determine the structure of a single protein, and of the 20,000 human proteins, only about 17% are considered to have had their structure determined experimentally with very high accuracy.
Meanwhile, AlphaFold 2, a deep learning program owned by Google’s parent company, can in minutes compute a protein’s structure with an accuracy competitive with experiment.
Meiler, who for decades has used machine learning (ML) to predict protein structure, notes that an excess of biomolecular data amassed over two decades by experimentalists was used to train AlphaFold 2, and he acknowledges that the resulting performance is indeed impressive.
“But the real hard problems are problems of limited data,” said Meiler, who in addition to his Vanderbilt faculty appointment holds an Alexander von Humboldt Professorship at Leipzig University in Germany. Meiler is collaborating with other Vanderbilt researchers to advance precision medicine, where treatment is to be tailored more than ever to individual differences among patients, including down to the molecular level.
And he’s using machine learning to do it.
The difference between machine learning and AI
Among scientists and engineers interviewed for this story, research interests range from basic science, where experiment and observation are comparatively unfettered, to drug discovery, which straddles basic science and clinical science, to clinical phenotyping and prediction, where health information policy is apt to impinge upon scientific observation. A phenotype, resulting from the interaction of genetics and the environment, is an observable characteristic of an organism. Clinical phenotyping and prediction characterize patients and research subjects in terms of their health — an area of science where stigma and privacy notably come into play.
ML is a component of artificial intelligence (AI), says computer scientist Bradley Malin, PhD, the Accenture Professor and professor of Biomedical Informatics, whose research broadly explores how to make data accessible for biomedical research.
“If you knew the way the world worked, you wouldn’t need to perform machine learning, because all the rules of how the universe works would be defined,” said Malin, whose team also publishes ML demonstration projects in the clinical data domain. “Machine learning says, ‘I don’t know all of the rules, I don’t know all of their relationships. So, let me try to learn a model for how the world works, and I will use data to drive that learning process.’ Artificial intelligence is this bigger, all-encompassing perspective of how to create and represent intelligent environments.”
ML comprises a zoo of learning models, that is, different sorts of computer programs that can identify meaningful patterns in previously unseen data. Linear regression, logistic regression, support vector machines, decision trees and artificial neural networks figure among the major classes. (ML models are variously derived through supervised, semi-supervised or unsupervised learning — see the sidebar on page 25.)
Among biomedical researchers, a surge of new interest in ML dates to around 2010, when breakthroughs brought a decades-old neural network model called deep learning into large-scale industrial application for speech recognition. And their interest continued to grow with the advent of transformers, a deep learning model introduced by a team at Google Brain in 2017.
John McLean, PhD, Stevenson Professor of Chemistry and chair of the department, says early on he patterned his use of ML for analytical chemistry on neural network models developed for internet commerce. In McLean’s research lab there are 10 mass spectrometry platforms, and as he says, mass spectrometry is incredibly fast. The spectrometers that he and his team help to conceptualize, design and build are primarily used to characterize biological samples — tissue, blood, urine. Given a sample having 10,000 to 80,000 molecules, spectrometers in this lab can sort the flurry of molecules and register their relative abundance in a few seconds.
Imagine a Microsoft Word document 833 million pages long written in 60 minutes: that’s how quickly raw biomolecular data can build up in McLean’s lab.
“Without machine learning, there’s no hope of interrogating that data,” McLean said. “We’ve used these tools for the better part of a decade now, where we let the data inform us what is important about itself. Being able to reduce complex data — and I hate to put it this way, but — to infographics, where somebody can actually act on that information really quickly, is kind of our whole reason for using things like AI and machine learning.”
He mentions his work with so-called organ-on-a-chip technology, where cells harvested from donor cadavers are used to simulate organs in miniature. With ML standing by, exposing a simulated liver to a drug is like ringing a bell, he said. “On a molecular basis, we would analyze everything that the liver would secrete, like listening to how the bell rang — some molecules going up, some going down, some having other relationships.”
The lab’s neural net programs wind up producing a type of graph called a heat map, showing clusters of molecules behaving similarly to one another. “In about 24 hours of experiments, we could recapitulate the known literature around how the liver would respond to a drug like acetaminophen. And not only that, but we can see all the molecules that behave just like the ones that were known in the literature but had never been discovered before.
“And we could not do that without machine learning.”
Wet-and-dry lab work
Monoclonal antibodies are biologic drugs used to treat diseases such as cancer, autoimmune disorders and infections. In his research lab, computer scientist and computational biologist Ivelin Georgiev, PhD, explores questions such as how antibodies recognize pathogens and cancers, and based on his findings he uses ML to design vaccines and antibody therapeutics.
“Immunology, virology, microbiology, single cell biology — all of these are areas that have been around for a while, and we still don’t understand a lot of fundamental rules and interactions in these different processes,” said Georgiev, associate professor of Pathology, Microbiology and Immunology. “The goal, at least in my mind, with what we do is to be able to really get into the personalized medicine area, where it’s not just going to be sufficient to generate vaccines or antibodies for general use, but where you actually tailor those drugs to each particular person.”
Jens Meiler agrees that ML is poised to speed not only drug discovery in general, but also the precision drug discovery that Georgiev talks about. Meiler has already begun to help clinical teams exploit therapeutic opportunities newly discernable at the molecular level in individual patients. Working with scientists and clinicians at Vanderbilt- Ingram Cancer Center, he’s helping to pioneer ML-assisted precision cancer therapy: Based on DNA from cancer tissue, Meiler uses ML to predict the structure of mutant proteins, which has helped clinical teams determine which drugs to prescribe.
“We’ve begun doing that on a regular basis for drugs that are already approved, and 10 years from now we’re going to be using AI to engineer the best possible molecule for your specific mutation,” Meiler said.
He stressed that AI’s role in precision medicine is to be only one component of a larger undertaking. “It will not work unless you put it in a pipeline with experts who will use the artificial intelligence outputs as one portion of judgment, subject to double checking and rigorous evaluation. And Vanderbilt is probably one of the three best places in the United States to do this kind of research, having some of the very best researchers on individual proteins targeted by drugs — transporters, ion channels, signaling proteins.”
Like Meiler, other basic scientists appear to value ML for its potential interplay with experimentation.
McLean said, “Machine learning…is beautiful for showing you correlations, but it is not good for showing causation, or showing you why did something happen.”
When viewed by their advocates as stand-ins for theories about how the world works, learning models are often termed “learned” models. When derived from highly complex data, as is the case with neural networks and decision trees, the correlations that drive learning models tend to be obscured. Though they may work gangbusters and far outstrip human capacity, as models of the world they tend to be inscrutable.
Speaking recently at Oxford University in England, Demise Hassabis, PhD, the founder and CEO of DeepMind, the company behind AlphaFold 2, suggested that learned models will increasingly come to define biology. “A lot of these emergent and complex phenomena are just too complicated to be described with a few equations. I don’t really see how you can, say, come up with Kepler’s laws of motion … of a cell,” Hassabis said.
Meanwhile, computer scientists are said to be making headway in addressing the so-called explainability problem that attends complex ML.
“It’s completely conceivable that in the near future, those equations that for us are formidable and probably intractable will not be so anymore,” Georgiev said.
André Bastos, PhD, assistant professor of Psychology, uses ML to study cognition, down to the level of individual neurons and neuronal networks.
“Current application of machine learning is perhaps going to give us the ability to categorize different types of cognition, but it won’t give us mechanistic insight,” Bastos said in a recent Vanderbilt webinar on AI and biomedical research. “I think promising work is going to combine the so-called wet lab approach and the dry lab approach.”
That seems to be the emerging paradigm in drug discovery: Wedded with experimental validation, ML has come to assist the hunt for molecular drug targets, provide target-based virtual screening of potentially therapeutic molecules (in their billions), and closely aid in drug design, safety prediction, and safety surveillance for drugs already at market.
On a precipice
ML has proven devilishly good at well-defined vision tasks, and for that reason was thought by one prominent computer scientist to have put at least some physicians out of work by now. “I think that if you work as a radiologist you are like Wile E. Coyote in the cartoon. You’re already over the edge of the cliff, but you haven’t yet looked down,” deep learning pioneer Geoffrey Hinton, PhD, told The New Yorker in 2017. He gave radiologists five, perhaps 10 years to reach their obsolescence. He also recounted a talk he had delivered at a medical school where he suggested that they should stop training radiologists.
Five years later, while some radiologists may have begun using commercial AI to assist their work, none has apparently lost a job to a computer.
“Computers replacing doctors, or learned models replacing textbooks and theory, I don’t see that happening,” said informaticist and internal medicine specialist Colin Walsh, MD, MA, associate professor of Biomedical Informatics. “For one thing, we simply never are going to have all the necessary data quantified. As much as we’d like to fantasize devices sensorizing everything in our world, that’s not going to be the reality. Probably ever.”
Meanwhile, recent medical literature has seen a welter of projects (including some contributed by Walsh) showing the power of complex ML for clinical phenotyping/prediction. Walsh attributes this research output to new availability of data — which in large degree is thanks to widespread adoption of electronic health records, or EHRs — and to ML tools having become more democratized. “Anyone reading this article can on their laptop or home computer download software to run machine learning on a data set they may have sitting around on their hard drive,” Walsh said.
The EHR, it should be said, will often be characterized by informaticists, statisticians and others as a billing document that’s only rather poorly disguised as a health record, and researchers have apparently learned to appreciate its vagaries and limitations and to hold it suspect as a source of data for drawing general inferences about human health.
Nevertheless, the EHR from an ML perspective first of all features a lot of easy pickings, so-called structured data, meaning information that lends itself to tabular representation and thus to computation — lab results, billing codes representing diagnoses and procedures, drug prescriptions.
Other data that could be helpful for clinical phenotyping/prediction is buried in EHR notes written by the clinical team and increasingly by patients themselves.
The transformer, that aforementioned deep learning model, burst into biomedical research in 2018, and one area where it shines is natural language processing, or NLP.
“With the use of machine learning methods, we can now pretty accurately capture all sorts of information that figures in EHR notes,” said computer scientist Cosmin Bejan, PhD, assistant professor of Biomedical Informatics. Bejan has used NLP of EHR text for tasks such as finding firefighters, homelessness, suicidal behavior and nonprescription drug use. Some of the research reports he has co-authored have established that certain drug exposures correlate with reduced COVID severity; that, compared to other firefighters, those exposed to toxins in the World Trade Center attack developed high rates of mutations associated with blood cancer and cardiovascular disease; that homelessness is among the grossly undercoded social determinants of health in the EHR, and that suicide attempt and suicidal ideation are likewise undercoded.
Clinical phenotyping/predictive learning models are tested on historical data, with an overwhelming majority never having made it to clinical testing for decision support or health care outcomes improvement. Bejan’s colleague, computer scientist You Chen, PhD, assistant professor of Biomedical Informatics, has used ML and longitudinal EHR data to predict, among other things, preterm birth and the timing of hospital discharge. “I’m confident that machine learning will help health care in the near future, but we have a lot of challenges that need to be addressed before that can happen,” said Chen, whose research extends to finding solutions to support the transfer of EHR-based predictive models between different health care institutions.
Whenever talk turns to the application of ML in health care, there tends to be no lack of attention to (or trepidation around) what has come to be known in some circles as the fairness problem.
“At least in the clinical domain, you have structural inequities and racism that will permeate society for years to come,” said Malin. When ML is trained for phenotyping/prediction on routine health care data, applying those learning models for clinical decision support or outcomes improvement will carry some risk of perpetuating underlying biases known to haunt society and health care delivery. “Problems with data in the health care domain are not so much data problems as problems with how society is currently structured,” Malin said.
Bringing AI into health care will in part mean engaging with the public over issues of AI fairness. Two large new National Institutes of Health research projects are grappling with fairness from various directions, and Malin is helping to lead parts of both. They are the Artificial Intelligence/Machine Learning Consortium to Advance Health Equity and Researcher Diversity and Bridge to Artificial Intelligence.
From ML to AI
Two researchers interviewed for this story, Walsh and Michael Matheny, MD, MS, MPH, bear the hard-earned distinction of having led development of EHR-based ML predictive models that have reached the clinical testing phase for decision support and patient outcomes improvement.
Matheny, an internal medicine specialist and observational data scientist, built a model to predict risk of acute kidney injury (AKI) following cardiac catheterization. He used logistic regression and some 20 patient features in EHR data, collected from all 76 health care centers of the Veterans Health Administration. Results are pending from a large multisite randomized controlled trial at the VA, where periodic ML-powered reports were issued to cardiac catheterization teams, showing their observed-to-expected AKI rates.
“If you’re actually going to impact clinical care, you have to get out of data science and get into the actual art and practice of health care and medicine,” said Matheny, professor of Biomedical Informatics. “In recent years, there has been a real awareness and acknowledgment that the hardest part of deploying AI and ML into health care isn’t the AI and ML, it’s their integration into the workflow and the care and the profit of the institution.”
Starting with 1,537 EHR features, Walsh used a decision tree-based learning model to predict suicidal behavior among adults at VUMC. The model can provide immediate decision support to clinical teams at the start of any return patient encounter. In an observational study, among patients figuring in the top 10% for suicide risk in universal face-to-face screening in the adult emergency room, while one in 200 went on to attempt suicide within 30 days, adding the outputs from Walsh’s model to the face-to-face risk evaluation tripled that rate to three in 200. Results are now pending from a pragmatic randomized controlled trial using the model to prompt face-to-face suicide risk screening in three outpatient clinics at VUMC.
“If we’re going to throw a bunch of new predictions at clinicians, we better prove to them that those predictions are actually helpful,” Walsh said. “That’s the most important step. And it’s often not done, because it’s easier to just turn it on and say, ‘Go, I’m helping you, and I hope that it’s better.’”
For truly massive observational studies of health and health care, for pragmatic clinical trials, for clinical decision support and systematic outcomes improvement, the EHR, with all its known flaws, would appear to be the only game in town. Computing power and the understanding of health continue to advance — and the capacity of the EHR to equitably support improvement of health care outcomes might likewise advance, particularly with the addition of genotypes, Twitter profiles and whatever other new sorts of patient data might be in the offing.
Meanwhile, AI’s apparent failure to launch in the health care domain might better be interpreted as warranted caution.