A researcher at Vanderbilt has identified a set of 15 genes that together exhibit a 24-hour gene expression pattern in human blood, constituting a circadian clock biomarker.
The biomarker suggests new ways to study daily rhythms generated in blood and other tissue, and for patients it suggests eventual development of more efficient monitoring of these rhythms.
The body’s internal clock is important for health but is often subject to disruption, writes Jacob Hughey, Ph.D., in a study in Genome Medicine.
“Clinically, this sort of approach could eventually help us diagnose and monitor circadian- and sleep-related disorders, and could also help us personalize circadian-related treatments,” Hughey said.
“And for basic research, this approach may help us understand how each individual’s circadian clock works slightly differently.”
An instructor in Biomedical Informatics, Hughey approaches the monitoring of circadian time in blood as a computational task. In supervised machine learning, a computer algorithm learns to make predictions from data without being explicitly programmed. In this study, Hughey uses a supervised machine learning method called ZeitZeiger (German for “time revealer”), previously developed with colleagues to identify predictive patterns in large periodic data sets.
For a given biological specimen, transcriptome data captures the abundance of messenger RNA from nearly every gene in the genome. Here Hughey trains ZeitZeiger on blood transcriptome time series data gathered in three different human circadian rhythm studies, data made publically available on the web for re-use. In all, 60 participants were enrolled in the three studies, giving Hughey data from 498 baseline blood samples with which to train his predictor.
Since our circadian clocks are normally synchronized to the world’s diurnal rhythm, for all 60 participants Hughey focuses on time since sunrise. This is the variable he sets out to predict for each individual using a single blood sample.
With the predictor learned by ZeitZeiger, half of all predictions from single blood samples land within 2.1 hours of time since sunrise. That is, the predictions converge on circadian time with a median accuracy of 2.1 hours.
“The biggest surprise is that most of the genes the machine learning algorithm selected to predict time of day are not canonical clock genes. Instead, they’re genes that seem to be outputs of the clock. In other words, most of the genes the algorithm chose are not the gears of the clock, but the hands,” Hughey said.
Accuracy increased considerably when Hughey used multiple blood samples. Also, he tried combining the 15-gene predictor with personalized predictors created for each individual, and accuracy again increased.
Along with drawing several baseline blood samples from subjects, each of the three circadian studies drew additional blood samples after introducing experimental conditions designed to disrupt or perturb circadian rhythms. Hughey proceeds to use his predictor to study what happened in these conditions.
“The predictor detects how the perturbations shift and/or dampen the clock in the blood,” he said.
ZeitZeiger is available for free as open source software.