Genetics & Genomics

February 6, 2026

New analytical approach identifies novel risk genes for colorectal cancer

The study advances understanding of risk for colon and rectal cancers and points to targets for developing new treatments.

Vanderbilt Health researchers have developed an integrative analytical framework that links genetic risk variants and transcription factor regulation of gene expression to identify novel and potentially druggable risk genes for colorectal cancer.

Their findings, reported in the journal Nature Communications, advance understanding of key transcription factor-gene regulatory networks that underlie colon and rectal cancer development.

Colorectal cancer is the second most common cause of cancer death in the United States, according to the National Cancer Institute.

Large-scale genome-wide association studies (GWAS) have previously identified more than 250 loci (DNA regions) and numerous candidate susceptibility genes that increase risk for colorectal cancer. But transcription factors that interact with and regulate these risk variants are poorly defined.

Xingyi Guo, PhD

“Our work moves beyond locus discovery to reveal how genetic variants influence colorectal cancer risk through disrupted transcription factor-gene regulatory networks,” said Xingyi Guo, PhD, associate professor of Medicine in the Division of Epidemiology and associate professor of Biomedical Informatics. “By explicitly modeling transcription factor binding effects and integrating multiple layers of gene regulation — including expression, splicing and polyadenylation — this study provides mechanistic insight into colorectal cancer susceptibility and identifies biologically grounded targets for therapeutic development.”

The researchers jointly modeled transcription factor ChIP-seq datasets alongside GWAS from 100,204 colorectal cancer cases and 154,587 controls of European and East Asian ancestries. They identified 51 susceptibility transcription factors and cofactor interactions, including vitamin D receptor-associated cofactors, that shape colorectal cancer risk.

They then integrated these regulatory signals with multiancestry transcriptome-wide association study (TWAS) data and identified 222 colorectal cancer risk genes, including 95 novel genes. A comprehensive annotation of the risk genes and analysis of drug-protein interaction databases revealed that nine of the genes (six of which are novel genes reported in this study) are targeted by drugs already approved or in trials for colorectal cancer treatment.

The researchers also experimentally validated oncogenic roles for three of the risk genes.

“The direct integration of transcription factor ChIP-seq data with large-scale GWAS using generalized linear mixed models enables systematic identification of susceptibility transcription factors,” Guo said. “Coupling this with multiancestry TWAS creates a coherent, end-to-end framework that connects genetic variation to regulatory disruption, gene function and disease biology — something that has been difficult to achieve with existing methods. The multiancestry design further strengthens the generalizability and translational relevance of the findings.”

The protein-drug mapping “expands the catalogue of druggable genes and candidate therapeutics and supports the development of precision medicine strategies for colorectal cancer prevention and intervention,” Guo added.

Zhishan Chen, PhD, Wenqiang Song, PhD, and Qing Li, PhD, are co-first authors of the Nature Communications study. The research was primarily supported by National Institutes of Health grants R37CA227130, R01CA269589 and R01CA297582.