(iStock)

Genome-wide association studies (GWAS) have discovered thousands of “spots” in the genome associated with diseases, including cancer, but understanding how genetic changes contribute to disease remains a challenge.

Artificial intelligence deep-learning models, such as Enformer, can predict how DNA changes might affect gene regulation. Because these models are trained on broad datasets, however, they do not capture tissue-specific contexts.

A research team led by Qing Li, PhD, and Xingyi Guo, PhD, at Vanderbilt Health, and Quan Long, PhD, at the University of Calgary, has now developed an AI transfer learning approach to adapt Enformer for breast and prostate cancer. Transfer learning is an AI technique that uses a pretrained model (in this case Enformer) as the starting point for a new task. The researchers retrained Enformer using tissue-specific transcription factor chromatin immunoprecipitation sequencing datasets (275 for breast and 357 for prostate).

With the new models, they computed regulatory scores for millions of GWAS genetic variants and identified those most likely to affect cancer risk. They further linked the genes to cancer risk through transcriptome-wide association study analyses and showed that many of the identified genes are important for cancer cell growth and are potential drug targets.

The study, reported in PLOS Genetics, showed that the transfer learning models outperformed the base model in identifying clinically relevant, disease-associated genes. The approach offers a generalizable framework for tailoring foundation models to disease-relevant contexts.

“Our findings demonstrate how adapting existing models to more disease-relevant data can significantly improve our ability to uncover genes and variants involved in cancer,” the authors stated.

Guo and Li are in the Department of Medicine Division of Epidemiology at Vanderbilt Health. The research was supported in part by a Canada Foundation for Innovation John R. Evans Leaders Fund grant to Long.