Jonathan Irish, Ph.D., and his colleagues have developed a new language, one that can be used to describe and identify cells.
The language — marker enrichment modeling, or MEM — assigns a “MEM label” to cells based on certain features of the cell, usually proteins. Irish and his colleagues hope that MEM, which they recently reported in the journal Nature Methods, will be widely adopted and used to generate a “Who’s Who” database of cell types.
For researchers who study the biology of single cells, advances in technology and instrumentation have made it possible to measure more features of a cell, at a rate of hundreds of cells per second, said Irish, assistant professor of Cancer Biology.
“The advancing technical capabilities are revealing new types of cells that we had no idea existed — cells that combine features in unexpected ways,” Irish said.
However, assigning cell identities based on the features relies on human experts to use prior knowledge about which features distinguish each particular cell type.
“If you give a computer a bunch of cells and ask, which ones are CD4 T cells, the computer would really struggle,” Irish said. “What’s missing is a database of annotated cell identities and a language to quantitatively say what’s special about each type of cell.
“We’re trying to create that language for humans and computers to say: we’re talking about the same cells.”
MEM labels consist of a series of marker proteins, each with a score ranging from negative 10 to positive 10 that reflects how enriched that protein is in the cell population. The label contains quantitative data (the scores for each component) in addition to serving as a short description of a population of cells, Irish said.
Kirsten Diggins, Ph.D., led the development of MEM as part of her graduate dissertation research. In the Nature Methods report, the investigators use MEM to analyze immune cell data from seven published studies using two different cytometry instrumentation platforms. The findings demonstrate that MEM is flexible and works for data measured in different places, at different times, with different technologies.
The researchers also analyzed samples of human glioblastoma (brain) tumors, which include both tumor cells and immune cells. Using only nine features — out of more than 30 measured features — the MEM methodology was able to correctly classify cancer cells and immune cells.
“It’s good at identifying what type of cell you’re looking at with a limited amount of information,” Irish said.
Irish hopes that MEM labels will begin to be deposited into a large database of cell types.
“Right now there’s a lot of published cytometry data that just disappears into the literature because there’s no common language or database for it,” Irish said.
Ultimately, the effort is about understanding cell identity as a way to understand the function of healthy and diseased cells, he said. Using unbiased cytometry technologies and MEM algorithmic labeling will reveal unexpected new cell types that may have important functions.
“For cancer and other disease research in particular, if someone finds an unusual cell population, the database we envision will make it possible to ask if anyone else has seen this cell type before” and to move quickly into exploring what role the cell type might have, Irish said.
Other authors of the Nature Methods report include Allison Greenplate, Nalin Leelatian, M.D., and Cara Wogsland. The research was supported by grants from the National Institutes of Health (CA136440, CA199993, CA143231, CA068485), the VICC Ambassadors, a VICC Hematology Helping Hands award and the Vanderbilt International Scholars Program.
MEM software and license are available from VUeInnovations (http://mem.vueinnovations.com) and are free for academic users.