June 12, 2012

Making order out of ordinal data

A new statistical tool developed by Vanderbilt biostatisticians will help medical researchers make sense of a commonly encountered – but hard-to-analyze – type of data.


Stage of disease (mild, moderate, severe) and frequency of drug use (never, yearly, monthly, weekly, daily) are examples of “ordinal” data, a type of data common in medical research. However, statistical methods for analyzing such data are limited.

Associate professors of biostatistics Chun Li and Bryan Shepherd have developed a new approach for analyzing ordinal data. They describe a new “residual” (a measure of how far an observed value is from its expected value) in the June issue of Biometrika. The new approach accounts for the ordered nature of the data without assigning arbitrary scores to categories – and considers categories’ relative position to other categories.

In collaboration with the Vanderbilt Institute for Global Health, the researchers then demonstrated the residual’s utility by uncovering a non-linear association between age and the odds of having more severe cervical lesions among HIV-infected women from Zambia.

The authors note that improved statistical methods allow researchers to more easily detect meaningful associations with fewer patients, saving money and increasing the ability to make important discoveries.

The research was supported by grants from the National Human Genome Research Institute (HG004517) and the National Institute of Allergy and Infectious Diseases (AI093234) of the National Institutes of Health.