Interpreting ordered logit results via odds ratios

Ordered logit is a logit model, and with logit models you can exponentiate the coefficients and interpret the multiplicative change.

The big twist is just that with the ordered logit model, there are a set of underlying cumulative logits, and the coefficients of an explanatory variable across all these implied logits are constrained to be the same. We can choose any of these implied logits for the purposes of interpretation.

Example

We will use our example from the Wisconsin Longitudinal Study, in which the outcome is self-reported health reported as: (1) Poor, (2) Fair, (3) Good, (4) Very Good, and (5) Excellent. Our explanatory variables are sex, parental SES (\(\texttt{z_ses57}\)), high school class rank (\(\texttt{z_classrank}\)), and high school test score (\(\texttt{hn1}\)).

Becuase there are five ordered categories, there are four implied logits here:

Excellent vs. Very Good, Good, Fair, and Poor.
Excellent and Very Good vs. Good, Fair, and Poor.
Excellent, Very Good, and Good vs. Fair and Poor.
Excellent, Very Good, Good, and Fair vs. Poor.

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.2.0
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(haven)
library(MASS)


Attaching package: 'MASS'

The following object is masked from 'package:dplyr':

    select

library(modelsummary)

# Read and prepare data
df <- read_dta("../dta/wlshealth.dta") %>%
  filter(!is.na(health04)) %>%
  mutate(health04 = as_factor(health04)) %>%
  mutate(female = as_factor(female))

# Fit ordered logit model
model <- polr(health04 ~ female + z_ses57 + z_classrank + hn1,
              data = df, Hess = TRUE, method = "logistic")

# Get coefficients and exponentiate to get odds ratios
odds_ratios <- exp(coef(model))
odds_ratios

femalefemale      z_ses57  z_classrank          hn1 
   0.8964429    1.1939309    1.3110659    1.0540730

. ologit health04 i.female z_ses57 z_classrank hn1, or nolog

Ordered logistic regression                             Number of obs =  7,221
                                                        LR chi2(4)    = 304.07
                                                        Prob > chi2   = 0.0000
Log likelihood = -9604.6427                             Pseudo R2     = 0.0156

------------------------------------------------------------------------------
    health04 | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
      female |
     female  |   .8964689   .0402101    -2.44   0.015     .8210234   .9788473
     z_ses57 |    1.19393   .0270195     7.83   0.000      1.14213    1.248079
 z_classrank |   1.311031   .0382129     9.29   0.000     1.238234    1.388107
         hn1 |   1.054092   .0303959     1.83   0.068     .9961696    1.115383
-------------+----------------------------------------------------------------
       /cut1 |  -3.860039   .0828935                     -4.022507   -3.697571
       /cut2 |  -2.277788   .0463692                      -2.36867   -2.186906
       /cut3 |  -.5395163   .0344048                     -.6069484   -.4720842
       /cut4 |   1.159183   .0367412                      1.087172    1.231195
------------------------------------------------------------------------------
Note: Estimates are transformed only in the first equation to odds ratios.

We can interpret the odds ratio for female as:

Net of family background and school performance, women have 11% lower odds of reporting being in Excellent health than men.

But also as:

Net of family background and school performance, women have 11% lower odds of reporting being in Very Good or better health than men.
Net of family background and school performance, women have 11% lower odds of reporting being in Good or better health than men.
Net of family background and school performance, women have 11% lower odds of reporting being in Fair or better health than men.

We can also take the reciprocal of our odds ratio (\(1/.89 = 1.12\)) and interpret that in terms of the reverse contrast (for example, Poor vs. Excellent, Very Good, Good, or Fair).

Net of family background and school performance, women have 12% higher odds than men of reporting being in Poor health.

The same would hold for our continuous variables:

Net of sex, family background, and test score, a standard deviation increase in class rank is associated with 31% better odds of reporting being in Excellent health.

We could equivalently interpret these results as:

Net of sex, family background, and test score, a standard deviation increase in class rank is associated with 31% better odds of reporting being in Very Good or better health.
Net of sex, family background, and test score, a standard deviation increase in class rank is associated with 31% better odds of reporting being in Good or better health.
Net of sex, family background, and test score, a standard deviation increase in class rank is associated with 31% better odds of reporting being in Fair or better health.

We could interpret the reciprocal (1/1.31 = .763) as:

Net of sex, family background, and test score, a standard deviation increase in class rank is associated with 24% lower odds of reporting being in Poor health.

Obviously, in practice, one would not actually want to provide four different sentences for the same odds ratio. You would pick one. The point here is to highlight that it is the same interpretation no matter which one you pick.

You might find the fact that all the cumulative logits have the same odds ratio and so the same interpretation a little unsettling. Good!

We might indeed wonder whether the assumption that all the cumulative logits are the same actually holds, and this is something we will test later.