Changes in predicted probabilities for multinomial logit

One approach to interpreting multinomial logit results is to examine how outcome probabilities change with the explanatory variables.

Example: GSS parenting values

In the General Social Survey, respondents are presented with a set of attributes and asked which they think is most important for a child to learn to prepare them for life. Alternatives are: (1) to obey; (2) to help others; (3) to be well-liked; (4) to think for oneself; (5) to work hard. Explanatory variables are sex, education, and birth year (minus 1900).

Here are multinomial logit coefficients with the option “to obey” used as the reference category.

Expand to show dependencies and data recoding

library(tidyverse)
library(haven)
library(modelsummary)
library(nnet)
library(marginaleffects)

data <- read_dta("../dta/gss_soc383.dta") %>%
  mutate(ed3cat = factor(ed3cat,
                  levels = c(1, 2, 3),
                  labels = c("HS or less", "Some college", "College grad"))) %>%
  mutate(mostimp = factor(mostimp,
                  levels = c(1, 2, 3, 4, 5),
                  labels = c("Obey", "Help others", "Be popular", "Think for self", "Work hard")))

# fit model
library(nnet)
model <- multinom(mostimp ~ male + ed3cat + cohort1900, data = data, weights = wtssnr, trace = FALSE)

Expand for code used to show model results

# Model summary
library(modelsummary)
modelsummary(list("Model estimates" = model),
             stars = TRUE,
             shape = term ~ response,         
             fmt = 3,                          # Round to 3 decimal places
             estimate = "{estimate}{stars}",
             statistic = "({std.error})",
             coef_rename = c(
               "male" = "Male",
               "ed3catSome college" = "Some college", 
               "ed3catCollege grad" = "College grad",
               "cohort1900" = "Birth year - 1900"),
             gof_omit = "DF|Deviance|R2|RMSE|AIC|BIC"
             )

	Model estimates
	Help others	Be popular	Think for self	Work hard
(Intercept)	-1.426***	-3.308***	0.018	-1.838***
	(0.066)	(0.207)	(0.050)	(0.066)
Male	-0.060	0.229	-0.255***	0.089*
	(0.044)	(0.156)	(0.036)	(0.042)
Some college	0.273***	-0.043	0.812***	0.379***
	(0.055)	(0.201)	(0.044)	(0.051)
College grad	0.975***	0.149	1.746***	1.090***
	(0.064)	(0.247)	(0.054)	(0.061)
Birth year - 1900	0.021***	-0.002	0.011***	0.030***
	(0.001)	(0.004)	(0.001)	(0.001)
Num.Obs.	26819


------------------------------------------------------------------------------------------
                      Help others     Be well-liked    Think for self         Work hard   
                             b/se              b/se              b/se              b/se   
------------------------------------------------------------------------------------------
Male (vs. female)          -0.048             0.151            -0.235***          0.113** 
                          (0.044)           (0.157)           (0.036)           (0.042)   
Some college                0.262***         -0.108             0.787***          0.348***
                          (0.054)           (0.208)           (0.044)           (0.051)   
BA or above                 0.939***          0.195             1.723***          1.071***
                          (0.064)           (0.244)           (0.054)           (0.061)   
Birth year - 1900           0.020***         -0.003             0.011***          0.029***
                          (0.001)           (0.004)           (0.001)           (0.001)   
Constant                   -1.367***         -3.228***          0.021            -1.778***
                          (0.066)           (0.207)           (0.050)           (0.066)   
------------------------------------------------------------------------------------------

For education, the reference category is having a high school diploma or less. If we look at the coefficients for BA or above, we can see two things:

All the coefficients are positive. Since “to obey” is our base category, this means having a BA (vs. no more than a high school diploma) is associated with an increased probability of each other outcome category vs. “to obey.” Because every other outcome category increases relative to “to obey,” the probability of answering “to obey” must therefore be lower among college graduates than among those with no more than a HS diploma.
The largest positive coefficient is for “Think for [one]self.” If “To think for oneself” had been the base category, all the coefficients would have been negative. So compared to “Think for self,” having a BA is associated with a decreasing probability of answering each other category. And, because every other outcome category decreases relative to “to think for oneself,” the probability of answering “to think for oneself” must therefore be higher among college graduates than among those with no more than a HS diploma.

So, having a BA is associated with a higher probability of saying “to think for oneself” and a lower probability of saying “to obey.” But what about the other three categories? We cannot tell this just from looking at the coefficients. Indeed, the answer may even differ for different values of explanatory variables.

An approach we’ve used before has been to take the average predicted change over observations.

We are going to calculate the average predicted change using the avg_slopes() function in the marginaleffects package.

mfx <- avg_slopes(
  model,
  variables = "ed3cat",  # Specify the variable of interest
  # newdata = "mean",              # Use mean values for other variables
  type = "probs"                 # Get effects on probabilities
) %>%
  arrange(contrast) %>% # arranged by contrast, default is by variable 
  select(group, contrast, estimate, conf.low, conf.high)
mfx


          Group                  Contrast Estimate    2.5 %   97.5 %
 Obey           College grad - HS or less -0.16905 -0.17854 -0.15957
 Help others    College grad - HS or less -0.04009 -0.05022 -0.02996
 Be popular     College grad - HS or less -0.00588 -0.00796 -0.00380
 Think for self College grad - HS or less  0.24537  0.23130  0.25944
 Work hard      College grad - HS or less -0.03035 -0.04150 -0.01919
 Obey           Some college - HS or less -0.09091 -0.10223 -0.07958
 Help others    Some college - HS or less -0.03199 -0.04234 -0.02164
 Be popular     Some college - HS or less -0.00361 -0.00603 -0.00119
 Think for self Some college - HS or less  0.14706  0.13250  0.16163
 Work hard      Some college - HS or less -0.02056 -0.03189 -0.00923

As before, this can be done in Stata using \(\texttt{margins}\) with the \(\texttt{dydx()}\) option.



. margins, dydx(3.ed3cat) 

Average marginal effects                                Number of obs = 26,819
Model VCE: OIM

dy/dx wrt: 3.ed3cat

1._predict: Pr(mostimp==obey), predict(pr outcome(1))
2._predict: Pr(mostimp==help_others), predict(pr outcome(2))
3._predict: Pr(mostimp==popular), predict(pr outcome(3))
4._predict: Pr(mostimp==think_for_self), predict(pr outcome(4))
5._predict: Pr(mostimp==work_hard), predict(pr outcome(5))

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
1.ed3cat     |  (base outcome)
-------------+----------------------------------------------------------------
3.ed3cat     |
    _predict |
          1  |  -.1660654   .0048239   -34.43   0.000      -.17552   -.1566108
          2  |  -.0424209   .0051675    -8.21   0.000     -.052549   -.0322927
          3  |  -.0055009   .0010547    -5.22   0.000    -.0075681   -.0034337
          4  |   .2440681   .0071737    34.02   0.000     .2300079    .2581283
          5  |   -.030081   .0057037    -5.27   0.000      -.04126   -.0189019
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

The rows correspond to the average change in predicted probability for obey, help others, be well-liked, think for oneself, and work hard, respectively.

The largest change is for “think for oneself”, which we can interpret as:

Net of sex and birth year, persons with a BA degree or higher are 24.4 percentage points more likely than those with only a high school education or less to say that the most important attribute for a child to learn is “to think for oneself.”

The five changes in predicted probability together must add to 0. This makes sense: any increases in the probability of some categories must be exactly offset by decreases in other categories. In our example, the increase in the probability of think-for-oneself is so large that all four of the other categories decrease.