Changes in predicted probabilities for multinomial logit
One way of interpreting multinomial logit results is to talk about how the probabilities of the different outcomes change as the explanatory variables change.
Example: GSS parenting values
In the General Social Survey, respondents are presented with a set of attributes and asked which they think is most important for a child to learn to prepare them for life. Alternatives are: (1) to obey; (2) to help others; (3) to be well-liked; (4) to think for oneself; (5) to work hard. Explanatory variables are sex, education, and birth year (minus 1900).
Here are multinomial logit coefficients with the option “to obey” used as the reference category.
# fit modellibrary(nnet)model <-multinom(mostimp ~ male + ed3cat + cohort1900, data = data, weights = wtssnr, trace =FALSE)
Expand for code used to show model results
# Model summarylibrary(modelsummary)modelsummary(list("Model estimates"= model),stars =TRUE,shape = term ~ response, fmt =3, # Round to 3 decimal placesestimate ="{estimate}{stars}",statistic ="({std.error})",coef_rename =c("male"="Male","ed3catSome college"="Some college", "ed3catCollege grad"="College grad","cohort1900"="Birth year - 1900"),gof_omit ="DF|Deviance|R2|RMSE|AIC|BIC" )
Model estimates
Help others
Be popular
Think for self
Work hard
(Intercept)
-1.426***
-3.308***
0.018
-1.838***
(0.066)
(0.207)
(0.050)
(0.066)
Male
-0.060
0.229
-0.255***
0.089*
(0.044)
(0.156)
(0.036)
(0.042)
Some college
0.273***
-0.043
0.812***
0.379***
(0.055)
(0.201)
(0.044)
(0.051)
College grad
0.975***
0.149
1.746***
1.090***
(0.064)
(0.247)
(0.054)
(0.061)
Birth year - 1900
0.021***
-0.002
0.011***
0.030***
(0.001)
(0.004)
(0.001)
(0.001)
Num.Obs.
26819
------------------------------------------------------------------------------------------
Help others Be well-liked Think for self Work hard
b/se b/se b/se b/se
------------------------------------------------------------------------------------------
Male (vs. female) -0.048 0.151 -0.235*** 0.113**
(0.044) (0.157) (0.036) (0.042)
Some college 0.262*** -0.108 0.787*** 0.348***
(0.054) (0.208) (0.044) (0.051)
BA or above 0.939***0.1951.723***1.071***
(0.064) (0.244) (0.054) (0.061)
Birth year - 1900 0.020*** -0.003 0.011*** 0.029***
(0.001) (0.004) (0.001) (0.001)
Constant -1.367*** -3.228*** 0.021 -1.778***
(0.066) (0.207) (0.050) (0.066)
------------------------------------------------------------------------------------------
For education, the reference category is having a high school diploma or less. If we look at the coefficients for BA or above, we can see two things:
All the coefficients are positive. Since “to obey” is our outcome category. this means having a BA (vs. no more than a high school diploma) is associated with an increased probability of each other outcome category vs. “to obey.” Because every other outcome category increases relative “to obey,” it must be the case that the probability of answering “to obey” is lower among college graduates than among those with no more than a HS diploma.
The largest positive coefficient is for “Think for [one]self.” This means that, if “To think for oneself” had been the base category, all the coefficients would have been negative. So compared to “Think for self,” having a BA is associated with a decreasing probability of answering each other category vs. thinking for oneself. And, because every other outcome category decreases relative to “to think for oneself,” it must be the case that the probability of answering “to think for oneself” is higher among college graduates than among those with no more than a HS diploma.
So, having a BA is associated with a higher probability of saying “to think for oneself” and a lower probabilities of saying “to obey.” But what about the other three categories? This we cannot tell just from looking at the coefficients. Indeed, the answer for may even differ for different values of explanatory variables.
An approach we’ve used before has been to take the average predicted change over observations.
We are going to calculate the average predicted change using the avg_slopes() function in the marginaleffects package.
mfx <-avg_slopes( model,variables ="ed3cat", # Specify the variable of interest# newdata = "mean", # Use mean values for other variablestype ="probs"# Get effects on probabilities) %>%arrange(contrast) %>%# arranged by contrast, default is by variable select(group, contrast, estimate, conf.low, conf.high)mfx
Group Contrast Estimate 2.5 % 97.5 %
Obey College grad - HS or less -0.16905 -0.17854 -0.15957
Help others College grad - HS or less -0.04009 -0.05022 -0.02996
Be popular College grad - HS or less -0.00588 -0.00796 -0.00380
Think for self College grad - HS or less 0.24537 0.23130 0.25944
Work hard College grad - HS or less -0.03035 -0.04150 -0.01919
Obey Some college - HS or less -0.09091 -0.10223 -0.07958
Help others Some college - HS or less -0.03199 -0.04234 -0.02164
Be popular Some college - HS or less -0.00361 -0.00603 -0.00119
Think for self Some college - HS or less 0.14706 0.13250 0.16163
Work hard Some college - HS or less -0.02056 -0.03189 -0.00923
As before, this can be done in Stata using \(\texttt{margins}\) with the \(\texttt{dydx()}\) option.
The output above corresponds to the average change in predicted probabilities for obey, help others, be well-liked, think for oneself, and work hard, respectively.
Looking at the results above, we can see that the biggest result is that for “think for oneself.” We can interpret this as:
Net of sex and birth year, persons with a BA degree or higher are 24.4 percentage points more likely than those with only a high school education or less to say that the most important attribute for a child to learn is “to think for oneself.”
The five changes in predicted probability together must add to 0. This makes sense: any increases in the probability of some categories must be exactly offset by a decreases in other categories. In our example, the increase in the probability of think-for-oneself is so large that all four of the other categories decrease.