Practice: Nonlinear vs. linear probability models

If you complete this exercise using the Quarto file used to generate this page, you could be able to change the format: header above to say format: pdf, it will render as a pdf file that includes just the questions and your answers (and not, for example, this text).

Using data of your choosing, you will fit four models with a binary outcome and multiple explanatory variables. For both the linear probability model and the logit model, you will fit both a model that does not include any interaction terms among your explanatory variables (hereafter referred to as a “main effects” model) and a model that does include an interaction term for two of your explanatory variables (the “interaction” model).

If you have a continuous variable, it may need to be transformed so that “0” corresponds to a substantively plausible value. For example, you may decide to center the variable so that the mean is 0.

1. Motivate your example by explaining the variables for which you will be fitting an interaction term and why you think it is substantively plausible that the two may have an interaction. [2]

2. Provide some sort of effective plot or table that shows the relationship between the probability of the outcome and one of the variables in your interaction by the levels of another variable in your interaction. [2=adequate, 3=effective!]

3. Fit the main effects and interaction model using the linear probability model. [1]

4. Interpret the coefficient for the interaction term in this model and the coefficient for at least one of the explanatory variables in the interaction. Note that conveying the meaning of the former may be easier/clearer in the context of the latter. [2]

5. Fit the main effects and interaction model using the logit model. [1]

6. In your example, do the linear probability model and logit model imply the same substantive conclusion one might draw about whether there is an interaction and its direction. Explain. [2]

7. In your example, do the linear probability model and logit model imply the same substantive conclusion one might draw about whether there is an interaction and its direction. Explain. [2]

Using data of your choosing, you will fit two models with a binary outcome and a key explanatory variable. The first model will not contain other variables, and the second will add another variable that you expect will confound, mediate, or otherwise account for part of the relationship between the key explanatory variable and outcome.

8. Motivate your example by explaining what variables you will be looking at and why you think the relationship between the key explanatory variable and the outcome is confounded or mediated by the third variable. [2]

9. Fit your two models. The sample sizes should be the same, and you should impose whatever sample restrictions would be necessary to achieve this. One way or another, your output for the two models need to demonstrate that the sample sizes for the two models are the same. [1]

10. How much does the logit coefficient differ between the two models? Express your answer in terms of a percentage difference from the Model 1 coefficient. (That is, if the Model 1 coefficient was .90 and the Model 2 coefficient was .45, the Model 2 coefficient would be 50% less.) [1]

11. Compute the average marginal or discrete change for your key explanatory variable for the two models. [1]

12. How much does the marginal/discrete change differ between the two models? Express your answer in terms of a percentage difference from the Model 1 change. Is it bigger or smaller than the percentage change for the coefficients themselves? Is this what you expected, and why or why not? [2]