Problem Set: Least Squares
Practice
If you complete this exercise using the Quarto file used to generate this page, you should be able to change the format: header above to say format: pdf, it will render as a pdf file that includes just the questions and your answers (and not, for example, this text).
The following items are premised on you doing some data analysis yourself. For questions 1–3, use an example of your own choosing with a binary explanatory variable and a continuous outcome. For questions 4–6, you will be returning to the models you fit in the “Background” problem set. For questions 7–11, you will introduce a new example.
- In data of your choosing, fit a linear regression model with a binary explanatory variable (provide the output). [1]
- Compute the mean of the outcome for each category of the binary explanatory variable. [1]
(Note: you may need to restrict your computation to the same observations used in the model — e.g., using filter in R — to ensure the means correspond to the model’s estimation sample.)
- Describe how the means computed in question 2 relate to the intercept and coefficient estimated in question 1. [1]
*4. In the “Background” problem set, you fit two models: one with only a key explanatory variable and one with a covariate. Fit those two models again and compare the RMSEs. [1]
For questions 5 and 6, use the model from the Background problem set that included a key explanatory variable and a covariate. Fit a median regression version of that model.
*5. Interpret the coefficient(s) from your median regression model. Does it differ appreciably from the OLS coefficient? [1]
*6. Interpret the difference (or lack thereof) between the median regression and OLS coefficients substantively: why might the relationship between the explanatory variable and the conditional median be the same as, or different from, the relationship between the explanatory variable and the conditional mean? [1]
For questions 7–11, fit a new linear regression model with three explanatory variables: a binary explanatory variable, a categorical explanatory variable (3+ exclusive categories), and a continuous explanatory variable. Don’t use the example from the prior problem set: either switch the data, the outcome, or the explanatory variables (or switch up more radically than this).
- In 1–3 sentences, describe the data you are using in this example, and what relationship you are expecting between the explanatory variables and outcome that motivates the example. [2]
- Provide the output from the model you have fit. [0]
- In one sentence, interpret the coefficient for the binary explanatory variable. [1]
- In one sentence, interpret the coefficient for the categorical explanatory variable. [1]
- In one sentence, interpret the coefficient for the continuous explanatory variable. [1]