---
title: "Problem Set: Least Squares"
format:
  html:
    css: tufte-pset-adapted.css
editor: source
---

## Practice

::: {.content-visible when-format="html"}

*If you complete this exercise using the Quarto file used to generate this page, you should be able to change the `format:` header above to say `format: pdf`, it will render as a pdf file that includes just the questions and your answers (and not, for example, this text).*

:::

::: {.content-visible when-format="html"}

The following items are premised on you doing some data analysis yourself. For questions 1–3, use an example of your own choosing with a binary explanatory variable and a continuous outcome. For questions 4–6, you will be returning to the models you fit in the "Background" problem set. For questions 7–11, you will introduce a new example.

:::

::::: {.content-visible when-format="pdf"}

```{r}
## dependencies

## open data file

```

:::::

1. *In data of your choosing, fit a linear regression model with a binary explanatory variable (provide the output).* [1]

::: {.content-visible when-format="pdf"}
```{r}
# code here
```
:::

2. *Compute the mean of the outcome for each category of the binary explanatory variable.* [1]

::: {.content-visible when-format="html"}
(Note: you may need to restrict your computation to the same observations used in the model — e.g., using `filter` in R — to ensure the means correspond to the model's estimation sample.)
:::

::: {.content-visible when-format="pdf"}
```{r}
# code here
```
:::

3. *Describe how the means computed in question 2 relate to the intercept and coefficient estimated in question 1.* [1]

::: {.content-visible when-format="pdf"}

YOUR ANSWER HERE.

:::

\*4. *In the "Background" problem set, you fit two models: one with only a key explanatory variable and one with a covariate. Fit those two models again and compare the RMSEs.* [1]

::: {.content-visible when-format="pdf"}
```{r}
# code here
```

YOUR ANSWER HERE.

:::

*For questions 5 and 6, use the model from the Background problem set that included a key explanatory variable and a covariate. Fit a median regression version of that model.*

\*5. *Interpret the coefficient(s) from your median regression model. Does it differ appreciably from the OLS coefficient?* [1]

::: {.content-visible when-format="pdf"}
```{r}
# code here
```

YOUR INTERPRETATION HERE.

:::

\*6. *Interpret the difference (or lack thereof) between the median regression and OLS coefficients substantively: why might the relationship between the explanatory variable and the conditional median be the same as, or different from, the relationship between the explanatory variable and the conditional mean?* [1]

::: {.content-visible when-format="pdf"}

YOUR ANSWER HERE.

:::

*For questions 7–11, fit a new linear regression model with three explanatory variables: a binary explanatory variable, a categorical explanatory variable (3+ [exclusive](https://en.wikipedia.org/wiki/Mutual_exclusivity) categories), and a continuous explanatory variable. Don't use the example from the prior problem set: either switch the data, the outcome, or the explanatory variables (or switch up more radically than this).*

7. *In 1–3 sentences, describe the data you are using in this example, and what relationship you are expecting between the explanatory variables and outcome that motivates the example.* [2]

::: {.content-visible when-format="pdf"}

YOUR ANSWER HERE.

:::

8. *Provide the output from the model you have fit.* [0]

::: {.content-visible when-format="pdf"}
```{r}
# code here
```
:::

9. *In one sentence, interpret the coefficient for the binary explanatory variable.* [1]

::: {.content-visible when-format="pdf"}

YOUR INTERPRETATION HERE.

:::

10. *In one sentence, interpret the coefficient for the categorical explanatory variable.* [1]

::: {.content-visible when-format="pdf"}

YOUR INTERPRETATION HERE.

:::

11. *In one sentence, interpret the coefficient for the continuous explanatory variable.* [1]

::: {.content-visible when-format="pdf"}

YOUR INTERPRETATION HERE.

:::
