Regression models for discrete outcomes

These pages have been developed for use in a course that I teach at Stanford, Soc 383.

Course Materials

Linear regression

Background

  1. Basic interpretation of regression coefficients. Stata R
  2. Adjusting for covariates in regression models. Stata R
  3. Predicting the outcome in OLS. Stata R
  4. \(\hat{y}\) and \(\mathbf{x\beta}\). (No code used.)
  5. Logged outcomes in linear regression Stata R

Concept Comprehension Practice R

Least squares

  1. Least squares. Stata R
  2. The mean and least squares. Stata R
  3. Least squares, the standard deviation, and the RMSE. Stata R
  4. Absolute deviations, the median, and median regression. Stata R

Concept Comprehension Practice R

Binary outcomes

Linear probability model

  1. Binary outcomes R (No Stata code used)
  2. The linear probability model. Stata R
  3. Drawbacks of the linear probability model. Stata R

Concept Comprehension Practice R

The logit model for binary outcomes

  1. Introducing the logit. Stata R
  2. The logit model for binary outcomes. Stata R
  3. Interpreting logit results using the odds ratio. Stata R
  4. Relative risk. R

Concept Comprehension Practice R

Interpreting logit results using predicted probabilities

  1. Computing and comparing predicted probabilities. Stata R
  2. Change in predicted probability for categorical explanatory variable. Stata R
  3. Profile plots of predicted probabilities. Stata R
  4. Change in predicted probability for a continuous explanatory variable. Stata R

Concept Comprehension Practice R

Issues with nonlinear vs linear probability models

  1. Model dependence of statistical interactions for binary outcomes Stata R
  2. Testing interactions via differences in average change in probabilities Stata R
  3. Uncorrelated regressors and coefficients from nonlinear probability models Stata R
  4. Comparing logit/probit coefficients across models Stata R

Practice R

Maximum likelihood estimation

  1. Fitting the logit model using maximum likelihood estimation. Stata R
  2. Uncertainty and maximum likelihood estimation. Stata R
  3. Model comparison using AIC and BIC. Stata R
  4. Hypothesis testing using likelihood-ratio and Wald tests. R
  5. Pseudo-\(R^2\) measures of model fit. Stata R

Concept Comprehension Practice R

Latent variable approach

  1. Cumulative distribution function of the normal distribution. Stata
  2. Probit regression and the latent variable approach to binary outcomes. Stata R
  3. Key points re: comparing logit and probit. (No code used.)

Problem set

Ordered outcomes

  1. Ordered outcomes. (No code used.)
  2. Ordered probit as a latent variable model. Stata R
  3. Interpreting estimates from the ordered probit model. Stata R
  4. Ordered logit as a model of cumulative log odds. Stata R
  5. Interpreting ordered logit results using odds ratios. Stata R
  6. Interpreting ordered logit results via plots. Stata R
  7. Average discrete change for categorical explanatory variables. Stata R
  8. Testing the parallel regressions assumption. Stata R
  9. The sequential logit model. Stata R

Problem set

Unordered outcomes (aka nominal outcomes)

  1. Unordered outcomes. (No code used.)
  2. Modeling an unordered outcome as a set of binary logits. Stata R
  3. The multinomial logit model. Stata R
  4. Changes in predicted probabilities for multinomial logit. Stata R
  5. Plotting results for multinomial logit as a continuous variable changes. Stata R
  6. Skipped in 2025:{.skipped} Independence of irrelevant alternatives Stata
  7. Conditional logit model for choice data. R
  8. Conditional logit to fit multinomial logit with alternative-specific variables. Stata R

Concept Comprehension Practice R

Event outcomes (survival analysis)

  1. Introduction to event outcomes. (No code used.)
  2. Survival analysis: key orienting concepts (No code used.)
  3. Kaplan-Meier curves. Stata R
  4. Skipped in 2025:{.skipped} Proportional hazards vs. accelerated failure-time. (No code used.)
  5. Skipped in 2025:{.skipped} The cumulative hazard. Stata
  6. Skipped in 2025:{.skipped} Exponential regression. Stata R
  7. Introduction to Cox regression as a conditional logit model. R
  8. Cox proportional hazards model. Stata R

Skipped in 2025: Problem set

Count outcomes

  1. The Poisson distribution. Stata R
  2. Poisson regression. Stata R
  3. Negative binomial regression.

Not used

The pages below are not presently used in the course, but I made them and am preserving them here in an effort to keep me from forgetting they exist should they prove handy later.

Out of sequence

  1. Least squares and the normal distribution. Stata R
  2. Normally distributed errors in linear regression. Stata R
  3. Likelihood of error and likelihood of estimates. Stata R

Superceded / Currently cast aside

  1. Are non-normal residuals a problem for the linear probability model? Stata R

Copyrightable portions are CC-BY-3.0 US, with the intended spirit being “I’d love to have you use these in whatever ways it would be useful to you! Just don’t pass my work of as yours, and please cite or give a shout-out when appropriate.”