Regression models for discrete outcomes
These pages have been developed for use in a course that I teach at Stanford, Soc 383.
Course Materials
- Syllabus (Spring 2026)
- Tulaverse package used in class examples
- Folder with .dta datasets used for examples (Note that opening these in R requires the
havenpackage.) - Key Principles for doing Transparent Social Science
- Refresher - Opportunity to review what this course presumes in terms of the background knowledge you have from earlier classes or elsewhere.
Linear regression
Background
- Basic interpretation of regression coefficients.
- Adjusting for covariates in regression models.
- Predicting the outcome in OLS.
- \(\hat{y}\) and \(\mathbf{x\beta}\). (No code used.)
- Logged outcomes in linear regression
Least squares
- Least squares.
- The mean and least squares.
- Least squares, the standard deviation, and the RMSE.
- Absolute deviations, the median, and median regression.
Binary outcomes
Linear probability model
- Binary outcomes (No Stata code used)
- The linear probability model.
- Drawbacks of the linear probability model.
The logit model for binary outcomes
- Introducing the logit.
- The logit model for binary outcomes.
- Interpreting logit results using the odds ratio.
- Relative risk.
Interpreting logit results using predicted probabilities
- Computing and comparing predicted probabilities.
- Change in predicted probability for categorical explanatory variable.
- Profile plots of predicted probabilities.
- Change in predicted probability for a continuous explanatory variable.
Issues with nonlinear vs linear probability models
- Model dependence of statistical interactions for binary outcomes
- Testing interactions via differences in average change in probabilities
- Uncorrelated regressors and coefficients from nonlinear probability models
- Comparing logit/probit coefficients across models
Maximum likelihood estimation
- Fitting the logit model using maximum likelihood estimation.
- Uncertainty and maximum likelihood estimation.
- Model comparison using AIC and BIC.
- Hypothesis testing using likelihood-ratio and Wald tests.
- Pseudo-\(R^2\) measures of model fit.
Latent variable approach
- Cumulative distribution function of the normal distribution.
- Probit regression and the latent variable approach to binary outcomes.
- Key points re: comparing logit and probit. (No code used.)
Ordered outcomes
- Ordered outcomes. (No code used.)
- Ordered probit as a latent variable model.
- Interpreting estimates from the ordered probit model.
- Ordered logit as a model of cumulative log odds.
- Interpreting ordered logit results using odds ratios.
- Interpreting ordered logit results via plots.
- Average discrete change for categorical explanatory variables.
- Testing the parallel regressions assumption.
- The sequential logit model.
Unordered outcomes (aka nominal outcomes)
- Unordered outcomes. (No code used.)
- Modeling an unordered outcome as a set of binary logits.
- The multinomial logit model.
- Changes in predicted probabilities for multinomial logit.
- Plotting results for multinomial logit as a continuous variable changes.
- Skipped in 2025:{.skipped} Independence of irrelevant alternatives
- Conditional logit model for choice data.
- Conditional logit to fit multinomial logit with alternative-specific variables.
Event outcomes (survival analysis)
- Introduction to event outcomes. (No code used.)
- Survival analysis: key orienting concepts (No code used.)
- Kaplan-Meier curves.
- Skipped in 2025:{.skipped} Proportional hazards vs. accelerated failure-time. (No code used.)
- Skipped in 2025:{.skipped} The cumulative hazard.
- Skipped in 2025:{.skipped} Exponential regression.
- Introduction to Cox regression as a conditional logit model.
- Cox proportional hazards model.
Skipped in 2025:
Count outcomes
Not used
The pages below are not presently used in the course, but I made them and am preserving them here in an effort to keep me from forgetting they exist should they prove handy later.
Out of sequence
Superceded / Currently cast aside
Copyrightable portions are CC-BY-3.0 US, with the intended spirit being “I’d love to have you use these in whatever ways it would be useful to you! Just don’t pass my work of as yours, and please cite or give a shout-out when appropriate.”