Regression models for discrete outcomes

These pages have been developed for use in a course that I teach at Stanford, Soc 383.

Course Materials

Syllabus (Spring 2026)
Tulaverse package used in class examples
Folder with .dta datasets used for examples (Note that opening these in R requires the haven package.)
Key Principles for doing Transparent Social Science
Refresher - Opportunity to review what this course presumes in terms of the background knowledge you have from earlier classes or elsewhere.

Binary outcomes

Event outcomes (survival analysis)

Introduction to event outcomes. (No code used.)
Survival analysis: key orienting concepts (No code used.)
Kaplan-Meier curves. Stata R Skipped in 2026: 1. Proportional hazards vs. accelerated failure-time. (No code used.) Skipped in 2026: 1. The cumulative hazard. Stata Skipped in 2026: 1. Exponential regression. Stata R
Introduction to Cox regression as a conditional logit model. R
Cox proportional hazards model. Stata R

Problem set

Count outcomes

Not used

The pages below are not presently used in the course, but I made them and am preserving them here in an effort to keep me from forgetting they exist should they prove handy later.

Out of sequence

Superceded / Currently cast aside

Are non-normal residuals a problem for the linear probability model? Stata R

Copyrightable portions are CC-BY-3.0 US, with the intended spirit being “I’d love to have you use these in whatever ways it would be useful to you! Just don’t pass my work of as yours, and please cite or give a shout-out when appropriate.”

Regression models for discrete outcomes

Course Materials

Linear regression

Background

Least squares

Binary outcomes

Linear probability model

The logit model for binary outcomes

Interpreting logit results using predicted probabilities

Issues with nonlinear vs linear probability models

Maximum likelihood estimation

Latent variable approach

Ordered outcomes

Unordered outcomes (aka nominal outcomes)

Event outcomes (survival analysis)

Count outcomes

Not used

Out of sequence

Superceded / Currently cast aside