Introduction to Cox regression as a conditional logit model

The most commonly used survival analysis model is the Cox proportional hazards model, also called “Cox regression” or the “Cox model.”

Cox regression is closely related to the conditional logit model. In R, the clogit() function we used to fit the model was part of a package named survival. Also, if you fit a model with clogit() and use summary(), the first lines of the output will make it appear as though you had fit the model using a function called coxph() instead of clogit(). “coxph” refers to “Cox proportional hazards,” and the reason this shows up in clogit() output is that all the clogit() function actually does is provide input to coxph(), which in turn does the heavy lifting of fitting the model.

Here, we are going to walk through an example intended to make the connection between Cox regression and conditional logit clear. “Survival” in our example will mean avoiding being banished on the TV game show The Traitors.

Background: The Traitors

The Traitors is a television show based on a game similar to the parlor games Mafia and Werewolf. The game takes place over a couple of weeks. A few contestants are secretly designated as “traitors,” the rest are “faithful.” Most evenings, the players have a “Round Table” discussion that culminates in a vote in which one player is banished (upon departure, they reveal whether they were a traitor or faithful). Most nights after that, the traitors will meet secretly and decide on a faithful to “murder,” whose death is revealed when they fail to show up for breakfast the next morning.

The basic premise of the Round Tables is that it is the opportunity for faithfuls to ferret out and banish traitors. Nevertheless, most of the time, Round Tables end with a faithful being banished. Then again, this isn’t necessarily surprising, there are many more faithful than traitors, especially at the beginning.

Our question: if a contestant is a traitor, are they at more or less risk of being banished at the Round Table than if they were a faithful?

(Importantly, we are focusing here on banishment, not on being eliminated from the game. Being murdered also eliminates one from the game, but it does not count as the “failure event” we are studying here. Instead, being murdered removes one out of the pool of people at risk of being banished at subsequent Round Tables.)

Data set up for conditional logit

Our earlier example of the conditional logit model involved predicting which nominee would win an Oscar. In our current example, we will be using a person’s status as traitor or faithful to estimate their predicted probability of being banished.

Each row in our dataset is going to be a Round Table and a contestant. In Season 1 of the US version of the Traitors, there were 20 contestants. Three were initially selected as traitors (Cirie, Christian, and Cody). They got to murder a faithful before the first Round Table met (Reza).

At the first Round Table, the person who was banished was a faithful (Geraldine).

All this is represented in the following data from the first Round Table:

Expand to show code for opening and reshaping data
library(tidyverse)
library(tulaverse)
library(readxl)
library(survival)

traitors_data <- read_excel("../rawdata/traitors.xlsx")

# Reshape from wide to long format
traitors_long <- traitors_data %>%
  pivot_longer(
    cols = starts_with(c("status_", "banished_")),
    names_to = c(".value", "round_table"),
    names_pattern = "(status|banished)_(\\d+)",
    names_transform = list(round_table = as.numeric)
  ) %>%
  arrange(name, round_table)
head(traitors_long %>% 
  filter(round_table == 1) %>%
  select(round_table, name, status, banished),
  n=20)
# A tibble: 20 × 4
   round_table name      status   banished
         <dbl> <chr>     <chr>       <dbl>
 1           1 Amanda    faithful        0
 2           1 Andie     faithful        0
 3           1 Anjelica  faithful        0
 4           1 Arie      faithful        0
 5           1 Azra      faithful        0
 6           1 Bam       faithful        0
 7           1 Brandi    faithful        0
 8           1 Christian traitor         0
 9           1 Cirie     traitor         0
10           1 Cody      traitor         0
11           1 Geraldine faithful        1
12           1 Kate      faithful        0
13           1 Kyle      faithful        0
14           1 Michael   faithful        0
15           1 Quentin   faithful        0
16           1 Rachel    faithful        0
17           1 Reza      murdered       NA
18           1 Ryan      faithful        0
19           1 Shelbe    faithful        0
20           1 Stephenie faithful        0

If we look at the variable banished, only two contestants have non-zero values:

  • Geraldine has a value of 1 because she was the player who was banished
  • Reza has a value of NA because he was already murdered

Using the language of survival analysis, Reza was never at risk of banishment because he had already been murdered–and so was out of the risk pool–by the time the first banishment took place.

Our dataset does not just contain data from the first round table, but from all the Round Tables that season. Each Round Table will have fewer players than the last.

Since Geraldine was banished at the first round table, she will not be there for the second. Also, after the Round Table, the traitors met secretly later that night and decided to murder Bam.

At the next evening’s Round Table, there are 17 players, and they voted to banish Brandi, who is also a faithful. All this is reflected in the rows for the second Round Table:

# A tibble: 20 × 4
   round_table name      status   banished
         <dbl> <chr>     <chr>       <dbl>
 1           2 Amanda    faithful        0
 2           2 Andie     faithful        0
 3           2 Anjelica  faithful        0
 4           2 Arie      faithful        0
 5           2 Azra      faithful        0
 6           2 Bam       murdered       NA
 7           2 Brandi    faithful        1
 8           2 Christian traitor         0
 9           2 Cirie     traitor         0
10           2 Cody      traitor         0
11           2 Geraldine banished       NA
12           2 Kate      faithful        0
13           2 Kyle      faithful        0
14           2 Michael   faithful        0
15           2 Quentin   faithful        0
16           2 Rachel    faithful        0
17           2 Reza      murdered       NA
18           2 Ryan      faithful        0
19           2 Shelbe    faithful        0
20           2 Stephenie faithful        0

The last Round Table happens when there are five contestants left, after which the game moves to its finale episode which has its own format. At this last Round Table, another faithful (Kate) is banished.

# A tibble: 5 × 4
  round_table name    status   banished
        <dbl> <chr>   <chr>       <dbl>
1           9 Andie   faithful        0
2           9 Arie    traitor         0
3           9 Cirie   traitor         0
4           9 Kate    faithful        1
5           9 Quentin faithful        0

The remaining 4 contestants survived all the Round Tables without being banished. Each of them will appear nine times in the dataset–once for each Round Table–and their value for banished will never be 1.

Contestants who were murdered also will never have values of 1 for banished, but they will be NA for all the Round Tables after they were eliminated.

The sharp-eyed observer of the previous table might notice that Arie was a faithful at the first Round Table and a traitor at the last Round Table. This is because, on The Traitors, after a traitor is banished there is an opportunity to recruit a new traitor from among the faithful to take their place.

In survival analysis terms, this makes status a time-varying covariate, albeit only for Arie in this season: his value is different for different times in the analysis.

# A tibble: 9 × 4
  round_table name  status   banished
        <dbl> <chr> <chr>       <dbl>
1           1 Arie  faithful        0
2           2 Arie  faithful        0
3           3 Arie  faithful        0
4           4 Arie  faithful        0
5           5 Arie  faithful        0
6           6 Arie  faithful        0
7           7 Arie  traitor         0
8           8 Arie  traitor         0
9           9 Arie  traitor         0

Important to understand about the data here is that all of the Round Tables are included in the same dataset. Ignoring the NA rows, there end up being 103 rows, because the 9 Round Tables include 19+17+15+13+11+9+8+6+5 players. Of these 103 rows, only 9 have positive values for banished, because only one player is banished at each Round Table.

table(traitors_long$banished)

 0  1 
94  9 
# A tibble: 9 × 4
  round_table name      status   banished
        <dbl> <chr>     <chr>       <dbl>
1           1 Geraldine faithful        1
2           2 Brandi    faithful        1
3           3 Michael   faithful        1
4           4 Kyle      faithful        1
5           5 Cody      traitor         1
6           6 Shelbe    faithful        1
7           7 Rachel    faithful        1
8           8 Christian traitor         1
9           9 Kate      faithful        1

Predicted probabilities for conditional logit model

In our model, traitor will be our only explanatory variable, and the outcome is 1 if a player is banished. As with our earlier example of conditional logit, among the players at risk of banishment at round table j the probability that player i will be banished is:

\[ \Pr(y_i = \mathrm{banished}) = \frac {\exp(\mathbf{x}_{ij}\mathbf{\beta})} {\sum_{m=1}^{n}\exp(\mathbf{x}_{mj}\mathbf{\beta})} \] That is, it will be that individual’s \(\exp(\mathbf{x}_{ij}\mathbf{\beta})\) over the summed \(\exp(\mathbf{x}_{ij}\mathbf{\beta})\) for everyone who might be banished at that Round Table.

However, because traitor is a binary variable and the only explanatory variable in our model, all this is very simple:

  • For each of the traitors, \(\exp(\mathbf{x}_{ij}\mathbf{\beta})\) will be \(\exp(\beta_\mathtt{traitor})\)
  • For each of the faithful, \(\exp(\mathbf{x}_{ij}\mathbf{\beta})\) will be \(\exp(0) = 1\)

This does not only simplify the numerator of our predicted probability but also the denominator. We will multiply \(\exp(\mathbf{x}_{ij}\mathbf{\beta})\) by the number of traitors at that Round Table and add to it 1 times the number of faithful:

\[ \begin{align} \Pr(y_i = \mathrm{banished}|x_i = \mathrm{traitor}) & = \frac {\exp(\beta)} {\left[N_{traitors}\times\exp(\beta)\right] + \left[N_{faithful}\times1\right]} \\ \\ \Pr(y_i = \mathrm{banished}|x_i = \mathrm{faithful}) & = \frac {1} {\left[N_{traitors}\times\exp(\beta)\right] + \left[N_{faithful}\times1\right]} \end{align} \]

Fitting as a conditional logit model

In our earlier conditional logit example, we specified Oscar race as a stratum, so probabilities were based on the differences among observations within each race. In this example, our stratum is going to be each Round Table.

library(survival)
# get rid of NA values
traitors_nona <- traitors_long %>%
  drop_na(banished)

model <- clogit(banished ~ status + strata(round_table), data = traitors_nona)
tula(model)
Conditional (fixed-effects) logistic regression
LR chi2(1)     =   0.1751                                Number of obs    = 103
Prob > chi2    =   0.6756                                Number of groups =   9
Log likelihood =  -21.039                                                      
McFadden R-sq  = 0.004145                                                      
───────────────────────────────────────────────────────────────────────────────
banished      │      Coef  Std. Err.          z     P>|z|  [95% Conf  Interval]
───────────────────────────────────────────────────────────────────────────────
statustraitor │    -.3357       .824     -.4074     .6837     -1.951      1.279
───────────────────────────────────────────────────────────────────────────────

Obviously from the p-value we can see we do not have evidence that traitors and faithful differ in their risk of being banished, but then again we only have evidence for one season and nine banishments.

Nevertheless, we can use this \(\beta\) = -.3357 to reinforce our understanding of how the predicted probabilities are calculated.

For the first round table, there were 19 players at risk: 3 traitors and 16 faithful. The denominator of the equation for our predicted probability above is:

$$ \[\begin{align} \left[N_{traitors}\times\exp(\beta)\right] + \left[N_{faithful}\times1\right] & = [3 \times \exp(-.3357)] + [16] \\ & = [3 \times .7148] + [16] \\ & = 18.144 \end{align}\] $$ And so for the traitors and faithful at the first round table:

$$ \[\begin{align} \Pr(y_i = \mathrm{banished}|x_i = \mathrm{traitor}) & = \frac {\exp(-.3357)} {18.144} \\ & = \frac {.7148} {18.144} \\ &= .0393 \\ \\ \Pr(y_i = \mathrm{banished}|x_i = \mathrm{faithful}) & = \frac {1} {18.144} \\ & = .0551 \end{align}\] $$

With 16 faithfuls and 3 traitors, each of the traitors has a predicted probability of about 4% of being banished, while each of the faithfuls has a 5.5% chance.

Interpretation by odds ratio

With the conditional logit model, we exponentiated coefficients and interpreted the result in terms of an odds ratio. We can do that again here.

The exponentiated coefficient is \(\exp(-.3357) = .7148\). We could therefore interpret the result as indicating that:

  • A player who is a traitor has 29% lower odds of being banished at the Round Table than a player who is a faithful.

Interpretation by hazard ratio

The hazard corresponds to the risk of the event happening within an interval. In our example, it is the probability of banishment at a Round Table. As we discussed with logit, the ratio of the predicted probability for a traitor vs. the predicted probability for a faithful is not the odds ratio, it is a relative risk.

However, as we noted when we talked about relative risk, when the baseline probability is small, the relative risk will be close to the odds ratio. We can even see that here, looking at the probabilities we calculated above when the Round Table was 1:

\[ \frac{\Pr(\mathrm{banishment}|\mathrm{traitor})}{\Pr(\mathrm{banishment}|\mathrm{faithful})} = \frac{.0393}{.0551} = .7149 \] Which is almost exactly our odds ratio.

In practice, Cox regression results are interpreted as hazard ratios, not odds ratios. When time is discrete as it is here (discrete Round Table sessions, instead of continuous time), that interpretation does involve an assumption that occurrence of the failure event at a point in time is relatively rare relative to the size of the risk pool. Even in this toy example, this is not a problematic assumption, and typically it would be even less so in real examples.

Spell organization of data

As shown above, we organized the data similarly to what we saw in conditional logit, with each row of the dataset being one Round Table for one individual and an outcome that was 1 for one row within each stratum.

But survival data instead are typically organized by spells. A spell is a time period in which the individual is at risk, and a spell ends when either (a) the event happens; (b) an explanatory variable changes; (c) an individual is no longer at risk; (d) an individual is no longer observed.

For the season of The Traitors examined above, a concise way of providing all the relevant information for the season is as spells.

Expand to show code that opens and recodes data
traitors <- read_csv("../csv/traitors.csv") %>%
  drop_na(banished) %>%
  mutate(first_rt_minus1 = first_rt-1)
Expand to show code that displays spells
traitors %>%
  filter(season == "US1") %>%
  arrange(contestant, first_rt) %>%
  select(contestant, status, first_rt, last_rt, banished)
# A tibble: 20 × 5
   contestant status   first_rt last_rt banished
   <chr>      <chr>       <dbl>   <dbl>    <dbl>
 1 Amanda     faithful        1       4        0
 2 Andie      faithful        1       9        0
 3 Anjelica   faithful        1       5        0
 4 Arie       faithful        1       6        0
 5 Arie       traitor         7       9        0
 6 Azra       faithful        1       2        0
 7 Bam        faithful        1       1        0
 8 Brandi     faithful        1       2        1
 9 Christian  traitor         1       8        1
10 Cirie      traitor         1       9        0
11 Cody       traitor         1       5        1
12 Geraldine  faithful        1       1        1
13 Kate       faithful        1       9        1
14 Kyle       faithful        1       4        1
15 Michael    faithful        1       3        1
16 Quentin    faithful        1       9        0
17 Rachel     faithful        1       7        1
18 Ryan       faithful        1       3        0
19 Shelbe     faithful        1       6        1
20 Stephenie  faithful        1       7        0

The data show the #s of the first and last Round Table the person participated in and whether that last Round Table ended with their being banished.

Because Arie was converted from a faithful to a traitor in the middle of the season, he has two rows, one for Round Tables #1-#6 in which he was a faithful, and one for #7-#9 for which he was a traitor.

We need to tweak this data to fit a Cox model, because a Cox model requires the end of a spell (last_rt) to be later (i.e., greater) than the beginning of a spell (first_rt). Round Tables are discrete periods, and so thinking about somebody’s first and last Round Table makes sense substantively, but to fit the model we just need to subtract one from first_rt. I made a new variable first_rt_minus1 for this purpose.

We fit the model using the coxph() function, which is part of the survival package. Most of the time when one fits a model in R, the only thing to the left of the ~ operator is the outcome variable. With coxph() however, we include three arguments inside a Surv() function: the start time, the end time, and the variable indicating whether the event occurred at the end of the spell.

cox_model <- coxph(
  Surv(first_rt_minus1, last_rt, banished) ~ status,
  data = traitors %>% filter(season == "US1"),
  ties = "efron"  
)
tula(cox_model)
Cox regression / Ties: efron
No. of subjects =      20                                 Number of obs =     20
No. of failures =       9                                 AIC           = 44.078
Time at risk    =       6                                 Concordance   = 0.5426
Log likelihood  = -21.039                                                       
────────────────────────────────────────────────────────────────────────────────
Surv(first_r~) │Haz. Ratio       DMSE          z     P>|z|  [95% Conf  Interval]
────────────────────────────────────────────────────────────────────────────────
statustraitor  │     .7148       .589     -.4074     .6837      .1422      3.594
────────────────────────────────────────────────────────────────────────────────

The estimate for traitor here corresponds to the results already observed for clogit above. We are thus able to obtain the same results with spell-formatted data using coxph() that we were above using clogit() on data that were more clearly choice data.

We can expand our consideration of the traitors data to more seasons: 3 of The Traitors US and 3 of The Traitors UK.

cox_model <- coxph(
  Surv(first_rt_minus1, last_rt, banished) ~ status + strata(season),
  data = traitors,
  ties = "efron"  
)
tula(cox_model)
Cox regression / Ties: efron
No. of subjects =      135                               Number of obs =     135
No. of failures =       62                               AIC           = 299.235
Time at risk    =       73                               Concordance   =  0.5439
Log likelihood  = -148.617                                                      
────────────────────────────────────────────────────────────────────────────────
Surv(first_r~) │Haz. Ratio       DMSE          z     P>|z|  [95% Conf  Interval]
────────────────────────────────────────────────────────────────────────────────
statustraitor  │     1.892      .5126      2.353     .0186      1.112      3.217
────────────────────────────────────────────────────────────────────────────────

When we do this, the sign of the estimate reverses and becomes statistically significant, suggesting that the US Season 1 was an anomaly. Over the six seasons analyzed, being a traitor increases the hazard of being banished. More specifically, a traitor’s hazard of banishment is 1.9 times that of a faithful.

Bonus: Nonlinear effects of time

If you’ve watched many seasons of The Traitors, it seems like the chances of catching a traitor increase as the show continues and players have more information, even beyond the effect of just the ratio of traitors to faithfuls increasing.

In survival analysis terms, we might think the hazard of being a traitor increases with time. We can use the tt() function within coxph() to include time-dependent coefficients. In our case, we will create a term interacting status with the log of analysis time.

cox_model <- coxph(
  Surv(first_rt_minus1, last_rt, banished) ~ status + tt(status=="traitor") + strata(season),
  data = traitors,
  tt = function(x, t, ...) x * log(t),
  ties = "efron"  
)
tula(cox_model)
Cox regression / Ties: efron
No. of subjects =      135                               Number of obs =     135
No. of failures =       62                               AIC           = 295.287
Time at risk    =     3459                               Concordance   =  0.5481
Log likelihood  = -145.644                                                      
────────────────────────────────────────────────────────────────────────────────
Surv(first_r~) │Haz. Ratio       DMSE          z     P>|z|  [95% Conf  Interval]
────────────────────────────────────────────────────────────────────────────────
statustraitor  │     .2538      .2649     -1.314     .1889     .03283      1.962
tt(status ==~) │     3.316      1.882      2.112     .0347       1.09      10.08
────────────────────────────────────────────────────────────────────────────────

The results indicate that, for the first round table:

\[ \begin{align} \beta_\mathrm{traitor, RT1} &= -1.3711 + (1.1987 \times \ln(1)) \\ &= -1.3711 + (1.1987 \times 0) \\ &= -1.3711 \\ \end{align} \] whereas, for the ninth round table:

\[ \begin{align} \beta_\mathrm{traitor, RT9} &= -1.3711 + (1.1987 \times \ln(9)) \\ &= -1.3711 + (1.1987 \times 2.197) \\ &= -1.3711 + (2.634) \\ &= 1.2627 \\ \end{align} \]

We can interpret these results as indicating that at the opening round table, the hazard of being banished is nearly four times higher for faithful than traitors (\(1/\exp(-1.3711) = 3.85\)), whereas for the ninth round table, it has reversed and is \(\exp(1.2627)=\) 3.5 times higher for traitors than faithful.