Conditional logit to fit multinomial logit with alternative-specific variables
In the multinomial logit model, each case in the data has the same set of possible unordered outcomes. If the outcome is a choice, one can think about the multinomial logit model as one in which each individual has the same set of alternatives, and chooses one among them.
A conditional logit model provides a different way of fitting a multinomial logit model. The multinomial logit approach we have already covered is simpler, and there is no reason to use conditional logit for an ordinary multinomial logit model.
But, the conditional logit model is more flexible. One way it is more flexible is that the conditional logit model provides a way of adding information that varies over the possible outcomes.
Here are some examples that illustrate what we mean by alternative-specific information:
The outcome is the mode of transportation an individual takes to work. Alternatives are car, bus, train. Our key explanatory variable is time. We expect that people would prefer transportation that would get them to work in less time. We want to estimate the effect of time, but the time for each alternative varies for each individual.
The outcome is the high school chosen by a student in a district that has eight public high schools and open enrollment among them. Our key explanatory variable is distance: we think that students might prefer high schools that are closer. We want to estimate the effect of distance, but the distance to each of the eight high schools varies for each student.
The dependent variable is which candidate a respondent votes for in a multiparty election with a parliamentary system. For example, Canadian elections (as of 2025) have five major parties: the Liberal Party, the Conservative Party, Bloc Québécois, the New Democratic Party, and the Green Party. Our key explanatory variable is agreement with a party’s issue positions. We can ask a voter’s position on many issues, and combine them with each party’s position on those issues. We want to estimate how much overall agreement across many issues affects vote choice, but agreement with each party varies for each respondent.
When a conditional logit model is used to fit the equivalent of a multinomial logit model with alternative-specific variables, this is sometimes called a McFadden’s choice model.
Description of example
The example we will use involves a survey conducted in 1989 among people who had engaged in a fishing trip in Southern California. For each trip, there were four possible options:
Fishing from a beach
Fishing off a pier
Using a commercial charter boat
Using a private boat
For each trip, each of these options has an associated price (in dollars) and an associated quality (in terms of the expected number of target fish caught). These are alternative-specific variables. In addition, the data include a measure of monthly income for the respondent (measured in thousands of dollars), which is a case-specific variable.
Data set-up for conditional logit model
When we have alternative-specific data, we want the data arranged so that each alternative is on a different row, with some id variable identifying which rows belong to the same case. A binary variable should indicate which of the alternatives is the selected outcome.
For the fishing data, there are four alternatives, so each case will have four rows. We will list the first few cases here.
Above, the variable \(\texttt{id}\) identifies the individual cases. The variable \(\texttt{alternative}\) indicates the alternative corresponding to each row. \(\texttt{chosen}\) indicates which alternative was used; for example, for the first observation, the trip was by charter boat.
\(\texttt{price}\) and \(\texttt{quality}\) are alternative-specific variables and vary across alternatives. The variable \(\texttt{income}\) is case-specific and we can see that it is the same within each case.
Stata: rearranging data with alternative-specific values. You may have data with alternative-specific variables in which the information is all in a single row, with different variables containing the different alternative-specific values for a given measure. The \(\texttt{reshape long}\) command is what you use to re-arrange this data so that each row represents a different alternative.
\(u_{ij}\) is the utility of alternative \(j\) for individual \(i\).
The alternative-specific variables are indicated by \(\mathbf{x}_{ij}^{A}\), and, in the simple formulation, each alternative-specific variable will have a \(\mathbf{\beta }\) that does not vary over the alternatives.
The case-specific variables are indicated by \(\mathbf{x}_{i}^{C}\). As in multinomial logit, there will be a base category; each case-specific variable will have coefficients for each other category that is defined in terms of its contrast with the base category.
Because the model is written in terms of a latent utility, there is an error term, for each alternative for each individual. These are assumed to be independent of one another.
The predicted probability of category \(m\) being the chosen category is then calculated as:
This is like with multinomial logit, where for each observation there is a separate \(\mathbf{x\beta}\) that can be computed for each alternative. We can then sum the \(\exp(\mathbf{x\beta})\) over each alternative and this is the denominator of our predicted probability, where the \(\exp(\mathbf{x\beta})\) for alternative \(m\) is the numerator for calculating \(\Pr(y=m)\).
Call:
coxph(formula = Surv(rep(1, 4728L), chosen) ~ price + quality +
(alt * income) + strata(id), data = data, method = "exact")
n= 4728, number of events= 1182
coef exp(coef) se(coef) z Pr(>|z|)
price -0.025117 0.975196 0.001732 -14.504 < 2e-16 ***
quality 0.357782 1.430154 0.109773 3.259 0.001117 **
altbeach -0.777959 0.459342 0.220494 -3.528 0.000418 ***
altprivate boat -0.250681 0.778271 0.203940 -1.229 0.219000
altcharter boat 0.916406 2.500289 0.207265 4.421 9.81e-06 ***
income NA NA 0.000000 NA NA
altbeach:income 0.127577 1.136073 0.050640 2.519 0.011758 *
altprivate boat:income 0.217017 1.242365 0.050058 4.335 1.46e-05 ***
altcharter boat:income 0.094285 1.098873 0.050060 1.883 0.059640 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(coef) exp(-coef) lower .95 upper .95
price 0.9752 1.0254 0.9719 0.9785
quality 1.4302 0.6992 1.1533 1.7735
altbeach 0.4593 2.1770 0.2982 0.7077
altprivate boat 0.7783 1.2849 0.5218 1.1607
altcharter boat 2.5003 0.4000 1.6656 3.7533
income NA NA NA NA
altbeach:income 1.1361 0.8802 1.0287 1.2546
altprivate boat:income 1.2424 0.8049 1.1263 1.3704
altcharter boat:income 1.0989 0.9100 0.9962 1.2122
Concordance= 0.757 (se = 0.011 )
Likelihood ratio test= 846.9 on 8 df, p=<2e-16
Wald test = 322.7 on 8 df, p=<2e-16
Score (logrank) test = 596 on 8 df, p=<2e-16
In the output above, you’ll notice the coefficient for income is indicated as “NA.” However, there are different coefficients for income for each of the three alternatives (vs. the base category), just like in multinomial logit.
In the conditional logit model, coefficients are only estimated for terms that vary within a case (in this case, within the same value of id). Income does not vary within a person. However, when we include the interaction terms, we implicitly refer to a term that does vary within a person: income\(\times\)beach, for example, equals the person’s income for their beach row and is 0 for all other rows.
We will reformat the output to be easier to read:
Expand to show code that reformats output
library(modelsummary)coef_rename <-c("price"="Price of option","quality"="Expected catch","altbeach:income"="Income: Beach","altprivate boat:income"="Income: Private boat","altcharter boat:income"="Income: Charter boat","altbeach"="Intercept: Beach","altprivate boat"="Intercept: Private boat","altcharter boat"="Intercept: Charter boat")# Create the formatted table with renamed coefficientsmodelsummary(model, stars =TRUE,estimate ="{estimate} ({std.error}){stars}",statistic =NULL,coef_map = coef_rename, # Apply the coefficient renaminginclude.LogLike =TRUE)
Model matrix is rank deficient. Parameters `income` were not estimable.
(1)
Price of option
-0.025 (0.002)***
Expected catch
0.358 (0.110)**
Income: Beach
0.128 (0.051)*
Income: Private boat
0.217 (0.050)***
Income: Charter boat
0.094 (0.050)+
Intercept: Beach
-0.778 (0.220)***
Intercept: Private boat
-0.251 (0.204)
Intercept: Charter boat
0.916 (0.207)***
Num.Obs.
4728
AIC
2446.3
BIC
2498.0
RMSE
0.39
In Stata, fitting a conditional logit model for simple choice data is done in two parts. First, the \(\texttt{cmcset}\) command is used to specify two things: (1) what variable identifies each observation and (2) what variable identifies each alternative within each observation. In our example these variables are conveniently named \(\mathtt{id}\) and \(\mathtt{alternative}\).
. cmset id alternative
Case ID variable: id
Alternatives variable: alternative
Then, we estimate the model using the command cmclogit, using “pier” as our base category:
The signs of these results mean that the price of an option is negatively associated with it being chosen, while an option’s quality is positively associated with being chosen. Unsurprisingly, customers are attracted to lower prices and higher quality.
Income is positively associated with choosing any option versus fishing off a pier, but most strongly with fishing by private boat.
The signs of the intercepts mean that, net of everything else, fishing by charter boat is the most popular choice, while fishing off the beach is least popular.
Interpreting conditional logit coefficients
As a logit model, we can exponentiate conditional logit coefficients and interpret those.
Expand to show code that provides output with exponentiated coefficients
library(modelsummary)coef_rename <-c("price"="Price of option","quality"="Expected catch","altbeach:income"="Income: Beach","altprivate boat:income"="Income: Private boat","altcharter boat:income"="Income: Charter boat","altbeach"="Intercept: Beach","altprivate boat"="Intercept: Private boat","altcharter boat"="Intercept: Charter boat")# Create the formatted table with renamed coefficientsmodelsummary(model, exponentiate=TRUE,stars =TRUE,estimate ="{estimate} ({std.error}){stars}",statistic =NULL,coef_map = coef_rename, # Apply the coefficient renaminginclude.LogLike =TRUE)
Model matrix is rank deficient. Parameters `income` were not estimable.
(1)
Price of option
0.975 (0.002)***
Expected catch
1.430 (0.157)**
Income: Beach
1.136 (0.058)*
Income: Private boat
1.242 (0.062)***
Income: Charter boat
1.099 (0.055)+
Intercept: Beach
0.459 (0.101)***
Intercept: Private boat
0.778 (0.159)
Intercept: Charter boat
2.500 (0.518)***
Num.Obs.
4728
AIC
2446.3
BIC
2498.0
RMSE
0.39
These can be obtained in Stata using the \(\texttt{or}\) option.
. cmclogit chosen price quality, casevar(income) or base(2)
Iteration 0: log likelihood = -1270.0164
Iteration 1: log likelihood = -1217.7258
Iteration 2: log likelihood = -1215.1499
Iteration 3: log likelihood = -1215.1376
Iteration 4: log likelihood = -1215.1376
Conditional logit choice model Number of obs = 4,728
Case ID variable: id Number of cases = 1182
Alternatives variable: alternative Alts per case: min = 4
avg = 4.0
max = 4
Wald chi2(5) = 252.98
Log likelihood = -1215.1376 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
chosen | Odds ratio Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
alternative |
price | .9751962 .0016887 -14.50 0.000 .971892 .9785117
quality | 1.430154 .1569927 3.26 0.001 1.153303 1.773462
-------------+----------------------------------------------------------------
beach |
income | 1.136073 .0575302 2.52 0.012 1.02873 1.254615
_cons | .4593424 .1012822 -3.53 0.000 .2981616 .7076546
-------------+----------------------------------------------------------------
pier | (base alternative)
-------------+----------------------------------------------------------------
private_boat |
income | 1.242365 .0621906 4.34 0.000 1.126263 1.370436
_cons | .7782709 .1587202 -1.23 0.219 .5218397 1.160712
-------------+----------------------------------------------------------------
charter_boat |
income | 1.098873 .0550096 1.88 0.060 .996177 1.212157
_cons | 2.500289 .518222 4.42 0.000 1.665582 3.753309
------------------------------------------------------------------------------
Note: Exponentiated coefficients represent odds ratios for alternative-specific variables (first equation) and
relative-risk ratios for case-specific variables.
Note: _cons estimates baseline relative risk for each outcome.
As the note at the bottom of the Stata output indicates, the results are an unusual hybrid of odds ratios (for the alternative-specific variables) and relative risk ratios (for the case-specific variables).
The coefficients for the alternative-specific variables can be interpreted as:
Net of personal income and the catch rate, each additional dollar of price is associated with a 2.5% decrease in the odds of an option being chosen.
Net of personal income and price, each unit increase in the catch rate is associated with a 43% increase in the odds of an option being chosen.
The coefficients for the case-specific variables can be interpreted as:
Net of price and quality, a one-thousand dollar increase in monthly income increases the odds of using a private boat versus fishing from a pier by 24%.
Consistent with its terminology with multinomial logit, Stata refers to the exponentiated coefficients for case-specific variables as relative risk ratios. As before, we do not do this because it invites confusion with the more usual use of “relative risk.”
Interpreting conditional logit results via average marginal change
Presently, I have only implemented this for Stata, and have not figured out how to do it in R.
For the case-specific variables, the average marginal change can be obtained in Stata using \(\texttt{margins}\) with the \(\texttt{dydx()}\) option:
. margins, dydx(income)
Average marginal effects Number of obs = 4,728
Model VCE: OIM
Expression: Pr(alternative|1 selected), predict()
dy/dx wrt: income
-------------------------------------------------------------------------------
| Delta-method
| dy/dx std. err. z P>|z| [95% conf. interval]
--------------+----------------------------------------------------------------
income |
_outcome |
beach | .0034878 .003541 0.98 0.325 -.0034524 .010428
pier | -.0144069 .004369 -3.30 0.001 -.0229701 -.0058437
private boat | .0266822 .00515 5.18 0.000 .0165885 .0367759
charter boat | -.0157631 .00559 -2.82 0.005 -.0267193 -.0048069
-------------------------------------------------------------------------------
From these results, we can see that, on average, a marginal increase in income increases the likelihood of choosing to fish by private boat by .027, and that this increase comes as the result of a decrease in the probability of fishing via charter boat of .016 and of fishing from a pier of .014.
For the alternative-specific variables, the Stata output is more complex to interpret. Here are the average marginal effects for price:
In this output, a row is labeled by a pair of alternatives separated by \(\texttt{\#}\). The alternative before the \(\texttt{\#}\) is the alternative for which the price is being increased by the marginal amount. The alternative after the hash is the alternative whose change in predicted probability is being evaluated.
The results in the highlighted rows at the bottom evaluate the changes in the probability for an increase in the price of using a charter boat.
The predicted probability of using a charter boat decreases by .0053, or a little more than half a percentage point. Most of that decrease is offset by an increase in the probability of using a private boat instead, which increases the probability by an average of .0038. The probabilities of using the beach or the pier increase by smaller amounts.