Survival Analysis

Pearls and pitfalls for time-to-event modelling

Author

Dean Marchiori

Published

February 20, 2025

tl;dr

The use of classification algorithms to model binary response data are so ubiquitous there is a risk that in some settings this type of model is inappropriate. In cases when the response data occur over time and subjects may drop out (due to death, loss to follow up etc), traditional methods like logistic regression will not result in accurate estimations. Time to event models like survival analysis may provide a more appropriate way to address this type of data generating system.

The below are some brief notes and resources on typical workflows for time-to-event or survival modelling.

Types of Survival Models

There are several categories of survival models:

1. Non-Parametric: Most commonly the Kaplan-Meier estimator where no assumptions are imposed on either the shape of the survival function nor on the form of the relationship between predictor variables and survival time.

2. Semi-Parametric: The standard approach here is the Cox Proportional Hazards Model (PH Model). The hazard function gives the instantaneous potential of having an event at a time, given survival up to that time. No distributional assumptions are made about the survival times or baseline hazard function. This effectively results in a model that scales a ‘baseline’ hazard function that is never actually estimated, hence the model being a ‘proportional hazards’ model and using only a partial-maximum likelihood estimation method.

3. Parametric: Fully parametric models exist and rely on the analyst specifying the functional form of the survival curve (exponential, Weibull, log-normal etc.).

4. Others: Other specialised methods exists such as AFT models exist and can be selected if there is significant justification or if the more traditional methods fail.

Workflow Notes

If working with tabular binary outcome data for a specific event, start by defining:
- The observed event time of the event: $T_{i}$
- The censoring time (when the subject was last followed up or known to have dropped out, or the end of the observation period): $C_{i}$
The event time can be calculated as $Y_{i} = m i n (T_{i}, C_{i})$ with a binary event indicator ( $δ_{i}$ ):

$δ_{i} = {\begin{array}{lr} 1, & T_{i} \leq C_{i} \\ 0 & T_{i} > C_{i} \end{array}$

A recommended workflow is to start by using a non-parametric approach such as the Kaplan-Meier estimator as a descriptive tool.
For between group comparisons, a group-wise KM-curve can be estimated.
Formal hypothesis testing between groups can be done using a Log-rank test as the typical standard.
A Cox proportional hazards model is a safe choice to enable the inclusion of multiple covariates of interest. It will not measure the underlying baseline hazard but has good properties for interpreting and testing hazard ratios from the model’s coefficients (essentially an analogy to odds ratios in logistic regression).
The Cox PH model has a number of assumptions:
- Proportional Hazards: Across time the relative hazard remains constant across covariate levels. This hypothesis can be tested using the scaled Schoenfeld residuals and should graphically be assessed to ensure there is no slope. A formal test can also be conducted using the cox.zph function in the {survival} package in R.
- Non-Informative Censoring: While this class of model is designed to handle censored data, the cause of censoring should be unrelated to the response and be entirely non-informative.
- Residual diagnostics can also be performed as the primary way to gauge influential observations, possible transformations and issues with non-proportional hazards. They can also be used to test for linearity and additivity in the covariates. There are several types of residuals available (Martingale, Deviance, Score etc.). See here for guidance.
Time dependent covariates (covariates that change after baseline) can also be incorporated but require specialised data re-shaping. See this vignette for a detailed overview first.
If the diagnostic tests indicate a poor fit there are several remedies (see here), or an alternative model should be tried. If this is from class of parametric survival models, some knowledge and testing of the distribution will be required as an explicit input.
Results are typically presented through interpretation of the regression coefficients. The raw coefficients are exponentiated to form a ‘hazard ratio’ which represents the percentage increase(decrease) over(under) the null value of 1 in the instantaneous hazard.
Subject matter experts are more comfortable interpreting increasing values that are associated with poorer survival experience, rather than protective effects.

Below is a basic flow chart for a standard survival analysis:

Introductory tutorials

Survival Analysis in R
An Introduction to Survival Analysis in R with Emily Zabor

Text books

Harrell, F. E. (2001). Regression modeling strategies: with applications to linear models, logistic regression, and survival analysis (Vol. 608). New York: springer.

Hosmer Jr, D. W., Lemeshow, S., & May, S. (2008). Applied survival analysis: regression modeling of time-to-event data (Vol. 618). John Wiley & Sons.

Software

survival R package

ggsurvfit: Flexible Time-to-Event Figures

Looking for a data science consultant? Feel free to get in touch here