stata stata

An Introduction to Survival Analysis Using Stata

Mario Cleves, William W. Gould, Roberto G. Gutierrez, and Yulia Marchenko

Table of contents

    List of Tables
    List of Figures

    Preface to the Second Edition
    Preface to the Revised Edition
    Preface to the First Edition
    Notation and Typography

  • 1 The problem of survival analysis
    • 1.1 Parametric modeling
    • 1.2 Semiparametric modeling
    • 1.3 Nonparametric analysis
    • 1.4 Linking the three approaches

  • 2 Describing the distribution of failure times
    • 2.1 The survivor and hazard functions
    • 2.2 The quantile function
    • 2.3 Interpreting the cumulative hazard and hazard rate
      • 2.3.1 Interpreting the cumulative hazard
      • 2.3.2 Interpreting the hazard rate
    • 2.4 Means and medians

  • 3 Hazard models
    • 3.1 Parametric models
    • 3.2 Semiparametric models
    • 3.3 Analysis time (time at risk)

  • 4 Censoring and truncation
    • 4.1 Censoring
      • 4.1.1 Right censoring
        4.1.2 Interval censoring
        4.1.3 Left censoring
    • 4.2 Truncation
      • 4.2.1 Left truncation (delayed entry)
        4.2.2 Interval truncation (gaps)
        4.2.3 Right truncation

  • 5 Recording survival data
    • 5.1 The desired format
    • 5.2 Other formats
    • 5.3 Example: Wide-form snapshot data

  • 6 Using stset
    • 6.1 A short lesson on dates
    • 6.2 Purposes of the stset command
    • 6.3 The syntax of the stset command
      • 6.3.1 Specifying analysis time
        6.3.2 Variables defined by stset
        6.3.3 Specifying what constitutes failure
        6.3.4 Specifying when subjects exit from the analysis
        6.3.5 Specifying when subjects enter the analysis
        6.3.6 Specifying the subject-ID variable
        6.3.7 Specifying the begin-of-span variable
        6.3.8 Convenience options

  • 7 After stset
    • 7.1 Look at stset’s output
    • 7.2 List some of your data
    • 7.3 Use stdescribe
    • 7.4 Use stvary
    • 7.5 Perhaps use stfill
    • 7.6 Example: Hip fracture data

  • 8 Nonparametric analysis
    • 8.1 Inadequacies of standard univariate methods
    • 8.2 The Kaplan–Meier estimator
      • 8.2.1 Calculation
        8.2.2 Censoring
        8.2.3 Left truncation (delayed entry)
        8.2.4 Interval truncation (gaps)
        8.2.5 Relationship to the empirical distribution function
        8.2.6 Other uses of sts list
        8.2.7 Graphing the Kaplan–Meier estimate
    • 8.3 The Nelson–Aalen estimator
    • 8.4 Estimating the hazard function
    • 8.5 Estimating mean and median survival times
    • 8.6 Tests of hypothesis
      • 8.6.1 The log-rank test
        8.6.2 The Wilcoxon test
        8.6.3 Other tests
        8.6.4 Stratified tests

  • 9 The Cox proportional hazards model
    • 9.1 Using stcox
      • 9.1.1 The Cox model has no intercept
        9.1.2 Interpreting coefficients
        9.1.3 The effect of units on coefficients
        9.1.4 Estimating the baseline cumulative hazard and survivor functions
        9.1.5 Estimating the baseline hazard function
        9.1.6 The effect of units on the baseline functions
    • 9.2 Likelihood calculations
      • 9.2.1 No tied failures
        9.2.2 Tied failures
        The marginal calculation
        The partial calculation
        The Breslow approximation
        The Efron approximation
        9.2.3 Summary
    • 9.3 Stratified analysis
      • 9.3.1 Obtaining coefficient estimates
        9.3.2 Obtaining estimates of baseline functions
    • 9.4 Cox models with shared frailty
      • 9.4.1 Parameter estimation
        9.4.2 Obtaining estimates of baseline functions
    • 9.5 Cox models with survey data
      • 9.5.1 Declaring survey characteristics
        9.5.2 Fitting a Cox model with survey data
        9.5.3 Some caveats of analyzing survival data from complex survey designs

  • 10 Model building using stcox
    • 10.1 Indicator variables
    • 10.2 Categorical variables
    • 10.3 Continuous variables
      • 10.3.1 Fractional polynomials
    • 10.4 Interactions
    • 10.5 Time-varying variables
      • 10.5.1 Using stcox, tvc() texp()
        10.5.2 Using stsplit
    • 10.6 Modeling group effects: fixed-effects, random-effects, stratification, and clustering

  • 11 The Cox model: Diagnostics
    • 11.1 Testing the proportional-hazards assumption
      • 11.1.1 Tests based on reestimation
        11.1.2 Test based on Schoenfeld residuals
        11.1.3 Graphical methods
    • 11.2 Residuals
      • Reye’s syndrome data
        11.2.1 Determining functional form
        11.2.2 Goodness of fit
        11.2.3 Outliers and influential points

  • 12 Parametric models
    • 12.1 Motivation
    • 12.2 Classes of parametric models
      • 12.2.1 Parametric proportional hazards models
        12.2.2 Accelerated failure-time models
        12.2.3 Comparing the two parameterizations

  • 13 A survey of parametric regression models in Stata
    • 13.1 The exponential model
      • 13.1.1 Exponential regression in the PH metric
        13.1.2 Exponential regression in the AFT metric
    • 13.2 Weibull regression
      • 13.2.1 Weibull regression in the PH metric
        Fitting null models
        13.2.2 Weibull regression in the AFT metric
    • 13.3 Gompertz regression (PH metric)
    • 13.4 Lognormal regression (AFT metric)
    • 13.5 Loglogistic regression (AFT metric)
    • 13.6 Generalized gamma regression (AFT metric)
    • 13.7 Choosing among parametric models
      • 13.7.1 Nested models
        13.7.2 Nonnested models

  • 14 Postestimation commands for parametric models
    • 14.1 Use of predict after streg
      • 14.1.1 Predicting the time of failure
        14.1.2 Predicting the hazard and related functions
        14.1.3 Calculating residuals
    • 14.2 Using stcurve

  • 15 Generalizing the parametric regression model
    • 15.1 Using the ancillary() option
    • 15.2 Stratified models
    • 15.3 Frailty models
      • 15.3.1 Unshared frailty models
        15.3.2 Example: Kidney data
        15.3.3 Testing for heterogeneity
        15.3.4 Shared frailty models

  • 16 Power and sample-size determination for survival analysis
    • 16.1 Estimating sample size
      • 16.1.1 Multiple-myeloma data
        16.1.2 Comparing two survivor functions nonparametrically
        16.1.3 Comparing two exponential survivor functions
        16.1.4 Cox regression models
    • 16.2 Accounting for withdrawal and accrual of subjects
      • 16.2.1 The effect of withdrawal or loss to follow-up
        16.2.2 The effect of accrual
        16.2.3 Examples
    • 16.3 Estimating power and effect size
    • 16.4 Tabulating or graphing results

    References
    Author index
    Subject index