Abk�rzung zur Hauptnavigation
Abk�rzung zu den Newsmeldungen
Abk�rzung zu den Topstories
Inhaltsbereich
### Firth correction for logistic, Poisson and Cox regression

## Quick Links

## Featured

The phenomenon of monotone likelihood or separation is observed in the fitting process of a regression model if the likelihood converges while at least one parameter estimate diverges to infinity. Separation primarily occurs with small samples with rare events or substantial censoring of survival times and several highly predictive covariates. Statistical software packages using the maximum likelihood method cannot appropriately deal with this problem. To solve this problem the Firth (1993) bias correction method has been proposed by Heinze, Schemper and colleagues (see references below). Unlike the maximum likelihood method, the Firth correction always leads to finite parameter estimates. Extensive simulation studies proved the dominance of Firth’s correction over maximum likelihood.

Heinze and Schemper (2001, 2002) proposed to combine point estimation with confidence interval estimation based on the profile penalized likelihood method, as these intervals are able to reflect the asymmetric shape of the profile penalized likelihood that results when the standard profile likelihood does not have a unique maximum.

Heinze and Puhr (2010) extended the method to be used with conditional logistic regression.

Heinze, Ploner and Beyea (2013) proposed a new method to compute confidence intervals after multiple imputation based on combining penalized likelihood profiles.

Puhr, Heinze, Nold, Lusa and Geroldinger (2017) proposed two new modifications of Firth’s correction for logistic regression, FLIC and FLAC. While the standard Firth correction leads to shrinkage in all parameters, including the intercept, and hence produces predictions which are biased towards 0.5, FLIC and FLAC are able to exclude the intercept from shrinkage while maintaining the desirable properties of the Firth correction.

Firth’s correction was compared to other methods such as Bayesian Data Augmentation and an increasing sample size strategy by Heinze (2006), Mansournia, Geroldinger, Greenland and Heinze (2018), Sinkovec, Geroldinger and Heinze (2019) and Sinkovec, Heinze, Blagus and Geroldinger (2021).

Firth's correction for Poisson regression, including its modifications FLIC and FLAC, were described, empirically evaluated and compared to Bayesian Data Augmentation and Exact Poisson Regression by Joshi, Geroldinger, Jiricka, Senchaudhuri, Corcoran and Heinze (2021).

Here we link to SAS and R software that can be used to apply the Firth correction to logistic, conditional logistic, Cox and Poisson regression.

SAS software repository on github with macros for all mentioned types of regression:

https://github.com/georgheinze/flicflac

R repository on github for logistic regression (including FLIC and FLAC):

https://github.com/georgheinze/logistf

R repository on github for Cox regression:

https://github.com/georgheinze/coxphf

The R packages are also available on CRAN and can be installed from within R.

References:

**Firth, D.** (1993): "Bias reduction of maximum likelihood estimates"*,* *Biometrika ***80**(1): 27-38; (doi:10.1093/biomet/80.1.27)

**Heinze, G.** (1999): "The application of Firth's procedure to Cox and logistic regression", *Technical Report 10/1999*, Section for Clinical Biometrics, CeMSIIS, Medical University of Vienna

**Heinze, G., Schemper, M. **(2001): "A Solution to the Problem of Monotone Likelihood in Cox Regression", *Biometrics* 57 (1), 114 – 119 (doi:10.1111/j.0006-341x.2001.00114.x)

**Heinze, G., Schemper, M.** (2002): "A Solution to the Problem of Separation in logistic regression", *Statistics in Medicine* 21, 2409 – 2419 (doi:10.1002/sim.1047)

**Heinze, G., Ploner, M. **(2002): "SAS and SPLUS programs to perform Cox regression without convergence problems", *Computer Methods and Programs in Biomedicine* 67, 217 – 223 (doi:10.1016/S0169-2607(01)00149-3)

**Heinze, G., Ploner, M.** (2003): "Fixing the nonconvergence bug in logistic regression with SPLUS and SAS", *Computer Methods and Programs in Biomedicine* 71, 181 – 187 (doi:10.1016/s0169-2607(02)00088-3)

**Heinze, G.** (2006): "A comparative investigation of methods for logistic regression with separated or nearly separated data", *Statistics in Medicine* 25, 4216 – 4226 (doi:10.1002/sim.2687)

**Heinze, G., Dunkler, D. **(2008): "Avoiding infinite estimates of time-dependent effects in small-sample survival studies", *Statistics in Medicine* 27, 6455 – 6469 (doi:10.1002/sim.3418)

**Heinze, G., Puhr, R. **(2010): "Bias-reduced and separation-proof conditional logistic regression with small or sparse data sets", *Statistics in Medicine* 29, 770 - 777 (doi:10.1002/sim.3794)

**Heinze,G., Ploner,M., Beyea,J. **(2013): "Confidence intervals after multiple imputation: combining profile likelihood information from logistic regressions", *Statistics in Medicine *32, 5062 - 5076 (doi:10.1002/sim.5899)

**Puhr R, Heinze G, Nold M, Lusa L, Geroldinger A **(2017): “Firth's logistic regression with rare events: accurate effect estimates and predictions?”, *Statistics in Medicine *36(14), 2302-2317 (doi:10.1002/sim.7273)

**Mansournia MA, Geroldinger A, Greenland S, Heinze G **(2018): “Separation in Logistic Regression: Causes, Consequences, and Control.” *American Journal of Epidemiology* 187(4): 864-870; (doi:10.1093/aje/kwx299)

**Sinkovec H, Geroldinger A, Heinze G** (2019): "Bring More Data!-A Good Advice? Removing Separation in Logistic Regression by Increasing Sample Size." *International Journal of Environmental Research and Public Health *16(23): 4658 (doi:10.3390/ijerph16234658)

**Joshi A, Geroldinger A, Jiricka L, Senchaudhuri P, Corcoran C, Heinze G** (2021): "Solutions to problems of nonexistence of parameter estimates and sparse data bias in Poisson regression." *Statistical Methods in Medical Research, early view *(doi:10.1177/09622802211065405)

**Sinkovec H, Heinze G, Blagus R, Geroldinger A** (2021): "To tune or not to tune, a case study of ridge logistic regression in small or sparse datasets." *BMC *Medical Research Methodology 21(1): 199 (doi:10.1186/s12874-021-01374-y)