Logo

70. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V.

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS)
07.-11.09.2025
Jena

Meeting Abstract

The Use of Double Poisson Regression for count data in Health and Life Science – a Narrative Review

Sebastian Appelbaum - Department for Psychology and Psychotherapy, Witten/Herdecke University, Witten, Germany
Julia Stronski - Department for Psychology and Psychotherapy, Witten/Herdecke University, Witten, Germany
Uwe Konerding - Otto-Friedrich-Universität Bamberg, Bamberg, Germany
Thomas Ostermann - Department for Psychology and Psychotherapy, Witten/Herdecke University, Witten, Germany

Text

Introduction: Count data such as days in hospital or sick days is present in many areas of everyday life [1]. Unfortunately, such data often exhibits a right skew in the distribution, i.e. lower count values are more common than higher values. Moreover, they are often characterized by over- and under-dispersion and thus, established methods of inferential statistics are only applicable to a limited extent. In 1986, Efron introduced the double Poisson distribution to account for this problem [2]. The aim of this work is to examine the application of this distribution in regression analyses performed in health-related literature by means of a narrative review.

Methods: Science Direct, PBSC, Pubmed, PsycInfo, PsycArticles, CINAHL and Google Scholar were searched for applications. Purely mathematical treatments were excluded as well as papers, which only illustrated the application of Double Poisson regression with use cases or examples. Applications not dealing with the approach of Efron were also excluded. Two independent reviewers extracted data on Authors, Year, Area or research, Type of study, Type of Count variable, Outcome presentation, Quality criteria applied, Type of dispersion, Implementation, and used Software.

Results: From a total of 1644 hits, 84 articles were pre-selected and after full-text screening, a total of 13 articles remained. All of these articles were published after 2011. Clinical applications i.e. using the Double Poisson regression for the analysis of a clinical trial were only counted once, while most of the studies were ecological studies. More than half of the studies originated from Brasil (n=5) or Iran (n=2) and most of them targeted epidemiological research with a focus on infectious diseases. Both over- and under-dispersion was present and most of the papers used the generalized additive models for location, scale, and shape (GAMLSS) framework [3]. Quality criteria included both the AIC and the BIC.

Conclusion: This narrative review shows that the first steps in applying Efron’s idea of Double Poisson regression for empirical count data have already been successfully taken in a variety of fields in the health and life sciences. Although suggested from Hohberg et al. [4], its application in the evaluation of RCTs is far away from been established. This is perhaps also the reason why there is still a clear heterogeneity in the presentation of the results and the application of quality indicators such as AIC or BIC. Approaches to ease the application of Double Poisson Regression in clinical research e.g. by establishing guidelines should be encouraged.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

[1] Neelon B, O'Malley AJ, Smith VA. Modeling zero-modified count and semicontinuous data in health services research Part 1: background and overview. Statistics in Medicine. 2016;35(27):5070-5093.
[2] Efron B. Double exponential families and their use in generalized linear regression. Journal of the American Statistical Association. 1986;81(395):709-721.
[3] Stasinopoulos DM, Rigby RA. Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software. 2008;23:1-46.
[4] Hohberg M, Pütz P, Kneib T. Treatment effects beyond the mean using distributional regression: Methods and guidance. PloS one. 2020;15(2):e0226514.