Analysing continuous proportions in ecology and evolution: A practical introduction to beta and Dirichlet regression

Jacob C. Douma*, James T. Weedon

*Corresponding author for this work

    Research output: Contribution to JournalReview articleAcademicpeer-review

    Abstract

    Proportional data, in which response variables are expressed as percentages or fractions of a whole, are analysed in many subfields of ecology and evolution. The scale-independence of proportions makes them appropriate to analyse many biological phenomena, but statistical analyses are not straightforward, since proportions can only take values from zero to one and their variance is usually not constant across the range of the predictor. Transformations to overcome these problems are often applied, but can lead to biased estimates and difficulties in interpretation. In this paper, we provide an overview of the different types of proportional data and discuss the different analysis strategies available. In particular, we review and discuss the use of promising, but little used, techniques for analysing continuous (also called non-count-based or non-binomial) proportions (e.g. percent cover, fraction time spent on an activity): beta and Dirichlet regression, and some of their most important extensions. A major distinction can be made between proportions arising from counts and those arising from continuous measurements. For proportions consisting of two categories, count-based data are best analysed using well-developed techniques such as logistic regression, while continuous proportions can be analysed with beta regression models. In the case of >2 categories, multinomial logistic regression or Dirichlet regression can be applied. Both beta and Dirichlet regression techniques model proportions at their original scale, which makes statistical inference more straightforward and produce less biased estimates relative to transformation-based solutions. Extensions to beta regression, such as models for variable dispersion, zero-one augmented data and mixed effects designs have been developed and are reviewed and applied to case studies. Finally, we briefly discuss some issues regarding model fitting, inference, and reporting that are particularly relevant to beta and Dirichlet regression. Beta regression and Dirichlet regression overcome some problems inherent in applying classic statistical approaches to proportional data. To facilitate the adoption of these techniques by practitioners in ecology and evolution, we present detailed, annotated demonstration scripts covering all variations of beta and Dirichlet regression discussed in the article, implemented in the freely available language for statistical computing, r.

    Original languageEnglish
    Pages (from-to)1412-1430
    Number of pages19
    JournalMethods in Ecology and Evolution
    Volume10
    Issue number9
    DOIs
    Publication statusPublished - Sept 2019

    Funding

    J.C.D. and J.T.W. were both supported by NWO project no. 863.14.018 and 016.171.089 respectively. J.T.W. was also partially supported by a postdoctoral fellowship from the Flemish Science Foundation (FWO). We thank Hendrik Poorter for providing the data on the biomass fractions, F. Bongers and R. Bevans for constructive criticism on a draft of this article, M. Joseph, the associate editor and two anonymous reviewers for valuable feedback on an earlier submitted version and Zishen Wang for providing a Chinese abstract.

    FundersFunder number
    Flemish Science Foundation
    Fonds Wetenschappelijk Onderzoek
    Nederlandse Organisatie voor Wetenschappelijk Onderzoek016.171.089, 863.14.018

      Keywords

      • beta regression
      • Dirichlet regression
      • fractions
      • non-count proportions
      • one augmented
      • proportions
      • transformations
      • zero augmented

      Fingerprint

      Dive into the research topics of 'Analysing continuous proportions in ecology and evolution: A practical introduction to beta and Dirichlet regression'. Together they form a unique fingerprint.

      Cite this