The analysis and forecasting of tennis matches by using a high dimensional dynamic model

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

We propose a high dimensional dynamic model for tennis match results with time varying player-specific abilities for different court surface types. Our statistical model can be treated in a likelihood-based analysis and can handle high dimensional data sets while the number of parameters remains small. In particular, we analyse 17 years of tennis matches for a panel of over 500 players, which leads to more than 2000 dynamic strength levels. We find that time varying player-specific abilities for different court surfaces are of key importance for analysing tennis matches. We further consider several other extensions including player-specific explanatory variables and the match configurations for Grand Slam tournaments. The estimation results can be used to construct rankings of players for different court surface types. We finally show that our proposed model produces accurate forecasts. We provide evidence that our model significantly outperforms existing models in the forecasting of tennis match results.

Original languageEnglish
Pages (from-to)1393-1409
JournalJournal of the Royal Statistical Society. Series A: Statistics in Society
Volume182
Issue number4
DOIs
Publication statusPublished - 2019

Fingerprint

Forecasting
Dynamic Model
High-dimensional
Time-varying
Tournament
High-dimensional Data
Small Parameter
Statistical Model
Forecast
Likelihood
Ranking
Model
ability
Configuration
ranking
Tennis
evidence
time
Evidence

Keywords

  • Association of Tennis Professionals
  • Bradley–Terry model
  • Logistic regression
  • Maximum likelihood
  • Out-of-sample analysis
  • Player rankings
  • Score-driven model
  • Time varying parameter

Cite this

@article{7d33758a862e40d1adb07d2b2b065902,
title = "The analysis and forecasting of tennis matches by using a high dimensional dynamic model",
abstract = "We propose a high dimensional dynamic model for tennis match results with time varying player-specific abilities for different court surface types. Our statistical model can be treated in a likelihood-based analysis and can handle high dimensional data sets while the number of parameters remains small. In particular, we analyse 17 years of tennis matches for a panel of over 500 players, which leads to more than 2000 dynamic strength levels. We find that time varying player-specific abilities for different court surfaces are of key importance for analysing tennis matches. We further consider several other extensions including player-specific explanatory variables and the match configurations for Grand Slam tournaments. The estimation results can be used to construct rankings of players for different court surface types. We finally show that our proposed model produces accurate forecasts. We provide evidence that our model significantly outperforms existing models in the forecasting of tennis match results.",
keywords = "Association of Tennis Professionals, Bradley–Terry model, Logistic regression, Maximum likelihood, Out-of-sample analysis, Player rankings, Score-driven model, Time varying parameter",
author = "P. Gorgi and Koopman, {S. J.} and R. Lit",
year = "2019",
doi = "10.1111/rssa.12464",
language = "English",
volume = "182",
pages = "1393--1409",
journal = "Journal of the Royal Statistical Society. Series A. Statistics in Society",
issn = "0964-1998",
publisher = "Wiley-Blackwell",
number = "4",

}

The analysis and forecasting of tennis matches by using a high dimensional dynamic model. / Gorgi, P.; Koopman, S. J.; Lit, R.

In: Journal of the Royal Statistical Society. Series A: Statistics in Society, Vol. 182, No. 4, 2019, p. 1393-1409.

Research output: Contribution to JournalArticleAcademicpeer-review

TY - JOUR

T1 - The analysis and forecasting of tennis matches by using a high dimensional dynamic model

AU - Gorgi, P.

AU - Koopman, S. J.

AU - Lit, R.

PY - 2019

Y1 - 2019

N2 - We propose a high dimensional dynamic model for tennis match results with time varying player-specific abilities for different court surface types. Our statistical model can be treated in a likelihood-based analysis and can handle high dimensional data sets while the number of parameters remains small. In particular, we analyse 17 years of tennis matches for a panel of over 500 players, which leads to more than 2000 dynamic strength levels. We find that time varying player-specific abilities for different court surfaces are of key importance for analysing tennis matches. We further consider several other extensions including player-specific explanatory variables and the match configurations for Grand Slam tournaments. The estimation results can be used to construct rankings of players for different court surface types. We finally show that our proposed model produces accurate forecasts. We provide evidence that our model significantly outperforms existing models in the forecasting of tennis match results.

AB - We propose a high dimensional dynamic model for tennis match results with time varying player-specific abilities for different court surface types. Our statistical model can be treated in a likelihood-based analysis and can handle high dimensional data sets while the number of parameters remains small. In particular, we analyse 17 years of tennis matches for a panel of over 500 players, which leads to more than 2000 dynamic strength levels. We find that time varying player-specific abilities for different court surfaces are of key importance for analysing tennis matches. We further consider several other extensions including player-specific explanatory variables and the match configurations for Grand Slam tournaments. The estimation results can be used to construct rankings of players for different court surface types. We finally show that our proposed model produces accurate forecasts. We provide evidence that our model significantly outperforms existing models in the forecasting of tennis match results.

KW - Association of Tennis Professionals

KW - Bradley–Terry model

KW - Logistic regression

KW - Maximum likelihood

KW - Out-of-sample analysis

KW - Player rankings

KW - Score-driven model

KW - Time varying parameter

UR - http://www.scopus.com/inward/record.url?scp=85065201409&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85065201409&partnerID=8YFLogxK

U2 - 10.1111/rssa.12464

DO - 10.1111/rssa.12464

M3 - Article

VL - 182

SP - 1393

EP - 1409

JO - Journal of the Royal Statistical Society. Series A. Statistics in Society

JF - Journal of the Royal Statistical Society. Series A. Statistics in Society

SN - 0964-1998

IS - 4

ER -