Validated inference of smoking habits from blood with a finite DNA methylation marker set

Silvana C E Maas, Dorret I Boomsma, Eco J C de Geus, Gonneke Willemsen, Jenny van Dongen, Manfred Kayser, BIOS Consortium

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

Inferring a person's smoking habit and history from blood is relevant for complementing or replacing self-reports in epidemiological and public health research, and for forensic applications. However, a finite DNA methylation marker set and a validated statistical model based on a large dataset are not yet available. Employing 14 epigenome-wide association studies for marker discovery, and using data from six population-based cohorts (N = 3764) for model building, we identified 13 CpGs most suitable for inferring smoking versus non-smoking status from blood with a cumulative Area Under the Curve (AUC) of 0.901. Internal fivefold cross-validation yielded an average AUC of 0.897 ± 0.137, while external model validation in an independent population-based cohort (N = 1608) achieved an AUC of 0.911. These 13 CpGs also provided accurate inference of current (average AUCcrossvalidation 0.925 ± 0.021, AUCexternalvalidation0.914), former (0.766 ± 0.023, 0.699) and never smoking (0.830 ± 0.019, 0.781) status, allowed inferring pack-years in current smokers (10 pack-years 0.800 ± 0.068, 0.796; 15 pack-years 0.767 ± 0.102, 0.752) and inferring smoking cessation time in former smokers (5 years 0.774 ± 0.024, 0.760; 10 years 0.766 ± 0.033, 0.764; 15 years 0.767 ± 0.020, 0.754). Model application to children revealed highly accurate inference of the true non-smoking status (6 years of age: accuracy 0.994, N = 355; 10 years: 0.994, N = 309), suggesting prenatal and passive smoking exposure having no impact on model applications in adults. The finite set of DNA methylation markers allow accurate inference of smoking habit, with comparable accuracy as plasma cotinine use, and smoking history from blood, which we envision becoming useful in epidemiology and public health research, and in medical and forensic applications.

Original languageEnglish
Pages (from-to)1055-1074
Number of pages20
JournalEuropean Journal of Epidemiology
Volume34
Issue number11
DOIs
Publication statusPublished - Nov 2019

Funding

The authors are grateful to the participants of the cohorts used: LifeLines (http://lifelines.nl/lifelines-research/general), the Leiden Longevity Study (http://www.leidenlangleven.nl), the Netherlands Twin Registry (http://www.tweelingenregister.org), the Rotterdam studies (http://www.erasmus epidemiology.nl/research/ergo.htm), the CODAM study (http://www.carimmaastricht.nl/), and the PAN study (http://www.alsonderzoek.nl/), the KORA study (https://www.helmholtz muenchen.de/en/kora/index.html), SHIP-Trend (http://www.medizin.uni greifswald.de/cm/fv/ship.html), Generation R (https://www.generationr.nl/). We also thank Dr. Hannah R Elliott for kindly sharing the R script, and Michael Verbiest, Mila Jhamai, Sarah Higgins, Marijn Verkerk and Dr. Lisette Stolk for their help in creating the EWAS database for RS and Generation R Study. H.J. Grabe has received funding from Fresenius Medical Care and speaker’s honoraria as well as travel funds from Fresenius Medical Care, Neuraxpharm and Janssen-Cilag. Other than that, the authors declared no conflict of interest. This work was performed within the framework of the Biobank-Based Integrative Omics Studies (BIOS) Consortium funded by BBMRI-NL, a research infrastructure financed by the Netherlands Organization for Scientific Research (NWO 184.021.007). This project has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant agreements No. 633595 (DynaHEALTH) and 733206 (LIFECYCLE). SCEM was supported by Netherlands Institute for Health Sciences scholarship. AV and MK were supported by the Erasmus MC University Medical Center Rotterdam. AV was additionally supported with an EUR fellowship by Erasmus University Rotterdam. LD received funding from the European Union’s Horizon 2020 research and innovation programme (Grant Agreement No. 696295; 2017) co-funded by ERA-Net on Biomarkers for Nutrition and Health (ERA HDHL) and ZonMW The Netherlands (No. 529051014; 2017) (ALPHABET project). VWVJ received funding from the Netherlands Organization for Health Research and Development (VIDI 016.136.361) and a Consolidator Grant from the European Research Council (ERC-2014-CoG-648916). MW has received funding from the European Union Seventh Framework Programme (FP7/2007–2013) under Grant agreements n°603288 (SysVasc) and n°602736 (PAIN-OMICS). The establishment of the RS EWAS data was funded by the Genetic Laboratory of the Department of Internal Medicine, Erasmus MC, and by the Netherlands Organization for Scientific Research (NWO; Project Number 184021007). The Rotterdam Study is funded by Erasmus Medical Center and Erasmus University, Rotterdam, Netherlands Organization for the Health Research and Development (ZonMw), the Research Institute for Diseases in the Elderly (RIDE), the Ministry of Education, Culture and Science, the Ministry for Health, Welfare and Sports, the European Commission (DG XII), and the Municipality of Rotterdam. The general design of the Generation R Study is made possible by financial support from the Erasmus MC, the Erasmus University Rotterdam, the Netherlands Organization for Health Research and Development, and the Ministry of Health, Welfare and Sport. The generation and management of the Illumina 450 K methylation array data was funded by a grant to VWJ from the Netherlands Genomics Initiative (NGI)/Netherlands Organisation for Scientific Research (NWO) Netherlands Consortium for Healthy Aging (NCHA; Project No. 050-060-810), by funds from the Genetic Laboratory of the Department of Internal Medicine, Erasmus MC, and by a grant from the National Institute of Child and Human Development (R01HD068437). CODAM was supported by Grants of the Netherlands Organization for Scientific Research (940–35–034) and the Dutch Diabetes Research Foundation (98.901). Funding for the NTR was obtained from the Netherlands Organization for Scientific Research (NWO) and The Netherlands Organisation for Health Research and Development (ZonMW) Grants 904-61-090, 985-10-002, 912-10-020, 904-61-193,480-04-004, 463-06-001, 451-04-034, 400-05-717, Addiction-31160008, 016-115-035, 481-08-011, 056-32-010, Middelgroot-911-09-032, and NWO-Groot 480-15-001/674. The KORA study was initiated and financed by the Helmholtz Zentrum München –German Research Center for Environmental Health, which is funded by the German Federal Ministry of Education and Research (BMBF) and by the State of Bavaria. SHIP is part of the Community Medicine Research net of the University of Greifswald, Germany, which is funded by the Federal Ministry of Education and Research (Grants No. 01ZZ9603, 01ZZ0103, and 01ZZ0403), the Ministry of Cultural Affairs as well as the Social Ministry of the Federal State of Mecklenburg-West Pomerania, and the network ‘Greifswald Approach to Individualized Medicine (GANI_MED)’ funded by the Federal Ministry of Education and Research (Grant 03IS2061A). DNA methylation data have been supported by the DZHK (Grant 81X3400104). The University of Greifswald is a member of the Caché Campus program of the InterSystems GmbH. The researchers are independent from the funders. The study sponsors had no role in the study design, data collection, data analysis, interpretation of data, and preparation, review or approval of the manuscript. Acknowledgements

FundersFunder number
BBMRI-NL
Erasmus MC University Medical Center Rotterdam
Federal State of Mecklenburg-West Pomerania03IS2061A
Fresenius Medical Care, Neuraxpharm and Janssen-Cilag
Leiden Longevity Study
Ministry of Cultural Affairs
NWO-Groot480-15-001/674
Netherlands Consortium for Healthy Aging050-060-810
Netherlands Genomics Initiative
Netherlands Institute for Health Sciences
Netherlands Organization for Health Research and DevelopmentVIDI 016.136.361
Netherlands Organization for the Health Research and Development
Research Institute for Diseases in the Elderly
ZonMW The Netherlands529051014
National Institute of Child Health and Human Development940–35–034, R01HD068437
Deutsches Zentrum für Herz-Kreislaufforschung81X3400104
Horizon 2020 Framework Programme733206, 633595, 696295
Seventh Framework Programme602736, 603288
Fresenius Medical Care North America
European Commission
European Research CouncilERC-2014-CoG-648916
ZonMw016-115-035, 463-06-001, 481-08-011, 904-61-090, 904-61-193,480-04-004, 400-05-717, 451-04-034, 056-32-010, 985-10-002, 912-10-020
Erasmus Universiteit Rotterdam
Bundesministerium für Bildung und Forschung01ZZ0403, 01ZZ0103, 01ZZ9603
Ministerie van Volksgezondheid, Welzijn en Sport
Erasmus Medisch Centrum184.021.007
Diabetes Fonds98.901
Ministerie van Onderwijs, Cultuur en Wetenschap
Nederlandse Organisatie voor Wetenschappelijk Onderzoek
Seventh Framework Programme
Deutsches Forschungszentrum für Gesundheit und Umwelt, Helmholtz Zentrum München

    Cohort Studies

    • Netherlands Twin Register (NTR)

    Fingerprint

    Dive into the research topics of 'Validated inference of smoking habits from blood with a finite DNA methylation marker set'. Together they form a unique fingerprint.

    Cite this