Sandwich corrected standard errors in family-based genome-wide association studies

C.C. Minica, C.V. Dolan, M.M.D. Kampert, D.I. Boomsma, J.M. Vink

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

Given the availability of genotype and phenotype data collected in family members, the question arises which estimator ensures the most optimal use of such data in genome-wide scans. Using simulations, we compared the Unweighted Least Squares (ULS) and Maximum Likelihood (ML) procedures. The former is implemented in Plink and uses a sandwich correction to correct the standard errors for model misspecification of ignoring the clustering. The latter is implemented by fast linear mixed procedures and models explicitly the familial resemblance. However, as it commits to a background model limited to additive genetic and unshared environmental effects, it employs a misspecified model for traits with a shared environmental component. We considered the performance of the two procedures in terms of type I and type II error rates, with correct and incorrect model specification in ML. For traits characterized by moderate to large familial resemblance, using an ML procedure with a correctly specified model for the conditional familial covariance matrix should be the strategy of choice. The potential loss in power encountered by the sandwich corrected ULS procedure does not outweigh its computational convenience. Furthermore, the ML procedure was quite robust under model misspecification in the simulated settings and appreciably more powerful than the sandwich corrected ULS procedure. However, to correct for the effects of model misspecification in ML in circumstances other than those considered here, we propose to use a sandwich correction. We show that the sandwich correction can be formulated in terms of the fast ML method.
Original languageEnglish
Pages (from-to)388-394
Number of pages7
JournalEuropean Journal of Human Genetics
Volume23
Issue number3
DOIs
Publication statusPublished - 2015

Fingerprint

Genome-Wide Association Study
Least-Squares Analysis
Cluster Analysis
Genotype
Genome
Phenotype

Cite this

@article{b7be001f086343839234999283089380,
title = "Sandwich corrected standard errors in family-based genome-wide association studies",
abstract = "Given the availability of genotype and phenotype data collected in family members, the question arises which estimator ensures the most optimal use of such data in genome-wide scans. Using simulations, we compared the Unweighted Least Squares (ULS) and Maximum Likelihood (ML) procedures. The former is implemented in Plink and uses a sandwich correction to correct the standard errors for model misspecification of ignoring the clustering. The latter is implemented by fast linear mixed procedures and models explicitly the familial resemblance. However, as it commits to a background model limited to additive genetic and unshared environmental effects, it employs a misspecified model for traits with a shared environmental component. We considered the performance of the two procedures in terms of type I and type II error rates, with correct and incorrect model specification in ML. For traits characterized by moderate to large familial resemblance, using an ML procedure with a correctly specified model for the conditional familial covariance matrix should be the strategy of choice. The potential loss in power encountered by the sandwich corrected ULS procedure does not outweigh its computational convenience. Furthermore, the ML procedure was quite robust under model misspecification in the simulated settings and appreciably more powerful than the sandwich corrected ULS procedure. However, to correct for the effects of model misspecification in ML in circumstances other than those considered here, we propose to use a sandwich correction. We show that the sandwich correction can be formulated in terms of the fast ML method.",
author = "C.C. Minica and C.V. Dolan and M.M.D. Kampert and D.I. Boomsma and J.M. Vink",
year = "2015",
doi = "10.1038/ejhg.2014.94",
language = "English",
volume = "23",
pages = "388--394",
journal = "European Journal of Human Genetics",
issn = "1018-4813",
publisher = "Nature Publishing Group",
number = "3",

}

Sandwich corrected standard errors in family-based genome-wide association studies. / Minica, C.C.; Dolan, C.V.; Kampert, M.M.D.; Boomsma, D.I.; Vink, J.M.

In: European Journal of Human Genetics, Vol. 23, No. 3, 2015, p. 388-394.

Research output: Contribution to JournalArticleAcademicpeer-review

TY - JOUR

T1 - Sandwich corrected standard errors in family-based genome-wide association studies

AU - Minica, C.C.

AU - Dolan, C.V.

AU - Kampert, M.M.D.

AU - Boomsma, D.I.

AU - Vink, J.M.

PY - 2015

Y1 - 2015

N2 - Given the availability of genotype and phenotype data collected in family members, the question arises which estimator ensures the most optimal use of such data in genome-wide scans. Using simulations, we compared the Unweighted Least Squares (ULS) and Maximum Likelihood (ML) procedures. The former is implemented in Plink and uses a sandwich correction to correct the standard errors for model misspecification of ignoring the clustering. The latter is implemented by fast linear mixed procedures and models explicitly the familial resemblance. However, as it commits to a background model limited to additive genetic and unshared environmental effects, it employs a misspecified model for traits with a shared environmental component. We considered the performance of the two procedures in terms of type I and type II error rates, with correct and incorrect model specification in ML. For traits characterized by moderate to large familial resemblance, using an ML procedure with a correctly specified model for the conditional familial covariance matrix should be the strategy of choice. The potential loss in power encountered by the sandwich corrected ULS procedure does not outweigh its computational convenience. Furthermore, the ML procedure was quite robust under model misspecification in the simulated settings and appreciably more powerful than the sandwich corrected ULS procedure. However, to correct for the effects of model misspecification in ML in circumstances other than those considered here, we propose to use a sandwich correction. We show that the sandwich correction can be formulated in terms of the fast ML method.

AB - Given the availability of genotype and phenotype data collected in family members, the question arises which estimator ensures the most optimal use of such data in genome-wide scans. Using simulations, we compared the Unweighted Least Squares (ULS) and Maximum Likelihood (ML) procedures. The former is implemented in Plink and uses a sandwich correction to correct the standard errors for model misspecification of ignoring the clustering. The latter is implemented by fast linear mixed procedures and models explicitly the familial resemblance. However, as it commits to a background model limited to additive genetic and unshared environmental effects, it employs a misspecified model for traits with a shared environmental component. We considered the performance of the two procedures in terms of type I and type II error rates, with correct and incorrect model specification in ML. For traits characterized by moderate to large familial resemblance, using an ML procedure with a correctly specified model for the conditional familial covariance matrix should be the strategy of choice. The potential loss in power encountered by the sandwich corrected ULS procedure does not outweigh its computational convenience. Furthermore, the ML procedure was quite robust under model misspecification in the simulated settings and appreciably more powerful than the sandwich corrected ULS procedure. However, to correct for the effects of model misspecification in ML in circumstances other than those considered here, we propose to use a sandwich correction. We show that the sandwich correction can be formulated in terms of the fast ML method.

U2 - 10.1038/ejhg.2014.94

DO - 10.1038/ejhg.2014.94

M3 - Article

VL - 23

SP - 388

EP - 394

JO - European Journal of Human Genetics

JF - European Journal of Human Genetics

SN - 1018-4813

IS - 3

ER -