Comparison of Targeted Maximum Likelihood and Shrinkage Estimators of Parameters in Gene Networks

G. Geeven, M.J. van der Laan, M.C.M. de Gunst

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

Gene regulatory networks, in which edges between nodes describe interactions between transcription factors (TFs) and their target genes, model regulatory interactions that determine the cell-type and condition-specific expression of genes. Regression methods can be used to identify TF-target gene interactions from gene expression and DNA sequence data. The response variable, i.e. observed gene expression, is modeled as a function of many predictor variables simultaneously. In practice, it is generally not possible to select a single model that clearly achieves the best fit to the observed experimental data and the selected models typically contain overlapping sets of predictor variables. Moreover, parameters that represent the marginal effect of the individual predictors are not always present. In this paper, we use the statistical framework of estimation of variable importance to define variable importance as a parameter of interest and study two different estimators of this parameter in the context of gene regulatory networks. On yeast data we show that the resulting parameter has a biologically appealing interpretation. We apply the proposed methodology on mammalian gene expression data to gain insight into the temporal activity of TFs that underly gene expression changes in F11 cells in response to Forskolin stimulation. © 2012 De Gruyter. All rights reserved.
Original languageEnglish
Number of pages27
JournalStatistical Applications in Genetics and Molecular Biology
Volume11
Issue number5
Early online date25 Sep 2012
DOIs
Publication statusPublished - 2012

Fingerprint

Shrinkage Estimator
Gene Networks
Gene Regulatory Networks
Maximum Likelihood Estimator
Maximum likelihood
Gene expression
Genes
Transcription factors
Gene Expression
Transcription Factor
Predictors
Transcription Factors
Gene Regulatory Network
Gene
Interaction
Target
DNA sequences
Cell
Colforsin
Regulator Genes

Cite this

@article{9904fe4cf1c841b78a5702e2833ba5a8,
title = "Comparison of Targeted Maximum Likelihood and Shrinkage Estimators of Parameters in Gene Networks",
abstract = "Gene regulatory networks, in which edges between nodes describe interactions between transcription factors (TFs) and their target genes, model regulatory interactions that determine the cell-type and condition-specific expression of genes. Regression methods can be used to identify TF-target gene interactions from gene expression and DNA sequence data. The response variable, i.e. observed gene expression, is modeled as a function of many predictor variables simultaneously. In practice, it is generally not possible to select a single model that clearly achieves the best fit to the observed experimental data and the selected models typically contain overlapping sets of predictor variables. Moreover, parameters that represent the marginal effect of the individual predictors are not always present. In this paper, we use the statistical framework of estimation of variable importance to define variable importance as a parameter of interest and study two different estimators of this parameter in the context of gene regulatory networks. On yeast data we show that the resulting parameter has a biologically appealing interpretation. We apply the proposed methodology on mammalian gene expression data to gain insight into the temporal activity of TFs that underly gene expression changes in F11 cells in response to Forskolin stimulation. {\circledC} 2012 De Gruyter. All rights reserved.",
author = "G. Geeven and {van der Laan}, M.J. and {de Gunst}, M.C.M.",
year = "2012",
doi = "10.1515/1544-6115.1728",
language = "English",
volume = "11",
journal = "Statistical Applications in Genetics and Molecular Biology",
issn = "2194-6302",
publisher = "Walter de Gruyter GmbH",
number = "5",

}

Comparison of Targeted Maximum Likelihood and Shrinkage Estimators of Parameters in Gene Networks. / Geeven, G.; van der Laan, M.J.; de Gunst, M.C.M.

In: Statistical Applications in Genetics and Molecular Biology, Vol. 11, No. 5, 2012.

Research output: Contribution to JournalArticleAcademicpeer-review

TY - JOUR

T1 - Comparison of Targeted Maximum Likelihood and Shrinkage Estimators of Parameters in Gene Networks

AU - Geeven, G.

AU - van der Laan, M.J.

AU - de Gunst, M.C.M.

PY - 2012

Y1 - 2012

N2 - Gene regulatory networks, in which edges between nodes describe interactions between transcription factors (TFs) and their target genes, model regulatory interactions that determine the cell-type and condition-specific expression of genes. Regression methods can be used to identify TF-target gene interactions from gene expression and DNA sequence data. The response variable, i.e. observed gene expression, is modeled as a function of many predictor variables simultaneously. In practice, it is generally not possible to select a single model that clearly achieves the best fit to the observed experimental data and the selected models typically contain overlapping sets of predictor variables. Moreover, parameters that represent the marginal effect of the individual predictors are not always present. In this paper, we use the statistical framework of estimation of variable importance to define variable importance as a parameter of interest and study two different estimators of this parameter in the context of gene regulatory networks. On yeast data we show that the resulting parameter has a biologically appealing interpretation. We apply the proposed methodology on mammalian gene expression data to gain insight into the temporal activity of TFs that underly gene expression changes in F11 cells in response to Forskolin stimulation. © 2012 De Gruyter. All rights reserved.

AB - Gene regulatory networks, in which edges between nodes describe interactions between transcription factors (TFs) and their target genes, model regulatory interactions that determine the cell-type and condition-specific expression of genes. Regression methods can be used to identify TF-target gene interactions from gene expression and DNA sequence data. The response variable, i.e. observed gene expression, is modeled as a function of many predictor variables simultaneously. In practice, it is generally not possible to select a single model that clearly achieves the best fit to the observed experimental data and the selected models typically contain overlapping sets of predictor variables. Moreover, parameters that represent the marginal effect of the individual predictors are not always present. In this paper, we use the statistical framework of estimation of variable importance to define variable importance as a parameter of interest and study two different estimators of this parameter in the context of gene regulatory networks. On yeast data we show that the resulting parameter has a biologically appealing interpretation. We apply the proposed methodology on mammalian gene expression data to gain insight into the temporal activity of TFs that underly gene expression changes in F11 cells in response to Forskolin stimulation. © 2012 De Gruyter. All rights reserved.

U2 - 10.1515/1544-6115.1728

DO - 10.1515/1544-6115.1728

M3 - Article

VL - 11

JO - Statistical Applications in Genetics and Molecular Biology

JF - Statistical Applications in Genetics and Molecular Biology

SN - 2194-6302

IS - 5

ER -