ConBind: Motif-aware cross-species alignment for the identification of functional transcription factor binding sites

Stefan H. Lelieveld, Judith Schütte, Maurits J J Dijkstra, Punto Bawono, Sarah J. Kinston, Berthold Göttgens, Jaap Heringa, Nicola Bonzanni

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

Eukaryotic gene expression is regulated by transcription factors (TFs) binding to promoter as well as distal enhancers. TFs recognize short, but specific binding sites (TFBSs) that are located within the promoter and enhancer regions. Functionally relevant TFBSs are often highly conserved during evolution leaving a strong phylogenetic signal. While multiple sequence alignment (MSA) is a potent tool to detect the phylogenetic signal, the current MSA implementations are optimized to align the maximum number of identical nucleotides. This approach might result in the omission of conserved motifs that contain interchangeable nucleotides such as the ETS motif (IUPAC code: GGAW). Here, we introduce ConBind, a novel method to enhance alignment of short motifs, even if their mutual sequence similarity is only partial. ConBind improves the identification of conserved TFBSs by improving the alignment accuracy of TFBS families within orthologous DNA sequences. Functional validation of the Gfi1b + 13 enhancer reveals that ConBind identifies additional functionally important ETS binding sites that were missed by all other tested alignment tools. In addition to the analysis of known regulatory regions, our web tool is useful for the analysis of TFBSs on so far unknown DNA regions identified through ChIP-sequencing.

Original languageEnglish
Pages (from-to)e72
JournalNucleic Acids Research
Volume44
Issue number8
DOIs
Publication statusPublished - 5 May 2016

Fingerprint

Sequence Alignment
Transcription Factors
Nucleotides
Binding Sites
Nucleic Acid Regulatory Sequences
Genetic Promoter Regions
Gene Expression
DNA
ETS Motif

Cite this

Lelieveld, Stefan H. ; Schütte, Judith ; Dijkstra, Maurits J J ; Bawono, Punto ; Kinston, Sarah J. ; Göttgens, Berthold ; Heringa, Jaap ; Bonzanni, Nicola. / ConBind : Motif-aware cross-species alignment for the identification of functional transcription factor binding sites. In: Nucleic Acids Research. 2016 ; Vol. 44, No. 8. pp. e72.
@article{3ecbec512848401495dc747f47fb8052,
title = "ConBind: Motif-aware cross-species alignment for the identification of functional transcription factor binding sites",
abstract = "Eukaryotic gene expression is regulated by transcription factors (TFs) binding to promoter as well as distal enhancers. TFs recognize short, but specific binding sites (TFBSs) that are located within the promoter and enhancer regions. Functionally relevant TFBSs are often highly conserved during evolution leaving a strong phylogenetic signal. While multiple sequence alignment (MSA) is a potent tool to detect the phylogenetic signal, the current MSA implementations are optimized to align the maximum number of identical nucleotides. This approach might result in the omission of conserved motifs that contain interchangeable nucleotides such as the ETS motif (IUPAC code: GGAW). Here, we introduce ConBind, a novel method to enhance alignment of short motifs, even if their mutual sequence similarity is only partial. ConBind improves the identification of conserved TFBSs by improving the alignment accuracy of TFBS families within orthologous DNA sequences. Functional validation of the Gfi1b + 13 enhancer reveals that ConBind identifies additional functionally important ETS binding sites that were missed by all other tested alignment tools. In addition to the analysis of known regulatory regions, our web tool is useful for the analysis of TFBSs on so far unknown DNA regions identified through ChIP-sequencing.",
author = "Lelieveld, {Stefan H.} and Judith Sch{\"u}tte and Dijkstra, {Maurits J J} and Punto Bawono and Kinston, {Sarah J.} and Berthold G{\"o}ttgens and Jaap Heringa and Nicola Bonzanni",
year = "2016",
month = "5",
day = "5",
doi = "10.1093/nar/gkv1518",
language = "English",
volume = "44",
pages = "e72",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "8",

}

ConBind : Motif-aware cross-species alignment for the identification of functional transcription factor binding sites. / Lelieveld, Stefan H.; Schütte, Judith; Dijkstra, Maurits J J; Bawono, Punto; Kinston, Sarah J.; Göttgens, Berthold; Heringa, Jaap; Bonzanni, Nicola.

In: Nucleic Acids Research, Vol. 44, No. 8, 05.05.2016, p. e72.

Research output: Contribution to JournalArticleAcademicpeer-review

TY - JOUR

T1 - ConBind

T2 - Motif-aware cross-species alignment for the identification of functional transcription factor binding sites

AU - Lelieveld, Stefan H.

AU - Schütte, Judith

AU - Dijkstra, Maurits J J

AU - Bawono, Punto

AU - Kinston, Sarah J.

AU - Göttgens, Berthold

AU - Heringa, Jaap

AU - Bonzanni, Nicola

PY - 2016/5/5

Y1 - 2016/5/5

N2 - Eukaryotic gene expression is regulated by transcription factors (TFs) binding to promoter as well as distal enhancers. TFs recognize short, but specific binding sites (TFBSs) that are located within the promoter and enhancer regions. Functionally relevant TFBSs are often highly conserved during evolution leaving a strong phylogenetic signal. While multiple sequence alignment (MSA) is a potent tool to detect the phylogenetic signal, the current MSA implementations are optimized to align the maximum number of identical nucleotides. This approach might result in the omission of conserved motifs that contain interchangeable nucleotides such as the ETS motif (IUPAC code: GGAW). Here, we introduce ConBind, a novel method to enhance alignment of short motifs, even if their mutual sequence similarity is only partial. ConBind improves the identification of conserved TFBSs by improving the alignment accuracy of TFBS families within orthologous DNA sequences. Functional validation of the Gfi1b + 13 enhancer reveals that ConBind identifies additional functionally important ETS binding sites that were missed by all other tested alignment tools. In addition to the analysis of known regulatory regions, our web tool is useful for the analysis of TFBSs on so far unknown DNA regions identified through ChIP-sequencing.

AB - Eukaryotic gene expression is regulated by transcription factors (TFs) binding to promoter as well as distal enhancers. TFs recognize short, but specific binding sites (TFBSs) that are located within the promoter and enhancer regions. Functionally relevant TFBSs are often highly conserved during evolution leaving a strong phylogenetic signal. While multiple sequence alignment (MSA) is a potent tool to detect the phylogenetic signal, the current MSA implementations are optimized to align the maximum number of identical nucleotides. This approach might result in the omission of conserved motifs that contain interchangeable nucleotides such as the ETS motif (IUPAC code: GGAW). Here, we introduce ConBind, a novel method to enhance alignment of short motifs, even if their mutual sequence similarity is only partial. ConBind improves the identification of conserved TFBSs by improving the alignment accuracy of TFBS families within orthologous DNA sequences. Functional validation of the Gfi1b + 13 enhancer reveals that ConBind identifies additional functionally important ETS binding sites that were missed by all other tested alignment tools. In addition to the analysis of known regulatory regions, our web tool is useful for the analysis of TFBSs on so far unknown DNA regions identified through ChIP-sequencing.

UR - http://www.scopus.com/inward/record.url?scp=84966277098&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84966277098&partnerID=8YFLogxK

U2 - 10.1093/nar/gkv1518

DO - 10.1093/nar/gkv1518

M3 - Article

VL - 44

SP - e72

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 8

ER -