A Computational Approach to Syntactic Diversity in the Hebrew Bible

Research output: Contribution to JournalArticleAcademicpeer-review

42 Downloads (Pure)


For more than four decades, the Eep Talstra Centre for Bible and Computer (ETCBC) has been building a richly-annotated linguistic database of the Hebrew Bible. This contribution describes the processes of data creation of this database and its underlying methodological principles. These principles, which can be labeled “bottom-up” and “form-to-function”, stem from a deep concern to do justice to the biblical text itself and to prevent it from being overruled by thematic or theological considerations.

The database facilitates the application of computational linguistics and digital humanities to the Hebrew Bible and supports biblical exegesis, Bible translation as well as the study of the Bible as a language corpus. In recent years the ETCBC database has been transformed to an open tool, which can be consulted online and which can be downloaded as a package for anyone who wants to use it for more advanced computational analysis of the Hebrew Bible.

A research project on syntactic variation in the Hebrew Bible demonstrated the interaction of presumed data of origin (early versus late texts), genre (e.g. prose or poetry), text type (e.g. narrative and direct speech) and syntactic environment (e.g. main versus subordinate clauses). Regarding the realization of the copula “to be”, for example, it can be observed that the narrative text type and the direct speech sections differ considerably in the alleged early texts of the Bible and that the direct speech in the early corpus shows similarities with the Late Biblical Hebrew corpus.

Regarding the complexity of tree structures, it can be observed that changes in the average size of tree structures take place in main clauses, and only later, or not at all, in subordinate clauses. This agrees with a well-known principle in linguistics, the so-called Penthouse Principle, that accounts for the distinction between “innovative” main clauses and “conservative” subordinate clauses.

Such distribution patterns, which can only discovered with a computational full corpus analysis, are helpful to get a better understanding of diachronic language development of Classical Hebrew in the intersection of oral and written text transmission.
Original languageEnglish
Pages (from-to)237–253
Number of pages17
JournalJournal of Biblical Text Research
Publication statusPublished - Apr 2019


  • Digital Humanities
  • Bible
  • Hebrew
  • Corpus Linguistics


Dive into the research topics of 'A Computational Approach to Syntactic Diversity in the Hebrew Bible'. Together they form a unique fingerprint.

Cite this