A Computational Approach to Syntactic Diversity in the Hebrew Bible

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

For more than four decades, the Eep Talstra Centre for Bible and Computer (ETCBC) has been building a richly-annotated linguistic database of the Hebrew Bible. This contribution describes the processes of data creation of this database and its underlying methodological principles. These principles, which can be labeled “bottom-up” and “form-to-function”, stem from a deep concern to do justice to the biblical text itself and to prevent it from being overruled by thematic or theological considerations.

The database facilitates the application of computational linguistics and digital humanities to the Hebrew Bible and supports biblical exegesis, Bible translation as well as the study of the Bible as a language corpus. In recent years the ETCBC database has been transformed to an open tool, which can be consulted online and which can be downloaded as a package for anyone who wants to use it for more advanced computational analysis of the Hebrew Bible.

A research project on syntactic variation in the Hebrew Bible demonstrated the interaction of presumed data of origin (early versus late texts), genre (e.g. prose or poetry), text type (e.g. narrative and direct speech) and syntactic environment (e.g. main versus subordinate clauses). Regarding the realization of the copula “to be”, for example, it can be observed that the narrative text type and the direct speech sections differ considerably in the alleged early texts of the Bible and that the direct speech in the early corpus shows similarities with the Late Biblical Hebrew corpus.

Regarding the complexity of tree structures, it can be observed that changes in the average size of tree structures take place in main clauses, and only later, or not at all, in subordinate clauses. This agrees with a well-known principle in linguistics, the so-called Penthouse Principle, that accounts for the distinction between “innovative” main clauses and “conservative” subordinate clauses.

Such distribution patterns, which can only discovered with a computational full corpus analysis, are helpful to get a better understanding of diachronic language development of Classical Hebrew in the intersection of oral and written text transmission.
Original languageEnglish
Pages (from-to)237–253
Number of pages17
JournalJournal of Biblical Text Research
Volume44
Publication statusPublished - Apr 2019

Fingerprint

Bible
Syntax
Data Base
Computational
Hebrew Bible
Subordinate Clause
Direct Speech
Text Type
Main Clause
Diachrony
Research Projects
Narrative Text
Justice
Syntactic Variation
Poetry
Prose
Language Corpora
Bottom-up
Bible Translation
Copula

Keywords

  • Digital Humanities
  • Bible
  • Hebrew
  • Corpus Linguistics

Cite this

@article{5f04992bc9bc43fe93223ad7a9acff42,
title = "A Computational Approach to Syntactic Diversity in the Hebrew Bible",
abstract = "For more than four decades, the Eep Talstra Centre for Bible and Computer (ETCBC) has been building a richly-annotated linguistic database of the Hebrew Bible. This contribution describes the processes of data creation of this database and its underlying methodological principles. These principles, which can be labeled “bottom-up” and “form-to-function”, stem from a deep concern to do justice to the biblical text itself and to prevent it from being overruled by thematic or theological considerations.The database facilitates the application of computational linguistics and digital humanities to the Hebrew Bible and supports biblical exegesis, Bible translation as well as the study of the Bible as a language corpus. In recent years the ETCBC database has been transformed to an open tool, which can be consulted online and which can be downloaded as a package for anyone who wants to use it for more advanced computational analysis of the Hebrew Bible. A research project on syntactic variation in the Hebrew Bible demonstrated the interaction of presumed data of origin (early versus late texts), genre (e.g. prose or poetry), text type (e.g. narrative and direct speech) and syntactic environment (e.g. main versus subordinate clauses). Regarding the realization of the copula “to be”, for example, it can be observed that the narrative text type and the direct speech sections differ considerably in the alleged early texts of the Bible and that the direct speech in the early corpus shows similarities with the Late Biblical Hebrew corpus. Regarding the complexity of tree structures, it can be observed that changes in the average size of tree structures take place in main clauses, and only later, or not at all, in subordinate clauses. This agrees with a well-known principle in linguistics, the so-called Penthouse Principle, that accounts for the distinction between “innovative” main clauses and “conservative” subordinate clauses.Such distribution patterns, which can only discovered with a computational full corpus analysis, are helpful to get a better understanding of diachronic language development of Classical Hebrew in the intersection of oral and written text transmission.",
keywords = "Digital Humanities, Bible, Hebrew, Corpus Linguistics",
author = "{van Peursen}, Wido",
year = "2019",
month = "4",
language = "English",
volume = "44",
pages = "237–253",
journal = "Journal of Biblical Text Research",
issn = "1226-5926",

}

A Computational Approach to Syntactic Diversity in the Hebrew Bible. / van Peursen, Wido.

In: Journal of Biblical Text Research, Vol. 44, 04.2019, p. 237–253.

Research output: Contribution to JournalArticleAcademicpeer-review

TY - JOUR

T1 - A Computational Approach to Syntactic Diversity in the Hebrew Bible

AU - van Peursen, Wido

PY - 2019/4

Y1 - 2019/4

N2 - For more than four decades, the Eep Talstra Centre for Bible and Computer (ETCBC) has been building a richly-annotated linguistic database of the Hebrew Bible. This contribution describes the processes of data creation of this database and its underlying methodological principles. These principles, which can be labeled “bottom-up” and “form-to-function”, stem from a deep concern to do justice to the biblical text itself and to prevent it from being overruled by thematic or theological considerations.The database facilitates the application of computational linguistics and digital humanities to the Hebrew Bible and supports biblical exegesis, Bible translation as well as the study of the Bible as a language corpus. In recent years the ETCBC database has been transformed to an open tool, which can be consulted online and which can be downloaded as a package for anyone who wants to use it for more advanced computational analysis of the Hebrew Bible. A research project on syntactic variation in the Hebrew Bible demonstrated the interaction of presumed data of origin (early versus late texts), genre (e.g. prose or poetry), text type (e.g. narrative and direct speech) and syntactic environment (e.g. main versus subordinate clauses). Regarding the realization of the copula “to be”, for example, it can be observed that the narrative text type and the direct speech sections differ considerably in the alleged early texts of the Bible and that the direct speech in the early corpus shows similarities with the Late Biblical Hebrew corpus. Regarding the complexity of tree structures, it can be observed that changes in the average size of tree structures take place in main clauses, and only later, or not at all, in subordinate clauses. This agrees with a well-known principle in linguistics, the so-called Penthouse Principle, that accounts for the distinction between “innovative” main clauses and “conservative” subordinate clauses.Such distribution patterns, which can only discovered with a computational full corpus analysis, are helpful to get a better understanding of diachronic language development of Classical Hebrew in the intersection of oral and written text transmission.

AB - For more than four decades, the Eep Talstra Centre for Bible and Computer (ETCBC) has been building a richly-annotated linguistic database of the Hebrew Bible. This contribution describes the processes of data creation of this database and its underlying methodological principles. These principles, which can be labeled “bottom-up” and “form-to-function”, stem from a deep concern to do justice to the biblical text itself and to prevent it from being overruled by thematic or theological considerations.The database facilitates the application of computational linguistics and digital humanities to the Hebrew Bible and supports biblical exegesis, Bible translation as well as the study of the Bible as a language corpus. In recent years the ETCBC database has been transformed to an open tool, which can be consulted online and which can be downloaded as a package for anyone who wants to use it for more advanced computational analysis of the Hebrew Bible. A research project on syntactic variation in the Hebrew Bible demonstrated the interaction of presumed data of origin (early versus late texts), genre (e.g. prose or poetry), text type (e.g. narrative and direct speech) and syntactic environment (e.g. main versus subordinate clauses). Regarding the realization of the copula “to be”, for example, it can be observed that the narrative text type and the direct speech sections differ considerably in the alleged early texts of the Bible and that the direct speech in the early corpus shows similarities with the Late Biblical Hebrew corpus. Regarding the complexity of tree structures, it can be observed that changes in the average size of tree structures take place in main clauses, and only later, or not at all, in subordinate clauses. This agrees with a well-known principle in linguistics, the so-called Penthouse Principle, that accounts for the distinction between “innovative” main clauses and “conservative” subordinate clauses.Such distribution patterns, which can only discovered with a computational full corpus analysis, are helpful to get a better understanding of diachronic language development of Classical Hebrew in the intersection of oral and written text transmission.

KW - Digital Humanities

KW - Bible

KW - Hebrew

KW - Corpus Linguistics

M3 - Article

VL - 44

SP - 237

EP - 253

JO - Journal of Biblical Text Research

JF - Journal of Biblical Text Research

SN - 1226-5926

ER -