Abstract
Retractions in scientific publishing are a critical mechanism for self-correction, yet they remain stigmatized, often associated with misconduct rather than honest error. This study investigates how retractions are framed in the informal content of two leading scientific journals, Science and Nature, using computational methods to analyze mentions of retractions from 1967 to 2023. We compare two distinct approaches to text classification: a dictionary-based method and a large language model (LLM), GPT4o-mini, to identify frames of “misconduct” and “honest mistake” in 999 articles. While the dictionary approach offers transparency, it struggles with nuanced language and contextual references. In contrast, the LLM demonstrates superior accuracy (88% vs. 62%) and provides interpretable explanations for its classifications.
Our findings reveal that retractions are predominantly framed as “misconduct”, with mentions of “honest mistakes” decreasing over time, despite ongoing efforts to foster this frame as a way to motivate open data practices and build trust in scientists and science as a self-correcting endeavor. This trend is further underscored by the fact that external contributions use the “honest mistake” frame significantly less often than editorial and journalistic texts. Additionally, the attribution of misconduct is more prevalent in the fields of medicine & health and biology compared to chemistry and physics, reflecting disciplinary differences in how retractions are socially constructed.
This study not only highlights the methodological advantages and limitations of LLM approaches in frame analysis but also introduces their novel application in the social study of sciences, a field where such methods remain underutilized. By leveraging digital tools to analyze scientific discourse, we contribute to a deeper understanding of how retractions are socially constructed and propose pathways for destigmatizing this essential scientific practice.
Our findings reveal that retractions are predominantly framed as “misconduct”, with mentions of “honest mistakes” decreasing over time, despite ongoing efforts to foster this frame as a way to motivate open data practices and build trust in scientists and science as a self-correcting endeavor. This trend is further underscored by the fact that external contributions use the “honest mistake” frame significantly less often than editorial and journalistic texts. Additionally, the attribution of misconduct is more prevalent in the fields of medicine & health and biology compared to chemistry and physics, reflecting disciplinary differences in how retractions are socially constructed.
This study not only highlights the methodological advantages and limitations of LLM approaches in frame analysis but also introduces their novel application in the social study of sciences, a field where such methods remain underutilized. By leveraging digital tools to analyze scientific discourse, we contribute to a deeper understanding of how retractions are socially constructed and propose pathways for destigmatizing this essential scientific practice.
Original language | English |
---|---|
Publication status | Published - 8 May 2025 |
Event | Measuring Culture: Advancing Computational Approaches in Sociology, Political Science, and Linguistics - Heidelberg Academie der Wissenschaften, Heidelberg, Germany Duration: 8 May 2025 → 9 May 2025 |
Conference
Conference | Measuring Culture |
---|---|
Abbreviated title | Measuring Culture |
Country/Territory | Germany |
City | Heidelberg |
Period | 8/05/25 → 9/05/25 |
Keywords
- Computational Social Sciences
- Framing
- Research Integrity
- Scientific Misconduct
- Natrual Language Processing
- Large Language Models