Abstract
Large language models (LLMs) have advanced virtual educators and learners, bridging NLP with AI4Education. Existing work often lacks scalability and fails to leverage diverse, large-scale course content, with limited frameworks for assessing pedagogic quality. To this end, we propose WikiHowAgent, a multi-agent workflow leveraging LLMs to simulate interactive teaching-learning conversations. It integrates teacher and learner agents, an interaction manager, and an evaluator to facilitate procedural learning and assess pedagogic quality. We introduce a dataset of 114,296 teacher-learner conversations grounded in 14,287 tutorials across 17 domains and 727 topics. Our evaluation protocol combines computational and rubric-based metrics with human judgment alignment. Results demonstrate the workflow’s effectiveness in diverse setups, offering insights into LLM capabilities across domains. Our datasets and implementations are fully open-sourced.
| Original language | English |
|---|---|
| Title of host publication | Findings of the Association for Computational Linguistics: EMNLP 2025 |
| Subtitle of host publication | [Proceedings] |
| Editors | Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 2984-2997 |
| Number of pages | 14 |
| ISBN (Electronic) | 9798891763357 |
| DOIs | |
| Publication status | Published - 2025 |
| Event | 30th Conference on Empirical Methods in Natural Language Processing, EMNLP 2025 - Suzhou, China Duration: 4 Nov 2025 → 9 Nov 2025 |
Conference
| Conference | 30th Conference on Empirical Methods in Natural Language Processing, EMNLP 2025 |
|---|---|
| Country/Territory | China |
| City | Suzhou |
| Period | 4/11/25 → 9/11/25 |
Bibliographical note
Publisher Copyright:©2025 Association for Computational Linguistics.
Fingerprint
Dive into the research topics of 'Conversational Education at Scale: A Multi-LLM Agent Workflow for Procedural Learning and Pedagogic Quality Assessment'. Together they form a unique fingerprint.Research output
- 1 Paper
-
Conversational Education at Scale: A Multi-LLM Agent Workflow for Procedural Learning and Pedagogic Quality Assessment
Pei, J., Ye, F., Sun, X., Deng, W., Hindriks, K. & Wang, J., 2025, (Accepted/In press).Research output: Contribution to Conference › Paper › Academic
Open Access
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver