Skip to main navigation Skip to search Skip to main content

Conversational Education at Scale: A Multi-LLM Agent Workflow for Procedural Learning and Pedagogic Quality Assessment

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

Large language models (LLMs) have advanced virtual educators and learners, bridging NLP with AI4Education. Existing work often lacks scalability and fails to leverage diverse, large-scale course content, with limited frameworks for assessing pedagogic quality. To this end, we propose WikiHowAgent, a multi-agent workflow leveraging LLMs to simulate interactive teaching-learning conversations. It integrates teacher and learner agents, an interaction manager, and an evaluator to facilitate procedural learning and assess pedagogic quality. We introduce a dataset of 114,296 teacher-learner conversations grounded in 14,287 tutorials across 17 domains and 727 topics. Our evaluation protocol combines computational and rubric-based metrics with human judgment alignment. Results demonstrate the workflow’s effectiveness in diverse setups, offering insights into LLM capabilities across domains. Our datasets and implementations are fully open-sourced.

Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics: EMNLP 2025
Subtitle of host publication[Proceedings]
EditorsChristos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
PublisherAssociation for Computational Linguistics (ACL)
Pages2984-2997
Number of pages14
ISBN (Electronic)9798891763357
DOIs
Publication statusPublished - 2025
Event30th Conference on Empirical Methods in Natural Language Processing, EMNLP 2025 - Suzhou, China
Duration: 4 Nov 20259 Nov 2025

Conference

Conference30th Conference on Empirical Methods in Natural Language Processing, EMNLP 2025
Country/TerritoryChina
CitySuzhou
Period4/11/259/11/25

Bibliographical note

Publisher Copyright:
©2025 Association for Computational Linguistics.

Fingerprint

Dive into the research topics of 'Conversational Education at Scale: A Multi-LLM Agent Workflow for Procedural Learning and Pedagogic Quality Assessment'. Together they form a unique fingerprint.

Cite this