Speakers
Gerti Pishtari
University for Continuing Education, Krems, AustriaSteven James Moore
Carnegie Mellon University, USABehzad Mirzababaei
Know-Center GmbH, Graz, AustriaStart
15/09/2022 - 10:30
End
15/09/2022 - 12:30
Address
Room B307 View mapSession: Intelligent Systems and Technological Devices
Chair: Tobias Ley
The Evaluation of One-to-One Initiatives: Exploratory Results from a Systematic Review
Gerti Pishtari[1], Edna Milena Sarmiento-Márquez[2], Kairit Tammets[2] and Jaan Aru[3] [1] Danube University Krems, Austria [2] Tallinn University, Estonia [3] University of Tartu, Estonia
Abstract: While one-to-one initiatives (that equip each student and teacher with digital devices) have been widely implemented, no systematic review has explored how they are being evaluated. The contribution of this paper is twofold. First, we present exploratory insights from a systematic review on the evaluation of one-to-one initiatives. We focus on the relations inside the related research community and explore the relevant research topics that they have considered, through bibiliometric network analyses and topic modeling. Second, this paper contributes to existing guidelines about systematic reviews with an example that applies the mentioned analyses after the manual in-depth review of the papers (usually they are applied in parallel, or afterwards). Results depict a fragmented community, with little explicit collaborations among the research groups, but that shares a common body of literature providing good practices that can inform future one-to-one implementations. This community has considered a common set of topics (including, the implementation of educational technologies, mobile learning and classroom orchestration). Future evaluations of one-to-one initiatives would benefit if grounded in pedagogical theories and informed by learning analytics. Our approach enabled us to understand the dynamics of the related community, identify the core literature, and define guiding questions for future qualitative analyses.
📄 Read More: https://link.springer.com/chapter/10.1007/978-3-031-16290-9_23
Assessing the Quality of Student-Generated Short Answer Questions Using GPT-3
Steven Moore, Huy A. Nguyen, Norman Bier, Tanvi Domadia and John Stamper Carnegie Mellon University, USA
Abstract: Generating short answer questions is a popular form of learnersourcing with benefits for both the students’ higher-order thinking and the instructors’ collection of assessment items. However, assessing the quality of the student-generated questions can involve significant efforts from instructors and domain experts. In this work, we investigate the feasibility of leveraging students to generate short answer questions with minimal scaffolding and machine learning models to evaluate the student-generated questions. We had 143 students across 7 online college-level chemistry courses participate in an activity where they were prompted to generate a short answer question regarding the content they were presently learning. Using both human and automatic evaluation methods, we investigated the linguistic and pedagogical quality of these student-generated questions. Our results showed that 32% of the student-generated questions were evaluated by experts as high quality, indicating that they could be added and used in the course in their present condition. Additional expert evaluation identified that 23% of the student-generated questions assessed higher cognitive processes according to Bloom’s Taxonomy. We also identified the strengths and weaknesses of using a state-of-the-art language model, GPT-3, to automatically evaluate the student-generated questions. Our findings suggest that students are relatively capable of generating short answer questions that can be leveraged in their online courses. Based on the evaluation methods, recommendations for leveraging experts and automatic methods are discussed.
📄 Read More: https://link.springer.com/chapter/10.1007/978-3-031-16290-9_18
Towards Generalized Methods for Automatic Question Generation in Educational Domains
Huy A. Nguyen, Shravya Bhat, Steven Moore, Norman Bier and John Stamper Carnegie Mellon University, USA
Abstract: Students learn more from doing activities and practicing their skills on assessments, yet it can be challenging and time consuming to generate such practice opportunities. In our work, we examine how advances in natural language processing and question generation may help address this issue. In particular, we present a pipeline for generating and evaluating questions from text-based learning materials in an introductory data science course. The pipeline includes applying a text-to-text transformer (T5) question generation model and a concept hierarchy extraction model on the text content, then scoring the generated questions based on their relevance to the extracted key concepts. We further evaluated the question quality with three different approaches: information score, automated rating by a trained model (Google GPT-3) and manual review by human instructors. Our results showed that the generated questions were rated favorably by all three evaluation methods. We conclude with a discussion of the strengths and weaknesses of the generated questions and outline the next steps towards refining the pipeline and promoting natural language processing research in educational domains.
📄 Read More: https://link.springer.com/chapter/10.1007/978-3-031-16290-9_20
Learning to Give a Complete Argument with a Conversational Agent: An Experimental Study in Two Domains of Argumentation
★ Best paper candidate
Behzad Mirzababaei[1] and Viktoria Pammer-Schindler[1,2] [1] Know-Center GmbH, Austria [2] Graz University of Technology, Austria
Abstract: This paper reports a between-subjects experiment (treatment group N = 42, control group N = 53) evaluating the effect of a conversational agent that teaches users to give a complete argument. The agent analyses a given argument for whether it contains a claim, a warrant and evidence, which are understood to be essential elements in a good argument. The agent detects which of these elements is missing, and accordingly scaffolds the argument completion. The experiment includes a treatment task (Task 1) in which participants of the treatment group converse with the agent, and two assessment tasks (Tasks 2 and 3) in which both the treatment and the control group answer an argumentative question. We find that in Task 1, 36 out of 42 conversations with the agent are coherent. This indicates good interaction quality. We further find that in Tasks 2 and 3, the treatment group writes a significantly higher percentage of argumentative sentences (task 2: t(94) = 1.73, p = 0.042, task 3: t(94) = 1.7, p = 0.045). This shows that participants of the treatment group used the scaffold, taught by the agent in Task 1, outside the tutoring conversation (namely in the assessment Tasks 2 and 3) and across argumentation domains (Task 3 is in a different domain of argumentation than Tasks 1 and 2). The work complements existing research on adaptive and conversational support for teaching argumentation in essays.
📄 Read More: https://link.springer.com/chapter/10.1007/978-3-031-16290-9_16