It’s been over a quarter century since Dr. Steve Ritter founded Carnegie Learning as a math learning system that adapted to individual students’ level of understanding. Breakthroughs in generative AI (GenAI) have allowed Carnegie Learning to personalize educational content in new ways, with greater precision. But Ritter says they haven’t fundamentally changed the nature of what the learning system is designed to do. “If anything, GenAI has allowed us to further our existing mission of personalization for learners. It’s not so much a departure from the past as a catalyst for carrying out the things that we already want to do.”
As leaders at Amplify, Carnegie Learning, and Discovery Education (formerly DreamBox Learning), we developed pilot projects to explore innovative uses of AI for personalization as part of the Bill & Melinda Gates Foundation’s K-12 AI Pilot Cohort. Our projects aimed to explore whether GenAI holds promise for helping advance personalization for educators and learners. While there were promising results in some cases, the work also shows that there are limits—and potential harms—to the use of current AI capabilities for personalization.
All of our projects used AI for personalization in some way, but none interacted directly with students on learning tasks. The projects didn’t want to risk chatbots providing inaccurate information, revealing answers when engaging with a learner, or producing other potential harms to students.
The potential for harm may decline as the technology evolves. But at present, some of the projects saw more promise in approaches that leverage GenAI to deepen existing personalization efforts. Discovery Education used GenAI to pinpoint better learning pathways for students within the company’s existing math learning system. The project aimed to leverage the learner’s time more effectively through more precise recommendations for learning. The team is looking forward to measuring the impact of the improved learning pathways with students this autumn.
Amplify’s project explored how GenAI could help teachers give more effective feedback on students’ responses to math problems. The project provided feedback on each teacher’s feedback, making recommendations based on literature about the most effective types of feedback. These projects allowed for more effective personalized learning while making sure that they did not potentially harm students.
Overall, these projects found that current GenAI technologies hold much more promise when they are used to personalize existing systems in ways that wouldn’t have been possible previously. For example, Carnegie Learning aimed to increase students’ sense of math belonging by enabling students to personalize math problems. Research shows that students benefit from math problems that discuss topics they are excited about—perhaps a sports team or video game—but that Black and Latino students’ interests are typically underrepresented in math problems. Early findings from the Carnegie Learning project indicated that student sense of belonging increased somewhat among users who engaged with the bot. In addition, preliminary results show an increase in math proficiency.
Crucially, the bot that Carnegie Learning created was used only to generate problems. The bot asked the learner to identify their interest areas and what they would like to see in a math problem, including the names of characters. After six exchanges with the student, the bot created a standards- and curriculum-aligned problem at the students’ level, folding in the student’s interest areas and name preferences. Using student-designated names allowed the system to avoid the perception of bias that can sometimes occur with randomized names. It would have been impossible to scale Carnegie Learning’s micro-personalization of problems based on student input prior to the emergence of GenAI tools.
Similar to Carnegie Learning’s work, Amplify’s project was also deeply rooted in research. The project aimed to use GenAI technology to help support teachers in offering asset-based feedback on students’ math work—a technique that is linked to better math outcomes, especially among historically and systematically excluded students.
The use of GenAI was purely on the teacher-facing side as a professional learning experience, reducing the risk of harm to students. Amplify’s professional development activity instructs teachers to offer feedback on a sample student response to a math problem. The chatbot then responds with feedback on the teacher’s feedback, responding to a series of prompts that Amplify designed to lead the teacher to more asset-based framing.
About half of the teachers who volunteered for the study found that the tool helped them give students better feedback. Still, significant challenges for professional learning remain. Many teachers wished for more transparency in how the tool operated. Some teachers grew frustrated that the tool always seemed to have more feedback for them, resulting in a feeling that their feedback to students was never quite good enough. In some cases, teachers reported AI hallucinations.
These obstacles are a good reminder of why it is important that the field remains focused on areas of personalization where GenAI can help with tasks that research shows are important for student learning—whether it is teacher feedback, more personalized word problems, or more specific guidance about skills to develop. We also need to be mindful not to cause harm to students or undermine the role of the teacher. These early findings demonstrate a potential for GenAI in personalization, but we need to proceed cautiously.
Be on the lookout for Digital Promise’s full report on the K-12 AI Pilot Cohort, to be published on October 16.
Want to know more about using AI in personalizing educational content? Find more resources here: