SADIE is a middleware framework designed to make AI-powered data science education tools safer, more accessible, and developmentally appropriate for students.
Summary
Safe and Accessible Data Interactions in Education (SADIE) is a middleware layer designed to ensure safe, instructionally relevant, and accessible student interactions with AI-powered data literacy education tools. As AI tools become increasingly integrated in the education setting, educators and students are voicing concerns, including a lack of data privacy, unsafe or unmoderated content, and limited accessibility support. These issues underscore the need for advanced technologies that tailor student-AI interactions to their academic and developmental needs. We will specifically focus on creating a middle layer customized for data literacy-focused EdTech products. Because many students currently enter postsecondary education and the workforce without the ability to analyze or interpret data (LaMar & Boaler, 2021), safe and accessible LLM-powered agents hold promise to provide students with enhanced opportunities to develop important data literacy skills. By leveraging innovative technology such as generative AI, automatic speech recognition (ASR), and retrieval augmented generation (RAG), SADIE aims to increase educator confidence in the use of AI for advancing data literacy teaching and learning.
Full description of project work
The Problem
In our past projects with students with disabilities, we witnessed firsthand how difficult it was for many students to meaningfully engage with data. These challenges inspired the initial concept for SADIE as a system that could make datasets and data visualizations more accessible to students with disabilities. As we continued investigating data literacy and accessibility, it became clear that these challenges are widespread and affect students of all abilities. An SRI landscape analysis on AI and middle-school data literacy echoed this need, with teachers and students noting the lack of effective tools for building data analysis skills, especially in science classes.
The current challenges teachers and students face around building data literacy skills coupled with research demonstrating that developing data literacy early is essential for closing long-standing skill gaps and building a strong foundation for future data science learning (Bargagliotti et al., 2020) led us to broaden SADIE’s scope to support secondary students with and without disabilities. Our goal for this work is to ensure that every learner has access to high-quality data literacy learning experiences.
Our Idea
As our SADIE research and development progressed, we became aware of several barriers to the adoption and scale of a GenAI-enabled standalone EdTech tools:
- Rapid but uneven development in data-literacy EdTech
- New AI-enabled tools focused on data literacy are emerging quickly, but most lack the safeguards needed for classroom use.
- Safety and reliability gaps
- AI can produce unsafe, inaccurate, or biased responses without appropriate guardrails, leading to teacher and parent mistrust and inconsistent adoption in schools.
- Limited accessibility support
- Key features such as screen reader compatibility, alt text, and speech input are often missing or poorly implemented.
- Students who rely on accessibility tools are disproportionately affected.
- Significant privacy risks
- Some tools collect or expose personal data, leaving schools concerned about compliance and risk.
To address these barriers and facilitate future adoption, scale, and sustainability of SADIE, we had to reframe our vision and begin to think of SADIE as not just an EdTech tool for data literacy, but as a tool for safe, accessible, and instructional student-GenAI interactions.
Our updated conceptual model for SADIE 2.0 would act as a safety and accessibility layer between students and AI. We are exploring three main components to comprise our SADIE middle layer:
- Data literacy content moderation: filter and refine LLM outputs to keep interactions safe, accurate, and constrained to data science learning progressions and data literacy tasks.
- Multimodal accessibility: prompt LLM to generate accessible data visualizations and high-quality alt text based on best practices; utilize advanced ASR technologies optimized for young students in bustling classroom environments.
- Data privacy: prevents student PII from being sent to underlying LLM.
We are also exploring ways to make SADIE compatible with different LLMs so it can be easily integrated into a wide range of EdTech products and so it can eventually scale to other subject areas as well.
Who will this impact?
- Students of all abilities will be able to interact with data safely and meaningfully while developing their data literacy skills through LLM-powered EdTech tools.
- Teachers can confidently adopt LLM-powered EdTech tools in their instruction, therefore reducing their workload so they can direct their valuable time toward individualized student instruction.
- EdTech developers can integrate AI more responsibly and seamlessly by utilizing SADIE’s middle layer to manage safety, accessibility, and privacy needs.
Resources
- Podcast : Leveraging Technology to Support Students with Disabilities
- Accessible data framework
Associated fields of research
Associated SRI team members
-

Shari Dubos
Principal Education Researcher, SRI Education
-

Jennifer Nakamura
Senior Researcher, SRI Education
-

Sophia Ouyang
Education Research Associate, SRI Education
ICS team members:
- Aaron Spaulding
- Andy Poggio
- Emre Yilmaz
- Laura Tam
- Ran Chen
- Sarah Bakst



