
Peer-Reviewed Study Measures Real-World Impact of EdTech

By Bogdan Yamkovenko, Head of Efficacy Research
Teachers matter enormously when it comes to how students use learning tools. In real classrooms, students usually don’t decide on their own how much time to spend on a platform like Khan Academy. Those choices are shaped by teachers, school expectations, and the broader learning environment. That reality makes a common research question surprisingly hard to answer: Does using an edtech tool actually cause students to learn more, or is it just correlated with other factors, like strong teaching or extra support at home?
We partnered with researchers at Stanford University and University of Toronto on a large, multiyear study to get closer to the answer. Using data from hundreds of U.S. school districts, the research looks at how changes in Khan Academy usage over three years relate to changes in students’ math test scores. The results of this study have been published in Proceedings of the National Academy of Sciences (PNAS).
Why simple usage–outcome correlations are misleading
There is almost always a correlation between using an edtech tool and test scores. The correlation is there for a few possible reasons. Using edtech tools can help students learn. Or it could be because stronger students use these tools more and score higher on tests. Another possible reason is because a teacher modified their instruction to include more edtech tools in combination with other effective instructional practices. In other words, correlation is noisy, it does not guarantee that the tool is effective.
In general, that noise can come from four different sources. The first is stable student differences (e.g., motivation). The second is time-varying, student-related changes (e.g., This year a student got a tutor and last year they didn’t have one). The third is stable teacher characteristics (e.g., teaching style). The fourth is time-varying, teacher-related changes (e.g., The teacher received professional development this year and last year they did not).
Here is how we think about these issues in our study:
Accounting for the real world in our efficacy research
First, because we have multiple observations for each student, we can account for stable student characteristics through fixed effects for students. We have already been doing this in our previous studies. Now we have gone further.
Second, instead of looking at each individual student’s time spent on Khan Academy, we compute the average time spent on Khan among all of the students in a particular classroom in a given year. But when we compute this average, we leave that individual student’s time out. For example, if a student is part of a 30-student classroom, that student’s usage is proxied by the mean usage of the remaining 29 students. Because individual student exposure to Khan Academy is not directly used, this approach substantially reduces bias arising from individual student characteristics that may change over time. Essentially, the use of Khan Academy is on the level of the classroom, not student’s individual usage.
Third, because we have multiple observations per teacher, we can account for stable teacher characteristics with teacher fixed effects. That’s because we effectively look at the relationship between each teacher’s average classroom time spent on Khan Academy and student test scores.
That leaves time-varying factors that could drive both student outcomes and changes in teacher usage. This is the core issue we explore in the study. We argue that teachers likely change their usage because of factors that are not also driving changes in student learning outcomes (e.g., shifts in school policy, receiving new professional development on how to use the tool, and a teacher’s own beliefs about the effectiveness of a particular edtech platform). Of course, teachers don’t change their classroom practices at random. What matters is whether increases in Khan Academy usage tend to happen in years when something else also changed that would have raised test scores.
We conduct several types of analysis to test this assumption. One: we look at how similar the changes in usage are across teachers in the same school and we see a lot of similarities. This suggests that changes in usage are more likely driven by school-level policies rather than individual teacher efforts. Two: we look at the relationship between changes in math time and reading outcomes and don’t find one. This suggests that these changes are not associated with broader changes in teaching quality that would affect multiple subjects. Finally, instead of computing a classroom average of Khan Academy usage for each teacher in a given year, we compute the school average. Then we look at how changes in the school’s average usage impact student test scores. We find that the results are essentially the same as when we look at a teacher’s average time spent on Khan Academy.
What does this mean? When a teacher increases their use of Khan Academy in the classroom, students tend to see better learning outcomes in that year compared to those in the previous year. Alternatively, when a student moves from a school or a classroom with low Khan Academy usage to a school or a classroom with high usage, the student sees better test scores. Given our results and the robustness checks we conduct, this effect is much more likely to come from practicing on Khan Academy than from other factors like teacher quality, peer effects, or individual student differences.
Why is this study different?
PNAS is one of the top academic journals in the world. It does not publish edtech efficacy studies. It publishes rigorous and novel research that has implications for policy and the scientific community. This journal’s peer-review process is extremely rigorous, with only a 13% acceptance rate.
The reason this study was published is because of its thoughtful and rigorous methodology, impartial approach to the analysis, real-world context, and an important finding. The data come from hundreds of districts with ~200K students across the United States, in which the use of Khan Academy is, on average, 10-15 minutes per week. Even at those levels of usage—which are below our recommended 30 minutes per week—and after applying stringent controls, Khan Academy has a significant effect on learning outcomes.
Source link



