For Quasi-experiments on the Efficacy of Edtech Products, it is a Good Idea to Use Usage Data to Identify Who the Users Are
With edtech products, the usage data allows for precise measures of exposure and whether critical elements of the product were implemented. Providers often specify an amount of exposure or the kind of usage that is required to make a difference. Furthermore, educators often want to know whether the program has an effect when implemented as intended. Researchers can readily use data generated by the product (usage metrics) to identify compliant users, or to measure the kind and amount of implementation.
Since researchers generally track product implementation and statistical methods allow for adjustments for implementation differences, it is possible to estimate the impact on successful implementers, or technically, on a subset of study participants who were compliant with treatment. It is, however, very important that the criteria researchers use in setting a threshold be grounded in a model of how the program works. This will, for example, point to critical components that can be referred to in specifying compliance. Without a clear rationale for the threshold set in advance, the researcher may appear to be “fishing” for the amount of usage that produces an effect.
Some researchers reject comparison studies in which identification of the treatment group occurs after the product implementation has begun. This is based in part on the concern that the subset of users who comply with the suggested amount of usage will get more exposure to the program. More exposure will result in a larger effect. This assumes of course, that the product is effective, otherwise the students and teachers will have been wasting their time and will likely perform worse than the comparison group.
There is also the concern that the “compliers” may differ from the non-compliers (and non-users) in some characteristic that isn’t measured. And that even after controlling for measurable variables (prior achievement, ethnicity, English proficiency, etc.), there could be a personal characteristic that results in an otherwise ineffective program becoming effective for them. We reject this concern and take the position that a product’s effectiveness can be strengthened or weakened by many factors. A researcher conducting any matched comparison study can never be certain that there isn’t an unmeasured variable that is biasing it. (That’s why the What Works Clearinghouse only accepts Quasi-Experiments “with reservations.”) However, we believe that as long as the QE controls for the major factors that are known to affect outcomes, the study can meet the Every Student Succeeds Act requirement that the researcher “controls for selection bias.”
With those caveats, we believe that a QE, which identifies users by their compliance to a pre-specified level of usage, is a good design. Studies that look at the measurable variables that modify the effectiveness of a product can not only be useful for school in answering their question, “is the product likely to work in my school?” but points the developer and product marketer to ways the product can be improved.