blog posts and news stories

Doing Something Truly Original in the Music of Program Evaluation

Is it possible to do something truly original in science?

How about in Quant evaluations in the social sciences?

The operative word here is "truly". I have in mind contributions that are "outside the box".

I would argue that standard Quant provides limited opportunity for originality. Yet, QuantCrit forces us to dig deep to arrive at original solutions - to reinterpret, reconfigure, and in some cases reinvent Quant approaches.

That is, I contend that Quant Crit asks the kinds of questions that force us to go outside the box of conventional assumptions and develop instrumentation and solutions that are broader and better. Yet, I qualify this by saying (and some will disagree), that doing so does not require us to give up the core assumptions that are at the foundation of Quant evaluation methods.

I find that developments and originality in jazz music closely parallel what I have in mind in discussing the evolution of genres in Quant evaluations, and what it means to conceive of and address problems and opportunities outside the box. (You can skip this section, and go straight to the final thoughts, but I would love to share my ideas with you here.)

An Analogy for Originality in the Artistry of Herbie Hancock

Last week I took my daughter, Maya, to see the legendary keyboardist Herbie Hancock perform live with Lionel Loueke, Terrance Blanchard and others. CHILLS along my spine, is how I would describe it. I found myself fixating on Hancock’s hand movements on the keys, and how he swiveled between the grand piano and the KORG synthesizer, and asking: "the improvisation is on-point all the time – how does he know how to go right there?"

Hancock, winner of an Academy award, and 14 Grammys, is a (if not the) major force in the evolution of jazz through the last 60 years, up to the contemporary scene.

His main start was in the 1960's as the pianist in Miles Davis' Second Great Quintet. (When Hancock was dispirited, Davis famously advised him "don't play the butter notes"). Check out this performance by the band of Wayne Shorter's composition "Footprints" from 1967 – note the symbiosis among the group and Hancocks respectful treatment of the melody.

In the 1970’s Hancock developed styles of jazz fusion and funk with the Headhunters (e.g., Chameleon.

Then in the 1980's Hancock explored electro styles, capped by the song "Rockit" – a smash that straddled jazz, pop and hip-hop. It featured scratch styling and became a mainstay for breakdancing (in upper elementary school I co-created a truly amateurish school play that ended in an ensemble Rockit dance with the best breakdancers in our school). Here's Hancock's Grammy performance.

Below is a picture of Hancock from the other night with the strapped synth popularized through the song Rockit.

Hancock and Synth

Hancock did plenty more besides what I mention here, but I narrowed his contributions to just a couple to help me make my point.

His direction, especially with funk fusion and Rockit, ruffled the feathers of more than a few jazz purists. He did not mind. His response was "I have to be true to myself…it was something that I needed to do….because it takes courage to work outside the box…and yet, that’s where the growth lies”.

He also recognized that the need for progression was not just to satisfy his creative direction, but to keep the audience listening; that is, for the music, jazz, to stay alive and relevant. If someone asserts that "Rockit" was a betrayal of jazz that sacrilegiously crossed over into pop and hip-hop, I would counter argue that it opened up the world of jazz to a whole generation of pop listeners (including me). (I recognize similar developments in the genre-crossing works of recent times by Robert Glasper.)

Hancock is a perfect case study of an artist executing his craft (a) fearlessly, (b) not with the goal of pleasing everyone, (c) with the purpose of connecting with, and reaching new audiences, (d) by being open to alternative influences, (e) to achieve a harmonious melodic fusion (moving between his KORG synth, a grand piano), and (f) with constant appreciation reflection of the roots and fundamentals.

Hancock and Band

Coming Back to the Idea of the Fusion of Quant with Quant Crit in Program Evaluation

Society today presents us with situations that require critical examination of how we use the instruments on which we are trained, and an audit of the effect they have, both intended and unintended. It also requires that we adapt the applications of methods that we have honed for years. The contemporary situation poses the question: How can we expand the range of what we can do with the instruments on which we are trained, given the solutions that society needs today, recognizing that any application has social ramifications? I have in mind the need to prioritize problems of equity and social and racial justice. How do we look past conventional applications that limit the recognition, articulation, and development of solutions to important and vexing problems in society?

Rather than feeling powerless and overwhelmed, the Quant evaluator is very well positioned to do this work. I greatly appreciate the observation by Frances Stage on this point:

"…as quantitative researchers we are uniquely able to find those contradictions and negative assumptions that exist in quantitative research frames"

This is analogous to saying that a dedicated pianist in classic jazz is very well positioned to expand the progressions and reach harmonies that reflect contemporary opportunities, needs and interests. It may also require the Quant evaluator to expand his/her arrangements and instrumentation.

As Quant researchers and evaluators, we are most familiar with the "rules of playing" that reinforce "the same old song" that needs questioning. Quant Crit can give us the momentum to push the limits of our instruments and apply them in new ways.

In making these points I feel a welcome alignment with Hancock's approach: recognizing the need to break free from cliché and convention, to keep meaningful discussion going, to maximize relevance, to get to the core of evaluation purpose, to reach new audiences and seed/facilitate new collaborations.

Over the next year I'll be posting a few creations, and striking in some new directions, with syncopations and chords that try to maneuver around and through the orthodoxy – "switching up" between the "KORG and the baby grand" so to speak.

Please stay tuned.

The Band on Stage

2024-10-15

Classrooms and Districts: Breaking Down Silos in Education Research and Evidence

I just got back from Edsurge’s Fusion conference. The theme, aimed at classroom and school leaders, was personalizing classroom instruction. This is guided by learning science, which includes brain development and the impact of trauma, as well as empathetic caregiving, as Pamela Cantor beautifully explained in her keynote. It also leads to detailed characterizations of learner variability being explored at Digital Promise by Vic Vuchic’s team, which is providing teachers with mappings between classroom goals and tools and strategies that can address learners who vary in background, cognitive skills, and socio-emotional character.

One of the conference tracks that particularly interested me was the workshops and discussions under “Research & Evidence”. Here is where I experienced a disconnect between Empirical ’s research policy-oriented work interpreting ESSA and Fusion’s focus on improving the classroom.

  • The Fusion conference is focused at the classroom level, where teachers along with their coaches and school leaders are making decisions about personalizing the instruction to students. They advocate basing decisions on research and evidence from the learning sciences.
  • Our work, also using research and evidence, has been focused on the school district level where decisions are about procurement and implementation of educational materials including the technical infrastructure needed, for example, for edtech products.

While the classroom and district levels have different needs and resources and look to different areas of scientific expertise, they need not form conceptual silos. But the differences need to be understood.

Consider the different ways we look at piloting a new product.

  • The Digital Promise edtech pilot framework attempts to move schools toward a more planful approach by getting them to identify and quantify the problem for which the product being piloted could be a solution. The success in the pilot classrooms is evaluated by the teachers, where detailed understandings by the teacher don’t call for statistical comparisons. Their framework points to tools such as the RCE Coach that can help with the statistics to support local decisions.
  • Our work looks at pilots differently. Pilots are excellent for understanding implementability and classroom acceptance (and working with developers to improve the product), but even with rapid cycle tools, the quantitative outcomes are usually not available in time for local decisions. We are more interested in how data can be accumulated nationally from thousands of pilots so that teachers and administrators can get information on which products are likely to work in their classrooms given their local demographics and resources. This is where review sites like Edsurge product reviews or Noodle’s ProcureK12) could be enhanced with evidence about for whom, and under what conditions, the products work best. With over 5,000 edtech products, an initial filter to help choose what a school should pilot will be necessary.

A framework that puts these two approaches together is promulgated in the Every Student Succeeds Act (ESSA). ESSA defines four levels of evidence, based on the strength of the causal inference about whether the product works. More than just a system for rating the scientific rigor of a study, it is a guide to developing a research program with a basis in learning science. The base level says that the program must have a rationale. This brings us back to the Digital Promise edtech pilot framework needing teachers to define their problem. The ESSA level 1 rationale is what the pilot framework calls for. Schools must start thinking through what the problem is that needs to be solved and why a particular product is likely to be a solution. This base level sets up the communication between educators and developers about not just whether the product works in the classroom, but how to improve it.

The next level in ESSA, called “correlational,” is considered weak evidence, because it shows only that the product has “promise” and is worth studying with a stronger method. However, this level is far more useful as a way for developers to gather information about which parts of the program are driving student results, and which patterns of usage may be detrimental. Schools can see if there is an amount of usage that maximizes the value of the product (rather than depending solely on the developer’s rationale). This level 2 calls for piloting the program and examining quantitative results. To get correlational results, the pilot must have enough students and may require going beyond a single school. This is a reason that we usually look for a district’s involvement in a pilot.

The top two levels in the ESSA scheme involve comparisons of students and teachers who use the product to those who do not. These are the levels where it begins to make sense to combine a number of studies of the same product from different districts in a statistical process called meta-analysis so we can start to make generalizations. At these levels, it is very important to look beyond just the comparison of the program group and the control group and gather information on the characteristics of schools, teachers, and students who benefit most (and least) from the product. This is the evidence of most value to product review sites.

When it comes to characterizing schools, teachers, and students, the “classroom” and the “district” approach have different, but equally important, needs.

  • The learner variability project has very fine-grained categories that teachers are able to establish for the students in their class.
  • For generalizable evidence, we need characteristics that are routinely collected by the schools. To make data analysis for efficacy studies a common occurrence, we have to avoid expensive surveys and testing of students that are used only for the research. Furthermore, the research community must reach consensus on a limited number of variables that will be used in research. Fortunately, another aspect of ESSA is the broadening of routine data collection for accountability purposes, so that information on improvements in socio-emotional learning or school climate will be usable in studies.

Edsurge and Digital Promise are part of a west coast contingent of researchers, funders, policymakers, and edtech developers that has been discussing these issues. We look forward to continuing this conversation within the framework provided by ESSA. When we look at the ESSA levels as not just vertical but building out from concrete classroom experience to more abstract and general results from thousands of school districts, then learning science and efficacy research are combined. This strengthens our ability to serve all students, teachers, and school leaders.

2018-10-08
Archive