How the Q-ODM impact model is a more cost-effective form of the quasi-experimental design (QED)

The Quality-Outcomes Design and Methods (Q-ODM) approach to program evaluation increases the use value of all estimates produced as part of an impact analysis. Put simply: We replace the “no-treatment” counterfactual condition (i.e., children who were not exposed to an afterschool program) with low-implementation conditions (e.g., children who were exposed to lower-quality instructional practices in an afterschool program) in order to describe the impact of optimal implementation on child outcomes (e.g., socio-emotional skill change, equity effects).  Said again: The “control group” in our impact model is any quality profile, subgroup configuration, or pathway (e.g., low-quality practices profile) that is contrasted with an optimal “treatment” group (e.g., high-quality practices profile).[1]

The “Analytic Tools” section of White Paper 3 provides an introductory discussion of Q-ODM impact models for student skill and equity outcomes. Also, check out this UK impact evaluation.

Now, let’s talk about three reasons why our approach is a cost-effective choice for CEOs seeking evidence about impact and equity outcomes:

Lots of Reality-Based Estimates that Analogize to Action. Our point about cost effectiveness is this: Every estimate produced in this impact model is useful. Where coupled with QTurn measures, Q-ODM impact estimates are interpretable in terms of specific adult and child behaviors and contexts. This means that there is a direct analogy from meaning encoded in the data to meaningful teacher and student behavior that occurs in the classroom – direct analogy from data to reality. The data used to identify the lower-quality profile actually identifies the lower-quality settings! The amount of skill change that occurs in the high-quality setting actually demonstrates what’s possible in the program; that is, it sets the benchmark for other programs.

An impact estimate implies a subtraction of one magnitude from another. What use is a counterfactual estimate if there is no such thing as a counter factual condition? Doesn’t that just mean that we are subtracting an imaginary quantity from a real one?

Using Natural Groupings to Address Threats to Valid Inference. Its not just usefulness of estimates (consequential validity) but, we argue, a more valid way to rule out primary threats to validity of inference that the treatment caused an effect. Two points: The children in the low-quality group are more likely to be similar to the kids in the high-quality group for all of the right reasons (i.e., SEL histories) that are missed by most efforts at matching individuals or groups using demographic and education data.

The case that families in one group have more education-relevant resources (e.g., SEL histories) than families in the other group plays out in two ways. When families have unmeasured resources before the child attends, we are talking about selection effects. When families use those unmeasured resources during the program intervention we are talking about history effects. We argue, and present evidence, that the Q-ODM method better addresses these threats to valid inferences about impact than the pernicious and unethical use of race/ethnicity and social address variables as covariates – pretended “controls” – in linear models.

Capturing Full Information from Small Samples. Our method is designed to detect such differences in the ways things go together in the real world, in or around the average expectable environments characterizing human development and socialization (cf. Magnusson, 2003). This in-the-world structure is a constraint on the states that can and cannot occur during development. In the pattern-centered frame, small cell sizes indicate sensitivity of the approach. Relatively-low Ns are not necessarily a problem for the distribution-free statistical tests used in pattern-centered impact analyses.

[1] We realize that others would claim that our designs are not QED at all. We delve deeper into the rationales used to disqualify “groups that receive different dosages of a treatment” from being considered “control groups” within the context of experimental design in White Paper 4.

Why are Q-ODM’s Pattern-Centered Methods (PCM) More Realistic and Useful for Evaluators?

Pattern-centered theory and methods (PCM) can be used to tell simple and accurate stories about how real persons grow in real school and afterschool classrooms. Stories about the quality and outcomes (i.e., causes and effects) that are modeled using PCM are particularly useful because they can address questions related to “how” programs and classrooms work and “how much” children grow skills.

Most training for education researchers and evaluators is focused on variable-centered methods (VCM), also called linear statistical methods (regression, the analysis of variance, and structural equation modeling) or the general linear model. VCM are powerful in cases where the causes and effects are similar across individuals and classrooms. In cases where that’s not true – which is most school and afterschool classrooms – VCM designs tend to provide information that means practically nothing about the actual people or contexts involved. Some of the basic issues have been summarized nicely by Todd Rose in the following TEDx presentation: https://youtu.be/4eBmyttcfU4 (“The Myth of Average”), but the critique is not new.

To better illustrate the point, let’s talk about three basic assumptions about the person-reality in afterschool classrooms and how PCM applies:

A person’s socio-emotional skills are most accurately represented as a pattern with multiple skills indicated simultaneously. This is not just about more information from more variables, although that is also a fundamental advantage of pattern-centered methods. The neuroperson is also a “multilevel system” – which is mouthful but as detailed in White Paper 1: Different parts of mental skill change for different reasons, on different timelines, and cause different types of behavior! This means different amounts and types of cause are involved in changing any mental skill or behavior. How could one variable at a time constraints of VCM ever do an adequate job of representing socio-emotional skill? PCM are uniquely fit for sorting out multilevel causal dynamics so that the full meaning encoded in the data can emerge.

Change in socio-emotional skill is always qualitative, from one pattern to a different pattern at a later time point. Given the multilevel nature of socio-emotional skills, the combination of skill parts is likely to differ at different time points and in different settings. The fact that skills turn into different skills as they change has been an Achilles heel for VCM. Check out the “Analytic Tools” section of White Paper 3 to see how PCM can be applied to (a) identify each individual’s unique pattern of skill parts at different points in time and then (b) compare across those qualitatively different patterns to detect stability, growth, or decline for each individual. When coupled with the sensitivity of optimal skill measures (see White Paper 2), PCM are ideal for describing the how (e.g., an individual child’s movement from one pattern to a subsequent pattern) and how much (e.g., how many children grew) of skills-change over short time periods, such as a semester or school year.

The same classroom causes different patterns of change for different subgroups of children. An adage from mid-20th century psychology (Kluckhohn and Murray, 1948, p. 35) is a helpful reminder: Any individual can, for different causal variables, be simultaneously like all others, like some others, or like no others. VCM work only in the first case, where every person experiences a very similar type of cause and effect. Case-study and qualitative methods are preferred in the third case, where the causes and effects may apply only to a single person. PCM are uniquely fit for the second case; that is, where different subgroups of children with different socio-emotional histories have qualitatively different types of responses to the same education settings.

In the end, VCM assumptions about the validity of single variables, the quantitative nature of skill change, and the homogeneity of causal dynamics lead to an impoverished view of reality – and likely a lot of inaccurate conclusions about what to do.

Introduction to White Paper 3

Greetings friends! In this third White PaperRealist(ic) Evaluation Tools for OST Programs: The Quality-Outcomes Design and Methods (Q-ODM) Toolbox, we extend from the neuroperson framework for socio-emotional skills to a focus on evaluation design and impact evidence. Focusing on the methods used to evaluate out-of-school time (OST) programs and to assess the impact on student skill growth is a critical issue, especially given the ambiguity about impacts from gold-standard evaluations of publicly funded afterschool programs. Are programs producing weak or no effects? Or, are gold-standard designs missing something?

We offer a sequence of evaluation questions that chart the course to realistic evidence about quality and outcomes (i.e., cause and effect, or “how” and how much”) – and is useful to managers, teachers, coaches, and evaluators. We’ve learned these questions over the past two decades by asking tens of thousands of afterschool, early childhood, and school day teachers about how data and results about their own work works best for them.

Getting the evaluation questions right calls for measurement and analytics tools that:

…reflect the assumption that children have mental skills that are causes of their behavior. These mental skills are conceived of as several different aspects of mental functioning (i.e., schemas, beliefs, & awareness) that exist within every biologically-intact person, enable behavioral skills, and can be assessed, more or less accurately, using properly-aligned measures. When the parts and patterns of skill are reflected in theory and measures, the accuracy and meaningfulness of data about program quality and SEL skill – and all subsequent manipulations and uses of the data – are dramatically improved.

Our thinking is deeply anchored in pattern- and person-centered science. Check out a related blog here: Why are Q-ODM’s Pattern-Centered Methods (PCM) More Realistic and Useful?

Finally, we provide data visualization examples that complete an unbroken chain of encoded meaning, from the observation of students’ socio-emotional skills in an afterschool classroom, to the decoding of the data visualization by an end-user. We’re pleased to share these insights. Cheers!

P.S. For CEOs that need impact evidence: Why are gold-standard designs not as cost-effective as we might think? Elsewhere, we have argued that gold-standard designs for afterschool programs are misspecified models because they lack key moderator and mediator variables (e.g., instructional quality and socio-emotional skills). For example, the large impacts (often equity effects, as predicted by the neuroperson framework) that we typically find for students who start programs with lower socio-emotional skills but who receive high-quality instruction cannot be detected using most gold-standard designs. As a result, it is difficult (or impossible) to analogize from the results of gold-standard designs to the real actions taken by real people; thus, those designs are not very cost effective for improvement or for telling compelling stories about impact. Check out a related blog here: How the Q-ODM impact model is a more cost-effective form of the quasi-experimental design (QED).