30 July 2020

13. Proving impact: Causality, attribution and contribution

 

Summary: Does measuring impact mean that you have to prove that this impact was caused by a specific intervention? If so, how does one establish a causal link? The complexity of interactions within a society can render it difficult to attribute such a link, in other words, to ascribe “a causal link between observed (or expected to be observed) changes and a specific intervention” (OECD, 2002). For this reason, other methods limit themselves to establishing a contributory relationship, wherein the attribution is expressed in more modest terms, such as: “in light of the multiple factors influencing a result, […] the intervention made a noticeable contribution to an observed result” (Mayne, 2012, p. 273).

Causality, attribution and randomized controlled trials

Causality refers to a “link that unites a cause with an effect” (Hutchinson, 2018). For McDavid and Hawthorn, three conditions must be met to establish a causal relationship:

(1) the [intervention] has to precede the observed outcome,

(2) the presence or absence of the [intervention] has to be correlated with the presence or absence of observed outcome, and

(3) there cannot be any other plausible rival explanatory factors that could account for the correlation between the [intervention] and the outcome.

(McDavid & Hawthorn, 2006, p. 25)

The use of a counterfactual makes it possible to establish a causal link by respecting these three conditions (Menzies, 2014).


Learn More

This approach can be summarized as follows:

The idea of (quasi-) experimental counterfactual analysis is that the situation of a participant group (receiving benefits from/affected by an intervention) is compared over time with the situation of an equivalent comparison group that is not affected by the intervention. (Leeuw & Vaessen, 2009, p. 22)

Concretely, this approach is operationalized through the implementation of randomized controlled trials (RCT). This technique consists of randomly selecting, from an eligible population, an experimental group that will receive an intervention and a control group that will serve as the comparison in order to assess the effect of an intervention (White et al., 2014, p. 1).


The challenge of this approach is precisely to estimate a counterfactual, defined by the OECD (2002, p. 19) as the “situation or condition which hypothetically may prevail for individuals, organizations, or groups were there no intervention,” and which is, hence, not observed because there has been an intervention. This situation must therefore be simulated in some way. A number of techniques are used to try to address this challenge. For more information, see the RCT summary sheet.


Learn More

Note that it is possible to attribute an effect to a cause without saying that the effect is 100% attributable to that cause. In this case, one must not only be able to establish a causal link but also to estimate the extent to which the observed outcome is due to a given intervention.

The attribution problem is often referred to as the central problem in impact evaluation. The central question is to what extent changes in outcomes of interest can be attributed to a particular intervention. Attribution refers to both isolating and estimating accurately the particular contribution of an intervention and ensuring that causality runs from the intervention to the outcome. (Leeuw & Vaessen, 2009, p. 21)


The exercise of attributing a credible causal link, using a (quasi-) experimental study, is particularly difficult in areas of social intervention, both for methodological and pragmatic reasons that are closely related.


Learn More

From a methodological standpoint, a study that seeks to prove (and often quantify) a causal link must, among other things, meet the following conditions:

  1. Homogeneity of the intervention over time and for each participant. For example, the pathway offered by a work insertion enterprise must be the same from one year to the next and from one participant to the next.
  2. The ability to take into account context, external factors and variations among participants and to isolate treatment effectiveness from these other factors. For example, insofar as participants may receive different levels of support from their families, this variable must be used to nuance the analysis of the performance of the insertion program.
  3. Willingness to focus on a rather reductive definition of the success or performance of the intervention. For example, one must look at whether a participant in a work insertion program has found employment within six months after the end of the program, without taking into account the skills and social network he or she may have developed or the effect of this experience on the participant’s family and friends.

From a pragmatic point of view, the conditions to be respected are so demanding that they are rarely met. The conditions include:

  1. Capacity to implement the study in order to generate the relevant data. This can be very expensive, up to 25% of the total program budget according to Zandniapour and Vicinanza (2013).
  2. Ability to exert a maximum degree of control over the environment in order to avoid numerous external variations. This is nearly impossible outside a laboratory setting. As a result, even a very good RCT-type study will be difficult to be generalized in a context other than the one in which the experiment was conducted (low external validity).
  3. No ethical issues in having some participants benefit from an intervention and not others. This can be a significant barrier when the intervention under consideration is considered a state-guaranteed right (e.g., access to quality health care or education).


In sum, the conditions necessary to attribute a change to an intervention are very difficult to meet. Some consider this to be nothing other than a challenge. In other words, they view the RCT, even if never perfectly applied, as a benchmark—hence the expression gold standard—for comparing the rigour of the other methods to be then applied in practice (quasi-experimental designs). Other researchers and practitioners, by contrast, consider certain difficulties to be insurmountable, which is why they are opposed to an increased use of this technique and way of thinking in the social sector.

 

  • Text by Cupitt (2015)
  • Text by Labrousse (2016) (in French)
  • Text by Ravallion (2018)
  • Text by Leeuw and Vaessen (2009)

 

Contribution analysis and theory of change

To overcome the significant difficulty of attributing an effect to a cause, researchers such as John Mayne of the International Development Research Centre (IDRC) have developed an approach based on contribution analysis (Mayne, 2001). According to Mayne, part of the confusion in these discussions stems from the fact that the notion of causality can be used to describe several types of linkages that may be necessary and/or sufficient, or none of these. The following table presents various cases with examples related to the health sector (Mayne, 2012, p. 275).

Thus, while it is relatively simple to attribute causality in contexts where the causal link is necessary, the vast majority of cases studied in the fields of interest to us (related to the social economy) involve causal relationships that are neither necessary nor sufficient. Instead, they are contributory causes. In order to establish a contributory link, with a so-called contribution analysis, Mayne recommends developing a solid theory of change by taking the following steps:

  1. Set out the cause-effect issue to be addressed;
  2. Develop the postulated theory of change and risks to it, including rival explanations;
  3. Gather the existing evidence on the theory of change;
  4. Assemble and assess the contribution claim, and challenges to it;
  5. Seek out additional evidence;
  6. Revise and strengthen the contribution story.

(Mayne, 2012, p. 272)

These steps are described in more detail on the BetterEvaluation website (BetterEvaluation, 2016).


Learn More

For Smutylo, it is very important not to be obsessed with impact and outcomes, as these are usually beyond the control of the organizations and cannot be reasonably proven:

Outcomes often occur a long way downstream and may not take the form anticipated. Outcomes depend on responsiveness to context specific factors, creating diversity across initiatives. The value and sustainability of outcomes usually depend on the depth and breadth of involvement by many stakeholders. These characteristics make it difficult for external agencies: a) to identify and attribute specific outcomes to specific components of their programs; and b) to aggregate and compare results across initiatives. (Smutylo, 2001, p. 1)

He even supports this argument in a song:

The contribution analysis therefore invites organizations and their evaluators to make greater use of qualitative methods. This discourse was not invented by the IDRC in the early 2000s. In fact, most evaluation practitioners in Quebec, France (Branger et al., 2014; Duclos, 2007) and elsewhere (Scriven, 1998; Patton, 2008) have long advocated for participatory and flexible, yet rigorous, evaluation.

While pragmatism is a powerful reason for deciding to use qualitative or mixed methods rather than experimental design (Pluye et al., 2009, p. 123), it is not the only one. Indeed, the qualitative part of an evaluation is the only one that can provide “detailed descriptions of complex phenomena based empirically on a specific context […] a better understanding of the development of complex programs […] and a deeper understanding of why these programs work” (Pluye et al., 2009, p. 125; our translation). In other words, one of the main contributions of evaluative approaches, such as contribution analysis, is to open the “black box” by seeking to explain how an intervention contributes to an effect, rather than simply trying to estimate whether an effect is attributable to a cause (Leeuw, 2012, p. 353).


In conclusion, although most definitions of impact include the notion of attribution, it is often more honest and realistic to—instead of seeking to prove a causal link between an action and an effect beyond any doubt—speak of a plausible association so that “in light of the multiple factors influencing a result, the intervention made a noticeable contribution to an observed result” (Mayne, 2012, p. 273).

It should be noted, however, that these two approaches are, in the end, relatively close. A good contribution analysis is consistent with the criteria for demonstrating causality outlined earlier. In fact, a theory that explains a causal link is formulated and then confronted with reality through various data-gathering approaches (observation, literature review, survey, interview, etc.). If contextual factors are considered and then rival explanations discarded, we can then speak of a plausible contributory link.

In sum, while the nuances between attribution and contribution are important, we should not get lost in this debate because, as pointed out in the Aid Leap blog, each approach recognizes that there are many factors and basically asks the same question: What did the intervention produce and what would have happened without it?

 

Clarify expectations and choose a method adapted to the context

The first section of this web portal says that impact measurement refers to “the activity of assessing the effects (or outcomes) of an intervention.” However, it also states that “techniques to [measure impact] may involve varying degrees of formalization or scientific rigour, ranging from surveys based on the perceptions of a few participants to longitudinal studies with randomly selected control groups (randomized controlled trials).”

Thus, when embarking on an impact measurement process, the challenge is not so much to define what impact measurement is and what it is not but rather what is expected of it by you and your stakeholders (directors, members, employees, funders and others). It is the match between the privileged approach and these expectations that will determine the success of your initiative.


Learn More

To guide us in this process, organizations such as the Nesta Foundation have proposed standards of evidence, suggesting, for example, that a study with a control group is better than one that only measures changes between the beginning and end of the intervention, or that many studies with control groups are better than a single one.

The Nesta standards of evidence

If this hierarchy is valid in theory as well as in a highly controlled scientific context (e.g., in a laboratory), it must be qualified in terms of the context and the expectations that can actually be formulated with respect to measuring the impact of the social economy. A more realistic version would therefore contain the following cautions and comments.

The Nesta standards of evidence commented


The best study is therefore not necessarily the one that will mobilize the largest number of control groups but the one that will be most useful to the organization conducting it. This usefulness depends on the objectives initially set by those implementing the evaluation, as discussed in the section Why evaluate?.