本文为留学生英国教育论文代写范例，题目是“Experimental Ethnographies of Early Intervention in Child Development（儿童发展早期干预的实验民族志 ），本文介绍了在所谓的“定量转向”之后，通过随机对照试验 (RCT) ，对儿童发展的早期干预与项目评估共存。
随着跨学科的理念得到程序化的支持，定量方法的批评者和推动者之间的两极分化辩论正在消退。然而，随机对照试验仍然被认为是评价项目有效性的“黄金标准”，混合方法的应用受到限制。因此，应鼓励讨论如何在实践中整合不同的方法。在本文中，我提出了在儿童发展早期干预领域中民族志和 RCT 之间的三种可能整合形式。我认为这种整合对评估研究是有益的，因此，对提供更好的早期干预服务是有益的。
Early intervention in child development has become co-existent with programme evaluation by means of randomised controlled trials (RCTs) following what has been termed the “quantitative turn.” The polarised debate between detractors and promoters of quantitative methods is fading as the idea of interdisciplinarity is gaining programmatic traction. However, RCTs are still considered the “gold standard” in evaluation of programme effectiveness, and the application of mixed methods is limited. The discussion of how different methodologies could be integrated in practice should, thus, be encouraged. In this article I propose three possible forms of integration between ethnography and RCT in the field of early intervention in child development. I argue that such an integration is beneficial for evaluation research and, thus, for the delivery of better early intervention services.
1. Situating Ethnography of Early Intervention
Ethnography of early intervention in child development is a new sub-field of ethnographic research that develops at the intersection between ethnography of early childhood and ethnography of education. Summing up the process that led researchers in education and childhood to look positively at ethnography, Allison James wrote, “Ethnography is becoming the new orthodoxy in childhood research” (James 2007: 246). This quote has been recently re-proposed in The SAGE Handbook of Early Childhood Research (Farrell, Kagan, and Tisdall2016, 223) in a chapter about ethnography of early childhood. That signals how in-depth, long-term participant observation has been increasingly recognized as a relevant method to study childhood. However, the significance of the quote might sound less self-evident if one considers that the meaning of childhood research and the meaning of ethnography have been changing since their inception as disciplines and fields of investigation.
早期干预儿童发展的民族志是在幼儿民族志与教育民族志的交叉点上发展起来的一个新的民族志研究分支。Allison James总结了引导教育和儿童研究人员积极看待民族志的过程，他写道：“民族志正在成为儿童研究中的新正统观念”（James 2007：246）。这句话最近在 《SAGE儿童研究手册》(Farrell, Kagan, and Tisdall2016, 223)关于儿童早期人种学的一章中被重新提出。这表明，深入、长期的参与观察越来越被认为是研究儿童的相关方法。然而，如果你考虑到儿童研究的意义和人种学 的意义自其学科和研究领域开始以来一直在变化，那么这句话的意义可能听起来不那么显而易见。
This is even more so given how these two concepts have been interacting. For example, understanding a child’s identity as resulting from the inculcation of a particular axiology was one of the initial objectives of the so-called culture and personality school established in the United States in the 1930s and 1940s. Following the work of Ruth Benedict, Margaret Mead was among the first ethnographers to look at the influence of culture in the constitution of identity in the early stages of life, with her book Growing Up in Papua New Guinea (1930). Cross-cultural comparison was another thematic interest of the school (Benedict 1949), which later anthropologists of childhood maintained (Whiting 1963; 1975; Le Vine et al. 1994). Recently, amidst a wider array of thematic interests, the way children are indoctrinated by pre-existing values maintains its foothold in the constellation of anthropological interests (Lancy 2015, 22, 336, 408), usually under the umbrella of the ethnography of education.
Although one of the first ethnographies of childhood that also looked at education is, again, Mead’s account of growing up in Manus, Papua New Guinea, at the time of its publication the anthropological study of education was yet to develop. It flourished mostly in the 1950s in different academic circles in USA, Argentina and Brazil (Anderson-Levitt 2011, 11). In the following two decades anthropological studies of education were published in places as far as the USA, Japan, and the UK, contributing to the diversification of the discipline. Research published in different languages and disseminated in different geographic regions resulted in the development of separate traditions such as the German historical anthropology of education and the anthropology of education in the United States.
With the evolution of the anthropology of childhood into different traditions, the published literature became increasingly specialised. While anthropologists of education concentrate on the ways in which children are moulded into context-specific learning and teaching patterns, anthropologists of early childhood are more concerned with what happens before formal schooling begins. Although the study of preschool education can be traced back to the contextualist (LeVine and LeVine 1966 ) and ecological (Bronfenbrenner 1979) traditions, the focus on children between 0 and 3 years of age is a relatively recent development. Such a focus gained considerable depth with the study of different models of early education in different cultures and the relationship between the delivery of these models and particular communities of application (Dahlberg et al. 1999; Delgado 2009; New, 1998, 1999; Tobin et al., 1989, 2009). It is in the context of such a diversity of approaches and thematic fields that the ethnography of early intervention is developing.
Sweet and Appelbaum date the first home visiting programs back to the end of the nineteenth century. Home visiting programme seeks to respond to the needs of families seen as “at-risk” by means of tailored services and support (Gomby 2015). Samuel Odom and Mark Wolery locate the development of early childhood intervention at the intersection between special education (Safford, et al. 1994) and early childhood education (Wolery and Bredekamp 1994). In the United States, the practice of providing children with special education services before negative environmental conditions produce negative ontogenic consequences can be traced back to the Civil Rights movement. Although many of the laws in support of special education were meant to protect the rights of disabled children to receive equal educational opportunities, early intervention was not a service designed specifically or exclusively for children with disabilities. Rather, a much wider definition of possible service users was conceptualised, coinciding with the developmental milestones that any child should be given an opportunity to achieve during the early years.
Different early-years intervals and definitions of milestone emerged from competing psychological and educational theories, resulting in the development of a diversity of approaches to early intervention. Signalling the need for a synthesis, in 2003 Odom and Wolery advanced a Unified Theory of Practice in Early Intervention/Early Childhood Special Education. They wrote that early intervention and early childhood special education is “different from early childhood education in its focus on family-centered services […], individually planned educational programs, and specialized teaching approaches. It differs from school-age special education in its focus on early developmental skills that are precursors for current and later school success and […] its emphasis on family” (Odom and Wolery 2003,164). In addition, early intervention services strive to enhance and support cognitive development, stimulation and language development, including face-to-face interactions of new-borns and infants with adults, social skills, emotional development, as well as diet, physical wellbeing and motor abilities, although different early intervention initiatives might focus particularly on one or more of these aspects and not the other (Dalli and White 2016).
Taken all together, the “rationales for studying – and intervening– in early childhood have shifted considerably over time.” (Penn 2016, 475). Following the differentiation of early intervention from early childhood education, ethnography of early intervention too is to be distinguished from ethnography of early childhood (Konstantoni and Kustatscher 2016). Typically, early intervention initiatives connect families, service providers, and larger institutional bodies in the common goal of preventing environmental and, thus, developmental risks for children. The 0-3 years interval and the fact that early intervention initiatives are often delivered in areas marked by socio-economic disadvantage considerably delimits the scope of the intervention. That also restricts the focus of the ethnographic observation, although anthropological research tends to connect rather than isolate social phenomena.
While distinct, there is one thing that ethnographers of early intervention have in common with ethnographers of early childhood. It is the intention to use ethnographic research to make early childhood programs more successful. Such a need emerges in reaction to the limited contribution that ethnography has made to early intervention, perhaps as a result of its relatively recent development. While ethnography has become increasingly important in childhood research, as mentioned above, that has not been reflected in increased status in the early intervention industry. The purpose of this article is to reflect on the reasons why this might be and propose a few ways to change that.
First, this article presents a pragmatic analysis of the methodological compatibilities between ethnography and randomized controlled trials (RCT) in the context of early intervention. Second, it suggests why an interdisciplinary partnership between RCT and ethnography is not considered the ‘gold standard’ of evaluation research in early intervention. The tendency to juxtapose qualitative and quantitative methods in education research has been widespread, particularly during the years of the paradigm wars. Although that debate has not entirely faded, it is not the purpose of this article to engage with discussions about the methodological or ethical superiority of either ethnography or RCT. Even less so when these concern the application of such methods beyond the field of early intervention. However it is noteworthy that methodological discussions of this kind are taking place. Recalling them situates this article within an existing and lively debate with potential and actual applications in a diversity of fields, including early intervention but also early childhood education.
Although the relationship between RCT and ethnography in evaluative research has been the subject of some recent theoretical work, that has not happened in the field of early intervention. For example, in the context of adolescent sexual health programmes it has been noted that RCT-based evaluations might be less capable to collect data on sexual health than participant observation. Self-report surveys and structured interviews in the vast majority of these programmes do not provide research participants with the intimate dimension of trust that is necessary for highly sensitive information to be released (Michielsen et al. 2010; Palen et al. 2008; Beguy et al. 2009). To take another example, Berwick argued that, in the case of evaluative research in health care and practice more generally, ethnography has “more power to inform about mechanisms and contexts than do RCTs” (Berwick 2008, 1183). While many of these criticisms are solely or mostly intended to highlight the unsuitability of the RCT in contexts where more intimate research methods are deemed necessary, I argue that a more suitable methodology for the evaluation of early intervention, in terms of both effectiveness and process evaluation, is a combination of ethnography and RCT.
The logics underpinning either research methodologies are, I argue, neatly compatible. Indeed, while RCTs are designed in order to tell whether an intervention has been effective or not, ethnographers can provide stark illustrations of why and how that effectiveness, or lack thereof, has been possible. In addition to this collaborative principle, other benefits emerge from the collaboration between RCT experts and ethnographers. The combination of these multiple benefits, when concretised into actual research collaborations, will produce more accurate, deep, and valid findings of individual early intervention initiatives.
In highlighting these benefits, the article proposes that such an interdisciplinary approach should replace the current orthodoxy, in preparation for a comparative theory of early intervention effectiveness, which is currently lacking. It also asks why such a collaborative approach has not been applied yet. One of the reasons why ethnographic methods are not used alongside early intervention initiatives as often as RCTs is that ethnographers and RCT experts are more engaged in either ignoring or criticising their respective methodologies rather than discussing how their approaches might be practically compatible. A second reason is that policy makers that instantiate, encourage, and/or design early intervention initiatives are more inclined to resonate in quantitative terms. In addition, they tend to be under-critical of the biases of the statistical methods because of the pragmatic reductionism that generally characterizes politics. A third reason is that, although the value of interdisciplinary research is increasingly recognised, in practice its relatively recent development means that the number of reviewers who can evaluate and award funding to interdisciplinary projects is still inadequate. These three reasons concur in explaining the limited role of ethnography in evaluative research in early intervention.
My argument is supported by both theoretical reflection and ethnographic research. I am currently conducting ethnographic research within Preparing for Life (PFL), an early intervention initiative that pairs new mothers and pregnant women resident in northside Dublin, Ireland, with mentors who influence their behaviour in a way that recalls nudge theory (Thaler and Sunstein 2008) although with a strong emphasis on empathy and social learning theory (Bandura 1977). Approaching a phenomenon like early intervention from a interdisciplinary perspective requires an understanding of the practical challenges of delivering the intervention and measuring its effectiveness. In that sense, my ethnography considers the perspectives and preoccupations of all the stakeholders, including the mothers, the mentors, as well as the policy makers who design and fund the intervention and the researchers who evaluate it.
The originality of this article rests in its programmatic effort to bridge the gap between different methodological approaches. While interdisciplinary research is increasingly popular, RCT experts and ethnographers rarely interact and even less frequently they do so for the purpose of mutual understanding and collaboration, mutual criticism and disciplinary diffidence being the norm. On the one hand, such diffidence is partly responsible for the limited engagement of ethnographers in evaluation research. On the other, it has been argued that the privileged position of RCTs in social science research might be undeserved. My argument is much less dismissive and much more propositive. Such an interdisciplinary contribution is new in the field of early intervention, where a collaboration between ethnography and RCT has never been theorised.
2. Situating the Evidence-base of Early Intervention Effectiveness
The rights of children to literacy, mental health, and cognitive development are currently considered as much a global priority as was once the case for prophylactic vaccines against polio, smallpox, or measles. Since primary prevention has been mostly effective against the development of these and other life-threatening diseases, the same kind of logic has been applied to the case of what could be called ‘societal’ diseases, such as psychological, social, and cognitive problems in infancy and lack of preparation to enter the school system (Concha-Eastman, 2016). Both the primary prevention and the early intervention endeavour, hence, categorize their fields of application in terms of pathology.
This “therapeutic” attitude (Macintyre 2013, 30-31) at the core of both primary prevention and early intervention is reflected in the epidemiological origin of the standard methodology for the evaluation of their effectiveness, the RCT. That might be considered as a philosophical reason to explain the preference for RCT in early intervention initiatives. In more pragmatic terms, RCTs have become the “gold standard” in evaluative research of early intervention (Stewart-Brown et al. 2011) because governments and agencies prefer to finance only those intervention programmes that are evidenced to produce the best results. If it is possible to isolate the specific factors that made a particular intervention effective, the epidemiological discourse implies, it is also possible to predict the outcomes of an investment of taxpayers’ money. It then becomes possible to deem such investment as good or not.
Such evaluative conclusions rest on methodological grounds. “The main appeal of the RCT comes from its potential to reduce selection bias. Randomization, if done properly, can keep study groups as similar as possible at the outset, so that the investigators can isolate and quantify the effect of the interventions they are studying. No other study design gives us the power to balance unknown prognostic factors at baseline.” (Jadad et al. 2008, 29). Research participants are assigned to either a treatment or a control group and, in the case of early intervention in child development, the former group will receive the intervention (in the form of, for example, a number of home visits, parenting classes, and educational materials). The latter group will receive a lower ‘dose’ of the treatment, or no treatment at all. At the end of the experiment, the differences between the two groups will be neutralised in the randomisation process and all remaining differences will be attributed to the effect of the intervention. Although the differences between the two groups might be caused by unobserved variables, the randomisation process, if conducted properly, is assumed to ensure that the only systematic difference between the two groups is that one has received the intervention and the other has not. It follows that it becomes possible to estimate the measure of treatment effects.
Despite the insistence on rigorous methods for programme evaluation, however, researchers have reached “distinctly agnostic conclusions” (Moss 2016, 91) regarding the general effectiveness of early intervention. Some studies suggested that the intended results of many programmes were not achieved (Roberts, Kramer, & Suissa 1996; Blauw‐Hospers and Hadders‐Algra 2005) or achieved in the moderate to low degree (Peacock et al. 2013:1; Van Sluijs, McMinn, and Griffin. 2007, 703). There have also been studies suggesting that early intervention was beneficial somehow (Kendrick et al. 2000; Olds et al. 1986; Feldman et al. 1993, Bruder, 1993, Anderson et al. 2003), and studies showing some long-term benefits too (Aronen, 1996; Olds, 1986). In other studies, however, early intervention initiatives produced results that tended to fade with time (Anderson 2012). The agnostic character of the general conclusion about effectiveness depends partly on the limited comparability of the early intervention initiatives and partly on the different types of methods utilized to evaluate them. Different interventions have different outcomes and, consequently, their evaluative studies have different outcome measures. These differences create theoretical as well as practical challenges that make outcome comparison not possible or inaccurate.
As a consequence, it has been argued that early intervention should not only be based on best evidence but also on values, with no need of empirical verification of effectiveness (Odom & Wolery 2003, 164). Although ethical values are not measurable, they have been equally important in encouraging the widespread diffusion of early intervention initiates. Indeed, it is relatively easy to justify the existence and critical importance of these initiatives on the basis of an ethical commitment. It is much harder to justify this when the evidence that they deliver the intended results is too thin or inconclusive. In this sense, after the hope for a better future had been converted into the science of improving one’s life chances, the opposite movement seems to take place: the science of improving life chances relies now on the hope that, based on the existing knowledge, though limited, these initiatives will be beneficial. The debate about the effectiveness of early intervention, hence, is still far from settled, and it is therefore important to discuss ways in which the accuracy of evaluation studies can be improved.
In order to conduct such a discussion, it is possible to begin by considering why the ability of RCTs to evaluate the effectiveness of early intervention programmes has been repeatedly called into question. Among the most common criticisms of RCTs, I identify the following three.
First, although “Random allocation of the participants to different study groups” is considered one of the main strengths of the RCT methodology, because it “increases the potential of a study to be free of allocation bias”, random allocation poses no solution to “other important biases” (Jadad et al. 2008). Among the biases that could be introduced along with an RCT methodology, Jadad et al. listed selection bias, ascertainment bias before and after data collection, and other kinds of bias.
Among the biases relevant for a discussion of the potential benefits of a collaboration between ethnographers and RCT experts, intervention choice bias is of particular interest. Intervention choice bias depends on the kind of intervention that has been selected for a particular population. It occurs when the way the intervention works per se influences the data and the data collection process.
For example, if the intervention is not expected to produce major results in the early stages of the programme, it is necessary to wait for the effects to become larger and thus easier to collect with the RCT data collection routine. In the case of an early intervention programme, that time might be relatively long. Parent participants are expected to take up new parenting practices throughout the programme, and to have changed their patterns of behaviour towards its end. Before such changes become observable, even more time might be necessary. It is therefore of critical importance that the data collection is carried out at the time when the particular kind of intervention is expected to produce the relevant results. Selecting a particular point in time, therefore, depends upon very specific contextual variables.
Second, as mentioned earlier, self-report is considered one of the weaknesses of the RCT methodology. Self-report questionnaires and interviews about a parent’s attitudes to parenting are often biased (Milner & Crouch 1997; Straus et al. 1998). The validity of the measures can be altered by conscious lies, unconscious desires, personal hopes, and so on. In the case of early intervention with parenting behaviour, according to Bögels & Melick (2004), mothers tend to describe their own rearing behaviour more positively than their children and partners do.
Nevertheless, self-report is almost ubiquitous in the evaluation of early intervention initiatives (Janus and Offord 2007), particularly because of the supposed absence of researcher interference. In the context of an early intervention programme, biased self-report concerning maternal behaviour may be corrected with reports of other informants, including but not limited to children and partners. Still, their views would be just as biased. Essentially, the problem with self-report is that responses to questionnaires are ultimately behaviours themselves. They are influenced by a multiplicity of factors. It follows that seeking to understand parenting behaviours through answers that are actually behaviours themselves is like creating two problems in the attempt of solving one.
A third, more general, problem arises from the experimental character of the RCT. The fact that a functional relationship can be established between a treatment and an effect by means of a statistical correlation is not sufficient to conclude that the said treatment causes the said effect. While there might be some kind of relationship between baseline and outcome data for specific measures, there might as well be other, perhaps important but unobserved factors connecting the treatment and the effects.
If these factors are not examined in detail and the statistical relation is accepted as the sole or primary expression of the causative relationship between the treatment and the effect, such a finding would have a limited range of application. In particular, it would not be possible to use it to support the argument that a particular component of the intervention has been effective. The effective element might be locatable in an unobserved factor, one that was perhaps co-existent or co-located with the observed variable, but not captured by the research routine.
That is not to say that the relationship captured by the RCT is not relevant by necessity. It is to say that it is necessary to exclude other factors as not accountable for the observed effects and to indicate a precise reason for doing so. For, the “core of the scientific method is not experimentation per se, but the strategy connoted by the phrase plausible rival hypothesis” (D. T. Campbell 1994, ix). A less technical way to express this concept can be found in the following grumble of Sherlock Holmes to Mr. Watson: “How often have I said to you that when you have eliminated the impossible, whatever remains, however improbable, must be the truth?” (Doyle 2008, 42).
While it is arguable that a given treatment caused a particular effect, in the absence of a close examination of rival hypotheses it is difficult to verify whether that argument is the most valid or not. But the list of plausible rival hypothesis cannot be completed unless a thorough examination of the context where the programme is delivered over a relatively long period of time. Without a list of plausible rival hypotheses and alternative explanations, the verification is not possible.
These are just three out of many and more subtle criticisms that have been formulated against the RCT methodology. There is an extensive literature on the subject. To quote one of the most popular textbooks in research methods in education, RCTs are deemed to belong to “a discredited view of science as positivism” (Cohen, Manion, and Morrison 2013, 318). Although criticisms of this kind are common in educational research today, it is equally common for RCTs to be used alongside early intervention initiatives. Such a contradiction reveals a discrepancy of standards between the view of the researchers and that of the practitioners (Glasgow, Lichtenstein, and Marcus 2003). The following section suggests a few ways in which this discrepancy can be de-emphasised in favour of a more collaborative attitude between researchers from different methodological backgrounds and other stakeholders in the early intervention sector.
3. Interdisciplinary, Problem-Specific, Collaborative Methods
Interdisciplinarity rests first and foremost on the basis of mutual understanding between researchers from different methodological backgrounds. That an RCT can tell us whether a treatment has been effective or not is a principle that the proponents of this method accept on the basis of what they consider logical reasoning and correctness of argumentation.
However, even accepting the logic that underpins the RCT, on its basis it is not possible to understand the process of differentiation of one group from the other. That is because the RCT is not designed to illustrate that process. It only relies on observations conducted before and after the intervention is delivered, rather than on continuous observations throughout the programme. Although in some cases RCTs are conducted at several points throughout the intervention, such as at 6 months intervals, that is mostly not the case. And even if that was the case, at-interval surveys are no substitute for long-term participant observation.
Ethnography is constituted precisely by long-term, in-depth participant observation. The logics that convince ethnographers of the trustworthiness of their method is that any phenomenon taking place within a context is embedded within that context and must be understood as such. In order to do that, it is necessary to simultaneously observe and participate in the everyday business of a group of people for a relatively long time. Within the context of an early intervention initiative, participant observation allows the researcher to describe the process through which changes brought in by an exogenous phenomenon, such as a parenting programme, are converted into actual practices, if at all. While the logic of ethnography cannot tell whether an intervention has been effective or not in statistical terms, it can illustrate the process that resulted in the outcome of the intervention. In other words, while it cannot measure the extent of the change occurred, it can seek to explain why and how the intervention worked the way it did.
In sum, the RCT can measure whether the control group has changed or not, but cannot explain why, whereas ethnography can illustrate how that result was achieved, but cannot estimate its extent. In answering these two different yet interrelated sets of questions, the logics underpinning RCT and ethnography appear as both different and logically compatible. As such, they can be used to understand interventions much better then if they were applied in isolation.
This is a purely theoretical argument. Hence it is necessary to explain how the combination of these methodologies can provide some form of practical advantage for the stakeholders of an early intervention initiative. I mention three ways in which the logical compatibility between ethnography and RCT can be concretised.
First, ethnographers can produce stark description at the stages of the programme in which the kind of data necessary for the RCT cannot be collected with that method. As mentioned in the previous section, it is necessary to tailor the RCT to the intervention timeline in order to avoid a collection of biased data. At the early stages of early intervention programmes, changes in the practices of recruited participants can be subtle and slow. This is particularly the case when a relationship of trust must develop between the mentor and the families for the programme to be delivered effectively. Change in parenting skills and its consequences on child development might take months, even years to yield statistical results. Thus, capturing changes with quantitative instruments can be hard when these effects are statistically small. That can be the cause of a few problems.
One of the consequences of the difficult detectability of early effects is that it is difficult to tell, during the programme, whether participants are taking up the practices that lead to changes or not. Examples of practices include eye contact, descriptive praise, and serve and return. In case they are not taking up the practices, in case they are but slowly, or in case they are taking up practices that the programme was not designed to encourage, practitioners will not know until much later. Consequently, they won’t be able to correct their programme at the early stages. Although it is possible to collect interim outcomes at different assessment points in time, most RCTs only collect data at the beginning and at the end of the intervention. Interim data collection is a very expensive measure and thus less common than others. In the rare cases in which interim data are available, the insights they provide do not necessarily justify the cost.
Since effect sizes would be relatively small, it would not be possible to tell whether participants are taking up the intended practices or not and to what extent. It follows that there is no way to collect that kind of statistical data until the programme outcomes become quantitatively larger. Hence, it will only be possible to correct the programme when the numbers grow. That will happen later on in the study-intervention, when a large proportion of time and resources have already been spent. Interventionists that detect low, slow, or absent changes in practices later in the timeline must spend time and resources in correcting the intervention later, which inevitably impacts on effectiveness. There being no statistical way to provide reliable evidence of early change, another method should be used.
Tasking ethnographers with the observation of changes at the early stage of the intervention can be helpful here. Ethnography can provide stark illustrations of individual instances of change at the early stages of the study-intervention, or the lack thereof. For example, the analysis of a few cases might demonstrate that in the context of a home visiting program, pairing parents with mentors of roughly the same age ensures the development of a good relationship only in some cases and not in others. Since a “good” relationship is necessary for the delivery of the programme, this early finding might suggest that a deeper analysis of the participant’s history and personality (Allen 2007, 75) is necessary before they are coupled with a particular mentor. Ethnographic descriptions and analyses of how research participants react to the intervention can then be used to identify unwanted circumstances and take redressive action early on.
It is important to note that, if ethnographic observations do suggest that changes in the programme delivery are necessary, these changes must be applied homogeneously to all the participants in the given group. Although interventions can be adjusted throughout the programme, every family must receive exactly the same intervention. Otherwise, the inner logic of the RCT would not work correctly. If some families in the treatment group receive one form of the intervention and other families in the same group receive a different form, it will not be possible to distinguish which component of the intervention caused which outcome measure. For, when the control group will be compared to the treatment group, the latter will not be considered homogeneous. However, if the adjustments concern the whole group, it will still be possible to compare the two groups without compromising the logic of the RCT. Much to the contrary, the application of that logic will be strengthened by the combination with the logic of ethnography, which contributed to deliver the intervention as intended, or to adjust it if necessary.
Second, in-depth, long term participant observation can be used to improve the accuracy of survey methods. As mentioned earlier, self-report questionnaires can be a source of bias. In addition to the bias introduced by the research participants in answering questions, the survey in itself can be methodologically problematic. When experimental and quasi-experimental approaches lack a prospective qualitative component, surveys only ask questions from the perspective of outsiders. Questions formulated in this way risk resonating very little with the research participants. For this reason, surveys should be informed by qualitative data collected before the RCT starts. On the basis of qualitative interviews and focus groups, it is possible to ask questions that, while being relevant for the research agenda of the outsiders, are translated into terms that resonate with the insiders. The more the questions resonate well with the research participants, the more they will feel represented by their own answers.
Often it is not necessary to have an ethnographer to formulate questions in this way. Translating from the research agenda into language expressions that can be readily understood by the research participants can be done in consultation with a few local informants before the study begins. Subsequently, a pilot study can usually provide enough information on how the survey draft is understood by the target population. Conducting such a pilot study is particularly important in the case of interventions with populations in contexts that are markedly different from those of the researchers’ background, such as economically disadvantaged peri-urban communities, prisoners, asylum seekers, and population from different countries and cultures.
While the translation of questionnaires in terms understandable by the target population is a relatively easy and quick task, it is harder and more time consuming to understand the current concerns of the research participants. It is necessary to ‘grasp their point of view’ on such matters as the dangers connected to local representations of the space, the time constraints of their lifestyles, and the quality of their relationships. If the survey assumes relationships with space and time that do not correspond to the participants’ perspectives, results will be biased. Hence, if these and other local aspects are not studied in some depth before the survey is designed, the RCT will not evaluate important aspects of the intervention. Important, for they can inform deeper and more accurate illustrations of the correlation between the intervention and the effects.
For example, according to the RCT conducted alongside PFL, the recruited families rated the importance of the mentor-mother relationship “very highly.” My ethnography confirms that the quality of the relationship is a crucial ingredient for the effective delivery of the programme. That means that the relationship has to be “good” in order for the programme to be effectively delivered. The RCT, though, was not designed to understand its importance, neither from the point of view of ensuring it nor from the point of view of measuring its impact. Although mentors were selected on the ground of their ability to establish a “good” relationship, no definition of what that means was given in the RCT, nor does the PFL manual or job profile illustrate what a “good” relationship is and how it can be developed. It follows that an essential component of the intervention was entirely overlooked. That substantially limits the ability of the researchers to understand how the intervention worked and which components are responsible for its effectiveness.
If in-depth participant observation is conducted within the concerned group before the survey is designed, it will be possible to complement the questionnaire with questions that could not be deemed relevant otherwise. The unexpected research trajectories that pre-RCT ethnographic observation enable depend essentially on the importance given to the perspectives of the participants. While it is important to translate the questions relevant for the outsiders in terms that are meaningful for the insiders, it is equally important to value what is relevant for the insiders and translate it into questions that the outsiders can learn to treat as meaningful.
Third, the blend of ethnography and RCT balances the experimental ethos of the latter with the participatory and reflexive ethos of the former. The genealogy of the RCT can be traced back to the experimental paradigm of epidemiological research, as mentioned above. For this reason, its logic depends on theoretical rigour often so rigid that it can hardly be applied in contexts marked by the complexities of the socio-cultural kind. All human settings are complex and cannot be controlled as if their parts could be isolated from each other. For example, within a community of interconnected service recipients it is impossible to entirely avoid contamination, a term with clear epidemiological derivation that is used to express the unintended transmission of some of the intervention from the treatment to the control group.
The main challenge for the application of a methodology rooted in the experimental paradigm in a non-experimental setting is that single causes cannot be methodologically isolated because they are not isolated from each other substantially. In that sense, an RCT not compromised by the limits of the context is not possible. The bar, still, is set so high that even the smallest difference from optimal experimental conditions might be considered to invalidate the RCT as a whole. But since experimental conditions cannot be entirely satisfied in complex human settings, it is necessary to accept less than optimal conditions. That, however, can only be done as long as a clear methodology is in place to compensate for the bias that less-than-optimal conditions introduce in the data.
For example, it suffices that one person drops out of the study for the randomisation to be spoiled and the theoretical validity of the experiment to be hampered. There has never been an RCT without human dropouts, so this problem concerns all trials, not only those designed alongside early intervention initiatives. Dropouts hamper the reliability of the experiment because it is not possible to compare the baseline data with the data they would have put in had they remained in the study. Compensating for the absence of dropouts, inevitably, requires the introduction of statistical strategies the choice of which is largely subjective.
In general, the missing data of the dropouts can be compensated with a variety of statistical remedies. These range from limiting the analysis to non-missing data, to inverse probability weighting (IPW). However, these strategies can be questioned on methodological grounds. Predicting the missing factors to compensate for missingness is considered fallacious approach. IPW can compensate missing data only as long as enough information is available about the entire population to predict the probability of non-missingness. However, the concept of “enough information” leaves room for interpretation and, again, the introduction of subjective bias.
If an RCT were to incorporate values that are not quantifiable and rely less on the experimental character of the investigation, these problems might seem less of a concern. For example, rather than using statistics to compensate for the missing data of the dropouts, RCT researchers might task ethnographers to concentrate, in a participative way, on individual research participants that, according to the predictive models, are more likely to leave the study. Ethnographers can discuss with these participants without necessarily touching on the issue of abandoning the study, but the fact of providing them with one-to-one care and attention might in itself decrease the likelihood of their dropping out. More generally, although it is impossible to predict who will actually drop out of the study, spending time in the community and engaging thoroughly with the participants will provide insights into why they might wish to leave the research.
Another way of limiting missingness consists in having the ethnographer spend time with people from the control group. Control group research participants are often more likely to disengage because they feel less important than the treatment group, even in the case of a double blind RCT. As they are members of the same community of the treatment group members, some of them would eventually realise that they are not receiving as much support or ‘dose’ as other research participants. Still, they are a crucial component of the RCT, even if they might not perceive that to be the case. If they drop out it would be just as difficult for the researchers to accurately evaluate the effectiveness of the intervention as if members of the treatment group had abandoned. The problem is that their importance is not reflected in the engagement of the intervention. If, however, they were considered as much a source of ethnographic insights as the members of the treatment group, they would understand better how important they are, arguably. The ethnographer would engage with them though participant observation, and they would engage more in turn, thereby limiting the number of dropouts.
Also, if employed before the intervention begins, ethnographers can identify reasons why people might consent to take part in a study that they later might wish to leave. In Ireland, for example, this might be related to what the Irish psychiatrist Anthony Clare described in the following anecdote: “Hector Warnes was a Canadian, and he had to adapt a great deal to Ireland. He found the complexity of the Irish character difficult. In particular, he looked at our elliptical way of communicating negative feelings; patients, instead of saying ‘no’, would say ‘of course’, and then go off and not do what it was he wanted them to do. This was true of his colleagues as well, so here we’re talking not of the psychopathology of patients, we’re talking about ways we have of coping with each other and lubricating certain kinds of difficult exchanges” (O’Reilly and Clare 1983, 173).
These are just a few examples of how the methodological compatibility between ethnography and RCT could be translated into the pragmatics of evaluation research. It is clear that both ethnographers and RCT experts would benefit greatly from this interdisciplinary, problem-specific collaboration, in both methodological and theoretical terms. It follows that it is for reasons others than methodology and theory that this kind of collaboration has not yet gained in popularity during the early intervention hype. In the following sections I briefly discuss what these other reasons might be.
4. Reasons for a lack
Given the above argument, it is not clear why the “gold standard’ is not to design evaluation research in early intervention as a partnership between ethnographers and RCT experts. One way to understand this is to look at the broader debate about interdisciplinarity. The Global Research Council Report 2016 states that, according to the “global literature” interdisciplinarity “has a key role to play in addressing the grand challenges that society faces.” (GRC 2016, 5) Yet, the Report recognizes that interdisciplinarity remains essentially a good idea rather than a common practice. The reasons are many, complex and interdependent. Interdisciplinarity is a relatively recent development. Although it is possible to identify instances of interdisciplinarity before the 1970s, these were generally related to the exceptional intellect of isolated figures, such as Herodotus, Leonardo, or Leibniz. Their scientific accomplishments resulted not from coordinated partnerships of interconnected global scholars.Interdisciplinarity of the global kind only started to gain traction in the 1970s; in the context of social sciences this happened even later.
鉴于上述论点，尚不清楚为什么“黄金标准”不是将早期干预中的评估研究设计为民族志学家和 RCT 专家之间的合作伙伴关系。理解这一点的一种方法是查看关于跨学科性的更广泛辩论。全球研究委员会 2016 年报告指出，根据“全球文献”，跨学科“在应对社会面临的重大挑战方面发挥着关键作用”。(GRC 2016, 5) 然而，报告承认跨学科本质上仍然是一个好主意，而不是一种普遍的做法。原因是多方面的，复杂且相互依存。跨学科是最近才发展起来的。尽管可以在 尽管在20世纪70年代之前识别出跨学科的实例，但这些通常与独立人物的非凡智慧有关，例如Herodotus, Leonardo, 或 Leibniz。他们的科学成就并非来自相互关联的全球学者的协作伙伴关系。全球范围内的跨学科性在20世纪70年代才开始受到关注。在社会科学的背景下，这甚至发生得更晚。
While scholars in the 1980s were still debating whether qualitative or quantitative methods were better suited to generate evidence (Tashakkori and Teddlie, 1998), at the turn of the millennium the interest for the dichotomic version of that debate was fading. Scholars became increasingly interested in mixed-methods approaches, and in the past three decades there has been quite a lot of research and theorising around the benefits and challenges of mixed methods research. However, initially these have been mostly isolated attempts, as opposed to the concerted efforts of the last 15 years. Most recently, we have seen the development of university courses, new scientific journals, and academic conferences entirely dedicated to mixed methods (Small 2011). That seems to signal that the number of scholars interested in mixed methods has been growing.
However, the definitions of different types of data, the means for obtaining them, and their analysis have been changing. As a consequence of these recent developments, a whole new debate with a considerable level of sophistication has emerged and mixed method research became an established field in its own right (Creswell and Tashakkori 2007; Johnson and Onwuegbuzie 2004). The reflection upon the meaning of mixed methods has led scholars well beyond the simple assertion that mixed methods are, for example, a quantitative survey supplemented by qualitative interviews. Most researchers in social sciences now mix quantitative and qualitative data all the time (Ragin 2014), producing a diversity of possible combinations.
Combinations might feature a research question in both qualitative and quantitative terms, incorporating the perspectives of the research participants or, to the contrary, maintaining experimental detachment. Recruitment of research participants and informants can be exploratory while a form of probability sampling can be conducted at the same time. Data collection might combine classic qualitative methods such as focus groups and qualitative interviews while demographic data are collected and surveys administered. Narrative and numerical data do not exclude each other and thematic and statistical analysis can respectively be conducted in a way that makes findings from one research routine informative for the other.
Following this new focus of interest, the debate shifted from one about which method is best suited to answer a particular research question to one about how methodologies should be mixed to achieve a diversity of valuable ends. It has therefore become an accepted principle that clear benefits derive from using a combination of methods, if only because confidence in research findings increases when arguments supported by different approaches are in agreement (Bollen & Paxton 1998). Narrowing down from the social sciences broadly defined to the more specific field of early childhood studies, and even more to the study of early intervention, however, we see that evaluative research is still dominated by the RCT, rather than a combination of RCT and ethnography. It appears therefore that, although the idea of interdisciplinarity is increasingly popular within the social sciences, its practice is still limited in early intervention.
In order to move beyond what might appear to be a tautology, it is possible to find another reason for the limited role of interdisciplinarity in early intervention evaluative research. That is the lack of familiarity that qualitative and quantitative researchers have with each others’ approaches in this field. Stephen Bell and Peter Aggleton reflected on the consequences of this lack of familiarity. (Bell and Aggleton 2012). In their edited volume (2016), Monitoring and Evaluation in Health and Social Development they argue in favor of ethnographic methods in evaluative research. However, they do not provide an example of how an integration with RCTs can be designed. In the second part of the volume the contributors do address the theme of research design, but a fundamental opposition persists between qualitative/quantitative, emic/etic, deductive/inductive, experimental/exploratory, and so on. In that way, each approach remains fundamentally separate from the other rather than integrated with the other.
Quantitative researchers are all too often caricaturized as cold, ultra-rationalistic machine-like beings who believe that such a thing as an objective reality can be perfectly known by means of increasingly sophisticate heuristic methods (Saracho 2016, 15). Their approach is considered inappropriate because it is designed in isolation from the context where it is to be applied and it is therefore unsuitable to capture the specificity of that context. As a consequence, their representation of phenomena is sometimes considered not only inaccurate but also ethically questionable because of the superimposition of an external perspective that homogenises difference and obliterates self-representation (Lincoln et al. 2011). Another negative consequence of experimental detachment is that the research insights it produces are so distant from the every-day lives of practitioners and service recipients that they have little application and social relevance (Foster & Mash 1999).
On the other hand, qualitative approaches like ethnography are often assumed not to be representative of the wider social reality in which a small group is embedded, essentially because they do not rely on methods measuring the extent to which the given group differs in significant ways from the broader population. Another reason why there might be diffidence is that the kind of data that ethnographers collect often require a narrative form to be presented, one that does not lend itself to secondary analysis and interpretation. All in all, while interdisciplinarity gains popularity, at least in some disciplines, the reciprocal diffidence between qualitative and quantitative approaches limits the development of interdisciplinary practices.
While academic debates are often instigated as means of building some careers and destroying others, the opposition between quantitative and qualitative approaches is perhaps grounded in the genuine attempt to generate more suitable means to understand social phenomena. However, while complex evaluation questions are still addressed as if the phenomena they intend to understand could be broken down into qualitative and quantitative components that could be studied in isolation from each other, as opposed to an interdisciplinary approach constituted by a complex and concrete integration of qualitative and quantitative methods, their heuristic reach will be limited.
Finally, another reason for the limited practice of interdisciplinarity in early intervention relates to the availability and distribution of funding. Agencies are committed to fund, when possible, only programmes that are deemed or expected to be effective according to an evidence-based paradigm. That has the unintended consequence of encouraging a methodological approach “that flattens complex change processes into overly simple causes and effects.” (Bell and Aggleton 2016, 4). In contrast, approaches that emphasise complexity and focus on descriptive narrative rather than statistic and representative data are considered less suitable for the evidence-based paradigm.
More generally, according to the GRC, interdisciplinary research receives less funding because peer review procedures are not designed to evaluate interdisciplinary proposals. “This is partly due to a lack of reviewers who understand how to evaluate interdisciplinary research, and the related circular problem that there is a need to expose more reviewers to interdisciplinary projects”(GRC 2016). It might be the case that similar circular reasons explain why early intervention initiatives are not designed alongside a research partnership that includes RCTs and ethnography. If the reviewers do not see the advantage of, for example, having an ethnographer working in partnership with an RCT expert, they would not value that kind of research.
5. Concluding remarks
In this article I argued that one of the main reasons why the benefits of a collaboration between RCT experts and ethnographers are not realised is that these benefits are not readily apparent for both scientists and policy makers. However, there are exceptions to this general claim. For example, Carlos Moedas, European Commissioner for Research, Science and Innovation, gave a speech at the second International Network for Government Science Advice (INGSA) Conference held on 29-30 September 2016 in Brussels, and said:
“I would argue that the job of scientific advisor has dramatically changed. I think that the scientific advisor is no longer the expert who provides the answers. That was the business of the past. Let me be provocative. It is not anymore about the answers. It’s about the process of collecting evidence in a multi-disciplinary world. The Scientific Adviser must accept that science does not provide all the answers. People will only accept the answers if they understand the process.”
Moedas’ words seem to suggest that the argument proposed in this article might not be an isolated attempt to formulate a different way of doing evaluative research.
A case in point is provided by Martin Walsh, who examined “Oxfam’s use of interpretive research to deepen the findings of project evaluations based on the use of quantitative survey methods” and claimed that “Oxfam’s experience suggests that the intersubjective and interpretive modes of enquiry that characterize ethnographic research […] can be integrated with experimental approaches in order to generate more effective learning for programmes.” (Walsh 2016, 219). It appears therefore that it is possible to identify instances of politicians and agencies understanding the value of interdisciplinary evaluation, as opposed to the much celebrated “gold standard”.
In academia, an instance of this new trend can be found, for example in a job description recently advertised on the website of The UCD School of Education. The document briefly describes a “mixed methods evaluation of an intervention […] designed to support children’s literacy, their rights and well-being. One researcher will be mainly responsible for managing the day-to-day running of a large-scale, cluster-randomised controlled trial whereas the other researcher will be mainly responsible for a longitudinal in-depth qualitative research study into the everyday lives of a subsample of children and their families.”
If these isolated examples truly belong to a new trend, it won’t be too long before the programmatic ambition of this article, that of promoting an integration between ethnography and RCT in early intervention, becomes reality.