Personalised interventions for subgroups of children with conduct problems

  • Protocol
  • Intervention



This is a protocol for a Cochrane Review (Intervention). The objectives are as follows:

To assess the effects of personalised, psychosocial interventions for subgroups of children with conduct problems.


Description of the condition

Conduct problems are a range of antisocial and disruptive behaviours that can be diagnosed as conduct disorder (CD) or oppositional defiant disorder (ODD), with ODD symptoms sometimes acting as a precursor to the onset of the more severe CD symptoms (Frick 2012; Moffitt 2008). CD is characterised by a repetitive and persistent pattern of behaviour in which the basic rights of others or major age-appropriate norms or rules are violated, whereas children with ODD demonstrate defiant behaviour, irritability, and vindictiveness (American Psychiatric Association 2013). Epidemiological studies have identified that between 5% to 10% of children and adolescents have significant problems with conduct and disruptive behaviour (Moffit 2009), making it the most common behavioural and mental health problem in children and young people globally (Collishaw 2004), and the most common reason for referral of young children to child and adolescent mental health services in the UK (NICE 2013).

Conduct problems are an important long-term condition of childhood as, when left untreated, they commonly persist (Murphy 2012). They predict not only the development of antisocial behaviour and substance misuse in adulthood, but also poor educational outcomes and increased physical health burden throughout life (Odgers 2007), and are the most common precursor of adult mental health problems across the spectrum (Copeland 2009; Kim-Cohen 2003). The 2010 Global Burden of Disease Study identified CD as a significant contributor to global years lived with disability (YLD), ranking it the 30th leading cause of nonfatal burden worldwide (Erskine 2014). Conduct disorder is ranked as the fourth leading cause of global YLDs for children aged five to nine years, and second for males in this age group. As well as the impact on the individual child and family, there is also an increased cost to the public purse, with each affected individual being associated with costs around 10 times that of children without the disorder (Murphy 2012). The early treatment and prevention of conduct problems is therefore of tremendous importance.

In recent years, there has been increasing awareness of substantial heterogeneity within conduct problems, so that it is now recognised that there are a number of ‘subgroups’ of children with conduct problems (Frick 2016; Klahr 2014). Variations include those in age-of-onset (Silberg 2015), level of aggression within antisocial behaviour (Loeber 1985), comorbidity with attention deficit hyperactivity disorder (ADHD; Waschbusch 2002), and influence of genetic and environmental factors in relation to level of callous-unemotional (CU) traits or ‘limited prosocial emotion’ (LPE; Viding 2005). These heterogeneous subgroups can exhibit differences in aetiology, developmental trajectory and likely prognosis (Frick 2016; Klahr 2014), with some studies reporting differential treatment outcomes (Hawes 2014; Reyno 2006).

Particular family characteristics have also been identified, both as risk factors for the development of conduct problems and as moderators of treatment effectiveness; for example, maternal mental health (Hutchings 2012) and contact with child protection services (Drugli 2010). Maternal ADHD symptoms have been associated with child ADHD and ODD symptoms (Zisser 2012) and may limit the improvement shown by children with ADHD in response to treatment (Chronis-Tuscano 2011; Sonuga-Barke 2002). It is therefore critical to identify subgroups defined by familial factors in addition to the recognition of heterogeneity on an individual level.

An example of the greater recognition of the importance of such subgroup heterogeneity is the decision by the American Psychiatric Association 2013 to incorporate a new specifier in the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) to describe children and young people with conduct problems who present with LPE. LPE is characterised by the presence of two or more of the following criteria over at least 12 months and in multiple relationships and settings: (a) lack of remorse or guilt (b) callous - lack of empathy (c) shallow or deficient affect, and (d) lack of concern about performance (American Psychiatric Association 2013; Jambroes 2016). Recent reviews regarding chronic irritability and anger in ODD have also recommended the inclusion of a specific irritability subtype for ODD in the 11th revision of the International Classification of Diseases and Related Health Problems (ICD-11), in response to the greater level of severity and impairment experienced by some children (Evans 2017; Lochman 2014). This subtype will potentially act as an alternative to disruptive mood dysregulation disorder in the DSM-5, with the aim of enabling more accurate identification and treatment of heterogeneity within ODD (Evans 2017; Lochman 2014).

Such differences between subgroups have prompted debate as to whether a ‘one-size-fits-all’ model of intervention, which fails to take account of this heterogeneity, may be limited in its effectiveness. It is hoped, for example, that the addition of the LPE specifier into DSM-5 will encourage more precise diagnosis and the acknowledgement of ‘an emerging subgroup within conduct problems’ thereby promoting more targeted treatment research (American Psychiatric Association 2013).

Description of the intervention

Current recommended interventions

The gold standard, evidence-based intervention for the treatment of conduct problems in children is behavioural parent training (Scott 2009). While parent training programmes are recognised as an effective treatment (Dretzke 2009), personalisation seeks to address possible limitations in the effectiveness of such programmes. Evaluations of even the best parent training programmes estimate that a quarter to a third of families and their children do not benefit (Scott 2009). Parent training also requires substantial commitment and organisation from parents and can be undermined as a treatment due to dropout or failure to engage. Although some recent interventions have sought to trial new methods of delivery that could address particular issues with provision and attendance, such as internet-delivered parent training (Högström 2015; Sourander 2016), there are still inherent difficulties with implementing behavioural parent training. Families with children diagnosed with ODD, CD, or ADHD who are appropriate for behavioural parent training, commonly do not enrol, enrol but never attend treatment, drop out prematurely, or do not fully engage in within-session or between-session skills implementation (Chacko 2012; Fernandez 2011; Peters 2005). A recent review of 262 studies of behavioural parent training found a combined dropout rate of 51%, with 25% not enrolling despite being appropriate for the programme and 26% beginning but not completing the training (Chacko 2016). Limitations in the reach of parent training programmes are therefore a significant problem (NICE 2013; Pilling 2013).

In addition to such limitations in reach and effectiveness, differential outcomes of parent training have been associated with subgroups of children with conduct problems. High CU traits (or LPE) can predict poor outcomes across parent training interventions (Hawes 2014), and there is evidence that poor economic circumstances, marital discord, parental mental health problems, and parental hostility are associated with poorer outcomes (Reyno 2006). Paternal substance abuse and child comorbid anxiety or depression have also been identified as factors predicting poorer outcomes (Beauchaine 2005). However, the evidence in this area is not clear cut and a recent, comprehensive meta-analysis found that a range of family characteristics, which are usually associated with a poorer outcome from parent training, did not in fact moderate a less favourable response (Gardner 2016).

Although parent training is the primary recommended intervention for children with conduct or behavioural problems, other treatments, such as cognitive problem-solving programmes, are recommended (NICE 2013), and cognitive behavioural treatments have been investigated in the treatment of aggressive behaviour in children (Smeets 2015; Sukhodolsky 2004). Meta-analyses of Cognitive Behavioural Therapy (CBT) for aggression in children and adolescents have demonstrated medium effect sizes (Smeets 2015; Sukhodolsky 2004), and have suggested that further research is necessary to determine whether subgroups of individuals with predominantly reactive or proactive aggression may respond differentially to CBT intervention (Smeets 2015). Therefore, additional investigation is vital to clarify whether subgroup classification is associated with differential outcome across available interventions and, if so, whether understanding the underlying reasons for this could potentially lead to the development of more effective treatments.


Differences across developmental pathways and clinical presentations have been identified within the wider diagnostic classification of conduct problems (Frick 2016; Klahr 2014). Recognition of these differences could therefore aid the tailoring or personalisation of interventions to address the specific needs of particular subgroups (Frick 2016). Such personalisation aligns with the recent strategy of the Medical Research Council (MRC) to ‘embrace a stratified medicine approach’ (Medical Research Council 2017). Stratified medicine is described as "identifying groups of people with shared characteristics within or across specific disorders… looking beyond standard diagnostic categories to find new treatments and better ways of using existing treatments". Personalised interventions may therefore include novel treatments, or may involve additional or adjunctive interventions alongside existing standard evidence-based interventions. While the National Institute for Mental Health (NIMH) in the USA has called for mental health researchers to "expand and deepen the focus to personalise intervention research" (Fisher 2015), the science of personalisation in relation to child mental health is a novel field in the early stage of development (Ng 2016; Scott 2016).

How the intervention might work

How personalised interventions might work

Personalised interventions are likely to work by tailoring different aspects of treatment to the needs of particular subgroups of parents and children. It is possible that personalised treatments may include elements of parent training programmes, or supplement existing interventions with additional techniques to address subgroup heterogeneity. For example, a subgroup of children with conduct problems experiencing parental hostility could potentially benefit from a parenting programme tailored to include additional sessions focusing on hostility and offering particular techniques to address this issue. Alternatively, personalised interventions may be entirely novel treatments without any reference to parent training, or adaptations of existing non-parent training based interventions for conduct problems.

Subgroups of children with conduct problems that may benefit from personalisation

An example of a subgroup difference, which could be addressed by a personalised approach to intervention, is that of children who have high versus low CU traits (Frick 2014). Children with low CU traits are more likely to be sensitive to traditional disciplinary strategies employed in parenting programmes (Thapar 2015), whereas children with high CU traits appear genetically vulnerable to antisocial behaviour (Viding 2005) and relatively insensitive to punishment, threat and others’ distress (Pardini 2012). These vulnerabilities may cause insensitivity to certain critical components of traditional behavioural approaches (Hawes 2005), and children with high CU traits may benefit from programmes focusing on the positive dimension of parenting (Muratori 2016). Programmes that have been successful in the treatment of CU traits may contain elements that are more beneficial for this particular subgroup; for example, supporting an increase in parental warmth (e.g. Fast Track Intervention; Pasalich 2016).

Similarly, maternal ADHD symptoms have been associated with poorer parent training outcomes for children with ADHD (Chronis-Tuscano 2011; Sonuga-Barke 2002). Lack of reduction in negative parenting behaviours has been identified as a possible explanation for the relationship between maternal ADHD symptoms and poorer post-treatment child behavioural outcomes (Chronis-Tuscano 2011). The ability of parents to exhibit positive parenting behaviours is of vital importance for behavioural change, and has been shown to act as a protective factor against the development of conduct problems in children with ADHD (Chronis 2007). For these families, management of maternal ADHD symptoms to aid implementation of positive parenting strategies could be beneficial.

Targeting other aspects of parental mental health may also be beneficial; for example, treating maternal depression appears to improve outcomes for children with conduct problems (Hutchings 2012). Further, following evidence suggesting that subgroups of children presenting with emotional dysregulation may differentially respond to parent training programmes, Scott 2012 proposed that it may be worthwhile to pre-screen children prior to allocation of parenting interventions to ensure individual differences are accounted for. Personalised treatments, therefore, may have the potential to improve outcomes by targeting the specific needs of pre-defined subgroups.

Why it is important to do this review

While existing reviews have identified considerable heterogeneity within conduct problems and have investigated differential response to treatment (Gardner 2016; Hawes 2014; Klahr 2014; Shelleby 2014; Wilkinson 2016), to date, there has been no attempt to identify and synthesise the evidence on personalised interventions for subgroups of children with conduct problems. Previous Cochrane reviews focusing on the treatment of conduct problems have evaluated standard group-based parenting programmes for improving emotional and behavioural adjustment in young children (Barlow 2016), improving early-onset conduct problems in children aged three to 12 years (Furlong 2012), and improving conduct problems in older children and adolescents (Woolfenden 2001). This review, therefore, aims to address a gap in the treatment literature by systematically identifying and appraising the evidence for personalised psychosocial treatments for subgroups of children with conduct problems.


To assess the effects of personalised, psychosocial interventions for subgroups of children with conduct problems.


Criteria for considering studies for this review

Types of studies

Randomised controlled trials (RCTs).

Types of participants

Children aged between two and 12 years, in any setting, with conduct problems. This will include children with diagnoses of conduct disorder (CD) and oppositional defiant disorder (ODD), but is not restricted to those with formally diagnosed conditions. We will include participants if they are within a subgroup category. Subgroup categories could include: individuals with a comorbid diagnosis of ADHD or an autistic spectrum condition (ASC), children with high levels of callous and unemotional traits, emotionally dysregulated children, rule-breaking versus aggressive subtypes of conduct problems, children with mentally-ill or addicted parents, children whose parents are experiencing marital conflict, and children from low socioeconomic status families, looked after children, and any other subgroups of children with conduct problems.

Due to differences in associated risk factors and developmental trajectory between child-onset and adolescent-onset conduct problems (Silberg 2015), we will exclude studies that include a proportion of children older than 12 years of age. We will also exclude studies that focus on a geographical area of disadvantage rather than pre-specifying a subgroup of socioeconomically disadvantaged families. This will exclude studies exploring generalisability rather than heterogeneity.

Types of interventions

Any personalised, psychological intervention that can act as an alternative or adjunct treatment, or an adaptation to standard practice for subgroups of children with conduct problems. Standard practice here may refer to parent training programmes or to other recommended interventions for children with conduct problems. We will exclude non-psychosocial/psychological interventions (e.g. pharmacological or dietary intervention).

Relevant comparisons may include no intervention, waiting-list control, standard practice, or a comparison intervention focused on conduct problems.

Types of outcome measures

Primary outcomes
  1. Improvement in child conduct problems or disruptive behaviour, as measured by, for example, the Eyberg Child Behaviour Inventory (ECBI; Eyberg 1978).

  2. Any adverse events (such as emotional or psychological trauma of any kind, perhaps if a parent was to experience an increase in anxiety or depression throughout the course of a parent-focused treatment; or an increase in negative parenting practices, such as shouting or criticism).

Secondary outcomes
  1. Personalised treatment outcomes, relevant to each subgroup (e.g. reduction in ADHD symptoms, as measured by, for example, the Conners Abbreviated Parent/Teacher Rating Scale (CAP/TRS; Conners 1994); reduction in callous-unemotional (CU) traits, as measured by, for example, the Inventory of Callous-Unemotional Traits (ICU; Frick 2004); or maternal depression, as measured by, for example, the Beck Depression Inventory-II (BDI-II; Beck 1996)).

  2. Parenting skills and knowledge, as measured by direct observation or self-report (e.g. Parenting Scale (PS); Arnold 1993).

  3. Family functioning, as measured by, for example, the Family Assessment Device (FAD; Epstein 1983).

  4. Engagement and decreased dropout, as measured by number of sessions attended.

  5. Educational outcomes, as measured by, for example, the items capturing child academic performance from the MacArthur Health Behavior Questionnaire (MacArthur HBQ; Boyce 2002; Essex 2002), or developmental assessments such as the Bayley Scales of Infant and Toddler Development - Third Edition (Bayley-III; Bayley 2006) or the Mullen Scales of Early Learning (MSEL; Mullen 1995).

Primary and secondary outcomes may be measured by child, parent or carer reports, either though questionnaires, interview or observational assessments.

The 'Summary of findings' tables will report on both primary outcomes and the following two secondary outcomes: personalised treatment outcomes and parenting skills and knowledge.

Search methods for identification of studies

Electronic searches

We will identify relevant trials through searching the electronic databases and trials registers listed below.

  1. Cochrane Central Register of Controlled Trials (CENTRAL; current issue) in the Cochrane Library, which includes the Cochrane Developmental, Psychosocial and Learning Problems Group Specialised Register.

  2. MEDLINE Ovid (1946 onwards).

  3. MEDLINE EPub Ahead of Print Ovid (current issue).

  4. MEDLINE In-Process and Other Non-Indexed Citations Ovid (current issue).

  5. Embase Ovid (1974 onwards).

  6. PsycINFO Ovid (1967 onwards).

  7. CINAHLPlus EBSCOhost (Cumulative Index to Nursing and Allied Health Literature; 1937 onwards).

  8. ERIC EBSCOhost (Education Resources Information Center; 1966 onwards).

  9. Conference Proceedings Citation Index - Science Web of Science (CPCI-S; 1990 onwards).

  10. Conference Proceedings Citation Index - Social Science & Humanities Web of Science (CPCI-SS&H; 1990 onwards).

  11. Cochrane Database of Systematic Reviews (CDSR; current issue), part of the Cochrane Library.

  12. Database of Abstracts of Reviews of Effects (DARE; current issue), part of the Cochrane Library.

  13. Epistemonikos (

  14. (

  15. ISRCTN Registry (

We will not limit the search by language or publication date, and will seek translations of any studies of potential relevance.

We will use the search strategy in Appendix 1 to search MEDLINE and will modify it, as necessary, to search the other databases.

Searching other resources

We will examine the reference lists of included studies and relevant review articles to identify further studies (e.g. Bakker 2017; Dretzke 2009; Epstein 2015; Fossum 2009; Kaminski 2008; Lundahl 2006). We will contact authors of the identified RCTs to request further information and contact experts and researchers working in the field in order to search for unpublished and ongoing studies.

Data collection and analysis

Selection of studies

Review authors LF and EK will select studies in a two-phase process. LF will remove duplicate records and obviously irrelevant records based on a preliminary screen of titles. LF and EK will then screen remaining titles and abstracts for eligibility, and will retrieve full-text reports of potentially relevant studies. Next, LF and EK will independently assess the full-text reports for inclusion against the selection criteria (Criteria for considering studies for this review). Any studies that may, on the surface, appear to meet inclusion criteria, but are excluded, will be reported in the ‘Characteristics of excluded studies’ table. Both review authors will resolve any disagreements by discussion. The flow of information throughout the phases of this review will be documented and presented in a flow chart, as described by the PRISMA Statement (Moher 2009).

Data extraction and management

From each eligible study, LF and EK will independently extract information on a number of key characteristics, as described below, using electronic data collection forms. If any differences between the authors arise, these will be discussed and any discrepancies will be addressed. If necessary, we will alter the data collection form, for example, if any categories appear repeatedly irrelevant. This will be assessed throughout the process of data extraction. We will contact authors of studies directly should target information be unreported or unclear, in order to clarify or complete extracted data. We will enter the collected data onto a pre-designed ‘Characteristics of included studies’ table and, within the discussion, discuss the implications of any bias on outcomes or meta-analyses in this review.

  1. General information about the study: title, authors, year of publication, eligibility.

  2. Methods: study design, unit allocation, and duration of the study.

  3. Participants: setting, recruitment method, withdrawal from study, relevant diagnostic details, age, sex, race/ethnicity, further sociodemographic detail, subgroup allocation.

  4. Intervention: considerations and components related to the intervention, including theoretical basis, duration, session frequency, individual or group-based delivery, staff qualifications, outcome measures and scales, economic information, compliance and integrity of delivery.

  5. Outcomes: primary and secondary outcomes and time points considered in the review.

Citations and data will be entered and organised in Review Manager (RevMan) 5 (Review Manager 2014). Where data are unavailable, we will contact the study authors to request the missing information.

Assessment of risk of bias in included studies

LF and EK will independently assess the risk of bias in each included study using the 'Risk of bias' criteria described in the Cochrane Handbook for Systematic Reviews of Interventions (hereafter referred to as the Cochrane Handbook) (Higgins 2011a). For each included study, both review authors will assess each of the domains listed below and assign ratings of low, high or unclear risk of bias (see Table 1 for additional information on rating criteria). They will resolve any disagreements by discussion until they reach consensus, and will record their final decisions in a ‘Risk of bias’ table with a brief rationale for each decision.

Table 1. 'Risk of bias' criteria
'Risk of bias' domainCriteria for low, high and unclear risk of bias
Sequence generation
  1. Low risk of bias, indicating a sufficiently random allocation method (e.g. using a random number generator, coin toss, or random number table).

  2. High risk of bias, indicating a non-random component in the sequence generation process (e.g. either a systematic approach such as sequence generation by date of birth, or allocation based on judgement such as that of the clinician or preference of the participant).

  3. Unclear risk of bias, indicating that insufficient information is available to make a judgement of either low or high risk of bias.

Allocation concealment
  1. Low risk of bias, indicating that participants and investigators enrolling participants could not foresee assignment due to use of methods such as sequentially numbered envelopes or central allocation.

  2. High risk of bias, indicating that participants and investigators may be able to foresee assignments and introduce selection bias. For example, allocation based on date of birth or case record number.

  3. Unclear risk of bias, indicating that insufficient information is available to make a judgement of either low or high risk of bias.

Blinding of participants of personnel
  1. Low risk of bias, indicating that performance is not likely to be influenced by lack of blinding.

  2. High risk of bias, indicating that performance is likely to be influenced by lack of blinding.

  3. Unclear risk of bias, indicating that insufficient information is available to make a judgement of either low or high risk of bias.

Blinding of outcome assessment
  1. Low risk of bias, indicating that outcome measurement is not likely to be influenced by lack of blinding.

  2. High risk of bias, indicating that outcome assessment is likely to be influenced by lack of blinding.

  3. Unclear risk of bias, indicating that insufficient information is available to make a judgement of either low or high risk of bias.

Incomplete outcome data
  1. Low risk of bias, indicating any of the following: a) there are no missing outcome data; b) reasons for missing outcome data are unlikely to be related to the true outcome; c) missing outcome data are balanced in numbers across intervention groups; d) the proportion of missing outcomes is not enough to have a clinically relevant impact; or e) missing data have been imputed using appropriate methods.

  2. High risk of bias, indicating any of the following: a) reasons for missing outcome data are likely to be related to the true outcome; b) clinically relevant bias is likely; c) ‘as-treated’ analysis is done with substantial departure from the intervention received at randomisation; d) there is potentially inappropriate application of simple imputation.

  3. Unclear risk of bias, indicating that insufficient information is available to make a judgement of either low or high risk of bias.

Selective outcome reporting
  1. Low risk of bias, indicating that all of the pre-specified outcomes of the review are included.

  2. High risk of bias, indicating any of the following: a) not all of the study’s pre-specified primary outcomes have been reported; b) the primary outcome is not pre-specified; c) outcomes are reported incompletely so they cannot be entered in a meta-analysis; d) the study does not report key outcomes that would be expected for such a study.

  3. Unclear risk of bias, indicating that insufficient information is available to make a judgement of either low or high risk of bias.

Other sources of bias
  1. Low risk of bias, indicating that the study appears to be free of other sources of bias.

  2. High risk of bias, indicating that there is at least one important risk of bias (e.g. related to specific study design, fraudulent claims, or other issues).

  3. Unclear risk of bias, indicating that there is insufficient information available to make a judgement of either low or high risk of bias.

Sequence generation

We will assess whether the methods used to generate the allocation sequence should have produced comparable groups, and will add a comment to indicate whether the method was likely to have been carried out.

Allocation concealment

We will assess the methods used to conceal the allocation sequence and whether allocation could have been foreseen in advance of, or during, recruitment.

Blinding of participants or personnel

We will assess the possibility of performance bias due to knowledge of the allocated interventions by participants or study personnel.

Blinding of outcome assessment

We will assess the possibility of detection bias due to knowledge of the allocated interventions by outcome assessors.

Incomplete outcome data

We will assess the risk of attrition bias due to the amount, nature or handling of incomplete outcome data.

Selective outcome reporting

We will assess the risk of reporting bias due to selective outcome reporting.

Other sources of bias

We will assess the risk of other causes of bias not covered by the above domains, such as lack of adherence to treatment manual and differences between groups aside from prescribed intervention (e.g. additional services).

Measures of treatment effect

Continuous data

We will analyse continuous data, providing means and standard deviations are available, or there are other ways to measure effect size. We will request additional information from the authors if reports have insufficient data.

Where studies have used the same outcomes measures, we will calculate the mean difference (MD). In studies where different measures have been used to assess the impact of the intervention on the same outcome, we will use the standardised mean difference (SMD) with 95% confidence intervals (CIs).

We will analyse continuous data, providing means and standard deviations are available, or there are other ways to measure effect size.

Dichotomous data

We will calculate risk ratios (RR) with 95% CIs for dichotomous data.

Unit of analysis issues

Cluster-randomised trials

It is possible that we will identify cluster-randomised trials in this search as interventions may be allocated by nurseries or schools. We will take into account the level at which randomisation occurred to determine whether individuals were randomised individually or in groups. We will analyse cluster-randomised trials using the average cluster size and an estimate of the intraclass correlation coefficient (ICC) to adjust sample sizes to the 'effective sample size'. This process will follow the methods described in the Cochrane Handbook (Higgins 2011b). Where an estimate of the ICC is not available, we will contact trial authors to obtain this information. Where this is unavailable, we will use an estimate from a similar trial or a trial with a similar population. We will combine single RCTs with cluster-RCTs if we consider the designs and interventions as sufficiently similar and the effect of the intervention is unlikely to be influenced by the method of randomisation. We will conduct sensitivity analyses if RCTs have not statistically accounted for clustering (see Sensitivity analysis).

Multiple treatment groups

In the case of multi-arm trials, where each arm comprises an active treatment (for example, treatment-as-usual rather than a no treatment control), we will compare each treatment against each other in pair-wise comparisons. For dichotomous data, we will sum the sample sizes and events across groups. For continuous data, we will combine sample sizes, means and standard deviations in accordance with the Cochrane Handbook (Higgins 2011b).

Dealing with missing data

We will assess missing data and dropouts or attrition for each study and contact study authors where there are missing or unclear data. We will attempt to retrieve any missing data from authors in addition to numbers, characteristics and reasons for dropout. If missing data are unavailable, we will conduct analyses using available data only, and will note any missing data in the data extraction form and in the 'Risk of bias' table. We will discuss the extent to which missing data are likely to influence the results of the study within the discussion.

Assessment of heterogeneity

We will assess two different types of heterogeneity: clinical (differences in participants, type and intensity of intervention) and statistical. We will assess statistical heterogeneity using a Chi2 test (with a P value of < 0.10), with heterogeneity being indicated by a Chi2 statistic greater than the degrees of freedom and a small P value, and by visual inspection of forest plots. Heterogeneity will be indicated by limited overlap of studies on the forest plot, or by outliers. We will also use the I2 statistic to detect inconsistencies across studies and to determine the approximate proportion of variation that is due to heterogeneity rather than sampling error. We will interpret I2 values as follows: 0% to 40% might not be important; 30% to 60% may represent moderate heterogeneity; 50% to 90% may represent substantial heterogeneity; 75% to 100% shows considerable heterogeneity. The strength of evidence for heterogeneity (for example, the P value from the Chi2 test) will be accounted for while interpreting I2 values. As part of a random-effects meta-analysis, we will report tau2 as an estimate of between study variability. If there is evidence of heterogeneity, we will discuss possible reasons for it within the Discussion and conduct subgroup analyses (see Subgroup analysis and investigation of heterogeneity).

Assessment of reporting biases

Where an individual meta-analysis contains at least 10 studies, we will construct and visually assess funnel plots for skewness of data. A relationship between effect size and standard error could be due to publication or related biases, or differences between small and large studies. We will assess funnel plot asymmetry using Egger’s test (Egger 1997).

Data synthesis

We will conduct the data synthesis in Review Manager 2014, Cochrane’s meta-analysis software. We will perform meta-analyses where studies have sufficiently similar participants (e.g. those belonging to the same subgroup) comparators, outcomes and time frames within which follow-up assessments were completed. As identified studies are likely to be estimating different but related intervention effects, we will use a random-effects model with an inverse-variance method to calculate weighted mean effect sizes with 95% CIs, and display them with forest plots. If studies are clinically diverse, we may need to consider them separately, and we will not conduct a meta-analysis if we consider studies to have serious reporting or publication biases. Should this be the case, we will instead provide a narrative description of the results.

'Summary of findings' table

We will summarise the results per comparison in a 'Summary of findings' table, which we will construct using software developed by the GRADE Working Group: GRADEpro GDT 2015. We have decided to include only our primary outcomes (improvement in child conduct problems or disruptive behaviour; and any adverse events), and two of our secondary outcomes (personalised treatment outcomes; parenting skills and knowledge).

We will present the effects of interventions as risk differences (RD) (for absolute effects), and RR (for relative effects) with accompanying 95% CIs. We will indicate the comparisons in the table and may include no intervention, waiting-list control, standard practice (including parent training), or a comparison intervention focused on conduct problems.

Review authors LF and EK will assess the quality of the body of evidence according to the following features: limitations in design and implementation, indirectness of evidence, unexplained heterogeneity or inconsistent results, imprecision of results, and high risk of publication bias. Both review authors will label the quality of evidence as high, moderate, low, or very low, and provide a narrative description of key findings.

Subgroup analysis and investigation of heterogeneity

We may conduct subgroup analyses to further investigate causes of heterogeneity. Possible subgroups may include:

  1. age of participating children across groups (under five years of age; four to eight years of age; nine to 12 years of age);

  2. gender (boys versus girls); and

  3. initial severity of conduct problems.

Sensitivity analysis

We will perform sensitivity analyses to evaluate the impact of study quality on the robustness of the conclusions drawn. Specifically, we will explore potential heterogeneity between studies according to:

  1. missing data (sensitivity analysis will be conducted when we cannot assume that data are missing at random, attrition is higher than 20%, or where an appropriate intention-to-treat analysis (ITT) has not been undertaken); and

  2. cluster effects (sensitivity analysis will be conducted if cluster-RCTs have not adjusted for clustering).

If necessary, we will conduct additional analyses for any further potential issues identified that may impact the robustness of review findings.


We would like to thank the Managing Editor (Dr Joanne Wilson) of the Cochrane Developmental, Psychosocial and Learning Problems Group for all her helpful suggestions, guidance and advice. We would also like to thank Margaret Anderson (Information Specialist, Cochrane Group) who helped in the construction and development of our search strategy.


Appendix 1. MEDLINE search strategy

1 "Attention Deficit and Disruptive Behavior Disorders"/
2 "disruptive, impulse control, and conduct disorders"/
3 conduct disorder/
4 (conduct$ adj3 (difficult$ or disorder$ or disturb$ or problem$)).tw,kf.
5 child behavior disorders/
6 Social Behavior Disorders/
7 (oppositional adj3 (defian$ or disorder$)).tw,kf.
8 (disrupt$ adj3 (behav$ or disorder$)).tw,kf.
9 (defian$ adj3 (behav$ or disorder$)).tw,kf.
10 (impuls$ adj3 (behav$ or disorder$)).tw,kf.
11 exp Aggression/
12 (aggressiv$ adj2 (behav$ or disorder$)).tw,kf.
13 (behav$ adj2 (agonistic or antisocial or anti-social or challeng$ or disorder$ or disturb$ or externali$ or problem$)).tw,kf.
14 or/1-13
15 Minors/
16 exp child/
17 (boy$1 or child$ or delinquen$ or girl$1 or graders or junior$1 or juvenile$1 or kindergarten or minors or p?ediatric$ or preadolescen$ or prepubert$ or prepubescen$ or preschool$ or preteen$ or pubert$ or pubescen$ or school$ or toddler$).tw,kf.
18 or/15-17
19 14 and 18
20 randomized controlled
21 controlled clinical
22 randomi#ed.ab.
23 placebo.ab.
24 clinical trials as
25 randomly.ab.
26 trial.ti.
27 or/20-26
28 exp animals/ not
29 27 not 28
30 19 and 29

Contributions of authors

LF and EK drafted the protocol. CR provided statistical support for the protocol.
EK is the guarantor for the review.

Declarations of interest

Eilis Kennedy and Chris Roberts are co-applicants on a National Institute for Health Research (NIHR) grant, the long-term objectives of which are to investigate outcomes of parent training and develop personalised interventions for children with conduct problems. The current review represents the first, informative phase of this six-year research project. Lorna French is a member of this research team.

Sources of support

Internal sources

  • Tavistock and Portman NHS Foundation Trust, UK.

    Provided base for review authors.

    The views and opinions do not necessarily reflect those of the NHS or Department of Health. The authors alone are responsible for the views expressed in this publication.

External sources

  • National Institute for Health Research (NIHR), UK.

    This review is being completed as part of a NIHR Programme Grant for Applied Research (PGfAR).

    The views and opinions do not necessarily reflect those of the NIHR. The authors alone are responsible for the views expressed in this publication.