Scolaris Content Display Scolaris Content Display

Overall prognosis of preschool autism spectrum disorder diagnoses

This is not the most recent version

Collapse all Expand all

Abstract

This is a protocol for a Cochrane Review (Prognosis). The objectives are as follows:

The primary objective of this review is to synthesise the available evidence on the proportion of individuals who have a diagnosis of autism spectrum disorder at baseline and at follow‐up one or more years later.

The secondary objectives of this review are to:

  1. investigate whether there are differences in the proportions of individuals with autism spectrum disorder who maintain a diagnosis at follow‐up dependent on use of the different classification systems (i.e. DSM or ICD criteria) and their revisions; and

  2. investigate the proportion of individuals with autism spectrum disorder who maintain diagnosis at follow‐up in important subgroups of individuals, including those of different ages and those with different language levels (verbal/non‐verbal; standard score ≤ 70 or > 70), IQs (≤ 70 or > 70), adaptive behaviour (standard score ≤ 70 or > 70), and different diagnostic subgroups (Asperger's syndrome/disorder, autistic disorder, childhood autism, PDD‐NOS, atypical autism, PDD and autism spectrum disorder).

We will investigate potential sources of heterogeneity that may impact outcomes such as differences in study participation, study design, length of follow‐up, participant attrition and participant outcome measurement factors. We will use internationally recognised standards for systematic reviews to guide the review.

Background

Description of the condition

Autism spectrum disorder is a complex and heterogeneous neurodevelopmental disorder. It is currently diagnosed by the presence of two core features, which include social communication difficulties, and repetitive and restricted interests and behaviours (APA 2013). A diagnosis of autism spectrum disorder is typically made using criteria from the Diagnostic and Statistical Manual of Mental Disorders (DSM) or the International Classification of Diseases (ICD) (APA 2013; WHO 1992). A number of other neurodevelopmental conditions are associated with autism spectrum disorder, such as speech and language difficulties, intellectual disability, and attention deficit hyperactivity disorder. In the fifth edition of the DSM (DSM‐5) (APA 2013), clinicians are encouraged to identify whether individuals have these aforementioned conditions, in addition to autism spectrum disorder. Severity ratings have also been introduced for each of the two main criteria (i.e. social communication difficulties and restricted and repetitive interests and behaviours), which form the diagnosis of autism spectrum disorder.

In the past two decades there have been many changes in the way individuals with autism spectrum disorder are diagnosed and cared for. While the DSM‐5, published in 2013, now uses the broad term 'autism spectrum disorder' (APA 2013), previous editions of the DSM and the ICD have used different criteria and included diagnostic subgroups based on the individual's profile of symptoms (APA 1980; APA 1994; APA 2000; NCHS 2011) (Table 1).

Table 1. Changes to the classification systems over time.

Classification system

Year published

Subgroups (as specified in the classification system)

International Classification of Diseases, Ninth Revision, Clinical Modification (ICD‐9‐CM)

1975

Autistic disorder

International Classification of Diseases, Tenth Revision (ICD‐10)

1996

Childhood autism, Asperger's syndrome, atypical autism, pervasive developmental disorder (PDD) ‐ unspecified

Diagnostic and Statistical Manual of Mental Disorders, Third Edition (DSM‐III)

1980

PDD: infantile autism, childhood onset PDD and atypical PDD

Diagnostic and Statistical Manual of Mental Disorders, Third Edition, Revised (DSM‐III‐R)

1987

PDD: autistic disorder, PDD‐not otherwise specified (PDD‐NOS)

Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM‐IV)

1994 to 2000

Asperger’s disorder, autistic disorder, PDD‐NOS

Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, Text Revision (DSM‐IV‐TR)

2000 to 2013

Asperger’s disorder, autistic disorder, PDD‐NOS

Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM‐5)

2013 to current

Autism spectrum disorder

Recently, the US National Institute of Mental Health has reoriented its focus away from diagnostic categories in mental disorders toward use of the Research Domain Criteria (RDoC) framework. This research framework focuses on the dimensions of functioning that underlie human behaviour. This reorientation will not impact already published studies that have used autism spectrum disorder diagnostic labels but may impact the way future studies are structured and the types of outcomes they report on, particularly those studies conducted in the USA.

The diagnostic validity of autism spectrum disorder, both in terms of its diagnostic accuracy and clinical utility, is not well described despite the high prevalence of the disorder (CDC 2014; Kim 2011). Current recommendations on how autism spectrum disorder should be diagnosed include a combination of history, observation and application of DSM or ICD criteria, taking into account the overall abilities of the person and ensuring alternative diagnoses are excluded (NICE 2011; Volkmar 2014; WAADF 2012). A number of assessment tools have been published that can assist with diagnosis. These include the Autism Diagnostic Observation Schedule (ADOS; Lord 2000), the Autism Diagnostic Interview ‐ Revised (ADI‐R; Le Couteur 2003) and the Childhood Autism Rating Scale (CARS; Schopler 1980). It is recommended that input from a medical expert, psychologist and speech pathologist is available, and that other expertise may be required depending on the strengths and difficulties of each individual. In both practice and research, diagnostic process often falls short of this recommendation, and is instead based on clinical best estimates and the results of one or more diagnostic instruments. Unfortunately, all methods of diagnosing autism spectrum disorder include an unquantifiable amount of diagnostic inaccuracy, with a higher amount expected for practices that fall short of the current recommendations. As such, the diagnostic stability and long‐term outcome of autism spectrum disorder are entangled and not yet able to be disentangled.

The clinical pathway for individuals with autism spectrum disorder can be variable, with some individuals showing signs of autism spectrum disorder from as early as 12 months of age and others being described as having typical development followed by a period of developmental regression or loss of previously acquired skills (Landa 2013). To receive a diagnosis of autism spectrum disorder, symptoms must be present from childhood; however, some individuals with more subtle functional impairment may not receive a diagnosis of autism spectrum disorder until the school years or adulthood.

The reported prevalence of autism spectrum disorder has increased over the past two decades (Elsabbagh 2012), with estimates from the USA reporting that 1.5% of children aged eight years have been diagnosed with the condition (CDC 2014). Elsabbagh 2012 reported the global median of prevalence estimates of autism spectrum disorders to be 62/10,000 (range 0.01% to 1.89%). Several factors have been proposed that may have had an impact on the increase in prevalence of autism spectrum disorder, including increased community awareness of the condition, administrative factors (e.g. specific funding for a diagnosis of autism spectrum disorder relative to other conditions), a broadening of the criteria used to diagnose autism spectrum disorder and diagnostic substitution with conditions such as intellectual disability (Fombonne 2009; Hansen 2015; King 2009; Wing 2002a). A study from Sweden reported an increase in the diagnosis of autism spectrum disorder but no change in the prevalence of traits of the type seen with autism spectrum disorder (Lundstrom 2015). Whether there is a true increase in autism spectrum disorder, or whether the increase in the prevalence of the disorder is due to the aforementioned factors, is yet to be determined.

The cause of autism spectrum disorder is not yet known although it is thought to have roots in genetics and brain development. Autism spectrum disorder may have shared aetiological pathways with other neurodevelopmental disabilities, such as intellectual disability, language impairment and attention deficit hyperactivity disorder, and individuals with these conditions may present with overlaps in functioning, behaviour and cognitive deficits (Simonoff 2008). The aforementioned conditions commonly co‐occur with autism spectrum disorder and may exacerbate the symptoms of autism spectrum disorder. No biological markers for autism spectrum disorder have been identified; hence, autism spectrum disorder is diagnosed by behavioural observation.

Many clinicians and families believe that autism spectrum disorder is a lifelong disability; however, there is debate in the current literature regarding the permanence of autism spectrum disorder. Some studies have reported that a significant proportion of previously affected individuals no longer meet the diagnostic criteria for the disorder (Corsello 2013; Daniels 2011; Kleinman 2008; Turner 2007), and have also reported a variety of factors found to influence the stability of the diagnosis. These factors included: age at diagnosis (Daniels 2011; Turner 2007); milder symptoms of autism spectrum disorder (particularly in the social domain) and higher cognitive scores at two years of age (Turner 2007); the diagnosing clinician, region and a history of regression (Daniels 2011); and maturation, type of diagnosis (autistic disorder versus pervasive developmental disorder (PDD) ‐ not otherwise specified (NOS)), amount of intervention and over‐diagnosis at two years of age (Kleinman 2008). Other studies report that autism spectrum disorder can be reliably diagnosed and that few individuals “grow out” of a diagnosis (Barbaro 2016; Guthrie 2013; Jónsdóttir 2007; Ozonoff 2015; Takeda 2005). In early intervention studies, if children no longer fulfil the criteria for autism spectrum disorder at the end of the study, they may be described as having achieved an "optimal outcome" (e.g. Fein 2013; Orinstein 2014).

Individuals are diagnosed with autism spectrum disorder at varying ages and following different early life experiences. The diagnosis conveys problems with an individual's ability and implies that they may face challenges ahead that will require intervention and support in addition to that required by their peers. These challenges might include difficulty with social relationships, poor reading, spelling and communication abilities, behaviour difficulties, higher levels of dependence on others and a poorer quality of life (Howlin 2004; Howlin 2012). The steps that then follow depend on the individual’s problems and abilities, the parents’ or individual's wishes, available and accessible interventions and services, and the trajectory from the time of diagnosis. For example, families may choose to pursue a range of different interventions, from complementary to traditional, and their choices may be influenced by what is readily available or promoted in their community (Goin‐Kochel 2007; Green 2006). What is currently lacking is an evidence base to inform the advice given to parents about what this diagnosis will mean for their child in the short and long term. In part, this is due to inconsistent evidence about the proportion of individuals diagnosed with autism spectrum disorder who still fulfil the diagnosis after a year or more, and whether age at diagnosis, and different diagnostic subgroups and subgroups with different intelligence have different prognoses.

Why it is important to do this review

Autism spectrum disorder is a global health issue with far‐reaching impacts for affected individuals, their families and support agencies. Substantial economic and social costs are associated with autism spectrum disorder (Ganz 2007; Horlin 2014), with the cost of supporting an individual with autism spectrum disorder throughout their lifespan estimated to be around US $2.4 million (if they have an intellectual disability) or US $1.4 million (without an intellectual disability) (Beuscher 2014). As such, there are substantial implications for policy makers and service providers with regard to the allocation of resources and the planning of future support needs of individuals with autism spectrum disorder.

Families, individuals with autism spectrum disorder and clinicians require high‐quality, reliable information about what proportion of individuals will continue to have a diagnosis of autism spectrum disorder as a crucial first step in understanding diagnostic validity and prognosis. It is also a vital part of the information that people need as they try to understand their own or their child's strengths and difficulties, and plan for their short‐ and long‐term future. Furthermore, information on the proportion of individuals who retain a diagnosis is important for policy makers and service providers so they can better plan future support needs for individuals.

A review that included a meta‐analysis of eight longitudinal studies found that the proportion of children who were still diagnosed with an autism spectrum disorder at follow‐up was lower if their original diagnosis was PDD‐NOS rather than autistic disorder (Rondeau 2010). This study had a number of methodological limitations, such as only searching one database, and it did not assess risk of bias. Another review, Woolfenden 2012, found that a substantial proportion of children diagnosed with PDD‐NOS either did not have a diagnosis at follow‐up or were diagnosed as having autistic disorder. The proportion that changed diagnosis was higher than many clinicians had anticipated. Variations in the persistence of a diagnosis according to age group, diagnostic subgroup and intelligence quotient (IQ) were also reported. Most studies included in the review were found to be at high risk of bias.

Much work has been published since the time of the last searches for these prior reviews. An update is needed to find whether higher‐quality evidence is now available and to assess whether current information about the proportion of individuals who retain a diagnosis of an autism spectrum disorder at follow‐up is sufficient to inform individually tailored decision making. This Cochrane Review will search a broader range of databases than previous reviews. It will also include a 'Risk of bias' assessment of the included studies.

In this Cochrane Review, we will not be able to definitively determine whether the individual has 'grown out' of autism spectrum disorder due to maturation or intervention, or if the original diagnosis was inaccurate. We will include studies in which diagnostic practices reflect current commonly used standards for research or clinical care and, if possible, we will investigate whether the diagnostic approach or tools (including age‐ or ability‐modified versions of those tools), or both, at baseline, and consistency between tools at baseline and follow‐up, contribute to differences in the proportions of children who retain a diagnosis. This will provide valuable information for clinicians and families about what the outcome will be in an individual with autism spectrum disorder one year or more after the use of these approaches.

Objectives

The primary objective of this review is to synthesise the available evidence on the proportion of individuals who have a diagnosis of autism spectrum disorder at baseline and at follow‐up one or more years later.

The secondary objectives of this review are to:

  1. investigate whether there are differences in the proportions of individuals with autism spectrum disorder who maintain a diagnosis at follow‐up dependent on use of the different classification systems (i.e. DSM or ICD criteria) and their revisions; and

  2. investigate the proportion of individuals with autism spectrum disorder who maintain diagnosis at follow‐up in important subgroups of individuals, including those of different ages and those with different language levels (verbal/non‐verbal; standard score ≤ 70 or > 70), IQs (≤ 70 or > 70), adaptive behaviour (standard score ≤ 70 or > 70), and different diagnostic subgroups (Asperger's syndrome/disorder, autistic disorder, childhood autism, PDD‐NOS, atypical autism, PDD and autism spectrum disorder).

We will investigate potential sources of heterogeneity that may impact outcomes such as differences in study participation, study design, length of follow‐up, participant attrition and participant outcome measurement factors. We will use internationally recognised standards for systematic reviews to guide the review.

Methods

Criteria for considering studies for this review

Types of studies

We will include published reports of prospective and retrospective longitudinal studies investigating the prognosis of autism spectrum disorder that use the same measure to diagnose autism spectrum disorder at baseline and follow‐up. Studies are required to have at least 12 months of follow‐up and contain at least 10 participants. The decision to use 10 as the minimum number of participants was made in conjunction with a statistician and is consistent with prior methods used in other studies (e.g. Magiati 2014). A study may or may not include a comparison group that is being observed over the same time period, the characteristics of which are being assessed in the same manner. We will exclude studies from the review if the follow‐up of children or adults with autism spectrum disorder is incidental to another syndrome and the outcomes are not appropriately measured (i.e. information on diagnosis at follow‐up is not provided).

Types of participants

Participants diagnosed with autism spectrum disorder, PDD, PDD‐NOS, atypical autism, PDD‐unspecified, Asperger's syndrome/disorder, autism, autistic disorder or childhood autism. Diagnosis must have been made using a standardised diagnostic tool (see 'Types of outcome measures' below for eligible tools) or by using established diagnostic criteria (e.g. criteria from the third edition (APA 1980), fourth edition (APA 1994), fourth edition, text revision (APA 2000) and fifth edition (APA 2013) of the DSM (DSM‐III, DSM‐IV, DSM‐IV‐TR and DSM‐5, respectively), or the ninth revision (WHO 1979) and tenth revision (WHO 1992) of the ICD (ICD‐9 and ICD‐10, respectively). Only children initially diagnosed before the aged of six years and followed up before the age of 19 years will be eligible for inclusion. We will include children with a dual diagnosis; for example, a diagnosis of Asperger's syndrome/disorder and attention deficit hyperactivity disorder. We will include children with medical aetiologies such as Fragile X syndrome and tuberous sclerosis only if these medical conditions co‐occur with a diagnosis of autism spectrum disorder and occur infrequently within the sample, rather than being the focus of the sample. We will exclude studies of children who have Rett syndrome.

Types of prognostic factors

We will not be analysing prognostic factors in this review.

Types of outcome measures

This review will focus on diagnostic stability only. There are other important outcomes for individuals with autism spectrum disorder (e.g. adaptive behaviour); however, reporting on these is beyond the scope of this review.

Primary outcome

  1. Proportion of individuals who have a diagnosis of autism spectrum disorder at baseline and at follow‐up one or more years later.

Diagnosis at follow‐up must have been made using DSM or ICD criteria, or a DSM‐ or ICD‐compatible standardised tool. Tools accepted for diagnosis include: the ADI‐R (Le Couteur 2003), CARS (Schopler 1980), ADOS (Lord 2000), Diagnostic Interview for Social and Communication Disorders (DISCO; Wing 2002b), the Gilliam Autism Rating Scale (GARS; Gilliam 1995) and the Developmental, Dimensional and Diagnostic Interview (3di; Skuse 2004). For each tool, individuals must meet the published cut‐off for a diagnosis of autism spectrum disorder. We expect that most studies will present this outcome as a dichotomous variable (i.e. diagnosis or no diagnosis). For studies that present data using continuous measures (e.g. a score on a diagnostic scale), we will analyse the data by computing a dichotomous variable. If studies do not provide the required data we will contact the authors to request it.

Secondary outcomes

We will assess the outcomes below, as measured by the diagnostic classification system or diagnostic tool used for the primary outcome, providing the data are available separately. When data are available, we will compile and provide a narrative description of the results.

  1. Social communication.

  2. Restricted and repetitive behaviour.

We have included these two secondary outcomes in consideration of the reorientation toward RDoC, with future studies likely to report on diagnostic outcomes in more dimensional ways.

We will group outcome data into three time periods for analysis purposes: short‐term (up to 2 years), medium‐term (2 to 5 years) and long‐term follow‐up (6 to 17 years).

We will include all outcomes in a 'GRADE Evidence Profile' table.

Search method for identification of studies

Electronic searches

We will search the following electronic databases:

  1. MEDLINE Ovid (1946 to current);

  2. MEDLINE In‐Process & Other Non‐Indexed Citations Ovid (current issue);

  3. MEDLINE Epub Ahead of Print Ovid (current issue);

  4. Embase Ovid (1980 to current);

  5. CINAHL Plus EBSCOhost (Cumulative Index to Nursing and Allied Health Literature; 1937 to current);

  6. PsycINFO Ovid (1967 to current);

  7. Conference Proceedings Citation Index ‐ Science Web of Science (CPCI‐S; 1990 to current);

  8. Conference Proceedings Citation Index ‐ Social Science & Humanities Web of Science (CPCI‐SS&H; 1990 to current);

  9. Cochrane Database of Systematic Reviews (CDSR; current issue) part of the Cochrane Library;

  10. Database of Abstracts of Reviews of Effects (DARE; current issue) part of the Cochrane Library; and

  11. Epistemonikos (www.epistemonikos.org; all available years).

We will search Ovid MEDLINE using the strategy in Appendix 1, which we will not limit by date of publication or by language. The strategy will be adapted for each of the other databases.

Searching other resources

We will identify additional studies by contacting known experts in the field, and by searching the reference lists of relevant reports identified by the electronic searches, including the reference lists of relevant systematic reviews (e.g. reviews of outcomes linked to diagnosis such as language, epilepsy and mortality). We will also use Web of Science (Thomson Reuters) to perform forward citation searches of any included studies, and will search other relevant sites such as those of the UK National Institute for Health and Research, and SciELO (Science Electronic Library Online).

Data collection and analysis

Selection of studies

Two review authors (AB, NA‐U, SW or KW) will independently screen records by title and abstract, removing those that are clearly irrelevant (do not meet the criteria listed above). We will advance records that have collected information on diagnosis of autism spectrum disorder, followed individuals for 12 months or more and have a sample size of more than 10 to the next stage. We will then obtain the full text of potentially relevant reports, including those where we consider inclusion criteria to be unclear, for review and, where appropriate, data extraction. At this stage, we will screen reports for type of diagnostic assessment tool used, whether diagnosis was made at baseline or prior to the start of the study and whether cut‐offs for autism spectrum disorder were met on the assessment tools. Disagreements will be resolved by discussion. If the disagreement cannot be resolved a third review author, who was not one of the two initial assessors, will act as an arbiter.

Data extraction and management

Using the spreadsheet in Appendix 2, two review authors (AB, NA‐U, SW or KW) will independently extract data on: participant characteristics, study characteristics, study population type and size, follow‐up period, and diagnostic criteria or diagnostic tools used (or both), diagnosis, study attrition, study outcome and change in diagnosis. They will also collect information on the version of diagnostic tool or classification system used in each study and note whether a different version of a tool/classification system was used at baseline and follow‐up for each study; differences in versions of tools/classification systems could impact study findings. They will also extract clinical information needed for subgroup analyses (autism spectrum disorder diagnostic groups, IQ, age of inception cohort (i.e. mean age of participants when they entered the study)), as well as data on duration of follow‐up as a possible study factor that influences the proportion of individuals who remain diagnosed with autism spectrum disorder at follow‐up. Disagreements will be resolved by discussion. If the disagreement cannot be resolved a third review author, who was not one of the two initial assessors, will act as an arbiter. The types of data likely to be reported in studies include numbers, percentages and/or survival curves.

Assessment of risk of bias in included studies

Two review authors will independently assess the risk of bias in each report by examining study participation, study attrition and outcome measurement. They will code studies as at high, medium, low or unclear risk of bias for each of these features. We have modified this approach from current literature that addresses the assessment of quality in prognostic systematic reviews (Hayden 2006; Hayden 2013). Details regarding the coding of risk of bias are provided in Appendix 3. Each of the three main domains in the 'Risk of bias' assessment (study participation, study attrition and outcome measurement) include items that are rated on a scale of low, medium to high risk of bias. We will combine these ratings to provide an overall 'Risk of bias' rating for each domain and one overall 'Risk of bias' rating for the study. We will group studies at medium and low risk of bias for sensitivity analysis. As we are not assessing prognostic factors (predictors of outcome) in this review, we will not conduct an analysis of confounders.

If the information required to make an assessment of risk of bias is not available, or it is not possible to extract information about other variables required for sensitivity analyses, we will email the authors of studies published after 2010 to ask for further information, as done in a previous review (see Woolfenden 2012). If the study authors are unwilling or unable to give us additional information, we will document that we have attempted to contact the study authors and mark the risk of bias as unclear. If the minimum necessary information required for inclusion is not available, we will exclude the study from the relevant analyses.

We will use 'Risk of bias' assessments to assess the quality of included studies and for sensitivity analyses.

Measures of association

We will not extract measures of association as we are not analysing prognostic factors in this review.

Unit of analysis issues

We will collect and analyse study level data from studies included in this review. Some studies may use relevant characteristics as eligibility criteria and, as such, report them at the study level; for example, intelligence, age of participants and duration of follow‐up. We will extract the number of individuals with and without a diagnosis of autism spectrum disorder at the time of follow‐up and calculate the percentage, if not presented in the paper. We will also extract these data for subgroup analysis, if presented in this way. The only likely data manipulation we will carry out will be if continuous scores rather than dichotomised categories are presented for diagnostic groups, as described above. If studies have reported data for subgroups (e.g. autistic or autism spectrum disorder; male or female) we will calculate a composite mean score, if this is meaningful. We will do this by conducting a fixed‐effect meta‐analysis of within‐study groups, following the methods described by Borenstein 2009. Individual participant data meta‐analyses are outside the scope of this review.

Dealing with missing data

We will include studies that follow up individuals with autism spectrum disorder for one or more years after entry, and report the proportion who still have the same diagnosis at follow‐up, even if there are missing data. We will contact study authors to obtain further information on missing data, when necessary. We will describe the missing data in the 'Risk of bias' tables and consider the extent to which these may have impacted the results of our review. We will assess the sensitivity of any primary analyses to missing data using the strategy described in the Cochrane Handbook for Systematic Reviews of Interventions (Deeks 2011). If authors are unwilling or unable to provide additional information on missing data we will analyse only available data. We will not impute missing data as children with autism spectrum disorder are very heterogeneous. We will document missing data and discuss the possible impact of missing data on each study, in terms of risk of bias, and on the overall review, in terms of quality of evidence.

Assessment of heterogeneity

We will assess clinical heterogeneity by comparing important participant factors at a study level, and methodological heterogeneity by comparing the risk of bias of studies, taking into account study participation, participant attrition and outcome measurement factors across the studies (see Appendix 3). We will assess statistical heterogeneity by inspecting forest plots, and will use the I2 and Tau2 statistics to estimate the total variation across studies due to heterogeneity. If we find high levels of heterogeneity (I2 > 50%) for the primary outcome, we will explore possible sources of heterogeneity using the subgroup and sensitivity analyses described below, as required by our secondary objectives.

Reporting bias

If we are able to pool 10 or more studies, we will examine publication bias and other small study effects, using a funnel plot in Review Manager (RevMan), version 5 (Review Manager 2014). We will check for asymmetry at a 10% level. We will attempt to obtain the results of unpublished studies by contacting study authors. Where this is not possible, and the missing studies are thought to introduce significant bias, we will explore the impact of including such studies in the overall assessment of results using sensitivity analyses.

Data synthesis

Since there is a high likelihood of heterogeneity in this population, we plan to pool the available data using a random‐effects model. If the studies are found to be more homogeneous than expected, we will also analyse the data using a fixed‐effect model. We will use Review Manager 2014 to pool data, perform statistical analyses and generate forest plots, if it is appropriate to combine results.

We will conduct meta‐analyses if data are available from three or more sufficiently homogeneous studies. We will use a random‐effects, generic inverse variance meta‐analysis model in Review Manager 2014. We will summarise the meta‐analysis using the pooled estimate, its 95% confidence interval, and the estimate of between‐study variance using Tau2.

If it is not appropriate to combine results using meta‐analysis (in the case of heterogeneity or a small number of studies), we will provide a narrative description of the results.

We will use an approach modified from the GRADE framework (Iorio 2015). We will judge and report the overall quality of evidence for all our outcomes using this modified GRADE approach. We will rate the overall strength of evidence considering risk of bias, inconsistency, indirectness, imprecision, publication bias, effect size, dose‐response gradient (see Appendix 4). We will rank evidence as high, moderate, low or very low (see Appendix 5). We will collect information in a 'GRADE Evidence Profile' table (Schünemann 2013; see Appendix 6).

Subgroup and sensitivity analysis

Subgroup analysis

There are a number of potential sources of heterogeneity in studies in individuals with autism spectrum disorder, such as the use of different tools or classification systems to diagnose autism spectrum disorder, different classification systems and types of participants (e.g. with or without language delay or intellectual disability; level of adaptive behaviour). We will assess the primary outcome only in subgroup analyses. We will examine differences between participant subgroups by visual inspections of confidence intervals for:

  1. age at baseline: < 2 years; 2 to 3 years; 4 to 6 years; 7 to 12 years; 13 to 17 years;

  2. age at follow up: 2 to 3 years; 4 to 6 years; 7 to 12 years; 13 to 18 years;

  3. duration of follow‐up: short‐term (up to 2 years), medium‐term (2 to 5 years), and long‐term (6 to 17 years) follow‐up;

  4. decade of publication: 1960 to 1969; 1970 to 1979; 1980 to 1989; 1990 to 1999; 2000 to 2009; 2010 to 2019;

  5. intelligence: mean IQ ≤ 70; mean IQ > 70; or more than 70% of the cohort has IQ ≤ 70;

  6. studies that use the same version of the diagnostic tool or a different version of the diagnostic tool at baseline and follow‐up (e.g. ADOS (Lord 2000) and ADOS‐2 (Lord 2012));

  7. type of diagnostic approach at baseline: multidisciplinary or not multidisciplinary;

  8. adaptive behaviour: mean standard score ≤ 70; mean standard score > 70; or > 70% of the cohort has mean standard score ≤ 70;

  9. language: > 70% verbal; > 70% non‐verbal (i.e. use < 15 words); mean standardised language score < 70; mean standardised language score ≥ 70; or > 70% of the cohort has mean language score < 70; and

  10. diagnostic subgroups: autistic disorder or childhood autism compared with other autism spectrum disorders, including Asperger's syndrome/disorder; atypical autism; and PDD‐NOS.

We will accept non‐overlapping confidence intervals to indicate a statistically significant difference between subgroups. We will conduct analyses using Review Manager 2014.

Sensitivity analysis

We will use sensitivity analyses to assess the impact of our decisions made during the review (e.g. inclusion of studies in the review and risk of bias of studies, taking into account recruitment, blinding and outcome measurement factors). This will be achieved by repeating the analyses using an alternative method or assumption, in order to explore the influence of our 'Risk of bias' assessments; for example, by the exclusion of lower‐quality studies (those at high or unclear risk of bias due to study participation, participant attrition or outcome measurement).