Scolaris Content Display Scolaris Content Display

Cochrane Database of Systematic Reviews Protocol - Intervention

Ginkgo biloba versus placebo for schizophrenia

Collapse all Expand all

Abstract

This is a protocol for a Cochrane Review (Intervention). The objectives are as follows:

To investigate the effects of Ginkgo biloba (or its extractions) for people with schizophrenia or related disorders compared with placebo for outcomes of clinical importance.

Background

Description of the condition

Schizophrenia is a severe mental disorder, characterised by profound disruptions in thinking, affecting language, perception, and the sense of self. It often includes psychotic experiences, such as hearing voices or delusions. Schizophrenia affects more than 21 million people worldwide (WHO 2016), and causes approximately 1.1% of worldwide disability‐adjusted life years (DALYs) (Picchioni 2007). In China, there is a lack of overall schizophrenia epidemiological data, but one local area study, conducted in 1994, showed that lifetime prevalence of schizophrenia in the Chinese population, aged 15 years and over, ranged from 0.426% (rural) to 0.711% (urban) (Si 2015). Schizophrenia typically begins in late adolescence or early adulthood, and can be caused by multiple factors, such as genetic predisposition, perinatal and early childhood stress and environmental stressors (Van Os 2009). The clinical diagnosis of schizophrenia is typically based on a thorough clinical interview by a clinician according to standard diagnostic criteria such as the Diagnostic and Statistical Manual of Mental Disorders – Fifth Edition (DSM‐5) (APA 2013) and International Classification of Diseases, 10th Edition (ICD‐10), (WHO 1991). These criteria include self‐reported experiences and abnormalities in behaviour, followed by a clinical assessment to exclude other associated medical conditions and psychiatric disorders. Schizophrenia is a treatable disorder (WHO 2016). At present, treatment mainly consists of antipsychotic drugs combined with psychological therapies, social support, and rehabilitation (Owen 2016). Other approaches can also be used, such as Traditional Chinese Medicine (TCM) ‐ a system of medicine originated in China and encompassing characteristics of traditional Chinese philosophy and culture. For thousands of years TCM has been used to treat a range of mental disorders with a reputation for not causing adverse effects associated with western antipsychotics. In TCM there is not an exact equivalent illness to what Western cultures consider to be schizophrenia. However, ancient Chinese doctors did use several herbal medicines to treat schizophrenia‐like disorders. Some of these herbs, such as Ginkgo biloba, are now applied clinically for schizophrenia.

See also online resources: Wikipedia: Schizophrenia; WHO: Schizophrenia and Wikipedia: TCM.

Description of the intervention

Ginkgo biloba, known as 银杏(yínxìng) in China, is an ancient tree and the only surviving species of tree in the division Ginkgophyta. Ginkgo is an endemic Chinese plant now widely planted throughout the world (Flora of China). The leaf and fruit of G biloba have been used as medicine and food for a long time in China, they can be used alone as herbal tea, or mixed with other herbs (JNMC 1986). Drugs made with extract of G biloba (EGb), especially one standardised product, EGb 761, are now used in clinical practice around the world (Clostre 1999).

EGb 761, originated by Dr Willmar Schwabe Pharmaceuticals (schwabepharma.com/) (Anonymous 2003; Jaggy 1997) in the 1990s, is a well‐defined mixture of active compounds extracted from G biloba leaves according to a standardised procedure. It contains approximately 24% flavone glycosides (primarily quercetin, kaempferol and isorhamnetin) and 6% terpene lactones (2.8% to 3.4% ginkgolides A, B and C, and 2.6% to 3.2% bilobalide). Ginkgolide B and bilobalide account for about 0.8% and 3% of the total extract, respectively. Other constituents include proanthocyanidins, glucose, rhamnose, organic acids, D‐glucaric and ginkgolic acids. (Anonymous 2003). EGb 761 is applied in treating a wide range of disorders, such as hypertension, coronary heart disease, hyperlipidaemia, diabetes, asthma and cancer adjuvant therapy (Clostre 1999). There are many laboratory and trial reports focused on the applying of EGb in psychiatric medicine (Defeudis 1991). EGb is often used as an adjunct to antipsychotics to treat schizophrenia. It is believed that EGb may enhance the curative effects and reduce the side effects of psychoactive drugs (Chen 2004). Most studies where EGb is used as an intervention for schizophrenia are reported in China, but some studies have been carried out in other countries, such as USA and Turkey.

See also online resources: Wikipedia: Ginkgo biloba.

How the intervention might work

Due to the main compositions, EGb shows some typical pharmacological effects of flavonoids and terpene lactones, such as anti‐oxidation and free radical scavenging (Niu 2006), anti‐platelet aggregation and anti‐thrombotic actions (Gong 2006), inhibition of apoptosis (Hong 2007; Yao 2001), accelerating the transmission of nerve impulses and facilitating synaptic transmission (Chen 2002). G biloba is believed to contribute to prevention and treatment of brain disorders, such as dementia, memory loss, stress and anxiety, the possible mechanism being a combination of the effect of increasing cerebral blood flow, improving energy metabolism and clearing free radicals (Abd‐Elhadya 2013; Liu 2014). Dysfunction of dopaminergic neurotransmission contributes to the genesis of schizophrenia‐like symptoms (Owen 2016; Howes 2016). EGb was also observed to have some beneficial effects on dopaminergic functions at the prefrontal cortex (Beck 2016), the striatum and limbic system (Wu 1995), the nucleus accumbens (Yeh 2015), the paraventricular nucleus and the mesolimbic system (Yeh 2011), and the nigrostriatal pathway (Rojas 2008). These could be the central pharmacological mechanisms of EGb for treating schizophrenia.

Why it is important to do this review

Ginkgo biloba is a classical TCM herb for schizophrenia‐like disorders (Shen 1997). Extract of G biloba, especially EGb 761, is used widely in the modern clinic (Clostre 1999). We are aware of randomised trials (more than 50 results in a preliminary search) in this area and several published systematic reviews, such as Singh 2010 (which included six randomised controlled trials (RCTs)), Brondino 2013 (included three RCTs), Chen 2015 (included eight RCTs) and Zheng 2016 (included three RCTs involving schizophrenia patients with Tardive Dyskinesia). This is an important area for which there should be a maintained review that can be updated in the light of new emerging evidence.

Objectives

To investigate the effects of Ginkgo biloba (or its extractions) for people with schizophrenia or related disorders compared with placebo for outcomes of clinical importance.

Methods

Criteria for considering studies for this review

Types of studies

We will include all relevant randomised controlled trials (RCTs). We will enter trials where randomisation is implied in a sensitivity analysis (see Sensitivity analysis). We will exclude quasi‐randomised studies, such as those that allocate intervention by alternate days of the week. Where people are given additional treatments as well as G biloba, we will only include data if the adjunct treatment is evenly distributed between groups and it is only the G biloba versus placebo that is randomised.

Types of participants

Adults, however defined, with schizophrenia or related disorders, including schizophreniform disorder, schizoaffective disorder and delusional disorder, by any means of diagnosis.

We are interested in making sure that information is as relevant as possible to the current care of people with schizophrenia, so aim to highlight the current clinical state clearly (acute, early post‐acute, partial remission, remission), as well as the stage (prodromal, first episode, early illness, persistent), and whether the studies primarily focused on people with particular problems (for example, negative symptoms, treatment‐resistant illnesses).

Types of interventions

1. G biloba alone

G biloba leaves or fruits, or both, used as raw herb (only including trials that employed G biloba alone), extract of G biloba, products of EGb 761, in any dosage form (tablet, capsule, injection), dose or mode of administration (oral or by injection).

For this comparison we will compare this to placebo or no treatment.

2. G biloba added to standard care

G biloba leaves or fruits, or both, used as raw herb (only including trials that employed G biloba alone), extract of G biloba, products of EGb 761, in any dosage form (tablet, capsule, injection), dose or mode of administration (oral or by injection).

Standard care is care given at a stated normal and uniform dose of antipsychotic drugs or other therapeutic approaches, or both.

For this comparison we will compare this to:

a. Placebo or no treatment added to standard care

Both the experimental and control groups should receive the same baseline treatment(s).

b. Placebo or no treatment added to non‐standard care

In rare trials, treatments might be applied differently to each group depending on whether they are allocated to the experimental group or not. For example those allocated to G biloba could also be allocated to receive a half dose of an antipsychotic compared with the placebo group, who would receive a normal dose of their drugs.

Types of outcome measures

We aim to divide all outcomes into short term (less than three months), medium term (three to 12 months) and long term (over one year).

Primary outcomes
1. Global state

1.1 Clinically important change, as defined by each study (in the short term)

2. Mental state

2.1 Clinically important change, as defined by each study (in the short term)

3. Adverse effects/events

3.1 Any serious adverse event/effect (in the short term)

Secondary outcomes
1. Global state

1.1 Clinically important change, as defined by each study (in the medium or long term)
1.2 Any improvement in global state
1.3 Average score/change in global state
1.4 Relapse

2. Mental state

2.1 Clinically important change, as defined by each study (in the medium or long term)
2.2 Any improvement in mental state
2.3 Average score/change in mental state

3. Adverse effects/events

3.1 Death
3.2 Cardiovascular effects
3.3 Genitourinary effects
3.4 Gastrointestinal effects
3.5 Respiratory effects
3.6 Extrapyramidal side effects
3.7 Metabolic
3.8 Any abnormal laboratory tests
3.9 Any other specific adverse effects
3.10 Any serious adverse effect/event (in the medium or long term)
3.11 Average endpoint/change adverse effects/event scale

4. Behaviour

4.1 Any clinically important change, as defined by each study
4.2 Average score/change in behaviour
4.3 Aggression/violence

5. Social functioning

5.1 Clinically important change, as defined by each study
5.2 Any improvement, as defined by each study
5.3 Average score/change in social functioning

6. Quality of life/satisfaction with care for either recipients of care or caregivers

6.1 Clinically important change in quality of life/satisfaction, as defined by each study
6.2 Average score/change in quality of life/satisfaction
6.3 Any change in employment status, as defined by each study

7. Acceptance of treatment

7. 1 Accepting treatment
7. 2 Average endpoint acceptance score
7. 3 Average change in acceptance score

8 Service utilisation outcomes

8.1 Hospital admission
8.2 Days in hospital

9. Economic outcomes

9.1 Costs due to treatment, as defined by each study
9.2 Savings due to treatment, as defined by each study

'Summary of findings' table

We will use the GRADE approach to interpret findings (Schünemann 2011a); and will use the GRADE profiler Guideline Development Tool (GRADEpro GDT 2015) to import data from Review Manager 5 (RevMan) (RevMan 2014) to create 'Summary of findings' tables (Schünemann 2011b). These tables provide outcome‐specific information concerning the overall quality of evidence from each included study in the comparison, the magnitude of effect of the interventions examined, and the sum of available data on all outcomes we rate as important to patient care and decision‐making. We aim to select the following main outcomes for inclusion in the 'Summary of findings' table.

  1. Global state: clinically important change, as defined by each study (in the short term)

  2. Mental state: clinically important change, as defined by each study (in the short term)

  3. Adverse effects: any serious adverse event/effect (in the short term)

  4. Adverse effects: extrapyramidal side effects (in the medium term)

  5. Quality of life/satisfaction with care for either recipients of care or caregivers: clinically important change in quality of life/satisfaction, as defined by each study (in the medium term)

  6. Economic outcomes: costs due to treatment, as defined by each study (in the long term)

Search methods for identification of studies

Electronic searches

Cochrane Schizophrenia’s study‐based register of trials

The information specialist will search the register using the following search strategy:

*Ginkgo biloba* in intervention field of study

In such study‐based registers, searching the major concept retrieves all the synonyms and relevant studies because all the studies have already been organised based on their interventions and linked to the relevant topics.

This register is compiled by systematic searches of major resources (including MEDLINE, Embase, CINAHL, AMED, BIOSIS, PsycINFO, PubMed, and registries of clinical trials) and their monthly updates, handsearches, grey literature, and conference proceedings (see Group’s Module). There is no language, date, document type, or publication status limitations for inclusion of records into the register.

Searching other resources

1. Reference searching

We will inspect references of all included studies for further relevant studies.

2. Personal contact

We will contact the first author of each included study for information regarding unpublished trials. We will note the outcome of this contact in the 'Characteristics of included studies' or 'Characteristics of studies awaiting classification' tables.

Data collection and analysis

Selection of studies

One review author, JX, will independently inspect citations from the searches and identify relevant abstracts; a second review author, HD will independently re‐inspect a random 20% sample of these abstracts to ensure reliability of selection. Where disputes arise, we will acquire the full report for more detailed scrutiny. JX will then obtain and inspect full reports of the abstracts or reports meeting the review criteria. HD will re‐inspect a random 20% of these full reports in order to ensure reliability of selection. Where it is not possible to resolve disagreement by discussion, we will attempt to contact the authors of the study concerned for clarification.

Data extraction and management

1. Extraction

Review authors, JX and WFY, will extract data from all included studies. In addition, to ensure reliability, HD will independently extract data from a random sample of these studies, comprising 10% of the total. We will attempt to extract data presented only in graphs and figures whenever possible, but will include such data only if two reviewers independently obtain the same result. If studies are multi‐centre, then where possible we will extract data relevant to each. We will discuss any disagreement and document our decisions. If necessary, we will attempt to contact study authors through an open‐ended request in order to obtain missing information or for clarification. Clive E Adams from The University of Nottingham will help clarify issues regarding any remaining problems and we will document these final decisions.

2. Management
2.1 Forms

We will extract data onto standard, pre‐designed, simple forms.

2.2 Scale‐derived data

We will include continuous data from rating scales only if:

a. the psychometric properties of the measuring instrument have been described in a peer‐reviewed journal (Marshall 2000); and
b. the measuring instrument has not been written or modified by one of the trialists for that particular trial.
c. The instrument should be a global assessment of an area of functioning and not sub‐scores which are not, in themselves, validated or shown to be reliable. However there are exceptions, we will include sub‐scores from mental state scales measuring positive and negative symptoms of schizophrenia.

Ideally the measuring instrument should either be a self‐report or completed by an independent rater or relative (not the therapist). We realise that this is not often reported clearly; in 'Description of studies' we will note if this is the case or not.

2.3 Endpoint versus change data

There are advantages of both endpoint and change data: change data can remove a component of between‐person variability from the analysis; however, calculation of change needs two assessments (baseline and endpoint) that can be difficult to obtain in unstable and difficult‐to‐measure conditions such as schizophrenia. We have decided primarily to use endpoint data, and only use change data if the former are not available. If necessary, we will combine endpoint and change data in the analysis, as we prefer to use mean differences (MDs) rather than standardised mean differences (SMDs) throughout (Deeks 2011).

2.4 Skewed data

Continuous data on clinical and social outcomes are often not normally distributed. To avoid the pitfall of applying parametric tests to non‐parametric data, we will apply the following standards to relevant continuous data before inclusion.

For endpoint data from studies including fewer than 200 participants:

a. when a scale starts from the finite number zero, we will subtract the lowest possible value from the mean, and divide this by the standard deviation. If this value is lower than one, it strongly suggests that the data are skewed and we will exclude these data. If this ratio is higher than one but less than two, there is a suggestion that the data are skewed: we will enter these data in the analysis and test whether their inclusion or exclusion would change the results substantially. Finally, if the ratio is larger than two we will include these data, because it is less likely that they are skewed (Altman 1996; Deeks 2011).

b. if a scale starts from a positive value (such as the Positive and Negative Syndrome Scale (PANSS), which can have values from 30 to 210 (Kay 1986)), we will modify the calculation described above to take the scale starting point into account. In these cases skewed data are present if 2 SD > (S − S min), where S is the mean score and 'S min' is the minimum score.

Please note: we will enter all relevant data from studies of more than 200 participants in the analysis irrespective of the above rules, because skewed data pose less of a problem in large studies. We will also enter all relevant change data, as when continuous data are presented on a scale that includes a possibility of negative values (such as change data), it is difficult to tell whether or not data are skewed.

2.5 Common measurement

To facilitate comparison between trials we aim, where relevant, to convert variables that can be reported in different metrics, such as days in hospital (mean days per year, per week or per month) to a common metric (e.g. mean days per month).

2.6 Conversion of continuous to binary

Where possible, we will make efforts to convert outcome measures to dichotomous data. This can be done by identifying cut‐off points on rating scales and dividing participants accordingly into 'clinically improved' or 'not clinically improved'. It is generally assumed that if there is a 50% reduction in a scale‐derived score such as the Brief Psychiatric Rating Scale (BPRS) (Overall 1962), or the PANSS (Kay 1986), this could be considered as a clinically significant response (Leucht 2005a; Leucht 2005b). If data based on these thresholds are not available, we will use the primary cut‐off presented by the study authors.

2.7 Direction of graphs

Where possible, we will enter data in such a way that the area to the left of the line of no effect indicates a favourable outcome for G biloba. Where keeping to this makes it impossible to avoid outcome titles with clumsy double‐negatives (e.g. 'not un‐improved') we will report data where the left of the line indicates an unfavourable outcome and note this in the relevant graphs.

Assessment of risk of bias in included studies

Review authors, JX and WFY, will work independently to assess risk of bias by using criteria described in the Cochrane Handbook for Systematic Reviews of Interventions to assess trial quality (Higgins 2011a). This set of criteria is based on evidence of associations between potential overestimation of effect and the level of risk of bias of the article that may be due to aspects of sequence generation, allocation concealment, blinding, incomplete outcome data and selective reporting, or the way in which these 'domains' are reported.

If the raters disagree, we will make the final rating by consensus, with the involvement of another member of the review author group. Where inadequate details of randomisation and other characteristics of trials are provided, we will attempt to contact authors of the studies in order to obtain further information. We will report non‐concurrence in quality assessment, but if disputes arise regarding the category to which a trial is to be allocated, we will resolve this by discussion.

We will note the level of risk of bias in both the text of the review, Figure 1, Figure 2, and the 'Summary of findings' table(s).

Measures of treatment effect

1. Binary data

For binary outcomes we will calculate a standard estimation of the risk ratio (RR) and its 95% confidence interval (CI), as it has been shown that RR is more intuitive than odds ratios (Boissel 1999); and that odds ratios tend to be interpreted as RR by clinicians (Deeks 2000). Although the number needed to treat for an additional beneficial outcome (NNTB) and the number needed to treat for an additional harmful outcome (NNTH), with their CIs, are intuitively attractive to clinicians, they are problematic to calculate and interpret in meta‐analyses (Hutton 2009). For binary data presented in the 'Summary of findings' table(s) we will, where possible, calculate illustrative comparative risks.

2. Continuous data

For continuous outcomes we will estimate MD between groups. We prefer not to calculate effect size measures (SMD). However if scales of very considerable similarity are used, we will presume that there is a small difference in measurement, and we will calculate effect size and transform the effect back to the units of one or more of the specific instruments.

Unit of analysis issues

1. Cluster trials

Studies increasingly employ 'cluster randomisation' (such as randomisation by clinician or practice), but analysis and pooling of clustered data poses problems. Firstly, authors often fail to account for intra‐class correlation in clustered studies, leading to a unit‐of‐analysis error whereby P values are spuriously low, CIs unduly narrow and statistical significance overestimated (Divine 1992). This causes type I errors (Bland 1997; Gulliford 1999).

Where clustering has been incorporated into the analysis of primary studies, we will present these data as if from a non‐cluster randomised study, but adjust for the clustering effect.

Where clustering is not accounted for in primary studies, we will present data in a table, with a '*' symbol to indicate the presence of a probable unit of analysis error. We will seek to contact first authors of studies to obtain intra‐class correlation coefficients for their clustered data and to adjust for this by using accepted methods (Gulliford 1999).

We have sought statistical advice and have been advised that the binary data from cluster trials presented in a report should be divided by a 'design effect'. This is calculated using the mean number of participants per cluster (m) and the intra‐class correlation coefficient (ICC): thus design effect = 1 + (m − 1) * ICC (Donner 2002). If the ICC is not reported we will assume it to be 0.1 (Ukoumunne 1999).

If cluster studies have been appropriately analysed and intra‐class correlation coefficients and relevant data documented in the report taken into account, synthesis with other studies will be possible using the generic inverse variance technique.

2. Cross‐over trials

A major concern of cross‐over trials is the carry‐over effect. This occurs if an effect (e.g. pharmacological, physiological or psychological) of the treatment in the first phase is carried over to the second phase. As a consequence, participants can differ significantly from their initial state at entry to the second phase, despite a wash‐out phase. For the same reason cross‐over trials are not appropriate if the condition of interest is unstable (Elbourne 2002). As both carry‐over and unstable conditions are very likely in severe mental illness, we will only use data from the first phase of cross‐over studies.

3. Studies with multiple treatment groups

Where a study involves more than two treatment arms, if relevant, we will present the additional treatment arms in comparisons. If data are binary we will simply add these and combine within the two‐by‐two table. If data are continuous we will combine data following the formula in section 7.7.3.8 (Combining groups) of the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011b). Where additional treatment arms are not relevant, we will not reproduce these data.

Dealing with missing data

1. Overall loss of credibility

At some degree of loss of follow‐up, data must lose credibility (Xia 2009). We choose that, for any particular outcome, should more than 50% of data be unaccounted for we will not reproduce these data or use them within analyses. If, however, more than 50% of those in one arm of a study are lost, but the total loss is less than 50%, we will address this within the 'Summary of findings' table(s) by down‐rating quality. Finally, we will also downgrade quality within the 'Summary of findings' table(s) should the loss be 25% to 50% in total.

2. Binary

In the case where attrition for a binary outcome is between 0% and 50% and where these data are not clearly described, we will present data on a 'once‐randomised‐always‐analyse' basis (an intention‐to‐treat analysis (ITT)). Those leaving the study early are all assumed to have the same rates of negative outcome as those who completed, with the exception of the outcome of death and adverse effects. For these outcomes the rate of those who stay in the study ‐ in that particular arm of the trial ‐ will be used for those who did not. We will undertake a sensitivity analysis testing how prone the primary outcomes are to change when data only from people who completed the study to that point are compared to the intention‐to‐treat analysis using the above assumptions.

3. Continuous
3.1 Attrition

We will use data where attrition for a continuous outcome is between 0% and 50%, and data only from people who complete the study to that point are reported.

3.2 Standard deviations

If standard deviations (SDs) are not reported, we will try to obtain the missing values from the study authors. If these are not available, where there are missing measures of variance for continuous data, but an exact standard error (SE) and CIs available for group means, and either P value or t value available for differences in mean, we can calculate SDs according to the rules described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011b). When only the SE is reported, SDs are calculated by the formula SD = SE * √(n). Chapters 7.7.3 and 16.1.3 of the Cochrane Handbook for Systematic Reviews of Interventions present detailed formulae for estimating SDs from P, t or F values, CIs, ranges or other statistics (Higgins 2011b). If these formulae do not apply, we will calculate the SDs according to a validated imputation method which is based on the SDs of the other included studies (Furukawa 2006). Although some of these imputation strategies can introduce error, the alternative would be to exclude a given study’s outcome and thus to lose information. Nevertheless, we will examine the validity of the imputations in a sensitivity analysis that excludes imputed values.

3.3 Assumptions about participants who left the trials early or were lost to follow‐up

Various methods are available to account for participants who left the trials early or were lost to follow‐up. Some trials just present the results of study completers; others use the method of last observation carried forward (LOCF); while more recently, methods such as multiple imputation or mixed‐effects models for repeated measurements (MMRM) have become more of a standard. While the latter methods seem to be somewhat better than LOCF (Leon 2006), we feel that the high percentage of participants leaving the studies early and differences between groups in their reasons for doing so is often the core problem in randomised schizophrenia trials. We will therefore not exclude studies based on the statistical approach used. However, by preference we will use the more sophisticated approaches, that is, we prefer to use MMRM or multiple‐imputation to LOCF, and we will only present completer analyses if no kind of ITT data are available. Moreover, we will address this issue in the item 'Incomplete outcome data' of the 'Risk of bias' tool.

Assessment of heterogeneity

1. Clinical heterogeneity

We will consider all included studies initially, without seeing comparison data, to judge clinical heterogeneity. We will simply inspect all studies for participants who are clearly outliers or situations that we had not predicted would arise and, where found, discuss such situations or participant groups.

2. Methodological heterogeneity

We will consider all included studies initially, without seeing comparison data, to judge methodological heterogeneity. We will simply inspect all studies for clearly outlying methods which we had not predicted would arise and discuss any such methodological outliers.

3. Statistical heterogeneity
3.1 Visual inspection

We will inspect graphs visually to investigate the possibility of statistical heterogeneity.

3.2 Employing the I² statistic

We will investigate heterogeneity between studies by considering the I² statistic alongside the Chi² P value. The I² statistic provides an estimate of the percentage of inconsistency thought to be due to chance (Higgins 2003). The importance of the observed value of I² depends on the magnitude and direction of effects as well as the strength of evidence for heterogeneity (e.g. P value from Chi² test, or a confidence interval for I²). We will interpret an I² estimate greater than or equal to 50% and accompanied by a statistically significant Chi² statistic as evidence of substantial heterogeneity (Section 9.5.2 Cochrane Handbook for Systematic Reviews of Interventions) (Deeks 2011). When substantial levels of heterogeneity are found in the primary outcome, we will explore reasons for heterogeneity (Subgroup analysis and investigation of heterogeneity).

Assessment of reporting biases

Reporting biases arise when the dissemination of research findings is influenced by the nature and direction of results (Egger 1997). These are described in Section 10 of the Cochrane Handbook for Systematic Reviews of Interventions (Sterne 2011). We are aware that funnel plots may be useful in investigating reporting biases, but are of limited power to detect small‐study effects. We will not use funnel plots for outcomes where there are 10 or fewer studies, or where all studies are of similar size. In other cases, where funnel plots are possible, we will seek statistical advice in their interpretation.

Data synthesis

We understand that there is no closed argument for preference for use of fixed‐effect or random‐effects models. The random‐effects method incorporates an assumption that the different studies are estimating different, yet related, intervention effects. This often seems to be true to us and the random‐effects model takes into account differences between studies, even if there is no statistically significant heterogeneity. There is, however, a disadvantage to the random‐effects model: it puts added weight onto small studies, which often are the most biased ones. Depending on the direction of effect, these studies can either inflate or deflate the effect size. We choose to use a fixed‐effect model for all analyses.

Subgroup analysis and investigation of heterogeneity

1. Subgroup analyses
1.1 Clinical state, stage or problem

We propose to undertake this review and provide an overview of the effects of G biloba for people with schizophrenia in general. No subgroup analyses are anticipated, however, we will report data on subgroups of people in the same clinical state, stage and with similar problems should they be available.

2. Investigation of heterogeneity

We will report if inconsistency is high. Firstly, we will investigate whether data have been entered correctly. Secondly, if data are correct, we will inspect the graph visually and remove outlying studies successively to see if homogeneity is restored. For this review we have decided that should this occur with data contributing to the summary finding of no more than 10% of the total weighting, we will present data. If not, we will not pool these data and will discuss any issues. We know of no supporting research for this 10% cut‐off but are investigating use of prediction intervals as an alternative to this unsatisfactory state.

When unanticipated clinical or methodological heterogeneity is obvious we will simply state hypotheses regarding these for future reviews or versions of this review. We do not anticipate undertaking analyses relating to these.

Sensitivity analysis

1. Implication of randomisation

We aim to include trials in a sensitivity analysis if they are described in some way that implies randomisation. For primary outcomes, if the inclusion of these trials does not result in a substantive difference, they will remain in the analyses. If their inclusion does result in statistically significant differences, we will not add the data from these lower‐quality studies to the results of the higher‐quality trials, but will present these data within a subcategory.

2. Assumptions for lost binary data

Where assumptions have to be made regarding people lost to follow‐up (see Dealing with missing data), we will compare the findings of the primary outcomes when we use our assumption compared with completer data only. If there is a substantial difference, we will report results and discuss them, but continue to employ our assumption.

Where assumptions have to be made regarding missing SD data (see Dealing with missing data), we will compare the findings of primary outcomes when we use our assumption compared with completer data only. We will undertake a sensitivity analysis to test how prone results are to change when completer data only are compared to the imputed data using the above assumption. If there is a substantial difference, we will report results and discuss them, but continue to employ our assumption.

3. Risk of bias

We will analyse the effects of excluding trials that are judged to be at high risk of bias across one or more of the 'Risk of bias' domains (implied as randomised with no further details available, allocation concealment, blinding and outcome reporting) for the meta‐analysis of the primary outcome. If the exclusion of trials at high risk of bias does not alter the direction of effect or the precision of the effect estimates substantially, then we will include relevant data from these trials.

4. Imputed values

We will undertake a sensitivity analysis to assess the effects of including data from trials where we use imputed values for ICC in calculating the design effect in cluster‐randomised trials.

If substantial differences are noted in the direction or precision of effect estimates in any of the sensitivity analyses listed above, we will not pool data from the excluded trials with the other trials contributing to the outcome, but will present them separately.

5. Fixed‐effect and random‐effects

We will synthesise data using a fixed‐effect model; however, we will also synthesise data for the primary outcome using a random‐effects model to evaluate whether this alters the significance of the results.