Pharmacological interventions for recurrent abdominal pain in childhood

Alice E Martin; Tamsin V Newlove‐Delgado; Rebecca A Abbott; Alison Bethel; Joanna Thompson‐Coon; Rebecca Whear; Stuart Logan

doi:10.1002/14651858.CD010973.pub2

药物治疗儿童复发性腹痛

Authors' declarations of interest

Version published: 06 March 2017 Version history

https://doi.org/10.1002/14651858.CD010973.pub2

Collapse all Expand all

Abstract

available in

研究背景

大约4％˜25％的学龄儿童在某一阶段出现复发性腹痛（RAP），严重影响了他们的日常生活。当没有发现明确的器质性病因时，儿童通常使用安慰和简单的措施来治疗，其中多使用推荐的各种药物。

研究目的

本研究旨在确定学龄儿童反复性腹痛进行药物干预的有效性。

检索策略

截至2016年6月，我们检索了Cochrane对照试验数据库（CENTRAL），Ovid MEDLINE，Embase和其他八个电子数据库。我们还检索了两个试验注册库，并联系了已发表研究的研究人员。

标准/纳入排除标准

随机对照试验纳入5岁至18岁反复性腹痛患儿，或根据罗马III标准(Rasquin 2006)定义的腹痛相关功能性胃肠道疾病的患儿。干预措施为相关药物干预与安慰剂、不治疗、等待名单或标准护理进行对比。主要结局指标包括疼痛强度、疼痛持续时间或疼痛频率，以及疼痛状况的改善。次要结局指标包括在学校的表现，社会或心理功能以及日常生活质量。

数据收集与分析

两位评价者独立地筛选了符合标准研究的标题、摘要和可能的相关全文。两位评价者提取数据并进行“偏倚风险”评估。我们使用GRADE方法评估证据的整体质量。由于研究的异质性较大，我们认为meta分析是不合适的。因此我们对结果进行了描述性的总结。

主要结果

本项综述纳入了16项研究，共有1024位5至18岁的儿童参与。这些儿童均由儿科门诊招募。试验研究在七个国家开展：美国有7个，伊朗4个，英国、瑞士、土耳其、斯里兰卡和印度各一个。随访时间为两周至四个月。研究纳入了以下治疗复发性腹痛（RAP）的药物：三环抗抑郁药、抗生素、5‐HT4受体激动剂、解痉药、抗组胺药、H2受体拮抗剂、5‐羟色胺拮抗剂、选择性5‐羟色胺再吸收抑制剂、多巴胺受体拮抗剂和激素。尽管一些单一的研究报告认为药物治疗是有效的，但所有这些研究小或在方法学上有很大的偏倚风险。这些“积极”的结果在后续研究中没有再次出现。我们认为药物治疗RAP有效性的证据是低质量的。这些研究均没有报告不良事件。

作者结论

目前还没有令人信服的证据支持使用药品可以治疗儿童RAP。需要进行设计良好的临床试验，评价药物干预措施所有可能的益处和风险。事实上，如果临床医生选择使用药物干预作为“治疗性试验”，他们和患者需要意识到RAP是波动变化的过程，任何“反应”可能是病情的自然变化或是安慰剂效应，而不是药物的疗效。

PICOs

Population

Intervention

Comparison

Outcome

The PICO model is widely used and taught in evidence-based health care as a strategy for formulating questions and search strategies and for characterizing clinical studies or meta-analyses. PICO stands for four different potential components of a clinical question: Patient, Population or Problem; Intervention; Comparison; Outcome.

See more on using PICO in the Cochrane Handbook.

Plain language summary

available in

药物治疗儿童反复性腹痛

系统综述问题

药物治疗可以改善复发性腹痛（RAP）患儿的疼痛或者其他症状吗？

研究背景

儿童复发性腹痛是一种疾病术语，它用于描述不明病因的腹部疼痛。疼痛常伴有其他症状，如腹泻或面色苍白。因此，一些研究人员根据这些其他相关症状把不明病因的疼痛分为不同的综合征。复发性腹痛常见于儿童。引发儿童腹痛的潜在原因可能不同。

研究特征

我们检索了截止到2016年6月关于药物治疗儿童反复性腹痛的全球科学文献。我们发现了16项研究符合标准，干预的药物包括抗抑郁药，抗生素，抗组胺药，解痉药，多巴胺受体拮抗剂和激素。14项研究将药物疗法与安慰剂进行对比，2项研究对比常规疗法。试验分别在七个国家进行：7项试验在美国进行；四项在伊朗；英国、瑞士、土耳其、斯里兰卡、印度各一项。研究共纳入了年龄在5‐18岁的1024名儿童。所有儿童都通过门诊招募。随访持续了两周至四个月。

主要结果

系统综述表明，没有支持使用药物可以改善症状或儿童的生活质量的证据。因此，如果服用药物，应在一个设计完善的的临床试验中进行。如果对患有RAP的儿童开药，要注意RAP随着时间的变化，因为任何RAP的改善或恶化可能是由于病情的自然变化而不是药物反应。

证据的质量

许多试验在设计方面和报告方式上存在一些缺陷，因此药物治疗RAP证据的整体质量较低。研究方法好的试验纳入的儿童数量少，并且研究结果无法重复再现。

Authors' conclusions

Implications for practice

Overall, this review provides only extremely weak evidence for the efficacy of some pharmacological agents in children with RAP. The lack of clear evidence of effectiveness for any drug suggests that there is little reason for their use outside of well‐conducted clinical trials. Clinicians may choose to prescribe drugs to children whose symptoms are severe and who have not responded to simple management. However, when using drugs as a 'therapeutic trial', clinicians need to be aware that RAP is a fluctuating condition and any 'response' may reflect the natural history of the condition or a placebo effect, rather than drug efficacy.

Implications for research

The pathogenesis of RAP in children remains unclear (Hyams 1998). There is an obvious need for further studies to be conducted to elucidate this aetiology. It may be that the complaint of pain is a unifying manifestation for a wide variety of causal pathways and triggers relating to psychological and physical processes. It is unlikely that RAP is a single disease entity. Further trials are therefore needed not only to guide the management of children with RAP, but also to validate the usefulness of suggested classifications (Rasquin 2006).

Summary of findings

Open in table viewer

Summary of findings for the main comparison. Antispasmodics compared to placebo for recurrent abdominal pain

Antispasmodics compared to placebo for recurrent abdominal pain
Patient or population: school‐aged children (5 to 18 years of age) with recurrent abdominal pain Settings: hospital paediatric outpatient clinics Intervention: antispasmodic drugs Comparison: placebo
Outcomes	*Illustrative comparative risks (SD)**		Relative effect (95% CI)	Number of participants (studies)	Quality of the evidence (GRADE)	Comments
	Assumed risk	Corresponding risk
	Placebo	Intervention
Pain duration (mean pain duration, assessed at 4 weeks)	The mean duration of pain in the control group was 6.17 (± 11.61).	The mean duration of pain in the intervention group was51.6 (± 23.74).	MD ‐25.4 (‐35.5 to ‐15.3)	120 (1)	⊕⊝⊝⊝ Very low¹	Asgarshirazi 2015 No evidence of efficacy
Pain improvement (clinician judged, assessed at 2 weeks)	9 of 21 children in the control group had an improvement in pain.	15 of 21 children in the intervention group had an improvement in pain.	OR 3.33 (0.93 to 12.01)	42 (1)	⊕⊝⊝⊝ Very low²	Kline 2001 No evidence of efficacy
Pain frequency (episodes of pain in 4 weeks, assessed after 4 weeks)	The mean number of episodes of pain in the control group was 21.6 (32.4).	The mean number of episodes of pain in the intervention group was10.3 (14).	MD 11.3 (2.4 to 20.1)	132 (1)	⊕⊝⊝⊝ Very low³	Narang 2015 No evidence of efficacy
Pain improvement (self reported response to treatment, assessed at 4 weeks)	The response to treatment in the control group was 30.3%.	The response to treatment in the intervention group was 40.6%.	OR 1.6 (0.7 to 3.4)	115 (1)	⊕⊕⊝⊝ Low⁴	Pourmoghaddas 2014 No evidence of efficacy
The basis for the assumed risk* (e.g., the median control group risk across studies) is provided in footnotes. The corresponding risk is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI). CI: confidence interval; MD: mean difference; OR: odds ratio; SD: standard deviation
GRADE Working Group grades of evidence High quality: Further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: We are very uncertain about the estimate.
¹Downgraded for low methodological quality due to single, small study; risk of bias from incomplete outcome data and differential loss of participants between groups. The placebo differed in preparation and dose timing compared to the intervention drug. ²Downgraded for low methodological quality due to single, small study; risk of bias from selective outcome reporting and short follow‐up. ³Downgraded for low methodological quality due to single, small study; risk of bias from selective outcome reporting and the method of altering the drug doses. ⁴Downgraded for single, small study. Not duplicated.

Background

Description of the condition

This review is an update of a previously published review in the Cochrane Library on 'Pharmacological interventions for recurrent abdominal pain (RAP) and irritable bowel syndrome (IBS) in childhood' (Huertas‐Ceballos 2008a). Recurrent abdominal pain (RAP) is a common problem in paediatric practice. It has been suggested that 4% to 25% of school‐aged children at some stage suffer from RAP that interferes with their activities of daily living (Abu‐Arafeh 1995; Apley 1958; Faull 1986; Williams 1996; Øster 1972). Recurrent abdominal pain is often regarded as a relatively benign condition, but it is important to note the associated morbidity and anxiety it causes for children and carers. The condition is associated with school absences, hospital admissions, and, on occasion, unnecessary surgical intervention (Scharff 1997; Stickler 1979; Størdal 2005; Walker 1998). Symptoms sometimes continue into adulthood (Apley 1975). The abdominal pain is commonly associated with other symptoms, including headaches, recurrent limb pains, pallor, and vomiting (Abu‐Arafeh 1995; Apley 1958; Faull 1986; Hyams 1995; Stickler 1979; Stone 1970; Øster 1972).

It is generally accepted that RAP in children represents a group of functional gastrointestinal disorders with an unclear aetiology. Children suffer either chronic or recurrent gastrointestinal symptoms that are not explained by a structural, biochemical, or inflammatory process. Apley first sought to define the condition in the 1950s, suggesting that the diagnostic label should be based on the presence of at least three episodes of severe abdominal pain (often, but not necessarily, with associated systemic symptoms) over three months, with no established organic cause (Apley 1958). More recently, an international consensus definition with a symptom‐based classification system with specific categories for paediatric presentations has been produced, known as the Rome III criteria (Rasquin 2006). We have used RAP throughout this review as an umbrella term to refer to the five categories included within this classification, which are: functional dyspepsia, irritable bowel syndrome (IBS), abdominal migraine, functional abdominal pain, and functional abdominal pain syndrome. It should be noted that the pain classification for each of the Rome III diagnoses is defined by at least one episode per week for at least two months; this varies from Apley's original definition of RAP (Apley 1958). The Rome classification is not based on known pathophysiological differences between the conditions, but rather on the constellation of clinical features. It remains unclear the extent to which separating children into these categories defines groups that are distinct clinical entities likely to respond differently to treatment. Nonetheless, this classification has been welcomed following the historical use of diverse terms, some implying causation. These include abdominal migraine (Bain 1974; Farquar 1956; Hockaday 1992; Symon 1986), abdominal epilepsy (Stowens 1970), the irritable bowel syndrome in childhood (Stone 1970), allergic‐tension‐fatigue syndrome (Sandberg 1973; Speer 1954), neurovegetative dystonia (Peltonen 1970; Rubin 1967), functional gastrointestinal disorder (Drossman 1995), and the irritated colon syndrome (Harvey 1973; Painter 1964).

There is no consensus about which of the numerous proposed causal pathways result in the heterogeneous presentations of chronic abdominal pain, although it has been suggested that physical, emotional, and environmental factors may contribute to the manifestation of unexplained abdominal pain. When considering the diverse proposed mechanisms, it is unsurprising that a variety of treatments have been suggested. The treatment approaches can be grouped into pharmacological, dietary, or psychosocial (psychological, behavioural, or both). Reviews of the effectiveness of psychosocial and dietary interventions for RAP (Huertas‐Ceballos 2008b; Huertas‐Ceballos 2009), published in 2008 and 2009, have been updated as companions to this updated review (Abbott 2017; Newlove‐Delgado in press, respectively).

Description of the intervention

A range of pharmacological treatments have been tried and tested for RAP in childhood: analgesics, dicyclomine (Edwards 1994), pizotifen (Christensen 1995; Symon 1995), herbal extracts (Zhang 1991), and many other drugs (Bain 1974; Worawattanakul 1999). A number of randomised controlled trials have reported on the use of peppermint oil for IBS in adults (Grigoleit 2005), the results of which have been interpreted as suggesting it to be a beneficial intervention. However, an earlier review reached no clear conclusion on efficacy due to poor methodological quality of the included studies (Pittler 1998). Other possible causal factors have been postulated, including food allergies (Poley 1973), reaction to food additives (Anonymous 1984), infectious agents like Helicobacter pylori, and parasitic infestation (Heldenberg 1995; Primelles 1990; Wardhan 1993).

How the intervention might work

The aetiology of abdominal pain‐related functional gastrointestinal disorders is unclear. It has been suggested that visceral hypersensitivity (Di Lorenzo 2001; Van Ginkel 2001), autonomic dysfunction (Good 1995), and gut dysmotility may contribute, which may be initiated by an inflammatory, infective, traumatic, or allergic trigger (Mayer 2002; Milla 2001).

Conventional analgesics have been proposed to work by interrupting these abnormal physiological pain responses, which become pathological. Antispamodics have been proposed to alter gut dysmotility, including peppermint oil, which has antispasmodic actions (Hills 1991). Serotonin (5‐hydroxytryptamine) agonists may relieve symptoms by causing vasoconstriction and stimulation of the release of other vasoactive substances, thus inhibiting neurogenic inflammation; this has been found in migraine headaches (Goadsby 2000). Antidepressants treat the associated symptoms, and there is evidence of effectiveness in treating IBS in adults (Ruepert 2011).

Why it is important to do this review

Recurrent abdominal pain in children is very common, and in daily clinical practice there is no consensus on which treatments to offer patients. The approach is therefore inconsistent. It was important to do this review to establish if there is evidence for the effectiveness of pharmacological interventions in children with RAP. This review updates a previous review published in 2008 (Huertas‐Ceballos 2008a). Companion reviews addressing the effectiveness of psychosocial, Huertas‐Ceballos 2008b, and dietary interventions for RAP, Huertas‐Ceballos 2009, are also being updated (Abbott 2017; Newlove‐Delgado in press, respectively), so together they can guide clinicians, patients, and their families in treatment decisions.

Objectives

To determine the effectiveness of pharmacological interventions for RAP in children of school age.

Methods

Criteria for considering studies for this review

Types of studies

Randomised controlled trials (RCTs).

Types of participants

Children aged five to 18 years with RAP or an abdominal pain‐related functional gastrointestinal disorder, as defined by the Rome III criteria (Rasquin 2006).

Recurrent abdominal pain is defined as at least three episodes of pain interfering with normal activities within a three‐month period. The Rome III criteria recognises five abdominal pain‐related categories: abdominal migraine, irritable bowel syndrome (IBS), functional dyspepsia, functional abdominal pain, and functional abdominal pain syndrome (Rasquin 2006).

Types of interventions

Any pharmacological intervention compared to placebo, no treatment, waiting list, or standard care.

Types of outcome measures

Primary outcomes

Pain intensity
Pain duration or pain frequency
Improvement in pain

As there is no standard method for measuring pain in this condition, studies could have used any validated measurement of pain such as a Likert scale, visual analogue scale, or questionnaire such as the Abdominal Pain Index (Walker 1997), which exists in various versions and formats. The trials could also have used 'the proportion of participants with a significant improvement in pain' as an outcome, as defined by the trial author.

Secondary outcomes

The following were secondary outcomes, as measured by a validated tool.

School performance
Social or psychological functioning
Quality of daily life

Search methods for identification of studies

Electronic searches

We ran the first literature searches in April 2013 and updated them in April 2014, March 2015, and again in June 2016. We searched the electronic databases and trial registers listed below.

Cochrane Central Register of Controlled Trials (CENTRAL; 2016, Issue 5) in the Cochrane Library and which includes the Cochrane Developmental, Psychosocial and Learning Problems Specialised Register (searched 10 June 2016).
Ovid MEDLINE In‐Process & Other Non‐Indexed Citations and Ovid MEDLINE (1946 to current; searched 9 June 2016).
Embase Ovid (1974 to current; searched 9 June 2016).
CINAHL Healthcare Databases Advanced Search (Cumulative Index to Nursing and Allied Health Literature; 1981 to current; searched 9 June 2016).
PsycINFO Ovid (1806 to current; searched 9 June 2016).
ERIC ProQuest (Educational Resources Information Center; 1966 to current; searched 9 June 2016).
BEI ProQuest (British Education Index; 1975 to current; searched 9 June 2016).
ASSIA ProQuest (Applied Social Sciences Index and Abstracts; 1987 to current; searched 9 June 2016).
AMED Healthcare Databases Advanced Search (Allied and Complementary Medicine; 1985 to current; searched 9 June 2016).
LILACS (Latin American and Caribbean Literature in Health Sciences; lilacs.bvsalud.org/en; searched 9 June 2016).
OpenGrey (opengrey.eu; searched 9 June 2016).
ClinicalTrials.gov (clinicaltrials.gov; searched 9 June 2016).
World Health Organization International Clinical Trials Registry Platform (WHO ICTRP; apps.who.int/trialsearch; searched 9 June 2016).

We revised the search terms from the original Cochrane RAP reviews (Huertas‐Ceballos 2008; Huertas‐Ceballos 2009a; Huertas‐Ceballos 2009b); we consequently ran the searches for all available years. We used RCT filters where appropriate and imposed no language limits. We translated any non‐English language studies identified so that they could be screened and considered for inclusion. The search strategies for each database are reported in Appendix 1.

Searching other resources

We used the Science Citation Index (Web of Science) for forward citation searching to identify papers in which the included reports had been cited, and we checked the reference lists of the included reports to identify any additional studies, including any ongoing or unpublished work. We also contacted researchers who have published studies in this field to ask for details of any relevant trials.

Data collection and analysis

Selection of studies

Two review authors (RAA, AB, TVND, or AEM) independently screened the titles and abstracts of studies identified by the search for relevance. We obtained the full‐text reports of any paper that we judged to be potentially suitable for inclusion and then identified studies for inclusion against the Criteria for considering studies for this review. Any disagreements were resolved through discussion with a third review author (JTC).

Data extraction and management

Two review authors (RAA, AB, JTC, TVND, or AEM) extracted data and entered the data into Cochrane's statistical software Review Manager 5 (Review Manager 2014). All review authors used the same data extraction form. We collected the following data.

Study characteristics: number of participating children, inclusion and exclusion criteria, type of intervention and comparison, intervention characteristics (duration, frequency, setting), and number of withdrawals.
Participant characteristics: sex, age, and diagnosis (e.g. RAP or a syndrome defined by the Rome III criteria) (Rasquin 2006).
Outcome measures: measurement of pain and any secondary outcomes measured.

Assessment of risk of bias in included studies

We considered the following domains when assessing the risk of bias of included studies:

selection bias (random sequence generation and allocation concealment);
performance bias (blinding of participants and personnel);
detection bias (blinding of outcome assessment);
attrition bias (incomplete outcome data);
reporting bias (selective outcome reporting); and
other sources of bias. We assessed all included studies for other sources of bias that may have altered the estimate of treatment effect, such as differential loss to follow‐up, whether the data collection tools were valid, whether there was sufficient power in terms of appropriate sample size, whether baseline parameters were similar, and whether data analyses were appropriate.

Two review authors (RAA, AB, JTC, TVND, or AEM) independently assessed each study. We classified the risk of bias as low, high, or unclear based on the methods detailed in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011a). See Table 1. We considered a trial as having an overall low risk of bias if most of the above domains were assessed as at low risk of bias. We considered a trial as having an overall high risk of bias if several of the above domains were assessed as at high risk of bias or unclear risk of bias.

Open in table viewer

Table 1. Assessment of risk of bias in included studies

Domain	'Risk of bias' judgement
Domain	Low	High	Unclear
Selection bias
Random sequence generation	If the study details any of the following methods: (1) simple randomisation (such as coin‐tossing, throwing dice, or dealing previously shuffled cards, a list of random numbers, or computer‐generated random numbers); or (2) restricted randomisation such as blocked, ideally with varying block sizes or stratified groups, provided that within‐group randomisation is not affected	If the study details no randomisation or an inadequate method such as alternation, assignment based on date of birth, case record number, and date of presentation. These latter methods may be referred to as ‘quasi‐random’.	If there is insufficient detail to judge the risk of bias
Allocation concealment	If the study details concealed allocation sequence in sufficient detail to determine that allocations could not have been foreseen in advance of, or during, enrolment	If the study details a method where the allocation is known prior to assignment	If there is insufficient detail to judge the risk of bias
Performance bias
Blinding of participants and personnel	If the study details a method of blinding participants and personnel. Detail would need to be sufficient to show that participants and personnel were unable to identify the therapeutic intervention from the control intervention.	If the methods detail that the participants or study personnel were not blinded to the study medication or placebo	If there is insufficient detail to judge the risk of bias
Detection bias
Blinding of outcome assessment	If the study details a blinded outcome assessment. This may only be possible for outcomes that are externally assessed.	If the outcome assessment is not blinded. We expect this may be unavoidable for self rated outcomes of unblinded interventions.	If there is insufficient detail to judge the risk of bias
Attrition bias
Incomplete outcome data	If the study reports attrition and exclusions, including the numbers in each intervention group (compared with total randomised participants), reasons for attrition or exclusions, and any re‐inclusions; the impact of missing data is not believed to have altered the conclusions; and reasons for the missing data are acceptable	We may judge the risk of attrition bias to be high due to the amount, nature, or handling (such as per‐protocol analysis) of incomplete outcome data.	If there is insufficient detail to judge the risk of bias, e.g. if the number of children randomised to each treatment is not reported
Reporting bias
Selective reporting	If there is complete reporting of all outcome data. This will be determined based on comparison of the protocol and published study, if available.	If the reporting is selective so that some outcome data are not reported	If there is insufficient detail to judge the risk of bias, e.g. protocols are unavailable
Other sources of bias
Other bias	If the study is judged to be at low of risk of other potential sources of bias, such as no differential loss to follow‐up or an adequate washout period in cross‐over trials	If there are other sources of bias, such as differential loss to follow‐up or an inadequate washout period in cross‐over trials	If there is insufficient detail to judge the risk of bias

Measures of treatment effect

Continuous data

For continuous data (e.g. number of days of pain), we analysed means and standard deviations (SDs), if these were available or could be calculated and if there was no clear evidence of skewness in the distribution, and presented these with 95% confidence intervals (CIs).

If different scales were used to measure the same clinical outcome, we combined the standardised mean differences (SMDs) across the studies, and presented these with 95% CIs.

Dichotomous data

For dichotomous data (e.g. pain improved, yes or no), we analysed data using odds ratios (ORs) and presented these with 95% CIs.

Unit of analysis issues

No methods were used as we did not perform a meta‐analysis. For methods archived for future updates of this review, please see Appendix 2 and our protocol (Martin 2014a).

Dealing with missing data

We first contacted the original investigators to request any missing data, but we received no responses. We did not impute any values. We did not feel it was relevant to carry out a sensitivity analysis, with and without the missing data, as it was not possible to conduct a meta‐analysis. For methods archived for future updates of this review, please see Appendix 2 and our protocol (Martin 2014a).

We collected the proportions of participants for whom no outcome data were obtained and reported them in the assessment of Risk of bias in included studies. We have also provided this information for each study in the 'Risk of bias' tables, beneath the Characteristics of included studies tables. We explored the potential impact of missing data on the findings of the review in the Discussion section.

Assessment of heterogeneity

We anticipated finding considerable heterogeneity between included studies. We assessed clinical heterogeneity by examining the distribution of relevant participant characteristics (e.g. age, definition of RAP) and study differences (e.g. concealment of randomisation, blinding of outcome assessors, interventions or outcome measures).

We did not use our prescribed methods for assessing statistical heterogeneity as we did not perform a meta‐analysis. For methods archived for future updates of this review, please see Appendix 2 and our protocol (Martin 2014a).

Assessment of reporting biases

We examined the report of each study to assess for selective outcome reporting. We assessed the study as adequate if it met the following criteria:

the study protocol was available and all of the study's prespecified (primary and secondary) outcomes that were of interest to the review were reported in the prespecified way; or
the study protocol was not available, but it was clear that the published reports included all expected outcomes, including those that were prespecified.

Data synthesis

Given the heterogeneity of drug classes and pain measurements used in the included studies, it was not possible to carry out a meta‐analysis (DerSimonian 1986). We have therefore provided a narrative description of the results. For methods archived for future updates of this review, please see Appendix 2 and our protocol (Martin 2014a).

Summary of findings

We used the GRADE approach to assess the overall quality of the body of evidence for a specific outcome (Grade Working Group 2013). We used GRADEpro to assess and present the findings in a 'Summary of findings' table (GRADEpro GDT 2015). There were 11 comparisons in total, as we found 11 classes of drugs in the studies. We chose to produce a 'Summary of findings' table for antispasmodic drugs versus placebo (summary of findings Table for the main comparison), as this was the most commonly investigated drug class in the studies, with a total of four studies. We presented pain, the primary outcome for this review, in the 'Summary of findings' table. The measurement of pain varied between studies, as explained in the Types of outcome measures section.

We judged the studies included for each outcome using five criteria: risk of bias, indirectness, inconsistency, imprecision, and publication bias. We used limitations in the design and implementation to assess the overall risk of bias of included studies for each outcome; we downgraded an outcome if the majority of studies had unclear or high risk of bias. We assessed indirectness if a population, intervention, or outcome was not of direct interest to the review. Inconsistency was determined by the heterogeneity of results. If an outcome had a heterogeneity outcome of greater than 70%, we downgraded the quality of the outcome. Imprecision was assessed by the number of participants included in an outcome and by CIs. We downgraded an outcome when only a small number of participants could be included in the analysis or the analysis had wide CIs. Finally, we downgraded for publication bias if studies failed to report outcomes in the published manuscript or if there was a suspicion that null findings had not been published or reported (Schünemann 2011, section 12.2.2).

We gave each outcome a quality marking ranging from 'very low' to 'high'.

High quality: "further research is unlikely to change our estimate of effect".
Moderate quality: "further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate".
Low quality: "further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate".
Very low quality: "we are very uncertain about the estimate" (Balshem 2011).

Subgroup analysis and investigation of heterogeneity

We used no methods as we did not perform a meta‐analysis. For methods archived for future updates of this review, please see Appendix 2 and our protocol (Martin 2014a).

Sensitivity analysis

We used no methods as we did not perform a meta‐analysis. For methods archived for future updates of this review, please see Appendix 2 and our protocol (Martin 2014a).

Results

Description of studies

For a full description of the main characteristics of the studies, including details on participants and setting, intervention aspects, and outcome measures, see Characteristics of included studies. See also Characteristics of excluded studies.

Results of the search

For this updated review, we chose to redesign the search strategy in order to include the recognised terms for different types of RAP, as defined by the Rome criteria (Rasquin 2006). Consequently, we ran our searches without date restrictions on each database. The results of the searching and screening process are shown in the PRISMA flow chart (Figure 1). We screened a total of 9649 titles and abstracts, 230 of which we carried forward for further screening at full‐text. We excluded 214 reports and included 16 studies.

Figure 1

Study flow diagram.

Settings

All 16 studies were conducted in hospital paediatric outpatient clinics.

Study duration

The study duration ranged from two weeks (Collins 2011; Kline 2001; Sadeghian 2008) to four months (Symon 1995). See Characteristics of included studies for details of each study.

Interventions

Pharmacological interventions, drug classes used; tricyclic antidepressants (two studies) (Bahar 2008; Saps 2009), antibiotics (two studies) (Collins 2011; Heyland 2012), antimuscarinics (one study) (Karabulut 2013), 5‐HT4 receptor agonists (one study) (Khoshoo 2006), antispasmodics (four studies) (Asgarshirazi 2015; Kline 2001; Narang 2015; Pourmoghaddas 2014), selective serotonin re‐uptake inhibitors (one study) (Roohafza 2014), antihistamines (one study) (Sadeghian 2008), H2 receptor antagonists (one study) (See 2001), serotonin antagonist (one study) (Symon 1995), dopamine receptor antagonist (one study) (Karunanayake 2015), and a hormone (one study) (Zybach 2016).

Study design

Three studies were cross‐over trials (See 2001; Symon 1995; Zybach 2016). One study had two intervention arms (Asgarshirazi 2015); the second intervention is reported in the accompanying review of dietary interventions (Newlove‐Delgado in press). The remaining 12 studies were single‐intervention, placebo‐controlled trials (Bahar 2008; Collins 2011; Heyland 2012; Karabulut 2013; Karunanayake 2015; Khoshoo 2006; Kline 2001; Narang 2015; Pourmoghaddas 2014; Roohafza 2014; Sadeghian 2008; Saps 2009). All studies were randomised.

Outcomes

Primary outcome

All 16 included trials measured pain; the method of measurement varied. Tools included: visual analogue scale (VAS; 0 to 10), mean pain score and number of days of pain. Some trials used the parental or child report of adequate pain relief as the outcome measure.

Secondary outcomes

One study measured school attendance (Narang 2015). One study measured social and psychological functioning (Roohafza 2014); the authors assessed self rated "depression, anxiety, and somatization" scores before and after treatment. Two studies measured quality of life (Bahar 2008; Karunanayake 2015). Other outcomes included: gastrointestinal symptoms scale and global assessment of well‐being. See Characteristics of included studies for details of outcome measures used in studies.

Excluded studies

We examined 230 full‐text reports and excluded 214. Of these, 203 reports were clearly irrelevant. Of the 11 remaining full‐text reports, we excluded four due to ineligible study design (Christensen 1995; Cucchiara 1992; Dehghani 2011; Kaminski 2009), five due to ineligible populations (Di Nardo 2011; Everitt 2010; Lloyd‐Still 1990; Van Outryve 2005; Yadav 1989), and two because of ineligible comparators (Grillage 1990; Xiao 2013). Please see Characteristics of excluded studies.

Risk of bias in included studies

For a summary, see Figure 2 and Figure 3.

Figure 2

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Figure 3

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Allocation

Of the 16 included studies, we judged seven to be at low risk of bias for random sequence generation (Karunanayake 2015; Khoshoo 2006; Narang 2015; Pourmoghaddas 2014; Roohafza 2014; Sadeghian 2008; Symon 1995), and nine to be at unclear risk, largely because the authors did not explain their method of randomisation (Asgarshirazi 2015; Bahar 2008; Collins 2011; Heyland 2012; Karabulut 2013; Kline 2001; Saps 2009; See 2001; Zybach 2016).

We rated eight studies at low risk of bias because allocation concealment had been achieved (Collins 2011; Narang 2015; Pourmoghaddas 2014; Roohafza 2014; Sadeghian 2008; See 2001; Symon 1995; Zybach 2016), six studies at unclear risk of bias because the authors did not mention allocation concealment (Asgarshirazi 2015; Heyland 2012; Karunanayake 2015; Khoshoo 2006; Kline 2001; Saps 2009), and two studies at high risk of bias because the methods implied that allocation concealment had not been achieved (Bahar 2008; Karabulut 2013).

Blinding

We judged 11 studies with clear blinding of participants to be at low risk of bias (Collins 2011; Heyland 2012; Karunanayake 2015; Narang 2015; Pourmoghaddas 2014; Roohafza 2014; Sadeghian 2008; Saps 2009; See 2001; Symon 1995; Zybach 2016). We judged the risk of performance bias as unclear in three studies (Asgarshirazi 2015; Bahar 2008; Kline 2001), and high in two studies where blinding was not achieved (Karabulut 2013; Khoshoo 2006). These judgements were mirrored in the risk of detection bias, as the primary outcomes were all self or parent reported and therefore relied on blinding.

Incomplete outcome data

Outcome data were largely complete, and we judged 13 of the 16 included studies to have a low risk of attrition bias (Bahar 2008; Collins 2011; Heyland 2012; Khoshoo 2006; Kline 2001; Narang 2015; Pourmoghaddas 2014; Roohafza 2014; Sadeghian 2008; Saps 2009; See 2001; Symon 1995; Zybach 2016). We judged one study to have an unclear risk of attrition bias, as the authors did not report the numerical outcome data or the number of participants completing the study, but rather reported results as a percentage without a total (Karabulut 2013). We judged one further study to have an unclear risk of attrition bias due to insufficient information in the conference abstract and no further information from personal communication with the study authors (Karunanayake 2015). We judged one study to be at a high risk of attrition bias, as there were differential losses to follow‐up between the groups (Asgarshirazi 2015).

Selective reporting

We considered the risk of reporting bias to be unclear in two studies (Collins 2011; Khoshoo 2006), and high in three studies (Bahar 2008; Kline 2001; Narang 2015), because the authors did not report results for all the outcomes mentioned in the methods section. We rated the risk of reporting bias as low in 11 studies (Asgarshirazi 2015; Heyland 2012; Karabulut 2013; Karunanayake 2015; Pourmoghaddas 2014; Roohafza 2014; Sadeghian 2008; Saps 2009; See 2001; Symon 1995; Zybach 2016).

Other potential sources of bias

We judged the risk of other potential sources of bias as high in 10 studies (Asgarshirazi 2015; Bahar 2008; Collins 2011; Karabulut 2013; Narang 2015; Sadeghian 2008; Saps 2009; See 2001; Symon 1995; Zybach 2016), and unclear in four studies (Heyland 2012; Karunanayake 2015; Khoshoo 2006; Kline 2001), due to lack of details on baseline characteristics, lack of power calculations, lack of details on how outcomes were measured, and insufficient washout periods in cross‐over trials. We rated two studies at low risk of other potential sources of bias (Pourmoghaddas 2014; Roohafza 2014).

For each domain above, where there was insufficient information to make a judgement of high or low risk of bias, we wrote to all trial authors for clarification (Abbott 2014a [pers comm]; Abbott 2014b [pers comm]; Martin 2014b [pers comm]; Martin 2014c [pers comm]; Martin 2016a [pers comm]; Martin 2016b [pers comm]; Martin 2016c [pers comm]; Martin 2016d [pers comm]; Martin 2016e [pers comm]; Martin 2016f [pers comm]; Martin 2016g [pers comm]). As we received limited response, we assigned ratings of unclear risk of bias for these studies in these domains.

Effects of interventions

See: Summary of findings for the main comparison Antispasmodics compared to placebo for recurrent abdominal pain

Comparison 1. Tricyclic antidepressants compared to placebo

Two studies, Bahar 2008 and Saps 2009, evaluated amitriptyline compared to placebo in 123 children with functional gastrointestinal disorders as defined by the Rome II criteria (Rasquin‐Weber 1999).

Bahar 2008 assessed multiple outcomes measuring pain (using a pain‐rating scale and VAS), quality of life, and IBS symptoms for up to 13 weeks. The quality of life tool had been validated in adults (Patrick 1997). The authors reported no significant difference in pain between the two groups, but provided no data to support this. The mean quality of life scores were 109.4 and 127.5 at baseline for the intervention and control groups, respectively. At 13 weeks' follow‐up, they were 126.2 and 129.8 for the intervention and control groups, respectively. The authors suggested that a higher proportion of participants in the intervention group showed a 15% increase in quality of life at 13 weeks compared to the control group (P = 0.002). The clinical significance of this is unclear. Of note, the two groups had a 14% difference of mean quality of life scores at baseline. This may have been a post‐hoc analysis, as it is not mentioned in the methods of the study. The study authors did not provide the standard deviations for these data. We wrote to the study authors to request further information but received no response (Abbott 2014b [pers comm]).

Saps 2009 used author‐defined outcomes to evaluate improvement in pain from amitriptyline compared to placebo at four weeks' follow‐up. They found no difference between the intervention and control groups when assessing the self reported outcomes of "how well did the medication relieve your pain" and "overall how do you feel your problem is". Out of the 46 children in the amitriptyline group, 23 experienced an improvement in pain, compared to 22 out of 44 in the control group (OR 1.05, 95% CI 0.45 to 2.45).

We were unable to perform a meta‐analysis due to the use of different outcome measures and lack of numerical data. The GRADE quality rating for this outcome was very low, due to the small number of studies with methodological flaws.

Comparison 2. Antibiotics compared to placebo

Two studies, Collins 2011 and Heyland 2012, evaluated antibiotics compared to placebo in 112 children with Rome II criteria for IBS, functional dyspepsia, functional abdominal pain, or abdominal migraine (Rasquin‐Weber 1999), and RAP (Apley 1958).

Collins 2011 assessed the effectiveness of rifaximin compared to placebo. The authors used 10 self reported outcomes on a VAS and a pain questionnaire to evaluate improvement in pain. The 10 self reported outcomes included: incomplete evacuation, abdominal pain, diarrhoea, constipation, urgency to pass stool, passage of mucus, straining, and faecal soiling. The authors stated that there were no differences between the intervention and control groups for any of the outcomes. The authors did not present their data to support this conclusion. We contacted the authors but received no response (Abbott 2014a [pers comm]).

Heyland 2012 also reported the use of antibiotics to treat RAP. They compared co‐trimoxazole to placebo on the mean pain index. This was a 10‐point VAS measuring pain severity for each group. The mean pain index changed from 6.9 pre‐treatment to 4.1 post‐treatment in the intervention group, a mean change of ‐2.9. In the placebo group, the mean pain index changed from 7.4 pre‐treatment to 3.0 post‐treatment, a mean change of ‐4.4. The authors found no difference in scores on the mean change of pain index between the two groups. No raw data were given. We contacted the authors but received no response (Martin 2016c [pers comm]).

We were unable to perform a meta‐analysis due to lack of numerical data. These papers offer no evidence of the effectiveness of antibiotics to treat functional gastrointestinal disorders. The GRADE quality rating for this outcome was very low, due to lack of reliable outcome data to assess precision of treatment effect, and small and single studies for any drug intervention.

Comparison 3. Antimuscarinic drugs compared to usual care

Karabulut 2013 compared trimebutine to usual care in 78 children with IBS as defined by Rome III criteria (Rasquin 2006). The authors reported a self defined outcome of pain improvement, "adequate relief", as assessed by the parents. There was an improvement in 37 out of 39 children treated with trimebutine and in eight out of 39 children treated with usual care (P < 0.0001; OR 71.7, 95% CI 14.2 to 362.7). However, it is important to note that the intervention was not blinded and that the outcome measure was parental assessment, therefore performance and detection bias alone could explain the results. Consequently, the GRADE quality rating was very low, and the results should be interpreted with caution.

Comparison 4. 5‐HT4 agonists compared to usual care

Khoshoo 2006 compared tegaserod with usual care in 48 children with constipation‐predominant IBS. The authors reported 14 out of 21 children with good pain reduction in the treatment group compared with 5 out of 27 children in the control group (P < 0.05 (exact P value not reported in paper); OR 8.8, 95% CI 2.3 to 33.2). This suggests that tegaserod may be effective in relieving abdominal pain in constipation‐predominant IBS. Due to the small number of children participating in the study and the risk of performance and detection bias, this result should be interpreted with caution. Importantly, it is not clear if this was a post‐hoc analysis. The GRADE quality rating for this comparison was therefore very low.

Comparison 5. Antispasmodics compared to placebo

Four trials compared antispasmodics to placebo in 377 children with functional abdominal pain (Asgarshirazi 2015; Kline 2001; Narang 2015; Pourmoghaddas 2014). For a summary of antispasmodics, see summary of findings Table for the main comparison.

Asgarshirazi 2015 compared peppermint oil, placebo and a synbiotic Lactol (containing a probiotic and fructo‐oligosaccharide) in a three‐arm RCT. One hundred and twenty children with functional gastrointestinal disorders based on Rome III criteria were included in the study, but those with abdominal migraine were excluded. Changes in severity, duration, and frequency of pain and any adverse effects were reported at four weeks. The results for the synbiotic intervention are reported in an accompanying dietary review (Newlove‐Delgado in press). Thirty‐four children completed the study and were analysed in the peppermint oil intervention group and 25 in the placebo group. The authors reported the intervention group compared to placebo: mean difference (MD) in pain duration ‐25.4 minutes/day (95% CI ‐35.5 to ‐15.3), MD in frequency of pain ‐1.4 episodes/week (95% CI ‐2.0 to ‐0.8), and MD in severity of pain ‐1.1 on a numerical rating scale (95% CI ‐1.8 to ‐0.4). Notably, this trial was at risk of bias from incomplete outcome data and differential loss of children between groups (38% drop out in the placebo group and 15% in the intervention group). The placebo was different in preparation and dose timing compared to the intervention drug.

Kline 2001 also compared peppermint oil capsules with placebo in 42 children with IBS as defined by the Rome criteria (Rasquin‐Weber 1999). The authors reported that, at two weeks, 15 out of 21 children in the peppermint group had a clinician‐judged improvement in pain compared to 9 out of 21 in the placebo group (OR 3.33, 95% CI 0.93 to 12.01). The authors reported the 15‐item Gastrointestinal Symptom Rating Scale (Svedulend 1988), which measures frequency, duration, and impact on daily life, as showing no difference between groups, but do not provide the data in the study. We contacted the trial authors but received no response (Martin 2016f [pers comm]). The authors reported that daily diaries completed by children showed significantly lower mean pain severity in the peppermint oil group. The authors provided neither data nor explanation of how the analysis of the daily diaries was carried out, although the P value for this comparison was reported to be less than 0.03. No side effects were reported in either group. This study therefore provides insufficient evidence to support the use of peppermint oil in the treatment of RAP.

Narang 2015 compared drotaverine to placebo in a RCT with 132 children with RAP as defined by Apley (Apley 1958). They assessed children's pain severity and frequency, school attendance, and parental judgement of well‐being (Likert scale) after four weeks. The authors provided no results for pain severity, only pain frequency, reporting the mean number of episodes of pain in four weeks and number of pain‐free days in four weeks. They found a mean of 10.3 (SD = 14) episodes of pain in the 64 children receiving drotaverine and a mean of 21.6 episodes (SD = 32.4) in the 60 children receiving placebo. The MD between the groups in episodes of pain was 11.3 (95% CI 2.4 to 20.1). The mean number of pain‐free days was 17.4 (SD = 8.2) in the intervention group and 15.6 (SD = 8.7) in the placebo group; MD 1.8, 95% CI ‐1.2 to 4.8. The authors reported the mean number of school days missed as 0.24 (SD = 0.85) in the drotaverine group and 0.71 (SD = 1.59) in the placebo group. The MD between these two groups was 0.46 (95% CI 0.01 to 0.91).

Pourmoghaddas 2014 randomised 115 children to mebeverine versus placebo, and found no effectiveness of mebeverine in treating functional abdominal pain in children. Self reported treatment response rates in the mebeverine and placebo groups were 40.6% and 30.3%, respectively at four weeks' postintervention (P = 0.469; OR 1.6, 95% CI 0.7 to 3.4) and 54.2% and 41.0%, respectively at 12 weeks' postintervention (P = 0.416; OR 1.7, 95% CI 0.8 to 3.6). This was an intention‐to‐treat analysis; the authors used last observed carried forward to substitute for missing data. The authors found no difference between the groups in scores on the physician‐rated change of the Clinical Global Improvement‐Severity scale, at four or 12 weeks' postintervention (NIMH 1985).

A meta‐analysis of studies evaluating antispasmodic drugs was not possible due to the heterogeneity of the interventions and variation in outcome measures. The overall GRADE quality rating for evidence evaluating this comparison was very low.

Comparison 6. Selective serotonin re‐uptake inhibitors (SSRIs) compared to placebo

Roohafza 2014 compared citalopram and placebo in 115 children with functional abdominal pain as defined by Rome III criteria (Rasquin 2006). The authors found no difference in treatment response rate between the groups at four weeks' postintervention (40.6% citalopram, 30.3% placebo; P = 0.169; OR 1.6, 95% CI 0.7 to 3.4) or at 12 weeks' postintervention (52.5% citalopram, 41.0% placebo; P = 0.148; OR 1.6, 95% CI 0.8 to 3.3). This was an intention‐to‐treat analysis. The authors used last observed carried forward to substitute for missing data. In addition, there were no differences between the two groups on scores for the secondary outcomes of self assessed change in severity of depression, as assessed using the Children's Depression Inventory (Kovacs 1985); anxiety, as assessed using the Revised Children's Manifest Anxiety Scale (Reynolds 1979); or somatization, as assessed using the Children's Somatization Inventory (Walker 2009), at four or 12 weeks' postintervention. The citalopram group experienced more drowsiness (37.2% citalopram, 16.2% placebo; P = 0.025) and dry mouth (44.1% citalopram, 23.2% placebo; P = 0.034) compared to the placebo group. The overall GRADE quality rating for evidence evaluating this comparison was very low.

Comparison 7. Antihistamines compared to placebo

Sadeghian 2008 randomised 29 children with functional abdominal pain as defined by Rome II criteria to cyproheptadine or placebo (Rasquin‐Weber 1999). They reassessed the outcome measures of pain intensity, pain frequency, and global self judgement of improvement in symptoms at two weeks. Pain intensity and pain frequency improved in 9 out of 15 children in the treatment group and 2 out of 14 children in the placebo group (OR 9.0, 95% CI 1.46 to 55.48). While this result suggests effectiveness, it should be interpreted with caution due to the very low GRADE quality rating of the evidence. The study was at risk of bias from the small sample size (imprecision of treatment effect), used non‐validated measurement tools, and the findings have not been reproduced in other studies.

Comparison 8. H2 receptor antagonists compared to placebo

One study, See 2001, compared famotidine versus placebo in a randomised cross‐over trial of 25 children with RAP, as defined by Apley 1958. The authors provided results from the first period of the trial during which 12 children received famotidine and 13 children received placebo. They reported that 8 out of 12 children receiving famotidine in the first period improved compared to 2 out of 13 children receiving placebo in the first period (OR 11.0, 95% CI 1.6 to 75.5). They also reported scores on an "abdominal pain score" (APS), which included three components: pain frequency, pain severity, and a peptic index score, which evaluates a number of symptoms, including nausea, vomiting, and nocturnal waking. This appears not to be a validated tool used in other studies. The authors reported no statistically significant difference in scores between famotidine and placebo, considering all the data from both periods. They reported finding an improvement in APS on famotidine (mean 3.37 (SD ± 3.53)) and on placebo (mean 1.66 (SD ± 2.7)). The MD in improvement on APS is 1.71. We were not able to provide confidence intervals around this difference due to insufficient data from the study authors. This is a cross‐over trial, and we did not impute values for the correlation coefficient. The authors of the previous review were unsuccessful in contacting the trial authors (Huertas‐Ceballos 2008a). The GRADE quality rating was very low due to the lack of primary data to confirm findings, no washout period between cross‐over of interventions, and the use of an invalidated tool.

Comparison 9. Serotonin antagonists compared to placebo

Symon 1995 reported the effects of pizotifen versus placebo in a restricted subgroup of 16 children with RAP. Although the children satisfied the standard definition of RAP (Apley 1958), they also had to report associated facial pallor, and either one first‐degree relative or two second‐degree relatives with a history of migraine or throbbing headache to be included in the study. This was a cross‐over study comparing the MD in the number of days on which children reported abdominal pain while taking pizotifen and placebo. The children reported a MD of 8.21 (95% CI 2.93 to 13.48) fewer days of pain while taking the intervention drug. The authors also reported the MD on the "Index of Severity", which was ‐16.21 (95% CI ‐26.51 to ‐5.90), and the MD on the "Index of Misery", which was ‐56.07 (95% CI ‐94.07 to ‐18.07). These appear not to be validated tools, but were judged by the trial authors to measure improvement. The study authors reported P values of 0.005 and 0.007 for these two comparisons, respectively. The authors reported that the trial was stopped early as a result of an interim analysis conducted when their initial supplies of drug preparations reached their expiry dates. They did not provide details of the size of the sample they had originally planned to include in the study. These findings have not been replicated in other studies. The GRADE rating was very low for this comparison.

Comparison 10. Dopamine receptor antagonist compared to placebo

Karunanayake 2015 compared domperidone to placebo in 89 children aged five to 12 years with abdominal pain‐predominant functional gastrointestinal disorders as defined by Rome III criteria (Rasquin 2006). Two primary outcomes were specified: cure and improvement. Cure was abdominal pain less than 25 mm on the VAS and no impact on daily activity. Improvement was pain relief and sense of improvement recorded on the Global Assessment Scale. No further information on how these primary outcomes were defined, when they were measured, or who assessed them was provided. The study is available as a conference abstract, and despite having written to the authors twice (Martin 2016a [pers comm]; Martin 2016d [pers comm]), we have no further published or unpublished information. The authors state that there was no difference in "cure" between the two groups, but provided no data to support this. They state that 37 out of 47 (78.7%) of the children treated with domperidone had "significant improvement" and 25 out of 42 (59.5%) in the placebo group had "significant improvement". This gives an OR for improvement of 2.52 (95% CI 0.99 to 6.39). Due to the lack of information about this outcome measurement and the limited possible 'Risk of bias' assessment, this result should be interpreted with caution. Regarding secondary outcomes, the domperidone group reported a significant reduction in abdominal pain severity (70.84% versus 48.18%) and improvement on the motility index (29.3% versus 8.6%) after intervention. No such difference was seen in improvement of quality of life and family impact. It was also unclear how these secondary outcomes were defined, measured, or who assessed them. The GRADE rating was very low for this comparison.

Comparison 11. Hormone treatment compared to placebo

Zybach 2016 reported the therapeutic effect of melatonin compared to placebo in 12 children aged 11 to 16 years with functional abdominal pain as defined by Rome III criteria (Rasquin 2006). This was a cross‐over study with no washout period. No power calculation was performed, but due to the low number of children, the study may have been underpowered. The authors found no difference in pain response reported by those treated with melatonin compared to placebo: OR 0.71 (95% CI 0.14 to 3.58). The authors also reported no change in mean sleep duration: melatonin group 9.9 (SD ± 3.53) hours; placebo group 9.41 (SD ± 2.7) hours. The GRADE rating for this comparison was very low.

Discussion

Summary of main results

Recurrent abdominal pain is an extremely common condition in childhood, and survey data in the USA suggest that many paediatricians use drug treatment for RAP (Edwards 1994). The most striking result of this review is the paucity of good‐quality, placebo‐controlled trials for all of the drugs that have been recommended for use in children with RAP.

In 2006, the paediatric Rome III criteria were devised to classify paediatric functional gastrointestinal disorders (Rasquin 2006). Diagnoses included within this classification comprised five categories defined on the basis of symptom profiles: IBS, functional dyspepsia, functional abdominal pain, abdominal migraine, and functional abdominal pain syndrome. The consistent symptom in each of these profiles is unexplained abdominal pain, which, prior to the development of this classification, was the complaint used to identify patients. It remains unclear the extent to which separating children into these subgroups defines patients who have different psychological or pathophysiological mechanisms underlying their symptoms or predicting their treatment response.

The studies with positive results are either small, single studies that have not been replicated or are larger studies with methodological flaws. Therefore, there is insufficient evidence to recommend any drug treatment.

There was no evidence from Bahar 2008 or Saps 2009 to suggest that amitriptyline is effective in treating RAP. Similarly, there was no evidence that antibiotics have a role in the treatment of RAP (Collins 2011; Heyland 2012). A single study reported that trimebutine (an antimuscarinic drug) was extremely effective in treating RAP (Karabulut 2013), but this was based on reports of symptoms by parents who were aware of whether their children were either receiving active treatment or not. A single, small study with a high risk of bias evaluated tegaserod (a 5‐HT4 agonist) (Khoshoo 2006). The findings of this study suggest that tegaserod may be effective, but further evidence is required before it can be recommended. We found two trials of peppermint oil: one trial showed no clear efficacy, but the small numbers may mean it was underpowered (Kline 2001), and the second trial had key methodological flaws requiring that it be interpreted with caution (Asgarshirazi 2015). There is no current evidence to recommend peppermint oil. Narang 2015 suggested benefit from drotaverine in some of the outcomes measured; others were either unreported or the children received no benefit from the intervention. Pourmoghaddas 2014 found no effectiveness of mebeverine in treating functional abdominal pain in children. Similarly, Roohafza 2014 compared citalopram and placebo and found no difference in treatment response rate between the groups. Sadeghian 2008 and See 2001 suggested that antihistamines and H2 receptor antagonists respectively may be effective in treating RAP. However, these results should be interpreted with caution due to risk of bias in the studies, small sample numbers, and therefore imprecision of estimates. In addition, the results of these single studies have not been reproduced. Symon 1995 reported the effects of pizotifen versus placebo in a subgroup of children with RAP fulfilling the definition of abdominal migraine. In the 14 children studied, the mean number of days of pain was less in the pizotifen group. The results of this small study, which was stopped early as this interim analysis was conducted when the drug supply had expired, has not been replicated in the last 20 years. Karunanayake 2015 published in abstract form a study of domperidone versus placebo; there was insufficient information on the outcomes measured and the quality of the study to conclude if domperidone may be effective in treating RAP. Zybach 2016 investigated melatonin compared to placebo in a small number of children in a cross‐over trial and found no efficacy. For a summary of these results, please see summary of findings Table for the main comparison.

Overall completeness and applicability of evidence

This review highlights some issues concerning the overall completeness and applicability of the evidence of benefits and harms of pharmacological interventions for children and adolescents with RAP: the lack of trials conducted in specific subgroups of RAP as defined by the Rome III criteria (Rasquin 2006); the lack of trials assessing the same class of drug; and the lack of sustained intervention and follow‐up beyond the period of intervention.

The majority of studies included children within the broad diagnosis of RAP, which meant that children could be presenting with a variety of RAP classifications such as IBS, functional abdominal pain, or functional dyspepsia. This meant that it was not possible to investigate whether particular classes of drugs benefited particular subgroups of RAP more than other subgroups.

Lastly, most of the interventions were relatively short in duration (two to six weeks), and very few had medium‐ or long‐term follow‐up, which limits the ability to assess whether any benefits are sustained in the long term.

Quality of the evidence

The overall quality of this evidence was low.

Potential biases in the review process

The present systematic review has many strengths. We developed a protocol for this review according to guidance provided in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011b). We published our protocol before we embarked on the review itself (Martin 2014a). We conducted extensive searches of relevant databases and checked forward and backward citations of all included studies. We also contacted authors of included studies for additional data when the presented data were insufficient or data were missing, to maximise our ability to pool data on comparable outcomes within comparable intervention types. Two review authors, working independently, selected trials for inclusion and extracted data. Disagreements were resolved by discussion between team members. We assessed the risk of bias in all trials according to the recommendations provided in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011a).

We did not include studies that had a mix of children, adolescents, and young adults when it was not possible to separate the data for children younger than 18 years of age. Likewise, we did not include studies that did not specify recruiting children or adolescents, and which presented mean ages of the population sample exceeding 20 years of age. Both these issues raise the possibility of bias in our review process, as we did not write to these authors asking whether or not they collected data for children younger than 18 years of age. However, we believe this potential bias is not likely to have changed our conclusions.

Agreements and disagreements with other studies or reviews

The previous version of this review, Huertas‐Ceballos 2008a, included only three studies (Kline 2001; See 2001; Symon 1995). This update includes 16 studies and reached a similar conclusion: there is no evidence to support the use of drugs to treat RAP or functional gastrointestinal disorders in children. Another Cochrane review evaluating the effectiveness of antidepressants in pain‐related functional abdominal disorders in children also agrees with this conclusion (Kaminski 2009).

Figure 1

Study flow diagram.

Navigate to figure in ReviewOpen in new tab

Figure 2

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Navigate to figure in ReviewOpen in new tab

Figure 3

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Navigate to figure in ReviewOpen in new tab

Summary of findings for the main comparison. Antispasmodics compared to placebo for recurrent abdominal pain

Antispasmodics compared to placebo for recurrent abdominal pain
Patient or population: school‐aged children (5 to 18 years of age) with recurrent abdominal pain Settings: hospital paediatric outpatient clinics Intervention: antispasmodic drugs Comparison: placebo
Outcomes	*Illustrative comparative risks (SD)**		Relative effect (95% CI)	Number of participants (studies)	Quality of the evidence (GRADE)	Comments
	Assumed risk	Corresponding risk
	Placebo	Intervention
Pain duration (mean pain duration, assessed at 4 weeks)	The mean duration of pain in the control group was 6.17 (± 11.61).	The mean duration of pain in the intervention group was51.6 (± 23.74).	MD ‐25.4 (‐35.5 to ‐15.3)	120 (1)	⊕⊝⊝⊝ Very low¹	Asgarshirazi 2015 No evidence of efficacy
Pain improvement (clinician judged, assessed at 2 weeks)	9 of 21 children in the control group had an improvement in pain.	15 of 21 children in the intervention group had an improvement in pain.	OR 3.33 (0.93 to 12.01)	42 (1)	⊕⊝⊝⊝ Very low²	Kline 2001 No evidence of efficacy
Pain frequency (episodes of pain in 4 weeks, assessed after 4 weeks)	The mean number of episodes of pain in the control group was 21.6 (32.4).	The mean number of episodes of pain in the intervention group was10.3 (14).	MD 11.3 (2.4 to 20.1)	132 (1)	⊕⊝⊝⊝ Very low³	Narang 2015 No evidence of efficacy
Pain improvement (self reported response to treatment, assessed at 4 weeks)	The response to treatment in the control group was 30.3%.	The response to treatment in the intervention group was 40.6%.	OR 1.6 (0.7 to 3.4)	115 (1)	⊕⊕⊝⊝ Low⁴	Pourmoghaddas 2014 No evidence of efficacy
The basis for the assumed risk* (e.g., the median control group risk across studies) is provided in footnotes. The corresponding risk is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI). CI: confidence interval; MD: mean difference; OR: odds ratio; SD: standard deviation
GRADE Working Group grades of evidence High quality: Further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: We are very uncertain about the estimate.
¹Downgraded for low methodological quality due to single, small study; risk of bias from incomplete outcome data and differential loss of participants between groups. The placebo differed in preparation and dose timing compared to the intervention drug. ²Downgraded for low methodological quality due to single, small study; risk of bias from selective outcome reporting and short follow‐up. ³Downgraded for low methodological quality due to single, small study; risk of bias from selective outcome reporting and the method of altering the drug doses. ⁴Downgraded for single, small study. Not duplicated.

Summary of findings for the main comparison. Antispasmodics compared to placebo for recurrent abdominal pain

Navigate to table in Review

Table 1. Assessment of risk of bias in included studies

Domain	'Risk of bias' judgement
Domain	Low	High	Unclear
Selection bias
Random sequence generation	If the study details any of the following methods: (1) simple randomisation (such as coin‐tossing, throwing dice, or dealing previously shuffled cards, a list of random numbers, or computer‐generated random numbers); or (2) restricted randomisation such as blocked, ideally with varying block sizes or stratified groups, provided that within‐group randomisation is not affected	If the study details no randomisation or an inadequate method such as alternation, assignment based on date of birth, case record number, and date of presentation. These latter methods may be referred to as ‘quasi‐random’.	If there is insufficient detail to judge the risk of bias
Allocation concealment	If the study details concealed allocation sequence in sufficient detail to determine that allocations could not have been foreseen in advance of, or during, enrolment	If the study details a method where the allocation is known prior to assignment	If there is insufficient detail to judge the risk of bias
Performance bias
Blinding of participants and personnel	If the study details a method of blinding participants and personnel. Detail would need to be sufficient to show that participants and personnel were unable to identify the therapeutic intervention from the control intervention.	If the methods detail that the participants or study personnel were not blinded to the study medication or placebo	If there is insufficient detail to judge the risk of bias
Detection bias
Blinding of outcome assessment	If the study details a blinded outcome assessment. This may only be possible for outcomes that are externally assessed.	If the outcome assessment is not blinded. We expect this may be unavoidable for self rated outcomes of unblinded interventions.	If there is insufficient detail to judge the risk of bias
Attrition bias
Incomplete outcome data	If the study reports attrition and exclusions, including the numbers in each intervention group (compared with total randomised participants), reasons for attrition or exclusions, and any re‐inclusions; the impact of missing data is not believed to have altered the conclusions; and reasons for the missing data are acceptable	We may judge the risk of attrition bias to be high due to the amount, nature, or handling (such as per‐protocol analysis) of incomplete outcome data.	If there is insufficient detail to judge the risk of bias, e.g. if the number of children randomised to each treatment is not reported
Reporting bias
Selective reporting	If there is complete reporting of all outcome data. This will be determined based on comparison of the protocol and published study, if available.	If the reporting is selective so that some outcome data are not reported	If there is insufficient detail to judge the risk of bias, e.g. protocols are unavailable
Other sources of bias
Other bias	If the study is judged to be at low of risk of other potential sources of bias, such as no differential loss to follow‐up or an adequate washout period in cross‐over trials	If there are other sources of bias, such as differential loss to follow‐up or an inadequate washout period in cross‐over trials	If there is insufficient detail to judge the risk of bias

Table 1. Assessment of risk of bias in included studies

Navigate to table in Review

Cochrane Review language

Website language

Abstract

研究背景

研究目的

检索策略

标准/纳入排除标准

数据收集与分析

主要结果

作者结论

PICOs

PICOs

Population

Intervention

Comparison

Outcome

Plain language summary

药物治疗儿童反复性腹痛

Visual summary

Authors' conclusions

Implications for practice

Implications for research

Summary of findings

Background

Description of the condition

Description of the intervention

How the intervention might work

Why it is important to do this review

Objectives

Methods

Criteria for considering studies for this review

Types of studies

Types of participants

Types of interventions

Types of outcome measures

Primary outcomes

Secondary outcomes

Search methods for identification of studies

Electronic searches

Searching other resources

Data collection and analysis

Selection of studies

Data extraction and management

Assessment of risk of bias in included studies

Measures of treatment effect

Continuous data

Dichotomous data

Unit of analysis issues

Dealing with missing data

Assessment of heterogeneity

Assessment of reporting biases

Data synthesis

Summary of findings

Subgroup analysis and investigation of heterogeneity

Sensitivity analysis

Results

Description of studies

Results of the search

Included studies

Participants

Location

Settings

Study duration

Interventions

Study design

Outcomes

Primary outcome

Secondary outcomes

Excluded studies

Risk of bias in included studies

Allocation

Blinding

Incomplete outcome data

Selective reporting

Other potential sources of bias

Effects of interventions

Comparison 1. Tricyclic antidepressants compared to placebo

Comparison 2. Antibiotics compared to placebo

Comparison 3. Antimuscarinic drugs compared to usual care

Comparison 4. 5‐HT4 agonists compared to usual care