Pulmonary rehabilitation for chronic obstructive pulmonary disease

Summary of findings for the main comparison. Rehabilitation versus usual care for chronic obstructive pulmonary disease

Rehabilitation versus usual care for chronic obstructive pulmonary disease
Patient or population: patients with chronic obstructive pulmonary disease Settings: hospital and community Intervention: rehabilitation versus usual care
Outcomes	*Illustrative comparative effects (95% CI)**		Number of participants (studies)	Quality of the evidence (GRADE)	Comments
	Response on control	Treatment effect
	Usual care	Rehabilitation versus usual care
QoL ‐ Change in CRQ (dyspnoea) CRQ Questionnaire. Scale from 1 to 7 (Higher is better and 0.5 unit is an important difference) Follow‐up: median 12 weeks	Median change = 0 units	Mean QoL ‐ change in CRQ (Dyspnoea) in the intervention groups was 0.79 units higher (0.56 to 1.03 higher)	1283 (19 studies)	⊕⊕⊕⊝ Moderate^1,2,3	Sensitivity analysis from studies at lower risk of bias was similar (MD 0.99, 95% CI 0.64 to 1.34; participants = 384; studies = 5; I² = 34%)
QoL ‐ Change in SGRQ (total) Scale from 0 to 100 (Lower is better and 4 units is an important difference) Follow‐up: median 12 weeks	Median change = 0.42 units	Mean QOL ‐ change in SGRQ (total) in the intervention groups was 6.89 units lower (9.26 to 4.52 lower)	1146 (19 studies)	⊕⊕⊕⊝ Moderate^2,3,4	Sensitivity analysis from studies at lower risk of bias was similar (MD ‐5.15, 95% CI ‐7.95 to ‐2.36; participants = 572; studies = 7; I² = 51%)
Change in maximal exercise (Incremental Shuttle walk test (ISWT)) Distance metres Follow‐up: median 12 weeks	Median change = 1 metre	Mean maximal exercise (incremental shuttle walk test) in the intervention groups was 39.77 metres higher (22.38 to 57.15 higher)	694 (8 studies)	⊕⊕⊕⊝ Moderate^2,3,5
Change in functional exercise capacity (6MWT)) Distance metres Follow‐up: median 12 weeks	Median change = 3.4 metres	Mean functional exercise capacity (6MWT)) in the intervention groups was 43.93 metres higher (32.64 to 55.21 higher)	1879 (38 studies)	⊕⊝⊝⊝ Very low^2,3,6,7
Change in maximal exercise capacity (cycle ergometer) Workmax (watt) Follow‐up: median 12 weeks	Median change = ‐0.05 watts	Mean maximal exercise capacity (cycle ergometer) in the intervention groups was 6.77 watts higher (1.89 to 11.65 higher)	779 (16 studies)	⊕⊕⊝⊝ Low^2,3,8,9
The basis for the response on control is the median control group response across studies. CI:* confidence interval; MD: mean difference.
GRADE Working Group grades of evidence. High quality: Further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: We are very uncertain about the estimate.
¹17 studies reported random sequence generation (1 unclear), 12 reported allocation concealment 2 did not have allocation concealment and it is unclear in 5 studies. 4 studies did not blind assessors, 11 blinded assessors and 4 were unclear as to assessor blinding. 6 studies had attrition bias greater than 20%. ²Downgraded as there is a high level of heterogeneity within the results. Several factors may impact heterogeneity, including content of the intervention programme, setting of the programme and severity of COPD. ³Greater than optimal Information size (OIS). 95% confidence interval does not includes "no effect," nor does the confidence limit cross the MID, so no need to downgrade. ⁴18 studies reported random sequence generation (2 unclear), 10 reported allocation concealment, 2 did not have allocation concealment and it is unclear in 7 studies. 3 studies did not blind assessors, 9 blinded assessors and 7 were unclear as to assessor blinding. 7 studies had attrition bias greater than 20%. ⁵All 8 studies reported random sequence generation, 5 reported allocation concealment and it is unclear in 3 studies. 5 studies had blind assessors with 1 not blinded, and 2 were unclear as to assessor blinding. 4 studies had attrition bias greater than 20%. ⁶34 studies reported random sequence generation, 4 were unclear, 20 reported allocation concealment, 3 did not have allocation concealment and it is unclear in 15 studies. 5 studies did not blind assessors, 19 blinded assessors and 13 were unclear as to assessor blinding. 13 studies had attrition bias greater than 20% and 2 were unclear. ⁷Downgraded as bias indicated for 6‐minute walk test: Egger: bias = 1.24304 (95% CI = 0.183967 to 2.302131; P value 0.0227). Begg‐Mazumdar: Kendall's tau = 0.16074 (P value 0.1601). ⁸All 16 studies reported random sequence generation, 6 reported allocation concealment, 3 did not have allocation concealment and it is unclear in 7 studies. 2 studies did not blind assessors, 10 blinded assessors and 4 were unclear as to assessor blinding. 4 studies had attrition bias greater than 20%. ⁹Downgraded as bias indicated for cycle ergometer test: Egger: bias = 1.57164 (95% CI = 0.6053 to 2.337984; P value 0.0036). Begg‐Mazumdar: Kendall's tau = ‐0.2666667 (P value 0.139).

Background

Description of the condition

Chronic obstructive pulmonary disease (COPD) is a multi‐factorial progressive chronic lung disease that causes obstruction in airflow. This obstruction results in persistent and progressive breathlessness, productive coughing, fatigue and recurrent chest infection (GOLD 2014). COPD is also associated with extrapulmonary effects such as muscle wasting, osteopaenia (reduction in protein and mineral content of bone tissue), cardiovascular disease and depression and therefore is now best understood as a systemic disease (Agusti 2003; Agusti 2005). Worldwide, COPD is a major cause of morbidity. It is estimated that 210 million people are living with COPD (Franchi 2009), and it is projected that by the year 2030, COPD will be the third most frequent cause of death globally (WHO 2008). At this time, COPD is an incurable condition that is associated with significant economic costs due to progressive disease severity and frequent hospital admissions and readmissions (GOLD 2014; Guarascio 2013).

Risk factors for COPD are numerous and include genetics, recurrent respiratory infection, low socioeconomic status, exposure to air pollutants, poor nutrition and asthma (Eisner 2010; GOLD 2014). However smoking is recognised as a major cause of COPD, and the more a person smokes, the more likely he or she is to develop this condition (Forey 2011).

COPD is a heterogeneous condition with marked variation in progression between individuals (Casanova 2011; Nishimura 2013). The initial underlying pathology of COPD is confined to the lungs, and a clinical diagnosis is based on presenting symptoms and confirmation of airflow obstruction with a postbronchodilator spirometry forced expiratory volume in one second/forced vital capacity ratio (FEV₁/FVC) < 0.70 (GOLD 2014). The Global Initiative for Chronic Obstructive Lung Disease (GOLD) guidelines are usually used to grade the severity of airflow limitations as mild (FEV₁ ≥ 80% predicted: GOLD 1), moderate (50% ≤ FEV₁ < 80% predicted: GOLD 2), severe (30% ≤ FEV₁ 50% predicted: GOLD 3) or very severe (FEV₁ < 30% predicted: GOLD 4) (GOLD 2014).

The symptoms of COPD make engagement in physical activity unpleasant as the result of air trapping and increased hyperinflation in the lungs, which result in increased breathlessness due to subsequent inefficient breathing (O' Donnell 2007). Increased breathlessness provokes anxiety, which inevitably leads to further breathlessness, exacerbation of COPD symptoms and panic. This causes a vicious circle whereby any activities that involve physical exertion are avoided, causing muscle de‐conditioning, which further reduces capacity to engage in physical activity (Bourbeau 2007). Physical inactivity is therefore a key predictor of mortality in people with COPD (Garcia‐Aymerich 2006; Spruit 2013; Waschki 2011). Consequently, the joint American Thoracic Society and European Respiratory Society (ATS/ERS) (Spruit 2013) guidelines highlight the importance of exercise in the treatment and management of COPD.

Description of the intervention

Treatment interventions for COPD include smoking cessation, pharmacological and non‐pharmacological therapies and, in specific circumstances, supplemental oxygen, ventilatory support, surgical treatment and palliative care (GOLD 2014). However, best evidence and all current international guidelines ratify the central role of pulmonary rehabilitation in the treatment of people with COPD (GOLD 2014; NICE 2010; Nici 2006; Ries 2007; Spruit 2013).

Pulmonary rehabilitation (PR), which was first defined by the American College of Chest Physicians Committee in 1974, is a proactive approach to minimising COPD symptoms, improving health‐related quality of life (HRQoL) and increasing physical and emotional involvement in everyday life (GOLD 2014; Nici 2006; Ries 2007). The ATS in conjunction with the ERS has published numerous comprehensive statements on PR, with the most recent update in 2013. In the latest update, pulmonary rehabilitation was defined newly as a "…comprehensive intervention based on a thorough patient assessment followed by patient tailored therapies that include, but are not limited to, exercise training, education, and behaviour change, designed to improve the physical and psychological condition of people with chronic respiratory disease and to promote the long‐term adherence to health‐enhancing behaviours" (Spruit 2013). This new definition differs from the previous one (2006) in that it focuses on the interdisciplinary and therefore more holistic approach to PR rather than on the previous multi‐disciplinary approach; highlights the importance of behaviour change; and places PR firmly within the concept of integrated care (Spruit 2013).

Depending on culture, healthcare systems and resources, the structure, personnel, content and settings of PR programmes may vary (Nici 2006;Spruit 2013). However, individually tailored exercise training is considered the cornerstone of PR (Nici 2006;Ries 2007;Spruit 2013). In particular, strength, low‐ and high‐intensity training, exercise endurance and upper and lower extremity training are recommended (Nici 2006;Ries 2007, Spruit 2013). In addition to exercise, the typical comprehensive PR programme includes patient assessment, education, psychosocial support and nutritional counselling (ATS 1999; GOLD 2014; Spruit 2013). Pulmonary rehabilitation is typically delivered to groups of patients (rather than to individuals), but no evidence suggests the optimal size of the exercise group. However, the American Association of Cardiovascular and Pulmonary Rehabilitation (AACVPR 2011) recommends a staff‐to‐participant ratio of 1:4, and the British Thoracic Society (British Thoracic Society 2001) a ratio of 1:8. The setting for PR programmes varies; both community‐based (Cambach 1997; Casey 2013; Wijkstra 1994a) and home‐based programmes (Maltais 2008; Viera 2010) are available. However, traditionally, most PR programmes have been hospital based (Bourbeau 2010), with participants attending as in‐patients or on an out‐patient basis.

The optimal duration of programmes, number of sessions offered per week and type of staff required to deliver PR programmes are unclear. Beauchamp 2011 concludes, following a systematic review, that available evidence is insufficient to show the optimal duration of PR programmes for people with COPD. However, a programme duration of at least eight weeks is recommended to attain a substantial effect (Beauchamp 2011). Likewise the number of times per week that programmes are offered differs; typically hospital‐based out‐patient programmes are offered two or three days per week, and in‐patient programmes are offered over five days (Spruit 2013). The optimal number of sessions required remains unclear. However, the 2006 ATS/ERS guidelines specify three sessions per week or a twice‐weekly supervised and one unsupervised home session (Nici 2006). Finally, key requirements for staff delivering the programme are that they are clinically competent, having the required skills and knowledge and maintain patient safety (Spruit 2013).

How the intervention might work

Pulmonary rehabilitation seeks to reduce COPD symptoms, reestablish and improve functional ability, enhance participation in everyday life, promote autonomy and improve HRQoL (Spruit 2013). It does this by focusing on the systemic aspects of the disease that are common among patients with COPD (AACVPR 2011). The exercise component of PR increases inspiratory volume and reduces dynamic hyperinflation, both of which reduce dyspnoea when the person is performing tasks (Casaburi 2009). Exercise also increases muscle function, delaying fatigue and resulting in increased exercise tolerance. Meanwhile, the educational component of PR focuses on collaborative self‐management and behaviour change (Spruit 2013). It encompasses providing information and knowledge regarding COPD; building skills such as goal setting, problem solving and decision making; and developing action plans that allow individuals to better recognise and manage the disease (Spruit 2013). The behaviour change element focuses on modifying nutritional intake and smoking patterns; adhering to medication and regular exercise; and utilising effective breathing techniques and energy‐saving strategies (Spruit 2013).

Why it is important to do this review

Review authors undertook the original version of this Cochrane review in 2001 in response to worldwide endorsement of PR as integral to the management of COPD and lack of clear evidence as to the impact of these programmes on HRQoL and exercise tolerance (Lacasse 2001). The review included 23 randomised controlled trials (RCTs), and review authors concluded that PR (exercise training for a minimum of four weeks with or without education and/or psychological support) resulted in statistically significant improvement in HRQoL and modest improvement in exercise capacity (Lacasse 2001). This review was updated in 2006, included 31 RCTs and again reported statistically significant improvement in HRQoL. However, results for both functional and maximal exercise capacity were below the threshold of clinical significance. Lacasse 2006 concluded that further RCTs comparing PR versus usual care for patients with COPD were not needed. Despite this, a large number of RCTs published since 2006 have endorsed the need for this current update. Furthermore, recent RCTs tend to use disease‐specific quality of life indices as primary outcome measures,, combined with more refined maximal and functional exercise capacity measurement tools (Curtis 2003;de Torres 2002;Gross 2004; Jones 2003). Consequently in the current review, we will take a more focused approach to assessment of primary and secondary outcomes. In recent years, wide variation has been noted in the follow‐up assessment times utilised within studies, and this may have an impact on study outcomes. Therefore in the current review, we will include only assessments completed up to and within three months of completion of the intervention. Also, risk of bias requirements for Cochrane reviews have been altered since the last update; review authors of this current update will ensure that these new requirements are met. Finally, as a separate systematic review examining the effects of PR following exacerbations of COPD has been undertaken (Puhan 2011(a)), we will exclude from this review studies that commenced within four weeks of an acute exacerbation of COPD.

Objectives

To compare the effects of pulmonary rehabilitation versus usual care on health‐related quality of life and functional and maximal exercise capacity in persons with COPD.

Methods

Criteria for considering studies for this review

Types of studies

All RCTs in which participants are randomly assigned at the individual or cluster level and in which researchers compare the effects of PR versus those of usual care.

Types of participants

We included RCTs in which more than 90% of participants had COPD defined as:

a clinical diagnosis of COPD; and
best recorded forced expiratory volume after one second (FEV₁)/forced vital capacity (FVC) (FEV₁/FVC) ratio of individual participants < 0.7.

We included RCTs in which:

any or all participants were on continuous oxygen.

We excluded RCTs that focused on participants:

who were mechanically ventilated; or
who had an acute exacerbation within four weeks before commencement of the intervention.

Types of interventions

Pulmonary rehabilitation

Any in‐patient, out‐patient, community‐based or home‐based rehabilitation programme of at least four weeks' duration that included exercise therapy with or without any form of education and/or psychological support delivered to patients with exercise limitation attributable to COPD.

We included any exercise therapy that included physical activity considered to be aerobically demanding.

We excluded:

interventions in which the physical activity component was considered to be not aerobically demanding (e.g. respiratory muscle training, breathing exercises, Tai Chi, yoga) (the degree of aerobic demand was assessed for each individual intervention by examining the detailed description of the intervention in identified studies); and
programmes of less than 4 weeks' duration.

Usual care

For the purpose of this review, usual care was defined as conventional care. We excluded trials in which the control group was given education or any form of additional intervention. Participants in the following situations were considered to be in receipt of usual care.

Only verbal advice was given. If the advice was accompanied by additional education provided in any way, for example, by video or by diary, then the study was excluded.
Medication was altered or optimised to what was considered best practice at the start of the trial for all participants.

Types of outcome measures

We considered disease‐specific HRQoL and/or maximal or functional exercise capacity (up to and including three months after the end of the intervention). We defined 'maximal exercise capacity' as the peak capacity measured by an incremental cycle ergometry test. 'Functional exercise capacity' was defined according to the results of timed walk tests (Holland 2014).

Primary outcomes

Disease‐specific health‐related quality of life (HRQoL)

Chronic Respiratory Disease Questionnaire (CRQ).
St. George's Respiratory Questionnaire (SGRQ).

Secondary outcomes

Exercise testing

The classification of exercise testing is divided into functional and maximal exercise groups, which include the following (Holland 2014).

Functional exercise capacity assessments.

- Six‐minute walk test/distance (6MWT/6MWD).
- Incremental shuttle walk test (ISWT).
- Endurance shuttle walk test (ESWT).
Maximal exercise tests.

- Incremental cycle ergometry.

Search methods for identification of studies

Electronic searches

We have detailed in Appendix 1 the search methods used in the previous version of this review. The previously published version included searches up to July 2004. The search period for this update is July 2004 to March 2014.

For the current update, we identified trials from the Cochrane Airways Group Specialised Register (CAGR), which is maintained by the Trials Search Co‐ordinator for the Group. The Register contains trial reports identified through systematic searches of bibliographic databases including the Cochrane Central Register of Controlled Trials (CENTRAL), MEDLINE, EMBASE, the Cumulative Index to Nursing and Allied Health Literature (CINAHL), the Allied and Complementary Medicine Database (AMED) and PsycINFO, and by handsearching of respiratory journals and meeting abstracts (please see Appendix 2 for further details). We searched all records in the CAGR using the search strategy described in Appendix 3.

We also conducted a search of ClinicalTrials.gov (www.ClinicalTrials.gov) and the World Health Organization (WHO) trials portal (www.who.int/ictrp/en/). We searched all databases from their inception to the present, with no restriction on the language of publication. We completed the latest searches in March 2014.

Searching other resources

We reviewed the reference lists of relevant articles and retrieved any potential additional citations. We contacted the authors of studies included in the meta‐analysis and experts in the field of pulmonary rehabilitation to uncover unpublished material. We also included the papers suggested by the study authors contacted.

Data collection and analysis

The methods used in this review were designed in accordance with recommendations provided in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011).

Selection of studies

Two review authors (BMC, DC) independently tested the inclusion criteria and sought clarification on all areas of concern with the wider review team, which included the original author of the review (YL). When the review authors were confident of the clarity of the criteria and their skills, they assessed studies with respect to the identified criteria. The two review authors then independently assessed all citation titles and abstracts. Review authors electronically collated initial decisions with the use of Distiller SR and later with Early Reviewing Software (EROS); they coded each citation as:

included to proceed;
more information needed before inclusion decision;
important article but not to be included in the review; or
excluded (Appendix 4; Appendix 5).

Review authors held a meeting after every 100 reviewed citations during which they resolved disagreements by consensus. They used quadratic weighted Kappa statistics to measure agreement between coders (Kramer 1981). When consensus could not be reached, a third review author (DD) adjudicated. Review authors then retrieved full‐text papers of all potentially eligible studies. Review authors maintained records on all studies that did not meet the inclusion criteria and provided the rationale for their exclusion.

Data extraction and management

The lead review author (BMC) extracted data from all original papers identified for inclusion in the meta‐analysis using a developed data extraction form. The other members of the review group (DC, KM, DD, EM) independently extracted data from an equal share of the same studies. Extracted information included the following.

Background characteristics of the research reports.
Characteristics of participants in the study.
The number and distribution of participants who dropped‐out or withdrew from the study.
A full description of the pulmonary rehabilitation programmes (setting, components and duration).
Health‐related quality of life measurement instruments and associated results.
Exercise capacity measure outcomes and corresponding results.

The lead review author and co‐review authors resolved discrepancies during the data extraction process through discussion; they consulted a third review author when unresolved issues remained. Review authors requested missing data from the authors of the primary studies. They asked these authors to provide additional information by filling in tables similar to the ones used by the review authors during the data extraction process. Two review authors (BMC, EM) entered all data into the Review Manager software (RevMan 2011) and checked them for accuracy.

If a study reported multiple group comparisons (e.g. exercise therapy with inspiratory muscle training compared with exercise therapy alone or with conventional community care), treatment groups considered relevant to PR were combined as if one intervention group,.and this group was compared with the group receiving conventional community care. Studies in which multiple group comparisons included interventions that were not considered relevant to PR such as acupuncture were not combined.

Assessment of risk of bias in included studies

The lead review author (BMC) assessed the risk of bias for all included studies. A second review author (DC, EM or KM) independently assessed the risk of bias for each study. The review authors followed the criteria for assessing risk of bias provided by The Cochrane Collaboration in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011) and contained in RevMan (RevMan 2011). We assessed risk of bias according to the following domains (Appendix 6).

Random sequence generation.
Allocation concealment.
Blinding of participants and personnel.
Blinding of outcome assessment.
Incomplete outcome data.
Selective outcome reporting.
Other bias.

We considered several important potential sources of bias that have proved to be major determinants of the magnitude of the effect size in clinical trials: unconcealed randomisation, unblinded study personnel, incomplete outcome data and attrition of more than 20% of those randomly assigned. The first of these has been associated with an overestimation of treatment effect by up to 40% (Schulz 1995), and the second may result in differential encouragement during performance testing, with the potential for distortion of the results (up to 30.5 metres in a six‐minute walk test) (Guyatt 1984). Schulz 1995 argued that loss to follow‐up of 20% or greater should be a matter of concern as it relates to the possibility of bias.

Review authors resolved disagreements by consensus. If details pertaining to randomisation, masking, drop‐out and withdrawal were not specified or were unclear in the original trial publication, we contacted the study authors to clarify the issue.

Measures of treatment effect

Continuous data

Different measures of HRQoL and exercise capacity have been reported in the primary studies. Both primary outcomes (HRQoL) and secondary outcomes (exercise capacity) are continuous outcomes. For these continuous variables, we recorded mean change from baseline or mean postintervention values and standard deviation (SD) for each group for outcomes measured using the same metrics. When 95% confidence intervals (CIs) and standard errors (SEs) were reported, we calculated SDs as guided by the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011). When SDs were missing from studies and it was not possible to obtain the results from study authors, we used a mean value for the SD of the other studies that reported that outcome. All outcomes were reported independently, so standardised mean differences (SMDs) for outcomes were not required. Mean differences (MDs) with 95% CIs were calculated for each study by using a random‐effects model.

Dichotomous data

We did not plan to analyse dichotomous outcomes.

Unit of analysis issues

Cluster‐randomised trials

We included cluster‐randomised trials in the analysis for the current review alongside individually randomised trials. We made an adjustment to the sample size in these studies for each intervention based on the method described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011). This method utilised the intracluster correlation co‐efficient (ICC) as calculated from trial results.

Multi‐armed trials

We included multi‐armed trials in this review. To overcome potential issues due to multiple, correlated comparisons, we analysed multi‐armed trials using methods described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011). When feasible, we combined multiple comparison groups to create one relevant intervention group and one relevant comparison group.

Dealing with missing data

For included studies, we noted the level of attrition; any study with greater than 20% attrition was considered at high risk of attrition bias. When standard deviations (SDs) of the change were missing from studies, and it was not possible to obtain the result from study authors, we used the mean value for the SD of other included studies that reported that outcome. We excluded from the analysis studies in which only medians and percentiles were available and study authors reported no other means of calculating mean change scores.

Assessment of heterogeneity

We assessed heterogeneity visually through inspection of forest plots, and statistical heterogeneity in each meta‐analysis using Tau², I² and Chi² statistics. We regarded heterogeneity as substantial when Tau² was greater than zero and I² was greater than 30% or a low P value (< 0.10) was reported for the Chi² test for heterogeneity.

Assessment of reporting biases

When 10 or more studies were included in the meta‐analysis, we investigated reporting biases (such as publication bias) by using funnel plots. When asymmetry was suggested on visual assessment, we undertook exploratory analyses to investigate asymmetry using the test proposed by Egger 1997 (see Table 1).

Table 1. Publication bias: results of Egger and Begg‐Mazumdar Kendall's tests

CRQ Fatigue

Bias indicators

Begg‐Mazumdar: Kendall's tau = 0.22807; P value 0.1863

Egger: bias = 1.61189 (95% CI = ‐0.194745 to 3.418525); P value 0.077

CRQ Emotional

Bias indicators

Begg‐Mazumdar: Kendall's tau = 0.204678; P value 0.2378

Egger: bias = 0.997332 (95% CI = ‐0.618039 to 2.612702); P value 0.2101

CRQ Mastery

Bias indicators

Begg‐Mazumdar: Kendall's tau = 0.146199; P value 0.4063

Egger: bias = 1.531134 (95% CI = ‐0.268167 to 3.330434); P value 0.0904

CRQ Dyspnoea

(see Figure 1 for funnel plot)

Bias indicators

Begg‐Mazumdar: Kendall's tau = 0.274854; P value 0.1082

Egger: bias = 1.275427 (95% CI = ‐0.761574 to 3.312427); P value 0.204

SGRQ Total

(see Figure 2 for funnel plot)

Bias indicators

Begg‐Mazumdar: Kendall's tau = ‐0.052632; P value 0.73

Egger: bias = ‐0.459813 (95% CI = ‐2.086751 to 1.167125); P value 0.5588

SGRQ Symptoms

Bias indicators

Begg‐Mazumdar: Kendall's tau = 0.017544; P value 0.945

Egger: bias = 0.076734 (95% CI = ‐1.241745 to 1.395213); P value 0.9037

SQRQ Activity

Bias indicators

Begg‐Mazumdar: Kendall's tau = ‐0.052632; P value 0.73

Egger: bias = ‐0.336937 (95% CI = ‐2.10096 to 1.427086); P value 0.692

6MWT

Bias indicators

Begg‐Mazumdar: Kendall's tau = 0.16074; P value 0.1601

Egger: bias = 1.24304 (95% CI = 0.183967 to 2.302131); P value 0.0227

Incremental Shuttle Walk Test

Bias indicators

Begg‐Mazumdar: Kendall's tau = 0.0776906; P value 0.846

Egger: bias = ‐0.21 2523 (95% CI = ‐2.7776 to 2.351859); P value 0.846

Cycle Ergometer

Bias indicators

Begg‐Mazumdar: Kendall's tau = ‐0.2666667; P value 0.139
Egger: bias = 1.57164 (95% CI = 0.6053 to 2.337984); P value 0.0036

Figure 1

Funnel plot of comparison: 1 Rehabilitation versus usual care, outcome: 1.4 QoL ‐ Change in CRQ (Dyspnoea) (see for Egger and Begg‐Mazumdar: Kendall's test results).

Funnel plot of comparison: 1 Rehabilitation versus usual care, outcome: 1.4 QoL ‐ Change in CRQ (Dyspnoea) (see Table 1 for Egger and Begg‐Mazumdar: Kendall's test results).

Figure 2

Funnel plot of comparison: 1 Rehabilitation versus usual care, outcome: 1.5 QoL ‐ Change in SGRQ (Total) (see for Egger and Begg‐Mazumdar: Kendall's test results).

Funnel plot of comparison: 1 Rehabilitation versus usual care, outcome: 1.5 QoL ‐ Change in SGRQ (Total) (see Table 1 for Egger and Begg‐Mazumdar: Kendall's test results).

Data synthesis

Review authors undertook statistical analysis by using Review Manager software (RevMan 2011). Throughout the analysis, we used mean differences (MDs) as determined (to take into account pre‐experiment group differences) from the differences between preintervention and postintervention changes in treatment and control groups. We combined MDs according to random‐effects analyses (Shadish 1994) and presented the results as average treatment effects with 95% CIs and estimates of Tau² and I². In the case of cross‐over trials, we considered only the first study period and excluded from the analysis data obtained during the second study period. We explored heterogeneity through a priori specified subgroup analyses. When possible, for each outcome, we discussed the summary effect estimate in the context of its minimal clinically important difference (MCID). The MCID is defined as the smallest difference in score corresponding to the smallest difference perceived by the average patient that would mandate, in the absence of troublesome side effects and excessive costs, a change in management of a patient's condition (Jaeschke 1989).

Subgroup analysis and investigation of heterogeneity

To explain anticipated heterogeneity among study results, we defined a set of three a priori hypotheses on which sensitivity analyses were to be based. We identified potential sources of heterogeneity in relation to the outcomes of exercise capacity and HRQoL. We then classified these hypotheses into subcategories as follows.

Interventions

The contribution of each of the components of PR programmes to patient improvement in exercise capacity and HRQoL is not known. We hypothesised that the more comprehensive the rehabilitation programme, the larger would be the effect size in improving exercise capacity and HRQoL. We also hypothesised that a difference in intervention effect may be noted between hospital only‐based and community/home‐based interventions. Therefore, we performed a subgroup analysis of:

pulmonary rehabilitation and exercise only interventions versus PR plus a more comprehensive intervention within which education was included; and
hospital only‐based versus community/home‐based programmes.

Methodological quality

We hypothesised that the results of trials would be influenced by their methodological quality. For the purpose of this subgroup analysis, we defined high‐quality trials as those at low risk of bias for:

allocation concealment; or
incomplete outcome data (i.e. loss to follow‐up ≥ 20%).

We assessed for subgroup differences by using interaction tests available within RevMan (RevMan 2011). We reported the results of subgroup analyses by quoting the statistic and the P value, and the interaction test by providing the I² value.

Sensitivity analysis

We performed sensitivity analyses on the basis of trial quality by repeating our analysis among only those trials judged to be of 'high quality.' For the purposes of this review, 'high‐quality' trials are defined as trials with low risk of bias due to allocation concealment or low risk of bias due to incomplete outcome data. We limited sensitivity analyses to primary outcomes (see Types of outcome measures).

Results

Description of studies

See Characteristics of included studies and Characteristics of excluded studies as well as baseline characteristics (Table 2) and study design (Table 3).

Table 2. Baseline characteristics

Study	Rehab sample size	Male	Female	Mean age (SD)	FEV₁ (SD)	Control sample size	Male	Female	Mean age (SD)	FEV₁ (SD)
Barakat 2008	35	na	na	63.7	41.9	36	na	na	65.9	43.3
Baumann 2012	37	na	na	65	45	44	na	na	63	47
Behnke 2000a	23	12	3	64.0 (1)	34.1 (7.4)	23	11	4	68.0 (2.2)	37.5 (6.6)
Bendstrup 1997	27	7	9	64 (3)	1.02 L/min (0.06)	20	7	9	65 (2)	1.04 L/min (0.07)
Booker 1984	32	na	na	66 (8)	0.85 L (0.29)	37	na	na	65 (7)	0.97 L (0.37)
Borghi‐Silva 2009	20	13	7	67 (10)	33 (9)	14	12	8	67(10)	35 (11)
Boxall 2005	23	11	12	77.6 (7.6)	40.5 (15.9)	23	15	8	75.8 (8.1)	37.7 (15.0)
Busch 1988	7	5	2	65 (16)	26% (9)	7	6	1	66 (16)	27% (11)
Cambach 1997	15	7	8	62 (5)	59% (16)	8	6	2	62 (9)	60% (23)
Casaburi 2004	12	12	0	69 (10)	36% (9)	12	12	0	68 (9)	39% (12)
Casey 2013	178	117	61	68.8 (10.2)	57.6 (14.3)	172	106	66	68.4 (10.3)	59.7 (13.8)
Cebollero 2012	28	28	0	68 (7)	47.8 (5)	8	8	0	69(5)	38.7 (5)
Chan 2011	69	61	8	73.6 (7.5)	91 (0.39)	67	58	9	73.6 (7.4)	89 (0.39)
Chlumsky 2001	13	12	1	63 (11)	43% (21)	6	5	1	65 (13)	51% (17)
Clark 1996	32	na	na	58 (8)	1.72 L (0.83)	16	na	na	55 (8)	1.44 L (0.59)
Cochrane 2006	74	32	42	na	na	50	18	32	na	na
Cockcroft 1981	18	18	0	61 (5)	1.53 L (0.70)	16	16	0	60 (5)	1.32 L (0.44)
De Souto Araujo 2012	21	12	9	59	39.2 (11.4) /43.9 (10.3)	11	8	3	71.1	45.1 (12.6)
Deering 2011	25	11	14	67.7 (5.3)	77.0 (19)	19	8	8	68.6 (5.5)	45.8 (18.3)
Elci 2008	39	33	6	59.67 (8.6)	47.7	39	33	6	58.08 (11.45)	46.28
Emery 1998	25	15	14	65 (6)	1.29 L (0.63)	25	12	13	67 (7)	1.02 L (0.37)
Engström 1999	26	14	12	66 (5)	31% (11)	24	12	12	67 (5)	34% (10)
Faager 2004	10	3	7	72 (9)	26 (7)	10	3	7	70 (8)	28 (6)
Faulkner 2010	10	na	na	na	na	10	na	na	na	na
Fernandez 2009	30	29	1	66 (8)	33 (10)	20	20	0	70 (5)	38 (12)
Finnerty 2001	36	25	11	70 (8)	41% (19)	29	19	10	68 (10)	41% (16)
Gohl 2006	17	6	4	62.5 (7)	53.4 (10.7)	17	7	2	53.7 (5.8)	63.2 (8.5)
Goldstein 1994	38	21	17	66 (7)	35% (15)	40	17	23	65 (8)	35% (12)
Gosselink 2000	37	31	6	60 (9)	41% (16)	33	30	3	63 (7)	43% (12)
Gottlieb 2011	35	7	15	74.1 (66–82)	64.27 (7.9)	26	7	13	73.2 (67–88)	67.05 (8.8)
Griffiths 2000	93	57	37	68 (8)	40% (16)	91	54	37	68 (8)	39% (16)
Gurgun 2013	30	28	28	64.0 (10.8)	41.9 (10.8)	16	15	1	67.8 (6.6)	39.3 (9.3)
Güell 1995	30	30	30	64 (7)	31% (12)	30	30	0	66 (6)	39% (14)
Güell 1998	18	16	2	68 (8)	32% (11)	17	17	0	66 (8)	38% (15)
Hernandez 2000	20	20	0	64 (8)	71.1 (18.9)	17	17	0	63 (7)	74.7 (14.7)
Hoff 2007	6	4	2	62.8 (1.4)	49.9 (4.6)	6	4	2	60.6 (3.0)	45.2 (6.0)
Jones 1985	8	6	2	64 (6)	0.78 L (0.27)	6	1	5	63 (8)	0.68 L (0.12)
Karapolat 2007	26	21	5	64.81 (9.4)	55.50%	19	18	1	67.21 (6.72)	58%
Lake 1990	7	6	1	66.3 (6.8)	0.83 L (0.25)	7	4	3	65.7 (3.5)	0.97 L (0.29)
Lindsay 2005	25	20	5	69.5 (9.3)	0.9 L (0.3)	25	18	7	69.8 (10.3)	0.8 L (0.4)
Liu 2012	36	26	10	61.34 (8.3)	61.27 (5.86)	36	29	7	62.2 (6.34)	61.43 (6.17)
McGavin 1977	12	12	0	61 (6)	0.97 L (0.33)	12	12	0	57 (8)	1.15 L (0.72)
McNamara 2013	38	18	15	72 (10)	60 (10)	15	8	7	70 (9)	55 (20)
Mehri 2007	20	11	9	52.1 (10.7)	na	18	7	11	52.17 (11.6)	na
Mendes De Oliveira 2010	56	46	10	66.4/71.3	47.5/ 51.5	29	19	10	70.8	41.4
Nalbant 2011	14	11	3	73.5	58.5 (48‐65)	15	13	2	68	57 (44‐66)
O'Shea 2007	27	na	na	66.9 (7)	49	27	na	na	68.4 (9.9)	52
Ozdemir 2010	25	25	0	60.9 (8.8)	54.5 (15.6)	25	25	0	64.1 (8.9)	54.1 (20.2)
Paz‐Diaz 2007	10	6	4	67 (5)	34 (11)	14	12	2	62 (7)	30 (9)
Petty 2006	149	80	69	68.8 (9.2)	na	73	40	33	66.8 (9.9)	na
Reardon 1994	10	5	5	66 (8)	35% (10)	10	5	5	66 (7)	33% (15)
Ringbaek 2000	24	1	23	62 (7)	50% (17)	21	6	15	65 (8)	44% (14)
Gomez 2006	64	39	9	64.1/64.9	74 (66.5‐81.5)	33	19	4	63.4	60.1 (55.6‐64.4)
Simpson 1992	14	5	9	73 (5)	40% (19)	14	10	4	70 (6)	39% (21)
Singh 2003	20	na	na	na	28 (7.5)	20	na	na	na	26 (7.1)
Sridhar 2008	61	30	31	69.9 (9.6)	42.9 (15.5)	61	30	31	69.68 (10.4)	48.9 (18.69)
Strijbos 1996	15	14	1	61 (6)	40% (20)	15	12	3	63 (5)	43% (9)
Theander 2009	15	3	9	66	35.1 (7.6)	15	10	4	64	32.3 (9.5)
Vallet 1994	10	7	3	60 (9)	57.2	10	8	2	58 (6)	55.7
Van Wetering 2010	102	72	30	65.9 (8.8)	58 (17)	97	69	28	67.2 (8.9)	60 (15)
Vijayan 2010	16	na	na	na	na	15	na	na	na	na
Weiner 1992	12	6	6	67 (9)	32.8 (3)	12	5	7	61 (9)	39.2 (2.8)
Wen 2008	32	31	1	67 (7)/68 (7)	46 (10)/50 (14)	9	9	0	66(10)	52 (14)
Wijkstra 1994	28	23	5	64 (5)	44% (11)	15	14	1	62 (5)	45% (9)
Xie 2003	25	22	3	54 (6)	42% (16)	25	21	4	54 (6)	40% (17)

na: not available.

See: Summary of findings for the main comparison Rehabilitation versus usual care for chronic obstructive pulmonary disease

Table 3. Study design

Study	Follow‐up	Duration (weeks)	Setting	Programme type
Barakat 2008	14 weeks	14	Outpatient	Exercise + other
Baumann 2012	6 months	8	Community	Exercise + other
Behnke 2000a	3, 6 months	24	Inpatient	Exercise + other
Bendstrup 1997	12, 24 weeks	12	Outpatient	Exercise
Booker 1984	3, 6, 12 months	9	Home	Exercise + other
Borghi‐Silva 2009	6 weeks	6	Outpatient	Exercise
Boxall 2005	12 weeks	12	Home	Exercise + other
Busch 1988	18 weeks	18	Home	Exercise
Cambach 1997	3 months	12	Community	Exercise + other
Casaburi 2004	10 weeks	10	Outpatient	Exercise + other
Casey 2013	12 weeks	8	Community	Exercise + other
Cebollero 2012	12 weeks	12	Outpatient	Exercise
Chan 2011	3 months	12	Community	Exercise
Chlumsky 2001	8 weeks	8	Outpatient	Exercise
Clark 1996	12 weeks	12	Home	Exercise
Cochrane 2006	6 weeks, 6 months, 12 months	6	Outpatient	Exercise + other
Cockcroft 1981	2, 6 months	6	Outpatient	Exercise
De Souto Araujo 2012	8 weeks	8	Community	Exercise
Deering 2011	8 weeks	7	Outpatient	Exercise + other
Elci 2008	1, 3 months	12	Community /Home	Exercise + other
Emery 1998	10 weeks	10	Outpatient	Exercise + other
Engström 1999	12 months	52	Outpatient /Home	Exercise + other
Faager 2004	8 weeks, 6 months	8	Inpatient /Home	Exercise + other
Faulkner 2010	week 9	8	Community	Exercise + other
Fernandez 2009	1 year	52	Home	Exercise + other
Finnerty 2001	12, 24 weeks	6	Outpatient	Exercise + other
Gohl 2006	12 months	52	Community	Exercise
Goldstein 1994	24 weeks	8	Inpatient	Exercise + other
Gosselink 2000	6, 18 months	24	Outpatient	Exercise
Gottlieb 2011	6 months	7	Community	Exercise + other
Griffiths 2000	1 year	6	Outpatients /Home	Exercise + other
Gomez 2006	3, 6 months	12	Community	Exercise + other
Güell 1995	3, 6, 9, 12, 18, 24 months	12	Outpatient /Home	Exercise
Güell 1998	8 weeks	8	Outpatient	Exercise
Gurgun 2013	8 weeks, 6 months	8	Outpatient	Exercise + other
Hernandez 2000	12 weeks	12	Home	Exercise
Hoff 2007	8 weeks	8	Outpatient	Exercise
Jones 1985	10 weeks	10	Home	Exercise
Karapolat 2007	8, 12 weeks	8	Outpatient	Exercise + other
Lake 1990	8 weeks	8	Outpatient	Exercise
Lindsay 2005	6 weeks, 3 months	6	Community	Exercise + other
Liu 2012	6 months	24	Inpatient /Home	Exercise
McGavin 1977	14 weeks	?12	Home	Exercise
McNamara 2013	8 weeks	8	Outpatient	Exercise
Mehri 2007	4 weeks	4	Outpatient	Exercise
Mendes De Oliveira 2010	12 weeks	12	Outpatient /Home	Exercise + other
Nalbant 2011	3, 6 months	24	Nursing home	Exercise + other
O'Shea 2007	3, 6 months	12	Outpatient /Home	Exercise
Ozdemir 2010	1 month	4	Outpatient	Exercise
Paz‐Diaz 2007	8 weeks	8	Outpatient	Exercise
Petty 2006	8 weeks	8	Home	Exercise + other
Reardon 1994	6 weeks	6	Outpatient	Exercise + other
Ringbaek 2000	8 weeks	8	Outpatient	Exercise + other
Simpson 1992	8 weeks	8	Outpatient	Exercise
Singh 2003	4 weeks	4	Home	Exercise
Sridhar 2008	6 months	6	Outpatients /Home	Exercise + other
Strijbos 1996	3, 6, 12, 18 months	12	Outpatient	Exercise + other
Theander 2009	12 weeks	12	Outpatient /Home	Exercise + other
Vallet 1994	8 weeks	8	Inpatient	Exercise
Van Wetering 2010	4 months	12	Community	Exercise + other
Vijayan 2010	Unclear	6	Unclear	Exercise
Weiner 1992	6 months	24	Outpatient	Exercise
Wen 2008	12 weeks	12	Outpatient	Exercise
Wijkstra 1994	12 weeks	12	Outpatient /Home	Exercise + other
Xie 2003	12 weeks	12	Home	Exercise

Results of the search

Our search yielded 1284 citations with potential for inclusion (see Figure 3). We excluded 1132 citations during the initial screening of titles and abstracts and assessed 98 studies (152 citations) on the basis of a full‐text review. Of these, 51 studies (68 citations) failed to meet the inclusion criteria. A further five studies (eight citations) provided insufficient detail to allow a decision and are still awaiting classification (see Characteristics of studies awaiting classification). Of these, we conducted a teleconference with the author of two studies (Meshcheryakova 2010; Meshcheryakova 2012) and are awaiting additional unpublished information. We were not able to establish contact with the authors of the other three studies (Aksu 2006; D'Amico 2010; Ren 2011). Three studies were ongoing at the time of this review, and results were not yet published; the study authors wished to withhold results until after publication (Chang 2008; Gurgun 2011; Sathyapala 2008) (see Characteristics of ongoing studies). In addition, eight citations were related to five studies that were already included in the previous version of this review. Thus, 34 studies (65 citations) were included for the first time in this review, in addition to the 31 studies (65 citations) already included in the previous version of the review. We have provided details of the literature search for the previous version of the review in Appendix 1.

Figure 3

Study flow diagram.

Included studies

We included the 31 RCTs from the 2006 version of the Cochrane review (Lacasse 2006). A total of 65 studies (represented by 130 citations) contributed to this meta‐analysis, including 34 new studies (Barakat 2008; Baumann 2012; Borghi‐Silva 2009; Casey 2013; Cebollero 2012; Chan 2011; Cochrane 2006; De Souto Araujo 2012; Deering 2011; Elci 2008; Faager 2004; Faulkner 2010; Fernandez 2009; Gohl 2006; Gomez 2006; Gottlieb 2011; Gurgun 2013; Hoff 2007; Karapolat 2007; Lindsay 2005; Liu 2012; McNamara 2013; Mehri 2007; Mendes De Oliveira 2010; Nalbant 2011; O'Shea 2007; Ozdemir 2010; Paz‐Diaz 2007; Petty 2006; Sridhar 2008; Theander 2009; Van Wetering 2010; Vijayan 2010; Wen 2008), in addition to the 31 studies included in the original review (Behnke 2000a; Bendstrup 1997; Booker 1984; Boxall 2005; Busch 1988; Cambach 1997; Casaburi 2004; Chlumsky 2001; Clark 1996; Cockcroft 1981; Emery 1998; Engström 1999; Finnerty 2001; Goldstein 1994; Gosselink 2000; Griffiths 2000; Güell 1995; Güell 1998; Hernandez 2000; Jones 1985; Lake 1990; McGavin 1977; Reardon 1994; Ringbaek 2000; Simpson 1992; Singh 2003; Strijbos 1996; Vallet 1994; Weiner 1992; Wijkstra 1994; Xie 2003). We provided descriptions of these individual studies in the Characteristics of included studies table.

These studies involved 3822 participants, 2090 of whom were randomly allocated to some form of exercise rehabilitation for a minimum duration of four weeks, and 1732 individuals who were randomly assigned to usual care. For a detailed account of the criteria required for inclusion, see Criteria for considering studies for this review. The sample size in the included studies ranged from 12 participants (Hoff 2007) to 350 participants (Casey 2013) with a median of 45 participants (interquartile range (IQR) 29.5 to 67). We noted a large gender imbalance across all studies, with 69% of participants being male and with 10 studies including no female participants.

Only six studies reported patient‐based programmes, three of which were combined with a home‐based follow‐up component. Thirty‐seven studies were hospital out‐patient based; eight of these included a home‐based element. In all, 21 programmes were community based, 11 of which were entirely home based, and one programme combined community‐ and home‐based components. The venue for the programme run by Vijayan 2010 was unclear from the reports. The duration of the programmes ranged from four weeks (three studies) to one year (three studies). Eight‐ and 12‐week programmes (18 studies of each) were most common. Timelines for assessment of participants followed a pattern identical to that of programme duration.

All but two trials that met the inclusion criteria used a standard parallel‐group design. Casey 2013 utilised cluster samples from general practices, whereas Cambach 1997 conducted a cross‐over trial. Most studies (48 trials) randomly assigned participants to two groups (i.e. rehabilitation and usual care), and three trials randomly assigned participants to three intervention groups, in addition to the usual care group (Casaburi 2004; Cochrane 2006; Lake 1990). The remaining 14 trials utilised two intervention groups and a usual care group (Cebollero 2012; De Souto Araujo 2012; Deering 2011; Emery 1998; Gomez 2006; Gurgun 2013; Jones 1985; Liu 2012; McNamara 2013; Mendes De Oliveira 2010; Petty 2006; Strijbos 1996; Weiner 1992; Wen 2008)

Excluded studies

We excluded 51 studies from the current update during the full‐text screening process.The Characteristics of excluded studies table provides full details of the excluded studies.

Risk of bias in included studies

As a result of the nature of the intervention, it was expected that blinding of participants and of professionals who delivered the interventions was not possible. Consequently, risk of performance bias in all studies was high. Risk of bias for other bias domains varied across included studies, and insufficient detail was provided to inform judgement in several included studies (see Figure 4, Risk of bias summary table, and Figure 5, Risk of bias graph, for an overview).

Figure 4

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Figure 5

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Allocation

We judged 53 included studies as having low risk of bias in random sequence generation. Information was insufficient to permit a decision in relation to 12 trials (Bendstrup 1997; Borghi‐Silva 2009; Clark 1996; Faager 2004; Fernandez 2009; Hoff 2007; Lindsay 2005; Mehri 2007; Nalbant 2011; Paz‐Diaz 2007; Vijayan 2010; Wen 2008). With regard to allocation concealment, we judged 28 studies as having low risk of bias (Behnke 2000a; Booker 1984; Boxall 2005; Busch 1988; Cambach 1997; Casaburi 2004; Casey 2013; Cebollero 2012; Cochrane 2006; Cockcroft 1981; De Souto Araujo 2012; Emery 1998; Engström 1999; Faulkner 2010; Finnerty 2001; Goldstein 1994; Gomez 2006; Gosselink 2000; Gottlieb 2011; Griffiths 2000; Gurgun 2013; Karapolat 2007; Liu 2012; McNamara 2013; Mendes De Oliveira 2010; O'Shea 2007; Theander 2009; Van Wetering 2010) and four studies as having high risk of bias (Baumann 2012; Güell 1995; Güell 1998; Jones 1985); the remaining 33 studies provided insufficient information to inform judgements.

Blinding

Performance bias

As a result of the nature of the intervention, it was not possible to blind participants or professionals who delivered the interventions. Consequently, we judged all studies as having high risk of performance bias.

Detection bias

Across studies, the level of reporting of whether outcome assessment was blinded was relatively poor. We judged 32 studies as having low risk of detection bias ( Barakat 2008; Booker 1984; Busch 1988; Casaburi 2004; Casey 2013; Cebollero 2012; Chan 2011; Cochrane 2006; De Souto Araujo 2012; Deering 2011; Elci 2008; Emery 1998; Engström 1999; Finnerty 2001; Goldstein 1994; Gomez 2006; Griffiths 2000; Güell 1995; Güell 1998; Hernandez 2000; Jones 1985; Lake 1990; Liu 2012; McNamara 2013; O'Shea 2007; Petty 2006; Reardon 1994; Ringbaek 2000; Simpson 1992; Strijbos 1996; Van Wetering 2010; Weiner 1992). In two of these studies (Engström 1999; Simpson 1992), the primary outcome assessment (quality of life) was blinded but the secondary outcome assessment (exercise capacity) was not. In Lake 1990, the cycle ergometer test was blinded, but the six‐minute walk test was not. In Busch 1988, the cycle ergometer test was not blinded and the 12‐minute walk test was blinded. Among studies that reported blinding of outcome assessment, nine studies were judged as having high risk of detection bias (Boxall 2005; Cambach 1997; Faulkner 2010; Gosselink 2000; Gottlieb 2011; McGavin 1977; Theander 2009; Vallet 1994; Wijkstra 1994), and the remaining 23 studies provided insufficient information to inform judgements.

Incomplete outcome data

We judged 39 studies as having low risk of attrition bias (Barakat 2008; Borghi‐Silva 2009; Boxall 2005; Cambach 1997; Casaburi 2004; Chlumsky 2001; Cockcroft 1981; Emery 1998; Engström 1999; Fernandez 2009; Goldstein 1994; Griffiths 2000; Güell 1995; Güell 1998; Gurgun 2013; Hoff 2007; Karapolat 2007; Lake 1990; Lindsay 2005; Liu 2012; McGavin 1977; McNamara 2013; Mehri 2007; O'Shea 2007; Ozdemir 2010; Paz‐Diaz 2007; Petty 2006; Reardon 1994; Ringbaek 2000; Simpson 1992; Singh 2003; Strijbos 1996; Theander 2009; Vallet 1994; Van Wetering 2010; Vijayan 2010; Weiner 1992; Wijkstra 1994; Xie 2003) and 22 as having high risk (Baumann 2012 24% of people dropped out; Behnke 2000a 35%; Bendstrup 1997 24%; Booker 1984 27%; Busch 1988 30%; Casey 2013 24%; Chan 2011 23%, Cochrane 2006 43%; De Souto Araujo 2012 24%; Deering 2011 42%; Faager 2004 30%; Faulkner 2010 30%; Finnerty 2001 43%; Gohl 2006 44%; Gomez 2006 48%; Gosselink 2000 62%; Gottlieb 2011 32%; Hernandez 2000 38%; Jones 1985 26%; Mendes De Oliveira 2010 27%; Nalbant 2011 28%; Wen 2008 24%). Information was insufficient to inform judgements in five studies (Cambach 1997; Cebollero 2012; Clark 1996; Elci 2008; Vijayan 2010).

Selective reporting

We found no trial registration protocol for most studies to check whether all prespecified outcomes were reported in the articles. However, outcomes listed in the methods section of the included studies were reported in the results section, with the exception of four studies that were judged to have high risk of reporting bias (i.e. Ozdemir 2010, whose results for the CRQ are incomplete; Paz‐Diaz 2007, who did not provide results for the rehabilitation group for CRQ; Petty 2006, in which results of the six‐minute walk test and Short Form (SF)‐36 are not presented; and Weiner 1992, in which results of the SGRQ are not available ). In relation to publication bias, we visually reviewed the funnel plots (Figure 3; Figure 1; Figure 2) and followed this by performing the Egger test (Egger 1997) (Table 1). Egger test results showed no significant publication bias across the studies included in the current meta‐analysis.

Other potential sources of bias

We found no other source of bias, with the exception of a tendency toward increased proportions of male participants, as was highlighted earlier.

Effects of interventions

Pulmonary rehabilitation versus usual care

For this comparison, we included all participants who were randomly assigned in the included studies and received PR (defined as exercise training for at least four weeks with or without educational and/or psychological support) and those allocated to usual care (see Characteristics of included studies for details). We also undertook subgroup analysis as discussed in the Subgroup analysis and investigation of heterogeneity section. All outcomes results utilised in the analyses were based on baseline assessment measurements and the earliest follow‐up assessment up to three months after completion of the intervention.

Primary outcomes

Health‐related quality of life

Among the 65 trials that met the inclusion criteria of the meta‐analysis, 44 made an attempt to measure HRQoL using eight different strategies. Only three of these strategies ‐ the Transitional Dyspnoea Index (TDI; Mahler 1984), the Chronic Respiratory Disease Questionnaire (CRQ; Guyatt 1987a) and the St. Georges Respiratory Questionnaire (SGRQ; Jones 1992) ‐ have been demonstrated to be valid and responsive. Of these, the CRG and the SGRQ have become the recognised standard of assessment of HRQoL amongst patients with COPD and are reported here. We analysed the CRQ and the SGRQ separately. Not all subscales were fully completed by all participants, so the numbers of participants per outcome and per subscale varied.

Chronic Respiratory Disease Questionnaire (CRQ)

Scores for the CRQ are reported on a 7‐point scale. Although 23 studies utilised the CRQ to assess HRQoL, only 19 studies (1291 participants) provided results suitable for analysis.

Participants allocated to rehabilitation programmes had, on average, significantly greater changes in HRQoL CRQ scores across all subscales when compared with participants allocated to control groups (Fatigue: MD 0.68, 95% CI 0.45 to 0.92; 19 trials; 1291 participants; Tau² = 0.15; I² = 64%; Analysis 1.1; Emotional function: MD 0.56, 95% CI 0.34 to 0.78; 19 trials; 1291 participants; Tau² = 0.12; I² = 58%; Analysis 1.2; Mastery: MD 0.71, 95% CI 0.47 to 0.95; 19 trials; 1212 participants; Tau² = 0.16; I² = 63%; Analysis 1.3; Dyspnoea: MD 0.79, 95% CI 0.56 to 1.03; 19 trials; 1283 participants; Tau² = 0.15; I² = 63%; Analysis 1.4).

For each of the CRQ domains (dyspnoea, fatigue, emotional function and mastery), the common effect size exceeded the 'minimal clinically important difference' (MCID) (0.5 points on the 7‐point scale) (Jaeschke 1989). The lower limit of the confidence interval around the common treatment effect of the dyspnoea domains (Analysis 1.4) exceeded the MCID, indicating not only statistical significance but also clinical significance in the effect of PR. The lower limits of the remaining domains were slightly below the MCID (Analysis 1.1; Analysis 1.2; Analysis 1.3).

Heterogeneity identified across all domains of the CRQ was substantial, as Tau² was greater than zero, and in all cases, I² was greater than 30% and the P value for the Chi² test was less than 0.10. We undertook subgroup and sensitivity analyses to try to explore heterogeneity; although findings are presented later, they did not explain the high level of heterogeneity.

St. George's Respiratory Questionnaire (SGRQ)

Scores for the SGRQ are reported on a 100‐point scale. Twenty trials utilised the SGRQ to assess the HRQoL of participants. Results were available in a usable format from 19 trials (a maximum of 1153 participants) for inclusion in the meta‐analysis. Barakat 2008 was not included in the analysis, as clarification regarding the SD of the change is needed from the study authors.

Similar to the CRQ, participants allocated to PR programmes had, on average, significantly greater changes in SGRQ scores across all subscales when compared with participants allocated to control groups (SGRQ total: MD ‐6.89, 95% CI ‐9.26 to ‐4.52; 19 trials; 1146 participants; Tau² = 13.17; I² = 59%; Analysis 1.5; SGRQ symptoms: MD ‐5.09, 95% CI ‐7.69 to ‐2.49; 19 trials; 1153 participants; Tau² = 7.79; I² = 26%; Analysis 1.6; SGRQ impact: MD ‐7.23, 95% CI ‐9.91 to ‐4.55; 19 trials; 1149 participants; Tau² = 17.94; I² = 58%; Analysis 1.7; SGRQ activity: MD ‐6.08, 95% CI ‐9.28 to ‐2.88; 19 trials; 1148 participants; Tau² = 27.01; I² = 64%; Analysis 1.8).

For each of the SGRQ domains (as well as the total SGRQ score), the common effect size exceeded the MCID of four (Jones 1991; Quirk 1991) (Analysis 1.5; Analysis 1.6; Analysis 1.7; Analysis 1.8). All results of the analysis for all domains of the SGRQ were statistically significant. However, the extent of the 95% CI around the pooled treatment effect exceeds the MCID only for the SGRQ total and SGRQ impact domains of the SGRQ, demonstrating unequivocal clinical and statistical significance in these domains.

Heterogeneity in results obtained from the total and all subscales of the SGRQ was substantial, with the exception of the symptoms subscale (Analysis 1.6).

Secondary outcomes

Maximal exercise capacity

A total of 34 trials measured maximal exercise capacity. We limited the meta‐analysis to the 16 trials that used the incremental cycle ergometer test.

Investigators in 16 studies (779 participants) used the incremental cycle ergometer test. On average, a statistically significant increase in mean Wmax (W) was reported among participants allocated to PR compared with those allocated to usual care (MD 6.77, 95% CI 1.89 to 11.65; Tau² = 40.97; I² = 74%; Analysis 1.10). The common effect size exceeded the MCID (4 watts) proposed by Puhan 2011(b). The maximal exercise test showed substantial heterogeneity in the results obtained.

Functional exercise capacity

Of the included studies, 43 trials used the six‐minute walk test as an outcome. Of these, 38 (1879 participants: 1012 actively treated, 867 controls) presented the results in a format that could be used for the meta‐analysis (see Analysis 1.11). Investigators reported a statistically significant increase, on average, in the mean difference in metres walked associated with PR (MD 43.93 m, 95% CI 32.64 to 55.21; Tau² = 713.49; I² = 74%; Analysis 1.11). Both the common effect and the lower limit of its confidence interval exceeded the MCID for the 6WMD of 30 metres, as recommended by Holland 2014, indicating the clinical significance of the effect of PR. .

Eight trials (694 participants) reported data on the incremental shuttle walk test (ISWT). These test results were analysed independently from those of the 6MWT. On average, a statistically significant increase in mean metres walked was noted among participants allocated to PR compared with those allocated to usual care (MD 39.77, 95% CI 22.38 to 57.15; Tau² = 181.56; I² = 32%). This result is slightly below the MCID of 47.5 m (Singh 2008; Singh 2014) to make this a finding of clinical significance.

Similar to previous outcomes on maximal exercise, both the six‐minute walk test and the analyses demonstrated substantial heterogeneity.

Several other outcome measures were used to measure functional capacity, but because of the limited numbers of trials providing data for these other outcomes (endurance shuttle walk test: two trials; 12‐minute walk test: four trials); four‐minute walk test: one trial)), these findings were not included in the meta‐analysis.

Subgroup and sensitivity analyses

Rehabilitation versus usual care (subgroup analysis hospital‐ versus community‐based pulmonary rehabilitation)

In total, 39 included studies were considered to have a hospital‐based PR intervention delivered on an in‐patient or out‐patient basis. A total of 25 studies focused on programmes that were delivered in the community at community centres or in individuals' homes. One study had both a community‐based and an out‐patient‐based intervention group, so it was excluded from the subgroup analysis (Mendes De Oliveira 2010).

In the subgroup analysis for the CRQ domain outcomes, the 'community' subgroup included nine studies (Cambach 1997; Casey 2013; Faulkner 2010; Gomez 2006; Hernandez 2000; Lindsay 2005; O'Shea 2007; Singh 2003; Wijkstra 1994) and the 'hospital group' included 10 studies (Behnke 2000a; Cebollero 2012; Goldstein 1994; Gosselink 2000; Griffiths 2000; Güell 1995; Güell 1998; McNamara 2013; Simpson 1992; Sridhar 2008; ). For SGRQ outcomes, the community subgroup included nine studies (Baumann 2012; Boxall 2005; Chan 2011; De Souto Araujo 2012; Elci 2008; Fernandez 2009; Gohl 2006; Gottlieb 2011; Van Wetering 2010) and the hospital subgroup included 10 studies (Chlumsky 2001; Deering 2011; Engström 1999; Finnerty 2001; Griffiths 2000; Gurgun 2013; Karapolat 2007; Paz‐Diaz 2007; Ringbaek 2000; Theander 2009).

Evidence suggested a significant difference in treatment effect between subgroups for all domains of the CRQ, with higher mean values, on average, in the PR group in hospital than in the community‐based group (Analysis 2.1; Analysis 2.2; Analysis 2.3; Analysis 2.4). No subgroup differences were reported for any of the SGRQ domains (Analysis 2.5; Analysis 2.6; Analysis 2.7; Analysis 2.8).

Rehabilitation versus usual care (subgroup analysis 'exercise only' vs 'exercise plus more comprehensive components')

A total of 31 trials were included in the 'exercise only' subgroup, and 34 trials in the 'exercise plus more comprehensive components' subgroup, of which 10 trials in the 'exercise only' subgroup (Cebollero 2012; Gosselink 2000; Güell 1995; Güell 1998; Hernandez 2000; McNamara 2013; O'Shea 2007; Simpson 1992; Singh 2003; Sridhar 2008), and nine in the more comprehensive subgroup (Behnke 2000a; Cambach 1997; Casey 2013; Faulkner 2010; Goldstein 1994; Gomez 2006; Griffiths 2000; Lindsay 2005; Wijkstra 1994) reported CRQ data.

For the SGRQ, five trials were included in the 'exercise only' subgroup (Chan 2011; Chlumsky 2001; De Souto Araujo 2012; Gohl 2006; Paz‐Diaz 2007) and 14 trials in the more comprehensive subgroup (Baumann 2012; Boxall 2005; Deering 2011; Elci 2008; Engström 1999; Fernandez 2009; Finnerty 2001; Gottlieb 2011; Griffiths 2000; Gurgun 2013; Karapolat 2007; Ringbaek 2000; Theander 2009; Van Wetering 2010).

No evidence was found of a significant treatment effect between subgroups for all domains of the CRQ (Analysis 3.1; Analysis 3.2; Analysis 3.3; Analysis 3.4) and the SGRQ (Analysis 3.5; Analysis 3.6; Analysis 3.7; Analysis 3.8).

Please see Table 4 for a summary of results of the subgroup analysis.

Table 4. Summary of subgroup analysis

Pulmonary rehabilitation versus usual care. Subgroup: community versus hospital‐delivered programme
Outcome	Subscale	Subgroups	Heterogeneity	MD [95% CI]	Test for subgroup differences
CRQ	Fatigue	Community	Tau² = 0.10; I² = 52%	0.44 [0.14, 0.75]	Chi² = 3.98, df = 1 (P value 0.05), I² = 74.9%
CRQ	Fatigue	Hospital	Tau² = 0.09; I² = 51%	0.86 [0.58, 1.14]	Chi² = 3.98, df = 1 (P value 0.05), I² = 74.9%
CRQ	Emotional Function	Community	Tau² = 0.00; I² = 0%	0.21 [0.04, 0.39]	Chi² = 12.24, df = 1 (P value 0.0005), I² = 91.8%
CRQ	Emotional Function	Hospital	Tau² = 0.06; I² = 39%	0.77 [0.51, 1.03]	Chi² = 12.24, df = 1 (P value 0.0005), I² = 91.8%
CRQ	Mastery	Community	Tau² = 0.07; I² = 45%	0.40 [0.12, 0.67]	Chi² = 8.58, df = 1 (P value 0.003), I² = 88.3%
CRQ	Mastery	Hospital	Tau² = 0.05; I² = 31%	0.95 [0.70, 1.20]	Chi² = 8.58, df = 1 (P value 0.003), I² = 88.3%
CRQ	Dyspnoea	Community	Tau² = 0.03; I² = 26%	0.58 [0.34, 0.81]	Chi² = 4.05, df = 1 (P value 0.04), I² = 75.3%
CRQ	Dyspnoea	Hospital	Tau² = 0.17; I² = 60%	0.99 [0.66, 1.32]	Chi² = 4.05, df = 1 (P value 0.04), I² = 75.3%
SGRQ	Total	Community	Tau² = 24.00; I² = 73%	‐8.15 [‐12.16, ‐4.13]	Chi² = 0.69, df = 1 (P value 0.41), I² = 0%
SGRQ	Total	Hospital	Tau² = 6.41; I² = 35%	‐6.05 [‐8.91, ‐3.20]	Chi² = 0.69, df = 1 (P value 0.41), I² = 0%
SGRQ	Symptoms	Community	Tau² = 6.28; I² = 24%	‐3.66 [‐7.07, ‐0.24]	Chi² = 1.65, df = 1 (P value 0.20), I² = 39.2%
SGRQ	Symptoms	Hospital	Tau² = 4.96; I² = 15%	‐6.91 [‐10.51, ‐3.30]	Chi² = 1.65, df = 1 (P value 0.20), I² = 39.2%
SGRQ	Impact	Community	Tau² = 19.91; I² = 63%	‐8.17 [‐12.00, ‐4.34]	Chi² = 0.46, df = 1 (P value 0.50), I² = 0%
SGRQ	Impact	Hospital	Tau² = 22.39; I² = 58%	‐6.21 [‐10.33, ‐2.09]	Chi² = 0.46, df = 1 (P value 0.50), I² = 0%
SGRQ	Activity	Community	Tau² = 48.91; I² = 78%	‐7.82 [‐13.37, ‐2.28]	Chi² = 0.93, df = 1 (P value 0.33), I² = 0%
SGRQ	Activity	Hospital	Tau² = 10.45; I² = 36%	‐4.58 [‐8.16, ‐1.00]	Chi² = 0.93, df = 1 (P value 0.33), I² = 0%
Pulmonary rehabilitation versus usual care. Subgroup: exercise only programme versus exercise plus additional elements in programme
Outcome	Subscale	Subgroups	Heterogeneity	MD [95% CI]	Test for subgroup differences
CRQ	Fatigue	Exercise only	Tau² = 0.00; I² = 0%	0.73 [0.54, 0.92]	Chi² = 0.26, df = 1 (P value 0.61), I² = 0%
CRQ	Fatigue	Exercise + other	Tau² = 0.29; I² = 79%	0.61 [0.18, 1.03]	Chi² = 0.26, df = 1 (P value 0.61), I² = 0%
CRQ	Emotional Function	Exercise only	Tau² = 0.00; I² = 0%	0.51 [0.31, 0.71]	Chi² = 0.09, df = 1 (P value 0.77), I² = 0%
CRQ	Emotional Function	Exercise + other	Tau² = 0.28; I² = 79%	0.58 [0.16, 1.00]	Chi² = 0.09, df = 1 (P value 0.77), I² = 0%
CRQ	Mastery	Exercise only	Tau² = 0.01; I² = 11%	0.66 [0.44, 0.88]	Chi² = 0.12, df = 1 (P value 0.73), I² = 0%
CRQ	Mastery	Exercise + other	Tau² = 0.31; I² = 79%	0.74 [0.31, 1.18]	Chi² = 0.12, df = 1 (P value 0.73), I² = 0%
CRQ	Dyspnoea	Exercise only	Tau² = 0.06; I² = 31%	0.83 [0.56, 1.10]	Chi² = 0.13, df = 1 (P value 0.72), I² = 0%
CRQ	Dyspnoea	Exercise + other	Tau² = 0.25; I² = 77%	0.74 [0.35, 1.13]	Chi² = 0.13, df = 1 (P value 0.72), I² = 0%
SGRQ	Total	Exercise only	Tau² = 62.83; I² = 70%	‐7.87 [‐16.72, 0.98]
SGRQ	Total	Exercise + other	Tau² = 10.17; I² = 56%	‐6.76 [‐9.19, ‐4.34]	Chi² = 0.06, df = 1 (P value 0.81), I² = 0%
SGRQ	Symptoms	Exercise only	Tau² = 0.00; I² = 0%	‐7.38 [‐12.33, ‐2.44]
SGRQ	Symptoms	Exercise + other	Tau² = 13.88; I² = 41%	‐4.38 [‐7.62, ‐1.15]	Chi² = 0.99, df = 1 (P value 0.32), I² = 0%
SGRQ	Impact	Exercise only	Tau² = 33.34; I² = 63%	‐6.11 [‐12.60, 0.38]
SGRQ	Impact	Exercise + other	Tau² = 17.12; I² = 59%	‐7.61 [‐10.64, ‐4.57]	Chi² = 0.17, df = 1 (P value 0.68), I² = 0%
SGRQ	Activity	Exercise only	Tau² = 139.67; I² = 78%	‐9.33 [‐21.66, 2.99]	Chi² = 0.30, df = 1 (P value 0.59), I² = 0%
SGRQ	Activity	Exercise + other	Tau² = 18.51; I² = 60%	‐5.79 [‐8.95, ‐2.64]	Chi² = 0.30, df = 1 (P value 0.59), I² = 0%

CRQ: Chronic Respiratory Disease Questionnaire; MD: mean difference; SGRQ: St. George's Respiratory Questionnaire.

Sensitivity analysis

A sensitivity analysis included only studies of high quality (studies for which both allocation concealment and Incomplete outcome data were rated as low risk) (see risk of bias table in Figure 4). Thirteen studies met the criteria for high quality (Boxall 2005; Cambach 1997; Cockcroft 1981; Emery 1998; Engström 1999; Goldstein 1994; Griffiths 2000; Karapolat 2007; Liu 2012; McNamara 2013; O'Shea 2007; Theander 2009; Van Wetering 2010). Effect estimates were consistent with overall summary effect estimates for the two primary outcomes when contributing data were restricted to high‐quality studies, with the exception of one domain, for which the confidence interval widened enough to include the possibility of no difference between rehabilitation and control. All domains for both the CRQ and the SGRQ continued to be statistically significant when restricted to studies of high quality, with the exception of the SGRQ symptoms domain, which was no longer statistically significant (MD ‐4.12, 95% CI ‐8.42 to 0.21;, seven trials; 572 participants; Tau² = 13.82; I² = 46%).

Neither subgroup analyses nor the sensitivity analysis based on quality had any impact on reducing or explaining high levels of heterogeneity.

Discussion

This review summarised 65 studies involving 3822 participants with chronic obstructive pulmonary disease (COPD), 2090 of whom were randomly allocated to some form of exercise rehabilitation for a minimum duration of four weeks, and 1732 individuals randomly assigned to usual care. This is the second update of this review, which was last updated in 2006 (Lacasse 2006). Pulmonary rehabilitation is now accepted within the scientific community as an essential strategy in the ongoing management of people with COPD (GOLD 2014). Development of objective health‐related quality of life (HRQoL) outcome measures (Kirshner 1985) and demonstration of a physiological rationale for exercise training in people with COPD (Casaburi 1991; Maltais 1996) have facilitated this acceptance. Results of the previous version of this meta‐analysis strongly supported pulmonary rehabilitation (PR) in the management of COPD, and results of this current update reconfirm these findings.

Three aspects of the meta‐analysis warrant comment. First, we examined the short‐term effects of PR in COPD, that is, the benefits of rehabilitation found at the completion of a programme. When the original review was undertaken, few investigators were examining the long‐term benefits of rehabilitation (Guell 2000; Ries 1995; Troosters 2000; Wijkstra 1995). More recently, focus on this aspect of PR has increased and exploration of strategies to maintain early benefits continues (Brooks 2002; Foglio 2001; Ries 2003). This review does not attempt to examine these issues. Second, we have been conservative in concluding clear benefit only when the 95% confidence interval (CI) representing the smallest treatment effect was still greater than the minimal clinically important difference (MCID). Third, we excluded a number of well‐conducted studies that have contributed to our understanding of PR, but in which control participants received interventions beyond what was considered conventional care. An example of this is Ries 1995, which was excluded on the grounds that control participants had been given an educational programme. Similarly, several studies in which an intervention such as inspiratory muscle training, psychosocial support or breathing exercises was compared with exercise training were excluded. Only studies in which usual care was directly compared with exercise rehabilitation were included for analysis.

As the care of patients with COPD is largely concerned with treating symptoms (Pauwels 2001), we believe that HRQoL should be considered as the primary outcome in PR. The present meta‐analysis reconfirms the findings of the previous version that PR is effective in relieving dyspnoea and fatigue, and in improving patients' emotional function and control over the disease. The magnitude of the improvement lies beyond the MCID.

In most trials, investigators measured HRQoL by using either the Chronic Respiratory Disease Questionnaire (CRQ) or the St. George's Respiratory Questionnaire (SGRQ). Head‐to‐head comparisons of these questionnaires have been published (Harper 1997; Rutten‐van Mölken 1999). In both studies, analyses of reliability, validity and responsiveness did not clearly favour one instrument above the other. Rutten‐van Mölken and colleagues (Rutten‐van Mölken 1999) suggested that the choice between the CRQ and the SGRQ should be based on other considerations, such as the required sample size. Only one trial included in the meta‐analysis reported results from both the CRQ and the SGRQ (Griffiths 2000), with no clear indication that one questionnaire is more sensitive to change than the other. Therefore, comparisons from this meta‐analysis are only indirect. We found wider 95% CIs around the pooled treatment effect from the SGRQ ‐ a situation that may be explained by the smaller number of participants contributing to this analysis.

Pulmonary rehabilitation programmes included in the meta‐analysis differed in several aspects, including clinical setting, duration and composition. This we believe is responsible for the substantial heterogeneity observed in the results obtained and is in keeping with a recent study by Spruit 2014 and supported by Rochester 2014, who also identified this as an issue requiring further investigation. For instance, the contributions of educational activities and psychological support to exercise training remain uncertain. This information would be of outmost importance to physicians and allied healthcare professionals who prescribe rehabilitation and to those who allocate the resources. We addressed this issue in a systematic overview of the literature (Lacasse 1997). Since the time this review was published, further evidence from randomised controlled trials (RCTs) has been published to better define the type and intensity of exercise (Bernard 1999), as well as the influence of programme components, including patient education and self‐management (Bourbeau 2003), nutritional support (Steiner 2003) and respiratory muscle training (Watson 1997). Sometimes, evidence even took the form of systematic reviews (Ferreira 2012; Lotters 2002; Taylor 2005). Such questions were too specific to be directly addressed in this meta‐analysis, which aimed to investigate the overall effect of rehabilitation in COPD (not the effects of its components). Nevertheless, homogeneity among study results suggested that less sophisticated rehabilitation programmes may also be effective in improving HRQoL, although the between‐study comparison from which this conclusion follows is relatively weak.

Investigators have identified an increase in exercise tolerance and functional activities such as walking as other relevant outcomes of rehabilitation (Fishman 1994; Pauwels 2001). Our current interpretation of the results of the six‐minute walk test (6MWT) analysis differs from that of the previous version of the meta‐analysis (Lacasse 2006). In 2006, results of the meta‐analysis were compared with an MCID of 54 metres (95% CI 37 to 71 metres; Redelmeier 1997). From this comparison, the clinical significance of results obtained from the 2006 meta‐analysis was interpreted as uncertain. Since 2006, several studies have further investigated the issue of the MCID in field walk tests in chronic respiratory disease. Results of these studies have recently been summarised in an important systematic review, which was supported by the European Respiratory and American Thoracic Societies (Holland 2014; Singh 2014). Although variability across studies and methods used to determine the MCID is evident, available evidence suggests that the MCID for the 6MWT lies between 25 and 33 metres (median estimate 30 metres). Results of our meta‐analysis (i.e. MD of 43.93 metres with 95% CI between 36.24 and 55.21 metres) indicate the clinical significance of the effects of PR.

When compared with the treatment effects of other important modalities of care for patients with COPD, such as long‐acting inhaled therapy or oral theophylline and its new derivatives (Kew 2014;Ram 2005), rehabilitation resulted in greater improvement in important domains of HRQoL and functional exercise capacity.

The importance of measures of maximal exercise capacity remains to be defined. An initial test may be useful in assisting with the prescription of an appropriate level of training. Retesting may provide physiological evidence that a training response has occurred and may be useful in adjustment of intensity levels during the programme (Jones 1988). As the results of maximal exercise tests correlate poorly with those of HRQoL measures (Guyatt 1985; Wijkstra 1994a), maximal exercise testing cannot serve as a substitute for such measures when the outcome of a rehabilitation programme is evaluated.

Figure 1

Funnel plot of comparison: 1 Rehabilitation versus usual care, outcome: 1.4 QoL ‐ Change in CRQ (Dyspnoea) (see Table 1 for Egger and Begg‐Mazumdar: Kendall's test results).

Figure 2

Funnel plot of comparison: 1 Rehabilitation versus usual care, outcome: 1.5 QoL ‐ Change in SGRQ (Total) (see Table 1 for Egger and Begg‐Mazumdar: Kendall's test results).

Figure 3

Study flow diagram.

Figure 4

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Figure 5

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Analysis 1.1

Comparison 1 Rehabilitation versus usual care, Outcome 1 QoL ‐ Change in CRQ (Fatigue).

Analysis 1.2

Comparison 1 Rehabilitation versus usual care, Outcome 2 QoL ‐ Change in CRQ (Emotional Function).

Analysis 1.3

Comparison 1 Rehabilitation versus usual care, Outcome 3 QoL ‐ Change in CRQ (Mastery).

Analysis 1.4

Comparison 1 Rehabilitation versus usual care, Outcome 4 QoL ‐ Change in CRQ (Dyspnoea).

Analysis 1.5

Comparison 1 Rehabilitation versus usual care, Outcome 5 QoL ‐ Change in SGRQ (Total).

Analysis 1.6

Comparison 1 Rehabilitation versus usual care, Outcome 6 QoL ‐ Change in SGRQ (Symptoms).

Analysis 1.7

Comparison 1 Rehabilitation versus usual care, Outcome 7 QoL ‐ Change in SGRQ (Impacts).

Analysis 1.8

Comparison 1 Rehabilitation versus usual care, Outcome 8 QoL ‐ Change in SGRQ (Activity).

Analysis 1.9

Comparison 1 Rehabilitation versus usual care, Outcome 9 Maximal Exercise (Incremental shuttle walk test).

Analysis 1.10

Comparison 1 Rehabilitation versus usual care, Outcome 10 Maximal Exercise Capacity (cycle ergometer).

Analysis 1.11

Comparison 1 Rehabilitation versus usual care, Outcome 11 Functional Exercise Capacity (6MWT)).

Analysis 2.1

Comparison 2 Rehabilitation versus usual care (subgroup analysis hospital vs community), Outcome 1 QoL ‐ Change in CRQ (Fatigue).

Analysis 2.2

Comparison 2 Rehabilitation versus usual care (subgroup analysis hospital vs community), Outcome 2 QoL ‐ Change in CRQ (Emotional Function).

Analysis 2.3

Comparison 2 Rehabilitation versus usual care (subgroup analysis hospital vs community), Outcome 3 QoL ‐ Change in CRQ (Mastery).

Analysis 2.4

Comparison 2 Rehabilitation versus usual care (subgroup analysis hospital vs community), Outcome 4 QoL ‐ Change in CRQ (Dyspnoea).

Analysis 2.5

Comparison 2 Rehabilitation versus usual care (subgroup analysis hospital vs community), Outcome 5 QoL ‐ Change in SGRQ (Total).

Analysis 2.6

Comparison 2 Rehabilitation versus usual care (subgroup analysis hospital vs community), Outcome 6 QoL ‐ Change in SGRQ (Symptoms).

Analysis 2.7

Comparison 2 Rehabilitation versus usual care (subgroup analysis hospital vs community), Outcome 7 QoL ‐ Change in SGRQ (Impacts).

Analysis 2.8

Comparison 2 Rehabilitation versus usual care (subgroup analysis hospital vs community), Outcome 8 QoL ‐ Change in SGRQ (Activity).

Analysis 3.1

Comparison 3 Rehabilitation versus usual care (subgroup analysis exercise only vs exercise and other), Outcome 1 QoL ‐ Change in CRQ (Fatigue).

Analysis 3.2

Comparison 3 Rehabilitation versus usual care (subgroup analysis exercise only vs exercise and other), Outcome 2 QoL ‐ Change in CRQ (Emotional Function).

Analysis 3.3

Comparison 3 Rehabilitation versus usual care (subgroup analysis exercise only vs exercise and other), Outcome 3 QoL ‐ Change in CRQ (Mastery).

Analysis 3.4

Comparison 3 Rehabilitation versus usual care (subgroup analysis exercise only vs exercise and other), Outcome 4 QoL ‐ Change in CRQ (Dyspnoea).

Analysis 3.5

Comparison 3 Rehabilitation versus usual care (subgroup analysis exercise only vs exercise and other), Outcome 5 QoL ‐ Change in SGRQ (Total).

Analysis 3.6

Comparison 3 Rehabilitation versus usual care (subgroup analysis exercise only vs exercise and other), Outcome 6 QoL ‐ Change in SGRQ (Symptoms).

Analysis 3.7

Comparison 3 Rehabilitation versus usual care (subgroup analysis exercise only vs exercise and other), Outcome 7 QoL ‐ Change in SGRQ (Impacts).

Analysis 3.8

Comparison 3 Rehabilitation versus usual care (subgroup analysis exercise only vs exercise and other), Outcome 8 QoL ‐ Change in SGRQ (Activity).

Analysis 4.1

Comparison 4 Rehabilitation versus usual care (sensitivity analysis by allocation concealment and incomplete outcome), Outcome 1 QoL ‐ Change in CRQ (Dyspnoea).

Analysis 4.2

Comparison 4 Rehabilitation versus usual care (sensitivity analysis by allocation concealment and incomplete outcome), Outcome 2 QoL ‐ Change in CRQ (Emotional Function).

Analysis 4.3

Comparison 4 Rehabilitation versus usual care (sensitivity analysis by allocation concealment and incomplete outcome), Outcome 3 QoL ‐ Low Risk CRQ (Fatigue).

Analysis 4.4

Comparison 4 Rehabilitation versus usual care (sensitivity analysis by allocation concealment and incomplete outcome), Outcome 4 QoL ‐ Low Risk CRQ (Mastery).

Analysis 4.5

Comparison 4 Rehabilitation versus usual care (sensitivity analysis by allocation concealment and incomplete outcome), Outcome 5 QoL ‐ Low Risk SGRQ (Total).

Analysis 4.6

Comparison 4 Rehabilitation versus usual care (sensitivity analysis by allocation concealment and incomplete outcome), Outcome 6 QoL ‐ Low Risk SGRQ (Symptoms).

Analysis 4.7

Comparison 4 Rehabilitation versus usual care (sensitivity analysis by allocation concealment and incomplete outcome), Outcome 7 QoL ‐ Low Risk SGRQ (Impacts).

Analysis 4.8

Comparison 4 Rehabilitation versus usual care (sensitivity analysis by allocation concealment and incomplete outcome), Outcome 8 QoL ‐ Low Risk SGRQ (Activity).

Summary of findings for the main comparison. Rehabilitation versus usual care for chronic obstructive pulmonary disease

Rehabilitation versus usual care for chronic obstructive pulmonary disease
Patient or population: patients with chronic obstructive pulmonary disease Settings: hospital and community Intervention: rehabilitation versus usual care
Outcomes	*Illustrative comparative effects (95% CI)**		Number of participants (studies)	Quality of the evidence (GRADE)	Comments
	Response on control	Treatment effect
	Usual care	Rehabilitation versus usual care
QoL ‐ Change in CRQ (dyspnoea) CRQ Questionnaire. Scale from 1 to 7 (Higher is better and 0.5 unit is an important difference) Follow‐up: median 12 weeks	Median change = 0 units	Mean QoL ‐ change in CRQ (Dyspnoea) in the intervention groups was 0.79 units higher (0.56 to 1.03 higher)	1283 (19 studies)	⊕⊕⊕⊝ Moderate^1,2,3	Sensitivity analysis from studies at lower risk of bias was similar (MD 0.99, 95% CI 0.64 to 1.34; participants = 384; studies = 5; I² = 34%)
QoL ‐ Change in SGRQ (total) Scale from 0 to 100 (Lower is better and 4 units is an important difference) Follow‐up: median 12 weeks	Median change = 0.42 units	Mean QOL ‐ change in SGRQ (total) in the intervention groups was 6.89 units lower (9.26 to 4.52 lower)	1146 (19 studies)	⊕⊕⊕⊝ Moderate^2,3,4	Sensitivity analysis from studies at lower risk of bias was similar (MD ‐5.15, 95% CI ‐7.95 to ‐2.36; participants = 572; studies = 7; I² = 51%)
Change in maximal exercise (Incremental Shuttle walk test (ISWT)) Distance metres Follow‐up: median 12 weeks	Median change = 1 metre	Mean maximal exercise (incremental shuttle walk test) in the intervention groups was 39.77 metres higher (22.38 to 57.15 higher)	694 (8 studies)	⊕⊕⊕⊝ Moderate^2,3,5
Change in functional exercise capacity (6MWT)) Distance metres Follow‐up: median 12 weeks	Median change = 3.4 metres	Mean functional exercise capacity (6MWT)) in the intervention groups was 43.93 metres higher (32.64 to 55.21 higher)	1879 (38 studies)	⊕⊝⊝⊝ Very low^2,3,6,7
Change in maximal exercise capacity (cycle ergometer) Workmax (watt) Follow‐up: median 12 weeks	Median change = ‐0.05 watts	Mean maximal exercise capacity (cycle ergometer) in the intervention groups was 6.77 watts higher (1.89 to 11.65 higher)	779 (16 studies)	⊕⊕⊝⊝ Low^2,3,8,9
The basis for the response on control is the median control group response across studies. CI:* confidence interval; MD: mean difference.
GRADE Working Group grades of evidence. High quality: Further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: We are very uncertain about the estimate.
¹17 studies reported random sequence generation (1 unclear), 12 reported allocation concealment 2 did not have allocation concealment and it is unclear in 5 studies. 4 studies did not blind assessors, 11 blinded assessors and 4 were unclear as to assessor blinding. 6 studies had attrition bias greater than 20%. ²Downgraded as there is a high level of heterogeneity within the results. Several factors may impact heterogeneity, including content of the intervention programme, setting of the programme and severity of COPD. ³Greater than optimal Information size (OIS). 95% confidence interval does not includes "no effect," nor does the confidence limit cross the MID, so no need to downgrade. ⁴18 studies reported random sequence generation (2 unclear), 10 reported allocation concealment, 2 did not have allocation concealment and it is unclear in 7 studies. 3 studies did not blind assessors, 9 blinded assessors and 7 were unclear as to assessor blinding. 7 studies had attrition bias greater than 20%. ⁵All 8 studies reported random sequence generation, 5 reported allocation concealment and it is unclear in 3 studies. 5 studies had blind assessors with 1 not blinded, and 2 were unclear as to assessor blinding. 4 studies had attrition bias greater than 20%. ⁶34 studies reported random sequence generation, 4 were unclear, 20 reported allocation concealment, 3 did not have allocation concealment and it is unclear in 15 studies. 5 studies did not blind assessors, 19 blinded assessors and 13 were unclear as to assessor blinding. 13 studies had attrition bias greater than 20% and 2 were unclear. ⁷Downgraded as bias indicated for 6‐minute walk test: Egger: bias = 1.24304 (95% CI = 0.183967 to 2.302131; P value 0.0227). Begg‐Mazumdar: Kendall's tau = 0.16074 (P value 0.1601). ⁸All 16 studies reported random sequence generation, 6 reported allocation concealment, 3 did not have allocation concealment and it is unclear in 7 studies. 2 studies did not blind assessors, 10 blinded assessors and 4 were unclear as to assessor blinding. 4 studies had attrition bias greater than 20%. ⁹Downgraded as bias indicated for cycle ergometer test: Egger: bias = 1.57164 (95% CI = 0.6053 to 2.337984; P value 0.0036). Begg‐Mazumdar: Kendall's tau = ‐0.2666667 (P value 0.139).

Summary of findings for the main comparison. Rehabilitation versus usual care for chronic obstructive pulmonary disease

Table 1. Publication bias: results of Egger and Begg‐Mazumdar Kendall's tests

CRQ Fatigue

Bias indicators

Begg‐Mazumdar: Kendall's tau = 0.22807; P value 0.1863

Egger: bias = 1.61189 (95% CI = ‐0.194745 to 3.418525); P value 0.077

CRQ Emotional

Bias indicators

Begg‐Mazumdar: Kendall's tau = 0.204678; P value 0.2378

Egger: bias = 0.997332 (95% CI = ‐0.618039 to 2.612702); P value 0.2101

CRQ Mastery

Bias indicators

Begg‐Mazumdar: Kendall's tau = 0.146199; P value 0.4063

Egger: bias = 1.531134 (95% CI = ‐0.268167 to 3.330434); P value 0.0904

CRQ Dyspnoea

(see Figure 1 for funnel plot)

Bias indicators

Begg‐Mazumdar: Kendall's tau = 0.274854; P value 0.1082

Egger: bias = 1.275427 (95% CI = ‐0.761574 to 3.312427); P value 0.204

SGRQ Total

(see Figure 2 for funnel plot)

Bias indicators

Begg‐Mazumdar: Kendall's tau = ‐0.052632; P value 0.73

Egger: bias = ‐0.459813 (95% CI = ‐2.086751 to 1.167125); P value 0.5588

SGRQ Symptoms

Bias indicators

Begg‐Mazumdar: Kendall's tau = 0.017544; P value 0.945

Egger: bias = 0.076734 (95% CI = ‐1.241745 to 1.395213); P value 0.9037

SQRQ Activity

Bias indicators

Begg‐Mazumdar: Kendall's tau = ‐0.052632; P value 0.73

Egger: bias = ‐0.336937 (95% CI = ‐2.10096 to 1.427086); P value 0.692

6MWT

Bias indicators

Begg‐Mazumdar: Kendall's tau = 0.16074; P value 0.1601

Egger: bias = 1.24304 (95% CI = 0.183967 to 2.302131); P value 0.0227

Incremental Shuttle Walk Test

Bias indicators

Begg‐Mazumdar: Kendall's tau = 0.0776906; P value 0.846

Egger: bias = ‐0.21 2523 (95% CI = ‐2.7776 to 2.351859); P value 0.846

Cycle Ergometer

Bias indicators

Begg‐Mazumdar: Kendall's tau = ‐0.2666667; P value 0.139
Egger: bias = 1.57164 (95% CI = 0.6053 to 2.337984); P value 0.0036

Table 1. Publication bias: results of Egger and Begg‐Mazumdar Kendall's tests

Table 2. Baseline characteristics

Study	Rehab sample size	Male	Female	Mean age (SD)	FEV₁ (SD)	Control sample size	Male	Female	Mean age (SD)	FEV₁ (SD)
Barakat 2008	35	na	na	63.7	41.9	36	na	na	65.9	43.3
Baumann 2012	37	na	na	65	45	44	na	na	63	47
Behnke 2000a	23	12	3	64.0 (1)	34.1 (7.4)	23	11	4	68.0 (2.2)	37.5 (6.6)
Bendstrup 1997	27	7	9	64 (3)	1.02 L/min (0.06)	20	7	9	65 (2)	1.04 L/min (0.07)
Booker 1984	32	na	na	66 (8)	0.85 L (0.29)	37	na	na	65 (7)	0.97 L (0.37)
Borghi‐Silva 2009	20	13	7	67 (10)	33 (9)	14	12	8	67(10)	35 (11)
Boxall 2005	23	11	12	77.6 (7.6)	40.5 (15.9)	23	15	8	75.8 (8.1)	37.7 (15.0)
Busch 1988	7	5	2	65 (16)	26% (9)	7	6	1	66 (16)	27% (11)
Cambach 1997	15	7	8	62 (5)	59% (16)	8	6	2	62 (9)	60% (23)
Casaburi 2004	12	12	0	69 (10)	36% (9)	12	12	0	68 (9)	39% (12)
Casey 2013	178	117	61	68.8 (10.2)	57.6 (14.3)	172	106	66	68.4 (10.3)	59.7 (13.8)
Cebollero 2012	28	28	0	68 (7)	47.8 (5)	8	8	0	69(5)	38.7 (5)
Chan 2011	69	61	8	73.6 (7.5)	91 (0.39)	67	58	9	73.6 (7.4)	89 (0.39)
Chlumsky 2001	13	12	1	63 (11)	43% (21)	6	5	1	65 (13)	51% (17)
Clark 1996	32	na	na	58 (8)	1.72 L (0.83)	16	na	na	55 (8)	1.44 L (0.59)
Cochrane 2006	74	32	42	na	na	50	18	32	na	na
Cockcroft 1981	18	18	0	61 (5)	1.53 L (0.70)	16	16	0	60 (5)	1.32 L (0.44)
De Souto Araujo 2012	21	12	9	59	39.2 (11.4) /43.9 (10.3)	11	8	3	71.1	45.1 (12.6)
Deering 2011	25	11	14	67.7 (5.3)	77.0 (19)	19	8	8	68.6 (5.5)	45.8 (18.3)
Elci 2008	39	33	6	59.67 (8.6)	47.7	39	33	6	58.08 (11.45)	46.28
Emery 1998	25	15	14	65 (6)	1.29 L (0.63)	25	12	13	67 (7)	1.02 L (0.37)
Engström 1999	26	14	12	66 (5)	31% (11)	24	12	12	67 (5)	34% (10)
Faager 2004	10	3	7	72 (9)	26 (7)	10	3	7	70 (8)	28 (6)
Faulkner 2010	10	na	na	na	na	10	na	na	na	na
Fernandez 2009	30	29	1	66 (8)	33 (10)	20	20	0	70 (5)	38 (12)
Finnerty 2001	36	25	11	70 (8)	41% (19)	29	19	10	68 (10)	41% (16)
Gohl 2006	17	6	4	62.5 (7)	53.4 (10.7)	17	7	2	53.7 (5.8)	63.2 (8.5)
Goldstein 1994	38	21	17	66 (7)	35% (15)	40	17	23	65 (8)	35% (12)
Gosselink 2000	37	31	6	60 (9)	41% (16)	33	30	3	63 (7)	43% (12)
Gottlieb 2011	35	7	15	74.1 (66–82)	64.27 (7.9)	26	7	13	73.2 (67–88)	67.05 (8.8)
Griffiths 2000	93	57	37	68 (8)	40% (16)	91	54	37	68 (8)	39% (16)
Gurgun 2013	30	28	28	64.0 (10.8)	41.9 (10.8)	16	15	1	67.8 (6.6)	39.3 (9.3)
Güell 1995	30	30	30	64 (7)	31% (12)	30	30	0	66 (6)	39% (14)
Güell 1998	18	16	2	68 (8)	32% (11)	17	17	0	66 (8)	38% (15)
Hernandez 2000	20	20	0	64 (8)	71.1 (18.9)	17	17	0	63 (7)	74.7 (14.7)
Hoff 2007	6	4	2	62.8 (1.4)	49.9 (4.6)	6	4	2	60.6 (3.0)	45.2 (6.0)
Jones 1985	8	6	2	64 (6)	0.78 L (0.27)	6	1	5	63 (8)	0.68 L (0.12)
Karapolat 2007	26	21	5	64.81 (9.4)	55.50%	19	18	1	67.21 (6.72)	58%
Lake 1990	7	6	1	66.3 (6.8)	0.83 L (0.25)	7	4	3	65.7 (3.5)	0.97 L (0.29)
Lindsay 2005	25	20	5	69.5 (9.3)	0.9 L (0.3)	25	18	7	69.8 (10.3)	0.8 L (0.4)
Liu 2012	36	26	10	61.34 (8.3)	61.27 (5.86)	36	29	7	62.2 (6.34)	61.43 (6.17)
McGavin 1977	12	12	0	61 (6)	0.97 L (0.33)	12	12	0	57 (8)	1.15 L (0.72)
McNamara 2013	38	18	15	72 (10)	60 (10)	15	8	7	70 (9)	55 (20)
Mehri 2007	20	11	9	52.1 (10.7)	na	18	7	11	52.17 (11.6)	na
Mendes De Oliveira 2010	56	46	10	66.4/71.3	47.5/ 51.5	29	19	10	70.8	41.4
Nalbant 2011	14	11	3	73.5	58.5 (48‐65)	15	13	2	68	57 (44‐66)
O'Shea 2007	27	na	na	66.9 (7)	49	27	na	na	68.4 (9.9)	52
Ozdemir 2010	25	25	0	60.9 (8.8)	54.5 (15.6)	25	25	0	64.1 (8.9)	54.1 (20.2)
Paz‐Diaz 2007	10	6	4	67 (5)	34 (11)	14	12	2	62 (7)	30 (9)
Petty 2006	149	80	69	68.8 (9.2)	na	73	40	33	66.8 (9.9)	na
Reardon 1994	10	5	5	66 (8)	35% (10)	10	5	5	66 (7)	33% (15)
Ringbaek 2000	24	1	23	62 (7)	50% (17)	21	6	15	65 (8)	44% (14)
Gomez 2006	64	39	9	64.1/64.9	74 (66.5‐81.5)	33	19	4	63.4	60.1 (55.6‐64.4)
Simpson 1992	14	5	9	73 (5)	40% (19)	14	10	4	70 (6)	39% (21)
Singh 2003	20	na	na	na	28 (7.5)	20	na	na	na	26 (7.1)
Sridhar 2008	61	30	31	69.9 (9.6)	42.9 (15.5)	61	30	31	69.68 (10.4)	48.9 (18.69)
Strijbos 1996	15	14	1	61 (6)	40% (20)	15	12	3	63 (5)	43% (9)
Theander 2009	15	3	9	66	35.1 (7.6)	15	10	4	64	32.3 (9.5)
Vallet 1994	10	7	3	60 (9)	57.2	10	8	2	58 (6)	55.7
Van Wetering 2010	102	72	30	65.9 (8.8)	58 (17)	97	69	28	67.2 (8.9)	60 (15)
Vijayan 2010	16	na	na	na	na	15	na	na	na	na
Weiner 1992	12	6	6	67 (9)	32.8 (3)	12	5	7	61 (9)	39.2 (2.8)
Wen 2008	32	31	1	67 (7)/68 (7)	46 (10)/50 (14)	9	9	0	66(10)	52 (14)
Wijkstra 1994	28	23	5	64 (5)	44% (11)	15	14	1	62 (5)	45% (9)
Xie 2003	25	22	3	54 (6)	42% (16)	25	21	4	54 (6)	40% (17)
na: not available.

Table 2. Baseline characteristics

Table 3. Study design

Study	Follow‐up	Duration (weeks)	Setting	Programme type
Barakat 2008	14 weeks	14	Outpatient	Exercise + other
Baumann 2012	6 months	8	Community	Exercise + other
Behnke 2000a	3, 6 months	24	Inpatient	Exercise + other
Bendstrup 1997	12, 24 weeks	12	Outpatient	Exercise
Booker 1984	3, 6, 12 months	9	Home	Exercise + other
Borghi‐Silva 2009	6 weeks	6	Outpatient	Exercise
Boxall 2005	12 weeks	12	Home	Exercise + other
Busch 1988	18 weeks	18	Home	Exercise
Cambach 1997	3 months	12	Community	Exercise + other
Casaburi 2004	10 weeks	10	Outpatient	Exercise + other
Casey 2013	12 weeks	8	Community	Exercise + other
Cebollero 2012	12 weeks	12	Outpatient	Exercise
Chan 2011	3 months	12	Community	Exercise
Chlumsky 2001	8 weeks	8	Outpatient	Exercise
Clark 1996	12 weeks	12	Home	Exercise
Cochrane 2006	6 weeks, 6 months, 12 months	6	Outpatient	Exercise + other
Cockcroft 1981	2, 6 months	6	Outpatient	Exercise
De Souto Araujo 2012	8 weeks	8	Community	Exercise
Deering 2011	8 weeks	7	Outpatient	Exercise + other
Elci 2008	1, 3 months	12	Community /Home	Exercise + other
Emery 1998	10 weeks	10	Outpatient	Exercise + other
Engström 1999	12 months	52	Outpatient /Home	Exercise + other
Faager 2004	8 weeks, 6 months	8	Inpatient /Home	Exercise + other
Faulkner 2010	week 9	8	Community	Exercise + other
Fernandez 2009	1 year	52	Home	Exercise + other
Finnerty 2001	12, 24 weeks	6	Outpatient	Exercise + other
Gohl 2006	12 months	52	Community	Exercise
Goldstein 1994	24 weeks	8	Inpatient	Exercise + other
Gosselink 2000	6, 18 months	24	Outpatient	Exercise
Gottlieb 2011	6 months	7	Community	Exercise + other
Griffiths 2000	1 year	6	Outpatients /Home	Exercise + other
Gomez 2006	3, 6 months	12	Community	Exercise + other
Güell 1995	3, 6, 9, 12, 18, 24 months	12	Outpatient /Home	Exercise
Güell 1998	8 weeks	8	Outpatient	Exercise
Gurgun 2013	8 weeks, 6 months	8	Outpatient	Exercise + other
Hernandez 2000	12 weeks	12	Home	Exercise
Hoff 2007	8 weeks	8	Outpatient	Exercise
Jones 1985	10 weeks	10	Home	Exercise
Karapolat 2007	8, 12 weeks	8	Outpatient	Exercise + other
Lake 1990	8 weeks	8	Outpatient	Exercise
Lindsay 2005	6 weeks, 3 months	6	Community	Exercise + other
Liu 2012	6 months	24	Inpatient /Home	Exercise
McGavin 1977	14 weeks	?12	Home	Exercise
McNamara 2013	8 weeks	8	Outpatient	Exercise
Mehri 2007	4 weeks	4	Outpatient	Exercise
Mendes De Oliveira 2010	12 weeks	12	Outpatient /Home	Exercise + other
Nalbant 2011	3, 6 months	24	Nursing home	Exercise + other
O'Shea 2007	3, 6 months	12	Outpatient /Home	Exercise
Ozdemir 2010	1 month	4	Outpatient	Exercise
Paz‐Diaz 2007	8 weeks	8	Outpatient	Exercise
Petty 2006	8 weeks	8	Home	Exercise + other
Reardon 1994	6 weeks	6	Outpatient	Exercise + other
Ringbaek 2000	8 weeks	8	Outpatient	Exercise + other
Simpson 1992	8 weeks	8	Outpatient	Exercise
Singh 2003	4 weeks	4	Home	Exercise
Sridhar 2008	6 months	6	Outpatients /Home	Exercise + other
Strijbos 1996	3, 6, 12, 18 months	12	Outpatient	Exercise + other
Theander 2009	12 weeks	12	Outpatient /Home	Exercise + other
Vallet 1994	8 weeks	8	Inpatient	Exercise
Van Wetering 2010	4 months	12	Community	Exercise + other
Vijayan 2010	Unclear	6	Unclear	Exercise
Weiner 1992	6 months	24	Outpatient	Exercise
Wen 2008	12 weeks	12	Outpatient	Exercise
Wijkstra 1994	12 weeks	12	Outpatient /Home	Exercise + other
Xie 2003	12 weeks	12	Home	Exercise

Table 3. Study design

Table 4. Summary of subgroup analysis

Pulmonary rehabilitation versus usual care. Subgroup: community versus hospital‐delivered programme
Outcome	Subscale	Subgroups	Heterogeneity	MD [95% CI]	Test for subgroup differences
CRQ	Fatigue	Community	Tau² = 0.10; I² = 52%	0.44 [0.14, 0.75]	Chi² = 3.98, df = 1 (P value 0.05), I² = 74.9%
CRQ	Fatigue	Hospital	Tau² = 0.09; I² = 51%	0.86 [0.58, 1.14]	Chi² = 3.98, df = 1 (P value 0.05), I² = 74.9%
CRQ	Emotional Function	Community	Tau² = 0.00; I² = 0%	0.21 [0.04, 0.39]	Chi² = 12.24, df = 1 (P value 0.0005), I² = 91.8%
CRQ	Emotional Function	Hospital	Tau² = 0.06; I² = 39%	0.77 [0.51, 1.03]	Chi² = 12.24, df = 1 (P value 0.0005), I² = 91.8%
CRQ	Mastery	Community	Tau² = 0.07; I² = 45%	0.40 [0.12, 0.67]	Chi² = 8.58, df = 1 (P value 0.003), I² = 88.3%
CRQ	Mastery	Hospital	Tau² = 0.05; I² = 31%	0.95 [0.70, 1.20]	Chi² = 8.58, df = 1 (P value 0.003), I² = 88.3%
CRQ	Dyspnoea	Community	Tau² = 0.03; I² = 26%	0.58 [0.34, 0.81]	Chi² = 4.05, df = 1 (P value 0.04), I² = 75.3%
CRQ	Dyspnoea	Hospital	Tau² = 0.17; I² = 60%	0.99 [0.66, 1.32]	Chi² = 4.05, df = 1 (P value 0.04), I² = 75.3%
SGRQ	Total	Community	Tau² = 24.00; I² = 73%	‐8.15 [‐12.16, ‐4.13]	Chi² = 0.69, df = 1 (P value 0.41), I² = 0%
SGRQ	Total	Hospital	Tau² = 6.41; I² = 35%	‐6.05 [‐8.91, ‐3.20]	Chi² = 0.69, df = 1 (P value 0.41), I² = 0%
SGRQ	Symptoms	Community	Tau² = 6.28; I² = 24%	‐3.66 [‐7.07, ‐0.24]	Chi² = 1.65, df = 1 (P value 0.20), I² = 39.2%
SGRQ	Symptoms	Hospital	Tau² = 4.96; I² = 15%	‐6.91 [‐10.51, ‐3.30]	Chi² = 1.65, df = 1 (P value 0.20), I² = 39.2%
SGRQ	Impact	Community	Tau² = 19.91; I² = 63%	‐8.17 [‐12.00, ‐4.34]	Chi² = 0.46, df = 1 (P value 0.50), I² = 0%
SGRQ	Impact	Hospital	Tau² = 22.39; I² = 58%	‐6.21 [‐10.33, ‐2.09]	Chi² = 0.46, df = 1 (P value 0.50), I² = 0%
SGRQ	Activity	Community	Tau² = 48.91; I² = 78%	‐7.82 [‐13.37, ‐2.28]	Chi² = 0.93, df = 1 (P value 0.33), I² = 0%
SGRQ	Activity	Hospital	Tau² = 10.45; I² = 36%	‐4.58 [‐8.16, ‐1.00]	Chi² = 0.93, df = 1 (P value 0.33), I² = 0%
Pulmonary rehabilitation versus usual care. Subgroup: exercise only programme versus exercise plus additional elements in programme
Outcome	Subscale	Subgroups	Heterogeneity	MD [95% CI]	Test for subgroup differences
CRQ	Fatigue	Exercise only	Tau² = 0.00; I² = 0%	0.73 [0.54, 0.92]	Chi² = 0.26, df = 1 (P value 0.61), I² = 0%
CRQ	Fatigue	Exercise + other	Tau² = 0.29; I² = 79%	0.61 [0.18, 1.03]	Chi² = 0.26, df = 1 (P value 0.61), I² = 0%
CRQ	Emotional Function	Exercise only	Tau² = 0.00; I² = 0%	0.51 [0.31, 0.71]	Chi² = 0.09, df = 1 (P value 0.77), I² = 0%
CRQ	Emotional Function	Exercise + other	Tau² = 0.28; I² = 79%	0.58 [0.16, 1.00]	Chi² = 0.09, df = 1 (P value 0.77), I² = 0%
CRQ	Mastery	Exercise only	Tau² = 0.01; I² = 11%	0.66 [0.44, 0.88]	Chi² = 0.12, df = 1 (P value 0.73), I² = 0%
CRQ	Mastery	Exercise + other	Tau² = 0.31; I² = 79%	0.74 [0.31, 1.18]	Chi² = 0.12, df = 1 (P value 0.73), I² = 0%
CRQ	Dyspnoea	Exercise only	Tau² = 0.06; I² = 31%	0.83 [0.56, 1.10]	Chi² = 0.13, df = 1 (P value 0.72), I² = 0%
CRQ	Dyspnoea	Exercise + other	Tau² = 0.25; I² = 77%	0.74 [0.35, 1.13]	Chi² = 0.13, df = 1 (P value 0.72), I² = 0%
SGRQ	Total	Exercise only	Tau² = 62.83; I² = 70%	‐7.87 [‐16.72, 0.98]
SGRQ	Total	Exercise + other	Tau² = 10.17; I² = 56%	‐6.76 [‐9.19, ‐4.34]	Chi² = 0.06, df = 1 (P value 0.81), I² = 0%
SGRQ	Symptoms	Exercise only	Tau² = 0.00; I² = 0%	‐7.38 [‐12.33, ‐2.44]
SGRQ	Symptoms	Exercise + other	Tau² = 13.88; I² = 41%	‐4.38 [‐7.62, ‐1.15]	Chi² = 0.99, df = 1 (P value 0.32), I² = 0%
SGRQ	Impact	Exercise only	Tau² = 33.34; I² = 63%	‐6.11 [‐12.60, 0.38]
SGRQ	Impact	Exercise + other	Tau² = 17.12; I² = 59%	‐7.61 [‐10.64, ‐4.57]	Chi² = 0.17, df = 1 (P value 0.68), I² = 0%
SGRQ	Activity	Exercise only	Tau² = 139.67; I² = 78%	‐9.33 [‐21.66, 2.99]	Chi² = 0.30, df = 1 (P value 0.59), I² = 0%
SGRQ	Activity	Exercise + other	Tau² = 18.51; I² = 60%	‐5.79 [‐8.95, ‐2.64]	Chi² = 0.30, df = 1 (P value 0.59), I² = 0%
CRQ: Chronic Respiratory Disease Questionnaire; MD: mean difference; SGRQ: St. George's Respiratory Questionnaire.

Table 4. Summary of subgroup analysis

Comparison 1. Rehabilitation versus usual care

Outcome or subgroup title	No. of studies	No. of participants	Statistical method	Effect size
1 QoL ‐ Change in CRQ (Fatigue) Show forest plot	19	1291	Mean Difference (IV, Random, 95% CI)	0.68 [0.45, 0.92]

2 QoL ‐ Change in CRQ (Emotional Function) Show forest plot	19	1291	Mean Difference (IV, Random, 95% CI)	0.56 [0.34, 0.78]

3 QoL ‐ Change in CRQ (Mastery) Show forest plot	19	1212	Mean Difference (IV, Random, 95% CI)	0.71 [0.47, 0.95]

4 QoL ‐ Change in CRQ (Dyspnoea) Show forest plot	19	1283	Mean Difference (IV, Random, 95% CI)	0.79 [0.56, 1.03]

5 QoL ‐ Change in SGRQ (Total) Show forest plot	19	1146	Mean Difference (IV, Random, 95% CI)	‐6.89 [‐9.26, ‐4.52]

6 QoL ‐ Change in SGRQ (Symptoms) Show forest plot	19	1153	Mean Difference (IV, Random, 95% CI)	‐5.09 [‐7.69, ‐2.49]

7 QoL ‐ Change in SGRQ (Impacts) Show forest plot	19	1149	Mean Difference (IV, Random, 95% CI)	‐7.23 [‐9.91, ‐4.55]

8 QoL ‐ Change in SGRQ (Activity) Show forest plot	19	1148	Mean Difference (IV, Random, 95% CI)	‐6.08 [‐9.28, ‐2.88]

9 Maximal Exercise (Incremental shuttle walk test) Show forest plot	8	694	Mean Difference (IV, Random, 95% CI)	39.77 [22.38, 57.15]

10 Maximal Exercise Capacity (cycle ergometer) Show forest plot	16	779	Mean Difference (IV, Random, 95% CI)	6.77 [1.89, 11.65]

11 Functional Exercise Capacity (6MWT)) Show forest plot	38	1879	Mean Difference (IV, Random, 95% CI)	43.93 [32.64, 55.21]

Comparison 1. Rehabilitation versus usual care

Comparison 2. Rehabilitation versus usual care (subgroup analysis hospital vs community)

Outcome or subgroup title	No. of studies	No. of participants	Statistical method	Effect size
1 QoL ‐ Change in CRQ (Fatigue) Show forest plot	19	1291	Mean Difference (IV, Random, 95% CI)	0.68 [0.45, 0.92]

1.1 QoL ‐ Community CRQ (Fatigue)	9	648	Mean Difference (IV, Random, 95% CI)	0.44 [0.14, 0.75]
1.2 QoL ‐ Hospital CRQ (Fatigue)	10	643	Mean Difference (IV, Random, 95% CI)	0.86 [0.58, 1.14]
2 QoL ‐ Change in CRQ (Emotional Function) Show forest plot	19	1291	Mean Difference (IV, Random, 95% CI)	0.56 [0.34, 0.78]

2.1 QoL ‐ Community (Emotional Function)	9	648	Mean Difference (IV, Random, 95% CI)	0.21 [0.04, 0.39]
2.2 QoL ‐ Hospital CRQ (Emotional Function)	10	643	Mean Difference (IV, Random, 95% CI)	0.77 [0.51, 1.03]
3 QoL ‐ Change in CRQ (Mastery) Show forest plot	19	1212	Mean Difference (IV, Random, 95% CI)	0.71 [0.47, 0.95]

3.1 QoL ‐ Community CRQ (Mastery)	9	569	Mean Difference (IV, Random, 95% CI)	0.40 [0.12, 0.67]
3.2 QoL ‐ Hospital CRQ (Mastery)	10	643	Mean Difference (IV, Random, 95% CI)	0.95 [0.70, 1.20]
4 QoL ‐ Change in CRQ (Dyspnoea) Show forest plot	19	1283	Mean Difference (IV, Random, 95% CI)	0.82 [0.59, 1.05]

4.1 QoL ‐ Community Based CRQ (Dyspnoea)	8	633	Mean Difference (IV, Random, 95% CI)	0.58 [0.34, 0.81]
4.2 QoL ‐ Hospital Based CRQ (Dyspnoea)	11	650	Mean Difference (IV, Random, 95% CI)	0.99 [0.66, 1.32]
5 QoL ‐ Change in SGRQ (Total) Show forest plot	19	1146	Mean Difference (IV, Random, 95% CI)	‐6.89 [‐9.26, ‐4.52]

5.1 QoL ‐ Community in SGRQ (Total)	9	643	Mean Difference (IV, Random, 95% CI)	‐8.15 [‐12.16, ‐4.13]
5.2 QoL ‐ Hospital SGRQ (Total)	10	503	Mean Difference (IV, Random, 95% CI)	‐6.05 [‐8.91, ‐3.20]
6 QoL ‐ Change in SGRQ (Symptoms) Show forest plot	19	1153	Mean Difference (IV, Random, 95% CI)	‐5.09 [‐7.69, ‐2.49]

6.1 QoL ‐ Community SGRQ (Symptoms)	9	649	Mean Difference (IV, Random, 95% CI)	‐3.66 [‐7.07, ‐0.24]
6.2 QoL ‐ Hospital SGRQ (Symptoms)	10	504	Mean Difference (IV, Random, 95% CI)	‐6.91 [‐10.51, ‐3.30]
7 QoL ‐ Change in SGRQ (Impacts) Show forest plot	19	1149	Mean Difference (IV, Random, 95% CI)	‐7.23 [‐9.91, ‐4.55]

7.1 QoL ‐ Community SGRQ (Impacts)	9	646	Mean Difference (IV, Random, 95% CI)	‐8.17 [‐10.00, ‐4.34]
7.2 QoL ‐ Hospital SGRQ (Impacts)	10	503	Mean Difference (IV, Random, 95% CI)	‐6.21 [‐10.33, ‐2.09]
8 QoL ‐ Change in SGRQ (Activity) Show forest plot	19	1148	Mean Difference (IV, Random, 95% CI)	‐6.08 [‐9.28, ‐2.88]

8.1 QoL ‐ Community SGRQ (Activity)	9	645	Mean Difference (IV, Random, 95% CI)	‐7.82 [‐13.37, ‐2.28]
8.2 QoL ‐ Hospital SGRQ (Activity)	10	503	Mean Difference (IV, Random, 95% CI)	‐4.58 [‐8.16, 1.00]

Comparison 2. Rehabilitation versus usual care (subgroup analysis hospital vs community)

Comparison 3. Rehabilitation versus usual care (subgroup analysis exercise only vs exercise and other)

Outcome or subgroup title	No. of studies	No. of participants	Statistical method	Effect size
1 QoL ‐ Change in CRQ (Fatigue) Show forest plot	19	1291	Mean Difference (IV, Random, 95% CI)	0.68 [0.45, 0.92]

1.1 QoL ‐ Exercise Only CRQ (Fatigue)	10	480	Mean Difference (IV, Random, 95% CI)	0.73 [0.54, 0.92]
1.2 QoL ‐ Exercise + Other CRQ (Fatigue)	9	811	Mean Difference (IV, Random, 95% CI)	0.61 [0.18, 1.03]
2 QoL ‐ Change in CRQ (Emotional Function) Show forest plot	19	1291	Mean Difference (IV, Random, 95% CI)	0.56 [0.34, 0.78]

2.1 QoL ‐ Exercise Only CRQ (Emotional Function)	10	480	Mean Difference (IV, Random, 95% CI)	0.51 [0.31, 0.71]
2.2 QoL ‐ Exercise + Other CRQ (Emotional Function)	9	811	Mean Difference (IV, Random, 95% CI)	0.58 [0.16, 1.00]
3 QoL ‐ Change in CRQ (Mastery) Show forest plot	19	1212	Mean Difference (IV, Random, 95% CI)	0.71 [0.47, 0.95]

3.1 QoL ‐ Exercise Only CRQ (Mastery)	10	480	Mean Difference (IV, Random, 95% CI)	0.66 [0.44, 0.88]
3.2 QoL ‐ Exercise + Other CRQ (Mastery)	9	732	Mean Difference (IV, Random, 95% CI)	0.74 [0.31, 1.18]
4 QoL ‐ Change in CRQ (Dyspnoea) Show forest plot	19	1283	Mean Difference (IV, Random, 95% CI)	0.79 [0.56, 1.03]

4.1 QoL ‐ Exercise Only CRQ (Dyspnoea)	10	474	Mean Difference (IV, Random, 95% CI)	0.83 [0.56, 1.09]
4.2 QoL ‐ Exercise + Other CRQ (Dyspnoea)	9	809	Mean Difference (IV, Random, 95% CI)	0.74 [0.35, 1.13]
5 QoL ‐ Change in SGRQ (Total) Show forest plot	19	1146	Mean Difference (IV, Random, 95% CI)	‐6.89 [‐9.26, ‐4.52]

5.1 QoL Exercise Only SGRQ (Total)	5	230	Mean Difference (IV, Random, 95% CI)	‐7.87 [‐16.72, 0.98]
5.2 QoL Exercise + Other SGRQ (Total)	14	916	Mean Difference (IV, Random, 95% CI)	‐6.76 [‐9.19, ‐4.34]
6 QoL ‐ Change in SGRQ (Symptoms) Show forest plot	19	1153	Mean Difference (IV, Random, 95% CI)	‐5.09 [‐7.69, ‐2.49]

6.1 QoL ‐ Exercise Only SGRQ (Symptoms)	5	230	Mean Difference (IV, Random, 95% CI)	‐7.38 [‐12.33, ‐2.44]
6.2 QoL ‐ Exercise + Other SGRQ (Symptoms)	14	923	Mean Difference (IV, Random, 95% CI)	‐4.38 [‐7.62, ‐1.15]
7 QoL ‐ Change in SGRQ (Impacts) Show forest plot	19	1149	Mean Difference (IV, Random, 95% CI)	‐7.23 [‐9.91, ‐4.55]

7.1 QoL ‐ Exercise Only SGRQ (Impacts)	5	230	Mean Difference (IV, Random, 95% CI)	‐6.11 [‐12.60, 0.38]
7.2 QoL ‐ Exercise + Other SGRQ (Impacts)	14	919	Mean Difference (IV, Random, 95% CI)	‐7.61 [‐10.64, ‐4.57]
8 QoL ‐ Change in SGRQ (Activity) Show forest plot	19	1148	Mean Difference (IV, Random, 95% CI)	‐6.08 [‐9.28, ‐2.88]

8.1 QoL ‐ Exercise Only SGRQ (Activity)	5	230	Mean Difference (IV, Random, 95% CI)	‐9.33 [‐21.66, 2.99]
8.2 QoL ‐ Exercise + Other SGRQ (Activity)	14	918	Mean Difference (IV, Random, 95% CI)	‐5.79 [‐8.95, ‐2.64]

Comparison 3. Rehabilitation versus usual care (subgroup analysis exercise only vs exercise and other)

Comparison 4. Rehabilitation versus usual care (sensitivity analysis by allocation concealment and incomplete outcome)

Outcome or subgroup title	No. of studies	No. of participants	Statistical method	Effect size
1 QoL ‐ Change in CRQ (Dyspnoea) Show forest plot	5	384	Mean Difference (IV, Random, 95% CI)	0.99 [0.64, 1.34]

1.1 QoL ‐ Low Risk CRQ (Dyspnoea)	5	384	Mean Difference (IV, Random, 95% CI)	0.99 [0.64, 1.34]
2 QoL ‐ Change in CRQ (Emotional Function) Show forest plot	5	386	Mean Difference (IV, Random, 95% CI)	0.60 [0.09, 1.11]

2.1 QoL ‐ Low Risk (Emotional Function)	5	386	Mean Difference (IV, Random, 95% CI)	0.60 [0.09, 1.11]
3 QoL ‐ Low Risk CRQ (Fatigue) Show forest plot	5	386	Mean Difference (IV, Random, 95% CI)	0.90 [0.41, 1.39]

4 QoL ‐ Low Risk CRQ (Mastery) Show forest plot	5	386	Mean Difference (IV, Random, 95% CI)	0.77 [0.28, 1.26]

5 QoL ‐ Low Risk SGRQ (Total) Show forest plot	7	572	Mean Difference (IV, Random, 95% CI)	‐5.15 [‐7.95, ‐2.36]

6 QoL ‐ Low Risk SGRQ (Symptoms) Show forest plot	7	572	Mean Difference (IV, Random, 95% CI)	‐4.12 [‐8.45, 0.21]

7 QoL ‐ Low Risk SGRQ (Impacts) Show forest plot	7	572	Mean Difference (IV, Random, 95% CI)	‐5.92 [‐10.01, ‐1.82]

8 QoL ‐ Low Risk SGRQ (Activity) Show forest plot	7	572	Mean Difference (IV, Random, 95% CI)	‐5.33 [‐8.10, ‐2.57]

Comparison 4. Rehabilitation versus usual care (sensitivity analysis by allocation concealment and incomplete outcome)