Scolaris Content Display Scolaris Content Display

முழங்கால் கீல்வாதத்திற்கான உடற்பயிற்சிகள்

Collapse all Expand all

Abstract

available in

Background

Knee osteoarthritis (OA) is a major public health issue because it causes chronic pain, reduces physical function and diminishes quality of life. Ageing of the population and increased global prevalence of obesity are anticipated to dramatically increase the prevalence of knee OA and its associated impairments. No cure for knee OA is known, but exercise therapy is among the dominant non‐pharmacological interventions recommended by international guidelines.

Objectives

To determine whether land‐based therapeutic exercise is beneficial for people with knee OA in terms of reduced joint pain or improved physical function and quality of life.

Search methods

Five electronic databases were searched, up until May 2013.

Selection criteria

All randomised controlled trials (RCTs) randomly assigning individuals and comparing groups treated with some form of land‐based therapeutic exercise (as opposed to exercise conducted in the water) with a non‐exercise group or a non‐treatment control group.

Data collection and analysis

Three teams of two review authors independently extracted data, assessed risk of bias for each study and assessed the quality of the body of evidence for each outcome using the GRADE (Grades of Recommendation, Assessment, Development and Evaluation) approach. We conducted analyses on continuous outcomes (pain, physical function and quality of life) immediately after treatment and on dichotomous outcomes (proportion of study withdrawals) at the end of the study; we also conducted analyses on the sustained effects of exercise on pain and function (two to six months, and longer than six months).

Main results

In total, we extracted data from 54 studies. Overall, 19 (20%) studies reported adequate random sequence generation and allocation concealment and adequately accounted for incomplete outcome data; we considered these studies to have an overall low risk of bias. Studies were largely free from selection bias, but research results may be vulnerable to performance and detection bias, as only four of the RCTs reported blinding of participants to treatment allocation, and, although most RCTs reported blinded outcome assessment, pain, physical function and quality of life were participant self‐reported.

High‐quality evidence from 44 trials (3537 participants) indicates that exercise reduced pain (standardised mean difference (SMD) ‐0.49, 95% confidence interval (CI) ‐0.39 to ‐0.59) immediately after treatment. Pain was estimated at 44 points on a 0 to 100‐point scale (0 indicated no pain) in the control group; exercise reduced pain by an equivalent of 12 points (95% CI 10 to 15 points). Moderate‐quality evidence from 44 trials (3913 participants) showed that exercise improved physical function (SMD ‐0.52, 95% CI ‐0.39 to ‐0.64) immediately after treatment. Physical function was estimated at 38 points on a 0 to 100‐point scale (0 indicated no loss of physical function) in the control group; exercise improved physical function by an equivalent of 10 points (95% CI 8 to 13 points). High‐quality evidence from 13 studies (1073 participants) revealed that exercise improved quality of life (SMD 0.28, 95% CI 0.15 to 0.40) immediately after treatment. Quality of life was estimated at 43 points on a 0 to 100‐point scale (100 indicated best quality of life) in the control group; exercise improved quality of life by an equivalent of 4 points (95% CI 2 to 5 points).

High‐quality evidence from 45 studies (4607 participants) showed a comparable likelihood of withdrawal from exercise allocation (event rate 14%) compared with the control group (event rate 15%), and this difference was not significant: odds ratio (OR) 0.93 (95% CI 0.75 to 1.15). Eight studies reported adverse events, all of which were related to increased knee or low back pain attributed to the exercise intervention provided. No study reported a serious adverse event.

In addition, 12 included studies provided two to six‐month post‐treatment sustainability data on 1468 participants for knee pain and on 1279 (10 studies) participants for physical function. These studies indicated sustainability of treatment effect for pain (SMD ‐0.24, 95% CI ‐0.35 to ‐0.14), with an equivalent reduction of 6 (3 to 9) points on 0 to 100‐point scale, and of physical function (SMD ‐0.15 95% CI ‐0.26 to ‐0.04), with an equivalent improvement of 3 (1 to 5) points on 0 to 100‐point scale.

Marked variability was noted across included studies among participants recruited, symptom duration, exercise interventions assessed and important aspects of study methodology. Individually delivered programmes tended to result in greater reductions in pain and improvements in physical function, compared to class‐based exercise programmes or home‐based programmes; however between‐study heterogeneity was marked within the individually provided treatment delivery subgroup.

Authors' conclusions

High‐quality evidence indicates that land‐based therapeutic exercise provides short‐term benefit that is sustained for at least two to six months after cessation of formal treatment in terms of reduced knee pain, and moderate‐quality evidence shows improvement in physical function among people with knee OA. The magnitude of the treatment effect would be considered moderate (immediate) to small (two to six months) but comparable with estimates reported for non‐steroidal anti‐inflammatory drugs. Confidence intervals around demonstrated pooled results for pain reduction and improvement in physical function do not exclude a minimal clinically important treatment effect. Since the participants in most trials were aware of their treatment, this may have contributed to their improvement. Despite the lack of blinding we did not downgrade the quality of evidence for risk of performance or detection bias. This reflects our belief that further research in this area is unlikely to change the findings of our review.

PICOs

Population
Intervention
Comparison
Outcome

The PICO model is widely used and taught in evidence-based health care as a strategy for formulating questions and search strategies and for characterizing clinical studies or meta-analyses. PICO stands for four different potential components of a clinical question: Patient, Population or Problem; Intervention; Comparison; Outcome.

See more on using PICO in the Cochrane Handbook.

எளியமொழிச் சுருக்கம்

முழங்கால் கீல்வாதத்திற்கான உடற்பயிற்சிகள்

பின்புலம்: முழங்கால் கீல்வாதம் மற்றும் உடற்பயிற்சி என்றால் என்ன?

கீல்வாதம் என்பது இடுப்பு போன்ற மூட்டுகளைக் தாக்கும் நோயாகும். குருத்தெலும்பை மூட்டு இழக்கும் போது, எலும்பானது வளர்ந்து, அதன் பாதிப்பைச் சரி செய்ய முயற்சிக்கும். ஆனால், எலும்பு வழக்கத்திற்கு மாறாக வளர்ந்து அதனைச் சரி செய்வதற்கு பதிலாக, மோசமடையச் செய்யும். உதாரணமாக, எலும்பானது உருவிழந்து, மூட்டுவலியையும், மற்றும் ஸ்திரமற்ற மூட்டையும் உண்டாக்கும். கீல்வாதம் குருத்தெலும்பின் தேய்மானத்தினால் ஏற்படுகிறது என்றே மருத்துவர்கள் கருதினார்கள். ஆனால் இப்பொழுதோ, அது ஒரு முழுமையான மூட்டு நோய் என்று கருதப்படுகிறது.

தசை வலிமை, உடல் திறன் மற்றும் ஒட்டுமொத்த ஆரோக்கியத்தை மேம்படுத்தும் அல்லது சீராக வைக்கிற எந்த ஒரு செயற்பாடாகவும் உடற்பயிற்சி இருக்கலாம். உடல் எடையைக் குறைக்கவும், தசை வலிமையை அதிகரிக்கவும், மற்றும் கீல்வாத அறிகுறிகளிலிருந்து விடுபடல் போன்ற பல்வேறு காரணங்களுக்காகவும் மக்கள் உடற்பயிற்சி செய்கிறார்கள்.

ஆய்வு பண்புகள்

இடுப்பு கீல்வாதத்திற்கு உடற்பயிற்சி அளிக்கும் விளைவுகளை குறித்து ஆய்வுகள் மூலம் நாங்கள் அறிந்து கொண்டதை இந்த மேம்படுத்தப்பட்ட காக்குரேன் மறுஆய்வு சுருக்கம் வழங்குகிறது. மே 2013 வரை இதன் தொடர்புடைய அனைத்து ஆய்வுகளை தேடிய பிறகு, நாங்கள்முந்தைய திறனாய்விற்கு பிறகு 23 புதிய ஆய்வுகளைச் சேர்த்ததுடன், 54 ஆய்வுகள் (3913 பங்கேற்பாளர்கள்), பெரும்பாலும் லேசான முதல் மிதமான இடுப்பு கீல்வாதம் மட்டும் அல்லது முழங்கால் கீல்வாதத்துடன் கிடைத்தது. உடற்பயிற்சி செய்யாதவர்களுடன் ஒப்பிடுகையில், தாய்சி திட்டத்தில் சேர்ந்த பங்கேற்பாளர்களுடன் கூடிய ஒரு ஆய்வு தவிர மற்ற ஆய்வில் பங்கேற்றவர்கள் நிலம் சார்ந்த பயிற்சி திட்டங்களான பாரம்பரிய தசை வலுப்படுத்துதல், செயல்பாட்டு பயிற்சி மற்றும் ஏரோபிக் உடற்பயிற்சி திட்டங்களை தனித்தனியாக கண்காணிக்கப்பட்டோ அல்லது ஒரு குழுவின் ஒரு பகுதியாகவோ மேற்கொண்டார்கள். 44 ஆய்வுகள் (3537 பங்கேற்பாளர்கள்) இருந்து ஆதாரமாகப் உடனடியாக சிகிச்சைக்கு பிறகு உடற்பயிற்சி விளைவுகள் காட்டுகிறது; 12 ஆய்வுகள் இரண்டு ஆறு மாத பிந்தைய சிகிச்சை பேண்தகைமை தகவல்களை அனுப்புகின்றது. இங்கே நாங்கள் சிகிச்சை காலம் முடிந்தவுடன் உடனடி முடிவுகளை மட்டுமே தெருவிகின்றோம்.

முக்கிய முடிவுகள்

0‐100 புள்ளிகள் கொண்ட அளவீட்டில் வலி (குறைந்த புள்ளிகள் என்றால் குறைந்த வலி):

‐ உடற்பயிற்சி செய்யாதவர்களுடன் ஒப்பிடுகையில் ஒரு உடற்பயிற்சி திட்டத்தை முடித்தவர்கள், சிகிச்சை முடிவில் அவர்களின் வலியை 12 புள்ளிகள் குறைவாக (10 முதல் 15) இருந்தது (12% முழுமையான முன்னேற்றம்) என்று மதிப்பீடு செய்தார்கள்.

‐ஒரு உடற்பயிற்சி திட்டத்தை செய்து முடித்தவர்கள் தங்களின் வலியை 32 புள்ளிகள் என மதிப்பீடு செய்தார்கள்.

‐உடற்பயிற்சி செய்யாதவர்கள் தங்களின் வலியை 44 புள்ளிகள் என மதிப்பீடு செய்தார்கள்.

0 முதல் 100 புள்ளிகள் என்ற அளவுகோலில் உடல் செயல்பாடு (குறைந்தளவு புள்ளிகள் என்றால் சிறந்த உடல் செயல்பாடு):

• உடற்பயிற்சி திட்டத்தை செய்து முடித்தவர்கள் உடற்பயிற்சி செய்யாதவர்களை ஒப்பிடுகையில், சிகிச்சை முடிவில் அவர்களின் உடல் செயல்பாட்டை 10 புள்ளிகள் (8 முதல் 13 புள்ளிகள்) குறைவாக இருந்ததாக , (10% முழுமையான முன்னேற்றம்) மதிப்பீடு செய்தார்கள்.

• உடற்பயிற்சி திட்டத்தை செய்து முடித்தவர்கள் தங்களின் உடல் செயல்பாட்டை 28 புள்ளிகள் என மதிப்பீடு செய்தார்கள்.

• உடற்பயிற்சி செய்யாதவர்கள் தங்களின் உடல் செயல்பாட்டை 38 புள்ளிகள் என மதிப்பீடு செய்தார்கள்.

0 முதல் 100 புள்ளிகள் என்ற அளவுகோலில் வாழ்க்கை தர மதிப்பீடு (அதிக புள்ளிகள் என்றால் சிறந்த உடல் செயல்பாடு):

• மொத்தத்தில் பார்க்கும்போது , உடற்பயிற்சி திட்டத்தை செய்து முடித்தவர்கள், சிகிச்சை முடிவில் அவர்களின் வாழ்க்கை தரத்தை 4 புள்ளிகள் (2 முதல் 5 புள்ளிகள்) அதிகமாக இருந்தது (4% முழுமையான முன்னேற்றம்) என்று மதிப்பீடு செய்தார்கள்.

• உடற்பயிற்சி திட்டத்தை செய்து முடித்தவர்கள் தங்களின் வாழ்க்கை தரத்தை 47 புள்ளிகள் என மதிப்பீடு செய்தார்கள்.

• உடற்பயிற்சி செய்யாதவர்கள் தங்களின் வாழ்க்கை தரத்தை 43 புள்ளிகள் என மதிப்பீடு செய்தார்கள்.

விலகியவர்கள்

• நூறில் ஒருவரளவில் குறைவாக உடற்பயிற்சி திட்டத்தைக் கைவிட்டார்கள் (1% முழுமையான குறைவு).

• உடற்பயிற்சி செய்தவர்களில் நூறில் 14 பேர் கைவிட்டார்கள்.

• உடற்பயிற்சி செய்யாதவர்களில் நூறில் 15 பேர் கைவிட்டார்கள்.

சான்றின் தரம்

முழங்கால் வாதம் உள்ளவர்களுக்கு, சிகிச்சையை நிறுத்தியபின் உடனடியாக உடற்பயிற்சி கைவிடுபவர்கள் அதிகமின்றி , வலியை மிதமான அளவு குறைக்கும் என்றும் மற்றும் வாழ்க்கை தரத்தையும் சற்று மேம்படுத்தும் என்றும் உயர்‐தர சான்றுகள் தெரிவிகின்றன. இனி மேற்கொள்ளப்படும் ஆராய்ச்சிகள் இந்த முடிவுகளை மாற்றி மதிப்பீடு செய்ய சாத்தியமில்லை.

சிகிச்சைகளை நிறுத்தியவுடன் உடனடியான உடற்பயிற்சி உடற்சார்ந்த செயல்பாட்டு திறனில் மிதமான மேம்பாட்டை அளிக்கிறது என்று மிதமான‐தர சான்றுகள் தெரிவிக்கின்றன. இனி மேற்கொள்ளப்படும் ஆராய்ச்சி இந்த முடிவுகளை மாற்றி மதிப்பீடு செய்யலாம்.

பெரும்பாலான மருத்துவ ஆய்வுகள் உடற்பயிற்சியின் போது காயங்கள் அல்லது விழுதல் போன்ற பக்க விளைவுகள் பற்றி எந்த துல்லியமான தகவல்களும் வழங்கவில்லை, ஆனால் நாங்கள் இது அரியதாக இருக்கும் என்று எதிர்பார்க்கிறோம். எட்டு ஆய்வுகள் உடற்பயிற்சி திட்டத்தின் காரணமாக முழங்கால் மற்றும் கீழ்முதுகு வலி அதிகமாகியது என்று தெரிவித்தன . தேர்வுசெயப்பட்ட அனைத்து ஆய்வுகளும் காயங்கள் எதுவும் ஏற்பட்டதாக தெரிவிக்கவில்லை.

Authors' conclusions

Implications for practice

High‐quality evidence suggests that land‐based therapeutic exercise provides benefit in terms of reduced knee pain and quality of life and moderate‐quality evidence of improved physical function among people with knee OA.Since the participants in most trials were aware of their treatment, this may have contributed to their improvement. Despite the lack of blinding we did not downgrade the quality of evidence for risk of performance or detection bias. This reflects our belief that further research in this area is unlikely to change the findings of our review.

Healthcare professionals and people with OA can be reassured that any type of exercise programme that is done regularly and is closely monitored by healthcare professionals can improve pain and physical function related to knee OA in the short term. This allows a great deal of choice, ranging from individual physiotherapy‐led sessions and exercise classes to home‐based programmes. Exercise programmes that were individually provided appeared to be associated with greater improvements in knee pain and physical function.

Results of this meta‐analysis are restricted to evaluation of symptomatic benefits. Regular exercise has the potential to modify structural disease progression among people with knee OA, but this was not evaluated in this review and remains an unanswered question in the literature.

Implications for research

Treatment effect size for many of the studies was modest. Multi‐faceted interventions that incorporate exercise strategies into patient care may provide greater benefit and should be tested.

  1. Identify possible predictors of patient responsiveness to therapeutic exercise, such as radiographic disease severity, symptom duration, outcomes expectancy, psychological well being, obesity, knee stability, etc.

  2. Develop multi‐armed placebo‐controlled randomised clinical trials to help provide evidence of optimal exercise content and dosage.

  3. Initiate research to assess the long‐term effectiveness of exercise for people with knee OA in terms of structural disease progression.

Summary of findings

Open in table viewer
Summary of findings for the main comparison. Immediate post‐treatment effects of exercise for osteoarthritis of the knee

Immediate post‐treatment effects of exercise for osteoarthritis of the knee

Patient or population: patients with knee OA
Settings: clinic or community
Intervention: land‐based exercise
Comparison: no exercise

Outcomes

Illustrative comparative risks* (95% CI)

Relative effect
(95% CI)

Number of participants
(studies)

Quality of the evidence
(GRADE)

Comments

Assumed risk

Corresponding risk

No exercise

Land‐based exercise

Pain
Self‐report questionnaires. Scale from 0‐100 (0 represents no pain)

Mean pain in the control groups was
44 points

Mean pain in intervention groups was
0.49 standard deviations lower
(0.39‐0.59 lower)

This translates to an absolute mean reduction of 12 (10‐15) points compared with control group on a 0‐100 scalea

3537
(44 studies)

⊕⊕⊕⊕
High

SMD ‐0.49 (‐0.39 to ‐0.59)

Absolute reduction in pain 12% (10%‐15%); relative change 27% (21%‐32%)a

NNTB 4 (3‐5)b

Physical function
Self‐report questionnaire. Scale from 0‐100 (0 represents no physical disability)

Mean physical function in control groups was
38 points

Mean physical function in intervention groups was
0.52 standard deviations lower
(0.39‐0.64 lower)

This translates to an absolute mean improvement of 10 (8‐13) points on a 0‐100 scalec

3913
(44 studies)

⊕⊕⊕⊝
Moderated

SMD ‐0.52 (‐0.39 to ‐0.64)

Absolute improvement 10% (8%‐13%); relative improvement 26% (20%‐32%)c

NNTB 4 (3‐5)b

Quality of life
Self‐report questionnaire. Scale from 0‐100 (100 is maximum quality of life)

Mean quality of life in control groups was
43 points

Mean quality of life in intervention groups was
0.28 standard deviations higher
(0.15‐0.4 higher)

This translates to an absolute improvement of 4 (2‐5) points on a 0‐100 scalee

1073
(13 studies)

⊕⊕⊕⊕
High

SMD 0.28 (0.15‐0.40)

Absolute improvement 4% (2%‐5%); relative improvement 9% (5%‐13%)e

NNTB 8 (5‐14)b

Study withdrawals or dropouts

153 per 1000

137 per 1000

4607

(44 studies)

⊕⊕⊕⊕
High

OR 0.93 (0.75‐1.15)

Absolute risk reduction: 1% fewer events with exercise (2% fewer‐2% more); relative risk reduction 6% fewer events with exercise (21% fewer‐12% more)

NNTH n/ab

*The basis for the assumed risk (e.g. median control group risk across studies) is provided in footnotes. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).
CI: Confidence interval; GRADE: Grades of Recommendation, Assessment, Development and Evaluation; KOOS: Knee Osteoarthritis Outcome Scale; NNTB: Number needed to treat for an additional beneficial outcome; NNTH: Number needed to treat for an additional harmful outcome; SMD: Standardised mean difference.

GRADE Working Group grades of evidence.
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

aCalculations based on the control group baseline mean (SD) pain: 44.3 (24.4) points on 0‐100 scale (from Yip 2007).

bNumber needed to treat for an additional beneficial outcome (NNTB) or harmful outcome (NNTH) not applicable (n/a) when result was not statistically significant. Number needed to treat (NNT) for continuous outcomes calculated using the Wells calculator (from the CMSG Editorial office; http://musculoskeletal.cochrane.org/), and for dichotomous outcomes using the Cates NNT calculator (www.nntonline.net/visualrx/).

cCalculations based on the control group baseline mean (SD) function: 40.0 (20.0) points on 0‐100 scale (from Hurley 2007).

dPhysical function downgraded for inconsistency (heterogeneity, I2 = 68%).

eCalculated on the basis of the control group baseline mean (SD): 39.2 (13.1) points on 0‐100 KOOS subscale (from Lund 2008).

Background

Description of the condition

Osteoarthritis (OA), the most common rheumatic disease, primarily affects the articular cartilage and the subchondral bone of a synovial joint, eventually resulting in joint failure. The most typical radiographic features include formation of osteophytes at the joint margins, joint space narrowing, subchondral sclerosis, subchondral cyst formation and chondrocalcinosis (Scott 1993). It has been estimated that about 40% to 80% of people with radiographic changes will have symptomatic disease. Symptomatic knee OA is highly prevalent among older people worldwide (10% to 30%), especially in rural regions, where occupational physical demands are high (Busija 2010).

People with symptomatic OA of the knee describe deep, aching pain. In early disease, pain is intermittent and most often is associated with joint use. For many people, symptomatic disease progresses, and the pain becomes more chronic and may occur at rest and during the night. The joint feels 'stiff,' resulting in typical pain and difficulty when movement is initiated after a period of rest. Individuals with advanced disease may experience crepitus or deep 'creaking' sounds on movement and often limited range of joint motion. People with progressive symptomatic knee OA experience increasing difficulty with daily functional activities. In fact, knee OA is more responsible than any other disease for disability in walking, stair climbing and housekeeping among non‐institutionalised people 50 years of age and older (Davis 1991; Guccione 1994; van Dijk 2006). Ultimately, chronic OA involving lower limb joints leads to reduced physical fitness with resultant increased risk of cardiometabolic co‐morbidity (Minor 1988; Philbin 1995; Nielen 2012) and early mortality (Hochberg 2008).

Description of the intervention

Therapeutic exercise covers a range of targeted physical activities that directly aim to improve muscle strength, joint range of motion and aerobic fitness.

How the intervention might work

Currently, no cure for OA is known. However, disease‐related factors, such as impaired muscle function and reduced fitness, are potentially amenable to exercise (Buchner 1992; Fiatarone 1993).

Exercise takes a multitude of forms and results in numerous systemic and local effects, some of which have been investigated among people with knee OA.

Among people with knee OA, improving muscle strength is one of the main aims of exercise, given that weakness is common. Strength training of sufficient dosage can address muscle weakness by improving muscle mass and/or recruitment. However, among patient groups, pain must be considered and may be a barrier leading to underdosage of the strength stimulus. Enhanced strength of the lower limb may lessen knee forces, reduce pain and improve physical function (Bennell 2008; Dekker 2013). Increased muscle strength may modify biomechanics, resulting in a decreased joint loading rate or localised stress in the articular cartilage, thereby playing an important role in both initiation and progression of knee OA (Cooper 1995; Felson 1995; Kujala 1995; McAlindon 1999; Rangger 1995; Slemenda 1997; Zhang 1996).

Poor physical fitness is another impairment reported among people with knee OA. Physiological reserve for aerobic capacity is enhanced primarily by increasing muscle oxidative capacity. Aerobic exercise (e.g. walking, cycling) of sufficient intensity increases muscle oxidative enzymes and muscle capillarisation, hence increasing peak oxygen uptake. Higher oxygen uptake is inversely related to morbidity and mortality and renders every submaximal daily task easier (in terms of effort). Thus, improved fitness may enhance quality of life by allowing a greater range of available daily tasks, thereby improving physical function.

Why it is important to do this review

International guidelines advocate various non‐pharmacological treatments, including exercise, for first‐line treatment of people with OA (Zhang 2010Nelson 2013). This is an update of a previous Cochrane review (Fransen 2008).

Objectives

To determine whether land‐based therapeutic exercise is beneficial for people with knee OA in terms of reduced joint pain or improved physical function and quality of life.

Methods

Criteria for considering studies for this review

Types of studies

Randomised or quasi‐randomised controlled trials, published in the English language, comparing groups given some form of land‐based therapeutic exercise versus a non‐exercise group.

Types of participants

Male and female adults given an established diagnosis of knee OA according to accepted criteria (Altman 1991), or who self‐reported knee OA on the basis of chronic joint pain (with or without radiographic confirmation).

Types of interventions

Any land‐based non‐perioperative therapeutic exercise regimens aimed at relieving the symptoms of OA, regardless of content, duration, frequency or intensity. The comparator (control) group could be an active (given any non‐exercise intervention) or no treatment (including waiting list) group.

Types of outcome measures

In accordance with international consensus regarding the core set of outcome measures for phase III clinical trials in OA (Bellamy 1997), each randomised clinical trial had to include assessment of at least one of the following.

  1. Knee pain.

  2. Self‐reported physical function.

  3. Quality of life.

These outcomes were assessed at three time points: immediately at the end of treatment (post‐treatment), two to six months after cessation of monitored study treatment and longer than six months after cessation of monitored study treatment. Each included study was required to report measurement of outcomes in at least one of these time periods.

We also noted the number of participants withdrawing from the study before post‐treatment assessment and the number of participants experiencing adverse events, if provided.

Search methods for identification of studies

Electronic searches

Five electronic databases were searched from inception to May 2013: MEDLINE (Appendix 1), EMBASE (Appendix 2), the Cochrane Central Register of Controlled Trials (CENTRAL) (Appendix 3), the Cumulative Index to Nursing and Allied Health Literature (CINAHL) (Appendix 4) and the Physiotherapy Evidence Database (PEDro) (Appendix 5).

We also included a search of ClinicalTrials.gov (www.ClinicalTrials.gov) and the World Health Organization (WHO) trials portal (www.who.int/ictrp/en/).

Searching other resources

We searched the reference lists of identified included studies as well.

Data collection and analysis

Selection of studies

Three teams of two review authors (MF, SM, AH, MVdE, MS, KB) independently screened retrieved clinical studies for inclusion. If agreement was not achieved at any stage, a third review author from one of the other two teams adjudicated.

Data extraction and management

Three teams of two review authors (MF, SM, AH, MVdE, MS, KB) extracted data from all included studies and conducted the risk of bias assessment. If agreement was not achieved at any stage, a third review author from one of the other two teams adjudicated.

If a trial provided data from more than one pain scale, we extracted data from the pain scale that is highest on the list below according to a previously described hierarchy of pain‐related outcomes (Juni 2006; Reichenbach 2007).

  1. Global pain.

  2. Pain on walking.

  3. Western Ontario and McMaster Osteoarthritis Index (WOMAC) osteoarthritis pain subscore.

  4. Composite pain scores other than WOMAC.

  5. Pain on activities other than walking.

  6. Pain at rest or pain during the night.

  7. WOMAC global algofunctional score.

  8. Lequesne Osteoarthritis Index global score.

  9. Other algofunctional scale.

Data on more than one physical function scale, when reported in a trial, were extracted according to the hierarchy presented below.

  1. Global disability score.

  2. Walking disability.

  3. WOMAC disability subscore.

  4. Composite disability scores other than WOMAC.

  5. Disability other than walking.

  6. WOMAC global scale.

  7. Lequesne Osteoarthritis Index global score.

  8. Other algofunctional scale.

If data on more than one quality of life scale were reported in a trial, data were extracted according to the hierarchy presented below.

  1. Short Form (SF)‐36, Mental Component Summary (MCS).

  2. SF‐12 MCS.

  3. EuroQol.

  4. Sickness Impact Profile (SIP).

  5. Nottingham Health Profile (NHP).

  6. Other quality of life scales.

Assessment of risk of bias in included studies

We assessed risk of bias in included studies in accordance with methods recommended by The Cochrane Collaboration (Risk of bias in included studies).

We assessed risk of bias according to the following domains.

  1. Random sequence generation.

  2. Allocation concealment.

  3. Blinding of participants and personnel.

  4. Blinding of outcome assessment, subjective self‐reported outcomes (pain, physical function, quality of life)

  5. Blinding of outcome assessment, other outcomes

  6. Incomplete outcome data.

  7. Selective outcome reporting.

We graded each potential source of bias as high, low or unclear and provide justification for our judgement in the 'Risk of bias' table.

We summarised the risk of bias judgements across different studies for each of the seven domains listed.

We presented the figures generated by the 'Risk of bias' tool to provide summary assessments of risk of bias (Figure 1).


Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

If the three domains of random sequence generation, allocation concealment and incomplete outcome data (selection bias and attrition bias) were adequately met in a study, we judged the overall risk of bias as low for that study.

Measures of treatment effect

As studies used a variety of continuous scales to evaluate pain, physical function and quality of life outcomes, a unitless measure of treatment effect size was needed to allow the results of various randomised controlled trials (RCTs) to be combined. We used standardised mean differences (SMDs) to calculate treatment effect sizes from the end of treatment, or change scores and related standard deviation (SD) scores, when possible. Treatment effect size therefore is a unitless measure providing an indication of size in terms of its variability. Outcomes pooled using SMDs were reexpressed as equivalent mean differences by multiplying by a representative control group (high weighting in pooled analyses) baseline SD. We pooled the Mantel‐Haenszel odds ratio (OR) to calculate the effects of treatment allocation on study withdrawal before the first outcome assessment.

Unit of analysis issues

The unit of analysis was the participant; thus no unit of analysis issues are described.

Dealing with missing data

No data were missing. We contacted study authors when data could not be extrapolated in the desired form from the published manuscript.

Assessment of heterogeneity

In a random‐effects model, overall effects are adjusted to include an estimate of the degree of variation between studies, or heterogeneity, in intervention effect (Tau2) (Deeks 2011). The Chi2 test assesses whether differences in results are beyond those that can be attributed to sampling error (chance). The impact of heterogeneity on meta‐analysis results is quantified by the I2 statistic. This statistic describes the percentage of variability in effect estimates that is due to heterogeneity rather than to chance (Deeks 2011): 30% to 60% probably represents moderate heterogeneity, and > 50% is usually considered as representing substantial heterogeneity.

Assessment of reporting biases

For studies published after 1 July 2005, we screened the Clinical Trials Register at the International Clinical Trials Registry Platform of the World Health Organization (http://apps.who.int/trialssearch) to obtain the a priori trial protocol. We evaluated whether selective reporting of outcomes occurred (outcome reporting bias).

To assess for potential small‐study effects in meta‐analyses (i.e. intervention effect is more beneficial in smaller studies), we compared effect estimates derived from a random‐effects model with those obtained from a fixed‐effect model of meta‐analysis. In the presence of small‐study effects, the random‐effects model will provide a more beneficial estimate of the intervention than the fixed‐effect model (Sterne 2011).

Data synthesis

We used the random‐effects model to combine outcomes.

Summary of findings table

We created a 'Summary of findings' table by using the following outcomes: immediate post‐treatment pain, physical function, quality of life, withdrawals due to adverse events and total adverse events. We used GRADEpro software and the five GRADE (Grades of Recommendation, Assessment, Development and Evaluation) considerations (study limitations, consistency of effect, imprecision, indirectness and publication bias) to assess the quality of a body of evidence for stated outcomes (Schünemann 2011a; Schünemann 2011b).

Outcomes pooled using SMDs were reexpressed as absolute mean differences (or changes) by multiplying by a representative control group baseline SD from a trial using a familiar instrument and dividing by points of the measurement scale expressed as a percentage.

In the Comments column of the 'Summary of findings' table, we have presented the absolute percent difference, the relative percent change from baseline and the number needed to treat for an additional beneficial outcome (NNTB) (the NNTB is provided only for outcomes with statistically significant differences between intervention and control groups).

For continuous outcomes, absolute risk difference was calculated as mean difference between intervention and control groups given in original measurement units (divided by the scale), expressed as a percentage; the relative difference was calculated as the absolute change (or mean difference) divided by the baseline mean of the control group from a representative trial. The NNTB for continuous measures was calculated using the Wells calculator (available at the CMSG Editorial office; http://musculoskeletal.cochrane.org/).

We assumed a minimal clinically important difference (MCID) of 15 points on a 0 to 100‐point pain scale, and of 10 points on a 0 to 100‐point function scale.

Subgroup analysis and investigation of heterogeneity

The influence of using end of treatment or change scores was evaluated for the investigation of heterogeneity.

Subgroup analyses were conducted to explore possible differences in pooled SMDs for immediate post‐treatment pain and physical function according to:

1. treatment content (quadriceps exercises only, lower limb strengthening, strengthening and aerobics, walking programme, other programmes),

2. treatment delivery mode (individual, class‐based, home programme) and

3. number of face‐to‐face contact occasions (< 12, ≥ 12).

These sub‐groups were chosen to reflect differences in dosage and content of the exercise programs using crude metrics that were usually available in all the study reports.

Sensitivity analysis

1.We assessed the effect of potential selection and attrition bias on immediate post‐treatment pain and physical function outcomes.

2. We assessed the effect of potential detection bias on immediate post‐treatment pain and physical function outcomes.

Results

Description of studies

Results of the search

Of 212 retrieved RCTs identified by the literature search, 54 met the inclusion criteria (Abbott 2013; An 2008; Baker 2001; Bautch 1997; Bennell 2005; Bennell 2010; Bezalel 2010; Brismée 2007; Bruce‐Brand 2012; Chang 2012; Deyle 2000; Doi 2008; Ettinger 1997a/b; Foley 2003; Foroughi 2011; Fransen 2001; Fransen 2007; Gur 2002; Hay 2006; Hopman‐Rock 2000; Huang 2003; Huang 2005; Hughes 2004; Hurley 2007; Jan 2008; Jan 2009; Jenkinson 2009; Kao 2012; Keefe 2004; Kovar 1992; Lee 2009; Lim 2008; Lin 2009; Lund 2008; Maurer 1999; Messier 2004; Mikesky 2006; Minor 1989; O'Reilly 1999; Peloquin 1999; Quilty 2003; Rogind 1998; Salacinski 2012; Salli 2010; Schilke 1996; Simao 2012; Song 2003; Talbot 2003; Thomas 2002; Thorstensson 2005; Topp 2002; van Baar 1998; Wang 2011; Yip 2007). Details for each of the included studies are outlined in Characteristics of included studies.

One of the 54 studies included two clearly different exercise intervention groups and was treated as two trials, with sample size of the control group equally divided between the two exercise intervention groups: aerobic walking and resistance training (Ettinger 1997a/b). Five of the included studies recruited people with a diagnosis of hip or knee OA (Foley 2003; Fransen 2007; Hopman‐Rock 2000; van Baar 1998; Abbott 2013). These five studies provided data specific for participants with knee OA. Five studies allocated participants to two (Gur 2002; Jan 2008; Jan 2009; Salli 2010) or three (Huang 2003) different forms of muscle strengthening. As control groups in both studies were relatively small, the mean effects of exercise allocations were combined and were compared with those of the control group. One study (Huang 2005) described two allocations combining exercise with ultrasound or hyaluronan. Only the exercise alone allocation was considered in the current review. Two studies described four treatment allocations (Messier 2004; Jenkinson 2009), two of which included a weight reduction programme. Only the exercise alone allocation versus the control group was considered in the current review. One study (Mikesky 2006) included participants without knee pain. Data were provided by the study author on 37 participants with knee pain and confirmed knee OA. One study (Keefe 2004) described four allocations, two involving a spouse‐assisted coping strategy intervention. Only the exercise alone groups and the control groups were evaluated in the current review. Two studies included (in addition to a more traditional exercise programme) a proprioceptive training allocation (Lin 2009) and an allocation to squatting on a vibratory platform (Simao 2012). One study stratified results according to varus or normal knee alignment (Lim 2008). These results were averaged for the two stratifications.

Included studies

Marked variability among the 54 included studies was noted with regard to study participants recruited, timing of outcomes assessments, exercise interventions assessed and important aspects of study methodology. Most studies recruited between 50 and 150 participants. However, 19 (35%) studies recruited fewer than 25 participants in one or both allocation groups (An 2008; Baker 2001; Bautch 1997; Brismée 2007; Bruce‐Brand 2012; Chang 2012; Foley 2003; Gur 2002; Keefe 2004; Lee 2009; Mikesky 2006; Minor 1989; Rogind 1998; Salacinski 2012; Salli 2010; Schilke 1996; Simao 2012; Song 2003; Talbot 2003), whereas five studies recruited more than 200 participants (Abbott 2013; Hurley 2007; Jenkinson 2009; Kao 2012; Thomas 2002), one of which recruited 750 participants (Thomas 2002).

Sample recruitment varied widely, with studies recruiting exclusively community volunteers (An 2008; Bennell 2005; Bennell 2010; Brismée 2007; Ettinger 1997a/b; Foroughi 2011; Fransen 2007; Hughes 2004; Kao 2012; Lim 2008; O'Reilly 1999; Peloquin 1999; Quilty 2003; Salacinski 2012; Wang 2011), patients drawn from specialist rheumatology or orthopaedic clinics (Bezalel 2010; Bruce‐Brand 2012; Doi 2008; Foley 2003; Jan 2008; Jan 2009; Lin 2009; Schilke 1996; Song 2003; Thorstensson 2005; Yip 2007), a mix of community volunteers and patients from specialist clinics or referred by general practitioners (Abbott 2013; Bautch 1997; Jenkinson 2009; Keefe 2004; Lund 2008; Minor 1989), patients referred by general practitioners (Hay 2006; Hurley 2007; Thomas 2002; van Baar 1998) or patients from physiotherapy waiting lists (Deyle 2000; Fransen 2001).

In two studies, approximately 50% of the sample reported a symptom duration of less than a year (Chang 2012; van Baar 1998), whilst a few other studies reported a mean symptom duration longer than 10 years (Foroughi 2011; Maurer 1999; Minor 1989). Many studies did not report symptom duration. Most studies stated that the American College of Rheumatology diagnostic criteria were used for study inclusion. However, 'knee pain in the past week' (O'Reilly 1999), ‘knee pain in the last month’ (Jenkinson 2009) or patellofemoral knee pain (Quilty 2003) was sufficient in three studies. In one study, patients with OA diagnosed via arthroscopy or who were on the waiting list for total knee replacement were included (Bruce‐Brand 2012). Five studies required radiographic disease of at least Kellgren and Lawrence Grade III for study participation (Bruce‐Brand 2012; Doi 2008; Lim 2008; Rogind 1998; Thorstensson 2005), whereas other studies included only participants with radiographic disease of Kellgren and Lawrence Grade III or less (Chang 2012; Jan 2008; Jan 2009; Lin 2009; Salacinski 2012). Many study cohorts comprised participants who were overweight (body mass index (BMI) 25 to 29.9 kg/m2) or obese (BMI ≥ 30 kg/m2). Consequently, mean BMI (reported or calculated from mean weight and height data) was in the normal range in only a few studies (Doi 2008; Jan 2008; Jan 2009; Lin 2009; Salacinski 2012). Two studies targeted only overweight or obese participants (BMI ≥ 28 kg/m2), resulting in cohorts with a mean BMI of 34 kg/m2 (Messier 2004) and a median BMI of 33.6 kg/m2 (Jenkinson 2009). This range of recruitment strategies and inclusion criteria resulted in wide variability in baseline radiographic and symptomatic disease severity between studies, when reported.

Many studies did not report medication use. One study excluded people taking non‐steroidal anti‐inflammatory drugs (NSAIDs) (Bautch 1997), whereas another included only people currently taking NSAIDs at least twice a week (Kovar 1992). Cessation of NSAID use was required for the duration of one study (Jan 2008). Another study offered paracetamol as required (up to 2 g per day) to all participants (Salli 2010). Sticky patch analgesia was available as required for all participants in a study in which the control group was taking NSAIDs (Doi 2008). One study stratified allocation groups according to glucosamine or chondroitin use (Foroughi 2011).

A wide range of therapeutic exercise programmes were assessed. Delivery mode varied between one‐on‐one (individual) programmes (Analysis 6.1; Analysis 7.1) and exercise programmes undertaken most often by the participant at home (Analysis 6.3; Analysis 7.3). However, many 'home' programmes incorporated home visits by a trained nurse or a community physiotherapist. Also, most individual treatments and class‐based programmes provided a home exercise programme. Only one study included allocation to individual treatment or to a class‐based programme (Fransen 2001). Results for each of these allocations were presented in the original manuscript for all participants (including those originally allocated to a waiting list control) and were presented as such for this comparison.

Complexity of content and mode of exercise varied considerably between studies. Simple quadriceps muscle strengthening (i.e. supine or seated knee extension using leg weight only) was used by one study (Doi 2008), whereas another study initially used very simple exercises (e.g. straightening knee over rolled towel) and progressed to functional exercises after several months (Jenkinson 2009). One study (Simao 2012) used squat exercises alone to strengthen multiple lower limb muscles, and another used multiple sitting and standing exercises with body weight only (Wang 2011). Other studies, although often using a combination of exercise equipment, used mainly elastic resistance bands (Bennell 2010; Bruce‐Brand 2012; Chang 2012; Topp 2002), free weights (Ettinger 1997a/b; Lim 2008) or resistance machines (Foley 2003; Foroughi 2011; Fransen 2001; Gur 2002; Huang 2003; Huang 2005; Jan 2008; Jan 2009; Maurer 1999; Mikesky 2006; Salli 2010; Schilke 1996). A number of studies employed complex, multi‐modal programmes including manual therapy, upper limb and/or truncal muscle strengthening and balance co‐ordination (Abbott 2013; Bennell 2005; Deyle 2000; Peloquin 1999; Rogind 1998; van Baar 1998), in addition to lower limb muscle strengthening. Aerobic walking (Ettinger 1997a/b; Kovar 1992; Messier 2004; Minor 1989; Talbot 2003) or cycling programmes (Salacinski 2012) were the focus of some studies.Five studies evaluated Tai Chi classes (Brismée 2007; Fransen 2007; Lee 2009; Song 2003; Yip 2007), and one study used Baduanjin exercises (An 2008). Exercises were not clearly described in one study (Kao 2012), and in another the website that provided exercise descriptions was not available (Hurley 2007). Overall, the exercise content of studies evidenced much variability, and many studies did not provide a clear rationale for choice of exercise.

Along with delivery mode and content, treatment 'dosage' (duration, frequency, intensity) varied widely between studies. Monitored treatment sessions, presented in individual or class‐based format, ranged from 20 to 60 minutes. Exercise frequency for monitored classes or for individual clinic sessions in most studies was two to three times per week; however, frequency varied between once per week (Bezalel 2010; Hopman‐Rock 2000; Kao 2012; Topp 2002; Yip 2007) and five times per week (An 2008). Concurrent monitored clinic classes and home programmes were provided in a few studies (Abbott 2013; Bennell 2010; Bruce‐Brand 2012; Topp 2002), thus potentially increasing the overall frequency of weekly exercise. The total number of monitored exercise sessions provided ranged from none (Talbot 2003) to 72 (Foroughi 2011). Four studies prescribed daily home exercise (Doi 2008; Jenkinson 2009; O'Reilly 1999; Thomas 2002), and one study monitored daily pedometer step counts (Talbot 2003). Total treatment duration for monitored classes or individual clinic sessions ranged from one month (Bezalel 2010; Deyle 2000) to six months (Foroughi 2011). Two studies prescribed home programmes for up to two years (Jenkinson 2009; Thomas 2002).

Prescribed exercise generally was of moderate to moderately high intensity, although some studies failed to report whether exercise intensity was maintained or progressed during the course of exercise training. Intensity achieved during strength training using free or limb weights or Theraband was commonly a 10‐repetition maximum (10RM) with varying numbers of sets (Bennell 2010; Chang 2012; Ettinger 1997a/b; Lim 2008) or was at least moderate (Bruce‐Brand 2012; Topp 2002; Wang 2011). One study ensured that strength exercise was conducted at least at 60% maximum heart rate (HRmax); this was progressed to the highest tolerable intensity (Thorstensson 2005). Muscle strength training conducted using a variety of resistance machines was generally very well quantified and ranged from 50% 1RM (Lin 2009), through 60% to 80% 1RM (Foley 2003; Foroughi 2011; Jan 2008; Jan 2009; Mikesky 2006), to maximum effort at various isokinetic speeds (Gur 2002; Huang 2003; Huang 2005; Maurer 1999; Salli 2010; Schilke 1996). For some studies, although strength exercises were described, exercise intensity was not quantified (Bezalel 2010; Doi 2008; Kao 2012; O'Reilly 1999; Thomas 2002). Aerobic exercise intensity, achieved via walking programmes, ranged from low (Bautch 1997; Talbot 2003) to moderate (50% to 70% heart rate reserve (HRR) or 60% to 80% HRmax) (Ettinger 1997a/b; Minor 1989). One study used moderate‐intensity (70% HRmax) stationary cycling (Salacinski 2012). Another few studies used moderate‐intensity walking (40% to 60% HRmax or 50% to 85% HRR) or cycling (50% to 60% HRmax) and resistance training in the same session (Fransen 2001; Hughes 2004; Keefe 2004; Messier 2004; Peloquin 1999). Tai Chi exercises were used in five studies (Brismée 2007; Fransen 2007; Lee 2009; Song 2003; Yip 2007), and Baduanjin (Qigong) exercises in one study (An 2008), but intensity was not measured (via heart rate or rating of perceived exertion). Other studies employed complex programmes of physiotherapy, exercise and other strategies, rendering overall assessment of exercise intensity difficult.

Thirty‐six of the 54 included studies (67%) used the Western Ontario and McMaster Universities Arthritis Index (WOMAC) to evaluate knee pain or self‐reported physical function. A variety of scales were used by the other studies. Thirteen studies used visual analogue scales (VASs) to measure pain (Abbott 2013; Bautch 1997; Bennell 2005; Brismée 2007; Gur 2002; Hopman‐Rock 2000; Huang 2003; Huang 2005; Lund 2008; Quilty 2003; Rogind 1998; Salacinski 2012; Salli 2010). Only three studies included a separate participant global assessment of treatment effectiveness (Kao 2012; van Baar 1998; Yip 2007).

Excluded studies

A total of 151 studies were excluded for reasons given in the Characteristics of excluded studies table (Ageberg 2010; Aglamis 2008; Aglamis 2009; Akyol 2010; Alfredo 2012; Anwer 2011; Aoki 2009; Atamaz 2006; Atamaz 2012; Boocock 2009; Borjesson 1996; Brosseau 2012; Bulthuis 2007; Bulthuis 2008; Callaghan 1995; Cetin 2008; Chaipinyo 2009; Chamberlain 1982; Cheing 2002; Cheing 2004; Ciolac 2011; Coupe 2007; Crotty 2009; Deyle 2005; Dias 2003; Diracoglu 2005; Duman 2012; Durmus 2007; Durmus 2012; Ebnezar 2012; Ebnezar 2012a; Evcik 2002; Evgeniadis 2008; Eyigor 2004; Farr 2010; Feinglass 2012; Fitzgerald 2011; Forestier 2010; Foroughi 2011a; Foster 2007; Gaal 2008; Gaudreault 2011; Gill 2009; Green 1993; Gremion 2009; Haslam 2001; Helmark 2010; Helmark 2012; Hinman 2007; Hiyama 2012; Hoeksma 2004; Huang 2005b; Hughes 2010; Hurley 1998; Hurley 2007a; Hurley 2012; Jan 1991; Jan 2008a; Jessep 2009; Karagulle 2007; Kawasaki 2008; Kawasaki 2009; King 2008; Konishi 2009; Kreindler 1989; Kuptniratsaikul 2002; Lankhorst 1982; Lim 2002; Lim 2010; Lin 2004; Lin 2007; Liu 2008; Mangione 1999; Marra 2012; Mascarin 2012; McCarthy 2004; McKnight 2010; McQuade 2011; Messier 1997; Messier 2000a; Messier 2000b; Messier 2007; Messier 2008; Miller 2012; Moss 2007; Murphy 2008; Neves 2011; Ng 2010; Nicklas 2004; Ozdincler 2005; Penninx 2001; Penninx 2002; Pereira, 2011; Petersen 2010; Petersen 2011; Peterson 1993; Petrella 2000; Pietrosimone 2010; Pietrosimone 2012; Pisters 2010; Pisters 2010a; Piva 2011; Piyakhachornrot 2011; Quirk 1985; Rattanachaiyanont 2008; Ravaud 2004; Reid 2010; Reid 2011; Rejeski 1998; Sayers 2012; Schlenk 2011; Scopaz 2009; Selfe 2008; Sen 2004; Sevick 2009; Shakoor 2007; Shakoor 2010; Shen 2008; Silva 2008; Sled 2010; Song 2010; Soni 2012; Stitik 2007; Stitik 2007a; Sullivan 1998; Swank 2011; Sylvester 1989; Teixeira 2011; Thiengwittayaporn 2009; Toda 2001; Tok 2011; Topp 2009; Tsauo 2008; Tunay 2010; Tuzun 2004; van Baar 2001; Van Gool 2005; Veenhof 2007; Walls 2010; Wang 2006; Wang 2007; Wang 2007a; Wang 2009; Weng 2009; Whitehurst 2011; Williamson 2007; Williamson 2007a; Wyatt 2001; Yilmaz 2010; Yip 2007a; Yip 2008).

Risk of bias in included studies

According to the above criteria (methodological quality assessment), a total of 19 (20%) studies could be considered as achieving 'low risk of bias' from the published report (Abbott 2013; Baker 2001; Bennell 2005; Bennell 2010; Ettinger 1997a/b; Foley 2003; Fransen 2001; Fransen 2007; Jenkinson 2009; Lee 2009; Lim 2008; Lin 2009; Lund 2008; Messier 2004; Quilty 2003; Thomas 2002; Thorstensson 2005; van Baar 1998; Wang 2011). Five of these studies provided sustainability (two to six months or longer than six months) data only (Abbott 2013; Jenkinson 2009; Messier 2004; Quilty 2003; Thomas 2002) (Figure 1).

Allocation

Although most studies reported the methods used to generate randomisation, allocation concealment procedures were less frequently described (Figure 1).

Blinding

Only four of the 54 included studies claimed blinding of study participants (Bennell 2005; Chang 2012; Foroughi 2011; Quilty 2003). Bennell 2005 used sham ultrasound (US) with non‐active gel as the placebo treatment; Chang 2012 had both allocations randomised to general physiotherapy with the addition of Theraband exercises for the experimental group. Foroughi 2011 provided low‐resistance, non‐progressive 'sham exercise'. The fourth study uniquely used a Zelen randomisation, leading the control group to be unaware of participation in a randomised trial (Quilty 2003).

Just over half (57%) of the 54 studies clearly stated that the outcomes assessor was blinded to group allocation. However, as outcomes evaluated in this review were participant self‐report (pain, physical function, quality of life), and given that participants were mostly not blinded to allocation status, vulnerability to biased reporting may still be present.

Incomplete outcome data

Just over half of the studies (29/54) reported minimal loss to follow‐up or utilised imputation methods (usually last observation carried forward) to perform 'intention‐to‐treat' analyses.

Selective reporting

The presence of reporting bias was simply based on study registration. As this criterion would cause earlier studies to be at a disadvantage (before study registration requirements), the risk of bias was judged as 'uncertain' for unregistered studies. Therefore this criterion also was not considered in the overall estimate of study bias.

Effects of interventions

See: Summary of findings for the main comparison Immediate post‐treatment effects of exercise for osteoarthritis of the knee

At the time of the original review, several attempts were made to contact seven study authors to obtain additional data. Four study authors responded, and two were able to provide requested results for the location of OA in the knee (Hopman‐Rock 2000; van Baar 1998), one was able to provide WOMAC scores disaggregated for pain and physical function (Deyle 2000) and one was able to provide change scores for each allocation group (Thomas 2002). No contact could be established with the other three study authors. Therefore, for one study a misprint assumption was made on one 'impossible' standard error of the mean score (Bautch 1997). For another study, two baseline standard deviations had to be extrapolated from a study of similar size using the same self‐report questionnaires (Maurer 1999). For the third study, post‐treatment results for the control group were used as the baseline for the active treatment groups (two‐group analysis) (Ettinger 1997a/b). For updated reviews, three studies that recruited participants with OA of the hip and/or OA of the knee (Abbott 2013; Foley 2003; Fransen 2007) provided data disaggregated according to the most symptomatic joint (hip or knee).

Comparison 1

Immediate post‐treatment effects

Pain

Forty‐four studies provided data on 3537 participants (Figure 2) (Analysis 1.1). Pooled results of these 44 studies demonstrated statistically significant benefit, with an SMD of 0.49 (95% CI 0.39 to 0.59). This effect size would be considered moderate (Cohen 1977) and was equivalent to a reduction of 12 points (95% CI 10 to 15 points) on a 0 to 100‐point VAS pain scale (0 means no pain). Between‐study heterogeneity was moderate (I2 = 47%). No significant difference was noted between the SMD extrapolated from change scores and from end of treatment scores (P value 0.77) (I2 = 0%).


Forest plot of comparison: 1 Post treatment, outcome: 1.1 Pain.

Forest plot of comparison: 1 Post treatment, outcome: 1.1 Pain.

Physical function

Forty‐four studies provided data on 3913 participants (Figure 3) (Analysis 1.2). Pooled results of these 44 studies demonstrated statistically significant benefit, with an SMD of 0.52 (95% CI 0.39 to 0.64). This effect size would be considered moderate (Cohen 1977) and was equivalent to an improvement of 10 points (95% CI 8 to 13 points) on a 0 to 100‐point scale. Between‐study heterogeneity was substantial (I2 = 68%). No significant difference was noted between change and end of treatment scores (P value 0.36) (I2 = 0%).


Forest plot of comparison: 1 Post treatment, outcome: 1.2 Physical function.

Forest plot of comparison: 1 Post treatment, outcome: 1.2 Physical function.

Quality of life

Thirteen studies provided data on 1073 participants (Figure 4) (Analysis 1.3). Pooled results of these 13 studies demonstrated statistically significant benefit, with an SMD of 0.28 (95% CI 0.15 to 0.40). This effect size would be considered small (Cohen 1977) and was equivalent to an improvement of 4 points (95% CI 2 to 5 points) on a 0 to 100‐point scale. Between‐study heterogeneity was negligible (I2 = 0%). No significant difference was noted between change scores and end of treatment scores (P value 0.86) (I2 = 0%).


Forest plot of comparison: 1 Post treatment, outcome: 1.3 Quality of life.

Forest plot of comparison: 1 Post treatment, outcome: 1.3 Quality of life.

Study withdrawals

Forty‐five studies provided data on study withdrawals at the time of the first post‐treatment assessment (Analysis 1.4). Of these 45 studies, only whole sample estimates (knee and hip OA) were available for two studies (Foley 2003; van Baar 2001). No significantly increased risk of study withdrawal was noted in the exercise allocation group (14%) compared with the control group (15%) (OR 0.93, 95% CI 0.75 to 1.15).

Comparison 2

Treatment sustainability (two to six months)

Pain

Twelve studies provided data on 1468 participants (Analysis 2.1). Pooled results demonstrated statistically significant benefit (SMD 0.24, 95% CI 0.14 to 0.35). This effect size would be considered small ‐ equivalent to a reduction of 6 (95% CI 3 to 9) points on a 0 to 100‐point scale. Between‐study heterogeneity was absent (I2 = 0%). No significant difference was noted between change scores and end of treatment scores (P value 0.40) (I2 = 0%).

Physical function

Ten studies provided data on 1279 participants (Analysis 2.2). Pooled results demonstrated statistically significant benefit (SMD 0.15, 95% CI 0.04 to 0.26). This effect size would be considered small—equivalent to an improvement of 3 (95% CI 1 to 5) points on a 0 to 100‐point scale. Between‐study heterogeneity was absent (I2 = 0%). No significant difference was noted between change scores and end of treatment scores (P value 0.95) (I2 = 0%).

Comparison 3

Treatment sustainability (longer than six months)

Pain

After exclusion of two studies with extremely outlying results (Huang 2003; Huang 2005), six studies provided data on 1104 participants (Analysis 3.1). Pooled results demonstrated a non‐significant effect (SMD 0.08, 95% CI ‐0.15 to 0.30). Between‐study heterogeneity was moderate (I2 = 43%). No significant difference was noted between change scores and end of treatment scores (P value 0.73) (I2 = 0%).

Physical function

After exclusion of two studies with extremely outlying results (Huang 2003; Huang 2005), six studies provided data on 1098 participants (Analysis 3.2). Pooled results demonstrated statistically significant benefit (SMD 0.20, 95% CI 0.08 to 0.32). This effect size would be considered small ‐ equivalent to an improvement of 4 (95% CI 2 to 6) points on a 0 to 100‐point scale. Between‐study heterogeneity was absent (I2 = 0%). No significant difference was noted between change scores and end of treatment scores (P value 0.96) (I2 = 0%).

Subgroup Analyses

Comparison 4

Treatment content
Pain

Studies providing immediate post‐treatment assessments for pain were classified into five categories according to their exercise programme content Analysis 4.1: quadriceps strengthening only (nine studies, 620 participants); lower limb strengthening (12 studies, 863 participants , combination strengthening and aerobic exercise (10 studies, 920 participants); walking programmes (four studies, 351 participants) and 'other programmes' (e.g. Tai Chi) (10 studies, 733 participants). Each of the treatment content subgroups reported significantly reduced pain. No significant differences were noted between the various exercise programmes in mean pooled SMD ranged from 0.35 for 'other programmes' to 0.50 to 0.64 for various strengthening/aerobic programmes ‐ equivalent to improvements of 9 points ('other programmes') to 12 to 16 points (various strengthening/aerobic programs) on a 0 10 100‐point scale. Within‐group between‐study heterogeneity was substantial for the quadriceps strengthening (70%) and lower limb strengthening (61%) programmes.

Exclusion of two extreme outliers (simple quadriceps strengthening (Salli 2010) and lower limb strengthening (Gur 2002)) reduced the SMD and within‐group heterogeneity to 0.49 and 0.47 (26% and 37%), respectively

Physical function

Studies providing immediate post‐treatment assessments for physical function were similarly classified Analysis 4.2: quadriceps strengthening only (10 studies, 726 participants), lower limb strengthening (13 studies, 1066 participants), combination strengthening and aerobic exercise (10 studies, 1231 participants), walking programmes (three studies, 317 participants) and 'other programmes' (most Tai Chi or complex non‐specific programmes) (10 studies, 915 participants). Each of the treatment content subgroups reported significantly improved physical function. No significant differences were noted between the various exercise programmes in mean pooled SMD which ranged from 0.27 for 'other programmes' to 0.74 for quadriceps strengthening only ‐ equivalent to improvements of 5 points ('other programmes') to 15 points (quadriceps strengthening only) on a 0 to 100‐point scale. Within‐group between‐study heterogeneity was considerable for many of the subgroups (quadriceps only I2 = 73%; lower limb strengthening I2 = 76%) and could not be reduced (by more than 25%) by exclusion of any one study.

Comparison 5

Treatment delivery mode
Pain

Studies providing immediate post‐treatment assessments for pain were categorised according to three treatment delivery modes Analysis 5.1: individual treatments (14 studies, 1133 participants), class‐based programmes (24 studies, 1905 participants) and 'home' programmes (seven studies, 550 participants). Pooled analysis demonstrated that each of the treatment delivery modes provided significant reductions in pain: individual treatments: SMD 0.76, 95% CI 0.52 to 1.01; exercise classes: SMD 0.42, 95% CI 0.33 to 0.51; and home programmes: SMD 0.38, 95% CI 0.21 to 0.55. These effect sizes ranged from large (individual treatments) to small (home programmes) ‐ equivalent to improvements of 19 (95% CI 13 to 25) points for individual treatments, 10 (95% CI 8 to 12) points for exercise classes and 9 (95% CI 5 to 13) points for home programmes on a 0 to 100‐point scale. Between‐study heterogeneity for the category of individual treatments was substantial (I2 = 72%) and was negligible for class‐based programmes and home programmes (I2 = 0%). A statistically significant difference was detected between the three modes of delivery (P value 0.03) (Analysis 5.1).

After exclusion of two extreme outliers in the individual treatments category (Gur 2002; Salli 2010), SMD 0.61, 95% CI 0.42 to 0.79) and between study heterogeneity, I2 = 49%, were considerably reduced, and no statistically significant difference among the three modes of treatment delivery could be detected (P value 0.14).

Physical function

Studies providing immediate post‐treatment assessments of physical function were similarly categorised Analysis 5.2: individual treatments (16 studies, 1493 participants), class‐based programmes (24 studies, 2152 participants) and 'home' programmes (seven studies, 699 participants). Pooled analysis demonstrated that each of the treatment delivery modes provided significant reductions in pain: individual treatments: SMD 0.76, 95% CI 0.50 to 1.03; exercise classes: SMD 0.38, 95% CI 0.26 to 0.49; and home programmes: SMD 0.37, 95% CI 0.21 to 0.53. These effect sizes would be considered large (individual treatments) to small (exercise classes, home programmes) ‐ equivalent to improvements of 16 (95% CI 10 to 21) points for individual treatments, 8 (95% CI 5 to 10) points for exercise classes and 7 (95% CI 4 to 11) points for home programmes on a 0 to 100‐point scale. Between‐study heterogeneity for the category of individual treatments was substantial (I2 = 84%) but was moderate for class‐based programmes (I2 = 33%) and minimal for home programmes (I2 = 8%). A statistically significant difference was detected among the three modes of delivery in terms of physical function (P value 0.03) Analysis 5.2. Even after exclusion of one extreme outlier in individual treatments (Gur 2002), heterogeneity remained substantial (I2 = 78%), but differences among the three modes of delivery failed to achieve statistical significance (P value 0.06).

Comparison 6

Number of contact occasions
Pain

Studies providing immediate post‐treatment pain assessments were dichotomised according to the number of face‐to‐face contact occasions (in clinics or as home visits) with the healthcare professional supervising or monitoring the exercise programme Analysis 6.1: fewer than 12 contact occasions (10 studies, 1019 participants) versus 12 or more contact occasions (34 studies, 2468 participants).

Both categories achieved significant benefit: fewer than 12 occasions: SMD 0.40, 95% CI 0.24 to 0.56; 12 or more contact occasions: SMD 0.55, 95% CI 0.45 to 0.66. Although '12 or more occasions' did result in a larger SMD, the effect size of both categories would be considered moderate ‐ equivalent to improvements of 10 (95% CI 6 to 14) points for fewer than 12 occasions and 13 (95% CI 11 to 16) points for 12 or more contact occasions on a 0 to 100‐point scale. Between‐study heterogeneity was moderate (I2= 35% and 43%). No significant difference could be detected between the two categories of contact occasions in terms of pain (P value 0.15) (Analysis 6.1).

Physical function

Studies providing immediate post‐treatment assessments of physical function Analysis 6.2: fewer than 12 occasions (nine studies, 1033 participants) versus 12 or more contact occasions (33 studies, 2432 participants). Both categories achieved significant benefit: fewer than 12 contact occasions: SMD 0.33, 95% CI 0.09 to 0.57; 12 or more contact occasions: SMD 0.55, 95% CI 0.41 to 0.60. The category of '12 or more occasions' did result in a larger SMD (moderate effect size) compared with the 'fewer than 12 occasions' category (small effect size) ‐ equivalent to improvements of 7 (95% CI 2 to 11) points for fewer than 12 occasions and 11 (95% CI 8 to 12) points for 12 or more contact occasions, on a 0 to 100‐point scale. However, between‐study heterogeneity was considerable for each category (I2 = 72% and 60%), with no influential outliers (reducing heterogeneity > 25%). Differences between the two categories of contact occasions failed to achieve statistical significance (P value 0.09).

Sensitivity Analyses

Comparison 7

Selection and attrition bias
Pain

If random sequence generation, allocation concealment and incomplete outcome data domains were adequately met by a study, we judged the overall risk of bias as low for that study (14 studies, 1458 participants) (Analysis 7.1). All other included studies were categorised as 'uncertain or high risk of bias' (30 studies, 2029 participants). The pooled effects restricted to the 'low‐risk ' studies still indicated a significant reduction in pain (SMD 0.47, 95% CI 0.36 to 0.59) ‐ equivalent to improvements of 12 (95% CI 9 to 15) points on a 0 to 100‐point scale, and very similar to the pooled effects with all studies included (SMD 0.49; 95% CI 0.39 to 0.59) Analysis 1.1 Between‐study heterogeneity was negligible for studies with low risk of bias (I2 = 14%) but substantial for studies categorised as having uncertain or high risk (I2 = 52%).

Physical function

On the basis of the same criteria, 14 studies (456 participants) were categorised as 'low risk' while 30 studies (2457 participants were categosed as having 'uncertain or high risk' of bias Analysis 7.2. Pooled SMDs for 'low‐risk ' studies indicated a significant treatment effect: SMD 0.45 (95% CI 0.28 to 0.63) ‐ equivalent to improvements of 9 (95% CI 6 to 13) points on a 0 to 100‐point scale, and very similar to the pooled effect with all studies included (SMD 0.52; 95% CI 0.39 to 0.64) Analysis 1.2. Between‐study heterogeneity was substantial for both categories (I2 = 57% and 72%).

Detection bias
Pain

If participants were stated to be blinded to treatment allocation, we considered the study as low risk for detection bias (3 studies, 226 participants) Analysis 7.3 . All other included studies were categorised as 'uncertain or high risk of bias (41 studies, 3261 participants). The mean effect for 'low risk' studies (SMD 0.37) was lower than the mean pooled effect with all studies included (SMD 0.49), but equivalent to a mean reduction in pain of 9 points on a 0 to 100‐point scale. However the 95% CI around the mean SMD for the 'low risk' studies included the possibility of 'no effect' (95% CI ‐0.13 to 0.87). The small number of 'low risk' studies on basis of participant blinding resulted in extremely wide 95% CIs around the SMD and substantial between‐study heterogeneity (I2 = 64%).

Physical function

On basis of the same criteria, 3 studies (226 participants) were categorised as 'low risk' while 41 studies (3687 participants) were categorised as having 'uncertain or high risk' of bias Analysis 7.4. The mean effect for the 'low risk' studies (SMD 0.46) was very similar to the mean pooled effect with all studies included (SMD 0.52) and equivalent to a mean improvement in physical function of 9 points on a 0 to 100‐point scale. However the 95% CI around the mean SMD for the 'low risk studies included the possibility of 'no effect' (95% CI ‐0.22 to 1.14 ). Again the small number of 'low risk' studies on basis of participant blinding resulted in extremely wide 95% CIs and substantial between‐study heterogeneity (I2 = 80%).

Comparisons 1 through 7

Both mean effect sizes and 95% CIs tended to be slightly smaller with a fixed‐effect model than with the random‐effects model used in this meta‐analysis. However, this difference was never clinically meaningful or statistically significant. The only exceptions were Analysis 7.3; Analysis 7.4, where the fixed effect model resulted in markedly smaller SMDs for the 'low risk' categories.

Adverse events

Only eleven RCTs specifically reported on adverse events (Abbott 2013; Bennell 2010; Chang 2012; Foley 2003; Foroughi 2011; Fransen 2007; Hurley 2007; Jan 2009; Lim 2008; Lund 2008; van Baar 1998).

Abbott 2013 "detected no trial related adverse events," and van Baar 1998 stated that one participant receiving exercise reported adverse effects. Foley 2003 reported four withdrawals in the exercise group due to increased pain (two people), increased blood pressure (one person) and doctor's advice (one person) compared with one withdrawal due to illness in the control group. Fransen 2007 reported one withdrawal in the Tai Chi allocation group that was due to increased low back pain. The largest numbers of adverse events were reported by Bennell 2010 (five), Hurley 2007 (five), Jan 2009 (five), Lim 2008 (10) and Lund 2008 (11). All reported events were related to increased back, hip or knee pain among participants allocated to exercise. No serious adverse events were reported in any of the included studies.

Discussion

Summary of main results

This systematic review is an update of a previous Cochrane review, published in 2008, which included 32 RCTs. An additional 22 randomised controlled trials have been included in this update for a total of 54 trials, providing data on 5362 participants for outcomes on pain and on 5222 participants for outcomes on physical function. Overall, meta‐analysis demonstrated that evaluated land‐based therapeutic exercise programmes resulted in an immediate mean treatment benefit for knee pain (SMD 0.49, 95% CI 0.39 to 0.59), physical function (SMD 0.52, 95% CI 0.39 to 0.64) and quality of life (SMD 0.28, 95% CI 0.15 to 0.40). These mean immediate treatment benefits, extrapolated from 44 randomised controlled clinical trials involving 3537 participants for pain and 3913 participants for physical function, would be considered moderate ‐ equivalent to 12 (95% CI 10 to 15) points and 10 (95% CI 8 to 13) points for pain and physical function, respectively, on a 0 to 100‐point scale. Treatment benefit for quality of life, extrapolated from 13 trials involving 1073 participants, would be considered small ‐ equivalent to 4 points (95% CI 2 to 5 points). The benefit for pain is comparable with reported estimates for current simple analgesics and non‐steroidal anti‐inflammatory drugs taken for knee pain (Zhang 2010). Confidence intervals around demonstrated pooled results for pain reduction and improvement in physical function do not exclude a minimal clinically important treatment effect (15 points for pain and 10 points for physical function on a 0 to 100‐point scale). If the meta‐analysis result for immediate post‐treatment pain is restricted to those 14 studies, with a total of 1458 participants, evaluated as having low risk of selection and attrition bias, exercise still demonstrated significant benefit (SMD 0.47, 95% CI 0.36 to 0.59) of moderate size ‐ equivalent to 11 (95% CI 9 to 15) points on a 0 to 100‐point scale. Similar results were found for physical function when restricted to the 14 studies, with a total of 1456 participants, evaluated as having low risk of bias (SMD 0.45, 95% CI 0.28 to 0.63) ‐ equivalent to 9 (95% CI 6 to 13) points on a 0 to 100‐point scale.

A new analysis added to this Cochrane review is an evaluation of the effects of exercise on quality of life. A relatively small number of studies (13; 24%) evaluated immediate post‐treatment quality of life by using a variety of measures. Five studies reported the Mental Component Summary (MCS) of the Short Form‐36 (SF‐36) health survey, three studies reported the Knee Osteoarthritis Outcome Scale quality of life subscale, two studies evaluated the depression component of the Arthritis Impact Measurement Scales and one study each reported the Hospital Anxiety Depression Scale, SF‐12 MCS and Assessment of Quality of Life. These measures have been validated for use in people with knee OA and have demonstrated generally good responsiveness (Brazier 1999; Liang 1990; Monticone 2013). A small beneficial effect of exercise on quality of life was identified immediately post treatment for people with knee OA. Because of the limited number of studies reporting follow‐up quality of life outcomes, meta‐analysis of treatment sustainability for quality of life could not be performed in this review.

The pain‐relieving benefit of exercise declined at two to six months post exercise but was still significant, as evidenced in 12 studies involving 1468 participants (SMD 0.24, 95% CI 0.14 to 0.35). However, pain benefits were lost longer than six months post exercise, as was found in six studies involving 1104 participants (SMD 0.08, 95% CI ‐0.15 to 0.30). A small but significant treatment benefit for physical function remained two to six months following exercise, as extrapolated from 10 studies involving 1279 participants (SMD 0.15, 95% CI 0.04 to 0.26), as well as at time points longer than six months, as evidenced in six studies involving 1098 participants (SMD 0.20, 95% CI 0.08 to 0.32). These results suggest that although the pain‐relieving benefit of exercise is not maintained six or more months after treatment, improvements in physical function are better sustained.

Overall completeness and applicability of evidence

Because of marked heterogeneity within evaluated exercise programmes, sub‐group analyses were conducted according to the stated main focus of the evaluated exercise programme, the mode of treatment delivery and the number of directly supervised treatment occasions. Although these subgroup analyses should be viewed as exploratory, as they are non‐randomised comparisons, some interesting findings were derived. A range of exercise types can be utilised in clinical practice, with lower limb muscle strengthening and general aerobic exercise recommended by most international guidelines (Hochberg 2012; McAlindon 2014). Few studies have attempted to directly compare different types of exercise. One study compared aerobic walking and muscle strengthening, but lack of study power for this particular research question led to inconclusive results (Ettinger 1997a/b). Two other studies compared different strengthening regimens: weight bearing quadriceps exercises versus non‐weight bearing quadriceps exercises in one study (Jan 2009), and concentric‐eccentric strengthening exercises versus isometric strengthening exercises in the other (Salli 2010). Neither study found significant differences between types of strengthening exercises. It is interesting to note that meta‐analyses also could not demonstrate significant differences in the magnitude of treatment effects for pain and physical function between the various exercise programmes Analysis 4.1; Analysis 4.2. However, for both pain and physical function, exercise programmes classified as “other” (which included Tai Chi or complex non‐specific exercise programmes involving coordination, stretching or balancing exercises) yielded small benefits (pain: SMD 0.35, 95% CI 0.20 to 0.49; physical function: SMD 0.27, 95% CI 0.07 to 0.47) and seemed to be less effective than strengthening and aerobic exercise. This may reflect the limited focus of these other exercise programmes on specific muscle groups, or it may reflect lower exercise intensity (which was not measured or was not quantifiable for most of these programmes). For physical function in particular, exercise involving quadriceps strengthening alone (10 studies) was the most beneficial, yielding an effect size considered large (SMD 0.74, 95% CI 0.41 to 1.07). Medium effects on physical function were identified for exercise programmes that employed general lower limb strengthening (SMD 0.54, 95% CI 0.26 to 0.83) and strengthening combined with aerobic exercise (SMD 0.52, 95% CI 0.36 to 0.67). Small benefits were detected for walking exercise programmes (SMD 0.35, 95% CI 0.11 to 0.58), although this result was obtained with pooled data from only three studies. Although a program focusing on quadriceps strengthening yielded the greatest effect on physical function, no statistically significant differences between programmes were noted.

We examined the influence of the exercise programme delivery mode Analysis 5.1; Analysis 5.2. Although studies assessing home programmes (SMD 0.38) and class‐based programmes (SMD 0.42) demonstrated effect sizes for pain that were consistently smaller than those for more closely supervised individual treatments (SMD 0.76), differences between the various forms of treatment delivery were not statistically significant after two extreme outliers were removed from the individual treatments category. For physical function, individual treatments also yielded a large effect size, and exercise classes and home programmes yielded small effect sizes but failed to achieve statistical significance between the three delivery modes (P value 0.06) after an extreme outlier had been excluded from the individual treatments category. It should be noted that substantial heterogeneity was demonstrated with individual treatment delivery, and this may reflect the varying numbers of individual contact sessions or the different exercise programmes.

The magnitude of the treatment effect for both pain and physical function was influenced by the number of face‐to‐face contact occasions with the healthcare professional supervising or monitoring the exercise programme Analysis 6.1; Analysis 6.2. However, unlike in the previous Cochrane review, the difference between fewer than 12 occasions and 12 or more occasions failed to reach statistical significance; this is likely due to considerable between‐study heterogeneity. Taken together, results suggest that most people with knee OA need some form of ongoing monitoring or supervision to optimise clinical benefits of exercise treatment. We chose to classify exposure to exercise interventions on the basis of the number of contact occasions, not according to duration of treatment (e.g. number of weeks). Although no ideal method of classifying exercise therapy exposure is known, the number of contact occasions was chosen, as it provided a quantitative outcome for the number of potential progressions through the exercise programme. A threshold of 12 sessions was chosen because a large number of studies reported two‐weekly sessions over six weeks or three‐weekly sessions over four weeks, suggesting 12 as a relevant number for dichotomising data.

Exercise 'dosage,' which is a factor of frequency, intensity and programme duration, varied considerably between the studies included in this review. Uncertainties in actual dosage arise as a result of the dependence of exercise intensity not only upon exercise prescription but also upon individual exertion. The influence of programme duration upon dosage is difficult to quantify, with simple addition not providing a sufficient physiologically plausible model. Only one of the included studies attempted to evaluate the influence of exercise dosage on outcomes by comparing high‐ and low‐intensity resistance training of the knee flexor and extensor muscles while controlling for total exercise workload (Jan 2008). Investigators found no significant differences in pain or physical function between groups, although the study was considered to have a moderate to high risk of bias. Furthermore, studies with comparable exercise programme content were insufficient to provide a meaningful subgroup analysis of the influence of exercise dosage on treatment effectiveness. Therefore, specific recommendations cannot be made regarding optimal dosage (frequency, intensity, duration).

Quality of the evidence

Overall quality of the body of evidence was assessed as high when the GRADE approach was applied for pain and quality of life. Although a potential study limitation may exist for evidence on pain and quality of life (a potential for performance and detection bias that may overestimate effect sizes), we did not consider it substantial enough to downgrade the evidence. Evidence underpinning physical function was moderate and was downgraded because of imprecision (marked heterogeneity between study findings).

For immediate post‐treatment pain and physical function, 14 of 42 studies (33%) were categorised as having low risk of selection and attrition bias (random sequence generation, allocation concealment and incomplete outcome data domains adequately met). Apart from adequate randomisation procedures and allocation concealment and limited loss to follow‐up, blinding of participants when outcome measures are self‐report would provide the best chance that trial results will be free of selection, performance, attrition and detection bias. Blinding of study participants is difficult to achieve in studies evaluating exercise programmes. Using 'sham' exercise as the control intervention can introduce ethical concerns (substantial wasted time for control participants attending an ineffective programme) and is likely to be fairly transparent to most people with OA.

Regarding other methodological criteria, findings included the following: Most studies (40; 74%) reported using random sequence generation; 33 studies (61%) reported using blinded outcomes assessment (for other outcomes); only 24 studies (44%) reported adequate allocation concealment and 29 studies (54%) provided complete outcome data. When pooling the results according to risk of selection and attrition bias, the mean treatment effect size for immediate post‐treatment pain and physical function was similar for ‘low‐risk’ studies Analysis 7.1; Analysis 7.2 compared with the pooled treatment effects with all studies included Analysis 1.1; Analysis 1.2. The overall estimate of low risk of selection and attrition bias was comparable for the 22 studies identified in the update (eight 'low risk of bias' studies; 36%) and the 32 studies identified in the previous Cochrane review (11 'low risk of bias' studies; 34%). While the pooled results for the 'low risk' (detection bias) group indicated a lower mean effect for pain and physical function, the confidence intervals indicate a finding of uncertainty (not of 'no effect') as the confidence intervals do not exclude a clinically important effect.

Potential biases in the review process

Some important caveats to this review must be stated. First, given that the comparator in many studies was a no treatment control group, and that blinding of participants was not performed in almost all trials, the well‐documented strong placebo effects for self‐reported outcomes in knee OA (Zhang 2010) have not been controlled for in the exercise studies. Thus it is not possible to determine the exact magnitude of beneficial effects. The second issue concerns the responsiveness of self‐reported pain and physical function measures. Many of the studies included in this systematic review recruited a majority of participants with early or mild symptomatic disease. Although people with early disease frequently demonstrate reduced muscle strength and aerobic capacity compared with their age‐ and gender‐matched peers without symptomatic OA, these physiological impairments often are not yet large enough to translate into reportable difficulties on simple questionnaires. This lack of reportable difficulties would considerably reduce the potential range of improvement that was possible (ceiling effect) on self‐report questionnaires in people with early or mild disease. One of the potential benefits of exercise in people with early disease, such as increased physiological reserve capacity, will not be captured by these questionnaires. Objective measures of physical performance not only strengthen the methodological quality of a study when masking to allocation is unattainable for the participant, they also potentially provide data that can be used to better discriminate between people with early disease in whom disease‐related impairments have not yet developed into self‐reported functional limitations or disability. Thus, reporting of both objective physiological measures and self‐reported assessments in an individual study is desirable.

Several limitations of this review have been identified. We conducted an extensive literature search. Because resources were limited, we extracted data only from studies published in the English language, potentially excluding other evidence. Four studies were published in a language other than English (Carlos 2012; Ghroubi 2008; Oida 2008; Rosa 2012), and we were unable to source full text for two studies (Eungpinichpong 1997; Keogan 2007). These studies await classification. However, the possibility of publication bias could not be ruled out, as we did not attempt to retrieve unpublished studies.

The effectiveness of exercise was investigated only for measures of self‐reported pain, physical function and quality of life. However, regular exercise has been demonstrated to offer many other overall physical and mental health benefits, apart from those related to OA‐induced disease impairments. Therefore this review likely underestimates the overall beneficial effects of exercise amongst people with knee OA. Mediating effects of exercise dosage and disease severity on the effectiveness of exercise could not be ascertained because of large variability in reported data.

Agreements and disagreements with other studies or reviews

Updated results of this meta‐analysis concur with previously identified benefits of exercise for pain and physical function among people with knee OA. However, effect sizes are greater than those reported in the previous Cochrane review (SMD 0.40, 95% CI 0.30 to 0.50 for pain; SMD 0.37, 95% CI 0.25 to 0.49 for physical function). A moderate effect size for pain was noted, whereas the previous small effect size for physical function has increased and now would be classified as moderate. The larger effects identified in this review are likely due to separation of findings into those noted immediately post treatment and those reported at a follow‐up time point, which could not be done in the previous review, given the smaller study numbers. Hence the larger effects are a reflection of superior results immediately following treatment.

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.
Figures and Tables -
Figure 1

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Forest plot of comparison: 1 Post treatment, outcome: 1.1 Pain.
Figures and Tables -
Figure 2

Forest plot of comparison: 1 Post treatment, outcome: 1.1 Pain.

Forest plot of comparison: 1 Post treatment, outcome: 1.2 Physical function.
Figures and Tables -
Figure 3

Forest plot of comparison: 1 Post treatment, outcome: 1.2 Physical function.

Forest plot of comparison: 1 Post treatment, outcome: 1.3 Quality of life.
Figures and Tables -
Figure 4

Forest plot of comparison: 1 Post treatment, outcome: 1.3 Quality of life.

Comparison 1 Post treatment, Outcome 1 Pain.
Figures and Tables -
Analysis 1.1

Comparison 1 Post treatment, Outcome 1 Pain.

Comparison 1 Post treatment, Outcome 2 Physical function.
Figures and Tables -
Analysis 1.2

Comparison 1 Post treatment, Outcome 2 Physical function.

Comparison 1 Post treatment, Outcome 3 Quality of Life.
Figures and Tables -
Analysis 1.3

Comparison 1 Post treatment, Outcome 3 Quality of Life.

Comparison 1 Post treatment, Outcome 4 Study withdrawals.
Figures and Tables -
Analysis 1.4

Comparison 1 Post treatment, Outcome 4 Study withdrawals.

Comparison 2 Treatment sustainability 2‐6 months, Outcome 1 Pain.
Figures and Tables -
Analysis 2.1

Comparison 2 Treatment sustainability 2‐6 months, Outcome 1 Pain.

Comparison 2 Treatment sustainability 2‐6 months, Outcome 2 Physical function.
Figures and Tables -
Analysis 2.2

Comparison 2 Treatment sustainability 2‐6 months, Outcome 2 Physical function.

Comparison 3 Treatment sustainability > 6 months, Outcome 1 Pain.
Figures and Tables -
Analysis 3.1

Comparison 3 Treatment sustainability > 6 months, Outcome 1 Pain.

Comparison 3 Treatment sustainability > 6 months, Outcome 2 Physical function.
Figures and Tables -
Analysis 3.2

Comparison 3 Treatment sustainability > 6 months, Outcome 2 Physical function.

Comparison 4 Treatment content, Outcome 1 Pain.
Figures and Tables -
Analysis 4.1

Comparison 4 Treatment content, Outcome 1 Pain.

Comparison 4 Treatment content, Outcome 2 Physical function.
Figures and Tables -
Analysis 4.2

Comparison 4 Treatment content, Outcome 2 Physical function.

Comparison 5 Treatment delivery mode, Outcome 1 Pain.
Figures and Tables -
Analysis 5.1

Comparison 5 Treatment delivery mode, Outcome 1 Pain.

Comparison 5 Treatment delivery mode, Outcome 2 Physical Function.
Figures and Tables -
Analysis 5.2

Comparison 5 Treatment delivery mode, Outcome 2 Physical Function.

Comparison 6 Number of contact occasions, Outcome 1 Pain.
Figures and Tables -
Analysis 6.1

Comparison 6 Number of contact occasions, Outcome 1 Pain.

Comparison 6 Number of contact occasions, Outcome 2 Physical function.
Figures and Tables -
Analysis 6.2

Comparison 6 Number of contact occasions, Outcome 2 Physical function.

Comparison 7 Sensitivity Analyses, Outcome 1 Selection and attrition bias: pain.
Figures and Tables -
Analysis 7.1

Comparison 7 Sensitivity Analyses, Outcome 1 Selection and attrition bias: pain.

Comparison 7 Sensitivity Analyses, Outcome 2 Selection and attrition bias: physical function.
Figures and Tables -
Analysis 7.2

Comparison 7 Sensitivity Analyses, Outcome 2 Selection and attrition bias: physical function.

Comparison 7 Sensitivity Analyses, Outcome 3 Detection bias: pain.
Figures and Tables -
Analysis 7.3

Comparison 7 Sensitivity Analyses, Outcome 3 Detection bias: pain.

Comparison 7 Sensitivity Analyses, Outcome 4 Detection bias: physical function.
Figures and Tables -
Analysis 7.4

Comparison 7 Sensitivity Analyses, Outcome 4 Detection bias: physical function.

Summary of findings for the main comparison. Immediate post‐treatment effects of exercise for osteoarthritis of the knee

Immediate post‐treatment effects of exercise for osteoarthritis of the knee

Patient or population: patients with knee OA
Settings: clinic or community
Intervention: land‐based exercise
Comparison: no exercise

Outcomes

Illustrative comparative risks* (95% CI)

Relative effect
(95% CI)

Number of participants
(studies)

Quality of the evidence
(GRADE)

Comments

Assumed risk

Corresponding risk

No exercise

Land‐based exercise

Pain
Self‐report questionnaires. Scale from 0‐100 (0 represents no pain)

Mean pain in the control groups was
44 points

Mean pain in intervention groups was
0.49 standard deviations lower
(0.39‐0.59 lower)

This translates to an absolute mean reduction of 12 (10‐15) points compared with control group on a 0‐100 scalea

3537
(44 studies)

⊕⊕⊕⊕
High

SMD ‐0.49 (‐0.39 to ‐0.59)

Absolute reduction in pain 12% (10%‐15%); relative change 27% (21%‐32%)a

NNTB 4 (3‐5)b

Physical function
Self‐report questionnaire. Scale from 0‐100 (0 represents no physical disability)

Mean physical function in control groups was
38 points

Mean physical function in intervention groups was
0.52 standard deviations lower
(0.39‐0.64 lower)

This translates to an absolute mean improvement of 10 (8‐13) points on a 0‐100 scalec

3913
(44 studies)

⊕⊕⊕⊝
Moderated

SMD ‐0.52 (‐0.39 to ‐0.64)

Absolute improvement 10% (8%‐13%); relative improvement 26% (20%‐32%)c

NNTB 4 (3‐5)b

Quality of life
Self‐report questionnaire. Scale from 0‐100 (100 is maximum quality of life)

Mean quality of life in control groups was
43 points

Mean quality of life in intervention groups was
0.28 standard deviations higher
(0.15‐0.4 higher)

This translates to an absolute improvement of 4 (2‐5) points on a 0‐100 scalee

1073
(13 studies)

⊕⊕⊕⊕
High

SMD 0.28 (0.15‐0.40)

Absolute improvement 4% (2%‐5%); relative improvement 9% (5%‐13%)e

NNTB 8 (5‐14)b

Study withdrawals or dropouts

153 per 1000

137 per 1000

4607

(44 studies)

⊕⊕⊕⊕
High

OR 0.93 (0.75‐1.15)

Absolute risk reduction: 1% fewer events with exercise (2% fewer‐2% more); relative risk reduction 6% fewer events with exercise (21% fewer‐12% more)

NNTH n/ab

*The basis for the assumed risk (e.g. median control group risk across studies) is provided in footnotes. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).
CI: Confidence interval; GRADE: Grades of Recommendation, Assessment, Development and Evaluation; KOOS: Knee Osteoarthritis Outcome Scale; NNTB: Number needed to treat for an additional beneficial outcome; NNTH: Number needed to treat for an additional harmful outcome; SMD: Standardised mean difference.

GRADE Working Group grades of evidence.
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

aCalculations based on the control group baseline mean (SD) pain: 44.3 (24.4) points on 0‐100 scale (from Yip 2007).

bNumber needed to treat for an additional beneficial outcome (NNTB) or harmful outcome (NNTH) not applicable (n/a) when result was not statistically significant. Number needed to treat (NNT) for continuous outcomes calculated using the Wells calculator (from the CMSG Editorial office; http://musculoskeletal.cochrane.org/), and for dichotomous outcomes using the Cates NNT calculator (www.nntonline.net/visualrx/).

cCalculations based on the control group baseline mean (SD) function: 40.0 (20.0) points on 0‐100 scale (from Hurley 2007).

dPhysical function downgraded for inconsistency (heterogeneity, I2 = 68%).

eCalculated on the basis of the control group baseline mean (SD): 39.2 (13.1) points on 0‐100 KOOS subscale (from Lund 2008).

Figures and Tables -
Summary of findings for the main comparison. Immediate post‐treatment effects of exercise for osteoarthritis of the knee
Comparison 1. Post treatment

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Pain Show forest plot

44

3537

Std. Mean Difference (IV, Random, 95% CI)

‐0.49 [‐0.59, ‐0.39]

1.1 Change scores

28

2136

Std. Mean Difference (IV, Random, 95% CI)

‐0.50 [‐0.62, ‐0.38]

1.2 End of treatment scores

16

1401

Std. Mean Difference (IV, Random, 95% CI)

‐0.47 [‐0.65, ‐0.29]

2 Physical function Show forest plot

44

3913

Std. Mean Difference (IV, Random, 95% CI)

‐0.52 [‐0.64, ‐0.39]

2.1 Change scores

28

2253

Std. Mean Difference (IV, Random, 95% CI)

‐0.47 [‐0.63, ‐0.31]

2.2 End of treatment scores

16

1660

Std. Mean Difference (IV, Random, 95% CI)

‐0.59 [‐0.78, ‐0.40]

3 Quality of Life Show forest plot

13

1073

Std. Mean Difference (IV, Random, 95% CI)

0.28 [0.15, 0.40]

3.1 Change scores

8

848

Std. Mean Difference (IV, Random, 95% CI)

0.27 [0.13, 0.42]

3.2 End of treatment scores

5

225

Std. Mean Difference (IV, Random, 95% CI)

0.30 [0.04, 0.57]

4 Study withdrawals Show forest plot

45

4607

Odds Ratio (M‐H, Random, 95% CI)

0.93 [0.75, 1.15]

Figures and Tables -
Comparison 1. Post treatment
Comparison 2. Treatment sustainability 2‐6 months

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Pain Show forest plot

12

1468

Std. Mean Difference (IV, Random, 95% CI)

‐0.24 [‐0.35, ‐0.14]

1.1 Change

4

563

Std. Mean Difference (IV, Random, 95% CI)

‐0.19 [‐0.36, ‐0.02]

1.2 End of follow‐up

8

905

Std. Mean Difference (IV, Random, 95% CI)

‐0.28 [‐0.42, ‐0.15]

2 Physical function Show forest plot

10

1279

Std. Mean Difference (IV, Random, 95% CI)

‐0.15 [‐0.26, ‐0.04]

2.1 Change scores

4

566

Std. Mean Difference (IV, Random, 95% CI)

‐0.15 [‐0.31, 0.02]

2.2 End of follow‐up

6

713

Std. Mean Difference (IV, Random, 95% CI)

‐0.15 [‐0.32, 0.02]

Figures and Tables -
Comparison 2. Treatment sustainability 2‐6 months
Comparison 3. Treatment sustainability > 6 months

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Pain Show forest plot

8

1272

Std. Mean Difference (IV, Random, 95% CI)

‐0.52 [‐1.01, ‐0.03]

1.1 Change

4

1024

Std. Mean Difference (IV, Random, 95% CI)

‐0.05 [‐0.35, 0.26]

1.2 End of follow‐up

4

248

Std. Mean Difference (IV, Random, 95% CI)

‐1.03 [‐2.02, ‐0.04]

2 Physical function Show forest plot

8

1266

Std. Mean Difference (IV, Random, 95% CI)

‐0.57 [‐1.05, ‐0.10]

2.1 Change

4

1024

Std. Mean Difference (IV, Random, 95% CI)

‐0.20 [‐0.32, ‐0.07]

2.2 End of follow‐up

4

242

Std. Mean Difference (IV, Random, 95% CI)

‐1.03 [‐2.07, 0.02]

Figures and Tables -
Comparison 3. Treatment sustainability > 6 months
Comparison 4. Treatment content

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Pain Show forest plot

44

3487

Std. Mean Difference (IV, Random, 95% CI)

‐0.51 [‐0.60, ‐0.41]

1.1 Quads strengthening only

9

620

Std. Mean Difference (IV, Random, 95% CI)

‐0.64 [‐0.95, ‐0.33]

1.2 Lower limb strengthening

12

863

Std. Mean Difference (IV, Random, 95% CI)

‐0.53 [‐0.78, ‐0.28]

1.3 Strengthening and aerobics

10

920

Std. Mean Difference (IV, Random, 95% CI)

‐0.50 [‐0.64, ‐0.37]

1.4 Walking programmes

4

351

Std. Mean Difference (IV, Random, 95% CI)

‐0.48 [‐0.83, ‐0.13]

1.5 Other programmes

10

733

Std. Mean Difference (IV, Random, 95% CI)

‐0.35 [‐0.49, ‐0.20]

2 Physical function Show forest plot

44

4255

Std. Mean Difference (IV, Random, 95% CI)

‐0.51 [‐0.62, ‐0.39]

2.1 Quadriceps strengthening only

10

726

Std. Mean Difference (IV, Random, 95% CI)

‐0.74 [‐1.07, ‐0.41]

2.2 Lower limb strengthening

13

1066

Std. Mean Difference (IV, Random, 95% CI)

‐0.54 [‐0.83, ‐0.26]

2.3 Strengthening and aerobics

10

1231

Std. Mean Difference (IV, Random, 95% CI)

‐0.52 [‐0.67, ‐0.36]

2.4 Walking programmes

3

317

Std. Mean Difference (IV, Random, 95% CI)

‐0.35 [‐0.58, ‐0.11]

2.5 Other programmes

10

915

Std. Mean Difference (IV, Random, 95% CI)

‐0.27 [‐0.47, ‐0.07]

Figures and Tables -
Comparison 4. Treatment content
Comparison 5. Treatment delivery mode

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Pain Show forest plot

44

3588

Std. Mean Difference (IV, Random, 95% CI)

‐0.50 [‐0.60, ‐0.41]

1.1 Individual treatments

14

1133

Std. Mean Difference (IV, Random, 95% CI)

‐0.76 [‐1.01, ‐0.52]

1.2 Class‐based programmes

24

1905

Std. Mean Difference (IV, Random, 95% CI)

‐0.42 [‐0.51, ‐0.33]

1.3 Home programmes

7

550

Std. Mean Difference (IV, Random, 95% CI)

‐0.38 [‐0.55, ‐0.21]

2 Physical Function Show forest plot

45

4344

Std. Mean Difference (IV, Random, 95% CI)

‐0.49 [‐0.61, ‐0.38]

2.1 Individual treatments

16

1493

Std. Mean Difference (IV, Random, 95% CI)

‐0.76 [‐1.03, ‐0.50]

2.2 Class‐based programmes

24

2152

Std. Mean Difference (IV, Random, 95% CI)

‐0.38 [‐0.49, ‐0.26]

2.3 Home programmes

7

699

Std. Mean Difference (IV, Random, 95% CI)

‐0.37 [‐0.53, ‐0.21]

Figures and Tables -
Comparison 5. Treatment delivery mode
Comparison 6. Number of contact occasions

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Pain Show forest plot

44

3487

Std. Mean Difference (IV, Random, 95% CI)

‐0.51 [‐0.60, ‐0.41]

1.1 Fewer than 12 occasions

10

1019

Std. Mean Difference (IV, Random, 95% CI)

‐0.40 [‐0.56, ‐0.24]

1.2 12 or more occasions

34

2468

Std. Mean Difference (IV, Random, 95% CI)

‐0.55 [‐0.66, ‐0.43]

2 Physical function Show forest plot

44

3913

Std. Mean Difference (IV, Random, 95% CI)

‐0.51 [‐0.64, ‐0.39]

2.1 Fewer than 12 occasions

9

1033

Std. Mean Difference (IV, Random, 95% CI)

‐0.33 [‐0.57, ‐0.09]

2.2 12 or more occasions

35

2880

Std. Mean Difference (IV, Random, 95% CI)

‐0.57 [‐0.71, ‐0.43]

Figures and Tables -
Comparison 6. Number of contact occasions
Comparison 7. Sensitivity Analyses

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Selection and attrition bias: pain Show forest plot

44

3487

Std. Mean Difference (IV, Random, 95% CI)

‐0.51 [‐0.60, ‐0.41]

1.1 Low risk

14

1458

Std. Mean Difference (IV, Random, 95% CI)

‐0.47 [‐0.59, ‐0.36]

1.2 Unclear or high risk

30

2029

Std. Mean Difference (IV, Random, 95% CI)

‐0.53 [‐0.67, ‐0.39]

2 Selection and attrition bias: physical function Show forest plot

44

3913

Std. Mean Difference (IV, Random, 95% CI)

‐0.52 [‐0.64, ‐0.39]

2.1 Low risk

14

1456

Std. Mean Difference (IV, Random, 95% CI)

‐0.45 [‐0.63, ‐0.28]

2.2 Unclear or high risk

30

2457

Std. Mean Difference (IV, Random, 95% CI)

‐0.55 [‐0.72, ‐0.38]

3 Detection bias: pain Show forest plot

44

3487

Std. Mean Difference (IV, Random, 95% CI)

‐0.51 [‐0.60, ‐0.41]

3.1 Low risk

3

226

Std. Mean Difference (IV, Random, 95% CI)

‐0.37 [‐0.87, 0.13]

3.2 Unclear or high risk

41

3261

Std. Mean Difference (IV, Random, 95% CI)

‐0.52 [‐0.61, ‐0.42]

4 Detection bias: physical function Show forest plot

44

3913

Std. Mean Difference (IV, Random, 95% CI)

‐0.52 [‐0.64, ‐0.39]

4.1 Low risk

3

226

Std. Mean Difference (IV, Random, 95% CI)

‐0.46 [‐1.14, 0.22]

4.2 Unclear or high risk

41

3687

Std. Mean Difference (IV, Random, 95% CI)

‐0.52 [‐0.65, ‐0.40]

Figures and Tables -
Comparison 7. Sensitivity Analyses