Scolaris Content Display Scolaris Content Display

Exercise programs for people with dementia

Contraer todo Desplegar todo

Background

This is an update of our previous 2013 review. Several recent trials and systematic reviews of the impact of exercise on people with dementia are reporting promising findings.

Objectives

Primary objective

Do exercise programs for older people with dementia improve their cognition, activities of daily living (ADLs), neuropsychiatric symptoms, depression, and mortality?

Secondary objectives

Do exercise programs for older people with dementia have an indirect impact on family caregivers’ burden, quality of life, and mortality?

Do exercise programs for older people with dementia reduce the use of healthcare services (e.g. visits to the emergency department) by participants and their family caregivers?

Search methods

We identified trials for inclusion in the review by searching ALOIS (www.medicine.ox.ac.uk/alois), the Cochrane Dementia and Cognitive Improvement Group’s Specialised Register, on 4 September 2011, on 13 August 2012, and again on 3 October 2013.

Selection criteria

In this review, we included randomized controlled trials in which older people, diagnosed with dementia, were allocated either to exercise programs or to control groups (usual care or social contact/activities) with the aim of improving cognition, ADLs, neuropsychiatric symptoms, depression, and mortality. Secondary outcomes related to the family caregiver(s) and included caregiver burden, quality of life, mortality, and use of healthcare services.

Data collection and analysis

Independently, at least two authors assessed the retrieved articles for inclusion, assessed methodological quality, and extracted data. We analysed data for summary effects. We calculated mean differences or standardized mean difference (SMD) for continuous data, and synthesized data for each outcome using a fixed‐effect model, unless there was substantial heterogeneity between studies, when we used a random‐effects model. We planned to explore heterogeneity in relation to severity and type of dementia, and type, frequency, and duration of exercise program. We also evaluated adverse events.

Main results

Seventeen trials with 1067 participants met the inclusion criteria. However, the required data from three included trials and some of the data from a fourth trial were not published and not made available. The included trials were highly heterogeneous in terms of subtype and severity of participants' dementia, and type, duration, and frequency of exercise. Only two trials included participants living at home.

Our meta‐analysis revealed that there was no clear evidence of benefit from exercise on cognitive functioning. The estimated standardized mean difference between exercise and control groups was 0.43 (95% CI ‐0.05 to 0.92, P value 0.08; 9 studies, 409 participants). There was very substantial heterogeneity in this analysis (I² value 80%), most of which we were unable to explain, and we rated the quality of this evidence as very low. We found a benefit of exercise programs on the ability of people with dementia to perform ADLs in six trials with 289 participants. The estimated standardized mean difference between exercise and control groups was 0.68 (95% CI 0.08 to 1.27, P value 0.02). However, again we observed considerable unexplained heterogeneity (I² value 77%) in this meta‐analysis, and we rated the quality of this evidence as very low. This means that there is a need for caution in interpreting these findings.

In further analyses, in one trial we found that the burden experienced by informal caregivers providing care in the home may be reduced when they supervise the participation of the family member with dementia in an exercise program. The mean difference between exercise and control groups was ‐15.30 (95% CI ‐24.73 to ‐5.87; 1 trial, 40 participants; P value 0.001). There was no apparent risk of bias in this study. In addition, there was no clear evidence of benefit from exercise on neuropsychiatric symptoms (MD ‐0.60, 95% CI ‐4.22 to 3.02; 1 trial, 110 participants; P value .0.75), or depression (SMD 0.14, 95% CI ‐0.07 to 0.36; 5 trials, 341 participants; P value 0.16). We could not examine the remaining outcomes, quality of life, mortality, and healthcare costs, as either the appropriate data were not reported, or we did not retrieve trials that examined these outcomes.

Authors' conclusions

There is promising evidence that exercise programs may improve the ability to perform ADLs in people with dementia, although some caution is advised in interpreting these findings. The review revealed no evidence of benefit from exercise on cognition, neuropsychiatric symptoms, or depression. There was little or no evidence regarding the remaining outcomes of interest (i.e., mortality, caregiver burden, caregiver quality of life, caregiver mortality, and use of healthcare services).

PICO

Population
Intervention
Comparison
Outcome

El uso y la enseñanza del modelo PICO están muy extendidos en el ámbito de la atención sanitaria basada en la evidencia para formular preguntas y estrategias de búsqueda y para caracterizar estudios o metanálisis clínicos. PICO son las siglas en inglés de cuatro posibles componentes de una pregunta de investigación: paciente, población o problema; intervención; comparación; desenlace (outcome).

Para saber más sobre el uso del modelo PICO, puede consultar el Manual Cochrane.

Exercise programs for people with dementia

Background

In future, as the population ages, the number of people in our communities suffering with dementia will rise dramatically. This will not only affect the quality of life of people with dementia but also increase the burden on family caregivers, community care, and residential care services. Exercise is one lifestyle factor that has been identified as a potential means of reducing or delaying progression of the symptoms of dementia.

Study characteristics

This review evaluated the results of 17 trials (search dates August 2012 and October 2013), including 1,067 participants, that tested whether exercise programs could improve cognition (which includes such things as memory, reasoning ability and spatial awareness), activities of daily living, behaviour and psychological symptoms (such as depression, anxiety and agitation) in older people with dementia. We also looked for effects on mortality, quality of life, caregivers' experience and use of healthcare services, and for any adverse effects of exercise.

Key findings

There was some evidence that exercise programs can improve the ability of people with dementia to perform daily activities, but there was a lot of variation among trial results that we were not able to explain. The studies showed no evidence of benefit from exercise on cognition, psychological symptoms, and depression. There was little or no evidence regarding the other outcomes listed above. There was no evidence that exercise was harmful for the participants. We judged the overall quality of evidence behind most of the results to be very low.

Conclusion

Additional well‐designed trials would allow us to enhance the quality of the review by investigating the best type of exercise program for people with different types and severity of dementia and by addressing all of the outcomes.

Authors' conclusions

Implications for practice

With an increased number of trials now available, there is evidence that suggests that exercise programs may improve people with dementia’s ability to perform activities of daily living (ADLs). Healthcare providers who work with people with dementia and their caregivers should feel confident in promoting exercise among this population, as decreasing the progression of dependence in ADLs will have significant benefits for people with dementia and their family caregivers’ quality of life, and possibly delay the need for placement in long‐term care settings. No trials reported adverse events related to exercise programs.

One trial that examined the burden experienced by family caregivers who provide care in the home revealed that this burden can be reduced if caregivers supervise the family member with dementia during participation in an exercise program. Therefore, encouraging caregivers to participate in exercise may also have a beneficial impact on their quality of life.

Setting of intervention (home versus institutional) should be considered in the future, if more studies become available. There was an insufficient number of trials to permit subgroup analyses that would determine which type of exercise (aerobic, strength training, balance, or a combination), at what frequency and duration, is most beneficial for specific types and severity of dementia. Clearly further research is needed to be able to develop best practice guidelines that would be helpful to healthcare providers in advising people with dementia living in institutional and community settings.

Implications for research

For older people in general, recent research recommends at least 150 minutes of moderate‐ to vigorous‐intensity aerobic exercise per week, in bouts of 10 minutes or more to achieve better quality of life, improve functional abilities, and reduce risk of disease, death, and loss of independence by up to 60%. In addition, muscle and bone strengthening activities using major muscle groups, at least two times per week is recommended (Chodzko‐Zajko 2009; Tremblay 2011). Other research has revealed that aerobic‐type exercise has a clear benefit over strength training, and moderate‐intensity exercise of at least one hour a day, three to five times or more a week may be more effective in improving cognition (Kramer 2007; Middleton 2007).

However, these recommendations may not be appropriate for people with dementia. Further research is necessary to identify the optimal exercise modalities particularly in terms of frequency, intensity, and duration for people with different types and severity of dementia and to identify barriers and facilitators to improving adherence. Attempting to match the exercise programs with the needs, capabilities, and preferences of people with dementia, and ensuring adequate funding to provide regular, appropriate programs, over extended periods, by qualified instructors may increase adherence (Forbes 2007). Additional well designed trials that are conducted in the community setting, which is where most people with dementia live, and that examine outcomes of relevance to people with dementia (e.g. cognition, ADLs, depression, neuropsychiatric symptoms, quality of life and mortality), family caregiver outcomes (e.g. caregiver burden, quality of life, and mortality) and economic analysis of visits to emergency departments, acute care settings, and cost of residential care are also needed.

No serious adverse events were attributed to participation in the trials. Recent research suggests that high‐intensity weight‐bearing exercise does not seem to have a negative effect on functional balance (Conradsson 2010). However, further research is needed about potential adverse events from aerobic exercise programs to determine whether this population is similar to older adults in general who are less likely to fall and less likely to injure themselves from falls if they are physically active (Kannus 2005; Sherrington 2004), or if the risk of falling and of cardiovascular events is higher in people with dementia during aerobic exercise.

Clinical researchers should make a practice of ensuring that their trials are of high methodological quality and provide information on the randomization process (sequence generation and allocation concealment), blinding of outcome assessors, attrition rates and reasons for drop‐outs from both treatment and control groups, rate of adherence to the exercise programs and reasons for withdrawal, and adverse events to the exercise programs in published articles, or be willing to share this information with reviewers when contacted. Providing statistically appropriate data (e.g. end point means and standard deviations) would also ensure that the trial results can be incorporated into meta‐analysis.

Summary of findings

Open in table viewer
Summary of findings for the main comparison. Exercise programs for people with dementia

Exercise programs for people with dementia

Patient or population: people with dementia
Settings: long term care, community programs, home
Intervention: exercise program compared to usual care or a social group activity

Outcomes

Illustrative comparative risks* (95% CI)

No of participants
(studies)

Quality of the evidence
(GRADE)

Comments

Cognition, SD units

Investigators measured cognition using different instruments. Higher scores represent better cognitive function
Follow‐up: 6‐36 weeks

The mean score for cognition in the intervention groups was
0.43 standard deviations units higher
(0.05 lower to 0.92 higher)

409
(9 studies)

⊕⊝⊝⊝
very lowa

As a rough guide, a difference of 0.2 SD represents a small, 0.6 a moderate and 0.8 a large treatment effect

Activities of daily living, SD units
Investigators measured ADLs using different instruments. Higher scores represent better performance
Follow‐up: 7‐52 weeks

The mean score activities of daily living in the intervention groups was
0.68 standard deviations higher
(0.08 to 1.27 higher)

289
(6 studies)

⊕⊝⊝⊝
lowb

Depression, SD units
Investigators measured depression using a variety of scales. Lower scores represent improvement
Follow‐up: 6‐52 weeks

The mean score for depression in the intervention groups was
0.14 lower
(0.36 lower to 0.07 higher)

341
(5 studies)

⊕⊕⊕⊝
moderatec

Neuropsychiatric symptoms

Measured using NPI. Severity of symptoms is measured on a scale of 0‐144. A higher score indicates worse symptoms

Follow‐up: 12 months

The mean NPI score in the intervention group was 0.60 points lower (4.22 lower to 3.02 higher).

110 (1 study)

very lowd

A minimum difference of 8 points in the NPI scale has been considered to be clinically important

GRADE Working Group grades of evidence
High quality: further research is very unlikely to change our confidence in the estimate of effect
Moderate quality: further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate
Low quality: further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate
Very low quality: we are very uncertain about the estimate

a rated down for serious inconsistency between studies (I² 80%), imprecision and publication bias (12 studies measured cognitive outcomes but data were only available from 9)

b rated down for serious inconsistency between studies (I² 77%) and imprecision

c rated down for imprecision

d rated down because data came from a single study, for imprecision and for publication bias (5 studies measured neuropsychiatric outcomes but only one provided usable data)

Background

 

Description of the condition

In 2012, the World Health Organization declared dementia to be a public health priority (World Health Organization 2012), citing the high global prevalence and economic impact on families, communities, and health service providers. In the coming decades, with the aging of the population, the number of people living with dementia in our communities will rise dramatically. This will increase the burden on family caregivers, community care, and residential care services (Alzheimer Society of Canada 2010; World Alzheimer Report 2011). People diagnosed with dementia often have unique needs, as they tend to be older and present with acquired impairment in memory, associated with other disturbances of higher cortical function, or personality changes (APA 1995; McKhann 1984). As a first approach, best practice guidelines currently recommend the exploration of behavioural and psychological interventions before initiating pharmacological interventions, due to the limited benefit of pharmacological treatments in reducing functional decline and their potential side effects (Forbes 2008a; Hogan 2008). Exercise is among the potential protective lifestyle factors identified as a strategy for treating the symptoms of dementia or delaying its progression (Lautenschlager 2010).

Description of the intervention

Exercise programs with older adults have been shown to improve cognitive function (Angevaren 2008; Erickson 2011; Tseng 2011), and depression (Chen 2009). Many of these studies used a 60‐minute exercise regimen scheduled three times per week that continued for 24 weeks (Tseng 2011). Hamer 2009 conducted a systematic review that included 16 prospective studies (163,797 participants without dementia at baseline with 3219 with dementia at follow‐up). The relative risk (RR) of dementia in the highest exercise category compared with the lowest was 0.72 (95% CI 0.60 to 0.86, P value < 0.001) and for Alzheimer's disease (AD) the RR was 0.55 (95% CI 0.36 to 0.84, P value 0.006). The authors concluded that exercise is inversely associated with risk of dementia (i.e. reduces the likelihood of dementia). Others, for example Chang 2010, have also revealed that mid‐life exercise may contribute to maintenance of cognitive function and may reduce or delay the risk of late‐life dementia. Intlekofer 2012 suggests that evidence is starting to emerge that exercise supports brain health, even when initiated after the appearance of AD pathology. Clearly, further investigation is needed in this area.

How the intervention might work

Physical activity refers to "body movement that is produced by the contraction of skeletal muscles and that increases energy expenditure" (Chodzko‐Zajko 2009). Exercise refers to "planned, structured, and repetitive movement to improve or maintain one or more components of physical fitness" (Chodzko‐Zajko 2009). A detailed examination of the potential mechanism(s) of physical activity and exercise is beyond the scope of this review. For further information the reader is directed to two recent reviews, Erickson 2012 and Davenport 2012. Briefly, exercise improves vascular health by reducing blood pressure (Fleg 2012), arterial stiffness (Fleg 2012), oxidative stress (Covas 2002), systemic inflammation (Lavie 2011), and enhances endothelial function (Ghisi 2010), all of which are associated in the maintenance of cerebral perfusion (Churchill 2002; Davenport 2012; Rogers 1990). Recent evidence has shown a strong association between cerebral perfusion (i.e. balance between the supply and demand of nutrients to the brain), cognitive function, and fitness in older healthy adults (Brown 2010). Furthermore, insulin resistance or glucose intolerance is linked with amyloid β plaque formation (Farris 2003; Wareham 2000; Watson 2003), which is a feature of AD. Exercise is known to enhance insulin sensitivity and glucose control (Ryan 2000). Exercise may also preserve neuronal structure and promote neurogenesis, synaptogenesis, and capillarization (formation of nerve cells, the gaps between them, and blood vessels, respectively; Colcombe 2003), which may be associated with exercise‐induced elevation in brain‐derived neurotrophic factor (BDNF; Vaynman 2004), and insulin‐like growth factors (Cotman 2007). Animal and human studies investigating the role of BDNF have provided evidence that BDNF supports the health and growth of neurons and may regulate neuroplasticity (adaptability of the brain) as we age (Cheng 2003; Vaynman 2004). Intlekofer 2012 recently reported that exercise reinstates hippocampal function (i.e. memory) by enhancing the expression of BDNF and other growth factors that promote neurogenesis, angiogenesis (formation of blood vessels), and synaptic plasticity. Taken together, animal and human studies indicate that exercise provides a powerful stimulus that can counteract the molecular changes that underlie the progressive loss of hippocampal function in advanced age and AD (Erickson 2012).

Why it is important to do this review

There was tremendous response to our 2013 review from both the media and researchers. Due to the suspected increase in research activity in this area, we feel it is important to keep our review updated and relatively current.

Objectives

Primary objective

  • Do exercise programs for older people with dementia improve their cognition, activities of daily living (ADLs), neuropsychiatric symptoms, depression, and mortality?

Secondary objectives

  • Do exercise programs for older people with dementia have an indirect impact on family caregivers’ burden, quality of life, and mortality?

  • Do exercise programs for older people with dementia reduce the use of healthcare services (e.g. visits to the emergency department) by participants and their family caregivers?

Methods

Criteria for considering studies for this review

Types of studies

In this review, we included randomized controlled trials (RCTs) in which older people diagnosed with dementia were allocated to either an exercise program or a control group (usual care or social contact/activities). Although we preferred parallel group trials, cross‐over trials were eligible, but we only considered data from the first treatment phase (prior to the cross‐over). We included non‐blinded trials, as it was unrealistic to expect blinding of the participants and those who conducted the exercise programs. We expected outcome assessors to be blinded to treatment allocation, however, we did not exclude studies if blinding of outcome assessors was not incorporated in the study. We rated studies for blinding in the 'Risk of bias' tables.

Types of participants

The majority of participants in the trials had to be older people (over 65 years of age) and diagnosed as having dementia using accepted criteria such as the Diagnostic and Statistical Manual of Mental Disorders ( APA 1987; APA 1995; DSM‐IV 1994), the National Institute of Neurological and Communicative Disorders and Stroke, and the Alzheimer's Disease and Related Disorders Association (McKhann 1984), ICD‐10 (World Health Organization 1992), or CERAD‐K (Hwang 2010). 

Types of interventions

Interventions included exercise programs offered over any length of time with the aim of improving cognition, activities of daily living (ADLs), neuropsychiatric symptoms, depression, and mortality in older people with dementia or improving the family caregiver's burden, health, quality of life, or to decrease caregiver mortality, or use of healthcare services, or a combination of these. We included trials where the only difference between groups was the exercise intervention, and the types, frequencies, intensities, duration, and settings of the exercise programs were described. The exercise could be any combination of aerobic‐, strength‐, or balance‐training. The comparison groups received either usual care, or social contact/activities, to ensure that the participants received a similar amount of attention.

Types of outcome measures

Primary outcomes

The primary outcomes concerned the person with dementia, and included: cognition, ADLs, neuropsychiatric symptoms (e.g. agitation, aggression), depression, and mortality.

Secondary outcomes

The secondary outcomes included the family caregiver’s burden of care, quality of life, and mortality, and costs related to the use of healthcare services.

Search methods for identification of studies

Electronic searches

We searched ALOIS (www.medicine.ox.ac.uk/alois) ‐ the Cochrane Dementia and Cognitive Improvement Group’s Specialised Register ‐ on 4 September 2011, 14 August 2012, and most recently on 3 October 2013. The search terms used were: physical activity OR exercise OR cycling OR swim* OR gym* OR walk* OR danc* OR yoga OR ‘tai chi’.

ALOIS is maintained by the Trials Search Co‐ordinator of the Cochrane Dementia and Cognitive Improvement Group and contains studies in the areas of dementia prevention, dementia treatment, and cognitive enhancement in healthy adults. The studies are identified from: 

  1. monthly searches of a number of major healthcare databases: MEDLINE (Ovid SP), EMBASE (Ovid SP), CINAHL (EBSCOhost), PsycINFO (Ovid SP) and LILACS (BIREME);

  2. monthly searches of a number of trial registers: ISRCTN; UMIN (Japan's Trial Register); the World Health Organization portal (which covers ClinicalTrials.gov; ISRCTN; the Chinese Clinical Trials Register; the German Clinical Trials Register; the Iranian Registry of Clinical Trials and the Netherlands National Trials Register, plus others);

  3. quarterly search of The Cochrane Library’s Central Register of Controlled Trials (CENTRAL);

  4. six‐monthly searches of a number of grey literature sources: ISI Web of Knowledge Conference Proceedings; Index to Theses; and Australasian Digital Theses.

To view a list of all sources searched for ALOIS see About ALOIS on the ALOIS website.

Details of the search strategies used for the retrieval of reports of trials from the healthcare databases, CENTRAL, and conference proceedings can be viewed in the ‘Methods used in reviews’ section within the editorial information about the Dementia and Cognitive Improvement Group.

We performed additional searches in many of the sources listed above to cover the timeframe from the last searches performed for ALOIS to ensure that the search for the review was as up‐to‐date and as comprehensive as possible. There was no restriction on language. The search strategies used can be seen in Appendix 1 and Appendix 2.

We performed another search on 3 October 2013.

Data collection and analysis

Selection of studies

After merging search results and discarding duplicates, at least two authors (DF, SCF, ET) independently examined titles and abstracts of citations. If a title or abstract appeared to represent our inclusion criteria, we retrieved the full article for further assessment. At least two authors, one a content expert (SCF) and the others with expertise in conducting systematic reviews (DF, ET), independently assessed the retrieved articles for inclusion in the review according to the eligibility criteria outlined above. We resolved disagreements by discussion, or if necessary, referred to another author. The excluded articles and reasons for exclusion are listed in the ‘Characteristics of excluded studies’ table.

Data extraction and management

We extracted information from the published articles including the study setting, inclusion and exclusion criteria, participants’ diagnosis and level of activity, description of the exercise programs, the randomization process, blinding, drop‐out rates, and outcome data.

The mean change from baseline to final measurements and the standard deviation (SD) of the change were often not reported in the published reports. Accordingly, we extracted the final mean following the intervention period, the SD of this mean, and the number of participants for each group at each assessment. The included trials reported no dichotomous data of interest to this review. One author extracted data from published reports, or requested it from the original first author when necessary, and at least two authors checked this data entry. We resolved disagreements as noted above.

Assessment of risk of bias in included studies

Criteria for judging risk of bias were based on the Cochrane Handbook for Systematic Reviews of Interventions, version 5.1.0, chapter 8 (Higgins 2011). At least two authors with content expertise (SCF, SF), and two with expertise in conducting systematic reviews (DF, ET), independently assessed and rated the trials according to the 'Risk of bias' criteria below. The authors used an assessment tool to determine whether there was a low, high, or unclear risk of bias for each factor (see table 8.5.d, Higgins 2011). The identity of the publication and author information for each trial report was not masked. If the description of a process or outcome was unclear or missing, we contacted the original author of the trial in an attempt to retrieve the required information. Again, we resolved disagreements by discussion, or, if necessary, referred to a third author. We assessed the following criteria:

  1. Selection bias ‐ systematic differences between baseline characteristics of the groups being compared, including:

    1. random sequence generation;

    2. allocation concealment.

  2. Performance bias ‐ systematic differences between groups in the care that is provided, or in exposure to factors other than the interventions of interest, this includes:

    1. blinding of participants and personnel.

  3. Detection bias ‐ systematic differences between groups in how outcomes are determined, this includes:

    1. blinding of outcome assessments.

  4. Attrition bias ‐ systematic differences between groups in withdrawals from a study, this includes:

    1. incomplete outcome data.

  5. Reporting bias ‐ systematic differences between reported and unreported findings, that is:

    1. outcome reporting bias.

    2. publication bias.

  6. Other bias, such as:

    1. bias due to other problems.

Measures of treatment effect

Summary statistics were required for each trial and each outcome. For continuous data, we used the mean difference (MD) when the pooled trials used the same rating scale or test to assess an outcome. We used the standardized mean difference (SMD), which is the absolute mean difference divided by the SD, when the pooled trials used different rating scales or tests. We used the inverse variance method in the meta‐analysis. We reported all outcomes using 95% confidence intervals (CI). None of the trials included in the review reported dichotomous data of interest to this review.

Unit of analysis issues

If a cross‐over design study had been included in the review, we planned to consider only the results prior to the cross‐over for inclusion in our analysis, however, we did not have any cross‐over design studies to consider. If a trial included three or more arms, we considered the nature of the intervention and control arms, and combined the data from two treatment arms that were similar and had the same control group, as recommended in the Cochrane Handbook for Systematic Reviews of Interventions, section 7.7.3.8 and Table 7.7a (Higgins 2011). For one trial, Williams 2008, we pooled two intervention arms (exercise group and a walking group) that were compared with a control (conversation) group.

Dealing with missing data

Many types of information were found to be missing from the published articles, such as descriptions of the process of randomization, blinding of outcome assessors, attrition and adherence to the exercise program, reasons for withdrawing, and statistical data (i.e. means and SDs). We emailed contact authors on at least three separate occasions over a three‐month period and requested them to provide the missing data. Some of this missing data is described in the 'Risk of bias' tables. The potential impact of the missing data on the results depended on the extent of missing data, the pooled estimate of the treatment effect, and the variability of the outcomes. We also considered variation in the degree of missing data as a potential source of heterogeneity. If available, we used intention‐to‐treat (ITT) data, and, if these were not available, we used only the reported completers’ data in the analyses.

Assessment of heterogeneity

We considered only trials that demonstrated clinical homogeneity (that is, trials that tested an exercise program and examined similar outcome measures) to be potentially appropriate for meta‐analysis. We explored heterogeneity initially through visual exploration of the forest plots. We then performed a test for statistical heterogeneity (a consequence of clinical or methodological diversity, or both, among trials) using the Chi² test (with a P value of < 0.10 indicating significance) and I² analysis. The I² analysis is a useful statistic for quantifying inconsistency (I² = [(Q ‐ df)/Q] x 100%, where Q is the Chi² statistic and df is its degrees of freedom; Higgins 2002; Higgins 2003). This describes the percentage of variability in effect estimates that is due to heterogeneity rather than sampling error (chance). Values greater than 50% are considered to represent substantial heterogeneity, and, when these occurred, we attempted to explain this variation. If the value was less than 30%, we presented the overall estimate using a fixed‐effect model. If, however, there was evidence of heterogeneity of the population or treatment effect, or both, between trials, then we used a random‐effects model, for which the confidence intervals are broader than those of a fixed‐effect model (Higgins 2011).

Assessment of reporting biases

We examined funnel plots to look for non‐significant study effects that might indicate publication bias. To investigate reporting biases within our included studies, we compared outcomes listed in the methods sections with reported results.

Data synthesis

We conducted the meta‐analyses using a fixed‐effect model except when we considered that there was significant diversity between studies in participants or interventions, or when the I² measure of heterogeneity was greater than 30%. In those cases we used a random‐effects model.

We assessed the overall quality of the evidence associated with the result of each meta‐analysis using the GRADE (Grading of Recommendations Assessment, Development and Evaluation) approach, which gives an indication of the confidence that can be placed in the estimate of treatment effect. We summarized the effect estimates and GRADE ratings for our primary outcomes in a 'Summary of findings' table.

Subgroup analysis and investigation of heterogeneity

We decided a priori that, if there were sufficient data, we would conduct the following subgroup analyses to explore possible causes of heterogeneity.

Severity of dementia at baseline:

  1. mild (Mini Mental State Examination (MMSE) 17 to 26, or similar scale; Hogan 2007);

  2. moderate (MMSE 10 to 17, or similar scale; Hogan 2007);

  3. severe (MMSE < 10, or similar scale; Feldman 2005).

Disease type:

  1. AD;

  2. vascular dementia;

  3. mixed dementia;

  4. unclassified or other dementia.

Type of exercise program:

  1. aerobic;

  2. strength;

  3. balance.

Frequency of exercise program:

  1. up to three times per week;

  2. more than three times per week.

Duration of exercise program:

  1. up to 12 weeks;

  2. more than 12 weeks.

Sensitivity analysis

We also considered sensitivity analyses, a priori, to explore possible causes of methodological heterogeneity, such as including studies that used a variety of measurement tools.

Results

Description of studies

Please see ‘Characteristics of included studies’, 'Characteristics of excluded studies’, and ‘Characteristics of ongoing studies’ tables.

Results of the search

Database searches located a total of 5241 articles; we screened the abstracts and titles of 542 of these for inclusion. Sixty‐nine articles were retrieved and independently rated by two reviewers. Eighteen articles met the inclusion criteria (Christofoletti 2008; Conradsson 2010 (two articles); Eggermont 2009a; Eggermont 2009b; Francese 1997; Hwang 2010; Holliman 2001; Kemoun 2010; Rolland 2007; Santana‐Sosa 2008; Steinberg 2009; Stevens 2006; Van de Winckel 2004; Venturelli 2011; Volkers 2012; Vreugdenhil 2012; Williams 2008). Two of these articles reported on different outcomes from the same trial (Conradsson 2010). Thus, 17 trials were included in the review. Only one new trial has been added to our previously published reviwew. See Figure 1 for a flow chart.


Study flow diagram

Study flow diagram

Included studies

The included articles were published between 1997 and 2012. Four trials were conducted in the USA (Francese 1997; Holliman 2001; Steinberg 2009; Williams 2008), one in Sweden (Conradsson 2010), two in France (Kemoun 2010; Rolland 2007), two in Australia (Stevens 2006; Vreugdenhil 2012), three in the Netherlands (Eggermont 2009a; Eggermont 2009b; Volkers 2012), and one each in Belgium (Van de Winckel 2004), Brazil (Christofoletti 2008), Italy (Venturelli 2011), South Korea (Hwang 2010), and Spain (Santana‐Sosa 2008).

Participants

Please see 'Characteristics of included studies' table.

Trial participants had been recruited from nursing homes (Eggermont 2009a; Eggermont 2009b; Francese 1997; Hwang 2010; Kemoun 2010; Rolland 2007; Santana‐Sosa 2008; Stevens 2006; Venturelli 2011; Williams 2008), graduated residential care (Conradsson 2010), psychiatric facilities (Christofoletti 2008; Holliman 2001; Van de Winckel 2004), from three different types of institutions (day care centres, homes for the elderly, and nursing homes; Volkers 2012), and their own home settings (Steinberg 2009; Vreugdenhil 2012).

Consent was obtained from the participants in all trials, or from their legal guardian or a family member, or both. Three of the included trials recruited fewer than 20 participants (Francese 1997; Holliman 2001; Santana‐Sosa 2008); nine trials recruited between 24 and 66 participants (Christofoletti 2008; Eggermont 2009b; Hwang 2010; Kemoun 2010; Steinberg 2009; Van de Winckel 2004; Venturelli 2011; Vreugdenhil 2012; Williams 2008); and five other trials recruited 100 or more participants (Conradsson 2010; Eggermont 2009a; Rolland 2007; Stevens 2006; Volkers 2012). The total number of participants assessed at baseline was 1067, and 919 of them (86.13%) completed the trials.

All the trials, except one, required a diagnosis of dementia for recruitment. Only 52% of participants (100/191) in the Conradsson 2010 trial had a diagnosed dementia disorder, but separate data were available for these participants.

Conradsson 2010, Venturelli 2011, and Volkers 2012 required participants to be 65 years or older. Eggermont 2009a and Eggermont 2009b required participants to be 70 years or older. Three trials had a length of residency or attendance requirement: participants had to have been living in the nursing home for three weeks (Holliman 2001), two months (Rolland 2007), or four months (Santana‐Sosa 2008).

The DSM‐IV set of criteria for diagnosis of dementia were the most commonly used in the included studies (Conradsson 2010; Eggermont 2009a; Eggermont 2009b; Kemoun 2010; Vreugdenhil 2012). Other authors used the National Institute of Neurological and Communicative Disorders and Stroke, and the Alzheimer Disease and Related Disorders Association (NINCDS‐ADRDA) criteria for probable or possible AD as eligibility for inclusion (Rolland 2007; Santana‐Sosa 2008; Steinberg 2009; Van de Winckel 2004; Vreugdenhil 2012; Williams 2008), the International Statistical Classification of Diseases, 10th revision (ICD‐10) definition of dementia (Christofoletti 2008), Clinical Dementia Rating (CDR3‐CDR4) for late stage AD (Venturelli 2011), Consortium to Establish a Registry for Alzheimer's Disease Assessment Package‐Korean (CERAD‐K; Hwang 2010), MMSE (Holliman 2001; Volkers 2012), chart review (Francese 1997), and a local Aged Care Assessment Team (Stevens 2006).

The majority of trial participants had AD (Eggermont 2009b; Francese 1997; Kemoun 2010; Rolland 2007; Santana‐Sosa 2008; Steinberg 2009; Venturelli 2011; Volkers 2012; Vreugdenhil 2012; Williams 2008). One of the trials had participants with vascular disease and AD (Van de Winckel 2004). In the remaining trials, the participants’ type of dementia was not specified (Christofoletti 2008; Conradsson 2010; Eggermont 2009a; Holliman 2001; Hwang 2010; Stevens 2006).

One of the included trials had participants with mild dementia (Santana‐Sosa 2008). Six of the trials had participants with mild to moderate dementia (Conradsson 2010; Eggermont 2009a; Eggermont 2009b; Steinberg 2009; Stevens 2006; Vreugdenhil 2012). Only two of the trials had participants with moderate to severe dementia (Venturelli 2011; Williams 2008). Five of the trials had participants with mild to severe dementia (Hwang 2010; Kemoun 2010; Rolland 2007; Van de Winckel 2004; Volkers 2012), and two had participants with severe dementia (Francese 1997; Holliman 2001).

Eleven of the trials specified the participants’ level of physical ability: Eggermont 2009a required the ability to walk short distances with no aid; Eggermont 2009b required no apparent disability in hand motor function; Kemoun 2010 required the ability to walk 10 metres without technical assistance; Conradsson 2010 required the ability to stand up from a chair with help from no more than one person; Rolland 2007 required that the residents be able to transfer from a chair and walk six metres without human assistance; Francese 1997 required the residents to need one‐ to two‐person assistance to transfer; Van de Winckel 2004 required the ability to mimic the movements of the therapist and to be able to hear the music; Venturelli 2011 required an absence of mobility limitations; Steinberg 2009 and Volkers 2012 required that participants be ambulatory; and Williams 2008 required that participants be able to walk with assistance, but also that they be dependent in at least one of the following, bed mobility, transfers, gait, or balance. Conradsson 2010 and Venturelli 2011 required participants to be dependent on assistance from a person in one or more personal ADLs. Santana‐Sosa 2008 required that participants be free of neurological, vision, muscle or cardio‐respiratory disorders, and Christofoletti 2008 required that participants had no other neurological or neuropsychiatric conditions, had no prescriptions of antidepressant medications with central anti‐cholinergic or sedative action, and had no drug‐related cognitive or balance impairment. Eggermont 2009a and Eggermont 2009b required participants without visual disturbances, hearing difficulties, history of alcoholism, personality disorders, cerebral trauma, or disturbances of consciousness. Volkers 2012 required participants without diagnosis of personality disorders, cerebral traumata, hydrocephalus, neoplasm, disturbances of consciousness or focal brain disorders.

Additional inclusion criteria required participants to have: good general health, a stable medical history and had a caregiver who spent at least 10 hours per week with the participant (Steinberg 2009); a caregiver who either lived with the participant or visited daily (Vreugdenhil 2012); a score of seven or above on the Cornell Scale for Depression in Dementia (CSDD; Williams 2008); a MMSE score lower than 24/30 (Van de Winckel 2004); a MMSE score from 5 to 15 and a minimum score of 23 on the Performance Oriented Mobility Assessment (POMA) index, with a constant oxygen saturation during walking (SpO2 that exceeded 85%; Venturelli 2011); discharge scheduled after the trial (Holliman 2001); medical fitness (Christofoletti 2008); physical ability to participate (Francese 1997; Hwang 2010; Stevens 2006); ability to respond to most verbal requests (Stevens 2006; Van de Winckel 2004); and the ability to understand English (Francese 1997). Participants in the Holliman 2001 trial were not permitted to be participants in another, simultaneous research trial.

Exercise programs

The administration of the exercise programs ranged from twice a week (Rolland 2007), to three times a week (Christofoletti 2008; Hwang 2010; Kemoun 2010; Santana‐Sosa 2008), four times a week (Venturelli 2011), five times a week (Eggermont 2009a; Eggermont 2009b; Volkers 2012; Williams 2008), to daily (Van de Winckel 2004; Vreugdenhil 2012). Conradsson 2010 required participants to complete five sessions every two weeks. Steinberg 2009 required the participants in the exercise group to achieve a number of points that were accrued for performing activities in the aerobic, strength, and balance categories (one point for partially performing a task; two for completing). The goal was to achieve six aerobic points and four each of strength and balance per week. Each session varied in length from 20 minutes (Francese 1997), to 30 minutes (Eggermont 2009a; Eggermont 2009b; Holliman 2001; Stevens 2006; Van de Winckel 2004; Venturelli 2011; Volkers 2012; Vreugdenhil 2012; Williams 2008), 45 minutes (Conradsson 2010), 50 minutes (Hwang 2010), 60 minutes (Christofoletti 2008; Kemoun 2010), up to 75 minutes (Santana‐Sosa 2008). The period of time the program was offered varied greatly from two weeks (Holliman 2001), to six weeks (Eggermont 2009a; Eggermont 2009b), seven weeks (Francese 1997), 12 weeks (Santana‐Sosa 2008; Stevens 2006; Steinberg 2009; Van de Winckel 2004), 13 weeks (Conradsson 2010); 15 weeks (Kemoun 2010), 16 weeks (Vreugdenhil 2012; Williams 2008), six months (Christofoletti 2008; Venturelli 2011), 12 months (Rolland 2007), and up to 18 months (Volkers 2012).

In three trials the exercises were performed while seated in order to accommodate people in wheelchairs (Francese 1997; Holliman 2001; Stevens 2006). In the Rolland 2007 trial, the first half hour of the session consisted of walking, and the remainder of the program included strength and balance training. Francese 1997 offered an exercise regime that consisted of activities such as catching, throwing, and kicking balls; leg weight exercises; and parachute reaches. Holliman 2001 designed the exercise program to target the training of gross and fine motor skills and movement, and also to be meaningful and appropriate for the residents. This program included several interactive exercises such as passing a bean bag or playing volleyball in order to promote socialization. The program used by Stevens 2006 was based on joint and large muscle group movement with the intention of creating gentle, aerobic exertion. Christofoletti 2008 and Vreugdenhil 2012 used walking and upper and lower limb exercises to stimulate strength, balance, motor co‐ordination, agility, flexibility, and aerobic endurance. The Santana‐Sosa 2008 training sessions included joint mobility, resistance, and co‐ordination exercises. Hwang 2010 conducted an upper body dance therapy program. Van de Winckel 2004 incorporated upper and lower body strengthening as well as balance, trunk movements, and flexibility training, all supported by music. Participants in Conradsson 2010 performed a high‐intensity functional weight‐bearing exercise program, including strength and balance exercises. Steinberg 2009 focused on walking, strength training, balance, and flexibility training. Eggermont 2009a, Venturelli 2011, and Volkers 2012 provided a supervised walking program. Participants in Eggermont 2009b performed hand movements only. Williams 2008 compared two experimental interventions: the first combined walking and strength‐based exercises focusing on improving strength, balance, and flexibility, while the second consisted of supervised walking.

Control groups

The control groups for eight of the studies received usual care with no additional interventions (Christofoletti 2008; Hwang 2010; Kemoun 2010; Rolland 2007; Santana‐Sosa 2008; Stevens 2006; Venturelli 2011; Vreugdenhil 2012). The control group for five studies included social contact (Eggermont 2009a; Steinberg 2009; Van de Winckel 2004; Volkers 2012; Williams 2008). In three studies the control groups consisted of social contact with additional activities such as films, singing, and reading (Conradsson 2010; Eggermont 2009b; Francese 1997). One study did not provide any details about the control group (Holliman 2001).

Primary outcomes
Cognitive functioning

The MMSE test was used frequently in trials to assess cognitive functioning (Christofoletti 2008; Holliman 2001; Steinberg 2009; Van de Winckel 2004; Venturelli 2011; Vreugdenhil 2012). In addition, Hwang 2010 used the Cognitive Memory Performance Scale; Kemoun 2010 used the Rapid Evaluation of Cognition Functions Test; Stevens 2006 measured the progression of dementia with the Clock Drawing Test; while Eggermont 2009a, Eggermont 2009b, and Volkers 2012 used a Delayed Recall score using the Eight Words Test. For all these measuring scales, higher scores indicate less cognitive impairment.

Activities of daily living (ADL)

ADL were assessed using the Barthel ADL index (Conradsson 2010; Santana‐Sosa 2008; Venturelli 2011; Vreugdenhil 2012), Katz Index of ADLs (Rolland 2007; Santana‐Sosa 2008), and Changes in Advanced Dementia Scale (CADS; Francese 1997). Higher scores in the Barthel ADL Index, Katz Index and the CADS indicate greater ability to perform ADLs.

Neuropsychiatric symptoms

Five trials measured neuropsychiatric symptoms of the participants (Holliman 2001; Rolland 2007; Steinberg 2009; Stevens 2006; Van de Winckel 2004). The Holliman 2001 trial used a subscale of the Psychogeriatric Dependency Rating Scale (PGDRS) to measure behaviours such as wandering, active aggression and restlessness related to dementia. Rolland 2007 and Steinberg 2009 evaluated neuropsychiatric symptoms using the Neuropsychiatric Inventory (NPI). Stevens 2006 used the Revised Elderly Disability Scale, which assesses self‐help skills, behaviour, and six other categories that reflect functional ability. Van de Winckel 2004 also evaluated neuropsychiatric symptoms with the abbreviated Stockton Geriatric Rating Scale. Higher scores on these scales indicate worse or dependent behaviours; all measures were appropriate for people with dementia.

Depression

Trialists evaluated depression using the Montgomery‐Asberg Depression Rating Scale (Rolland 2007), the Cornell Scale for Depression in Dementia (CSDD; Steinberg 2009; Williams 2008), and the Geriatric Depression Scale (Conradsson 2010; Eggermont 2009a; Eggermont 2009b; Vreugdenhil 2012). All of these measures are valid, reliable and specific to people with dementia; higher scores indicate greater depression.

Mortality

None of the included studies measured mortality.

Secondary outcomes
Caregiver burden

Caregiver burden was assessed using the Screen for Caregiver Burden (Steinberg 2009), and the Zarit Burden Interview Scale (Vreugdenhil 2012). In both cases, higher scores indicate increased burden.

Caregiver quality of life

None of the included studies measured caregiver quality of life.

Caregiver mortality

None of the included studies measured caregiver mortality.

Use of healthcare services

None of the included studies measured use of healthcare services.

Excluded studies

Forty‐one trials were excluded for the following reasons:

  1. nine were not or were probably not randomized (Aman 2009; Arcoverde 2008; Batman 1999; Christofoletti 2011; de Melo Coelho 2013; Garuffi 2013; Kwak 2008; Litchke 2012; Thurm 2011);

  2. 11 did not include people diagnosed with dementia (Anon 1986; Hariprasad 2013; Kerse 2008; Littbrand 2006; Netz 1994; Powell 1974; Rodgers 2002; Scherder 2005; Suzuki 2012; van Uffelen 2005; Viscogliosi 2000);

  3. five were complex interventions in which exercise was combined with additional treatments or training so that groups did not differ in exposure to exercise alone (Burgener 2008; Logsdon 2012a; Oswald 2007; Pitkala 2013; Schwenk 2010);

  4. one study did not include an exercise program (Onor 2007);

  5. ne study did not incorporate a comparison group comprised of people with dementia (Heyn 2008);

  6. two studies did not include usual care in the control group (Day 2012; Obisesan 2011); and

  7. 12 studies examined outcomes that were not of interest to this review (Abreu 2013; Hauer 2012; Littbrand 2011; McCurry 2011; Netz 2007; Padala 2012; Roach 2011; Rodriguez‐Ruiz 2013; Suttanon 2013; Tappen 2000; Williams 2007; Yagüez 2011).

Risk of bias in included studies

(See Characteristics of included studies.)

Allocation

Random sequence generation (selection bias)

In 12 trials the methods used to generate allocation sequence were not described or were unclear (Christofoletti 2008; Conradsson 2010; Eggermont 2009b; Francese 1997; Holliman 2001; Hwang 2010; Kemoun 2010; Santana‐Sosa 2008; Steinberg 2009; Venturelli 2011; Volkers 2012; Williams 2008). We judged the remaining trials to be at low risk of bias for this domain, as sufficient information about the way the allocation sequence was generated was available (Eggermont 2009a; Rolland 2007; Stevens 2006; Van de Winckel 2004; Vreugdenhil 2012).

Allocation (selection bias)

In 12 of the trials the methods used to conceal allocation sequence were unclear or not described (Eggermont 2009a; Eggermont 2009b; Francese 1997; Holliman 2001; Hwang 2010; Kemoun 2010; Santana‐Sosa 2008; Steinberg 2009; Stevens 2006; Van de Winckel 2004; Volkers 2012; Williams 2008). In the remaining trials, allocation concealment was adequate and, due to this factor, we rated the risk of selection bias as low (Christofoletti 2008; Conradsson 2010; Rolland 2007; Venturelli 2011; Vreugdenhil 2012).

Blinding

Blinding of participants and personnel (performance bias)

All studies were at high risk of performance bias, as blinding of participants and personnel to the intervention was not possible, due to the nature of rehabilitation trials.

Blinding of outcome assessment (detection bias)

Blinding of outcome assessors was not described in Francese 1997, Hwang 2010, and Stevens 2006. Venturelli 2011 stated that the evaluation was completed in a "blinded way" and provided no further explanation, so it was not clear whether outcome assessments had been blinded. Van de Winckel 2004 stated, "The physiotherapist who was conducting both treatments evaluated the patients on cognition. However, the nurses who scored the patients on behaviours were all blind to the group assignment." Therefore, this study was rated as being at high risk for detection bias for cognition outcomes. We judged the remaining trials to be at low risk for detection bias since outcome assessors were blinded (Christofoletti 2008; Conradsson 2010; Eggermont 2009a; Eggermont 2009b; Holliman 2001; Kemoun 2010; Rolland 2007; Santana‐Sosa 2008; Steinberg 2009; Volkers 2012; Vreugdenhil 2012; Williams 2008).

Incomplete outcome data

Attrition rates (drop‐outs from the trials) varied from 0% to 37% in the included trials. The risk of attrition bias was unclear for Steinberg 2009, since the trial report did not provide data on attrition; we received no response when we requested this information. In Volkers 2012 the risk of attrition bias was also unclear, as the dropout rate for the experimental and control groups at the end of the study was not specified, instead the author provided the actual and expected number of observations made for each outcome measure over the course of the trial. Stevens 2006 was the only author who did not indicate the group (experimental or control) from which the drop‐outs occurred. The drop‐out rates were higher in the experimental arms for Christofoletti 2008 (29% experimental versus 15% control), Kemoun 2010 (20% experimental versus 17% control), Conradsson 2010 (14% experimental versus 9% control), and Eggermont 2009b (12% experimental versus 3% control). Attrition was higher in the control groups for Francese 1997 (0% experimental versus 17% control), Van de Winckel 2004 (0% experimental versus 10% control), Venturelli 2011 (8% experimental versus 17% control), Rolland 2007 (16% experimental versus 19% control), and Hwang 2010 (29% experimental versus 43% control). Reasons for attrition were provided, and included: death, illness, increased disability, disinterest, physician’s disapproval, withdrawal of consent by family, moving to another institution, and refusal to continue to participate.

In summary, we judged the majority of the trials to be at low risk of attrition bias. A high risk of attrition bias was reported for five of the included studies for a variety of reasons that included: failure to report attrition rates for individual groups; a high attrition rate; or an imbalance of attrition between the groups, or failure to provide reasons for attrition, or both (Christofoletti 2008; Holliman 2001; Hwang 2010; Kemoun 2010; Stevens 2006; see Characteristics of included studies). None of these studies used ITT principles of analysis.

Eggermont 2009a, Eggermont 2009b, and Conradsson 2010 did report conducting modified ITT analyses, but did not include all randomized participants. Eggermont 2009a enrolled 103 nursing home residents with dementia in the study, but included only 97 participants in the modified ITT analysis. Similarly Eggermont 2009b enrolled 66 participants, but included only 61 in the ITT analysis. Conradsson 2010 included 91 of the original 100 participants with dementia in the ITT analysis with no explanation. Thus, there was a potential risk of attrition bias in these studies.

Selective reporting

We judged all of the included trials as being at low risk of reporting bias.

Other potential sources of bias

Figure 2 and Figure 3 provide summaries of risk of bias.


Risk of bias graph: review authors' judgments about each risk of bias item presented as percentages across all included trials

Risk of bias graph: review authors' judgments about each risk of bias item presented as percentages across all included trials


Risk of bias summary: review authors' judgments about each risk of bias item for each included trial

Risk of bias summary: review authors' judgments about each risk of bias item for each included trial

Effects of interventions

See: Summary of findings for the main comparison Exercise programs for people with dementia

Primary outcomes

Cognition (nine trials; 409 participants)

Twelve of the included studies measured cognitive outcomes, but we were only able to obtain data from nine to include in the meta‐analysis of the effect of exercise on cognition (Christofoletti 2008; Eggermont 2009a; Eggermont 2009b; Hwang 2010; Kemoun 2010; Van de Winckel 2004; Venturelli 2011; Volkers 2012; Vreugdenhil 2012). We included post‐intervention measures following six weeks to six months of exercise intervention.

As a result of the clinical diversity among studies (participants; type, intensity and duration of exercise programs), we used a random‐effects model. The estimated standardized mean difference (SMD) between exercise and control groups was 0.43 (95% CI ‐0.05 to 0.92, P value 0.08; Analysis 1.1; Figure 4a), with nine studies and 409 participants. No clear conclusion can be drawn from this result because of the imprecision; it is compatible with both minimal harm or substantial benefit from the intervention. There was very substantial heterogeneity in this analysis (I² value 80%). We rated the quality of this evidence as very low because of the imprecision, inconsistency between studies, risk of bias and publication bias (see summary of findings Table for the main comparison).

We explored potential reasons for the high heterogeneity by conducting meta‐analyses that included only trials: 1) with people diagnosed with AD; 2) that ran the exercise programs for: more than 12 weeks; more than three times per week; or less than three times per week; 3) that included only aerobic exercise; or only strength exercise. None of these meta‐analyses reduced the heterogeneity. However, when we removed the Venturelli 2011 trial ‐ since it was the only trial that included only participants with moderate to severe dementia ‐ the heterogeneity was reduced (Chi² value 23.15; I² value 70%). However the result of this meta‐analysis was still inconclusive (SMD 0.21, 95% CI ‐0.18 to 0.61, P value 0.28; 8 trials; 388 participants; very low quality evidence; Analysis 1.1; Figure 4b).


Forest plot of comparison 1: Physical activity vs usual care: cognition

Forest plot of comparison 1: Physical activity vs usual care: cognition

Volkers 2012 was the only study that reported cognitive outcomes in which the intervention lasted longer than six months. Results from participants who remained in the study were also reported after 12 and 18 months of the exercise program. The estimated effect at both time‐points was imprecise and compatible with either benefit or harm from the exercise intervention (after 12 months: SMD 0.02, 95% CI ‐0.42 to 0.46, P value 0.93; 1 study, 62 participants; and after 18 months: SMD ‐0.08, 95% CI ‐0.61 to 0.45, P value 0.77; 1 study, 52 participants). We considered this to be very low quality evidence because of the imprecision and risk of bias. The level of compliance with the walking program in this study was low. The quantitative findings of this study have not been published in a peer‐reviewed journal.

Although three additional trials also examined cognition (Holliman 2001; Steinberg 2009; Stevens 2006), they could not be included in the analyses as the necessary data were not reported, and the authors did not provide them upon request. The conclusions of these studies were mixed: Holliman 2001 and Steinberg 2009 reported finding no benefit of exercise on cognition, whereas Stevens 2006 reported that participants in the exercise program showed cognitive benefits relative to the control group.

Activities of daily living (ADLs) (six trials; 289 participants)

Six studies measured the effect of exercise on ADLs (Conradsson 2010; Francese 1997; Rolland 2007; Santana‐Sosa 2008; Venturelli 2011; Vreugdenhil 2012). The exercise programs ranged from seven to 52 weeks in length. We included end point measures of means, standard deviations (SDs), and number of participants in each group in the meta‐analysis. As a result of the clinical diversity among studies (types and severity of dementia and type, intensity and duration of exercise programs), we used a random‐effects model.

The meta‐analysis yielded an estimated SMD between exercise and control groups of 0.68 favouring the exercise group (95% CI 0.08 to 1.27, P value 0.03; six trials, 289 participants; Analysis 2.1; Figure 5). There was considerable heterogeneity in this analysis (I² value 77%). We rated the quality of this evidence as low because of the inconsistency and imprecision (results compatible with both minimal and moderate effect size; see summary of findings Table for the main comparison).


Forest plot of comparison 2: Physical activity vs usual care: Activities of daily living (ADLs)

Forest plot of comparison 2: Physical activity vs usual care: Activities of daily living (ADLs)

We explored potential reasons for the high heterogeneity by conducting meta‐analyses that included only trials: 1) with participants diagnosed with AD; 2) that ran the exercise programs for more than 12 weeks; less than 12 weeks; more than three times per week; less than three times per week; 3) with a combination of aerobic and strength exercises; and 4) removing the trial that included only persons with moderate to severe dementia (Venturelli 2011). None of these meta‐analyses reduced the heterogeneity.

Neuropsychiatric symptoms (one trial; 110 participants)

Holliman 2001, Rolland 2007, Steinberg 2009, Stevens 2006, and Van de Winckel 2004 examined the effect of exercise on neuropsychiatric symptoms. Holliman 2001 did not provide the SDs when using the PGDRS behaviour scale, but did report that participants showed improved behaviour only during group sessions, and not outside the group. Steinberg 2009 and Stevens 2006 did not provide useable data. Stevens 2006 reported that the participants in the exercise program showed improvement in behaviour, while Steinberg 2009 reported increased neuropsychiatric symptoms. Van de Winckel 2004 also did not provide useable data and reported no significant behavioural effects. At 12 months, the Rolland 2007 study revealed no clear effect of exercise on neuropsychiatric symptoms (MD ‐0.60, 95% CI ‐4.22 to 3.02, P value 0.75; 1 trial, 110 participants). We considered this to be very low quality evidence (an imprecise result from a single study, publication bias; see summary of findings Table for the main comparison).

Depression (five studies; 341 participants)

Six studies examined the effect of exercise on depression (Conradsson 2010; Eggermont 2009b; Rolland 2007; Steinberg 2009; Vreugdenhil 2012; Williams 2008). Steinberg 2009 did not report the data needed for the analysis, or respond to requests for this data, so could not be included in the meta‐analysis. Williams 2008 included two experimental groups: a supervised individual walking group and a comprehensive individual exercise group. For this trial, we combined the data from the two experimental groups. Our meta‐analysis revealed no clear effect of exercise (SMD ‐0.14, 95% CI ‐0.36 to 0.07, P value 0.20; 5 trials, 341 participants). Heterogeneity in this analysis was low (I² value 0%; Analysis 3.1; Figure 6). We rated the quality of this evidence as moderate because of the imprecision. In addition, publication bias was suspected due to the missing Steinberg 2009 data (see summary of findings Table for the main comparison).


Forest plot of comparison 3: Physical activity vs usual care: depression

Forest plot of comparison 3: Physical activity vs usual care: depression

Mortality

No trials reported on mortality in people with dementia.

Secondary outcomes

Caregiver burden (one trial; 40 participants)

Two trials examined caregiver burden (Steinberg 2009; Vreugdenhil 2012). Steinberg 2009 did not report the data needed for the analysis, and the trial authors did not respond to requests for this data. The community‐based exercise program in Vreugdenhil 2012 was associated with a reduction in caregiver burden. The fixed‐effect model meta‐analysis yielded a mean difference between exercise and control groups of ‐15.30 (95% CI ‐24.73 to ‐5.87, P value 0.001; 1 trial, 40 participants). We rated this as low quality evidence (a single study, publication bias).

Caregiver quality of life

No studies reported on caregivers’ quality of life.

Caregiver mortality

No studies reported on caregivers’ mortality.

Use of healthcare services

No studies reported on use of healthcare services.

Adverse events (five trials)

Five studies addressed potential adverse events of exercise programs for people with dementia (Conradsson 2010; Rolland 2007; Santana‐Sosa 2008; Steinberg 2009; Venturelli 2011). None of these trials reported any serious adverse events that could be attributed to the exercise intervention. One trial, Christofoletti 2008, indirectly addressed adverse events by stating there were no drop‐outs related to the treatment.

Discussion

Summary of main results

This review included 17 trials (18 articles) with a total of 1067 participants. Most participants were older people with AD. The exercise programs varied greatly; the length of time that they ran ranged from two weeks to 18 months, and activities varied (e.g. hand movements, sitting, walking, and upper and lower limb exercises). The review suggests that exercise programs may improve people with dementia’s ability to perform ADLs, though there was considerable unexplained statistical heterogeneity observed in the ADL analyses, which suggests the need for caution in interpreting these results. In addition, one trial revealed that the burden experienced by informal caregivers providing care in the home may be reduced if they supervise their family member with dementia during participation in an exercise program. This review found no clear evidence of benefit from exercise on cognitive functioning, neuropsychiatric symptoms, and depression. Nevertheless, these are encouraging results, as dementia is a debilitating disease that results in progressive decline in ability to perform ADLs, as well as other symptoms. A slowing of the development of dependence in ADLs is critical for enhancing the quality of life for people with dementia, and will have an impact on the family caregivers’ ability to sustain their caregiving role.

Overall completeness and applicability of evidence

The number of included trials was sufficient to address the first three objectives relating to the effect of exercise on cognition, ADLs, and depression. However, only one trial was included in the analyses of the effect on neuropsychiatric symptoms and caregiver burden, and no analyses were completed for the following outcomes: mortality in people with dementia, caregiver quality of life, caregiver mortality, and use of healthcare services. Although several additional included trials investigated cognition (three trials), neuropsychiatric symptoms (three trials), depression (one trial), and caregiver burden (one trial), useable data for inclusion in the meta‐analyses were not provided by the authors. It is important to include means and SDs for end point measures, or change from baseline to final measurement scores in published reports, or, alternatively, the trial authors should be willing to provide these data on request. Clearly, additional research is needed that examines these important outcomes and provides the data needed for meta‐analysis.

Only two studies were based in the community (Steinberg 2009; Vreugdenhil 2012), all others were conducted largely in institutions. Most people with dementia are cared for at home, and most caregivers wish to keep the family member with dementia at home for as long as possible. Knowing how to support family caregivers and delay the symptoms of dementia will have profound benefits for all involved. In addition, enabling people with dementia to remain in their homes for longer will lead to decreased healthcare costs. Further community‐based trials are needed that examine the benefit or harm of exercise on multiple domains of the person with dementia and the impact on their family caregivers.

The participants within the trials were not homogeneous in terms of their diagnosis (e.g. AD, vascular dementia, mixed dementia, other) or severity of dementia (e.g. mild, moderate, severe). This was unfortunate, as dementia should not be viewed as a single disease entity, and there is some evidence that exercise might affect the risk of these conditions differently (Rockwood 2007). Several observational studies have found that the preventive effects of exercise may be weaker for vascular dementia than for AD or dementia in general (Rockwood 2007). However, a more recent meta‐analysis revealed a significant association between exercise and a reduced risk of developing vascular dementia (odds ratio 0.62, 95% CI 0.42 to 0.92; Aarsland 2010).

Also, the exercise programs were not homogeneous in terms of the type (e.g. aerobic, strength, balance), duration (range: two weeks to 18 months), and frequency (range: two times per week to daily) of activities. Therefore, we compared type, duration (less than 12 weeks versus longer than 12 weeks), and frequency (less than three times per week versus more than three times per week) of the exercise programs in further subgroup analyses. However, because of the low number of trials in each category it was not possible to identify any relationship between the type, duration, or frequency of exercise, and the effect on ADL performance or on other outcomes.

Quality of the evidence

One additional trial was included in this updated review. As a result the number of participants increased to 1067 at baseline and 919 (86.13%) completed the trials, compared with 280 participants at baseline and 208 (74%) completing the trials in the original 2008 review. These are encouraging results.

Factors that may influence the quality of the evidence (our confidence in estimates of effect) include inconsistency, imprecision, and indirectness. Inconsistency refers to considering the upper and lower limits of the confidence intervals (CIs). The quality of evidence should be rated down if clinical action would differ if the upper versus the lower boundary of the CI represented the truth. Similarly, the quality of the evidence would be rated down for imprecision if the 95% CI includes appreciable benefit or harm. Indirectness refers to substantial differences that may exist between the population, the intervention or the outcomes measured in studies included in a systematic review. Publication bias was not reported in this review as at least 10 studies should be included in the meta‐analysis to adequately test for publication bias (Higgins 2011; section 10.4.3.1).

The three primary outcomes (cognition, ADLs, and depression) were all rated as very low on quality of evidence due to serious risk of bias, inconsistency, indirectness, and imprecision, and potential publication bias in some or all of these outcomes (see GRADE, summary of findings Table for the main comparison). Serious risk of bias was a possibility as many of the authors of the trials did not report the random sequence generation and allocation concealment processes adequately. A computer‐generated program managed by a third party is a rigorous approach that can be used to generate random allocation to groups, and ensures allocation concealment. Several authors did not report or did not describe adequately the outcome data for each main outcome. Although blinding of the participants and individuals conducting the exercise programs was not possible, it was expected that outcome assessors would be blinded. A few authors failed to report on the blinding of outcome assessors. High attrition rates, an imbalance of attrition between groups, and unknown reasons for attrition and poor adherence (or no description) to the exercise programs were also potential sources of bias in several of the included trials. In addition, some trials with high attrition rates did not conduct ITT analysis (see Figure 2 and Figure 3).

We rated inconsistency as serious for two of the outcomes of interest (cognition and ADLs) and not serious for the depression outcome. We rated indirectness and imprecision as serious for all three outcomes. The funnel plots revealed potential publication bias with the outcome of cognition, no publication bias for the outcome of depression, and suspected publication bias for the depression outcome (see GRADE, summary of findings Table for the main comparison).

Potential biases in the review process

This review was conducted as outlined in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011), therefore, the introduction of bias during the review process was minimized. We are fairly confident that all relevant studies were identified, as the literature searches were conducted by Anna Noel‐Storr of the Cochrane Dementia and Cognitive Improvement Group and are updated at least every six months. However, not all of the included trials reported data that could be used in the meta‐analysis, and some authors did not respond to requests for this data. This meant that the results of four trials could not be included in our meta‐analyses. This was unfortunate as the total number of trials that have examined the evidence of the benefit or lack of benefit of exercise programs in improving the symptoms of dementia is limited.

Agreements and disagreements with other studies or reviews

A recent systematic review, that included 13 RCTs with 896 participants (Potter 2011), found similar results to those identified in this review for depression; only one of the four trials identified by Potter 2011 that reported depression as an outcome found a benefit. However, Thune‐Boyle 2012 used a critical interpretive approach to synthesize the literature, and concluded that exercise appears to be beneficial in reducing depressed mood. The Potter 2011 review reported on two trials that found an improvement in quality of life; our review did not include any trials that examined quality of life, and only one trial that examined neuropsychiatric symptoms. Bowes 2013 was a scoping review that concluded that a more holistic approach is needed, which examines the benefit of exercise on mental health and well‐being in people with dementia living at home and the impact on their family caregivers. So we concur with Thune‐Boyle 2012 that the evidence is weak or lacking of the benefit of exercise on neuropsychiatric symptoms, such as repetitive behaviours, and also with Potter 2011 and Bowes 2013 that the evidence of the benefit of exercise on depression and quality of life is limited. We would agree with these authors that further research is needed that examines the benefit of exercise on cognition, ADLs, depression, neuropsychiatric symptoms, and quality of life.

Study flow diagram
Figuras y tablas -
Figure 1

Study flow diagram

Risk of bias graph: review authors' judgments about each risk of bias item presented as percentages across all included trials
Figuras y tablas -
Figure 2

Risk of bias graph: review authors' judgments about each risk of bias item presented as percentages across all included trials

Risk of bias summary: review authors' judgments about each risk of bias item for each included trial
Figuras y tablas -
Figure 3

Risk of bias summary: review authors' judgments about each risk of bias item for each included trial

Forest plot of comparison 1: Physical activity vs usual care: cognition
Figuras y tablas -
Figure 4

Forest plot of comparison 1: Physical activity vs usual care: cognition

Forest plot of comparison 2: Physical activity vs usual care: Activities of daily living (ADLs)
Figuras y tablas -
Figure 5

Forest plot of comparison 2: Physical activity vs usual care: Activities of daily living (ADLs)

Forest plot of comparison 3: Physical activity vs usual care: depression
Figuras y tablas -
Figure 6

Forest plot of comparison 3: Physical activity vs usual care: depression

Comparison 1 Exercise vs usual care: cognition, Outcome 1 Cognition.
Figuras y tablas -
Analysis 1.1

Comparison 1 Exercise vs usual care: cognition, Outcome 1 Cognition.

Comparison 2 Exercise vs usual care: Activities of Daily Living (ADL), Outcome 1 Comparison of ADL.
Figuras y tablas -
Analysis 2.1

Comparison 2 Exercise vs usual care: Activities of Daily Living (ADL), Outcome 1 Comparison of ADL.

Comparison 3 Exercise vs usual care: depression, Outcome 1 Depression.
Figuras y tablas -
Analysis 3.1

Comparison 3 Exercise vs usual care: depression, Outcome 1 Depression.

Summary of findings for the main comparison. Exercise programs for people with dementia

Exercise programs for people with dementia

Patient or population: people with dementia
Settings: long term care, community programs, home
Intervention: exercise program compared to usual care or a social group activity

Outcomes

Illustrative comparative risks* (95% CI)

No of participants
(studies)

Quality of the evidence
(GRADE)

Comments

Cognition, SD units

Investigators measured cognition using different instruments. Higher scores represent better cognitive function
Follow‐up: 6‐36 weeks

The mean score for cognition in the intervention groups was
0.43 standard deviations units higher
(0.05 lower to 0.92 higher)

409
(9 studies)

⊕⊝⊝⊝
very lowa

As a rough guide, a difference of 0.2 SD represents a small, 0.6 a moderate and 0.8 a large treatment effect

Activities of daily living, SD units
Investigators measured ADLs using different instruments. Higher scores represent better performance
Follow‐up: 7‐52 weeks

The mean score activities of daily living in the intervention groups was
0.68 standard deviations higher
(0.08 to 1.27 higher)

289
(6 studies)

⊕⊝⊝⊝
lowb

Depression, SD units
Investigators measured depression using a variety of scales. Lower scores represent improvement
Follow‐up: 6‐52 weeks

The mean score for depression in the intervention groups was
0.14 lower
(0.36 lower to 0.07 higher)

341
(5 studies)

⊕⊕⊕⊝
moderatec

Neuropsychiatric symptoms

Measured using NPI. Severity of symptoms is measured on a scale of 0‐144. A higher score indicates worse symptoms

Follow‐up: 12 months

The mean NPI score in the intervention group was 0.60 points lower (4.22 lower to 3.02 higher).

110 (1 study)

very lowd

A minimum difference of 8 points in the NPI scale has been considered to be clinically important

GRADE Working Group grades of evidence
High quality: further research is very unlikely to change our confidence in the estimate of effect
Moderate quality: further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate
Low quality: further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate
Very low quality: we are very uncertain about the estimate

a rated down for serious inconsistency between studies (I² 80%), imprecision and publication bias (12 studies measured cognitive outcomes but data were only available from 9)

b rated down for serious inconsistency between studies (I² 77%) and imprecision

c rated down for imprecision

d rated down because data came from a single study, for imprecision and for publication bias (5 studies measured neuropsychiatric outcomes but only one provided usable data)

Figuras y tablas -
Summary of findings for the main comparison. Exercise programs for people with dementia
Comparison 1. Exercise vs usual care: cognition

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Cognition Show forest plot

9

Std. Mean Difference (IV, Random, 95% CI)

Subtotals only

1.1 Cognition: all trials

9

409

Std. Mean Difference (IV, Random, 95% CI)

0.43 [‐0.05, 0.92]

1.2 Cognition: excluded moderate‐severe dementia

8

388

Std. Mean Difference (IV, Random, 95% CI)

0.21 [‐0.18, 0.61]

Figuras y tablas -
Comparison 1. Exercise vs usual care: cognition
Comparison 2. Exercise vs usual care: Activities of Daily Living (ADL)

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Comparison of ADL Show forest plot

6

289

Std. Mean Difference (IV, Random, 95% CI)

0.68 [0.08, 1.27]

1.1 ADL: all trials

6

289

Std. Mean Difference (IV, Random, 95% CI)

0.68 [0.08, 1.27]

Figuras y tablas -
Comparison 2. Exercise vs usual care: Activities of Daily Living (ADL)
Comparison 3. Exercise vs usual care: depression

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Depression Show forest plot

5

341

Std. Mean Difference (IV, Random, 95% CI)

0.14 [‐0.07, 0.36]

Figuras y tablas -
Comparison 3. Exercise vs usual care: depression