Exercise programs for people with dementia

Dorothy Forbes; Emily J Thiessen; Catherine M Blake; Scott C Forbes; Sean Forbes

doi:10.1002/14651858.CD006489.pub3

Exercise programs for people with dementia

Authors' declarations of interest

Version published: 04 December 2013 Version history

https://doi.org/10.1002/14651858.CD006489.pub3

Collapse all Expand all

Abstract

available in

Background

This is an update of our previous 2008 review. Several recent trials and systematic reviews of the impact of exercise on people with dementia are reporting promising findings.

Objectives

Primary: Do exercise programs for older people with dementia improve cognition, activities of daily living (ADLs), challenging behaviour, depression, and mortality in older people with dementia?

Secondary: Do exercise programs for older people with dementia have an indirect impact on family caregivers’ burden, quality of life, and mortality?

Do exercise programs for older people with dementia reduce the use of healthcare services (e.g. visits to the emergency department) by participants and their family caregivers?

Search methods

We identified trials for inclusion in the review by searching ALOIS (www.medicine.ox.ac.uk/alois), the Cochrane Dementia and Cognitive Improvement Group’s Specialised Register, on 4 September 2011, and again on 13 August 2012. The search terms used were: 'physical activity' OR exercise OR cycling OR swim* OR gym* OR walk* OR danc* OR yoga OR ‘tai chi’.

Selection criteria

In this review, we included randomized controlled trials in which older people, diagnosed with dementia, were allocated either to exercise programs or to control groups (usual care or social contact/activities) with the aim of improving cognition, ADLs, behaviour, depression, and mortality. Secondary outcomes related to the family caregiver(s) and included caregiver burden, quality of life, mortality, and use of healthcare services.

Data collection and analysis

Independently, at least two authors assessed the retrieved articles for inclusion, assessed methodological quality, and extracted data. Data were analysed for summary effects using RevMan 5.1 software. We calculated mean differences or standardized mean difference (SMD) for continuous data, and synthesized data for each outcome using a fixed‐effect model, unless there was substantial heterogeneity between studies, when we used a random‐effects model. We planned to explore heterogeneity in relation to severity and type of dementia, and type, frequency, and duration of exercise program. We also evaluated adverse events.

Main results

Sixteen trials with 937 participants met the inclusion criteria. However, the required data from three trials and some of the data from a fourth trial were not published and not made available. The included trials were highly heterogeneous in terms of subtype and severity of participants' dementia, and type, duration and frequency of exercise. Only two trials included participants living at home. Our meta‐analysis suggested that exercise programs might have a significant impact on improving cognitive functioning (eight trials, 329 participants; SMD 0.55, 95% confidence interval (CI) 0.02 to 1.09). However, there was substantial heterogeneity between trials (I² value 80%), most of which we were unable to explain. We repeated the analysis omitting one trial, an outlier, that included only participants with moderate or severe dementia. This reduced the heterogeneity somewhat (I² value 68%), and produced a result that was no longer significant (seven trials, 308 participants; SMD 0.31, 95% CI ‐0.11 to 0.74). We found a significant effect of exercise programs on the ability of people with dementia to perform ADLs (six studies, 289 participants; SMD 0.68, 95% CI 0.08 to 1.27). However, again we observed considerable unexplained statistical heterogeneity (I² value 77%) in this meta‐analysis. This means that there is a need for caution in interpreting these findings. In further analyses, we found that the burden experienced by informal caregivers providing care in the home may be reduced when they supervise the participation of the family member with dementia in an exercise program (one study, 40 participants; MD ‐15.30, 95% CI ‐24.73 to ‐5.87), but we found no significant effect of exercise on challenging behaviours (one study, 110 participants; MD ‐0.60, 95% CI ‐4.22 to 3.02), or depression (six studies, 341 participants; MD –0.14, 95% CI ‐0.36 to 0.07) . We could not examine the remaining outcomes, quality of life, mortality, and healthcare costs, as either the appropriate data were not reported, or we did not retrieve trials that examined these outcomes.

Authors' conclusions

There is promising evidence that exercise programs can have a significant impact in improving ability to perform ADLs and possibly in improving cognition in people with dementia, although some caution is advised in interpreting these findings. The programs revealed no significant effect on challenging behaviours or depression. There was little or no evidence regarding the remaining outcomes of interest.

PICOs

Population

Intervention

Comparison

Outcome

The PICO model is widely used and taught in evidence-based health care as a strategy for formulating questions and search strategies and for characterizing clinical studies or meta-analyses. PICO stands for four different potential components of a clinical question: Patient, Population or Problem; Intervention; Comparison; Outcome.

See more on using PICO in the Cochrane Handbook.

Plain language summary

available in

Exercise programs for people with dementia

Background

In future, as the population ages, the number of people in our communities suffering with dementia will rise dramatically. This will not only affect the quality of life of people with dementia but also increase the burden on family caregivers, community care, and residential care services. Exercise one lifestyle factor identified as a potential means of reducing or delaying progression of the symptoms of dementia.

Study Charateristics

This review evaluated the results of 16 trials (search date August 2012), including 937 participants, that tested whether exercise programs could improve cognition, activities of daily living, behaviour, depression, and mortality in older people with dementia or benefit their family caregivers.

Key Findings

There was promising evidence that exercise programs can significantly improve the cognitive functioning of people with dementia and their ability to perform daily activities, but there was a lot of variation between trial results that we were not able to explain. The studies showed no significant effect of exercise on mood. There was little or no evidence regarding the other outcomes listed above. Further well‐designed research is required to examine these outcomes and to determine the best type of exercise program for people with different types and severity of dementia.

Quality of Evidence

The authors have no conflicts of interest.

Authors' conclusions

Implications for practice

With an increased number of trials now available, there is evidence that suggests that exercise programs could have a significant impact on improving cognitive functioning and ability to perform activities of daily living (ADLs) in people with dementia. Healthcare providers who work with people with dementia and their caregivers should feel confident in promoting exercise among this population, as decreasing the progression of cognitive decline and dependence in ADLs will have significant benefits for people with dementia and their family caregivers’ quality of life, and possibly delay the need for placement in long‐term care settings. No trials reported adverse events related to exercise programs.

One trial that examined the burden experienced by family caregivers who provide care in the home revealed that this burden can be reduced if they supervise their family member with dementia during participation in an exercise program. Therefore, encouraging caregivers to participate in exercise may also have a beneficial impact on their quality of life.

Setting (home versus institutional) can be considered in the future, if more studies become available. There was an insufficient number of trials to permit subgroup analyses that would determine which type of exercise (aerobic, strength training, balance), at what frequency and duration, is most beneficial for specific types and severity of dementia. Clearly further research is needed to be able to develop best practice guidelines that would be helpful to healthcare providers in advising people with dementia living in institutional and community settings.

Implications for research

For older people in general, recent research recommends at least 150 minutes of moderate‐ to vigorous‐intensity aerobic exercise per week, in bouts of 10 minutes or more to achieve better quality of life, improve functional abilities, and reduce risk of disease, death, and loss of independence by up to 60%. In addition, muscle and bone strengthening activities using major muscle groups, at least two times per week is recommended (Chodzko‐Zajko 2009; Tremblay 2011). Other research has revealed that aerobic‐type exercise has a clear benefit over strength training, and moderate‐intensity exercise of at least one hour a day, three to five times or more a week may be more effective in improving cognition (Kramer 2007; Middleton 2007).

However, these recommendations may not be appropriate for people with dementia. Further research is necessary to identify the optimal exercise modalities particularly in terms of frequency, intensity, and duration for people with different types and severity of dementia and to identify barriers and facilitators to improving adherence. Attempting to match the exercise programs with the needs, capabilities, and preferences of people with dementia, and ensuring adequate funding to provide regular, appropriate programs, over extended periods, by qualified instructors may increase adherence (Forbes 2007). Additional well designed trials that are conducted in the community setting, which is where most people with dementia live, and that examine outcomes of relevance to people with dementia (e.g. cognition, ADLs, depression, challenging behaviours, and mortality), family caregiver outcomes (e.g. caregiver burden, quality of life, and mortality) and economic analysis of visits to emergency departments, acute care settings, and cost of residential care are also needed.

No serious adverse events were attributed to participation in the trials. Recent research suggests that high‐intensity weight‐bearing exercise does not seem to have a negative effect on functional balance (Conradsson 2010). However, further research is needed about potential adverse events from aerobic exercise programs to determine whether this population is similar to older adults in general who are less likely to fall and less likely to injure themselves from falls if they are physically active (Kannus 2005; Sherrington 2004), or if the risk of falling and of cardiovascular events is higher in people with dementia during aerobic exercise.

Clinical researchers should make a practice of ensuring that their trials are of high methodological quality and provide information on the randomization process (sequence generation and allocation concealment), blinding of outcome assessors, attrition rates and reasons for drop‐outs from both treatment and control groups, rate of adherence to the exercise programs and reasons for withdrawal, and adverse events to the exercise programs in published articles, or be willing to share this information with reviewers when contacted. Providing statistically appropriate data (e.g. end point means and standard deviations) would also ensure that the trial results can be incorporated into meta‐analysis.

Background

Description of the condition

In 2012, the World Health Organization declared dementia to be a public health priority (World Health Organization 2012), citing the high global prevalence and economic impact on families, communities, and health service providers. In the coming decades, with the aging of the population, the number of individuals living with dementia in our communities will rise dramatically. This will increase the burden on family caregivers, community care, and residential care services (Alzheimer Society of Canada 2010; World Alzheimer Report 2011). People diagnosed with dementia often have unique needs, as they tend to be older and present with acquired impairment in memory, associated with other disturbances of higher cortical function, or personality changes (APA 1995; McKhann 1984). As a first approach, best practice guidelines currently recommend the exploration of behavioural and psychological interventions before initiating pharmacological interventions, due to the limited benefit of pharmacological treatments in reducing functional decline and their potential side effects (Forbes 2008a; Hogan 2008). Exercise is among the potential protective lifestyle factors identified as a strategy for treating the symptoms of dementia or delaying its progression (Lautenschlager 2010).

Description of the intervention

Many studies have examined the influence of exercise on healthy older people. A Cochrane review that included 11 randomized controlled trials (RCTs) of aerobic exercise programs for older people reported that eight studies showed improvement in at least one aspect of cognitive function (Angevaren 2008), with the largest effects on cognitive speed, delayed memory functions, auditory and visual attention. However, the aspects of cognitive function that improved differed across studies, and the majority of comparisons were not significantly different. Since this review, further studies with older people have revealed that exercise improves depression (e.g. Chen 2009); aerobic exercise training increases the size of the anterior hippocampus, with improvements in cognition (Erickson 2011); and mid‐life exercise may contribute to maintenance of cognitive function and may reduce or delay the risk of late‐life dementia (Chang 2010). A recent narrative review included 12 medium‐ to high‐quality RCTs (Tseng 2011). Five of the studies that focused on healthy adults revealed that exercise can improve cognitive function. Most studies examined used a 60‐minute exercise regimen scheduled three times per week that continued for 24 weeks. Hamer 2009 conducted a systematic review that included 16 prospective studies (163,797 participants without dementia at baseline with 3219 with dementia at follow‐up). The relative risk (RR) of dementia in the highest exercise category compared with the lowest was 0.72 (95% CI 0.60 to 0.86, P value < 0.001) and for Alzheimer's disease (AD) the RR was 0.55 (95% CI 0.36 to 0.84, P value 0.006). The authors concluded that exercise is inversely associated with risk of dementia. However, Miller 2012 cautioned that, although the association between exercise and preserved cognition during aging was clearly demonstrated, the specific hypothesis ‐ that exercise is a cause of healthy cognitive aging ‐ had yet to be validated. A number of factors could mediate the exercise‐cognition association, including depression, and social or cognitive stimulation. These factors need to be studied systematically, including identifying the type of activity most useful to individual participants.

Recent research has also examined the influence of exercise with people with mild cognitive impairment (MCI). Miller 2011 reported no improvement in cognitive function in people with MCI who participated in an exercise program. They concluded that exercise may be beneficial prior to the onset of MCI, but is less beneficial after its onset. A twice‐weekly, group‐based, moderate‐intensity walking program over one year led to no improvement in cognition, except in those with better adherence. For these participants, the walking program was efficacious in improving memory in men, and memory and attention in women (Van Uffelen 2008). However, Intlekofer 2012 suggests that evidence is starting to emerge that exercise supports brain health even when initiated after the appearance of AD pathology. Clearly, further investigation is needed in this area.

How the intervention might work

Physical activity, as defined for this review, refers to "body movement that is produced by the contraction of skeletal muscles and that increases energy expenditure" (Chodzko‐Zajko 2009). Exercise refers to "planned, structured, and repetitive movement to improve or maintain one or more components of physical fitness" (Chodzko‐Zajko 2009). There are several potential mechanisms that link exercise programs, which include physical activity to cognitive function. A detailed examination of the potential mechanism(s) is beyond the scope of this review. For further information the reader is directed to two recent reviews, Erickson 2012 and Davenport 2012. Briefly, exercise improves vascular health by reducing blood pressure (Fleg 2012), arterial stiffness (Fleg 2012), oxidative stress (Covas 2002), systemic inflammation (Lavie 2011), and enhances endothelial dysfunction (Ghisi 2010), all of which are associated in the maintenance of cerebral perfusion (Churchill 2002; Davenport 2012; Rogers 1990). Recent evidence has shown a strong association between cerebral perfusion (i.e. balance between the supply and demand of nutrients to the brain), cognitive function, and fitness in older healthy adults (Brown 2010). Furthermore, insulin resistance or glucose intolerance is linked with amyloid β plaque formation (Farris 2003; Wareham 2000; Watson 2003), which is a feature of AD. Exercise is known to enhance insulin sensitivity and glucose control (Ryan 2000). Exercise may also preserve neuronal structure and promote neurogenesis, synaptogenesis, and capillarization (formation of nerve cells, the gaps between them, and blood vessels, respectively) (Colcombe 2003), which may be associated with exercise‐induced elevation in brain‐derived neurotrophic factor (BDNF) (Vaynman 2004), and insulin‐like growth factors (Cotman 2007). Animal and human studies investigating the role of BDNF provide evidence that BDNF supports the health and growth of neurons and may regulate neuroplasticity (adaptability of the brain) as we age (Cheng 2003; Vaynman 2004). Intlekofer 2012 recently reported that exercise reinstates hippocampal function by enhancing the expression of BDNF and other growth factors that promote neurogenesis, angiogenesis (formation of blood vessels), and synaptic plasticity. Taken together, animal and human studies indicate that exercise provides a powerful stimulus that can counteract the molecular changes that underlie the progressive loss of hippocampal function in advanced age and AD (Erickson 2012).

Why it is important to do this review

Our initial Cochrane review in 2008 examined the influence of physical activity on people with dementia (Forbes 2008b). It included four RCTs, but only two provided the data necessary for the analysis. We concluded that there was insufficient evidence to draw conclusions about the effectiveness of physical activity programs in improving cognition, function, behaviour, depression, and mortality in people with dementia. Since conducting that review, more RCTs of exercise for dementia have been published, and an update is now timely to determine the effects of exercise programs on these outcomes as well as on burden, health, quality of life and mortality for family caregivers, and cost of healthcare services.

Objectives

Primary:

Do exercise programs for older people with dementia improve cognition, activities of daily living (ADLs), challenging behaviour, depression, and mortality in older persons with dementia?

Secondary:

Do exercise programs for older people with dementia have an indirect impact on family caregivers’ burden, quality of life, and mortality?
Do exercise programs for older people with dementia reduce the use of healthcare services (e.g. visits to the emergency department) by participants and their family caregivers?

Methods

Criteria for considering studies for this review

Types of studies

In this review, we included RCTs in which older people diagnosed with dementia were allocated to either an exercise program or a control group (usual care or social contact/activities). Although parallel group trials were preferred, cross‐over trials were eligible, but we only considered data from the first treatment phase (prior to the cross‐over). We included non‐blinded trials, as it was unrealistic to expect blinding of the participants and those who conducted the exercise programs. We expected outcome assessors to be blinded to treatment allocation, however, we did not exclude studies if blinding of outcome assessors was not incorporated in the study. We rated studies for blinding in the 'Risk of bias' tables.

Types of participants

The majority of participants in the trials had to be older people (over 65 years of age) and diagnosed as having dementia using accepted criteria such as the Diagnostic and Statistical Manual of Mental Disorders ( APA 1995; DSM‐III‐R, DSM‐IV 1994), the National Institute of Neurological and Communicative Disorders and Stroke, and the Alzheimer's Disease and Related Disorders Association (McKhann 1984), ICD‐10 (World Health Organization 1992), or CERAD‐K (Hwang 2010).

Types of interventions

Interventions included exercise programs offered over any length of time with the aim of improving cognition, ADLs, behaviour, depression, and mortality in older people with dementia or improving the family caregiver burden, health, quality of life, or to decrease caregiver mortality, or use of healthcare services, or a combination of these. We included trials where the only difference between groups was the exercise intervention, and the types, frequencies, intensities, duration, and settings of the exercise programs were described. The exercise could be any combination of aerobic‐, strength‐, or balance‐training. The comparison groups received either usual care, or social contact/activities, to ensure that the participants received a similar amount of attention.

Types of outcome measures

Primary outcomes

The primary outcomes concerned the person with dementia, and included: cognition, ADLs, challenging behaviour (e.g. agitation, aggression), depression, and mortality.

Secondary outcomes

The secondary outcomes concerned the family caregiver(s), and included burden of care, quality of life, and mortality. We had intended to examine system costs related to use of health services, however, none of the included trials examined use of healthcare services.

Search methods for identification of studies

Electronic searches

We searched ALOIS (www.medicine.ox.ac.uk/alois) ‐ the Cochrane Dementia and Cognitive Improvement Group’s Specialised Register ‐ on 4 September 2011 and again on 14 August 2012. The search terms used were: physical activity OR exercise OR cycling OR swim* OR gym* OR walk* OR danc* OR yoga OR ‘tai chi’.

ALOIS is maintained by the Trials Search Co‐ordinator of the Cochrane Dementia and Cognitive Improvement Group and contains studies in the areas of dementia prevention, dementia treatment, and cognitive enhancement in healthy adults. The studies are identified from:

Monthly searches of a number of major healthcare databases: MEDLINE (Ovid SP), EMBASE (Ovid SP), CINAHL (EBSCOhost), PsycINFO (Ovid SP) and LILACS (BIREME).
Monthly searches of a number of trial registers: ISRCTN; UMIN (Japan's Trial Register); the World Health Organization portal (which covers ClinicalTrials.gov; ISRCTN; the Chinese Clinical Trials Register; the German Clinical Trials Register; the Iranian Registry of Clinical Trials and the Netherlands National Trials Register, plus others).
Quarterly search of The Cochrane Library’s Central Register of Controlled Trials (CENTRAL).
Six‐monthly searches of a number of grey literature sources: ISI Web of Knowledge Conference Proceedings; Index to Theses; Australasian Digital Theses.

To view a list of all sources searched for ALOIS see About ALOIS on the ALOIS website.

Details of the search strategies used for the retrieval of reports of trials from the healthcare databases, CENTRAL, and conference proceedings can be viewed in the ‘Methods used in reviews’ section within the editorial information about the Dementia and Cognitive Improvement Group.

We performed additional searches in many of the sources listed above to cover the timeframe from the last searches performed for ALOIS to ensure that the search for the review was as up‐to‐date and as comprehensive as possible. There was no restriction on language. The search strategies used can be seen in Appendix 1 and Appendix 2.

We performed another search in October 2013. We identified one new study for potential inclusion. We will incoprorate this into the next update of this review.

Data collection and analysis

Selection of studies

After merging search results and discarding duplicates, at least two authors (DF, SCF, ET) independently examined titles and abstracts of citations. If a title or abstract appeared to represent our inclusion criteria, we retrieved the full article for further assessment. At least two authors, one a content expert (SCF) and the others with expertise in conducting systematic reviews (DF, ET), independently assessed the retrieved articles for inclusion in the review according to the eligibility criteria outlined above. We resolved disagreements by discussion, or if necessary, referred to another author. The excluded articles and reasons for exclusion are listed in the ‘Characteristics of excluded studies’ table.

Data extraction and management

We extracted information from the published articles including the study setting, inclusion and exclusion criteria, participants’ diagnosis and level of activity, description of the exercise programs, the randomization process, blinding, drop‐out rates, and outcome data.

The mean change from baseline to final measurements and the standard deviation (SD) of the change were often not reported in the published reports. Accordingly, we extracted the final mean measurements, the SD of the final mean, and the number of participants for each group at each assessment. The included trials reported no dichotomous data of interest to this review. One author extracted data from published reports, or requested it from the original first author when necessary, and at least two authors checked this. We resolved disagreements as noted above.

Assessment of risk of bias in included studies

Criteria for judging risk of bias were based on the Cochrane Handbook for Systematic Reviews of Interventions, version 5.1.0, chapter 8 (Higgins 2011). At least two authors, one a content expert (SCF), and the others (DF, ET) with expertise in conducting systematic reviews, independently assessed and rated the trials according to the 'Risk of bias' criteria below. The authors used an assessment tool to determine whether there was a low, high, or unclear risk of bias for each factor (see table 8.5.d, Higgins 2011). The identity of the publication and author information for each trial report was not masked. If the description of a process or outcome was unclear or missing, we contacted the original author of the trial in an attempt to retrieve the required information. Again, we resolved disagreements by discussion, or, if necessary, referred to a third author. We assessed the following criteria:

Selection bias ‐ systematic differences between baseline characteristics of the groups being compared, including:
1. random sequence generation;
2. allocation concealment.
Performance bias ‐ systematic differences between groups in the care that is provided, or in exposure to factors other than the interventions of interest, this includes:
1. blinding of participants and personnel.
Detection bias ‐ systematic differences between groups in how outcomes are determined, this includes:
1. blinding of outcome assessment.
Attrition bias ‐ systematic differences between groups in withdrawals from a study, this includes:
1. incomplete outcome data.
Reporting bias ‐ systematic differences between reported and unreported findings, that is:
1. selective reporting.
Other bias, such as:
1. bias due to other problems.

Measures of treatment effect

Summary statistics were required for each trial and each outcome. For continuous data, we used the mean difference (MD) when the pooled trials used the same rating scale or test to assess an outcome. We used the standardized mean difference (SMD), which is the absolute mean difference divided by the SD, when the pooled trials used different rating scales or tests. We used the inverse variance method in the meta‐analysis. All outcomes were reported using 95% confidence intervals (CI). None of the trials included in the review reported dichotomous data of interest to this review.

Unit of analysis issues

If a cross‐over design study was included in the review, we planned to only consider the results prior to the cross‐over for inclusion in our analysis. However, we did not have any cross‐over design studies to consider. If a trial included three or more arms, consideration was given to the nature of the intervention and control arms. Combining the data from two treatment arms that were similar and had the same control group was undertaken as recommended in the Cochrane Handbook for Systematic Reviews of Interventions, section 7.7.3.8 and Table 7.7a (Higgins 2011). For one trial, (Williams 2008), we pooled two intervention arms (exercise group and a walking group) that were compared with a control (conversation) group.

Dealing with missing data

Many types of information were found to be missing from the published articles, such as descriptions of the process of randomization, blinding of outcome assessors, attrition and adherence to the exercise program, reasons for withdrawing, and statistical data (i.e. means and SDs). We emailed contact authors on at least three separate occasions over a three‐month period and requested them to provide the missing data. Some of this missing data is described in the 'Risk of bias' tables. The potential impact of the missing data on the results depended on the extent of missing data, the pooled estimate of the treatment effect, and the variability of the outcomes. Variation in the degree of missing data was also considered as a potential source of heterogeneity. If available, intention‐to‐treat (ITT) data were used, and, if not available, only the reported completers’ data were used in the analyses.

Assessment of heterogeneity

We considered only trials that demonstrated clinical homogeneity (that is, trials that tested similar exercise programs and examined similar outcome measures) to be potentially appropriate for meta‐analysis. Heterogeneity was initially explored through visual exploration of the forest plots. A test for statistical heterogeneity (a consequence of clinical or methodological diversity, or both, among trials) was then performed using the Chi² test (with a P value < 0.10 indicating significance) and I² analysis. The I² analysis is a useful statistic for quantifying inconsistency (I² = [(Q ‐ df)/Q] x 100%, where Q is the Chi² statistic and df is its degrees of freedom; Higgins 2002; Higgins 2003). This describes the percentage of variability in effect estimates that is due to heterogeneity rather than sampling error (chance). Values greater than 50% are considered to represent substantial heterogeneity, and, when these occurred, we attempted to explain this variation. If the value was less than 30%, we presented the overall estimate using a fixed‐effect model. If, however, there was evidence of heterogeneity of the population or treatment effect, or both, between trials, then we used a random‐effects model, for which the confidence intervals are broader than those of a fixed‐effect model (Higgins 2011).

Assessment of reporting biases

We explored publication bias through examining funnel plots for signs of asymmetry. We also explored additional reasons for asymmetry. However, only a few studies were included in the meta‐analyses, so this approach was not reliable. To investigate reporting biases within our included studies, we compared outcomes listed in the methods sections with reported results.

Data synthesis

As stated above, we conducted the meta‐analysis using a fixed‐effect model except when the I² measure of heterogeneity was greater than 30%, when we used a random‐effects model in the analyses.

Subgroup analysis and investigation of heterogeneity

It was decided a priori that, if there were sufficient data, we would conduct the following subgroup analyses to explore possible causes of heterogeneity:

Severity of dementia at baseline:

mild ‐ Mini Mental State Examination (MMSE) 17 to 26, or similar scale (Hogan 2007);
moderate (MMSE 10 to 17, or similar scale (Hogan 2007);
severe (MMSE < 10, or similar scale) (Feldman 2005).

Disease type:

AD;
vascular dementia;
mixed dementia;
unclassified or other dementia.

Type of exercise program:

aerobic;
strength;
balance.

Frequency of exercise program:

up to three times per week;
more than three times per week.

Duration of exercise program:

up to 12 weeks;
more than 12 weeks.

Sensitivity analysis

We also considered sensitivity analyses, a priori, to explore possible causes of methodological heterogeneity, such as including studies that used a variety of measurement tools.

Results

Description of studies

Please see ‘Characteristics of included studies’, 'Characteristics of excluded studies’, and ‘Characteristics of ongoing studies’ tables.

Results of the search

Database searches located a total of 3844 articles; the abstracts and titles of 501 of these were screened for inclusion. Fifty‐one articles were retrieved and independently rated by two reviewers. Seventeen articles met the inclusion criteria (Christofoletti 2008; Conradsson 2010 (two articles); Eggermont 2009a; Eggermont 2009b; Francese 1997; Hwang 2010; Holliman 2001; Kemoun 2010; Rolland 2007; Santana‐Sosa 2008; Steinberg 2009; Stevens 2006; Van de Winckel 2004; Venturelli 2011; Vreugdenhil 2012; Williams 2008). Two of these articles reported on different outcomes from the same trial (Conradsson 2010). Thus, 16 trials were included in the review. See Figure 1 for a flow chart.

Figure 1

Study flow diagram

Included studies

The included articles were published between 1997 and 2012. Four trials were conducted in the United States (Francese 1997; Holliman 2001; Steinberg 2009; Williams 2008), one in Sweden (Conradsson 2010), two in France (Kemoun 2010; Rolland 2007), two in Australia (Stevens 2006; Vreugdenhil 2012), two in the Netherlands (Eggermont 2009a; Eggermont 2009b), and one each in Belgium (Van de Winckel 2004), Brazil (Christofoletti 2008), Italy (Venturelli 2011), South Korea (Hwang 2010), and Spain (Santana‐Sosa 2008).

Participants

Please see 'Characteristics of included studies' table.

Participants were residents of nursing homes (Eggermont 2009a; Eggermont 2009b; Francese 1997; Hwang 2010; Kemoun 2010; Rolland 2007; Santana‐Sosa 2008; Stevens 2006; Venturelli 2011; Williams 2008), graduated residential care (Conradsson 2010), psychiatric facilities (Christofoletti 2008; Holliman 2001; Van de Winckel 2004), and in participants’ home settings (Steinberg 2009; Vreugdenhil 2012).

Consent was obtained in all trials from the participants or, from their legal guardian or family member, or both. Three of the included trials recruited fewer than 20 participants: Francese 1997, Holliman 2001, and Santana‐Sosa 2008. Nine trials recruited between 24 and 66 participants: Venturelli 2011, Van de Winckel 2004, Steinberg 2009, Hwang 2010, Kemoun 2010, Vreugdenhil 2012, Williams 2008, Christofoletti 2008, and Eggermont 2009b. Four other trials recruited 100 or more participants: Conradsson 2010, Rolland 2007, Stevens 2006, and Eggermont 2009a. The total number of participants who were assessed at baseline was 937, and 798 (84.2%) participants completed the trials.

All but one trial required a diagnosis of dementia for eligibility. Only 52% of participants (100/191) in the Conradsson 2010 trial had a diagnosed dementia disorder, but separate data were available for these participants. Conradsson 2010 and Venturelli 2011 required participants to be 65 years or older. Eggermont 2009a and Eggermont 2009b required participants to be 70 years or older. Three trials had a length of residency or attendance requirement: participants had to have been living in the nursing home for three weeks (Holliman 2001), two months (Rolland 2007), or four months (Santana‐Sosa 2008).

The most commonly used diagnostic criteria for dementia in the included studies were from DSM‐IV (Conradsson 2010; Eggermont 2009a; Eggermont 2009b; Kemoun 2010; Vreugdenhil 2012). Other authors used the National Institute of Neurological and Communicative Disorders and Stroke, and the Alzheimer Disease and Related Disorders Association (NINCDS‐ADRDA) criteria for probable or possible Alzheimer's disease as eligibility for inclusion (Rolland 2007; Santana‐Sosa 2008; Steinberg 2009; Van de Winckel 2004; Vreugdenhil 2012; Williams 2008), the International Statistical Classification of Diseases, 10th revision (ICD‐10) definition of dementia (Christofoletti 2008), Clinical Dementia Rating (CDR3‐CDR4) for late stage AD (Venturelli 2011), Consortium to Establish a Registry for Alzheimer's Disease Assessment Package‐Korean (CERAD‐K) (Hwang 2010), MMSE (Holliman 2001), chart review (Francese 1997), and a local Aged Care Assessment Team (Stevens 2006).

The majority of trial participants had AD (Eggermont 2009b; Francese 1997; Kemoun 2010; Rolland 2007; Santana‐Sosa 2008; Steinberg 2009; Venturelli 2011; Vreugdenhil 2012; Williams 2008). One of the trials had participants with vascular disease and AD (Van de Winckel 2004). In the remaining trials, participants’ type of dementia was unspecified (Christofoletti 2008; Conradsson 2010; Eggermont 2009a; Holliman 2001; Hwang 2010; Stevens 2006).

One of the included trials had participants with mild dementia (Santana‐Sosa 2008). Six of the trials had participants with mild to moderate dementia (Conradsson 2010; Eggermont 2009a; Eggermont 2009b; Steinberg 2009; Stevens 2006; Vreugdenhil 2012). Only two of the trials had participants with moderate to severe dementia (Venturelli 2011; Williams 2008). Four of the trials had participants with mild to severe dementia (Hwang 2010; Kemoun 2010; Rolland 2007; Van de Winckel 2004), and two had participants with severe dementia (Francese 1997; Holliman 2001).

Ten of the trials specified the participants’ level of physical abilities: Eggermont 2009a required the ability to walk short distances with no aid; Eggermont 2009b required no apparent disability in hand motor function; Kemoun 2010 required the ability to walk 10 metres without technical assistance; Conradsson 2010 required the ability to stand up from a chair with help from no more than one person; Rolland 2007 required that the residents be able to transfer from a chair and walk six metres without human assistance; Francese 1997 required the residents to need one‐ to two‐person assistance to transfer; Van de Winckel 2004 required the ability to mimic the movements of the therapist and to be able to hear the music; Venturelli 2011 required an absence of mobility limitations; Steinberg 2009 required that participants be ambulatory; and Williams 2008 required that participants be able to walk with assistance, but also that they be dependent in at least one of the following, bed mobility, transfers, gait, or balance. Conradsson 2010 and Venturelli 2011 required participants to be dependent on assistance from a person in one or more personal ADLs. Santana‐Sosa 2008 required that participants be free of neurological, vision, muscle or cardio‐respiratory disorders, and Christofoletti 2008 required that participants had no other neurological or neuropsychiatric conditions, had no prescriptions of antidepressant medications with central anti‐cholinergic or sedative action, and had no drug‐related cognitive or balance impairment. Eggermont 2009a and Eggermont 2009b required participants with no visual disturbances, hearing difficulties, history of alcoholism, personality disorders, cerebral trauma, or disturbances of consciousness.

Additional inclusion criteria required participants to have: good general health, a stable medical history and had a caregiver who spent at least 10 hours per week with the participant (Steinberg 2009); a caregiver who either lived with the participant or visited daily (Vreugdenhil 2012); a score of seven or above on the Cornell Scale for Depression in Dementia (CSDD) (Williams 2008); a MMSE score lower than 24/30 (Van de Winckel 2004); a MMSE score from 5 to 15 and a minimum score of 23 on the Performance Oriented Mobility Assessment (POMA) index, with a constant oxygen saturation during walking (SpO₂ that exceeded 85%) (Venturelli 2011); discharge scheduled after the trial (Holliman 2001); medical fitness (Christofoletti 2008); physical ability to participate (Francese 1997; Hwang 2010; Stevens 2006); ability to respond to most verbal requests (Stevens 2006; Van de Winckel 2004); and the ability to understand English (Francese 1997). Participants in the Holliman 2001 trial were not permitted to be participants in another, simultaneous research trial.

Exercise programs

The administration of the exercise programs ranged from twice a week (Rolland 2007), to three times a week (Christofoletti 2008; Hwang 2010; Kemoun 2010; Santana‐Sosa 2008), four times a week (Venturelli 2011), five times a week (Eggermont 2009a; Eggermont 2009b; Williams 2008), to daily (Van de Winckel 2004; Vreugdenhil 2012). Conradsson 2010 required participants to complete five sessions every two weeks. Steinberg 2009 required the participants in the exercise group to achieve a number of points that were accrued for performing activities in the aerobic, strength, and balance categories (one point for partially performing a task; two for completing). The goal was to achieve six aerobic points and four each of strength and balance per week. Each session varied in length from 20 minutes (Francese 1997), to 30 minutes (Eggermont 2009a; Eggermont 2009b; Holliman 2001; Stevens 2006; Van de Winckel 2004; Venturelli 2011; Vreugdenhil 2012; Williams 2008), 45 minutes (Conradsson 2010), 50 minutes (Hwang 2010), 60 minutes (Christofoletti 2008; Kemoun 2010), up to 75 minutes (Santana‐Sosa 2008). The period of time the program was offered varied greatly from two weeks (Holliman 2001), to six weeks (Eggermont 2009a; Eggermont 2009b), seven weeks (Francese 1997), 12 weeks (Santana‐Sosa 2008; Stevens 2006; Steinberg 2009; Van de Winckel 2004), 13 weeks (Conradsson 2010); 15 weeks (Kemoun 2010), 16 weeks (Vreugdenhil 2012; Williams 2008), six months (Christofoletti 2008; Venturelli 2011), up to 12 months (Rolland 2007).

In three trials the exercises were performed while seated in order to accommodate people in wheelchairs (Francese 1997; Holliman 2001; Stevens 2006). In the Rolland 2007 trial, the first half hour of the session consisted of walking, and the remainder of the program included strength and balance training. Francese 1997 offered an exercise regime that consisted of activities such as catching, throwing, and kicking balls; leg weight exercises; and parachute reaches. Holliman 2001 designed the exercise program to target the training of gross and fine motor skills and movement, and also to be meaningful and appropriate for the residents. This program included several interactive exercises such as passing a bean bag or playing volleyball in order to promote socialization. The program used by Stevens 2006 was based on joint and large muscle group movement with the intention of creating gentle, aerobic exertion. Christofoletti 2008 and Vreugdenhil 2012 used walking and upper and lower limb exercises to stimulate strength, balance, motor co‐ordination, agility, flexibility, and aerobic endurance. The Santana‐Sosa 2008 training sessions included joint mobility, resistance, and co‐ordination exercises. Hwang 2010 conducted an upper body dance therapy program. Van de Winckel 2004 incorporated upper and lower body strengthening as well as balance, trunk movements, and flexibility training, all supported by music. Participants in Conradsson 2010 performed a high‐intensity functional weight‐bearing exercise program, including strength and balance exercises. Steinberg 2009 focused on walking, strength training, balance, and flexibility training. Eggermont 2009a and Venturelli 2011 provided a supervised walking program. Participants in Eggermont 2009b performed hand movements only. Williams 2008 compared two experimental interventions: the first combined walking and strength‐based exercises focusing on improving strength, balance, and flexibility, while the second consisted of supervised walking.

Control groups

The control groups for eight of the studies received usual care with no additional interventions (Christofoletti 2008; Hwang 2010; Kemoun 2010; Rolland 2007; Santana‐Sosa 2008; Stevens 2006; Venturelli 2011; Vreugdenhil 2012). The control group for four studies included social contact (Eggermont 2009a; Steinberg 2009; Van de Winckel 2004; Williams 2008). In three studies the control groups consisted of social contact with additional activities such as films, singing, and reading (Conradsson 2010; Eggermont 2009b; Francese 1997). One study did not provide any details about the control group (Holliman 2001).

Primary outcomes

Cognitive

The MMSE test was used frequently in trials (Christofoletti 2008; Holliman 2001; Steinberg 2009; Van de Winckel 2004; Venturelli 2011; Vreugdenhil 2012). In addition, Hwang 2010 used the Cognitive Memory Performance Scale; Kemoun 2010 used the Rapid Evaluation of Cognition Functions Test; Stevens 2006 measured the progression of dementia with the Clock Drawing Test; while Eggermont 2009a and Eggermont 2009b used a Delayed Recall score using the Eight Words Test. For all these measuring scales, higher scores indicate less cognitive impairment.

Activities of daily living (ADL)

ADL were assessed using the Barthel ADL index (Conradsson 2010; Santana‐Sosa 2008; Venturelli 2011; Vreugdenhil 2012), Katz Index of ADLs (Rolland 2007; Santana‐Sosa 2008), and Changes in Advanced Dementia Scale (CADS) (Francese 1997). Higher scores in the Barthel ADL Index, Katz Index and the CADS indicate greater ability to perform ADLs.

Challenging behaviours

Four trials measured challenging behaviours of the participants (Holliman 2001; Rolland 2007; Stevens 2006; Van de Winckel 2004). A subscale of the Psychogeriatric Dependency Rating Scale (PGDRS) was used to measure behaviours such as wandering, active aggression and restlessness related to dementia (Holliman 2001). Rolland 2007 evaluated behavioural disturbances as a secondary outcome measure using the Neuropsychiatric Inventory (NPI).Stevens 2006 used the Revised Elderly Disability Scale, which assesses self‐help skills, behaviour, and six other categories that reflect functional ability. behaviour was also evaluated with the abbreviated Stockton Geriatric Rating Scale (Van de Winckel 2004). Higher scores on these scales indicate worse or dependent behaviours; all measures are appropriate for people with dementia.

Depression

Depression was evaluated using the Montgomery‐Asberg Depression Rating Scale (Rolland 2007), Cornell Scale for Depression in Dementia (CSDD) (Steinberg 2009; Williams 2008), the NPI (Steinberg 2009), and the Geriatric Depression Scale (Conradsson 2010; Eggermont 2009a; Eggermont 2009b; Vreugdenhil 2012). All of these measures are valid, reliable and specific to people with dementia; higher scores indicate greater depression.

Mortality

No included studies measured mortality.

Secondary outcomes

Caregiver burden

Caregiver burden was assessed using the Screen for Caregiver Burden (Steinberg 2009), and the Zarit Burden Interview Scale (Vreugdenhil 2012). In both cases, higher scores indicate increased burden.

Caregiver quality of life

No included studies measured caregiver quality of life.

Caregiver mortality

No included studies measured caregiver mortality.

Use of healthcare services

No included studies measured use of healthcare services.

Excluded studies

Twenty‐four trials were excluded for the following reasons: five were not or likely not randomised (Aman 2009; Arcoverde 2008; Batman 1999; Christofoletti 2011; Kwak 2008), nine did not include persons diagnosed with dementia (Anon 1986; Kerse 2008; Littbrand 2006; Netz 1994; Powell 1974; Rodgers 2002; Scherder 2005; van Uffelen 2005; Viscogliosi 2000; two studies included treatments in addition to the exercise (Burgener 2008; Oswald 2007) and one study did not include an exercise program (Onor 2007). One study did not incorporate a comparison group comprised of persons with dementia (Heyn 2008) and another study did not include usual care in the control group (Obisesan 2011). Five studies examined outcomes that were not of interest to this review (Littbrand 2011; Netz 2007; Tappen 2000; Williams 2007; Yagüez 2011).

Risk of bias in included studies

(See Characteristics of included studies.)

Allocation

Random sequence generation (selection bias)

In eleven trials the methods used to generate allocation sequence were not described or were unclear (Christofoletti 2008; Conradsson 2010; Eggermont 2009b; Francese 1997; Holliman 2001; Hwang 2010; Kemoun 2010; Santana‐Sosa 2008; Steinberg 2009; Venturelli 2011; Williams 2008). The remaining trials were judged to be at low risk of bias for this domain, as sufficient information about the way the allocation sequence was generated was available (Eggermont 2009a; Rolland 2007; Stevens 2006; Van de Winckel 2004; Vreugdenhil 2012).

Allocation (selection bias)

Methods used to conceal allocation sequence in eleven of the trials were unclear or not described (Eggermont 2009a; Eggermont 2009b; Francese 1997; Holliman 2001; Hwang 2010; Kemoun 2010; Santana‐Sosa 2008; Steinberg 2009; Stevens 2006; Van de Winckel 2004; Williams 2008). In the remaining trials, allocation concealment was adequate and, due to this factor, the risk of selection bias was rated as low (Christofoletti 2008; Conradsson 2010; Rolland 2007; Venturelli 2011).

Blinding

Blinding of participants and personnel (performance bias)

All studies were at high risk of performance bias, as blinding of participants and personnel to the intervention was not possible, due to the nature of rehabilitation trials.

Blinding of outcome assessment (detection bias)

Blinding of outcome assessors was not described in Francese 1997, Hwang 2010, and Stevens 2006. Venturelli 2011 stated that the evaluation was completed in a "blinded way" and provided no further explanation, so it was not clear whether outcome assessments had been blinded. The Van de Winckel 2004 study stated, "The physiotherapist who was conducting both treatments evaluated the patients on cognition. However, the nurses who scored the patient on behaviours were all blind to the group assignment.", and so was at high risk for detection bias for cognition outcomes. The remaining trials were at low risk for detection bias since outcome assessors were blinded (Christofoletti 2008; Conradsson 2010; Eggermont 2009a; Eggermont 2009b; Holliman 2001; Kemoun 2010; Rolland 2007; Santana‐Sosa 2008; Steinberg 2009; Vreugdenhil 2012; Williams 2008).

Incomplete outcome data

Attrition rates (drop‐outs from the trials) varied from 0% to 37% in the included trials. Attrition bias was unclear for Steinberg 2009, since the trial report did not provide data on attrition; we received no response when we requested this information. Stevens 2006 was the only author who did not indicate the group (experimental or control) from which the drop‐out occurred. The drop‐out rates were higher in the experimental arms for Christofoletti 2008 (29% experimental versus 15% control), Kemoun 2010 (20% experimental versus 17% control), Conradsson 2010 (14% experimental versus 9% control), and Eggermont 2009b (12% experimental versus 3% control). Attrition was higher in the control groups for Francese 1997 (0% experimental versus 17% control), Van de Winckel 2004 (0% experimental versus 10% control), Venturelli 2011 (8% experimental versus 17% control), Rolland 2007 (16% experimental versus 19% control), and Hwang 2010 (29% experimental versus 43% control). Reasons for attrition were provided, and included: death, illness, increased disability, disinterest, physician’s disapproval, withdrawal of consent by family, moving to another institution, and refusal to continue to participate.

In summary, the majority of the trials were rated as having low attrition bias, trials with high attrition used ITT principles in analysis. High attrition bias was reported for five of the included studies for a variety of reasons that included (Christofoletti 2008; Holliman 2001; Hwang 2010; Steinberg 2009; Stevens 2006): failure to report attrition rates for individual groups; a high attrition rate; or an imbalance of attrition between the groups, or failure to provide reasons for attrition, or both (see Characteristics of included studies).

Selective reporting

All of the included trials had low reporting bias, as all outcomes reported were based on the objectives of the study.

Other potential sources of bias

Several points included in the Characteristics of included studies section may be considered to indicate other potential sources of bias, such as attendance and adherence to the exercise programs. Only eight trials provided information regarding the adherence of participants in the intervention arms: Santana‐Sosa 2008 (98.9% for five subjects, 97% for three subjects); Venturelli 2011 (93%); Kemoun 2010 (90%); Conradsson 2010 (72%); Steinberg 2009 (59% of the diaries were received and 75% of the exercise group met their goals); Rolland 2007 (mean adherence 33% (SD 25.5) of the 88 sessions offered, although 100% were included in the ITT analysis); Eggermont 2009b (data analysed from residents with 80% attendance or greater); and Stevens 2006 (data analysed from residents with 75% attendance or greater). Figure 2 and Figure 3 provide summaries of risk of bias.

Figure 2

Risk of bias graph: review authors' judgments about each risk of bias item presented as percentages across all included trials

Figure 3

Risk of bias summary: review authors' judgments about each risk of bias item for each included trial

Effects of interventions

Primary outcomes

Cognition (eight trials; 329 participants)

We included eight trials that examined the effect of exercise on cognition in the meta‐analysis (Christofoletti 2008; Eggermont 2009a; Eggermont 2009b; Hwang 2010; Kemoun 2010; Van de Winckel 2004; Venturelli 2011; Vreugdenhil 2012). We included post‐intervention end point measures of means, SDs, and number of participants in each group in the meta‐analysis.

A random‐effects model was used, as the Chi² value was 34.67 and the I² value was 80%, indicating substantial heterogeneity. The meta‐analysis yielded significant results (P value 0.04) that favoured the exercise program (k = 8, n = 329; SMD 0.55, 95% CI 0.02 to 1.09) (Analysis 1.1; Figure 4a).

Figure 4

Forest plot of comparison 1: Physical activity vs usual care: cognition

We explored potential reasons for the high heterogeneity by completing the following meta‐analyses that included only trials: 1) with people diagnosed with AD; and 2) that ran the exercise programs for: more than 12 weeks; more than three times per week; less than three times per week; included only aerobic exercise; and included only strength exercise. None of these meta‐analyses reduced the heterogeneity. However, when we removed the Venturelli 2011 trial ‐ since it was the only trial that included only participants with moderate to severe dementia ‐ the heterogeneity was reduced (Chi²value 18.49; I² value 68%), but the results of this meta‐analysis were not significant (P value 0.15; k = 7; n = 308; SMD 0.31, 95% CI ‐0.11 to 0.74) (Analysis 1.1; Figure 4b).

Although three additional trials also examined cognition (Holliman 2001; Steinberg 2009; Stevens 2006), they could not be included in the analyses as the necessary data were not reported, and the authors did not provide them upon request. However, their conclusions were mixed: Holliman 2001 and Steinberg 2009 showed no benefit of exercise on cognition, whereas participants in the exercise program in the Stevens 2006 study showed improvement in cognition.

Activities of daily living (ADLs) (six trials; 289 participants)

Six studies measured the effect of exercise on ADLs (Conradsson 2010; Francese 1997; Rolland 2007; Santana‐Sosa 2008; Venturelli 2011; Vreugdenhil 2012). We included end point measures of means, SDs, and number of participants in each group in the meta‐analysis. A random‐effects model was used as the Chi² value was 22.19 and the I² value was 77%, indicating considerable heterogeneity. The meta‐analysis yielded significant results (P value 0.03), that favoured the exercise program (k = 6; n = 289; SMD 0.68, 95% CI 0.08 to 1.27) (Analysis 2.1; Figure 5).

Figure 5

Forest plot of comparison 2: Physical activity vs usual care: Activities of daily living (ADLs)

We explored potential reasons for the high heterogeneity and completed the following meta‐analyses that included only trials: 1) with participants diagnosed with AD; and 2) that ran the exercise programs for more than 12 weeks; less than 12 weeks; more than three times per week; less than three times per week; a combination of aerobic and strength exercises; and removing the trial with only persons with moderate to severe dementia (Venturelli 2011). None of these meta‐analyses reduced the heterogeneity.

Challenging behaviours (four trials; 110 participants)

Holliman 2001, Rolland 2007, Stevens 2006, and Van de Winckel 2004 examined the effect of exercise on challenging behaviours.

Holliman 2001 did not provide the SDs when using the PGDRS behaviour scale, but did report that participants showed improved behaviour only during group sessions, and not outside the group. Stevens 2006 did not provide useable data but reported that the participants in the exercise program showed improvement in behaviour. Van de Winckel 2004 also did not provide useable data and reported no significant behavioural effects. The Rolland 2007 study revealed non‐significant results (P value 0.75; k = 1; n = 110; MD ‐0.60, 95% CI ‐4.22 to 3.02).

Depression (six studies; 341 participants)

Six studies examined the effect of exercise on depression (Conradsson 2010; Eggermont 2009b; Rolland 2007; Steinberg 2009; Vreugdenhil 2012; Williams 2008). The trial authors of Steinberg 2009 did not report the data needed for the analysis, or respond to requests for this data. Williams 2008 included two experimental groups: a supervised individual walking group and a comprehensive individual exercise group. We combined the data from the two experimental groups resulting in a sample size of 33, a mean of 9.01 and an SD of 6.11. The meta‐analysis revealed a Chi² value of 1.87 and an I² value of 0%, indicating no heterogeneity. Consequently, a fixed‐effect model was utilized that revealed non‐significant results (P value 0.20; k =5; n = 341; MD –0.14, 95% CI ‐0.36 to 0.07) (Analysis 3.1; Figure 6). As there was no heterogeneity, no subgroup or sensitivity analyses were needed.

Figure 6

Forest plot of comparison 3: Physical activity vs usual care: depression

Mortality

No trials reported on mortality in people with dementia.

Secondary outcomes

Caregiver burden (two trials; 40 participants)

Two trials examined caregiver burden (Steinberg 2009; Vreugdenhil 2012). As noted above, Steinberg 2009 did not report the data needed for the analysis and the trial authors did not respond to requests for this data. The community‐based exercise program in Vreugdenhil 2012 revealed a significant improvement in caregiver burden (P value 0.001). The fixed‐effect model revealed significant results (k = 1; n = 40; MD ‐15.30, 95% CI ‐24.73 to ‐5.87).

Caregiver quality of life

No studies reported on caregivers’ quality of life.

Caregiver mortality

No studies reported on caregivers’ mortality.

Use of healthcare services

No studies reported on use of healthcare services.

Adverse events (five trials)

Five studies addressed potential adverse events of exercise programs for people with dementia (Conradsson 2010; Rolland 2007; Santana‐Sosa 2008; Steinberg 2009; Venturelli 2011). None of these trials revealed any serious adverse events that could be attributed to the exercise intervention. One trial, Christofoletti 2008, indirectly addressed adverse events by stating there were no drop‐outs related to the treatment.

Discussion

Summary of main results

This review included 16 trials (17 articles) with a total of 937 participants. Most participants were older people with AD. The exercise programs varied greatly; the length of time that they ran ranged from two weeks to 12 months, and activities varied (e.g. hand movements, sitting, walking, and upper and lower limb exercises). The review suggests that exercise programs may have a significant impact on improving cognitive functioning and the ability to perform ADLs in people with dementia. There was substantial and considerable unexplained statistical heterogeneity observed in the cognitive and ADL analyses, respectively, which suggests the need for caution in interpreting these results. Indeed, when we removed the Venturelli 2011 trial from the cognitive analysis ‐ since it was the only trial that included only people with moderate to severe dementia ‐ the heterogeneity was reduced, and the results of this meta‐analysis were no longer significant. In addition, our findings reveal that the burden experienced by informal caregivers providing care in the home may be reduced if they supervise their family member with dementia during participation in an exercise program. This review found no significant effect of exercise on challenging behaviours or depression. Nevertheless, these are encouraging results, as dementia is a debilitating disease that results in progressive decline in cognition and ability to perform ADLs, as well as other symptoms. A slowing of both cognitive decline and the development of dependence in ADLs is critical for enhancing the quality of life for people with dementia, and will have an impact on the family caregivers’ ability to sustain their caregiving role.

Overall completeness and applicability of evidence

The number of included trials was sufficient to address the first three objectives relating to the effect of exercise on cognition, ADLs, and depression. However, only one trial was included in the analyses of the effect on challenging behaviours and caregiver burden, and no analyses were completed for the following outcomes: mortality in people with dementia, caregiver quality of life, caregiver mortality, and use of healthcare services. Although several additional included trials investigated cognition (four trials), challenging behaviours (three trials), depression (two trials), and caregiver burden (one trial), useable data for inclusion in the meta‐analyses were not provided by the authors. It is important to include means and SDs for end point measures, or change from baseline to final measurement scores in published reports, or the authors should be willing to provide these data on request. Clearly, additional research is needed that examines these important outcomes and provides the needed data for meta‐analysis.

Only two studies were based in the community (Steinberg 2009; Vreugdenhil 2012), all others were conducted in institutions. Most people with dementia are cared for at home, and most caregivers wish to keep the family member with dementia at home for as long as possible. Knowing how to support family caregivers and delay the symptoms of dementia will have profound benefits for all involved. In addition, enabling people with dementia to remain in their homes for longer will lead to decreased healthcare costs. Further community‐based trials are needed that examine the effect of exercise on multiple domains of the person with dementia and the impact on their family caregivers.

The participants within the trials were not homogeneous in terms of their diagnosis (e.g. AD, vascular dementia, mixed dementia, other) or severity of dementia (e.g. mild, moderate, severe). This was unfortunate, as dementia should not be viewed as a single disease entity, and there is preliminary evidence that exercise might affect the risk of these conditions differently (Rockwood 2007). Several observational studies have found that the preventive effects of exercise may be weaker for vascular dementia than for AD or dementia in general (Rockwood 2007). However, a more recent meta‐analysis (Aarsland 2010) revealed a significant association between exercise and a reduced risk of developing vascular dementia (odds ratio 0.62, 95% CI 0.42 to 0.92).

Also, the exercise programs were not homogeneous in terms of the type (e.g. aerobic, strength, balance), duration (range: two weeks to 12 months), and frequency (range: two times per week to daily) of activities. Therefore, further subgroup analyses compared type, duration (less than 12 weeks versus longer than 12 weeks), and frequency (less than three times per week versus more than three times per week) of the exercise programs. However, because of the few number of trials included in the review it was not possible to determine a dose‐response between the type, duration, or frequency of exercise, and the degree of protection from cognitive decline and other outcomes.

Quality of the evidence

Twelve additional trials were included in this updated review compared with the four included in the previous version of the review. As a result the number of participants increased to 937 at baseline and 798 (85.2%) completed the trials, compared with 280 at baseline and 208 (74%) completing the trials in our previous review. These are encouraging results. However, many of the authors of the trials did not report the random sequence generation and allocation concealment processes adequately. A computer‐generated program managed by a third party is a rigorous approach that can be used to generate random allocation to groups, and ensures allocation concealment. Several authors did not report or did not describe adequately the outcome data for each main outcome. Although blinding of the participants and individuals conducting the exercise programs was not possible, it was expected that outcome assessors would be blinded. A few authors failed to report on the blinding of outcome assessors. High attrition rates, an imbalance of attrition between groups, and unknown reasons for attrition and poor adherence (or no description) to the exercise programs were also potential sources of bias in several of the included trials. In addition, some trials with high attrition rates did not conduct ITT analysis (see Figure 2 and Figure 3).

Potential biases in the review process

This review was conducted as outlined in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011), therefore, the introduction of bias during the review process was minimized. We are fairly confident that all relevant studies were identified, as the literature searches were conducted by Anna Noel‐Storr of the Cochrane Dementia and Cognitive Improvement Group and are updated at least every six months. However, not all of the included trials reported data that could be used in the meta‐analysis, and some authors did not respond to requests for this data. Thus, the results of four trials could not be included in our meta‐analyses. This was unfortunate as the total number of trials that have examined the effectiveness of exercise programs for improving the symptoms of dementia is limited.

Agreements and disagreements with other studies or reviews

A recent systematic review, that included 13 RCTs with 896 participants (Potter 2011), found similar, non‐significant, results to those identified in this review for depression; only one of the four trials identified by Potter et al that reported depression as an outcome found a positive effect. However, Thune‐Boyle 2012, who used a critical interpretive approach to synthesize the literature, concluded that exercise appears to be beneficial in reducing depressed mood. The Potter 2011 review reported on two trials that found an improvement in quality of life; our review did not include any trials that examined quality of life, and only one trial that examined challenging behaviours. Thus, we concur with Thune‐Boyle 2012 that the evidence is weak or lacking for an effect of exercise on challenging behaviours, such as repetitive behaviours, and also with Potter 2011 that the evidence for an effect on depression and quality of life is limited. We would agree with both authors that further research is needed that examines depression, quality of life, and challenging behaviours.

Figure 1

Study flow diagram

Navigate to figure in ReviewOpen in new tab

Figure 2

Risk of bias graph: review authors' judgments about each risk of bias item presented as percentages across all included trials

Navigate to figure in ReviewOpen in new tab

Figure 3

Risk of bias summary: review authors' judgments about each risk of bias item for each included trial

Navigate to figure in ReviewOpen in new tab

Figure 4

Forest plot of comparison 1: Physical activity vs usual care: cognition

Navigate to figure in ReviewOpen in new tab

Figure 5

Forest plot of comparison 2: Physical activity vs usual care: Activities of daily living (ADLs)

Navigate to figure in ReviewOpen in new tab

Figure 6

Forest plot of comparison 3: Physical activity vs usual care: depression

Navigate to figure in ReviewOpen in new tab

Analysis 1.1

Comparison 1 Exercise vs usual care: cognition, Outcome 1 Cognition.

Navigate to figure in ReviewOpen in new tab

Analysis 2.1

Comparison 2 Exercise vs usual care: Activities of Daily Living (ADL), Outcome 1 Comparison of ADL.

Navigate to figure in ReviewOpen in new tab

Analysis 3.1

Comparison 3 Exercise vs usual care: depression, Outcome 1 Depression.

Navigate to figure in ReviewOpen in new tab

Comparison 1. Exercise vs usual care: cognition

Outcome or subgroup title	No. of studies	No. of participants	Statistical method	Effect size
1 Cognition Show forest plot	8		Std. Mean Difference (IV, Random, 95% CI)	Subtotals only

1.1 Cognition: all trials	8	329	Std. Mean Difference (IV, Random, 95% CI)	0.55 [0.02, 1.09]
1.2 Cognition: excluded moderate‐severe dementia	7	308	Std. Mean Difference (IV, Random, 95% CI)	0.31 [‐0.11, 0.74]

Comparison 1. Exercise vs usual care: cognition

Navigate to table in Review

Comparison 2. Exercise vs usual care: Activities of Daily Living (ADL)

Outcome or subgroup title	No. of studies	No. of participants	Statistical method	Effect size
1 Comparison of ADL Show forest plot	6	289	Std. Mean Difference (IV, Random, 95% CI)	0.68 [0.08, 1.27]

1.1 ADL: all trials	6	289	Std. Mean Difference (IV, Random, 95% CI)	0.68 [0.08, 1.27]

Comparison 2. Exercise vs usual care: Activities of Daily Living (ADL)

Navigate to table in Review

Comparison 3. Exercise vs usual care: depression

Outcome or subgroup title	No. of studies	No. of participants	Statistical method	Effect size
1 Depression Show forest plot	5	341	Std. Mean Difference (IV, Random, 95% CI)	0.14 [‐0.07, 0.36]

Comparison 3. Exercise vs usual care: depression

Navigate to table in Review

Cochrane Review language

Website language

Abstract

Background

Objectives

Search methods

Selection criteria

Data collection and analysis

Main results

Authors' conclusions

PICOs

PICOs

Population

Intervention

Comparison

Outcome

Plain language summary

Exercise programs for people with dementia

Visual summary

Authors' conclusions

Implications for practice

Implications for research

Background

Description of the condition

Description of the intervention

How the intervention might work

Why it is important to do this review

Objectives

Methods

Criteria for considering studies for this review

Types of studies

Types of participants

Types of interventions

Types of outcome measures

Primary outcomes

Secondary outcomes

Search methods for identification of studies

Electronic searches

Data collection and analysis

Selection of studies

Data extraction and management

Assessment of risk of bias in included studies

Measures of treatment effect

Unit of analysis issues

Dealing with missing data

Assessment of heterogeneity

Assessment of reporting biases

Data synthesis

Subgroup analysis and investigation of heterogeneity

Sensitivity analysis

Results

Description of studies

Results of the search

Included studies

Participants

Exercise programs

Control groups

Primary outcomes

Cognitive

Activities of daily living (ADL)

Challenging behaviours

Depression

Mortality

Secondary outcomes

Caregiver burden

Caregiver quality of life

Caregiver mortality

Use of healthcare services

Excluded studies

Risk of bias in included studies

Allocation

Random sequence generation (selection bias)

Allocation (selection bias)

Blinding

Blinding of participants and personnel (performance bias)

Blinding of outcome assessment (detection bias)

Incomplete outcome data

Selective reporting

Other potential sources of bias

Effects of interventions