Scolaris Content Display Scolaris Content Display

Psychological interventions to foster resilience in healthcare students

Collapse all Expand all

Background

Resilience can be defined as maintaining or regaining mental health during or after significant adversities such as a potentially traumatising event, challenging life circumstances, a critical life transition or physical illness. Healthcare students, such as medical, nursing, psychology and social work students, are exposed to various study‐ and work‐related stressors, the latter particularly during later phases of health professional education. They are at increased risk of developing symptoms of burnout or mental disorders. This population may benefit from resilience‐promoting training programmes.

Objectives

To assess the effects of interventions to foster resilience in healthcare students, that is, students in training for health professions delivering direct medical care (e.g. medical, nursing, midwifery or paramedic students), and those in training for allied health professions, as distinct from medical care (e.g. psychology, physical therapy or social work students).

Search methods

We searched CENTRAL, MEDLINE, Embase, 11 other databases and three trial registries from 1990 to June 2019. We checked reference lists and contacted researchers in the field. We updated this search in four key databases in June 2020, but we have not yet incorporated these results.

Selection criteria

Randomised controlled trials (RCTs) comparing any form of psychological intervention to foster resilience, hardiness or post‐traumatic growth versus no intervention, waiting list, usual care, and active or attention control, in adults (18 years and older), who are healthcare students. Primary outcomes were resilience, anxiety, depression, stress or stress perception, and well‐being or quality of life. Secondary outcomes were resilience factors.

Data collection and analysis

Two review authors independently selected studies, extracted data, assessed risks of bias, and rated the certainty of the evidence using the GRADE approach (at post‐test only).

Main results

We included 30 RCTs, of which 24 were set in high‐income countries and six in (upper‐ to lower‐) middle‐income countries. Twenty‐two studies focused solely on healthcare students (1315 participants; number randomised not specified for two studies), including both students in health professions delivering direct medical care and those in allied health professions, such as psychology and physical therapy. Half of the studies were conducted in a university or school setting, including nursing/midwifery students or medical students. Eight studies investigated mixed samples (1365 participants), with healthcare students and participants outside of a health professional study field.

Participants mainly included women (63.3% to 67.3% in mixed samples) from young adulthood (mean age range, if reported: 19.5 to 26.83 years; 19.35 to 38.14 years in mixed samples). Seventeen of the studies investigated group interventions of high training intensity (11 studies; > 12 hours/sessions), that were delivered face‐to‐face (17 studies). Of the included studies, eight compared a resilience training based on mindfulness versus unspecific comparators (e.g. wait‐list).

The studies were funded by different sources (e.g. universities, foundations), or a combination of various sources (four studies). Seven studies did not specify a potential funder, and three studies received no funding support.

Risk of bias was high or unclear, with main flaws in performance, detection, attrition and reporting bias domains.

At post‐intervention, very‐low certainty evidence indicated that, compared to controls, healthcare students receiving resilience training may report higher levels of resilience (standardised mean difference (SMD) 0.43, 95% confidence interval (CI) 0.07 to 0.78; 9 studies, 561 participants), lower levels of anxiety (SMD −0.45, 95% CI −0.84 to −0.06; 7 studies, 362 participants), and lower levels of stress or stress perception (SMD −0.28, 95% CI −0.48 to −0.09; 7 studies, 420 participants). Effect sizes varied between small and moderate. There was little or no evidence of any effect of resilience training on depression (SMD −0.20, 95% CI −0.52 to 0.11; 6 studies, 332 participants; very‐low certainty evidence) or well‐being or quality of life (SMD 0.15, 95% CI −0.14 to 0.43; 4 studies, 251 participants; very‐low certainty evidence).

Adverse effects were measured in four studies, but data were only reported for three of them. None of the three studies reported any adverse events occurring during the study (very‐low certainty of evidence).

Authors' conclusions

For healthcare students, there is very‐low certainty evidence for the effect of resilience training on resilience, anxiety, and stress or stress perception at post‐intervention.

The heterogeneous interventions, the paucity of short‐, medium‐ or long‐term data, and the geographical distribution restricted to high‐income countries limit the generalisability of results. Conclusions should therefore be drawn cautiously. Since the findings suggest positive effects of resilience training for healthcare students with very‐low certainty evidence, high‐quality replications and improved study designs (e.g. a consensus on the definition of resilience, the assessment of individual stressor exposure, more attention controls, and longer follow‐up periods) are clearly needed.

PICOs

Population
Intervention
Comparison
Outcome

The PICO model is widely used and taught in evidence-based health care as a strategy for formulating questions and search strategies and for characterizing clinical studies or meta-analyses. PICO stands for four different potential components of a clinical question: Patient, Population or Problem; Intervention; Comparison; Outcome.

See more on using PICO in the Cochrane Handbook.

Psychological interventions to foster resilience in healthcare students

Background
Healthcare students (e.g. medical, nursing, midwifery, paramedic, psychology, physical therapy, or social work students) have a high academic work load, are required to pass examinations and are exposed to human suffering. This can adversely affect their physical and mental health. Interventions to protect them against such stresses are known as resilience interventions. Previous systematic reviews suggest that resilience interventions can help students cope with stress and protect them against adverse consequences on their physical and mental health.

Review question
Do psychological interventions designed to foster resilience improve resilience, mental health, and other factors associated with resilience in healthcare students?

Search dates
The evidence is current to June 2019. The results of an updated search of four key databases in June 2020 have not yet been included in the review.

Study characteristics
We found 30 randomised controlled trials (studies in which participants are assigned to either an intervention or a control group by a procedure similar to tossing a coin). The studies evaluated a range of resilience interventions in participants aged on average between 19 and 38 years.

Healthcare students were the focus of 22 studies, with a total of 1315 participants (not specified for two studies). Eight studies included mixed samples (1365 participants) of healthcare students and non‐healthcare students.

Eight of the included studies compared a mindfulness‐based resilience intervention (i.e. an intervention fostering attention on the present moment, without judgements) versus unspecific comparators (e.g. wait‐list control receiving the training after a waiting period). Most interventions were performed in groups (17/30), with high training intensity of more than 12 hours or sessions (11/30), and were delivered face‐to‐face (i.e. with direct contact and face‐to‐face meetings between the intervention provider and the participants; 17/30).

The included studies were funded by different sources (e.g. universities, foundations), or a combination of various sources (four studies). Seven studies did not specify a potential funder, and three studies received no funding support.

Certainty of the evidence
A number of things reduce the certainty about whether resilience interventions are effective. These include limitations in the methods of the studies, different results across studies, the small number of participants in most studies, and the fact that the findings are limited to certain participants, interventions and comparators.

Key results
Resilience training for healthcare students may improve resilience, and may reduce symptoms of anxiety and stress immediately after the end of treatment. Resilience interventions do not appear to reduce depressive symptoms or to improve well‐being. However, the evidence from this review is limited and very uncertain. This means that we currently have very little confidence that resilience interventions make a difference to these outcomes and that further research is very likely to change the findings.

Very few studies reported on the short‐ and medium‐term impact of resilience interventions. Long‐term follow‐up assessments were not available for any outcome. Studies used a variety of different outcome measures and intervention designs, making it difficult to draw general conclusions from the findings. Potential adverse events were only examined in four studies, with three of them showing no undesired effects and one reporting no results. More research is needed, of high methodological quality and with improved study designs.

Authors' conclusions

Implications for practice

There is very uncertain evidence that resilience interventions are effective in improving resilience or self‐reported symptoms of anxiety, and stress or stress perception at post‐test (small and moderate effect sizes).

The generalisability and applicability of the evidence is limited by the heterogeneous design and content of interventions (with a predominance of high‐intensity, face‐to‐face interventions delivered in a group setting), the scarcity of studies with short‐, medium‐ and long‐term follow‐up, the divergent efficacy measures used, for example, to measure resilience, and the limited geographical location (i.e. high‐income countries). We rated the certainty of the evidence in this review as being very low across all primary outcomes at post‐test. We therefore cannot draw strong conclusions about the effects of resilience interventions, as the true effect may be markedly different from the estimated effect.

We know little about the longer‐term effects of resilience training on most outcomes, because few studies included follow‐up assessments. Booster sessions were not conducted in any of the included studies.

The limited evidence that resilience training improves well‐being or quality of life (post‐test) and several resilience factors might indicate the need to adapt the current intervention techniques used and the protective factors trained.

The results of our review provide very uncertain evidence about whether resilience‐training programmes may be helpful in stabilising and improving the mental health of healthcare students as a group of students with high stressor exposure.

Implications for research

The findings of this review point to the need for further research of high methodological quality in order to determine the efficacy of resilience interventions in healthcare students.

A consensus on the definition of resilience and adequate outcome measures to be used consistently across the field would be important for future research. Following the growing consensus on resilience as a dynamic outcome (Bonanno 2015; Kalisch 2017), intervention studies might be guided by this definition and examine resilience as a primary outcome (Chmitorz 2018). Because none of the studies in healthcare students measured the participants' stressor exposure, it remains unclear whether healthcare students really benefit from resilience training by being better able to cope with stressors. Future studies should therefore measure resilience as a person’s mental health in relation to individual stressor load. Only if the risk or stressor exposure is assessed (which is different from the subjective perception of stress), may researchers gain knowledge about the changes in resilience by an intervention. In addition to the number of stressors, certain covariates should be assessed, such as the type of stressors (e.g. micro‐ versus macro stressors, psychological versus physiological stressors, acute versus chronic stressors) or the perceived severity of occurred stressors.

Study designs: there is a need for improved comparators, at least treatment as usual (TAU) or ideally active and attention control (Chmitorz 2018), to allow fair comparisons between resilience interventions and control. As already suggested (Chmitorz 2018), resilience‐training programmes could be implemented during or after the presence of a stressor. However, future studies should also use designs in which resilience training is provided prior to circumscribed stress situations (e.g. examinations; rotation of a healthcare student to a demanding hospital ward, such as emergency), in order to determine the resilience effects of the intervention, and to see whether the training does indeed improve resilience to the specific stress situation (Chmitorz 2018; Kalisch 2015).

In general, pre‐ and post‐assessments of the outcome indicators (e.g. for resilience) should be conducted, with future studies also filling the gap of longer follow‐up periods and measuring the stressor exposure before, throughout and after the intervention. Also, it could be interesting to investigate whether or not booster sessions might help maintain the effects of training over time.

The use of adequate sample sizes based on a priori analyses seems to be an urgent need in this field, to ensure sufficient statistical power.

Intervention studies might also benefit from more comprehensive baseline diagnostics of mental health (e.g. clinical interview) and a better reporting of eligibility criteria for pre‐existing mental symptoms. This would allow more precise conclusions about whether resilience training reduces (clinically relevant) mental symptoms. Furthermore, the implications of the resilience concept would require a baseline mental health assessment. In order to investigate the effects of interventions on resilience (i.e. mental health in relation to stressor load) and to determine a specific 'resilience pattern or trajectory' under consideration, the status of psychological functioning as an outcome of interest at baseline is important. For example, when researchers are interested in testing the effects of an intervention in stressor‐exposed individuals on the resilience trajectory of sustained mental health (see also Description of the condition), they would have to prove a positive mental health level at baseline and at post‐intervention. On the other hand, researchers considering a sample with elevated levels of mental symptoms at pre‐test (Harrer 2019) would be able to investigate the resilience trajectory of recovery or even of post‐traumatic growth (i.e. increased level of functioning compared to outset prior to stressors).

Beyond RCTs, dismantling designs could be helpful in clarifying the efficacy of single components of resilience training.

In general, there is a need for a better reporting of intervention studies using international guidelines, such as the CONSORT statement (Schulz 2010). To guarantee higher transparency of study conduct and reporting, primary investigators should register trials or publish study protocols according to the SPIRIT guidelines (Standard Protocol Items: Recommendations for Interventional Trials; Chan 2013a; Chan 2013b).

Finally, future studies in this field should focus more on men. Research efforts should be intensified in low‐ and middle‐income countries in order to reach more robust conclusions about the effectiveness of training across various settings. More studies would be desirable with particular formats of intervention (e.g. online‐ and mobile‐based). Based on the varying relevance of resilience factors in different age groups (see long‐time cohort studies; Werner 1992; Werner 2001) and given that this review was limited to young adults (students), the participants' age and the protective factors trained might also have affected the findings. Future studies should therefore focus their efforts on the development and evaluation of resilience interventions that foster specific and validated age‐relevant factors in specific target groups.

In sum, there is still an urgent need for additional evidence to answer the question of which resilience interventions are really effective in healthcare students, and how they should be implemented. A larger number of RCTs in the field might then allow potential effect modifiers to be explored.

Summary of findings

Open in table viewer
Summary of findings 1. Resilience interventions versus control conditions for healthcare students

Resilience interventions versus control conditions for healthcare students

Patient or population: healthcare students, including students in training for health professions delivering direct medical care (e.g. medical students, nursing students), and allied health professions as distinct from medical care (e.g. psychology students, social work students); aged 18 years and older, irrespective of health status

Setting: any setting of health professional education (e.g. medical school, nursing school, psychology or social work department at university)

Intervention: any psychological intervention focused on fostering resilience or the related concepts of hardiness or post‐traumatic growth by strengthening well‐evidenced resilience factors that are thought to be modifiable by training (see Appendix 3), irrespective of content, duration, setting or delivery mode

Comparison: no intervention, wait‐list control, treatment as usual (TAU), active control, attention control

Outcomes

Anticipated absolute effects* (95% CI)

Relative effect
(95% CI)

№ of participants
(studies)

Certainty of the evidence
(GRADE)

Comments

Risk with control conditions

Risk with resilience interventions

Resilience
Measured by: investigators measured resilience using different instruments; higher scores mean higher resilience

Timing of outcome assessment: post‐intervention

The mean resilience score in the intervention groups was, on average, 0.43 standard deviations higher (0.07 higher to 0.78 higher)

561
(9 RCTs)

⊕⊝⊝⊝
Very lowa

SMD of 0.43 represents a moderate effect size (Cohen 1988b)b

Mental health and well‐being: anxiety
Measured by: investigators measured anxiety using different instruments; lower scores mean lower anxiety

Timing of outcome assessment: post‐intervention

The mean anxiety score in the intervention groups was, on average, 0.45 standard deviations lower (0.84 lower to 0.06 lower)

362
(7 RCTs)

⊕⊝⊝⊝
Very lowc

SMD of 0.45 represents a moderate effect size (Cohen 1988b)b

Mental health and well‐being: depression
Measured by: investigators measured depression using different instruments; lower scores mean lower depression

Timing of outcome assessment: post‐intervention

The mean depression score in the intervention groups was, on average, 0.20 standard deviations lower (0.52 lower to 0.11 higher)

332
(6 RCTs)

⊕⊝⊝⊝
Very lowd

SMD of 0.20 represents a small effect size (Cohen 1988b)b

Mental health and well‐being: stress or stress perception:

Measured by: investigators measured stress or stress perception using different instruments; lower scores mean lower stress or stress perception

Timing of outcome assessment: post‐intervention

The mean stress or stress perception score in the intervention groups was, on average, 0.28 standard deviations lower (0.48 lower to 0.09 lower)

420
(7 RCTs)

⊕⊝⊝⊝
Very lowe

SMD of 0.28 represents a small effect size (Cohen 1988b)b

Mental health and well‐being: well‐being or quality of life:

Measured by: investigators measured well‐being or quality of life using different instruments; higher scores mean higher well‐being or quality of life

Timing of outcome assessment: post‐intervention

The mean well‐being or quality of life score in the intervention groups was, on average, 0.15 standard deviations higher (0.14 lower to 0.43 higher)

251
(4 RCTs)

⊕⊝⊝⊝
Very lowf

SMD of 0.15 represents a small effect size (Cohen 1988b)b

Adverse events

There were no adverse events reported in association with study participation in 3 of 4 studies measuring potential adverse events.g

566

(3 RCTs)h

⊕⊝⊝⊝
Very lowi

*The risk in the intervention group (and its 95% CI) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CI: confidence interval; RCT: randomised controlled trial; SMD: standardised mean difference.

GRADE Working Group grades of evidence
High certainty: we are very confident that the true effect lies close to that of the estimate of the effect
Moderate certainty: we are moderately confident in the effect estimate; the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different
Low certainty: our confidence in the effect estimate is limited; the true effect may be substantially different from the estimate of the effect
Very low certainty: we have very little confidence in the effect estimate; the true effect is likely to be substantially different from the estimate of effect

aDowngraded by two levels due to study limitations (unclear risk of selection bias, high and unclear risk of performance, detection and attrition bias), by one level due to unexplained inconsistency (I2 = 75%), and by one level due to indirectness (studies limited to certain interventions (e.g. group setting, face‐to‐face delivery, moderate and high intensity, unspecified theoretical foundation) and comparators (no intervention, wait‐list)).

bAccording to Cohen 1988b, a standardised mean difference (SMD) of 0.2 represents a small difference (i.e. small effect size), 0.5 a moderate difference, and 0.8 a large difference.
cDowngraded by two levels due to study limitations (unclear risk of selection bias, high and unclear risk of detection and attrition bias, high risk of performance bias), by one level due to unexplained inconsistency (I2 = 66%), and by one level due to indirectness (studies limited to certain participants (medical students), interventions (e.g. group setting, moderate and high intensity) and comparators (no intervention, wait‐list)).
dDowngraded by two levels due to study limitations (unclear risk of selection bias, high and unclear risk of detection bias, high risk of performance and attrition bias), by one level due to unexplained inconsistency (I2 = 45%), by one level due to indirectness (studies limited to certain participants (medical students), interventions (e.g. group and individual setting, low and high intensity) and comparators (no intervention, wait‐list)), and by two levels due to imprecision (< 400 participants; 95% CI wide and inconsistent).
eDowngraded by two levels due to study limitations (unclear risk of selection bias, high and unclear risk of detection bias, high risk of performance, attrition and reporting bias), and by one level due to indirectness (studies limited to certain participants (medical and nursing students), interventions (group and individual setting, low and high intensity, mindfulness and unspecific theoretical foundation) and comparators (no intervention, wait‐list)).
fDowngraded by two levels due to study limitations (unclear risk of selection and detection bias, high and unclear risk of attrition bias, high risk of performance bias), by one level due to indirectness (studies limited to certain interventions (group setting, face‐to‐face and combined delivery, high intensity)), and by two levels due to imprecision (< 400 participants; 95% CI and inconsistent).

gKötter 2016 also assessed adverse events but did not report the respective data in the report.
hFor Galante 2018, subgroup data in healthcare students were not available; number of participants in total sample at post‐test (CORE‐OM data) was 482.
iDowngraded by two levels due to study limitations (unclear risk of selection and detection bias, unclear and high risk of attrition bias, high risk of performance and other bias (no systematic and validated assessment of adverse events)), and by one level due to indirectness (studies limited to certain interventions (individual setting, face‐to‐face, mindfulness based) and comparators (TAU)).

Background

For a description of abbreviations used in this review, please see Appendix 1.

Description of the condition

Since the introduction of Antonovsky’s salutogenesis as a basis for health promotion (Antonovsky 1979), and the Ottawa Charter for Health Promotion (WHO 1986), the concept of resilience has stimulated extensive research. Resilience describes the phenomenon under which an individual does not, or only temporarily, experiences mental health problems despite being subjected to psychological or physical stressors of short (acute) or long (chronic) duration (Kalisch 2015; Kalisch 2017). By definition, resilience always presupposes the exposure to substantial risk or adversity (Earvolino‐Ramirez 2007; Jackson 2007; Luthar 2000; Masten 2001).

Stressor exposure in healthcare students and its consequences

Healthcare students are exposed to a large number of academic, clinical and psychosocial stressors. Substantial academic stressors include, for example, excessive academic workload (e.g. long hours of study, volume of information, difficult academic work), difficulties with studying and time management, competition with peers, examinations (e.g. high frequency), and fear of failing (Edwards 2010; Gazzaz 2018; Hill 2018). Further categories of stressor exposure may include social stressors such as conflicts with work‐life balance and relationship management, financial concerns, or uncertainty about the future (Chang 2012; Gazzaz 2018; Santen 2010). In addition to typical life changes during the transition from training (e.g. nursing or medical school) to (clinical) practice, healthcare students also have to adapt to challenges that are specific to their chosen field of work. Due to patient contact in later phases of training, they are exposed to patient‐related stressors such as exposure to human suffering and death (Hill 2018). Furthermore, clinical stressors identified among students and trainees in the healthcare sector include, for example, lack of practical skills, a theory‐to‐practice gap, tense atmosphere among clinical staff and negative attitudes of healthcare professionals, being criticised in front of staff and patients, or hospital ward rotations (Dyrbye 2009; Edwards 2010; Evans 2004; Hill 2018).

Chronic stressor exposure during health professional education has the potential to impact on the students' physical and mental health; for example, medical and nursing students have reported debilitating sleep disorders (Azad 2015; Belingheri 2020). Health professional education is perceived as stressful by many students, with many reporting increased levels of perceived stress (Edwards 2010; Fares 2016; Foster 2018; Heinen 2017; Jacob 2013; Wilks 2010). Healthcare students, especially medical students, are at increased risk of developing symptoms of burnout, such as high emotional exhaustion (Cecil 2014; Dyrbye 2009; Dyrbye 2016; Fares 2016; Santen 2010), and stress‐related mental disorders such as depression (Bunevicius 2008; Compton 2008; Dyrbye 2006; Mao 2019; Tung 2018) and anxiety (Bunevicius 2008; Dyrbye 2006; Mao 2019). The experience of stressors and the resulting health impact may negatively affect students' academic (e.g. grades) and clinical performance (e.g. decline in empathy) (Gazzaz 2018; Kötter 2017; Neumann 2011; Yamada 2014; Ye 2018), and could possibly affect also the high attrition rates found among healthcare students (Hamshire 2019) and new graduates (Pine 2007), as demonstrated by some studies (Dyrbye 2011).

Overall, based on these findings, the concept of resilience has become increasingly important for health professional education in recent years (Eley 2014; Hodges 2008; McAllister 2009; Pines 2012; Sanderson 2017; Stephens 2013; Tempski 2012; Thomas 2016; Waddell 2015; Wright 2019).

Definition of resilience

Three different approaches have been discussed in the definition of resilience (Hu 2015; Kalisch 2015). Trait resilience refers to resilience defined as personal resources or static, positive personality characteristics that enhance individual adaptation (Block 1996; Nowack 1989; Wagnild 1993). This approach has been superceded largely by a view of resilience as an outcome rather than a static personality trait (Kalisch 2015; Mancini 2009), i.e. mental health despite significant stress or trauma. According to this outcome‐oriented definition, the positive outcome of resilience is partially determined by several resilience factors (Kalisch 2015). To date, a large range of genetic, psychological, social and environmental factors have been discussed in resilience research that often overlap and may interact (Bengel 2012; Bonanno 2013; Carver 2010; Connor 2006; Earvolino‐Ramirez 2007; Feder 2011; Forgeard 2012; Haglund 2007; Iacoviello 2014; Kuiper 2012; Mancini 2009; Michael 2003; Ozbay 2007; Rutten 2013; Sapienza 2011; Sarkar 2014; Southwick 2005; Southwick 2012; Stewart 2011; Wu 2013; Zauszniewski 2010). Psychosocial resilience factors that are well‐evidenced according to the current state of knowledge and thought to be modifiable include: meaning or purpose in life, a sense of coherence, positive emotions, hardiness, self‐esteem, active coping, self‐efficacy, optimism, social support, cognitive flexibility (including positive reappraisal and acceptance), and religiosity or spirituality or religious coping (see Appendix 2: level 1). Most recently, resilience has been conceptualised as a multidimensional and dynamic process (Johnston 2015; Kalisch 2015; Kent 2014; Mancini 2009; Norris 2009; Rutten 2013; Sapienza 2011; Southwick 2012). This resilient process is characterised either by a trajectory of undisturbed mental health during or after adversities, or temporary dysfunction followed by successful recovery (Kalisch 2015). In general, resilience is viewed as the outcome of an interaction between the individual and his or her environment (Cicchetti 2012; Rutten 2013), which may be influenced through personal (e.g. optimism) as well as environmental resources (e.g. social support) (Haglund 2007; Iacoviello 2014; Kalisch 2015; Southwick 2005; Wu 2013). As such, resilience is modifiable and can be improved by interventions (Bengel 2012; Connor 2006; Southwick 2011).

Interventions to foster resilience

Interventions to foster resilience have been developed for and conducted in a variety of clinical and non‐clinical populations using various formats, such as multimedia programmes or face‐to‐face settings, and have been delivered in a group or individual context (see Bengel 2012 and Southwick 2011 for an overview). To date, several resilience‐training programmes that focus specifically on fostering resilience in healthcare students have been tested (Anderson 2017; Peng 2014). However, the empirical evidence for the efficacy of these interventions is still unclear and requires further research.

Description of the intervention

There is currently little consensus about when to consider a programme as ‘resilience training’, or what components are needed for effective programmes (Leppin 2014). The diversity across resilience‐training programmes in their theoretical assumptions, operationalisation of the construct, and inclusion of core components reflect the current state of knowledge (Joyce 2018; Leppin 2014; Macedo 2014; Robertson 2015; Vanhove 2016), with leading guidelines still under discussion (compare Kalisch 2015; Robertson 2015).

Most training programmes, whether individual or group‐based, are implemented face‐to‐face. Alternative formats include online interventions or combinations of different formats. Resilience‐training programmes often use methods such as discussions, role plays, practical exercises and homework to reinforce training content. They usually contain a psycho‐educative element to provide information on the concept of resilience, or specific training elements (e.g. cognitive restructuring).

In general, resilience interventions are based on different psychotherapeutic approaches: cognitive‐behavioural therapy (CBT; Abbott 2009); acceptance and commitment therapy (ACT; Ryan 2014); mindfulness‐based therapy (Geschwind 2011); attention and interpretation therapy (AIT; Sood 2014); problem‐solving therapy (Bekki 2013), as well as stress inoculation (Farchi 2010). A number of training programmes focus on fostering single or multiple psychosocial resilience factors (Kanekar 2010), without being assignable to a certain approach. Few interventions base their work on a defined resilience model (Schachman 2004; Steinhardt 2008).

How the intervention might work

Depending on the underlying resilience concept, resilience interventions target different resources and skills. The theoretical foundations of training programmes and the hypotheses on how they might maintain or regain mental health are as diverse as their content. Currently, no empirically‐validated theoretical framework exists that outlines the mode of action of resilience interventions (Bengel 2012; Leppin 2014).

As resilience as an outcome is determined by several potentially modifiable resilience factors (see Description of the condition), resilience interventions might work by strengthening these factors (see Appendix 3 for examples of possible training methods). However, depending on the underlying theoretical foundation, there are different theories of change on how certain factors and hence resilience might be affected.

From a cognitive‐behavioural perspective, stress‐related mental dysfunctions (e.g. depression) are considered to be the result of dysfunctional thinking (Beck 2011; Benjamin 2011). When confronted with adversity, people show maladaptive behavioural responses or experience negative mood states, or both, due to irrational cognition (Beck 1976; Ellis 1975). This is in line with other stress and resilience theories, which assume that it is not the stressor itself, but its cognitive appraisal that may lead to stress reactions (Kalisch 2015; Lazarus 1987). Modifying cognitive processes into more adaptive patterns of thought will therefore probably produce more adaptive responses to stress (Beck 1964). By challenging an individual’s maladaptive thoughts, and by teaching coping strategies, CBT‐based resilience interventions might be beneficial in promoting the resilience factors of cognitive flexibility and active coping.

As one form of CBT, stress inoculation therapy is based on the assumption that exposing individuals to milder forms of stress can strengthen coping strategies and the individual’s confidence in using his or her coping repertoire (Meichenbaum 2007). Resilience‐training programmes grounded in stress inoculation therapy might therefore foster resilience by enhancing factors such as self‐efficacy.

Problem‐solving therapy is closely related to CBT and based on problem‐solving theory. According to the problem‐solving model of stress and adaptation, effective problem‐solving can attenuate the negative effects of stress and adversity on well‐being by moderating or mediating (or both) the effects of stressors on emotional distress (Nezu 2013). Resilience interventions based on problem‐solving that enhance an individual’s positive problem orientation and planful problem‐solving might foster participants’ psychological adaptation to stress by increasing the resilience factor of active coping.

According to ACT (Hayes 2004; Hayes 2006), psychopathology is primarily the consequence of psychological inflexibility (Hayes 2006), which is also relevant when an individual is confronted with stressors. By teaching acceptance and mindfulness skills on the one hand (e.g. being in contact with the present moment), and commitment and behaviour‐change skills on the other (e.g. values, committed action), several resilience factors might be fostered in ACT‐based resilience interventions (e.g. cognitive flexibility, purpose in life). In particular, the acceptance of a full range of emotions taught in ACT might result in a better adjustment to stressful conditions.

In mindfulness‐based therapy (e.g. mindfulness‐based stress reduction (MBSR; Stahl 2010); AIT (Sood 2010)), mindfulness is characterised by the nonjudging awareness of the present moment and its accompanying mental phenomena (i.e. body sensations, thoughts and emotions). Since practitioners learn to accept whatever occurs in the present moment, they are thought to adapt more efficiently to stress (Grossman 2004; Shapiro 2005). As being more aware of the 'here and now' may enhance the sensitivity to positive aspects in life, mindfulness‐based resilience interventions might also help participants to gain a brighter outlook for the future (i.e. optimism) or to experience positive emotions more regularly. Teaching mindfulness might also increase participants’ cognitive flexibility by learning to accept negative situations and emotions.

Independently of the underlying theory, resilience training might work differently depending on the respective 'delivery format' and 'intervention setting' (Robertson 2015; Vanhove 2016). For example, interventions implemented face‐to‐face could work better than online formats in increasing resilience, due to the more direct contact between trainers and participants (Vanhove 2016), which might also increase compliance. Resilience training in an individual setting could be more efficient than group‐based interventions, as trainers might be better able to attend to participants’ individual needs and provide feedback more easily (Vanhove 2016). On the other hand, group‐based interventions could enhance participants’ social resources. No previous review has examined the role of training duration on effect sizes of resilience interventions. As participants have the opportunity to apply the taught skills in daily life, high‐intensity resilience interventions that include weekly sessions over several weeks (e.g. combined with homework assignments or daily practice) could be more efficient than low‐intensity training (e.g. a single session). Joyce 2018, who examined the role of the theoretical foundation of resilience interventions for the first time, found positive effect sizes on resilience for CBT‐based, mindfulness‐based and mixed interventions (i.e. CBT and mindfulness) compared to control. However, differences in the effects of resilience training based on other theoretical foundations have not so far been considered.

Why it is important to do this review

A large number of systematic reviews and meta‐analyses have investigated various forms of interventions to foster healthcare students' mental health, such as stress management, mentoring programmes, emotional intelligence interventions and mindfulness‐based training to reduce or prevent burnout, and crisis‐focused programmes (see Appendix 4). Although some of these reviews also identified interventions to foster resilience (e.g. Griffiths 2019), the primary review question did not specifically refer to such programmes.

A considerable number of systematic reviews and meta‐analyses of interventions to foster resilience (see Appendix 4) have synthesised the efficacy of resilience‐training programmes in clinical and non‐clinical adult populations (Bauer 2018; Joyce 2018; Leppin 2014; Macedo 2014; Massey 2019; Milne 2016; Pallavicini 2016; Pesantes 2015; Petriwskyj 2016; Reyes 2018; Robertson 2015; Skeffington 2013; Townshend 2016; Vanhove 2016; Van Kessel 2014; Wainwright 2019), or at least have searched for 'resilience' and related constructs (Deady 2017; Tams 2016). In a recent Cochrane Review, our group synthesised the evidence on the efficacy of resilience training in healthcare professionals (Kunzler 2020). There are so far only four relevant meta‐analyses (Joyce 2018; Kunzler 2020; Leppin 2014; Vanhove 2016). Previous reviews agree in their conclusion that resilience interventions can generally improve resilience, mental health and (job) performance. Nevertheless, there are some methodological and quality differences between the reviews, which complicate statements about the efficacy of resilience training or result in a variety of effect sizes. These include, for example, heterogeneous eligibility criteria and definitions of resilience training, rather simple and limited search strategies, the lack of a review protocol or PROSPERO registration for most reviews, and different guidelines for the conduct and reporting of the review.

Four systematic reviews of healthcare students (see Appendix 4) have synthesised evidence on the efficacy of resilience‐training programmes in this target group (Gilmartin 2017; McGowan 2016; Rogers 2016; Sanderson 2017), with Sanderson 2017 not focusing only on resilience interventions. One other review (Pezaro 2017) and a meta‐analysis (Lo 2018) also searched for 'resilience'. The six publications either investigated healthcare students such as medical students (Lo 2018) or combinations of healthcare students and healthcare professionals (i.e. with completed training) (Gilmartin 2017). Overall, they found mixed results for the efficacy of resilience‐training programmes. On the one hand, they identified some benefits to healthcare students, for example, in improving resilience or mental health outcomes (e.g. Gilmartin 2017; Pezaro 2017; Rogers 2016). On the other hand, as pointed out by some authors (e.g. McGowan 2016), the reviews' conclusions have been restricted by current limitations of resilience intervention research (e.g. heterogeneous definitions of resilience, and the low methodological rigour of studies). Comparable with reviews in other populations, the publications also suffer from methodological weaknesses, which limit the robustness of their findings (see Appendix 4). Most importantly, the number of RCTs included in previous reviews is rather limited (0 to 24 RCTs among 5 to 36 studies included in the six reviews), and the search period covered by the reviews is up to January 2017 (Gilmartin 2017), thus precluding any conclusions about the efficacy of resilience interventions in healthcare students that have been developed since then.

In our review, which seeks to address the methodological weaknesses of previous reviews, we were also particularly interested in psychological resilience interventions offered to this target group. The interventions had to be scientifically founded, i.e. they had to address one or more of the resilience factors stated above that are known to be associated with resilience in adults according to the state of current research (see Appendix 2: levels 1a to 1c). They also had to state the intention of promoting resilience or a related construct (hardiness, post‐traumatic growth). Lastly, the trained population had to fulfil the condition of potential stress or trauma exposure (the concept implicated for resilience), i.e. being a healthcare student (see Description of the condition), in order to clearly distinguish genuine resilience interventions from other interventions focused on fostering associated constructs such as mental health (Windle 2011a).

Resilience as a concept of prevention is highly current, and there is increasing interest worldwide in promoting mental health and preventing disease (WHO 1986; WHO 2004). Due to chronic stressor exposure in healthcare students, and the potentially negative consequences for the students’ health (see Description of the condition), healthcare students are viewed as an important target group for resilience interventions (McAllister 2009). This review therefore aims to provide further and more detailed evidence about which interventions are most likely to foster resilience and to prevent stress‐related mental health problems in healthcare students. The evidence base for this review might contribute to improving existing interventions and to facilitating the future development of training programmes. In this way, researchers, practitioners and policymakers could benefit from our work.

Objectives

To assess the effects of interventions to foster resilience in healthcare students, that is, students in training for health professions delivering direct medical care (e.g. medical, nursing, midwifery or paramedic students), and those in training for allied health professions, as distinct from medical care (e.g. psychology, physical therapy or social work students; see Differences between protocol and review).

Methods

Criteria for considering studies for this review

Types of studies

Randomised controlled trials (RCTs), including cluster‐RCTs.

Types of participants

Adults aged 18 years and older, who are healthcare students, i.e. students in training for health professions delivering direct medical care (e.g. medical, nursing, midwifery or paramedic students) and those in training for allied health professions, as distinct from medical care (e.g. psychology, physical therapy, social work, counselling, occupational therapy, speech therapy, medical assistant or medical technician students).

Participants were included irrespective of health status.

At the time of the intervention, individuals had to be exposed to potential risk or stressors, which was ensured by focusing on healthcare students in this review (see Description of the condition; see Differences between protocol and review).

We included studies involving mixed samples (e.g. healthcare and non‐healthcare students) in the review. We also considered these studies in meta‐analyses (see Data synthesis) provided the data for healthcare students were reported separately or could be obtained by contacting the study authors.

Types of interventions

Any psychological resilience intervention, irrespective of content, duration, setting or delivery mode.

For the purpose of this review, we define psychological resilience interventions as follows: interventions focused on fostering resilience or the related concepts of hardiness or post‐traumatic growth, by strengthening well‐evidenced resilience factors that are thought to be modifiable by training (see above and Appendix 2; level 1). In order to use highly‐objective inclusion criteria, we considered only interventions that explicitly defined the objective of fostering resilience, hardiness, or post‐traumatic growth by using one or more of these terms in the publication (see Differences between protocol and review). We did not include studies that examined the efficacy of disorder‐specific psychotherapy (e.g. CBT for depression).

We considered the following comparators in this review: no intervention, wait‐list control, treatment as usual (TAU), active control, and attention control. We used the term ‘attention control’ for alternative treatments that mimicked the amount of time and attention received (e.g. by the trainer) in the treatment group. We also considered active controls to involve an alternative treatment (no TAU; for example, treatment developed specifically for the study), but that did not control for the amount of time and attention in the intervention group and was not attention control in a narrow sense.

Types of outcome measures

Due to the different ways in which resilience has been operationalised in previous research, resilience as an intervention outcome could not always be guaranteed in studies. We therefore also defined assessments of psychological adaptation (e.g. mental health) as primary outcomes.

Secondary outcomes included a range of psychological factors associated with resilience, according to the current state of knowledge, and were selected based on conceptual clarity and measurability (levels 1a and 1b; see Appendix 2).

Measures for the assessment of psychological resilience and psychological adaptation, as well as resilience factors, are specified on the basis of previous reviews on resilience interventions (Leppin 2014; Macedo 2014; Robertson 2015; Vanhove 2016) and reviews on resilience measurements (Pangallo 2015; Windle 2011b); see Helmreich 2017 and Appendix 5, Appendix 6, Appendix 7 in this review, respectively.

We considered self‐rated and observer‐ or clinician‐rated measures, as well as study outcomes at all time points. The lack of reporting of the primary or secondary outcomes described above was not an exclusion criterion for this review.

Primary outcomes

  • Resilience*, measured by improvements in specific resilience scales (Bengel 2012; Earvolino‐Ramirez 2007; Pangallo 2015; Windle 2011b), such as the Resilience Scale for Adults (Friborg 2003).

  • Mental health and well‐being, subsumed into the categories below, and measured by improvements in the respective assessment scales, such as the Depression Anxiety and Stress Scale (DASS‐21; Lovibond 1995). See Appendix 6 for further examples.

    • Anxiety*

    • Depression*

    • Stress or stress perception*

    • Well‐being or quality of life* (e.g. well‐being, life satisfaction, (health‐related) quality of life, vitality, vigour)

  • Adverse events*

Secondary outcomes

    • Social support

    • Optimism

    • Self‐efficacy

    • Active coping

    • Self‐esteem

    • Hardiness (although hardiness is often used as a synonym for resilience in the literature, we conceptualised it as a resilience factor in this review. See Appendix 2.)

    • Positive emotions

We extracted and reported data on secondary outcomes whenever they were assessed. If possible, we calculated and reported effect sizes.

Where data were available, we used outcomes marked by an asterisk (*) to generate the ‘Summary of findings’ table. If there was insufficient information, we provided a narrative description of the evidence.

Search methods for identification of studies

We ran the first searches for this review in October 2016, based on the MEDLINE search strategy in the protocol (Helmreich 2017) before changing the inclusion criteria of the review to focus on healthcare students (see Differences between protocol and review). For the top‐up searches in June 2019, we added a new section to the original search strategy using search terms to limit the search to healthcare sector workers and students.

Electronic searches

We searched the electronic sources listed below.

  • Cochrane Central Register of Controlled Trials (CENTRAL; 2019, Issue 6) in the Cochrane Library, which includes the Cochrane Developmental, Psychosocial and Learning Problems Specialised Register (searched 26 June 2019).

  • MEDLINE Ovid (1946 to 21 June 2019).

  • Embase Ovid (1974 to 2019 Week 25).

  • PsycINFO Ovid (1806 to June Week 3 2019).

  • CINAHL EBSCOhost (Cumulative Index to Nursing and Allied Health Literature; 1981 to 24 June 2019).

  • PSYNDEX EBSCOhost (1977 to 24 June 2019).

  • Web of Science Core Collection Clarivate (Science Citation Index; Social Science Citation Index; Conference Proceedings Citation Index ‐ Science; Conference Proceedings Citation Index ‐ Social Science & Humanities; 1970 to 26 June 2019).

  • International Bibliography of the Social Sciences ProQuest (IBSS; 1951 to 25 June 2019).

  • Applied Social Sciences Index & Abstracts ProQuest (ASSIA; 1987 to 24 June 2019).

  • ProQuest Dissertations & Theses (PQDT; 1743 to 24 June 2019).

  • Cochrane Database of Systematic Reviews (CDSR; 2019, Issue 6), part of the Cochrane Library (searched 26 June 2019).

  • Database of Abstracts of Reviews of Effects (DARE; 2015, Issue 4) part of the Cochrane Library (final issue; searched 27 October 2016).

  • Epistemonikos (epistemonikos.org; all available years, searched 24 June 2019).

  • ERIC EBSCOhost (Education Resources Information Center; 1966 to 26 June 2019).

  • Current Controlled Trials now ISTRCN registry (www.isrctn.com; 1 January 1990 to 24 June 2019).

  • ClinicalTrials.gov (clinicaltrials.gov; 1 January 1990 to 24 June 2019).

  • World Health Organization International Clinical Trials Registry Platform (WHO ICTRP; who.int/trialsearch; 1 January 1990 to 24 June 2019)

We report the search strategies for each database in Appendix 8 (up to 2016) and for the revised inclusion criteria, Appendix 9 (2016 onwards). We used the Cochrane Highly Sensitive Search Strategy to identify RCTs in MEDLINE (Lefebvre 2019). We adapted the search terms and syntax for other databases. The searches were not restricted by language, publication status or publication format. We limited our search to the period January 1990 onwards, to account for the fact that the concept of resilience and its operationalisation have developed significantly over the past decades (Fletcher 2013; Hu 2015; Kalisch 2015; Pangallo 2015). Because of the lack of homogeneity for the period 1990 to 2014 (Robertson 2015), it is likely that using a broader time frame would have made it even more difficult to detect resilience‐training studies with similar resilience concepts and assessments. Moreover, it appeared plausible to concentrate on the period 1990 to the present, since the idea of resilience as an outcome and a modifiable process has only emerged in recent years, and paved the way for the development of resilience‐promoting interventions (Bengel 2009; Southwick 2011). The idea of fostering resilience by specific training was therefore relatively new (Leppin 2014), which can also be seen in the review by Macedo 2014, who searched for studies on resilience interventions every year until 2013 but only found RCTs published after 1990.

As resilience‐training programmes should be adapted to scientific findings on a regular basis, and with the current research focusing on the detection of general resilience mechanisms (Kalisch 2015; Luthar 2000), the last five years seemed especially important in synthesising the evidence on newly‐developed resilience training.

We performed a further scoping search of four key databases (CENTRAL, CINAHL EBSCOhost, PsycINFO Ovid, ClinicalTrials.gov) in June 2020 prior to the publication of this review. The results are awaiting classification and will be incorporated into the review at the next update.

Searching other resources

In addition to the electronic searches, we inspected the reference lists of all included RCTs and relevant reviews, and contacted researchers in the field as well as the authors of selected studies, to check if there were any unpublished or ongoing studies. If data were missing or unclear, we contacted the study authors.

Data collection and analysis

We report only the methods we used in successive sections in this review. We report preplanned but unused methods in Table 1.

Open in table viewer
Table 1. Unused methods table

Method

Approach planned for analysis

Reason for non‐use

Measures of treatment effect

Dichotomous data
We had planned to analyse dichotomous outcomes by calculating the risk ratio (RR) of a successful outcome (i.e. improvement in relevant variables) for each trial. We had intended to express uncertainty in each result using 95% confidence intervals (CIs).

No study provided relevant dichotomous data for any of the primary or secondary outcomes included in this review.

Unit of analysis issues

Cluster‐randomised trials
In cluster‐randomised trials, if the clustering is ignored and the unit of analysis is different from the unit of allocation (‘unit‐of‐analysis error’) (Whiting‐O'Keefe 1984), P values may be artificially small and may result in false‐positive conclusions (Higgins 2019c). Had we encountered such cases, we would have accounted for the clustering in the data and followed the recommendations given in the literature (Higgins 2019c; White 2005). For those cluster‐randomised trials that did not report correct standard errors, we would have tried to recover correct standard errors by applying the usual formula for the variance inflation factor 1 + (M ‐ 1) ICC, where M is the average cluster size and ICC the intra‐cluster correlation coefficient (Higgins 2019c). If it had not been possible to extract ICC values from the study, we would have used the ICC of all cluster‐randomised trials in our review that investigated the same primary outcome scale in a similar setting. If this was not available, we would have used the average ICC of all other cluster‐randomised trials in our review. If no such studies were available, we would have used ICC = 0.05 as a mildly conservative guess for the primary analysis, and conducted a sensitivity analysis using ICC = 0.10. We had also planned to conduct sensitivity analyses based on the unit of randomisation as well as the ICC estimate in cluster‐randomised trials (see Sensitivity analysis).

No cluster‐RCT was identified and included in this review.

Multiple treatment groups

Had multiple groups in a study been relevant, we would have accounted for the correlation between the effect sizes from multi‐arm studies in a pair‐wise meta‐analysis (Higgins 2019c). We would have treated each comparison between a control group and a treatment group as an independent study. We would have multiplied the standard errors of the effect estimates by an adjustment factor to account for correlation between effect estimates. In so doing, we would have acknowledged heterogeneity between different treatment groups.

For studies with multiple treatment groups, we considered only one intervention group to be relevant for the review and meta‐analyses, based on the independent judgement of two review authors. Thus, in a pair‐wise meta‐analysis, we did not have to account for the correlation between the effect sizes for multi‐arm studies.

[…] If there is an adequate evidence base, we will consider performing a network meta‐analysis (see Data synthesis).

The evidence base was insufficient to conduct a network meta‐analysis.

Dealing with missing data

If standard deviations could neither be recovered from reported results nor obtained from the authors, we would have considered single imputation by means of pooled within‐treatment standard deviations from all other studies, provided that fewer than five studies had missing standard deviations. If more than five studies had missing standard deviations, we would have performed multiple imputation on the basis of the hierarchical model fitted to the non‐missing standard deviations.

We found no studies using the same scale that had missing standard deviations. Missing standard deviations could always be recovered from alternative statistical values or be obtained from the study authors.

Data synthesis

Had a trial reported more than one resilience scale, we planned to use the scale with better psychometric qualities (as specified in Appendix 3 in Helmreich 2017), to calculate effect sizes.

All studies measuring resilience only used one resilience scale.

If a study provided data from two instruments used equally in the included RCTs, two review authors (AK, IH) would have identified the appropriate measure through discussion (compare Storebø 2020).

This did not occur in this review.

Network meta‐analyses (NMAs) would have been merely exploratory and would only have been conducted if the review results had a sufficient and adequate evidence base.

Network meta‐analyses offer the possibility of comparing multiple treatments simultaneously (Caldwell 2005). They combine both direct (head‐to‐head) and indirect evidence (Caldwell 2005; Mills 2012), by using direct comparisons of interventions within RCTs, as well as indirect comparisons across trials, on the basis of a common reference group (e.g. an identical control group) (Li 2011). A network meta‐analysis on resilience‐training programmes does not exist.

According to Mills 2012, Linde 2016 and the Cochrane Handbook (Chaimani 2019), there are three important conditions for the conduct of NMAs: transitivity, homogeneity, consistency. Had an NMA been possible, i.e. if the three conditions had been fulfilled, we would have conducted an analysis ‐ with expert statistical support as suggested by Cochrane (Chaimani 2019) ‐ using a frequentist approach in R (Rücker 2020; Viechtbauer 2010). For sensitivity analyses, we had planned to fit the same models using the restricted maximum likelihood method (Piepho 2012; Piepho 2014; Rücker 2020). We had intended to consider categorising resilience training into seven groups, based on the underlying training concept: (1) cognitive behavioural therapy, (2) acceptance and commitment therapy, (3) mindfulness‐based therapy, (4) attention and interpretation therapy, (5) problem‐solving therapy, (6) stress inoculation therapy and (7) multimodal resilience training. We may have included additional groups after conducting the full literature search. Reference groups that might have been included in the NMA were: attention control, wait‐list, treatment as usual or no intervention. We had planned to investigate inconsistency and flow of evidence in accordance with recommendations in the literature (e.g. Chaimani 2019; Dias 2010; König 2013; Krahn 2013; Krahn 2014; Lu 2006; Lumley 2002; Rücker 2020; Salanti 2008; White 2012b).

The evidence base was insufficient to conduct a network meta‐analysis.

Summary of findings

Depending on the assessment of heterogeneity and possible effect modifiers (see Subgroup analysis and investigation of heterogeneity), we would have created several ‘Summary of findings’ tables; for example, the clinical status of study populations or the comparator group.

We were not able to investigate potential effect modifiers for the primary outcomes in subgroup analyses and therefore created no additional ‘Summary of findings’ tables.

Subgroup analysis and investigation of heterogeneity

Where we detected substantial heterogeneity, we had planned to examine characteristics of studies that may be associated with this diversity (Deeks 2019). The selection of potential effect modifiers was based on experiences from previous reviews (Leppin 2014; Robertson 2015; Vanhove 2016).

We had intended to perform the following subgroup analyses on our primary outcomes, if we identified 10 or more studies during the review process (Deeks 2019):

  • setting of resilience interventions (group setting vs individual setting vs combined setting);

  • delivery format of resilience interventions (face‐to‐face vs online vs bibliotherapy vs combined delivery vs mobile‐based vs delivery not specified);

  • theoretical foundation of resilience‐training programmes (CBT vs ACT vs mindfulness‐based therapy vs AIT vs problem‐solving training vs stress inoculation vs multimodal resilience training vs coaching vs positive psychology vs nonspecific resilience training);

  • comparator group in intervention studies (attention control vs wait‐list control vs TAU vs no intervention vs active control vs control group not further specified); and

  • intensity of resilience interventions (low intensity vs moderate intensity vs high intensity).

For the primary outcomes at each time point, we identified fewer than 10 studies in a pair‐wise meta‐analysis.

Sensitivity analysis

Comparable with the planned subgroup analyses, we had planned to perform sensitivity analyses if more than 10 RCTs were included in a meta‐analysis. We had intended to restrict the sensitivity analyses to the primary outcomes.
For intervention studies assessing resilience with resilience scales, we had planned to perform a sensitivity analysis on the basis of the underlying concept (state versus trait) in these measures, and to limit the analysis to scales assessing resilience as an outcome of an intervention.

To examine the impact of the risk of bias of included trials, we had intended to limit the studies included in the sensitivity analysis to those whose risk of bias was rated as low or unclear, and to exclude studies assessed at high risk of bias; for studies with low or unclear risk of bias, we had planned to conduct subgroup analyses.

We had also intended to consider the restriction to registered studies. We had planned to identify registration, both by recording whether we found a study in a trial registry and by noting whether the study author claimed to have registered it.

We had planned to perform sensitivity analyses by limiting analysis to those studies with low levels of missing data (less than 10% missing primary outcome). We had intended to limit the analysis to studies where missing data had been imputed or accounted for by fitting a model for longitudinal data, or where the proportion of missing primary outcome data was less than 10%.

We had also intended to perform sensitivity analyses based on the ICC estimate in cluster‐randomised trials that had not adjusted for clustering, by excluding cluster‐RCTs where standard errors had not been corrected or corrected only on the basis of an externally‐estimated ICC. In an additional sensitivity analysis, we had planned to replace all externally‐estimated ICCs less than 0.10 by 0.10.

Finally, we had intended to conduct a sensitivity analysis based on the unit of randomisation, by limiting the analysis to individually randomised trials.

For the primary outcomes at each time point, we identified fewer than 10 studies in a pair‐wise meta‐analysis.

This table provides details of analyses that had been planned and described in the protocol (Helmreich 2017), including revisions made at review stage, but were not used as they were not required or not feasible.

ACT: acceptance and commitment therapy; AIT: attention and interpretation therapy; CBT: cognitive‐behavioural therapy; RCT(s): randomised controlled trial(s); TAU: treatment as usual; vs: versus

Selection of studies

Two review authors (AK, IH) independently screened titles and abstracts in order to determine eligible studies. We immediately excluded clearly irrelevant papers. At full text level, the same two review authors (AK, IH), working independently, inspected for eligibility in duplicate. We calculated inter‐rater reliability at both stages (title and abstract screening and full text screening), resolving any disagreements in study selection by discussion. Where we could reach no consensus, a third review author (AC or KL) arbitrated. If necessary, we contacted the study authors to seek additional information. We recorded all decisions in a PRISMA flow diagram (Moher 2009).

We assessed the feasibility of the selection criteria a priori, by screening 500 studies in order to attain acceptable inter‐rater reliability (see Differences between protocol and review). There was good agreement between the review authors (kappa = 0.72), and thus no need to refine or clarify the criteria. For scientific reasons, however, we adapted the eligibility criteria during review development (see Differences between protocol and review).

Data extraction and management

We developed a data extraction sheet (see Appendix 10), based on Cochrane guidelines (Li 2019), and tested it on 10 randomly‐selected included studies. This initial test resulted in sufficient agreement between the review authors. For each included study, two review authors (AK, IH) independently extracted the data in duplicate. The extraction sheet contained the following elements:

  • source and eligibility;

  • study methods (e.g. design);

  • allocation process;

  • participant characteristics;

  • interventions and comparators;

  • outcomes and assessment instruments (means and standard deviations (SDs) in any standardised scale);

  • results;

  • miscellaneous aspects.

We resolved any disagreements in data collection by discussion. Where we could reach no consensus, a third review author (AC or KL) arbitrated. If necessary, we contacted the study authors to seek additional information.

Assessment of risk of bias in included studies

Two review authors (AK, IH) independently assessed the risks of bias of the included studies. We checked the risk of bias for each study using the criteria presented in the Cochrane Handbook for Systematic Reviews of Interventions, hereafter referred to as the Cochrane Handbook (Higgins 2011a) (see Appendix 11). We resolved any disagreements by discussion or by consulting a third review author (AC or KL). In accordance with Cochrane’s 'Risk of bias' tool (Higgins 2011b), we critically assessed each study across the following domains:

  • sequence generation and allocation concealment (selection bias);

  • blinding of participants and personnel (performance bias);

  • blinding of outcome assessment (detection bias);

  • incomplete outcome data (attrition bias);

  • selective outcome reporting (reporting bias).

We also considered the baseline comparability between study conditions as part of selection bias (random‐sequence generation), which is not defined in the Cochrane Handbook. In the first part of the assessment, we described what was reported to have happened in the study for each domain, before assigning a judgement for the risk of bias (low, high or unclear) for the entry.

Measures of treatment effect

Dichotomous data

We did not need to use our preplanned methods for analysing dichotomous outcomes (Helmreich 2017), as none of the included studies reported relevant dichotomous data for any of the prespecified primary or secondary outcomes.

Continuous data

Because the included resilience‐training studies used different measurement scales to assess resilience and related constructs (see Table 2; Table 3), we used standardised mean difference (SMD) effect sizes (Cohen's d) and their 95% confidence intervals (CIs) for continuous data in pair‐wise meta‐analyses. We calculated effect sizes on the basis of means, standard deviations (SDs) and sample sizes for each study condition. If the respective data were not provided, we computed Cohen's d from alternative statistics (e.g. t test, change scores). We assessed the magnitude of effect for continuous outcomes using the criteria for interpreting SMDs suggested in the Cochrane Handbook (Schünemann 2019a): a value of 0.2 indicates a small effect; 0.5 a moderate effect; and 0.8 a large effect (Cohen 1988b).

Open in table viewer
Table 2. Primary outcomes: scales used

Outcomes

Number of studies

Studies and instruments

Resilience

17

Anxiety

9

Depression

10

Stress or stress perception

13

Well‐being or quality of life

6

aFor depression, we preferred depression scales over burnout scales if both measures were reported.
bConcerning Barry 2019, we included the values for the PSS‐10 in the pooled analysis, as this measure was used more often among the included studies.

Open in table viewer
Table 3. Secondary outcomes: scales used

Outcomes

Number of studies

Studies and instruments

Social support (perceived)

4

Optimism

4

Self‐efficacy

7

Active coping

2

  • Houston 2017: taking action, newly created subscale for the respective sample using original items of the Brief Coping Orientations to Problems Experience scale (Carver 1997)

  • Porter 2008: planful problem‐solving subscale of Ways of Coping Questionnaire (Folkman 1988)

Self‐esteem

2

Hardiness

1

Positive emotions

6

Unit of analysis issues

Cluster‐randomised trials

As the allocation of individuals to different conditions in resilience intervention studies partly occurs by groups (e.g. work sites, hospitals), we intended to include cluster‐randomised studies along with individually‐randomised studies. Since we identified no cluster‐RCTs, we only included individually‐randomised studies in meta‐analyses.

Repeated observations on participants

If there were longitudinal designs with repeated observations on participants, we defined several outcomes based on different follow‐up periods and conducted separate analyses, as recommended in the Cochrane Handbook (Higgins 2019b). One analysis included all studies with measurement at the end of intervention (post‐test), while other analyses were based on the period of follow‐up (short‐term: three months or less; medium‐term: more than three months to six months; and long‐term follow‐up: more than six months). We rated assessments as post‐intervention if performed within one week after the intervention. Assessments at more than one week after the intervention were counted as short‐term follow‐up.

Studies with multiple treatment groups

If selected studies contained two or more intervention groups, two review authors (AK, IH) determined which group was relevant to the review and the particular meta‐analysis, based on the inclusion criteria for interventions (see Types of interventions). For all studies that included several intervention groups, we considered only one intervention group as relevant for the review (see Results, specifically 'Interventions').

Dealing with missing data

In the case of studies where there were missing data, such as missing SDs, or where healthcare students had been combined with other participants, we contacted the original authors to enquire if the missing data or subgroup (summary outcome) data were available. To obtain missing summary outcome data for studies solely conducted in healthcare students, we contacted the study authors (at least twice) to request the respective data (i.e. means, SDs and sample sizes for the relevant study conditions, or alternative information to calculate the SMDs; see Measures of treatment effect).

We did not ask for individual‐level missing data for outcome data missing due to attrition, and performed no re‐analysis using imputation methods. We rated studies with high levels of missing data (≥ 10%) that used no imputation methods at high attrition risk of bias (see Assessment of risk of bias in included studies). If the study authors had reported a complete‐case analysis as well as imputed data, we used the summary outcome data based on the imputed dataset (e.g. last observation carried forward (LOCF) in two studies, or ideally expectation maximisation or multiple imputation).

Following the recommendations in the Cochrane Handbook (Higgins 2019b), we computed missing SDs for continuous outcomes on the basis of other statistical information (e.g. t values, P values), since, as expected, we found enough information in all papers to restore SDs from the reported results.

Studies for which authors provided additional data not originally reported (e.g. number of participants analysed) are described in detail in the Characteristics of included studies tables. We recorded missing data and attrition levels for each included study in the ‘Risk of bias’ tables (beneath the Characteristics of included studies tables).

Assessment of heterogeneity

We assessed the presence of clinical heterogeneity by comparing study and population characteristics across all eligible studies, by generating descriptive statistics. In accordance with the Cochrane Handbook (Deeks 2019), we explored if studies were sufficiently homogeneous for participant characteristics, interventions and outcomes.

We assessed methodological diversity by inspecting the included studies for variability in study design and risks of bias (e.g. method of randomisation). In accordance with previous reviews that have already described the great heterogeneity in resilience intervention studies (Joyce 2018; Leppin 2014; Macedo 2014; Robertson 2015; Vanhove 2016), we discuss the similarities and differences between the included studies for these study characteristics in the Results and Discussion sections.

To assess statistical heterogeneity between the included studies within each pair‐wise meta‐analysis (i.e. heterogeneity in observed treatment effects that exceeds sampling error alone), we relied on forest plots, Chi2 test, tau2, and I2 statistic, as suggested by Deeks 2019. We also considered G2, in order to take small‐study effects into account (Rücker 2011). Significant statistical heterogeneity is indicated by a P value in the Chi2 test lower than 0.10. Since resilience‐training studies are often conducted with relatively small sample sizes (e.g. Loprinzi 2011; Sood 2014), we acknowledge that the Chi2 test has only limited power in such cases. Tau2 also provides an estimate of between‐study variance in a random‐effects meta‐analysis. The I2 is a descriptive statistic, which reflects the percentage of total variation across studies that is due to heterogeneity rather than to chance. In accordance with guidelines (Deeks 2019), we assumed non‐important heterogeneity for I2 values of 0% to 40%, moderate heterogeneity for I2 values of 30% to 60%, substantial heterogeneity for I2 values of 50% to 90%, and considerable heterogeneity for I2 values between 75% and 100%. G2 indicates the proportion of unexplained variance, having allowed for possible small‐study effects (Rücker 2011). No statistical heterogeneity is indicated by a G2 near zero. We also calculated the 95% prediction intervals from random‐effects meta‐analyses (see Data synthesis; pooled analyses with more than two studies) to present the extent of between‐study variation (Deeks 2019).

Assessment of reporting biases

We produced (contour‐enhanced) funnel plots for the primary outcomes at post‐test, plotting the effect estimates of studies against their standard errors on reversed scales (Page 2019; Peters 2008), in order to explore potential publication bias as part of our assessment of the certainty of the evidence and to create the 'Summary of findings' table (see Data synthesis). We considered the fact that funnel plot asymmetry does not necessarily reflect publication bias but can stem from a number of reasons (Page 2019). To differentiate between real asymmetry and chance, we followed the recommendations in Page 2019, and also used Egger’s test (regression test; Egger 1997) to test for funnel plot asymmetry.

We did not assess reporting bias as planned for the remaining outcomes at other time points (Helmreich 2017), due to an insufficient number of studies (fewer than 10 studies) included in the meta‐analyses for each outcome (see Effects of interventions).

Data synthesis

We synthesised the results, in narrative and tabular form, by describing the resilience interventions, their theoretical concept (when possible), as well as the populations and outcomes studied (see Results). We performed the statistical analyses either in Review Manager 5 (RevMan 5; Review Manager 2014) or in R (R 3.6.3 2019; libraries used: meta (Balduzzi 2019) metafor (Viechtbauer 2010) and metasens (Schwarzer 2019), when appropriate.

We combined outcome measures of included studies through pair‐wise meta‐analyses (any resilience training versus control), in order to determine summary (pooled) intervention effects of resilience‐training programmes in healthcare students. The decision to summarise numerical results of RCTs in pair‐wise meta‐analyses depended on the number of studies found (at least two studies for a specific outcome and time point), as well as the homogeneity of the included studies by population (for age, sex), resilience interventions (i.e. comparable content and modalities), comparisons, outcomes measured (i.e. the same prespecified outcome, albeit with different assessment tools), and methodological quality (risk of bias) of selected studies. We conducted meta‐analyses where intervention studies did not differ excessively in their content, outcomes (measures) were not too diverse and there were no individual studies predominantly at high risk of bias.

For summary statistics for continuous data, we reported SMDs using an inverse variance random‐effects model. We used random‐effects pair‐wise meta‐analyses since we anticipated a certain degree of heterogeneity between studies, as indicated by the results of previous reviews (Joyce 2018; Leppin 2014; Macedo 2014; Robertson 2015; Vanhove 2016), and given the nature of the interventions included. We calculated the 95% prediction intervals from random‐effects meta‐analyses (see Assessment of heterogeneity). As part of our sensitivity analyses, we also performed fixed‐effect analyses (see Sensitivity analysis). We analysed continuous data reported as means and SDs in some studies separately from outcomes where SMDs and the respective standard error were taken from different data (e.g. independent t test). We subsequently combined these values using the generic invariance method in RevMan 5 (Review Manager 2014).

We also included studies with mixed samples (i.e. healthcare students and non‐healthcare students) in meta‐analyses, provided the subgroup data for healthcare students were reported separately or could be obtained from the study authors. If subgroup data were not available, we provided a narrative report of the findings of these studies in a separate section (see Effects of interventions, studies with mixed samples) for each outcome.

All studies measuring resilience used only one resilience scale. If a study reported more than one instrument for mental health and well‐being outcomes or for a specific resilience factor, we used the measure most often used among the included studies for effect size calculation. For the outcome of depression, we preferred depression scales over burnout scales if both measures were reported. Where studies reported both general measures of well‐being or quality of life and work‐related assessments (e.g. job satisfaction, work‐related vitality), we preferred general measures.

Once we had produced a summary of the evidence to date, and only if a pair‐wise meta‐analysis (any resilience training versus control) was possible, we examined whether the data were also suitable for a network meta‐analysis (NMA). There was insufficient evidence to perform a NMA.

Summary of findings

In this review, we used the software developed by the GRADE Working Group, GRADEpro: Guideline Development Tool (GRADEpro GDT), to create a 'Summary of findings' table for the comparison: resilience interventions versus control conditions for healthcare students.

We included all primary outcomes at post‐test in the ‘Summary of findings’ table. For each outcome, we assessed the certainty of the body of evidence using the GRADE approach proposed by the GRADE working group (Schünemann 2013; Schünemann 2019b), across the following five GRADE considerations:

  • limitations in the design and implementation of available studies (i.e. unclear or high risk of bias of studies contributing to the respective outcome; Guyatt 2011a);

  • high probability of publication bias (i.e. high risk of selective outcome reporting bias for studies contributing to the outcome, based on funnel plot asymmetry, Egger's test, different results of published versus unpublished studies, and whether the evidence consisted of many small studies with potential conflicts of interest) (Guyatt 2011b);

  • imprecision of results (i.e. small number of participants included in an outcome and wide CIs; Guyatt 2011c);

  • unexplained heterogeneity or inconsistency of results (i.e. heterogeneity based on variation of effect estimates, CIs, the statistical test of heterogeneity and I2, but the subgroup analyses fail to identify a plausible explanation; Guyatt 2011d); and

  • indirectness of evidence (i.e. included studies limited to certain participants, intervention types, or comparators; Guyatt 2011e).

According to the GRADE system, meta‐analyses for continuous outcomes should include sample sizes of at least 400 participants for sufficient statistical precision. Where there was both substantial inconsistency (I2 ≥ 60%) for an outcome and imprecision, we did not downgrade for imprecision, as the heterogeneity might have influenced the CI (i.e. precision) and we did not wish to double‐downgrade for the same problem.

Two review authors (AK, IH), working independently, conducted the assessment of the certainty of the evidence in duplicate, resolving any disagreements by discussion or by consulting a third review author (AC, KL). We interpreted the magnitude of effect for continuous outcomes according to the criteria suggested in the Cochrane Handbook (Schünemann 2019a) (i.e. 0.2 as a small effect, 0.5 as a moderate effect, 0.8 as a large effect).

We rated the certainty of the evidence as high, moderate, low or very low (Schünemann 2013). High‐certainty evidence indicates high confidence that the true effect lies close to that of the estimate of effect. Very‐low certainty evidence indicates that we have very little confidence in the effect estimate and that the true effect is likely to be substantially different from the estimate of effect.

See Differences between protocol and review.

Subgroup analysis and investigation of heterogeneity

Due to the limited number of studies that we could include in our meta‐analyses for the primary outcomes (fewer than 10 studies; Deeks 2019), we were not able to conduct the planned subgroup analyses to examine key characteristics of studies that may be associated with the substantial heterogeneity detected for several outcomes (see Effects of interventions). We were also unable to perform a subgroup analysis for training intensity (post hoc addition).

Sensitivity analysis

Due to the limited number of studies we were able to include in our meta‐analyses for the primary outcomes (fewer than 10 studies; Deeks 2019), we did not conduct most of the planned sensitivity analyses to test the robustness of the findings of this review.

However, for the primary outcomes at post‐test (i.e. the main analyses of this review), we performed the planned sensitivity analysis using a fixed‐effect model in addition to random‐effects meta‐analyses.

Results

Description of studies

Results of the search

We ran the first searches for this review in October 2016 according to the protocol (Helmreich 2017). We used the strategies in Appendix 8 to find studies in which the participants included any adults aged 18 years and older. Due to the large number of potentially eligible studies, we decided to split the review and changed the inclusion criteria to focus on healthcare sector workers and students (see Differences between protocol and review). Before running the top‐up searches in June 2019, we revised the original search strategy by limiting the population to healthcare sector workers and students (Appendix 9). Following these searches, we further revised the inclusion criteria to healthcare students only, which is the focus of this review.

In total, the database searches retrieved 37,737 records. We found an additional 663 records by searching other resources. Following de‐duplication, we screened the remaining 24,703 records by title and abstract. We deemed 21,629 records to be irrelevant and sought the full texts of the remaining 3074 records for further assessment. At the level of title/abstract screening, we achieved a good agreement (kappa = 0.70) for the original search, and an excellent agreement for the top‐up searches (kappa = 0.99). The full‐text screening resulted in excellent inter‐rater reliability for both the original search (kappa = 0.95) and the top‐up searches (kappa = 1).

After revising the eligibility criteria to focus broadly on the healthcare sector (including healthcare professionals and healthcare students; see Differences between protocol and review), we identified 80 studies that were performed in any of these groups. We also identified nine ongoing studies and 29 studies awaiting classification. We found six additional reports of studies during the top‐up searches.

Finally, after revising the eligibility criteria to focus on healthcare students, we reassessed these 118 studies (from 144 reports). In total, for healthcare students, we included 30 studies (from 34 reports). We excluded a total of 3010 full text reports (see Figure 1); this figure includes 15 reports (13 excluded studies), which we needed to examine in detail to determine eligibility, and which are described in the Characteristics of excluded studies. We identified 22 studies awaiting classification (see Studies awaiting classification) and three ongoing studies (see Ongoing studies). For further details of our screening process, see the study flow diagram (Figure 1). We present the results of both searches in more detail in Appendix 12.


Study flow diagram for all searches.

Study flow diagram for all searches.

From an updated (pre‐publication) search of four key databases in June 2020, we have added 16 studies (from 16 reports) to the Characteristics of studies awaiting classification tables. The results of these studies are not yet included in this review and will be incorporated at the next update. We also found an additional report of the included study by Houston 2017 (First 2018: a qualitative study in a subsample of Houston 2017).

Included studies

We present the corresponding references for the description of included studies in Appendix 13.

Study design

All 30 included studies were parallel‐group designs, published between 2005 and 2019, with the exception of one completed but unpublished trial: ISRCTN64217625.

Location

Eleven studies were conducted in the USA, four in Canada, and three in Iran. Two studies apiece were performed in Australia, Germany, China, and the UK. The remaining studies took place in Belgium (Geschwind 2015), India (Mathad 2017), Switzerland (Recabarren 2019), and The Netherlands (Smeets 2014).

Settings

Training programmes were delivered at university or in schools (e.g. nursing school, school of medicine) in 11 studies. For nine studies, the intervention site was not further specified. As four studies included online or mobile resilience interventions, there was no concrete venue and participants could participate regardless of location. Three interventions took place in a laboratory. Two training programmes could be performed in the home setting (using a spoken compact disc (CD)). One resilience training was conducted in a mixed setting (online training plus face‐to‐face sessions with implementation site not further specified) (ISRCTN64217625).

Participants

Participants were mainly young women, due to the student sample (one study in doctoral candidates, Barry 2019). Most studies evaluated a resilience‐training programme in nursing or midwifery students, closely followed by medical students with an almost equal number of studies.

The total number of healthcare students randomised across 20 of the 30 included studies was 1315 (including one completed but unpublished study: ISRCTN64217625 (targeting number: 50)). For eight studies with mixed samples (Barry 2019; Galante 2018; Geschwind 2015; Goldstein 2019; Houston 2017; Recabarren 2019; Venieris 2017; Victor 2018), the total number of participants randomised was 1365 participants. While the original number of healthcare students randomised in most of these mixed‐sample studies is unclear, we received information from the authors for five studies (Barry 2019; Geschwind 2015; Houston 2017; Recabarren 2019; Victor 2018). For two studies, Kelleher 2018 and Samouei 2015, it was unclear how many participants were randomised. Overall, eight studies randomised 100 or more participants, and five studies randomised 30 participants or fewer.

Where data on age were available, the mean age across 13 studies in healthcare students (no studies with mixed samples) ranged from 19.5 to 26.83 years (SDs ranging from 0.77 to 5.12 years), with an average of 22.29 years (mean SD = 2.12 years). For mean age in studies with mixed samples, six out of eight studies reported a range of 19.35 to 38.14 years (SD ranging from 1.98 to 11.33 years) for the total samples, including healthcare students, with an average of 25.14 years (mean SD = 4.65 years). Three studies did not report mean age, but only the age range of participants (Houston 2017; Waddell 2005; Waddell 2015), and two studies (Galante 2018; Delaney 2016) considered participants aged 17 or 18 years and above, respectively. Six studies did not specify the age of the sample or were unclear (Chen 2018a; ISRCTN64217625; Kelleher 2018; Mejia‐Downs 2016; Miu 2016; Samouei 2015).

Women outnumbered men in seven studies conducted solely in healthcare students, and men predominated in a further six studies. Three studies included only women. For six studies, the sex of the participants was unclear (Chen 2018a; ISRCTN64217625; Kelleher 2018; Mejia‐Downs 2016; Samouei 2015; Waddell 2005). For 14 studies presenting the total numbers of men and women investigated, the proportion of women was 63.3%. Women outnumbered men in seven of the eight studies with mixed samples; one mixed‐sample study included only women. The proportion of women in eight studies with mixed samples that reported total numbers by sex was approximately 67.3%.

Eight studies included solely nursing or midwifery students, and seven were conducted in medical students. Three studies involved paramedic students and two involved psychology students. Two studies included physical therapy students. The eight remaining studies were performed with mixed samples, i.e. healthcare students combined with other individuals such as volunteers or students in other fields. Relevant subgroups within these studies included: university students (Goldstein 2019; Houston 2017); doctoral candidates in different health fields (Barry 2019); psychology students (Geschwind 2015; Recabarren 2019; Victor 2018); 'Clinical medicine' and 'Humanities and social sciences' students (p e76; Galante 2018); students in 'Health & Wellness' and 'Social and Behavioral Sciences' (p 146; Venieris 2017).

Twelve of the 30 studies assessed mental health at baseline. All studies measuring mental health used self‐report (screening) measures covering one or a small number of mental dysfunctions (e.g. Depression Anxiety and Stress Scales (DASS) in Barry 2019). Only one of these studies also conducted comprehensive baseline diagnostics with the use of a structured interview (Mini‐International Neuropsychiatric Interview (MINI); (Recabarren 2019). Seventeen studies provided no data about the mental health status of the sample. For one unpublished trial (ISRCTN64217625) and one study published as a conference abstract (Goldstein 2019), the baseline mental health status was unclear, although both of them assessed mental health at baseline. Eight studies included mentally healthy participants only (Akbari 2017; Recabarren 2019), participants without severe psychiatric illness (not further specified; Mathad 2017), participants showing symptoms below a cut‐off on a screening instrument (Barry 2019; Wang 2012; Warnecke 2011) or participants without certain mental disorders or suicidality, e.g. bipolar disorder, psychosis (Miu 2016; Victor 2018). Wang 2012 only considered participants with a mental crisis. Since Victor 2018 focused on burdened students, they included participants with a symptom burden of four or more on the Global Severity Index of the Brief Symptom Inventory.

Interventions

All 30 studies examined the effects of a psychological intervention to foster resilience, hardiness or post‐traumatic growth in healthcare students, compared to a control condition. Most studies evaluated group interventions (17 studies), that were delivered face‐to‐face (17 studies) and were structured on mindfulness‐based theoretical approaches (eight studies). High‐intensity interventions (11 studies) and training programmes of low intensity (10 studies) were relatively balanced.

Two studies had multiple intervention arms (Kötter 2016; Venieris 2017). In a three‐arm study (Kötter 2016), one intervention group (IG1) participated in a one‐hour psycho‐educative seminar (e.g. emotional reactions towards stressors) plus two individual coaching sessions, with the latter designed to foster individual stress‐management resources (i.e. resilience) using techniques such as eye movement desensitisation and neurolinguistic programming. IG2 received the psycho‐educative seminar only. Due to an unexpected shortfall in the sample size, the study authors combined both intervention arms in the quantitative analyses. Venieris 2017 was a three‐arm study comparing a positive psychology intervention (PPI; IG1) to an informative stress intervention (IG2) or a wait‐list control group. The PPI asked participants to engage in one of five activities (e.g. '3 grateful things') everyday for three weeks, whereas IG2 provided information about stress and positive coping mechanisms. Since the study authors hypothesised an increase in resilience in the PPI group compared to the remaining groups, we considered this group to be relevant for our review.

Setting

Seventeen interventions were performed in groups. Seven studies were conducted in an individual setting. Four studies were performed in a variety of training settings. Two studies did not specify the type of setting (Chen 2018a; Miu 2016).

Delivery format

Seventeen studies delivered resilience interventions face‐to‐face. Five studies used multimodal delivery of interventions (e.g. face‐to‐face group sessions and internet‐based training). Four studies examined online or mobile‐based resilience‐training programmes, and two studies tested interventions that were conducted in a laboratory setting and unlikely with face‐to‐face contact.Two studies used an audio intervention.

Training intensity

Treatment duration varied between a 20‐minute single intervention session (Geschwind 2015) and 40 hours in total, i.e. one hour a day for five days a week over eight weeks (Mathad 2017). Eleven studies included high‐intensity training (more than 12 hours or more than 12 sessions). Ten RCTs investigated low‐intensity interventions (i.e. less than five hours or three sessions or fewer), and seven studies evaluated moderate‐intensity training (i.e. more than five hours to 12 hours or less, or more than three sessions to 12 sessions or fewer). The intensity of the training was unclear for two interventions (Barry 2019; Venieris 2017).

Theoretical foundations

We categorised the interventions into six groups, based on their content and the descriptions provided by the study authors. We present a synthesis of the characteristics of studies within a specific theoretical foundation and the respective intervention content in Appendix 14.

Eight studies evaluated mindfulness‐based resilience interventions, including MBSR (Erogul 2014; Kelleher 2018) and content related to mindfulness‐based (self‐)compassion (e.g. Chen 2018a; Smeets 2014). Seven RCTs examined nonspecific resilience interventions that did not give details of the type of resilience‐training programmes conducted or their theoretical orientation, but aimed at fostering one or several prespecified resilience factors (see Appendix 2, level 1, e.g. self‐esteem, social support, active coping by problem‐solving, spirituality). Six studies included interventions based on a combination of two or more explicit theoretical foundations (e.g. CBT and positive psychology). Of these, two were based on mindfulness (e.g. MBSR) and CBT or cognitive therapy (ISRCTN64217625; Recabarren 2019), and four combined interventions that could not be clustered any further (Delaney 2016; Goldstein 2019; Kötter 2016; Victor 2018). Four resilience‐training programmes were based on positive psychology. Three interventions included only elements of CBT. Resilience‐training programmes based on coaching approaches were tested in two studies.

Comparators

With the exception of one study (Victor 2018), all 30 included studies involved only one comparator. In Victor 2018, the intervention group PRM (see Interventions) was compared to attention control and wait‐list control groups. For this review, we considered only the attention‐control group to be relevant.

Most studies included wait‐list control groups (10/30 studies) and no intervention comparators (7/30 studies), followed by attention control (6/30 studies; Victor 2018 (second CG)), active control (3/30 studies) and TAU (3/30 studies). Two studies did not specify the type of control group (Anderson 2017; Wang 2012).

Of the six studies comparing a resilience intervention with an attention‐control group, two studies conducted in a laboratory setting either instructed participants to think about a typical day and visualise this scenario for five minutes (Geschwind 2015), or used the ‘Wisdom On Wellness’ (WOW) intervention, which included some level of social interaction (Goldstein 2019). Further attention‐control comparators included a single laboratory session on the role of the brain and information processing (participants had to read an article, read testimonials from others and write to others about what they learned; Miu 2016); a time management intervention (Smeets 2014); an educational intervention on Twitter consisting of nursing trivia or questions related to nursing knowledge (Stephens 2012); and individual coaching sessions on the ABC (Activating event, Belief, Consequences) model (e.g. challenging dysfunctional and testing alternative thoughts) (Victor 2018).

In three studies, active control groups included a booklet and a worksheet on therapeutic communication (Delaney 2016); brochures about scientific information unrelated to psychology (Samouei 2015); and a standard, group‐based resilience training (Mind's resilience intervention; ISRCTN64217625). We considered the last intervention to use an active control group, rather than TAU, because the Mind’s group‐based resilience intervention for emergency personnel had been newly developed in a recent study (Wild 2016), and was not yet considered as established standard care.

In three studies, TAU referred to usual mental health support (Galante 2018), or a standard undergraduate curriculum group for nursing students (Waddell 2015). Warnecke 2011 did not further specify the content of the TAU group.

One study used a design where a control group plus resilience intervention was compared to the control group alone (Galante 2018). One completed but unpublished study (ISRCTN64217625) examined the impact of a face‐to‐face resilience intervention (control group) versus the same resilience intervention with an additional internet‐based top‐up session (intervention group).

Outcome measures

The included RCTs used a diversity of outcome measures, but some studies measuring the same outcomes (e.g. perceived stress) used the same instrument (e.g. Perceived Stress Scale; Cohen 1983b; Cohen 1988a). All outcomes were based on self‐reported assessments and most studies used validated scales.

Primary outcomes

We defined treatment efficacy as an improvement in resilience, assessed by specific resilience scales, or an improvement in four categories of mental health and well‐being (i.e. anxiety, depression, stress or stress perception, and well‐being or quality of life). For each outcome, the studies used heterogeneous scales (see details in Table 2). Among the 30 included studies, 17 assessed resilience using a resilience scale, followed by stress or stress perception (13 studies), depression (e.g. depressive symptoms; 10 studies), anxiety (nine studies) and well‐being or quality of life (six studies).

Secondary outcomes

The authors of the included studies used a heterogeneous group of instruments to assess the secondary outcomes (see details in Table 3). Most of the included studies assessed self‐efficacy (seven studies), followed by positive emotions (six studies). Social support and optimism were assessed by four studies each. Active coping and self‐esteem were assessed by two studies each, while hardiness was an outcome measure in one study.

Funding sources

Funding sources for the included studies were various, and in six studies included universities (e.g. certain faculties, medical schools) and university research funds. In four studies, further funding was provided by different foundations. Two studies received funding from the nursing organisation Sigma Theta Tau. Single studies were supported by a scholarship (Waddell 2005), the Graduate and Professional Student Association (Venieris 2017), the US Substance Abuse and Mental Health Services Administration through a university's Disaster and Community Crisis Center (Houston 2017), and the Social Sciences and Humanities Council (Waddell 2015). Four studies reported a combination of funding sources (e.g. Canadian Mental Health Association, Campus Capacity Development Grant and Justice Institute of British Columbia; university, National Institute for Health Research Collaboration and Care East England; award and graduate research fellowship; university and charity). Seven studies did not report their funding sources (Mathad 2017; Miu 2016; Smeets 2014; Victor 2018; Wang 2012) or did not offer the information (e.g. conference abstract) (Chen 2018a; Kelleher 2018). Three studies received no funding support (Mueller 2018; Samouei 2015; Sahranavard 2018).

Excluded studies

We excluded 3010 irrelevant full text reports.

We excluded 13 studies that seemed to merit inclusion but on closer inspection did not (see Characteristics of excluded studies). Most of these studies (11/13) were excluded for an ineligible intervention (Brady 2016; De la Fuente 2018; De Vibe 2013; Duan 2019; Dvořáková 2017; Esch 2013; Huennekens 2018; Pogrebtsova 2018; Sampl 2017; Song 2015; Van Dijk 2015). Of these, eight studies only briefly mentioned the concept of resilience or a related construct (e.g. in the introduction or discussion section of a publication), but did not explicitly state the aim of fostering resilience, hardiness or post‐traumatic growth through the intervention (Brady 2016; De la Fuente 2018; De Vibe 2013; Dvořáková 2017; Esch 2013; Huennekens 2018; Pogrebtsova 2018; Song 2015). It was also unclear in Dvořáková 2017 whether healthcare students were included. Duan 2019 evaluated an intervention based on strengths‐based CBT to build resilience (Padesky 2012). However, the authors did not specify the intention of fostering resilience. Sampl 2017 (psychology students included) often mentioned the concept of resilience, but we excluded the study as, according to the investigators, it primarily focused on (measured) constructs such as mindfulness. Van Dijk 2015 was excluded, as the study mentioned resilience in a publication reporting baseline results of the RCT, but not in the final report.

We excluded one study due to ineligible study design: Victor 2017 evaluated a strengths‐based CBT intervention to foster resilience in first‐year psychology students by randomising participants to either the training or to a no‐intervention control group. However, as participants were free not to follow the invitation, the authors pointed out that randomisation may have been jeopardised by self‐selection bias. We therefore excluded this study for an ineligible study design.

Finally, we did not include ACTRN12617000300370, as we received information from the primary investigators that the trial in university staff failed to obtain human resources approval to proceed, and that the authors did not have any data relevant to the meta‐analysis.

Studies awaiting classification

We identified 22 studies awaiting classification.

For 20 studies, it was unclear whether the final sample also included healthcare students (Arch 2014; Bauman 2014; Beadel 2016; Chen 2018b; DRKS00011265; DRKS00013765; Enrique 2019; Gerson 2013 (study 1); Gerson 2013 (study 2); Harrer 2018; Herrero 2019; ISRCTN17156687; Kanekar 2010; Liu 2016; NCT02867657; NCT03903978; Oman 2008; Roghanchi 2013; Seligman 2007; Zhang 2018). In six studies (Arch 2014; Bauman 2014; Beadel 2016; Gerson 2013 (study 1); Gerson 2013 (study 2); Oman 2008), for example, the study authors reported partial recruitment in psychology departments. However, whether psychology students had been included in the final sample was not specified in the reports and could not be obtained from the study authors. Similarly, 14 studies only described recruiting university or college students in general, and based on the available reports it was unclear if the final samples included healthcare students at all (Chen 2018b; DRKS00011265; DRKS00013765; Enrique 2019; Harrer 2018; Herrero 2019; ISRCTN17156687; Kanekar 2010; Liu 2016; NCT02867657; NCT03903978; Roghanchi 2013; Seligman 2007; Zhang 2018).

In one study (NCT03669016) resilience was assessed as a secondary outcome, but we could not clearly determine the extent to which the trial focused on fostering this construct based on the trial registration, and received no response from the authors. The same applied to Chen 2018b, for which resilience or a related construct was not mentioned in the available conference abstract.

The study design of Ye 2016 could not be clearly determined, since the full text was not available and we identified no contact details to ask the study authors for more information. The same applied to Zhang 2018 (available as a conference abstract) and to Liu 2016, for which, besides the potential inclusion of healthcare students, randomisation was also unclear.

Details of these studies can be found in the Characteristics of studies awaiting classification tables.

Sixteen studies from the updated search in June 2020 were also added to the Characteristics of studies awaiting classification tables. They will be incorporated into this review at the update stage.

Ongoing studies

We found three ongoing studies that are likely to meet our inclusion criteria (Harrer 2019; NL7623; Wild 2018). All three studies are RCTs with parallel assignment. In a sample of German distance‐learning students experiencing elevated levels of depression (including psychology students), Harrer 2019 assessed the impact of TAU (e.g. general practitioner visits, counselling services) plus StudiCare Fernstudierende (a seven‐week online intervention with feedback on demand) versus usual care plus attention control (online psycho‐education). The intervention involved information about stress, systematic problem‐solving, muscle and breath relaxation, mindfulness, acceptance and tolerance, self‐compassion and creating a master plan (e.g. recognising physiological warning signs, creating a plan for the future). Using a longitudinal observation cohort study with a nested RCT, a Dutch study (NL7623) randomised students at the Erasmus University Medical Center in Rotterdam to either resilience training or an active control (psycho‐education about chronic stress and burnout prevention). In contrast to the other ongoing studies, the intervention group did not receive only one treatment, but followed a maximum of three intervention periods (e.g. mindfulness training, stress management training) of eight weeks each. Wild 2018 examined the impact of an internet‐based cognitive training for resilience (iCT‐R) versus attention control (mind‐online) and TAU (usual support by university) in a sample of paramedic students.

Further details of these studies can be found in the Characteristics of ongoing studies tables.

Risk of bias in included studies

The main limitations we found for risks of bias (≥ 20% high risk) across the 30 studies were in the following domains: performance bias, detection bias, attrition bias, and reporting bias. See Figure 2 and Figure 3 for ‘Risk of bias’ graphs, and Characteristics of included studies tables for further information. A large number of studies provided insufficient information to adequately judge the risk of selection bias. We identified the greatest variation across studies for attrition and reporting biases.


Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.


Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Allocation

Sequence generation

We rated eight studies at low risk of selection bias, since the investigators described a random component in the sequence‐generation process (e.g. computer‐generated random sequence, shuffling cards). For five of these, there was verified baseline comparability between study groups for sociodemographic characteristics (i.e. potential confounding factors) as well as outcome variables (Mueller 2018; Smeets 2014; Venieris 2017; Victor 2018; Waddell 2015). For three studies, there was evidence of a genuine random assignment (e.g. computer‐generated random sequence), but the authors provided no information (Galante 2018) or only partial information about potential baseline differences in sociodemographic and outcome measures (Recabarren 2019; Stephens 2012).

We rated 14 studies as having unclear risk of selection bias because there was no description of the sequence‐generation process (Akbari 2017; Anderson 2017; Delaney 2016; Erogul 2014; Houston 2017, Kötter 2016; Mathad 2017; Peng 2014; Porter 2008; Sahranavard 2018; Samouei 2015; Waddell 2005; Wang 2012; Warnecke 2011). Nine of these RCTs did not specify further the baseline comparability of groups for (some) sociodemographic characteristics or outcomes of interest, or both (Akbari 2017; Houston 2017; Kötter 2016; Mathad 2017; Peng 2014; Porter 2008; Sahranavard 2018; Samouei 2015; Waddell 2005). Based on the limited information in conference abstracts or trial registrations, we considered five further studies at unclear risk of bias (Chen 2018a; Goldstein 2019; ISRCTN64217625; Kelleher 2018; Mejia‐Downs 2016).

We judged three studies to be at high risk of selection bias since, despite randomisation, baseline comparability in sociodemographic characteristics or outcomes (or both) could not be verified by the investigators based on statistical analysis (Barry 2019; Geschwind 2015; Miu 2016).

Allocation concealment

Allocation concealment was not well reported and we rated all 30 included studies at unclear risk of selection bias for this domain.

Five studies described the randomisation process being concealed from participants or from personnel recruiting participants, or both, but neglected to specify further the method of allocation concealment (Erogul 2014; Miu 2016; Mueller 2018; Recabarren 2019; Waddell 2015).

The authors of 20 studies provided either insufficient or no information about the allocation concealment process (Akbari 2017; Anderson 2017; Barry 2019; Delaney 2016; Galante 2018; Geschwind 2015; Houston 2017; Kötter 2016; Mathad 2017; Peng 2014; Porter 2008; Sahranavard 2018; Samouei 2015; Smeets 2014; Stephens 2012; Venieris 2017; Victor 2018; Waddell 2005; Wang 2012; Warnecke 2011). In Barry 2019, the random‐sequence generation was probably concealed from the participants (sealed trial pack that participants were instructed to open only after the baseline assessment), but the allocation concealment for the investigators enrolling the participants was not specified. Similarly, for the two randomisation procedures used in Kötter 2016, the study authors described the allocation concealment in sufficient detail (sealed, opaque envelopes) for the second randomisation (IG1 versus IG2), but it was unclear for the first randomisation to either treatment or control group.

There was limited information in conference abstracts or trial registrations to reach a decision on the risk of bias for five studies, and we therefore rated them as having an unclear risk of bias (Chen 2018a; Goldstein 2019; ISRCTN64217625; Kelleher 2018; Mejia‐Downs 2016).

Blinding

Blinding of participants and personnel
Objective outcomes

Five of the 30 studies assessed one or several objective outcomes such as physical activity by accelerometers, heart/breathing rate or grade point average (Delaney 2016; Galante 2018; Goldstein 2019; ISRCTN64217625; Kelleher 2018). Although study personnel were not blinded in most of these studies (see next paragraph on subjective outcomes below), we judged these studies to be at low risk of performance bias in relation to objective outcomes.

Subjective outcomes

We considered only one of the 30 studies to be at low risk of performance bias for subjective outcomes (Miu 2016), as this was a double‐blind RCT.

We rated two studies at unclear risk of performance bias (Anderson 2017; Venieris 2017). Anderson 2017 performed a (blended) online resilience intervention without specifying the blinding of participants and personnel. Venieris 2017 also delivered a training programme through an online educational system, and the blinding of participants was probably ensured (i.e. participants were not specifically informed about the number or nature of study conditions, but only informed that they may or may not be asked to participate in different activities), but there was insufficient information about the potential blinding of study personnel.

We judged 22 studies to be at high risk of performance bias because resilience interventions were performed entirely face‐to‐face (Akbari 2017; Chen 2018a; Delaney 2016; Erogul 2014; Galante 2018; Goldstein 2019; Houston 2017; Kelleher 2018; Kötter 2016; Mathad 2017; Mejia‐Downs 2016; Peng 2014; Porter 2008; Recabarren 2019; Sahranavard 2018; Samouei 2015; Victor 2018; Waddell 2015; Wang 2012), or included face‐to‐face elements (ISRCTN64217625; Smeets 2014; Waddell 2005), resulting in a lack of blinding of personnel. Some of these studies explicitly indicated the lack of blinding of both participants and personnel (e.g. Galante 2018; Kötter 2016; Recabarren 2019). We also rated five other studies at high risk of performance bias for the following reasons. Barry 2019 and Warnecke 2011, which were described as single‐blind studies, provided participants with a spoken CD. While Warnecke 2011 described no blinding of participants, it was unclear whether or not study personnel or participants had been blinded in Barry 2019. Mueller 2018 performed an online, self‐guided intervention and indicated no blinding of participants; blinding of study personnel was unlikely also, as they monitored discussion‐board postings within the intervention. Stephens 2012 conducted a resilience‐training programme on Twitter by only one researcher who also performed the outcome assessment. Geschwind 2015 included a resilience intervention that was conducted in a laboratory. Although there was no face‐to‐face contact, the study personnel in these studies were not blinded, as verbal communication with participants was possible and participants were observed by the intervention providers.

Blinding of outcome assessment
Objective outcomes

We considered all five studies measuring objective outcomes to be at low risk of detection bias. Although two of these studies did not adequately describe the blinding of outcome assessment (Delaney 2016; Galante 2018), we judged them to be at low risk of detection bias, since they used objective outcomes (e.g. physiological parameters), which we considered unlikely to be influenced by the lack of blinding. We applied the same rating to three other studies that used objective outcomes, even though there was insufficient information in the conference abstracts, posters or trial registrations (Goldstein 2019; Kelleher 2018; ISRCTN64217625).

Subjective outcomes

We judged only two studies to be at low risk of detection bias for the assessment of subjective outcomes (Galante 2018; Miu 2016), for the following reasons: data were collected using web‐based software to ensure masking of outcome assessors (Galante 2018), or researchers blind to condition provided a link to an online survey (Miu 2016).

We considered four studies to be at unclear risk of detection bias, because the study authors did not adequately describe the blinding of the results (Anderson 2017; Barry 2019; Geschwind 2015; Venieris 2017), and the risk of performance bias (i.e. blinding of participants) was low or unclear (see blinding of participants and personnel).

Finally, we rated 24 studies at high risk of detection bias. Blinding of outcome assessment seemed unlikely in Kötter 2016 for the assessment after first randomisation, since the group allocation was concealed neither from study personnel nor from the participants (unclear blinding for the second assessment). In Stephens 2012, the outcome assessor was the same individual who provided the intervention and therefore could not be blinded. For the remaining 22 studies, due to (potential) performance bias (no blinding of participants), we judged that the participants' responses to questionnaires may be likely to be affected by the lack of blinding (i.e. knowledge and beliefs about intervention they received) (Akbari 2017; Chen 2018a; Delaney 2016; Erogul 2014; Goldstein 2019; Houston 2017; ISRCTN64217625; Kelleher 2018; Mathad 2017; Mejia‐Downs 2016; Mueller 2018; Peng 2014; Porter 2008; Recabarren 2019; Sahranavard 2018; Samouei 2015; Smeets 2014; Victor 2018; Waddell 2005; Waddell 2015; Wang 2012; Warnecke 2011).

Incomplete outcome data

We assessed 13 studies as having low attrition bias because they met at least one of the following criteria: the losses were similar across intervention and control groups; the reasons for missing data were unlikely to be related to the true outcome (e.g. dropout due to pregnancy); the losses were not substantial (< 10% from number of randomised participants; e.g. two dropouts from 70 participants in Wang 2012) and/or study authors accounted for dropouts and losses to follow‐up by using statistical analyses aimed at reducing bias (e.g. multiple imputation) or preventing false‐positive conclusions (e.g. last observation carried forward) (Delaney 2016; Erogul 2014; Galante 2018; Houston 2017; Kötter 2016; Mejia‐Downs 2016; Mueller 2018; Peng 2014; Recabarren 2019; Smeets 2014; Stephens 2012; Victor 2018; Wang 2012). Four studies performed an intention‐to‐treat (ITT) analysis (Galante 2018; Kötter 2016; Recabarren 2019; Victor 2018). Based on data provided by the original investigator, Mejia‐Downs 2016 analysed all randomised participants, but it is unclear if there were any missing data that were imputed. The same applied to Peng 2014, who reported results for all participants randomised but did not state the level of missing data. Smeets 2014 provided contradictory information about the number of participants analysed (available‐case analysis according to p 797 versus ITT analysis according to Table 2 in the report), but we judged the study to be at low risk of bias, as the dropout was not substantial. Delaney 2016 did not clearly specify the number of participants analysed and we relied on the numbers reported in the publications, as we had no response from the study authors.

We rated eight studies at unclear risk of attrition bias. Four studies did not fully account for dropouts throughout the study or whether this differed between groups (Akbari 2017; Anderson 2017; Geschwind 2015; Samouei 2015). Samouei 2015 did not report the number of participants randomly allocated to each group. Anderson 2017 and two other studies (Sahranavard 2018; Waddell 2005: especially after phase two) did not clearly specify the number of participants analysed and we relied on the numbers reported in the publications, having received no response from the study authors, or had to derive these indirectly from other statistical values in the report with the help of the statistician (JK). We could not judge the risk of attrition bias from the information available in conference abstracts or trial registrations for two studies (ISRCTN64217625; Kelleher 2018), which we consequently rated at unclear risk of bias.

We considered nine studies to be at high risk of attrition bias. In three of these, the reasons for missing data were unlikely to be related to the true outcome (e.g. similar levels of missing data between groups with a difference of two or less lost individuals); however, there was substantial attrition (10% or more compared to the randomised sample), and study authors did not impute the missing data and performed an available‐case analysis (i.e. participants for whom outcomes were obtained at assessments) or a per‐protocol analysis (i.e. only participants who complied with their allocated intervention or attended a certain number of sessions), or both (Mathad 2017; Porter 2008; Warnecke 2011). Despite an imbalance in the levels of missing data between groups in Warnecke 2011, the investigators ensured that reasons for missing data were unlikely to be related to the true outcome, based on non‐significant differences in demographic and outcome variables between completers and non‐completers of the study (i.e. random loss). Missing data in the study were treated as absent and the study authors performed an available‐case analysis. Two studies did not provide sufficient information about dropouts, such as the number of participants randomised to each group or attrition by group (Chen 2018a; Waddell 2015). Based on the number of participants analysed, we presumed an available‐case or per‐protocol analysis, or both (e.g. no hypothesis testing in Chen 2018a due to dropout), and considered both studies to be at high risk of bias because of substantial dropout (≥ 10%). In four other studies at high risk of attrition bias (Barry 2019;Goldstein 2019; Miu 2016; Venieris 2017), reasons for missing data were likely to be related to the true outcome because of imbalance in missing data between groups, and an available‐case or per‐protocol analysis (or both) was conducted in all four studies. Not all studies reported reasons for missing data (e.g. Venieris 2017).

Selective reporting

To assess potential reporting bias for 22 non‐registered studies or those without a published study protocol (Anderson 2017; Barry 2019; Delaney 2016; Erogul 2014; Galante 2018; Geschwind 2015; Houston 2017; Mathad 2017; Miu 2016; Mueller 2018; Peng 2014; Porter 2008; Sahranavard 2018; Samouei 2015; Smeets 2014; Stephens 2012; Venieris 2017; Victor 2018; Waddell 2005; Waddell 2015; Wang 2012; Warnecke 2011), we considered whether the outcome measures described in the Methods section of the paper were reported in the Results section. We were unable to assess reporting bias for three additional, non‐registered studies, for which only conference abstracts were available (Chen 2018a; Goldstein 2019; Kelleher 2018).

Of the 25 non‐registered studies, we considered 19 to be free of reporting bias because the published results corresponded to those expected in these types of studies (Anderson 2017; Barry 2019; Delaney 2016; Erogul 2014; Geschwind 2015; Houston 2017; Mathad 2017; Miu 2016; Mueller 2018; Peng 2014; Porter 2008; Sahranavard 2018; Smeets 2014; Stephens 2012; Venieris 2017; Victor 2018; Waddell 2005; Wang 2012; Warnecke 2011). We rated four studies (including three reported as conference abstracts) at unclear risk of reporting bias (Chen 2018a; Goldstein 2019; Kelleher 2018; Waddell 2015); in Waddell 2015, although the published report seemed to include all expected outcomes, the reported assessment at time four was not further specified (i.e. it is unclear whether it is a 12‐month follow‐up period or a post‐test assessment). We judged two studies to be at high risk of bias, largely because not all prespecified outcomes were reported (Galante 2018; Samouei 2015).

Five studies were prospectively or retrospectively registered (Akbari 2017; ISRCTN64217625; Kötter 2016; Mejia‐Downs 2016; Recabarren 2019). Of these registered studies, we considered one study to be at low risk of reporting bias as the (unpublished) report included all expected outcomes in the prespecified way (Mejia‐Downs 2016); no full text (dissertation) was available for this study, but we considered the risk of reporting bias to be low based on the Results section, which was provided by the study author in response to our email request. For one registered trial (ISRCTN64217625) we could not determine the risk of reporting bias on the basis of trial registration, as the study was completed but unpublished and no further information was provided by the study authors during the publication process. We judged three registered studies to be at high risk of reporting bias because not all of the prespecified outcomes (Recabarren 2019) or time points (Akbari 2017; Kötter 2016) were reported. According to the study authors, Recabarren 2019 was only the first publication for this study, so future publications reporting the other prespecified outcomes are possible.

Effects of interventions

See: Summary of findings 1 Resilience interventions versus control conditions for healthcare students

See: summary of findings Table 1.

Overall, across the included studies in healthcare students, we were able to perform 14 pooled analyses that combined at least two studies.

We analysed effects on all primary outcomes at immediate post‐intervention and at short‐term follow‐up (except for well‐being or quality of life). No meta‐analyses were possible for any of the primary outcomes at medium‐term or long‐term follow‐up. For the secondary outcomes, we performed meta‐analyses for social support, optimism, self‐efficacy and positive emotions at post‐intervention, and social support at short‐term follow‐up. For several secondary outcomes, i.e. active coping, self‐esteem and positive emotions, only single‐study results were available at short‐term follow‐up (three months or less). No secondary outcome was measured at medium‐ or long‐term follow‐up.

We present the different outcome measures that we used to assess the primary and secondary outcomes in the included studies in Table 2 and Table 3, respectively. For the primary outcomes of resilience and well‐being or quality of life, as well as all secondary outcomes (social support, optimism, self‐efficacy, active coping, self‐esteem, hardiness, positive emotions), positive values indicate a higher (i.e. better) level of the corresponding outcome in the intervention group compared to the control group (e.g. higher resilience), whereas negative values refer to lower levels of the respective outcome in the intervention arm. For the remaining primary outcomes of anxiety, depression and stress or stress perception, negative values indicate a lower (i.e. better) level of these outcomes in the intervention arm (e.g. fewer depressive symptoms) compared to the control arm, while positive values refer to a higher level of depression, anxiety and stress or stress perception in the intervention group compared to control.

We report P values exactly, and where provided by the study authors, unless P values are lower than 0.001, in which case they are expressed as P < 0.001. T values and P values of Egger's tests were rounded.

Resilience interventions versus control conditions in healthcare students

Primary outcomes
Resilience

Post‐intervention

Thirteen studies evaluated the effect of resilience interventions compared to control groups on resilience at immediate post‐intervention.

Nine studies reported data suitable for quantitative analysis (Anderson 2017; Barry 2019; Erogul 2014; Houston 2017; Mathad 2017; Mueller 2018; Peng 2014; Stephens 2012; Wang 2012), including two studies with mixed samples (Barry 2019; Houston 2017) for which we obtained subgroup data for healthcare students by contacting the study authors. The pooled effect estimate suggests evidence of a moderate effect of resilience interventions on resilience at post‐intervention (standardised mean difference (SMD) 0.43, 95% confidence interval (CI) 0.07 to 0.78; P = 0.02; 9 studies, 561 participants; I2 = 75%; Tau2 = 0.21; P for heterogeneity < 0.001; G2 = 55.9%; 95% prediction interval: −0.54 to 1.41; Analysis 1.1; very‐low certainty evidence, see summary of findings Table 1).

For resilience at post‐intervention, we found no evidence of asymmetry based on funnel plots and Egger’s test (t = −1.42; df = 7; P = 0.20; see Appendix 15 and Appendix 16).

Indicators of statistical heterogeneity were mixed, with I2 and Chi2 values indicating substantial to considerable heterogeneity, while G2 suggested moderate heterogeneity.

Single‐study results

Four studies also measuring resilience at post‐intervention could not be pooled with the studies above for the following reasons: for one unpublished study (ISRCTN64217625) we could not obtain the data from the study authors (i.e. trial completed, but publication process still ongoing). The same applied to two studies only available as conference abstracts (Chen 2018a; Kelleher 2018). Without indicating statistical values, Kelleher 2018 (sample size not specified) reported higher resilience scores at post‐intervention in the intervention group compared to the control group. For another study graphically reporting the results for resilience (Delaney 2016), we could not obtain the quantitative data from the study authors. However, Delaney 2016 (probably 37 participants included) reported no evidence of a difference in resilience between intervention and control groups at post‐test.

Short‐term follow‐up (≤ 3 months)

Seven individually‐randomised studies, including one study with a mixed sample (Victor 2018), assessed the effect of resilience‐training programmes versus control groups on resilience at short‐term follow‐up. We were able to combine the data from four of these studies, including one mixed‐sample study with available subgroup data (Victor 2018), in a meta‐analysis (Mejia‐Downs 2016; Stephens 2012; Victor 2018; Wang 2012). The pooled SMD for resilience was 0.20 (95% CI −0.44 to 0.84; P = 0.53; 4 studies, 209 participants; I2 = 80%; Tau2 = 0.34; P for heterogeneity = 0.002; G2 = 99.7%; 95% prediction interval: −2.66 to 3.07; Analysis 1.2), suggesting little or no evidence of a difference between resilience training and control.

The statistical values indicated substantial to considerable heterogeneity for this outcome.

Single‐study results

Three studies also measuring resilience at short‐term follow‐up could not be pooled with the studies above, for the following reasons: we could not obtain data by contacting the authors of two studies available as conference abstracts (Chen 2018a; Kelleher 2018). One of these (Kelleher 2018; sample size not specified) reported higher resilience scores for the resilience intervention compared to the control group at one‐month follow‐up. Samouei 2015 reported statistical values (e.g. means), but the number of participants randomised to and analysed in each group was not specified and not available from the study authors. The investigators found no significant difference between intervention and control groups for resilience at three‐month follow‐up (P = 0.27).

Medium‐term follow‐up (> 3 to ≤ 6 months)

One of three studies comparing a resilience intervention to control at medium‐term follow‐up provided suitable data for quantitative analysis (Erogul 2014). Using the Resilience Scale (RS‐14; range 14 (worst) to 98 (best); Wagnild 1993) in 57 participants, Erogul 2014 reported no statistically significant difference between the intervention group (mean = 82.4; SD = 9.8) compared no intervention (mean = 77.3; SD = 12.5) at six‐month follow‐up (P = 0.12). Similarly, the calculated mean difference (MD) for this outcome showed little or no evidence of a difference between resilience intervention and control at medium‐term follow‐up (MD 5.10, 95% CI −0.72 to 10.92; P = 0.09; Analysis 1.3)

Single‐study results

For one study presenting only graphical results for resilience (Delaney 2016), we could not obtain the numerical data from the study authors. The same applied to one unpublished study (ISRCTN64217625). Delaney 2016 (probably 37 participants included) showed no evidence of a difference between resilience training and control at four‐month follow‐up.

Mental health and well‐being: anxiety

Post‐intervention

Eight studies (including four with mixed samples: Barry 2019; Goldstein 2019; Houston 2017; Recabarren 2019) evaluated the effect of resilience interventions compared to controls on self‐reported anxiety at immediate post‐intervention. Seven studies reported data suitable for quantitative analysis (Barry 2019; Houston 2017; Kötter 2016; Recabarren 2019; Sahranavard 2018; Wang 2012; Warnecke 2011), including three studies with mixed samples for which subgroup data in healthcare students were available (Barry 2019; Houston 2017; Recabarren 2019). The pooled effect estimate for 362 participants suggests evidence for a moderate effect of resilience training on post‐intervention anxiety (SMD −0.45, 95% CI −0.84 to −0.06; P = 0.02; I2 = 66%; Tau2 = 0.17; P for heterogeneity = 0.008; G2 = 99.3%; Analysis 1.4; 95% prediction interval: −2.09 to 1.13; very‐low certainty evidence, see summary of findings Table 1).

Based on funnel plots and Egger’s test, we found no statistically significant asymmetry for anxiety at post‐intervention (Egger’s test: t = −1.61; df = 5; P = 0.17; see Appendix 15 and Appendix 16).

The statistical indicators of heterogeneity suggest there is substantial (I2 and Chi2) or considerable heterogeneity (G2) in the results for anxiety at post‐test.

Studies with mixed samples

One study with a mixed sample (Goldstein 2019; 45 participants analysed in total sample), measured anxiety at post‐intervention. However, the (subgroup) results for healthcare students were neither reported in the conference abstract nor in the poster, and we could not obtain them from the study authors.

Short‐term follow‐up (≤ 3 months)

At short‐term follow‐up, three studies compared the impact of resilience interventions versus controls on anxiety. We were able to combine the data from two studies with a total of 91 participants in a meta‐analysis (Porter 2008; Wang 2012). The pooled SMD for short‐term, self‐reported anxiety was −0.88 (95% CI −1.32 to −0.45; P < 0.001; I2 = 0%; Tau2 = 0; P for heterogeneity = 0.80; G2 = 0%; 95% prediction interval: incalculable due to only two studies; Analysis 1.5), and revealed evidence of a large difference between groups in favour of resilience training for this outcome (large effect size).

We detected no statistical heterogeneity for anxiety at short‐term follow‐up.

Studies with mixed samples

We were unable to pool the data from one study measuring anxiety at short‐term follow‐up in a mixed sample with the data from the above studies, due to unavailable subgroup data for healthcare students (Goldstein 2019; 45 participants analysed in total sample).

Mental health and well‐being: depression

Post‐intervention

Seven studies (including four mixed studies: Barry 2019; Goldstein 2019; Houston 2017; Recabarren 2019) assessed the effect of resilience interventions versus controls on self‐reported depression (or burnout; see Helmreich 2017 and Appendix 6 in this review) at post‐intervention. For three studies investigating healthcare and non‐healthcare students, we were able to retrieve the relevant subgroup data from the study authors (Barry 2019; Houston 2017; Recabarren 2019). Analysis of six studies (332 participants) providing data suitable for pooling (Barry 2019; Houston 2017; Kötter 2016; Recabarren 2019; Wang 2012; Warnecke 2011) suggested little or no evidence for a difference between resilience training and control group for post‐intervention depression (SMD −0.20, 95% CI −0.52 to 0.11; P = 0.20; I2 = 45%; Tau2 = 0.07; P for heterogeneity = 0.11; G2 = 99.1%; 95% prediction interval: −1.16 to 0.74; Analysis 1.6; very‐low certainty evidence, see summary of findings Table 1).

We found no indication of asymmetry for depression immediately post‐intervention (see Appendix 15 and Appendix 16; Egger’ test: t = −0.85; df = 4; P = 0.44).

The results for statistical heterogeneity were mixed, with no important heterogeneity indicated by Chi2 test, moderate heterogeneity indicated by the I2 value, and the G2 value suggesting considerable heterogeneity.

Studies with mixed samples

One study with a mixed sample (Goldstein 2019; 45 participants analysed in total sample) provided only a narrative report of improvements in self‐reported depression for the resilience‐training programme. We were not able to obtain the (subgroup) results for healthcare students from the study authors.

Short‐term follow‐up (≤ 3 months)

Five studies (including two studies with mixed samples: Goldstein 2019; Victor 2018) evaluated the effect of resilience training compared to controls on self‐reported depression at short‐term follow‐up. A meta‐analysis of four studies (including one mixed‐sample study with available subgroup data: Victor 2018) that could be combined (Miu 2016; Porter 2008; Victor 2018; Wang 2012) revealed evidence of a moderate difference between groups, in favour of resilience training for this outcome (SMD −0.65, 95% CI −1.26 to −0.04; P = 0.04; 4 studies, 226 participants; I2 = 76%; Tau2 = 0.28; P for heterogeneity = 0.006; G2 = 99.7%; 95% prediction interval: −2.90 to 1.62; Analysis 1.7; moderate effect size).

Based on statistical indicators, we found substantial (I2 and Chi2 test) to considerable heterogeneity (I2 and G2) for depression at short‐term follow‐up.

Studies with mixed samples

One study with a mixed sample (Goldstein 2019), measured the effects of a resilience intervention versus control on depression at short‐term follow‐up but could not be pooled with the above studies. Goldstein 2019 (45 participants analysed in total sample) provided only a narrative report of improvements in self‐reported depression for the resilience‐training programme, and we were not able to obtain the (subgroup) results for this time point from the study authors.

Mental health and well‐being: stress or stress perception

Post‐intervention

Eleven studies (three with mixed samples: Barry 2019; Goldstein 2019; Houston 2017) evaluated the effect of resilience interventions compared to control groups on self‐reported stress symptoms or the subjective perception of stress at immediately post‐intervention. We obtained the relevant subgroup data from the study authors for two studies involving both healthcare and non‐healthcare students (Barry 2019; Houston 2017), resulting in seven studies that could be combined (Barry 2019; Erogul 2014; Houston 2017; Kötter 2016; Mathad 2017; Stephens 2012; Warnecke 2011). The pooled effect estimate suggests evidence for a small effect of resilience interventions on stress or stress perception at post‐intervention (SMD −0.28, 95% CI −0.48 to −0.09; P = 0.004; 7 studies, 420 participants; I2 = 0%; Tau2 = 0; P for heterogeneity = 0.58; G2 = 99.1%; 95% prediction interval: −0.74 to 0.15; Analysis 1.8; very‐low certainty evidence, see summary of findings Table 1).

Based on funnel plots and Egger’s test, we found no statistically significant asymmetry for stress or stress perception at post‐test (see Appendix 15 and Appendix 16; Egger’s test: t = −1.55; df = 5; P = 0.18).

We found mixed results for heterogeneity in stress or stress perception at post‐test, with three values (I2, Tau2, Chi2 test) indicating no heterogeneity, while G2 suggested considerable heterogeneity.

Single‐study results

Three studies also measuring stress or stress perception at post‐intervention could not be pooled with the studies above, for the following reasons. For two studies available only as conference abstracts (Chen 2018a, Kelleher 2018), we could not obtain the data from the authors. Kelleher 2018 (sample size not specified) reported lower stress scores at post‐intervention in the intervention group compared to the control group, but did not indicate statistical values. Delaney 2016 reported the results for perceived stress graphically and we could not retrieve the quantitative data from the study authors; however, Delaney 2016 (probably 37 participants included) reported no evidence of a difference in perceived stress between intervention and control groups at post‐test.

Studies with mixed samples

Another study with a mixed sample measuring perceived stress at post‐intervention could not be pooled with the studies above, due to unavailable subgroup data (Goldstein 2019). For the total sample, including medical and nursing students (45 participants analysed), the study authors reported evidence of a between‐group difference for changes in perceived stress between baseline and post‐intervention (intervention arm: −24.6% average change, P = 0.017, d = −0.58; control arm: 0.8% average change, P > 0.05).

Short‐term follow‐up (≤ 3 months)

At short‐term follow‐up, five studies (including Goldstein 2019 with a mixed sample) compared the impact of resilience training compared to controls for self‐reported stress or stress perception. We combined two studies reporting data suitable for quantitative analysis (Mejia‐Downs 2016; Stephens 2012). Analysis of these studies (113 participants) suggests little or no evidence for a difference between groups in stress or stress perception within three months post‐intervention (SMD 0.13, 95% CI −0.79 to 1.06; P = 0.78; I2 = 83%; Tau2 = 0.37; P for heterogeneity = 0.02; G2 = 0%; 95% prediction interval: incalculable due to only two studies; Analysis 1.9).

The findings for statistical heterogeneity for stress or stress perception at short‐term follow‐up were also mixed, with some values (I2, Chi2) indicating substantial to considerable heterogeneity, while no heterogeneity was suggested by G2.

Single study‐results

Two additional studies measuring stress or stress perception at short‐term follow‐up and available as conference abstracts (Chen 2018a; Kelleher 2018), could not be pooled with the above studies, since we were unable to obtain the relevant data from the study authors. Of these, Kelleher 2018 (sample size not specified) reported lower stress scores for the resilience intervention compared to control group at one‐month follow‐up.

Studies with mixed samples

Comparable with immediate post‐intervention, data from Goldstein 2019, who examined a mixed sample of medical and nursing students along with students from other fields (45 participants analysed in total sample), could not be combined with the other studies. The study authors identified evidence of a between‐group difference in changes in perceived stress between baseline and three‐month follow‐up in the total sample (intervention arm: −22.3% average change, P = 0.002, d = −0.84; control arm: −10.5% average change, P > 0.05).

Medium‐term follow‐up (> 3 to ≤ 6 months)

Two studies reported on stress or stress perception at medium‐term follow‐up (Delaney 2016; Erogul 2014), with quantitative data available for only one study (Erogul 2014). Using the Perceived Stress Scale (PSS; range = 0 (best) to 40 (worst); Cohen 2012) in 57 participants, the study authors reported no evidence for a difference between resilience training (mean = 14.9; SD = 6.6) and no intervention (mean = 18.4; SD = 6.9) at six‐month follow‐up (P = 0.08). The calculated MD indicated evidence for a difference in favour of the resilience intervention at medium‐term follow‐up (MD −3.50, 95% CI −7.00 to 0.00; P = 0.05; Analysis 1.10).

Single‐study results

One study measuring perceived stress at this time point could not be combined with Erogul 2014, as numerical data were not available (Delaney 2016). At four‐month follow‐up, Delaney 2016 (probably 37 participants included) provide a narrative report of no evidence of a difference between the resilience intervention and control for perceived stress.

Mental health and well‐being: well‐being or quality of life

Post‐intervention

At post‐intervention, five studies (including two with mixed samples: Goldstein 2019; Recabarren 2019) assessed the effect of resilience interventions compared to controls on self‐reported well‐being or quality of life. Including one mixed‐sample study for which we obtained subgroup data for healthcare students from the study authors (Recabarren 2019), four studies (251 participants) provided data suitable for quantitative analysis (Mathad 2017; Recabarren 2019; Smeets 2014; Wang 2012). The analysis revealed little or no evidence of an effect of resilience training (SMD 0.15, 95% CI −0.14 to 0.43; P = 0.31; I2 = 23%; Tau2 = 0.02; P for heterogeneity = 0.27; G2 = 99.9%; 95% prediction interval: −0.90 to 1.20; Analysis 1.11; very‐low certainty evidence, see summary of findings Table 1).

There was no statistical indication of asymmetry for well‐being or quality of life at post‐intervention (see Appendix 15 and Appendix 16; Egger’s test: t = 0.12; df = 2; P = 0.91).

We found mixed results for statistical heterogeneity. While I2 suggested unimportant heterogeneity, G2 indicated considerable heterogeneity.

Studies with mixed samples

One mixed‐sample study (Goldstein 2019), which examined healthcare and non‐healthcare students, could not be included in the meta‐analysis of well‐being or quality of life at post‐test, as relevant subgroup data were not available. Goldstein 2019 (45 participants analysed in total sample) provided a narrative report of improvements in self‐reported life satisfaction for resilience training compared to control.

Short‐term follow‐up (≤ 3 months)

Two studies (including one with a mixed sample: Goldstein 2019), evaluated the effect of resilience training compared to controls on self‐reported well‐being or quality of life at short‐term follow‐up, with quantitative data available for only one study (Wang 2012). Using the General Well‐Being Schedule (GWB; range = 0 (worst) to 110 (best)), the study authors reported higher well‐being scores in the intervention group (mean = 78.00; SD = 8.90) compared to the control group (mean = 69.60; SD = 7.20) at three‐month follow‐up, with a significant time × group interaction (F = 5.25; P < 0.01). Similarly, the calculated MD indicated evidence for a difference in favour of resilience training at this time point (MD 8.40, 95% 4.54 to 12.26; P < 0.001; Analysis 1.12).

Studies with mixed samples

One study with a mixed sample also compared the effects of a resilience intervention to control on well‐being or quality of life at short‐term follow‐up (Goldstein 2019), but could not be combined in analysis due to unavailable subgroup data. Except for correlations between life satisfaction and other outcomes, Goldstein 2019 (45 participants analysed in total sample) provided only a narrative report of improvements in self‐reported life satisfaction for resilience training compared to control in a sample of university students (including medical and nursing students).

Adverse events

Only four studies assessed the adverse or undesired effects of resilience training in healthcare students (Galante 2018; Kötter 2016; Victor 2018; Warnecke 2011), with three of them reporting no such effects (Galante 2018; Victor 2018; Warnecke 2011).

Galante 2018 reported no participants with adverse reactions related to self‐harm, suicidality or harm to others. The study authors also determined the number of adverse events by exceeding cut‐off scores for Clinical Outcomes in Routine Evaluation Outcome Measure (CORE‐OM; Evans 2000) risk subscales for psychological distress. In the intervention arm, 20 participants triggered this adverse event reporting protocol compared to 25 participants in the control arm. For the total sample (n = 482), the study authors found a lower score of psychological distress in the intervention arm (mean = 0.88; SD = 0.53) compared to the control group (mean = 1.04; SD = 0.54), with a significant difference in favour of the intervention arm (P = 0.001).

Victor 2018 did not systematically assess adverse events, but no negative effects were mentioned by the participants in verbal feedback. Similarly, according to Warnecke 2011, no adverse effects of the intervention were reported in the study, although the method of assessment is unclear. Kötter 2016 also measured adverse events but did not provide the relevant data in the available publication. Most studies in healthcare students provided no data on adverse effects.

Secondary outcomes
Resilience factors: social support

Post‐intervention

Two studies (including one with a mixed sample: Recabarren 2019) reported on perceived social support at post‐intervention. Recabarren 2019 provided subgroup data for psychology students, which we pooled with data from Stephens 2012. The analysis indicated little or no evidence of a difference in social support at post‐test (SMD 0.21, 95% CI −0.15 to 0.57; P = 0.25; I2 = 0%; Tau2 = 0; P for heterogeneity = 0.83; 2 studies, 121 participants; G2 = 0%; 95% prediction interval: incalculable due to only two studies; Analysis 1.13).

We found no heterogeneity for social support at short‐term follow‐up, based on statistical indicators.

Short‐term follow‐up (≤ 3 months)

We combined data from two studies (Porter 2008; Stephens 2012), to estimate the effects of a resilience intervention compared to control on social support at short‐term follow‐up. The pooled SMD of social support across the studies was 0.23 (95% CI −0.18 to 0.64; P = 0.28; I2 = 0%; Tau2 = 0; P for heterogeneity = 0.96; 2 studies, 92 participants; G2 = 0%; 95% prediction interval: incalculable due to only two studies; Analysis 1.14), suggesting little or no evidence for an effect of a resilience intervention on social support within three months post‐intervention.

There was no statistical heterogeneity for social support at short‐term follow‐up.

Single‐study results

One additional study assessed social support at short‐term follow‐up (Mejia‐Downs 2016). We could not pool this study with the others because we could not estimate the SDs from the interquartile ranges due to skewed distribution (see Higgins 2019b). The study authors reported the median social support using the Social Provisions Scale (Cutrona 1987) in 43 participants at two‐week follow‐up: 83.0 (interquartile range = 17.0) in the intervention arm and 85.0 (interquartile range = 13.0) in the control arm (study author‐reported P > 0.05, no further detail available).

Resilience factors: optimism

Post‐intervention

At post‐intervention, two studies reported the effects of resilience interventions compared to controls on self‐reported optimism (Barry 2019; Smeets 2014). We combined the data from these studies in a meta‐analysis, which revealed little or no evidence for an effect of resilience training (SMD 0.29, 95% CI −0.20 to 0.78; P = 0.24; I2 = 0%; Tau2 = 0; P for heterogeneity = 0.54; 2 studies, 66 participants; G2 = 0%; 95% prediction interval: incalculable due to only two studies; Analysis 1.15), compared to control.

There was no indication of heterogeneity for optimism at post‐test in any statistical values.

Short‐term follow‐up (≤ 3 months)

Single‐study results

Two studies assessed the effect of resilience‐training programmes versus controls on optimism at short‐term follow‐up (Mejia‐Downs 2016; Samouei 2015), but could not be pooled in meta‐analysis. Because of skewed distribution of data for optimism, we could not compute the SDs on the basis of interquartile ranges for one study (Mejia‐Downs 2016; see Higgins 2019b). The study authors reported the median optimism using the Life Orientation Test‐Revised (Scheier 1994): 25.0 (interquartile range = 8.0) in the intervention arm and 25.0 (interquartile range = 5.0) in the control arm (study author‐reported P > 0.05, no further details available; 43 participants). Samouei 2015 (sample size not specified) reported statistical values (e.g. means), but the number of participants randomised and analysed in each group was not specified and not available from the study authors. The investigators identified no significant difference between intervention and control groups for optimism at three‐month follow‐up (P = 0.23).

Resilience factors: self‐efficacy

Post‐intervention

Five individually‐randomised studies assessed the effect of resilience interventions compared to controls on self‐reported self‐efficacy at immediate post‐intervention (including two studies with mixed samples: Barry 2019; Recabarren 2019). We were able to retrieve the subgroup data for the two studies with mixed samples at post‐test from the study authors, resulting in five studies providing data suitable for quantitative analysis (Barry 2019; Recabarren 2019; Sahranavard 2018; Smeets 2014; Waddell 2015). The analysis (219 participants) revealed evidence for a moderate difference in favour of resilience training for self‐efficacy at post‐test (SMD 0.51, 95% CI 0.14 to 0.88; P = 0.008; I2 = 41%; Tau2 = 0.07; P for heterogeneity = 0.15; 5 studies, 219 participants; G2 = 74.5%; 95% prediction interval: −0.96 to 1.96; Analysis 1.16).

Moderate (I2) to substantial heterogeneity (G2) was indicated by the statistical values for self‐efficacy at post‐test, .

Short‐term follow‐up (≤ 3 months)

At short‐term follow‐up, only Waddell 2005 reported quantitative data on self‐efficacy. After phase two in this study with 20 participants, the study authors reported higher scores for career decision‐making self‐efficacy, assessed using the Career Decision‐Making Self‐Efficacy Scale (CDMSES; Betz 1996; Taylor 1983; range not specified; higher values indicate higher self‐efficacy), for the intervention arm (mean = 112.0) compared to the control arm (mean = 110.4), with no significant between‐group difference (t = 0.19; P = 0.85). The calculated MD also indicated little or no evidence for an effect of training on self‐efficacy (MD 1.60, 95% CI −14.65 to 17.85; P = 0.85; 1 study, 20 participants; Analysis 1.17).

Single‐study results

One study reporting on self‐reported self‐efficacy at short‐term follow‐up could not be pooled with Waddell 2005. Samouei 2015, for which the number of participants randomised and analysed in each group was unclear, reported no significant difference between the intervention and control groups for self‐efficacy at three‐month follow‐up (P = 0.36).

Resilience factors: active coping

Post‐intervention

One mixed‐sample study (Houston 2017) assessed the effect of a resilience intervention compared to control on the resilience factor of active coping at immediate post‐intervention. We received the relevant subgroup data from the investigators. Using the newly‐created 'taking action' subscale, based on original items of the Brief Coping Orientations to Problems Experience scale (Brief COPE; range 1 (worst) to 4 (best) Carver 1997) in 38 participants, the study authors found lower values of active coping (taking action) in the intervention arm (mean = 3.06; SD = 0.50) compared to the control arm (mean = 3.13; SD = 0.65) in healthcare students at post‐test. The calculated MD indicated little or no evidence for an effect of resilience training (MD −0.06, 95% CI −0.45 to 0.32; P = 0.74; 1 study, 38 participants; Analysis 1.18).

Short‐term follow‐up (≤ 3 months)

Only one study compared the effects of resilience training to control on active coping at short‐term follow‐up (Porter 2008). Based on data from 22 participants and using the 'planful problem‐solving' scale of the Ways of Coping Questionnaire (WOC; Folkman 1988; range not specified; higher values indicate higher planful problem‐solving), Porter 2008 reported higher values of active coping in the intervention group (mean = 1.78; SD = 0.43) compared to the control group (mean = 1.32; SD = 0.54) at two‐month follow‐up, with a significant time × group interaction (F = 13.20; P < 0.006). The calculated MD suggested evidence for a difference between resilience training and control in favour of training (MD 0.46, 95% CI 0.05 to 0.87; P = 0.03; 1 study, 22 participants; Analysis 1.19).

Resilience factors: self‐esteem

Post‐intervention

Only one study in a mixed sample, for which subgroup data were not available, reported on self‐reported self‐esteem at immediate post‐intervention (Goldstein 2019). For the total sample (45 participants analysed), the study authors only provided a narrative report of improvements in self‐esteem, and we therefore could not calculate an MD.

Short‐term follow‐up (≤ 3 months)

At short‐term follow‐up, two studies with mixed samples compared the effect of resilience interventions to controls on self‐esteem (Goldstein 2019; Victor 2018). The study authors of Victor 2018 provided the subgroup data for psychology students. Using the Rosenberg Self‐Esteem Scale (RSES; Ferring 1996; range not specified; higher values indicate higher self‐esteem) in 28 participants, the study authors found slightly higher values for self‐esteem in the intervention arm (mean = 2.42; SD = 0.45) compared to attention control (mean = 2.34; SD = 0.51) at three‐week follow‐up. The calculated MD of 0.08 (95% CI −0.28 to 0.44; P = 0.67; 1 study, 28 participants; Analysis 1.20) indicated little or no evidence for an effect of resilience intervention on this outcome.

Studies with mixed samples

In a mixed sample of medical and nursing students, Goldstein 2019 measured self‐esteem at three‐month follow‐up and provided a narrative report of improvements in this outcome (45 participants analysed in total sample). Subgroup data were not available.

Resilience factors: hardiness

Post‐intervention

One study assessed the effects of hardiness training compared to wait‐list control on hardiness at post‐intervention (Sahranavard 2018). The Ahvaz Hardiness Inventory (AHI; Kiamarthi 1998) was used to measure hardiness in 30 participants (15 in each group). However, considering the possible range of scores for this outcome measure (0 (worst) to 81 (best); i.e. range of 0 to 1215 for the sum scores in each group), the reported values for hardiness at post‐test did not seem plausible (intervention arm: mean = 175.80 (SD = 6.00); control arm: mean = 167.80 (SD = 13.06)). We therefore decided against calculating the MD for this study.

Resilience factors: positive emotions

Post‐intervention

Five studies (including Geschwind 2015, who used a mixed sample) assessed the effect of resilience interventions compared to controls on self‐reported positive emotions at immediate post‐intervention. Two studies (Peng 2014; Smeets 2014) provided data suitable for quantitative analysis (112 participants). The pooled effect estimate revealed a moderate effect on positive emotions at post‐test in favour of resilience training (SMD 0.51, 95% CI 0.01 to 1.01; P = 0.05; I2 = 43%; Tau2 = 0.06; P for heterogeneity = 0.19; 2 studies, 112 participants; G2 = 0%; 95% prediction interval incalculable due to only two studies; Analysis 1.21).

There were mixed findings for statistical heterogeneity for positive emotions at post‐test. While I2 suggested moderate heterogeneity, there was no indication of heterogeneity using G2.

Single‐study results

Two other studies also measured the effects of resilience interventions compared to controls on positive emotions at post‐intervention (Akbari 2017; Sahranavard 2018), but could not be pooled with the aforementioned studies. Although Akbari 2017 (number of participants analysed not specified) demonstrated an increasing effect of resilience training compared to no intervention on happiness (Oxford Happiness Questionnaire; Alipour 1993; Hills 2002; higher values indicate more happiness) at post‐test (F = 22.412, P < 0.001), the report indicated lower values of happiness in the intervention arm (mean = 27.82; SD = 2.25) than in the control arm (mean = 42.11; SD = 2.25). We contacted the study authors but were unable to resolve this uncertainty. Sahranavard 2018 only presented the combined results for both positive and negative affect (assessed with the Positive and Negative Affect Schedule; PANAS; Watson 1988) and reported a significant effect of condition on this outcome (F = 4.96, P = 0.035) in a covariance analysis (30 participants). We could not obtain the separate data for positive affect after contacting the study authors.

Studies with mixed samples

In Geschwind 2015, the subgroup data for psychology students were not available. For the total sample of 50 participants (i.e. psychology students and healthy volunteers), the study authors reported a significant interaction for time x condition (F = 6.632, P = 0.002), with significantly higher positive affect in the intervention arm compared to the control arm after affect induction (t = 3.369, P = 0.002).

Short‐term follow‐up (≤ 3 months)

At short‐term follow‐up, one study compared the effects of a resilience intervention to wait‐list control on self‐reported positive emotions (Mejia‐Downs 2016). Using the positive affect subscale of the Modified Differential Emotion Scale (mDES; Fredrickson 2003; range not specified; higher values indicate more positive affect) in 43 participants, the study authors reported a significant time × group interaction for positive emotions (F = 5.73; P = 0.02), with higher values in the intervention arm (mean = 3.20; SD = 0.47) compared to the control arm (mean = 3.01; SD = 0.65) at two‐week follow‐up. The calculated MD suggested little or no evidence for an effect of resilience training at short‐term follow‐up (MD 0.19, 95% CI −0.15 to 0.53; P = 0.27; 1 study, 43 participants; Analysis 1.22).

Studies with mixed samples

Geschwind 2015 also assessed the effects of a resilience intervention compared to an attention control group at short‐term follow‐up. We could not pool the data from this study with the above study, as we could not obtain the subgroup data for psychology students from the study authors. For the total sample (50 participants), the effect of positive affect induction compared to control on positive affect was shown to be maintained at 20‐minute follow‐up (t = 2.053, P = 0.047).

Sensitivity analyses

We performed five sensitivity analyses using fixed‐effect pair‐wise meta‐analysis for the primary outcomes at post‐intervention. For each outcome, the results were consistent with the findings from the random‐effects meta‐analyses.

Resilience

Post‐intervention

We found evidence for a moderate difference in favour of resilience training (SMD 0.52, 95% CI 0.36 to 0.69; P < 0.001; 9 studies, 561 participants; Analysis 1.23).

Anxiety

Post‐intervention

We found evidence for a moderate difference in favour of resilience training (SMD −0.35, 95% CI −0.57 to −0.14; P = 0.001; 7 studies, 362 participants; Analysis 1.24).

Depression

Post‐intervention

We found little or no evidence for an effect of resilience training (SMD −0.18, 95% CI −0.40 to 0.04; P = 0.12; 6 studies, 332 participants; Analysis 1.25).

Stress or stress perception

Post‐intervention

We found evidence for a small difference in favour of resilience training (SMD −0.28, 95% CI −0.48 to −0.09; P = 0.004; 7 studies, 420 participants; Analysis 1.26).

Well‐being or quality of life

Post‐intervention

We found little or no evidence for an effect of resilience training (SMD 0.14 95% CI −0.10 to 0.39; P = 0.25; 4 studies, 251 participants; Analysis 1.27).

Discussion

Summary of main results

We identified 30 RCTs that fulfilled the inclusion criteria for this review, eight of which were conducted in mixed samples.

There is very‐low certainty evidence (meaning that the true effect may differ markedly from the estimated effect) that resilience interventions might be more effective than control for improving resilience, self‐reported symptoms of anxiety, and stress or stress perception at post‐test. Effect sizes ranged from small to moderate. We found little or no evidence for an effect of training on depressive symptoms and well‐being or quality of life at post‐intervention. At short‐term follow‐up (three months or less post‐intervention), the effect size for the reduction in anxiety symptoms increased from moderate to large. We also found some evidence for a moderate effect in favour of resilience training on depressive symptoms, and a single study provided evidence of an increase in well‐being. The moderate or small effects for resilience and stress or stress perception found at post‐test, respectively, were no longer evident at short‐term follow‐up. At medium‐term follow‐up (more than three months to six months or less), we no longer found evidence for a difference between a resilience intervention and control for resilience, while a single study still provided evidence for a decrease in stress symptoms. Anxiety, depression, and well‐being or quality of life were not measured at medium‐term follow‐up by any study. Long‐term follow‐up assessments (more than six months post‐intervention) were not available for any primary outcome.

For secondary outcomes at post‐test and short‐term follow‐up, we found some evidence for moderate effects in favour of resilience training for self‐efficacy and positive emotions at post‐intervention, that were not maintained in the short‐term follow‐up, based on the evidence from single studies. While there was no evidence for an effect of training on active coping at post‐test, we found evidence for an effect in favour of resilience training at short‐term follow‐up. Neither at post‐intervention nor within three months post‐intervention was there evidence for a difference between training and control for social support, optimism (only at post‐intervention), or self‐esteem (only at short‐term follow‐up). Hardiness was not measured at short‐term follow‐up. None of the secondary outcomes were assessed at medium‐term or at long‐term follow‐up.

The planned subgroup analyses to test for possible effect modifiers were not possible, due to the limited number of studies. The same applied to the planned sensitivity analyses to examine the robustness of the conclusions of this review, the exception being the sensitivity analyses using a fixed‐effect model for the primary outcomes at post‐test. Compared with the main analyses, the calculation of fixed‐effect instead of random‐effects pairwise meta‐analyses showed no changes in the evidence found.

Overall, we found very‐low certainty evidence in this review, meaning that we can draw no clear conclusions.

Overall completeness and applicability of evidence

The review highlights some issues about the completeness and applicability of the evidence for the effects of resilience interventions in healthcare students (for details, see Appendix 17).

Participants

Since stress‐related mental disorders are more prevalent in women (Kuehner 2017; Li 2017; WHO 2019) and since women report lower resilience (e.g. Kunzler 2018), the high proportion of women among the study participants may be explained by a higher interest among women to participate in resilience interventions. The applicability of the findings of this review to men may be limited, since gender differences in the prevalence of stress‐related mental disorders may reflect differences in biological vulnerability, social roles, or stress reactivity (Nazroo 1998; Verma 2011; WHO 2019), thereby possibly causing a different effect of resilience training in men and women.

The included studies mainly considered young individuals, but this was to be expected, given that the population of interest in this review is healthcare students.

Students in allied health professions were less represented, restricting the applicability of our findings to these fields of study.

The clinical relevance of mental symptoms at baseline, i.e. whether symptom load justified a diagnosis of a mental disorder, is unclear for most studies. However, to get a clear picture of the participants' baseline mental health could be important, as the large effect sizes in some studies (e.g. Wang 2012) might be explained in part by the inclusion of participants with a pre‐existing burden of mental symptoms or even clinical diagnoses.

The evidence was concentrated in North America, Europe and Asia (including the Near East), with only two studies from Australia. The applicability of the findings to other locations and ethnicities (e.g. South America, Africa, Oceania) therefore remains unclear. Twenty‐four of the 30 included studies were conducted in high‐income countries (e.g. USA) and five in an upper‐middle income country (e.g. China), with only one study performed in a lower‐ to middle‐income country. We therefore also advise some caution about the cross‐cultural applicability of the evidence.

In summary, the findings may be most applicable to young adult, female healthcare students, living in high‐income countries.

Interventions

Although the benefits of online‐ and mobile‐based interventions (e.g. 24/7 availability) have been recently discussed (Cuijpers 2017; Heber 2017; Heron 2010), we identified only four studies delivered in this format. Furthermore, most of the interventions were of high or low intensity with treatment durations varying considerably. Theoretical approaches were relatively balanced between mindfulness‐based training, unspecified interventions and combinations. Overall, the findings of this review are mostly applicable to group interventions of high intensity, delivered face‐to‐face, and using a mindfulness‐based theoretical approach.

Comparators

The primary use of no intervention and wait‐list controls, in particular, in the evidence found here is problematic, since these control groups are demonstrated to yield inflated effect sizes compared to active comparators in psychotherapy research (Mohr 2014). The evidence does not allow us to answer this question, because we were not able to conduct the relevant subgroup analysis (see Table 1).

Outcomes

There were a large number of different measures for resilience in the studies (see Table 2). We were not able to investigate the potential effect of the underlying concept of resilience in these scales (see Table 1).

A large variety of assessments was also used for the primary outcomes of mental health and well‐being (e.g. burnout and depression scales for depression; see Helmreich 2017). This diversity of measures has to be considered as a potential source of heterogeneity in our meta‐analyses, and might have an impact on the interpretation of results.

Although resilience factors, such as social support, are discussed as well‐evidenced resilience factors (see Helmreich 2017), relatively few of the included studies assessed these outcomes at the different periods of follow‐up.

Adverse or undesired effects were not specified in most included studies, with three studies reporting no adverse or undesired effects. For psychotherapy, however, possible adverse outcomes have been discussed (Berk 2009; Moritz 2019). As resilience interventions often include confronting participants with individual problems, some of these training programmes might also have the potential to harm certain participants.

Lastly, very few studies had medium‐term follow‐up assessments and no study performed long‐term follow‐up, which might be explained by the students' restricted time in universities and schools, and general difficulties in establishing long‐term outcomes. Our ability to examine whether any benefits of resilience interventions are sustained in the medium‐ and long‐term was therefore also limited.

Quality of the evidence

Using the GRADE approach (Schünemann 2013; Schünemann 2019a), we rated the overall certainty of evidence at post‐intervention for all primary outcomes as very low for several reasons. First, important methodological limitations reduced the certainty of the evidence offered by most included studies. There was unclear and high risk of bias for several domains across the studies; notably, there was a high risk of bias in blinding of participants and personnel and loss to follow‐up, and unclear risk of bias for methods of sequence generation, allocation concealment and blinding of outcome assessment. Selective outcome reporting was also occasionally an issue. Second, three outcomes had moderate (I2 > 30%; depression) or substantial (I2 > 50%; resilience, anxiety) levels of unexplained heterogeneity and only partially overlapping CIs leading to inconsistency. Third, for all (primary) outcomes at post‐intervention, the evidence was indirect, as studies were limited to certain participants (e.g. certain fields of health professional study), particular versions of resilience intervention (e.g. group setting, high training intensity) and certain comparators (e.g. no intervention, waiting list). Finally, due to the small number of participants included in the meta‐analyses for anxiety, depression and well‐being or quality of life (fewer than 400 participants), inconsistent messages about the 95% CI for the intervention effect (depression, well‐being or quality of life), and the 95% CI encompassing both a very small treatment effect and crossing the threshold for appreciable benefit of the intervention (resilience, anxiety), imprecision was a problem for four outcomes at post‐intervention. However, in the case of post‐intervention resilience and anxiety, we did not downgrade for imprecision. Rather, we downgraded only for inconsistency, as the substantial heterogeneity (I2 = 75% or 66%, respectively) for these outcomes might also have affected the CIs (i.e. precision) and we did not wish to double‐downgrade for the same problem.

We did not downgrade for publication bias for any of the primary outcomes at post‐intervention. Despite the small number of studies per meta‐analysis (fewer than 10 studies; see Assessment of reporting biases), inspection of funnel plots (see Appendix 16) and Egger's test revealed no statistical or visual evidence of asymmetry (see also Effects of interventions and Appendix 15). The funnel plots were symmetrical in shape and, where available, the results from grey literature did not differ from other published studies for the (non‐)evidence or the direction of effect. Due to the scarcity of larger studies across the primary outcomes at post‐test, a small‐study effect was difficult to assess and cannot not be ruled out completely. Nevertheless, an overestimation of effects in smaller studies seemed unlikely, since the meta‐analyses mostly included small studies with significant and non‐significant results. Although the evidence was largely based on small studies, there was almost no indication of potential conflicts of interest of relevance for the post‐test meta‐analyses, except for one study (Kötter 2016) included in the meta‐analyses for anxiety, depression and stress or stress perception (see Appendix 15).

Regarding adverse events, several GRADE domains (e.g. precision, inconsistency, publication bias) could not be assessed due to the small number of studies documenting any adverse effects of study participation (e.g. by verbal feedback from participants; Galante 2018; Victor 2018; Warnecke 2011; for Kötter 2016: adverse events measured, but not reported). Based on the narrative reporting in these studies, we downgraded the certainty of the evidence for this outcome for study limitations and indirectness.

Overall, the GRADE certainty rating was very low for all primary outcomes at post‐intervention, which means that there is a high degree of uncertainty about the estimates of effect observed. Future research in this area is very likely to substantially impact the effect estimates of resilience interventions.

Search methods

Appendix 18 includes further information on how potential biases in the search methods were prevented in this review.

Except for five completed, but unpublished studies (Chen 2018a; Goldstein 2019; ISRCTN64217625; Kelleher 2018; Mejia‐Downs 2016), we were able to retrieve the full texts for all included studies. In accordance with the CDPLP Editorial Team, we considered alternative sources (e.g. trial register entry) for these five studies. In 22 cases, we did not receive any reply from the study authors (i.e. eligibility criteria not verifiable due to unavailable full text or alternative information), or the responses were inadequate and did not provide sufficient information to enable us to reach a decision about the eligibility of the studies (see Characteristics of studies awaiting classification). We attempted to conduct a comprehensive search; however, the fact that 16 studies have not yet been incorporated, and will only be added in the update of this review could be considered a potential source bias.

Correspondence with the authors about data analysis was required for 25 included studies. For six studies for which we aimed to double‐check the available information (e.g. amount of missing data; per‐protocol analysis) by contacting the authors, we decided to rely on the reports and to include the studies in the meta‐analyses despite the missing response (Anderson 2017; Miu 2016; Sahranavard 2018; Smeets 2014; Waddell 2005; Waddell 2015). For three studies (Chen 2018a; ISRCTN64217625; Kelleher 2018), we received information that no data could be provided as the studies were completed and in the process of analysis or publication. For one study (Venieris 2017) the authors responded, but relevant subgroup data could not be retrieved, since the data had been collected several years ago and were saved on another computer. The primary investigators of three studies responded to our first enquiry, but did not react to a second enquiry (Geschwind 2015; Goldstein 2019) or were not able to provide the relevant subgroup data at the time of data analysis (Galante 2018).

Post hoc changes

We made a post hoc change to the eligibility criteria for the Types of interventions (see Differences between protocol and review) by subsequently limiting the study selection to interventions that explicitly stated the aim of fostering resilience, hardiness or post‐traumatic growth. Although the change raises the possibility of bias in the review process, we felt it was necessary to guarantee highly‐objective eligibility criteria and transparency. We do not believe that this divergence from the protocol (Helmreich 2017) is a serious bias. Due to the focus on interventions with the mention of at least one of the three terms, general health‐promoting interventions (e.g. well‐being therapy, chronic disease self‐management, self‐management training after negative life events) not meeting this criterion were excluded from this review. However, other psychological interventions in healthcare students, that are eventually more economic than the theoretical approaches found in this review, might also foster mental health despite stressors (i.e. resilience), although not being labelled as 'resilience training'.

We also made a post hoc change to the eligibility criteria for Types of participants (see Differences between protocol and review) by limiting the included studies to healthcare students. Although the change raises the possibility of bias, we felt it was necessary because the restriction to healthcare students guarantees a systematic review with sufficiently homogeneous comparisons.

Further potential biases

Even within each type of theoretical foundation, there was partial clinical heterogeneity, in terms of intervention setting, delivery or intensity. However, as there is still no consensus or 'gold standard' about how to design resilience‐training programmes leading to a variety (see previous reviews, e.g. Leppin 2014), we decided to pool the data. We took this decision as this review had a larger evidence base than previous meta‐analyses, but we were not able to perform the planned subgroup analyses to investigate potential explanations for heterogeneity.

Beyond the five main results for the primary outcomes at post‐test, the large number of pooled analyses in this review might have increased the probability of a type I error, potentially leading to false‐positive results.

Another important limitation of this review is the unknown stressor or risk exposure in the included studies (see Implications for research). Although the health professional education might be associated with substantial stressors among participants of the included studies, a proven risk or stressor exposure was not applied as an inclusion criterion for this review (see Types of participants), only potential stressor exposure. Based on the definition of resilience (Windle 2011a), the effects of resilience interventions on resilience cannot be determined without ensuring a significant risk. The missing assessment of stressor exposure is a general problem in resilience intervention research (Chmitorz 2018). For healthcare students in particular, the stressor exposure might also vary at different time points of training (e.g. more stressors in year one versus year four or vice versa, due to expectations or fears about the transition to professional life). The students' limited time in institutions (e.g. university) should also be considered. As the number of potential risks or stressors (i.e. stressor load) is naturally restricted to the years of training, healthcare students might be exposed to fewer stressors than groups experiencing the same stressors over a longer period of time (e.g. healthcare professionals).

Agreements and disagreements with other studies or reviews

Studies or reviews in different clinical and non‐clinical adult populations

As mentioned under Why it is important to do this review, the efficacy of resilience interventions for adult populations has been previously examined in 13 systematic reviews and six meta‐analyses, including a recent Cochrane Review by our group on resilience interventions in healthcare professionals (Kunzler 2020). Overall, the reviews largely found positive effects of resilience training on different outcomes (e.g. resilience, mental health, physical health, performance); however, many review authors have pointed out the need for further research, due to elements such as the low methodological quality of the primary studies. Many of the reviews also considered study designs other than RCTs (e.g. Bauer 2018; Massey 2019), and focused on certain target groups (e.g. Milne 2016; Pallavicini 2016; Pesantes 2015; Petriwskyj 2016), or certain forms of intervention (e.g. Deady 2017). The number of RCTs on resilience training specifically was therefore rather limited, making comparisons with our review difficult.

Some of the previous reviews used broader eligibility criteria (e.g. clinical and non‐clinical individuals, employees) and identified more RCTs (Joyce 2018; Macedo 2014; Leppin 2014; Robertson 2015; Vanhove 2016), compared to other reviews. Our review is focused on healthcare students, which is different from the mixed target groups in the previous reviews. Despite varying inclusion criteria, the findings of our review mostly agree with the previous research, although our review is based on evidence from a larger group of studies. For example, Macedo 2014 (seven RCTs in non‐clinical adult samples), whilst not pooling any data, identified some degree of effectiveness of resilience‐training programmes. Similarly, Robertson 2015 (eight RCTs in employees) found indications of benefits for personal resilience, mental health, well‐being and work performance in employees. With the exception of job performance, which was not examined here, these findings were confirmed by our review. With respect to the positive effects for resilience at post‐test, our review is consistent with and even showed evidence of a larger (moderate) effect than Joyce 2018 (17 RCTs in adults), who found a small positive effect of training on resilience at immediate post‐intervention. However, compared with Leppin 2014 (25 RCTs in diverse adults populations and persons with chronic diseases), who also found a moderate effect in favour of resilience training for up to three months after the end of training, our review suggested little or no evidence for a maintained positive effect for resilience at short‐term follow‐up. In contrast to Vanhove 2016, who identified positive effects on well‐being and psychological deficits (e.g. depressive symptoms) within one month post‐intervention, we found little or no evidence for an effect on these outcomes at immediately post‐test. However, the maintained positive effects for anxiety and the delayed effect on depression between post‐test and short‐term follow‐up in our review are comparable to Vanhove 2016, who, as well as the positive effects at one‐month follow‐up or less, also observed sustained effects of training for the prevention of psychological deficits at more than one month after training. In general, our findings on mental health differ from Leppin 2014, who found no evidence for an effect of training for mental health outcomes (depression, quality of life) aside from resilience. Due to the limited number of studies in our review (fewer than 10 studies per meta‐analysis; Deeks 2019), we were not able to replicate the findings of previous reviews for effect modifiers such as training setting (Vanhove 2016), theoretical foundation (Joyce 2018), or study comparator (Leppin 2014).

Compared to our review on healthcare professionals (Kunzler 2020), this review delivers similar findings for healthcare students, although we identified a smaller number of studies for individuals in health professional education (30 RCTs) compared with studies in healthcare professionals who have completed training (44 RCTs). The moderate positive effect on resilience immediately after training, which we identified in this review, is consistent with Kunzler 2020. However, while the positive effect on resilience was maintained in the short term (three months or less after training) in healthcare professionals, we could not replicate this finding in healthcare students. The same applies to symptoms of stress or perceived stress: while we found evidence for a small, positive effect of training on post‐intervention stress in healthcare students, with no evidence of an effect at short‐term follow‐up and only a single study measuring this outcome at medium‐term follow‐up, there was a moderate, positive effect on post‐test stress in healthcare professionals, which was also sustained over time. Similar to our findings in healthcare professionals (i.e. increase from a small to a moderate positive effect size for depression between post‐test and short‐term follow‐up), we observed a similar delayed effect on depression in healthcare students, with evidence for a moderate, positive effect on depressive symptoms emerging only in the short‐term. In contrast with our review on healthcare professionals (no evidence of any effect), we found evidence for a positive effect of resilience interventions on healthcare students' well‐being or quality of life at short‐term follow‐up (single study), as well as on symptoms of anxiety at post‐test, which for anxiety were maintained at short‐term follow‐up. Comparable with our findings in Kunzler 2020, resilience factors (i.e. secondary outcomes in both reviews) were hardly assessed in healthcare students. Finally, several methodological weaknesses (e.g. paucity of medium‐ and long‐term follow‐ups), that we identified for RCTs in healthcare staff were also found in this review or were even more evident here (see Implications for research). We therefore judged the certainty of the evidence to be very uncertain for both reviews.

Studies or reviews in healthcare students

Five systematic reviews (Gilmartin 2017; McGowan 2016; Pezaro 2017; Rogers 2016; Sanderson 2017) and one meta‐analysis (Lo 2018) have synthesised the efficacy of resilience‐training programmes for healthcare students to date, although not all of them have focused solely on interventions (see Why it is important to do this review). Comparable with our review, two of these previous publications examined healthcare students in general (Lo 2018; Sanderson 2017), but most only targeted a subgroup of healthcare students (e.g. nursing and midwifery students; McGowan 2016) or a combination of qualified staff and (certain) students (Gilmartin 2017; Rogers 2016; Pezaro 2017). Similar to the problems for the reviews described above, most previous reviews in healthcare students (Lo 2018) also considered study designs other than RCTs. The number of RCTs on resilience training is therefore rather limited (i.e. 0 to 24 RCTs among 5 to 36 included studies in the six reviews), in contrast with our review, which identified 30 RCTs across various groups of healthcare students. Since the review questions of some of the six reviews did not focus solely on the construct of resilience or on intervention studies, the primary studies included here did not always explicitly mention the intention of fostering resilience. Instead, broader mental health interventions (e.g. Gilmartin 2017) were also considered, which renders comparisons with our review difficult.

Our review is most comparable with McGowan 2016 and Rogers 2016, who included educational interventions to promote resilience (McGowan 2016), or considered (qualitative) research covering educational interventions and resilience (Rogers 2016), with Rogers 2016 considering different groups of healthcare students in addition to healthcare professionals. McGowan 2016 identified no RCTs, and Rogers 2016 found only one RCT in healthcare students (Peng 2014), which we also included in our review. Comparable with Rogers 2016, we also identified Steinhardt 2008, but excluded it due to an 'ineligible population' based on information obtained from the study authors that they did not target healthcare students. Furthermore, we also identified several non‐RCTs found in McGowan 2016 (e.g. Jameson 2014; Judkins 2005b) during the study identification process for our review.

In the only previous meta‐analysis, Lo 2018, who included 24 RCTs (19 in meta‐analysis) on any group intervention to enhance or maintain mental health in healthcare students, identified only two studies that explicitly stated the intention of fostering resilience (Erogul 2014; Porter 2008), both of which are included in this review. The meta‐analyses on mental health (depression, anxiety), burnout and stress symptoms, which the study authors calculated for different theoretical foundations (e.g. mindfulness‐based training) and which resulted in some positive effect sizes (e.g. stress reduction by mindfulness interventions compared to control), cannot readily be contrasted with the findings of this review.

Study flow diagram for all searches.

Figures and Tables -
Figure 1

Study flow diagram for all searches.

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Figures and Tables -
Figure 2

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Figures and Tables -
Figure 3

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Study flow diagram for first searches (January 1990 to October 2016).aOne ongoing study in the box above is now awaiting classification.

Figures and Tables -
Figure 4

Study flow diagram for first searches (January 1990 to October 2016).

aOne ongoing study in the box above is now awaiting classification.

Study flow diagram for second searches (October 2016 onwards).
aPeng 2014; Galante 2018.

Figures and Tables -
Figure 5

Study flow diagram for second searches (October 2016 onwards).
aPeng 2014; Galante 2018.

Contour‐enhanced funnel plot of comparison 1: Resilience intervention vs control, healthcare students, Resilience: post‐intervention.

Figures and Tables -
Figure 6

Contour‐enhanced funnel plot of comparison 1: Resilience intervention vs control, healthcare students, Resilience: post‐intervention.

Contour‐enhanced funnel plot of comparison 1: Resilience intervention vs control, healthcare students, Anxiety: post‐intervention.

Figures and Tables -
Figure 7

Contour‐enhanced funnel plot of comparison 1: Resilience intervention vs control, healthcare students, Anxiety: post‐intervention.

Contour‐enhanced funnel plot of comparison 1: Resilience intervention vs control, healthcare students, Depression: post‐intervention.

Figures and Tables -
Figure 8

Contour‐enhanced funnel plot of comparison 1: Resilience intervention vs control, healthcare students, Depression: post‐intervention.

Contour‐enhanced funnel plot of comparison 1: Resilience intervention vs control, healthcare students, Stress or stress perception: post‐intervention.

Figures and Tables -
Figure 9

Contour‐enhanced funnel plot of comparison 1: Resilience intervention vs control, healthcare students, Stress or stress perception: post‐intervention.

Contour‐enhanced funnel plot of comparison 1: Resilience intervention vs control, healthcare students, Well‐being or quality of life: post‐intervention.

Figures and Tables -
Figure 10

Contour‐enhanced funnel plot of comparison 1: Resilience intervention vs control, healthcare students, Well‐being or quality of life: post‐intervention.

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 1: Resilience: post‐intervention

Figures and Tables -
Analysis 1.1

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 1: Resilience: post‐intervention

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 2: Resilience: short‐term follow‐up (≤ 3 months)

Figures and Tables -
Analysis 1.2

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 2: Resilience: short‐term follow‐up (≤ 3 months)

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 3: Resilience: medium‐term follow‐up (> 3 to ≤ 6 months)

Figures and Tables -
Analysis 1.3

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 3: Resilience: medium‐term follow‐up (> 3 to ≤ 6 months)

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 4: Anxiety: post‐intervention

Figures and Tables -
Analysis 1.4

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 4: Anxiety: post‐intervention

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 5: Anxiety: short‐term follow‐up (≤ 3 months)

Figures and Tables -
Analysis 1.5

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 5: Anxiety: short‐term follow‐up (≤ 3 months)

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 6: Depression: post‐intervention

Figures and Tables -
Analysis 1.6

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 6: Depression: post‐intervention

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 7: Depression: short‐term follow‐up (≤ 3 months)

Figures and Tables -
Analysis 1.7

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 7: Depression: short‐term follow‐up (≤ 3 months)

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 8: Stress or stress perception: post‐intervention

Figures and Tables -
Analysis 1.8

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 8: Stress or stress perception: post‐intervention

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 9: Stress or stress perception: short‐term follow‐up (≤ 3 months)

Figures and Tables -
Analysis 1.9

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 9: Stress or stress perception: short‐term follow‐up (≤ 3 months)

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 10: Stress or stress perception: medium‐term follow‐up (> 3 to ≤ 6 months)

Figures and Tables -
Analysis 1.10

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 10: Stress or stress perception: medium‐term follow‐up (> 3 to ≤ 6 months)

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 11: Well‐being or quality of life: post‐intervention

Figures and Tables -
Analysis 1.11

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 11: Well‐being or quality of life: post‐intervention

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 12: Well‐being or quality of life: short‐term follow‐up (≤ 3 months)

Figures and Tables -
Analysis 1.12

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 12: Well‐being or quality of life: short‐term follow‐up (≤ 3 months)

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 13: Social support: post‐intervention

Figures and Tables -
Analysis 1.13

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 13: Social support: post‐intervention

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 14: Social support: short‐term follow‐up (≤ 3 months)

Figures and Tables -
Analysis 1.14

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 14: Social support: short‐term follow‐up (≤ 3 months)

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 15: Optimism: post‐intervention

Figures and Tables -
Analysis 1.15

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 15: Optimism: post‐intervention

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 16: Self‐efficacy: post‐intervention

Figures and Tables -
Analysis 1.16

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 16: Self‐efficacy: post‐intervention

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 17: Self‐efficacy: short‐term follow‐up (≤ 3 months)

Figures and Tables -
Analysis 1.17

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 17: Self‐efficacy: short‐term follow‐up (≤ 3 months)

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 18: Active coping: post‐intervention

Figures and Tables -
Analysis 1.18

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 18: Active coping: post‐intervention

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 19: Active coping: short‐term follow‐up (≤ 3 months)

Figures and Tables -
Analysis 1.19

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 19: Active coping: short‐term follow‐up (≤ 3 months)

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 20: Self‐esteem: short‐term follow‐up (≤ 3 months)

Figures and Tables -
Analysis 1.20

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 20: Self‐esteem: short‐term follow‐up (≤ 3 months)

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 21: Positive emotions: post‐intervention

Figures and Tables -
Analysis 1.21

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 21: Positive emotions: post‐intervention

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 22: Positive emotions: short‐term follow‐up (≤ 3 months)

Figures and Tables -
Analysis 1.22

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 22: Positive emotions: short‐term follow‐up (≤ 3 months)

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 23: Resilience: post‐intervention, sensitivity analysis (fixed‐effect analysis)

Figures and Tables -
Analysis 1.23

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 23: Resilience: post‐intervention, sensitivity analysis (fixed‐effect analysis)

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 24: Anxiety: post‐intervention, sensitivity analysis (fixed‐effect analysis)

Figures and Tables -
Analysis 1.24

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 24: Anxiety: post‐intervention, sensitivity analysis (fixed‐effect analysis)

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 25: Depression: post‐intervention, sensitivity analysis (fixed‐effect analysis)

Figures and Tables -
Analysis 1.25

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 25: Depression: post‐intervention, sensitivity analysis (fixed‐effect analysis)

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 26: Stress or stress perception: post‐intervention, sensitivity analysis (fixed‐effect analysis)

Figures and Tables -
Analysis 1.26

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 26: Stress or stress perception: post‐intervention, sensitivity analysis (fixed‐effect analysis)

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 27: Well‐being or quality of life: post‐intervention, sensitivity analysis (fixed‐effect analysis)

Figures and Tables -
Analysis 1.27

Comparison 1: Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes, Outcome 27: Well‐being or quality of life: post‐intervention, sensitivity analysis (fixed‐effect analysis)

Summary of findings 1. Resilience interventions versus control conditions for healthcare students

Resilience interventions versus control conditions for healthcare students

Patient or population: healthcare students, including students in training for health professions delivering direct medical care (e.g. medical students, nursing students), and allied health professions as distinct from medical care (e.g. psychology students, social work students); aged 18 years and older, irrespective of health status

Setting: any setting of health professional education (e.g. medical school, nursing school, psychology or social work department at university)

Intervention: any psychological intervention focused on fostering resilience or the related concepts of hardiness or post‐traumatic growth by strengthening well‐evidenced resilience factors that are thought to be modifiable by training (see Appendix 3), irrespective of content, duration, setting or delivery mode

Comparison: no intervention, wait‐list control, treatment as usual (TAU), active control, attention control

Outcomes

Anticipated absolute effects* (95% CI)

Relative effect
(95% CI)

№ of participants
(studies)

Certainty of the evidence
(GRADE)

Comments

Risk with control conditions

Risk with resilience interventions

Resilience
Measured by: investigators measured resilience using different instruments; higher scores mean higher resilience

Timing of outcome assessment: post‐intervention

The mean resilience score in the intervention groups was, on average, 0.43 standard deviations higher (0.07 higher to 0.78 higher)

561
(9 RCTs)

⊕⊝⊝⊝
Very lowa

SMD of 0.43 represents a moderate effect size (Cohen 1988b)b

Mental health and well‐being: anxiety
Measured by: investigators measured anxiety using different instruments; lower scores mean lower anxiety

Timing of outcome assessment: post‐intervention

The mean anxiety score in the intervention groups was, on average, 0.45 standard deviations lower (0.84 lower to 0.06 lower)

362
(7 RCTs)

⊕⊝⊝⊝
Very lowc

SMD of 0.45 represents a moderate effect size (Cohen 1988b)b

Mental health and well‐being: depression
Measured by: investigators measured depression using different instruments; lower scores mean lower depression

Timing of outcome assessment: post‐intervention

The mean depression score in the intervention groups was, on average, 0.20 standard deviations lower (0.52 lower to 0.11 higher)

332
(6 RCTs)

⊕⊝⊝⊝
Very lowd

SMD of 0.20 represents a small effect size (Cohen 1988b)b

Mental health and well‐being: stress or stress perception:

Measured by: investigators measured stress or stress perception using different instruments; lower scores mean lower stress or stress perception

Timing of outcome assessment: post‐intervention

The mean stress or stress perception score in the intervention groups was, on average, 0.28 standard deviations lower (0.48 lower to 0.09 lower)

420
(7 RCTs)

⊕⊝⊝⊝
Very lowe

SMD of 0.28 represents a small effect size (Cohen 1988b)b

Mental health and well‐being: well‐being or quality of life:

Measured by: investigators measured well‐being or quality of life using different instruments; higher scores mean higher well‐being or quality of life

Timing of outcome assessment: post‐intervention

The mean well‐being or quality of life score in the intervention groups was, on average, 0.15 standard deviations higher (0.14 lower to 0.43 higher)

251
(4 RCTs)

⊕⊝⊝⊝
Very lowf

SMD of 0.15 represents a small effect size (Cohen 1988b)b

Adverse events

There were no adverse events reported in association with study participation in 3 of 4 studies measuring potential adverse events.g

566

(3 RCTs)h

⊕⊝⊝⊝
Very lowi

*The risk in the intervention group (and its 95% CI) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CI: confidence interval; RCT: randomised controlled trial; SMD: standardised mean difference.

GRADE Working Group grades of evidence
High certainty: we are very confident that the true effect lies close to that of the estimate of the effect
Moderate certainty: we are moderately confident in the effect estimate; the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different
Low certainty: our confidence in the effect estimate is limited; the true effect may be substantially different from the estimate of the effect
Very low certainty: we have very little confidence in the effect estimate; the true effect is likely to be substantially different from the estimate of effect

aDowngraded by two levels due to study limitations (unclear risk of selection bias, high and unclear risk of performance, detection and attrition bias), by one level due to unexplained inconsistency (I2 = 75%), and by one level due to indirectness (studies limited to certain interventions (e.g. group setting, face‐to‐face delivery, moderate and high intensity, unspecified theoretical foundation) and comparators (no intervention, wait‐list)).

bAccording to Cohen 1988b, a standardised mean difference (SMD) of 0.2 represents a small difference (i.e. small effect size), 0.5 a moderate difference, and 0.8 a large difference.
cDowngraded by two levels due to study limitations (unclear risk of selection bias, high and unclear risk of detection and attrition bias, high risk of performance bias), by one level due to unexplained inconsistency (I2 = 66%), and by one level due to indirectness (studies limited to certain participants (medical students), interventions (e.g. group setting, moderate and high intensity) and comparators (no intervention, wait‐list)).
dDowngraded by two levels due to study limitations (unclear risk of selection bias, high and unclear risk of detection bias, high risk of performance and attrition bias), by one level due to unexplained inconsistency (I2 = 45%), by one level due to indirectness (studies limited to certain participants (medical students), interventions (e.g. group and individual setting, low and high intensity) and comparators (no intervention, wait‐list)), and by two levels due to imprecision (< 400 participants; 95% CI wide and inconsistent).
eDowngraded by two levels due to study limitations (unclear risk of selection bias, high and unclear risk of detection bias, high risk of performance, attrition and reporting bias), and by one level due to indirectness (studies limited to certain participants (medical and nursing students), interventions (group and individual setting, low and high intensity, mindfulness and unspecific theoretical foundation) and comparators (no intervention, wait‐list)).
fDowngraded by two levels due to study limitations (unclear risk of selection and detection bias, high and unclear risk of attrition bias, high risk of performance bias), by one level due to indirectness (studies limited to certain interventions (group setting, face‐to‐face and combined delivery, high intensity)), and by two levels due to imprecision (< 400 participants; 95% CI and inconsistent).

gKötter 2016 also assessed adverse events but did not report the respective data in the report.
hFor Galante 2018, subgroup data in healthcare students were not available; number of participants in total sample at post‐test (CORE‐OM data) was 482.
iDowngraded by two levels due to study limitations (unclear risk of selection and detection bias, unclear and high risk of attrition bias, high risk of performance and other bias (no systematic and validated assessment of adverse events)), and by one level due to indirectness (studies limited to certain interventions (individual setting, face‐to‐face, mindfulness based) and comparators (TAU)).

Figures and Tables -
Summary of findings 1. Resilience interventions versus control conditions for healthcare students
Table 1. Unused methods table

Method

Approach planned for analysis

Reason for non‐use

Measures of treatment effect

Dichotomous data
We had planned to analyse dichotomous outcomes by calculating the risk ratio (RR) of a successful outcome (i.e. improvement in relevant variables) for each trial. We had intended to express uncertainty in each result using 95% confidence intervals (CIs).

No study provided relevant dichotomous data for any of the primary or secondary outcomes included in this review.

Unit of analysis issues

Cluster‐randomised trials
In cluster‐randomised trials, if the clustering is ignored and the unit of analysis is different from the unit of allocation (‘unit‐of‐analysis error’) (Whiting‐O'Keefe 1984), P values may be artificially small and may result in false‐positive conclusions (Higgins 2019c). Had we encountered such cases, we would have accounted for the clustering in the data and followed the recommendations given in the literature (Higgins 2019c; White 2005). For those cluster‐randomised trials that did not report correct standard errors, we would have tried to recover correct standard errors by applying the usual formula for the variance inflation factor 1 + (M ‐ 1) ICC, where M is the average cluster size and ICC the intra‐cluster correlation coefficient (Higgins 2019c). If it had not been possible to extract ICC values from the study, we would have used the ICC of all cluster‐randomised trials in our review that investigated the same primary outcome scale in a similar setting. If this was not available, we would have used the average ICC of all other cluster‐randomised trials in our review. If no such studies were available, we would have used ICC = 0.05 as a mildly conservative guess for the primary analysis, and conducted a sensitivity analysis using ICC = 0.10. We had also planned to conduct sensitivity analyses based on the unit of randomisation as well as the ICC estimate in cluster‐randomised trials (see Sensitivity analysis).

No cluster‐RCT was identified and included in this review.

Multiple treatment groups

Had multiple groups in a study been relevant, we would have accounted for the correlation between the effect sizes from multi‐arm studies in a pair‐wise meta‐analysis (Higgins 2019c). We would have treated each comparison between a control group and a treatment group as an independent study. We would have multiplied the standard errors of the effect estimates by an adjustment factor to account for correlation between effect estimates. In so doing, we would have acknowledged heterogeneity between different treatment groups.

For studies with multiple treatment groups, we considered only one intervention group to be relevant for the review and meta‐analyses, based on the independent judgement of two review authors. Thus, in a pair‐wise meta‐analysis, we did not have to account for the correlation between the effect sizes for multi‐arm studies.

[…] If there is an adequate evidence base, we will consider performing a network meta‐analysis (see Data synthesis).

The evidence base was insufficient to conduct a network meta‐analysis.

Dealing with missing data

If standard deviations could neither be recovered from reported results nor obtained from the authors, we would have considered single imputation by means of pooled within‐treatment standard deviations from all other studies, provided that fewer than five studies had missing standard deviations. If more than five studies had missing standard deviations, we would have performed multiple imputation on the basis of the hierarchical model fitted to the non‐missing standard deviations.

We found no studies using the same scale that had missing standard deviations. Missing standard deviations could always be recovered from alternative statistical values or be obtained from the study authors.

Data synthesis

Had a trial reported more than one resilience scale, we planned to use the scale with better psychometric qualities (as specified in Appendix 3 in Helmreich 2017), to calculate effect sizes.

All studies measuring resilience only used one resilience scale.

If a study provided data from two instruments used equally in the included RCTs, two review authors (AK, IH) would have identified the appropriate measure through discussion (compare Storebø 2020).

This did not occur in this review.

Network meta‐analyses (NMAs) would have been merely exploratory and would only have been conducted if the review results had a sufficient and adequate evidence base.

Network meta‐analyses offer the possibility of comparing multiple treatments simultaneously (Caldwell 2005). They combine both direct (head‐to‐head) and indirect evidence (Caldwell 2005; Mills 2012), by using direct comparisons of interventions within RCTs, as well as indirect comparisons across trials, on the basis of a common reference group (e.g. an identical control group) (Li 2011). A network meta‐analysis on resilience‐training programmes does not exist.

According to Mills 2012, Linde 2016 and the Cochrane Handbook (Chaimani 2019), there are three important conditions for the conduct of NMAs: transitivity, homogeneity, consistency. Had an NMA been possible, i.e. if the three conditions had been fulfilled, we would have conducted an analysis ‐ with expert statistical support as suggested by Cochrane (Chaimani 2019) ‐ using a frequentist approach in R (Rücker 2020; Viechtbauer 2010). For sensitivity analyses, we had planned to fit the same models using the restricted maximum likelihood method (Piepho 2012; Piepho 2014; Rücker 2020). We had intended to consider categorising resilience training into seven groups, based on the underlying training concept: (1) cognitive behavioural therapy, (2) acceptance and commitment therapy, (3) mindfulness‐based therapy, (4) attention and interpretation therapy, (5) problem‐solving therapy, (6) stress inoculation therapy and (7) multimodal resilience training. We may have included additional groups after conducting the full literature search. Reference groups that might have been included in the NMA were: attention control, wait‐list, treatment as usual or no intervention. We had planned to investigate inconsistency and flow of evidence in accordance with recommendations in the literature (e.g. Chaimani 2019; Dias 2010; König 2013; Krahn 2013; Krahn 2014; Lu 2006; Lumley 2002; Rücker 2020; Salanti 2008; White 2012b).

The evidence base was insufficient to conduct a network meta‐analysis.

Summary of findings

Depending on the assessment of heterogeneity and possible effect modifiers (see Subgroup analysis and investigation of heterogeneity), we would have created several ‘Summary of findings’ tables; for example, the clinical status of study populations or the comparator group.

We were not able to investigate potential effect modifiers for the primary outcomes in subgroup analyses and therefore created no additional ‘Summary of findings’ tables.

Subgroup analysis and investigation of heterogeneity

Where we detected substantial heterogeneity, we had planned to examine characteristics of studies that may be associated with this diversity (Deeks 2019). The selection of potential effect modifiers was based on experiences from previous reviews (Leppin 2014; Robertson 2015; Vanhove 2016).

We had intended to perform the following subgroup analyses on our primary outcomes, if we identified 10 or more studies during the review process (Deeks 2019):

  • setting of resilience interventions (group setting vs individual setting vs combined setting);

  • delivery format of resilience interventions (face‐to‐face vs online vs bibliotherapy vs combined delivery vs mobile‐based vs delivery not specified);

  • theoretical foundation of resilience‐training programmes (CBT vs ACT vs mindfulness‐based therapy vs AIT vs problem‐solving training vs stress inoculation vs multimodal resilience training vs coaching vs positive psychology vs nonspecific resilience training);

  • comparator group in intervention studies (attention control vs wait‐list control vs TAU vs no intervention vs active control vs control group not further specified); and

  • intensity of resilience interventions (low intensity vs moderate intensity vs high intensity).

For the primary outcomes at each time point, we identified fewer than 10 studies in a pair‐wise meta‐analysis.

Sensitivity analysis

Comparable with the planned subgroup analyses, we had planned to perform sensitivity analyses if more than 10 RCTs were included in a meta‐analysis. We had intended to restrict the sensitivity analyses to the primary outcomes.
For intervention studies assessing resilience with resilience scales, we had planned to perform a sensitivity analysis on the basis of the underlying concept (state versus trait) in these measures, and to limit the analysis to scales assessing resilience as an outcome of an intervention.

To examine the impact of the risk of bias of included trials, we had intended to limit the studies included in the sensitivity analysis to those whose risk of bias was rated as low or unclear, and to exclude studies assessed at high risk of bias; for studies with low or unclear risk of bias, we had planned to conduct subgroup analyses.

We had also intended to consider the restriction to registered studies. We had planned to identify registration, both by recording whether we found a study in a trial registry and by noting whether the study author claimed to have registered it.

We had planned to perform sensitivity analyses by limiting analysis to those studies with low levels of missing data (less than 10% missing primary outcome). We had intended to limit the analysis to studies where missing data had been imputed or accounted for by fitting a model for longitudinal data, or where the proportion of missing primary outcome data was less than 10%.

We had also intended to perform sensitivity analyses based on the ICC estimate in cluster‐randomised trials that had not adjusted for clustering, by excluding cluster‐RCTs where standard errors had not been corrected or corrected only on the basis of an externally‐estimated ICC. In an additional sensitivity analysis, we had planned to replace all externally‐estimated ICCs less than 0.10 by 0.10.

Finally, we had intended to conduct a sensitivity analysis based on the unit of randomisation, by limiting the analysis to individually randomised trials.

For the primary outcomes at each time point, we identified fewer than 10 studies in a pair‐wise meta‐analysis.

This table provides details of analyses that had been planned and described in the protocol (Helmreich 2017), including revisions made at review stage, but were not used as they were not required or not feasible.

ACT: acceptance and commitment therapy; AIT: attention and interpretation therapy; CBT: cognitive‐behavioural therapy; RCT(s): randomised controlled trial(s); TAU: treatment as usual; vs: versus

Figures and Tables -
Table 1. Unused methods table
Table 2. Primary outcomes: scales used

Outcomes

Number of studies

Studies and instruments

Resilience

17

Anxiety

9

Depression

10

Stress or stress perception

13

Well‐being or quality of life

6

aFor depression, we preferred depression scales over burnout scales if both measures were reported.
bConcerning Barry 2019, we included the values for the PSS‐10 in the pooled analysis, as this measure was used more often among the included studies.

Figures and Tables -
Table 2. Primary outcomes: scales used
Table 3. Secondary outcomes: scales used

Outcomes

Number of studies

Studies and instruments

Social support (perceived)

4

Optimism

4

Self‐efficacy

7

Active coping

2

  • Houston 2017: taking action, newly created subscale for the respective sample using original items of the Brief Coping Orientations to Problems Experience scale (Carver 1997)

  • Porter 2008: planful problem‐solving subscale of Ways of Coping Questionnaire (Folkman 1988)

Self‐esteem

2

Hardiness

1

Positive emotions

6

Figures and Tables -
Table 3. Secondary outcomes: scales used
Comparison 1. Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1.1 Resilience: post‐intervention Show forest plot

9

561

Std. Mean Difference (IV, Random, 95% CI)

0.43 [0.07, 0.78]

1.2 Resilience: short‐term follow‐up (≤ 3 months) Show forest plot

4

209

Std. Mean Difference (IV, Random, 95% CI)

0.20 [‐0.44, 0.84]

1.3 Resilience: medium‐term follow‐up (> 3 to ≤ 6 months) Show forest plot

1

Mean Difference (IV, Random, 95% CI)

Subtotals only

1.4 Anxiety: post‐intervention Show forest plot

7

362

Std. Mean Difference (IV, Random, 95% CI)

‐0.45 [‐0.84, ‐0.06]

1.5 Anxiety: short‐term follow‐up (≤ 3 months) Show forest plot

2

91

Std. Mean Difference (IV, Random, 95% CI)

‐0.88 [‐1.32, ‐0.45]

1.6 Depression: post‐intervention Show forest plot

6

332

Std. Mean Difference (IV, Random, 95% CI)

‐0.20 [‐0.52, 0.11]

1.7 Depression: short‐term follow‐up (≤ 3 months) Show forest plot

4

226

Std. Mean Difference (IV, Random, 95% CI)

‐0.65 [‐1.26, ‐0.04]

1.8 Stress or stress perception: post‐intervention Show forest plot

7

420

Std. Mean Difference (IV, Random, 95% CI)

‐0.28 [‐0.48, ‐0.09]

1.9 Stress or stress perception: short‐term follow‐up (≤ 3 months) Show forest plot

2

113

Std. Mean Difference (IV, Random, 95% CI)

0.13 [‐0.79, 1.06]

1.10 Stress or stress perception: medium‐term follow‐up (> 3 to ≤ 6 months) Show forest plot

1

Mean Difference (IV, Random, 95% CI)

Subtotals only

1.11 Well‐being or quality of life: post‐intervention Show forest plot

4

251

Std. Mean Difference (IV, Random, 95% CI)

0.15 [‐0.14, 0.43]

1.12 Well‐being or quality of life: short‐term follow‐up (≤ 3 months) Show forest plot

1

Mean Difference (IV, Random, 95% CI)

Subtotals only

1.13 Social support: post‐intervention Show forest plot

2

121

Std. Mean Difference (IV, Random, 95% CI)

0.21 [‐0.15, 0.57]

1.14 Social support: short‐term follow‐up (≤ 3 months) Show forest plot

2

92

Std. Mean Difference (IV, Random, 95% CI)

0.23 [‐0.18, 0.64]

1.15 Optimism: post‐intervention Show forest plot

2

66

Std. Mean Difference (IV, Random, 95% CI)

0.29 [‐0.20, 0.78]

1.16 Self‐efficacy: post‐intervention Show forest plot

5

219

Std. Mean Difference (IV, Random, 95% CI)

0.51 [0.14, 0.88]

1.17 Self‐efficacy: short‐term follow‐up (≤ 3 months) Show forest plot

1

Mean Difference (IV, Random, 95% CI)

Subtotals only

1.18 Active coping: post‐intervention Show forest plot

1

Mean Difference (IV, Random, 95% CI)

Subtotals only

1.19 Active coping: short‐term follow‐up (≤ 3 months) Show forest plot

1

Mean Difference (IV, Random, 95% CI)

Subtotals only

1.20 Self‐esteem: short‐term follow‐up (≤ 3 months) Show forest plot

1

Mean Difference (IV, Random, 95% CI)

Subtotals only

1.21 Positive emotions: post‐intervention Show forest plot

2

112

Std. Mean Difference (IV, Random, 95% CI)

0.51 [0.01, 1.01]

1.22 Positive emotions: short‐term follow‐up (≤ 3 months) Show forest plot

1

Mean Difference (IV, Random, 95% CI)

Subtotals only

1.23 Resilience: post‐intervention, sensitivity analysis (fixed‐effect analysis) Show forest plot

9

561

Std. Mean Difference (IV, Fixed, 95% CI)

0.52 [0.36, 0.69]

1.24 Anxiety: post‐intervention, sensitivity analysis (fixed‐effect analysis) Show forest plot

7

362

Std. Mean Difference (IV, Fixed, 95% CI)

‐0.35 [‐0.57, ‐0.14]

1.25 Depression: post‐intervention, sensitivity analysis (fixed‐effect analysis) Show forest plot

6

332

Std. Mean Difference (IV, Fixed, 95% CI)

‐0.18 [‐0.40, 0.04]

1.26 Stress or stress perception: post‐intervention, sensitivity analysis (fixed‐effect analysis) Show forest plot

7

420

Std. Mean Difference (IV, Fixed, 95% CI)

‐0.28 [‐0.48, ‐0.09]

1.27 Well‐being or quality of life: post‐intervention, sensitivity analysis (fixed‐effect analysis) Show forest plot

4

251

Std. Mean Difference (IV, Fixed, 95% CI)

0.14 [‐0.10, 0.39]

Figures and Tables -
Comparison 1. Resilience interventions versus control conditions in healthcare students: primary and secondary outcomes