Scolaris Content Display Scolaris Content Display

Estrategias de seguimiento después de finalizar el tratamiento primario contra el cáncer en adultos supervivientes del cáncer

Contraer todo Desplegar todo

Antecedentes

La mayoría de los supervivientes del cáncer reciben atención de seguimiento después de completar el tratamiento con el objetivo principal de detectar la recidiva. El seguimiento tradicional que consiste en visitas establecidas a un especialista en cáncer para que realice exámenes y pruebas es costoso y puede ser una carga para el paciente. Se han desarrollado y examinado estrategias de seguimiento que incluyen a proveedores de atención no especialistas, diferentes intensidades de los procedimientos o el agregado de paquetes de atención de supervivencia; sin embargo, aún no se conoce su efectividad.

Objetivos

El objetivo de esta revisión es comparar el efecto de diferentes estrategias de seguimiento en adultos supervivientes del cáncer, después de finalizar el tratamiento primario contra el cáncer, sobre los resultados primarios de la supervivencia general y el tiempo transcurrido hasta la detección de la recidiva. Los resultados secundarios son la calidad de vida relacionada con la salud, la ansiedad (incluido el miedo a la recidiva), la depresión y el coste.

Métodos de búsqueda

Se hicieron búsquedas en CENTRAL, MEDLINE, Embase, en otras cuatro bases de datos y dos registros de ensayos hasta el 11 diciembre 2018; junto con la verificación de referencias, la búsqueda de citas y el contacto con autores de los estudios para identificar estudios adicionales.

Criterios de selección

Se incluyeron todos los ensayos aleatorizados que comparaban diferentes estrategias de seguimiento en adultos supervivientes del cáncer después de la finalización del tratamiento primario contra el cáncer con intención curativa, que incluyó al menos uno de los resultados mencionados anteriormente. Se comparó la efectividad de: 1) el seguimiento realizado por personal no especialista (es decir, atención médica general, atención de enfermería, atención iniciada por el paciente o compartida) frente al seguimiento realizado por especialistas; 2) el seguimiento menos intensivo frente a más intensivo (basado en visitas clínicas, exámenes y procedimientos de diagnóstico) y 3) el seguimiento que integra componentes de atención adicionales relevantes a la detección de la recidiva (por ejemplo, educación o monitorización de los síntomas del paciente o planes de atención de supervivencia) frente a la atención habitual.

Obtención y análisis de los datos

Se utilizaron las guías metodológicas estándar de Cochrane y Cochrane Effective Practice and Organisation of Care (EPOC). La certeza de la evidencia se evaluó mediante los criterios GRADE. Para cada comparación, se presentan resultados sintetizados para la supervivencia general y el tiempo hasta la detección de la recidiva como cociente de riesgos instantáneos (CRI) y para la calidad de vida relacionada con la salud, la ansiedad y la depresión como diferencias de medias (DM), con intervalos de confianza (IC) del 95%. Cuando no fue posible el metanálisis, se informaron los resultados de los estudios individuales. Para la supervivencia y la recidiva, se utilizó el análisis de metarregresión cuando fue posible para investigar si los efectos variaban con respecto al sitio del cáncer, el año de publicación y la calidad del estudio.

Resultados principales

Se incluyeron 53 ensayos con 20 832 participantes a través de 12 sitios del cáncer y 15 países, principalmente en Europa, Norteamérica y Australia. Todos los estudios se realizaron en un hospital o en un centro de medicina general. Diecisiete estudios compararon el seguimiento realizado por personal no especialista con el seguimiento realizado por especialistas, 24 estudios compararon la intensidad del seguimiento y 12 estudios compararon la educación o la monitorización de los síntomas de los pacientes, o planes de atención de supervivencia con la atención habitual. Por lo general, el riesgo de sesgo fue bajo o poco claro en la mayoría de los estudios, con un mayor riesgo de sesgo en los ensayos más pequeños.

Seguimiento realizado por personal no especialista en comparación con el seguimiento realizado por especialistas

No está claro cómo esta estrategia afecta la supervivencia general (CRI 1,21; IC del 95%: 0,68 a 2,15; 2 estudios; 603 participantes), el tiempo hasta la detección de la recidiva (4 estudios, 1691 participantes) o el coste (8 estudios, 1756 participantes) debido a que la certeza de la evidencia es muy baja.

El seguimiento realizado por personal no especialista frente al seguimiento realizado por especialistas puede lograr poca o ninguna diferencia en la calidad de vida relacionada con la salud a los 12 meses (DM 1,06; IC del 95%: ‐1,83 a 3,95; 4 estudios; 605 participantes; evidencia de certeza baja); y probablemente logra poca o ninguna diferencia en la ansiedad a los 12 meses (DM ‐0,03; IC del 95%: ‐0,73 a 0,67; 5 estudios; 1266 participantes; evidencia de certeza moderada). Existe una mayor certeza en cuanto a que tiene poco o ningún efecto sobre la depresión a los 12 meses (DM 0,03; IC del 95%: ‐0,35 a 0,42; 5 estudios; 1266 participantes; evidencia de certeza alta).

Seguimiento menos intensivo en comparación con seguimiento más intensivo

El seguimiento menos intensivo frente al más intensivo puede lograr poca o ninguna diferencia en la supervivencia general (CRI 1,05; IC del 95%: 0,96 a 1,14; 13 estudios; 10 726 participantes; evidencia de certeza baja) y probablemente aumenta el tiempo hasta la detección de la recidiva (CRI 0,85; IC del 95%: 0,79 a 0,92; 12 estudios; 11 276 participantes; evidencia de certeza moderada). El análisis de metarregresión mostró poca o ninguna diferencia en los efectos de la intervención según el sitio del cáncer, el año de publicación o la calidad del estudio.

No está claro si esta estrategia tiene un efecto sobre la calidad de vida relacionada con la salud (3 estudios, 2742 participantes), la ansiedad (1 estudio, 180 participantes) o el coste (6 estudios, 1412 participantes) debido a que la certeza de la evidencia es muy baja. Ninguno de los estudios informó sobre la depresión.

Estrategias de seguimiento que integran la educación o monitorización adicional de los síntomas del paciente, o planes de atención de supervivencia en comparación con la atención habitual:

Ninguno de los estudios informó sobre la supervivencia general o el tiempo hasta la detección de la recidiva.

No está claro si esta estrategia logra una diferencia en la calidad de vida relacionada con la salud (12 estudios, 2846 participantes), la ansiedad (1 estudio, 470 participantes), la depresión (8 estudios, 2351 participantes) o el coste (1 estudio, 408 participantes), debido a que la certeza de la evidencia es muy baja.

Conclusiones de los autores

La evidencia con respecto a la efectividad de las diferentes estrategias de seguimiento varían de forma considerable. Un seguimiento menos intensivo puede lograr poca o ninguna diferencia en la supervivencia general, aunque probablemente retrasa la detección de la recidiva. Sin embargo, debido a que no se analizaron los dos resultados juntos, no se pueden establecer conclusiones directas acerca del efecto de las intervenciones sobre la supervivencia después de la detección de la recidiva. No se conocen los efectos del seguimiento realizado por personal no especialista sobre la supervivencia y la detección de la recidiva, ni cómo la intensidad del seguimiento afecta la calidad de vida relacionada con la salud, la ansiedad y la depresión. Hubo poca evidencia de los efectos del seguimiento que integra la educación/monitorización adicional de los síntomas de los pacientes y planes de atención de supervivencia.

PICO

Population
Intervention
Comparison
Outcome

El uso y la enseñanza del modelo PICO están muy extendidos en el ámbito de la atención sanitaria basada en la evidencia para formular preguntas y estrategias de búsqueda y para caracterizar estudios o metanálisis clínicos. PICO son las siglas en inglés de cuatro posibles componentes de una pregunta de investigación: paciente, población o problema; intervención; comparación; desenlace (outcome).

Para saber más sobre el uso del modelo PICO, puede consultar el Manual Cochrane.

Resumen en términos sencillos

Estrategias de seguimiento después de finalizar el tratamiento primario contra el cáncer

¿Cuál era el objetivo de esta revisión?

En esta revisión Cochrane, se intentó averiguar si los supervivientes del cáncer que recibieron tres tipos diferentes de atención de seguimiento, después de haber recibido tratamiento contra el cáncer, tienen mejores resultados médicos y personales. Se recopilaron y evaluaron todos los estudios pertinentes y se encontraron 53 estudios.

Mensajes clave

El seguimiento realizado por personal no especialista, como el seguimiento proporcionado por un médico general o por el personal de enfermería, tiene poca o ninguna diferencia en la calidad de vida relacionada con la salud, la ansiedad o la depresión, en comparación con el seguimiento realizado por especialistas. No es posible tener seguridad en cuanto a sus efectos sobre la supervivencia general y la detección del cáncer que reaparece después del tratamiento (recidiva).

Un seguimiento menos intensivo, como el seguimiento con menos exámenes o pruebas, puede lograr poca o ninguna diferencia en la supervivencia general, aunque probablemente retrasa la detección de la recidiva en comparación con un seguimiento más intensivo. Sin embargo, se necesitan otros tipos de estudios antes de poder tener seguridad acerca de los efectos de la detección temprana de la recidiva sobre la supervivencia. Tampoco es posible tener seguridad en cuanto a su efecto sobre la calidad de vida relacionada con la salud, la ansiedad y la depresión.

Hubo poca evidencia en cuanto al tipo de seguimiento final, que integró componentes adicionales relevantes para la detección de la recidiva, como la educación o la monitorización de los síntomas del paciente, o planes de atención de supervivencia.

¿Qué se estudió en la revisión?

Después de recibir tratamiento para el cáncer, la mayoría de los pacientes reciben atención de seguimiento para buscar signos de recidiva. Si el cáncer regresa, se cree que es mejor detectarlo antes, lo cual permite un tratamiento más temprano, lo cual se espera que mejore la supervivencia del paciente. El seguimiento tradicional que incluye visitas establecidas a un especialista en cáncer en un contexto hospitalario para la realización de exámenes y pruebas puede ser costoso y representar una carga para el paciente. Se han desarrollado y examinado estrategias de seguimiento más nuevas que incluyen proveedores de atención no especialistas, diferentes intensidades de los exámenes o el agregado de planes de atención de supervivencia, aunque aún no se conoce su efectividad.

El objetivo de esta revisión fue determinar si tres tipos de atención posterior aumentaron la supervivencia, disminuyeron el tiempo hasta que se detectó la recidiva y mejoraron los resultados de los pacientes, como la calidad de vida relacionada con la salud, la ansiedad y la depresión, así como el coste. Los tipos de atención posterior fueron: 1) seguimiento realizado por personal no especialista (por ejemplo, atención médica general, atención de enfermería, atención iniciada por el paciente o compartida) frente al seguimiento realizado por especialistas; 2) seguimiento menos intensivo frente a más intensivo (basado en la frecuencia o intensidad de las visitas clínicas, los exámenes o los procedimientos de diagnóstico); y 3) seguimiento que integra componentes adicionales de la atención pertinentes para la detección de la recidiva (por ejemplo, educación o monitorización de los síntomas del paciente o planes de atención de supervivencia) frente a la atención habitual.

¿Cuáles son los principales resultados de la revisión?

Se analizaron 53 estudios, con 20 832 participantes con 12 tipos de cáncer en 15 países diferentes, principalmente en Europa, Norteamérica y Australia. Todos los estudios se realizaron en un hospital o en un centro de medicina general.

Cuando los supervivientes del cáncer reciben atención posterior realizada por personal no especialista, como médicos generales y personal de enfermería:

‐ No se sabe si la supervivencia general está afectada o si la recidiva del cáncer se detecta antes;

‐ Probablemente logra poca o ninguna diferencia en la calidad de vida relacionada con la salud y la ansiedad y no logra ninguna diferencia en la depresión a los 12 meses de seguimiento;

‐ No existe seguridad en cuanto a si existe una diferencia en los costes entre estos dos tipos de estrategias de seguimiento.

Cuando los supervivientes del cáncer reciben atención posterior menos intensiva, como menos exámenes y pruebas:

‐ Puede lograr poca o ninguna diferencia en la supervivencia general, aunque probablemente retrasa la detección de la recidiva;

‐ No existe seguridad en cuanto a si logra una diferencia en la calidad de vida relacionada con la salud, la ansiedad y los costes. No se encontraron estudios que evaluaran la depresión.

Cuando los supervivientes del cáncer reciben atención posterior con educación adicional sobre sus síntomas o planes de atención de supervivencia:

‐ No existe seguridad en cuanto a cómo este tipo de atención posterior mejora la calidad de vida relacionada con la salud, la ansiedad o la depresión, o si aumenta los costes de la atención. No se encontraron estudios que evaluaran la supervivencia general o si la recidiva del cáncer se detecta antes.

¿Cómo de actualizada está esta revisión?

Se examinaron los estudios publicados hasta el 11 de diciembre de 2018.

Conclusiones de los autores

disponible en

Implicaciones para la práctica

En muchos países de ingresos altos, las estrategias de seguimiento convencionales, como el examen de rutina realizado por un especialista, están siendo sustituidas por enfoques menos intensivos y supuestamente más estratégicos. En este contexto, es importante evaluar si estos cambios pueden dar lugar a un pronóstico y a resultados informados por el paciente más deficientes. El hallazgo de que las estrategias menos intensivas no afectan de forma negativa la supervivencia general puede considerarse tranquilizador, al tener en cuenta esta tendencia actual.

El hallazgo de que las estrategias más intensivas detectan antes las recidivas tiene implicaciones menos claras. Como los dos resultados (recidiva y supervivencia) no se analizaron juntos, no es posible establecer conclusiones directas acerca del efecto de las intervenciones sobre la supervivencia después de la detección de la recidiva. De hecho, el contenido y la organización de las estrategias de seguimiento convencionales se han centrado durante mucho tiempo en la detección precoz de las recidivas, basándose en el supuesto de que la detección precoz de la recidiva se traduce en una supervivencia más larga a través del tratamiento precoz. Debido a que esta revisión es la primera en encontrar un efecto de las estrategias menos intensivas sobre la recidiva, se necesita más conocimiento. Se necesitan más estudios de alta calidad que utilicen el enfoque apropiado para evaluar el efecto de las estrategias de seguimiento sobre la supervivencia según la detección del estado de recidiva. Pueden estar en juego varios mecanismos que influyen en la supervivencia después de la recidiva en comparación con la supervivencia general. Es posible que la biología de las recidivas sea diferente en los diferentes sitios de cáncer y que la detección y el tratamiento tempranos de la recidiva oligometastásica puedan dar lugar a un beneficio de supervivencia en algunos sitios de cáncer pero no en otros.

A pesar de la menor certeza de la evidencia, los resultados sobre la calidad de vida y la angustia psicológica indican que todavía no existe suficiente conocimiento sobre cómo mejorar estos aspectos de la supervivencia del cáncer hasta un punto que muestre un efecto mensurable. Como solo se incluyeron estudios que se centraban en el seguimiento del cáncer y se excluyeron los estudios que solo se centraban en mejorar la calidad de vida, es posible que los componentes de los estudios que se incluyeron fueran demasiado débiles para lograr una diferencia. También es posible que las herramientas de medición actuales para la calidad de vida, la ansiedad y la depresión sean inadecuadas o no sean óptimas para captar los cambios psicológicos que pueden experimentar los pacientes.

En muchos países, actualmente se está debatiendo si el enfoque central del seguimiento debe ampliarse desde la detección de la recidiva e incluir además los efectos somáticos y psicológicos tardíos. Por lo tanto, es importante saber si es posible proporcionar intervenciones que puedan mejorar de manera significativa estas experiencias informadas por los pacientes. En forma concurrente con esta discusión, también se ha sugerido que los pacientes deben participar más directamente en el seguimiento del cáncer, por ejemplo, mediante la automonitorización de los síntomas y las decisiones de tratamiento compartidas. Esta es una forma relativamente nueva de pensar sobre la atención de seguimiento del cáncer. Algunos de los estudios incluidos en esta revisión examinaron algunos de estos principios, por ejemplo, mediante la educación del paciente sobre la monitorización de los síntomas y el contacto iniciado por el paciente con los profesionales de la salud, aunque se necesita un mayor desarrollo teórico y clínico para examinar formas de organizar el seguimiento del cáncer centrado en el paciente.

Implicaciones para la investigación

La evaluación de la calidad de la evidencia en esta revisión indica que todavía hay margen para mejorar la calidad de los ensayos aleatorizados en el seguimiento del cáncer. Aunque el cegamiento de los participantes a menudo no es posible ni ético, se podrían realizar otros esfuerzos para minimizar el sesgo debido al tratamiento diferencial o a la evaluación diferencial de los resultados. Esta revisión también destaca la necesidad de utilizar métodos más óptimos y análisis estadísticos cuando sea posible, debido a que este hecho tiene un impacto importante sobre el nivel de certeza de la evidencia. El análisis de los resultados del tiempo transcurrido hasta el evento como resultados continuos o resultados dicotómicos aumenta el riesgo de estimaciones sesgadas, dificulta el metanálisis y limita las conclusiones que se pueden extraer, ya que se pierde una gran cantidad de información de cada estudio. Por ejemplo, el informe del tiempo transcurrido hasta la detección de la recidiva como tiempo medio hasta el evento puede proporcionar estimaciones inexactas del efecto del tratamiento, ya que dicho análisis solo tiene en cuenta el subconjunto de participantes que han tenido una recidiva y excluye a los que aún están libres de cáncer. Finalmente, el informe sobre los años de seguimiento de los participantes al realizar el análisis de supervivencia ayudará a mejorar la transparencia de los hallazgos en los ensayos clínicos.

Una posible dirección para la investigación futura puede ser la investigación del efecto de la detección temprana de la recidiva sobre los resultados de los pacientes, como la supervivencia, al comparar estrategias más intensivas y menos intensivas. Este hecho puede proporcionar el conocimiento necesario para discutir cuánto interés clínico se le debe dar a la detección de la recidiva tan pronto como sea posible. Varios estudios en esta revisión han investigado la detección de la recidiva y la supervivencia general, pero no juntas. Será necesario obtener evidencia sólida a partir de análisis que utilicen modelos multi‐estado que tengan en cuenta tanto el tiempo hasta la detección de la recidiva como la muerte para todos los participantes. Aunque dichos estudios requerirán una gran cantidad de recursos y períodos largos de seguimiento, proporcionarían la evidencia necesaria para tomar las decisiones que afectan las vidas de millones de supervivientes del cáncer. Dichos estudios también pueden investigar cualquier efecto adverso potencial de diferentes estrategias de seguimiento, un aspecto que estaba fuera del alcance de esta revisión.

A pesar del nivel relativamente bajo de la evidencia, esta revisión indica que se necesita más conocimiento sobre cómo mejorar y proporcionar el apoyo médico y psicosocial necesario que los supervivientes del cáncer pueden necesitar. Es posible que también sea necesario desarrollar herramientas de medición de resultados específicamente para los síntomas somáticos y psicológicos en los supervivientes del cáncer, ya que la mayoría de las medidas de resultado de los estudios incluidos se desarrollaron para captar los síntomas durante el tratamiento contra el cáncer. En forma paralela a la discusión sobre los niveles más altos de participación de los pacientes en el tratamiento, se ha sugerido que la participación de los pacientes en la investigación también puede ser beneficiosa, en especial cuando se desarrollan intervenciones centradas en el paciente. Este es un paradigma relativamente nuevo en la investigación que todavía se está evaluando.

En esta revisión, se identificaron muchos estudios que aportan conocimiento sobre una variedad de resultados importantes en una gran población de supervivientes del cáncer. También se han identificado sistemáticamente las brechas en el conocimiento que aún deben cubrirse. La investigación futura puede construir y ampliar los hallazgos de esta revisión, así como mejorar las limitaciones de los métodos. Este conocimiento es necesario para justificar los enormes recursos que se requieren cada vez más para proporcionar un seguimiento seguro y efectivo de los adultos supervivientes del cáncer en todo el mundo.

Summary of findings

Open in table viewer
Summary of findings for the main comparison. Non‐specialist‐led versus specialist‐led follow‐up after primary cancer treatment

Non‐specialist‐led versus specialist‐led follow‐up after primary cancer treatment

Patient or population: adult cancer survivors from the following cancer sites: breast, colon, colorectal, endometrial, ovarian, cervical, melanoma and oesophageal
Setting: outpatient treatment in hospitals or general practice in Australia, Canada, Denmark, Netherlands, Norway, Sweden and UK
Intervention: non‐specialist‐led (i.e. GP‐led, nurse‐led, patient‐initiated or shared care) follow‐up
Comparison: specialist‐led follow‐up

Outcomes

Anticipated absolute effects* (95% CI)

Relative effects (95% CI)

Number of participantsa
(Number of studies)

Certainty of the evidence
(GRADE)

Comments

Risk with specialist‐led follow‐up

Risk with non‐specialist‐led follow‐up

Overall survival

Follow‐up range: 12 months to 60 months

89 per 100b

87 per 100

(79 to 93)

HR 1.21
(0.68 to 2.15)

603 participants

(2 randomised trials)

⊕⊝⊝⊝
Very lowc

4 studies reported on overall survival. It is uncertain how non‐specialist‐led follow‐up affects overall survival as the certainty of the evidence is very low.

We could not incorporate data from 2 other studies (N =1077) in the meta‐analysis, both reported little or no difference in overall survival.

Difference: 2 fewer survivors in the intervention group per 100 participants (between 10 fewer to 4 more)

Time to detection of recurrence

Follow‐up range: 3 months to 60 months

See comment

1691 participants (4 randomised trials)

⊕⊝⊝⊝
Very lowd

4 studies reported on time to detection of recurrence. It is uncertain how non‐specialist‐led follow‐up affects time to detection of recurrence as the certainty of the evidence is very low and we could not pool the reported data.

3 studies reported little or no difference in time to detection of recurrence and 1 study reported median time to recurrence but did not carry out any statistical analysis.

Health‐related quality of life, (at 12 months' follow‐up)

EORTC‐C30 global health status scale (higher scores indicate better HRQoL)

MD 1.06 higher
(1.83 lower to 3.95 higher)

605 participants

(4 randomised trials)

⊕⊕⊝⊝
Lowe

Thirteen studies reported on HRQoL using EORTC‐C30, SF‐36, SF‐12, EuroQoL‐5D and FACT at different time points. Meta‐analysis of 4 studies showed that non‐specialist‐led follow‐up may make little or no difference in HRQoL at 12 months as measured by the EORTC‐C30 global health status scale. The mean difference did not reach the minimal clinically important difference of 10 points identified for this scale.

Studies that we could not incorporate in the meta‐analysis (N = 2385) generally reported that non‐specialist‐led follow‐up made little or no difference to HRQoL.

Anxiety (at 12 months' follow‐up)

HADS‐Anxiety subscale (higher scores indicate worse anxiety)

MD 0.03 lower
(0.73 lower to 0.67 higher)

1266 participants

(5 randomised trials)

⊕⊕⊕⊝
Moderatef

12 studies reported on anxiety and 2 on fear of recurrence using STAI, HADS and FCRI at different time points. Meta‐analysis of 5 studies showed that non‐specialist‐led follow‐up probably makes little or no difference to anxiety at 12 months as measured by HADS‐Anxiety subscale. The mean difference did not reach the minimal clinically important difference of 1.5 points identified for this scale.

Data from the studies that we could not incorporate in the meta‐analysis (N = 1755) generally reported that non‐specialist‐led follow‐up made little or no difference to anxiety and fear of recurrence, except 1 study reporting higher levels of fear of recurrence in the patient‐initiated follow‐up group.

Depression (at 12 months)

HADS‐Depression subscale (higher scores indicate worse depression)

MD 0.03 higher
(0.35 lower to 0.42 higher)

1266 participants

(5 randomised trials)

⊕⊕⊕⊕
Highg

Eleven studies reported on depression using GHQ‐12 and HADS at different time points. Meta‐analysis of 5 studies showed that non‐specialist‐led follow‐up makes little or no difference to depression at 12 months as measured by HADS‐Depression subscale. The mean difference did not reach the minimal clinically important difference of 1.5 points identified for this scale.

The studies that we could not incorporate in the meta‐analysis (N = 1378) generally reported that non‐specialist‐led follow‐up may make little or no difference to depression.

Cost

See comment

1756 participants (8 randomised trials)

⊕⊝⊝⊝
Very lowh

Eight studies reported cost outcomes but due to the substantial heterogeneity in how they measured and reported them, we could not pool the results in a meta‐analysis. It is uncertain whether non‐specialist‐led follow‐up has an effect on cost when compared with specialist‐led follow‐up, as the certainty of the evidence is very low.

6 studies reported lower cost per participant in the non‐specialist‐led group, while 2 studies reported higher cost per participant in the non‐specialist‐led group.

*The basis for the assumed risk in the comparison group (assumed comparator risk, ACR) is provided in the footnotes. The corresponding risk in the intervention group (and its 95%confidence interval) is based on the ACR and the relative effect of the intervention (and its 95%CI).

CI: confidence interval; EORTC‐C30: European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire; FACT: Functional Assessment of Cancer Therapy scale; FCRI: Fear of Cancer Recurrence Inventory; GHQ‐12: General Health Questionnaire‐12 items; GP: general practitioner; HADS: Hospital Anxiety and Depression Scale; HR: Hazard ratio; HRQoL: health‐related quality of life; MD: mean difference; SF‐36: Short Form Health Survey‐36 items; SF‐12: Short Form Health Survey‐12 items; STAI: State Trait Anxiety Inventory

GRADE Working Group grades of evidence
High certainty: we are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty: we are moderately confident in the effect estimate: the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low certainty: our confidence in the effect estimate is limited: the true effect may be substantially different from the estimate of the effect
Very low certainty: we have very little confidence in the effect estimate: the true effect is likely to be substantially different from the estimate of effect.

aFrom meta‐analysis if we pooled study results; for all studies if we did not pool study results.
bThe ACR is the assumed proportion of participants who are alive in the comparison group.
cWe judged the certainty of evidence to be very low and downgraded by three levels for very serious concerns regarding indirectness and imprecision, as representativeness is limited with only two studies, the HRs were not reported but indirectly estimated and the confidence interval was very wide.
dWe judged the certainty of evidence to be very low and downgraded by three levels for serious concerns regarding inconsistency, indirectness and imprecision due to few studies, reporting of results by different estimates that could not be pooled and high variance of the result estimates.
eWe judged the certainty of evidence to be low and downgraded by two levels for serious concerns regarding inconsistency and imprecision due to differing estimates of effect and wide confidence intervals.
fWe judged the certainty of evidence to be moderate as we downgraded by one level for concerns regarding inconsistency of results and indirectness due to few studies.
gWe judged the certainty of evidence to be high although we had some concerns regarding indirectness due to few studies.
hWe judged the certainty of evidence to be very low as the high heterogeneity led to serious concerns regarding inconsistency, indirectness and imprecision in the way cost outcomes were measured and reported across studies.

Open in table viewer
Summary of findings 2. Less intensive versus more intensive follow‐up after primary cancer treatment

Less compared with more intensive components in follow‐up after primary cancer treatment

Patient or population: adult cancer survivors from the following cancer sites: breast, colorectal, head‐and‐neck, Hodgkin lymphoma, melanoma, non‐small cell lung cancer and testicular cancer
Setting: outpatient treatment in hospitals in Australia, Denmark, China, Finland, France, India, Italy, Netherlands, Spain, Sweden, Switzerland and UK
Intervention: less intensive follow‐up (based on fewer clinical visits, examinations or less intensive diagnostic procedures)
Comparison: more intensive follow‐up

Outcomes

Anticipated absolute effects* (95% CI)

Relative effects
(95% CI)

Number of participantsa
(Number of studies)

Certainty of the evidence
(GRADE)

Comments

Risk with more intensive follow‐up

Risk with less intensive follow‐up

Overall survival

Follow‐up range: 24 months to 120 months

75 per 100b

74 per 100

(72 to 76)

HR 1.05
(0.96 to 1.14)

10,726 participants

(13 randomised trials)

⊕⊕⊝⊝
Lowc

18 studies reported on overall survival. Meta‐analysis of 13 studies showed that less intensive follow‐up may make little or no difference to overall survival. Meta‐regression analysis showed little or no difference in the intervention effects by cancer site, publication year or study quality.

We could not incorporate data from 5 other studies. 3 of these studies reported little or no difference in overall survival (N = 1752), while 2 studies reported improved survival with more intensive follow‐up (N = 544).

Difference: 1 fewer survivor in the intervention group per 100 participants (between 3 fewer to 1 more)

Time to detection of recurrence

Follow‐up range: 12 months to 120 months

27 per 100d

24 per 100

(22 to 25)

HR 0.85 (0.79 to 0.92)

11,276 participants

(12 randomised trials)

⊕⊕⊕⊝
Moderatee

22 studies reported on time to detection of recurrence. Meta‐analysis of 12 studies showed that less intensive follow‐up probably increases time to detection of recurrence. Meta‐regression analysis showed little or no difference in the intervention effects by cancer site, publication year or study quality.

We could not incorporate data from 10 other studies. 4 of these studies reported shorter time to detection of recurrence for more intensive follow‐up (N = 854), while 4 other studies reported little or no difference in detection of recurrence (N = 734). 1 study reported results that we could not use for this comparison (N = 337) and 1 study reported results based on only unresectable recurrence (N = 239).

Difference: 3 fewer detected recurrence in the intervention group per 100 participants (between 5 to 2 fewer)

Health‐related quality of life

See comment

2742 participants (3 randomised trials)

⊕⊝⊝⊝
Very lowf

3 studies reported on HRQoL using SF‐36 and SF‐12 at varying time points. We could not pool the reported data. It is uncertain whether less intensive follow‐up has an effect on HRQoL when compared with more intensive follow‐up, as the certainty of the evidence is very low.

All 3 studies reported that less intensive follow‐up may make little or no difference in HRQoL when compared to more intensive follow‐up at time points ranging from 12 months to 5 years.

Anxiety

See comment

180 participants (1 randomised trial)

⊕⊝⊝⊝
Very lowg

One study reported that less intensive follow‐up may make little or no difference to anxiety at 12 months follow‐up using STAI.

It is uncertain whether less intensive follow‐up has an effect on anxiety when compared with more intensive follow‐up, as the certainty of the evidence is very low.

Depression

None of the studies reported depression.

Cost

See comment

1412 participants (6 randomised trials)

⊕⊝⊝⊝
Very lowh

6 studies reported cost outcomes but due to the substantial heterogeneity in how they measured and reported this outcome, we could not pool the results in a meta‐analysis. It is uncertain whether less intensive follow‐up has an effect on cost when compared with more intensive follow‐up, as the certainty of the evidence is very low.

All studies report lower costs for the less intensive arm from the perspective of the participant or healthcare system but the difference in cost varied considerably depending on the components/procedures used in the different interventions.

*The basis for the assumed risk in the comparison group (assumed comparator risk, ACR) is provided in the footnotes. The corresponding risk in the intervention group (and its 95%confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95%CI).

CI: confidence interval; HR: hazard ratio; HRQoL: health‐related quality of life; SF‐36: Short Form Health Survey‐36 items; SF‐12: Short Form Health Survey‐12 items; STAI: State Trait Anxiety Inventory

GRADE Working Group grades of evidence
High certainty: we are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty: we are moderately confident in the effect estimate: the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low certainty: our confidence in the effect estimate is limited: the true effect may be substantially different from the estimate of the effect
Very low certainty: we have very little confidence in the effect estimate: the true effect is likely to be substantially different from the estimate of effect.

aFrom meta‐analysis if we pooled study results; for all studies if we did not pool study results.
bThe ACR is the assumed proportion of participants who are alive in the comparison group.
cWe judged the certainty of evidence to be low as we downgraded by two levels for some concerns regarding study limitations (lack of allocation concealment in one study) and indirectness as the studies were primarily investigating follow‐up after colorectal and breast cancer, and serious concerns regarding imprecision as the confidence interval includes effects that are not trivial (potentially up to 3 fewer survivors per 100 participants).
dThe ACR is the assumed proportion of participants with a detected recurrence in the comparison group.
eWe judged the certainty of evidence to be moderate as we downgraded by one level for serious concerns regarding indirectness as seven of the studies did not report hazard ratios, so we indirectly estimated them from published data.
fWe judged the certainty of evidence to be very low and downgraded by three levels for serious concerns regarding inconsistency, indirectness and imprecision due to the few studies, heterogeneous measures and reporting of results by different estimates that we could not pool.
gWe judged the certainty of evidence to be very low and downgraded by three levels for serious concerns regarding inconsistency, indirectness and imprecision since there was only one study.
hWe judged the certainty of evidence to be very low and downgraded by three levels as the substantial heterogeneity led to serious concerns regarding inconsistency, indirectness and imprecision in the way cost was measured and reported across studies.

Open in table viewer
Summary of findings 3. Follow‐up integrating additional patient symptom education or monitoring, or survivorship care plans versus usual care

Follow‐up integrating additional patient symptom education or monitoring, or survivorship care plans versus usual care

Patient or population: adult cancer survivors from the following cancer sites: breast, colorectal, endometrial, ovarian and prostate cancer
Setting: outpatient treatment in hospitals or general practice in Australia, Canada, Netherlands, Sweden and USA
Intervention: follow‐up integrating additional components relevant for detection of recurrence (e.g. patient symptom education or monitoring, or survivorship care plans (SCP))
Comparison: usual care

Outcomes

Anticipated absolute effects* (95% CI)

Relative effects
(95% CI)

№ of studies

Certainty of the evidence
(GRADE)

Comments

Risk with usual care

Risk with follow‐up integrating additional patient symptom education/monitoring or SCP

Overall survival

None of the studies reported overall survival.

Time‐to‐ detection of recurrence

None of the studies reported detection of recurrence.

Health‐related quality of life, (HRQoL)

See comment

2846 participants (12 randomised trials)

⊕⊝⊝⊝
Very lowa

12 studies reported on HRQoL using EORTC‐C30, SF‐36, SF‐12, FACT and City of Hope QoL scale at varying time points. We could not pool the reported data. It is uncertain whether follow‐up integrating additional patient symptom education/monitoring or SCP has an effect on HRQoL when compared with usual care, as the certainty of the evidence is very low.

11 studies reported that follow‐up integrating additional patient education/SCP may make little or no difference to HRQoL when compared to usual care at follow‐up ranging from 6 months to 12 months. 1 study reported that SCP and patient coaching improved HRQoL at 3 months' follow‐up.

Anxiety

See comment

470 participants (1 randomised trial)

⊕⊝⊝⊝
Very lowb

One study reported that SCP may make little or no difference to anxiety at 12 months' follow‐up using HADS.

It is uncertain whether follow‐up integrating additional patient symptom education/monitoring or SCP has an effect on anxiety when compared with usual care, as the certainty of the evidence is very low.

Depression

See comment

2351 participants (8 randomised trials)

⊕⊕⊝⊝
Very lowc

8 studies reported on depression using HADS, POMS, PHQ‐9, BSI‐18, CES‐D and the distress thermometer at varying time points. We could not pool the reported data. It is uncertain whether follow‐up integrating additional patient symptom education/monitoring or SCP has an effect on depression when compared with usual care, as the certainty of the evidence is very low.

7 studies reported that follow‐up integrating additional patient education/SCP may make little or no difference to depression when compared to usual care at follow‐up ranging from 3 months to 12 months. 1 study reported that the intervention improved symptoms of depression at 12 months' follow‐up.

Cost

See comment

408 participants (1 randomised trial)

⊕⊝⊝⊝
Very lowd

One study reported that the use of SCP make little or no difference to cost at 2 years' follow‐up.

It is uncertain whether follow‐up integrating additional patient symptom education/monitoring or SCP has an effect on cost when compared with usual care, as the certainty of the evidence is very low.

*The corresponding risk in the intervention group (and its 95%confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95%CI).

BSI‐18: Brief Symptom Inventory‐18 items, CES‐D: Center for Epidemiological Studies‐Depression scale; CI: confidence interval; EORTC‐C30: European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire; FACT: Functional Assessment of Cancer Therapy scale; HADS: Hospital Anxiety and Depression Scale; HRQoL: health‐related quality of life; PHQ‐9: Patient Health Questionaire‐9 items; POMS: Profile of Mood States; QoL: quality of life; SCP: survivorship care plans; SF‐36: Short Form Health Survey‐36 items; SF‐12: Short Form Health Survey‐12 items

GRADE Working Group grades of evidence
High certainty: we are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty: we are moderately confident in the effect estimate: the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low certainty: our confidence in the effect estimate is limited: the true effect may be substantially different from the estimate of the effect
Very low certainty: we have very little confidence in the effect estimate: the true effect is likely to be substantially different from the estimate of effect.

aWe judged the overall certainty of evidence to be very low and downgraded by three levels for serious concerns regarding study limitations, indirectness and imprecision due to studies being at high risk of bias, the heterogeneous measures and reporting of results by different estimates that could not be pooled.
bWe judged the certainty of evidence to be very low and downgraded by three levels for serious concerns regarding inconsistency, indirectness and imprecision since there was only one study.
cWe judged the overall certainty of evidence to be very low and downgraded by three levels for serious concerns regarding study limitations, indirectness and imprecision due to one study at high risk of bias, the heterogeneous measures and reporting of results by different estimates that could not be pooled.
dWe judged the certainty of evidence to be very low and downgraded by three levels for serious concerns regarding inconsistency, indirectness and imprecision since there was only one study.

Antecedentes

disponible en

Descripción de la afección

El cáncer se ha convertido en una de las principales causas de muerte en todo el mundo. En la actualidad se producen más de 14 000 000 de casos nuevos de cáncer por año, y la Organización Mundial de la Salud espera que esta cifra aumente en un 70% en las próximas dos décadas (World Cancer Report 2014). Junto con los programas de detección y los procedimientos de tratamiento contra el cáncer en constante mejoría, estas cifras están dando lugar a un auge en la población de supervivientes del cáncer, que son dirigidos a una atención de seguimiento de rutina de muchos años después de haber completado el tratamiento primario contra el cáncer (Davies 2011). En muchos países, el cáncer se trata cada vez más como una enfermedad crónica (Rose 2009).

El cáncer es heterogéneo, con una variedad de tipos de cáncer, tratamientos y resultados. Esta revisión se limita a la población adulta, debido a que el cáncer infantil difiere biológica y etiológicamente del cáncer en adultos, lo que da lugar a diferentes cuestiones de tratamiento y seguimiento (Bleyer 1990). Las secuelas físicas y psicosociales posteriores al tratamiento que experimentan los adultos supervivientes del cáncer varían enormemente (Howell 2012). Sin embargo, independientemente del sitio donde se encuentre el cáncer, las áreas clave de preocupación para los supervivientes incluyen el desarrollo de cáncer recidivante o nuevo, los efectos tardíos y a largo plazo del cáncer y su tratamiento, y cuestiones psicosociales y funcionales, como la depresión, el miedo a la recidiva y las dificultades para atravesar los servicios de atención posterior (Jorgensen 2015; Landier 2009). La atención de seguimiento del cáncer se ha desarrollado para considerar estas preocupaciones y, como era de esperar, se está convirtiendo en una intervención compleja con una mayor utilización de nuevas estrategias que intentan satisfacer las necesidades de los pacientes y, al mismo tiempo, ser clínicamente efectivas en función de los costes (Davies 2011; Rose 2009).

Descripción de la intervención

El seguimiento del cáncer se refiere al proceso de atención que se presta tras la finalización del tratamiento primario contra el cáncer, cuyo principal objetivo es la vigilancia y la detección rápida de la recidiva o de un nuevo cáncer, con el fin de optimizar los resultados del tratamiento posterior (Collins 2004). Los objetivos secundarios de los programas de seguimiento incluyen la identificación y el tratamiento de los efectos secundarios y tardíos del cáncer y su tratamiento, la provisión de apoyo informativo y psicológico, y las derivaciones pertinentes a los servicios de rehabilitación y otros servicios de salud (Rose 2009). En la actualidad, no existe una definición formal de lo que es una estrategia de seguimiento, a pesar de que las intervenciones de seguimiento se están volviendo cada vez más complejas y comprenden muchos elementos que se han desarrollado en las últimas décadas con el fin de cumplir con los objetivos mencionados anteriormente (Howell 2012). Por lo tanto, las estrategias de seguimiento se definen como la coordinación y organización de estos elementos, y en la Casilla 1 se distingue sistemáticamente entre los diversos elementos, sobre la base del marco de las "Five Ws and one H", y se proporcionan ejemplos existentes de cada uno: por qué la intervención de seguimiento, quién realiza la intervención, dónde tiene lugar, cuándo se programan las visitas, qué se administra en cada sesión y cómo se administra la atención (why follow‐up intervention, who leads the intervention, where does it take place, when are visits scheduled, what is delivered in each session, and how is care delivered) (Spencer‐Thomas 2016). Aunque el marco está típicamente asociado con la disciplina del periodismo, se ha descubierto que se puede aplicar para comprender sistemáticamente la complejidad de las estrategias de seguimiento del cáncer.

Casilla 1. Elementos que componen las estrategias de seguimiento del cáncer

Pregunta de seguimiento

Ejemplos

¿Por qué una intervención de seguimiento?

  • Detección temprana de la recidiva

  • Identificación y manejo de los síntomas físicos y psicológicos

¿Quién lo realiza?

  • Realizado por especialistas (por ejemplo, oncólogos o cirujanos)

  • Realizado por personal de enfermería

  • Realizado por un médico general

  • Atención compartida

¿Dónde se realizan las visitas?

  • Atención primaria

  • Atención secundaria

¿Cuándo se programan las visitas?

  • Basado en el calendario: frecuencia, tiempo y duración del seguimiento establecidos

  • Iniciado por el paciente, basado en los síntomas

¿Qué se administra?

  • Componentes de vigilancia: examen físico, pruebas bioquímicas, procedimientos de imagenología, etc.

  • Componentes de la atención posterior: información del paciente, educación sobre los síntomas, planes de atención de supervivencia, derivaciones a otros servicios, etc.

¿Cómo se brinda la atención médica?

  • De forma directa

  • A través de la tecnología: teléfono, correo electrónico, etc.

Los programas de seguimiento de rutina consisten tradicionalmente en visitas establecidas, realizadas por especialistas en forma directa y ambulatoria en un entorno hospitalario, que se programan con frecuencia durante los primeros años (por lo general, cada dos a cuatro meses), cuando el riesgo de recidiva es mayor, seguido de intervalos más largos entre las visitas en los años posteriores, hasta 10 años o más (De Felice 2015). Las citas casi siempre consisten en componentes de vigilancia destinados a la detección de las recidivas, como exámenes clínicos, análisis de sangre o procedimientos de imagenología, y también incluyen cada vez más componentes de la atención posterior, como sesiones de educación del paciente, planes de atención de supervivencia y apoyo en el manejo de la calidad de vida y cuestiones psicosociales (Davies 2011). Las intervenciones de seguimiento más intensivas se han definido como las que tienen más componentes de vigilancia en cada cita (Collins 2004). También se puede esperar que determinados componentes de la atención posterior, como la educación sobre los síntomas y la integración de un plan de atención de supervivencia en la atención clínica, tengan un impacto en la detección de la recidiva, debido a que es más probable que los pacientes reconozcan y realicen un autoinforme de los signos de recidiva o que se adhieran a las visitas de seguimiento.

No es sorprendente que el seguimiento del cáncer constituya una carga pesada para los sistemas sanitarios nacionales, y que el seguimiento convencional realizado por especialistas sea cada vez menos sostenible (Davies 2011). Por lo tanto, se han sugerido otras estrategias, que son menos exhaustivas y pueden presentar una coste‐efectividad mayor, entre las que se incluyen: el seguimiento realizado por personal de enfermería, el seguimiento realizado por médicos generales en la atención primaria, el seguimiento iniciado por los pacientes, menos citas y el uso de pruebas y procedimientos de diagnóstico menos intensivos (Brown 2002; Dickinson 2014; Hall 2011; Oeffinger 2006; Rose 2009). Otra área de investigación en el seguimiento del cáncer es el agregado de paquetes de supervivencia que incluyen educación o información para el paciente, como un plan de atención de supervivencia (Jefford 2016). Independientemente de la estrategia de seguimiento, los principales objetivos siguen siendo la vigilancia dirigida a la detección temprana de la recidiva, la atención posterior de los efectos tardíos y a largo plazo, y el apoyo a las necesidades psicosociales y funcionales (Landier 2009). Los resultados esperados de las estrategias de seguimiento efectivas son la mejoría en las tasas de supervivencia, la detección rápida de la recidiva y un mejor manejo de los problemas físicos y psicológicos, lo cual daría lugar a una mejor calidad de vida para los supervivientes del cáncer (Lewis 2009a). El cambio actual hacia un menor número de visitas a los profesionales de la salud y el enfoque en el empoderamiento de los pacientes, también tiene como objetivo aumentar la capacidad de los supervivientes para autocontrolar su afección y autoiniciar el contacto con los sistemas de salud. Sin embargo, lo anterior puede aumentar la ansiedad y la angustia entre los supervivientes que carecen de la capacidad de autovigilancia y autocontrol (Lewis 2009a).

De qué manera podría funcionar la intervención

Se han sugerido diferentes mecanismos para vincular los diversos componentes de las intervenciones de seguimiento con sus resultados. La vigilancia se basa en la idea de que cuanto antes se detecte una recidiva del cáncer, más susceptible será al tratamiento y, por lo tanto, mayor será la tasa de supervivencia (Clarke 2014). Sin embargo, pocos ensayos han estudiado estos dos resultados juntos y actualmente no hay evidencia sólida de que la vigilancia sistemática mejore el tiempo hasta la detección de la recidiva o la supervivencia general (Clarke 2014; Jeffery 2016; Moschetti 2016). Además, los síntomas de recidiva con frecuencia son detectados por los propios pacientes, entre las visitas programadas, y las citas tradicionales en el hospital a menudo no satisfacen las necesidades de atención de apoyo de los pacientes (De Felice 2015). Este hecho ha dado lugar a un mayor enfoque en las estrategias en las que los pacientes son entrenados para reconocer, informar y autocontrolar los síntomas de recidiva, los efectos secundarios tardíos y a largo plazo, la angustia emocional y las necesidades funcionales (Davies 2011). Lo anterior puede lograrse mediante programas de información y educación del paciente sobre los síntomas, apoyo psicosocial proporcionado por personal de enfermería capacitado y la derivación rápida a los servicios de atención especializada, rehabilitación y otros servicios de atención sanitaria cuando sea necesario (Davies 2011). Existe evidencia de que dichas estrategias son aceptadas de forma positiva por los pacientes y pueden mejorar los resultados informados por los pacientes, como la calidad de vida relacionada con la salud (Brennan 2011; Davies 2011). En la Figura 1 se construyó un modelo que resume los vínculos potenciales entre las estrategias de seguimiento, los posibles mecanismos y los resultados esperados. Como se ilustra, los factores relacionados con el cáncer (por ejemplo, el tipo de cáncer y su tratamiento) y los factores relacionados con el paciente (por ejemplo, la edad y el sexo) determinan los síntomas relevantes para un grupo particular de pacientes, lo que a su vez, informa la organización de los diferentes elementos dentro de las estrategias de seguimiento específicas. Por lo tanto, las estrategias de seguimiento pueden funcionar a través de diferentes mecanismos para diferentes grupos de pacientes, con objeto de lograr los resultados esperados de supervivencia y el manejo de los síntomas.

Por qué es importante realizar esta revisión

Debido al rápido aumento en el número de supervivientes del cáncer, los sistemas sanitarios se ven presionados cada vez más a optimizar las intervenciones de seguimiento a fin de satisfacer las necesidades físicas, psicológicas y funcionales de los pacientes, sin dejar de ser económicamente viables (Lewis 2009a). Se siguen desarrollando e implementando nuevos modelos de atención de seguimiento que afectan la vida de millones de supervivientes, aunque las guías para el desarrollo de estrategias de seguimiento efectivas ha sido limitada hasta ahora por el pequeño número de ensayos aleatorizados disponibles para cada sitio del cáncer y la heterogeneidad de estos estudios. El contenido y la organización óptimos de los procedimientos de seguimiento siguen siendo objeto de debate (Sperduti 2013). Actualmente, se han publicado cuatro revisiones Cochrane sobre estrategias de seguimiento para sitios específicos del cáncer (cáncer de mama [Moschetti 2016], cáncer colorrectal no metastásico [Jeffery 2016], cáncer de cuello uterino [Lanceley 2013] y cáncer epitelial de ovario [Clarke 2014]), pero con la excepción de la revisión sobre el seguimiento del cáncer colorrectal, han carecido de poder estadístico para establecer conclusiones sobre los efectos de diferentes estrategias de seguimiento debido a la falta de estudios. Una revisión Cochrane evaluó las intervenciones de seguimiento independientemente del tipo de cáncer, pero se centró en las intervenciones que mejoraron la continuidad de la atención durante el período completo de tratamiento contra el cáncer, y no se incluyeron los resultados de la supervivencia y la recidiva (Aubin 2012). Esta revisión se propone cubrir esta brecha mediante la inclusión de ensayos aleatorizados de estrategias de seguimiento del cáncer en todos los sitios del cáncer. De esta manera, se buscó superar la limitación del escaso número de ensayos en determinados sitios y proporcionar un resumen sistemático de la última evidencia disponible en cuanto a las estrategias de seguimiento en múltiples tipos de cáncer, incluidos los que no han sido representados previamente (por ejemplo, cáncer de pulmón, cáncer de cabeza y cuello, etc.).

Objetivos

disponible en

El objetivo de esta revisión es comparar el efecto de diferentes estrategias de seguimiento en adultos supervivientes del cáncer, después de finalizar el tratamiento primario contra el cáncer, sobre los resultados primarios de la supervivencia general y el tiempo transcurrido hasta la detección de la recidiva. Los resultados secundarios son la calidad de vida relacionada con la salud, la ansiedad (incluido el miedo a la recidiva), la depresión y el coste.

Métodos

disponible en

Criterios de inclusión de estudios para esta revisión

Tipos de estudios

Se incluyeron todos los ensayos aleatorizados que compararon diferentes estrategias de seguimiento en adultos supervivientes del cáncer que habían completado el tratamiento primario contra el cáncer con intención curativa, con respecto a los resultados de la supervivencia general, el tiempo hasta la detección de la recidiva, la calidad de vida relacionada con la salud, la depresión, la ansiedad y el coste. No se impusieron restricciones sobre el idioma de la publicación. Los estudios publicados en otros idiomas fueron traducidos al inglés cuando fue necesario.

Tipos de participantes

Se incluyeron ensayos en adultos (a partir de 18 años de edad) que habían completado el tratamiento primario del cáncer con intención curativa. Los participantes debían haber sido diagnosticados histológica y clínicamente con cáncer, independientemente del tipo y el estadio del cáncer.

Tipos de intervenciones

Se incluyeron ensayos que comparaban cualquiera de las siguientes intervenciones que podrían tener un impacto en la detección de la recidiva.

  • Seguimiento realizado por personal no especialista (es decir, atención médica general, atención de enfermería, atención iniciada por el paciente o compartida) frente al seguimiento realizado por especialistas

  • Seguimiento menos intensivo frente a más intensivo (basado en visitas clínicas, exámenes y procedimientos de diagnóstico)

  • Seguimiento que integra la educación o la monitorización de los síntomas del paciente, o planes de atención de supervivencia frente a la atención habitual

Se excluyeron los estudios que examinaban solo componentes psicosociales o de rehabilitación o los estudios que investigaban componentes de diagnóstico que no estaban integrados como parte del seguimiento clínico del cáncer.

Tipos de medida de resultado

Resultados primarios

  • Supervivencia general: calculada desde el momento de la asignación al azar o el reclutamiento del estudio hasta el momento de la muerte

  • Tiempo hasta la detección de la recidiva: calculado desde el momento de la asignación al azar o el reclutamiento del estudio hasta la detección de la recidiva. En algunos estudios, este resultado se denominó supervivencia libre de enfermedad.

Resultados secundarios

  • Calidad de vida relacionada con la salud

  • Ansiedad (incluido el miedo a la recidiva)

  • Depresión

  • Coste

Se incluyeron todos los estudios que planeaban informar o que informaban sobre al menos una de las medidas de resultado. La calidad de vida relacionada con la salud, la ansiedad y la depresión solo se consideraron cuando los estudios las medían con escalas validadas, como el European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire para pacientes con cáncer (EORTC QLQ‐C30; Aaronson 1993), la 36‐item Short Form Health Survey (SF‐36; Ware 1994), la Hospital Anxiety and Depression Scale (HADS; Snaith 2003), etc. Para todos los estudios incluidos, también se extrajeron los datos de los resultados del coste cuando se informaron.

Métodos de búsqueda para la identificación de los estudios

Búsquedas electrónicas

The review authors developed the search strategies in consultation with the Effective Practice and Organisation of Care (EPOC) Information Specialist (IS), who also ensured that the search strategy was peer‐reviewed by a second IS. We searched the Cochrane Database of Systematic Reviews (CDSR) and the Database of Abstracts of Reviews of Effects (DARE) for related systematic reviews and the databases below for primary studies on 11 December 2018:

  • Cochrane Central Register of Controlled Trials (CENTRAL; 2018, Issue 12) in the Cochrane Library

  • MEDLINE Ovid, including Epub Ahead of Print, In‐Process & Other Non‐Indexed Citations and Versions (1946 to 11 December 2018)

  • Embase Ovid (1974 to 11 December 2018)

  • PsycINFO Ovid (1967 to 11 December 2018)

  • CINAHL EBSCO (Cumulative Index to Nursing and Allied Health Literature; 1982 to 11 December 2018)

Search strategies are comprised of keywords and controlled vocabulary terms. We applied no language or time limits. All strategies used are provided in Appendix 1.

Búsqueda de otros recursos

We also searched the following registers for ongoing trials on 11 December 2018:

Additionally, we reviewed reference lists of all included studies and relevant systematic reviews, as well as contacted authors of relevant studies and reviews to clarify reported published information and to seek unpublished results and data.

Obtención y análisis de los datos

Selección de los estudios

We uploaded all titles and abstracts retrieved by electronic searching and through other sources into Covidence, which is an online platform that facilitates the management of the systematic review process (Covidence). We used Covidence to carry out both the title/abstract screening stage and the full‐text screening stage. Five review authors (BLH, RVK, LS, ASF and TAH) independently screened all titles and abstracts for inclusion and we obtained the full text of study reports and publications coded as "Yes" or "Maybe". Thereafter, the authors independently screened the full texts to identify studies for inclusion. We tagged excluded studies with the reason for exclusion, following a similar hierarchy as the screening algorithm: wrong intervention, wrong patient population (e.g. patients were not cancer‐free or were treated for recurrence) or wrong outcome. These reasons are predefined in Covidence. We resolved any disagreements during both screening stages through regularly held discussion meetings with the rest of the author team. We also identified ongoing studies and recorded any information available. We extracted Information from Covidence regarding the selection process to complete a PRISMA flow diagram (Liberati 2009; Figure 2).

Extracción y manejo de los datos

We used a modified Cochrane data collection form from our editorial group, Cochrane Effective Practice and Organisation of Care (EPOC), to capture study characteristics and outcome data (EPOC 2013). We used the first five studies to pilot and refine the template. Five review authors (BLH, RVK, LS, ASF and TAH) extracted the following study characteristics from included studies.

  • Methods: study design, number of study centres and location, study setting, withdrawals, date of study, follow‐up

  • Participants: cancer site, number, mean age, age range, gender, cancer stage, diagnostic criteria, inclusion criteria, exclusion criteria, other relevant characteristics

  • Interventions: intervention type, intervention components, comparison, fidelity assessment

  • Outcomes: main and other outcomes specified and collected, time points reported

  • Notes: funding for trial, notable conflicts of interest of trial authors, and ethical approval

For each study, one review author extracted all the pre‐defined relevant data and another review author independently read all the publications from the same study and double‐checked the form to ensure accuracy and that there were no missing data. We only extracted outcome data for outcomes relevant for this review. For studies with multiple reports, we extracted data from all the reports, if relevant. We resolved any disagreements during discussion meetings with the rest of the review author team. To minimise error, the review authors used the guidance provided in Chapter 7 of the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011a), and took the online Cochrane Interactive Learning course on selecting studies and collecting data (Sambunjak 2017). We used information from the data collection forms to create the 'Characteristics of included studies' table. We noted if the study did not contribute data that could be pooled in a meta‐analysis.

Evaluación del riesgo de sesgo de los estudios incluidos

Five review authors (BLH, RVK, LS, ASF and TAH) independently assessed risk of bias for each study, using the criteria outlined in Chapter 8 of the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2017) and guidance from EPOC (EPOC 2015). To further minimise error, the authors also took the Cochrane Interactive Learning course on introduction to study quality and risk of bias (Page 2017). We resolved disagreements by discussion with the rest of the author team or with the editors of this review. We assessed the risk of bias according to the following domains.

  • Random sequence generation

  • Allocation concealment

  • Blinding of participants and personnel

  • Blinding of outcome assessment

  • Incomplete outcome data

  • Selective outcome reporting

  • Other bias, including baseline imbalances and risk of contamination

As the risk of detection bias differs for objective outcomes (survival and recurrence) and patient‐reported outcomes (quality of life, anxiety and depression), we assessed the risk of bias by type of outcome for the following domains: blinding of participants and personnel, blinding of outcome assessment and incomplete outcome data (Higgins 2017). For blinding of outcome assessment, we further assessed the risk of bias separately for survival and time to detection of recurrence because while there can be no doubt as to death, time to detection of recurrence may be influenced by judgement regarding clinical tests and assessments, which may be affected by lack of blinding.

We classified each potential source of bias as high, low, or unclear, and provide a quote from the study report and justification for our judgement in the 'Risk of bias' table. When considering treatment effects, we took into account the risk of bias for the studies that contributed to that outcome.

Medidas del efecto del tratamiento

Time‐to‐event outcomes

We have presented time‐to‐event outcomes, overall survival and time to detection of recurrence, as hazard ratios (HRs) (Deeks 2017). We estimated log HRs and the associated standard error (SE) required for a meta‐analysis using the calculator in Review Manager 5 (Review Manager 2014) and a spreadsheet developed by Tierney 2007 that provides 11 methods for calculating HRs and the associated variance depending on the information available in each study. We used Method 3 for studies that provided a HR and its associated confidence interval (CI), Method 9 for studies that provided a P‐value from a log‐rank test, the number of events and the numbers randomised to each arm and Method 11 when a study only provided Kaplan‐Meier curves and numbers at risk. For multi‐armed studies, we also used the approach proposed by Parmar 1998 to estimate the overall log HR and its variance for the combined intervention arms.We also contacted the authors of relevant studies for additional information where possible and noted this, along with any response, in the Characteristics of included studies.

We did not specify the minimal clinically important difference (MCID) (Patrick 2011) for survival and time to detection of recurrence for this review. Instead, we assessed the importance of effects and the precision of the estimates based on how likely it seemed to us that some people would make different decisions if the true effect was near one end or the other of the CI, for example, when the CI includes effects that are not trivial (EPOC 2018).

Continuous outcomes

For health‐related quality of life, anxiety and depression, we calculated the mean difference (MD) for each measurement tool together with the 95% CI by using the mean final value scores and the associated standard deviations (SD) in each study (Deeks 2017). We used final value scores at 12 months as this was the time point reported by the majority of the included studies and because we considered one year a sufficient period of time to assess a meaningful effect of the intervention on patient‐reported outcomes. We also used mean final value scores and the associated SDs instead of other estimates of treatment effects because this was the measurement most consistently reported across the included studies. We did not use the standardised mean difference (SMD) because the different scales/subscales often measure different dimensions of an outcome and combining the results would not be meaningful.

In the few studies where means and SDs were not reported, we attempted to contact the study author of for the required information. Where possible, we estimated the mean and SD using the methods described in Chapter 7 of the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011a), and in Wan 2014. For each measurement tool, we state the range of scores possible, whether an increase in score is desirable and the MCID, if available from the literature. Information regarding whether a study author was contacted, what information was requested, and whether we received a reply was noted in the Characteristics of included studies table.

Cuestiones relativas a la unidad de análisis

We included cluster‐randomised trials in the meta‐analysis only if we were able to extract an estimate of the treatment effect from an analysis that properly accounted for the cluster design. For trials with multiple intervention groups, we only included the group with the intervention that met the inclusion criteria for this review (Kimman 2011), or we combined the intervention groups (Primrose 2014), and created a single pair‐wise comparison with the control group for the meta‐analysis. We did not include cross‐over trials in this review.

Manejo de los datos faltantes

We report missing data and attrition rates for the included studies as part of the 'Risk of bias' assessment under the domain 'Incomplete outcome data'. Where possible, we contacted study authors in order to verify key study characteristics and obtain unreported outcome data. Almost all the included studies reported intention‐to‐treat (ITT) analyses or methods to impute missing data, thus indicating an ITT approach. If a study reported both ITT and per‐protocol analyses, we extracted only ITT outcome data. We did not impute missing data and we did not request individual patient data.

Evaluación de la heterogeneidad

We used the Chi2 test and the I² statistic (Higgins 2003), to measure statistical heterogeneity among the trials in the analysis for each outcome (Deeks 2017; Thompson 2002). We expected clinical heterogeneity, as there was substantial variation across studies on study and patient characteristics. Therefore, regardless of the statistical heterogeneity level, we planned and performed a random‐effects meta‐regression analysis, as reported below, to investigate prespecified study differences.

Evaluación de los sesgos de notificación

For outcomes where we were able to pool more than 10 studies, we created and examined a funnel plot to explore possible publication biases and interpreted the results with caution (Sterne 2011). For studies where a protocol had been published or the study had been prospectively registered, we compared the predefined outcome measures with those that the study reported as part of the risk of bias assessment under the domain 'Selective reporting'.

Síntesis de los datos

For time‐to‐event outcomes, we carried out a meta‐analysis for all the trials where it was possible to estimate a log HR and associated sampling variance. We present relative effects (HR) and estimated anticipated absolute effects in terms of the absolute risk of event‐free survival (i.e. the event being death) for overall survival, and as the absolute risk of an event for recurrence, based on formulae found in Schünemann 2019.

For continuous outcomes, we carried out a meta‐analysis for each scale or subscale if at least three studies reported a measurement scale or subscale at 12 months' follow‐up, in order to have reasonable representativeness and probability of detecting the effect of interest.

We synthesised data based on three intervention comparisons:

  1. Non‐specialist‐led follow‐up (i.e. GP‐led, nurse‐led, patient‐initiated or shared care) versus specialist‐led follow‐up

  2. Less intensive versus more intensive follow‐up (based on clinical visits, examinations and procedures). In trials where the intervention group received the more intensive treatment, the reported estimates for intervention and comparison arms were reversed.

  3. Follow‐up integrating additional patient symptom education or monitoring, or survivorship care plans versus usual care

In each comparison group, we followed the same procedures with regards to undertaking a possible meta‐analysis and meta‐regression analysis (see Subgroup analysis and investigation of heterogeneity). We carried out all meta‐analyses in Review Manager 5 using the inverse variance random‐effects method (Review Manager 2014). We carried out meta‐regression in the statistical software R (version 3.5.1, package 'meta'; R 2017), and the codes are available in Appendix 2 . For the studies that reported data that we could not pool in the meta‐analyses, we have presented the findings in a narrative manner (Deeks 2017).

'Summary of findings' tables

For each intervention comparison, we used the GRADEpro software (GRADEpro GDT 2015), to create a 'Summary of findings' table for overall survival, time to detection of recurrence, health‐related quality of life, anxiety, depression and cost. We assessed the overall certainty of evidence for each outcome using the five GRADE considerations: study limitations (risk of bias), consistency (of effect and measurement across studies), imprecision (wide confidence intervals in study estimates), indirectness (representativeness and whether we had to indirectly calculate effect estimates), and publication bias (through funnel plots if we pooled 10 or more studies; Schünemann 2013). We also used the methods and recommendations described in Section 8.5 (Higgins 2017), and Chapter 12 (Schünemann 2017), of the Cochrane Handbook for Systematic Reviews of Interventions and the EPOC worksheets (EPOC 2017), and attach the GRADE evidence profiles for each outcome in Appendix 3.

As we only included randomised trials, the evidence certainty started at 'high', If we identified any serious concerns in any of the five GRADE domains, we downgraded the certainty of the evidence accordingly by either one level to 'moderate.' two levels to 'low' or three levels to 'very low' (Guyatt 2008). Three review authors (BLH, RVK and ASF) carried out the GRADE assessments of each outcome and we resolved any disagreements with the rest of the review author group. For the outcomes of health‐related quality of life, anxiety and depression, we presented the findings from the EORTC‐C30 Global health status subscale, the HADS‐Anxiety subscale and the HADS‐Depression subscale, as we judged the results from these subscales to be most representative of the outcome. Additional outcome information that we were not able to incorporate into the evidence from the meta‐analyses are noted in the comments section.

Análisis de subgrupos e investigación de la heterogeneidad

An important aim of this review is to investigate how characteristics of all the different strategies may relate to outcome effects. Therefore, we investigated this heterogeneity by performing meta‐regression analysis, that is, including various predefined study characteristics as explanatory variables and testing for significance. By doing so, we aimed to compare the effect of various follow‐up strategies without splitting participant data into subgroups, where conclusions may be misleading if there are a limited number of studies available. To avoid false positive conclusions that can occur through post‐hoc analyses, we identified the co‐variates we wished to investigate a priori (Thompson 2002), and carried out a meta‐regression analysis to investigate how the following variables relate to the primary outcomes of overall survival and time to detection of recurrence:

  • cancer site;

  • sex of participant;

  • age of participant;

  • study quality (i.e. high (4‐5 low 'Risk of bias' judgements); moderate (2‐3 low 'Risk of bias' judgements); low (0‐1 low 'Risk of bias' judgements));

  • year of publication (i.e. before 2000; after 2000).

We note that associations derived from meta‐regressions are observational, and that the risk of bias must be taken into account in the interpretation of results. Due to insufficient studies, we were not able to carry out meta‐regression analyses for the continuous outcomes and we present our findings in a narrative manner (Deeks 2017). We did not carry out subgroup analysis.

Análisis de sensibilidad

For each meta‐analysis, we carried out a sensitivity analysis whereby we restricted the analysis to studies with published HRs, means and SD, and we noted their impact on effect sizes. We did not restrict the analysis to studies with low risk of bias, as we had already investigated the effect of study quality through meta‐regression.

Assesment of bias in conducting the systematic review

We conducted this review according to the published protocol and report all deviations in the 'Differences between protocol and review' section below. We used PRISMA statement (Liberati 2009), to guide the reporting of this review.

Results

Description of studies

Results of the search

After the removal of duplicates, the electronic search yielded 9110 references and we identified a further 17 references from other sources. Following title and abstract screening, we identified and retrieved the full text of 297 references and collated the references into 131 studies. Following full‐text screening, we identified 81 studies for inclusion, of which 28 studies were ongoing (see Characteristics of included studies and Characteristics of ongoing studies tables). We excluded 50 studies with reasons (see Characteristics of excluded studies). The screening flow diagram and the proportions of included studies that contributed to each comparison and outcome can be seen in Figure 2 (Moher 2009).

Included studies

Below, we provide a brief summary of the 53 included studies. Two of the included studies were cluster‐randomised trials (Murchie 2010; ROGY 2015), and three studies were multi‐armed (Kimman 2011; Primrose 2014; Secco 2002).

Participants, cancer site and setting

The included studies randomised 20,832 participants and spanned 12 cancer sites: 18 studies with participants who had breast cancer (Beaver 2009; Brown 2002; GIVIO 1994; Grunfeld 1996; Grunfeld 2006; Grunfeld 2011; Hershman 2013; Juarez 2013; Kimman 2011; Kirshbaum 2017; Koinberg 2004; Kokko 2003; Kvale 2016; Maly 2017; Oltra 2007; Rosselli Del Turco 1994; Ruddy 2016; Sheppard 2009), 16 studies in colorectal cancer (Beaver 2012; GILDA 2016; Jefford 2016; Kjeldsen 1997; Mäkelä 1992; Ohlsson 1995; Pietra 1998; Primrose 2014; Rodríguez‐Moranta 2006; Schoemaker 1998; Secco 2002; Sobhani 2008; Sobhani 2018; Wang 2009; Wille‐Jorgensen 2018; Young 2013), three studies in non‐small cell lung cancer (NSCLC) (Gambazzi 2018; Monteil 2010; Westeel 2012), two studies in colon cancer (Augestad 2013; Wattchow 2006), two studies in endometrial cancer (Beaver 2017; Jeppesen 2018), two studies in gynaecological cancer (ROGY 2015; Morrison 2018), two studies in melanoma (Damude 2016; Murchie 2010), two studies with prostate cancer (Davis 2013; Emery 2016), two studies with oesophageal cancer (Malmstrom 2016; Verschuur 2009), and one study each in Hodgkin lymphoma (Picardi 2014), testicular cancer (Rustin 2007), head‐and‐neck cancer (Van der Meulen 2013) and oral cancer (D'Cruz 2016). The majority of the studies included participants with cancer stages I, II and III, while six studies also included participants with stage IV cancer at diagnosis, who had completed curatively‐intended treatment (Gambazzi 2018; Picardi 2014; ROGY 2015; Sobhani 2008; Sobhani 2018; Verschuur 2009). The studies had follow‐up periods ranging from six months to five years.

All the studies were carried out in either a hospital or general practice setting and 15 countries were represented: the UK (Beaver 2009; Beaver 2012; Beaver 2017; Brown 2002; Grunfeld 1996; Kirshbaum 2017; Morrison 2018; Murchie 2010; Primrose 2014; Rustin 2007; Sheppard 2009), Italy (GILDA 2016; GIVIO 1994; Picardi 2014; Pietra 1998; Rosselli Del Turco 1994; Secco 2002), Australia (Emery 2016; Jefford 2016; Schoemaker 1998; Wattchow 2006; Young 2013), the USA (Davis 2013; Hershman 2013; Juarez 2013; Kvale 2016; Maly 2017; Ruddy 2016), the Netherlands (Damude 2016; Kimman 2011; ROGY 2015; Van der Meulen 2013; Verschuur 2009), France (Monteil 2010; Sobhani 2008; Sobhani 2018; Westeel 2012), Canada (Grunfeld 2006; Grunfeld 2011), Denmark (Jeppesen 2018; Kjeldsen 1997; Wille‐Jorgensen 2018), Sweden (Koinberg 2004; Malmstrom 2016; Ohlsson 1995), Finland (Kokko 2003; Mäkelä 1992), Spain (Oltra 2007; Rodríguez‐Moranta 2006), China (Wang 2009), India (D'Cruz 2016), Norway (Augestad 2013), and Switzerland (Gambazzi 2018).

Nine studies did not report funding source (Brown 2002; Damude 2016; Koinberg 2004; Mäkelä 1992; Oltra 2007; Pietra 1998; Rustin 2007; Secco 2002; Wang 2009). The remaining studies were funded by either academic, public or non‐profit sources, although two studies also reported contributions from industry (GIVIO 1994; Westeel 2012). Authors from six studies reported disclosures on potential conflicts of interests (Jefford 2016; Maly 2017; Primrose 2014; ROGY 2015; Ruddy 2016; Wille‐Jorgensen 2018).

Types of interventions

The included studies investigated a wide range of interventions, and details of the types of interventions, comparisons and follow‐up periods are given in the Characteristics of included studies tables.

Six studies compared nurse‐led follow‐up with conventional specialist‐led follow‐up (Beaver 2009; Beaver 2012; Beaver 2017; Kimman 2011; Morrison 2018; Verschuur 2009).

Five studies compared GP‐led follow‐up with conventional specialist‐led follow‐up (Augestad 2013; Grunfeld 1996; Grunfeld 2006; Murchie 2010; Wattchow 2006).

Five studies compared patient‐initiated follow‐up with conventional specialist‐led follow‐up (Brown 2002; Jeppesen 2018; Kirshbaum 2017; Koinberg 2004; Sheppard 2009). This type of follow‐up was also referred to as 'open‐access' or 'on‐demand' follow‐up in some of the publications.

One study compared shared care (where a number of hospital visits are replaced by GP‐appointments) with conventional specialist‐led follow‐up (Emery 2016).

Four studies compared frequency of follow‐up visits: one study compared fewer visits with more frequent visits (Damude 2016), while three compared more frequent visits with fewer visits (Kjeldsen 1997; Pietra 1998; Wille‐Jorgensen 2018). When the latter three studies contributed data to the meta‐analysis comparing less intensive to more intensive follow‐up, we reversed the reported estimates for intervention and comparison arms.

Two studies compared a less intensive follow‐up intervention to more intensive follow‐up: Picardi 2014 compared follow‐up based on the use of chest X‐rays with follow‐up using PET/CT scans in Hodgkin lymphoma survivors, while Rustin 2007 compared follow‐up based on two CT scans with follow‐up based on five CT scans in testicular cancer survivors.

Eighteen studies compared a more intensive follow‐up intervention to less intensive follow‐up based on the use of additional or more intensive surveillance components, for example, additional examinations, imaging procedures or blood tests for biomarkers (D'Cruz 2016; Gambazzi 2018; GILDA 2016; GIVIO 1994; Kokko 2003; Mäkelä 1992; Monteil 2010; Ohlsson 1995; Oltra 2007; Primrose 2014; Rodríguez‐Moranta 2006; Rosselli Del Turco 1994; Schoemaker 1998; Secco 2002; Sobhani 2008; Sobhani 2018; Wang 2009; Westeel 2012). When these studies contributed data to the meta‐analysis, we reversed the reported estimates for intervention and comparison arms.

Twelve studies investigated the addition of care or information components to usual care that might be expected to affect surveillance of recurrences, such as symptom monitoring and feedback (Davis 2013), implementation of survivorship care plans/packages in clinical care (Grunfeld 2011; Hershman 2013; Jefford 2016; Kvale 2016; Maly 2017; ROGY 2015; Ruddy 2016), or implementation of supportive care packages that included patient education on symptoms of recurrence (Juarez 2013; Malmstrom 2016; Van der Meulen 2013; Young 2013).

Outcomes
Overall survival

Twenty‐two studies reported on the outcome of overall survival (D'Cruz 2016; GILDA 2016; GIVIO 1994; Grunfeld 2006; Kjeldsen 1997; Koinberg 2004; Kokko 2003; Mäkelä 1992; Monteil 2010; Ohlsson 1995; Pietra 1998; Primrose 2014; Rodríguez‐Moranta 2006; Rosselli Del Turco 1994; Schoemaker 1998; Secco 2002; Sobhani 2018; Verschuur 2009; Wang 2009; Wattchow 2006; Westeel 2012; Wille‐Jorgensen 2018). Eight studies reported HRs (D'Cruz 2016; GILDA 2016; Rodríguez‐Moranta 2006; Rosselli Del Turco 1994; Schoemaker 1998; Wang 2009; Westeel 2012; Wille‐Jorgensen 2018), and we were able to calculate HRs for seven studies based on the information reported or obtained from the study authors (GIVIO 1994; Kjeldsen 1997; Koinberg 2004; Mäkelä 1992; Ohlsson 1995; Sobhani 2018; Wattchow 2006), thus yielding 15 studies that contributed data for meta‐analysis. The remaining seven studies either did not carry out survival analysis or reported insufficient information to estimate a HR.

Time to detection of recurrence

Thirty studies reported on the outcome of time to detection of recurrence/disease‐free survival (Augestad 2013; Beaver 2009; Beaver 2012; Beaver 2017; Damude 2016; Gambazzi 2018; GILDA 2016; GIVIO 1994; Grunfeld 1996; Grunfeld 2006; Kjeldsen 1997; Koinberg 2004; Kokko 2003; Mäkelä 1992; Monteil 2010; Ohlsson 1995; Oltra 2007; Picardi 2014; Pietra 1998; Primrose 2014; Rodríguez‐Moranta 2006; Rosselli Del Turco 1994; Rustin 2007; Secco 2002; Sobhani 2008; Sobhani 2018; Wang 2009; Wattchow 2006; Westeel 2012; Wille‐Jorgensen 2018). Four studies calculated time to detection of recurrence from the time of presentation of symptoms/suspicion of recurrence instead of from randomisation, as defined in our protocol, and we did not include the results from these studies in our analysis (Augestad 2013; Beaver 2009; Beaver 2012; Grunfeld 1996). Three study reported HRs (GILDA 2016; Westeel 2012; Wille‐Jorgensen 2018), and we were able to calculate HRs for nine studies based on the information reported or obtained from the authors (Gambazzi 2018; GIVIO 1994; Kjeldsen 1997; Picardi 2014; Primrose 2014; Rodríguez‐Moranta 2006; Rosselli Del Turco 1994; Rustin 2007; Sobhani 2008), thus yielding twelve studies that contributed data for the meta‐analysis. The remaining studies either did not carry out survival analysis or reported insufficient information to estimate a HR.

Health‐related quality of life

Twenty‐eight studies reported on the outcome of health‐related quality of life using a variety of validated measurement scales, such as the European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire (EORTC QLQ‐C30; Augestad 2013; Beaver 2017; Brown 2002; Jefford 2016; Kimman 2011; Kirshbaum 2017; Malmstrom 2016; Morrison 2018; ROGY 2015; Verschuur 2009), the 36‐item Short Form Health Survey (SF‐36; Damude 2016; Grunfeld 1996; Grunfeld 2006; Grunfeld 2011; Kvale 2016; Murchie 2010), the 12‐item Short Form Health Survey (SF‐12; Davis 2013; GILDA 2016; Maly 2017; Ruddy 2016; Wattchow 2006), the Functional Assessment of Cancer Therapy ‐ General (FACT‐G; Davis 2013), Breast (FACT‐B; Hershman 2013; Sheppard 2009), Colorectal (FACT‐C); Young 2013), the EuroQol‐5D (Augestad 2013; Verschuur 2009), and the City of Hope Quality of Life Questionnaire (Juarez 2013). An older study (GIVIO 1994), measured quality of life using a compilation of items selected from several quality‐of‐life instruments available in 1985. The results of the EORTC‐C30, SF‐36 and SF‐12 are not reported as overall scores but by subscales measuring specific domains of health‐related quality of life. We carried out a meta‐analysis for results at 12 months for scales or subscales that were reported by at least three studies as specified in our Methods section under Data synthesis. We have reported the studies that contributed data to meta‐analysis for each specific scale or subscale in the results section below.

Anxiety

Fourteen studies reported on the outcome of anxiety: five studies used the State Trait Anxiety Inventory (STAI; Beaver 2009; Beaver 2012; Beaver 2017; Damude 2016; Kimman 2011), and nine studies used the Hospital Anxiety and Depression Scale ‐ Anxiety subscale (HADS‐Anxiety; Brown 2002; Emery 2016; Grunfeld 1996; Grunfeld 2006; Kirshbaum 2017; Koinberg 2004; Murchie 2010; ROGY 2015; Wattchow 2006). Two studies reported on fear of recurrence: one using a three‐item questionnaire that was still being tested (Sheppard 2009), and one using the Fear of Cancer Recurrence Inventory (Jeppesen 2018). We carried out a meta‐analysis for results at 12 months for scales or subscales that were reported by at least three studies as specified in our Methods section under Data synthesis. We have reported the studies that contributed data to meta‐analysis for each specific scale or subscale in the results section below.

Depression

Nineteen studies reported on the outcome of depression or psychological distress: nine studies used the Hospital Anxiety and Depression Scale ‐ Depression subscale (HADS‐Depression; Brown 2002; Emery 2016; Grunfeld 1996; Grunfeld 2006; Kirshbaum 2017; Koinberg 2004; Murchie 2010; ROGY 2015; Wattchow 2006), three used the General Heath Questionnaire (GHQ‐12; Beaver 2009; Beaver 2012; Sheppard 2009), two studies each used the Center for Epidemiological Studies‐Depression scale (CES‐D; Hershman 2013; Van der Meulen 2013), and the Distress thermometer (Juarez 2013, Young 2013), one study each used the Patient Health Questionnaire (PHQ‐9; Kvale 2016), the Profile of Mood States (POMS; Grunfeld 2011), and the Brief Symptom Inventory (BSI‐18; Jefford 2016). We carried out a meta‐analysis for results at 12 months only for scales or subscales that were reported by at least three studies as specified in our Methods section under Data synthesis. We have reported the studies that contributed data to meta‐analysis for each specific scale or subscale in the results section below.

Cost

Sixteen studies reported cost outcomes (Augestad 2013; Beaver 2009; Beaver 2017; Damude 2016; Grunfeld 1996; Grunfeld 2011; Kimman 2011; Koinberg 2004; Kokko 2003; Monteil 2010; Morrison 2018; Oltra 2007; Picardi 2014; Rodríguez‐Moranta 2006; Secco 2002; Verschuur 2009). However, there was high heterogeneity in how the studies measured and reported this outcome and we could not pool the results in a meta‐analysis.

Excluded studies

We excluded 44 studies with reasons. Following recommendations from the Cochrane Handbook for Systematic Reviews of Interventions, we classified studies as excluded with reason only if they were studies one might reasonably expect to be eligible for inclusion (see Characteristics of excluded studies table; Higgins 2011a). We excluded studies if the intervention was not follow‐up treatment after primary cancer treatment (wrong intervention; Chang 2013; Helgesen 2000; Majhail 2019; Mathew 2014; NCT03125070; NCT03360994; Ploos van Amstel 2016; Rustin 2010; Song 2018; Stanciu 2015; Visser 2015; Watson 2014), if the participants included patients who were not cancer‐free or were being treated for a recurrence (wrong patient population; Holtedahl 2005; Lanceley 2017; Moore 2002; NCT01973946; NCT02200133; NCT02361099; NCT03056469; NCT03424837; NCT03608410; Puri 2018; Skolarus 2017; Van Rhijn 2011), if our primary or secondary outcomes of interest were not an outcome included by the study (wrong outcomes; Faithfull 2001; Gulliford 1997; Haq 2015; Jefford 2011; Lyu 2016; NCT00049465; NCT01824745; NCT02209415; NCT03271099; NCT03618017; Parker 2018; Smith 2016; Strand 2011; Wheelock 2015), or if it was not a standard randomised trial (wrong design; Rogers 2018; Samawi 2017; Verberne 2015). Two potential studies registered on the ClinicalTrials.gov trials registry were reported as being withdrawn (NCT01993901; NCT02655068), and one study was never started due to lack of funding (Kessler 2013).

Risk of bias in included studies

Figure 3 shows the summary of the 'Risk of bias' assessments for all the included studies. Reasons for the authors' judgements are given for each study in the 'Risk of bias' tables under the Characteristics of included studies.


'Risk of bias' summary: review authors' judgements about each 'Risk of bias' item for each included study. Blank items indicate that this type of outcome was not reported by the study

'Risk of bias' summary: review authors' judgements about each 'Risk of bias' item for each included study. Blank items indicate that this type of outcome was not reported by the study

Allocation

Random sequence generation

Forty studies clearly stated the methods used for random sequence generation and we judged the risk of selection bias in these studies to be low (Augestad 2013; Beaver 2009; Beaver 2012; Beaver 2017; Brown 2002; D'Cruz 2016; Damude 2016; Davis 2013; Emery 2016; Gambazzi 2018; GILDA 2016; GIVIO 1994; Grunfeld 1996; Grunfeld 2006; Grunfeld 2011; Hershman 2013; Jefford 2016; Jeppesen 2018; Kimman 2011; Koinberg 2004; Kvale 2016; Malmstrom 2016; Maly 2017; Morrison 2018; Murchie 2010; Picardi 2014; Primrose 2014; Rodríguez‐Moranta 2006; ROGY 2015; Rosselli Del Turco 1994; Rustin 2007; Schoemaker 1998; Sheppard 2009; Sobhani 2018; Van der Meulen 2013; Verschuur 2009; Wattchow 2006; Westeel 2012; Wille‐Jorgensen 2018; Young 2013). The remaining thirteen studies did not provide sufficient information and we judged the risk of bias to be unclear (Juarez 2013; Kirshbaum 2017; Kjeldsen 1997; Kokko 2003; Mäkelä 1992; Monteil 2010; Ohlsson 1995; Oltra 2007; Pietra 1998; Ruddy 2016; Secco 2002; Sobhani 2008; Wang 2009.

Allocation concealment

We judged 37 studies to be at low risk of bias (Augestad 2013; Beaver 2009; Beaver 2012; Beaver 2017; D'Cruz 2016; Damude 2016; Davis 2013; Emery 2016; Gambazzi 2018; GILDA 2016; GIVIO 1994; Grunfeld 1996; Grunfeld 2006; Grunfeld 2011; Hershman 2013; Jeppesen 2018; Kimman 2011; Koinberg 2004; Malmstrom 2016; Maly 2017; Morrison 2018; Murchie 2010; Picardi 2014; Primrose 2014; Rodríguez‐Moranta 2006; ROGY 2015; Rosselli Del Turco 1994; Rustin 2007; Sheppard 2009; Sobhani 2018; Van der Meulen 2013; Verschuur 2009; Wang 2009; Wattchow 2006; Westeel 2012; Wille‐Jorgensen 2018; Young 2013). We judged studies that reported using a telephone‐, computer‐ or web‐based method of allocation to be at low risk of bias even if they did not specifically report that the allocation was concealed from the personnel involved in assigning participants to the treatment arms. Fifteen studies did not provide sufficient information for us to clearly judge whether allocation was adequately concealed prior to assignment (Brown 2002; Jefford 2016; Juarez 2013; Kirshbaum 2017; Kjeldsen 1997; Kokko 2003; Kvale 2016; Mäkelä 1992; Monteil 2010; Ohlsson 1995; Oltra 2007; Pietra 1998; Ruddy 2016; Secco 2002; Sobhani 2008). We judged one study to be at high risk of selection bias (Schoemaker 1998), as participants were reported to be allocated by the assigner, "choosing the next card from a box of cards indicating the type of follow‐up".

Blinding

Given the nature of this type of intervention, it is usually not possible to blind participants and personnel to intervention arms. Following recommendations from the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2017), we assessed the domains of performance bias and detection bias by outcome group: objective outcomes (survival, recurrence and costs) and patient‐reported outcomes (health‐related quality of life, depression and anxiety).

Blinding of participants and personnel (performance bias)

Performance bias refers to systematic differences between groups in the care that is provided or received and requested (Higgins 2017). We judged all studies that reported on the objective outcomes of survival and time to detection of recurrence to be at unclear risk of bias, as blinding was either not possible or not done. All the studies reporting on patient‐reported outcomes we judged to be at unclear risk of bias except three, which we judged to be at low risk of bias, as these three studies reported that participants were blinded (Hershman 2013; ROGY 2015; Van der Meulen 2013).

Blinding of outcome assessment (detection bias)

Detection bias refers to systematic differences between groups in how outcomes are determined (Higgins 2017). Here, we assessed the risk separately for the objective outcomes of survival and time to detection of recurrence. Since there can be no doubt whether a person is dead or alive, we judged all the studies reporting on overall survival to be at low risk of bias with regards to this outcome (GILDA 2016; GIVIO 1994; Kjeldsen 1997; Koinberg 2004; Kokko 2003; Mäkelä 1992; Monteil 2010; Ohlsson 1995; Pietra 1998; Primrose 2014; Rodríguez‐Moranta 2006; Rosselli Del Turco 1994; Schoemaker 1998; Secco 2002; Sobhani 2018; Wang 2009; Wattchow 2006; Westeel 2012; Wille‐Jorgensen 2018). With regards to time to detection of recurrence, we judged the risk to be unclear, as we cannot rule out the possibility that the lack of blinding may influence judgement regarding clinical tests and assessments, which might influence the outcome. However, one study (Grunfeld 2006), reported that, "the outcome was assessed by a committee that was blinded to treatment allocation" so we judged it as low risk of detection bias for this outcome. With regards to patient‐reported outcomes, all of which were collected through self‐reported questionnaires, we judged all the studies, except the three mentioned above (Hershman 2013; ROGY 2015; Van der Meulen 2013), to be at unclear risk because participants were self‐assessors and were not blinded.

Incomplete outcome data

Attrition bias refers to systematic differences between groups in withdrawals from a study (Higgins 2017). Following recommendations from the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2017), we assessed the domain of attrition bias by outcome group as the same study may have low risk of bias for objective outcomes (where information on death or recurrence is available from hospital records) but a high risk of bias for patient‐reported outcomes (due to unreturned questionnaires).

Studies reporting on objective outcomes where missing outcome data were balanced and due to similar reasons in both groups, we judged to be at low risk of bias (Augestad 2013; Beaver 2012; Beaver 2017; D'Cruz 2016; Damude 2016; Emery 2016; Gambazzi 2018; GILDA 2016; Grunfeld 1996; Grunfeld 2006; Kimman 2011; Kjeldsen 1997; Koinberg 2004; Kokko 2003; Mäkelä 1992; Murchie 2010; Ohlsson 1995; Oltra 2007; Picardi 2014; Pietra 1998; Primrose 2014; Rodríguez‐Moranta 2006; Ruddy 2016; Rustin 2007; Schoemaker 1998; Sobhani 2008; Sobhani 2018; Verschuur 2009; Wang 2009; Wattchow 2006; Westeel 2012; Wille‐Jorgensen 2018). Five studies did not report reasons for dropout and we judged them to be at unclear risk of bias (GIVIO 1994; Monteil 2010; Rosselli Del Turco 1994; Secco 2002; Sheppard 2009). We judged one study (Beaver 2009), to be at high risk of bias, as more participants in the intervention group did not receive the intervention or wanted to change group and were lost to follow‐up compared to the comparison group.

We judged studies reporting on patient‐reported outcomes, where response rates were balanced and missing data were due to similar reasons in both groups, to be at low risk of bias (Augestad 2013; Beaver 2017; Brown 2002; Damude 2016; Davis 2013; Emery 2016; GILDA 2016; Grunfeld 1996; Grunfeld 2006; Grunfeld 2011; Hershman 2013; Jefford 2016; Jeppesen 2018; Kimman 2011; Kjeldsen 1997; Koinberg 2004; Kvale 2016; Malmstrom 2016; Maly 2017; Morrison 2018; Murchie 2010; Ruddy 2016; Van der Meulen 2013; Verschuur 2009; Wattchow 2006; Young 2013). Three studies insufficiently reported reasons for loss to follow‐up or information on whether attrition was equally distributed between the groups, and we judged the risk of bias to be unclear (GIVIO 1994; Kirshbaum 2017; Sheppard 2009). We judged four studies to be at high risk of bias due to high attrition, imbalance in numbers or different reasons for attrition between the two groups (Beaver 2009; Beaver 2012; Juarez 2013; ROGY 2015).

Selective reporting

Sixteen studies had available study protocols or prospectively registered clinical trial entries where all of the studies' prespecified outcomes had been reported and we judged these studies to be at low risk of reporting bias (Augestad 2013; D'Cruz 2016; Emery 2016; GILDA 2016; Grunfeld 2006; Jefford 2016; Jeppesen 2018; Kimman 2011; Monteil 2010; Morrison 2018; ROGY 2015; Rustin 2007; Sobhani 2018; Westeel 2012; Wille‐Jorgensen 2018; Young 2013). We judged one study (Ruddy 2016), as being at high risk of reporting bias, as they did not report results for anxiety and depression in the publication, even though it was an outcome that was specified in the methods section. The remaining studies received a judgement of unclear risk, either because no study protocol was available for these studies or because they did not report all the outcomes specified in the protocol.

Other potential sources of bias

We judged four studies to be at high risk of bias due to the potential risk of contamination, the risk of surveillance bias, significant baseline imbalances, or a combination of two or all of these (Beaver 2012; Juarez 2013; Murchie 2010; Ruddy 2016). Four other studies also reported baseline imbalances but we had difficulty identifying whether the imbalance would introduce bias and thus, judged them to be at unclear risk: Grunfeld 1996 had more stage I participants in the hospital group compared to the GP group (50.3% versus 40.4%); Ohlsson 1995 had fewer women and more men in the control group compared to the intervention group (23 versus 33 women, 31 versus 20 men); Oltra 2007 had more disease stage I participants in the intervention group (28 versus 17) and more stage IIA participants in the comparison group (24 versus 11); and Secco 2002 had more participants with higher levels of pre‐operative carcinoembryonic antigen (CEA) and fewer participants with lower levels of pre‐operative CEA in the intervention group compared to the comparison group (31.5% versus 9.5%, 68.5% versus 90.5%). Kirshbaum 2017 did not record any baseline information other than age of the participants and we also judged the risk to be unclear. We judged the remaining studies to be at unclear risk of other bias.

Effects of interventions

See: Summary of findings for the main comparison Non‐specialist‐led versus specialist‐led follow‐up after primary cancer treatment; Summary of findings 2 Less intensive versus more intensive follow‐up after primary cancer treatment; Summary of findings 3 Follow‐up integrating additional patient symptom education or monitoring, or survivorship care plans versus usual care

Below, we present the effects of the interventions by outcome for each comparison group. For each outcome, we present the results of the meta‐analyses, the meta‐regression and sensitivity analyses for overall survival and time to detection of recurrence (if carried out) and a narrative synthesis of the studies with results that we could not pool.

Comparison 1: non‐specialist‐led versus specialist‐led follow‐up

We included 17 studies for this comparison (Augestad 2013; Beaver 2009; Beaver 2012; Beaver 2017; Brown 2002; Emery 2016; Grunfeld 1996; Grunfeld 2006; Jeppesen 2018; Kimman 2011; Kirshbaum 2017; Koinberg 2004; Morrison 2018; Murchie 2010; Sheppard 2009; Verschuur 2009; Wattchow 2006).

Overall survival

Four studies reported on the outcome of survival (Grunfeld 2006; Koinberg 2004; Verschuur 2009; Wattchow 2006).

Two studies reported data that we could pool in a meta‐analysis investigating nurse‐led follow‐up after breast cancer (Koinberg 2004), and GP‐led follow‐up after colon cancer (Wattchow 2006). It is uncertain how non‐specialist‐led follow‐up affects overall survival as the certainty of the evidence is very low (HR 1.21, 95% CI 0.68 to 2.15; P = 0.07; 2 studies; 603 participants; Analysis 1.1; Figure 4). The anticipated absolute effect was 2 fewer survivors per 100 patients (ranging from 10 fewer to 4 more). There was no statistical heterogeneity (I2 = 0) and we could not carry out meta‐regression or sensitivity analyses. We downgraded the certainty of evidence by three levels for very serious concerns regarding indirectness and imprecision, as representativeness is limited with only two studies, the HRs were not reported but indirectly estimated and the confidence interval was very wide.


Forest plot of comparison 1. Non‐specialist‐led versus specialist‐led follow‐up, outcome: 1.1 overall survivalA HR greater than 1 indicates a higher hazard of death (worse survival) in the non‐specialist arm and a lower hazard of death (better survival) in the specialist‐led arm

Forest plot of comparison 1. Non‐specialist‐led versus specialist‐led follow‐up, outcome: 1.1 overall survival

A HR greater than 1 indicates a higher hazard of death (worse survival) in the non‐specialist arm and a lower hazard of death (better survival) in the specialist‐led arm

The remaining two studies (1077 participants) reported little or no difference in survival for GP‐led follow‐up after breast cancer (risk difference 0.18%, 95% CI −2.90 to 3.26; Grunfeld 2006) or nurse‐led follow‐up after oesophageal cancer ("7 died in each group"; P = 0.41; Verschuur 2009).

Time to detection of recurrence

Four studies reported on time to detection of recurrence (Beaver 2017; Grunfeld 2006; Koinberg 2004; Wattchow 2006), but we could not use the reported data to indirectly estimate HRs and we could not pool the results. Thus, it is uncertain how non‐specialist‐led follow‐up affects time to detection of recurrence

Three studies (1435 participants) reported little or no difference in time to detection of recurrence for: GP‐led follow‐up after breast cancer (risk difference 2.02%, 95% CI −2.13 to 6.16; Grunfeld 2006); GP‐led follow‐up after colon cancer (log‐rank P = 0.76; Wattchow 2006); and nurse‐led follow‐up after breast cancer (risk difference −0.3%, 95% CI −10 to 9; Koinberg 2004). The final study (Beaver 2017; 259 participants), investigated follow‐up after endometrial cancer and reported median time to recurrence in the nurse‐led arm (307 days; range 48 to 662) versus the hospital arm (172 days; range 99 to 436) but did not carry out any statistical analysis. We judged the certainty of evidence to be very low and downgraded by three levels for serious concerns regarding inconsistency, indirectness and imprecision due to few studies, reporting of results by different estimates that could not be pooled and high variance of the result estimates.

Health‐related quality of life

Thirteen studies reported on health‐related quality of life using a variety of measurement scales and with varying follow‐up periods (Augestad 2013; Beaver 2017; Brown 2002; Grunfeld 1996; Grunfeld 2006; Kimman 2011; Kirshbaum 2017; Maly 2017; Morrison 2018; Murchie 2010; Sheppard 2009; Verschuur 2009; Wattchow 2006). We present results below, according to measurement scale.

The Medical Outcomes Study Short Form Health Survey (SF‐36)

The SF‐36 (Ware 1994), is a 36‐item self‐reported questionnaire consisting of eight subscales (physical functioning, physical role functioning, bodily pain, general health, vitality, social functioning, emotional role functioning and mental health) that can be grouped into two dimensions: the Physical Component Summary (PCS) and the Mental Component Summary (MCS). All scales have transformed scores from 0‐100, with higher scores indicating better health. The MCID for the SF‐36 has been estimated to be approximately five points (Wyrwich 2005).

Three studies (1406 participants) used the SF‐36 (Grunfeld 1996; Grunfeld 2006; Murchie 2010). We were unable to pool the data for meta‐analysis as the three studies did not report on the same subscales at 12 months of follow‐up (criteria prespecified in the Data synthesis section). However, all three studies reported that non‐specialist‐led follow‐up may make little or no difference to health‐related quality of life. Two studies (1264 participants) investigated GP‐led follow‐up after breast cancer. Grunfeld 1996 reported small differences between groups in mean change scores from baseline to trial end at 18 months' follow‐up for social functioning (−1.8, 95% CI −7.2 to 3.5), mental health (0.5, 95% CI −4.1 to 5.1) and general health (0.6, 95% CI −3.6 to 4.8). Grunfeld 2006 reported a figure showing similar mean scores between groups over time for SF‐36 MCS and PCS with up to 60 months' follow‐up. Murchie 2010 reported little or no effect of GP‐led follow‐up after melanoma on SF‐36 scores for all subscales at 12 months' follow‐up with P values ranging from P = 0.149 to P = 1.000 (142 participants).

European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire (EORTC‐C30)

The EORTC‐C30 is a 30‐item, self‐reported questionnaire consisting of six subscales (physical functioning, role functioning, social functioning, emotional functioning, cognitive functioning and global health status) and other single items on symptoms (Aaronson 1993). All scales have scores from 0 to 100, with higher scores indicating better health. The MCID for the EORTC‐C30 has been estimated to be approximately 10 points (Osoba 1998).

Seven studies reported on quality of life using the EORTC‐C30 (Augestad 2013; Beaver 2017; Brown 2002; Kimman 2011; Kirshbaum 2017; Morrison 2018; Verschuur 2009). We were able to pool data from four studies investigating GP‐led follow‐up after colon cancer (Augestad 2013), nurse‐led follow‐up after breast cancer (Kimman 2011), patient‐initiated follow‐up after breast cancer (Kirshbaum 2017), and nurse‐led follow‐up after oesophageal cancer (Verschuur 2009). We carried out meta‐analyses for all six subscales, although Kimman 2011 did not report data for the subscales physical functioning, cognitive functioning and social functioning. Compared to specialist‐led follow‐up, non‐specialist‐led follow‐up may make little or no difference at 12 months to: global health status (MD 1.06, 95% CI −1.83 to 3.95; P = 0.47, I2 = 32%; 4 studies, 605 participants; Analysis 1.2); physical functioning (MD 1.65, 95% CI −2.35 to 5.64; P = 0.42, I2 = 47%; 3 studies, 306 participants; Analysis 1.3); role functioning (MD 2.36, 95% CI −2.75 to 7.47; P = 0.36, I2 = 48%; 4 studies; participants = 605; Analysis 1.4); emotional functioning (MD 0.52, 95% CI −2.06 to 3.09; P = 0.69, I2 = 0%; 4 studies, 605 participants; Analysis 1.5); and cognitive functioning (MD 4.41, 95% CI −1.52 to 10.34; P = 0.14, I2 = 54%; 3 studies, 306 participants; Analysis 1.6). However, non‐specialist‐led follow‐up may slightly improve social functioning (MD 5.39, 95% CI 1.60 to 9.17; P = 0.005; I2 = 0%; 3 studies, 306 participants; Analysis 1.7), but this difference was not large enough to be clinically meaningful. We judged the certainty of evidence to be low and downgraded by two levels for serious concerns regarding inconsistency and imprecision due to differing estimates of effect and wide confidence intervals.

Of the remaining three studies, two studies (320 participants) did not contradict the results of the meta‐analyses. Beaver 2017 reported little or no effect of nurse‐led telephone follow‐up after endometrial cancer on all six subscales at time points ranging from 3 to 12 months after baseline data collection, and Brown 2002 investigated patient‐initiated follow‐up after breast cancer and reported similar median scores between groups on all subscales after 12 months of follow‐up. The third study (Morrison 2018; 24 participants), reported improved health‐related quality of life for nurse‐led telephone follow‐up after gynaecological cancer at six months for global health (MD 4.2, no 95% CI), physical functioning (MD 14.3, no 95% CI) and emotional functioning (MD 1.6, no 95% CI).

Other measures of health‐related quality of life

Four studies also reported that non‐specialist‐led follow‐up made little or no difference to health‐related quality of life using other measures than the above (659 participants). Wattchow 2006 reported on health‐related quality of life using the 12‐item Short Form Health Survey (SF‐12; Ware 1995), and reported little or no effect of GP‐led follow‐up after colon cancer on the median scores for both subscales at 12 months (PCS, P = 0.887; MCS, P = 0.510). Two studies reported on health‐related quality of life using the EuroQol‐5D (EuroQoL 1990). Augestad 2013 reported little or no effect of GP‐led follow‐up after colon cancer in mean differences from baseline to 24 months (P = 0.48) and Verschuur 2009 reported little or no effect of nurse‐led follow‐up after oesophageal cancer on mean scores at 13 months (P = 0.58). Using the Functional Assessment of Cancer Therapy (FACT) scale, Sheppard 2009 reported little or no effect of nurse‐led patient‐initiated follow‐up after breast cancer on mean scores at 18 months of follow‐up (P = 0.952).

Anxiety

Twelve studies reported on the outcome of anxiety using a variety of measurement scales and with varying follow‐up periods (Beaver 2009; Beaver 2012; Beaver 2017; Brown 2002; Emery 2016; Grunfeld 1996; Grunfeld 2006; Kimman 2011; Kirshbaum 2017; Koinberg 2004; Murchie 2010; Wattchow 2006), and two studies reported on fear of recurrence (Jeppesen 2018; Sheppard 2009). We present below results according to measurement scale.

State Trait Anxiety Inventory (STAI) ‐ state subscale

The STAI is a 40‐item, self‐reported questionnaire consisting of two subscales (state anxiety and trait anxiety; Spielberger 1983). Subscales have scores ranging from 20 to 80, with higher scores indicating greater anxiety. The MCID for the STAI has been estimated to be approximately 10 points (Corsaletti 2014).

Four studies reported on state anxiety using STAI (Beaver 2009; Beaver 2012; Beaver 2017; Kimman 2011), and three reported data that we could pool in a meta‐analysis (Beaver 2009; Beaver 2012; Kimman 2011). Compared to specialist‐led follow‐up, non‐specialist‐led follow‐up may make little or no difference to anxiety at 12 months' follow‐up as measured by STAI state subscale (MD −0.55, 95% CI −2.41 to 1.32; P = 0.57; I2 = 0%; 3 studies, 602 participants; Analysis 1.8). We judged the certainty of evidence to be low as we downgraded by two levels for serious concerns regarding study limitations (high risk of attrition bias for two of the studies) and indirectness due to only three studies.

We did not include results from Beaver 2017 in the meta‐analysis as the reported final scores included follow‐up periods of 3, 6 and 12 months (259 participants). However, the study reported the non‐inferiority of nurse‐led follow‐up in endometrial cancer (MD 0.7, 95% CI −1.9 to 3.3).

Hospital Anxiety and Depression Scale (HADS)‐Anxiety subscale

The HADS is a 14‐item, self‐reported questionnaire consisting of two subscales (anxiety and depression; Snaith 2003). Subscales have scores from 0 to 21, with higher scores indicating greater anxiety or depression. The MCID for the HADS has been estimated to be approximately 1.5 points (Puhan 2008).

Eight studies reported on anxiety using the HADS‐Anxiety subscale (Brown 2002; Emery 2016; Grunfeld 1996; Grunfeld 2006; Kirshbaum 2017; Koinberg 2004; Murchie 2010; Wattchow 2006). Five studies contributed data to the meta‐analysis investigating patient‐initiated follow‐up after breast cancer (Brown 2002), shared‐care after prostate cancer (Emery 2016), GP‐led follow‐up after breast cancer (Grunfeld 2006), patient‐initiated follow‐up after breast cancer (Kirshbaum 2017), and GP‐led follow‐up after colon cancer (Wattchow 2006). Compared to specialist‐led follow‐up, non‐specialist‐led follow‐up probably makes little or no difference to anxiety at 12 months' follow‐up as measured by HADS‐Anxiety subscale (MD −0.03, 95% CI −0.73 to 0.67; P = 0.94, I2 = 51%; 5 studies, 1266 participants; Analysis 1.9). Sensitivity analysis (where we removed estimates that were indirectly derived from Brown 2002) did not change our conclusion (MD 0.27, 95% CI −0.15 to 0.69; P = 0.21; 4 studies, 1210 participants). We judged the certainty of evidence to be moderate as we downgraded by one level for concerns regarding inconsistency of results (I2 = 51%) and indirectness due to few studies.

The remaining three studies did not contradict the results of the meta‐analyses (702 participants). Grunfeld 1996 reported little or no effect of GP‐led follow‐up after breast cancer on mean change from baseline to the end of the study (MD 0.4, 95% CI −0.3 to 1.2). Koinberg 2004 and Murchie 2010 reported dichotomised results with little or no effect of nurse‐led follow‐up after breast cancer at 18 months (RR 1.2, 95% CI 0.4 to 3.1) and GP‐led follow‐up after melanoma at 12 months (P = 0.87) respectively.

Fear of recurrence

Two studies reported conflicting results on fear of recurrence. Jeppesen 2018 reported that fear of recurrence decreased more in the comparison group than in the patient‐initiated group at 10 months' follow‐up (MD −5.9, 95% CI −10.9 to −0.9; P = 0.02; 214 participants) as measured by the Fear of Cancer Recurrence Inventory (Simard 2009), while Sheppard 2009 reported little or no effect of nurse‐led patient‐initiated follow‐up after breast cancer on fear of recurrence at 18‐months (MD 0.5 95% CI −0.3 to 1.0; 237 participants) using a three‐item questionnaire that was yet to be tested at that time.

Depression

Eleven studies reported on the outcome of depression or psychological distress using a variety of measurement scales and with varying follow‐up periods (Beaver 2009; Beaver 2012; Brown 2002; Emery 2016; Grunfeld 1996; Grunfeld 2006; Kirshbaum 2017; Koinberg 2004; Murchie 2010; Sheppard 2009; Wattchow 2006). We present a synthesis according to measurement scale below:

Hospital Anxiety and Depression Scale (HADS)‐Depression subscale

Eight studies reported on depression using the HADS‐Depression subscale (Brown 2002; Emery 2016; Grunfeld 1996; Grunfeld 2006; Kirshbaum 2017; Koinberg 2004; Murchie 2010; Wattchow 2006). Five studies contributed data to the meta‐analysis investigating patient‐initiated follow‐up after breast cancer (Brown 2002), shared‐care after prostate cancer (Emery 2016), GP‐led follow‐up after breast cancer (Grunfeld 2006), patient‐initiated follow‐up after breast cancer (Kirshbaum 2017) and GP‐led follow‐up after colon cancer (Wattchow 2006). Compared to specialist‐led follow‐up, non‐specialist‐led follow‐up makes little or no difference to depression at 12‐months as measured by HADS‐Depression subscale (MD 0.03, 95% CI ‐0.35 to 0.42; P = 0.86; I2 = 8%; 5 studies, 1266 participants; Analysis 1.10). Sensitivity analysis (where we removed estimates that were indirectly derived from Brown 2002) did not change our conclusion (MD 0.19, 95% CI ‐0.18 to 0.56; P = 0.30; 4 studies, 1210 participants). We judged the certainty of evidence to be high although we had some concerns regarding indirectness due to few studies, but we did not judge it serious enough to warrant downgrading the evidence by a whole level.

The remaining three studies did not contradict the results of the meta‐analyses (702 participants). Grunfeld 1996 reported little or no effect on depression for GP‐led follow‐up after breast cancer on mean change from baseline to the end of the trial (MD 0.4, 95% CI −0.2 to 1.1). Koinberg 2004 and Murchie 2010 reported little or no effect on depression for nurse‐led follow‐up after breast cancer at 18 months (RR 0.5, 95% CI 0.0 to 5.8) and GP‐led follow‐up after melanoma at 12 months (P = 0.91) respectively.

Other measures of depression

The remaining three studies (676 participants) also reported that non‐specialist‐led follow‐up may make little or no difference to depression. The studies reported using the General Heath Questionnaire (GHQ‐12; Goldberg 1978). Beaver 2009 reported that, "Although the percentage of cases (scores ≥4) was consistently higher in the hospital group at the start, middle, and end of the trial, differences between the groups at each time point were not significant". Beaver 2012 reported that mean GHQ‐12 score was slightly higher in the hospital arm (Cohen’s d = 0.11), and Sheppard 2009 reported little or no differences between the point‐of‐need (patient‐initiated) and routine follow‐up (MD 0.1, 95% CI −1.4 to 1.0, P = 0.767).

Cost

Eight studies reported cost outcomes (Augestad 2013; Beaver 2009; Beaver 2017; Grunfeld 1996; Kimman 2011; Koinberg 2004; Morrison 2018; Verschuur 2009), but due to the substantial heterogeneity in how the outcome was measured and reported, we could not pool the results in a meta‐analysis (1756 participants). Details of how each study measured and reported cost outcomes are summarised in Table 1. In general, Augestad 2013; Beaver 2017; Grunfeld 1996; Koinberg 2004; Morrison 2018; and Verschuur 2009 reported lower cost per participant in the non‐specialist‐led group, while Beaver 2009 and Kimman 2011 reported higher cost per participant in the non‐specialist‐led group. We judged the certainty of evidence to be very low and downgraded by three levels as the substantial heterogeneity led to serious concerns regarding inconsistency, indirectness and imprecision in the way cost was measured and reported across studies.

Open in table viewer
Table 1. Summary of cost outcomes

Study ID

Outcome measurement

Results

Intervention

Comparison

Augestad

GP‐led vs surgeon‐led follow‐up after colon cancer

  • The cost elements included costs related to hospital visits, GP visits, laboratory tests, radiology examinations, colonoscopy, examinations owing to suspected relapse treatment of recurrence, travelling/transportation, production losses, co‐payments and other participant/family expenses.

  • Cost minimisation analysis was carried out and reported as:

    • cost per participant per 3‐month follow‐up cycle in GBP

    • mean societal cost in GBP for 24 months' follow‐up: mean (95% CI)

  • The follow‐up programme initiated 1186 healthcare contacts (GP 678 vs surgeon 508), 1105 diagnostic tests (GP 592 vs surgeon 513) and 778 hospital travels (GP 250 vs surgeon 528). GP‐organised follow‐up was associated with societal cost savings (GBP 8233 vs GBP 9889, P < 0.001).

Cost per participant

GBP 292 (255 to 327)

Mean societal cost

GBP 8233 (7904 to 8619)

Cost per participant

GBP 351 (315 to 386)

Mean societal cost

GBP 9889 (9569 to 10194)

Beaver 2009

Nurse‐led telephone vs hospital follow‐up after breast cancer

  • Resource use included training of nurses in telephone follow‐up, routine follow‐up consultations and diagnostic tests ordered, participant travel, and time of work and unit costs (e.g. salary, qualifications, ongoing training and clinic overhead)

  • Cost was reported as total NHS (UK) cost per participant in GBP (mean/SD) over a mean of 24 months

  • Owing to the cost of nurse training, greater frequency and longer duration of the telephone consultations, and the frequent use of junior medical staff in hospital clinics, the mean costs of follow‐up consultations were higher with telephone follow‐up (MD GBP 55, bias‐corrected 95% CI GBP 29 to GBP 77).

GBP 179 (118)

GBP 24 (116)

Beaver 2017

Nurse‐led telephone vs hospital follow‐up after endometrial cancer

  • Unit cost data were drawn primarily from the unit costs of health and social care and NHS Reference Costs. The cost of a nurse or doctor contact hour included salary (excluding overtime and shift payments), on‐costs (e.g. national insurance contributions), qualifications, the ratio of participant contact to non‐contact time, and overheads.

  • Unit cost data were collected from 2012/13 and inflated from the time of the study using the most recent (2016/17) UK GBP deflator so that costs were expressed in GBP at 2016/17 prices.

  • Reported as total health service mean cost per participant costs at 6 months or 12 months

  • Differences between groups: 6 months, GBP 8 (bias‐corrected 95% GBP −147 to GBP 141); 12 months, GBP −77 (GBP −334 to GBP 154)

6 months: GBP 434

12 months: GBP 746

6 months: GBP 426

12 months: GBP 823

Damude 2016

Less frequent vs conventional follow‐up schedule after melanoma

  • Study authors calculated total follow‐up costs of the first year for all participants from University Medical Center Groningen, Netherlands (UMCG) based on data from the UMCG financial administration.

  • Cost was reported as total cost per participant (EUR; mean/SD)

  • Study reported a reduction in hospital costs at 1‐year follow‐up of 45% in the intervention group compared to the conventional schedule group.

EUR 417.66 (452.74)

EUR 761.97 (683.37)

Grunfeld 1996

GP‐led vs hospital follow‐up after breast cancer

  • The economic evaluation considered costs to the health service (particularly, the costs of follow‐up visits and diagnostic tests) and costs to the participants (such as lost earnings and out‐of‐pocket expenses) over 18 months. All costs were expressed in 1994 GBP.

  • Cost was reported as average cost per participant in GBP (95% CI)

  • GP participants were seen significantly more frequently and each follow‐up visit lasted longer. GPs ordered more diagnostic tests than did specialists. Although the mean cost per visit in general practice was significantly less, the mean cost of diagnostic tests per visit was similar in the 2 groups.

GBP 64.70 (5.80 to 301.90)

GBP 195.10 (62 to 737.40)

Grunfeld 2011

SCP vs no SCP after breast cancer

  • Analysis assessed the healthcare and societal costs (physician visits, diagnostic and laboratory tests, participant travel costs and lost productivity, and additional costs associated with the SCP) and QALYs over the 2‐year follow‐up of the randomised trial.

  • Cost was reported as:

    • total cost per participant in 2011 CAD

    • QALY

  • The SCP is not cost‐effective. Total costs per participant were lower for standard care (CAD 698 vs CAD 765), and total QALYs were almost equivalent (1.42 for standard care vs 1.41 for the SCP). The probability that the SCP was cost effective was 0.26 at a threshold value of a QALY of CAD 50,000.

CAD 765

QALY 1.41

CAD 698

QALY 1.42

Kimman 2011

Nurse‐led telephone vs hospital follow‐up after breast cancer

  • Analysis included health care (e.g. diagnostic procedures, outpatient clinic visits, telephone interviews) and non‐healthcare‐related costs (e.g. productivity loss, informal care). Cost prices were obtained from the Dutch Governmental manual for healthcare cost analysis.

  • Costs were reported by treatment arm (see note) as:

    • mean annual costs per participant in 2008 Euros (95% CI) over 12 months

    • QALY (95% CI)

Note: this study was a 4‐armed 2 x 2 design that also compared the use of an educational group programme (EGP) vs no EGP. We only included the nurse‐led vs hospital follow‐up comparison for this review.

EUR 4672 (3489 to 6033)

QALY 0.769 (0.746 to 0.794)

EUR 4419 (3410 to 5501)

QALY 0.747 (0.707 to 0.778)

Koinberg 2004

Nurse‐led on demand vs standard hospital follow‐up after breast cancer

  • Cost elements included medical examinations (e.g. mammography, pulmonary X‐ray, scintigraphy, CT scans, US, biopsies), visits (nurse, physician, social worker, physiotherapist and breast prosthetic technician visits) and telephone contacts.

  • Costs are reported as mean cost per participant per year of follow‐up by intervention arm in 2006 Euros (95% CI) in Sweden.

  • Specialist nurse intervention with check‐ups on demand was 20% less expensive than routine follow‐up visits to the physician, explained by the numbers of visits to the physician in the respective study arms.

EUR 495 (410 to 797)

EUR 630 (557 to 1055)

Kokko 2003

Less intensive (CXR only when indicated) vs regular X‐rays after breast cancer

  • Costs from the hospital perspective (no. of contacts (visits and phone calls) and diagnostic tests) were compared during the first 5 years of follow‐up

  • Mean cost of follow‐up per participant in 2003‐2004 Euros were reported for 4 arms (see note):

    • A: every 3 months, routine tests (X‐ray every 6 months)

    • B: every 3 months, no routine tests

    • C: every 6 months, routine tests

    • D: every 6 months, no routine tests

  • Routine examinations in the follow‐up of asymptomatic primary breast cancer participants increase the costs of follow‐up 2.2 times.

Note: the clinical outcomes of this study (survival and recurrence) were reported in a separate paper according to 2 groups: 1 group had regular CXRs every 6 months while the other group had CXR only when clinically needed.

Arm B: EUR 1493

Arm D: EUR 1050

Arm A: EUR 2269

Arm C: EUR 1656

Monteil 2010

More intensive (CDET) vs CT scans after lung cancer

  • The analysis included only direct medical costs for each participant (imaging procedure, fixed hospital and patient transportation costs) over 2 years. Reimbursement prices were determined for each procedure by the French Healthcare system using 2002 repayment tariffs.

  • Costs were reported as average cost of follow‐up visits and imaging per participant in Euros (95% CI)

  • CDET imaging was more expensive, provided earlier detection of recurrence, but did not modify survival outcome.

EUR 1104.96

(954 to 1240)

EUR 755.47 (640 to 864)

Morrison 2018

Nurse‐led telephone vs hospital follow‐up after gynaecological cancer

  • Mean total cost per participant (SD) of all contacts with NHS primary and secondary care services and other cancer services over the 6‐month follow‐up period were calculated.

  • Difference between groups: GBP 26.60 (bootstrapped 95% CI GBP −290.37 to GBP 240.42)

  • Although this difference is not statistically significant, the mean total costs of service use were lower in the intervention group.

GBP 388.84 (320.11)

GBP 415.44

(329.08)

Oltra 2007

More intensive (additional diagnostic tests) vs standard follow‐up after breast cancer

  • Cost elements were not specified but were calculated over a median of 3 years of follow‐up in a hospital in Spain where participants were recruited from 1997‐1999.

  • Cost reported as:

    • total cost of follow‐up in Euros

    • mean costs per participant in Euros

EUR 74,171

EUR 1278 per participant

EUR 24,567

EUR 390 per participant

Picardi 2014

Less intensive (US/chest radiography) vs standard (PET/CT scans) follow‐up after Hodgkin lymphoma

  • Cost was calculated from the perspective of the Italian National Healthcare System including costs of imaging procedures and surgical biopsies over median follow‐up of 60 months.

  • Costs were reported as average cost of follow‐up per participant in 2010 Euros

  • Estimated cost per relapse diagnosed with routine PET/CT was 10‐fold higher compared with that diagnosed with routine US/chest radiography (P < 0.0001 for difference between groups).

EUR 862

EUR 8818

Rodríguez‐Moranta 2006

More intensive (CT scan plus colonoscopy) vs simple follow‐up after colorectal cancer

  • Costs were calculated for all procedures performed during the scheduled follow‐up or as a result of additional work‐up for any suspected recurrence according to Hospital Clinic current billing (participants recruited from 1997‐2001). Indirect costs, such as time lost from work or transportation charges, were not factored into the analysis.

  • Cost were reported as:

    • cost per participant in Euros for median follow‐up of up to 49 months

    • Cost per resectable tumour recurrence in Euros

  • Although overall cost of follow‐up was higher in the intensive strategy group (EUR 300,315) than in simple strategy group (EUR 188,630), the intensive surveillance strategy was more efficient when resectability was considered.

EUR 300,315 per participant

EUR 16,684 per tumour

EUR 188,630 per participant

EUR 18,863 per tumour

Sobhani 2018

More intensive (18FDG‐PET) vs conventional follow‐up after colorectal cancer

  • Costs were assessed in accordance with the Consolidated Health Economic Evaluation Reporting Standards statement for single‐trial‐based studies.

  • The prospective analysis determined the cost per life‐year gained with 18FDG‐PET/CT vs the standard of care over the 3‐year trial period. Hospital inpatient costs were estimated and average cost for each study group determined with adjustment for the actual length of stay and resources used during the admission including the cost of imaging studies. Discounting was not performed. Total cost (mean/SD) was computed both with and without the cost of 18FDG‐PET/CT in 2016 EUROS. Costs were compared between groups using the Wilcoxon test.

  • Difference between groups P value: without imaging: P = 0.23; with imaging: P = 0.033

  • The probabilistic sensitivity analysis suggested that the intervention strategy increased costs without improving participant outcomes, with a likelihood of 87% for the survival end point

Without imaging: EUR 14,573 (27,531)

With imaging: EUR 18,192 (27,679)

EUR 11,131 (13,254)

Verschuur 2009

Nurse‐led vs surgeon‐led follow‐up after oesophageal cancer

  • Cost elements included comprehensive data on hospital costs (inpatient days, health practitioner care, medical treatment), diagnostic interventions and extramural care (GP‐care) for a period of 12 months' follow‐up

  • Costs were reported as total costs per participant, reported in 2006 Euros in the Netherlands

  • The total average costs per participant were not statistically significantly higher for standard follow‐up than nurse‐led follow‐up (EUR 3798 vs EUR 2592; P = 0.11). Costs of nurse‐led follow‐up visits were lower than those of standard follow‐up visits (EUR 234 vs EUR 503; P < 0.001).

EUR 2592

EUR 3798

18FDG: 18F‐fluoro‐2‐deoxy‐D‐glucose; CAD: Canadian Dollar ; CI: confidence interval; CDET: coincidence detection system imaging; CT: computed tomography; EUR: Euro; CXR: chest X‐ray; GBP: Pound Sterling; GP: general practitioner; MD: mean difference; NHS: National Health Service; PET: positron emission tomography; QALY: quality‐adjusted life year; SCP: survivorship care plan; SD: standard deviation; US: ultrasound

Comparison 2: less intensive versus more intensive follow‐up

We included 24 studies for this comparison (D'Cruz 2016; Damude 2016; Gambazzi 2018; GILDA 2016; GIVIO 1994; Kjeldsen 1997; Kokko 2003; Mäkelä 1992; Monteil 2010; Ohlsson 1995; Oltra 2007; Picardi 2014; Pietra 1998; Primrose 2014; Rodríguez‐Moranta 2006; Rosselli Del Turco 1994; Rustin 2007; Schoemaker 1998; Secco 2002; Sobhani 2008; Sobhani 2018; Wang 2009; Westeel 2012; Wille‐Jorgensen 2018).

Overall survival

Eighteen studies reported on the outcome of survival (D'Cruz 2016; GILDA 2016; GIVIO 1994; Kjeldsen 1997; Kokko 2003; Mäkelä 1992; Monteil 2010; Ohlsson 1995; Pietra 1998; Primrose 2014; Rodríguez‐Moranta 2006; Rosselli Del Turco 1994; Schoemaker 1998; Secco 2002; Sobhani 2018; Wang 2009; Westeel 2012; Wille‐Jorgensen 2018).

Thirteen studies reported data that we could pool for meta‐analysis investigating less versus more intensive follow‐up after oral cancer (D'Cruz 2016), breast cancer (GIVIO 1994; Rosselli Del Turco 1994), non‐small cell lung cancer (Westeel 2012), and colorectal cancer (GILDA 2016; Kjeldsen 1997; Mäkelä 1992; Ohlsson 1995; Rodríguez‐Moranta 2006; Schoemaker 1998; Sobhani 2018; Wang 2009; Wille‐Jorgensen 2018).

Compared to more intensive follow‐up, we found that less intensive follow‐up may make little or no difference to overall survival (HR 1.05, 95% CI 0.96 to 1.14; P = 0.29, I2 = 8%; 13 studies, 10,726 participants; Analysis 2.1; Figure 5). The anticipated absolute effect was 1 fewer survivor per 100 patients (ranging from 3 fewer to 1 more). A funnel plot showed no detectable publication bias. We judged the certainty of evidence to be low as we downgraded by two levels for some concerns regarding study limitations (lack of allocation concealment in one study, Schoemaker 1998) and indirectness as the studies were primarily investigating follow‐up after colorectal and breast cancer, and serious concerns regarding imprecision as the confidence interval includes effects that are not trivial (potentially up to 3 fewer survivors per 100 patients). Sensitivity analysis (where we removed estimates that were indirectly derived from the following studies Kjeldsen 1997; Mäkelä 1992; Ohlsson 1995; Rosselli Del Turco 1994; Sobhani 2018), gave a similar result (HR 1.07, 95% CI 0.96 to 1.19; P = 0.24; 8 studies, 9037 participants).


Forest plot of comparison 2. Less intensive versus more intensive follow‐up, outcome: 2.1 overall survivalA HR greater than 1 indicates a higher hazard of death (worse survival) in the less intensive arm and a lower hazard of death (better survival) in more intensive arm

Forest plot of comparison 2. Less intensive versus more intensive follow‐up, outcome: 2.1 overall survival

A HR greater than 1 indicates a higher hazard of death (worse survival) in the less intensive arm and a lower hazard of death (better survival) in more intensive arm

We carried out meta‐regression analysis to quantify clinical heterogeneity and found little or no difference in the intervention effect by cancer site (breast, colorectal, lung, oral), P = 0.32; year of publication (before 2000, from 2000 onwards), P = 0.87; or study quality (high, moderate, low), P = 0.71. We did not carry out meta‐regression for participant age or sex, as age was similar across studies (approximately 60 to 65 years) and sex was either associated with cancer site (e.g. breast cancer) or effects were not reported separately by sex in studies with both men and women.

We could not incorporate data from five other studies (Kokko 2003; Monteil 2010; Pietra 1998; Primrose 2014; Secco 2002). Three of these studies (1752 participants) reported no difference between less and more intensive follow‐up on overall survival after breast cancer (five‐year survival rate = 85% versus 85%, no P‐value; Kokko 2003), lung cancer ("overall survival" = 26,5 months ± 19.6 versus 29 months ± 17.1, no P‐value; Monteil 2010), and colorectal cancer (survival curves comparing all four arms, log‐rank P = 0.56; Primrose 2014). The remaining two studies (544 participants) reported improved survival outcomes with more intensive follow‐up after colorectal cancer. Pietra 1998 reported improved cumulative survival rates (73% versus 58%, log‐rank P < 0.02) and Secco 2002 reported improved actuarial five‐year survival rates (for high‐risk patients: Chi2 = 4.97, P < 0.05 and for low‐risk patients: Chi2 = 7.90, P < 0.01).

Time to detection of recurrence

Twenty‐two studies reported on time to detection of recurrence (Damude 2016; Gambazzi 2018; GILDA 2016; GIVIO 1994; Kjeldsen 1997; Kokko 2003; Mäkelä 1992; Monteil 2010; Ohlsson 1995; Oltra 2007; Picardi 2014; Pietra 1998; Primrose 2014; Rodríguez‐Moranta 2006; Rosselli Del Turco 1994; Rustin 2007; Secco 2002; Sobhani 2008; Sobhani 2018; Wang 2009; Westeel 2012; Wille‐Jorgensen 2018).

Twelve studies reported data that we were able to pool for meta‐analysis investigating less versus more intensive follow‐up after colorectal cancer (GILDA 2016; Kjeldsen 1997; Primrose 2014; Rodríguez‐Moranta 2006; Sobhani 2008; Wille‐Jorgensen 2018), breast cancer (GIVIO 1994; Rosselli Del Turco 1994), non‐small cell lung cancer (Gambazzi 2018; Westeel 2012), Hodgkin lymphoma (Picardi 2014) and testicular cancer (Rustin 2007).

Compared to more intensive follow‐up, less intensive follow‐up probably increases time to detection of recurrence (HR 0.85, 95% CI 0.79 to 0.92; P < 0.0001, I2 = 0; 12 studies, 11,276 participants; Analysis 2.2, Figure 6). The anticipated absolute effect was 3 fewer detected recurrences per 100 patients (ranging between 5 to 2 fewer). A funnel plot showed no detectable publication bias. We judged the certainty of evidence to be moderate as we downgraded by one level for serious concerns regarding indirectness as HRs for eight of the studies were not reported but indirectly estimated from published data. Sensitivity analysis (where we removed all the studies except GILDA 2016; Picardi 2014; Westeel 2012; Wille‐Jorgensen 2018), showed that less intensive follow‐up may still delay detection of recurrence (HR 0.87, 95% CI 0.79 to 0.96; P = 0.006; 4 studies, 5872 participants).


Forest plot of comparison 2. Less intensive versus more intensive follow‐up, outcome: 2.2 time‐to‐detection of recurrenceA HR less than 1 indicates a lower hazard of detecting recurrence (delay in detection of recurrence) in the less intensive arm and a higher hazard of detecting recurrence (better detection of recurrence) in the more intensive arm.

Forest plot of comparison 2. Less intensive versus more intensive follow‐up, outcome: 2.2 time‐to‐detection of recurrence

A HR less than 1 indicates a lower hazard of detecting recurrence (delay in detection of recurrence) in the less intensive arm and a higher hazard of detecting recurrence (better detection of recurrence) in the more intensive arm.

We carried out meta‐regression analysis to quantify clinical heterogeneity and found little or no difference in the intervention effect by cancer site (breast, colorectal, lung, other), P = 0.81; year of publication (before 2000, from 2000 onwards), P = 0.89 or study quality (high, moderate, low), P = 0.42. We did not carry out meta‐regression for participant age or sex, as age was similar across studies (approximately 60 to 65 years) and sex was either associated with cancer site (e.g. breast cancer) or effects were not reported separately by sex in studies with both men and women.

We could not incorporate data from 10 other studies (Damude 2016; Kokko 2003; Mäkelä 1992; Monteil 2010; Ohlsson 1995; Oltra 2007; Pietra 1998; Secco 2002; Sobhani 2018; Wang 2009). Four of these studies (854 participants) reported shorter time to detection of recurrence for more intensive follow‐up after colorectal cancer (mean/SD 10 ± 5 versus 15 ± 10 months, P = 0.002; Mäkelä 1992 and mean/SD 10.3 ± 2.7 versus 20.2 ± 6.1 months, P < .0003; Pietra 1998), lung cancer (mean/SD 12 ± 9.9 versus 18 ± 11.8 months, no P‐value; Monteil 2010) and breast cancer (mean 1.9 versus 2.1 years, no P‐value; Kokko 2003). Four other studies (734 participants) reported little or no difference between groups on time to recurrence after melanoma (no estimate reported by the authors, only Chi2 P‐value = 0.893; Damude 2016), colorectal cancer (median 1.7 (range 0.3 to 7.6) years versus 2.0 (range 0.8 ‐ 5.6) years, P > 0.05; Ohlsson 1995), (mean/SD 22 ± 17.6 months versus 35 ± 23.9 months, P = 0.49; Wang 2009), and detection of recurrence after breast cancer at three years' follow‐up (22.41%, 90% CI 13.40 to 31.42 versus 17.46%, 90% CI 9.59 to 25.30; Oltra 2007). One final study, Secco 2002, also reported on "disease‐free intervals" but the outcome data were not relevant for this review as the study compared high‐risk patients with low‐risk patients instead of patients in the intensive treatment group with patients in the minimal treatment group (337 participants). Finally, Sobhani 2018 only reported on detection of unresectable recurrence in colorectal cancer (7.0 versus 14.3 months in favour of the more intensive arm; 239 participants).

The delay in detection of recurrence between the two arms ranged from 2 to 10 months in the studies that reported the outcome in time. However, the clinical relevance of this delay on patient outcomes, such as survival, is uncertain as this depends on multiple factors, such as cancer site, treatment received or the patient’s disease burden, and the studies included were not designed to address this.

Health‐related quality of life

Three studies reported on health‐related quality of life (Damude 2016; GILDA 2016; GIVIO 1994), but we could not pool the reported data due to the different measures used and varying follow‐up periods. However, all three studies (2742 participants) reported that less intensive follow‐up made little or no difference in health‐related quality of life when compared to more intensive follow‐up. Damude 2016 used the MCS subscale of the RAND‐36 (a version of the SF‐36) and reported that less intensive follow‐up after melanoma had no effect on health‐related quality of life at 12 months (mean/SD 54.3 (7.6) versus 52.5 (8.8); P = 0.62). GILDA 2016 used the SF‐12 and reported graphs showing no difference between less intensive and more intensive follow‐up after colorectal cancer on the PCS and MCS subscales of the SF‐36 for up to 60 months of follow‐up. GIVIO 1994 measured quality of life using a compilation of items selected from several quality‐of‐life instruments and reported that, "type of follow‐up did not affect various dimensions of health‐related quality of life" after breast cancer for up to five years' follow‐up. We judged the certainty of evidence to be very low and downgraded by three levels for serious concerns regarding inconsistency, indirectness and imprecision due to the few studies, heterogeneous measures and reporting of results by different estimates that we could not pool.

Anxiety

Only Damude 2016 reported on anxiety using STAI and showed that less intensive follow‐up after melanoma made little or no difference to anxiety at 12 months (mean/SD 29.5 (8.8) versus 31.0 (9.9); P = 0.54; 180 participants). We judged the certainty of evidence to be very low and downgraded by three levels for serious concerns regarding inconsistency, indirectness and imprecision since there was only one study.

Depression

None of the studies reported on depression. Thus, there is a lack of evidence for the effect of less intensive versus more intensive follow‐up on depression following completion of primary cancer treatment in adult cancer survivors.

Cost

Six studies (1412 participants) reported cost outcomes (Damude 2016; Kokko 2003; Monteil 2010; Oltra 2007; Picardi 2014; Rodríguez‐Moranta 2006), but due to the substantial heterogeneity in how studies measured and reported the outcome, we could not pool the results in a meta‐analysis. We have summarised details of how each study measured and reported cost outcomes in Table 1. One study (Secco 2002) reported carrying out cost analysis but did not report any data. All studies reported lower costs for the less intensive arm from the perspective of the patient or healthcare system but the difference in cost varied considerably depending on the components of or procedures used in the different interventions. We judged the certainty of evidence to be very low and downgraded by three levels as the substantial heterogeneity led to serious concerns regarding inconsistency, indirectness and imprecision in the way cost was measured and reported across studies.

Comparison 3: follow‐up integrating additional patient symptom education or monitoring, or survivorship care plans versus usual care

We included 12 studies for this comparison (Davis 2013; Grunfeld 2011; Hershman 2013; Jefford 2016; Juarez 2013; Kvale 2016; Malmstrom 2016; Maly 2017; ROGY 2015; Ruddy 2016; Van der Meulen 2013; Young 2013).

Overall survival

None of the studies reported on overall survival. Thus, there is a lack of evidence for the effect of follow‐up integrating patient education/survivorship care plans versus usual care on survival following completion of primary cancer treatment in adult cancer survivors.

Time to detection of recurrence

None of the studies reported on time to detection of recurrence. Thus, there is a lack of evidence for the effect of follow‐up integrating patient education/survivorship care plans versus usual care on detection of recurrence following completion of primary cancer treatment in adult cancer survivors.

Health‐related quality of life

All twelve studies (2846 participants) reported on health‐related quality of life using a variety of measurement scales and with varying follow‐up periods. (Davis 2013; Grunfeld 2011; Hershman 2013; Jefford 2016; Juarez 2013; Kvale 2016; Malmstrom 2016; Maly 2017; ROGY 2015; Ruddy 2016; Van der Meulen 2013; Young 2013). We were unable to pool the reported data in a meta‐analysis as there was no scale or subscale that at least three studies reported at 12 months of follow‐up (criteria prespecified in the Data synthesis section). However, all the studies except Kvale 2016 reported little or no difference in health‐related quality of life between the intervention and usual care groups. We present results below according to measurement scale.

Two studies used the SF‐36. Grunfeld 2011 investigated a survivorship care plan intervention after breast cancer and unpublished results showed similar mean scores between groups at 12 months for both the PCS subscale (mean/SD 48.2 (8.8) versus 48.4 (9.4)) and MCS subscale (,mean/SD 51.4 (9.4) versus 49.5 (10.7)). Kvale 2016 investigated the POSTCARE intervention (survivorship care plan and patient coaching) after breast cancer and reported clinically meaningful improvement for physical role functioning (P = 0.0009), bodily pain (P = 0.03) and emotional role functioning (P = 0.04) at three months' follow‐up.

Three studies used the EORTC‐C30. Jefford 2016 (the SurvivorCare intervention) and Malmstrom 2016 investigated the addition of patient supportive education or care packages to usual care after colorectal and oesophageal cancer respectively and both reported little or no difference on all subscales at six months between the intervention group and the usual care group. Unpublished results from ROGY 2015, investigating survivorship care plan after endometrial and ovarian cancer, also showed little or no difference on all subscales at 12 months between the survivorship care plan group and the usual care group.

Three studies used the SF‐12. Davis 2013 investigated the addition of patient symptom education and monitoring after prostate cancer and reported, "no significant group differences at 7 months" (P > 0.10), Maly 2017 investigated the addition of treatment summaries and survivorship care plan after breast cancer and reported, "no significant differences in SF‐12 scores from baseline to 12 months between groups (data not shown)", while Ruddy 2016 investigated a survivorship care plan with patient navigator intervention after breast cancer and reported similar medians and interquartile ranges in both groups at 12 months, but did not carry out any statistical analyses.

Three studies used various versions of the FACT scale. In addition to reporting on the SF‐12 above, Davis 2013 also reported little or no effect as measured by FACT‐G at seven months (P > 0.10). Young 2013 investigated the CONNECT intervention (additional patient education through structured telephone calls by trained nurses) after colorectal cancer and reported little or no effect as measured by the FACT‐C at six months (P = 0.58), and Hershman 2013 investigated survivorship care plan with patient‐nurse sessions after breast cancer and reported little or no effect as measured by the FACT‐B at six months for the physical well‐being subscale (P = 0.93) and functional well‐being subscale (P = 0.83).

Juarez 2013 investigated an individualised bilingual patient education programme after breast cancer and used the City of Hope Quality of Life Questionnaire to measure quality of life at three and six months and reported that, "QoL [quality of life] increased slightly in both groups or remained unchanged, without significant group by time interaction."

Although all the studies but one reported consistent results, we judged the overall certainty of evidence to be very low and downgraded by three levels for serious concerns regarding study limitations, indirectness and imprecision due to studies being at high risk of bias, the heterogeneous measures and reporting of results by different estimates that could not be pooled.

Anxiety

Only ROGY 2015 reported on anxiety (470 participants). Unpublished data showed similar scores for both the survivorship care plan and usual care group on the HADS‐Anxiety subscale at 12 months for endometrial cancer survivors (mean/SD 5 (4) versus 4.2 (3.6)) and ovarian cancer survivors (mean/SD 5.4 (3.9) versus 6.2 (4.5)). We judged the certainty of evidence to be very low and downgraded by three levels for serious concerns regarding inconsistency, indirectness and imprecision since there was only one study.

Depression

Eight studies (2351 participants) reported on depression or psychological distress using a variety of measurement scales and with varying follow‐up periods. (Grunfeld 2011; Hershman 2013; Jefford 2016; Juarez 2013; Kvale 2016; ROGY 2015; Van der Meulen 2013; Young 2013). We were unable to pool the reported data in a meta‐analysis as there was no scale or subscale that at least three studies reported at 12 months of follow‐up (criteria prespecified in the Data synthesis section). However, all the studies except Van der Meulen 2013 reported little or no difference in depression between the intervention and usual care groups. We present results below according to measurement scale.

One study used the HADS‐Depression subscale. Unpublished results from ROGY 2015 showed similar mean scores for both the survivorship care plan and usual care groups at 12 months' follow‐up among endometrial cancer survivors (mean/SD 3.8 (3.9) versus 3.7 (3.5)) and ovarian cancer survivors (mean/SD 3.5 (3.6) versus 4.4 (4.4)).

One study used the Profile of Mood States (POMS). Grunfeld 2011 reported little or no mean difference between the survivorship care plan and usual care groups at 12 months' follow‐up among breast cancer survivors (MD 2.6, 95% CI 5.6 to 0.5).

One study used the Patient Health Questionnaine (PHQ‐9). Kvale 2016 reported no difference between the POSTCARE and usual care groups in change scores from baseline to three‐month follow‐up among breast cancer survivors (P = 0.376).

One study used the Brief Symptom Inventory (BSI‐18). Jefford 2016 reported little or no difference between the SurvivorCare and usual care groups at six months' follow‐up among colorectal cancer survivors (MD −0.9, 95% CI −3.7 to 1.9).

Two studies each used the Center for Epidemiological Studies‐Depression scale (CES‐D). Hershman 2013 reported little or no difference between the survivorship care plan and usual care groups at six months' follow‐up among breast cancer survivors (P = 0.83), while Van der Meulen 2013 reported a decrease in depressive symptoms in the 'NUCAI' survivorship care group compared with the usual care group at 12 months among head and neck cancer survivors (MD 2.8, 95% CI 5.2 to 0.3).

Two studies used the distress thermometer, a single‐question screening instrument to evaluate patient’s distress based on a scale of 1 to 10 during the past week. Both studies reported little or no difference between groups at six months among breast cancer and colorectal cancer survivors respectively: Juarez 2013 (P = 0.305) and Young 2013 (MD 0.3, 95% CI 0.8 to 0.2).

Although all the studies but one reported consistent results, we judged the overall certainty of evidence to be very low and downgraded by three levels for serious concerns regarding study limitations, indirectness and imprecision due to one study (Juarez 2013), at high risk of bias (as the principal investigator was responsible for all aspects of study, including implementation and follow‐up for both the experimental and attention control groups, which were also highly imbalanced), the heterogeneous measures and reporting of results by different estimates that could not be pooled.

Cost

Only Grunfeld 2011 reported cost outcomes using quality‐adjusted life years (QALY) to calculate the cost effectiveness of the use of a survivorship care plan after two years of follow‐up among breast cancer survivors (408 participants). One QALY is equivalent to one year in perfect health and this study reported 1.42 QALY for the survivorship care plan arm and 1.41 QALY for the usual care only arm. We judged the certainty of evidence to be very low and downgraded by three levels for serious concerns regarding inconsistency, indirectness and imprecision since there was only one study.

Discusión

disponible en

El objetivo de esta revisión sistemática fue resumir la evidencia disponible con respecto a los efectos de las diferentes estrategias de seguimiento después del tratamiento primario curativo contra el cáncer en adultos supervivientes del cáncer sobre los resultados de la supervivencia general, el tiempo hasta la detección de la recidiva, la calidad de vida relacionada con la salud, la ansiedad, la depresión y el coste. Debido a la gama amplia de estrategias de seguimiento disponibles, las intervenciones se clasificaron en tres grupos según si el estudio investigaba lo siguiente: 1) quién realiza el seguimiento, 2) la intensidad de la estrategia de seguimiento con respecto a los exámenes clínicos y los procedimientos de diagnóstico, o 3) la integración de otros componentes en la atención habitual que puedan ser relevantes para la detección de la recidiva, como la educación/monitorización de los síntomas del paciente o la información a través de planes de atención de supervivencia. Lo anterior refleja los elementos de "quién", "qué" y "cuándo" de la Casilla 1. El "dónde" está implícito en quién realiza el seguimiento (por ejemplo, el seguimiento realizado por el médico general tendrá lugar en la atención primaria), mientras que el uso de diferentes formatos/tecnologías para administrar la atención puede variar a través de los mismos elementos del seguimiento y no se consideró en esta revisión.

Aunque se logró agrupar los estudios a través de los sitios del cáncer y se pudieron sintetizar las pruebas para diferentes resultados, aún no se pudo proporcionar evidencia concluyente del efecto de determinados tipos de intervención sobre resultados específicos (p.ej. efecto del seguimiento realizado por personal no especialista sobre la supervivencia y la detección de la recidiva) debido a la falta de estudios disponibles. Sin embargo, la identificación de esta brecha en el conocimiento a través de los sitios de cáncer también es una parte importante de esta revisión. A continuación, se resumen los hallazgos, se discuten las fortalezas y limitaciones de esta revisión y se señalan las implicaciones para la investigación y la práctica futuras.

Resumen de los resultados principales

Se identificaron 53 estudios que evaluaron tres tipos amplios de estrategias de seguimiento.

Seguimiento realizado por personal no especialista (es decir, atención médica general, atención de enfermería, atención iniciada por el paciente o compartida) frente al seguimiento realizado por especialistas

Se incluyeron 17 estudios para esta comparación. Cuatro estudios informaron sobre la supervivencia general, cuatro estudios informaron sobre la detección de la recidiva, 13 estudios informaron sobre la calidad de vida relacionada con la salud, 14 estudios informaron sobre la ansiedad, 11 estudios informaron sobre la depresión y ocho estudios informaron el coste.

No se conoce la manera en que esta estrategia afecta la supervivencia general (CRI 1,21; IC del 95%: 0,68 a 2,15; evidencia de certeza muy baja) y el tiempo hasta la detección de la recidiva (los datos no pudieron agruparse). Esta estrategia puede lograr poca o ninguna diferencia en la calidad de vida relacionada con la salud a los 12 meses (DM 1,06; IC del 95%: ‐1,83 a 3,95; evidencia de certeza baja), probablemente logra poca o ninguna diferencia en la ansiedad a los 12 meses (DM ‐0,03; IC del 95%: ‐0,73 a 0,67; evidencia de certeza moderada) y tiene poco o ningún efecto sobre la depresión a los 12 meses (DM 0,03; IC del 95%: ‐0,35 a 0,42; evidencia de certeza alta). No está claro si el seguimiento realizado por personal no especialista presenta la misma coste‐efectividad que el seguimiento realizado por especialistas debido a la heterogeneidad considerable en la forma en que los estudios incluidos midieron e informaron de este resultado.

Seguimiento menos intensivo frente a más intensivo (basado en visitas clínicas, exámenes o procedimientos de diagnóstico)

Se incluyeron 24 estudios para esta comparación. Dieciocho estudios informaron sobre la supervivencia general, 22 estudios informaron sobre el tiempo transcurrido hasta la detección de la recidiva, tres estudios informaron sobre la calidad de vida relacionada con la salud, un estudio informó sobre la ansiedad, ninguno de los estudios informó sobre la depresión y seis estudios informaron sobre el coste.

El seguimiento menos intensivo puede lograr poca o ninguna diferencia en la supervivencia general (CRI 1,05; IC del 95%: 0,96 a 1,14; evidencia de certeza baja), aunque probablemente aumenta el tiempo hasta la detección de la recidiva (CRI 0,85; IC del 95%: 0,79 a 0,92; evidencia de certeza moderada). El análisis de metarregresión para ambos resultados mostró poca o ninguna diferencia en el efecto de la intervención por sitio del cáncer, año de publicación o puntuación de la calidad. Sin embargo, no se conoce la relevancia clínica de un retraso en la detección de la recidiva en los resultados de los pacientes, debido a que no se analizaron los dos resultados juntos. El análisis de sensibilidad proporcionó un resultado similar para ambos resultados. No está claro si un seguimiento menos intensivo tiene un efecto sobre la calidad de vida relacionada con la salud, la ansiedad y la depresión debido a que la certeza de la evidencia es muy baja (e inexistente para el resultado de la depresión). Todos los estudios informaron costes más bajos para el brazo menos intensivo, aunque hubo heterogeneidad considerable en la forma en que midieron e informaron del resultado.

Seguimiento que integra la educación o la monitorización de los síntomas del paciente, o planes de atención de supervivencia frente a la atención habitual

Se incluyeron 12 estudios para esta comparación. Ninguno de los estudios informó sobre la supervivencia general o la detección de la recidiva, los 12 estudios informaron sobre la calidad de vida relacionada con la salud, un estudio informó sobre la ansiedad, ocho estudios informaron sobre la depresión y un estudio informó del coste.

Existe una falta de evidencia del efecto del seguimiento que integra la educación o la monitorización adicional de los síntomas de los pacientes, o planes de atención de supervivencia sobre la supervivencia general y el tiempo hasta la detección de la recidiva, debido a que ninguno de los estudios evaluó estos resultados. No está claro si esta estrategia logra una diferencia en la calidad de vida relacionada con la salud, la ansiedad, la depresión y el coste, debido a que la certeza de la evidencia es muy baja. Hubo heterogeneidad considerable en la forma en que los estudios midieron la calidad de vida relacionada con la salud y la depresión, y la ansiedad y el coste solo se informaron en estudios individuales.

Compleción y aplicabilidad general de las pruebas

Gracias a la estrategia de búsqueda exhaustiva, se pudieron identificar, dentro de lo posible, casi todos los estudios disponibles que consideran los objetivos de esta revisión. También fue posible agrupar los resultados de diversas intervenciones mediante la construcción de lo que se espera sean tres comparaciones significativas: 1) seguimiento realizado por personal no especialista frente a seguimiento realizado por especialistas, 2) seguimiento menos intensivo frente a más intensivo, y 3) seguimiento que integra la educación o monitoreo de los síntomas del paciente, o planes de atención de supervivencia frente a la atención habitual solamente.

En esta revisión, se incluyen sitios del cáncer que no han sido representados anteriormente. También se utilizaron métodos estadísticos más avanzados para incluir todos los datos posibles de la manera más apropiada disponible (p.ej. mediante el cálculo de los cocientes de riesgos instantáneos para los resultados del tiempo transcurrido hasta el evento) y para investigar la heterogeneidad sin dividir los estudios en subgrupos y sin perder poder estadístico (p.ej. mediante la metarregresión). Además, se analizaron los resultados que pueden considerare los más importantes para los pacientes, los médicos y los servicios de salud, a saber, la supervivencia general, la detección de la recidiva, la calidad de vida, la ansiedad, la depresión y el coste.

Sin embargo, a pesar de los esfuerzos, la evidencia presentada en esta revisión es limitada por el hecho de que todavía hay una falta considerable de investigación de alta calidad en muchos sitios del cáncer y en muchas partes del mundo. Aunque 12 sitios del cáncer y 15 países estuvieron representados en esta revisión, la mayoría de los estudios se realizaron en pacientes con cáncer de mama y colorrectal y casi todos los estudios se realizaron en países de ingresos altos con sistemas de salud universales, donde existe un incentivo directo para encontrar el mejor modelo que equilibre el coste y el efecto. Las razones de lo anterior pueden incluir el hecho de que la realización de los ensayos es muy costosa y se necesitan períodos de seguimiento relativamente largos para los resultados de la supervivencia y la recidiva, en particular en los sitios del cáncer con una supervivencia prolongada y tasas de recidiva bajas. Por lo tanto, la evidencia presentada en esta revisión puede no ser directamente aplicable en determinadas partes del mundo y para determinados tipos de cáncer. También se observó que solo se incluyeron estudios realizados en participantes con cáncer tratado de forma curativa. Este hecho significa que los resultados no pueden generalizarse a la atención de pacientes con cáncer en estadio tardío o incurable, que se caracteriza por el tratamiento y la vigilancia adicional de la progresión en lugar de la recidiva.

Por último, la amplitud de los objetivos solo podría lograrse a expensas de la especificidad. Por lo tanto, no se investigaron aspectos más detallados de una intervención, como la efectividad de los marcadores tumorales o tipos específicos de imágenes, ni se investigaron otros resultados, como la supervivencia específica del cáncer o los efectos del tratamiento temprano para las recidivas asintomáticas. Tampoco se investigaron los efectos adversos potenciales de diferentes estrategias de seguimiento. Sin embargo, se espera que el marco presentado en la introducción y los resultados de esta revisión exhaustiva puedan servir de base para seguir investigando aspectos más específicos del seguimiento del cáncer.

Certeza de la evidencia

Debido a que la evidencia provino de ensayos aleatorizados, las evaluaciones de la calidad comenzaron con el nivel más alto de certeza. Sin embargo, después de la evaluación GRADE, la certeza final de la evidencia varió de alta a muy baja (Schünemann 2013). Se resumió la certeza de la evidencia según los criterios GRADE para cada resultado en las tres tablas de "Resumen de los hallazgos": Resumen de los hallazgos, tabla 1 (seguimiento realizado por personal no especialista frente a seguimiento realizado por especialistas), resumen de los hallazgos, tabla 2 (seguimiento menos intensivo frente a más intensivo) y resumen de los hallazgos, tabla 3 (seguimiento que integra la educación o monitorización adicional de los síntomas del paciente, o planes de atención de supervivencia frente a la atención habitual solamente). Los detalles sobre los criterios GRADE evaluados y disminuidos para cada resultado se encuentran en las notas al pie de la página de cada tabla de 'Resumen de los hallazgos' y en los perfiles de evidencia GRADE (Apéndice 3).

Cabe señalar que casi ninguno de los ensayos de esta revisión realizó el cegamiento. De acuerdo con el juicio de que la dirección del sesgo no está clara (ver Cegamiento [sesgo de realización y sesgo de detección]), la falta de cegamiento no se consideró una preocupación lo suficientemente grave como para disminuir la certeza de la evidencia.

Para el seguimiento realizado por personal no especialista, la certeza de la evidencia del efecto sobre la supervivencia general y la detección de la recidiva fue muy baja, lo que significa que los resultados sintetizados no proporcionan una indicación fiable del efecto probable de esta intervención. La certeza de la evidencia del efecto de esta intervención fue baja en cuanto a la calidad de vida relacionada con la salud, moderada para la ansiedad, alta para la depresión y muy baja para el coste. Por lo tanto, aunque existe una certeza relativa en cuanto a que esta intervención no empeora la ansiedad ni la depresión del paciente, existe menos seguridad en cuanto a cómo afecta la calidad de vida y el coste.

Para un seguimiento menos intensivo, la certeza de la evidencia del efecto sobre la supervivencia general fue baja, principalmente debido a la imprecisión, ya que el intervalo de confianza incluyó un efecto perjudicial (potencialmente tres muertes adicionales) que se consideró importante para los pacientes y los responsables de la toma de decisiones. Para el tiempo transcurrido hasta la detección de la recidiva, la certeza de la evidencia fue moderada, lo que significa que es probable que el efecto estimado sea cercano al efecto real. La certeza de la evidencia del efecto de esta intervención sobre el resto de los resultados fue muy baja, e inexistente para la depresión, lo que pone de manifiesto la falta de conocimiento sobre cómo esta intervención afecta el bienestar del paciente.

Para el seguimiento que integra la educación/monitorización del paciente y planes de atención de supervivencia, no hubo evidencia del efecto sobre la supervivencia general y la detección de la recidiva, lo que destaca que no se sabe cómo este tipo de estrategia de seguimiento más nuevo afecta los resultados pronósticos de los pacientes. La certeza de la evidencia del efecto de esta intervención sobre el resto de los resultados fue muy baja, debido principalmente a las medidas heterogéneas y los puntos temporales utilizados en los diferentes estudios que no pudieron agruparse en un metanálisis.

Sesgos potenciales en el proceso de revisión

En el proceso de revisión se tomaron varias decisiones y juicios que pueden haber tenido un impacto en las conclusiones. A continuación, se identifican y discuten las fortalezas y limitaciones potenciales del proceso de revisión utilizando los dominios identificados en la herramienta para evaluar el riesgo de sesgo en las revisiones sistemáticas (ROBIS): criterios de elegibilidad de los estudios, identificación y selección de estudios, recopilación de datos y evaluación de los estudios, síntesis y hallazgos (Whiting 2016).

Criterios de elegibilidad de los estudios

Debido al amplio alcance de los objetivos y la pregunta de investigación de esta revisión, todos los ensayos aleatorizados que comparaban diferentes tipos de seguimiento con una posible repercusión en la detección de la recidiva en pacientes sometidos a un tratamiento primario con intención curativa fueron potencialmente elegibles. Este hecho dio lugar a la recuperación de un gran número de estudios que compararon una gama amplia de intervenciones y que midieron una gama amplia de resultados. Se cumplió con el protocolo al incluir solo los estudios que informaban sobre uno de los resultados predefinidos y se tomó la decisión de excluir los estudios que no integraban la intervención en la atención de seguimiento clínico, a pesar de que informaban haber realizado la intervención en los participantes sometidos a un seguimiento después del tratamiento primario. Esta decisión se basó en el juicio de que los resultados de dichos estudios eran una medida de la intervención puramente (a menudo una intervención psicosocial o de rehabilitación), y no una medida de un cambio en el seguimiento del cáncer per se. Además, se incluyeron estudios que investigaban una intervención de atención de apoyo solo si había un componente que pudiera ser relevante para la detección de la recidiva, por ejemplo, educación y monitorización de los síntomas o planes de cuidados de supervivencia, de acuerdo con el enfoque de esta revisión. Los componentes de la intervención se informan en detalle para todos los estudios incluidos en las tablas de Características de los estudios incluidos para ayudar al lector a explorar las comparaciones específicas.

Identificación y selección de estudios

La búsqueda se realizó con el apoyo del grupo editorial Cochrane, EPOC, e incluyó todas las bases de datos relevantes. No se impusieron restricciones en cuanto a la fecha, el formato de publicación ni el idioma. También se utilizaron métodos de búsqueda adicionales, como la búsqueda a través de referencias en comentarios, resúmenes y artículos, con el fin de recuperar la mayor cantidad de estudios potencialmente relevantes. Con el fin de sistematizar el proceso de identificación y selección y minimizar el error en la inclusión de los estudios, se utilizó el software recomendado, Covidence, para garantizar la identificación independiente de cada estudio por parte de al menos dos revisores y se celebraron reuniones de discusión regulares con el grupo de autores de la revisión para resolver cualquier duda.

Recopilación de datos y evaluación de estudios

Con el fin de sistematizar y minimizar el error, se utilizaron formularios estandarizados de recopilación de datos basados en la plantilla EPOC que se puso a prueba y posteriormente se perfeccionó en los primeros cinco estudios. Para cada estudio, un autor de la revisión extrajo todos los datos relevantes predefinidos y otro autor de la revisión leyó de forma independiente todas las publicaciones del mismo estudio y verificó dos veces el formulario para asegurar la exactitud y que no hubiera datos faltantes. Varios autores de la revisión participaron en este proceso. Al menos dos autores de la revisión realizaron de forma independiente las evaluaciones del "Riesgo de sesgo" para cada estudio y todos los autores de la revisión tomaron los módulos de aprendizaje interactivo en línea desarrollados por by Cochrane Training (Page 2017; Sambunjak 2017), y leyeron los capítulos relevantes en el Manual Cochrane de Revisiones Sistemáticas de Intervenciones (Cochrane Handbook of Systematic Reviews of Interventions) para minimizar cualquier error (Higgins 2011a; Higgins 2017). El grupo de autores de la revisión mantuvo reuniones de discusión regulares con respecto a los estudios y se resolvieron los problemas por consenso. Debido a que el interés se centró en muchos resultados, también se evaluó el riesgo de sesgo en cada dominio mediante resultados objetivos y resultados informados por los pacientes, de manera que fue posible establecer conclusiones más claras acerca de las limitaciones de los estudios una vez que se sintetizaron los resultados.

Síntesis y resultados

Se realizaron esfuerzos considerables para incluir todos los estudios posibles para la síntesis de cada resultado mediante el contacto con los autores de los estudios y el uso de métodos estadísticos para convertir los datos publicados en estimaciones que se pudieran agrupar. Cuando fue posible, se consideró la heterogeneidad estadísticamente y se informaron todos los resultados como metanálisis o resúmenes narrativos. Se informaron los resultados de cada metanálisis junto con las evaluaciones de calidad de GRADE y se establecieron las DMCI, cuando fue relevante, para ayudar al lector a establecer las conclusiones apropiadas. Al menos dos autores de la revisión realizaron las evaluaciones GRADE para cada resultado y los autores de la revisión tomaron los módulos de aprendizaje interactivo en línea desarrollados por by Cochrane Training (Page 2017; Sambunjak 2017), y leyeron los capítulos relevantes en el Manual Cochrane de Revisiones Sistemáticas de Intervenciones (Cochrane Handbook of Systematic Reviews of Interventions) para minimizar cualquier error (Deeks 2017).

Acuerdos y desacuerdos con otros estudios o revisiones

Hasta donde se conoce, esta es la primera revisión sistemática que incluye de manera integral estudios a través de las intervenciones de seguimiento y los sitios del cáncer, y agrupa sus hallazgos para cada resultado de acuerdo con tres grupos de estrategias de seguimiento. Los resultados amplían los hallazgos de las revisiones Cochrane disponibles sobre el seguimiento del cáncer de mama, colorrectal, de cuello uterino y ovárico en el momento de la búsqueda. Sin embargo, aunque los resultados con respecto a la supervivencia y la calidad de vida están de acuerdo con las revisiones Cochrane sobre el seguimiento del cáncer de mama y del cáncer colorrectal, el hallazgo de un efecto sobre la detección de la recidiva no se encontró en las otras dos revisiones.

En la revisión Cochrane sobre las estrategias de seguimiento para el cáncer de mama basada en cinco estudios incluidos, Moschetti 2016 estableció la conclusión de que el seguimiento menos intensivo presenta la misma efectividad que el seguimiento más intensivo con respecto a la supervivencia general, el tiempo hasta la detección de la recidiva y la calidad de vida. Su resultado diferente con respecto al tiempo hasta la detección de la recidiva probablemente se deba al hecho de que su metanálisis se basó en solo dos estudios (GIVIO 1994; Rosselli Del Turco 1994), y su resultado fue dudoso a favor de un seguimiento más intensivo (CR 0,84; IC del 95%: 0,71 a 1,00). En otra revisión sistemática de ensayos aleatorizados que investigaron el seguimiento intensivo del cáncer de mama (Lafranconi 2017), que incluyó seis ensayos aleatorizados (de los cuales todos se incluyeron en esta revisión excepto Gulliford 1997), los autores establecieron la conclusión de que la mayor frecuencia de las pruebas diagnósticas o las visitas no tuvieron efectos sobre la mortalidad general o las recidivas. Sin embargo, resumieron los efectos como resultados dicotómicos y los presentaron como riesgos relativos, lo que difiere de los análisis del tiempo transcurrido hasta el evento.

En la revisión Cochrane sobre las estrategias de seguimiento para el cáncer colorrectal basada en 15 estudios incluidos, Jeffery 2016 estableció la conclusión de que no hubo un beneficio de supervivencia general o de supervivencia libre de recidiva con un seguimiento más intensivo. Su resultado diferente con respecto a la detección de la recidiva probablemente se deba al hecho de que basaron su metanálisis en estudios diferentes a los de esta revisión. Cinco de los estudios eran de otros sitios de cáncer (GIVIO 1994; Picardi 2014; Rosselli Del Turco 1994; Rustin 2007; Westeel 2012) y no se incluyeron siete de sus estudios (Augestad 2013; Mäkelä 1992; Ohlsson 1995; Pietra 1998; Schoemaker 1998; Secco 2002; Wang 2009). Augestad 2013 calculó la recidiva desde el momento de la sospecha de recidiva en lugar de desde la asignación al azar y aunque los autores de Jeffery 2016 proporcionaron amablemente los métodos que utilizaron cuando se estableció contacto con ellos, no fue posible obtener suficiente información de los últimos cinco estudios para estimar los cocientes de riesgos instantáneos (ver Efectos de las intervenciones, Comparación 2: seguimiento menos intensivo frente a más intensivo, Tiempo hasta la detección de la recidiva). No se identificaron nuevas revisiones sistemáticas de ensayos aleatorizados que investigaran el seguimiento del cáncer colorrectal, con la excepción de las cinco que identificó Jeffery 2016; y se hizo referencia a la discusión en su revisión.

En la revisión Cochrane sobre estrategias de seguimiento para el cáncer de ovario, Clarke 2014 identificó un estudio que no estaba incluido en la presente revisión (Rustin 2010). Este estudio investigó el tratamiento temprano frente a tardío del cáncer de ovario recidivante y encontró que no hubo un beneficio general de supervivencia con el tratamiento inmediato basado en los niveles elevados de CA125 en comparación con el tratamiento tardío hasta que la mujer desarrolla síntomas. Debido a que los pacientes solo fueron aleatorizados después de que se detectaron niveles séricos elevados, no se incluyó este estudio, ya que se consideró que los resultados eran una medida del efecto de una intervención temprana frente a una intervención posterior en pacientes con recidiva, en lugar de un cambio en las estrategias de seguimiento entre los pacientes tratados con intención curativa. No se encontraron estudios en la revisión Cochrane sobre las estrategias de seguimiento para el cáncer de cuello uterino (Lanceley 2013), mientras que hay una revisión Cochrane en marcha que investiga estrategias de seguimiento para mujeres con cáncer de endometrio (Aslam 2016).

No se encontraron otras revisiones sistemáticas centradas en ensayos aleatorizados de las estrategias de seguimiento en sitios específicos del cáncer, aunque se identificaron otras cuatro revisiones sistemáticas que incluyeron ensayos aleatorizados a través de diferentes sitios del cáncer. Sin embargo, las cuatro revisiones no realizaron ningún metanálisis e informaron los resultados de los estudios individuales en forma narrativa. Lewis 2009a y Lewis 2009b informaron sobre el seguimiento en la atención primaria frente a la atención secundaria (11 ensayos aleatorizados) y el seguimiento realizado por personal de enfermería frente al seguimiento realizado por médicos convencionales (cuatro ensayos aleatorizados) respectivamente. Las conclusiones en ambas publicaciones están de acuerdo con estos resultados, excepto que informan de "ninguna diferencia estadísticamente significativa para la tasa de recidiva". Dickinson 2014 realizó una revisión sistemática sobre el uso de la tecnología para proporcionar atención de seguimiento del cáncer (13 ensayos aleatorizados) y estableció la conclusión de que el uso del seguimiento telefónico o de programas educativos basados en Internet era aceptable y factible, basado en los resultados de la satisfacción del paciente, la calidad de vida y la angustia psicológica. Barbieri 2018 realizó una revisión sistemática de la literatura sobre la relación coste‐efectividad de las estrategias de seguimiento después del tratamiento contra el cáncer basada en 44 artículos (incluidos los ensayos no aleatorizados). Establecieron la conclusión de que es probable que el seguimiento intensivo sea efectivo en función de los costes en el cáncer colorrectal, pero no en el cáncer de mama, y no hubo evidencia suficiente para establecer conclusiones para los otros sitios del cáncer.

Proposed model of cancer follow‐up, mechanisms and possible outcomes
Figuras y tablas -
Figure 1

Proposed model of cancer follow‐up, mechanisms and possible outcomes

Study flow diagram
Figuras y tablas -
Figure 2

Study flow diagram

'Risk of bias' summary: review authors' judgements about each 'Risk of bias' item for each included study. Blank items indicate that this type of outcome was not reported by the study
Figuras y tablas -
Figure 3

'Risk of bias' summary: review authors' judgements about each 'Risk of bias' item for each included study. Blank items indicate that this type of outcome was not reported by the study

Forest plot of comparison 1. Non‐specialist‐led versus specialist‐led follow‐up, outcome: 1.1 overall survivalA HR greater than 1 indicates a higher hazard of death (worse survival) in the non‐specialist arm and a lower hazard of death (better survival) in the specialist‐led arm
Figuras y tablas -
Figure 4

Forest plot of comparison 1. Non‐specialist‐led versus specialist‐led follow‐up, outcome: 1.1 overall survival

A HR greater than 1 indicates a higher hazard of death (worse survival) in the non‐specialist arm and a lower hazard of death (better survival) in the specialist‐led arm

Forest plot of comparison 2. Less intensive versus more intensive follow‐up, outcome: 2.1 overall survivalA HR greater than 1 indicates a higher hazard of death (worse survival) in the less intensive arm and a lower hazard of death (better survival) in more intensive arm
Figuras y tablas -
Figure 5

Forest plot of comparison 2. Less intensive versus more intensive follow‐up, outcome: 2.1 overall survival

A HR greater than 1 indicates a higher hazard of death (worse survival) in the less intensive arm and a lower hazard of death (better survival) in more intensive arm

Forest plot of comparison 2. Less intensive versus more intensive follow‐up, outcome: 2.2 time‐to‐detection of recurrenceA HR less than 1 indicates a lower hazard of detecting recurrence (delay in detection of recurrence) in the less intensive arm and a higher hazard of detecting recurrence (better detection of recurrence) in the more intensive arm.
Figuras y tablas -
Figure 6

Forest plot of comparison 2. Less intensive versus more intensive follow‐up, outcome: 2.2 time‐to‐detection of recurrence

A HR less than 1 indicates a lower hazard of detecting recurrence (delay in detection of recurrence) in the less intensive arm and a higher hazard of detecting recurrence (better detection of recurrence) in the more intensive arm.

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 1 Overall Survival.
Figuras y tablas -
Analysis 1.1

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 1 Overall Survival.

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 2 EORTC‐C30 ‐ Global health status.
Figuras y tablas -
Analysis 1.2

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 2 EORTC‐C30 ‐ Global health status.

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 3 EORTC‐C30 ‐ Physical functioning.
Figuras y tablas -
Analysis 1.3

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 3 EORTC‐C30 ‐ Physical functioning.

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 4 EORTC‐C30 ‐ Role functioning.
Figuras y tablas -
Analysis 1.4

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 4 EORTC‐C30 ‐ Role functioning.

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 5 EORTC‐C30 ‐ Emotional functioning.
Figuras y tablas -
Analysis 1.5

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 5 EORTC‐C30 ‐ Emotional functioning.

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 6 EORTC‐C30 ‐ Cognitive functioning.
Figuras y tablas -
Analysis 1.6

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 6 EORTC‐C30 ‐ Cognitive functioning.

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 7 EORTC‐C30 ‐ Social functioning.
Figuras y tablas -
Analysis 1.7

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 7 EORTC‐C30 ‐ Social functioning.

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 8 STAI ‐ State anxiety subscale.
Figuras y tablas -
Analysis 1.8

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 8 STAI ‐ State anxiety subscale.

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 9 HADS ‐ Anxiety subscale.
Figuras y tablas -
Analysis 1.9

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 9 HADS ‐ Anxiety subscale.

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 10 HADS ‐ Depression subscale.
Figuras y tablas -
Analysis 1.10

Comparison 1 Non‐specialist‐led versus specialist‐led follow‐up, Outcome 10 HADS ‐ Depression subscale.

Comparison 2 Less intensive versus more intensive follow‐up, Outcome 1 Overall survival.
Figuras y tablas -
Analysis 2.1

Comparison 2 Less intensive versus more intensive follow‐up, Outcome 1 Overall survival.

Comparison 2 Less intensive versus more intensive follow‐up, Outcome 2 Time‐to‐detection of recurrence.
Figuras y tablas -
Analysis 2.2

Comparison 2 Less intensive versus more intensive follow‐up, Outcome 2 Time‐to‐detection of recurrence.

Summary of findings for the main comparison. Non‐specialist‐led versus specialist‐led follow‐up after primary cancer treatment

Non‐specialist‐led versus specialist‐led follow‐up after primary cancer treatment

Patient or population: adult cancer survivors from the following cancer sites: breast, colon, colorectal, endometrial, ovarian, cervical, melanoma and oesophageal
Setting: outpatient treatment in hospitals or general practice in Australia, Canada, Denmark, Netherlands, Norway, Sweden and UK
Intervention: non‐specialist‐led (i.e. GP‐led, nurse‐led, patient‐initiated or shared care) follow‐up
Comparison: specialist‐led follow‐up

Outcomes

Anticipated absolute effects* (95% CI)

Relative effects (95% CI)

Number of participantsa
(Number of studies)

Certainty of the evidence
(GRADE)

Comments

Risk with specialist‐led follow‐up

Risk with non‐specialist‐led follow‐up

Overall survival

Follow‐up range: 12 months to 60 months

89 per 100b

87 per 100

(79 to 93)

HR 1.21
(0.68 to 2.15)

603 participants

(2 randomised trials)

⊕⊝⊝⊝
Very lowc

4 studies reported on overall survival. It is uncertain how non‐specialist‐led follow‐up affects overall survival as the certainty of the evidence is very low.

We could not incorporate data from 2 other studies (N =1077) in the meta‐analysis, both reported little or no difference in overall survival.

Difference: 2 fewer survivors in the intervention group per 100 participants (between 10 fewer to 4 more)

Time to detection of recurrence

Follow‐up range: 3 months to 60 months

See comment

1691 participants (4 randomised trials)

⊕⊝⊝⊝
Very lowd

4 studies reported on time to detection of recurrence. It is uncertain how non‐specialist‐led follow‐up affects time to detection of recurrence as the certainty of the evidence is very low and we could not pool the reported data.

3 studies reported little or no difference in time to detection of recurrence and 1 study reported median time to recurrence but did not carry out any statistical analysis.

Health‐related quality of life, (at 12 months' follow‐up)

EORTC‐C30 global health status scale (higher scores indicate better HRQoL)

MD 1.06 higher
(1.83 lower to 3.95 higher)

605 participants

(4 randomised trials)

⊕⊕⊝⊝
Lowe

Thirteen studies reported on HRQoL using EORTC‐C30, SF‐36, SF‐12, EuroQoL‐5D and FACT at different time points. Meta‐analysis of 4 studies showed that non‐specialist‐led follow‐up may make little or no difference in HRQoL at 12 months as measured by the EORTC‐C30 global health status scale. The mean difference did not reach the minimal clinically important difference of 10 points identified for this scale.

Studies that we could not incorporate in the meta‐analysis (N = 2385) generally reported that non‐specialist‐led follow‐up made little or no difference to HRQoL.

Anxiety (at 12 months' follow‐up)

HADS‐Anxiety subscale (higher scores indicate worse anxiety)

MD 0.03 lower
(0.73 lower to 0.67 higher)

1266 participants

(5 randomised trials)

⊕⊕⊕⊝
Moderatef

12 studies reported on anxiety and 2 on fear of recurrence using STAI, HADS and FCRI at different time points. Meta‐analysis of 5 studies showed that non‐specialist‐led follow‐up probably makes little or no difference to anxiety at 12 months as measured by HADS‐Anxiety subscale. The mean difference did not reach the minimal clinically important difference of 1.5 points identified for this scale.

Data from the studies that we could not incorporate in the meta‐analysis (N = 1755) generally reported that non‐specialist‐led follow‐up made little or no difference to anxiety and fear of recurrence, except 1 study reporting higher levels of fear of recurrence in the patient‐initiated follow‐up group.

Depression (at 12 months)

HADS‐Depression subscale (higher scores indicate worse depression)

MD 0.03 higher
(0.35 lower to 0.42 higher)

1266 participants

(5 randomised trials)

⊕⊕⊕⊕
Highg

Eleven studies reported on depression using GHQ‐12 and HADS at different time points. Meta‐analysis of 5 studies showed that non‐specialist‐led follow‐up makes little or no difference to depression at 12 months as measured by HADS‐Depression subscale. The mean difference did not reach the minimal clinically important difference of 1.5 points identified for this scale.

The studies that we could not incorporate in the meta‐analysis (N = 1378) generally reported that non‐specialist‐led follow‐up may make little or no difference to depression.

Cost

See comment

1756 participants (8 randomised trials)

⊕⊝⊝⊝
Very lowh

Eight studies reported cost outcomes but due to the substantial heterogeneity in how they measured and reported them, we could not pool the results in a meta‐analysis. It is uncertain whether non‐specialist‐led follow‐up has an effect on cost when compared with specialist‐led follow‐up, as the certainty of the evidence is very low.

6 studies reported lower cost per participant in the non‐specialist‐led group, while 2 studies reported higher cost per participant in the non‐specialist‐led group.

*The basis for the assumed risk in the comparison group (assumed comparator risk, ACR) is provided in the footnotes. The corresponding risk in the intervention group (and its 95%confidence interval) is based on the ACR and the relative effect of the intervention (and its 95%CI).

CI: confidence interval; EORTC‐C30: European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire; FACT: Functional Assessment of Cancer Therapy scale; FCRI: Fear of Cancer Recurrence Inventory; GHQ‐12: General Health Questionnaire‐12 items; GP: general practitioner; HADS: Hospital Anxiety and Depression Scale; HR: Hazard ratio; HRQoL: health‐related quality of life; MD: mean difference; SF‐36: Short Form Health Survey‐36 items; SF‐12: Short Form Health Survey‐12 items; STAI: State Trait Anxiety Inventory

GRADE Working Group grades of evidence
High certainty: we are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty: we are moderately confident in the effect estimate: the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low certainty: our confidence in the effect estimate is limited: the true effect may be substantially different from the estimate of the effect
Very low certainty: we have very little confidence in the effect estimate: the true effect is likely to be substantially different from the estimate of effect.

aFrom meta‐analysis if we pooled study results; for all studies if we did not pool study results.
bThe ACR is the assumed proportion of participants who are alive in the comparison group.
cWe judged the certainty of evidence to be very low and downgraded by three levels for very serious concerns regarding indirectness and imprecision, as representativeness is limited with only two studies, the HRs were not reported but indirectly estimated and the confidence interval was very wide.
dWe judged the certainty of evidence to be very low and downgraded by three levels for serious concerns regarding inconsistency, indirectness and imprecision due to few studies, reporting of results by different estimates that could not be pooled and high variance of the result estimates.
eWe judged the certainty of evidence to be low and downgraded by two levels for serious concerns regarding inconsistency and imprecision due to differing estimates of effect and wide confidence intervals.
fWe judged the certainty of evidence to be moderate as we downgraded by one level for concerns regarding inconsistency of results and indirectness due to few studies.
gWe judged the certainty of evidence to be high although we had some concerns regarding indirectness due to few studies.
hWe judged the certainty of evidence to be very low as the high heterogeneity led to serious concerns regarding inconsistency, indirectness and imprecision in the way cost outcomes were measured and reported across studies.

Figuras y tablas -
Summary of findings for the main comparison. Non‐specialist‐led versus specialist‐led follow‐up after primary cancer treatment
Summary of findings 2. Less intensive versus more intensive follow‐up after primary cancer treatment

Less compared with more intensive components in follow‐up after primary cancer treatment

Patient or population: adult cancer survivors from the following cancer sites: breast, colorectal, head‐and‐neck, Hodgkin lymphoma, melanoma, non‐small cell lung cancer and testicular cancer
Setting: outpatient treatment in hospitals in Australia, Denmark, China, Finland, France, India, Italy, Netherlands, Spain, Sweden, Switzerland and UK
Intervention: less intensive follow‐up (based on fewer clinical visits, examinations or less intensive diagnostic procedures)
Comparison: more intensive follow‐up

Outcomes

Anticipated absolute effects* (95% CI)

Relative effects
(95% CI)

Number of participantsa
(Number of studies)

Certainty of the evidence
(GRADE)

Comments

Risk with more intensive follow‐up

Risk with less intensive follow‐up

Overall survival

Follow‐up range: 24 months to 120 months

75 per 100b

74 per 100

(72 to 76)

HR 1.05
(0.96 to 1.14)

10,726 participants

(13 randomised trials)

⊕⊕⊝⊝
Lowc

18 studies reported on overall survival. Meta‐analysis of 13 studies showed that less intensive follow‐up may make little or no difference to overall survival. Meta‐regression analysis showed little or no difference in the intervention effects by cancer site, publication year or study quality.

We could not incorporate data from 5 other studies. 3 of these studies reported little or no difference in overall survival (N = 1752), while 2 studies reported improved survival with more intensive follow‐up (N = 544).

Difference: 1 fewer survivor in the intervention group per 100 participants (between 3 fewer to 1 more)

Time to detection of recurrence

Follow‐up range: 12 months to 120 months

27 per 100d

24 per 100

(22 to 25)

HR 0.85 (0.79 to 0.92)

11,276 participants

(12 randomised trials)

⊕⊕⊕⊝
Moderatee

22 studies reported on time to detection of recurrence. Meta‐analysis of 12 studies showed that less intensive follow‐up probably increases time to detection of recurrence. Meta‐regression analysis showed little or no difference in the intervention effects by cancer site, publication year or study quality.

We could not incorporate data from 10 other studies. 4 of these studies reported shorter time to detection of recurrence for more intensive follow‐up (N = 854), while 4 other studies reported little or no difference in detection of recurrence (N = 734). 1 study reported results that we could not use for this comparison (N = 337) and 1 study reported results based on only unresectable recurrence (N = 239).

Difference: 3 fewer detected recurrence in the intervention group per 100 participants (between 5 to 2 fewer)

Health‐related quality of life

See comment

2742 participants (3 randomised trials)

⊕⊝⊝⊝
Very lowf

3 studies reported on HRQoL using SF‐36 and SF‐12 at varying time points. We could not pool the reported data. It is uncertain whether less intensive follow‐up has an effect on HRQoL when compared with more intensive follow‐up, as the certainty of the evidence is very low.

All 3 studies reported that less intensive follow‐up may make little or no difference in HRQoL when compared to more intensive follow‐up at time points ranging from 12 months to 5 years.

Anxiety

See comment

180 participants (1 randomised trial)

⊕⊝⊝⊝
Very lowg

One study reported that less intensive follow‐up may make little or no difference to anxiety at 12 months follow‐up using STAI.

It is uncertain whether less intensive follow‐up has an effect on anxiety when compared with more intensive follow‐up, as the certainty of the evidence is very low.

Depression

None of the studies reported depression.

Cost

See comment

1412 participants (6 randomised trials)

⊕⊝⊝⊝
Very lowh

6 studies reported cost outcomes but due to the substantial heterogeneity in how they measured and reported this outcome, we could not pool the results in a meta‐analysis. It is uncertain whether less intensive follow‐up has an effect on cost when compared with more intensive follow‐up, as the certainty of the evidence is very low.

All studies report lower costs for the less intensive arm from the perspective of the participant or healthcare system but the difference in cost varied considerably depending on the components/procedures used in the different interventions.

*The basis for the assumed risk in the comparison group (assumed comparator risk, ACR) is provided in the footnotes. The corresponding risk in the intervention group (and its 95%confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95%CI).

CI: confidence interval; HR: hazard ratio; HRQoL: health‐related quality of life; SF‐36: Short Form Health Survey‐36 items; SF‐12: Short Form Health Survey‐12 items; STAI: State Trait Anxiety Inventory

GRADE Working Group grades of evidence
High certainty: we are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty: we are moderately confident in the effect estimate: the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low certainty: our confidence in the effect estimate is limited: the true effect may be substantially different from the estimate of the effect
Very low certainty: we have very little confidence in the effect estimate: the true effect is likely to be substantially different from the estimate of effect.

aFrom meta‐analysis if we pooled study results; for all studies if we did not pool study results.
bThe ACR is the assumed proportion of participants who are alive in the comparison group.
cWe judged the certainty of evidence to be low as we downgraded by two levels for some concerns regarding study limitations (lack of allocation concealment in one study) and indirectness as the studies were primarily investigating follow‐up after colorectal and breast cancer, and serious concerns regarding imprecision as the confidence interval includes effects that are not trivial (potentially up to 3 fewer survivors per 100 participants).
dThe ACR is the assumed proportion of participants with a detected recurrence in the comparison group.
eWe judged the certainty of evidence to be moderate as we downgraded by one level for serious concerns regarding indirectness as seven of the studies did not report hazard ratios, so we indirectly estimated them from published data.
fWe judged the certainty of evidence to be very low and downgraded by three levels for serious concerns regarding inconsistency, indirectness and imprecision due to the few studies, heterogeneous measures and reporting of results by different estimates that we could not pool.
gWe judged the certainty of evidence to be very low and downgraded by three levels for serious concerns regarding inconsistency, indirectness and imprecision since there was only one study.
hWe judged the certainty of evidence to be very low and downgraded by three levels as the substantial heterogeneity led to serious concerns regarding inconsistency, indirectness and imprecision in the way cost was measured and reported across studies.

Figuras y tablas -
Summary of findings 2. Less intensive versus more intensive follow‐up after primary cancer treatment
Summary of findings 3. Follow‐up integrating additional patient symptom education or monitoring, or survivorship care plans versus usual care

Follow‐up integrating additional patient symptom education or monitoring, or survivorship care plans versus usual care

Patient or population: adult cancer survivors from the following cancer sites: breast, colorectal, endometrial, ovarian and prostate cancer
Setting: outpatient treatment in hospitals or general practice in Australia, Canada, Netherlands, Sweden and USA
Intervention: follow‐up integrating additional components relevant for detection of recurrence (e.g. patient symptom education or monitoring, or survivorship care plans (SCP))
Comparison: usual care

Outcomes

Anticipated absolute effects* (95% CI)

Relative effects
(95% CI)

№ of studies

Certainty of the evidence
(GRADE)

Comments

Risk with usual care

Risk with follow‐up integrating additional patient symptom education/monitoring or SCP

Overall survival

None of the studies reported overall survival.

Time‐to‐ detection of recurrence

None of the studies reported detection of recurrence.

Health‐related quality of life, (HRQoL)

See comment

2846 participants (12 randomised trials)

⊕⊝⊝⊝
Very lowa

12 studies reported on HRQoL using EORTC‐C30, SF‐36, SF‐12, FACT and City of Hope QoL scale at varying time points. We could not pool the reported data. It is uncertain whether follow‐up integrating additional patient symptom education/monitoring or SCP has an effect on HRQoL when compared with usual care, as the certainty of the evidence is very low.

11 studies reported that follow‐up integrating additional patient education/SCP may make little or no difference to HRQoL when compared to usual care at follow‐up ranging from 6 months to 12 months. 1 study reported that SCP and patient coaching improved HRQoL at 3 months' follow‐up.

Anxiety

See comment

470 participants (1 randomised trial)

⊕⊝⊝⊝
Very lowb

One study reported that SCP may make little or no difference to anxiety at 12 months' follow‐up using HADS.

It is uncertain whether follow‐up integrating additional patient symptom education/monitoring or SCP has an effect on anxiety when compared with usual care, as the certainty of the evidence is very low.

Depression

See comment

2351 participants (8 randomised trials)

⊕⊕⊝⊝
Very lowc

8 studies reported on depression using HADS, POMS, PHQ‐9, BSI‐18, CES‐D and the distress thermometer at varying time points. We could not pool the reported data. It is uncertain whether follow‐up integrating additional patient symptom education/monitoring or SCP has an effect on depression when compared with usual care, as the certainty of the evidence is very low.

7 studies reported that follow‐up integrating additional patient education/SCP may make little or no difference to depression when compared to usual care at follow‐up ranging from 3 months to 12 months. 1 study reported that the intervention improved symptoms of depression at 12 months' follow‐up.

Cost

See comment

408 participants (1 randomised trial)

⊕⊝⊝⊝
Very lowd

One study reported that the use of SCP make little or no difference to cost at 2 years' follow‐up.

It is uncertain whether follow‐up integrating additional patient symptom education/monitoring or SCP has an effect on cost when compared with usual care, as the certainty of the evidence is very low.

*The corresponding risk in the intervention group (and its 95%confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95%CI).

BSI‐18: Brief Symptom Inventory‐18 items, CES‐D: Center for Epidemiological Studies‐Depression scale; CI: confidence interval; EORTC‐C30: European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire; FACT: Functional Assessment of Cancer Therapy scale; HADS: Hospital Anxiety and Depression Scale; HRQoL: health‐related quality of life; PHQ‐9: Patient Health Questionaire‐9 items; POMS: Profile of Mood States; QoL: quality of life; SCP: survivorship care plans; SF‐36: Short Form Health Survey‐36 items; SF‐12: Short Form Health Survey‐12 items

GRADE Working Group grades of evidence
High certainty: we are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty: we are moderately confident in the effect estimate: the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low certainty: our confidence in the effect estimate is limited: the true effect may be substantially different from the estimate of the effect
Very low certainty: we have very little confidence in the effect estimate: the true effect is likely to be substantially different from the estimate of effect.

aWe judged the overall certainty of evidence to be very low and downgraded by three levels for serious concerns regarding study limitations, indirectness and imprecision due to studies being at high risk of bias, the heterogeneous measures and reporting of results by different estimates that could not be pooled.
bWe judged the certainty of evidence to be very low and downgraded by three levels for serious concerns regarding inconsistency, indirectness and imprecision since there was only one study.
cWe judged the overall certainty of evidence to be very low and downgraded by three levels for serious concerns regarding study limitations, indirectness and imprecision due to one study at high risk of bias, the heterogeneous measures and reporting of results by different estimates that could not be pooled.
dWe judged the certainty of evidence to be very low and downgraded by three levels for serious concerns regarding inconsistency, indirectness and imprecision since there was only one study.

Figuras y tablas -
Summary of findings 3. Follow‐up integrating additional patient symptom education or monitoring, or survivorship care plans versus usual care
Table 1. Summary of cost outcomes

Study ID

Outcome measurement

Results

Intervention

Comparison

Augestad

GP‐led vs surgeon‐led follow‐up after colon cancer

  • The cost elements included costs related to hospital visits, GP visits, laboratory tests, radiology examinations, colonoscopy, examinations owing to suspected relapse treatment of recurrence, travelling/transportation, production losses, co‐payments and other participant/family expenses.

  • Cost minimisation analysis was carried out and reported as:

    • cost per participant per 3‐month follow‐up cycle in GBP

    • mean societal cost in GBP for 24 months' follow‐up: mean (95% CI)

  • The follow‐up programme initiated 1186 healthcare contacts (GP 678 vs surgeon 508), 1105 diagnostic tests (GP 592 vs surgeon 513) and 778 hospital travels (GP 250 vs surgeon 528). GP‐organised follow‐up was associated with societal cost savings (GBP 8233 vs GBP 9889, P < 0.001).

Cost per participant

GBP 292 (255 to 327)

Mean societal cost

GBP 8233 (7904 to 8619)

Cost per participant

GBP 351 (315 to 386)

Mean societal cost

GBP 9889 (9569 to 10194)

Beaver 2009

Nurse‐led telephone vs hospital follow‐up after breast cancer

  • Resource use included training of nurses in telephone follow‐up, routine follow‐up consultations and diagnostic tests ordered, participant travel, and time of work and unit costs (e.g. salary, qualifications, ongoing training and clinic overhead)

  • Cost was reported as total NHS (UK) cost per participant in GBP (mean/SD) over a mean of 24 months

  • Owing to the cost of nurse training, greater frequency and longer duration of the telephone consultations, and the frequent use of junior medical staff in hospital clinics, the mean costs of follow‐up consultations were higher with telephone follow‐up (MD GBP 55, bias‐corrected 95% CI GBP 29 to GBP 77).

GBP 179 (118)

GBP 24 (116)

Beaver 2017

Nurse‐led telephone vs hospital follow‐up after endometrial cancer

  • Unit cost data were drawn primarily from the unit costs of health and social care and NHS Reference Costs. The cost of a nurse or doctor contact hour included salary (excluding overtime and shift payments), on‐costs (e.g. national insurance contributions), qualifications, the ratio of participant contact to non‐contact time, and overheads.

  • Unit cost data were collected from 2012/13 and inflated from the time of the study using the most recent (2016/17) UK GBP deflator so that costs were expressed in GBP at 2016/17 prices.

  • Reported as total health service mean cost per participant costs at 6 months or 12 months

  • Differences between groups: 6 months, GBP 8 (bias‐corrected 95% GBP −147 to GBP 141); 12 months, GBP −77 (GBP −334 to GBP 154)

6 months: GBP 434

12 months: GBP 746

6 months: GBP 426

12 months: GBP 823

Damude 2016

Less frequent vs conventional follow‐up schedule after melanoma

  • Study authors calculated total follow‐up costs of the first year for all participants from University Medical Center Groningen, Netherlands (UMCG) based on data from the UMCG financial administration.

  • Cost was reported as total cost per participant (EUR; mean/SD)

  • Study reported a reduction in hospital costs at 1‐year follow‐up of 45% in the intervention group compared to the conventional schedule group.

EUR 417.66 (452.74)

EUR 761.97 (683.37)

Grunfeld 1996

GP‐led vs hospital follow‐up after breast cancer

  • The economic evaluation considered costs to the health service (particularly, the costs of follow‐up visits and diagnostic tests) and costs to the participants (such as lost earnings and out‐of‐pocket expenses) over 18 months. All costs were expressed in 1994 GBP.

  • Cost was reported as average cost per participant in GBP (95% CI)

  • GP participants were seen significantly more frequently and each follow‐up visit lasted longer. GPs ordered more diagnostic tests than did specialists. Although the mean cost per visit in general practice was significantly less, the mean cost of diagnostic tests per visit was similar in the 2 groups.

GBP 64.70 (5.80 to 301.90)

GBP 195.10 (62 to 737.40)

Grunfeld 2011

SCP vs no SCP after breast cancer

  • Analysis assessed the healthcare and societal costs (physician visits, diagnostic and laboratory tests, participant travel costs and lost productivity, and additional costs associated with the SCP) and QALYs over the 2‐year follow‐up of the randomised trial.

  • Cost was reported as:

    • total cost per participant in 2011 CAD

    • QALY

  • The SCP is not cost‐effective. Total costs per participant were lower for standard care (CAD 698 vs CAD 765), and total QALYs were almost equivalent (1.42 for standard care vs 1.41 for the SCP). The probability that the SCP was cost effective was 0.26 at a threshold value of a QALY of CAD 50,000.

CAD 765

QALY 1.41

CAD 698

QALY 1.42

Kimman 2011

Nurse‐led telephone vs hospital follow‐up after breast cancer

  • Analysis included health care (e.g. diagnostic procedures, outpatient clinic visits, telephone interviews) and non‐healthcare‐related costs (e.g. productivity loss, informal care). Cost prices were obtained from the Dutch Governmental manual for healthcare cost analysis.

  • Costs were reported by treatment arm (see note) as:

    • mean annual costs per participant in 2008 Euros (95% CI) over 12 months

    • QALY (95% CI)

Note: this study was a 4‐armed 2 x 2 design that also compared the use of an educational group programme (EGP) vs no EGP. We only included the nurse‐led vs hospital follow‐up comparison for this review.

EUR 4672 (3489 to 6033)

QALY 0.769 (0.746 to 0.794)

EUR 4419 (3410 to 5501)

QALY 0.747 (0.707 to 0.778)

Koinberg 2004

Nurse‐led on demand vs standard hospital follow‐up after breast cancer

  • Cost elements included medical examinations (e.g. mammography, pulmonary X‐ray, scintigraphy, CT scans, US, biopsies), visits (nurse, physician, social worker, physiotherapist and breast prosthetic technician visits) and telephone contacts.

  • Costs are reported as mean cost per participant per year of follow‐up by intervention arm in 2006 Euros (95% CI) in Sweden.

  • Specialist nurse intervention with check‐ups on demand was 20% less expensive than routine follow‐up visits to the physician, explained by the numbers of visits to the physician in the respective study arms.

EUR 495 (410 to 797)

EUR 630 (557 to 1055)

Kokko 2003

Less intensive (CXR only when indicated) vs regular X‐rays after breast cancer

  • Costs from the hospital perspective (no. of contacts (visits and phone calls) and diagnostic tests) were compared during the first 5 years of follow‐up

  • Mean cost of follow‐up per participant in 2003‐2004 Euros were reported for 4 arms (see note):

    • A: every 3 months, routine tests (X‐ray every 6 months)

    • B: every 3 months, no routine tests

    • C: every 6 months, routine tests

    • D: every 6 months, no routine tests

  • Routine examinations in the follow‐up of asymptomatic primary breast cancer participants increase the costs of follow‐up 2.2 times.

Note: the clinical outcomes of this study (survival and recurrence) were reported in a separate paper according to 2 groups: 1 group had regular CXRs every 6 months while the other group had CXR only when clinically needed.

Arm B: EUR 1493

Arm D: EUR 1050

Arm A: EUR 2269

Arm C: EUR 1656

Monteil 2010

More intensive (CDET) vs CT scans after lung cancer

  • The analysis included only direct medical costs for each participant (imaging procedure, fixed hospital and patient transportation costs) over 2 years. Reimbursement prices were determined for each procedure by the French Healthcare system using 2002 repayment tariffs.

  • Costs were reported as average cost of follow‐up visits and imaging per participant in Euros (95% CI)

  • CDET imaging was more expensive, provided earlier detection of recurrence, but did not modify survival outcome.

EUR 1104.96

(954 to 1240)

EUR 755.47 (640 to 864)

Morrison 2018

Nurse‐led telephone vs hospital follow‐up after gynaecological cancer

  • Mean total cost per participant (SD) of all contacts with NHS primary and secondary care services and other cancer services over the 6‐month follow‐up period were calculated.

  • Difference between groups: GBP 26.60 (bootstrapped 95% CI GBP −290.37 to GBP 240.42)

  • Although this difference is not statistically significant, the mean total costs of service use were lower in the intervention group.

GBP 388.84 (320.11)

GBP 415.44

(329.08)

Oltra 2007

More intensive (additional diagnostic tests) vs standard follow‐up after breast cancer

  • Cost elements were not specified but were calculated over a median of 3 years of follow‐up in a hospital in Spain where participants were recruited from 1997‐1999.

  • Cost reported as:

    • total cost of follow‐up in Euros

    • mean costs per participant in Euros

EUR 74,171

EUR 1278 per participant

EUR 24,567

EUR 390 per participant

Picardi 2014

Less intensive (US/chest radiography) vs standard (PET/CT scans) follow‐up after Hodgkin lymphoma

  • Cost was calculated from the perspective of the Italian National Healthcare System including costs of imaging procedures and surgical biopsies over median follow‐up of 60 months.

  • Costs were reported as average cost of follow‐up per participant in 2010 Euros

  • Estimated cost per relapse diagnosed with routine PET/CT was 10‐fold higher compared with that diagnosed with routine US/chest radiography (P < 0.0001 for difference between groups).

EUR 862

EUR 8818

Rodríguez‐Moranta 2006

More intensive (CT scan plus colonoscopy) vs simple follow‐up after colorectal cancer

  • Costs were calculated for all procedures performed during the scheduled follow‐up or as a result of additional work‐up for any suspected recurrence according to Hospital Clinic current billing (participants recruited from 1997‐2001). Indirect costs, such as time lost from work or transportation charges, were not factored into the analysis.

  • Cost were reported as:

    • cost per participant in Euros for median follow‐up of up to 49 months

    • Cost per resectable tumour recurrence in Euros

  • Although overall cost of follow‐up was higher in the intensive strategy group (EUR 300,315) than in simple strategy group (EUR 188,630), the intensive surveillance strategy was more efficient when resectability was considered.

EUR 300,315 per participant

EUR 16,684 per tumour

EUR 188,630 per participant

EUR 18,863 per tumour

Sobhani 2018

More intensive (18FDG‐PET) vs conventional follow‐up after colorectal cancer

  • Costs were assessed in accordance with the Consolidated Health Economic Evaluation Reporting Standards statement for single‐trial‐based studies.

  • The prospective analysis determined the cost per life‐year gained with 18FDG‐PET/CT vs the standard of care over the 3‐year trial period. Hospital inpatient costs were estimated and average cost for each study group determined with adjustment for the actual length of stay and resources used during the admission including the cost of imaging studies. Discounting was not performed. Total cost (mean/SD) was computed both with and without the cost of 18FDG‐PET/CT in 2016 EUROS. Costs were compared between groups using the Wilcoxon test.

  • Difference between groups P value: without imaging: P = 0.23; with imaging: P = 0.033

  • The probabilistic sensitivity analysis suggested that the intervention strategy increased costs without improving participant outcomes, with a likelihood of 87% for the survival end point

Without imaging: EUR 14,573 (27,531)

With imaging: EUR 18,192 (27,679)

EUR 11,131 (13,254)

Verschuur 2009

Nurse‐led vs surgeon‐led follow‐up after oesophageal cancer

  • Cost elements included comprehensive data on hospital costs (inpatient days, health practitioner care, medical treatment), diagnostic interventions and extramural care (GP‐care) for a period of 12 months' follow‐up

  • Costs were reported as total costs per participant, reported in 2006 Euros in the Netherlands

  • The total average costs per participant were not statistically significantly higher for standard follow‐up than nurse‐led follow‐up (EUR 3798 vs EUR 2592; P = 0.11). Costs of nurse‐led follow‐up visits were lower than those of standard follow‐up visits (EUR 234 vs EUR 503; P < 0.001).

EUR 2592

EUR 3798

18FDG: 18F‐fluoro‐2‐deoxy‐D‐glucose; CAD: Canadian Dollar ; CI: confidence interval; CDET: coincidence detection system imaging; CT: computed tomography; EUR: Euro; CXR: chest X‐ray; GBP: Pound Sterling; GP: general practitioner; MD: mean difference; NHS: National Health Service; PET: positron emission tomography; QALY: quality‐adjusted life year; SCP: survivorship care plan; SD: standard deviation; US: ultrasound

Figuras y tablas -
Table 1. Summary of cost outcomes
Comparison 1. Non‐specialist‐led versus specialist‐led follow‐up

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Overall Survival Show forest plot

2

Hazard Ratio (Random, 95% CI)

1.21 [0.68, 2.15]

2 EORTC‐C30 ‐ Global health status Show forest plot

4

605

Mean Difference (IV, Random, 95% CI)

1.06 [‐1.83, 3.95]

3 EORTC‐C30 ‐ Physical functioning Show forest plot

3

306

Mean Difference (IV, Random, 95% CI)

1.65 [‐2.35, 5.64]

4 EORTC‐C30 ‐ Role functioning Show forest plot

4

605

Mean Difference (IV, Random, 95% CI)

2.36 [‐2.75, 7.47]

5 EORTC‐C30 ‐ Emotional functioning Show forest plot

4

605

Mean Difference (IV, Random, 95% CI)

0.52 [‐2.06, 3.09]

6 EORTC‐C30 ‐ Cognitive functioning Show forest plot

3

306

Mean Difference (IV, Random, 95% CI)

4.41 [‐1.52, 10.34]

7 EORTC‐C30 ‐ Social functioning Show forest plot

3

306

Mean Difference (IV, Random, 95% CI)

5.39 [1.60, 9.17]

8 STAI ‐ State anxiety subscale Show forest plot

3

602

Mean Difference (IV, Random, 95% CI)

‐0.55 [‐2.41, 1.32]

9 HADS ‐ Anxiety subscale Show forest plot

5

1266

Mean Difference (IV, Random, 95% CI)

‐0.03 [‐0.73, 0.67]

10 HADS ‐ Depression subscale Show forest plot

5

1266

Mean Difference (IV, Random, 95% CI)

0.03 [‐0.35, 0.42]

Figuras y tablas -
Comparison 1. Non‐specialist‐led versus specialist‐led follow‐up
Comparison 2. Less intensive versus more intensive follow‐up

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Overall survival Show forest plot

13

Hazard Ratio (Random, 95% CI)

1.05 [0.96, 1.14]

2 Time‐to‐detection of recurrence Show forest plot

12

Hazard Ratio (Random, 95% CI)

0.85 [0.79, 0.92]

Figuras y tablas -
Comparison 2. Less intensive versus more intensive follow‐up