Scolaris Content Display Scolaris Content Display

Impacto del cribado con tomografía computarizada de dosis baja (TCDB) en la mortalidad relacionada con el cáncer de pulmón

Contraer todo Desplegar todo

Antecedentes

El cáncer de pulmón es la causa más frecuente de muerte relacionada con el cáncer en el mundo; sin embargo, el cribado del cáncer de pulmón no se ha implementado en la mayoría de los países a nivel poblacional. Una revisión Cochrane anterior encontró evidencia limitada de la efectividad del cribado del cáncer de pulmón con radiografía de tórax o citología de esputo para reducir la mortalidad relacionada con el cáncer de pulmón; sin embargo, ha aumentado la evidencia que respalda el cribado con tomografía computarizada de dosis baja (TCDB).

Objetivos

Determinar si el cribado del cáncer de pulmón mediante TCDB de tórax reduce la mortalidad relacionada con el cáncer de pulmón y evaluar los posibles efectos perjudiciales del cribado con TCDB.

Métodos de búsqueda

La búsqueda se realizó en colaboración con el documentalista del Grupo Cochrane de Cáncer de pulmón (Cochrane Lung Cancer Group) e incluyó el registro de ensayos de este Grupo, el Registro Cochrane central de ensayos controlados (Cochrane Central Register of Controlled Trials [CENTRAL]; la Biblioteca Cochrane, número actual), MEDLINE (con acceso a través de PubMed) y Embase. También se realizaron búsquedas en registros de ensayos clínicos para identificar ensayos no publicados y en curso. No se impusieron restricciones en cuanto al idioma de publicación. La búsqueda se realizó hasta el 31 de julio de 2021.

Criterios de selección

Ensayos controlados aleatorizados (ECA) de cribado de cáncer de pulmón mediante TCDB y que informan desenlaces de mortalidad o efectos perjudiciales.

Obtención y análisis de los datos

Dos autores de la revisión realizaron de forma independiente la evaluación de la elegibilidad de los ensayos, la extracción de los datos y de las características de los ensayos, así como la evaluación del riesgo de sesgo de los ensayos incluidos mediante la herramienta Cochrane RoB 1. La certeza de la evidencia se evaluó con el método GRADE. Los desenlaces principales fueron la mortalidad relacionada con el cáncer de pulmón y los efectos perjudiciales del cribado. Cuando fue apropiado, se realizó un metanálisis de todos los desenlaces utilizando un modelo de efectos aleatorios. En el análisis de los desenlaces de mortalidad sólo se incluyeron los ensayos que tenían al menos cinco años de seguimiento. Se informaron las razones de riesgos (RR) y los cocientes de riesgos instantáneos (CRI), con intervalos de confianza (IC) del 95% y se utilizó la estadística I2 para investigar la heterogeneidad.

Resultados principales

En la revisión se incluyeron 11 ensayos con un total de 94 445 participantes. Los ensayos se realizaron en Europa y EE.UU. en personas de 40 años o más y la mayoría de los ensayos tenían como requisito de inclusión antecedentes de tabaquismo de ≥ 20 paquete‐años (p. ej., un paquete de cigarrillos al día durante 20 años o dos paquetes al día durante diez años, etc.). Uno de los ensayos incluyó sólo participantes masculinos. Ocho ensayos fueron ECA de fase tres, con dos ECA de viabilidad y un ECA piloto. Siete de los ensayos incluidos tenían como comparador la ausencia de cribado, y cuatro ensayos tenían el cribado con radiografía de tórax. La frecuencia de cribado incluyó intervalos anuales, bienales y crecientes. La duración del cribado varió entre uno y diez años. El seguimiento de la mortalidad fue de cinco hasta aproximadamente 12 años.

Ninguno de los ensayos incluidos tuvo riesgo de sesgo bajo en todos los dominios. La certeza de la evidencia fue moderada a baja en los diferentes desenlaces, según la evaluación con el método GRADE.

En el metanálisis de los ensayos que evaluaron la mortalidad relacionada con el cáncer de pulmón se incluyeron ocho ensayos (91 122 participantes), y hubo una reducción de la mortalidad del 21% con el cribado mediante TCDB en comparación con los grupos control de ningún cribado o cribado con radiografía de tórax (RR 0,79; IC del 95%: 0,72 a 0,87; ocho ensayos, 91 122 participantes; evidencia de certeza moderada). Probablemente no hubo diferencias en los subgrupos en los análisis según el tipo de control, el sexo, la región geográfica y el algoritmo de tratamiento de los nódulos. Con el cribado mediante TCDB las mujeres parecieron tener un mayor beneficio en la mortalidad relacionada con el cáncer de pulmón, en comparación con los hombres. También hubo una reducción de la mortalidad por todas las causas (incluida la relacionada con el cáncer de pulmón) del 5% (RR 0,95; IC del 95%: 0,91 a 0,99; ocho ensayos, 91 107 participantes; evidencia de certeza moderada).

Se realizaron pruebas invasivas con mayor frecuencia en el grupo de TCDB (RR 2,60; IC del 95%: 2,41 a 2,80; tres ensayos, 60 003 participantes; evidencia de certeza moderada). Sin embargo, el análisis de la mortalidad posoperatoria a los 60 días no fue significativo entre los grupos (RR 0,68; IC del 95%: 0,24 a 1,94; dos ensayos, 409 participantes; evidencia de certeza moderada).

Los resultados falsos positivos y las tasas de recuperación fueron mayores con el cribado mediante TCDB en comparación con el cribado con radiografía de tórax; sin embargo, hubo evidencia de certeza baja en los metanálisis debido a la heterogeneidad y a las dudas sobre el riesgo de sesgo. La estimación del sobrediagnóstico con el cribado por TCDB fue del 18%; sin embargo el IC del 95% fue del 0 al 36% (diferencia de riesgos [DR] 0,18; IC del 95%: ‐0,00 a 0,36; cinco ensayos, 28 656 participantes; evidencia de certeza baja).

Cuatro ensayos compararon diferentes aspectos de la calidad de vida relacionada con la salud (CdVRS) con el uso de diversas medidas. Para la ansiedad se agruparon tres ensayos y los participantes en el cribado con TCDB informaron sobre puntuaciones de ansiedad más bajas que en el grupo control (diferencia de medias estandarizada [DME] ‐0,43; IC del 95%: ‐0,59 a ‐0,27; tres ensayos, 8153 participantes; evidencia de certeza baja).

No hay datos suficientes para hacer comentarios sobre el impacto del cribado con TCDB en el hábito de fumar.

Conclusiones de los autores

La evidencia actual respalda una reducción de la mortalidad relacionada con el cáncer de pulmón con el uso de la TCDB para el cribado del cáncer de pulmón en poblaciones de alto riesgo (mayores de 40 años con una exposición significativa al tabaco). Sin embargo, los datos sobre los efectos perjudiciales son limitados y se necesitan más ensayos para determinar la selección de los participantes y la frecuencia y duración óptimas del cribado, con la posibilidad de un sobrediagnóstico significativo del cáncer de pulmón. Actualmente se realizan ensayos para el cribado del cáncer de pulmón en no fumadores.

PICO

Population
Intervention
Comparison
Outcome

El uso y la enseñanza del modelo PICO están muy extendidos en el ámbito de la atención sanitaria basada en la evidencia para formular preguntas y estrategias de búsqueda y para caracterizar estudios o metanálisis clínicos. PICO son las siglas en inglés de cuatro posibles componentes de una pregunta de investigación: paciente, población o problema; intervención; comparación; desenlace (outcome).

Para saber más sobre el uso del modelo PICO, puede consultar el Manual Cochrane.

Impacto de la tomografía computarizada (TC) en el cribado del cáncer de pulmón

Antecedentes

El cáncer de pulmón es la causa más frecuente de muerte relacionada con el cáncer en el todo el mundo. La supervivencia del cáncer de pulmón depende en gran medida del momento en que se diagnostica la enfermedad. Es fundamental detectar la enfermedad lo antes posible mediante una radiografía (radiografía de tórax) o una tomografía computarizada (TC), que es un tipo de radiografía más detallada en la que se toman múltiples imágenes del pulmón. El objetivo de esta revisión fue reunir información sobre el uso de la tomografía computarizada para detectar el cáncer de pulmón de forma más temprana y averiguar si la detección temprana del cáncer de pulmón reduce la muerte por esta causa. También se evaluaron los posibles efectos perjudiciales que puede ocasionar el uso de la TC para detectar el cáncer de pulmón, como los exámenes adicionales y sus complicaciones relacionadas.

Descripción de los ensayos incluidos

La evidencia está actualizada hasta el 31 de julio de 2021. Se incluyeron 11 ensayos con un total de 94 445 participantes. Los ensayos procedían de EE.UU. y Europa. El primer ensayo comenzó en 1991 y el más reciente en 2011. Los participantes eran adultos mayores de 40 años. La frecuencia del cribado con TC varió de anual a más de 2,5 años.

Hallazgos clave

En el análisis del desenlace principal de la mortalidad relacionada con el cáncer de pulmón se incluyeron ocho de los ensayos (91 122 participantes). En las personas mayores de 40 años con una exposición significativa al tabaco, el cribado con TC redujo las muertes por cáncer de pulmón en un 21%, siendo necesario que 226 personas se sometieran al cribado para evitar una muerte por cáncer de pulmón. También se comprobó que las muertes por cualquier causa (incluido el cáncer de pulmón) eran menores con el cribado mediante TC. Sin embargo, el efecto fue mucho menor (sólo un 5% de reducción del riesgo). El cáncer de pulmón se detectó con mayor frecuencia en el grupo de personas que se sometieron a un cribado con TC en comparación con las que no se sometieron a un cribado. Sin embargo, la TC puede inducir falsos positivos (una prueba con resultado positivo o indeterminada para el cáncer de pulmón, cuando la persona no tiene realmente cáncer de pulmón). Se observó que los resultados falsos positivos eran más frecuentes entre las personas que se sometieron a un cribado con TC que con radiografía de tórax. Por ello, los que se sometieron a un cribado con TC se sometieron a más pruebas para investigar tanto enfermedades relacionadas con el cáncer como otras que no lo están. El cribado también implica el riesgo de detectar cánceres de pulmón que podrían no haber progresado hasta causar efectos perjudiciales a la persona (lo que se denomina sobrediagnóstico). El riesgo de sobrediagnóstico de cáncer de pulmón con el cribado con TC se calculó en un 18%.

Los ensayos fueron demasiado diferentes o no proporcionaron información suficiente para analizar el impacto del cribado en el abandono del hábito de fumar o en la calidad de vida. Hubo alguna evidencia que indica que no hubo efectos perjudiciales psicológicos a largo plazo por el cribado, y algunas personas del grupo de cribado con TC se sintieron menos ansiosas en comparación con los grupos control a los que no se les ofreció el cribado.

Certeza de la evidencia

La certeza general de la evidencia fue moderada cuando se trataba de desenlaces relacionados con la muerte, con evidencia de certeza moderada a baja en otros desenlaces. La calificación de la certeza de los desenlaces refleja la confianza y seguridad de los autores en que el desenlace es correcto.

Authors' conclusions

Implications for practice

The evidence in this review from RCTs suggests that screening with LDCT leads to a reduction in lung cancer‐related mortality in high‐risk populations, although it should be noted that the certainty of this evidence is moderate. There is also an increase in investigations associated with LDCT screening, including those unrelated to lung cancer disease. More information is required to define the ideal frequency and duration of screening, as there is a probable loss of impact of screening on lung cancer‐related mortality as time from last screening scan increases.

Whilst available data suggests there is probably no difference between LDCT screening and non‐LDCT screening groups in mortality following invasive procedures, there remains a lack of data regarding harms of screening. Quality assurance with monitoring of harms, such as complications, false positives, recall rates and follow‐up should be included as part of a screening programme with set key performance indicators.  

When discussing lung cancer screening with patients, physicians can consider the relative risks and benefits for the individual prior to recommendation, and ensure that screening and subsequent follow‐up occurs in centres with experience in lung nodule and lung cancer care, particularly given concerns regarding overdiagnosis. The difference in management of solid versus subsolid nodules (Silva 2018), and incorporation of recommendations such as the European Union Position Statement on lung cancer screening (Oudkerk 2017), can be considered. Whilst there was limited use of risk prediction models in the included RCTs, these have been associated with improved participant selection (Field 2019ten Haaf 2017), and further trials are required to evaluate and compare different models. Physicians may choose to also be mindful of potential psychosocial consequences of screening, although evidence has not demonstrated any long‐term impacts. Participants of screening programmes should receive counselling about the implications of a positive screen and significant incidental findings (SIFs).

Current smokers and former smokers with a significant pack‐year history formed the majority of the population in this review, with trials evaluating lung cancer screening with LDCT in predominantly non‐smokers still ongoing. Smoking cessation and other primary prevention strategies can be considered as part of a screening programme, although the optimal method for delivery of these strategies is still under investigation.

There have been several guidelines from the USA and Europe with favourable positions on national screening programmes for groups at high risk of lung cancer (Jonas 2021Kauczor 2020Mazzone 2021). Response to recruitment to screening programmes in the trial setting ranged from 1% to 5% of people approached with invitations and letters. Adherence to screening was noted to decrease over time, with only 54% of the annual screening group of the MILD trial completing their 6‐year scan (Pastorino 2012). Further consideration is required to determine the optimal way to engage with people at high risk of lung cancer, to ensure equitable access to screening when developing national screening programmes, as well as ensuring engagement with the programme once initiated. 

Implications for research

Further research is needed to determine participant selection for a lung cancer screening programme, particularly focusing on groups under‐represented in this review, women and non‐smokers, as well as other geographic regions outside the USA and Europe. 

Additional research is also required for the optimal duration and frequency of screening (van der Aalst 2021), with particular attention to the optimal assessment and management of lung nodules; there is no uniform approach or guideline to nodule management. Further review of nodule classifications, such as Lung CT Screening Reporting and Data System (Lung‐RADS, Lung‐RADS 2019) and the Brock model (McWilliams 2013) for estimating lung cancer risk from nodules are needed, with particular attention to ground glass opacities and nodule management, which may contribute to overdiagnosis. None of the included trials prospectively evaluated the use of artificial intelligence in lung cancer screening. 

Biomarkers are an evolving field, with most included trials only publishing very early descriptive data. Future research with biomarkers may help with the selection of participants for screening or prove useful as a screening modality. 

Summary of findings

Open in table viewer
Summary of findings 1. Low‐dose computed tomography (LDCT) screening compared to no LDCT screening for lung cancer‐related mortality

Low‐dose computed tomography (LDCT) screening compared to no LDCT screening for lung cancer‐related mortality

Patient or population: healthy adults
Setting: hospitals or screening centres
Intervention: LDCT screening
Comparison: no LDCT screening

Outcomes

№ of participants
(trials)
follow‐up

Certainty of the evidence
(GRADE)

Relative effect
(95% CI)

Anticipated absolute effects*(95% CI)

Risk with no screening

Risk difference

Lung cancer‐related mortality ‐ planned time points
Follow‐up: 6 years to 10 years from randomisation

91,122
(8 RCTs)

⊕⊕⊕⊝
Moderatea

RR 0.79

(0.72 to 0.87)

Trial population

21 per 1000

4 fewer per 1000 people screened
(3 fewer to 6 fewer)

All‐cause mortality ‐ planned time points

Follow‐up: 6 years to 10 years from randomisation

91,107

(8 RCTs)

⊕⊕⊕⊝
Moderatea

RR 0.95

(0.91 to 0.99)

Trial population

89 per 1000

4 fewer per 1000 people screened
(1 fewer to 8 fewer)

Overdiagnosis

Time point: ≥ 10 years from randomisation excluding CXR trials

 

28,656
(5 RCTs)

⊕⊕⊝⊝
Lowa,c

RD 0.18

(‐0.00 to 0.36)

Trial population

180 more lung cancers overdiagnosed per 1000 lung cancers detected
(0 more to 360 more)

Number of invasive tests

Time point: 3 years to 10 years from randomisation

 

60,003
(3 RCTs)

⊕⊕⊕⊝
Moderatea

RR2.60
(2.41 to 2.80)

Trial population

31 per 1000

49 more per 1000 people screened
(45 more to 55 more)

Any death postsurgery

Time point: 6 years to 9 years from randomisation

409
(2 RCTs)

⊕⊕⊕⊝
Moderatea

RR 0.68
(0.24 to 1.94)

Trial population

48 per 1000

15 fewer per 1000 people screened
(37 fewer to 45 more)

Health‐related quality of life ‐ anxiety

Time point: 10 months to 5 years from randomisation

Measured by different scales

8153

(3 RCT)

⊕⊕⊝⊝
Lowa,b

SMD ‐0.43

(‐0.59 to ‐0.27)

Trial population

SMD 0.43 lower

(0.27 to 0.59 lower )

*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CI: confidence interval; CXR: chest x‐ray; OR: odds ratio; RCT: randomised controlled trial; RR: risk ratio; RD: risk difference, SMD: standardised mean difference

GRADE Working Group grades of evidence
High certainty: we are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty: we are moderately confident in the effect estimate; the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low certainty: our confidence in the effect estimate is limited; the true effect may be substantially different from the estimate of the effect.
Very low certainty: we have very little confidence in the effect estimate; the true effect is likely to be substantially different from the estimate of effect.

 

 

 

 

 

 

aDowngraded one level due to high risk of "other bias" in Becker 2020De Koning 2020Infante 2015, and Pastorino 2012.
bDowngraded one level due to indirectness: only a subset of the trial population were included for quality assessment.
cDowngraded one level due to heterogeneity. 

Background

Description of the condition

Lung cancer remains the most common cause of cancer‐related death in the world (Ferlay 2019), resulting in an estimated 1.76 million deaths in 2018 (WHO 2018). Whilst historically a male predominant condition, the incidence of lung cancer is now comparable in men and women in the USA, representing approximately 13% of all new cancer diagnoses (Siegel 2019). In Germany, New Zealand, Denmark, Canada, the Netherlands and the USA, age‐specific lung cancer incidence rates have declined in males with each 5‐year birth cohort, with significant transition from male to female dominance in these countries in the younger age groups (30 to 49 years old) (Fidler‐Benaoudia 2020). There is a concerning upward trend in lung cancer‐related deaths in younger women (Levi 2007), with the death rate from lung cancer expected to exceed breast cancer‐related deaths in Europe in women (Malvezzi 2017). The current 5‐year survival for lung cancer is 19% in the USA, with poorer outcomes in small cell lung cancer and in the advanced stages (Howlader 2020). In the last decade, prognosis has improved in stage III and IV non‐small cell lung cancer (NSCLC) with the introduction of immunotherapy and targeted molecular therapy (Howlader 2020; NICE 2019). However, these treatments are mostly not considered curative, with the 5‐year survival in the USA for metastatic NSCLC being 6%, compared to 61% for local NSCLC (Howlader 2020). Complete resection of early‐stage NSCLC has the greatest potential for long‐term survival (beyond 10 years) (Hubbard 2012).

Tobacco smoking is recognised as the most significant risk factor for lung cancer (Halpern 1993Peto 1994), and as such, primary prevention is an essential component of public health campaigns. However, additional factors such as age, genetic factors, airway obstruction, infections and environmental exposure affect risk (Alberg 2007; Bach 2003), with exposure to ambient air pollution increasingly contributing to the global burden of lung cancer (WHO 2016). Particularly in females, adenocarcinomas with detectable molecular mutation are more common in never‐smokers compared to people with a tobacco‐exposure history (Subramanian 2007). A number of validated risk prediction tools have been developed which incorporate smoking history, in addition to other risk factors, to estimate lung cancer risk (Cassidy 2008Tammemägi 2013). These risk prediction models have been suggested to improve participant selection for lung cancer screening and have already been incorporated into screening programmes (Field 2019ten Haaf 2017).

Description of the intervention

Lung cancers are commonly diagnosed at an advanced stage, with 48% of patients in Australia and 61% of patients in Denmark having metastatic NSCLC at the time of diagnosis (Walters 2013). Hence, several trials have evaluated the role of screening for the detection of preclinical disease. Early lung cancers may be visible on plain chest radiography (CXR) or computed tomography (CT) as a pulmonary nodule. A lung nodule is defined as a focal opacity, more or less well defined, measuring up to 3 cm (Hansell 2008). The sensitivity of CXR for the detection of pulmonary nodules < 1 cm is poor (Quekel 1999). Furthermore, in people presenting with symptoms of lung cancer, the sensitivity of CXR is only 80% or less (Bradley 2019). A CT scan is a more detailed type of radiography imaging which uses a rotation x‐ray source. Multiple x‐ray attenuation measurements are taken from different angles and then processed on a computer using reconstruction algorithms to produce cross‐sectional images or virtual slices of a body. These cross‐sectional images are able to detect pulmonary nodules < 1 cm more reliably than CXR due to improved resolution and reduced obscuration from overlapping mediastinal, cardiac and chest wall structures. This is beneficial in the detection of small early‐stage lung cancers, however CT‐detected nodules are not specific to cancer, with differentials including benign nodules, such as hamartomas, granulomas, and inflammatory nodules. Additional incidental findings described with low‐dose computed tomography (LDCT) include mediastinal lymphadenopathy, coronary artery calcification, aortic aneurysm, and non‐pulmonary malignancies (Swensen 2002).

How the intervention might work

LDCT screening has been established as a more sensitive tool to detect lung cancer at an early and resectable stage compared with CXR (Diederich 2002Nawa 2002Sobue 2002Sone 2001Swensen 2002). An earlier Cochrane Review on lung cancer screening found that annual CXR did not significantly reduce lung cancer mortality (Manser 2013). The same review concluded that LDCT screening was associated with a reduction in lung cancer mortality compared with CXR among high‐risk former and current smokers. Reviewers for the 2013 US Preventive Services Task Force Evidence Synthesis also concluded that high‐certainty evidence shows that LDCT screening can significantly reduce mortality from lung cancer (Humphrey 2013). The findings of both of these systematic reviews were based largely on the results of the National Lung Screening Trial (NLST, Aberle 2011) which used the comparator of CXR in a group of high‐risk former and current smokers. In a more recent systematic review, conducted as part of a Health Technology Assessment for the National Institute for Health Research (NIHR) in the UK, the reviewers concluded that LDCT may be clinically effective in reducing lung cancer mortality, but there is considerable uncertainty (Snowsill 2018).

Why it is important to do this review

Despite multiple international guidelines recommending LDCT screening for high‐risk former and current smokers, and calls for the implementation of screening, to our knowledge a nationally co‐ordinated screening programme has not been broadly adopted, apart from in Korea (Lewin 2016Moyer 2014Oudkerk 2017Zhou 2015). In the USA, the Center for Medicare and Medicaid Services has approved coverage and reimbursement for lung cancer screening for individuals who meet certain criteria (Jensen 2015). However, in the absence of a co‐ordinated programme, there have been concerns about the low up take of screening and considerable variability in false‐positive rates between different providers (Pinsky 2018).

There was an urgent need for a contemporary systematic evidence synthesis that incorporates the growing evidence base from RCTs on both benefits and harms of screening in order to better understand the potential magnitude of any benefit and to understand in which groups any benefits might outweigh the harms. False‐positive test results and overdiagnosis are both potential sources of harm from screening which may lead to unnecessary interventions with adverse psychological impacts, morbidity and mortality. Overdiagnosis refers to the detection and diagnosis of lung cancers by screening which would have never caused the person harm, such as death or symptoms, in their lifetime when left untreated (Brodersen 2018). In a recent review of RCTs in which LDCT was compared to usual care (no screening), it was estimated that 49% of lung cancers detected by screening may have been overdiagnosed (Brodersen 2020). Radiation exposure has similarly been considered, with Gierada et al. describing an estimated risk of radiation‐induced cancer mortality after 20 annual chest LDCTs of 0.1%, based on a linear no threshold model of ionising radiation effects (Berrington de González 2008FDA 2017Gierada 2020Rampinelli 2017). In the UK, screening for lung cancer is part of the National Health Service (NHS) long‐term plan, and its ambition is to reach around 600,000 people over 4 years, detecting approximately 3400 cancers across the UK (NHS 2019).

The purpose of this review was to assess the evidence regarding LDCT screening methods to reduce lung cancer‐related mortality and to evaluate the possible harms associated with screening. Additionally, we estimated the incidence of lung cancer and impact on smoking behaviour following screening. Another reason for conducting this review was to involve consumer participation to allow for different perspectives on outcomes and to disseminate the review findings.

Objectives

To determine whether screening for lung cancer using LDCT of the chest reduces lung cancer‐related mortality and to evaluate the possible harms of LDCT screening.

Methods

Criteria for considering studies for this review

Types of studies

We considered randomised controlled trials (RCTs). Randomisation by groups, clusters or individuals was acceptable. All trials reporting mortality as an outcome were eligible for inclusion in the review; however, we did not include those with < 5 years of mortality follow‐up data in quantitative synthesis.

We excluded:

  • observational cohort studies; and

  • case‐series studies.

Types of participants

We included trials with asymptomatic adults from all backgrounds. We excluded trials in adults with previous diagnosis and treatment of lung cancer. We verified entry requirements for all included trials to include only preclinical nodules.

Types of interventions

  • Intervention

    • LDCT, defined as a volumetric CT dose index of ≤ 3 mGy in a standard sized patient (height 170 cm, weight 70 kg) in 2016 (Kazerooni 2016). Newer technological improvements (iterative reconstruction) have enabled further dose reductions (Willemink 2013). 

  • Comparator

    • LDCT screening versus no screening

    • LDCT screening versus any non‐LDCT intervention, including (but not limited to) CXR, sputum cytology or biomarkers (alone or in any combination)

In addition, we included trials which compared different frequencies of screening with LDCT, such as annual LDCT versus biennial LDCT.

Types of outcome measures

Primary outcomes

  • Lung cancer‐related mortality ≥ 5 years post‐randomisation

  • Harms of screening at any time point, including the number of invasive tests performed in those with a false‐positive diagnosis (positive screen in the absence of lung cancer), and any complications arising from these tests, including death

Secondary outcomes

  • All‐cause mortality (death from any cause, including lung cancer)

  • Lung cancer incidence (during screening and postscreening period in those trials which have recorded the incidence postscreening, to capture data on overdiagnosis where possible). In this review, baseline screen incidence data included both incident and prevalence cases of lung cancer first detected during baseline screening. 

  • False‐positive rates and recall rates (proportion of participants recalled for interval CT at 3 months and > 6 months for follow‐up of a nodule or suspected lung cancer)

  • Impact on smoking behaviour: cessation, relapse rates, smoking intensity

  • Health‐related quality of life (HRQoL)/psychosocial consequences. We considered all time points recorded in trials, with an analytic plan for 6 months, 12 months, and 24 months interval assessments.

We recorded, where possible, any other outcomes presented in the primary studies, including but not limited to, stage at diagnosis, histology, radiation exposure, use of biomarkers, response rate, adherence to screening, contamination, interval lung cancers, false negatives, cost, medication implications, and incidental findings.

Search methods for identification of studies

Electronic searches

We searched the following electronic databases from inception to 31 July 2021. We performed the search in collaboration with the Information Specialist of the Cochrane Lung Cancer Group.

  • Cochrane Lung Cancer Group Trial Register

  • Cochrane Central Register of Controlled Trials (CENTRAL, the Cochrane Library, current issue) (Appendix 1)

  • MEDLINE, accessed via PubMed (Appendix 2)

  • Embase (Appendix 3)

We performed the MEDLINE search using the Cochrane highly sensitive search strategy, sensitivity and precision‐maximising version (2008 version) as described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2022).

We also conducted searches in the following clinical trials registries to identify unpublished and ongoing trials.

We applied no restriction on language of publication.

Searching other resources

Ongoing trials and grey literature

We used the following additional resources.

  • Abstracts from 2018 and onwards from international lung cancer meetings, including World Conference on Lung Cancer, American Thoracic Society Conference, European Respiratory Society Conference, American Society of Clinical Oncology (ASCO) Conference, European Society of Medical Oncology (ESMO) Conference and European Conference of Clinical Oncology (ECCO)

  • We searched the bibliographies of identified trials and narrative reviews for additional citations.

  • We contacted authors of primary studies and experts in the field of lung cancer screening to determine whether they were aware of any additional relevant unpublished or published studies or works in progress.

We applied no restriction on language of publication.

Data collection and analysis

Selection of studies

We selected trials for inclusion according to the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2022).

Two review authors (AB and CM) using Covidence (Covidence 2017) independently screened all titles and abstracts retrieved by electronic searches. Two review authors (AB and DM) then obtained the full texts for all relevant trials and independently checked the eligibility of each trial against review eligibility criteria. We pursued discordant evaluations by discussion to reach consensus. When necessary, we involved a third review author (RManser). We report the results of the trial selection process using a PRISMA flow diagram (Moher 2009).

Data extraction and management

The review authors developed a data extraction form using Covidence (Covidence 2017). Two review authors (AB and RM) independently extracted relevant data and performed a cross‐check. To reach consensus, we involved a third review author when necessary (RManser or DM). We were not blinded to the names of trial authors nor to the institutions where trials were conducted and funded. When we encountered multiple publications for the same trial, we chose the first publication dealing with the primary endpoint in this review as a study identifier (study ID).

We collected the following data.

  • Source: citation, trial name if applicable and contact details

  • Eligibility criteria and reasons for exclusion

  • Methods: trial design, total duration of trial, number of trial centres and locations, trial setting, date of trial and dates of first and last included participants

  • Characteristics of participants: number of participants, participant characteristics (age, sex, smoking status, performance status), country, ethnicity

  • Characteristics of interventions (e.g. frequency of scanning, dose of CT, duration of screening, interpretation of scans, criteria for significance)

  • Outcomes: primary and secondary outcomes (with definitions) and time points

  • Results: number of participants allocated to each group, and for each outcome of interest, sample size, missing participants, summary data for each group, estimate of effect with confidence interval and P value and subgroup analyses

  • Miscellaneous: funding source, notable conflicts of interest of trial authors

Assessment of risk of bias in included studies

Two review authors (AB and RM) independently applied the Cochrane RoB 1 tool in order to assess quality and potential biases across included trials (Higgins 2017). We rated each domain of the tool as having 'low', 'high', or 'unclear' risk of bias at trial level and for each outcome if possible, and we supported the rating of each domain with a brief description. We summarised risk of bias for each outcome within a trial by considering all domains relevant to the outcome (i.e. both trial‐level entries, such as allocation sequence concealment, and outcome‐specific entries, such as blinding). We provided a figure to summarise the risk of bias.

If the two review authors did not reach consensus, a third review author (RManser or DM) was consulted.

Using the Cochrane RoB 1 tool, we considered the following domains.

  • Selection bias ‐ generation of allocation sequence: we scored 'low risk' when a random component in the sequence generation process was stated, 'high risk' when a non‐random method was used such as date of birth or hospital admission and 'unclear risk' if not specified in the paper.

  • Selection bias ‐ allocation concealment (selection bias): we scored 'low risk' when the allocation to intervention methods were reported such as using some form of centralised randomisation scheme, an on‐site computer system or sealed opaque envelopes, we scored 'high risk' when the allocation concealment method was not appropriate and 'unclear risk' when the method was not specified in the paper.

  • Performance bias ‐ blinding of participants and personnel: we scored 'low risk' when the blinding of participants and key trial personnel was ensured. We scored 'high risk' when there was no blinding or incomplete blinding, for the review outcome was likely to be influenced by lack of blinding such as smoking behaviour changes. We scored 'unclear' when there was insufficient information to make this judgement.

  • Detection bias ‐ blinding of outcome assessors: we scored 'low risk' when the outcome assessment was blindly performed. We scored 'high risk' when there was no blinding of the other review outcome assessment. We scored 'unclear' when there was insufficient information to make this judgement.

  • Attrition bias ‐ incomplete outcome data: we scored 'low risk' when there were no missing data, reasons for missing data were provided, the number of missing data were balanced across the groups or when appropriate method was used to impute missing data. We scored 'high risk' when there was > 20% missing data or imbalance in numbers or reasons for missing data across the trial groups. We scored 'unclear risk' when there was insufficient information to make this judgement.

  • Reporting bias ‐ selective reporting: we scored 'low risk' when the trial protocol was available and all prespecified trial outcomes were reported. Moreover, when the protocol was not available, and it was clear from the published papers that all expected outcomes are reported, these trials were still rated at low risk. We scored 'high risk' when not all prespecified outcomes were reported, reported outcomes on subsets of the data, and incomplete reporting of the outcomes. We scored 'unclear risk' when there was insufficient information to make this judgement.

  • Other sources of bias ‐ other bias: we scored 'low risk' if the trial appeared to be free of other sources of bias. We scored 'high risk' when there was at least one important bias, for example, the risk of contamination between the intervention and the control groups.

For cluster‐RCTs we addressed the following additional issues (Higgins 2022).

  • Randomisation process: we reported on the number of clusters involved and whether randomisation was performed at a single time point or in batches.

  • Recrutment bias: we investigated bias relevant to whether the participants within the cluster were aware of the intervention, the timing of randomisation and recruitment of individuals in addition to any baseline imbalance between individuals, not clusters.

  • Bias due to deviations from intended interventions: we dealt with this issue similar to the individually‐randomised trials.

  • Bias due to missing outcome data: we reported missing data for both the participants and the cluster.

  • Bias in measurement of the outcome: we reported on this bias in the same way as to the individually‐randomised trials.

  • Other bias: we reported on this bias the same way as the individually‐randomised trials.

Measures of treatment effect

For time‐to‐event outcomes (overall survival and relapse‐free survival), we had planned to use hazard ratios (HRs) to measure intervention effects after validating the proportional hazards assumption, so far as possible. However, only a few trials reported the hazard of death from the time of the enrolment point and reported each HR along with the 95% confidence Interval (CI). 

For dichotomous outcomes (i.e. lung cancer cases detected by CT screening), we used the extracted data from the original trials for both screened and unscreened controlled groups to estimate the overall incidence of newly‐diagnosed lung cancer cases.

We also calculated the risk of overdiagnosis by estimating the risk ratio (RR) of lung cancer (with 95% CIs) in the screened group compared with the control group in trials which have reported the cumulative incidence of lung cancer post the active phase of screening. The primary analysis for overdiagnosis was limited to trials in which the control group did not have any active screening; however we also estimated the risk of overdiagnosis from CT screening relative to that of CXR screening in those trials where the control group were offered CXR screening in a separate analysis.

For continuous outcomes (HRQoL), we used mean differences (MDs) between treatment arms when a similar scale was implemented to measure outcomes, and standardised mean differences (SMDs) when different scales were used to measure the same outcome. This was applied when anxiety data were pooled across the four trials reported on anxiety. If we confirmed that higher scores for continuous outcomes have the same meaning for the particular outcome, we explained the direction, and reported if directions were reversed. We analysed data on an intention‐to‐screen basis.

Unit of analysis issues

For the included RCTs, the individuals were the unit of analysis by practice.

For cluster‐RCTs we identified trials using a cluster randomisation as a way of avoiding contamination bias. Randomisation might have been performed by hospitals, centres and cities. When including data from these trials into meta‐analyses we used the effective sample size method as recommended in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2022). We calculated the effective sample size of groups in each cluster trial to be the original sample size divided by the 'design effect'. We calculated the cluster design effect by 1+ (M ‐1) ICC, where M represented the average cluster size and ICC was the interclass correlation coefficient. For dichotomous data, we divided both the total number of participants and the number experiencing the event by the same design effect. For continuous data, only the sample size was reduced and the means and standard deviations (SDs) stayed the same (Higgins 2022).

Trials with multiple treatment groups

For trials with multiple comparison groups that compared two or more intervention groups with the same control group, we first tried to combine groups to create a single pair‐wise comparison. We calculated within‐study correlation as recommended in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2022).

Dealing with missing data

When data were missing or unsuitable for analysis, we (DM) contacted trial authors to request further information using email addresses from trial reports, from trial registers or from trial author institutions. When data were missing to the extent that the trial could not be included in the meta‐analysis and attempts to retrieve data had been exhaustive, we presented the results in the review and discussed them in the context of trial findings. For each trial, we checked whether intention‐to‐screen analysis was applied (i.e. the number of analysed participants equalled the number of randomly‐assigned participants).

Assessment of heterogeneity

We followed Cochrane recommendations for assessment of heterogeneity (Higgins 2022). We visually investigated heterogeneity by using forest plots generated via Review Manager 5 (RevMan 5) (Review Manager 2020). We assessed statistical heterogeneity of treatment effects between pooled trials for each considered outcome by using the I² statistic to quantify the degree of heterogeneity (Higgins 2002), and we considered I² > 30% as showing moderate heterogeneity, with I² > 75% signifying substantial heterogeneity.

Assessment of reporting biases

We were unable to generate funnel plots and performed Egger's linear regression tests in order to investigate reporting biases for any of the outcomes, as the maximum number of trials included in a single meta‐analysis was insufficient (9, with at least 10 trials required). We followed recommendations provided in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2022). We noted interpretation was difficult when small numbers of trials (< 10) were included. When we observed evidence of small‐study effects, we performed sensitivity analyses according to regression‐based adjustment methods.

Data synthesis

We used intention‐to‐screen analyses by including all randomised people who were invited to screening where possible, and have specified when intention‐to‐screen analysis was not used for a study. When there were repeated observations on participants in long‐term trials, we included outcomes at different time points in separate analyses. We combined data when outcomes from different trials were measured at similar time points.

If sufficient clinically‐similar trials were available, we performed meta‐analyses, applying both fixed‐ and random‐effects meta‐analyses according to recommendations in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2022). We entered data into RevMan 5 (Review Manager 2020). A review author (RM) entered the data, and a second review author (AB) double‐checked the data for accuracy. We only included trials in the meta‐analysis for lung cancer‐specific mortality and all‐cause mortality if they had at least 5 years of follow‐up. We applied the generic inverse‐variance method and random‐effect models for all type of outcomes. For dichotomous outcomes, we applied the DerSimonian and Laird method (DerSimonian 1986).

For calculating overdiagnosis data we used the following formula for the diagnosis rate in the screened group and then bootstrapped this to obtain 95% normal based CIs.

 [(Lung cancer incidence in LDCT screening group/total number of participants in screening group) ‐ (lung cancer incidence in control group/total number of participants in control group)] / (lung cancer incidence in LDCT screening group/total number of participants in screening group)]

Subgroup analysis and investigation of heterogeneity

We investigated the level of heterogeneity. When data were heterogenous we checked and identified the sources of this heterogeneity. When heterogeneity remained considerably high I2 > 75%, we reported the results narratively with no meta‐analyses.

  • We performed a number of subgroup analyses:

    • age

    • sex

    • smoking history or validated measures of lung cancer risk (including risk prediction model)

    • screening interval

    • geographical region

    • by control types ‐ usual care or CXR

Sensitivity analysis

We conducted sensitivity analyses to assess whether results were robust to assess decisions made during the review process such as our assessments about clinical heterogeneity. We looked at the impact of types of control groups. If we identified sufficient trials, we restricted the analysis to trials at low risk of bias, based on overall risk of bias judgement (Higgins 2017).

Summary of findings and assessment of the certainty of the evidence

As suggested in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2022), we presented a summary of findings table, reporting the following outcomes listed in order of priority.

  • Lung cancer‐related mortality, using planned follow‐up time points (predefined by trial as opposed to unplanned, post hoc, extended follow‐up)

  • All‐cause mortality, using planned follow‐up time points

  • Overdiagnosis (this replaced lung cancer incidence)

  • Number of invasive tests (to represent harms of screening)

  • Any death postsurgery (this replaced the impact on smoking behaviour with an additional harm of screening outcome)

  • Anxiety (to represent HRQoL and psychosocial consequences) 

We followed the GRADE approach when creating our summary of findings table, as suggested in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2022). The GRADE approach specifies four levels of certainty (high, moderate, low, or very low) to rate the certainty of evidence in the following domains.

  • Risk of bias

  • Inconsistency

  • Indirectness

  • Imprecision

  • Publication bias

Results

Description of studies

Results of the search

Overall, we identified 5390 citations during our electronic search, of which we selected 43 for full‐text review. The evidence is current to 31 July 2021. Following full‐text review, we included 11 trials (reported in 182 multiple citations). We excluded 30 citations, with additional details provided in Characteristics of excluded studies. We identified an additional 47 citations for the included trials during full‐text review via searching of bibliographies and additional MEDLINE author searches. We identified two RCTs that were in keeping with our review protocol, that had not published mortality or harm data (Sagawa 2012Yang 2018). The first is a Japanese trial that started in 2010, comparing LDCT and CXR over a 10‐year period in people with a smoking history < 30 pack years (e.g. < 1 pack of cigarettes/day for 30 years or < 1.5 packs/day for 15 years etc.) (Sagawa 2012). The second trial (Yang 2018), based in China, similarly includes participants who also do not have a strong smoking history, however participants must have at least one high‐risk factor (family history of cancer or personal history of cancer, occupational exposures to carcinogenic agents, passive or active smoking exposure, or long‐term exposure to cooking oils). This trial compares three rounds of biennial LDCT with no screening and started in 2013. These trials are described in more detail in Characteristics of ongoing studies.

The included trials were the US National Lung Screening Trial (NLST, Aberle 2011), German Lung Cancer Screening Intervention (LUSI, Becker 2020), French DEPISCAN trial (Blanchon 2007), Dutch‐Belgian Nederlands‐Leuvens Longkanker Screenings Onderzoek trial (NELSON, De Koning 2020), UK Lung Cancer Screening trial (UKLS, Field 2021), US Lung Screening Study (LSS, Gohagan 2005), Italian Detection And screening of early lung cancer by Novel imaging TEchnology trial (DANTE, Infante 2015), North American Jewish Hospital Lung Cancer Screening and Early Detection Study (LaRocca 2002), Italian Lung Cancer Screening trial (ITALUNG, Paci 2017), Multicentric Italian Lung Detection trial (MILD, Pastorino 2012), and the Danish Lung Cancer Screening Trial (DLCST, Wille 2016).

Search results are described in Figure 1.


Study selection flow diagram.

Study selection flow diagram.

Included studies

Trial design and setting

Eight of the 11 trials were phase 3 RCTs (Aberle 2011Becker 2020De Koning 2020Infante 2015LaRocca 2002Paci 2017Pastorino 2012Wille 2016), whilst the LSS (Gohagan 2005) and DEPISCAN (Blanchon 2007) trials were feasibility RCTs, and UKLS was a pilot RCT (Field 2021). Three of the 11 trials were conducted in the USA (Aberle 2011Gohagan 2005LaRocca 2002); the remaining trials were based in Europe. 

All trials were conducted via hospitals or screening centres, with the number of sites varying from 1 to 33. The NLST had the most trial sites (Aberle 2011), followed by the French DEPISCAN trial with 14 sites (Blanchon 2007). 

LaRocca 2002 was the earliest trial to start, in 1991, followed by Gohagan 2005 in 2000. Wille 2016 had the latest start date (2011) of the included trials, with the remaining trials starting between 2001 and 2007.

Trial participants

Overall 94,445 people were included across the trials. The NLST had the largest sample size of the included trials with 53,456 participants (Aberle 2011). The next biggest was the NELSON trial with 15,792 (De Koning 2020). Four trials had just over 4000 participants each (Becker 2020; Field 2021; Pastorino 2012; Wille 2016), whilst LSS (Gohagan 2005) and ITALUNG (Paci 2017) had over 3000 participants each. The DANTE trial had 2450 participants (Infante 2015). DEPISCAN had the smallest reported sample size of 765 participants randomised (Blanchon 2007), and with only 621 participants continuing after 144 withdrew consent. LaRocca 2002 reported 871 participants were randomised, however did not include allocation of participants.

In the UKLS trial (Field 2021), the number of participants included in Characteristics of included studies and number of participants in some analyses differ, as 87 participants in the UKLS trial were excluded post‐randomisation from analysis of long‐term data.

Inclusion criteria

Inclusion and exclusion criteria between the trials were similar, with trials having an overlapping age range from 40 years and above. Nine of the 11 trials had a lower age limit of 50 years or above (Aberle 2011Becker 2020Blanchon 2007De Koning 2020Field 2021Gohagan 2005Infante 2015Paci 2017Wille 2016). Ten of the 11 trials had an upper age limit of 75 years or less (Aberle 2011Becker 2020Blanchon 2007De Koning 2020Field 2021Gohagan 2005Infante 2015LaRocca 2002Paci 2017Wille 2016). All trials, except UKLS (Field 2021), had a strong smoking history requirement as part of the inclusion criteria (at least 20 pack years or more). Field 2021  was one of the few trials to use a risk prediction model; with participants requiring a 5% risk of developing lung cancer in 5 years, based on the Liverpool Lung Project (LLP) Risk Prediction Model version 2 (LLPv2). The LLPv2 is a lung cancer risk calculator that incorporates factors such as age, tobacco smoking history, personal history of pulmonary disease or cancer, family history of lung cancer and occupational exposures (Field 2016).

Of note, the DANTE trial excluded all women from the trial (Infante 2015), and the NELSON trial (De Koning 2020) only recruited women in the Belgium arm of the trial, and not for the Netherlands cohort. No included trial reported equal representation of male and female participants. 

In LaRocca 2002, participants required a normal or stable CXR prior to randomisation. The DANTE trial also required a baseline CXR and sputum cytology with clinical examination in both arms of their trial (Infante 2015). 

In addition to the basic demographics provided in the Characteristics of included studies, the NLST included information about education status (Aberle 2011), with 32% of participants having a college degree or higher level of education. Only 48% of their cohort were current smokers. Weight data was also collected, with 1% of their cohort underweight, 28% normal weight, 43% overweight and 28% obese. In the UKLS trial (Field 2021), 46% of the cohort had an education up to or equal to secondary level and 54% beyond secondary school. The DLCST participants had a relatively even distribution of low, middle, and high socioeconomic status (Wille 2016), with 74% of the cohort having 10 years or less of schooling.

Intervention

All trials used chest LDCT as their primary intervention, with reported settings ranging from 90 kVP to 140 kVP and 20 mA to 60 mA. The frequency and duration of LDCT varied between trials, with annual LDCT occurring in nine of the 11 trials (Aberle 2011Becker 2020Blanchon 2007Gohagan 2005Infante 2015LaRocca 2002Paci 2017Pastorino 2012Wille 2016). In the UKLS trial (Field 2021), only one LDCT was performed during the trial. The LSS trial conducted annual screening over 2 years (Gohagan 2005), whilst DEPISCAN (Blanchon 2007) and NLST (Aberle 2011) performed annual LDCT for 3 years. The  ITALUNG trial performed annual LDCT for 4 years (Paci 2017), whilst four of the 11 trials performed annual LDCT screening for 5 years (Becker 2020Infante 2015LaRocca 2002Wille 2016). The MILD trial had two intervention arms (Pastorino 2012), one for biennial scans and one for annual scans; over the 10‐year screening period, the biennial arm had a median of four LDCT scans whilst the annual group had a median of seven LDCT scans. The NELSON trial used incrementing intervals for the LDCT (De Koning 2020), with a baseline scan, then at 1 year, 2 years, and 2.5‐year intervals. 

The majority of the trials used no screening for the control arm (Becker 2020De Koning 2020Field 2021Infante 2015Paci 2017Pastorino 2012Wille 2016), however four of the 11 trials used annual CXR in the comparison arm for the duration of the screening period (Aberle 2011Blanchon 2007Gohagan 2005LaRocca 2002). 

Six of the 11 trials used diameter criteria and no volumetric assessment using computer‐assisted tools to determine significance of pulmonary nodules (Aberle 2011Blanchon 2007Gohagan 2005Infante 2015LaRocca 2002Paci 2017). LUSI (Becker 2020), UKLS (Field 2021), and DLCST (Wille 2016) used both diameter and volumetric criteria to determine nodule significance. The NELSON trial (De Koning 2020) and MILD trial (Pastorino 2012) used volumetric analysis only, for evaluating nodules at baseline and calculating at 3‐month follow‐up the volume doubling time of nodules.

Outcomes and follow‐up

Of the published data, follow‐up ranged from 5 to 12 years post‐randomisation (Aberle 2011Becker 2020De Koning 2020Field 2021Gohagan 2005Infante 2015Paci 2017Pastorino 2012Wille 2016). NLST (Aberle 2011), NELSON (De Koning 2020), ITALUNG (Paci 2017), and MILD (Pastorino 2012) all have median follow‐ups of 10 or more years. The DANTE (Infante 2015) and MILD (Pastorino 2012) trials both published mortality data before and beyond 5 years, with only the later time points included. Yang 2018 published 2‐year mortality data following the baseline scan, however this trial is ongoing. 

Eight of the 11 trials used prespecified nodule follow‐up (Becker 2020Blanchon 2007De Koning 2020Field 2021Infante 2015Paci 2017Pastorino 2012Wille 2016). LaRocca 2002 did not state if any protocol was used, however NLST (Aberle 2011) and LSS (Gohagan 2005) stated they did not use a trial‐wide algorithm for nodule follow‐up. Nodule management for each trial is described in Table 1

Open in table viewer
Table 1. Nodule management 

 

Interpretation

Management 

Aberle 2011

Positive scan: findings suspicious of lung cancer, such as non‐calcified nodule ≥ 4 mm, lung consolidation, or obstructive atelectasis, nodule enlargement, and nodules with suspicious changes in attenuation

No trial‐wide algorithm

Becker 2020

Positive scan: any nodule ≥ 5 mm

  • No abnormality or nodule < 5 mm: routine screening

  • Nodules 5 mm to 7 mm: early recall (6 months)

  • Nodules 8 mm to 10 mm: earlier recall (3 months)

  • Nodules > 10 mm: immediate recall

On recall scans

  • > 600 VDT: back to routine scans

  • 400 VDT to 600 VDT: 6 months early recall

  • < 7.5 mm: early recall 6 months

  • ≥ 7.5 mm to 10 mm: early recall at 3 months

  • ≤ 400 VDT or > 10 mm diameter: immediate recall

 

Blanchon 2007

Positive scan: non‐calcified nodule > 5 mm

  • Non‐calcified nodule ≤ 5 mm: repeat LDCT in 1 year

  • Non‐calcified nodule > 5 mm and < 10 mm: repeat LDCT in 3 months

 

If no change: repeat scan at 6 months, 12 months and 24 months from baseline. If growth at any time: histological diagnosis.  

 

  • Non‐calcified nodule ≥ 10 mm: CT with contrast versus PET versus histological diagnosis discussed in MDM with pulmonary oncologist, radiologist and thoracic surgeon

De Koning 2020

Classification of non‐calcified nodules:

  • NODCAT 1: benign nodule (fat/benign calcifications) or other benign characteristics

  • NODCAT 2: any nodule, smaller than NODCAT 3 and no characteristics of NODCAT 1

  • NODCAT 3: solid (500 mm3 to 500 mm3), solid/pleural based (5 mm dmin to 10 mm dmin), partial solid/non‐solid component (≥ 8 mm dmean), partial solid/solid component (50 mm3 to 500 mm3), non‐solid (≥ 8 mm dmean) 

  • NODCAT 4: solid (> 500 mm3), solid/pleural based (> 10 mm dmin), partial solid/solid component (> 500 mm3)

Classification of nodules based on growth:

  • GROWCAT A: VDT > 600 days

  • GROWCAT BL: VDT 400 days to 600 days

  • GROWCAT C: VDT< 400 days or a new solid component in a non‐solid lesion

 

Management of non‐calcified nodules based on baseline screening

  • NODCAT 1: negative test, annual CT

  • NODCAT 2: negative test, annual CT

  • NODCAT 3: indeterminate test, 3‐month follow‐up CT

  • NODCAT 4: positive test, refer to pulmonologist for work up and diagnosis

  • GROWCAT: positive test, histological diagnosis            

Management protocol for non‐calcified nodules at incidence screening

  • NODCAT 1: negative test, CT in year 4

  • NODCAT 2: indeterminate test, CT in year 3

  • NODCAT 3: indeterminate test, CT after 6‐8 weeks

  • NODCAT 4: positive test, work up for work up and diagnosis

  • GROWCAT C‐ positive test, histological diagnosis required                    

At year 4

  • NODCAT 1: negative test, CT in year 6

  • NODCAT 2: indeterminate test, CT after 1 year

  • NODCAT 3: indeterminate test, CT after 6‐8 weeks

  • NODCAT 4: positive test, refer to pulmonologist

  • GROWCAT A: negative test, CT in year 6

  • GROWCAT B: indeterminate test, repeat CT after 1 year

  • GROWCAT C:  positive test, refer to pulmonologist 

 At year 6

  • NODCAT 1: negative test, end of screening

  • NODCAT 2: indeterminate test, end of screening

  • NODCAT 3: indeterminate test, CT after 6‐8 weeks

  • NODCAT 4: positive test, refer to pulmonologist for work up and diagnosis

  • GROWCAT A: negative test, end of screening

  • GROWCAT B: indeterminate screening, CT after 1 year

  • GROWCAT C: positive test, refer to pulmonologist                                                         

Preoperative biopsy was not routine.

Suspicious nodules were removed by VATS or thoracotomy with wedge resection+frozen section.
Lobectomies were performed only for central nodules that could not be approached by wedge resection.
If cancer was diagnosed by VATS, the procedure was converted to an open thoracotomy with sampling of lobar, interlobar, hilar and mediastinal lymph nodes as VATS resection in lung cancer was not fully implemented at the time of trial in the Netherlands. Mediastinoscopy was performed before proceeding to VATS or thoracotomy in subjects with mediastinal lymph nodes > 10 mm in short axis and/or positive nodes.

Field 2021

Classification of nodules:

  • Cat 1: nodules containing fat or with a benign pattern of calcification are considered benign. Solid nodules < 15 mm3 or if pleural or juxta pleural < 3 mm3.

  • Cat 2: solid intraparenchymal nodules with a volume of 15 mm3 to 49 mm3. Pleural or juxta pleural nodules with a maximal diameter of 3.1 mm to 4.9mm. Part solid nodules with a maximal non‐solid component of < 5 mm diameter or where the solid component volume is < 15 mm3.

  • Cat 3: solid intraparenchymal nodules with a volume of 50 mm3 to 500 mm3. Pleural or juxtapleural nodules with a maximal diameter of 5 mm to 9.9 mm. Non‐solid nodules with a maximal diameter of > 5 mm or part solid nodules where solid component volume is 15 mm3 to 500 mm3.

  • Cat 4: solid intraparenchymal nodules with a volume > 500 mm3, pleural or juxtapleural nodules with a maximal diameter of ≥ 10 mm. Part solid nodules with a solid component with a volume > 500 mm3.

  • Cat 1: nil further scans

  • Cat 2: follow‐up CT in 1 year and assessed for VDT or new solid component in non‐solid nodule

  • If no growth, stop follow‐up

  • If growth, for MDT

  • Cat 3: follow‐up CT in 3 months and assessed for VDT or new solid component in non‐solid nodule. If no growth then CT in 9 months  

  • If VDT > 400 days, stop follow‐up

  • If VDT ≤ 400 days then MDT assessment

  • If growth then MDT assessment

  • 4. Cat 4: MDT assessment

Gohagan 2005

  • Positive scan: any non‐calcified nodule ≥ 4 mm

Other abnormalities could also be considered suspicious for lung cancer at the discretion of the radiologist. 

No trial‐wide algorithm for management

  • Telephone call to patient with positive test and urged to seek medical follow‐up with additional follow‐up calls at 4 weeks +/‐ 8 weeks if follow‐up had not begun at the 4‐week phone call

  • Referrals to specialists for follow‐up of positive screening; results were provided if requested by the participant

Infante 2015

  • Positive scan: non‐calcified pulmonary nodules ≥ 10 mm in diameter or smaller but showing spiculated margins, or non‐nodular lesions such as a hilar mass, focal ground glass opacities, major atelectasis, endobronchial lesions, mediastinal adenopathy, pleural effusion or pleural masses

No set trial‐wide algorithm for management

  • If lesion smooth and < 10 mm in size: LDCT at 3, 6, and 12 months

    • If no change occurs: follow‐up after 1 year

  • Non‐smooth lesion ≥ 6 mm but ≤ 10 mm: oral antibiotics and new HRCT after 6 to 8 weeks 

    • If no regression occurs: evaluation on a case‐by‐case basis as to the opportunity to follow the lesion or to perform invasive procedures to obtain a tissue diagnosis

  • Lesion ≥ 10 mm but ≤ 20 mm: oral antibiotics and new HRCT after 6 to 8 weeks

    • If no regression occurs: PET. If PET is positive: tissue diagnosis. If PET is negative: close follow‐up

  • Lesion ≥ 20 mm: discretional oral antibiotics and new HRCT or standard CT + PET

    • If PET positive: tissue diagnosis

    • If PET negative: close follow‐up 

  • Focal ground glass opacities: oral antibiotics and new HRCT after 6 to 8 weeks. Evaluation on case‐by‐case basis as to opportunity to follow lesion or obtain tissue diagnosis based on the size, number of lesions, location and ratio of any solid versus non‐solid component

LaRocca 2002

  • Positive scan: ≥ 5 mm nodule with suspicious features

  • Abnormal mass > 10 mm in diameter or 5 mm to 10 mm in diameter and highly suspicious for malignancy: CXR and tissue diagnosis is obtained

  • If the abnormal mass ≤ 10 mm in diameter: thin section high resolution image of the mass is obtained

  • If this image is normal or benign, annual spiral CT scanning is continued.

  • If the image is indeterminate, a repeat high‐resolution scan is performed in 3 months.

  • If the image is unchanged at 3 months, annual spiral CT scanning is continued.

  • If the mass is larger at 3 months: CXR and tissue diagnosis is performed.

Paci 2017

  • Positive scan: at least one non‐calcified nodule ≥ 5 mm or a non‐solid nodule ≥ 10 mm or the presence of a part‐solid nodule

  • Solid non‐calcified nodule ≥ 8mm and non‐solid non‐calcified nodule > 10 mm: PET

    • If PET positive: FNA recommended (if FNA negative or indeterminate: 3 month follow‐up scan)

    • If PET negative: 3 month follow‐up scan

    • All cases with no nodule growth at follow‐up exam were invited to annual repeat CT scan

  • Solid or part‐solid non‐calcified nodules with diameter between 5 mm to 7 mm: follow‐up dose LDCT after 3 months

    • If significant growth (increase ≥ 1 mm in mean diameter in a solid nodule or increase of solid component in a part solid nodule): considered potentially malignant

    • If considered potentially malignant and peripheral nodule, FDG PET or CT‐guided FNA arranged

    • If considered potentially malignant and deep nodule, FDG PET or bronchoscopy arranged.

    • Bronchoscopy also performed for airway abnormalities

  • If screening test revealed focal abnormalities consistent with inflammatory disease: antibiotic therapy and 1 month follow‐up CT recommended

  • In case of complete resolution, the subject was sent to annual repeat screening

  • In case of partial or lack of resolution, further 2‐month follow‐up CT performed

  • All subjects with FNA evidence of malignancy underwent a staging CT (CT chest/abdominal/head and neck exam with IV contrast).

 

Pastorino 2012

  • Negative nodule: non‐calcified nodule < 60 mm3 or nodules with fat or benign pattern of calcification

  • Indeterminate: non‐calcified nodules 60 mm3 to 250 mm3

  • Positive result: non‐calcified nodules > 250 mm3

  • Positive result was also based on findings such as non‐calcified hilar or mediastinal lymphadenopathy, atelectasis, consolidation, pleural findings

  • Solid lesions < 60 mm3 in volume (diameter ≥ 4.8 mm) considered: repeat LDCT for 1 or 2 years

  • Nodules with a volume of 60 mm3 to 250 mm3 (5 mm to 8 mm in diameter): underwent repeat CT exam after 3 months

  • Nodules with a volume > 250 mm3: additional work‐up including PET or lung biopsy

  • Volumetric growth was used on serial imaging with significant growth considered ≥ 25% after 3‐month interval

    • If no growth, back to planned screening intervals

  • Ground glass opacities were conservatively managed.

Wille 2016

  • Category 1: nodules ≤ 15 mm in maximal diameter with benign characteristics or ≤ 20 mm for calcified nodules

  • Category 2: nodules < 5 mm

  • Category 3: nodules 5 mm to 15 mm not classified as benign

  • Category 4: nodules > 15 mm or suspicious morphology

  • Category 5: growing nodules (increase in volume ≥ 25%)

  • Category 1 and 2: nil further action

  • Category 3: indeterminate: repeat scan in 3 months

  • Category 4 and 5: diagnostic investigation

CT: computed tomography; CXR: chest x‐ray; dmean: mean diameter; dmin: minimal diameter; FDG PET: fluorodeoxyglucose positron emission tomography; FNA: fine needle aspiration; HRCT; high‐resolution computed tomography; IV: intravenous; LDCT: low‐dose computed tomography; MDM: multidisciplinary meeting; MDT: multidisciplinary team; PET: positron emission tomography; VATS: video‐assisted thoracoscopic surgery; VDT: volume doubling time.

Of the 11 trials, nine had a primary outcome that included lung cancer‐related mortality (Aberle 2011Becker 2020De Koning 2020Field 2021Infante 2015LaRocca 2002Paci 2017Pastorino 2012Wille 2016). The LSS trial had the primary outcome of feasibility to enrol participants in a lung cancer screening programme (Gohagan 2005), however it also had outcomes assessing harms of screening, such as extent of diagnostic follow‐up after abnormal screening findings. The DEPISCAN trial also had primary feasibility outcomes of enrolling participants in a lung cancer screening programme (Blanchon 2007), but also outcomes on harms and adverse events during diagnostic procedures, as well as number of futile thoracotomies for benign lesions. 

Excluded studies

We excluded 30 studies for the following reasons.

Details of these citations are provided in Characteristics of excluded studies.

Risk of bias in included studies

We performed the risk of bias assessment for all included trials with the Cochrane RoB 1 tool (Higgins 2017), and summarised the results in Characteristics of included studiesFigure 2 and Figure 3


Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.


Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Allocation

We deemed allocation concealment adequate in nine of the 11 trials, suggesting a low risk of bias (Aberle 2011Becker 2020Field 2021Gohagan 2005Infante 2015Paci 2017Pastorino 2012Wille 2016De Koning 2020). The remaining two trials had unclear risk of bias (Blanchon 2007; LaRocca 2002), with insufficient information available to determine if a centralised process was used. 

We judged sequence generation adequate in nine of the 11 trials, suggesting low risk of bias (Aberle 2011Becker 2020Field 2021Gohagan 2005Infante 2015Paci 2017Pastorino 2012Wille 2016De Koning 2020). The remaining two trials had unclear risk of bias (Blanchon 2007; LaRocca 2002), with insufficient information available to determine if a random method in sequence generation was used. 

Blinding

Due to the nature of the intervention, no trial participants were blinded to their trial arm in included trials. For this review, lack of blinding of participants in the primary outcomes (lung cancer‐related mortality and harms of screening) was unlikely to influence the outcomes. Blinding of assessors for the primary outcome of lung cancer‐related mortality was assessed as adequate in five of the 11 trials (Aberle 2011Field 2021Pastorino 2012Paci 2017Wille 2016). The UKLS trial (Field 2021) only assessed cause of death from registries and death certificates, without the use of a review board. 

Two trials did not provide information regarding blinding of assessors (Blanchon 2007; LaRocca 2002), and we judged these at unclear risk. We also deemed Becker 2020 at unclear risk of bias; whilst the assessors in the trial were blinded to the arm when assessing lung cancer‐related mortality, the method of identification of lung cancer was not uniform, with 11 of the 67 cases in the control arm and 1 of the 85 cases in the intervention arm detected on death certificate only.

We deemed three of the 11 trials at high risk of bias (De Koning 2020Gohagan 2005Infante 2015): neither the LSS (Gohagan 2005) nor the DANTE trial (Infante 2015) blinded assessors; LSS (Gohagan 2005) only assessed cause of death from death certificates, without the use of a review board; and the NELSON trial (De Koning 2020) raised concerns regarding the method of assessing lung cancer as the cause of death ‐ it changed from using a death review panel to using death certificates only during active follow‐up, with assessors also unblinded in 2018. 

Some outcome measurements, such as all‐cause mortality, were not likely to be influenced by lack of blinding.

Incomplete outcome data

Missing data and withdrawals were adequately described in nine of the 11 included trials (Aberle 2011Becker 2020De Koning 2020Field 2021Gohagan 2005Infante 2015Paci 2017Pastorino 2012Wille 2016), with risk of bias deemed low risk. Of note, the NLST (Aberle 2011) included 192 participants in their analysis that were deemed ineligible for the trial post‐randomisation, and at the end of December 2009 completed active follow‐up (meaning the remaining causes of death were assessed as per the registries). The LSS (Gohagan 2005) excluded the 91 participants found to be ineligible post‐randomisation from analysis. The ITALUNG trial (Paci 2017) had moderate rates of dropout and non‐adherence (81% adherence to screening), however used intention‐to‐treat analysis. The remaining two trials had insufficient information available to make a judgement and we deemed them at unclear risk (Blanchon 2007LaRocca 2002).

Selective reporting

We judged nine of the 11 included trials at low risk of reporting bias (Aberle 2011; Becker 2020; De Koning 2020; Field 2021; Gohagan 2005; Infante 2015; Paci 2017; Pastorino 2012; Wille 2016). Whilst the NELSON (De Koning 2020) trial has not published their cost‐analysis data, information from the authors confirmed intention to do so. We judged the remaining two trials at unclear risk due to insufficient data available (Blanchon 2007LaRocca 2002).

Other potential sources of bias

There were minimal deviations to protocols and balanced baselines in five of the 11 trials ‐ we deemed these at low risk of other bias (Aberle 2011Becker 2020Gohagan 2005Paci 2017Wille 2016). The DLCST (Becker 2020) reported a difference in baseline characteristics between the two groups in mean forced expiratory ratio (FER) (although no difference in mean forced expiratory volume in 1 second (FEV1)) and number of participants with > 35 pack‐year smoking history, however we judged the size of difference unlikely to have had a significant impact on outcomes. Blanchon 2007 and LaRocca 2002 both had insufficient data published to enable us to make an assessment and we judged these at unclear risk of bias. We judged four of the 11 trials at high risk of other bias (De Koning 2020; Field 2021Infante 2015Pastorino 2012). We deemed the NELSON (De Koning 2020) trial at high risk of bias due to a change in method of determining lung cancer‐related death during the trial, as well as additional amendments to the protocol to add a scan interval of 2.5 years after trial commencement. The trial also did not recruit any women in the Netherlands arm of the trial. Similarly, in the DANTE (Infante 2015) trial women were excluded, and there was an unbalanced baseline between trial arms with respiratory comorbidity more prevalent in the LDCT arm. The UKLS (Field 2021) trial excluded 87 participants from long‐term mortality and incidence analysis and did not use intention‐to‐screen analysis, however we judged the number of participants was likely too small to have an impact on results. It should be noted that LLPv2 was unintentionally used rather than LLPv1 as the risk prediction model in UKLS (Field 2021). The MILD (Pastorino 2012) trial had an unbalanced baseline between arms with 90% of the control arm being current smokers compared with 69% of the LDCT arm. Additionally, when the MILD (Pastorino 2012) trial commenced recruitment, there was only the annual LDCT and biennial LDCT groups, with the no‐screening control group added later. 

Effects of interventions

See: Summary of findings 1 Low‐dose computed tomography (LDCT) screening compared to no LDCT screening for lung cancer‐related mortality

Primary outcomes

1) Lung cancer‐related mortality
Lung cancer‐related mortality using planned follow‐up time points

We pooled the latest time point for planned lung cancer‐related mortality for all available trials. We included eight trials in this analysis (Aberle 2011Becker 2020De Koning 2020Field 2021Infante 2015Paci 2017Pastorino 2012Wille 2016). Time points (median time post‐randomisation) for these trials were 6.5 years, 8.8 years, 10 years, 7 years, 8.5 years, 9.3 years, 10 years, and 10 years respectively. We did not include the LSS (Gohagan 2005) as the planned follow‐up for the trial was only 2 years. The evidence showed a difference in lung cancer‐related mortality favouring screening with LDCT, with a reduction in lung cancer‐related mortality of 21% (risk ratio (RR) 0.79, 95% confidence interval (CI) 0.72 to 0.87; 8 trials, 91,122 participants; I2 = 0%; moderate‐certainty evidence; Analysis 1.1). The number needed to screen to prevent one additional lung cancer‐related death was 226. The NLST (Aberle 2011) and NELSON (De Koning 2020) trials both had strong weighting in this analysis, with the DLCST (Wille 2016), and DANTE (Infante 2015) trials demonstrating probably no difference with LDCT screening on lung cancer‐related mortality. When we performed sensitivity analysis using three trials with low risk of bias (Aberle 2011Paci 2017Wille 2016), the evidence still favoured screening, with a reduction in lung cancer‐related mortality (RR 0.81, 95% CI 0.71 to 0.92; 3 trials, 60,764 participants; I2 = 0%; high‐certainty evidence; Figure 4). The number needed to screen to prevent one additional lung cancer‐related death was 296.


Lung cancer mortality ‐ Planned time points ‐ Sensitivity analysis

Lung cancer mortality ‐ Planned time points ‐ Sensitivity analysis

When we analysed hazard ratios (HRs) from Becker 2020Infante 2015, and Wille 2016 at the > 8 to 10‐year planned follow‐up time point post‐randomisation, there was probably no difference for people at risk for lung cancer‐related mortality with LDCT screening (HR 0.93, 95% CI 0.72 to 1.19; 3 trials, 10,606 participants; I2 = 0%; Analysis 1.2).

Lung cancer‐related mortality using planned and unplanned follow‐up time points

We also grouped trial results by time points, including planned and unplanned extended follow‐up, as depicted in Analysis 1.3.

  • 5 to 6 years post‐randomisation: four RCTs reported this outcome at this time point (Becker 2020De Koning 2020Gohagan 2005Wille 2016). There was probably no difference between LDCT and control groups in relation to lung cancer‐related mortality (RR 0.89, 95% CI 0.64 to 1.24; 4 trials, 27,263 participants; I2 = 42%). Heterogeneity amongst trials was moderate, but within acceptable limits. On average, 466 people would have to be screened to prevent one additional lung cancer‐related death.

  • More than 6 to 8 years post‐randomisation: we included three RCTS (Aberle 2011De Koning 2020Field 2021). The evidence showed there was a difference in lung cancer‐related mortality favouring LDCT screening over no screening (RR 0.77, 95% CI 0.69 to 0.86; 3 trials, 73,211 participants; I2 = 0%). On average, 233 people would have to be screened to prevent one additional death related to lung cancer.

  • More than 8 to 10 years post‐randomisation: we included six RCTs (Becker 2020De Koning 2020Infante 2015Paci 2017Pastorino 2012Wille 2016), and once more, pooling data showed a difference favouring LDCT screening in lung cancer‐related mortality (RR 0.79, 95% CI 0.69 to 0.90; 6 trials, 33,700 participants; I2 = 0%). Of note, screening in DANTE (Infante 2015) and DLCST (Wille 2016) probably made no difference, however they were smaller trials. The MILD (Pastorino 2012) trial combined both biennial and annual trial group mortality data for the outcome. On average, 163 people would have to be screened to prevent one additional death from lung cancer.

  • More than 10 years post‐randomisation: we included three RCTS (Aberle 2011De Koning 2020Paci 2017), and the evidence showed a difference favouring LDCT screening in lung cancer‐related mortality (RR 0.86, 95% CI 0.75 to 0.98; 3 trials, 72,447 participants; I2 = 48%). Heterogeneity amongst trials was moderate, but within acceptable limits. On average, 222 people would have to be screened to prevent one additional death from lung cancer.

Lung cancer‐related mortality by time postcompletion of screening

We grouped trial results by years postcompletion of screening using planned and unplanned time point follow‐up data from all nine available trials (Aberle 2011Becker 2020De Koning 2020Field 2021Gohagan 2005Infante 2015Paci 2017Pastorino 2012Wille 2016) in Analysis 1.5. When multiple time points were available for a trial within one bracket of time, we used the latest time point data. 

  • Zero to 1 year postscreening completion: we included four RCTs (Becker 2020De Koning 2020Pastorino 2012Wille 2016). The evidence showed a difference in lung cancer‐related mortality favouring screening (RR 0.76, 95% CI 0.61 to 0.94; 4 trials, 28,044 participants; I2 = 0%). On average, 324 people would need to be screened to prevent one additional death related to lung cancer.

  • 2 to 4.5 years postscreening completion: we included five RCTs (Aberle 2011Becker 2020De Koning 2020Gohagan 2005Infante 2015). The evidence favoured LDCT screening for lung cancer‐related morality (RR 0.82, 95% CI 0.72 to 0.93; 5 trials, 79,063 participants; I2 = 18%). On average, 262 people would need to be screened to prevent one additional death from lung cancer‐related mortality. 

  • 5 to 7 years postscreening completion: we included four RCTs (De Koning 2020Field 2021Paci 2017Wille 2016). The evidence favoured screening for lung cancer‐related mortality (RR 0.78, 95% CI 0.67 to 0.90; 4 trials, 27,067 participants; I2 = 0%). On average, 149 people would need to be screened to prevent one additional death related to lung cancer. 

  • More than 7 to 10 years postscreening completion: we included two RCTs (Aberle 2011Paci 2017). There was probably no difference between the groups for lung cancer‐related mortality (RR 0.92, 95% CI 0.83 to 1.01; 2 trials, 56,658 participants; I2 = 6%).

Lung cancer‐related mortality by different subgroups

  • By screening arm

Planned time periods: we pooled lung cancer‐related mortality data from all eight available trials and divided the data into subgroups based on control arm comparator, CXR (Aberle 2011) or no screening (Becker 2020De Koning 2020Field 2021Infante 2015Paci 2017Pastorino 2012Wille 2016), using the latest planned follow‐up time point for each trial in Analysis 1.4.

  1. For usual care: the evidence showed a difference in lung cancer‐related mortality favouring screening when compared to usual care (RR 0.78, 95% CI 0.69 to 0.88; 7 trials, 37,668 participants; I2 = 0%).

  2. For CXR: the evidence also favoured LDCT over CXR (RR 0.80, 95% CI 0.70 to 0.92; 1 trial, 53,434 participants).

There was no difference between subgroups. Test for subgroup differences: Chi² = 0.11, df = 1 (P = 0.74), I² = 0% (Analysis 1.4).

  • By screening intervals

We also presented the latest planned time point data from nine available trials by screening interval in Analysis 1.6. The MILD trial (Pastorino 2012) had mortality data presented separately by intervention group (biennial and annual). The NELSON (De Koning 2020) trial, with incremental intervals, demonstrated a reduction in lung cancer‐related mortality (RR 0.75, 95% CI 0.62 to 0.90; 1 trial, 15,789 participants), while data from NLST (Aberle 2011), which had three annual screens also favoured LDCT (RR 0.80, 95% CI 0.70 to 0.92; 1 trial, 53454 participants). The overall results favoured LDCT screening for lung cancer‐related mortality (RR 0.79, 95% CI 0.72 to 0.87; 9 trials, 91,122 participants; I2 = 0), with no subgroup difference (test for subgroup differences: Chi² = 3.38, df = 6 (P = 0.76), I² = 0%).

  • By sex

Five trials reported mortality due to lung cancer by sex (Aberle 2011Becker 2020De Koning 2020Field 2021Infante 2015), as depicted in Analysis 1.8 and Analysis 1.7.

  1. For women: we included four RCTS (Aberle 2011Becker 2020De Koning 2020Field 2021). We used data from the latest planned time point available for this analysis. The evidence showed a difference in lung cancer‐related mortality in women favouring LDCT screening, and screening reduced the risk by 29% (RR 0.71, 95% CI 0.59 to 0.86; 4 trials, 26,965 participants; I2 = 0%; Analysis 1.8). However, the pooled HRs from three RCTs showed screening reduced the risk of death by 27% compared to no screening (HR 0.73, 95% CI 0.34 to 1.56; 3 trials, 4286 participants; I2= 64%; Analysis 1.7). However, the 95% CI included 1, so there was probably no difference between the two arms. Removing Wille 2016, reduced the heterogeneity between trials without changing the finding (HR 0.50, 95% CI 0.23 to 1.07; 2 trials, 2449 participants; I2=15%) (analysis not shown).

  2. For men: we included five RCTS (Aberle 2011Becker 2020De Koning 2020Field 2021Infante 2015). We used data from the latest planned time point available for this analysis. The evidence showed a difference in lung cancer‐related mortality in men favouring LDCT screening, and screening reduced risk by 15% (RR 0.85, 95% CI 0.76 to 0.95; 5 trials, 52,833 participants; I2 = 0%; Analysis 1.8). Analysis of HRs (HR 0.76, 95% CI 0.52 to 1.12; 2 trials, 5658 participants) demonstrated that screening could reduce the risk of death by 24% compared to no screening among men, however the 95% CI included 1, so there was probably no difference for men at risk for lung cancer‐related mortality with LDCT screening (Analysis 1.7).

There was no difference between the two subgroups. Test for subgroup differences: Chi² = 2.49, df = 1 (P = 0.11), I² = 59.9%.

  • By age

One trial (Aberle 2011) presented mortality data by age group for the latest planned time point Analysis 1.9.

  1. For those < 65 years old: the evidence favoured LDCT screening to reduce lung cancer‐related mortality by 18% (RR 0.82, 95% CI 0.70 to 0.97; 1 trial, 39,234 participants).

  2. For those ≥ 65 years: the evidence favoured LDCT screening to reduce lung cancer‐related mortality by 38% (RR 0.62, 95% CI 0.52 to 0.74; 1 trial, 17,218 participants).

  • By smoking status

Only one trial (Aberle 2011) presented lung cancer‐related mortality data by smoking status (former or current). Data from both 6.5 years and 12.3 years post‐randomisation are provided in Analysis 1.10. At both time points, the evidence showed a benefit in LDCT screening for lung cancer‐related mortality in current smokers (6.5 years: RR 0.82, 95% CI 0.70 to 0.95; 1 trial, 25,760 participants; I2 = 0%) and (12.3 years: RR 0.89, 95% CI 0.81 to 0.98; 1 trial, 25,760 participants). However, evidence suggested there was probably no difference in former smokers (6.5 years: RR 0.91, 95% CI 0.74 to 1.11; 1 trial, 27,692 participants) and (12.3 years: RR 1.01, 95% CI 0.88 to 1.15; 1 trial, 27,692 participants).

The DLCST (Wille 2016) presented lung cancer‐related mortality by number of pack years smoked < 35 or ≥ 35, there was probably no difference between the groups, for < 35 pack years (RR 1.26, 95% CI 0.55 to 2.90; 1 trial, 2148 participants) and ≥ 35 pack years (RR 0.92, 95% CI 0.54 to 1.54; 1 trial, 1955 participants) in this trial (Analysis 1.10).

  • By geographical regions

Planned time points: lung cancer‐related mortality by geographical region using the latest planned time point is presented in Analysis 1.11

  1. Europe: we included seven trials (Becker 2020De Koning 2020Field 2021Infante 2015Paci 2017Pastorino 2012Wille 2016). The evidence demonstrated a benefit in lung cancer‐related mortality with LDCT screening (RR 0.78, 95% CI 0.69 to 0.88; 7 trials, 37,668 participants; I2 = 0%).

  2. USA: we included one trial (Aberle 2011). The evidence demonstrated a benefit in lung cancer‐related mortality with LDCT screening (RR 0.80, 95% CI 0.70 to 0.92; 1 trial, 53,454 participants).

This analysis (Analysis 1.11) is identical to Analysis 1.4, as the USA trial was the only one to use CXR as a comparison. Overall, the evidence suggested a lung cancer‐related mortality benefit with LDCT screening.

There was no difference between the groups. Test for subgroup differences: Chi² = 0.11, df = 1 (P = 0.74), I² = 0%.

  • By algorithms for nodule management

We also grouped trials by use of trial‐wide algorithms for nodule management (yes or no) using the latest planned time points in Analysis 1.12

  1. Yes: we included six RCTs (Becker 2020De Koning 2020Field 2021Paci 2017Pastorino 2012Wille 2016). The evidence suggested a difference in lung cancer‐related mortality favouring screening in this group (RR 0.75, 95% CI 0.66 to 0.86; 6 trials, 35,218 participants; I2 = 0%). We also applied a fixed‐effect model for analysis and the conclusion was the same (RR 0.75, 95% CI 0.66 to 0.86; 6 trials, 35,218 participants; I2 = 0%).

  2. No: we included two trials (Aberle 2011Infante 2015). There was probably no difference using a random‐effects model for analysis (RR 0.84, 95% CI 0.70 to 1.01; 2 trials, 55,904 participants; I2 = 24%). However, when we applied a fixed‐effect model, the evidence showed a difference in lung cancer‐related mortality favouring screening (RR 0.83, 95% CI 0.73 to 0.94; 2 trials, 55,904 participants; I2 = 24%).

There was no difference between the groups. Test for subgroup differences: Chi² = 1.02, df = 1 (P = 0.31), I² = 2.2%.

  • By nodule analysis method

We grouped trials by method of nodule analysis (diameter criteria and/or volumetric criteria) using the latest planned time points in Analysis 1.13.

  1. Diameter criteria: we included three RCTs (Aberle 2011Infante 2015Paci 2017). The evidence showed a difference in lung cancer‐related mortality favouring screening (RR 0.81, 95% CI 0.72 to 0.92; 3 trials, 59,110 participants; I2 = 0%).

  2. Volume criteria: we included two RCTs (De Koning 2020Pastorino 2012). The evidence showed a difference in lung cancer‐related mortality favouring screening (RR 0.74, 95% CI 0.62 to 0.88; 2 trials, 19,888 participants; I2 = 0%).

  3. Diameter and volume criteria: we included three trials (Becker 2020Field 2021Wille 2016). These trials demonstrated there was probably no difference between the groups (RR 0.79, 95% CI 0.60 to 1.04; 3 trials, 12,124 participants; I2 = 8%). It should be noted that all included trials had low participant numbers. 

There was no difference between the groups. Test for subgroup differences: Chi² = 0.71, df = 2 (P = 0.70), I² = 0%. Nodule management pathways are detailed in Table 1.

2) Harms of screening
Number of all invasive tests performed

We grouped trial results based on time point (following baseline screening scan or at follow‐up) (Analysis 2.1).

  • At baseline: we included three RCTs (Aberle 2011Gohagan 2005Infante 2015). We combined invasive procedures and surgery numbers provided in Infante 2015 for this analysis. The evidence showed that more invasive tests were performed in the LDCT screening group (RR 2.90, 95% CI 2.25 to 3.75; 3 trials, 59,110 participants; I2 = 43%); 363 invasive tests were performed for every 10,000 participants screened with LDCT, with a number needed to harm (NNH) of 44. Heterogeneity was moderate, and when we removed Aberle 2011 from the analysis, there was no heterogeneity and no change to the conclusion (RR 3.56, 95% CI 2.53 to 5.01; 2 trials, 5768 participants; I2 = 0%) (analysis not shown). Both NLST (Aberle 2011) and LSS (Gohagan 2005) had CXR screening as a comparison, whilst the DANTE (Infante 2015) trial performed a CXR and sputum cytology in both groups prior to screening. 

  • At follow‐up: we included three RCTs (Aberle 2011Infante 2015Pastorino 2012). The evidence showed that more invasive tests were performed in the LDCT screening group (RR 2.60, 95% CI 2.41 to 2.80; 3 trials, 60,003 participants; I2 = 0%; moderate‐certainty evidence). The data we used in the analysis for MILD trial (Pastorino 2012) was only inclusive of surgery cases; 788 invasive tests occurred for every 10,000 participants screened with LDCT (NNH = 21). The MILD trial (Pastorino 2012) was the only trial that had no CXRs performed in the control group. 

There was no difference between the subgroups. Test for subgroup differences: Chi² = 0.67, df = 1 (P = 0.41), I² = 0%.

Whilst DESPICAN (Blanchon 2007) included adverse events during diagnostic procedures and number of thoracotomies for benign disease, it did not specify the participant groups when presenting results and subsequently we did not include it in our analysis of harms of screening.

Number of all non‐invasive tests performed 

We grouped trial results based on time point (following baseline screening scan or at follow‐up) (Analysis 2.2).

  • At baseline: we included three RCTs (Aberle 2011Gohagan 2005Infante 2015). The evidence showed that more non‐invasive tests were performed in the LDCT screening group (RR 3.28, 95% CI 2.40 to 4.48; 3 trials, 59,222 participants; I2 = 90%) (analysis not shown). Heterogeneity was high, with a slight reduction to 70% when we removed Infante 2015 (RR 2.68, 95% CI 2.30 to 3.12; 2 trials, 56,772 participants; I2 = 70%); 2154 non‐invasive tests would be performed for every 10,000 people screened with LDCT (NNH = 7). Of note, Infante 2015 was the only included trial that did not have CXR screening in the control arm, although participants did receive one at baseline. The DANTE (Infante 2015) trial included additional CT and PET scans, whilst  Gohagan 2005 included pulmonary function tests, CT and CXR. The NLST (Aberle 2011) combined all additional imaging numbers, and hence heterogeneity was clinical.

  • At follow‐up: we included two RCTs which reported additional PET scans (Aberle 2011Infante 2015). The evidence also showed that more non‐invasive tests were performed in the LDCT screening group (RR 3.56, 95% CI 1.81 to 7.01; 2 trials, 55,905 participants; I2 = 86%) (analysis not shown).

There was no difference between the subgroups. Test for subgroup differences: Chi² = 0.05, df = 1 (P = 0.83), I² = 0%).

Number of invasive tests performed in those with a false‐positive diagnosis (positive test in the absence of lung cancer) 

We grouped trial results based on time point (following baseline screening scan or at follow‐up) (Analysis 2.3).

  • At baseline: we only included one RCT (Gohagan 2005 ). The invasive interventions included bronchoscopy and biopsies, with a higher rate of intervention in the screening group (RR 3.09, 95% CI 1.57 to 6.07; 3318 participants; 1 trial; I2 = 0%); 205 invasive tests would be performed for false‐positive results for every 10,000 participants screened at baseline (NNH = 72).

  • At follow‐up: we included three RCTs (Aberle 2011Infante 2015Pastorino 2012). The NLST (Aberle 2011) included thoracotomy, bronchoscopy, and needle biopsy, whereas Infante 2015 and Pastorino 2012 included only surgery numbers for invasive procedures. The MILD (Pastorino 2012) intervention arm was the combined total of the biennial and annual screening groups. The evidence showed that more invasive tests were performed in the LDCT screening group (RR 3.91, 95% CI 3.21 to 4.76; 3 trials, 60,005 participants; I2 = 0%); 159 invasive tests would be performed in false‐positive results for every 10,000 participants screened (NNH = 85).

There was no difference between the subgroups. Test for subgroup differences: Chi² = 0.43, df = 1 (P = 0.51), I² = 0%. Invasive tests performed in non‐lung cancer‐related disease are summarised in Table 2.

Open in table viewer
Table 2. Invasive tests in non‐lung cancer‐related disease

 

Invasive tests in non‐lung cancer‐related disease

Aberle 2011

At 6.5 year follow‐up

LDCT group: 457 procedures for non‐lung cancer‐related disease (164 thoracotomies/thoracoscopies/mediastinoscopies; 227 bronchoscopies; 66 needle biopsies) out of 17,053 positive screening results over 3 rounds with complete diagnostic information

CXR group: 115 procedures for benign disease (45 thoracotomies/thoracoscopies/mediastinoscopies; 46 bronchoscopies; 24 needle biopsies) out of 4674 positive screening results over 3 rounds with complete information

Becker 2020

Baseline LDCT: 30 biopsies performed in benign disease (at least 5 thoracotomies, 2 VATS thoracoscopies, and 1 bronchoscopy)

Year 1 LDCT: 19 biopsies performed in benign disease

Year 2 LDCT: 12 biopsies performed in benign disease

Year 3 LDCT: 16 biopsies performed in benign disease

Year 4 LDCT: 13 biopsies performed in benign disease

Blanchon 2007

Trial arm not specified. Baseline: 3 thoracostomies performed for benign disease

De Koning 2020

Baseline LDCT: 27.2% of invasive procedures performed in benign disease

Between 2004 and 2008: 215 participants had surgery. 2/17 mediastinoscopies were in benign diease; 47/198 lung surgeries (thoracotomies+/‐ VATS) were in benign disease. 

Field 2021

Baseline LDCT: 7 participants had needle biopsies, 1 EBUS bronchoscopy, 4 referrals for surgery completed for benign disease

Gohagan 2005

Baseline LDCT: 16 bronchoscopies, 19 lung biopsies or resection, and 23 any invasive procedures (including biopsy/resection, bronchoscopy, thoracotomy, thoracoscopy, mediastinotomy, mediastinoscopy) performed for benign disease

Baseline CXR: 5 bronchoscopies, 6 lung biopsies or resections, and 8 procedures (including biopsy/resection, bronchoscopy, thoracotomy, thoracoscopy, mediastinotomy, mediastinoscopy) performed for benign disease

Infante 2015

At 8.35 years median follow‐up

LDCT group: 17 surgeries for benign disease (3 mediastinoscopies, 7 VATS wedge resections, 6 open wedge resections, 1 open segmentectomy). 7 surgeries for other conditions (reported as 1 open biopsy, 1 extrapleural pneumonectomy for mesothelioma, 2 oesophagectomies for cancer, 1 oesophageal leiomyoma VATS resection, 2 VATS thymectomies, 1 lobectomy for aspergilloma)

Control arm: 5 surgeries for benign disease (2 VATS biopsies, 2 VATS wedge resection, 1 open wedge resection); 2 surgeries for other conditions (1 open lung biopsy for hilar lymphoma, 1 VATS thymectomy)

LaRocca 2002

Not available

Paci 2017

Baseline LDCT: 1 FNA biopsy and 1 (5.5% of all surgical resections) surgical resection for benign disease reported

Pastorino 2012

Median 6 annual LDCTs: 1 invasive diagnostic procedure (transthoracic needle aspiration, fibro bronchoscopy, transbronchial needle aspiration), 0 anatomical (lobectomy or segmentectomy) resections, 0 non‐anatomical resections (wedge resection) performed for benign disease

Median 3 biennial LDCTs: 3 invasive diagnostic procedures (transthoracic needle aspiration, fibro bronchoscopy, transbronchial needle aspiration), 0 anatomical (lobectomy or segmentectomy) resections, 1 non‐anatomical resection (wedge resection) performed for benign disease

Wille 2016

Baseline LDCT: 1 mediastinoscopy, 3 bronchoscopy with biopsy, 1 EUS, 2 EBUS, 2 VATS, 1 percutaneous biopsy performed for benign disease

CXR: chest x‐ray; EBUS: endobronchial ultrasound; EUS: endoscopic ultrasound;  FNA: fine needle aspirate; LDCT: low‐dose computed tomography; VATS: video‐assisted thoracoscopic surgery

There were no common time points for non‐invasive tests performed in participants without lung cancer, however we compared total numbers in both groups in two RCTs (Aberle 2011Gohagan 2005). 

Any complications arising from tests including death

Two RCTs which reported mortality rates within 60 days of surgery (Aberle 2011Infante 2015). The NSLT (Aberle 2011) had a CXR screening comparison arm, whereas the DANTE trial (Infante 2015) had a baseline CXR and sputum cytology for all participants followed by annual clinical examinations. There was probably no difference in mortality following surgery between the groups (RR 0.68, 95% CI 0.24 to 1.94; 2 trials, 409 participants; I2 = 0%; moderate‐certainty evidence; Analysis 2.4). Another RCT also reported postsurgery mortality rates (Paci 2017), however we were unable to locate the total number of surgeries in each group and consequently, we did not include it in the analysis. 

We reported a comparison of complications arising from tests for non‐cancer‐related disease in two RCTs at different time points (Aberle 2011Gohagan 2005). 

Secondary outcomes

3) All‐cause mortality

We combined trial data from all eight available trials using the latest planned follow‐up time point for each trial (Aberle 2011Becker 2020De Koning 2020Field 2021Infante 2015Paci 2017Pastorino 2012Wille 2016; see Analysis 3.1). We excluded the LSS (Gohagan 2005) from this analysis as it did not have planned follow‐up at ≥ 5 years. The evidence showed a 5% risk reduction in all‐cause mortality with LDCR screening (RR 0.95, 95% CI 0.91 to 0.99; 8 trials, 91,107 participants; I2 = 0%; moderate‐certainty evidence); 210 people would need to be screened to prevent one death from all‐cause mortality. When we performed a sensitivity analysis using trials with low risk of bias (Aberle 2011Paci 2017Wille 2016), there was probably a difference between the groups for all‐cause mortality favouring LDCT (RR 0.94, 95% CI 0.89 to 0.99; 3 trials, 60,764 participants; I2 = 0%); 204 people would need to be screened to prevent one death from all‐cause mortality (Figure 5). When we analysed HRs from Becker 2020Infante 2015 and Wille 2016 at the latest planned time points post‐randomisation, there was probably no difference for people at risk for all‐cause mortality with LDCT screening (HR 0.98, 95% CI 0.87 to 1.12; 3 trials, 10,606 participants; I2 = 0%; Analysis 3.3).


All‐cause mortality ‐ Planned time points ‐ Sensitivity analysis

All‐cause mortality ‐ Planned time points ‐ Sensitivity analysis

We also grouped trial results by time points (planned and unplanned) (Analysis 3.2).

  • 5 to 6 years post‐randomisation: we included three RCTS (Becker 2020Gohagan 2005Wille 2016). There was probably no difference between LDCT and control groups in all‐cause mortality (RR 1.14, 95% CI 0.88 to 1.47; 3 trials, 11,474 participants; I2 = 52%). There was moderate heterogeneity, which disappeared when we excluded the results from Becker 2020 (RR 1.26, 95% CI 1.03 to 1.54; 2 trials, 7422 participants; I2 = 0%). 

  • More than 6 to 8 years post‐randomisation: we included two RCTs (Aberle 2011Field 2021). The evidence showed there was a difference in all‐cause mortality favouring LDCT screening (RR 0.94, 95% CI 0.89 to 0.99; 2 trials, 57,422 participants; I2 = 0%).

  • More than 8 to 10 years post‐randomisation: we included six RCTS (Becker 2020De Koning 2020Infante 2015Paci 2017Pastorino 2012Wille 2016). There was probably no difference between LDCT and control groups in all‐cause mortality (RR 0.97, 95% CI 0.91 to 1.03; 6 trials, 33,685 participants; I2 = 0%).

  • More than 10 years post‐randomisation: we included two RCTs (Aberle 2011Paci 2017). There was probably no difference between LDCT and control groups in all‐cause mortality (RR 0.91, 95% CI 0.76 to 1.09; 2 trials, 56,658 participants; I2 = 76%). Heterogeneity between the two trials was high, with ITALUNG (Paci 2017) favouring screening. It should be noted that the NLST (Aberle 2011) had CXR as a comparison. 

All‐cause mortality by different subgroups (planned time points)

  • By sex

Three trials reported mortality by sex (Aberle 2011De Koning 2020Infante 2015); see Analysis 3.4.

For women: we included two RCTs (Aberle 2011De Koning 2020). The NELSON trial (De Koning 2020) results for women were not provided and so we calculated these from available data. There was probably no difference between LDCT and control groups in all‐cause mortality and heterogeneity was moderate (RR 0.89, 95% CI 0.76 to 1.03; 2 trials, 24,514 participants; I2 = 31%).

For men: we included three RCTs (Aberle 2011De Koning 2020Infante 2015). The evidence showed there was no difference in all‐cause mortality (RR 0.93, 95% CI 0.80 to 1.07; 3 trials, 49,162 participants; I2 = 82%) (analysis not shown). Heterogeneity was high, with  Aberle 2011 having significant weight in the analysis. When we removed Aberle 2011, there was no heterogeneity, however data suggested there was probably no difference between LDCT and control groups in all‐cause mortality (RR 1.00, 95% CI 0.93 to 1.09; 2 trials, 5632 participants; I2 = 0%).

There was no difference between the two groups. Test for subgroup differences: Chi² = 1.96, df = 1 (P = 0.16), I² = 48.9%.

At 8 to 10 years planned time points: we included two RCTS (Becker 2020Paci 2017) and the data was collected as part of their planned analysis. There was probably no difference between LDCT and control groups in cardiovascular‐related mortality and heterogeneity was high (RR 0.76, 95% CI 0.37 to 1.56; 2 trials, 7258 participants; I2 = 78%) (analysis not shown).

At more than 10 years using unplanned time points, we included only one RCT (Paci 2017). The evidence showed there was a difference in cardiovascular mortality favouring LDCT screening (RR 0.53, 95% CI 0.34 to 0.81; 1 trial, 3206 participants). 

4) Lung cancer incidence and overdiagnosis 
Lung cancer incidence

We grouped trial results by time points (planned and unplanned) (Analysis 4.1).

  • At baseline: we included six trials (Aberle 2011Blanchon 2007De Koning 2020Gohagan 2005Infante 2015Wille 2016). The evidence showed a higher incidence of lung cancer in the LDCT screening group, (RR 4.98, 95% CI 2.01 to 12.35; 6 trials, 79,900 participants; I2 = 87%), with high heterogeneity. Removing  Aberle 2011 reduced the heterogeneity to a moderate level (RR 6.45, 95% CI 3.21 to 12.98; 5 trials, 26,448 participants; I2 = 49%). Both the NLST(Aberle 2011) and LSS (Gohagan 2005) had CXR screening in the control arm. 

  • At 1 year post‐randomisation: we included three RCTs (Aberle 2011De Koning 2020Wille 2016). The evidence showed a higher incidence of lung cancer in the LDCT screening group, with high heterogeneity (RR 2.12, 95% CI 1.35 to 3.31; 3 trials, 73,345 participants; I2 = 54%). After removing Aberle 2011, there was no heterogeneity (RR 2.87, 95% CI 1.78 to 4.60; 2 trials, 19,893participants; I2 = 0%).

  • At two years post‐randomisation: we included two RCTs (Aberle 2011Wille 2016). This analysis demonstrated a higher incidence of lung cancer in the LDCT screening group (RR 1.88, 95% CI 1.51 to 2.32; 2 trials, 57,556 participants; I2 = 0%).

  • At 3 years post‐randomisation: we included one RCT (Wille 2016). This trial suggested the possibility of no difference in the incidence of lung cancer between the groups (RR 1.71, 95% CI 0.68 to 4.35; 1 trial, 4104 participants; I2 = 0%)

  • At 4 years post‐randomisation: we included one RCT (Wille 2016). This trial demonstrated a higher incidence of lung cancer in the LDCT screening group (RR 2.67, 95% CI 1.05 to 6.80; 1 trial, 4104 participants; I2 = 0%).

  • At 5 to 7 years post‐randomisation: we included two RCTs (Aberle 2011Becker 2020). The evidence showed a higher incidence of lung cancer in the LDCT screening group (RR 1.13, 95% CI 1.04 to 1.23; 2 trials, 57,506 participants; I2 = 0%).

  • At more than 7 years post‐randomisation: we included eight RCTS (Aberle 2011Becker 2020De Koning 2020Field 2021Infante 2015Paci 2017Pastorino 2012Wille 2016). Of note, only male participant data were available and included in this analysis for the NELSON trial (De Koning 2020). The MILD (Pastorino 2012) trial had both annual and biennial groups combined into the intervention arm. The evidence showed a higher incidence of lung cancer in the LDCT screening group (RR 1.17, 95% CI 1.02 to 1.33; 8 trials, 8528 participants; I2 = 65%). Heterogeneity was high, and became low when we removed Wille 2016 from the analysis (RR 1.08, 95% CI 0.99 to 1.18; 7 trials, 84,424 participants; I2 = 27%) (analysis not shown).

We grouped trials with ≥ 10 years follow‐up post‐randomisation based on control arm (Analysis 4.2).

  • No screening comparison: we included five RCTs (Becker 2020De Koning 2020Paci 2017Pastorino 2012Wille 2016). There was possibly no difference in lung cancer incidence between the groups (RR 1.21, 95% CI 0.99 to 1.48; 5 trials, 28,656 participants; I2 = 66%). Heterogeneity was high, which was probably due to the DLCST trial (Wille 2016), which reported less lung cancer in the control group (RR 1.11, 95% CI 0.99 to 1.24; 4 trials, 24,552 participants; I2 = 0%) (analysis not shown). 

  • CXR comparison: we included one RCT (Aberle 2011). There was probably no difference in lung cancer incidence between the groups (RR 1.01, 95% CI 0.95 to 1.08; 1 trial, 53,454 participants).

There was no difference between the subgroups. Test for subgroup differences: Chi² = 2.62, df = 1 (P = 0.11), I² = 61.8%.

Overdiagnosis 

Estimates were described by five trials (Aberle 2011Becker 2020Field 2021Paci 2017Wille 2016), and ranged from ‐4% to 67% for all lung cancers (Table 3). This was in part due to some adenocarcinoma subtypes We divided overdiagnosis using definitions from the NLST (Aberle 2011) where overdiagnosis from a public health perspective used cumulated lung cancer incidence rate as the denominator, whereas the clinical perspective used screen‐detected lung cancer incidence as the denominator.   

Open in table viewer
Table 3. Recall rates, false positives and overdiagnosis 

 

Recall rates at overall baseline
LDCT recall rate = 18% (8078/44,920)

False positives 

Overall false‐positive rate from baseline LDCT = 21%

(8874/41857)

Overdiagnosis

Aberle 2011

All chest CTs performed postbaseline LDCT: 20%

(5153/26,309)

All chest CTs performed postbaseline CXR in control group: 6%

Baseline

LDCT: 6911/26,309 (26%)

Baseline CXR: 2243/26,035 (7%)

1‐year LDCT: 6728/24715 (27%)

1‐year CXR: 1416/24,089 (6%)

2‐year LDCT: 3838/24,102 (16%)

2‐year CXR: 1094/23, 346 (5%)

At 6.5 years post‐randomisation:

  • 11% (95% CI 3.2 to 18.2) from a public health perspective and 18.5% (95% CI 5.4 to 30.6) from a clinical perspective of all lung cancers

  • 67.6% (95% CI 53.5 to 78.5) from a public health perspective and 78.9% (95% CI 62.2 to 93.5) from a clinical perspective of all BAC

 

At 11.3 years post‐randomisation:

  • 3.1% of all lung cancers and 79% of BAC from a public health perspective

Becker 2020

Baseline LDCT: 22% (451/2028)

1‐year LDCT: 5%

2‐year LDCT: 4%

3‐year LDCT: 6%

4‐year LDCT: 5%

Baseline LDCT: 426/2028 (21%)

1‐year LDCT: 77/1892 (4%)

2‐year LDCT: 62/1849 (3%)

3‐year LDCT 94/1826 (5%)

4‐year LDCT: 88/1810 (5%)

At 9.7 years post‐randomisation:

  • 17.8% (95% CI ‐7.4 to 44.7) from a public health perspective and 25.4% (95% CI ‐11.3 to 64.3) from a clinical perspective of all lung cancers

  • 90% (95% CI 54.3 to 164.4) from a public health perspective and 112.5% (95% CI 68.2 to 113.1) from a clinical perspective of all BAC

Blanchon 2007

Not available

Baseline LDCT: 73/336 (22%)

Baseline CXR: 14/285 (5%)

Not available

De Koning 2020

Baseline LDCT: 19% (1438/7557)

1‐year LDCT: 19%

Baseline LDCT 107/7557 (1%)

1‐year LDCT: 64/7295 (1%)

3‐year LDCT: 276/6922 (4%)

5.5‐year LDCT:  62/5279 (1%) 

Not available

Field 2021

Baseline LDCT: 5% (103/1994)

Baseline LDCT: *909/1994 (46%) 

**72/1994 (4%) 

*when defined as needing any work‐up 

**when defined as referred to MDT

Estimated 15% of all lung cancers

Gohagan 2005

Baseline LDCT: 15% (232/1586)

Baseline CXR: 5% (76/1550)

Overall post‐LDCT: 8%

Overall post‐CXR in control group: 3%

Baseline LDCT: 286/1586 (18%)

Baseline CXR: 139/1550 (9%)

Not available

Infante 2015

Baseline LDCT: 10% (128/1276)

Not available

Not available

LaRocca 2002

Not available

Not available

Not available

Paci 2017

Baseline LDCT: 23% 366/1406)

1‐year LDCT: 14%

2‐year LDCT: 13%

3‐year LDCT: 11%

Not available

At 11.3 years post‐randomisation, estimated overdiagnosis rates reported as ‐4% using public health perspective and ‐10% from a clinical perspective

Pastorino 2012

Baseline LDCT: 15% in annual group, 14% in (284/2303) biennial group

1‐year LDCT: 3% in annual group, 3% in biennial group

2‐year LDCT: 5% in annual group, 5% in biennial group

3‐year LDCT: 3% in annual group, 7% in biennial group

4‐year LDCT: 2% in annual group, 3% in biennial group

5‐year LDCT: 1% in annual group, 7% in biennial group

6‐year LDCT: 4% in annual group, 5% in biennial group

Median 6 annual LDCT: 54/1152 (5%)

Median 3 biennial LDCT: 34/1151 (3%)

Not available

Wille 2016

Baseline LDCT: 8% (155/2047)

1‐year LDCT: 1%

2‐year LDCT: 1%

3‐year LDCT: 1%

4‐year LDCT: 1%

Baseline LDCT: 162/2047 (8%)

1‐year LDCT: 34/1976 (2%)

2‐year LDCT: 39/1944 (2%)

3‐year LDCT: 32/1982 (2%)

4‐year LDCT: 35/1851 (2%)

Estimated 67.2% of lung cancers (95% CI 37.1 to 95.4) from unplanned posthoc analysis

BAC: bronchioalveolar carcinoma; CI: confidence interval; CT: computed tomography; CXR: chest x‐ray; LDCT: low‐dose computed tomography; MDT: multidisciplinary team. 

We estimated overdiagnosis based on the control arm (Analysis 4.3). We calculated estimates from the total incidence in each arm and these were not limited to screen‐detected cancers only. Based on extended follow‐up (≥ 10 years), calculated rates of overdiagnosis for the NELSON trial (De Koning 2020) was 12%, with a wide CI (95% CI ‐1% to 25%). It should be noted that extended incidence and hence overdiagnosis was only relevant to male participants in this trial. The estimated overdiagnosis rate of ITALUNG (Paci 2017) was ‐11%, with a 95% CI of ‐42% to 20%. The estimated overdiagnosis rate of MILD (Pastorino 2012) was 16%, again with a wide CI (95% CI ‐10% to 41%). The DLCST (Wille 2016) had an estimated overdiagnosis rate of 47% and had the only CI that did not cross 0 (95% CI 30% to 64%). The NLST compared LDCT to CXR and so we excluded it from the meta‐analysis, however it had an estimated overdiagnosis rate of 1% (95% CI ‐5% to 7%). The DANTE (Infante 2015) trial included CXR and sputum cytology for both trial groups pre‐screening, and had a median follow‐up of < 10 years, and so we did not include it in the meta‐analysis of overdiagnosis. The calculated overdiagnosis rate for the DANTE (Infante 2015) trial was 26% (95% CI 5 to 48%). 

Five trials compared LDCT screening with usual care after ≥ 10 years from randomisation (Becker 2020De Koning 2020Paci 2017Pastorino 2012Wille 2016), with an estimated overdiagnosis rate of 18% (RD 0.18, 95% CI ‐0.00 to 0.36; 5 trials, 28,656 participants; I2 = 73%; low‐certainty evidence), with a wide CI that does not meet significance (Analysis 4.3). It is estimated that 7 cases of lung cancer overdiagnosis would occur for every 1000 people screened (95% CI of 2 to 84 cases of overdiagnosis). Heterogeneity was also high, and reduced when we removed Wille 2016

5) False‐positive, negative and recall rates

False‐positive rates were provided for eight RCTS (Aberle 2011Becker 2020Blanchon 2007De Koning 2020Field 2021Gohagan 2005Pastorino 2012Wille 2016) and are detailed in  Table 3. When we combined all available baseline LDCT results from seven trials (Aberle 2011Becker 2020Blanchon 2007De Koning 2020Field 2021Gohagan 2005Wille 2016), 21% of trial screens had a false‐positive result, with a range from 1% to 46%. The false‐positive rate of LDCT was lower in trials that used volumetric analysis alone (De Koning 2020Pastorino 2012), which ranged from 1% to 5%, compared with diameter criteria alone (Aberle 2011Blanchon 2007Gohagan 2005) which ranged from 18% to 26%. The three trials that used both diameter and volumetric criteria (Becker 2020Field 2021Wille 2016) had false‐positive rates of 21%, 46%, and 8%, respectively. Four per cent of participants had false‐positive results reviewed in a multidisciplinary team meeting in UKLS (Field 2021). False‐positive rates also decreased with subsequent LDCT screens (Aberle 2011Becker 2020De Koning 2020Wille 2016). It should be noted that false positives in the NELSON trial (De Koning 2020) did not include all participants who had an indeterminate scan, only those who had a positive follow‐up scan following an indeterminate result.

When we combined results from all available three trials comparing LDCT and CXR (Aberle 2011Blanchon 2007Gohagan 2005) using the latest follow‐up time point for each trial (Analysis 5.1), the evidence showed fewer false positives in the CXR groups with high heterogeneity between the trials (RR 2.82, 95% CI 1.98 to 4.01; 3 trials, 56,101 participants; I2 = 90%) (analysis not shown). Removing Gohagan 2005 reduced the heterogeneity, however the conclusion was unchanged (RR 3.31, 95% CI 2.45 to 4.47; 2 trials, 52,965 participants; I2 = 43%).

Recall rate is the portion of participants recalled for repeat CT at 3 months and beyond 6 months for follow‐up of a nodule or suspected lung cancer. Only two RCTs (Aberle 2011Gohagan 2005) had baseline comparison data for recall rate (Analysis 5.3). The data suggested there were probably more recalls in the LDCT screening group compared to CXR groups (RR 5.31, 95% CI 1.73 to 16.34; 2 trials, 55,480 participants; I2 = 99%) (analysis not shown). Heterogeneity was high between the groups. Both trials had no trial‐wide algorithm, however had similar definitions for a positive screen. LSS (Gohagan 2005) was a feasibility study, and there were only two screening rounds, with participants advised to seek medical follow‐up with specialists. Baseline screening recall rates from trials provided (Aberle 2011Becker 2020De Koning 2020Field 2021Gohagan 2005Infante 2015Paci 2017Pastorino 2012Wille 2016) are summarised in Table 3, with an overall recall rate following baseline screen of 18% (range of 5% to 23%). Of note, recall rates were defined differently between trials, with most trials including scans occurring up to 12 months postscreen and the NLST (Aberle 2011) and LSS (Gohagan 2005) data including all CT scans, not specifically recall scans, performed as a result of the baseline screen. 

6) Smoking behaviour – cessation, relapse rates, smoking intensity

There were no common time points for assessment of quit rates. Individual trial results are presented in Analysis 6.1. Both the ITALUNG trial (Paci 2017) and the DLCST (Wille 2016) included smoking cessation as part of their programme, although Wille 2016 quantified it as minimal (< 5 minutes spent on smoking cessation per review). The ITALUNG (Paci 2017) trial and UKLS (Field 2021) confirmed smoking status via self‐reporting only, whereas the DLSCT (Wille 2016) also confirmed smoking by measuring exhaled CO2 levels. 

  • At 2 weeks post‐randomisation, Field 2021 showed a higher quit rate in the LDCT screening group compared with control (RR 2.16, 95% CI 1.47 to 3.18; 1 trial, 1545 participants).

  • At 1 year post‐randomisation, there was probably no difference in quit rates between the groups in Wille 2016 (RR 1.08, 95% CI 0.88 to 1.32; 1 trial, 3124 participants).

  • Within 2 years post‐randomisation, Field 2021 again showed a higher quit rate in the LDCT screening group compared with control (RR 1.51, 95% CI 1.15 to 1.97; 1 trial, 1524 participants).

  • At 4 years post‐randomisation, Paci 2017 demonstrated there was possibly no difference in quit rates between the groups (RR 1.17, 95% CI 0.99 to 1.37; 1 trial, 2447 participants).

Only one trial presented smoking relapse rates in both groups (Wille 2016). There was probably no difference in relapse rates between the groups in this trial (RR 0.95, 95% CI 0.65 to 1.41; 1 trial, 888 participants; Analysis 6.2).

7) Health‐related quality of life (HRQoL)/psychosocial consequences

HRQoL and psychosocial consequences were evaluated in four trials (Aberle 2011De Koning 2020Field 2021Wille 2016). The trials measured different aspects of quality of life. Whilst small transient changes were at times reported, no long‐term adverse consequences on HRQoL were reported. The DLSCT (Wille 2016) and UKLS (Field 2021) administered questionnaires to their whole trial cohort. The NLST (Aberle 2011) initially only invited participants of the LDCT screening group from 16 of the 23 American College of Radiology Imaging Network (ACRIN) sites with positive baseline LDCTs, however later they invited participants who had significant incidental findings (SIFs) on LDCT as well. The control group matched LDCT group participants with negative results with a total number of participants of 2812 for this outcome. The NELSON (De Koning 2020) trial also only included a portion of their cohort for this analysis, taking a random sample of 733 participants from each trial group (LDCT screening and control). 

The following questionnaires were used as assessments in these trials.

  • Lung cancer‐specific questionnaire Consequences of Screening in Lung Cancer (COS‐LC, Brodersen 2010): COS‐LC consists of nine psychosocial scales: four core scales (24 core items) and five lung cancer screening‐specific scales (25 lung cancer screening‐specific items). The four core scales measure anxiety (7 items), behaviour (7 items), dejection (6 items), sleep (4 items) and smoking (2 items). Higher scores indicate more negative psychosocial consequences. 

  • Short form 36‐item questionnaire (SF‐36, Ware 1992) version 2: SF‐36 has a physical component summary (PCS) and a mental component summary (MCS). The score ranges from 0‐100. Higher scores indicate better HRQoL; lower scores indicate worse quality of life.

  • Short form 12‐item questionnaire (SF‐12, Gandek 1998Ware 1996) is a short version of the 36‐item questionnaire which also has a PCS and a MCS, with both components having a maximum score of 50 each. As with the SF‐36, higher scores indicated better HRQoL.

  • EuroQol questionnaire (EQ‐5D, Essink‐Bot 1993Kind 2005): EQ‐5D is a health questionnaire with five dimensions (mobility, self‐care, usual activities, pain/discomfort and anxiety/depression) as well as a visual analogue score (VAS). The VAS ranges from 0 (the worst imaginable health status) to 100 (the best imaginable health status). 

  • Spielberger State‐Trait Anxiety Inventory (STAI‐6) (van der Bij 2003): STAI‐6 assesses generic anxiety, with scores ranging from 20 to 80; higher scores indicate more anxiety.

  • Hospital Anxiety and Depression Scale (HADS): (Zigmond 1983): HADS consists of separate anxiety and depression components, each with a score ranging from 0 to 21. Scores for each component are considered normal (0‐8), borderline abnormal (9‐11) and abnormal (12‐21). 

  • Impact of event scale (IES, Horowitz 1979): IES measures distress caused by traumatic events (in this instance lung cancer) and consists of 15 items with total scores 0 to 75, with subscales for intrusion (0 to 35) and avoidance (0 to 40); higher scores indicated more distress. 

  • Revised 6‐item Cancer Worry Scale (CWS‐R): CWS‐R has scores ranging from 4 to 24, with higher scores indicating more worry. 

Lung cancer‐specific questionnaires

DLSCT (Wille 2016) used COS‐LC. Analysis 7.2 illustrates the change over time in mean differences (MDs) across all the core scales between the LDCT and control group at round two and five compared to baseline.

  • Anxiety: at round two there was probably no difference between the groups (MD ‐0.13, 95% CI ‐0.33 to 0.07; 1 trial, 3352 participants), with higher anxiety scores in the control group compared with LDCT groups at round five (MD ‐0.51, 95% CI ‐0.76 to ‐0.26; 1 trial, 3185 participants). 

  • Behaviour: at round two both groups had increased negative response, with probably no difference between the groups (MD ‐0.21, 95% CI ‐0.42 to 0.00; 1 trial, 3337 participants). However, scores decreased in the LDCT group by round five (MD ‐0.60, 95% CI ‐0.88 to ‐0.32; 1 trial, 3180 participants).

  • Dejection: at round two there was probably no difference between the groups, with elevated scores in both groups (MD ‐0.15, 95% CI ‐0.36 to 0.06; 1 trial, 3377 participants). However at round five, the LDCT group had fewer psychological consequences (MD ‐0.58, 95% CI ‐0.82 to ‐0.34; 1 trial, 3195 participants).

  • Negative impact on sleep: at round two there was probably no difference between the groups, with elevated scores in both groups (MD ‐0.14, 95% CI ‐0.32 to 0.04; 1 trial, 3389 participants). However, at round five there was a significant difference favouring LDCT screening (MD ‐0.70, 95% CI ‐0.95 to ‐0.45; 1 trial, 3198 participants).

  • Other domains (self‐blame, focusing on airway symptoms, stigmatisation, introvert and harm of smoking): the lung cancer‐specific scales tended to demonstrate more negative psychosocial consequences in the control group compared with the LDCT screening group from round two to round five.

Anxiety

When we combined standardised mean difference (SMD) in anxiety scores for the three available trials (De Koning 2020Field 2021Wille 2016Analysis 7.1), the evidence favoured lower anxiety scores in the LDCT screening group (SMD ‐0.43, 95% CI ‐0.59 to ‐0.27; 3 trials, 8153 participants; I2 = 0%; low‐certainty evidence), although scores were not necessarily abnormal in either group.

  • In the NELSON trial (De Koning 2020), there was probably no difference between the groups at baseline (MD ‐0.48, 95% CI ‐1.63 to 0.67; 1 trial, 1288 participants) and at 2 years (MD ‐0.75, 95% CI ‐1.99 to 0.49; 1 trial, 931 participants).

  • In the UKLS trial (Field 2021), HADS anxiety scores were in the normal range in both groups, however scores were lower in the LDCT group compared with the control groups at baseline (MD ‐0.07, 95% CI ‐0.11 to ‐0.03; 1 trial, 4037 participants) and at 10 months to 27 months (MD ‐0.36, 95% CI ‐0.57 to ‐0.15; 1 trial, 4037 participants).

  • We did not include the NLST trial (Aberle 2011) in this analysis as the control group consisted of participants with negative screen results and not participants from the CXR group. Participants were separated based on screening outcome (negative, true positive, false positive, and SIFs) (Analysis 7.5). There was no difference in anxiety levels between the groups for the participants with negative screens at baseline (MD ‐0.26, 95% CI ‐1.79 to 1.27; 1 trial, 1162 participants) and at 6 months (MD ‐0.33, 95% CI ‐1.91 to 1.25; 1 trial, 1019 participants). There was probably no difference between the groups with participants with true positive screens at baseline (MD 1.63, 95% CI ‐6.31 to 9.57; 1 trial, 48 participants) and at 6 months (MD ‐2.69, 95% CI ‐11.69 to 6.31; 1 trial, 42 participants). False‐positive participants also did not demonstrate a difference in anxiety levels at baseline (MD 1.77, 95% CI ‐0.04 to 3.58; 1 trial, 835 participants) or at 6 months (MD 1.31, 95% CI ‐0.61 to 3.23; 1 trial, 703 participants) between the groups.

Depression

The UKLS trial (Field 2021) used the HADS score for depression. These scores were within normal limits for both groups. The LDCT group reported lower scores compared with the control group at baseline (MD ‐0.06, 95% CI ‐0.10 to ‐0.02; 1 trial, 4037 participants) and at 10 months to 27 months (MD ‐0.24, 95% CI ‐0.40 to ‐0.08; 1 trial, 4037 participants).

Stress

Both the NESLON trial (De Koning 2020) and UKLS (Field 2021) measured stress using different instruments. 

  • The NELSON trial (De Koning 2020) used the IES score. There was probably no difference between the groups at baseline (MD 0.03, 95% CI ‐0.88 to 0.94; 1 trial, 1288 participants) and at 2 years (MD ‐0.31, 95% CI ‐1.30 to 0.68; 1 trial, 931 participants). However, they reported that people in the LDCT group who had indeterminate results had elevated distress following the result at 2 months. 

  • The UKLS trial (Field 2021) used the CWS‐R and did not report any difference between groups in cancer distress at 10 months to 27 months. However, they did observe that those participants referred to the lung cancer multidisciplinary meeting reported more lung cancer distress in the short term (approximately 2 weeks after randomisation to the non‐screening arm or receiving results of the baseline LDCT) (mean score 11.88, 95% CI 11.10 to 12.72), although were more satisfied with their participation in the trial. 

Generic HRQoL

The NELSON trial (De Koning 2020) measured HRQoL using the SF‐12 questionnaire and EQ‐5D, whilst NLST (Aberle 2011) used the SF‐36 version.

  • In the NELSON trial (De Koning 2020), there was probably no difference in the SF‐12 scores between the groups for both the PCS and MCS components, both at baseline and 2 years. There was also probably no difference between the two groups with EQ‐5D VAS at baseline and 2 years. 

    • PCS baseline (MD 0.28, 95% CI ‐0.66 to 1.22; 1 trial, 1288 participants) and at 2 years (MD 0.88, 95% CI ‐0.34 to 2.10; 1 trial, 931 participants).

    • MCS baseline (MD ‐0.06, 95% CI ‐1.42 to 1.30; 1 trial, 1288 participants) and at 2 years (MD 0.81, 95% CI ‐0.65 to 2.27; 1 trial, 931 participants).

    • EQ‐5D VAS baseline (MD 0.69, 95% CI ‐0.98 to 2.36; 1 trial, 1288 participants) and at 2 years (MD 2.08, 95% CI 0.18 to 3.98; 1 trial, 1010 participants).

  • In the NLST trial (Aberle 2011), groups were divided again by results of screening (negative, true positive, false positive, and SIF). 

    • Participants with true‐positive screens did have lower PCS and MCS scores (Analysis 7.3Analysis 7.4), which declined at 6 months, however scores were probably not significantly different between groups.

      • PCS at baseline (MD ‐1.94, 95% CI ‐7.33 to 3.45; 1 trial, 63 participants) and at 6 months (MD ‐0.20, 95% CI ‐7.32 to 6.92; 1 trial, 42 participants). MCS at baseline (MD ‐1.74, 95% CI ‐6.66 to 3.18; 1 trial, 63 participants) at 6 months (MD 0.08, 95% CI ‐8.19 to 8.35; 1 trial, 42 participants).

    • There was probably no significant difference between the groups for those with negative and false‐positive screens at baseline and 6 months.

      • Participants with a negative screen: PCS at baseline (MD ‐1.07, 95% CI ‐2.09 to ‐0.05; 1 trial, 1381 participants) and at 6 months (MD ‐0.11, 95% CI ‐1.38 to 1.16; 1 trial, 1019 participants). MCS at baseline (MD ‐0.85, 95% CI ‐1.97 to 0.27; 1 trial, 1381 participants) and at 6 months (MD ‐0.15, 95% CI ‐1.52 to 1.22; 1 trial, 1019 participants). 

      • Participants with false‐positive screens: PCS at baseline (MD ‐0.72, 95% CI ‐1.98 to 0.54; 1 trial, 1024 participants) and at 6 months (MD ‐0.78, 95% CI ‐2.42 to 0.86; 1 trial, 703 participants). MCS at baseline (MD ‐0.19, 95% CI ‐1.43 to 1.05; 1 trial, 1024 participants) and at 6 months (MD ‐1.02, 95% CI ‐2.67 to 0.63; 1 trial, 703 participants).

8) Cancer stage at diagnosis 

We grouped trial results by time points (Analysis 8.1Analysis 8.2Analysis 8.3Analysis 8.4Analysis 8.5). Where specified in the trials, we separated limited and extensive small cell lung cancer (SCLC) from TNM (tumour, node, metastasis) staging (Goldstraw 2016). 

  • At baseline: we included five trials (Aberle 2011Blanchon 2007Gohagan 2005Infante 2015Wille 2016Analysis 8.1). The evidence suggested that stage 1 lung cancer was detected more in the LDCT screening group (RR 2.41, 95% CI 1.86 to 3.12; 5 trials, 64,092 participants; I2 = 0%). Analysis showed there was possibly no difference between the groups for stage 2 lung cancer (RR 1.88, 95% CI 0.99 to 3.58; 5 trials, 64,092 participants; I2 = 0%). There were fewer stage 3 cases of lung cancer in the control group, however heterogeneity between trials was high (RR 4.28, 95% CI 1.06 to 17.27; 5 trials, 64,092 participants; I2 = 59%). Heterogeneity was reduced to zero when we removed Aberle 2011, although the outcome was unchanged (RR 8.69, 95% CI 2.33 to 32.35; 4 trials, 10,637 participants; I2 = 0%). There was probably no difference between the groups with stage 4 cancer (RR 1.05, 95% CI 0.70 to 1.55; 5 trials, 64,092 participants; I2 = 0%) and unknown stage lung cancer (RR 0.99, 95% CI 0.31 to 3.13; 2 trials, 56,773 participants; I2 = 0%). The DLSCT (Wille 2016) was the only trial that reported limited and extensive stages. There were no cases in limited stage, and the extensive stage had more cases in the screening group. In the LDCT screening group, 53% of diagnosed lung cancers were stage 1, 7% of diagnosed lung cancers were stage 2, 23% of diagnosed lung cancers were stage 3, and 14% of diagnosed lung cancers were stage 4. In the control group, 40%, 7%, 28%, and 23% of diagnosed cancers, respectively were stages 1, 2, 3, and 4. 

  • At 1 year: we included three RCTS (Aberle 2011Gohagan 2005Wille 2016Analysis 8.2). The evidence suggested that stage 1 cancer was detected more in the LDCT compared with the control group (RR 2.57, 95% CI 1.24 to 5.32; 3 trials, 60,877 participants; I2 = 16%). There was probably no difference between the groups for stages 2 (RR 1.39, 95% CI 0.68 to 2.84; 3 trials, 60,877 participants; I2 = 0%) and 3 lung cancer (RR 1.22, 95% CI 0.76 to 1.95; 3 trials, 60,877 participants; I2 = 0%). The evidence showed that stage 4 lung cancer was detected more in the control group than in the LDCT screening group (RR 0.48, 95% CI 0.30 to 0.77; 3 trials, 60,877 participants; I2 = 0%). The NLST (Aberle 2011) was weighted 96% in that analysis. There was no difference between the groups for limited (RR 1.00, 95% CI 0.06 to 15.98; 1 trial, 4104 participants; I2 = 0%) and extensive (RR 3.00, 95% CI 0.12 to 73.60; 1 trial, 4104 participants; I2 = 0%) stages of lung cancer, as well as unknown stage (RR 1.35, 95% CI 0.17 to 10.75; 2 trials, 56,773 participants; I2 = 17%). In the LDCT screening group, 57% of diagnosed lung cancers were stage 1, 9% of diagnosed lung cancers were stage 2, 19% of diagnosed lung cancers were stage 3, and 13% of diagnosed lung cancers were stage 4. In the control group, 30%, 9%, 22%, and 37% of diagnosed cancers respectively, were stages 1, 2, 3, and 4.

  • At 2 years: we included two RCTS (Aberle 2011Wille 2016Analysis 8.3). The evidence suggested that stage 1 cancer was detected more in the LDCT group compared with control screening group (RR 3.53, 95% CI 1.66 to 7.53; 2 trials, 57,559 participants; I2 = 20%). There was probably no difference between the groups for stages 2, 3, and 4 lung cancer (RR 1.08, 95% CI 0.49 to 2.37; 2 trials, 57,559 participants; I2 = 0%), (RR 0.92, 95% CI 0.59 to 1.44; 2 trials, 57,559 participants; I2 = 0%) and (RR 0.80, 95% CI 0.52 to 1.24; 2 trials, 57,559 participants; I2 = 0%), respectively. The DLSCT trial (Wille 2016) did not have any events in stage 2 lung cancer. We only included one trial (Wille 2016) for each of extensive (RR 0.14, 95% CI 0.01 to 2.76; 1 trial, 4104 participants) and unknown stages (RR 7.00, 95% CI 0.86 to 56.91; 1 trial, 53,455 participants) with no limited cases. In the LDCT screening group, 63% of diagnosed lung cancers were stage 1, 5% of diagnosed lung cancers were stage 2, 14% of diagnosed lung cancers were stage 3, and 15% of diagnosed lung cancers were stage 4. In the control group, 33%, 8%, 26%, and 31% of diagnosed cancers respectively, were stages 1, 2, 3, and 4. 

  • 5 to < 10 years post‐randomisation: we included four trials (Becker 2020Field 2021Infante 2015Paci 2017Analysis 8.4). All trials used TNM staging. The NLST trial (Aberle 2011) also included an occult lung cancer stage, which we combined with stage 1 for this analysis. The evidence showed that stage 1 occurred more frequently in the LDCT screening group (RR 2.26, 95% CI 1.43 to 3.57; 4 trials, 13,676 participants; I2 = 0%). There was probably no difference between the groups for stages 2 (RR 0.78, 95% CI 0.37 to 1.66; 4 trials, 13,676 participants; I2 = 9%) and 3 lung cancer (RR 0.84, 95% CI 0.47 to 1.49; 4 trials, 13,676 participants; I2 = 27%). Heterogeneity was mild amongst the trials for stage 3. The evidence suggested fewer cases of stage 4 lung cancer detected in the LDCT screening group (RR 0.55, 95% CI 0.34 to 0.91; 4 trials, 13,676 participants; I2 = 57%). Heterogeneity was present and removing Field 2021 resulted in no heterogeneity without changing the findings (RR 0.70, 95% CI 0.50 to 0.97; 3 trials, 9708 participants; I2 = 0%). There was probably no difference between the groups with lung cancer of unknown stage (RR 0.67, 95% CI 0.41 to 1.12; 4 trials, 13,676 participants; I2 = 5%). In the LDCT screening group, 31% of diagnosed lung cancers were stage 1, 7% of diagnosed lung cancers were stage 2, 17% of diagnosed lung cancers were stage 3, and 32% of diagnosed lung cancers were stage 4. In the control group, 11%, 8%, 16%, and 46% of diagnosed cancers respectively were stages 1, 2, 3, and 4. 

  • ≥ 10 years post‐randomisation: we included four trials (Aberle 2011Paci 2017Pastorino 2012Wille 2016Analysis 8.5). As previously, we combined occult lung cancer with stage 1 lung cancer for Aberle 2011. The evidence showed that stage 1 was detected more frequently in the LDCT screening group (RR 3.28, 95% CI 1.82 to 5.90; 4 trials, 11,409 participants; I2 = 56%). The heterogeneity level was acceptable. There was probably no difference between the groups for stages 2 (RR 0.94, 95% CI 0.76 to 1.17; 4 trials, 64,864 participants; I2 = 0%) and stage 3 lung cancer (RR 1.23, 95% CI 0.79 to 1.93; 4 trials, 64,864 participants; I2 = 56%). The heterogeneity level was acceptable amongst the trials for stage 3 lung cancer. The evidence suggested there were fewer cases of stage 4 lung cancer in the LDCT screening group (RR 0.77, 95% CI 0.69 to 0.86; 4 trials, 64,864 participants; I2 = 0%) There were possibly fewer cancers at unknown stages in the LDCT group compared with the control group (RR 0.67, 95% CI 0.45 to 0.99; 3 trials, 60,765 participants; I2 = 24%). In the LDCT screening group, 40% of diagnosed lung cancers were stage 1, 8% of diagnosed lung cancers were stage 2, 18% of diagnosed lung cancers were stage 3, and 28% of diagnosed lung cancers were stage 4. In the control group, 26%, 9%, 18%, and 37% respectively, were stages 1, 2, 3, and 4.

9) Histology

We grouped trial results by time points (Analysis 9.1Analysis 9.2Analysis 9.3). For the purposes of this review, histological types presented are small cell lung carcinoma (SCLC), mixed SCLC and non‐small cell lung carcinoma (NSCLC), squamous cell carcinoma (SCC), adenocarcinoma (AC), bronchioalveolar carcinoma (BAC), and other. The category of 'other' is all other histological subtypes presented in trials, including sarcomatoid carcinomas, large cell carcinomas, neuroendocrine tumours and neuroendocrine carcinomas. It should be noted that BAC was reclassified as various adenocarcinoma subtypes in the lung cancer TNM classification by the World Health Organization (WHO) (Nicholson 2022), however has been included here as presented by the relevant trials. 

  • Baseline: we included four trials (Aberle 2011Blanchon 2007Gohagan 2005Infante 2015Analysis 9.1). There was probably no difference in the number of SCLC and other histology between groups (RR 0.84, 95% CI 0.45 to 1.57; 4 trials, 59,987 participants; I2 = 2%) and (RR 1.32, 95% CI 0.90 to 1.94; 4 trials, 59,987 participants; I2 = 0%), respectively. SCC, AC, and BAC were more common in the LDCT arm (RR 1.47, 95% CI 1.01 to 2.13; 4 trials, 59,987 participants; I2 = 0%), (RR 2.81, 95% CI 1.38 to 5.71; 4 trials, 59,987 participants; I2 = 46%) and (RR 4.94, 95% CI 2.41 to 10.10; 2 trials, 55,904 participants; I2 = 0%), respectively. Heterogeneity between groups was moderate in the AC analysis.

  • 1 year: we only included one trial (Gohagan 2005Analysis 9.2), with more cases of AC found in the LDCT screening groups and probably no difference between groups for SCLC (RR 2.00, 95% CI 0.37 to 10.89; 1 trial, 3318 participants), SCC (RR 0.83, 95% CI 0.25 to 2.72; 1 trial, 3318 participants), AC (RR 2.66, 95% CI 1.24 to 5.71; 1 trial, 3318 participants) and other histology (RR1 2.33, 95% CI 0.60 to 9.00; 1 trial, 3318 participants).

  • ≥ 7 years post‐randomisation: we included seven RCTS in the analysis (Aberle 2011Becker 2020Field 2021Infante 2015Paci 2017Pastorino 2012Wille 2016Analysis 9.3). The latest available time point was 11.3 years post‐randomisation (Aberle 2011). For the DLCST trial (Wille 2016), the adenocarcinoma category also included mixed AC and BAC, as well as mixed AC and SCC. The MILD trial (Pastorino 2012) had both annual and biennial arms combined in the intervention group. For consistency, the latest available time point for data was taken for each trial. In the LUSI trial (Becker 2020), this meant that only data pertaining to three categories (AC, BAC, and other) were included. For SCLC, SCC, and other, there was probably no difference between the groups, with moderate heterogeneity between the trials for the other category: SCLC (RR 0.86, 95% CI 0.74 to 1.01; 6 trials, 71,281 participants; I2 = 0%), SCC (RR 1.04, 95% CI 0.81 to 1.32; 6 trials, 71,281 participants; I2 = 26%) and other (RR 0.87, 95% CI 0.68 to 1.11; 7 trials, 75,333 participants; I2 = 40%). The evidence suggested that AC and BAC were more common in the LDCT screening group, with high heterogeneity between trials in the AC category: AC (RR 1.49, 95% CI 1.05 to 2.10; 7 trials, 75,333 participants; I2 = 82%) (analysis not shown) and BAC (RR 2.73, 95% CI 1.96 to 3.81; 3 trials, 61,610 participants; I2 = 0%). When we removed NLST (Aberle 2011) from the AC analysis, heterogeneity decreased (RR 1.62, 95% CI 1.13 to 2.34; 6 trials, 21,879 participants; I2 = 69%). NLST (Aberle 2011) was the only included trial in this analysis that used CXR screening as a comparator. Mixed SCLC and NSCLC was only reported in one trial (Wille 2016) (RR 0.14, 95% CI 0.01 to 2.76; 1 trial; 4104 participants).

10) Other outcomes

  • Biomarkers: two trials have published data on small samples of their trial population DNA and microRNA profiles (Paci 2017Pastorino 2012). Becker 2020 published early data on autoantibodies to tumour‐associated antigens in a subgroup of their cohort. 

  • Response rate: eight RCTs had available data (Becker 2020Blanchon 2007De Koning 2020Field 2021Gohagan 2005LaRocca 2002Paci 2017Wille 2016), which is summarised in Table 4. In larger trials that used information mail outs (Becker 2020De Koning 2020Field 2021Gohagan 2005Paci 2017), only ≤ 5% of those contacted, enrolled in the trial.  

  • Adherence to screening: eight RCTs reported adherence to screening (Aberle 2011Becker 2020Blanchon 2007De Koning 2020Field 2021Gohagan 2005Paci 2017Pastorino 2012). Overall, adherence to screening was good, with a decline noted at and after 5 years (Table 4). Three RCTs had comparative data with CXR control groups (Aberle 2011Blanchon 2007Gohagan 2005). Heterogeneity was high between the groups, however the evidence suggested poorer adherence in the LDCT screening group (RR 1.05, 95% CI 1.01 to 1.09; 3 trials, 57,539 participants; I2 = 85%) (analysis not shown).

  • Contamination: seven RCTS reported on contamination (Becker 2020Blanchon 2007De Koning 2020Gohagan 2005Infante 2015Pastorino 2012Wille 2016). In LUSI (Becker 2020), 13% of the control group had undergone a CT scan. In DESPICAN (Blanchon 2007), 6 participants had inadvertently undergone LDCT. In the NESLON trial (De Koning 2020) 4% of the control group in a random sample of 1460 participants had undergone chest CT. One per cent of control arm participants underwent a LDCT in the MILD trial (Pastorino 2012), and three participants in the DLCST control group (Wille 2016) had a chest CT for lung cancer screening purposes. Neither LSS (Gohagan 2005) nor DANTE (Infante 2015) clearly differentiated the number of scans performed for screening purposes only. Reported contamination rates are detailed in Table 4. Three RCTs had comparative data with control arms (Gohagan 2005Infante 2015Wille 2016). The evidence suggested there was no significance difference between the groups, however heterogeneity was high (RR 1.35, 95% CI 0.32 to 5.68; 3 trials, 6902 participants; I2 = 74%).

  • Interval lung cancers: seven trials reported rates of interval cancers at different time points (Aberle 2011Becker 2020De Koning 2020Gohagan 2005Paci 2017Pastorino 2012Wille 2016). These are summarised in Table 5

  • False negatives: five trials reported false‐negative cases at various time points (Aberle 2011De Koning 2020Field 2021Infante 2015Pastorino 2012). In NLST (Aberle 2011), there were < 1% false negatives each round of screening for the LDCT screening group and CXR control group for the first 3 years, with more false‐negative results in the CXR screening compared with LDCT screening (Analysis 5.2). The NELSON trial (De Koning 2020) also reported small numbers of false negatives with a total of five, seven, and seven false negatives across the first, second and third round of screening, respectively. The UKLS trial (Field 2021) reported three false negatives from their baseline LDCT. The DANTE trial (Infante 2015) reported one false negative in the intervention arm. The MILD trial (Pastorino 2012) reported 17 false negatives in the annual LDCT group and nine in the biennial LDCT group. 

  • Incidental findings: six trials reported rates of incidental findings (Aberle 2011Blanchon 2007De Koning 2020Field 2021Infante 2015Wille 2016), and these are summarised in Table 5

  • Cost: three trials reported cost data (Aberle 2011Field 2021Wille 2016). The lowest cost was in the UK trial (Field 2021) which, following clarification with authors, was GBP 186 per screen/per participant, including costs of recruitment. In the Danish trial (Wille 2016), the cost of a LDCT screen was EUR 282, with the total cost per year of healthcare costs per participant being EUR 3756 in the screening group. Of note, the control arm cost per participant per year in this trial was EUR 3474 (EUR 2348 without lung function or counselling). The USA trial (Aberle 2011) had a cost per participant in the LDCT screening group of USD 1130, compared with USD 336 for the CXR participant. The costs of screening included the cost of investigating clinically SIFs.

  • Use of anxiolytics and antidepressants: only one trial investigated this outcome and concluded that participation in the trial was not associated with a change in prescriptions of these medications (Wille 2016).

  • Feasibility of general practitioner (GP) enrolment to lung cancer screening trial: one trial reported this as an outcome (Blanchon 2007), with participation rate of GPs reported as 41%. 

Open in table viewer
Table 4. Response, adherence and contamination rates

 

Response rates to recruitment 

Adherence to screening 

Contamination

Aberle 2011

Not available

Overall adherence to all 3 screening rounds: 95% of participants completed LDCT scan, 93% in the control group completed CXR

Not available

Becker 2020

  • 292,440 people received questionnaires

  • 95,797 people responded

  • 4913 people met eligibility criteria

  • 4052 people were enrolled and randomised to the trial (1% of those who received a questionnaire, and 4% of respondents)

 

 

 

 

 

Baseline: almost 100% completed LDCT scan.

1‐year: 95% completed LDCT scan.

2‐year: 93% completed LDCT scan.

3‐year: 93% completed LDCT scan.

4‐year: 94% completed LDCT scan. 

10 years postrandomisation: 264 participants in the control arm had received a CT.

Blanchon 2007

  • 830 eligible people were approached

  • 765 people consented to be randomised (92% of eligible people) 

 

Baseline: 86% participants completed LDCT scan, 75% in control arm completed CXR.

At baseline: 6 participants in the control arm inadvertently received a LDCT. 

De Koning 2020

  • 606,409 people received the first questionnaire

  • 150,920 responded to the questionnaire

  • 30,959 people were eligible and invited to participate

  • 15,822 people completed second questionnaire and were included and randomised (3% of people who received the first questionnaire and 51% of eligible respondents)

Baseline: 95% of participants completed LDCT scan

1 year: 97% of participants completed LDCT scan

3 years: 95% completed LDCT scan

5.5 years: 78% of participants completed LDCT scan

At 2 years: 3.6% of participants in the control arm had received CT for any reason

Field 2021

  • 247,354 people invited to participate in study

  • 75,958 responded positively to invitations

  • 8729 of respondents assessed as high risk

  • 5967 responded to second questionnaire

  • 4868 invited to recruitment centre

  • 4152 attended recruitment centre

  • 4061 consented to participation 

  • 4055 randomised to trial (2% of those invited and 46% of high‐risk respondents)

Baseline: 98% of participants completed LDCT scan

Not available

Gohagan 2005

  • 653,417 people mailed information packages

  • 12,270 people contacted screening centre and underwent eligibility assessment

  • 4828 people eligible for trial

  • 3409 people were randomised, however 91 participants subsequently found to be ineligible 

  • 3318 participants randomised and included in analysis (1% of people who received mail packages and 27% who were screened for eligibility)

Baseline: 95% of participants completed LDCT scan, 93% of the control group completed CXR

1‐year: 86% of participants completed LDCT scan, 80% of the control group completed CXR 

Contamination assessed by random sample of participants

At baseline: 5% of respondents in the intervention arm had received a CXR for medical or screening purposes. 0.9% of respondents in the control arm had received a CT for medical or screening purposes.

At 1 year: 10% of participants in the intervention arm had received a CXR for medical or screening purposes and 1.3% of respondents in the control arm had received a CT for medical or screening purposes. 

Infante 2015

Not available

Not available

3 years post‐randomisation:

  • intervention arm (74 extra CT scan, 233 extra CXRs)

  • control arm (extra 68 CTs, 209 extra CXRs)

Did not specify if for screening purposes, only outside protocol

LaRocca 2002

  • 3418 people completed screening questionnaires

  • 904 participants completed pre‐screening baseline CXR

  • 871 participants randomised to trial (25% of people who completed screening questionnaire)

Not available

Not available

Paci 2017

  • 71232 people sent letters

  • 17,055 people responded to letters

  • 3206 people were eligible and randomised (5% of people who received letters and 19% of respondents)

Baseline: 87% completed LDCT scan

1 year: 85% completed LDCT scan

2 years: 82% completed scan

3 years: 80% completed scan

 

Not available

Pastorino 2012

Not available

Baseline: 97% of annual group completed scan, 97% of biennial group completed scan

1 year: 97% of the annual group and 97% of the biennial group completed the scan

2 years: 98% of the annual group and 95% of the biennial group completed the scan

3 years: 97% of the annual group and 97% of the biennial group completed the scan

4 years: 96% of the annual group and 92% of the biennial group completed the scan

5 years: 79% of the annual group and 98% of the biennial group completed the scan

6 years: 54% of the annual group and 77% of the biennial group completed the scan

10 years post‐randomisation: 21 of 1723 participants in the control arm had received a LDCT

Wille 2016

  • 5861 people assessed for eligibility

  • 4104 people randomised to the trial (70% of people assessed)

Not available

After 5 years post‐randomisation: 0 cases of contamination in the intervention arm; 3 cases in the control arm

CT: computed tomography; CXR: chest x‐ray; LDCT: low‐dose computed tomography

Open in table viewer
Table 5. Interval cancers and incidental findings 

 

Interval cancers 

Incidental findings

Aberle 2011

Postbaseline LDCT: 18 lung cancers

Post‐1 year LDCT: 10 lung cancers

Baseline LDCT data from non‐ACRIN centres (N = 17,309): 2625 cardiovascular abnormalities, 221 thyroid abnormalities, 419 adrenal abnormalities, 780 renal abnormalities, and 1064 hepatobiliary abnormalities

Becker 2020

Postbaseline LDCT: 1 lung cancer

Post‐1‐year LDCT: 0 lung cancers 

Post‐2‐year LDCT: 2 lung cancers

Post‐3‐year LDCT: 1 lung cancer

Post‐4‐year LDCT: 2 lung cancers

Not available

Blanchon 2007

Not available

Baseline LDCT: 19 severe emphysema, 63 bronchiectasis, and 18 mediastinal findings

Baseline CXR in control group: 5 severe emphysema, 2 bronchiectasis, and 6 mediastinal findings

De Koning 2020

After 3 rounds of LDCT screening: 35 interval lung cancers

Baseline LDCT data from one centre (N = 1929): 76 liver findings, 53 kidney findings, 9 thyroid findings, 2 mediastinal findings, 1 adrenal finding, 1 breast finding, 1 colon finding, and one perineural cyst

Field 2021

Not available

Baseline LDCT: 4 aortic dilatations, 5 severe aortic valva calcifications, 4 mediastinal masses, 6 mediastinal or hilar lymphadenopathy, 41 pneumonias, 5 bronchiectasis, 8 pleural thickening, 7 smoking related interstitial lung diseases, 9 severe emphysemas, 6 unspecified interstitial fibrosing lung disease, 2 nonspecific interstitial pneumonias, 12 usual interstitial pneumonias, 1 sarcoidosis, 2 oesophageal thickening or dilatation, 1 breast mass, 2 lobar collapse, 1 biliary dilatation, 3 adrenal masses, 1 liver cirrhosis, 1 hydronephrosis, 1 liver mass, 1 pancreatic cyst, 3 renal masses, 1 splenomegaly, and 1 thyroid mass 

Gohagan 2005

1 year post‐randomisation: 2 lung cancers in the LDCT group and 2 lung cancers in the control group

Not available

Infante 2015

Not available

Baseline LDCT: 1 lymphoma, 1 oesophageal carcinoma, 1 malignant mesothelioma, 1 colon cancer with liver metastasis, and 2 renal cancers with pulmonary metastasis

Baseline CXR in the control group: 1 lymphoma

LaRocca 2002

Not available

Not available

Paci 2017

Overall 6 interval lung cancers reported during 4 years of screening

Not available

Pastorino 2012

  • 4.4 years post‐randomisation: 5 lung cancers reported in annual group and 5 in the biennial group

  • 6.5 years post‐randomisation: 13 lung cancers reported in annual group and 10 in biennial group

Not available

Wille 2016

1 interval lung cancer reported in LDCT group during year 3

After 5 rounds of screening: 140 participants had 148 significant incidental findings (1 larynx, 3 thyroid, 9 gastroesophageal, 16 breast, 5 cardiac, 12 mediastinum, 28 aorta, 18 liver/gallbladder, 6 pancreas, 1 spleen, 2 intestines, 40 kidneys, 2 skin, 3 chest wall, and 2 vertebral column)

CXR: chest x‐ray; LDCT: low‐dose computed tomography

Discussion

Summary of main results

Primary outcomes 

  • For lung cancer‐related mortality, moderate‐certainty evidence showed a difference favouring LDCT screening. When we only included high‐certainty trials, the evidence still favoured LDCT screening. 

  • The evidence showed that the number of invasive and non‐invasive interventions is higher in the LDCT screening group compared with the control group, including rates of invasive interventions for non‐lung cancer‐related disease. However, there was probably no difference in death postsurgery between groups. 

Secondary outcomes 

  • For all‐cause mortality, the evidence showed a small difference favouring screening with LDCT. The analysis still favoured screening with LDCT when only high‐certainty trials were included. 

  • For estimated overdiagnosis at 10 or more years, the combined risk was 18%. However, the 95% CI was wide, suggesting possibly no difference between the groups, with a lower limit of the 95% CI just below 0 and an upper limit of 36%. This is in keeping with the incidence of lung cancer, demonstrating that there was possibly no difference in incidence between LDCT screening and control groups at 10 or more years. 

  • For false‐positive results from scans, the evidence showed that these were higher in the LDCT screening group. 

  • For recall rates, the evidence showed that these were higher in the LDCT screening group. 

  • For smoking cessation rates, results were mixed. However, there was probably no significant difference in smoking relapse rates between LDCT screening and control groups.

  • For psychosocial consequences, the evidence was of low certainty due to inconsistencies in outcome measures, sample groups, and timing of assessments. Overall the limited evidence available did not suggest any long‐term adverse impact on psychosocial well‐being or HRQoL with LDCT screening. 

  • For lung cancer staging, the evidence showed there was more stage 1 lung cancer detected in the LDCT group compared with control, across the time points. As time from randomisation increased, there was probably no difference between groups for stage 2 and 3 lung cancer. In later time points, the evidence showed there was more stage 4 lung cancer in the control groups compared to LDCT screening group. 

  • For lung cancer histology, the evidence showed there was more squamous cell carcinoma (SCC), adenocarcinoma (AC) and bronchioalveolar carcinoma (BAC) detected in the LDCT screening group compared with control groups at baseline. AC and BAC remained more prevalent in the LDCT screening group at later time points. 

Overall completeness and applicability of evidence

We identified 11 eligible RCTs and included eight for the main meta‐analysis of the primary outcome, lung cancer‐related mortality; we could not include two trials in the meta‐analysis due to data being unavailable, and we excluded one trial (Gohagan 2005) from Analysis 1.1 as it did not have any planned follow‐up time points.

The following considerations may affect the strength and completeness of the conclusions of this review.

  • Participant characteristics

    • Participants enrolled in these trials tended to have a strong tobacco smoking exposure history as a result of trial inclusion criteria. Trials investigating the role of LDCT for lung cancer‐screening in non‐smoking populations are still ongoing (Ongoing studies). Only one trial used a validated tool to predict lung cancer risk, with the other trials using smoking exposure as part of the inclusion criteria. 

    • Two RCTs had either zero or a significantly underrepresented number of female participants (De Koning 2020Infante 2015). This is significant, as the subgroup analysis of female participants demonstrated a larger lung cancer‐related mortality benefit with LDCT screening compared to the male participants (Analysis 1.8), although CIs did overlap.

    • All included trials were conducted in the USA or Europe. Whilst not all trials reported breakdown of ethnicity or race, the two trials that did, reported a significant majority of white participants compared to other races (Aberle 2011Field 2021). 

    • Seven trials had additional fitness requirements (Becker 2020Blanchon 2007De Koning 2020Infante 2015Paci 2017Pastorino 2012Wille 2016), including fitness for surgery for entry into the trial.

    • Information regarding education level and socioeconomic status of participants was limited. Three trials described education levels amongst participants (Aberle 2011Field 2021Wille 2016), with 32% of NLST having a tertiary degree or higher (Aberle 2011). 

  • We could not provide enough information about the potential harms of LDCT screening as only a few trials provided data from all trial groups (Aberle 2011Gohagan 2005Infante 2015Pastorino 2012). Heterogeneity ranged from high to low amongst trials in these analyses, as trials did not have uniform reporting and categorisation of invasive and non‐invasive interventions. Additionally, trials did not have a consensus definition for significance of nodules and investigation pathways. 

  • Included trials did not use consistent measures of HRQoL, which made comparison challenging. 

    • Only two trials administered the questionnaires to their whole cohort (Field 2021Wille 2016), with the NLST (Aberle 2011) and the NESLON (De Koning 2020) trials inviting only a small portion of their cohort. The NLST trial limited their assessment to only those in the LDCT screening group (Aberle 2011).

    • Whilst the exact timing of assessments varied, all trials consistently reported no long‐term adverse consequences of screening with LDCT, except for true positives (participants diagnosed with cancer as a result of screening) in NLST (Aberle 2011). The NLST trial reported lower HRQoL and more anxiety in this group. 

    • The NELSON trial had an indeterminate category for classification of nodules (De Koning 2020), and reported an elevated cancer‐specific distress core in this group at 2 months. There was no difference between groups at 2 years. 

    • The UKLS trial reported less decision satisfaction for participants in the control arm (Field 2021), which may reflect an increased awareness of risk of lung cancer in the control arm without the ability to participate in screening. Anxiety and lung cancer‐related distress were higher in the group referred to multidisciplinary meetings postscreening, however this group had high satisfaction rates in their decision to participate in screening.  

    • The DLCST reported more negative responses in both the screening and no‐screening groups in the fields of behaviour, dejection and negative impact on sleep at earlier time points (Wille 2016), however the impact decreased by round five of screening. They also noted less anxiety in the screening at round five. Their response rate to questionnaires in the control group was lower (76% compared with 94% in the LDCT screening group). In their smaller cohort study, Wille 2016 reported that participants with false‐positive results had more negative psychosocial consequences of screening compared with those in the control group (no screening) or true negatives in the short term, with no significant long‐term consequences. 

    • The DLSCT also reviewed psychosocial status and demographics of participants in the control group who did not attend their annual clinical review (Wille 2016). The trial reported that non‐attenders to visits were from more disadvantaged sociodemographic backgrounds and had lower psychosocial questionnaire scores from preceding rounds. This is an important consideration for lung cancer screening, adding to the need for more comprehensive assessments of psychosocial well‐being of potential lung cancer screening participants and acknowledging that the population who participates and engages in a trial may not be reflective of the general population. 

    • Further research is required to assess factors affecting engagement in screening and adherence and how to manage potential short‐term adverse psychosocial outcomes. 

  • Effect of screening on smoking was also limited due to available data. 

  • The definition of recall rates was not consistent across all trials (Aberle 2011Becker 2020De Koning 2020Field 2021Gohagan 2005Infante 2015Paci 2017Pastorino 2012Wille 2016). Recall rates in LUSI included all early recalls up to 12 months following LDCT (Becker 2020), whereas in the MILD trial it was defined as 3 months post‐LDCT scan (Pastorino 2012). The NLST reported all further chest CT, and did not specify recall CTs only (Aberle 2011). The NELSON trial also had an intermediate category for abnormal results (De Koning 2020), which likely resulted in an under appreciation of false‐positive screens, when defined as any result that is not a negative screen. 

  • Incidental findings were also not well described across the included trials, with varying categories of findings reported. Incidental findings are likely common however, with the NLST reporting 15% of participants had incidental cardiovascular disease noted on baseline LDCT (N =17,309 from non‐ACRIN centres; Aberle 2011).

  • Some trials had limited publications. All RCT authors were contacted by a reviewer (DM) where required. 

Quality of the included trials

Included trials were generally well‐designed, and all included trials were RCTs.

  • Due to the nature of the intervention, an open‐label design may have increased performance bias in subjective outcomes.

  • Only a few trials used blinded death panels to review lung cancer‐related mortality, which may have influenced detection bias.

    • Six of the included trials used a death review panel in some capacity (Aberle 2011Becker 2020De Koning 2020Infante 2015Paci 2017Wille 2016), particularly when assessing lung cancer‐related deaths for part or all of the duration of the trial.

    • Three trials used registries and/or death certificates only to determine cause of death (Field 2021Gohagan 2005Pastorino 2012).

    • The NELSON trial reviewed the use of death review panels to determine cause of death compared with death certificates in the first 266 Netherlands cohort deaths (De Koning 2020); the use of the review panel reclassified 12% of cases. The NELSON trial subsequently ceased using a death review panel thereafter.

    • The NLST trial also reviewed use of a death review panel compared with death certificates to accurately determine cause of death (Aberle 2011). Cause of death was reclassified in 3% of cases following review by panel, with death certificates having a sensitivity of 91% for cause of death. The authors then revisited analysis of lung cancer‐related mortality using lung cancer‐related deaths provided by death certificate, and found a lung cancer mortality reduction of 18% (95% CI 4.2 to 25) compared to the 20% (95% CI 6.7 to 26.7) published.

    • The use of death panels in trials is expensive, and further research is required to determine whether it significantly adds to the assessment of lung cancer‐related mortality.

    • The use of registries alone for follow‐up and detection of mortality and lung cancer incidence was a concern, as it was limited by trial access to participant data, delays from outcome to registry notification, and assumed participants had remained in the catchment of the registry without name change or errors. In the NLST's extended follow‐up for instance (Aberle 2011), not all home state registries participated in the linkage, and some screening centres did not have access to participants' details to complete the linkage. The LUSI trial also had limitations with registry linkage (Becker 2020), with 39 participants declining data access. The UKLS trial had participants who had not consented to data linkages or had opted out of national registries (Field 2021).

    • Given concerns regarding completeness of follow‐up data, this review prioritised planned and/or active follow‐up in analysis over unplanned extended follow‐up, as this tended to rely more on registry data and some trials which had used the death panel (Aberle 2011Paci 2017), ceased after planned follow‐up.

  • All‐cause mortality results were reliable across included trials.

  • Regarding risk of bias, excluding those with unclear risk (Becker 2020Blanchon 2007LaRocca 2002), overall most domains of importance were low risk. There were a few trials that did have high risk for certain domains (De Koning 2020Field 2021Gohagan 2005Infante 2015Pastorino 2012), however this is unlikely to have had a significant impact on the results presented.

Certainty of the evidence

See summary of findings Table 1.

We graded lung cancer‐related mortality as having moderate‐certainty evidence as it included eight trials which had low to high risk of bias. It should be noted that one trial had CXR as a comparison rather than no screening (Aberle 2011). Whilst screening with CXR has not been shown to reduce lung cancer‐related mortality (Manser 2013), there may potentially be some impact from screening with CXR, which may have diluted the effect of LDCT screening in the NLST trial. However, there was no heterogeneity between the included trials. The included trials also had different definitions for positive scans (Aberle 2011Becker 2020De Koning 2020Field 2021Infante 2015Paci 2017Pastorino 2012Wille 2016).

We graded all‐cause mortality as having moderate‐certainty evidence as the eight included trials had risk of bias ratings from low to high (Aberle 2011Becker 2020De Koning 2020Field 2021Infante 2015Paci 2017Pastorino 2012Wille 2016). The difference was small, with only a 5% risk reduction with screening and a 95% CI upper limit of 0.99 (1% risk reduction). CXR as a comparator in the NLST may have impacted the effect (Aberle 2011).

We graded overdiagnosis as having low‐certainty evidence due to risk of bias in the included trials and high heterogeneity between trials. The heterogeneity was largely due to the DLCST (Wille 2016), which had significantly higher lung cancer incidence in the LDCT screening group compared with no screening. There was a higher incidence of lung cancer diagnosed postcessation of screening in the screening group, which was unusual and not adequately explained by the mild overrepresentation of smoking history > 35 pack years and lower forced expiratory ratio (FER) in the screening group at baseline. We chose to present the follow‐up data at 10 or more years despite this rating as it probably provides a better estimate for overdiagnosis of lung cancer with LDCT screening. Except for the DLSCT (Wille 2016), the other trials demonstrated a reduction in estimated overdiagnosis as time from randomisation increased. The other consideration which was not well explored in all trials was the background rates of CT scans in each country for other purposes, which may conversely diminish the perceived impact of overdiagnosis.

We graded the outcomes 'number of invasive tests' and 'any death postsurgery' as having moderate‐certainty evidence as both analyses included trials with concerns regarding risk of bias.

We graded HRQoL and psychosocial consequences as having low‐certainty evidence because we judged two of the three trials contributing to this outcome at high risk of bias and the included trials were different in their assessment of this outcome (De Koning 2020Field 2021Wille 2016), and this limited our ability to combine results. Only two trials included the whole trial cohort in their quality of life assessments (Field 2021Wille 2016). The UKLS trial administered the questionnaires to the whole cohort (Field 2021). The DLCST also administered their questionnaires to the whole cohort (Wille 2016), however performed an additional nested cohort trial focusing on those with a positive screen result. The NESLON trial took a random sample of 733 participants from each trial group (LDCT screening and control) (De Koning 2020). In the NLST (Aberle 2011), which we excluded from the meta‐analysis as it did not have data from the CXR group, only 16 sites invited participants to complete questionnaires. Of those 16 sites, only those with a positive scan were initially invited to participate, with the NLST subsequently inviting participants with a negative scan but SIFs on LDCT at a later time point. This cohort of participants was then matched with controls who had a negative LDCT scan. The tools used for assessment of this outcome also varied, with only one trial using a lung cancer‐specific tool to assess psychosocial consequences (Wille 2016). The NELSON trial (De Koning 2020) and UKLS trial (Field 2021) used lung cancer‐specific distress scores. The NLST (Aberle 2011) and NELSON trial (De Koning 2020) both used generic health questionnaires (Short Form‐12 and 36), as well as the STAI‐6 for anxiety. The UKLS trial also measured Hospital Anxiety and Depression Scale (HADS), Cancer Worry Scale, and used a satisfaction with decision score (Field 2021). The trials also did not evaluate outcomes at uniform time points. The NLST assessed before the scan (could be any screen interval), 1‐month postscreen, and 6 months postscreen (Aberle 2011). The NELSON trial assessed at baseline, 2 months post‐randomisation, and at 2 years (prior to the third screen) (De Koning 2020). The UKLS trial collected questionnaires before randomisation, at 2 weeks (after participants had received results of screen or been notified of allocation to control arm), and at 10 months to 27 months (Field 2021). The DLCST administered questionnaires annually for five years (Wille 2016). Their nested cohort trial collected questionnaires at baseline, 1 week (postresult of screen or annual visit), 1 month, 6 months, and 18 months. The DLSCT also reviewed psychosocial status and demographics of participants in the control group who did not attend their annual clinical review. The trial reported that non‐attenders to visits were from socioeconomically disadvantaged backgrounds and had lower psychosocial questionnaire scores from preceding rounds. This is an important consideration for lung cancer screening, as lung cancer is disproportionally represented in this group (Mao 2001). 

Potential biases in the review process

This review used the Cochrane highly‐sensitive search strategy and aimed to include all trials, both published and unpublished, and conducted a wide search including ongoing trial registry databases and abstracts from major conferences. We did not apply any language restrictions, and we are confident we have included all relevant trials to date in this review. At least two review authors (AB, CM, DM, RM, RManser) reviewed search results. Multiple authors (AB, RM, DM) checked all data, both during the extraction process and analysis of data against published results. We resolved any disagreements and queries via discussion. One review author (DM) contacted trial authors when additional data were required. 

Agreements and disagreements with other studies or reviews

The previous Cochrane Review on lung cancer screening (Manser 2013), evaluated multiple modalities of screening, including sputum cytology, CXR, and CT. Based on their conclusions, there was no evidence to support screening with sputum cytology or CXR, however more data were required for CT screening. As such, this review focused on lung cancer screening with LDCT and incorporated data from more RCTs (Aberle 2011Becker 2020Blanchon 2007De Koning 2020Field 2021Gohagan 2005Infante 2015LaRocca 2002Paci 2017Pastorino 2012Wille 2016), most of which were still ongoing in the previous review. 

When comparing our review to other systematic reviews published (Hoffman 2020Huang 2019Jonas 2021Rota 2019Sadate 2020Mazzone 2021), our review incorporated more secondary outcomes and provided analysis at more defined time points and with additional subgroups. Two of the reviews (Hoffman 2020Huang 2019), included the AME trial (Yang 2018), which does not have complete mortality data at 5 years. Despite some differences in included trials and time points, the reviews still favoured screening to reduce lung cancer‐related mortality and did not show differences in mortality after invasive procedures between groups when reported. However, most concluded that there was probably no difference with all‐cause mortality, as their 95% CI just reached or crossed 1. Our review included data from the UKLS trial (Field 2021), and had a 95% CI upper limit of 0.99. One review (Hoffman 2020), also evaluated for diagnosis of stage 1 lung cancer, and found it to be more prevalent in the LDCT screening group. 

One other review has focused on overdiagnosis in lung cancer screening RCTs (Brodersen 2020). This review included data from five RCTs (Becker 2020De Koning 2020Paci 2017Pastorino 2012Wille 2016), and estimated that 38% of screen‐detected lung cancers may be overdiagnosed in these trials (95% CI 14% to 63%). Brodersen 2020 performed a sensitivity analysis of their rated high‐quality trials (Becker 2020Wille 2016), which had an estimated overdiagnosis rate of 49% (95% CI 11% to 89%). Of note, a recent publication by Gao 2022 investigated possible overdiagnosis of lung cancer by LDCT screening in a population of mostly non‐smoking Taiwanese women. They estimated that from 2004 to 2018, between 12% and 21% of women have been overdiagnosed with lung cancer. Women and non‐smokers were underrepresented in the trials included in our review. 

Our findings were consistent with other reviews on psychosocial outcomes related to lung cancer screening (Quaife 2021Slatore 2014Wu 2016).

In regard to assessment of risk of bias, our review was generally consistent with other reviews, however we tended to be more conservative when evaluating 'other biases' in trials, such as protocol deviations and unbalanced baselines. We also contacted authors for clarification. 

Study selection flow diagram.

Figuras y tablas -
Figure 1

Study selection flow diagram.

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Figuras y tablas -
Figure 2

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Figuras y tablas -
Figure 3

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Lung cancer mortality ‐ Planned time points ‐ Sensitivity analysis

Figuras y tablas -
Figure 4

Lung cancer mortality ‐ Planned time points ‐ Sensitivity analysis

All‐cause mortality ‐ Planned time points ‐ Sensitivity analysis

Figuras y tablas -
Figure 5

All‐cause mortality ‐ Planned time points ‐ Sensitivity analysis

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 1: Lung cancer‐related mortality ‐ planned time points

Figuras y tablas -
Analysis 1.1

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 1: Lung cancer‐related mortality ‐ planned time points

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 2: Lung cancer‐related mortality ‐ planned time points

Figuras y tablas -
Analysis 1.2

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 2: Lung cancer‐related mortality ‐ planned time points

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 3: Lung cancer‐related mortality at different follow‐up time points (including unplanned)

Figuras y tablas -
Analysis 1.3

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 3: Lung cancer‐related mortality at different follow‐up time points (including unplanned)

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 4: Lung cancer‐related mortality by screening arm ‐ planned time points

Figuras y tablas -
Analysis 1.4

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 4: Lung cancer‐related mortality by screening arm ‐ planned time points

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 5: Lung cancer‐related mortality – by time postscreening cessation (including unplanned time points)

Figuras y tablas -
Analysis 1.5

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 5: Lung cancer‐related mortality – by time postscreening cessation (including unplanned time points)

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 6: Lung cancer‐related mortality by screening interval ‐ planned time points

Figuras y tablas -
Analysis 1.6

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 6: Lung cancer‐related mortality by screening interval ‐ planned time points

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 7: Lung cancer‐related mortality by sex ‐ planned time points

Figuras y tablas -
Analysis 1.7

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 7: Lung cancer‐related mortality by sex ‐ planned time points

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 8: Lung cancer‐related mortality by sex ‐ planned time points

Figuras y tablas -
Analysis 1.8

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 8: Lung cancer‐related mortality by sex ‐ planned time points

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 9: Lung cancer‐related mortality by age ‐ planned time points

Figuras y tablas -
Analysis 1.9

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 9: Lung cancer‐related mortality by age ‐ planned time points

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 10: Lung cancer related to smoking ‐ latest time point (including unplanned)

Figuras y tablas -
Analysis 1.10

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 10: Lung cancer related to smoking ‐ latest time point (including unplanned)

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 11: Lung cancer‐related mortality by geography ‐ planned time points

Figuras y tablas -
Analysis 1.11

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 11: Lung cancer‐related mortality by geography ‐ planned time points

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 12: Nodule management algorithm ‐ planned follow‐up time points

Figuras y tablas -
Analysis 1.12

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 12: Nodule management algorithm ‐ planned follow‐up time points

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 13: Nodule management criteria ‐ planned follow‐up time points

Figuras y tablas -
Analysis 1.13

Comparison 1: Primary outcome: lung cancer‐related mortality, Outcome 13: Nodule management criteria ‐ planned follow‐up time points

Comparison 2: Primary outcome: number of non‐invasive and invasive tests ‐ all time points, Outcome 1: Number of invasive tests

Figuras y tablas -
Analysis 2.1

Comparison 2: Primary outcome: number of non‐invasive and invasive tests ‐ all time points, Outcome 1: Number of invasive tests

Comparison 2: Primary outcome: number of non‐invasive and invasive tests ‐ all time points, Outcome 2: Non‐invasive tests

Figuras y tablas -
Analysis 2.2

Comparison 2: Primary outcome: number of non‐invasive and invasive tests ‐ all time points, Outcome 2: Non‐invasive tests

Comparison 2: Primary outcome: number of non‐invasive and invasive tests ‐ all time points, Outcome 3: Number of invasive test for false positive

Figuras y tablas -
Analysis 2.3

Comparison 2: Primary outcome: number of non‐invasive and invasive tests ‐ all time points, Outcome 3: Number of invasive test for false positive

Comparison 2: Primary outcome: number of non‐invasive and invasive tests ‐ all time points, Outcome 4: Death postsurgery

Figuras y tablas -
Analysis 2.4

Comparison 2: Primary outcome: number of non‐invasive and invasive tests ‐ all time points, Outcome 4: Death postsurgery

Comparison 3: Secondary outcome: all‐cause mortality, Outcome 1: All‐cause mortality ‐ planned time points (latest time points)

Figuras y tablas -
Analysis 3.1

Comparison 3: Secondary outcome: all‐cause mortality, Outcome 1: All‐cause mortality ‐ planned time points (latest time points)

Comparison 3: Secondary outcome: all‐cause mortality, Outcome 2: All‐cause mortality ‐ all time points (planned and unplanned)

Figuras y tablas -
Analysis 3.2

Comparison 3: Secondary outcome: all‐cause mortality, Outcome 2: All‐cause mortality ‐ all time points (planned and unplanned)

Comparison 3: Secondary outcome: all‐cause mortality, Outcome 3: All‐cause mortality ‐ planned time points

Figuras y tablas -
Analysis 3.3

Comparison 3: Secondary outcome: all‐cause mortality, Outcome 3: All‐cause mortality ‐ planned time points

Comparison 3: Secondary outcome: all‐cause mortality, Outcome 4: All‐cause mortality by sex ‐ planned time points

Figuras y tablas -
Analysis 3.4

Comparison 3: Secondary outcome: all‐cause mortality, Outcome 4: All‐cause mortality by sex ‐ planned time points

Comparison 3: Secondary outcome: all‐cause mortality, Outcome 5: Cardiovascular mortality ‐ planned and unplanned

Figuras y tablas -
Analysis 3.5

Comparison 3: Secondary outcome: all‐cause mortality, Outcome 5: Cardiovascular mortality ‐ planned and unplanned

Comparison 4: Secondary outcome: lung cancer incidence, Outcome 1: Lung cancer incidence ‐ by different time points

Figuras y tablas -
Analysis 4.1

Comparison 4: Secondary outcome: lung cancer incidence, Outcome 1: Lung cancer incidence ‐ by different time points

Comparison 4: Secondary outcome: lung cancer incidence, Outcome 2: Lung cancer incidence ‐ by control group at ≥ 10 years

Figuras y tablas -
Analysis 4.2

Comparison 4: Secondary outcome: lung cancer incidence, Outcome 2: Lung cancer incidence ‐ by control group at ≥ 10 years

Comparison 4: Secondary outcome: lung cancer incidence, Outcome 3: Overdiagnosis at ≥ 10 years

Figuras y tablas -
Analysis 4.3

Comparison 4: Secondary outcome: lung cancer incidence, Outcome 3: Overdiagnosis at ≥ 10 years

Comparison 5: Secondary outcome: false positives, negatives and recalls (number of screens), Outcome 1: False positive at baseline

Figuras y tablas -
Analysis 5.1

Comparison 5: Secondary outcome: false positives, negatives and recalls (number of screens), Outcome 1: False positive at baseline

Comparison 5: Secondary outcome: false positives, negatives and recalls (number of screens), Outcome 2: False negative

Figuras y tablas -
Analysis 5.2

Comparison 5: Secondary outcome: false positives, negatives and recalls (number of screens), Outcome 2: False negative

Comparison 5: Secondary outcome: false positives, negatives and recalls (number of screens), Outcome 3: Recall rates at baseline

Figuras y tablas -
Analysis 5.3

Comparison 5: Secondary outcome: false positives, negatives and recalls (number of screens), Outcome 3: Recall rates at baseline

Comparison 6: Secondary outcome: impact on smoking behaviour, Outcome 1: stop smoking

Figuras y tablas -
Analysis 6.1

Comparison 6: Secondary outcome: impact on smoking behaviour, Outcome 1: stop smoking

Comparison 6: Secondary outcome: impact on smoking behaviour, Outcome 2: smoking relapse

Figuras y tablas -
Analysis 6.2

Comparison 6: Secondary outcome: impact on smoking behaviour, Outcome 2: smoking relapse

Comparison 7: Secondary outcome: health‐related quality of life, Outcome 1: Anxiety ‐ at 10 months to 5 years (change over time and endpoints)

Figuras y tablas -
Analysis 7.1

Comparison 7: Secondary outcome: health‐related quality of life, Outcome 1: Anxiety ‐ at 10 months to 5 years (change over time and endpoints)

Comparison 7: Secondary outcome: health‐related quality of life, Outcome 2: Quality of life measures at different time points

Figuras y tablas -
Analysis 7.2

Comparison 7: Secondary outcome: health‐related quality of life, Outcome 2: Quality of life measures at different time points

Comparison 7: Secondary outcome: health‐related quality of life, Outcome 3: SF‐36v2: PCS by different components at baseline and at 6 months

Figuras y tablas -
Analysis 7.3

Comparison 7: Secondary outcome: health‐related quality of life, Outcome 3: SF‐36v2: PCS by different components at baseline and at 6 months

Comparison 7: Secondary outcome: health‐related quality of life, Outcome 4: SF‐36v2: MCS by different components at baseline and 6 months

Figuras y tablas -
Analysis 7.4

Comparison 7: Secondary outcome: health‐related quality of life, Outcome 4: SF‐36v2: MCS by different components at baseline and 6 months

Comparison 7: Secondary outcome: health‐related quality of life, Outcome 5: Anxiety by different results at 1 and 6 months

Figuras y tablas -
Analysis 7.5

Comparison 7: Secondary outcome: health‐related quality of life, Outcome 5: Anxiety by different results at 1 and 6 months

Comparison 8: Secondary outcome: lung cancer by stages at different time points, Outcome 1: baseline

Figuras y tablas -
Analysis 8.1

Comparison 8: Secondary outcome: lung cancer by stages at different time points, Outcome 1: baseline

Comparison 8: Secondary outcome: lung cancer by stages at different time points, Outcome 2: at 1 year

Figuras y tablas -
Analysis 8.2

Comparison 8: Secondary outcome: lung cancer by stages at different time points, Outcome 2: at 1 year

Comparison 8: Secondary outcome: lung cancer by stages at different time points, Outcome 3: At year 2

Figuras y tablas -
Analysis 8.3

Comparison 8: Secondary outcome: lung cancer by stages at different time points, Outcome 3: At year 2

Comparison 8: Secondary outcome: lung cancer by stages at different time points, Outcome 4: 5 to < 10 years

Figuras y tablas -
Analysis 8.4

Comparison 8: Secondary outcome: lung cancer by stages at different time points, Outcome 4: 5 to < 10 years

Comparison 8: Secondary outcome: lung cancer by stages at different time points, Outcome 5: ≥ 10 years

Figuras y tablas -
Analysis 8.5

Comparison 8: Secondary outcome: lung cancer by stages at different time points, Outcome 5: ≥ 10 years

Comparison 9: Secondary outcome: lung cancer histology at different time points, Outcome 1: Histology types at baseline

Figuras y tablas -
Analysis 9.1

Comparison 9: Secondary outcome: lung cancer histology at different time points, Outcome 1: Histology types at baseline

Comparison 9: Secondary outcome: lung cancer histology at different time points, Outcome 2: Histology at year 1

Figuras y tablas -
Analysis 9.2

Comparison 9: Secondary outcome: lung cancer histology at different time points, Outcome 2: Histology at year 1

Comparison 9: Secondary outcome: lung cancer histology at different time points, Outcome 3: Histology at follow‐up

Figuras y tablas -
Analysis 9.3

Comparison 9: Secondary outcome: lung cancer histology at different time points, Outcome 3: Histology at follow‐up

Comparison 10: Secondary outcome: other outcomes, Outcome 1: contamination

Figuras y tablas -
Analysis 10.1

Comparison 10: Secondary outcome: other outcomes, Outcome 1: contamination

Summary of findings 1. Low‐dose computed tomography (LDCT) screening compared to no LDCT screening for lung cancer‐related mortality

Low‐dose computed tomography (LDCT) screening compared to no LDCT screening for lung cancer‐related mortality

Patient or population: healthy adults
Setting: hospitals or screening centres
Intervention: LDCT screening
Comparison: no LDCT screening

Outcomes

№ of participants
(trials)
follow‐up

Certainty of the evidence
(GRADE)

Relative effect
(95% CI)

Anticipated absolute effects*(95% CI)

Risk with no screening

Risk difference

Lung cancer‐related mortality ‐ planned time points
Follow‐up: 6 years to 10 years from randomisation

91,122
(8 RCTs)

⊕⊕⊕⊝
Moderatea

RR 0.79

(0.72 to 0.87)

Trial population

21 per 1000

4 fewer per 1000 people screened
(3 fewer to 6 fewer)

All‐cause mortality ‐ planned time points

Follow‐up: 6 years to 10 years from randomisation

91,107

(8 RCTs)

⊕⊕⊕⊝
Moderatea

RR 0.95

(0.91 to 0.99)

Trial population

89 per 1000

4 fewer per 1000 people screened
(1 fewer to 8 fewer)

Overdiagnosis

Time point: ≥ 10 years from randomisation excluding CXR trials

 

28,656
(5 RCTs)

⊕⊕⊝⊝
Lowa,c

RD 0.18

(‐0.00 to 0.36)

Trial population

180 more lung cancers overdiagnosed per 1000 lung cancers detected
(0 more to 360 more)

Number of invasive tests

Time point: 3 years to 10 years from randomisation

 

60,003
(3 RCTs)

⊕⊕⊕⊝
Moderatea

RR2.60
(2.41 to 2.80)

Trial population

31 per 1000

49 more per 1000 people screened
(45 more to 55 more)

Any death postsurgery

Time point: 6 years to 9 years from randomisation

409
(2 RCTs)

⊕⊕⊕⊝
Moderatea

RR 0.68
(0.24 to 1.94)

Trial population

48 per 1000

15 fewer per 1000 people screened
(37 fewer to 45 more)

Health‐related quality of life ‐ anxiety

Time point: 10 months to 5 years from randomisation

Measured by different scales

8153

(3 RCT)

⊕⊕⊝⊝
Lowa,b

SMD ‐0.43

(‐0.59 to ‐0.27)

Trial population

SMD 0.43 lower

(0.27 to 0.59 lower )

*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CI: confidence interval; CXR: chest x‐ray; OR: odds ratio; RCT: randomised controlled trial; RR: risk ratio; RD: risk difference, SMD: standardised mean difference

GRADE Working Group grades of evidence
High certainty: we are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty: we are moderately confident in the effect estimate; the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low certainty: our confidence in the effect estimate is limited; the true effect may be substantially different from the estimate of the effect.
Very low certainty: we have very little confidence in the effect estimate; the true effect is likely to be substantially different from the estimate of effect.

 

 

 

 

 

 

aDowngraded one level due to high risk of "other bias" in Becker 2020De Koning 2020Infante 2015, and Pastorino 2012.
bDowngraded one level due to indirectness: only a subset of the trial population were included for quality assessment.
cDowngraded one level due to heterogeneity. 

Figuras y tablas -
Summary of findings 1. Low‐dose computed tomography (LDCT) screening compared to no LDCT screening for lung cancer‐related mortality
Table 1. Nodule management 

 

Interpretation

Management 

Aberle 2011

Positive scan: findings suspicious of lung cancer, such as non‐calcified nodule ≥ 4 mm, lung consolidation, or obstructive atelectasis, nodule enlargement, and nodules with suspicious changes in attenuation

No trial‐wide algorithm

Becker 2020

Positive scan: any nodule ≥ 5 mm

  • No abnormality or nodule < 5 mm: routine screening

  • Nodules 5 mm to 7 mm: early recall (6 months)

  • Nodules 8 mm to 10 mm: earlier recall (3 months)

  • Nodules > 10 mm: immediate recall

On recall scans

  • > 600 VDT: back to routine scans

  • 400 VDT to 600 VDT: 6 months early recall

  • < 7.5 mm: early recall 6 months

  • ≥ 7.5 mm to 10 mm: early recall at 3 months

  • ≤ 400 VDT or > 10 mm diameter: immediate recall

 

Blanchon 2007

Positive scan: non‐calcified nodule > 5 mm

  • Non‐calcified nodule ≤ 5 mm: repeat LDCT in 1 year

  • Non‐calcified nodule > 5 mm and < 10 mm: repeat LDCT in 3 months

 

If no change: repeat scan at 6 months, 12 months and 24 months from baseline. If growth at any time: histological diagnosis.  

 

  • Non‐calcified nodule ≥ 10 mm: CT with contrast versus PET versus histological diagnosis discussed in MDM with pulmonary oncologist, radiologist and thoracic surgeon

De Koning 2020

Classification of non‐calcified nodules:

  • NODCAT 1: benign nodule (fat/benign calcifications) or other benign characteristics

  • NODCAT 2: any nodule, smaller than NODCAT 3 and no characteristics of NODCAT 1

  • NODCAT 3: solid (500 mm3 to 500 mm3), solid/pleural based (5 mm dmin to 10 mm dmin), partial solid/non‐solid component (≥ 8 mm dmean), partial solid/solid component (50 mm3 to 500 mm3), non‐solid (≥ 8 mm dmean) 

  • NODCAT 4: solid (> 500 mm3), solid/pleural based (> 10 mm dmin), partial solid/solid component (> 500 mm3)

Classification of nodules based on growth:

  • GROWCAT A: VDT > 600 days

  • GROWCAT BL: VDT 400 days to 600 days

  • GROWCAT C: VDT< 400 days or a new solid component in a non‐solid lesion

 

Management of non‐calcified nodules based on baseline screening

  • NODCAT 1: negative test, annual CT

  • NODCAT 2: negative test, annual CT

  • NODCAT 3: indeterminate test, 3‐month follow‐up CT

  • NODCAT 4: positive test, refer to pulmonologist for work up and diagnosis

  • GROWCAT: positive test, histological diagnosis            

Management protocol for non‐calcified nodules at incidence screening

  • NODCAT 1: negative test, CT in year 4

  • NODCAT 2: indeterminate test, CT in year 3

  • NODCAT 3: indeterminate test, CT after 6‐8 weeks

  • NODCAT 4: positive test, work up for work up and diagnosis

  • GROWCAT C‐ positive test, histological diagnosis required                    

At year 4

  • NODCAT 1: negative test, CT in year 6

  • NODCAT 2: indeterminate test, CT after 1 year

  • NODCAT 3: indeterminate test, CT after 6‐8 weeks

  • NODCAT 4: positive test, refer to pulmonologist

  • GROWCAT A: negative test, CT in year 6

  • GROWCAT B: indeterminate test, repeat CT after 1 year

  • GROWCAT C:  positive test, refer to pulmonologist 

 At year 6

  • NODCAT 1: negative test, end of screening

  • NODCAT 2: indeterminate test, end of screening

  • NODCAT 3: indeterminate test, CT after 6‐8 weeks

  • NODCAT 4: positive test, refer to pulmonologist for work up and diagnosis

  • GROWCAT A: negative test, end of screening

  • GROWCAT B: indeterminate screening, CT after 1 year

  • GROWCAT C: positive test, refer to pulmonologist                                                         

Preoperative biopsy was not routine.

Suspicious nodules were removed by VATS or thoracotomy with wedge resection+frozen section.
Lobectomies were performed only for central nodules that could not be approached by wedge resection.
If cancer was diagnosed by VATS, the procedure was converted to an open thoracotomy with sampling of lobar, interlobar, hilar and mediastinal lymph nodes as VATS resection in lung cancer was not fully implemented at the time of trial in the Netherlands. Mediastinoscopy was performed before proceeding to VATS or thoracotomy in subjects with mediastinal lymph nodes > 10 mm in short axis and/or positive nodes.

Field 2021

Classification of nodules:

  • Cat 1: nodules containing fat or with a benign pattern of calcification are considered benign. Solid nodules < 15 mm3 or if pleural or juxta pleural < 3 mm3.

  • Cat 2: solid intraparenchymal nodules with a volume of 15 mm3 to 49 mm3. Pleural or juxta pleural nodules with a maximal diameter of 3.1 mm to 4.9mm. Part solid nodules with a maximal non‐solid component of < 5 mm diameter or where the solid component volume is < 15 mm3.

  • Cat 3: solid intraparenchymal nodules with a volume of 50 mm3 to 500 mm3. Pleural or juxtapleural nodules with a maximal diameter of 5 mm to 9.9 mm. Non‐solid nodules with a maximal diameter of > 5 mm or part solid nodules where solid component volume is 15 mm3 to 500 mm3.

  • Cat 4: solid intraparenchymal nodules with a volume > 500 mm3, pleural or juxtapleural nodules with a maximal diameter of ≥ 10 mm. Part solid nodules with a solid component with a volume > 500 mm3.

  • Cat 1: nil further scans

  • Cat 2: follow‐up CT in 1 year and assessed for VDT or new solid component in non‐solid nodule

  • If no growth, stop follow‐up

  • If growth, for MDT

  • Cat 3: follow‐up CT in 3 months and assessed for VDT or new solid component in non‐solid nodule. If no growth then CT in 9 months  

  • If VDT > 400 days, stop follow‐up

  • If VDT ≤ 400 days then MDT assessment

  • If growth then MDT assessment

  • 4. Cat 4: MDT assessment

Gohagan 2005

  • Positive scan: any non‐calcified nodule ≥ 4 mm

Other abnormalities could also be considered suspicious for lung cancer at the discretion of the radiologist. 

No trial‐wide algorithm for management

  • Telephone call to patient with positive test and urged to seek medical follow‐up with additional follow‐up calls at 4 weeks +/‐ 8 weeks if follow‐up had not begun at the 4‐week phone call

  • Referrals to specialists for follow‐up of positive screening; results were provided if requested by the participant

Infante 2015

  • Positive scan: non‐calcified pulmonary nodules ≥ 10 mm in diameter or smaller but showing spiculated margins, or non‐nodular lesions such as a hilar mass, focal ground glass opacities, major atelectasis, endobronchial lesions, mediastinal adenopathy, pleural effusion or pleural masses

No set trial‐wide algorithm for management

  • If lesion smooth and < 10 mm in size: LDCT at 3, 6, and 12 months

    • If no change occurs: follow‐up after 1 year

  • Non‐smooth lesion ≥ 6 mm but ≤ 10 mm: oral antibiotics and new HRCT after 6 to 8 weeks 

    • If no regression occurs: evaluation on a case‐by‐case basis as to the opportunity to follow the lesion or to perform invasive procedures to obtain a tissue diagnosis

  • Lesion ≥ 10 mm but ≤ 20 mm: oral antibiotics and new HRCT after 6 to 8 weeks

    • If no regression occurs: PET. If PET is positive: tissue diagnosis. If PET is negative: close follow‐up

  • Lesion ≥ 20 mm: discretional oral antibiotics and new HRCT or standard CT + PET

    • If PET positive: tissue diagnosis

    • If PET negative: close follow‐up 

  • Focal ground glass opacities: oral antibiotics and new HRCT after 6 to 8 weeks. Evaluation on case‐by‐case basis as to opportunity to follow lesion or obtain tissue diagnosis based on the size, number of lesions, location and ratio of any solid versus non‐solid component

LaRocca 2002

  • Positive scan: ≥ 5 mm nodule with suspicious features

  • Abnormal mass > 10 mm in diameter or 5 mm to 10 mm in diameter and highly suspicious for malignancy: CXR and tissue diagnosis is obtained

  • If the abnormal mass ≤ 10 mm in diameter: thin section high resolution image of the mass is obtained

  • If this image is normal or benign, annual spiral CT scanning is continued.

  • If the image is indeterminate, a repeat high‐resolution scan is performed in 3 months.

  • If the image is unchanged at 3 months, annual spiral CT scanning is continued.

  • If the mass is larger at 3 months: CXR and tissue diagnosis is performed.

Paci 2017

  • Positive scan: at least one non‐calcified nodule ≥ 5 mm or a non‐solid nodule ≥ 10 mm or the presence of a part‐solid nodule

  • Solid non‐calcified nodule ≥ 8mm and non‐solid non‐calcified nodule > 10 mm: PET

    • If PET positive: FNA recommended (if FNA negative or indeterminate: 3 month follow‐up scan)

    • If PET negative: 3 month follow‐up scan

    • All cases with no nodule growth at follow‐up exam were invited to annual repeat CT scan

  • Solid or part‐solid non‐calcified nodules with diameter between 5 mm to 7 mm: follow‐up dose LDCT after 3 months

    • If significant growth (increase ≥ 1 mm in mean diameter in a solid nodule or increase of solid component in a part solid nodule): considered potentially malignant

    • If considered potentially malignant and peripheral nodule, FDG PET or CT‐guided FNA arranged

    • If considered potentially malignant and deep nodule, FDG PET or bronchoscopy arranged.

    • Bronchoscopy also performed for airway abnormalities

  • If screening test revealed focal abnormalities consistent with inflammatory disease: antibiotic therapy and 1 month follow‐up CT recommended

  • In case of complete resolution, the subject was sent to annual repeat screening

  • In case of partial or lack of resolution, further 2‐month follow‐up CT performed

  • All subjects with FNA evidence of malignancy underwent a staging CT (CT chest/abdominal/head and neck exam with IV contrast).

 

Pastorino 2012

  • Negative nodule: non‐calcified nodule < 60 mm3 or nodules with fat or benign pattern of calcification

  • Indeterminate: non‐calcified nodules 60 mm3 to 250 mm3

  • Positive result: non‐calcified nodules > 250 mm3

  • Positive result was also based on findings such as non‐calcified hilar or mediastinal lymphadenopathy, atelectasis, consolidation, pleural findings

  • Solid lesions < 60 mm3 in volume (diameter ≥ 4.8 mm) considered: repeat LDCT for 1 or 2 years

  • Nodules with a volume of 60 mm3 to 250 mm3 (5 mm to 8 mm in diameter): underwent repeat CT exam after 3 months

  • Nodules with a volume > 250 mm3: additional work‐up including PET or lung biopsy

  • Volumetric growth was used on serial imaging with significant growth considered ≥ 25% after 3‐month interval

    • If no growth, back to planned screening intervals

  • Ground glass opacities were conservatively managed.

Wille 2016

  • Category 1: nodules ≤ 15 mm in maximal diameter with benign characteristics or ≤ 20 mm for calcified nodules

  • Category 2: nodules < 5 mm

  • Category 3: nodules 5 mm to 15 mm not classified as benign

  • Category 4: nodules > 15 mm or suspicious morphology

  • Category 5: growing nodules (increase in volume ≥ 25%)

  • Category 1 and 2: nil further action

  • Category 3: indeterminate: repeat scan in 3 months

  • Category 4 and 5: diagnostic investigation

CT: computed tomography; CXR: chest x‐ray; dmean: mean diameter; dmin: minimal diameter; FDG PET: fluorodeoxyglucose positron emission tomography; FNA: fine needle aspiration; HRCT; high‐resolution computed tomography; IV: intravenous; LDCT: low‐dose computed tomography; MDM: multidisciplinary meeting; MDT: multidisciplinary team; PET: positron emission tomography; VATS: video‐assisted thoracoscopic surgery; VDT: volume doubling time.

Figuras y tablas -
Table 1. Nodule management 
Table 2. Invasive tests in non‐lung cancer‐related disease

 

Invasive tests in non‐lung cancer‐related disease

Aberle 2011

At 6.5 year follow‐up

LDCT group: 457 procedures for non‐lung cancer‐related disease (164 thoracotomies/thoracoscopies/mediastinoscopies; 227 bronchoscopies; 66 needle biopsies) out of 17,053 positive screening results over 3 rounds with complete diagnostic information

CXR group: 115 procedures for benign disease (45 thoracotomies/thoracoscopies/mediastinoscopies; 46 bronchoscopies; 24 needle biopsies) out of 4674 positive screening results over 3 rounds with complete information

Becker 2020

Baseline LDCT: 30 biopsies performed in benign disease (at least 5 thoracotomies, 2 VATS thoracoscopies, and 1 bronchoscopy)

Year 1 LDCT: 19 biopsies performed in benign disease

Year 2 LDCT: 12 biopsies performed in benign disease

Year 3 LDCT: 16 biopsies performed in benign disease

Year 4 LDCT: 13 biopsies performed in benign disease

Blanchon 2007

Trial arm not specified. Baseline: 3 thoracostomies performed for benign disease

De Koning 2020

Baseline LDCT: 27.2% of invasive procedures performed in benign disease

Between 2004 and 2008: 215 participants had surgery. 2/17 mediastinoscopies were in benign diease; 47/198 lung surgeries (thoracotomies+/‐ VATS) were in benign disease. 

Field 2021

Baseline LDCT: 7 participants had needle biopsies, 1 EBUS bronchoscopy, 4 referrals for surgery completed for benign disease

Gohagan 2005

Baseline LDCT: 16 bronchoscopies, 19 lung biopsies or resection, and 23 any invasive procedures (including biopsy/resection, bronchoscopy, thoracotomy, thoracoscopy, mediastinotomy, mediastinoscopy) performed for benign disease

Baseline CXR: 5 bronchoscopies, 6 lung biopsies or resections, and 8 procedures (including biopsy/resection, bronchoscopy, thoracotomy, thoracoscopy, mediastinotomy, mediastinoscopy) performed for benign disease

Infante 2015

At 8.35 years median follow‐up

LDCT group: 17 surgeries for benign disease (3 mediastinoscopies, 7 VATS wedge resections, 6 open wedge resections, 1 open segmentectomy). 7 surgeries for other conditions (reported as 1 open biopsy, 1 extrapleural pneumonectomy for mesothelioma, 2 oesophagectomies for cancer, 1 oesophageal leiomyoma VATS resection, 2 VATS thymectomies, 1 lobectomy for aspergilloma)

Control arm: 5 surgeries for benign disease (2 VATS biopsies, 2 VATS wedge resection, 1 open wedge resection); 2 surgeries for other conditions (1 open lung biopsy for hilar lymphoma, 1 VATS thymectomy)

LaRocca 2002

Not available

Paci 2017

Baseline LDCT: 1 FNA biopsy and 1 (5.5% of all surgical resections) surgical resection for benign disease reported

Pastorino 2012

Median 6 annual LDCTs: 1 invasive diagnostic procedure (transthoracic needle aspiration, fibro bronchoscopy, transbronchial needle aspiration), 0 anatomical (lobectomy or segmentectomy) resections, 0 non‐anatomical resections (wedge resection) performed for benign disease

Median 3 biennial LDCTs: 3 invasive diagnostic procedures (transthoracic needle aspiration, fibro bronchoscopy, transbronchial needle aspiration), 0 anatomical (lobectomy or segmentectomy) resections, 1 non‐anatomical resection (wedge resection) performed for benign disease

Wille 2016

Baseline LDCT: 1 mediastinoscopy, 3 bronchoscopy with biopsy, 1 EUS, 2 EBUS, 2 VATS, 1 percutaneous biopsy performed for benign disease

CXR: chest x‐ray; EBUS: endobronchial ultrasound; EUS: endoscopic ultrasound;  FNA: fine needle aspirate; LDCT: low‐dose computed tomography; VATS: video‐assisted thoracoscopic surgery

Figuras y tablas -
Table 2. Invasive tests in non‐lung cancer‐related disease
Table 3. Recall rates, false positives and overdiagnosis 

 

Recall rates at overall baseline
LDCT recall rate = 18% (8078/44,920)

False positives 

Overall false‐positive rate from baseline LDCT = 21%

(8874/41857)

Overdiagnosis

Aberle 2011

All chest CTs performed postbaseline LDCT: 20%

(5153/26,309)

All chest CTs performed postbaseline CXR in control group: 6%

Baseline

LDCT: 6911/26,309 (26%)

Baseline CXR: 2243/26,035 (7%)

1‐year LDCT: 6728/24715 (27%)

1‐year CXR: 1416/24,089 (6%)

2‐year LDCT: 3838/24,102 (16%)

2‐year CXR: 1094/23, 346 (5%)

At 6.5 years post‐randomisation:

  • 11% (95% CI 3.2 to 18.2) from a public health perspective and 18.5% (95% CI 5.4 to 30.6) from a clinical perspective of all lung cancers

  • 67.6% (95% CI 53.5 to 78.5) from a public health perspective and 78.9% (95% CI 62.2 to 93.5) from a clinical perspective of all BAC

 

At 11.3 years post‐randomisation:

  • 3.1% of all lung cancers and 79% of BAC from a public health perspective

Becker 2020

Baseline LDCT: 22% (451/2028)

1‐year LDCT: 5%

2‐year LDCT: 4%

3‐year LDCT: 6%

4‐year LDCT: 5%

Baseline LDCT: 426/2028 (21%)

1‐year LDCT: 77/1892 (4%)

2‐year LDCT: 62/1849 (3%)

3‐year LDCT 94/1826 (5%)

4‐year LDCT: 88/1810 (5%)

At 9.7 years post‐randomisation:

  • 17.8% (95% CI ‐7.4 to 44.7) from a public health perspective and 25.4% (95% CI ‐11.3 to 64.3) from a clinical perspective of all lung cancers

  • 90% (95% CI 54.3 to 164.4) from a public health perspective and 112.5% (95% CI 68.2 to 113.1) from a clinical perspective of all BAC

Blanchon 2007

Not available

Baseline LDCT: 73/336 (22%)

Baseline CXR: 14/285 (5%)

Not available

De Koning 2020

Baseline LDCT: 19% (1438/7557)

1‐year LDCT: 19%

Baseline LDCT 107/7557 (1%)

1‐year LDCT: 64/7295 (1%)

3‐year LDCT: 276/6922 (4%)

5.5‐year LDCT:  62/5279 (1%) 

Not available

Field 2021

Baseline LDCT: 5% (103/1994)

Baseline LDCT: *909/1994 (46%) 

**72/1994 (4%) 

*when defined as needing any work‐up 

**when defined as referred to MDT

Estimated 15% of all lung cancers

Gohagan 2005

Baseline LDCT: 15% (232/1586)

Baseline CXR: 5% (76/1550)

Overall post‐LDCT: 8%

Overall post‐CXR in control group: 3%

Baseline LDCT: 286/1586 (18%)

Baseline CXR: 139/1550 (9%)

Not available

Infante 2015

Baseline LDCT: 10% (128/1276)

Not available

Not available

LaRocca 2002

Not available

Not available

Not available

Paci 2017

Baseline LDCT: 23% 366/1406)

1‐year LDCT: 14%

2‐year LDCT: 13%

3‐year LDCT: 11%

Not available

At 11.3 years post‐randomisation, estimated overdiagnosis rates reported as ‐4% using public health perspective and ‐10% from a clinical perspective

Pastorino 2012

Baseline LDCT: 15% in annual group, 14% in (284/2303) biennial group

1‐year LDCT: 3% in annual group, 3% in biennial group

2‐year LDCT: 5% in annual group, 5% in biennial group

3‐year LDCT: 3% in annual group, 7% in biennial group

4‐year LDCT: 2% in annual group, 3% in biennial group

5‐year LDCT: 1% in annual group, 7% in biennial group

6‐year LDCT: 4% in annual group, 5% in biennial group

Median 6 annual LDCT: 54/1152 (5%)

Median 3 biennial LDCT: 34/1151 (3%)

Not available

Wille 2016

Baseline LDCT: 8% (155/2047)

1‐year LDCT: 1%

2‐year LDCT: 1%

3‐year LDCT: 1%

4‐year LDCT: 1%

Baseline LDCT: 162/2047 (8%)

1‐year LDCT: 34/1976 (2%)

2‐year LDCT: 39/1944 (2%)

3‐year LDCT: 32/1982 (2%)

4‐year LDCT: 35/1851 (2%)

Estimated 67.2% of lung cancers (95% CI 37.1 to 95.4) from unplanned posthoc analysis

BAC: bronchioalveolar carcinoma; CI: confidence interval; CT: computed tomography; CXR: chest x‐ray; LDCT: low‐dose computed tomography; MDT: multidisciplinary team. 

Figuras y tablas -
Table 3. Recall rates, false positives and overdiagnosis 
Table 4. Response, adherence and contamination rates

 

Response rates to recruitment 

Adherence to screening 

Contamination

Aberle 2011

Not available

Overall adherence to all 3 screening rounds: 95% of participants completed LDCT scan, 93% in the control group completed CXR

Not available

Becker 2020

  • 292,440 people received questionnaires

  • 95,797 people responded

  • 4913 people met eligibility criteria

  • 4052 people were enrolled and randomised to the trial (1% of those who received a questionnaire, and 4% of respondents)

 

 

 

 

 

Baseline: almost 100% completed LDCT scan.

1‐year: 95% completed LDCT scan.

2‐year: 93% completed LDCT scan.

3‐year: 93% completed LDCT scan.

4‐year: 94% completed LDCT scan. 

10 years postrandomisation: 264 participants in the control arm had received a CT.

Blanchon 2007

  • 830 eligible people were approached

  • 765 people consented to be randomised (92% of eligible people) 

 

Baseline: 86% participants completed LDCT scan, 75% in control arm completed CXR.

At baseline: 6 participants in the control arm inadvertently received a LDCT. 

De Koning 2020

  • 606,409 people received the first questionnaire

  • 150,920 responded to the questionnaire

  • 30,959 people were eligible and invited to participate

  • 15,822 people completed second questionnaire and were included and randomised (3% of people who received the first questionnaire and 51% of eligible respondents)

Baseline: 95% of participants completed LDCT scan

1 year: 97% of participants completed LDCT scan

3 years: 95% completed LDCT scan

5.5 years: 78% of participants completed LDCT scan

At 2 years: 3.6% of participants in the control arm had received CT for any reason

Field 2021

  • 247,354 people invited to participate in study

  • 75,958 responded positively to invitations

  • 8729 of respondents assessed as high risk

  • 5967 responded to second questionnaire

  • 4868 invited to recruitment centre

  • 4152 attended recruitment centre

  • 4061 consented to participation 

  • 4055 randomised to trial (2% of those invited and 46% of high‐risk respondents)

Baseline: 98% of participants completed LDCT scan

Not available

Gohagan 2005

  • 653,417 people mailed information packages

  • 12,270 people contacted screening centre and underwent eligibility assessment

  • 4828 people eligible for trial

  • 3409 people were randomised, however 91 participants subsequently found to be ineligible 

  • 3318 participants randomised and included in analysis (1% of people who received mail packages and 27% who were screened for eligibility)

Baseline: 95% of participants completed LDCT scan, 93% of the control group completed CXR

1‐year: 86% of participants completed LDCT scan, 80% of the control group completed CXR 

Contamination assessed by random sample of participants

At baseline: 5% of respondents in the intervention arm had received a CXR for medical or screening purposes. 0.9% of respondents in the control arm had received a CT for medical or screening purposes.

At 1 year: 10% of participants in the intervention arm had received a CXR for medical or screening purposes and 1.3% of respondents in the control arm had received a CT for medical or screening purposes. 

Infante 2015

Not available

Not available

3 years post‐randomisation:

  • intervention arm (74 extra CT scan, 233 extra CXRs)

  • control arm (extra 68 CTs, 209 extra CXRs)

Did not specify if for screening purposes, only outside protocol

LaRocca 2002

  • 3418 people completed screening questionnaires

  • 904 participants completed pre‐screening baseline CXR

  • 871 participants randomised to trial (25% of people who completed screening questionnaire)

Not available

Not available

Paci 2017

  • 71232 people sent letters

  • 17,055 people responded to letters

  • 3206 people were eligible and randomised (5% of people who received letters and 19% of respondents)

Baseline: 87% completed LDCT scan

1 year: 85% completed LDCT scan

2 years: 82% completed scan

3 years: 80% completed scan

 

Not available

Pastorino 2012

Not available

Baseline: 97% of annual group completed scan, 97% of biennial group completed scan

1 year: 97% of the annual group and 97% of the biennial group completed the scan

2 years: 98% of the annual group and 95% of the biennial group completed the scan

3 years: 97% of the annual group and 97% of the biennial group completed the scan

4 years: 96% of the annual group and 92% of the biennial group completed the scan

5 years: 79% of the annual group and 98% of the biennial group completed the scan

6 years: 54% of the annual group and 77% of the biennial group completed the scan

10 years post‐randomisation: 21 of 1723 participants in the control arm had received a LDCT

Wille 2016

  • 5861 people assessed for eligibility

  • 4104 people randomised to the trial (70% of people assessed)

Not available

After 5 years post‐randomisation: 0 cases of contamination in the intervention arm; 3 cases in the control arm

CT: computed tomography; CXR: chest x‐ray; LDCT: low‐dose computed tomography

Figuras y tablas -
Table 4. Response, adherence and contamination rates
Table 5. Interval cancers and incidental findings 

 

Interval cancers 

Incidental findings

Aberle 2011

Postbaseline LDCT: 18 lung cancers

Post‐1 year LDCT: 10 lung cancers

Baseline LDCT data from non‐ACRIN centres (N = 17,309): 2625 cardiovascular abnormalities, 221 thyroid abnormalities, 419 adrenal abnormalities, 780 renal abnormalities, and 1064 hepatobiliary abnormalities

Becker 2020

Postbaseline LDCT: 1 lung cancer

Post‐1‐year LDCT: 0 lung cancers 

Post‐2‐year LDCT: 2 lung cancers

Post‐3‐year LDCT: 1 lung cancer

Post‐4‐year LDCT: 2 lung cancers

Not available

Blanchon 2007

Not available

Baseline LDCT: 19 severe emphysema, 63 bronchiectasis, and 18 mediastinal findings

Baseline CXR in control group: 5 severe emphysema, 2 bronchiectasis, and 6 mediastinal findings

De Koning 2020

After 3 rounds of LDCT screening: 35 interval lung cancers

Baseline LDCT data from one centre (N = 1929): 76 liver findings, 53 kidney findings, 9 thyroid findings, 2 mediastinal findings, 1 adrenal finding, 1 breast finding, 1 colon finding, and one perineural cyst

Field 2021

Not available

Baseline LDCT: 4 aortic dilatations, 5 severe aortic valva calcifications, 4 mediastinal masses, 6 mediastinal or hilar lymphadenopathy, 41 pneumonias, 5 bronchiectasis, 8 pleural thickening, 7 smoking related interstitial lung diseases, 9 severe emphysemas, 6 unspecified interstitial fibrosing lung disease, 2 nonspecific interstitial pneumonias, 12 usual interstitial pneumonias, 1 sarcoidosis, 2 oesophageal thickening or dilatation, 1 breast mass, 2 lobar collapse, 1 biliary dilatation, 3 adrenal masses, 1 liver cirrhosis, 1 hydronephrosis, 1 liver mass, 1 pancreatic cyst, 3 renal masses, 1 splenomegaly, and 1 thyroid mass 

Gohagan 2005

1 year post‐randomisation: 2 lung cancers in the LDCT group and 2 lung cancers in the control group

Not available

Infante 2015

Not available

Baseline LDCT: 1 lymphoma, 1 oesophageal carcinoma, 1 malignant mesothelioma, 1 colon cancer with liver metastasis, and 2 renal cancers with pulmonary metastasis

Baseline CXR in the control group: 1 lymphoma

LaRocca 2002

Not available

Not available

Paci 2017

Overall 6 interval lung cancers reported during 4 years of screening

Not available

Pastorino 2012

  • 4.4 years post‐randomisation: 5 lung cancers reported in annual group and 5 in the biennial group

  • 6.5 years post‐randomisation: 13 lung cancers reported in annual group and 10 in biennial group

Not available

Wille 2016

1 interval lung cancer reported in LDCT group during year 3

After 5 rounds of screening: 140 participants had 148 significant incidental findings (1 larynx, 3 thyroid, 9 gastroesophageal, 16 breast, 5 cardiac, 12 mediastinum, 28 aorta, 18 liver/gallbladder, 6 pancreas, 1 spleen, 2 intestines, 40 kidneys, 2 skin, 3 chest wall, and 2 vertebral column)

CXR: chest x‐ray; LDCT: low‐dose computed tomography

Figuras y tablas -
Table 5. Interval cancers and incidental findings 
Comparison 1. Primary outcome: lung cancer‐related mortality

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1.1 Lung cancer‐related mortality ‐ planned time points Show forest plot

8

91122

Risk Ratio (M‐H, Random, 95% CI)

0.79 [0.72, 0.87]

1.2 Lung cancer‐related mortality ‐ planned time points Show forest plot

3

Hazard Ratio (IV, Random, 95% CI)

Subtotals only

1.2.1 8 to 10 years

3

10606

Hazard Ratio (IV, Random, 95% CI)

0.93 [0.72, 1.19]

1.3 Lung cancer‐related mortality at different follow‐up time points (including unplanned) Show forest plot

9

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

1.3.1 5 to 6 years

4

27263

Risk Ratio (M‐H, Random, 95% CI)

0.89 [0.64, 1.24]

1.3.2 > 6 to 8 years

3

73211

Risk Ratio (M‐H, Random, 95% CI)

0.77 [0.69, 0.86]

1.3.3 > 8 to 10 years

6

33700

Risk Ratio (M‐H, Random, 95% CI)

0.79 [0.69, 0.90]

1.3.4 > 10 years

3

72447

Risk Ratio (M‐H, Random, 95% CI)

0.86 [0.75, 0.98]

1.4 Lung cancer‐related mortality by screening arm ‐ planned time points Show forest plot

8

91122

Risk Ratio (M‐H, Random, 95% CI)

0.79 [0.72, 0.87]

1.4.1 usual care

7

37668

Risk Ratio (M‐H, Random, 95% CI)

0.78 [0.69, 0.88]

1.4.2 CXR

1

53454

Risk Ratio (M‐H, Random, 95% CI)

0.80 [0.70, 0.92]

1.5 Lung cancer‐related mortality – by time postscreening cessation (including unplanned time points) Show forest plot

9

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

1.5.1 0 to 1 year

4

28044

Risk Ratio (M‐H, Random, 95% CI)

0.76 [0.61, 0.94]

1.5.2 2 to 4.5 years

5

79063

Risk Ratio (M‐H, Random, 95% CI)

0.82 [0.72, 0.93]

1.5.3 5 to 7 years

4

27067

Risk Ratio (M‐H, Random, 95% CI)

0.78 [0.67, 0.90]

1.5.4 > 7 to 10 years

2

56658

Risk Ratio (M‐H, Random, 95% CI)

0.92 [0.83, 1.01]

1.6 Lung cancer‐related mortality by screening interval ‐ planned time points Show forest plot

8

91122

Risk Ratio (M‐H, Random, 95% CI)

0.79 [0.72, 0.87]

1.6.1 annual ‐ 1 screen

1

3968

Risk Ratio (M‐H, Random, 95% CI)

0.65 [0.41, 1.03]

1.6.2 annual ‐ 3 screens

1

53454

Risk Ratio (M‐H, Random, 95% CI)

0.80 [0.70, 0.92]

1.6.3 annual ‐ 4 screens

1

3206

Risk Ratio (M‐H, Random, 95% CI)

0.71 [0.48, 1.04]

1.6.4 annual ‐ 5 screens

3

10606

Risk Ratio (M‐H, Random, 95% CI)

0.93 [0.73, 1.18]

1.6.5 annual ‐ 7 screens

1

2052

Risk Ratio (M‐H, Random, 95% CI)

0.69 [0.37, 1.28]

1.6.6 biennial ‐ 4 screens

1

2047

Risk Ratio (M‐H, Random, 95% CI)

0.76 [0.42, 1.40]

1.6.7 incremental ‐ interval 4 screens

1

15789

Risk Ratio (M‐H, Random, 95% CI)

0.75 [0.62, 0.90]

1.7 Lung cancer‐related mortality by sex ‐ planned time points Show forest plot

3

9944

Hazard Ratio (IV, Random, 95% CI)

0.80 [0.55, 1.17]

1.7.1 females

3

4286

Hazard Ratio (IV, Random, 95% CI)

0.73 [0.34, 1.56]

1.7.2 males

2

5658

Hazard Ratio (IV, Random, 95% CI)

0.76 [0.52, 1.12]

1.8 Lung cancer‐related mortality by sex ‐ planned time points Show forest plot

5

79798

Risk Ratio (M‐H, Random, 95% CI)

0.81 [0.73, 0.89]

1.8.1 females

4

26965

Risk Ratio (M‐H, Random, 95% CI)

0.71 [0.59, 0.86]

1.8.2 males

5

52833

Risk Ratio (M‐H, Random, 95% CI)

0.85 [0.76, 0.95]

1.9 Lung cancer‐related mortality by age ‐ planned time points Show forest plot

1

56452

Risk Ratio (M‐H, Random, 95% CI)

0.72 [0.54, 0.95]

1.9.1 < 65 years old

1

39234

Risk Ratio (M‐H, Random, 95% CI)

0.82 [0.70, 0.97]

1.9.2 ≥ 65 years old

1

17218

Risk Ratio (M‐H, Random, 95% CI)

0.62 [0.52, 0.74]

1.10 Lung cancer related to smoking ‐ latest time point (including unplanned) Show forest plot

2

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

1.10.1 current smokers at 6.5 years ‐ planned

1

25760

Risk Ratio (M‐H, Random, 95% CI)

0.82 [0.70, 0.95]

1.10.2 former smokers at 6.5 years ‐ planned

1

27692

Risk Ratio (M‐H, Random, 95% CI)

0.91 [0.74, 1.11]

1.10.3 current smokers at 12.3 years ‐ unplanned

1

25760

Risk Ratio (M‐H, Random, 95% CI)

0.89 [0.81, 0.98]

1.10.4 former smokers at 12.3 years ‐ unplanned

1

27692

Risk Ratio (M‐H, Random, 95% CI)

1.01 [0.88, 1.15]

1.10.5 < 35 pack‐history

1

2148

Risk Ratio (M‐H, Random, 95% CI)

1.26 [0.55, 2.90]

1.10.6 ≥ 35 pack‐history

1

1955

Risk Ratio (M‐H, Random, 95% CI)

0.92 [0.54, 1.54]

1.11 Lung cancer‐related mortality by geography ‐ planned time points Show forest plot

8

91122

Risk Ratio (M‐H, Random, 95% CI)

0.79 [0.72, 0.87]

1.11.1 Europe

7

37668

Risk Ratio (M‐H, Random, 95% CI)

0.78 [0.69, 0.88]

1.11.2 USA

1

53454

Risk Ratio (M‐H, Random, 95% CI)

0.80 [0.70, 0.92]

1.12 Nodule management algorithm ‐ planned follow‐up time points Show forest plot

8

91122

Risk Ratio (M‐H, Random, 95% CI)

0.79 [0.72, 0.87]

1.12.1 yes

6

35218

Risk Ratio (M‐H, Random, 95% CI)

0.75 [0.66, 0.86]

1.12.2 no

2

55904

Risk Ratio (M‐H, Random, 95% CI)

0.84 [0.70, 1.01]

1.13 Nodule management criteria ‐ planned follow‐up time points Show forest plot

8

91122

Risk Ratio (M‐H, Random, 95% CI)

0.79 [0.72, 0.87]

1.13.1 diameter

3

59110

Risk Ratio (M‐H, Random, 95% CI)

0.81 [0.72, 0.92]

1.13.2 volume

2

19888

Risk Ratio (M‐H, Random, 95% CI)

0.74 [0.62, 0.88]

1.13.3 diameter and volume

3

12124

Risk Ratio (M‐H, Random, 95% CI)

0.79 [0.60, 1.04]

Figuras y tablas -
Comparison 1. Primary outcome: lung cancer‐related mortality
Comparison 2. Primary outcome: number of non‐invasive and invasive tests ‐ all time points

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

2.1 Number of invasive tests Show forest plot

4

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

2.1.1 at baseline

3

59222

Risk Ratio (M‐H, Random, 95% CI)

2.90 [2.25, 3.75]

2.1.2 at follow‐up

3

60003

Risk Ratio (M‐H, Random, 95% CI)

2.60 [2.41, 2.80]

2.2 Non‐invasive tests Show forest plot

3

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

2.2.1 at baseline

3

59222

Risk Ratio (M‐H, Random, 95% CI)

3.28 [2.40, 4.48]

2.2.2 at follow‐up

2

55905

Risk Ratio (M‐H, Random, 95% CI)

3.56 [1.81, 7.01]

2.3 Number of invasive test for false positive Show forest plot

4

63323

Risk Ratio (M‐H, Random, 95% CI)

3.84 [3.18, 4.64]

2.3.1 at baseline

1

3318

Risk Ratio (M‐H, Random, 95% CI)

3.09 [1.57, 6.07]

2.3.2 at follow‐up

3

60005

Risk Ratio (M‐H, Random, 95% CI)

3.91 [3.21, 4.76]

2.4 Death postsurgery Show forest plot

2

409

Risk Ratio (M‐H, Random, 95% CI)

0.68 [0.24, 1.94]

Figuras y tablas -
Comparison 2. Primary outcome: number of non‐invasive and invasive tests ‐ all time points
Comparison 3. Secondary outcome: all‐cause mortality

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

3.1 All‐cause mortality ‐ planned time points (latest time points) Show forest plot

8

91107

Risk Ratio (M‐H, Random, 95% CI)

0.95 [0.91, 0.99]

3.2 All‐cause mortality ‐ all time points (planned and unplanned) Show forest plot

9

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

3.2.1 5 to 6 years

3

11474

Risk Ratio (M‐H, Random, 95% CI)

1.14 [0.88, 1.47]

3.2.2 > 6 to 8 years

2

57422

Risk Ratio (M‐H, Random, 95% CI)

0.94 [0.89, 0.99]

3.2.3 > 8 to 10 years

6

33685

Risk Ratio (M‐H, Random, 95% CI)

0.97 [0.91, 1.03]

3.2.4 > 10 years

2

56658

Risk Ratio (M‐H, Random, 95% CI)

0.91 [0.76, 1.09]

3.3 All‐cause mortality ‐ planned time points Show forest plot

3

Hazard Ratio (IV, Random, 95% CI)

Subtotals only

3.3.1 any time points

3

Hazard Ratio (IV, Random, 95% CI)

0.98 [0.87, 1.12]

3.4 All‐cause mortality by sex ‐ planned time points Show forest plot

3

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

3.4.1 females

2

24514

Risk Ratio (M‐H, Random, 95% CI)

0.89 [0.76, 1.03]

3.4.2 males

3

49162

Risk Ratio (M‐H, Random, 95% CI)

0.93 [0.80, 1.07]

3.5 Cardiovascular mortality ‐ planned and unplanned Show forest plot

2

Risk Ratio (M‐H, Random, 95% CI)

Totals not selected

3.5.1 8 to 10 years

2

Risk Ratio (M‐H, Random, 95% CI)

Totals not selected

3.5.2 > 10 years ‐ unplanned

1

Risk Ratio (M‐H, Random, 95% CI)

Totals not selected

Figuras y tablas -
Comparison 3. Secondary outcome: all‐cause mortality
Comparison 4. Secondary outcome: lung cancer incidence

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

4.1 Lung cancer incidence ‐ by different time points Show forest plot

10

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

4.1.1 at baseline

6

79900

Risk Ratio (M‐H, Random, 95% CI)

4.98 [2.01, 12.35]

4.1.2 at year 1

3

73345

Risk Ratio (M‐H, Random, 95% CI)

2.12 [1.35, 3.31]

4.1.3 at year 2

2

57556

Risk Ratio (M‐H, Random, 95% CI)

1.88 [1.51, 2.32]

4.1.4 at year 3

1

4104

Risk Ratio (M‐H, Random, 95% CI)

1.71 [0.68, 4.35]

4.1.5 at year 4

1

4104

Risk Ratio (M‐H, Random, 95% CI)

2.67 [1.05, 6.80]

4.1.6 5 to 7 years

2

57506

Risk Ratio (M‐H, Random, 95% CI)

1.13 [1.04, 1.23]

4.1.7 > 7 years

8

88528

Risk Ratio (M‐H, Random, 95% CI)

1.17 [1.02, 1.33]

4.2 Lung cancer incidence ‐ by control group at ≥ 10 years Show forest plot

6

82110

Risk Ratio (M‐H, Random, 95% CI)

1.15 [0.99, 1.34]

4.2.1 usual care

5

28656

Risk Ratio (M‐H, Random, 95% CI)

1.21 [0.99, 1.48]

4.2.2 CXR

1

53454

Risk Ratio (M‐H, Random, 95% CI)

1.01 [0.95, 1.08]

4.3 Overdiagnosis at ≥ 10 years Show forest plot

6

Risk Difference (IV, Random, 95% CI)

Subtotals only

4.3.1 usual care

5

28656

Risk Difference (IV, Random, 95% CI)

0.18 [‐0.00, 0.36]

4.3.2 CXR

1

53454

Risk Difference (IV, Random, 95% CI)

0.01 [‐0.05, 0.07]

Figuras y tablas -
Comparison 4. Secondary outcome: lung cancer incidence
Comparison 5. Secondary outcome: false positives, negatives and recalls (number of screens)

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

5.1 False positive at baseline Show forest plot

3

56101

Risk Ratio (M‐H, Random, 95% CI)

2.82 [1.98, 4.01]

5.2 False negative Show forest plot

1

Risk Ratio (M‐H, Random, 95% CI)

Totals not selected

5.2.1 baseline

1

Risk Ratio (M‐H, Random, 95% CI)

Totals not selected

5.2.2 at year 1

1

Risk Ratio (M‐H, Random, 95% CI)

Totals not selected

5.2.3 at year 2

1

Risk Ratio (M‐H, Random, 95% CI)

Totals not selected

5.3 Recall rates at baseline Show forest plot

2

55480

Risk Ratio (M‐H, Random, 95% CI)

5.31 [1.73, 16.34]

Figuras y tablas -
Comparison 5. Secondary outcome: false positives, negatives and recalls (number of screens)
Comparison 6. Secondary outcome: impact on smoking behaviour

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

6.1 stop smoking Show forest plot

3

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

6.1.1 at 2 weeks

1

1545

Risk Ratio (M‐H, Random, 95% CI)

2.16 [1.47, 3.18]

6.1.2 at 1 year

1

3124

Risk Ratio (M‐H, Random, 95% CI)

1.08 [0.88, 1.32]

6.1.3 within 2 years

1

1524

Risk Ratio (M‐H, Random, 95% CI)

1.51 [1.15, 1.97]

6.1.4 at year 4

1

2447

Risk Ratio (M‐H, Random, 95% CI)

1.17 [0.99, 1.37]

6.2 smoking relapse Show forest plot

1

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

6.2.1 at 1 year

1

888

Risk Ratio (M‐H, Random, 95% CI)

0.95 [0.65, 1.41]

Figuras y tablas -
Comparison 6. Secondary outcome: impact on smoking behaviour
Comparison 7. Secondary outcome: health‐related quality of life

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

7.1 Anxiety ‐ at 10 months to 5 years (change over time and endpoints) Show forest plot

3

8153

Std. Mean Difference (IV, Random, 95% CI)

‐0.43 [‐0.59, ‐0.27]

7.2 Quality of life measures at different time points Show forest plot

3

Mean Difference (IV, Random, 95% CI)

Subtotals only

7.2.1 Physical component summary of short‐form 12 (PCS) at baseline

1

1288

Mean Difference (IV, Random, 95% CI)

‐0.17 [‐1.21, 0.87]

7.2.2 Physical component summary of short‐form 12 (PCS) at 2 years

1

931

Mean Difference (IV, Random, 95% CI)

0.88 [‐0.34, 2.10]

7.2.3 Mental component summary of short‐form 12 (MCS) at baseline

1

1288

Mean Difference (IV, Random, 95% CI)

‐0.06 [‐1.42, 1.30]

7.2.4 Mental component summary of short‐form 12 (MCS) at 2 years

1

931

Mean Difference (IV, Random, 95% CI)

0.81 [‐0.65, 2.27]

7.2.5 EuroQol questionnaire visual analogue scale (EQ‐5D VAS) (1‐100) at baseline

1

1288

Mean Difference (IV, Random, 95% CI)

0.69 [‐0.98, 2.36]

7.2.6 EuroQol questionnaire visual analogue scale (EQ‐5D VAS) (1‐100) at 2 years

1

1010

Mean Difference (IV, Random, 95% CI)

2.08 [0.18, 3.98]

7.2.7 Spielberger State‐Trait Anxiety Inventory (STAI‐6) at baseline

1

1288

Mean Difference (IV, Random, 95% CI)

‐0.48 [‐1.63, 0.67]

7.2.8 Spielberger State‐Trait Anxiety Inventory (STAI‐6) at 2 years

1

931

Mean Difference (IV, Random, 95% CI)

‐0.75 [‐1.99, 0.49]

7.2.9 Impact of event scale (IES) total at baseline

1

1288

Mean Difference (IV, Random, 95% CI)

0.03 [‐0.88, 0.94]

7.2.10 Anxiety ‐ at baseline

1

4037

Mean Difference (IV, Random, 95% CI)

‐0.07 [‐0.11, ‐0.03]

7.2.11 Impact of event scale (IES) total at 2 years

1

931

Mean Difference (IV, Random, 95% CI)

‐0.31 [‐1.30, 0.68]

7.2.12 Anxiety ‐ at 10‐27 months

1

4037

Mean Difference (IV, Random, 95% CI)

‐0.36 [‐0.57, ‐0.15]

7.2.13 Anxiety (0‐18) at round 1 to 2

1

3352

Mean Difference (IV, Random, 95% CI)

‐0.13 [‐0.33, 0.07]

7.2.14 Anxiety (0‐18) at round 1 to 5

1

3185

Mean Difference (IV, Random, 95% CI)

‐0.51 [‐0.76, ‐0.26]

7.2.15 Depression ‐ at baseline

1

4037

Mean Difference (IV, Random, 95% CI)

‐0.06 [‐0.10, ‐0.02]

7.2.16 Depression ‐ at 10‐27 months

1

4037

Mean Difference (IV, Random, 95% CI)

‐0.24 [‐0.40, ‐0.08]

7.2.17 Behaviour (0‐21) at round 1 to 2

1

3337

Mean Difference (IV, Random, 95% CI)

‐0.21 [‐0.42, 0.00]

7.2.18 Behaviour (0‐21) at round 1 to 5

1

3180

Mean Difference (IV, Random, 95% CI)

‐0.60 [‐0.88, ‐0.32]

7.2.19 Dejection (0‐18) at round 1 to 2

1

3377

Mean Difference (IV, Random, 95% CI)

‐0.15 [‐0.36, 0.06]

7.2.20 Dejection (0‐18) at round 1 to 5

1

3195

Mean Difference (IV, Random, 95% CI)

‐0.58 [‐0.82, ‐0.34]

7.2.21 Negative impact on sleep (0‐12) round 1 to 2

1

3389

Mean Difference (IV, Random, 95% CI)

‐0.14 [‐0.32, 0.04]

7.2.22 Negative impact on sleep (0‐12) round 1 to 5

1

3198

Mean Difference (IV, Random, 95% CI)

‐0.70 [‐0.95, ‐0.45]

7.3 SF‐36v2: PCS by different components at baseline and at 6 months Show forest plot

1

Mean Difference (IV, Random, 95% CI)

Subtotals only

7.3.1 Negative at baseline

1

1381

Mean Difference (IV, Random, 95% CI)

‐1.07 [‐2.09, ‐0.05]

7.3.2 Negative at 6 months

1

1019

Mean Difference (IV, Random, 95% CI)

‐0.11 [‐1.38, 1.16]

7.3.3 SIFs at baseline

1

344

Mean Difference (IV, Random, 95% CI)

0.16 [‐2.45, 2.77]

7.3.4 SIFs at 6 months

1

226

Mean Difference (IV, Random, 95% CI)

‐1.25 [‐4.26, 1.76]

7.3.5 False positive at baseline

1

1024

Mean Difference (IV, Random, 95% CI)

‐0.72 [‐1.98, 0.54]

7.3.6 False positive at 6 months

1

703

Mean Difference (IV, Random, 95% CI)

‐0.78 [‐2.42, 0.86]

7.3.7 True positive at baseline

1

63

Mean Difference (IV, Random, 95% CI)

‐1.94 [‐7.33, 3.45]

7.3.8 True positive at 6 months

1

42

Mean Difference (IV, Random, 95% CI)

‐0.20 [‐7.32, 6.92]

7.4 SF‐36v2: MCS by different components at baseline and 6 months Show forest plot

1

Mean Difference (IV, Random, 95% CI)

Subtotals only

7.4.1 Negative at baseline

1

1381

Mean Difference (IV, Random, 95% CI)

‐0.85 [‐1.97, 0.27]

7.4.2 Negative at 6 months

1

1019

Mean Difference (IV, Random, 95% CI)

‐0.15 [‐1.52, 1.22]

7.4.3 SIFs at baseline

1

344

Mean Difference (IV, Random, 95% CI)

0.63 [‐1.94, 3.20]

7.4.4 SIFs at 6 months

1

226

Mean Difference (IV, Random, 95% CI)

0.73 [‐2.27, 3.73]

7.4.5 False positive at baseline

1

1024

Mean Difference (IV, Random, 95% CI)

‐0.19 [‐1.43, 1.05]

7.4.6 False positive at 6 months

1

703

Mean Difference (IV, Random, 95% CI)

‐1.02 [‐2.67, 0.63]

7.4.7 True positive at baseline

1

63

Mean Difference (IV, Random, 95% CI)

‐1.74 [‐6.66, 3.18]

7.4.8 True positive at 6 months

1

42

Mean Difference (IV, Random, 95% CI)

0.08 [‐8.19, 8.35]

7.5 Anxiety by different results at 1 and 6 months Show forest plot

1

Mean Difference (IV, Random, 95% CI)

Subtotals only

7.5.1 Negative at 1 month

1

1162

Mean Difference (IV, Random, 95% CI)

‐0.26 [‐1.79, 1.27]

7.5.2 Negative at 6 months

1

1019

Mean Difference (IV, Random, 95% CI)

‐0.33 [‐1.91, 1.25]

7.5.3 SIFs at 1 month

1

272

Mean Difference (IV, Random, 95% CI)

‐0.06 [‐3.52, 3.40]

7.5.4 SIFs at 6 months

1

226

Mean Difference (IV, Random, 95% CI)

‐0.60 [‐4.26, 3.06]

7.5.5 False positive at 1 month

1

835

Mean Difference (IV, Random, 95% CI)

1.77 [‐0.04, 3.58]

7.5.6 False positive at 6 months

1

703

Mean Difference (IV, Random, 95% CI)

1.31 [‐0.61, 3.23]

7.5.7 True positive at 1 month

1

48

Mean Difference (IV, Random, 95% CI)

1.63 [‐6.31, 9.57]

7.5.8 True positive at 6 months

1

42

Mean Difference (IV, Random, 95% CI)

‐2.69 [‐11.69, 6.31]

Figuras y tablas -
Comparison 7. Secondary outcome: health‐related quality of life
Comparison 8. Secondary outcome: lung cancer by stages at different time points

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

8.1 baseline Show forest plot

5

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

8.1.1 stage 1 (A+B)

5

64092

Risk Ratio (M‐H, Random, 95% CI)

2.41 [1.86, 3.12]

8.1.2 stage 2 (A+B)

5

64092

Risk Ratio (M‐H, Random, 95% CI)

1.88 [0.99, 3.58]

8.1.3 stage 3 (A+B)

5

64092

Risk Ratio (M‐H, Random, 95% CI)

4.28 [1.06, 17.27]

8.1.4 stage 4

5

64092

Risk Ratio (M‐H, Random, 95% CI)

1.05 [0.70, 1.55]

8.1.5 SCLC ‐ limited

1

4104

Risk Ratio (M‐H, Random, 95% CI)

Not estimable

8.1.6 SCLC ‐ extensive

1

4104

Risk Ratio (M‐H, Random, 95% CI)

19.00 [1.11, 326.23]

8.1.7 unknown

2

56773

Risk Ratio (M‐H, Random, 95% CI)

0.99 [0.31, 3.13]

8.2 at 1 year Show forest plot

3

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

8.2.1 stage 1 (A+B)

3

60877

Risk Ratio (M‐H, Random, 95% CI)

2.57 [1.24, 5.32]

8.2.2 stage 2 (A+B)

3

60877

Risk Ratio (M‐H, Random, 95% CI)

1.39 [0.68, 2.84]

8.2.3 stage 3 (A+B)

3

60877

Risk Ratio (M‐H, Random, 95% CI)

1.22 [0.76, 1.95]

8.2.4 stage 4

3

60877

Risk Ratio (M‐H, Random, 95% CI)

0.48 [0.30, 0.77]

8.2.5 SCLC ‐ limited

1

4104

Risk Ratio (M‐H, Random, 95% CI)

1.00 [0.06, 15.98]

8.2.6 SCLC ‐ extensive

1

4104

Risk Ratio (M‐H, Random, 95% CI)

3.00 [0.12, 73.60]

8.2.7 unknown

2

56773

Risk Ratio (M‐H, Random, 95% CI)

1.35 [0.17, 10.75]

8.3 At year 2 Show forest plot

2

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

8.3.1 stage 1 (A+B)

2

57559

Risk Ratio (M‐H, Random, 95% CI)

3.53 [1.66, 7.53]

8.3.2 stage 2 (A+B)

2

57559

Risk Ratio (M‐H, Random, 95% CI)

1.08 [0.49, 2.37]

8.3.3 stage 3 (A+B)

2

57559

Risk Ratio (M‐H, Random, 95% CI)

0.92 [0.59, 1.44]

8.3.4 stage 4

2

57559

Risk Ratio (M‐H, Random, 95% CI)

0.80 [0.52, 1.24]

8.3.5 SCLC ‐ limited

1

4104

Risk Ratio (M‐H, Random, 95% CI)

Not estimable

8.3.6 SCLC ‐ extensive

1

4104

Risk Ratio (M‐H, Random, 95% CI)

0.14 [0.01, 2.76]

8.3.7 unknown

1

53455

Risk Ratio (M‐H, Random, 95% CI)

7.00 [0.86, 56.91]

8.4 5 to < 10 years Show forest plot

4

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

8.4.1 stage 1 (A+B)

4

13676

Risk Ratio (M‐H, Random, 95% CI)

2.26 [1.43, 3.57]

8.4.2 stage 2 (A+B)

4

13676

Risk Ratio (M‐H, Random, 95% CI)

0.78 [0.37, 1.66]

8.4.3 stage 3 (A+B)

4

13676

Risk Ratio (M‐H, Random, 95% CI)

0.84 [0.47, 1.49]

8.4.4 4

4

13676

Risk Ratio (M‐H, Random, 95% CI)

0.55 [0.34, 0.91]

8.4.5 unknown

4

13676

Risk Ratio (M‐H, Random, 95% CI)

0.67 [0.41, 1.12]

8.5 ≥ 10 years Show forest plot

4

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

8.5.1 stage 1 (A+B)

4

64864

Risk Ratio (M‐H, Random, 95% CI)

2.57 [1.36, 4.84]

8.5.2 stage 2 (A+B)

4

64864

Risk Ratio (M‐H, Random, 95% CI)

0.94 [0.76, 1.17]

8.5.3 stage 3 (A+B)

4

64864

Risk Ratio (M‐H, Random, 95% CI)

1.23 [0.79, 1.93]

8.5.4 stage 4

4

64864

Risk Ratio (M‐H, Random, 95% CI)

0.77 [0.69, 0.86]

8.5.5 unknown

3

60765

Risk Ratio (M‐H, Random, 95% CI)

0.67 [0.45, 0.99]

Figuras y tablas -
Comparison 8. Secondary outcome: lung cancer by stages at different time points
Comparison 9. Secondary outcome: lung cancer histology at different time points

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

9.1 Histology types at baseline Show forest plot

4

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

9.1.1 SCLC

4

59987

Risk Ratio (M‐H, Random, 95% CI)

0.84 [0.45, 1.57]

9.1.2 squamous cell carcinoma

4

59987

Risk Ratio (M‐H, Random, 95% CI)

1.47 [1.01, 2.13]

9.1.3 adenocarcinoma

4

59987

Risk Ratio (M‐H, Random, 95% CI)

2.81 [1.38, 5.71]

9.1.4 bronchoalveolar carcinoma

2

55904

Risk Ratio (M‐H, Random, 95% CI)

4.94 [2.41, 10.10]

9.1.5 other

4

59987

Risk Ratio (M‐H, Random, 95% CI)

1.32 [0.90, 1.94]

9.2 Histology at year 1 Show forest plot

1

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

9.2.1 SCLC

1

3318

Risk Ratio (M‐H, Random, 95% CI)

2.00 [0.37, 10.89]

9.2.2 squamous cell carcinoma

1

3318

Risk Ratio (M‐H, Random, 95% CI)

0.83 [0.25, 2.72]

9.2.3 adenocarcinoma

1

3318

Risk Ratio (M‐H, Random, 95% CI)

2.66 [1.24, 5.71]

9.2.4 other

1

3318

Risk Ratio (M‐H, Random, 95% CI)

2.33 [0.60, 9.00]

9.3 Histology at follow‐up Show forest plot

7

Risk Ratio (M‐H, Random, 95% CI)

Subtotals only

9.3.1 SCLC

6

71281

Risk Ratio (M‐H, Random, 95% CI)

0.86 [0.74, 1.01]

9.3.2 mixed SCLC + NSCLC

1

4104

Risk Ratio (M‐H, Random, 95% CI)

0.14 [0.01, 2.76]

9.3.3 squamous cell carcinoma

6

71281

Risk Ratio (M‐H, Random, 95% CI)

1.04 [0.81, 1.32]

9.3.4 adenocarcinoma

7

75333

Risk Ratio (M‐H, Random, 95% CI)

1.49 [1.05, 2.10]

9.3.5 bronchoalveolar carcinoma

3

61610

Risk Ratio (M‐H, Random, 95% CI)

2.73 [1.96, 3.81]

9.3.6 other

7

75333

Risk Ratio (M‐H, Random, 95% CI)

0.87 [0.68, 1.11]

Figuras y tablas -
Comparison 9. Secondary outcome: lung cancer histology at different time points
Comparison 10. Secondary outcome: other outcomes

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

10.1 contamination Show forest plot

3

6902

Risk Ratio (M‐H, Random, 95% CI)

1.35 [0.32, 5.68]

Figuras y tablas -
Comparison 10. Secondary outcome: other outcomes