Scolaris Content Display Scolaris Content Display

Stereotactic body radiation therapy versus more fractionated radical radiotherapy for adults with stage I/II non‐small cell lung cancer: a systematic review and network meta‐analysis

Contraer todo Desplegar todo

Abstract

Objectives

This is a protocol for a Cochrane Review (intervention). The objectives are as follows:

To assess the effectiveness and toxicity of different SBRT schedules (± additional treatments) in comparison with various more fractionated radical radiotherapy schedules (± additional treatments) in controlling primary tumor and prolonging survival in people with early stage NSCLC, and to rank all the schedules based on their efficacy and tolerability. 

Background

Description of the condition

Lung cancer is a major global public health concern with a significant financial burden on society (Ha 2020). In the USA, approximately 228,150 new cases of lung cancer were diagnosed in 2019, and the estimated number of deaths due to lung cancer was 142,670 (ASCO 2019). In 2015, approximately 730,000 new cases of lung cancer were diagnosed in China, and the estimated number of deaths due to lung cancer was 610,000 (CACA 2019Sun 2020Wu J 2019). Due to population growth and aging, it seems inevitable that the incidence of lung cancer will continue to increase (Sekine 2020). While lung cancer is a multistep and multifactorial disease, smoking is by far the leading risk factor. Additional risk factors include exposure to second‐hand smoke, radon, asbestos, air pollution and other occupational carcinogens, and genetic inheritance (ACS 2020Cao 2019Syn 2018).

Non‐small‐cell lung cancer (NSCLC) accounts for over 85% of all lung cancer cases (Wang 2019Wu Q 2020). Among NSCLC cases, only approximately 20% of people are diagnosed at an early stage (Palma 2019). According to the National Comprehensive Cancer Network (NCCN) guideline, early‐stage disease comprises stages I and II with negative nodes (N0), which corresponds to T1‐T3N0M0 in the eighth edition of the TNM classification for lung cancer (NCCN 2022). The five‐year survival rates are 80% to 90% for pathologic stage IA, 73% for stage IB, 65% for stage IIA, and 56% for stage IIB. However, this includes data for both N0 and N1 (positive node) disease states (Goldstraw 2016). The main clinical manifestations of NSCLC are a new cough in a smoker (or former smoker), recurrent pneumonia in the same anatomic location, or frequent exacerbations of chronic obstructive pulmonary disease. Additional symptoms include dyspnea, chest pain, and hemoptysis (Nasim 2019). However, because the early stages of this disease are often asymptomatic, people with NSCLC are less likely to be diagnosed early and treated successfully. It is anticipated that detection of early‐stage NSCLC will increase with the expansion of low‐dose computed tomography (LDCT) screening (Ernani 2019Frick 2020). The various modalities for diagnosis and staging of lung cancer currently include fiberoptic bronchoscopy, endobronchial ultrasound (EBUS)‐guided bronchoscopic sampling, and computed tomography (CT)‐guided biopsy (Postmus 2017). The NCCN guideline recommends that the diagnostic strategy should be individualized for each person, depending on the size and location of the tumor, individual characteristics, and local expertise (NCCN 2022). 

Description of the intervention

NSCLC is associated with a high mortality rate. Even when diagnosed at an early stage, people with untreated NSCLC have a poor prognosis, with a median survival time of just over one year (Fernandez 2020). At present, standard treatment for early‐stage NSCLC is surgical resection (Wu Z 2019). However, 20% to 30% of people with early‐stage NSCLC are ineligible for surgery, due to old age or severe complications (including chronic obstructive lung disease, coronary artery disease, and other comorbidities), or refuse to receive surgery (Fernandez 2020Liang 2020Shiue 2018). In these situations, radiotherapy is an appropriate alternative treatment option. Until 20 to 25 years ago, radiotherapy provided only a modest increase in survival time (compared to untreated people) (Ostheimer 2019Wisnivesky 2005). However, subsequent technological advances in electronic engineering, computer sciences, and imaging physics have brought about significant improvements in radiotherapy techniques. Radiotherapy has advanced from a 'planar imaging‐based' discipline to include conformal radiotherapy, intensity‐modulated radiotherapy (IMRT), image‐guided radiotherapy, and stereotactic body radiation therapy (SBRT). SBRT, also known as stereotactic ablative radiotherapy (SABR), is one such high‐precision image‐guided form of radiotherapy. Using this technique, it is possible to deliver a very high radiation dose to a tumor target in short regimens of 1 to 10 fractions (NCCN 2022Nicosia 2019Ostheimer 2019). Because of its high precision, SBRT is able to minimize the dose delivered to surrounding normal tissues (Hiley 2020). The basic principles of SBRT include: precise, reproducible stereotactic localization of the tumor (either using internal or external references); daily image guidance for tumor re‐localization and visualization of critical normal structures; and delivery of a high radiotherapy dose in one or more treatment fractions (usually a maximum of 10 fractions) (Amini 2014Guckenberger 2014Kavanagh 2005). Because of these characteristics, SBRT can deliver a significantly higher biologically effective dose (BED) to tumors over a shorter period of time compared to conventional radiotherapy. Although SBRT has been widely recommended for medically inoperable early‐stage NSCLC, many issues remain unresolved (Zhang 2011). Firstly, different SBRT dose and fractionation schedules have been reported, and the optimal regimen is undefined (Singh 2019). Thus, the selection of SBRT schedules currently depends on clinical experience (Luo 2019). Secondly, it remains unknown whether systemic treatments such as chemotherapy or immunotherapy can improve the prognosis of people compared with SBRT alone. This is probably a future direction (Prezzano 2019). Finally, SBRT involves an array of the latest technology available in radiation therapy, and thus requires a multidisciplinary team approach. Consequently, SBRT may not be available in many hospitals or clinics, especially in low‐ and middle‐income countries (Datta 2014Lievens 2015). In a study conducted to assess current lung SBRT practice in the UK, only 36 of the 62 National Health Service (NHS) radiotherapy centers surveyed (58%) delivered lung SBRT (Beasley 2019). Compared to SBRT, other more fractionated radical radiotherapy techniques, such as conventionally fractionated radiotherapy (CFRT) (which is usually given daily for four to seven weeks), are more widely available and require less complex technology. However, compared to more fractionated radical radiotherapy treatments for early‐stage NSCLC, SBRT is more convenient for the person (because it can shorten treatment times), and may decrease toxicity and increase survival (Kennedy 2020Lievens 2015). Recently, the TROG 09.02 (CHISEL) randomized, multicenter controlled trial that compared SBRT versus CFRT demonstrated a survival benefit with SBRT (Ball 2019). While both radiation techniques are options for people with early‐stage NSCLC, no systematic reviews have been published to date to show that all the SBRT regimens are superior to all the other more fractionated radical radiotherapy treatments, and no review has provided a synthesis and direct comparison of these results.

How the intervention might work

During radiotherapy, ionizing radiation can damage cells both directly and indirectly. Direct damage breaks DNA itself. Ionizing radiation also causes cell death indirectly through the formation of free radicals and peroxides, which can subsequently induce DNA damage (Fiorino 2020Hawley 2013). Any resulting double‐stranded DNA breaks usually induce cell death or cause defective DNA. Since the 1970s, a conventional fractionation radiotherapy schedule in NSCLC has been defined as 45 Gy to 70 Gy, comprising one treatment per day of 1.8 Gy to 2.5 Gy per fraction. Additional curative‐intent fractionation schemes were subsequently suggested, such as continuous hyper‐fractionated accelerated radiotherapy (CHART) and hypo‐fractionated radiotherapy, and these have been widely adopted in several countries (Alaswad 2019Ramroth 2016). In recent years, improved radiation therapy technologies, including modern linear accelerators equipped with suitable image‐guidance technology, sophisticated immobilization systems and advanced treatment planning software, have enabled the application of SBRT (Guckenberger 2014Román 2020). SBRT delivers ablative doses of highly conformal radiation to the tumor in only a few fractions, while a sharp dose fall‐off outside the tumor volume minimizes dose to surrounding normal tissues (compared with other more fractionated radical radiotherapy) (Merlotti 2021). By optimizing the BED and shortening the treatment time, the effect of tumor repopulation is decreased. Consequently, SBRT improves disease control and limits toxicity (Phillips 2019Schonewolf 2019). Additionally, a growing number of studies suggest that high doses per fraction can injure vasculature and enhance antitumor immunity, so that SBRT has greater antitumor efficacy than would be predicted by classical radiobiology. Furthermore, radiotherapy, and SBRT in particular, can influence immune responses in both the tumor microenvironment and the immune system. It can kill cancer cells by releasing tumor‐associated antigens and induce immunogenic cancer cell stress or death. These may result in SBRT being able to provoke tumor cell responses both at the treatment site and in remote, non‐irradiated regional or distant metastases via what is called an 'abscopal effect' (Giaj 2020; Li 2019; Zhang 2017). Therefore, SBRT is likely a more efficient method compared to other radical fractionated radiotherapy to control tumors, decrease toxicity, and improve survival.

Why it is important to do this review

Radiotherapy is an alternative treatment for people with early‐stage NSCLC who are medically inoperable or refuse surgery. Studies have shown that the survival rate following standard fractionated radiotherapy is situated between that of untreated people and those treated surgically (Rowell 2001Wu Z 2019). To further improve prognosis, other curative‐intent fractionation schemes have been suggested by various institutions. With the development of radiation therapy techniques, the use of SBRT has increased rapidly since 2001 (Palma 2010). As mentioned above, this is because SBRT delivers a more precise conformal dose to the tumor and a smaller dose to the surrounding healthy tissues (Schonewolf 2019). Both SBRT and more fractionated radical radiotherapy are radiotherapy options for people with early‐stage NSCLC. There are two randomized controlled trials (RCTs) and a pair‐wise meta‐analysis which have compared SBRT and CFRT (Ball 2019Li 2020Nyman 2016). However, conventional pair‐wise meta‐analyses can only generate effect estimates for treatment interventions compared in head‐to‐head trials (Parry Smith 2020). Therefore, in the absence of a high‐quality RCT comparing all SBRT schedules with different radical radiotherapy schedules, uncertainty remains about which option is best for medically inoperable early‐stage NSCLC. A network meta‐analysis would allow a comparison of all interventions, including those for which head‐to‐head comparisons have not been conducted (Al Said 2019Jansen 2008Lu 2004). In addition, a network meta‐analysis could also calculate the probability that a specific intervention constitutes the most effective intervention with the fewest side effects, thereby allowing ranking of the available interventions (Parry Smith 2020).

Objectives

To assess the effectiveness and toxicity of different SBRT schedules (± additional treatments) in comparison with various more fractionated radical radiotherapy schedules (± additional treatments) in controlling primary tumor and prolonging survival in people with early stage NSCLC, and to rank all the schedules based on their efficacy and tolerability. 

Methods

Criteria for considering studies for this review

Types of studies

We will focus on RCTs because they are the best study design for evaluating the effectiveness of interventions. We will consider studies that also include participants with III/IV stage disease if they can provide sufficient data for people with stage I/II disease. We will include both full‐text and abstract publications if they provide sufficient information on study design, characteristics of participants (people with inoperable early‐stage NSCLC or refusing to undergo surgery), and interventions (SBRT or other radical radiotherapy). We will include trials with participants receiving SBRT (± additional treatments) or other more fractionated radical radiotherapy (± additional treatments) in at least one treatment arm. We will not include quasi‐RCTs.

Types of participants

We will include adults (≥ 18 years old) with primary early‐stage NSCLC (stages I and II with N0 using the eighth edition of the TNM classification and staging system for lung cancer, or tumor ≤ 7 cm in the greatest dimension with N0 using other versions). Participants should have no history of radiation therapy (including brachytherapy) or chemotherapy. They should be medically inoperable or refuse surgery.  

We will exclude people with multiple synchronous primary tumors, recurrent lung cancers or who have received the combination treatment of external beam radiotherapy and brachytherapy or surgery.

We will not apply any upper age or gender restrictions.

Types of interventions

We will include any of the following interventions.

  1. Any type of SBRT. The doses given, fractionated doses, and radiotherapy regimen should be clearly described.

  2. Any type of SBRT + additional treatments (including adjuvant, neoadjuvant, concurrent or sequential), e.g. chemotherapy or immunotherapy. However, our definition of additional treatments does not include non‐therapeutic associated interventions (e.g. a physical function assessment or a comprehensive geriatric assessment).

  3. Any type of radical fractionated radiotherapy. Fractionated radical radiotherapy can be defined as 50 Gy or more delivered in over 10 fractions. It can include conventional radiotherapy (delivered ≤ 2.5 Gy per fraction once a day) and other 'non‐conventional' regimens, such as 50 Gy to 55 Gy in 20 fractions over 4 weeks, CHART (54 Gy in 36 fractions, three times per day over 12 continuous days) or hypo‐fractionated radiotherapy schedules, which are widely used in some countries.

  4. Any type of radical fractionated radiotherapy + additional treatments such as chemotherapy or immunotherapy. However, our definition of additional treatments does not include non‐therapeutic associated interventions.

We will not include trials that use the same radiotherapy regimen but only compare different immobilization systems, the dose calculation grid size, or other similar research.  

We plan to conduct both traditional meta‐analysis and network meta‐analysis (NMA). We will conduct pair‐wise meta‐analysis for every treatment comparison with at least two studies. In the NMA we will consider all the possible groups or combinations as separate arms and compare them with each other. For different SBRT schedules, we will divide them into more fractionated or less fractionated regimens. For different radical radiotherapy, we will consider different schedules as separate groups such as CFRT, CHART or hypo‐fractionated radiotherapy. In addition, we will calculate BED and EQD2 (equivalent dose in 2 Gy/f) of every radiotherapy schedule. We will use a linear‐quadratic equation to calculate the BED: BED = nd (1 + d/ (α/β)), where n represent the dose per fraction and d the number of fractions of radiotherapy; we will use α/β = 10 for the analysis. We will calculate EQD2 = BED/(1 + 2/(α/β)). If there is a sufficient number of included studies, we will divide BED or EQD2 into several groups, such as: low (≤ 84 Gy), medium (84 Gy to 105 Gy), medium to high (105 Gy to 145 Gy), high (> 145 Gy). We will consider these as separate groups.

Types of outcome measures

Primary outcomes

  1. Overall survival: defined as the time from randomization until death from any cause.

  2. Radiotherapy‐related toxicity (RT): divided into early‐stage RT and late‐stage RT. Early‐stage RT includes any toxic events occurring in the esophagus, lung, skin and heart within six months of completion of radiotherapy. Radiation pneumonitis and radiation esophagitis will be our main concerns. Late‐stage RT includes any toxic events occurring after six months of completion of radiotherapy.  Radiation pulmonary fibrosis will be our main concern. If there are sufficient data, we will also analyze other radiation therapy related toxicities, such as skin reactions, nausea and vomiting. Treatment‐related toxicities can be defined according to the criteria of RTOG (Radiation Therapy Oncology Group) (RTOG 2018), CTCAE (Common Toxicity Criteria for Adverse Events) (CTCAE 2009), or by the authors of the included studies if it is reasonable.

Secondary outcomes

  1. Progression‐free survival (PFS): defined as the time from randomization to any progression or death.

  2. Local control: defined as the time from randomization to the occurrence of any local treatment failure during follow‐up. Local treatment failure can be defined according to RECIST (Response Evaluation Criteria in Solid Tumors) guidelines, on the basis of imaging changes, or by histologic confirmation (e.g. a tumor enlargement by a prespecified percentage with proof of PET (positron emission tomography) or biopsy) (Eisenhauer 2009).

  3. Tumor response to treatment (including complete response, partial response, progressive disease or stable disease): response to treatment defined according to RECIST guidelines (Eisenhauer 2009). 

  4. Health related Quality of life (HRQoL): measured by a validated scale (e.g. the European Organisation for Research and Treatment of Cancer quality of life questionnaire EORTC QLQ‐C30 (Aaronson 1993)).

Search methods for identification of studies

Electronic searches

We will search the following databases:

  • the Cochrane Central Register of Controlled Trials (CENTRAL; latest issue);

  • MEDLINE accessed via PubMed (1946 to present);

  • Embase (1980 to present);

  • the WHO International Clinical Trials Registry Platform (ICTRP) search portal (apps.who.int/trialsearch/AdvSearch.aspx) for all prospectively registered and ongoing trials;

  • ClinicalTrials.gov (www.clinicaltrials.gov/).

There will be no limitation on publication language or publication type.

We will use the search strategies recommended by the Cochrane Lung Cancer Group. We will search all databases using the combination of controlled vocabulary (e.g. medical subject headings (MeSH) in MEDLINE, Emtree in Embase) and free‐text terms. We will perform the MEDLINE search using the Cochrane highly sensitive search strategy and precision‐maximizing version, as referenced in the Cochrane Handbook for Systematic Reviews of Interventions (Lefebvre 2021). Our search strategies are presented in Appendix 1 for CENTRAL, Appendix 2 for MEDLINE, and  Appendix 3 for Embase. 

Searching other resources

We will search the reference lists of included studies and contact experts in the field for information about ongoing or non‐published studies.

We will also search for lung cancer and radiotherapy reports recorded from proceedings in the following sources from 2019 onwards, as these may not already be included in the databases we intend to search:

  • American Society for Clinical Oncology (ASCO);

  • European Society of Oncology (ESMO);

  • European Cancer Organisation (ECCO);

  • International Lung Cancer Research Association (IASLC);

  • American Society of Therapeutic Radiation and Oncology(ASTRO);

  • European Society for Radiotherapy and Oncology (ESTRO);

  • World Lung Cancer Conference.

Data collection and analysis

Selection of studies

We will use reference management software to manage the records retrieved from searches of electronic databases (EndNote 2018). Two review authors (YZ, H‐M F) will independently browse titles and abstracts of all identified studies, and exclude those that do not meet the inclusion criteria. We will retrieve the full‐text copies of the remaining studies. Two review authors (YZ, H‐M F) will independently assess the eligibility of retrieved papers. We will identify and exclude duplicates, and use the studies with the most up‐to‐date results. We will contact the authors if more information is required to determine eligibility for inclusion. We will record the reasons for exclusion of the ineligible researches. We will make detailed records of the selection process in order to complete a PRISMA flow diagram and characteristics of excluded studies table (Moher 2009Murthy 2018). We will resolve disagreements by discussion between the two review authors, and if necessary we will consult a third review author (J‐H T). We will record the reasons for exclusions.

Data extraction and management

Firstly, two review authors (YZ, H‐M F) will design and independently pilot data extraction forms. Then, for included studies, the same two review authors will independently extract and record data on the data extraction forms. If there are differences, we will find consensus; if this cannot be reached, we will consult a third reviewer (J‐H T).

We will extract the following information from the eligible primary studies.

  • Publication details (i.e. year, country, authors, affiliation of authors)

  • Study methodology: setting, study design, method of randomization, allocation concealment, blinding, total duration of study, duration of follow‐up period, numbers of centers, year trial started

  • Participants: sample size, numbers enrolled in each arm, mean age, age range, gender, Eastern Cooperative Oncology Group (ECOG) performance status, diagnostic criteria, NSCLC histological subtype, staging of NSCLC, staging system used, location of tumor, smoking history, withdrawals, inclusion and exclusion criteria

  • Intervention: details of interventions and comparisons (i.e. radiation dose fractionation schedule, type of radiotherapy, length; use of additional treatment), diagnostic PET‐CT

  • Outcome measures: primary and secondary results, type of questionnaires used to assess HRQoL

  • Type of data analyses (e.g. intention‐to‐treat, modified intention‐to‐treat)

  • Funding for trial or conflict of interest

One review author (CW) will copy the data from the data collection form into Review Manager 5.4 (Review Manager 2020). Another author (S‐F F) will double‐check that the data are entered correctly, comparing the study reports with the data presented in the systematic review (Hanna 2013Soon 2018Zhu 2019).

Assessment of risk of bias in included studies

Two review authors (YZ, H‐M F) will evaluate the risk of bias for each study independently. We will resolve disagreements by discussion between the two review authors or by referring to the third review author (J‐H T).

We will use the criteria outlined in the Cochrane risk of bias tool (RoB1) (Higgins 2017a). We will assess the risk of bias according to the following domains.

  1. Random sequence generation (selection bias; per study)

  2. Allocation concealment (selection bias; per study)

  3. Blinding of participants and personnel (performance bias; per outcome)

  4. Blinding of outcome assessment (detection bias; per outcome)

  5. Incomplete outcome data (attrition bias; per outcome)

  6. Selective outcome reporting (reporting bias; per study)

  7. Other bias

We will classify each domain as 'low', 'high' or 'uncertain' risk of bias for each included study. We will record these judgments and justifications, with brief descriptions for each domain and study. We will complete a risk of bias table for each included study and will summarize the risk of bias across studies.

Measures of treatment effect

We will enter the data from data extraction forms into Review Manager 5.4 to calculate the intervention effect (Review Manager 2020). For each outcome, we will calculate the summary estimates of treatment effect with 95% confidence intervals (CI). We will use the following measures of the effect of treatment.

  • For dichotomous data (i.e. tumor response, early and late toxicity), we will use risk ratios (RRs) or odds ratios (ORs).

  • For continuous data (i.e. health‐related quality of life measures), we will use mean differences (MDs) if outcome measurements in all studies are made on the same scale. If studies use different scales, we will use standardized mean differences (SMDs).

  • For time‐to‐event variables (i.e. overall survival), we will use hazard ratios (HRs). If the HRs are not reported directly, we will use the methods described by Parmar 1998 and Tierney 2007 to obtain the information indirectly.

Unit of analysis issues

We will not include cluster‐randomized trials and cross‐over trials in the review. In multiple‐arm trials, for pair‐wise meta‐analysis, we will only consider the relevant arms that meet the inclusion criteria. We will list additional arms in the characteristics of included studies table. We will also try to combine relevant groups to create a single, pair‐wise comparison. If they cannot be combined, we will use the methods recommended by Higgins 2021 to avoid double‐counting of study participants. For the network meta‐analyses, we will treat multi‐armed studies as multiple independent comparisons, and we will adjust for correlations inherent in multiple‐arm trials using standard methods (e.g. Dias 2016).

Dealing with missing data

If there are inadequate or missing data, we will make efforts to contact the study authors to obtain detailed information. We will use intention‐to‐treat (ITT) data whenever possible. Otherwise, we will use the data available to us. This may result in the use of per‐protocol analyses. Since, according to the Cochrane Handbook for Systematic Reviews of Interventions, per‐protocol analyses may be biased, we will address the potential impact of them in the assessment of risk of bias (Higgins 2022a). In addition, for dichotomous data, we will conduct best‐worst case scenario analysis (which assumes a good outcome in the intervention group and bad outcome in the control group) and the worst‐best case scenario (which assumes a bad outcome in the intervention group and a good outcome in the control group) as sensitivity analyses (Higgins 2017b). For continuous data, if mean and standard deviation (SD) cannot be extracted directly, we will calculate SD from standard errors (SE), confidence intervals (CI), P value and t‐values, if these are reported in the trials. If the data are likely to be normally distributed, we will use the median for meta‐analysis when the mean is not available. If it is not possible to calculate the SD by the above methods, we will impute the SD using the highest SD in other trials for that outcome. This form of imputation can down‐weight a study for calculation of MDs and may bias the effect estimate to no effect for calculation of SMDs. We will use sensitivity analyses to assess the impact of the imputation (Higgins 2022b). If the data are presented only in graphs, we will extract data using software such as the GetData Graph Digitizer (www.getdata-graph-digitizer.com/) or similar. If we still cannot obtain the data, we will explain the missing data in the data extraction form and risk of bias table. We will use sensitivity analysis to explore the impact of missing data on findings. We will also discuss the potential impact of missing data on the findings in the discussion section of the review.

Assessment of heterogeneity

We will assess clinical and methodological heterogeneity by carefully examining the important clinical characteristics and methodological differences of included trials. We will assess the presence of clinical heterogeneity by comparing these characteristics within each pair‐wise comparison and effect estimates for different subgroups (as stated in Subgroup analysis and investigation of heterogeneity). Different study designs and risk of bias can contribute to methodological heterogeneity. In addition, we will assess statistical heterogeneity. Firstly, we will use visual inspection of the forest plot to assess heterogeneity. Then, we will use formal statistical tests to quantify heterogeneity. The Chi2 test can be used to assess whether observed differences in results are compatible with chance alone. A low P value (0.10) from the Chi2 test provides evidence of heterogeneity of intervention effects. The I2 statistic is useful to quantify inconsistency; according to the Cochrane Handbook for Systematic Reviews of Interventions, the I2 statistic can be interpreted as follows (Deeks 2021).

  • 0% to 40%: may not be important

  • 30% to 60%: moderate heterogeneity

  • 50% to 90%: substantial heterogeneity

  • 75% to 100%: considerable heterogeneity

Therefore, we will consider a P value < 0.10 or I2 > 50% to indicate substantial statistical heterogeneity.

If we find heterogeneity, we will summarize the data using a random‐effects model and attempt to determine possible reasons using prespecified subgroup analysis.

In addition, for network meta‐analysis, we will compare the fixed‐effect model and random‐effects model. We will present results of random‐effects model meta‐analysis with the between‐trial heterogeneity parameter. The posterior for the between‐trial heterogeneity parameter is likely to be extremely sensitive to the prior. For dichotomous outcomes, if the posterior distribution of the between‐trial heterogeneity parameter include values that are implausibly high, we will use informative priors as described by Turner (Dias 2012Turner 2012).

Assessment of transitivity across treatment comparisons

We will assess transitivity by comparing the distribution of effect modifiers across the different comparisons. In this context we expect that the transitivity assumption will hold, assuming the following:

  • the common treatment which can be used to 'connect' two other interventions is similar when it appears in different trials (e.g. more fractionated SBRT is administered in a similar way in more fractionated SBRT versus CFRT trials and in more fractionated SBRT versus less fractionated SBRT trials);

  • none of the pair‐wise comparisons differ with respect to the distribution of effect modifiers (participants, interventions, co‐interventions, methods and timing for outcome measurement, comparators, and risk of bias). 

If we still cannot rule out factors related to the violation of the transitivity assumption after including studies according to the exact inclusion criteria and comparing the distribution of the potential effect modifiers across the available direct comparisons in the network, we will rate down the certainty of the evidence due to intransitivity and discuss it carefully.

Assessment of reporting biases

We will attempt to minimize publication and other biases, and their potential impact, by ensuring a comprehensive search for both unpublished and published eligible studies. In order to avoid duplicate publication bias, we will identify and exclude duplicate reports of the same study by scrutinizing and comparing authors, location, setting and sample size. We will address non‐reporting or under‐reporting of results by searching for trials register records, protocols, abstracts of presentations, or by scrutinizing the Methods section of published articles for a list of outcomes for comparison with the reported published outcomes for each included study (Page 2021). We will use funnel plots to assess the possibility of small‐study effects and publication bias if we include at least 10 studies for any outcome. In the event that we can only perform direct pair‐wise analyses, we will generate and visually assess funnel plots. We will also assess them using a linear regression test (Egger 1997) or Harbord test (Harbord 2006). If we find funnel plot asymmetry by visual inspection, we will discuss its possible sources, as indicated in table 13.3.b of the Cochrane Handbook (Page 2021). 

To account for the fact that studies estimate effects for different comparisons in an NMA, we will use the comparison‐adjusted funnel plot, an extension of the conventional funnel plot appropriate for NMAs, to assess the presence of small‐study effects. This is a scatterplot of the difference between the study‐specific effect sizes from the corresponding comparison‐specific summary effect versus the study’s standard error. The resulting plot should be symmetric around the zero line in the absence of small‐study effects in the network. Any asymmetry in the plot indicates the presence of small study effects (Chaimani 2013). In order to draw the plot, we need to order the treatments in a meaningful way, regarding which treatment the small‐study effects would favor in a comparison (Ibrahim 2020). Newer treatments are often expected to be favored by small‐study effects (Chaimani 2012Dias 2010). Therefore, prior to drawing this plot, we will order interventions from the oldest to the newest treatments in the entire evidence base. All treatment regimens will be ranked according to the publication year of the earliest trial for each regimen included in the network. Then, in each study, the arm that contains the most novel regimen will be considered to be favored (Salanti 2010). The funnel plots can be only seen as a generic means of displaying small‐study effects and not necessarily publication bias (Page 2021). To assist interpretation, we will use contour‐enhanced funnel plots, which add contours of statistical significance onto the plot, to help distinguish publication biases from other factors that lead to asymmetry in the funnel plot (Moreno 2009aMoreno 2009b). We will also use appropriate network meta‐regression models (Chaimani 2012) to assess the potential for small‐study effects in NMA (Boutron 2020). If any important association between study size and effect size is found, and publication bias is suspected, we will attempt to explore the possibility that funnel plot asymmetry is due to publication bias by using a selection model (Chaimani 2017Mavridis 2013Mavridis 2014). We will describe the outcome of this in the Discussion part of the review and interpret results carefully.

Methods for pair‐wise meta‐analyses

We will use Review Manager 5.4 to analyze data (Review Manager 2020) .

  • For dichotomous outcomes, we will use the Mantel‐Haenszel method.

  • For continuous outcomes, we will use the inverse‐variance method.

  • For time‐to‐event outcomes, we will use the generic inverse variance method.

If there is no substantial heterogeneity, we will use the fixed‐effect model. However, if substantial heterogeneity exists, we will use the random‐effects model. In addition, if meta‐analysis is not possible, we will undertake a narrative review of the findings according to the Synthesis without meta‐analysis (SWiM) guideline (Campbell 2020).

Methods for network meta‐analysis

We will perform NMA to compare multiple interventions simultaneously for each of the primary and secondary outcomes. We will obtain a network plot to ensure that the trials are connected by interventions using STATA 14.2 (Chaimani 2013Stata). We will exclude any trials that are not connected to the network from the NMA and report only the direct pair‐wise meta‐analysis for such comparisons. We will conduct a Bayesian NMA using the Markov chain Monte Carlo method. We will run both random‐effects models and fixed‐effect models using OpenBugs/Winbugs (Winbugs 2000), according to guidance from the National Institute for Health and Care Excellence (NICE) Decision Support Unit (DSU) documents (Dias 2016). We will report the results of both models. We will use the deviance information criterion (DIC), of which a difference of more than five points is considered significant, for model selection. Residual deviance is also an indicator to check whether a model’s fit is satisfactory. The closer the value is to the number of data points, the better the model fit will be. We will select the model with the lower value of DIC and the value of residual deviance which is closer to data points to explain our results. We will use binomial likelihood and logit link for binary outcomes, normal likelihood and identity link for continuous outcomes and binomial likelihood and cloglog link for time‐to event outcomes. We will use a hierarchical Bayesian model using three different initial values, employing codes provided by the NICE DSU (Dias 2016). We will set vague or flat priors, N (0, 1002), for trials' baselines and treatment effect priors. For the random‐effects model, we will use a prior distributed uniformly (0, 5) for between‐trial standard deviation (SD) but will assume the same between‐trial SD across treatment comparisons (Dias 2016). We will use 100,000 iterations after a 'burn‐in' of 50,000 simulations, check for convergence visually (i.e. visualize whether the values in different chains mix very well) and obtain effect estimates. If we do not obtain convergence, we will increase the number of simulations for 'burn‐in' and iterations. If we still do not obtain convergence, we will use alternative initial values and priors, employing methods suggested by van Valkenhoef 2012

We will calculate the probability of each treatment at each possible rank and estimate the surface under the cumulative ranking curve (SUCRA) for each outcome (Salanti 2011). We will present rankograms as a visual reflection of the uncertainty in the ranking probabilities (Chaimani 2017). If we get a sparse network, as described by Brignardello‐Petersen 2019a, the assumption of common between‐study heterogeneity across the network may result in spuriously wide confidence intervals for some of the comparisons in the network and excessively imprecise network estimates using a random‐effects model. In this case, according to the recommendations by Brignardello‐Petersen 2019a, we will trust the results of Bayesian fixed‐effect models or complete sensitivity analyses using frequentist models which can be accomplished by using STATA 14.2 (Stata).

To avoid drawing misleading conclusions, we will consider the simultaneous presentation of results for outcomes about efficacy and toxicity. In addition, we will interpret the findings based on the combination of evidence characteristics (risk of bias in included studies, heterogeneity, incoherence and selection bias), GRADE evaluations, and results with confidence intervals (Chaimani 2019Guyatt 2008).

Assessment of inconsistency

We will assess inconsistency between direct and indirect sources of evidence. Firstly, because inconsistency is a reflection of the presence of effect‐modifiers, we will assess the distribution of potential effect modifiers across treatment comparisons based on an evaluation of participant, intervention and methodological characteristics (Dias 2014Robertson 2019). Then we will fit both an inconsistency model and a consistency model. We will conduct a global assessment of inconsistency using Winbugs/Openbugs (Winbugs 2000), comparing the goodness of fit of an inconsistency model with the standard consistency model used in the main analyses, which assumes consistency between direct and indirect evidence. We will assess the impact of heterogeneity using I2 or Chi2 and between‐trials SD (for random‐effects models only) because the risk of inconsistency is greatly reduced if between‐trial heterogeneity is low. We will assess goodness‐of‐fit statistics (residual deviance and DIC). If there is an improved fit of the inconsistency model of five points or more on the DIC or substantial reduction in between‐study deviation, there might be sufficient evidence of potential inconsistency (Robertson 2019). In addition, we will also fit node‐splitting models using the GeMTC package in R, which can examine inconsistency locally (R 2017van Valkenhoef 2016avan Valkenhoef 2016b). If inconsistency is detected, node‐splitting methods can be helpful to identify which piece of evidence is responsible for inconsistency (Chaimani 2017). In addition, we will complete prespecified subgroup or meta‐regression analyses to explore and explain the possible sources of heterogeneity and inconsistency (Chaimani 2019).

Subgroup analysis and investigation of heterogeneity

If substantial heterogeneity exists, and there is a sufficient number of included studies (> 10), we will complete subgroup analysis according to the following prespecified characteristics:

  1. Intervention characteristics: the treatment techniques used for SBRT and radical fractionated radiotherapy, such as IMRT or three‐dimensional conformal radiation therapy (3DCRT);

  2. Participant characteristics: age (> 60 years versus ≤ 60 years); pathology (squamous or non‐squamous); cancer stage (I or II stage); location of tumor (central or peripheral);

  3. Year trial started (< 2010, 2010 to 2015, > 2015).

In addition, if there is a sufficient number of included studies (e.g. more than 10 studies for each covariate modeled), we will also perform meta‐regression to investigate heterogeneity and inconsistency with the help of the codes provided in the NICE DSU guidance (Dias 2012). We will use a single common interaction term model as recommended by the guidance. This model assumes an identical interaction effect for all treatments compared to the reference treatment/standard treatment (i.e. conventional radiotherapy). This will guarantee that the terms cancel out in the event of comparing two active comparisons. For continuous covariates, the interaction terms will describe how the intervention effect changes with a unit increase in the covariate. For categorical covariates, the interaction terms will estimate the additional intervention effect in each subgroup compared to a nominated reference subgroup (Deeks 2021Dias 2012). We will give the interaction terms non‐informative priors: for example N (0, 1002) (Dias 2012). We will include each factor as a covariate in the NMA models. We will present the results of fitting fixed‐effect and random‐effects NMAs and interaction models with covariates in a table (including posterior treatment effects, interaction terms, between‐trial heterogeneity and their credibility intervals; the residual deviance, number of parameters and DIC etc.). We will compare the impact on heterogeneity (between‐study SD) and goodness of fit (DIC, residual deviance) statistics from the model with covariates against the no‐covariate model. Besides, if the 95% CrI (credible interval) of the interaction term does not include zero, we will consider this to represent statistically significant heterogeneity. If a sufficiently credible modifier exists and can provide the explanation for heterogeneity, we will present the separate evidence summaries for each subgroup. If heterogeneity remains unexplained after the investigation, we will rate down the certainty of evidence (Guyatt 2011Schünemann 2021).

Sensitivity analysis

We will execute sensitivity analyses to examine the robustness of the review findings.

  • Undertaking the analysis excluding unpublished studies (i.e. only abstracts can be found).

  • Undertaking the analysis excluding lower quality studies (those at high risk of bias).

  • Undertaking the analysis to compare the results of the random‐effects model and the fixed‐effect model.

  • Undertaking the analysis excluding missing data.

  • If a trial reports only per‐protocol analysis results, we plan to re‐analyze the results using the best‐worst case scenario and worst‐best case scenario analyses as sensitivity analysis.

  • Undertaking the analysis excluding trials in which any imputation technique is used. We will also undertake sensitivity analysis by excluding the trials in which the median is used to replace the mean.

Summary of findings and assessment of the certainty of the evidence

We will evaluate and present the certainty of the evidence for each outcome using the GRADE approach. This takes into account criteria related to internal validity (risk of bias, inconsistency, imprecision, publication bias) and external validity (such as directness of results) (Guyatt 2008). Certainty of evidence can be graded into four levels: high, moderate, low, and very‐low certainty. Two authors (YZ and H‐M F) will independently assess the certainty of the body of evidence for each outcome following the GRADE framework. If there are any disagreements, we will consult a third author (J‐H T). We will present the reasons for all decisions to downgrade or upgrade the quality of the evidence, and, where necessary, we will provide comments to aid the reader's understanding of the review. 

We will follow the approach suggested by Puhan and Brignardello‐Petersen to evaluate confidence in evidence from a network meta‐analysis (Brignardello‐Petersen 2018Brignardello‐Petersen 2019aBrignardello‐Petersen 2019bPuhan 2014). Firstly, we will present direct and indirect treatment estimates for each comparison of the evidence network. Then, we will rate the certainty of evidence separately for direct and indirect evidence. We will assess direct evidence using the standard GRADE approach. For the indirect evidence, we will focus on the most dominant first order loop. We will assess the direct comparisons contributing to the indirect comparison using the GRADE approach, and the lower confidence rating will be the initial confidence rating of the indirect comparison. In the absence of a first order loop, we will use a higher order loop. However, we will rate down further the initial confidence rating of the indirect comparison in the presence of intransitivity. We will evaluate the certainty of both the direct and indirect evidence without considering imprecision. For the network evidence, if the network estimate is dominated by either the direct or the indirect estimate, we will base the network quality rating on the dominant estimate and will not rate it down because of incoherence. But if both sources of evidence contribute to a similar degree, we will use the higher of the two quality ratings as the network quality rating, and rate it down further for incoherence. We will then consider imprecision. In addition, if intransitivity and incoherence coexist, we will not rate the network estimate down separately for both issues, but will rate it down once to cover both.   

We will use GRADEpro GDT to create summary of findings tables, as recommended in the Cochrane Handbook (Schünemann 2021). We will adopt a modified version of the new summary of findings tables format for NMAs (Yepes‐Nuñez 2019). We will report the following outcomes: overall survival, progression‐free survival, radiotherapy‐related toxicity, tumor response to treatment, health‐related quality of life.