Scolaris Content Display Scolaris Content Display

PET‐CT for assessing mediastinal lymph node involvement in patients with suspected resectable non‐small cell lung cancer

This is not the most recent version

Collapse all Expand all

Abstract

This is a protocol for a Cochrane Review (Diagnostic test accuracy). The objectives are as follows:

To examine the accuracy of integrated PET‐CT for mediastinal staging of patients with suspected or confirmed NSCLC which is potentially suitable for treatment with curative intent.   

Background

Target condition being diagnosed

Accurately determining the diagnosis and stage of lung cancer is important to enable patients to be offered the best possible treatment, but the process is often complex. The symptoms and signs of lung cancer can be difficult to distinguish from those of other diseases (some of which may coexist in lung cancer patients) and many lung cancers are diagnosed via atypical pathways (e.g. emergency or Accident & Emergency admissions, via other specialities, or opportunistically; Department of Health 2011) by means of a variety of different biopsies and imaging techniques, some of which yield information about both diagnosis and staging (NICE 2011). The complexity is augmented by the need to consider the location of the primary tumour, patient preferences and the fitness of the patient, which itself may influence both diagnostic and treatment decisions and may require a change to the diagnostic and staging pathway.

A major determinant of treatment offered to patients with non‐small cell lung cancer (NSCLC) is the intra‐thoracic (mediastinal) nodal status. If the disease has not spread to the ipsilateral mediastinal and/or subcarinal (N2) nodes and the patient is otherwise considered fit for surgery, resection is often the treatment of choice (see also Manser 2005). Other treatment options for these patients include combination or single‐modality treatment with radiotherapy and/or chemotherapy (see also O'Rourke 2010). Planning the optimal treatment is therefore critically dependent on accurate staging of the disease.  

Lung cancer staging is performed using an arsenal of different complementary tests, some of which are non‐invasive (e.g. various types of imaging; Silvestri 2007) and some of which are invasive (e.g. surgical staging, mediastinoscopy), or minimally invasive  (e.g. endobronchial ultrasound‐guided transbronchial needle aspiration (EBUS‐TBNA), endoscopic ultrasound‐guided fine needle aspiration (EUS‐FNA) and various other methods for obtaining biopsies) (see also Detterbeck 2007). The only test of these that can access all the relevant lymph nodes is surgical staging ‐ by means of resection of the primary tumour and systematic nodal dissection. However, surgical staging is highly invasive and so is not appropriate for many patients without first acquiring further information about the likely suitability for resection with curative intent. Suitability for resection with curative intent will most often be determined by imaging tests (including combined positron emission tomography and computed tomography; PET‐CT) to assess the probability of malignant involvement and to detect extra‐thoracic metastases and mediastinal lymph node metastasis that would preclude treatment with curative intent. Imaging findings may have to be followed by one or more biopsies to confirm the results of these tests pathologically. Occasionally, when imaging tests are unequivocally positive for cancer, the findings alone will be enough to exclude patients from radical treatment. This, in effect, means that only the patients who receive resection will receive the ultimate reference standard (i.e. surgical staging) while those patients who are found to have unresectable NSCLC will usually have had their cancer stage pathologically confirmed by a number of other tests that are considered suitable for the location of the affected lymph node(s).   

The reference standard for the present review will therefore necessarily have to consist of a number of invasive tests that all yield pathologically confirmable information and that collectively can be considered as tests that provide cyto‐histological confirmation of tumour extent. The secondary aims of this review reflect our consideration of this issue as we aim to consider potential differences in the reference standard as a source of heterogeneity between the studies.

Index test(s)

PET‐CT is a non‐invasive staging method of the mediastinum that is widely available and used extensively by all lung cancer multi‐disciplinary teams. PET‐CT is most commonly performed using [18F]‐2‐fluoro‐deoxy‐D‐glucose (FDG) as a tracer to provide a measure of glucose uptake with simultaneous computed tomography to aid localisation. Before receiving a PET‐CT scan, most patients will already have received a CT scan and PET‐CT is most commonly used to confirm early‐stage disease in patients who have no significant nodes (≥ 1 cm on the short axis) on CT or to clarify nodal status, in which case PET‐CT is not always the first test after CT. That is, currently, the role of PET‐CT is primarily in triaging patients by identifying patients with no spread to the mediastinum who may therefore be candidates for resection and distinguishing those from patients with distant and/or mediastinal metastases that may need to be biopsied before their treatment plan can be developed.

Rationale

Although the non‐invasive nature of PET‐CT constitutes one of the major advantages of the test, PET‐CT may be suboptimal in detecting malignancy in normal‐sized lymph nodes and in ruling out malignancy in patients with co‐existing inflammatory or infectious diseases (Cerfolio 2005; Kim 2006; Lee 2007; Shim 2005; Tournoy 2007; Yi 2007). The role of PET‐CT in the accurate staging pathway for patients with lung cancer is therefore still debated and a crucial question is when a biopsy sample is needed to increase the sensitivity and specificity of PET‐CT. Multi‐disciplinary teams must have a clear idea of the likelihood of false positive and negative PET‐CT in a given circumstance (in particular the size of mediastinal nodes) in order to best manage patients and advise them whether or not a biopsy is necessary. A false negative rate that is consistently above 20% would cause clinicians to question the utility of the test. However the question is complex. For example, in the case of detection of distant metastases in patients otherwise fit for surgery, a 20% false negative rate might lead to only one patient in 100 having futile surgery (as the baseline rate of distant metastases is around 5%). Where patients have had surgery, a 20% false negative rate in mediastinal nodes would mean that management is changed in one in five patients – but this does not necessarily mean that an operation was the wrong thing to do, as we know from the National Lung Cancer Audit that outcomes are better when patients have had surgery, even for N2 disease (NICE 2011). On balance, we will focus on nodal metastases in this review. False negative outcomes of PET‐CT should only apply to nodes that are not significantly enlarged on (a prior) CT, as enlarged nodes should be biopsied. False positives are of less concern since they should always be followed by a further test to confirm.

This review represents an extension of a previous review we have undertaken in this area for the 2011 NICE guideline on 'The diagnosis and treatment of lung cancer (update)' (NICE 2011) which included fewer studies and no meta‐analysis. To the best of our knowledge the accuracy of PET‐CT for detecting N2 disease in patients with suspected resectable NSCLC has not previously been systematically reviewed and meta‐analysed by any others.

Objectives

To examine the accuracy of integrated PET‐CT for mediastinal staging of patients with suspected or confirmed NSCLC which is potentially suitable for treatment with curative intent.   

Secondary objectives

To assess potential sources of heterogeneity including study design (e.g. retrospective/prospective, consecutive/random series), patient populations (number and characteristics, e.g. T‐ and N‐stage, significant nodes on prior CT, country), different cut‐off values for test positivity (malignancy), differences in PET‐CT image acquisition and/or scanning equipment and potential differences in reference standard (mediastinoscopy/pathological or surgical staging).

Investigation of sources of heterogeneity

See 'Secondary objectives'.

Methods

Criteria for considering studies for this review

Types of studies

Prospective or retrospective cross‐sectional studies that assess the diagnostic accuracy of integrated PET‐CT for diagnosing N2 disease in patients with suspected resectable NSCLC and report patients as the unit of analysis.

Participants

Patients with suspected/confirmed NSCLC who are considered potentially suitable for primary resection. This review will not consider patients who are being re‐staged after induction or neoadjuvant chemotherapy.

Index tests

PET‐CT carried out on the various available integrated PET‐CT scanners with cut‐off values for test positivity as reported in the studies that will be included. That is, the type of integrated PET‐CT scanner, scanner manufacturer and cut‐off values will not influence whether a study is included or not, rather we will examine the potential contribution of these factors to systematic between‐study variation as potential sources of heterogeneity as part of the secondary objectives. However, we will not consider studies that employ tracers other than FDG or other nuclear medicine imaging such as single photon emission tomography (SPECT) or stand‐alone PET.

Target conditions

Resectability of lung cancer depends on the locoregional spread of the disease. NSCLC is generally not considered resectable if it has spread beyond N1 disease. Thus, the target condition of this review is resectable NSCLC which for the present purposes is defined as NSCLC that has not spread to the ipsilateral mediastinal and/or subcarinal (N2) lymph nodes.

Reference standards

Pathological confirmation of PET‐CT results from samples obtained via surgical resection with mediastinal sampling, mediastinoscopy, video‐assisted thoracic surgery (VATS), EBUS‐TBNA, EUS‐FNA, TBNA, transthoracic needle aspiration (TTNA) and/or biopsies of extra‐thoracic sites.

Search methods for identification of studies

Electronic searches

We will search the following databases using the search terms and strategy identified in Table 1: MEDLINE; EMBASE; MEDION; Science Citation Index; The Cochrane Library; CINAHL; EBM online; ACP Journal; C‐EBLM; ARIF; HTA database; DARE; Scopus, NTIS, openSIGLE, ProQuest Dissertations & Theses; Index Medicus for the Eastern Mediterranean region; KoreaMed; IMSEAR; Panteleimon; WPRIM; African Index Medicus; LILACS; IndMED; Australasian Medical Index; the Chinese Biomedical Literature Database; ISI Proceedings; and BIOSIS Previews. We will search trials registers (such as http://clinicaltrials.gov and http://www.controlled‐trials.com) for research projects in process. We will impose no language or publication status restrictions on the search.

Open in table viewer
Table 1. Search strategy for OVID MEDLINE database (and adapted for other databases)

1

 exp Lung Neoplasms/

2

((lung or lungs or pulmonary) adj3 (neoplasm$ or cancer$ or carcinoma$ or adenocarcinoma$ or angiosarcoma$ or chrondosarcoma$ or sarcoma$ or teratoma$ or lymphoma$ or blastoma$ or microcytic$ or tumour$ or tumor$)).ti,ab.

3

NSCLC.ti,ab.

4

or/1‐3

5

Tomography/

6

Tomography, Emission‐Computed/   

7

Positron‐Emission Tomography/

8

Tomography, Spiral Computed/

9

Fludeoxyglucose f 18/

10

FDG or Fludeoxyglucose or fluorodeoxyglucose or depreotide).tw.

11

((positron or photon or scintillation) adj3 (emission or tomograph$)).tw.

12

(CGC or PET or SPECT or NEOTECT or NEOSPECT or NEOTEC).tw.

13

or/5‐12

14

4 and 13

15

Neoplasm Staging/

16

(staging or grading).tw.

17

Lymphatic Metastasis/

18

Neoplasm Invasiveness/

19

Neoplasm Metastasis/

20

Mediastinal Neoplasms/

21

Lymph Nodes/

22

Mediastinum/

23

((lymph node or mediastin$) adj2 (malignan$ or metast$ or neoplasm$)).tw.

24

or/15‐23

25

14 and 24

Searching other resources

We will handsearch the reference lists of included articles along with the reference lists of any relevant review articles identified through the search. We will also contact the authors of included trials and other experts in the field of lung cancer staging for information about any ongoing or unpublished studies. We will impose no language or publication status restrictions on the search.

Data collection and analysis

Selection of studies

Firstly, one of the review authors (MSH) will assess the titles and abstracts of all the studies identified by the search for potential inclusion. This first stage of screening will exclude all records that are not studies of PET‐CT in patients with NSCLC. Secondly, two of the review authors (MSH and DRB) will assess the titles and abstracts of the remaining records for potential inclusion, and thirdly two of the review authors (MSH and DRB) will independently consider the full records of all potentially relevant studies for inclusion by applying the selection criteria outlined in the Types of studies section. We will resolve potential disagreements by discussion.

Data extraction and management

Using a standardised data extraction form two authors (MSH, DRB) will extract data pertaining to study design, participant detail, index and reference tests and funding (see Table 2). We will resolve potential disagreements by discussion. If there are studies where only a subgroup of the participants meet the inclusion criteria for the current review, we will only extract data on this subgroup.

Open in table viewer
Table 2. Study details

Study ID

First author, year of publication

Clinical features and settings

Inclusion and exclusion criteria, previous tests for lung cancer (diagnosis and staging), clinical setting

Participants

Sample size, age, sex, comorbidities, country, histology of primary tumour

Study design

Prospective/retrospective, case‐control/consecutive/random patient series

Reference standard(s)

The reference standard(s) used

Index test

Details of the index test used including the type of PET‐CT scanner, FDG dose, injection‐to‐scan time, attenuation correction and the cut‐off values for test positivity (malignancy)

Follow‐up

All patients accounted for in results, missing/uninterpretable test results, reasons for withdrawal, adverse events caused by test

Notes

Source of funding, anything else of relevance

FDG: [18F]‐2‐fluoro‐deoxy‐D‐glucose
PET‐CT: positron emission tomography and computed tomography

For the comparison of the index test with the reference standard, we will extract the number of true and false positives and true and false negatives for the index test if these numbers are presented in the studies. Otherwise, we will attempt to reconstruct the two‐by‐two table of true and false positives and negatives from the information reported in the studies.

Assessment of methodological quality

Two of the authors (MSH, DRB) will assess the quality of each study using a modified version of the QUADAS tool (Whiting 2003), as outlined in Table 3. Each item listed in Table 3 requires a 'yes', 'no' or 'unclear' response. We will also document the reasons for each response in accordance with the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy (http://srdta.cochrane.org./). We included three additional items on our checklist: 1. In studies where a cut‐off value was used, was it pre‐specified? 2. Was there a clear definition of a positive result? 3. Was the study free of commercial funding? We included the items pertaining to cut‐off values and clear definitions of positive results to take into account the subjective nature of PET‐CT image interpretation, which may be based on a variety of different criteria such as extensive clinical experience, different standard uptake values (SUV) and/or different morphological features. We included the third additional item in order to record any potential bias resulting from commercial interest in the results. We will resolve potential disagreements between the quality ratings through discussion.

Open in table viewer
Table 3. Study characteristics

Item

Description

Was the spectrum of patients representative of the spectrum of patients who will receive the test in practice?

'Yes' if the characteristics of the participants are well described and probably typical of a secondary healthcare setting
'No' if the sample is unrepresentative of people with potentially resectable lung cancer in general
'Unclear' if the source or characteristics of participants is not adequately described

Is the reference standard likely to correctly identify the target condition?

'Yes' if reference standard is sampling of mediastinal nodes with pathological diagnosis
'No' if there is no sampling of mediastinal nodes with pathological diagnosis

'Unclear' if insufficient information is provided

Is the time period between the reference standard and the index test short enough to be reasonably sure that the target condition did not change between the two tests?

'Yes' if the time period between PET‐CT and the reference standard is ≤ 8 weeks
'No' if the time period between PET‐CT and the reference standard is > 8 weeks

'Unclear' if insufficient information is provided

Partial verification avoided?

'Yes' if all participants who received the index text also underwent the reference test
'No' if not all the participants who received the index test also underwent the reference test
'Unclear' if insufficient information is provided
If not all participants received the reference tests, how many did not (of the total)?

Differential verification avoided?

'Yes' if the same reference test was used regardless of the index test results
'No' if different reference tests are used depending on the results of the index test
'Unclear' if insufficient information is provided
If any participants received a different reference test, what were the reasons stated for this, and how many participants were involved?

Incorporation avoided?

Should be ‘Yes' for all studies, as the reference standard is defined in the inclusion criteria as pathological staging

'Yes' if the index test if not part of the reference standard

'No' if the index test if clearly part of the reference standard

'Unclear' if insufficient information is provided to assess this item

Reference standard results blinded?

'Yes' if the report stated that the person undertaking the reference test did not know the results of the index tests, or if the two tests were carried out in different places
'No' if the report stated that the same person performed both tests, or that the results of the index tests were known to the person undertaking the reference tests
'Unclear' if insufficient information provided

Index test results blinded?

'Yes' if the report stated that the person undertaking the index test did not know the results of the reference tests, or if the two tests were carried out in different places
'No' if the report stated that the same person performed both tests, or that the results of the index tests were known to the person undertaking the reference tests
'Unclear' if insufficient information provided

Relevant clinical data available?

'Yes' if clinical data (i.e. patient history, other test results) that would normally be available when the test results are interpreted and similar data are available in the study
'No' if the clinical data (i.e. patient history, other test results) that would normally be available when the test results are interpreted are not available in the study or if other test results are available that would not normally be available when interpreting the test results
'Unclear' if the paper does not explain which clinical information was available at the time of assessment

Uninterpretable results reported?

'Yes' if the number of participants in the two‐by‐two table matches the number of participants recruited into the study, or if sufficient explanation is provided for any discrepancy
'No' if the number of participants in the two‐by‐two table does not match the number of participants recruited into the study, and insufficient explanation is provided for any discrepancy
'Unclear' if insufficient information is given to permit judgement
Report how many results were uninterpretable (of the total)

Withdrawals explained?

'Yes' if there are no participants excluded from the analysis, or if exclusions are adequately described
'No' if there are participants excluded from the analysis and there is no explanation given
'Unclear' if not enough information is given to assess whether any participants were excluded from the analysis
Report how many participants were excluded from the analysis, for reasons other than uninterpretable results

If a cut‐off value has been used, was it established before the study was started (pre‐specified cut‐off value)?

'Yes' if pre‐specified
'No' if the authors selected the optimal cut‐off value based on the results of the study
'Unclear' if there is a range of cut‐off values and there is doubt which cut‐off has been used, or if there is no mention at all of a cut‐off value

Did the study provide a clear definition of what was considered to be a positive result?

'Yes' if the definition of a positive result is clearly stated (e.g. SUV)
'No' if no definition of what was considered a positive result is stated or the definition of a positive result varied between the patients
'Unclear' if not enough information is given to permit judgement

Was the study free of commercial funding?

'Yes' if the funding source is clearly stated and is not commercial
'No' if the funding source is clearly stated and is commercial
'Unclear' if not enough information is given to assess whether the funding source is commercial

SUV: standard uptake values

Statistical analysis and data synthesis

We will extract the numbers of true positives, false positives, true negative and false negatives for each study based only on the ability of PET‐CT to distinguish between N0 and N1 mediastinal disease and N2 and N3 mediastinal disease. N0 and N1 disease are therefore both considered negatives and N2 and N3 are both considered positives. If PET‐CT indicates N1 disease which is shown by the reference standard to be N0 disease (and vice versa), the PET‐CT result will still be considered a true negative because N0 and N1 disease are both considered resectable disease. The same principle will apply to N2 and N3 disease, that is, if PET‐CT indicates N2 disease which is shown to be N3 disease by the reference standard (and vice versa) the PET‐CT result will still be considered a true positive. However, if PET‐CT indicates N0 or N1 disease which the reference standard show to be N2 or N3 disease, then the FDG‐PET result will be considered a false negative and similarly if PET‐CT indicates N2 or N3 disease which is shown by the reference standard to be N0 or N1 disease, the PET‐CT result will be considered a false positive. If data for more than one positivity threshold are reported, we will initially extract all the data, but only analyse the threshold most commonly used by all the studies. We will only extract data with patient as the unit of analysis and not, for example, lymph node.

We will plot the estimates of the observed sensitivities and specificities in forest plots and in a receiver operating characteristic (ROC) plot of sensitivity versus 1‐specificity in order to visually assess the between‐study variability. If collected data are based on different thresholds for a positive test result, we will fit a summary ROC curve using HSROC model (Harbord 2006; Rutter 2001). We will select one threshold per study in the special case of a single study reporting data for more than one threshold. If studies show sufficient clinical homogeneity (see Investigations of heterogeneity), we will obtain summary accuracy estimates for one or more clinically relevant thresholds, which we will select before performing the analyses. We will present summary estimates with a 95% confidence ellipse in the ROC space. We will use pooled estimates of sensitivity and specificity to calculate the positive and negative likelihood ratios.

Investigations of heterogeneity

Several factors can contribute to heterogeneity in diagnostic accuracy of a test across studies. We will check for heterogeneity as part of the planned meta‐analysis. Anticipated sources of heterogeneity includes study design (e.g. retrospective/prospective, consecutive/random series), patient populations (number and characteristics, e.g. T‐ and N‐stage, significant nodes on prior CT, country), different cut‐off values for test positivity (malignancy), differences in PET‐CT image acquisition and/or scanning equipment and potential differences in reference standard (mediastinoscopy/pathological or surgical staging).

Sensitivity analyses

If sufficient data are available, we will examine the robustness of the meta‐analyses by conducting sensitivity analyses using different components of the quality assessment, particularly those relating to whether cut‐off values were pre‐specified, whether a clear definition for test positivity was used and whether commercial funding was provided. We will add covariates to the bivariate model to test whether either sensitivity or specificity, or both, differ in subgroups of studies defined according to these covariates. The analysis will aim to estimate valid measures of predictive accuracy taking into account confounding by any methodological flaws. We will use the NLMIXED procedure in SAS version 9.1 for Windows (SAS Institute Inc, Cary, NC, USA) to fit the HSROC model.

Table 1. Search strategy for OVID MEDLINE database (and adapted for other databases)

1

 exp Lung Neoplasms/

2

((lung or lungs or pulmonary) adj3 (neoplasm$ or cancer$ or carcinoma$ or adenocarcinoma$ or angiosarcoma$ or chrondosarcoma$ or sarcoma$ or teratoma$ or lymphoma$ or blastoma$ or microcytic$ or tumour$ or tumor$)).ti,ab.

3

NSCLC.ti,ab.

4

or/1‐3

5

Tomography/

6

Tomography, Emission‐Computed/   

7

Positron‐Emission Tomography/

8

Tomography, Spiral Computed/

9

Fludeoxyglucose f 18/

10

FDG or Fludeoxyglucose or fluorodeoxyglucose or depreotide).tw.

11

((positron or photon or scintillation) adj3 (emission or tomograph$)).tw.

12

(CGC or PET or SPECT or NEOTECT or NEOSPECT or NEOTEC).tw.

13

or/5‐12

14

4 and 13

15

Neoplasm Staging/

16

(staging or grading).tw.

17

Lymphatic Metastasis/

18

Neoplasm Invasiveness/

19

Neoplasm Metastasis/

20

Mediastinal Neoplasms/

21

Lymph Nodes/

22

Mediastinum/

23

((lymph node or mediastin$) adj2 (malignan$ or metast$ or neoplasm$)).tw.

24

or/15‐23

25

14 and 24

Figures and Tables -
Table 1. Search strategy for OVID MEDLINE database (and adapted for other databases)
Table 2. Study details

Study ID

First author, year of publication

Clinical features and settings

Inclusion and exclusion criteria, previous tests for lung cancer (diagnosis and staging), clinical setting

Participants

Sample size, age, sex, comorbidities, country, histology of primary tumour

Study design

Prospective/retrospective, case‐control/consecutive/random patient series

Reference standard(s)

The reference standard(s) used

Index test

Details of the index test used including the type of PET‐CT scanner, FDG dose, injection‐to‐scan time, attenuation correction and the cut‐off values for test positivity (malignancy)

Follow‐up

All patients accounted for in results, missing/uninterpretable test results, reasons for withdrawal, adverse events caused by test

Notes

Source of funding, anything else of relevance

FDG: [18F]‐2‐fluoro‐deoxy‐D‐glucose
PET‐CT: positron emission tomography and computed tomography

Figures and Tables -
Table 2. Study details
Table 3. Study characteristics

Item

Description

Was the spectrum of patients representative of the spectrum of patients who will receive the test in practice?

'Yes' if the characteristics of the participants are well described and probably typical of a secondary healthcare setting
'No' if the sample is unrepresentative of people with potentially resectable lung cancer in general
'Unclear' if the source or characteristics of participants is not adequately described

Is the reference standard likely to correctly identify the target condition?

'Yes' if reference standard is sampling of mediastinal nodes with pathological diagnosis
'No' if there is no sampling of mediastinal nodes with pathological diagnosis

'Unclear' if insufficient information is provided

Is the time period between the reference standard and the index test short enough to be reasonably sure that the target condition did not change between the two tests?

'Yes' if the time period between PET‐CT and the reference standard is ≤ 8 weeks
'No' if the time period between PET‐CT and the reference standard is > 8 weeks

'Unclear' if insufficient information is provided

Partial verification avoided?

'Yes' if all participants who received the index text also underwent the reference test
'No' if not all the participants who received the index test also underwent the reference test
'Unclear' if insufficient information is provided
If not all participants received the reference tests, how many did not (of the total)?

Differential verification avoided?

'Yes' if the same reference test was used regardless of the index test results
'No' if different reference tests are used depending on the results of the index test
'Unclear' if insufficient information is provided
If any participants received a different reference test, what were the reasons stated for this, and how many participants were involved?

Incorporation avoided?

Should be ‘Yes' for all studies, as the reference standard is defined in the inclusion criteria as pathological staging

'Yes' if the index test if not part of the reference standard

'No' if the index test if clearly part of the reference standard

'Unclear' if insufficient information is provided to assess this item

Reference standard results blinded?

'Yes' if the report stated that the person undertaking the reference test did not know the results of the index tests, or if the two tests were carried out in different places
'No' if the report stated that the same person performed both tests, or that the results of the index tests were known to the person undertaking the reference tests
'Unclear' if insufficient information provided

Index test results blinded?

'Yes' if the report stated that the person undertaking the index test did not know the results of the reference tests, or if the two tests were carried out in different places
'No' if the report stated that the same person performed both tests, or that the results of the index tests were known to the person undertaking the reference tests
'Unclear' if insufficient information provided

Relevant clinical data available?

'Yes' if clinical data (i.e. patient history, other test results) that would normally be available when the test results are interpreted and similar data are available in the study
'No' if the clinical data (i.e. patient history, other test results) that would normally be available when the test results are interpreted are not available in the study or if other test results are available that would not normally be available when interpreting the test results
'Unclear' if the paper does not explain which clinical information was available at the time of assessment

Uninterpretable results reported?

'Yes' if the number of participants in the two‐by‐two table matches the number of participants recruited into the study, or if sufficient explanation is provided for any discrepancy
'No' if the number of participants in the two‐by‐two table does not match the number of participants recruited into the study, and insufficient explanation is provided for any discrepancy
'Unclear' if insufficient information is given to permit judgement
Report how many results were uninterpretable (of the total)

Withdrawals explained?

'Yes' if there are no participants excluded from the analysis, or if exclusions are adequately described
'No' if there are participants excluded from the analysis and there is no explanation given
'Unclear' if not enough information is given to assess whether any participants were excluded from the analysis
Report how many participants were excluded from the analysis, for reasons other than uninterpretable results

If a cut‐off value has been used, was it established before the study was started (pre‐specified cut‐off value)?

'Yes' if pre‐specified
'No' if the authors selected the optimal cut‐off value based on the results of the study
'Unclear' if there is a range of cut‐off values and there is doubt which cut‐off has been used, or if there is no mention at all of a cut‐off value

Did the study provide a clear definition of what was considered to be a positive result?

'Yes' if the definition of a positive result is clearly stated (e.g. SUV)
'No' if no definition of what was considered a positive result is stated or the definition of a positive result varied between the patients
'Unclear' if not enough information is given to permit judgement

Was the study free of commercial funding?

'Yes' if the funding source is clearly stated and is not commercial
'No' if the funding source is clearly stated and is commercial
'Unclear' if not enough information is given to assess whether the funding source is commercial

SUV: standard uptake values

Figures and Tables -
Table 3. Study characteristics