Ultrasound, CT, MRI, or PET‐CT for staging and re‐staging of adults with cutaneous melanoma

Summary of findings Summary of findings table

Question		How accurate is ultrasound, CT, MRI, or PET‐CT for staging or re‐staging of cutaneous invasive melanoma in adults?
Population:		Adults with a confirmed diagnosis of melanoma undergoing imaging for staging purposes: Before sentinel lymph node biopsy (SLNB) to identify nodal metastases For full body staging following removal of the primary melanoma For full body staging due to suspected recurrence of disease
Index test(s):		Ultrasound with or without fine needle aspiration cytology (FNAC) Computed tomography (CT) Magnetic resonance imaging (MRI) Positron emission tomography–computed tomography (PET‐CT)
Comparator test:		All of the index tests may be used in comparison to each other
Target condition:		For pre‐SLNB imaging: detection of nodal metastases For all other imaging: detection of any metastases
Reference standard:		Histology plus clinical or imaging follow‐up
Action:		If accurate, positive results of imaging before SLNB in some circumstances could allow patients with nodal metastases to proceed directly to commence adjuvant therapy and avoid an additional invasive procedure (SLNB). Accurate whole body imaging will allow appropriate locoregional and systemic therapies to be initiated in a timely manner
Quantity of evidence (n = 39 studies)		Number of studies		Number of participants		Number of cases
Per patient data:		34		4980		1265
Per lesion data:		7		417 (1846 lesions)		1061 metastases
Limitations
Risk of bias:		Some concerns due to poor reporting across almost all domains. Unclear risk for participant selection method (11/39) or exclusions not clearly described (3/39). High risk from exclusions on the basis of index test results (4/39). Low risk for the index test for pre‐SLNB ultrasound (6/11), other ultrasound evaluation (3/5), CT (7/10), and MRI (4/4). For PET‐CT, unclear risk from lack of description of blinded case note review to ascertain imaging results for retrospective studies (13/23) and high risk from data driven selection of test threshold (1/23). Unclear risk for reference standard from lack of detail on participant follow‐up schedules (12/39). Lack of blinding of the histological diagnosis (2/39) or data collection on follow‐up (3/39) to the index result. High risk from differential verification (20/39) and participant exclusions (13/39). Low risk for comparisons between tests (6/9)
Applicability of evidence to question:		High or unclear concern for applicability for almost all domains. High concern for participant selection from mixed populations (11/39) or data presented per lesion (5/39). Unclear concern from lack of clarity regarding study population. High concern for index tests from poor description of test thresholds (pre‐SLNB ultrasound (1/11), other ultrasound (1/5), CT (5/10), MRI (3/4), PET‐CT (4/23)) or consensus test interpretation (CT (6/10), MRI (2/4), PET‐CT (11/23)). Unclear concern for application and interpretation of the index test (pre‐SLNB US (10/11), CT (3/10), MRI (2/4), PET‐CT (6/23)) or unclear observer expertise (pre‐SLNB ultrasound (6/11), CT (3), MRI (2/4), PET‐CT (6/23)). Unclear concern for applicability of the reference standard from lack of description of the target condition or no breakdown of cases according to nodal or distant metastases. Expertise of the histopathologist poorly described (6/39)
Findings
Thirty‐nine studies reporting accuracy data for pre‐SLNB imaging (n = 18) or for whole body imaging (n = 24) were included. The 24 studies of whole body imaging were of primary staging (n = 6) or staging for potential recurrence of disease (n = 3), or were conducted in mixed or not clearly described populations (n = 15). As we are unable to make clear statements regarding the expected accuracy of imaging at any particular point on the clinical pathway for the mixed population group, the findings presented are based on results for pre‐SLNB imaging, and for primary staging and re‐staging of melanoma only.
Test: pre‐SLNB imaging
Test	Studies: patients (cases)	Sensitivity (95% CI)	Specificity (95% CI)	Numbers in a cohort of 1000 lesions at a median prevalence of 23.7%^a
				TP (95% CI)	FN (95% CI)	FP (95% CI)	TN (95% CI)
US	11: 2614 (542)	35.4 (17.0 to 59.4)	93.9 (86.1 to 97.5)	84 (40 to 141)	153 (197 to 96)	47 (106 to 19)	716 (657 to 744)
US + FNAC	3: 1164 (259)	18.0 (3.58 to 56.5)	99.8 (99.1 to 99.9)	43 (8 to 134)	194 (229 to 103)	2 (7 to 1)	761 (756 to 762)
PET‐CT	4: 170 (49)	10.2 (4.31 to 22.3)	96.5 (87.1 to 99.1)	24 (10 to 53)	213 (227 to 184)	27 (98 to 7)	736 (665 to 756)
Whole bodyimaging for primary staging of melanoma
Quantity of evidence (n = 6 studies)		Number of studies		Number of participants		Number of cases
Any metastases		3		81		51
Nodal metastases		3		373		68
Distant metastases		2		112		17
Findings
Four of the six studies evaluated PET‐CT, one in comparison to CT. In participants with primary melanomas > 4 mm thick (two studies), sensitivities for the detection of any metastases were 30% (95% CI 7% to 65%) to 47% (95% CI 29% to 65%), and specificities 73% (95% CI 45% to 92%) to 88% (95% CI 68% to 97%). One study of any participant referred for PET‐CT demonstrated no false positive results for either CT or PET‐CT for the detection of nodal metastases (specificity 100%, 95% CI 92% to 100%); however, sensitivity was higher for PET‐CT (38%, 95% CI 14% to 68%) compared to CT (23%, 95% CI 5% to 54%). For the detection of distant metastases, two additional cases were detected with PET‐CT (sensitivity 42%, 95% CI 15% to 72%) in comparison to CT (25%, 95% CI 5% to 57%) with no difference in specificity (93%, 95% CI 81% to 99%). One study of PET‐CT suggested an SUVmax threshold ≥ 2.2 at baseline and predicted later recurrence with a sensitivity of 89% (95% Cl 52% to 100%) and specificity 61% (95% CI 41% to 78%). No data for MRI were identified. Results for ultrasound for the detection of nodal metastases (2 studies) were highly variable and likely subject to bias.
Whole bodyimaging for re‐staging of melanoma
Quantity of evidence (n = 3 studies)		Number of studies		Number of participants (lesions)		Number of cases (metastases)
Any metastases:		2 (1)		153 (139)		95 (87)
Nodal metastases:		1		460		37
Distant metastases:		0		N/A		N/A
Findings:
Two studies of PET‐CT for re‐staging were pooled; summary sensitivity for the detection of any metastasis was 92.6% (95% CI 85.3% to 96.4%) and specificity 89.7% (95% CI 78.8% to 95.3%) (153 patients, 95 cases). In one of the two studies, PET‐CT was more sensitive (89%, 95% CI 78% to 96%) than CT alone (increase of 21%). With similar specificity (88%, 95% CI 76% to 95%), PET‐CT was more sensitive in the subgroup with stage Illc to IV disease (100%, 95% CI 81% to 100%) than in those with less advanced disease (84%, 95% Cl 69% to 94%). One study of ultrasound in clinically node negative patients undergoing follow‐up demonstrated 100% sensitivity (95% CI 91% to 100%) for 'common signs of malignancy' or focal hypoechoic cortical thickening (considered test positive) with a specificity of 93% (95% CI 90% to 95%). No data for MRI were identified.
^aMedian prevalence observed across 11 studies of pre‐SLNB ultrasound (interquartile range: 25th percentile 20.5%, 75th percentile 25.4%). CT: computed tomography; FN: false negative; FNAC: fine needle aspiration cytology; FP: false positive; MRI: magnetic resonance imaging; PET: positron emission tomography; SLNB: sentinel lymph node biopsy; TN: true negative; TP: true positive.

Background

available in

This review is one of a series of Cochrane Diagnostic Test Accuracy (DTA) reviews on the diagnosis and staging of melanoma and keratinocyte skin cancers conducted for the National Institute for Health Research (NIHR) Cochrane Systematic Reviews Programme. Appendix 1 shows the content and structure of the programme. Appendix 2 provides a glossary of terms used, and Appendix 3 presents a table of acronyms used.

Target condition being diagnosed

Melanoma is one of the most aggressive forms of skin cancer, with the potential to metastasise to other parts of the body via the lymphatic system and the bloodstream. Melanoma accounts for a small percentage of skin cancer cases but is responsible for up to 75% of skin cancer deaths (Boring 1994; Cancer Research UK 2017). Melanoma arises from uncontrolled proliferation of melanocytes ‐ the epidermal cells that produce pigment or melanin. It most commonly arises in the skin but can occur in any organ that contains melanocytes, including mucosal surfaces, the back of the eye, and the lining around the spinal cord and brain. 'Cutaneous melanoma' refers to a skin lesion with malignant melanocytes present in the dermis, and includes superficial spreading and nodular, acral lentiginous, and lentigo maligna melanoma variants (Figure 1).

Figure 1

Sample photographs of superficial spreading melanoma (left) and nodular melanoma (right). Copyright © 2010 Dr. Rubeta Matin: reproduced with permission.

The incidence of melanoma rose to over 200,000 newly diagnosed cases worldwide in 2012 (Erdmann 2013; Ferlay 2015), with an estimated 55,000 deaths (Ferlay 2015). The highest incidence is observed in Australia, with 11,405 new cases of melanoma of the skin (ACIM 2014), and in New Zealand, with 2341 registered cases in 2010 (Cancer Society of New Zealand 2013). In the USA for 2014, the predicted incidence was 73,870 per annum, and the predicted number of deaths 9940 (Siegel 2015). The highest rates in Europe are seen in northwestern Europe and the Scandinavian countries, with highest incidence reported in Switzerland of 25.8 per 100,000 in 2012. Rates in the UK trebled from 4.6 and 6.0 per 100,000 in men and women, respectively, in England in 1990, to 18.6 and 19.6 per 100,000 in 2012 (EUCAN 2012). In the UK, melanoma has one of the fastest rising incidence rates of any cancer, and it shows the biggest projected increase in incidence between 2007 and 2030 (Mistry 2011). In the decade leading up to 2013, age standardised incidence increased by 46%, with 14,500 new cases in 2013 and 2459 deaths in 2014 (Cancer Research UK 2017a). Although overall incidence rates are higher in women than in men, the rate of incidence in men is increasing faster than in women (Arnold 2014).

The rising incidence of melanoma is thought to be primarily related to rising recreational sun exposure and tanning bed use, along with an increasingly ageing population with higher lifetime recreational ultraviolet (UV) exposure (Boniol 2012; Gandini 2005), in conjunction with possible earlier detection (Belbasis 2016; Linos 2009). Putative risk factors are reviewed in detail elsewhere (Belbasis 2016), but they can be broadly divided into host and environmental factors. Host factors include fair skin and light hair or eye colour; older age (Geller 2002); male sex (Geller 2002); previous skin cancer history (Tucker 1985); predisposing skin lesions (e.g. high melanocytic naevus counts) (Gandini 2005), clinically atypical naevi (Gandini 2005), and large congenital naevi (Swerdlow 1995)); genetically inherited skin disorders (e.g. xeroderma pigmentosum) (Lehmann 2011); and a family history of melanoma (Gandini 2005). Environmental factors include recreational and occupational exposure to sunlight (both cumulative and episodic burning) (Armstrong 1977; Gandini 2005); artificial tanning (Boniol 2012); and immunosuppression (e.g. in organ transplant recipients or human immunodeficiency virus (HIV)‐positive individuals) (DePry 2011). Lower socioeconomic class may be associated with delayed presentation and thus more advanced disease at diagnosis (Reyes‐Ortiz 2006).

The main prognostic indicators following diagnosis of cutaneous melanoma can be divided into histological and clinical factors. Histologically, Breslow thickness is the single most important predictor of survival, as it is a quantitative measure of tumour invasion or volume, and thus propensity to metastasise (Balch 2001). Other factors associated with poorer prognosis histologically include microscopic ulceration, mitotic rate, microscopic satellites, regression, lymphovascular invasion, and nodular (rapidly growing) or amelanotic (lacking in melanin pigment) subtypes (Moreau 2013; Shaikh 2012). Independent of tumour thickness, prognosis is worse in older people, males, and those with locally recurrent lesions, regional lymph node involvement, or primary lesion location on the scalp or neck (Zemelman 2014).

Following histological confirmation of diagnosis, the lesion is staged according to the American Joint Committee on Cancer (AJCC) Staging System to inform treatment strategy (the eighth version of the Staging System ‐ AJCC 8 ‐ is outlined in Gershenwald 2017). Stage 0 refers to melanoma in situ; stages I to II localised melanoma; stage III regional metastasis (spread to the lymph nodes, usually but not always those nearest to the primary tumour); and stage IV distant metastasis. A preliminary stage is assigned based on histological evaluation (thickness of primary lesion and presence of ulceration) and clinical (and sometimes radiological) assessment of regional lymph nodes. A pathological stage is then confirmed based on histology of the primary lesion and of the regional lymph nodes (if the patient has sentinel lymph node biopsy (SLNB) or completion lymphadenectomy (CLND) for those with clinically palpable lymph nodes) and imaging to confirm the presence or absence of disseminated disease, where indicated.

An American database of over 40,000 patients from 1998 onwards, which assisted the development of AJCC 8, indicated five‐year survival of 99% for very early‐stage melanoma, dropping to anything between 32% and 93% in stage III disease, depending on tumour thickness, the presence of ulceration, and the number of involved nodes (Gershenwald 2017). Before the advent of targeted therapy and immunotherapies, disseminated melanoma (to distant sites/visceral organs) was associated with median survival of six to nine months, one‐year survival of 25%, and three‐year survival of 15% (Balch 2009; Korn 2008).

Between 1975 and 2010, five‐year relative survival for melanoma (i.e. not including death from other causes) in the United States increased from 80% to 94%, with survival for localised, regional, and distant disease estimated at 99%, 70%, and 18%, respectively, in 2010 (Cho 2014). However, mortality rates showed little change, at 2.1 per 100,000 deaths in 1975, and 2.7 per 100,000 in 2010 (Cho 2014). Increasing incidence of localised disease over the same period (from 5.7 to 21 per 100,000) suggests that much of the observed improvement in survival may be due to earlier detection and heightened vigilance (Cho 2014). New targeted therapies for advanced (stage IV) melanoma (e.g. BRAF inhibitors) have improved survival, and immunotherapies are evolving such that long‐term survival is being documented (Rozeman 2018). No new data regarding survival prospects for patients with stage IV disease were analysed for the AJCC 8 staging guidelines because of lack of contemporary data (Gershenwald 2017).

Treatment of melanoma

Treatment of melanoma varies to some extent, according to the stage of disease upon diagnosis. For primary melanoma, the mainstay of treatment is complete lesion excision, with a safety margin some distance from the borders of the primary tumour to remove both the tumour and any malignant cells that might have spread into the surrounding skin (Garbe 2016; Marsden 2010; NICE 2015a; SIGN 2017; Sladden 2009). Recommended surgical margins vary according to tumour thickness ‐ Garbe 2016 ‐ and stage of disease at presentation ‐ NICE 2015a. Evidence for further local or regional interventions such as wider surgical margins is limited (Sladden 2009; Wheatley 2016), although further trials in this area are planned.

Sentinel lymph node biopsy has been offered to those without clinically palpable lymph nodes as a means of providing prognostic information for several years, with the option of CLND in the event of a positive result (metastases identified on SLNB). Recent data (MLST II ‐ Kyrgidis 2015 and Morton 2014 ‐ and DeCOG ‐ Leiter 2016 and Leiter 2018 ‐ trials) show no survival benefit from CLND for this patient group, and the procedure is no longer a standard of care for most patients. Recent advances demonstrating longer recurrence‐free survival for patients with stage III melanoma receiving BRAF‐directed therapy or immunotherapies have resulted in use of SLNB as a test to identify patients who should be offered adjuvant treatment (Eggermont 2016; Eggermont 2018; Long 2017; Weber 2017). Currently available guidelines do not, as yet, reflect this recent change in practice (Garbe 2016; NICE 2015a). In the UK, the National Institute for Health and Care Excellence (NICE) has already approved dabrafenib and trametinib for adjuvant treatment of resected BRAF V600 mutation positive melanoma (NICE 2018a), with further appraisals of pembrolizumab for adjuvant treatment of melanoma with high risk of recurrence (NICE 2018b), as well as ongoing appraisals of nivolumab for adjuvant treatment of resected stage III and IV melanoma (NICE 2019a).

For stage IV melanoma, dacarbazine was the only drug approved worldwide for many years, with fotemustine used in some European countries (Avril 2004), and interleukin (IL)‐2 given in the USA (Atkins 1999). Temozolomide has also been used, especially for people with brain metastases, because of its strong ability to pass the blood‐brain barrier (Lukas 2014; Zhu 2014). This landscape has changed dramatically, with two distinct therapeutic approaches suggesting survival benefit in metastatic melanoma: (1) targeting mutations in tumour cells, and (2) providing immunomodulation (Chapman 2011; Chapman 2012; Dummer 2014; Hamid 2013; Hodi 2010; Larkin 2014; Robert 2015; Villanueva 2010). Several different therapies have now shown high response rates and, most important, have demonstrated for the first time in the treatment of melanoma the potential for a durable clinical response (Chapman 2011; Hamid 2013; Hodi 2010; Hodi 2016; Larkin 2015; Maio 2015; Sznol 2013). Several therapies are now recommended for use alone or in combination for particular subgroups of patients with metastatic melanoma, both in the UK ‐ NICE 2018a ‐ and beyond ‐ Garbe 2016 ‐ and have recently been the topic of a Cochrane Review (Pasquali 2018). An appraisal of encorafenib with binimetinib for advanced BRAF V600 mutation positive melanoma is under way (NICE 2019b), and several other treatments are currently suspended pending marketing authorisation applications from the companies concerned (NICE 2018c).

Psychosocial interventions to improve quality of life and general psychological distress after diagnosis for patients with cancer are also available. However, a Cochrane Review found considerable variation in the evidence to support such interventions (Galway 2012).

Index test(s)

Accurate staging of melanoma is more important than ever, in part to avoid unnecessary treatment and associated morbidity in those with early‐stage disease, and in part to ensure that potentially effective therapies are initiated in a timely manner for those with nodal or distant metastatic disease.

Imaging techniques such as ultrasound, computed tomography (CT), or magnetic resonance imaging (MRI) scans can be undertaken at several points along the clinical pathway, including on initial presentation of disease (primary staging), on development of recurrence (re‐staging), and on follow‐up after previous treatment for those who are asymptomatic for recurrence. The use of imaging during follow‐up with no specific clinical indication for imaging (i.e. as a monitoring test for disease surveillance) is not the focus of our reviews. Historically, most staging in terms of imaging has been undertaken in people with clinical stage III and IV disease (see Clinical pathway). However, this landscape is changing as more adjuvant systemic therapies for melanoma are becoming available.

Imaging tests are typically undertaken and interpreted by radiologists, with decisions about patient management following imaging or SLNB made at multi‐disciplinary team meetings that include oncologists, dermatologists, and surgeons (Clinical pathway).

Ultrasound

Ultrasound uses high‐frequency sound waves to create images of the body. Ultrasound can be used to assist in detection of diseased lymph nodes with clinically node negative melanoma; in treatment of patients who have a positive imaging result, proceeding to fine needle aspiration cytology (FNAC) or core biopsy; and in treatment of patients who are negative on ultrasound alone or on ultrasound combined with FNAC proceeding to SLNB. A 2011 systematic review identified 21 studies of ultrasound for primary lymph node staging or surveillance; for primary staging, sensitivity was 60% for detection of diseased lymph nodes, with specificity of 97% (the number of studies that considered staging vs surveillance is unclear) (Xing 2011).

Computed tomography (CT) (non‐contrast‐enhanced or contrast‐enhanced)

Computed tomography scans use ionising radiation in the form of X‐rays to take cross‐sectional images of the body (Bluemm 1983; van Waes 1983). This procedure involves varying amounts of radiation according to the area of the body to be scanned (Mahesh 2017), and it can be conducted with an intravenous contrast agent (contrast‐enhanced) to increase the sensitivity of metastasis detection in solid organs.

Mohr 2009 describes contrast‐enhanced CT as the best method of identifying intrathoracic metastases and as superior to X‐ray for detection of mediastinal and hilar adenopathy associated with lymphatic spread and for assessment of lesions in the bone. Computed tomography can also be used for assessment of metastatic spread to the brain, but magnetic resonance imaging (MRI) is considered more sensitive (Goulart 2011). Overall specificity is reportedly high for detection of regional nodal and distant disease, but sensitivity varies from 23% to 85% for detection of lymph node metastases, and from 25% to 74% for assessment of distant spread (Xing 2011).

Magnetic resonance imaging (MRI) (non‐contrast‐enhanced or contrast‐enhanced)

Magnetic resonance imaging scans use large magnets and non‐ionising radiation in the form of radio waves to generate images of the body (Ai 2012). These scans are more expensive and take longer to carry out compared to CT scans (Whaley 2016b).

We did not identify any systematic reviews of MRI for melanoma staging through our scoping searches; however, several studies have considered whole body MRI (Jouvet 2014; Mosavi 2013), as well as MRI for detection of brain or hepatic metastases (Aukema 2010a; Sofue 2012). Because melanoma is one of the top three cancers responsible for brain metastases (Cagney 2017), the body of evidence for the incremental accuracy of MRI compared with other imaging tests must be considered.

PET‐CT (positron emission tomography‐computed tomography)

Positron emission tomography‐computed tomography is a hybrid imaging technique that provides both functional and anatomical information. It involves injection of a weakly radioactive positron‐emitting radiopharmaceutical, which is usually 2‐deoxy‐2‐[¹⁸F]fluoro‐D‐glucose (FDG), for the purposes of oncological imaging. The distribution of FDG throughout the body is represented on images, with malignant tissue usually demonstrating greater levels of FDG uptake than normal tissue (Lammertsma 2017). The low‐dose CT component of the study generates attenuation factors that improve the quality of PET images and allows accurate anatomical localisation of areas of FDG uptake (IAEA 2016). Although initially, PET scanners were stand‐alone devices, since 2004, all modern scanners have been integrated PET‐CT scanners (Jones 2017). A systematic review of the added value of integrated PET‐CT compared to PET alone across a range of cancers suggested a 10% increase in sensitivity of PET‐CT compared to PET alone from a meta‐analysis of 10 comparative studies (Gao 2013). For these reasons, PET alone has not been considered as an index test for this review.

In comparison to CT alone, PET‐CT is generally considered to be a more sensitive test (Xing 2011); however, increases in sensitivity must be linked to any patient benefit in terms of changes in management and ultimately in patient outcomes (Schroer‐Gunther 2012; Subesinghe 2013). It may be that PET‐CT has the greatest added value for metastases in areas that are difficult to image with CT or other imaging modalities (Tan 2012), or for indeterminate metastases in areas such as the lung. Whether these assumptions are supported by current evidence has yet to be established. The evidence report for the NICE guideline in 2015 found no evidence "to suggest that earlier treatment of metastatic disease improves survival and therefore increased sensitivity was viewed currently as not an important issue" (NICE 2015d). With adjuvant therapy now an increasing option for melanoma, this conclusion seems likely to be revised in a future guideline update.

Clinical pathway

Staging of confirmed melanoma takes place in secondary and tertiary care settings only (NICE 2015a). Recommendations on the management of melanoma following diagnosis, published in the 2015 NICE Guideline (NICE 2015a), as well as in other UK guideline documents (Burkill 2014; Marsden 2010; Melanoma Taskforce 2011), are summarised in Figure 2 and are outlined below; however, practice varies across the UK. It is important to note that clinical practice is changing as more adjuvant therapies are licensed for the treatment of melanoma, and this is not adequately reflected by current guidelines. However, a consensus statement reflecting changes in decision thresholds for the use of SLNB for staging of melanoma has been published (Melanoma Focus 2018). Any key variations in practice recommended in European or US guidelines (ESMO 2019; Swetter 2019), or under consideration in a current Australian guideline update (Cancer Council Australia 2019; Gyorki 2018; Millward 2018; Morton 2018; Saw 2018), are also reflected below.

Figure 2

Summary of 2015 NICE guideline recommendations for the management of cutaneous melanoma following primary diagnosis (NICE 2015a); not necessarily reflective of current practice.

Following complete excision of the primary lesion, all patients should undergo preliminary staging. This involves a detailed clinical history to determine if there are any symptoms such as weight loss suggesting metastatic spread of disease, followed by a thorough clinical examination, including whole body skin examination, palpation of the lymph nodes, and full abdominal and chest examination (Figure 2). A preliminary stage is assigned on the basis of histopathology results for the primary lesion(s). Those with palpable lymph nodes are automatically assigned to clinical stage III or IV, and those with no palpable lymph nodes are assigned a stage between 0 and IIC, according to the thickness of the tumour (Breslow) and the presence of ulceration (Gershenwald 2017).

The results of all investigations carried out during the process of diagnosis are discussed at a multi‐disciplinary team meeting (Melanoma Taskforce 2011), where decisions regarding further staging procedures are made. This could be a local skin multi‐disciplinary team or, for those with stage IIB disease and above, a specialist skin multi‐disciplinary team (Marsden 2010). Teams should include dermatologists, surgeons (including plastic surgeons), medical and clinical oncologists, radiologists, histopathologists, skin cancer nurse specialists, physiotherapists, psychologists, lymphoedema service providers, occupational therapists, and cosmetic camouflage advisors (Melanoma Taskforce 2011).

On current UK guidance (based on AJCC version 7 (Balch 2009)), no further staging investigations beyond a full clinical examination are recommended for people with thin melanomas (≤ 1 mm) without ulceration or mitoses, and SLNB is reserved for those with stage IB or stage II disease (NICE 2015a). Current practice is now based on staging according to AJCC version 8, for example, with ‘thin’ melanomas now defined as < 0.8 mm in thickness without evidence of ulceration (Gershenwald 2017). Furthermore, with the advent of new adjuvant therapies, SLNB is now considered essential in determining eligibility for systemic adjuvant therapy (Gyorki 2018; Melanoma Focus 2018; Swetter 2019), and imaging is used in sentinel node positive patients to confirm absence of further disease spread (ESMO 2019; Swetter 2019). SLNB is recommended for those with primary melanoma greater than 1.0 mm and should be considered for some patients with thinner melanomas (i.e. melanomas < 0.8 mm with ulceration, and melanomas 0.8 to 1.0 mm with or without ulceration), especially in the presence of lymphovascular invasion or a mitotic rate of at least 2 per mm² (Melanoma Focus 2018). Those with clinically palpable lymph nodes or with significant nodal disease identified on imaging are likely to undergo CLND, with the option of adjuvant therapy for those with no evidence of distant metastases.

Available recommendations on the optimal choice of imaging tests vary to some extent, even within the UK (Burkill 2014; Melanoma Focus 2014; NICE 2015a). Computed tomography is generally the imaging test of choice; however, some centres additionally offer high‐resolution ultrasound, MRI, or PET‐CT scans. The National Institute for Health and Care Excellence recommends CT staging to identify those who may benefit from systemic therapy among those with stage IIC, stage III, or suspected stage IV disease (NICE 2015a), as well as imaging of the brain (with CT for adults and MRI for children and young adults) only if metastatic disease outside the central nervous system is suspected (NICE 2015a). However, the Melanoma Focus position paper recommends that all ‘high‐risk’ patients should undergo CT of the chest, abdomen, and pelvis (or whole body PET‐CT), plus MRI of the head, as standard treatment (Melanoma Focus 2014). In current clinical practice, eligibility for imaging is likely to diverge from both of these target groups; however, the emergence of new treatment options is not likely to impact the choice of imaging tests performed nor body sites imaged.

European guidelines recommend pre‐SLNB baseline lymph node (LN) ultrasound for stage IB to IIA disease, and CT or PET for stage IIB and upwards (ESMO 2019). Australian guidelines in Morton 2018 and US guidelines in Swetter 2019 recommend against baseline imaging for all asymptomatic and clinically node negative patients. In the United States, CT or PET‐CT may be considered for sentinel lymph node (SLN) positive disease but otherwise should be reserved for investigation of specific signs or symptoms or nodal or distant metastases (Swetter 2019). In Australia, US and FNAC are recommended to identify the extent of regional LN involvement in clinically node positive melanoma (Saw 2018), as well as whole body PET‐CT with CT or MRI of the brain for clinical stage III or IV disease (Saw 2018; Millward 2018).

The Royal College of Radiologists guideline recommends that scans should be tailored according to the site of the primary lesion and most likely the regional lymph node basin. In general, CT imaging of the head, chest, abdomen, and pelvis should be employed for lower limb and lower body wall lesions, with CT of the neck added for upper limb, scalp, neck, and upper torso primary tumours (Burkill 2014). Magnetic resonance imaging may be more appropriate for imaging the central nervous system (Burkill 2014). Although PET‐CT has been suggested to have a role in imaging the lower limbs, further evidence is required (Burkill 2014).

Genotyping is also now offered to identify BRAF mutations to allow further planning of systemic treatment (Melanoma Taskforce 2011; NICE 2018a; NICE 2019b).

Prior test(s)

Consideration of the degree of prior testing that study participants have undergone is key to interpretation of resulting test accuracy indices, which are known to vary according to the spectrum or case mix of included participants (Lachs 1992; Leeflang 2013; Moons 1997; Usher‐Smith 2016). Prior testing can be considered in two ways. First, the results of any tests undertaken around the time of application of the index test may contribute to the decision to undertake the index test in any particular study participant. For example, PET‐CT may be undertaken because of the presence of high‐risk primary melanoma characteristics or because of abnormal findings on abdominal ultrasound or chest X‐ray; the likelihood of abnormal findings on PET‐CT, and therefore sensitivity or specificity, may be influenced by the results of any tests previously undergone.

Second, prior testing can be considered in terms of the place on the clinical pathway or the time course of disease that patients have reached. People undergoing imaging for staging following a primary diagnosis of melanoma are less likely to have metastatic spread of disease compared to those for whom imaging is prompted by signs of recurrence, and the nature of any disease spread is likely to vary between a primary staging population and patients undergoing follow‐up, who may have already undergone previous treatment such as complete lymphadenectomy. Reinhardt 2006 evaluated the accuracy of CT, PET, and PET‐CT in 250 participants with melanoma "at different time points in the course of disease", including primary staging after sentinel node biopsy (n = 75); therapy control after chemotherapy for metastatic disease (n = 42); staging of clinically suspected recurrent disease (n = 65); and assessment during follow‐up within five years of primary treatment (n = 68). For both nodal and distant staging, the overall sensitivity and specificity of each test masked likely variations in accuracy between subgroups. For example, the overall sensitivity and specificity of CT for detection of nodal metastases were 85% and 87%, but when estimated for each subgroup of participants, the sensitivity of CT ranged from 67% for those undergoing follow‐up to 93% for those having imaging for treatment evaluation, and specificities ranged from 73% for the treatment evaluation group to 93% for those having primary staging (Reinhardt 2006). The overall pooled analysis suggested statistically significant differences in sensitivities (CT 73% vs PET‐CT 99%; P < 0.0001) and in specificities (CT 88% vs PET‐CT 98%; P < 0.0001) for detection of distant metastases, but for the primary staging subgroup, no difference in sensitivities was observed (93.8% for both tests) and the difference in specificities was non‐significant (CT 94.9% vs PET‐CT 98.3%) (Reinhardt 2006). For the re‐staging subgroup, differences in both sensitivities (CT 85% vs PET‐CT 100%) and specificities (CT 79% vs PET‐CT 96%) between tests were observed (Reinhardt 2006). Although subgroup numbers were relatively small, these findings lend support to the hypothesis that the clinical pathway does affect test accuracy in this context, although as for other tests and diseases, the mechanisms of action can be complex and difficult to identify (Leeflang 2013).

Role of index test(s)

Ultrasound with FNAC as a triage test before SLNB was originally promoted as having a role in fast‐tracking those with positive cytology results (micro‐metastases identified) to CLND, while those with negative cytology may proceed to SLNB, as required (Voit 2014). With the changing clinical pathway and lack of evidence for survival benefit from CLND (Leiter 2018; Morton 2014), the only potential role for ultrasound and FNAC in the UK is considered to be seen at centres where SLNB is not immediately available (with a positive cytology result indicating that adjuvant therapy should be initiated); however this approach is still recommended for use following primary melanoma diagnosis in Europe (ESMO 2019), as well as for clinically node positive melanoma in Australia (Saw 2018).

No role has been recommended for imaging tests in early‐stage disease. The need to rule out distant metastases among those who are otherwise eligible for adjuvant therapy suggests that imaging might now be used in a much more broadly defined patient group than previously. To date, CT has been recommended as the imaging approach of choice for detection of nodal and distant spread for those with stage III or IV disease (and for those with stage IIC if no SLNB has been performed) (NICE 2015a). Positron emission tomography‐computed tomography is increasingly used; however, practice varies across the country, primarily according to availability. The advantages of disease management derived from PET‐CT are not yet known. The most appropriate role for MRI in staging melanoma in adults, other than for central nervous system disease, remains unclear.

Alternative test(s)

Several other tests may be used to inform disease management following a diagnosis of melanoma.

Sentinel lymph node biopsy, which allows detection of metastatic spread to the regional lymph node basins, is the topic of another review in this series of reviews (Ferrante di Ruffano 2019).

Core needle biopsy of the lymph nodes, as in Whaley 2016a, or FNAC, as in Hall 2013, to confirm the presence of macro‐metastases can be guided by simple palpation or, for more deep‐seated lesions, via image‐based guidance to identify micro‐metastases (requiring use of a microscope for visualisation) (Bohelay 2015). Although the accuracy of core needle biopsy compared to fine needle aspiration has been identified as a key clinical question for investigation, this topic is beyond the scope of these reviews, which focus primarily on detection of non‐palpable metastatic disease.

Genetic testing of primary melanoma specimens, for BRAF mutations for example, is used increasingly (NICE 2015a), particularly with the emergence of systemic treatments for BRAF V600 mutation positive melanoma (Chapman 2011; Chapman 2012; Larkin 2014; Larkin 2015). However, its purpose is to inform systemic treatment decisions rather than to serve as an integral part of the staging procedure itself. Biomarkers, such as S100, are used in countries such as Germany as a marker of prognosis (Gray 2014), or of early disease relapse (Peric 2011), rather than for staging purposes per se (Egberts 2010; Pirpiris 2010), and lactate dehydrogenase (LDH) is part of AJCC staging for stage IV (Pirpiris 2010); however, these approaches are beyond the scope of our reviews.

Rationale

Appropriate staging of melanoma is crucial for ensuring that patients are directed to the most appropriate and effective treatment. Several tests are available to assist in the staging of melanoma; however, their comparative accuracy for detection of nodal or distant metastases, or both, according to histological stage at presentation is unclear.

The NICE guideline recommendations for staging (see Clinical pathway) were based on available systematic reviews of both SLNB and imaging tests (Hall 2013; Jimenez‐Requena 2010; Krug 2008; Rodriguez 2014; Valsecchi 2011; Xing 2011), with some supplementary data derived from primary studies (NICE 2015d). Most reviews are limited in terms of currency (de Rosa 2011; Jimenez‐Requena 2010; Krug 2008; Valsecchi 2011; Warycha 2009; Xing 2011), with literature searches in most cases extending only as recently as 2009 (Jimenez‐Requena 2010; Krug 2008; Valsecchi 2011; Xing 2011). Furthermore, the only review that compared accuracy across imaging tests did not consider histological stage (Xing 2011). Two reviews provide a more recent evaluation of PET and PET‐CT (search dates up to 2012 and 2011, respectively) (Rodriguez 2014; Schroer‐Gunther 2012); however, the Schroer‐Gunther 2012 review also relied on previously published reviews (Jimenez‐Requena 2010; Krug 2008), with supplementary searching for more recently published studies, and the Rodriguez 2014 review included only stage III melanoma. The Schroer‐Gunther 2012 review relied on quality assessment that was carried out for the original systematic reviews, and only a small number of studies were eventually included; the review authors themselves recommend that future reviews should include a broader range of study designs (Schroer‐Gunther 2012).

The comparative accuracy of imaging tests according to stage of disease therefore remains to be determined. Furthermore, any evidence for or against the routine use of brain scanning in stage III melanoma with either CT or MRI remains to be identified. Positron emission tomography‐computed tomography is increasingly used, but any additional role of this test compared with CT or MRI needs to be examined according to particular patient groups.

This review follows a generic Cochrane DTA protocol for staging of melanoma (Dinnes 2017). The Background and Methods sections of this review therefore include some text that was originally published in the protocol (Dinnes 2017), along with text that overlaps some of our other reviews for the diagnosis or staging of melanoma (e.g. Dinnes 2018; Ferrante di Ruffano 2019).

Objectives

available in

Primary objectives

We estimated accuracy separately according to the point in the clinical pathway at which imaging tests were used. Our objectives were:

to determine the diagnostic accuracy of ultrasound or PET‐CT for detection of nodal metastases before sentinel lymph node biopsy in adults with confirmed cutaneous invasive melanoma; and
to determine the diagnostic accuracy of ultrasound, CT, MRI, or PET‐CT for whole body imaging in adults with cutaneous invasive melanoma:
- for detection of any metastasis in adults with a primary diagnosis of melanoma (i.e. primary staging at presentation); and
- for detection of any metastasis in adults undergoing staging of recurrence of melanoma (i.e. re‐staging prompted by findings on routine follow‐up).

We undertook separate analyses according to whether accuracy data were reported per patient or per lesion.

Secondary objectives

We sought to determine the diagnostic accuracy of ultrasound, CT, MRI, or PET‐CT for whole body imaging (detection of any metastasis) in mixed or not clearly described populations of adults with cutaneous invasive melanoma.

For study participants undergoing primary staging or re‐staging (for possible recurrence), and for mixed or unclear populations, our objectives were:

to determine the diagnostic accuracy of ultrasound, CT, MRI, or PET‐CT for detection of nodal metastases;
to determine the diagnostic accuracy of ultrasound, CT, MRI, or PET‐CT for detection of distant metastases; and
to determine the diagnostic accuracy of ultrasound, CT, MRI, or PET‐CT for detection of distant metastases according to metastatic site.

Investigation of sources of heterogeneity

We aimed to consider a range of potential sources of heterogeneity for investigation, as outlined in our generic protocol and described in Appendix 4, but insufficient data were identified to allow any heterogeneity investigations to be undertaken.

Methods

available in

Criteria for considering studies for this review

Types of studies

We included test accuracy studies that allow comparison of results of the index test versus a reference standard, including:

prospective and retrospective studies;
studies where all participants receive a single index test and a reference standard;
studies where all participants receive more than one index test(s) (concurrently) and a reference standard;
studies where participants are allocated (by any method) to receive different index tests or combinations of index tests and all receive a reference standard (between‐person comparative studies);
studies that recruit a series of participants unselected by true disease status; and
diagnostic case‐control studies that separately recruit diseased and non‐diseased groups (Rutjes 2005).

We excluded follow‐up and surveillance studies using repeated imaging tests to detect disease recurrence, as defining the most appropriate follow‐up schedule for melanoma patients is not the primary objective of these reviews.

We excluded studies if it was not possible to derive the numbers of true positives, false positives, false negatives, and true negatives from data provided in the paper, and we excluded small studies with fewer than five disease positive or fewer than five disease negative participants or lesions identified on imaging. Although the size threshold of five is arbitrary, such small studies are likely to yield unreliable estimates of sensitivity or specificity, and are unlikely to add precision to estimates of accuracy.

We included studies reporting either lesion‐based or participant‐based analyses; however, we accorded more weight to those reporting data on a per participant basis as detection of multiple metastatic sites in an individual patient may have a disproportionate effect on estimates of test accuracy based on per lesion data. Furthermore, treatment following staging is generally directed to the patient rather than to the individual metastatic lesion, making the patient the more appropriate unit of analysis.

We excluded studies available only as conference abstracts.

Participants

We included studies in adults with cutaneous melanoma at any primary site who were undergoing staging, either following primary presentation of disease or following recurrence of disease. We included for completeness studies that included mixed populations of patients, or where the clinical pathway could not be determined, but we undertook no statistical pooling. We included studies if up to 10% of participants had other forms of melanoma such as ocular or mucosal melanoma. We included studies with greater proportions of participants with non‐cutaneous melanoma and studies including participants with other forms of cancer only if test results for participants with cutaneous melanoma could be differentiated.

Index tests

Studies reporting accuracy data for a single application of one or more of the following tests were eligible for inclusion.

Ultrasound (with or without subsequent FNAC or core biopsy).
CT (non‐contrast‐enhanced or contrast‐enhanced).
PET‐CT (¹⁸FDG only).
MRI (non‐contrast‐enhanced or contrast‐enhanced).

We included any threshold for deciding test positivity, either qualitative or quantitative.

We excluded studies reporting multiple applications of the same test in more than 10% of study participants because of anticipated effects on test accuracy (multiple tests increasing the chance of detection of metastases, thereby increasing test sensitivity and reducing specificity). The threshold of 10% is arbitrary but allows for inclusion of studies primarily focused on evaluating the accuracy of a single test application for staging of disease. We excluded studies of surveillance imaging following initial definitive treatment.

Target conditions

Primary target conditions were defined as detection of:

nodal metastases in participants scheduled for SLNB (to identify those who should proceed directly to CLND); and
any metastases for all other staging.

Two additional definitions of the target condition were considered in secondary analyses, namely, detection of:

any nodal metastases; and
any distant metastases (combined or by metastatic site).

Reference standards

Acceptable reference standards include:

histology of lymph node or distant specimens, with samples obtained by core biopsy, SLNB, or lymph node dissection;
cytology of lymph node specimens, with samples obtained by core biopsy or fine needle aspiration;
clinical or radiological follow‐up to identify nodal or distant recurrence of at least three months; and
any combination of the above.

We excluded studies using cross‐sectional imaging‐based reference standards (i.e. direct comparison of the index test vs an alternative reference standard imaging test).

Search methods for identification of studies

Electronic searches

The Information Specialist (SB) carried out a comprehensive search for published and unpublished studies. A single large literature search was conducted to cover all topics in the programme grant (see Appendix 1 for a summary of reviews included in the programme grant). This allowed screening of search results for potentially relevant papers for all reviews at the same time. A search combining disease‐related terms with terms related to test names, using both text words and subject headings, was formulated. The search strategy was designed to capture studies evaluating tests for the diagnosis or staging of skin cancer. As a majority of records were related to searches for tests for staging of disease, a filter using terms related to cancer staging and to accuracy indices was applied to the staging test search to try to eliminate irrelevant studies, for example, those using imaging tests to assess treatment effectiveness. A sample of 300 records that would be missed by applying this filter was screened and the filter adjusted to include potentially relevant studies. When piloted on MEDLINE, inclusion of the filter for staging tests reduced the overall numbers by around 6000. The final search strategy, incorporating the filter, was subsequently applied to all bibliographic databases as listed below (Appendix 5). The final search result was cross‐checked against the list of studies included in five systematic reviews; our search identified all but one of these studies, and this study was not indexed on MEDLINE. The Information Specialist devised the search strategy, with input from the Information Specialist from Cochrane Skin. No additional limits were used.

We searched the following bibliographic databases to 29 August 2016 for relevant published studies.

MEDLINE via OVID (from 1946).
MEDLINE In‐Process & Other Non‐Indexed Citations via OVID.
Embase via OVID (from 1980).

We searched the following bibliographic databases to 30 August 2016 for relevant published studies.

Cochrane Central Register of Controlled Trials (CENTRAL; 2016, Issue 7), in the Cochrane Library.
Cochrane Database of Systematic Reviews (CDSR; 2016, Issue 8), in the Cochrane Library.
Cochrane Database of Abstracts of Reviews of Effects (DARE; 2015, Issue 2).
CRD HTA (Health Technology Assessment) database (2016, Issue 3).
Cumulative Index to Nursing and Allied Health Literature (CINAHL) via EBSCO from 1960.

We searched the following databases for relevant unpublished studies using a strategy based on the MEDLINE search.

Conference Proceedings Citation Index (CPCI), via Web of Science™ (from 1990; searched 28 August 2016).
Science Citation Index (SCI) Expanded™ via Web of Science™ (from 1900, using the 'Proceedings and Meetings Abstracts' Limit function; searched 29 August 2016).

We searched the following trials registers using the search terms 'melanoma', 'squamous cell', 'basal cell', and 'skin cancer' combined with 'diagnosis'.

Zetoc (from 1993; searched 28 August 2016).
US National Institutes of Health Ongoing Trials Register (www.clinicaltrials.gov; searched 29 August 2016).
NIHR Clinical Research Network Portfolio Database (www.nihr.ac.uk/research‐and‐impact/nihr‐clinical‐research‐network‐portfolio/; searched 29 August 2016).
World Health Organization International Clinical Trials Registry Platform (apps.who.int/trialsearch/; searched 29 August 2016).

We aimed to identify all relevant studies regardless of language or publication status (published, unpublished, in press, or in progress), but because of time constraints, we were unable to follow up on potentially relevant studies identified from conference abstracts. We applied no date limits.

Searching other resources

We screened relevant systematic reviews identified by the searches for their included primary studies, and we included any missed by our searches. We checked the reference lists of all included papers, and subject experts within the author team reviewed the final list of included studies. We conducted no electronic citation searching.

Data collection and analysis

Selection of studies

At least one review author (JDi or NC) screened titles and abstracts and discussed and resolved any queries by consensus. A pilot screen of 539 MEDLINE references showed good agreement (89% with a kappa of 0.77) between screeners. Primary test accuracy studies and test accuracy reviews (for scanning of reference lists) of any test used to investigate suspected melanoma, basal cell carcinoma (BCC), or cutaneous squamous cell carcinoma (cSCC) were included at initial screening. Inclusion criteria were applied independently by both a clinical review author (from one of a team of 12 clinician reviewers) and a methodologist review author (JDi, NC, or LFR) to all full‐text articles, and disagreements were resolved by consensus or by a third party (JDe, CD, HW, RM) (Appendix 6). No study authors were contacted in regard to study eligibility because of the volume of data retrieved. Authors of eligible studies were contacted when insufficient data were presented to allow for construction of 2×2 contingency tables.

The study selection process is described in a PRISMA‐DTA flowchart (McInnes 2018).

Data extraction and management

One clinical (SAC, AD, AG, LP) and at least one methodologist review author (LFR, JDi) extracted data concerning details of study design, participants, index test(s) or test combinations, criteria for index test positivity, reference standards, and data required to populate a 2×2 diagnostic contingency table for each index test using a piloted data extraction form. Disagreements were resolved through discussion or by a third party (JDe, CD, HW, RM).

Dealing with multiple publications and companion papers

In the event of multiple reports of a primary study, the most complete and up‐to‐date data source available was used to contribute 2×2 contingency table data to eliminate double‐counting of datasets. When possible, yield of information regarding study methods and participants was maximised by extracting relevant data from multiple publications.

Assessment of methodological quality

We assessed risk of bias and applicability of included studies using the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS‐2) checklist (Whiting 2011), which had been tailored to the review topic (Appendix 7). We piloted the modified QUADAS‐2 tool on a small number of included full‐text articles. One clinical (as detailed above) and at least one methodologist review author (LFR, JDi, BH, or SB) independently assessed quality for the remaining studies; any disagreement was resolved by consensus or by a third party when necessary (JDe, CD, HW, RM).

Statistical analysis and data synthesis

We conducted separate analyses first according to whether study participants were recruited on primary presentation of melanoma or with a disease recurrence, and second according to our primary and secondary objectives (i.e. detection of any metastasis (which must include both nodal and distant recurrence) and detection of nodal metastasis alone or detection of any distant metastasis, as defined under Target condition being diagnosed).

Studies may report test accuracy per lesion or per patient. Our unit of analysis for primary analyses was the patient, as study participants may have multiple metastatic sites at any one time, such that a per lesion analysis may overestimate test accuracy.

We initially explored the data by plotting estimates of sensitivity and specificity on coupled forest plots and in receiver operating characteristic (ROC) space for each index test. We performed meta‐analyses using the bivariate method to produce summary operating points (summary sensitivities and specificities) with 95% confidence and prediction regions (Chu 2006; Macaskill 2010; Reitsma 2005). When few studies were available for a meta‐analysis, we simplified the bivariate model to univariate fixed‐effect or random‐effects logistical regression models depending on whether or not heterogeneity was observed on forest plots and in ROC space (Takwoingi 2015). If there were only two or three studies and we observed heterogeneity on the plots, we did not pool the data, as a fixed‐effect approach would be inappropriate and the number of studies too small to reliably estimate random effects.

To compare the accuracy of the index tests, we performed both direct and indirect test comparisons, as comparative studies are scarce (Takwoingi 2013). To formally compare index tests, we added a co‐variate for test type to a bivariate model (i.e. bivariate meta‐regression). We used likelihood ratio tests to assess the statistical significance of differences in sensitivity and specificity by comparing models without the co‐variate terms versus models containing the co‐variate terms. Using parameter estimates from bivariate meta‐regression models, we calculated absolute differences in sensitivity and specificity. We obtained 95% confidence intervals and P values for these differences using the delta method and the Wald test, respectively. When the number of studies in a direct comparison was insufficient for meta‐regression, we examined individual study results and computed absolute differences in sensitivity and specificity for each comparative study. We calculated 95% confidence intervals (CIs) for these differences using the Newcombe‐Wilson method without continuity correction (Newcombe 1998).

We conducted analyses using Review Manager 5 (Review Manager 2014), along with the meqrlogit command in the statistical software STATA version 15 (STATA 2017).

Investigations of heterogeneity

We initially examined heterogeneity between studies by visually inspecting forest plots of sensitivity and specificity and summary ROC plots. We identified insufficient numbers of studies to allow meta‐regression to formally investigate potential sources of heterogeneity.

Sensitivity analyses

We performed no sensitivity analyses because limited data were available.

Assessment of reporting bias

Because of uncertainty about the determinants of publication bias for diagnostic accuracy studies and the inadequacy of tests for detecting funnel plot asymmetry (Deeks 2005), we did not assess publication bias.

Results

Results of the search

We identified and screened for inclusion a total of 34,507 unique references. Of these, we reviewed 1035 full‐text papers for eligibility for any one of the reviews of tests for staging of melanoma or cSCC. Of the 1035 full‐text papers assessed, we excluded 829 from all reviews in our series (see Figure 3 PRISMA flow diagram of search and eligibility results).

Figure 3

PRISMA flow diagram.

Of the 390 studies tagged as potentially eligible for this review of imaging tests for staging of melanoma, we included 39 publications. Exclusions were due to publication as a conference abstract (n = 202), not a primary study (n = 103), not a test accuracy study (no index test and or reference standard reported) (n = 11), wrong index test (n = 125; including 17 studies with more than one scan reported per participant), inadequate reference standard (n = 90), wrong study population (n = 47), inadequate sample size (n = 55), wrong target condition (n = 125), missing data to complete 2×2 contingency table (n = 46), and duplicate or related publication (n = 86). We have provided a list of the 351 publications excluded from this review with reasons for exclusion in Characteristics of excluded studies. We contacted the authors of four included studies for further details of study methods (Chai 2012; Reinhardt 2006; Stoffels 2012; Voit 2014). We received a response in regard to one study (Reinhardt 2006), but study authors did not provide the additional data requested.

The 39 included study publications provide 195 contingency table datasets for a total of 5204 study participants. Thirty‐four studies reported data on a per patient basis, including two that also reported data per lesion identified on imaging (Cachin 2014; Iagaru 2007), and five reported data only on a per lesion basis (Dellestable 2011; Hausmann 2011; Jouvet 2014; Pfannenberg 2007; Pfluger 2011). The 34 studies that reported data per patient included 4980 study participants, 1265 of whom had confirmed metastatic disease. The seven studies that reported data per lesion included 417 study participants with 1846 potentially metastatic lesions identified on imaging, 1061 of which were confirmed metastases.

Table 1 cross‐tabulates the index tests evaluated and the population groups and target conditions considered in the 39 included studies. Eighteen studies considered the use of imaging for nodal metastases before SLNB; 11 of these studies considered the use of ultrasound, and eight evaluated PET‐CT. Twenty‐four studies evaluated the use of imaging as a staging tool in study participants undergoing primary staging on diagnosis of melanoma (n = 6) or re‐staging for recurrence of disease (n = 3), or inclusion of mixed (n = 11) or not clearly described populations (n = 4). The imaging tests evaluated included ultrasound (n = 5), CT (n = 10), MRI (n = 4), and PET‐CT (n = 15) for detection of any metastases (n = 14), nodal metastases (n = 14), or distant metastases (n = 9). Five studies also reported data separately by metastatic site.

Table 1. Cross‐tabulation of studies by index test, population group, and target condition

Study

US‐ FNAC

MRI

PET‐CT

Population group

Population detail

Reference standard

Any metastases

Distant metastases

Nodal metastases

Other sites

PRIMARY STAGING

Arrangoiz 2012

‐

Primary (any); primary (pre‐SLNB)

BT > 4 mm

SLNB/CLND/FU

Per patient

Per patient/ Pre‐SLNB

‐

Chai 2012

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB/CLND ± FU

‐

Pre‐SLNB

‐

Hafner 2004

‐

(X)

Primary (pre‐SLNB); primary

Standard SLNB
Any (incl N+)

SLNB/CLND

‐

Per patient/
Pre‐SLNB

‐

Hinz 2011

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB

‐

Pre‐SLNB

‐

Hinz 2013

‐

Primary (pre‐SLNB)

High risk (BT ≥ 2.0 mm or other RF)

SLNB

‐

Pre‐SLNB

‐

Hocevar 2004

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB/CLND

‐

Pre‐SLNB

‐

Kang 2011

‐

Primary (any)

All staging (incl N+)

Histology/FU

per patient

‐

Kell 2007

Primary (pre‐SLNB)

Standard SLNB

SLNB

‐

Pre‐SLNB

‐

Klode 2010

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB

‐

Pre‐SLNB

‐

Kunte 2009

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB

‐

Pre‐SLNB

‐

Maubec 2007

‐

Primary (any); primary (pre‐SLNB)

BT > 4 mm

SLNB/CLND ± FU

Per patient

‐

Pre‐SLNB

‐

Prayer 1990

‐

Primary (any)

All staging (incl N+)

CLND/FU

‐

Per patient

‐

Radzhabova 2009

‐

Primary (pre‐SLNB)

Standard SLNB; any (incl N+)

SLNB ± FU

‐

Pre‐SLNB

‐

Revel 2010

‐

Primary (pre‐SLNB)

HN MM

SLNB

‐

Pre‐SLNB

‐

Sanki 2009

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB

Pre‐SLNB

Sibon 2007

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB

‐

Pre‐SLNB

‐

Singh 2008

‐

Primary (pre‐SLNB)

Standard SLNB/BT > 4 mm

SLNB

‐

Pre‐SLNB

‐

van Rijk 2006

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB/CLND

‐

pre‐SLNB

‐

Veit‐Haibach 2009

‐

Primary (any)

All staging (incl N+)

Histology/FU

‐

Per patient

‐

Voit 2014

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB/CLND

‐

Pre‐SLNB

‐

Wagner 2012

‐

Primary (pre‐SLNB)

High risk (BT ≥ 4 mm or > 1 mm and ulcerated)

SLNB/CLND

‐

Pre‐SLNB

‐

RE‐STAGING

Iagaru 2007

‐

Re‐staging

Any re‐staging

Histology/FU

Per patient

/Per lesion

‐

Rubaltelli 2011

‐

Re‐staging

Any FU and suspicious on B‐mode US

FNAC/Histology/FU

‐

Per patient

‐

Strobel 2007a

‐

Re‐staging

High risk (BT > 4 mm, etc.), elevated S100

Histology/Cytology/FU

Per patient

‐

MIXED OR UNCLEARLY REPORTED

Abbott 2011

‐

Mixed

Stage III

Histology/FU

Per patient

‐

Aukema 2010a

‐

(X ‐ Brain)

Mixed

S100 positive

FNAC/Histology/Imaging FU

Per patient

‐

Aukema 2010b

‐

(X ‐ Brain)

Unclear

Node positive

FNAC/Histology/FU

Per patient

‐

Bastiaannet 2009

‐

(X)

Mixed

All node positive

Histology/FU

‐

Per patient

‐

Cachin 2014

‐

Mixed

Stage III

Histology/Imaging/FU

Per patient/Per lesion

Per lesion

Dellestable 2011

‐

Mixed

All staging

Histology/FU

Per lesion

Hausmann 2011

‐

Unclear

Stage III/IV

Histology/FU

Per lesion

Jouvet 2014

‐

Unclear

Stage IV

FNAC/FU

Per lesion

Klebl 2003

‐

Mixed

Clark IV/V in FU

Histology/FU

‐

per patient

‐

Pfannenberg 2007

‐

Mixed

Stage III/IV

Histology/Imaging/FU

Per lesion

Per patient

Per lesion

Pfluger 2011

‐

Mixed

All stage III

Histology/FU

Per patient

‐

Reinhardt 2006

‐

Mixed

All staging (incl N+)

Histology/FU

Per patient

‐

Strobel 2007b

‐

Unclear

High risk (BT > 4 mm, etc.)

Histology/Cytology/FU

Per patient

‐

van den Brekel 1998

‐

Mixed

HN MM and N+

Histology

‐

Per patient

‐

van Wissen 2016

‐

Mixed

Stage IIIB/IIIC palpable groin mets

Histology (combined groin dissection)

‐

Per patient

‐

BT: Breslow thickness; CLND: complete lymph node dissection; CT: computed tomography; FNAC: fine needle aspiration cytology; FU: follow‐up; HN: head and neck; MM: malignant melanoma; MRI: magnetic resonance imaging; mm: millimetre; N+: node positive; PET: positron emission tomography; RF: risk factor; SLNB: sentinel lymph node biopsy; US: ultrasound.

Methodological quality of included studies

The overall methodological quality of all included study cohorts is summarised in Figure 4 and Figure 5. Studies were generally at low or unclear risk of bias and of high or unclear concern regarding applicability of the evidence.

Figure 4

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies.

Figure 5

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study.

Over half of studies (23; 59%) were at low risk of bias for participant selection. High risk of bias was observed in four studies (10%) because of inappropriate participant exclusions; all excluded study participants on the basis of findings on the index test (ultrasound in all cases) (Hinz 2011; Hinz 2013; Radzhabova 2009; Rubaltelli 2011). Those at unclear risk of bias (n = 12) did not clearly describe participant recruitment as random or consecutive (n = 11) (all except Iagaru 2007) (Abbott 2011; Aukema 2010b; Cachin 2014; Hocevar 2004; Iagaru 2007; Jouvet 2014; Kang 2011; Klebl 2003; Pfluger 2011; Prayer 1990; Sanki 2009; Singh 2008), or did not clearly report participant exclusions (n = 3) (Iagaru 2007; Kang 2011; Pfluger 2011).

Over half of evaluations were considered at low risk of bias for the index test (55% (6/11) for pre‐SLNB ultrasound; 60% (3/5) for other uses of ultrasound; 70% (7/10) for CT; 100% (4/4) for MRI; and 43% (10/23) for PET‐CT). Across the 11 evaluations of pre‐SLNB ultrasound, five (45%) studies were retrospective or unclear in the nature of their design and did not describe blinded case note review to ascertain imaging test results (Hinz 2013; Hocevar 2004; Radzhabova 2009; Sanki 2009; van Rijk 2006). The same rationale for unclear risk of bias was made for two of the five (40%) other evaluations of ultrasound (Prayer 1990; Rubaltelli 2011), three evaluations of CT (30%) (Iagaru 2007; Pfluger 2011; van den Brekel 1998), and 13 (57%) evaluations of PET‐CT (Abbott 2011; Arrangoiz 2012; Aukema 2010a; Hinz 2013; Iagaru 2007; Kang 2011; Klode 2010; Pfluger 2011; Revel 2010; Singh 2008; Strobel 2007a; van Wissen 2016; Wagner 2012). One evaluation of pre‐SLNB ultrasound ‐ Radzhabova 2009 ‐ and one of PET‐CT ‐ Iagaru 2007 ‐ did not clearly prespecify the index test threshold. One study of PET‐CT ‐ Kang 2011 ‐ retrospectively selected the maximum standardised uptake value (SUVmax) threshold for PET‐CT using ROC analysis and therefore scored high risk of bias for this domain.

Most studies (27/39) were judged at low risk of bias for the reference standard; the 12 studies at unclear risk of bias provided no information on the follow‐up schedule used to determine final disease status (Arrangoiz 2012; Aukema 2010a; Aukema 2010b; Bastiaannet 2009; Cachin 2014; Dellestable 2011; Iagaru 2007; Jouvet 2014; Pfluger 2011; Rubaltelli 2011; Strobel 2007b; Veit‐Haibach 2009). Although blinding of the reference standard diagnosis did not contribute to overall risk of bias, two studies clearly reported blinding of the histological diagnosis (Pfannenberg 2007; Sibon 2007), and three reported blinding of data collection on follow‐up (Hausmann 2011; Pfannenberg 2007; Reinhardt 2006). Two studies reported no blinding of histological diagnosis (Cachin 2014; Singh 2008), and three reported no blinding to the original imaging result during follow‐up (Abbott 2011; Cachin 2014; Jouvet 2014).

Two‐thirds of studies were at high risk of bias for participant flow and timing (26/39), and one was judged as having unclear risk. High risk of bias was considered in one study because of performance of the imaging test (PET‐CT) up to four months after the reference standard (SLNB) (Maubec 2007); in 19 studies (49%) because of differential verification (Abbott 2011; Arrangoiz 2012; Aukema 2010a; Aukema 2010b; Bastiaannet 2009; Cachin 2014; Dellestable 2011; Hausmann 2011; Iagaru 2007; Jouvet 2014; Kang 2011; Klebl 2003; Pfannenberg 2007; Pfluger 2011; Prayer 1990; Reinhardt 2006; Strobel 2007a; Strobel 2007b; Veit‐Haibach 2009); and in 13 studies (33%) because of exclusion of participants from the analysis (Bastiaannet 2009; Cachin 2014; Chai 2012; Dellestable 2011; Hafner 2004; Hausmann 2011; Klebl 2003; Pfannenberg 2007; Pfluger 2011; Radzhabova 2009; Rubaltelli 2011; van Wissen 2016; Wagner 2012).

Among the nine studies providing direct comparisons of index tests, six were judged at low risk of bias for the comparative domain. Pfluger 2011 was considered at high risk of bias, as PET‐CT and CT images were interpreted side by side, and in Hinz 2013, only a subgroup of those with US also underwent PET‐CT. In two studies, blinding between tests was not clearly described (Hinz 2013; Iagaru 2007).

In terms of applicability of evidence to the review question, 40% (n = 16) of studies were of high or unclear concern due to participant selection (Figure 4). High concern was primarily due to inclusion of participants from mixed population groups (including primary staging, re‐staging, or patient follow‐up) (Abbott 2011; Aukema 2010a; Bastiaannet 2009; Cachin 2014; Dellestable 2011; Klebl 2003; Pfannenberg 2007; Pfluger 2011; Reinhardt 2006; van den Brekel 1998; van Wissen 2016), or it was due to the presentation of only per lesion rather than per patient data (Dellestable 2011; Hausmann 2011; Jouvet 2014; Pfannenberg 2007; Pfluger 2011). Three studies were of unclear concern due to lack of clear description of the indication for imaging (Aukema 2010b; Iagaru 2007; Strobel 2007b).

Almost all test evaluations were considered at high or unclear concern around applicability of the index test. For pre‐SLNB ultrasound, there was high concern from lack of detail regarding the threshold used (n = 1) (Hafner 2004), and unclear concern resulted from lack of information on application and interpretation of the index test (n = 9) (Chai 2012; Hafner 2004; Hinz 2011; Hocevar 2004; Kunte 2009; Radzhabova 2009; Sibon 2007; van Rijk 2006; Voit 2014), or regarding the expertise of the observer performing the ultrasound examination (n = 6) (Chai 2012; Hinz 2011; Kunte 2009; Radzhabova 2009; Sanki 2009; van Rijk 2006).

For CT, six evaluations were of high concern due to use of consensus test interpretation (Iagaru 2007; Jouvet 2014; Pfannenberg 2007; Pfluger 2011; Reinhardt 2006; Veit‐Haibach 2009), two for MRI (Jouvet 2014; Pfannenberg 2007), and 11 for PET‐CT (Aukema 2010a; Aukema 2010b; Iagaru 2007; Jouvet 2014; Kang 2011; Pfannenberg 2007; Pfluger 2011; Reinhardt 2006; Revel 2010; Strobel 2007a; Strobel 2007b). Only five CT evaluations described the provision of usual clinical information to test interpreters (Jouvet 2014; Pfannenberg 2007; Pfluger 2011; Reinhardt 2006; Veit‐Haibach 2009), one evaluation of MRI (Pfannenberg 2007), and four for PET‐CT (Pfannenberg 2007; Pfluger 2011; Reinhardt 2006; Revel 2010). Three CT evaluations were unclear on the information provided to assist test interpretation (Bastiaannet 2009; Dellestable 2011; van den Brekel 1998), two for MRI (Dellestable 2011; Hausmann 2011), and six for PET‐CT (Dellestable 2011; Kell 2007; Klode 2010; Maubec 2007; Singh 2008; Veit‐Haibach 2009).

Inadequate details of test threshold were provided in five evaluations of CT (Bastiaannet 2009; Dellestable 2011; Iagaru 2007; Jouvet 2014; Reinhardt 2006), three for MRI (Dellestable 2011; Hausmann 2011; Jouvet 2014), and four for PET‐CT (Abbott 2011; Hinz 2013; Jouvet 2014; Reinhardt 2006). Threshold details were unclear for one study for both CT and MRI (Pfannenberg 2007), as were six for PET‐CT (Aukema 2010a; Aukema 2010b; Dellestable 2011; Klode 2010; Maubec 2007; Wagner 2012). Two CT evaluations were unclear with regard to observer expertise (Dellestable 2011; van den Brekel 1998), one for MRI (Dellestable 2011), and seven for PET‐CT (Arrangoiz 2012; Dellestable 2011; Hinz 2013; Kell 2007; Klode 2010; Maubec 2007; Revel 2010).

For applicability of the reference standard, five studies were considered of low concern (Hafner 2004; Hinz 2011; Hinz 2013; Kang 2011; van Wissen 2016), and one was rated of high concern because it did not present data for the primary target condition of any metastasis (nodal plus distant metastases) (Veit‐Haibach 2009). The remaining 33 studies were considered at unclear concern for applicability because they did not clearly define the target condition or provide a breakdown according to nodal or distant metastases. Only five studies described the expertise of the histopathologist (Hafner 2004; Hinz 2011; Hinz 2013; Kang 2011; van Wissen 2016); the remaining studies were rated of unclear concern.

Findings

1. Imaging for detection of nodal metastases before SLNB

Imaging before SLNB can be used to identify patients with nodal metastatic disease that is not detectable clinically such that they can bypass the SLNB procedure and undergo complete lymph node dissection. Eighteen studies were included, 10 of which considered the use of pre‐SLNB ultrasound (Chai 2012; Hafner 2004; Hinz 2011; Hocevar 2004; Kunte 2009; Radzhabova 2009; Sanki 2009; Sibon 2007; van Rijk 2006; Voit 2014); seven evaluated PET‐CT (Arrangoiz 2012; Kell 2007; Klode 2010; Maubec 2007; Revel 2010; Singh 2008; Wagner 2012), and one evaluated both tests (Hinz 2013). Three studies of ultrasound also presented accuracy data for ultrasound combined with FNAC (i.e. complete lymph node dissection recommended only if both ultrasound and FNAC were positive for metastases) (Hocevar 2004; van Rijk 2006; Voit 2014).

Forest plots of study data are provided in Figure 6. Summary estimates for indirect and direct comparisons of tests are presented in Table 2 and Figure 7. Summary details of all studies in this section are presented alphabetically in Appendix 8.

Figure 6

Forest plot of all data for pre‐SLNB ultrasound, ultrasound plus FNAC, or PET‐CT for the detection of nodal metastasis.
(HN MM ‐ head and neck only malignant melanoma.)

Figure 7

Summary ROC plot comparing pre‐SLNB ultrasound vs ultrasound plus FNAC vs PET‐CT.

Table 2. Summary results from studies of imaging for primary staging or re‐staging

Test	Studies	Participants (cases)	Sensitivity (95% CI), %	Specificity (95% CI), %
Comparison of imaging tests before SLNB
Indirect comparison of imaging tests for detection of nodal metastasis (per patient data)
US	11	2614 (542)	35.4 (17.0 to 59.4)	93.9 (86.1 to 97.5)
US‐FNAC	3	1164 (259)	18.0 (3.58 to 56.5)	99.8 (99.1 to 99.9)
PET‐CT	4	170 (49)	10.2 (4.31 to 22.3)	96.5 (87.1 to 99.1)
Difference			P = 0.07	P < 0.001
Direct comparison of imaging tests for detection of nodal metastasis (per patient data)
US	3	1164 (259)	58.7 (36.5 to 77.9)	79.4 (70.0 to 86.4)
US‐FNAC	3	1164 (259)	18.0 (3.58 to 56.5)	99.8 (99.1 to 99.9)
Difference			‐40.7 (‐75.0 to ‐6.50), P = 0.02	+20.4 (+12.2 to +28.6), P < 0.001
Whole body imaging
Imaging for re‐staging for the detection of any metastasis (per patient data)
PET‐CT	2^a	153 (95)	92.6 (85.3 to 96.4)	89.7 (78.8 to 95.3)

CI: confidence interval; CT: computed tomography; FNAC: fine needle aspiration cytology; PET: positron emission tomography; SLNB: sentinel lymph node biopsy; US: ultrasound.

^aWhere there were only two studies, estimates of summary sensitivity and summary specificity were obtained by using univariate fixed‐effect logistic regression models to pool sensitivities and specificities separately.

Description of studies

Study design and setting. Four of the 18 studies (22%) were prospective case series (Hafner 2004; Hinz 2011; Kunte 2009; Maubec 2007), nine (50%) were retrospective (Arrangoiz 2012; Chai 2012; Hinz 2013; Kell 2007; Klode 2010; Revel 2010; Sibon 2007; van Rijk 2006; Wagner 2012), and five (28%) did not clearly report the design (Hocevar 2004; Radzhabova 2009; Sanki 2009; Singh 2008; Voit 2014). Studies were conducted in Europe (n = 14), Australia (n = 1; Sanki 2009), and the USA (n = 3; Arrangoiz 2012; Chai 2012; Kell 2007).

Participants. Thirteen of the 18 studies (72%) were considered to have been conducted in ‘standard’ SLNB populations, either reporting the inclusion of participants with primary melanomas with Breslow thickness of at least 0.76 mm or 1 mm unless other adverse prognostic factors were present such as Clark level of at least IV, ulceration, or regression (Chai 2012; Hafner 2004; Hinz 2011; Kell 2007; Klode 2010; Kunte 2009; Sanki 2009; Sibon 2007; Singh 2008; van Rijk 2006; Voit 2014), or reporting including ‘candidates for SLNB’ with no further detail (Hocevar 2004; Radzhabova 2009). One study restricted inclusion to participants with head and neck melanoma only (Revel 2010), and the remaining four studies included only higher‐risk participants, with Breslow thickness of at least 2 mm (Hinz 2013), or 4 mm (Arrangoiz 2012; Maubec 2007; Wagner 2012).

A total of 2894 participants were included, with 640 nodal metastases identified (prevalence 22%, ranging from 10% in Hinz 2011 to 60% in Hinz 2013). Sample sizes ranged from 20 participants in Hinz 2013 and Maubec 2007 to 1000 in Voit 2014. When reported (n = 11), the ages of included participants ranged from one year in Hocevar 2004 to 94 years in Voit 2014. The mean age of included participants was reported in 11 studies (the median of reported means was 57.5 years, range 50 to 67 years), and the median age was reported in five studies (the median of reported means was 58 years, range 55 to 62 years); five studies reported neither the mean nor median age of included study participants (Hinz 2013; Hocevar 2004; Radzhabova 2009; Sanki 2009; Wagner 2012). When reported (n = 15), 48% of included participants were male. Of 11 studies reporting the site of the primary melanoma lesion (excluding Revel 2010, which included head and neck melanomas only), the percentage of participants with head and neck melanoma ranged from 0% in Hinz 2013 to 36% in Maubec 2007 (median 14%), and melanoma of the extremities, including the hands or feet where documented, from 32% in Maubec 2007 to 56% in Kunte 2009 (median 50%).

Ultrasound. The 11 studies of pre‐SLNB ultrasound were all conducted in standard SLNB populations, although Hinz 2013 restricted inclusion to participants with melanomas at least 2 mm thick or with risk factors such as ulceration or regression. The two studies by Hinz and colleagues excluded participants with classic sonographic signs of lymphatic metastasis (Hinz 2011; Hinz 2013), whereas Radzhabova 2009 included only those who were positive on US or in whom metastases could not be excluded.

Studies employed mainly B‐mode ultrasound, with two studies also employing Doppler ultrasound in all participants (Hinz 2011; Voit 2014). B‐mode ultrasound frequencies were variable, mainly ranging from 5 or 6 MHz to 10 or 12 MHz in each study, apart from Voit 2014, which used three transducers ranging from 1 to 18 MHz in frequency. Ultrasound was performed before lymphoscintigraphy in five studies (Chai 2012; Hafner 2004; Hinz 2011; Hinz 2013; Sibon 2007), after lymphoscintigraphy in two studies (Sanki 2009; Voit 2014), and both before and after lymphoscintigraphy in four studies (Hocevar 2004; Kunte 2009; Radzhabova 2009; van Rijk 2006). Lymph node basins were imaged according to the site of the primary melanoma (Chai 2012; Hafner 2004; Hinz 2011; Hinz 2013; Sibon 2007; Voit 2014), according to the site marked following lymphoscintigraphy (Sanki 2009; Voit 2014), or this information was not reported (Hocevar 2004; Kunte 2009; Radzhabova 2009; van Rijk 2006). Criteria for detection of nodal metastases were clearly described in all studies apart from Hafner 2004 (Appendix 8). Ultrasound was reported to be performed by dermatologists (Kunte 2009), sonographers (Voit 2014), radiologists (Hafner 2004; Hocevar 2004; Sibon 2007), or nuclear medicine physicians (Sanki 2009), or this was not reported.

PET‐CT. Of the eight studies of PET‐CT before SLNB, four were conducted in any participant eligible for SLNB (Hinz 2013; Kell 2007; Klode 2010; Singh 2008); one in those with head and neck melanoma (Revel 2010); and three in higher‐risk melanoma populations (Arrangoiz 2012; Maubec 2007; Wagner 2012). Singh 2008 also reported data for the subgroup of participants with higher‐risk melanoma (Breslow thickness > 4 mm). When reported (n = 3), studies employed two‐dimensional (2D) PET (Wagner 2012), three‐dimensional (3D) PET (Maubec 2007), or either 2D or 3D PET (Arrangoiz 2012). PET was combined with unenhanced ‐ in Arrangoiz 2012,Kell 2007, and Maubec 2007 ‐ or contrast‐enhanced ‐ in Hinz 2013,Klode 2010, and Singh 2008 ‐ CT scans (use of contrast not reported in Revel 2010 and Wagner 2012). When reported, CT was used for attenuation correction (Arrangoiz 2012; Hinz 2013; Revel 2010; Singh 2008; Wagner 2012), as well as for anatomical localisation (Revel 2010; Wagner 2012).

Criteria for the detection of nodal metastases were not reported in Hinz 2013, were based on a qualitative assessment of increased ¹⁸FDG uptake in six studies (Kell 2007; Klode 2010; Maubec 2007; Revel 2010; Singh 2008; Wagner 2012), and were based on a quantitative assessment of focal uptake in Arrangoiz 2012 (SUV ≥ 2.5) (see Appendix 8). Performance and interpretation of PET‐CT were not clearly described. For example, Wagner 2012 reported interpretation by a nuclear medicine specialist, while two others mentioned an in‐house medical physicist ‐ Singh 2008 ‐ and a team of radiologists and nuclear physicians ‐ Arrangoiz 2012. Only Revel 2010 and Wagner 2012 reported the provision of clinical or other radiological findings to assist PET‐CT interpretation.

Reference standard. Ten studies (56%) evaluated the accuracy of imaging in comparison to histology from SLNB alone (Hinz 2011; Hinz 2013; Kell 2007; Klode 2010; Kunte 2009; Radzhabova 2009; Revel 2010; Sanki 2009; Sibon 2007; Singh 2008), seven studies (39%) included histology results from participants proceeding directly to CLND as well as SLNB results as a reference standard (Chai 2012; Hafner 2004; Hocevar 2004; Maubec 2007; van Rijk 2006; Voit 2014; Wagner 2012), and one study reported only data for histology based on CLND or SLNB combined with follow‐up to determine any false negative results on PET‐CT as the reference standard (Arrangoiz 2012).

Participant exclusions. Five studies reported the exclusion of between two and eight participants primarily due to technical failure of SLNB (sentinel node not identified or SLNB not performed) (Chai 2012; Hafner 2004; Maubec 2007; Revel 2010; Wagner 2012), and in one study (Radzhabova 2009), 100 participants did not undergo SLNB on the basis of a negative ultrasound finding.

Results: ultrasound for detection of nodal metastases

Across the 11 ultrasound evaluations, sensitivity for detection of nodal metastasis in comparison to a histological reference standard (SLNB or LCND) ranged from 0% in Hafner 2004 to 33% in Chai 2012 and Kunte 2009 in eight studies, and from 71% in Hocevar 2004 and Voit 2014 to 100% in Radzhabova 2009 in three. Specificity ranged from 73% in Voit 2014 to 100% in Kunte 2009,Hinz 2011,Hinz 2013, and Radzhabova 2009) (Figure 6). Radzhabova 2009 included a highly selected group of study participants, which likely explains the perfect sensitivity and specificity observed. The particularly low sensitivity in Hafner 2004 (0%) may be related to the use of only a 5‐MHz ultrasound transducer, but this study was poorly reported and other explanations may be possible. The relatively high sensitivities (both 71%) in Hocevar 2004 and Voit 2014 are also difficult to explain based on the information reported. In terms of specificity, Kunte 2009, Hinz 2011, and Hinz 2013 all applied ultrasound before and after the use of lymphoscintigraphy, which is likely to have contributed to the 100% specificity observed.

The summary sensitivity of ultrasound across the 11 studies was 35.4% (95% CI 17.0% to 59.4%) and summary specificity was 93.9% (86.1% to 97.5%) for 2614 participants and 542 confirmed cases of nodal metastasis (Table 2; Figure 7).

The three studies that reported the accuracy of ultrasound combined with FNAC reported decreased sensitivity but increased specificity in comparison to ultrasound alone. Sensitivities ranged from 3% (95% CI 0% to 14%) in van Rijk 2006 to 51% (95% CI 44% to 58%) in Voit 2014, and specificities ranged from 99% (95% CI 92% to 100%) in van Rijk 2006 to 100% (95% CI 99% to 100%) in Voit 2014 (Figure 6). The summary sensitivity was 18.0% (95% CI 3.58% to 56.5%), and summary specificity was 99.8% (95% CI 99.1% to 99.9%), based on 1164 participants and 259 cases (Table 2; Figure 7).

Results: PET‐CT for detection of nodal metastases

The four studies comparing PET‐CT to histology based on SLNB in standard SLNB populations reported sensitivities ranging from 0% (95% CI 0% to 26%) in Hinz 2013 to 22% (95% CI 3% to 60%) in Kell 2007, and specificities from 89% (95% CI 72% to 98%) in Kell 2007 to 100% (95% CI 63% to 100%) in Hinz 2013 and 100% (95% CI 92% to 100%) in Klode 2010 (Figure 6). The summary sensitivity was 10.2% (95% CI 4.31% to 22.3%) and summary specificity was 96.5% (95% CI 87.1% to 99.1%) for 170 participants and 49 confirmed cases of nodal metastasis (Table 2; Figure 7).

Data from the three studies in higher‐risk melanoma populations (75 participants with 28 cases) that compared PET‐CT to histology based on SLNB alone could not be pooled because of substantial heterogeneity likely resulting from small sample sizes (Maubec 2007; Singh 2008; Wagner 2012). Sensitivities ranged from 0% (95% CI 0% to 41%) in Maubec 2007 to 43% (95% CI 18% to 71%) in Wagner 2012, and specificities from 92% (95% CI 64% to 100%) in Maubec 2007 to 100% (95% CI 48% to 100%) in Singh 2008 and 100% (95% CI 88% to 100%) in Wagner 2012 (Figure 6).

One of these studies ‐ Maubec 2007 ‐ and Arrangoiz 2012 reported data for PET‐CT compared to histology based on SLNB plus follow‐up to identify false negatives. Maubec 2007 identified one additional false negative result on follow‐up, but sensitivity (0%) and specificity (92%) remained the same with marginal changes to CIs (95% CI 0% to 37% for sensitivity and 62% to 100% for specificity). Arrangoiz 2012 reported sensitivity of PET‐CT as 41% (95% CI 24% to 61%) and specificity as 89% (95% CI 71% to 98%) (Figure 6).

Revel 2010 reported the sensitivity of PET‐CT as 20% (95% CI 3% to 56%) and 100% (95% CI 69% to 100%) for 20 participants with head and neck melanoma when compared to SLNB alone as a reference standard. Adding data for a follow‐up reference standard identified two additional nodal metastases missed on PET‐CT for sensitivity of 17% (95% CI 2% to 48%) and specificity of 100% (95% CI 69% to 100%) (Figure 6).

Results: comparison between tests

Upon comparison of ultrasound alone, ultrasound plus FNAC, and PET‐CT, summary sensitivities were not statistically significantly different (P = 0.07), and summary specificities were significantly higher for ultrasound plus FNAC compared to the other two modalities (P < 0.001) (Table 2; Figure 7).

The direct comparison of ultrasound alone versus ultrasound plus FNAC suggested higher sensitivity (58.7%, 95% CI 36.5% to 77.9%) but lower specificity (79.4%, 95% CI 70.0% to 86.4%) (3 studies; 1164 participants; 259 cases of nodal metastases) for ultrasound alone compared to the overall pooled result. Requiring both ultrasound and FNAC to be positive for nodal metastases (as an indicator for CLND instead of SLNB) reduced sensitivity by 40.7% (95% CI 6.50% to 75.0%; P = 0.02) but increased specificity by 20.4% (95% CI 12.2% to 28.6%; P < 0.001) (Table 2).

2. Whole body imaging for detection of any metastases, nodal metastases, and distant metastases

Twenty‐four studies evaluated whole body imaging. Summary characteristics of all studies are tabulated alphabetically in Appendix 9, and a narrative description is provided in Appendix 10. Results are presented below according to the population group studied, the target condition and imaging test, and the presentation of data per patient and per lesion.

Primary staging

Six studies recruited participants undergoing primary staging following a confirmed diagnosis of melanoma (Arrangoiz 2012; Hafner 2004; Kang 2011; Maubec 2007; Prayer 1990; Veit‐Haibach 2009). Two studies included any participant following diagnosis of melanoma (Kang 2011; Veit‐Haibach 2009); two excluded those with distant metastases on diagnosis (Hafner 2004; Maubec 2007). Maubec 2007 restricted data to those with melanomas at least 4 mm in thickness; one study included clinically node positive participants but did not report exclusion of those with distant metastases (Prayer 1990), and one included only clinically node negative participants with melanomas of at least 4 mm Breslow thickness (Arrangoiz 2012). Three studies also reported data for pre‐SLNB imaging (Arrangoiz 2012; Hafner 2004; Maubec 2007), two of which reported subgroup data for clinically node negative participants who underwent SLNB (Hafner 2004; Maubec 2007). All six studies reported accuracy data on a per patient basis; no per lesion data were identified.

Forest plots of all available study data are presented in Figure 8. Summary details of all studies in this section are presented alphabetically in Appendix 9. Sensitivities and specificities from studies evaluating more than one target condition (any metastasis, nodal metastasis, or distant metastasis) are tabulated in Appendix 11.

Figure 8

Forest plot of imaging for primary staging, for the detection of any metastases, nodal metastases, and distant metastases (per patient and per lesion data).

Results: detection of any metastases

Three studies presented data for detection of any metastasis in 118 study participants with 51 cases of metastatic disease (Figure 8) (Arrangoiz 2012; Kang 2011; Maubec 2007); the prevalence of metastases ranged from 24% in Kang 2011 to 57% in Arrangoiz 2012.

CT. No data on CT were identified for participants undergoing primary staging of melanoma.

MRI. No data on MRI were identified for participants undergoing primary staging of melanoma.

PET‐CT. Three studies presented per patient data for PET‐CT for detection of any metastasis; no per lesion data were identified.

Two studies evaluated PET‐CT in participants with melanomas ≥ 4 mm in thickness: one reported sensitivity and specificity for detection of any metastases of 47% (95% CI 29% to 65%) and 88% (95% CI 68% to 97%) (56 participants; 32 cases of metastatic disease) (Arrangoiz 2012); and in the other (Maubec 2007), sensitivity was 30% (95% CI 7% to 65%) and specificity 73% (95% CI 45% to 92%) (25 participants; 10 cases of metastatic disease) (Figure 8). Arrangoiz 2012 identified four patients with distant metastases whose disease would have been missed without PET‐CT imaging (prevalence of distant metastases 4/56; 7%). In the other study (Maubec 2007), no distant metastases were identified, but all three false positive results suggested possible distant metastases.

The third study evaluated PET‐CT for the prediction of subsequent recurrence in any participant following diagnosis of melanoma (Kang 2011). Stage of disease on presentation was reported as stage 0 (n = 7), stage I or II (n = 23), stage III (n = 6), and stage IV (n = 1); all patients underwent curative surgery for primary and metastatic lesions. The sensitivity of PET‐CT for the prediction of later recurrence at an SUVmax ≥ 2.2 at baseline was 89% (95% CI 52% to 100%) and specificity 61% (95% CI 41% to 78%) (37 participants; nine cases of metastatic disease) (Figure 8). The accuracy of PET‐CT for initial staging was not reported. Three of the nine patients who developed recurrence during follow‐up were stage III or greater at presentation.

Results: detection of nodal metastases

Three studies presented data for the detection of nodal metastases in 373 participants with 68 cases of nodal metastases (Hafner 2004; Prayer 1990; Veit‐Haibach 2009) (Figure 8); the prevalence of nodal metastases ranged from 13% in Prayer 1990 to 26% in Hafner 2004.

Ultrasound. Two studies evaluated the use of ultrasound in any participant following the diagnosis of melanoma, including those who were clinically node positive (Hafner 2004; Prayer 1990). Hafner 2004 restricted inclusion to those with melanomas ≥ 1 mm in thickness, and all underwent SLNB including three with clinically detectable nodal metastases (data for clinically node negative are reported in the pre‐SLNB imaging section above). The sensitivity of ultrasound was 8% (95% CI 1% to 25%) and specificity 88% (95% CI 78% to 94%) (100 participants, 26 with nodal metastases) (Figure 8); the only true positive results were both detected on physical examination. In Prayer 1990, the sensitivity of ultrasound was 100% (95% CI 88% to 100%) and specificity 97% (95% CI 93% to 99%) (217 participants, 29 with nodal metastases) (Figure 8). These results are likely to be influenced by incorporation bias and inadequate follow‐up to identify false negatives on ultrasound. Of 217 included participants, 15% (35/217) had suspicious findings on palpation; however, among these, only those who were also found to have suspicious nodes on ultrasound had histological verification (n = 15). This left 17 who were positive on palpation but negative on ultrasound, along with 165 who were negative on both palpation and ultrasound, with a six‐month follow‐up reference standard.

Other imaging tests. One study presented data comparing CT with PET‐CT for the detection of nodal metastases in all participants referred for PET‐CT after primary melanoma resection (Veit‐Haibach 2009). No false positive results were obtained with either test (specificity 100%, 95% CI 92% to 100%), but sensitivity was higher for PET‐CT (38%, 95% CI 14% to 68%) compared to CT (23%, 95% CI 5% to 54%) (56 participants, 13 with nodal metastases) (Figure 8). Initial staging procedures including histology of the primary lesion, SLNB, and all imaging procedures apart from PET‐CT identified four of the 13 participants with nodal metastases, two of whom were identified only on SLNB and were missed by all imaging tests (Veit‐Haibach 2009). Both CT and PET‐CT correctly identified additional participants with nodal metastases (one and three, respectively) that were not picked up by other staging procedures at the time of melanoma diagnosis.

No data on MRI to detect nodal disease were identified for participants undergoing primary staging of melanoma.

Results: detection of distant metastases

Two studies presented data for the detection of distant metastases in 112 participants with 17 cases of distant metastases (Arrangoiz 2012; Veit‐Haibach 2009) (Figure 8); the prevalence of distant metastases was 9% in Arrangoiz 2012 and 21% in Veit‐Haibach 2009.

CT. One study presented data for CT: sensitivity for the detection of 12 distant metastases was 25% (95% CI 5% to 57%) and specificity 93% (95% CI 81% to 99%) (56 participants) (Veit‐Haibach 2009).

MRI. No per patient data on MRI were identified for participants undergoing primary staging of melanoma.

PET‐CT.Veit‐Haibach 2009 provided a direct comparison of CT with PET‐CT; two additional cases of distant metastases were detected in comparison to CT (sensitivity 42%, 95% CI 15% to 72%), with no difference in specificity (93%, 95% CI 81% to 99%) (Figure 8).

Arrangoiz 2012 evaluated the use of PET‐CT in clinically node negative participants with primary melanoma greater than 4 mm in thickness and no indications for distant metastases; all five distant metastases were detected by PET‐CT (sensitivity 100%, 95% CI 48% to 100%), with three false positive results (specificity 94%, 95% CI 84% to 99%) (56 participants).

Results: detection of distant metastases by metastatic site

No data were identified for the detection of metastatic disease according to metastatic site in participants undergoing imaging for primary staging of melanoma.

In three of six studies, scan coverage was reported to include the skull (Arrangoiz 2012; Kang 2011; Maubec 2007), but the detection of brain metastases was not separately documented.

Re‐staging

Three studies recruited participants undergoing imaging for re‐staging of disease following a clinical indication of recurrence (Iagaru 2007; Rubaltelli 2011; Strobel 2007a). One study included any participant having imaging for re‐staging purposes (Iagaru 2007), and two included clinically node negative participants either undergoing ultrasound of the regional lymph nodes as part of a follow‐up program (Rubaltelli 2011), or with raised serum S100 (> 0.2 μg/L) during follow‐up (Strobel 2007a).

Forest plots of all available study data are presented in Figure 9. Summary estimates of sensitivity and specificity are presented in Table 2. Summary details of all studies in this section are presented alphabetically in Appendix 9.

Figure 9

Forest plot of imaging for re‐staging of melanoma, for the detection of any metastases or nodal metastases (per patient and per lesion data).

Results: detection of any metastases

Two studies provided per patient data for the detection of any metastasis in 153 participants with 95 cases of metastatic disease (Figure 9); the prevalence of any metastasis was 53% in Iagaru 2007 and 83% in Strobel 2007a.

CT. In one study, the sensitivity of CT for detection of any metastasis on a per patient basis was 68% (95% CI 54% to 80%) and specificity 94% (95% CI 83% to 99%) (106 participants; 56 cases of metastatic disease) (Iagaru 2007).

MRI. No data on MRI were identified for participants undergoing re‐staging of melanoma.

PET‐CT. Two studies evaluated PET‐CT on a per‐patient basis in 153 participants, 95 of whom had confirmed metastases (Iagaru 2007; Strobel 2007a); summary sensitivity was 92.6% (95% CI 85.3% to 96.4%) and specificity 89.7% (95% CI 78.8% to 95.3%) (Table 2).

Comparison of PET‐CT with CT in Iagaru 2007 demonstrated PET‐CT to be more sensitive (89%, 95% CI 78% to 96%) than CT alone (increase of 21%), with similar specificity (88%, 95% CI 76% to 95%). Similar results were observed on a per lesion basis (Figure 9). Although numbers were small, PET‐CT was more sensitive in the subgroup with stage IIIc to IV disease who underwent PET‐CT for re‐staging after therapy (100%, 95% CI 81% to 100%) (n = 32; 18 with metastatic disease) than in those with less advanced disease (84%, 95% CI 69% to 94%).

Results: detection of nodal metastases

One study presented per patient data for the detection of nodal recurrence after primary treatment in 460 participants with 37 cases of nodal metastases (prevalence 8%) (Rubaltelli 2011) (Figure 9).

Ultrasound. Considering participants with 'common signs of malignancy' or with focal hypoechoic cortical thickening as positive for metastases detected all participants with nodal metastases (sensitivity 100%, 95% CI 91% to 100%) with a specificity of 93% (95% CI 90% to 95%) (460 participants, 37 with nodal metastases) (Rubaltelli 2011). The combination of contrast‐enhanced ultrasound with B‐mode ultrasound for participants with focal cortical thickening (presence of perfusion defects corresponding to the cortical focal thickening required for a positive test result) increased specificity to 100% (95% CI 98% to 100%).

Other imaging tests. No data on CT, MRI, or PET‐CT for the detection of nodal metastases were identified for participants undergoing re‐staging for recurrence of melanoma.

Results: detection of distant metastases

No data were identified for the detection of distant metastases in participants undergoing re‐staging for disease recurrence.

Results: detection of distant metastases by metastatic site

Two of three studies conducted in participants undergoing imaging for re‐staging of melanoma included imaging of the brain and documented some results for the detection of brain metastases.

In Iagaru 2007, one of the nine lesions classified as a false negative on PET‐CT was a brain lesion that was identified by MRI during follow‐up; the total number of brain metastases identified in the study was not reported.

In Strobel 2007a, two brain metastases were identified on PET‐CT, both of which were confirmed to be malignant on the reference standard.

3. Staging in mixed or not clearly described populations

Studies in mixed and not clearly described populations have been considered together on the basis that we would be unable to make clear statements regarding the expected accuracy of imaging at any particular point on the clinical pathway for either subset of studies. Table 3 describes variability in the clinical pathway and indications for imaging, inclusion criteria, and stage of disease of participants included in these studies.

Table 3. Characteristics of studies conducted in mixed or unclear population groups

Study Population group	Participant inclusion criteria and reported indications for imaging	Stage of disease on presentation	Imaging tests	Patients/cases (prevalence) [lesions/metastases (prevalence)]	Average no. metastases per patient
PER PATIENT DATA
Abbott 2011 Mixed – primary or follow‐up	Undergoing FU after prior SLNB/CLND for micro‐metastases or presenting with clinically detectable nodal disease at or subsequent to initial diagnosis	Stage: IIIA 18, 53% IIIB 10, 29% IIIC 6, 18%	PET‐CT (NR)	34/7 (21%)	N/A
Aukema 2010a Mixed – primary or re‐staging	Asymptomatic S100 positive. Previously treated for locoregional recurrence (n = 15) or distant metastases (n = 5); or with unfavourable primary tumour (n = 6), primary melanoma with simultaneous nodal metastases (n = 20)	Any (stage NR)	PET‐CT (U)	46/23 (50%)	N/A
Aukema 2010b Unclear	Palpable and pathology proven lymph node metastases and no signs of distant metastases. Imaging to identify further ‘undetected’ disease	Stage III: 100%	PET‐CT (U)	70/30 (43%)	N/A
Bastiaannet 2009 Mixed – primary or re‐staging	Node positive (clinical or histology/cytology proven) candidates for CLND; imaging to identify further disease. Includes those with LN mets diagnosed at time of primary diagnosis 39, 15.5%; LN metastases identified ≤ 3 years since primary diagnosis 145, 57.8%; recurrence > 3 years since primary diagnosis 67, 26.7%	Stage III (100%)	CT (CE)	251/78 (31%) distant metastases	N/A
Cachin 2014 Mixed ‐ staging or re‐staging	Any primary MM, visceral metastases, or cutaneous metastases from unknown primary	Any; 51% with metastases	PET‐CT (NR)	67/39 (58%) [176/85 (48%)]	N/A
Klebl 2003 Mixed	Clark level IV or V undergoing FU after primary surgery. Reports primary (n = 8) and imaging during follow‐up (n = 75)	Any (NR)	US	79/17 (22%) nodal	N/A
Reinhardt 2006 Mixed – primary, re‐staging, FU, disease response	All with PET‐CT for primary staging after sentinel node biopsy (n = 75); therapy control after chemotherapy of metastatic disease (n = 42); staging of clinically suspected recurrent disease (n = 65); during follow‐up within 5 years of primary treatment (n = 68)	Stage I 22, 9% Stage II 88, 35% Stage III 108, 43% Stage IV 32, 13%	CT (CE) PET‐CT (CE)	250/116 (46%)	N/A
Strobel 2007b Unclear	High risk melanoma (BT > 4 mm, or Clark level III or IV, or known resected metastases) with PET‐CT for depiction or exclusion of metastases	Any (NR)	PET‐CT (CE)	124/53 (43%)	N/A
van den Brekel 1998 Mixed ‐ primary and recurrence	Head and neck MM with CT before neck dissection, including therapeutic and elective (negative on palpation). "Interval between the treatment of the primary and the neck dissection ranged from 0 to 8.8 years (mean: 21 months)"	Stage I to II 8, 31% Stage III 18, 69%	CT (CE)	26/21 (81%) nodal	N/A
van Wissen 2016 Mixed ‐ primary and recurrence	Stage IIIB or IIIC MM with palpable groin metastases; selected for therapeutic combined groin dissection. Discussion states: "large proportion of our patients were initially treated for their primary tumour at other hospitals, and sometimes years prior to the current groin dissection"	All stage IIIB and C	PET‐CT (U)	69/59 (superficial nodes 86%) 67/24 (deep nodes 36%)	N/A
PER LESION DATA
Cachin 2014 Mixed ‐ staging or re‐staging	Any primary MM, visceral metastases, or cutaneous metastases from unknown primary. Lesions with equivocal focal uptake considered test positive Only 1 eligible index test	Any: 51% with metastases	CT (NR)	67/39 (58%) [176/85 (48%)]	1 (85/67)
Dellestable 2011 Mixed ‐ primary or follow‐up	All with PET‐CT regardless of AJCC stage or indication for examination Number of lesions included varies per test	Stage I to II: 27.5% Stage III to IV: 72.5%	CT (CE) MRI (DW) PET‐CT	40 [108/66 (61%)] [117/70 (60%)] [119/72 (61%)]	2 (72/40)
Hausmann 2011 Unclear	AJCC stage III or IV with positive SLNB or suspicious lesions on ultrasound or X‐ray studies Number of lesions included same per test	Stage III 38% Stage IV 62%	CT (CE) MRI (NR)	33 All tests [824/455 (55%)]	14 (455/33)
Jouvet 2014 Unclear	AJCC stage IV. Number of lesions included varies per test	Stage IV: 100%	CT (CE) MRI (DW) MRI (DW + ultrafast GE) PET‐CT	37 (218 lesions) [209/115 (55%)] [218/125 (57%)] [191/104 excl brain (54%)]	3 (125/37)
Pfannenberg 2007 Mixed – incl primary, FU, and NR	Stage III or IV imaged before surgery due to abnormal radiological, clinical, and laboratory findings, or routine surveillance in high risk Number of lesions included same per test	Stage III: 39% Stage IV: 61%	CT (CE) MRI (DW + ultrafast GE) PET‐CT	64 All tests [420/297 (71%)]	5 (297/64)
Pfluger 2011 Mixed ‐ primary or follow‐up	Melanoma with regional lymph node metastases; excluded any lesions newly arising during follow‐up Number of lesions included same per test	Stage III: 100%	CT (CE); CT (U) PET‐CT (CE); (U)	50 All tests [232/151 (65%)]	3 (151/50)

AJCC: American Joint Cancer Committee; BT: Breslow thickness; CE: contrast enhanced; CLND: complete lymph node dissection; CT: computed tomography; DW: diffusion weighted; FNAC: fine needle aspiration cytology; FU: follow‐up; GE: gradient echo; HN: head and neck; LN: lymph node; MM: malignant melanoma; MRI: magnetic resonance imaging; mm: millimetre; N+: node positive; N/A: not applicable; NR: not reported; PET: positron emission tomography; SLNB: sentinel lymph node biopsy; U: unenhanced; US: ultrasound; VIBE: MRI sequence.

Fifteen studies were conducted in mixed population groups (n = 11), including participants undergoing primary staging, re‐staging, and follow‐up imaging (i.e. at more than one point in the clinical pathway) (Abbott 2011; Aukema 2010a; Bastiaannet 2009; Cachin 2014; Dellestable 2011; Klebl 2003; Pfannenberg 2007; Pfluger 2011; Reinhardt 2006; van den Brekel 1998; van Wissen 2016), or did not clearly describe the clinical pathway in included participants (n = 4) (Aukema 2010b; Hausmann 2011; Jouvet 2014; Strobel 2007b).

Stage of disease on recruitment was not reported in four studies (Aukema 2010a; Cachin 2014; Klebl 2003; Strobel 2007b), two studies included any stage of disease (with 56% (Reinhardt 2006) and 73% (Dellestable 2011) at stage III or stage IV), six included only stage III melanoma (Abbott 2011; Aukema 2010b; Bastiaannet 2009; Pfluger 2011; van den Brekel 1998; van Wissen 2016), one included stage IV only (Jouvet 2014), and two included either stage III or IV melanoma (both with just under 40% with stage III disease) (Hausmann 2011; Pfannenberg 2007).

Nine of the fifteen studies reported accuracy data only on a per patient basis (Abbott 2011; Aukema 2010a; Aukema 2010b; Bastiaannet 2009; Klebl 2003; Reinhardt 2006; Strobel 2007b; van den Brekel 1998; van Wissen 2016), five reported data per lesion (Dellestable 2011; Hausmann 2011; Jouvet 2014; Pfannenberg 2007; Pfluger 2011), and one reported both per patient and per lesion data (Cachin 2014). Of those reporting per lesion data, two studies reported accuracy data only for those imaging abnormalities identified by each test (Dellestable 2011; Jouvet 2014), such that comparative studies reported different numbers of lesions and confirmed metastases per index test evaluated (Dellestable 2011; Jouvet 2014). Three studies included all lesions detected by any index test so that the number of lesions included in each 2×2 contingency table was the same for every test (Hausmann 2011; Pfannenberg 2007; Pfluger 2011); two included all lesions considered suspicious by any one index test (Hausmann 2011; Pfannenberg 2007); and one reported including only lesions considered positive for melanoma on at least one index test (Pfluger 2011). Variation in lesion inclusion has the potential to reduce sensitivity in studies that included all lesions detected by any index test, as any metastases missed by any one test would count as false negative results in the contingency table; any missed benign lesions would be considered true negative results, but a large number of lesions would need to be missed to have any detectable effect on specificity.

The considerable clinical heterogeneity between studies in terms of population groups, stages of disease, lesion selection, differences between tests, and definitions of target conditions (either including or excluding imaging of the brain) means that no conclusions can be drawn from studies in mixed and not clearly described populations (Table 3; Appendix 9). Study results are therefore described narratively in Appendix 12.

Discussion

available in

Summary of main results

There is a considerable volume of literature evaluating the accuracy of imaging tests for staging of melanoma, but other than the specific use of imaging to identify nodal metastases before sentinel lymph node biopsy (SLNB), only a small number of identified studies were eligible for our review and were conducted in well‐defined populations of participants undergoing primary staging or re‐staging for disease recurrence. In terms of methodological quality, studies were generally at low or unclear risk of bias, with methods of participant selection and blinded case note review particularly poorly reported. Studies were of high or unclear concern regarding the applicability of evidence due to participant selection from mixed or not clearly defined populations, poorly described application and interpretation of the index test or observer expertise, and lack of definition of the target condition (e.g. including or excluding the detection of metastases of the brain, no breakdown of cases according to nodal or distant metastases). Because few studies compared eligible tests and because available data for magnetic resonance imaging (MRI) were limited and information regarding the stage of disease at diagnosis was lacking, we could not fully answer the review question.

Four main findings can be drawn from our review.

1. Pre‐SLNB ultrasound combined with fine needle aspiration cytology (FNAC) allows around a fifth of patients with nodal metastases to be identified with few false positive results.

Half of all included studies (18/39) considered the use of imaging before SLNB to identify people with nodal metastases, particularly the use of ultrasound with or without FNAC (n = 11). Study populations were well defined and included people likely to be considered eligible for SLNB in routine practice. Studies primarily used B‐mode ultrasound, although two also used Doppler ultrasound in all participants, and the use of ultrasound before or after lymphoscintigraphy to identify sentinel nodes varied.

summary of findings Table presents key results and translates summary estimates to a hypothetical cohort of 1000 lesions. Given that completion lymphadenectomy (CLND) is no longer a standard of care for patients who are eligible for SLNB, the previously postulated benefit from ultrasound and FNAC of triaging those with nodal metastases on cytology directly to CLND is no longer relevant. We have therefore framed the potential benefit from ultrasound and FNAC in terms of access to adjuvant therapy, but any benefit would be incurred only if a result from imaging and cytology could be obtained significantly more quickly than an SLNB result, or, if SLNB was not available or was contraindicated.

All imaging tests had poor sensitivity, detecting at best a third of people with nodal metastases that were not clinically detectable (sensitivity of 35.4% for ultrasound alone); however, all summary specificities were higher than 90%.

With use of the median prevalence of nodal metastases observed across the 11 studies of ultrasound, a test sensitivity of 35.4% would correctly identify 84 of 237 people with nodal metastases but with 47 false positive results, or people who would be incorrectly considered for adjuvant therapy. Combining ultrasound with FNAC, such that only those positive on both ultrasound and subsequent FNAC would be considered to have nodal disease (i.e. a more narrowly defined threshold for test positivity), reduces by 41 the number of people with nodal metastases who are correctly identified (from 84 to 43) but also reduces to two the number with false positive results. In other words, for every 1000 people eligible for SLNB, ultrasound with FNAC has the potential to allow around a fifth of those with nodal metastases to be considered for adjuvant therapy without the need for a more invasive procedure (SLNB), at a cost of two people being inappropriately managed (false negatives).

However, considerable between‐study differences were observed, such that the number of people with false positive results could range between one and seven, and the number of people with false negative results could range between 8 and 134. Results were also dominated by a single large study of 1000 participants from an expert group (Voit 2014), and it is difficult to determine whether these results could be replicated.

2. Limited test accuracy data were available for whole body imaging via positron emission tomography‐computed tomography (PET‐CT) for primary staging or re‐staging for disease recurrence and none evaluated MRI.

Of 24 studies meeting the inclusion criteria for this review, only six clearly recruited participants who were undergoing primary staging following a confirmed diagnosis of melanoma and three recruited participants undergoing imaging for re‐staging of disease following some clinical indication of recurrence. Most of the studies (6/9) considered PET‐CT, two in comparison to CT alone, and three studies examined the use of ultrasound. None of the studies included in these groups evaluated MRI. Observed sensitivities and specificities for the detection of any metastases for PET‐CT appeared to be higher for those having re‐staging of disease (summary sensitivity from two studies was 92.6% and specificity 89.7%) compared to primary staging (sensitivities ranged from 30% to 47% and specificities from 73% to 88%) and were more sensitive than CT alone in both population groups, but participant numbers were very small.

3. No conclusions can be drawn regarding routine imaging of the brain with either MRI or CT.

We excluded from this review a number of studies that reported data for ‘conventional imaging’ including CT or MRI because they did not have clearly defined imaging protocols whereby all included participants underwent both tests (Finkelstein 2004; Fuster 2004; Gulec 2003; Oehr 1999; Paquet 2000; Rinne 1998). Furthermore we identified no eligible studies reporting data for MRI in primary or re‐staging populations.

Of the studies conducted in mixed populations, scan coverage variably included the brain such that the definition of the target condition of any metastasis could either include or specifically exclude the detection of brain metastases. Generally speaking, studies were too small to include significant numbers of brain lesions. Only two studies in mixed population groups identified a sufficient number of brain metastases to allow sensitivities to be estimated. Jouvet 2014 showed CT with intravenous (IV) contrast and MRI with ultrafast gradient echo sequences to have sensitivities of 95% (CT) or more (100% for MRI) compared to 65% sensitivity for MRI without ultrafast gradient echo sequences. In Cachin 2014, PET‐CT detected one of seven confirmed metastases of the brain (sensitivity 14%, 95% CI 0% to 58%).

4. There are high concerns regarding the applicability of the evidence, although risk of bias is generally low.

Study quality was moderate in terms of risk of bias, and there are real concerns regarding the applicability of the evidence to the review question. Much of this concern is due to the inclusion of mixed and not clearly defined participant groups. There was a tendency to include participants based on the availability of results for a particular test, but more careful consideration of the indication for imaging is needed before the comparative accuracy of tests at different points on the clinical pathway can be established. Although there is an understandable temptation to translate results from mixed populations to a primary staging or re‐staging setting, there is at least some evidence that accuracy varies by pathway and in different ways for different tests (Reinhardt 2006), and this is supported by work in other fields (Leeflang 2013).

Further concerns around applicability relate to reporting of data per lesion as opposed to per patient, not only potentially impacting estimates of test sensitivity and specificity but making it more difficult to consider the implications of testing for patient management unless further information is provided in the papers. Although one might expect sensitivity to be inflated by per lesion data, effects on accuracy are not always clear cut. Cachin 2014 was one of the few studies reporting data both per patient and per lesion; both the sensitivity and specificity of PET‐CT were higher for data reported per patient (87% and 71%) compared to those reported per lesion (80% and 54%). The detection of additional metastatic lesions by any one test is of limited benefit if there is no resulting change in stage of disease classification or in patient management options. For example, in Pfluger 2011, the five lesions found to be false negative on unenhanced PET‐CT were all identified in patients with multiple metastases, such that there would have been no impact on TNM stage; on the other hand, all six false positive lesions were identified in otherwise metastasis‐free patients who were falsely upstaged from M0 to M1.

We also noted variations in the scan coverage between studies, which will impact the definition of the target condition, and limited information was provided on the thresholds used to identify the presence of metastases.

Strengths and weaknesses of the review

The strengths of this review include an in‐depth and comprehensive electronic literature search, systematic review methods including double extraction of papers by both clinicians and methodologists, and attempts to contact study authors to clarify data. A detailed and replicable analysis of methodological quality was undertaken and a clear analysis structure was adopted.

In comparison to other available systematic reviews of imaging tests (e.g. Catalano 2011; Danielsen 2014; El‐Maraghi 2008; Rodriguez 2014; Sadigh 2014; Schroer‐Gunther 2012; Xing 2011), our review covers a more recent search period and a broader review question, including both primary staging and re‐staging of melanoma, as opposed to one or the other (Danielsen 2014; Schroer‐Gunther 2012), and this review considers the comparative accuracy of different tests as opposed to reviewing a single test (as in all other reviews apart from Xing 2011). We have also separately considered data according to reporting of study data per patient as opposed to per lesion ‐ an approach not taken by any of the other identified systematic reviews.

Our stringent application of review inclusion criteria means that we excluded a considerable proportion of studies included in previous reviews. For example, across a selection of four reviews considering PET, we included only 3 of 12 (El‐Maraghi 2008), 7 of 12 (Xing 2011), 3 of 9 (Rodriguez 2014), and 1 of 7 studies included in those reviews (Danielsen 2014). We excluded studies on the basis of having evaluated PET alone rather than PET‐CT (Acland 2000; Acland 2001; Belhocine 2002; Fink 2004; Havenga 2003; Koskivuo 2007; Longo 2003; Nguyen 1999; Steinert 1998; Tyler 2000; Vereecken 2005; Wagner 1999; Wagner 2005), a combination of PET and PET‐CT, which could not be differentiated from each other (Horn 2006; Wagner 2011), use of PET for treatment response (Beasley 2012; Raymond 2011), use of an inadequate reference standard (e.g. minimum follow‐up period was not reported) (Peric 2011), inadequate sample size (Libberecht 2005), or inability to estimate the 2×2 data (Mottaghy 2007). We included all five PET‐CT studies included by Schroer‐Gunther and colleagues for primary staging of melanoma (Schroer‐Gunther 2012), but we considered two of the five to have been conducted in mixed rather than primary staging populations (Aukema 2010b; Strobel 2007b).

A similar picture was observed for other tests. We included only 7 of 22 studies of ultrasound and 4 of 13 studies of CT alone that were included by Xing 2011, 6 of 24 studies of ultrasound from Catalano 2011, and 2 of 8 studies of CT in Sadigh 2014. The most common reason for exclusion of studies of ultrasound from this review was the reporting of more than one ultrasound scan per patient (e.g. Binder 1997; Brountzos 2003; Schmid‐Wendtner 2003; Tregnaghi 1997; Voit 2001). For CT, it was the reporting of accuracy data for CT combined with other imaging tests such as MRI in Finkelstein 2004 and Fuster 2004 or reporting of more than one CT scan per patient in Sawyer 2009 and Swetter 2002, inadequate reference standards (Buzaid 1993; Buzaid 1995; Chomyn 1992; Holder 1998), or the inclusion of more than 10% of participants with non‐cutaneous melanoma (Brady 2006; Sofue 2012).

The main concerns for this review result from the poor reporting of primary studies, in particular, limiting assessment of studies according to clinical pathway, by stage of disease on diagnosis, and by varying definitions of the target condition. This review is also somewhat limited by the date of the last search (2016); however, imaging of melanoma has not been a particularly fast‐moving field (with only 7 of 39 included studies published in the five years before the search); furthermore, we are not aware of publication of any landmark studies in the interim period.

Applicability of findings to the review question

Varying definitions of eligible study populations and lack of clarity regarding the patient pathway and any prior testing restrict the extent to which our findings can be applied in the clinical setting.

Figure 1

Figure 2

Summary of 2015 NICE guideline recommendations for the management of cutaneous melanoma following primary diagnosis (NICE 2015a); not necessarily reflective of current practice.

Figure 3

PRISMA flow diagram.

Figure 4

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies.

Figure 5

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study.

Figure 6

Forest plot of all data for pre‐SLNB ultrasound, ultrasound plus FNAC, or PET‐CT for the detection of nodal metastasis.
(HN MM ‐ head and neck only malignant melanoma.)

Figure 7

Summary ROC plot comparing pre‐SLNB ultrasound vs ultrasound plus FNAC vs PET‐CT.

Figure 8

Forest plot of imaging for primary staging, for the detection of any metastases, nodal metastases, and distant metastases (per patient and per lesion data).

Figure 9

Forest plot of imaging for re‐staging of melanoma, for the detection of any metastases or nodal metastases (per patient and per lesion data).

Figure 10

Forest plot of tests for the detection of any metastases (mixed populations ‐ per patient data).

Figure 11

Forest plot of tests for the detection of any metastases (mixed populations ‐ per lesion data).

Figure 12

ROC plot of direct comparisons between CT and MRI for the detection of any metastases in mixed population group studies (per lesion data).

Figure 13

ROC plot of direct comparisons between CT and PET‐CT for the detection of any metastases in mixed population group studies (per lesion data).

Figure 14

ROC plot of direct comparisons between MRI and PET‐CT for the detection of any metastases in mixed population group studies (per lesion data).

Figure 15

Forest plot of tests for the detection of nodal metastases (mixed populations ‐ per patient data).

Figure 16

Forest plot of tests for the detection of nodal metastases (mixed populations ‐ per lesion data).

Figure 17

ROC plot of direct comparisons between CT and MRI for the detection of nodal metastases in mixed population group studies (per lesion data).

Figure 18

ROC plot of direct comparisons between CT and PET‐CT for the detection of nodal metastases in mixed population group studies (per lesion data).

Figure 19

ROC plot of direct comparisons between MRI and PET‐CT for the detection of nodal metastases in mixed population group studies (per lesion data).

Figure 20

Forest plot of tests for the detection of distant metastases (mixed populations ‐ per patient data only).

Figure 21

Forest plot of tests for the detection of distant metastases (per lesion data).

Figure 22

ROC plot of direct comparisons between CT and MRI for the detection of distant metastases in mixed population group studies (per lesion data).

Figure 23

ROC plot of direct comparisons between CT and PET‐CT for the detection of distant metastases in mixed population group studies (per lesion data).

Figure 24

ROC plot of direct comparisons between MRI and PET‐CT for the detection of distant metastases in mixed population group studies (per lesion data).

Figure 25

Forest plot of tests for the detection of bone metastasis in mixed population groups (per lesion data).

Figure 26

Forest plot of tests for the detection of lung metastasis in mixed population groups (per lesion data).

Figure 27

Forest plot of tests for the detection of liver metastasis in mixed population groups (per lesion data).

Figure 28

Forest plot of tests: 75 soft tissue metastasis ‐ PET‐CT ‐ MIXED (per lesion), 76 local/subcutaneous metastasis ‐ CT ‐ MIXED (per lesion), 77 local/subcutaneous metastasis ‐ MRI ‐ MIXED (per lesion), 78 local/subcutaneous metastasis ‐ MRI (DW + VIBE) ‐ MIXED (per lesion), 79 local/subcutaneous metastasis ‐ PET‐CT ‐ MIXED (per lesion).

Test 1

Pre‐SLNB US vs Histology ‐ Nodal mets ‐ per patient.

Test 2

Pre‐SLNB US (stringent US criteria) vs Histology ‐ Nodal mets ‐ per patient.

Test 3

Pre‐SLNB US‐FNAC ‐ Nodal mets ‐ per patient.

Test 4

Pre‐SLNB PET‐CT vs Histology ‐ Nodal mets ‐ all SLNB ‐ per patient.

Test 5

Pre‐SLNB PET‐CT vs Histology ‐ Nodal mets ‐ high risk ‐ per patient.

Test 6

Pre‐SLNB PET‐CT vs Histology ‐ Nodal mets ‐ head and neck only ‐ per patient.

Test 7

Pre‐SLNB PET‐CT vs Histology/FU ‐ Nodal mets ‐ high risk ‐ per patient.

Test 8

Pre‐SLNB PET‐CT vs Histology/FU ‐ Nodal mets ‐ head and neck only ‐ per patient.

Test 9

Any metastasis ‐ PET‐CT ‐ PRIMARY ‐ Any stage (per pt).

Test 10

Any metastasis ‐ PET‐CT ‐ PRIMARY ‐ BT > 4 mm (per pt).

Test 11

Any metastasis ‐ CT ‐ RE‐STAGING ‐ Any stage (per pt).

Test 12

Any metastasis ‐ PET‐CT ‐ RE‐STAGING ‐ Any stage (per pt).

Test 13

Any metastasis ‐ PET‐CT ‐ RE‐STAGING ‐ Stage IIIb or less (per pt).

Test 14

Any metastasis ‐ PET‐CT ‐ RE‐STAGING ‐ Stage IIIc to IV (per pt).

Test 15

Any metastasis ‐ CT‐ MIXED ‐ All data (per pt).

Test 16

Any metastasis ‐ PET‐CT ‐ MIXED ‐ All data (per pt).

Test 17

Any metastasis ‐ PET‐CT (plus CT) ‐ Mixed ‐ Any stage (per pt).

Test 18

Any metastasis ‐ PET‐CT ‐ RE‐STAGING ‐ Any stage (per lesion).

Test 19

Any metastasis ‐ CT‐ MIXED ‐ All data (per lesion).

Test 20

Any metastasis (incl brain) ‐ CT (U) ‐ MIXED (per lesion).

Test 21

Any metastasis (incl brain) ‐ CT (CE) ‐ MIXED (per lesion).

Test 22

Any metastasis ‐ MRI ‐ MIXED ‐ All data (per lesion).

Test 23

Any metastasis (excl brain) ‐ MRI (DW + VIBE) ‐ MIXED (per lesion).

Test 24

Any metastasis (incl brain) ‐ MRI (DW) ‐ MIXED (per lesion).

Test 25

Any metastasis (incl brain) ‐ MRI (DW + VIBE) ‐ MIXED (per lesion).

Test 26

Any metastasis (incl brain) ‐ MRI plus CT ‐ MIXED (per lesion).

Test 27

Any metastasis ‐ PET‐CT ‐ MIXED ‐ All data (per lesion).

Test 28

Any metastasis (incl brain) ‐ PET‐CT (U) ‐ MIXED (per lesion).

Test 29

Any metastasis (direct test comparisons) ‐ CT ‐ Mixed ‐ Stage III/IV (per lesion).

Test 30

Any metastasis (direct test comparisons) ‐ MRI ‐ Mixed ‐ Stage III/IV (per lesion).

Test 31

Any metastasis (direct test comparisons) ‐ PET‐CT ‐ Mixed ‐ Stage III/IV (per lesion).

Test 32

Nodal metastasis ‐ US ‐ PRIMARY (per pt).

Test 33

Nodal metastasis ‐ CT ‐ PRIMARY (per pt).

Test 34

Nodal metastasis ‐ PET‐CT ‐ PRIMARY (per pt).

Test 35

Nodal metastasis ‐ US ‐ RE‐STAGING (per pt).

Test 36

Nodal metastasis ‐ US plus US (CE) ‐ RE‐STAGING (per pt).

Test 37

Nodal metastasis ‐ US ‐ MIXED (per pt).

Test 38

Nodal metastasis ‐ CT ‐ MIXED (per pt).

Test 39

Nodal metastasis (superficial groin) ‐ PET‐CT (indeterminate test positive) ‐ MIXED (per pt).

Test 40

Nodal metastasis (superficial groin) ‐ PET‐CT (indeterminate test negative) ‐ MIXED (per pt).

Test 41

Nodal metastasis (deep groin) ‐ PET‐CT (indeterminate test positive) ‐ MIXED (per pt).

Test 42

Nodal metastasis (deep groin) ‐ PET‐CT (indeterminate test negative) ‐ MIXED (per pt).

Test 43

Nodal metastasis ‐ PET‐CT ‐ MIXED (per pt).

Test 44

Nodal metastasis ‐ CT ‐ MIXED ‐ All data (per lesion).

Test 45

Nodal metastasis ‐ MRI ‐ MIXED ‐ All data (per lesion).

Test 46

Nodal metastasis ‐ MRI (DW + VIBE) ‐ MIXED (per lesion).

Test 47

Nodal metastasis ‐ PET‐CT ‐ MIXED ‐ All data (per lesion).

Test 48

Superficial nodal metastasis ‐ US ‐ Mixed ‐ stage IV (per LNB).

Test 49

Superficial nodal metastasis ‐ CT ‐ Mixed ‐ stage IV (per LNB).

Test 50

Superficial nodal metastasis ‐ MRI ‐ Mixed ‐ stage IV (per LNB).

Test 51

Superficial nodal metastasis ‐ MRI (DW + VIBE) ‐ Mixed ‐ Stage IV (per lesion).

Test 52

Superficial nodal metastasis ‐ PET‐CT ‐ Mixed ‐ stage IV (per LNB).

Test 53

Distant metastasis ‐ CT ‐ PRIMARY (per pt).

Test 54

Distant metastasis ‐ PET‐CT ‐ PRIMARY (per pt).

Test 55

Distant metastasis ‐ CT ‐ MIXED ‐ All data (per pt).

Test 56

Distant metastasis ‐ PET‐CT ‐ MIXED ‐ All data (per pt).

Test 57

Distant metastasis ‐ CT ‐ Mixed ‐ All data (per lesion).

Test 58

Distant metastasis ‐ MRI ‐ Mixed ‐ All data (per lesion).

Test 59

Distant metastasis ‐ PET‐CT ‐ Mixed ‐ All data (per lesion).

Test 60

Distant metastasis (excl brain) ‐ MRI (DW + VIBE) ‐ Mixed ‐ stage III/IV (per lesion).

Test 61

Distant metastasis (incl brain) ‐ MRI (DW) ‐ Mixed ‐ stage III/IV (per lesion).

Test 62

Distant metastasis (incl brain) ‐ MRI (DW + VIBE) ‐ Mixed ‐ stage III/IV (per lesion).

Test 63

Bone metastasis ‐ CT‐ MIXED ‐ All data (per lesion).

Test 64

Bone metastasis ‐ MRI ‐ MIXED ‐ All data (per lesion).

Test 65

Bone metastasis ‐ MRI (DW + VIBE) ‐ MIXED ‐ All data (per lesion).

Test 66

Bone metastasis ‐ PET‐CT ‐ MIXED ‐ All data (per lesion).

Test 67

Liver metastasis ‐ CT‐ MIXED ‐ All data (per lesion).

Test 68

Liver metastasis ‐ MRI ‐ MIXED ‐ All data (per lesion).

Test 69

Liver metastasis ‐ MRI (DW + VIBE) ‐ Mixed ‐ stage III/IV (per lesion).

Test 70

Liver metastasis ‐ PET‐CT ‐ MIXED ‐ All data (per lesion).

Test 71

Lung metastasis ‐ CT ‐ MIXED ‐ All data (per lesion).

Test 72

Lung metastasis ‐ MRI ‐ MIXED ‐ All data (per lesion).

Test 73

Lung metastasis ‐ MRI (DW + VIBE) ‐ Mixed ‐ stage III/IV (per lesion).

Test 74

Lung metastasis ‐ PET‐CT ‐ MIXED ‐ All data (per lesion).

Test 75

Soft tissue metastasis ‐ PET‐CT ‐ MIXED (per lesion).

Test 76

Local/subcutaneous metastasis ‐ CT ‐ MIXED (per lesion).

Test 77

Local/subcutaneous metastasis ‐ MRI ‐ MIXED (per lesion).

Test 78

Local/subcutaneous metastasis ‐ MRI (DW + VIBE) ‐ MIXED (per lesion).

Test 79

Local/subcutaneous metastasis ‐ PET‐CT ‐ MIXED (per lesion).

Test 80

Brain metastasis ‐ CT‐ MIXED ‐ All data (per lesion).

Test 81

Brain metastasis ‐ MRI (DW) ‐ MIXED ‐ All data (per lesion).

Test 82

Brain metastasis ‐ MRI (DW + VIBE) ‐ MIXED ‐ All data (per lesion).

Test 83

Brain metastasis ‐ PET‐CT ‐ MIXED ‐ All data (per lesion).

Test 93

'Other' metastasis ‐ CT ‐ Mixed ‐ Any stage (per lesion).

Test 94

'Other' metastasis ‐ MRI ‐ Mixed ‐ Any stage (per lesion).

Test 95

'Other' metastasis ‐ PET‐CT ‐ Mixed ‐ Any stage (per lesion).

Test 96

'Other' metastasis ‐ CT ‐ Mixed ‐ stage III/IV (per lesion).

Test 97

'Other' metastasis ‐ MRI ‐ Mixed ‐ stage III/IV (per lesion).

Test 98

'Other' metastasis ‐ PET‐CT ‐ Mixed ‐ stage III/IV (per lesion).

Summary of findings Summary of findings table

Question		How accurate is ultrasound, CT, MRI, or PET‐CT for staging or re‐staging of cutaneous invasive melanoma in adults?
Population:		Adults with a confirmed diagnosis of melanoma undergoing imaging for staging purposes: Before sentinel lymph node biopsy (SLNB) to identify nodal metastases For full body staging following removal of the primary melanoma For full body staging due to suspected recurrence of disease
Index test(s):		Ultrasound with or without fine needle aspiration cytology (FNAC) Computed tomography (CT) Magnetic resonance imaging (MRI) Positron emission tomography–computed tomography (PET‐CT)
Comparator test:		All of the index tests may be used in comparison to each other
Target condition:		For pre‐SLNB imaging: detection of nodal metastases For all other imaging: detection of any metastases
Reference standard:		Histology plus clinical or imaging follow‐up
Action:		If accurate, positive results of imaging before SLNB in some circumstances could allow patients with nodal metastases to proceed directly to commence adjuvant therapy and avoid an additional invasive procedure (SLNB). Accurate whole body imaging will allow appropriate locoregional and systemic therapies to be initiated in a timely manner
Quantity of evidence (n = 39 studies)		Number of studies		Number of participants		Number of cases
Per patient data:		34		4980		1265
Per lesion data:		7		417 (1846 lesions)		1061 metastases
Limitations
Risk of bias:		Some concerns due to poor reporting across almost all domains. Unclear risk for participant selection method (11/39) or exclusions not clearly described (3/39). High risk from exclusions on the basis of index test results (4/39). Low risk for the index test for pre‐SLNB ultrasound (6/11), other ultrasound evaluation (3/5), CT (7/10), and MRI (4/4). For PET‐CT, unclear risk from lack of description of blinded case note review to ascertain imaging results for retrospective studies (13/23) and high risk from data driven selection of test threshold (1/23). Unclear risk for reference standard from lack of detail on participant follow‐up schedules (12/39). Lack of blinding of the histological diagnosis (2/39) or data collection on follow‐up (3/39) to the index result. High risk from differential verification (20/39) and participant exclusions (13/39). Low risk for comparisons between tests (6/9)
Applicability of evidence to question:		High or unclear concern for applicability for almost all domains. High concern for participant selection from mixed populations (11/39) or data presented per lesion (5/39). Unclear concern from lack of clarity regarding study population. High concern for index tests from poor description of test thresholds (pre‐SLNB ultrasound (1/11), other ultrasound (1/5), CT (5/10), MRI (3/4), PET‐CT (4/23)) or consensus test interpretation (CT (6/10), MRI (2/4), PET‐CT (11/23)). Unclear concern for application and interpretation of the index test (pre‐SLNB US (10/11), CT (3/10), MRI (2/4), PET‐CT (6/23)) or unclear observer expertise (pre‐SLNB ultrasound (6/11), CT (3), MRI (2/4), PET‐CT (6/23)). Unclear concern for applicability of the reference standard from lack of description of the target condition or no breakdown of cases according to nodal or distant metastases. Expertise of the histopathologist poorly described (6/39)
Findings
Thirty‐nine studies reporting accuracy data for pre‐SLNB imaging (n = 18) or for whole body imaging (n = 24) were included. The 24 studies of whole body imaging were of primary staging (n = 6) or staging for potential recurrence of disease (n = 3), or were conducted in mixed or not clearly described populations (n = 15). As we are unable to make clear statements regarding the expected accuracy of imaging at any particular point on the clinical pathway for the mixed population group, the findings presented are based on results for pre‐SLNB imaging, and for primary staging and re‐staging of melanoma only.
Test: pre‐SLNB imaging
Test	Studies: patients (cases)	Sensitivity (95% CI)	Specificity (95% CI)	Numbers in a cohort of 1000 lesions at a median prevalence of 23.7%^a
				TP (95% CI)	FN (95% CI)	FP (95% CI)	TN (95% CI)
US	11: 2614 (542)	35.4 (17.0 to 59.4)	93.9 (86.1 to 97.5)	84 (40 to 141)	153 (197 to 96)	47 (106 to 19)	716 (657 to 744)
US + FNAC	3: 1164 (259)	18.0 (3.58 to 56.5)	99.8 (99.1 to 99.9)	43 (8 to 134)	194 (229 to 103)	2 (7 to 1)	761 (756 to 762)
PET‐CT	4: 170 (49)	10.2 (4.31 to 22.3)	96.5 (87.1 to 99.1)	24 (10 to 53)	213 (227 to 184)	27 (98 to 7)	736 (665 to 756)
Whole bodyimaging for primary staging of melanoma
Quantity of evidence (n = 6 studies)		Number of studies		Number of participants		Number of cases
Any metastases		3		81		51
Nodal metastases		3		373		68
Distant metastases		2		112		17
Findings
Four of the six studies evaluated PET‐CT, one in comparison to CT. In participants with primary melanomas > 4 mm thick (two studies), sensitivities for the detection of any metastases were 30% (95% CI 7% to 65%) to 47% (95% CI 29% to 65%), and specificities 73% (95% CI 45% to 92%) to 88% (95% CI 68% to 97%). One study of any participant referred for PET‐CT demonstrated no false positive results for either CT or PET‐CT for the detection of nodal metastases (specificity 100%, 95% CI 92% to 100%); however, sensitivity was higher for PET‐CT (38%, 95% CI 14% to 68%) compared to CT (23%, 95% CI 5% to 54%). For the detection of distant metastases, two additional cases were detected with PET‐CT (sensitivity 42%, 95% CI 15% to 72%) in comparison to CT (25%, 95% CI 5% to 57%) with no difference in specificity (93%, 95% CI 81% to 99%). One study of PET‐CT suggested an SUVmax threshold ≥ 2.2 at baseline and predicted later recurrence with a sensitivity of 89% (95% Cl 52% to 100%) and specificity 61% (95% CI 41% to 78%). No data for MRI were identified. Results for ultrasound for the detection of nodal metastases (2 studies) were highly variable and likely subject to bias.
Whole bodyimaging for re‐staging of melanoma
Quantity of evidence (n = 3 studies)		Number of studies		Number of participants (lesions)		Number of cases (metastases)
Any metastases:		2 (1)		153 (139)		95 (87)
Nodal metastases:		1		460		37
Distant metastases:		0		N/A		N/A
Findings:
Two studies of PET‐CT for re‐staging were pooled; summary sensitivity for the detection of any metastasis was 92.6% (95% CI 85.3% to 96.4%) and specificity 89.7% (95% CI 78.8% to 95.3%) (153 patients, 95 cases). In one of the two studies, PET‐CT was more sensitive (89%, 95% CI 78% to 96%) than CT alone (increase of 21%). With similar specificity (88%, 95% CI 76% to 95%), PET‐CT was more sensitive in the subgroup with stage Illc to IV disease (100%, 95% CI 81% to 100%) than in those with less advanced disease (84%, 95% Cl 69% to 94%). One study of ultrasound in clinically node negative patients undergoing follow‐up demonstrated 100% sensitivity (95% CI 91% to 100%) for 'common signs of malignancy' or focal hypoechoic cortical thickening (considered test positive) with a specificity of 93% (95% CI 90% to 95%). No data for MRI were identified.
^aMedian prevalence observed across 11 studies of pre‐SLNB ultrasound (interquartile range: 25th percentile 20.5%, 75th percentile 25.4%). CT: computed tomography; FN: false negative; FNAC: fine needle aspiration cytology; FP: false positive; MRI: magnetic resonance imaging; PET: positron emission tomography; SLNB: sentinel lymph node biopsy; TN: true negative; TP: true positive.

Summary of findings Summary of findings table

Table 1. Cross‐tabulation of studies by index test, population group, and target condition

Study

US‐ FNAC

MRI

PET‐CT

Population group

Population detail

Reference standard

Any metastases

Distant metastases

Nodal metastases

Other sites

PRIMARY STAGING

Arrangoiz 2012

‐

Primary (any); primary (pre‐SLNB)

BT > 4 mm

SLNB/CLND/FU

Per patient

Per patient/ Pre‐SLNB

‐

Chai 2012

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB/CLND ± FU

‐

Pre‐SLNB

‐

Hafner 2004

‐

(X)

Primary (pre‐SLNB); primary

Standard SLNB
Any (incl N+)

SLNB/CLND

‐

Per patient/
Pre‐SLNB

‐

Hinz 2011

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB

‐

Pre‐SLNB

‐

Hinz 2013

‐

Primary (pre‐SLNB)

High risk (BT ≥ 2.0 mm or other RF)

SLNB

‐

Pre‐SLNB

‐

Hocevar 2004

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB/CLND

‐

Pre‐SLNB

‐

Kang 2011

‐

Primary (any)

All staging (incl N+)

Histology/FU

per patient

‐

Kell 2007

Primary (pre‐SLNB)

Standard SLNB

SLNB

‐

Pre‐SLNB

‐

Klode 2010

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB

‐

Pre‐SLNB

‐

Kunte 2009

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB

‐

Pre‐SLNB

‐

Maubec 2007

‐

Primary (any); primary (pre‐SLNB)

BT > 4 mm

SLNB/CLND ± FU

Per patient

‐

Pre‐SLNB

‐

Prayer 1990

‐

Primary (any)

All staging (incl N+)

CLND/FU

‐

Per patient

‐

Radzhabova 2009

‐

Primary (pre‐SLNB)

Standard SLNB; any (incl N+)

SLNB ± FU

‐

Pre‐SLNB

‐

Revel 2010

‐

Primary (pre‐SLNB)

HN MM

SLNB

‐

Pre‐SLNB

‐

Sanki 2009

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB

Pre‐SLNB

Sibon 2007

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB

‐

Pre‐SLNB

‐

Singh 2008

‐

Primary (pre‐SLNB)

Standard SLNB/BT > 4 mm

SLNB

‐

Pre‐SLNB

‐

van Rijk 2006

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB/CLND

‐

pre‐SLNB

‐

Veit‐Haibach 2009

‐

Primary (any)

All staging (incl N+)

Histology/FU

‐

Per patient

‐

Voit 2014

‐

Primary (pre‐SLNB)

Standard SLNB

SLNB/CLND

‐

Pre‐SLNB

‐

Wagner 2012

‐

Primary (pre‐SLNB)

High risk (BT ≥ 4 mm or > 1 mm and ulcerated)

SLNB/CLND

‐

Pre‐SLNB

‐

RE‐STAGING

Iagaru 2007

‐

Re‐staging

Any re‐staging

Histology/FU

Per patient

/Per lesion

‐

Rubaltelli 2011

‐

Re‐staging

Any FU and suspicious on B‐mode US

FNAC/Histology/FU

‐

Per patient

‐

Strobel 2007a

‐

Re‐staging

High risk (BT > 4 mm, etc.), elevated S100

Histology/Cytology/FU

Per patient

‐

MIXED OR UNCLEARLY REPORTED

Abbott 2011

‐

Mixed

Stage III

Histology/FU

Per patient

‐

Aukema 2010a

‐

(X ‐ Brain)

Mixed

S100 positive

FNAC/Histology/Imaging FU

Per patient

‐

Aukema 2010b

‐

(X ‐ Brain)

Unclear

Node positive

FNAC/Histology/FU

Per patient

‐

Bastiaannet 2009

‐

(X)

Mixed

All node positive

Histology/FU

‐

Per patient

‐

Cachin 2014

‐

Mixed

Stage III

Histology/Imaging/FU

Per patient/Per lesion

Per lesion

Dellestable 2011

‐

Mixed

All staging

Histology/FU

Per lesion

Hausmann 2011

‐

Unclear

Stage III/IV

Histology/FU

Per lesion

Jouvet 2014

‐

Unclear

Stage IV

FNAC/FU

Per lesion

Klebl 2003

‐

Mixed

Clark IV/V in FU

Histology/FU

‐

per patient

‐

Pfannenberg 2007

‐

Mixed

Stage III/IV

Histology/Imaging/FU

Per lesion

Per patient

Per lesion

Pfluger 2011

‐

Mixed

All stage III

Histology/FU

Per patient

‐

Reinhardt 2006

‐

Mixed

All staging (incl N+)

Histology/FU

Per patient

‐

Strobel 2007b

‐

Unclear

High risk (BT > 4 mm, etc.)

Histology/Cytology/FU

Per patient

‐

van den Brekel 1998

‐

Mixed

HN MM and N+

Histology

‐

Per patient

‐

van Wissen 2016

‐

Mixed

Stage IIIB/IIIC palpable groin mets

Histology (combined groin dissection)

‐

Per patient

‐

Table 1. Cross‐tabulation of studies by index test, population group, and target condition

Table 2. Summary results from studies of imaging for primary staging or re‐staging

Test	Studies	Participants (cases)	Sensitivity (95% CI), %	Specificity (95% CI), %
Comparison of imaging tests before SLNB
Indirect comparison of imaging tests for detection of nodal metastasis (per patient data)
US	11	2614 (542)	35.4 (17.0 to 59.4)	93.9 (86.1 to 97.5)
US‐FNAC	3	1164 (259)	18.0 (3.58 to 56.5)	99.8 (99.1 to 99.9)
PET‐CT	4	170 (49)	10.2 (4.31 to 22.3)	96.5 (87.1 to 99.1)
Difference			P = 0.07	P < 0.001
Direct comparison of imaging tests for detection of nodal metastasis (per patient data)
US	3	1164 (259)	58.7 (36.5 to 77.9)	79.4 (70.0 to 86.4)
US‐FNAC	3	1164 (259)	18.0 (3.58 to 56.5)	99.8 (99.1 to 99.9)
Difference			‐40.7 (‐75.0 to ‐6.50), P = 0.02	+20.4 (+12.2 to +28.6), P < 0.001
Whole body imaging
Imaging for re‐staging for the detection of any metastasis (per patient data)
PET‐CT	2^a	153 (95)	92.6 (85.3 to 96.4)	89.7 (78.8 to 95.3)
CI: confidence interval; CT: computed tomography; FNAC: fine needle aspiration cytology; PET: positron emission tomography; SLNB: sentinel lymph node biopsy; US: ultrasound. ^aWhere there were only two studies, estimates of summary sensitivity and summary specificity were obtained by using univariate fixed‐effect logistic regression models to pool sensitivities and specificities separately.

Table 2. Summary results from studies of imaging for primary staging or re‐staging

Table 3. Characteristics of studies conducted in mixed or unclear population groups

Study Population group	Participant inclusion criteria and reported indications for imaging	Stage of disease on presentation	Imaging tests	Patients/cases (prevalence) [lesions/metastases (prevalence)]	Average no. metastases per patient
PER PATIENT DATA
Abbott 2011 Mixed – primary or follow‐up	Undergoing FU after prior SLNB/CLND for micro‐metastases or presenting with clinically detectable nodal disease at or subsequent to initial diagnosis	Stage: IIIA 18, 53% IIIB 10, 29% IIIC 6, 18%	PET‐CT (NR)	34/7 (21%)	N/A
Aukema 2010a Mixed – primary or re‐staging	Asymptomatic S100 positive. Previously treated for locoregional recurrence (n = 15) or distant metastases (n = 5); or with unfavourable primary tumour (n = 6), primary melanoma with simultaneous nodal metastases (n = 20)	Any (stage NR)	PET‐CT (U)	46/23 (50%)	N/A
Aukema 2010b Unclear	Palpable and pathology proven lymph node metastases and no signs of distant metastases. Imaging to identify further ‘undetected’ disease	Stage III: 100%	PET‐CT (U)	70/30 (43%)	N/A
Bastiaannet 2009 Mixed – primary or re‐staging	Node positive (clinical or histology/cytology proven) candidates for CLND; imaging to identify further disease. Includes those with LN mets diagnosed at time of primary diagnosis 39, 15.5%; LN metastases identified ≤ 3 years since primary diagnosis 145, 57.8%; recurrence > 3 years since primary diagnosis 67, 26.7%	Stage III (100%)	CT (CE)	251/78 (31%) distant metastases	N/A
Cachin 2014 Mixed ‐ staging or re‐staging	Any primary MM, visceral metastases, or cutaneous metastases from unknown primary	Any; 51% with metastases	PET‐CT (NR)	67/39 (58%) [176/85 (48%)]	N/A
Klebl 2003 Mixed	Clark level IV or V undergoing FU after primary surgery. Reports primary (n = 8) and imaging during follow‐up (n = 75)	Any (NR)	US	79/17 (22%) nodal	N/A
Reinhardt 2006 Mixed – primary, re‐staging, FU, disease response	All with PET‐CT for primary staging after sentinel node biopsy (n = 75); therapy control after chemotherapy of metastatic disease (n = 42); staging of clinically suspected recurrent disease (n = 65); during follow‐up within 5 years of primary treatment (n = 68)	Stage I 22, 9% Stage II 88, 35% Stage III 108, 43% Stage IV 32, 13%	CT (CE) PET‐CT (CE)	250/116 (46%)	N/A
Strobel 2007b Unclear	High risk melanoma (BT > 4 mm, or Clark level III or IV, or known resected metastases) with PET‐CT for depiction or exclusion of metastases	Any (NR)	PET‐CT (CE)	124/53 (43%)	N/A
van den Brekel 1998 Mixed ‐ primary and recurrence	Head and neck MM with CT before neck dissection, including therapeutic and elective (negative on palpation). "Interval between the treatment of the primary and the neck dissection ranged from 0 to 8.8 years (mean: 21 months)"	Stage I to II 8, 31% Stage III 18, 69%	CT (CE)	26/21 (81%) nodal	N/A
van Wissen 2016 Mixed ‐ primary and recurrence	Stage IIIB or IIIC MM with palpable groin metastases; selected for therapeutic combined groin dissection. Discussion states: "large proportion of our patients were initially treated for their primary tumour at other hospitals, and sometimes years prior to the current groin dissection"	All stage IIIB and C	PET‐CT (U)	69/59 (superficial nodes 86%) 67/24 (deep nodes 36%)	N/A
PER LESION DATA
Cachin 2014 Mixed ‐ staging or re‐staging	Any primary MM, visceral metastases, or cutaneous metastases from unknown primary. Lesions with equivocal focal uptake considered test positive Only 1 eligible index test	Any: 51% with metastases	CT (NR)	67/39 (58%) [176/85 (48%)]	1 (85/67)
Dellestable 2011 Mixed ‐ primary or follow‐up	All with PET‐CT regardless of AJCC stage or indication for examination Number of lesions included varies per test	Stage I to II: 27.5% Stage III to IV: 72.5%	CT (CE) MRI (DW) PET‐CT	40 [108/66 (61%)] [117/70 (60%)] [119/72 (61%)]	2 (72/40)
Hausmann 2011 Unclear	AJCC stage III or IV with positive SLNB or suspicious lesions on ultrasound or X‐ray studies Number of lesions included same per test	Stage III 38% Stage IV 62%	CT (CE) MRI (NR)	33 All tests [824/455 (55%)]	14 (455/33)
Jouvet 2014 Unclear	AJCC stage IV. Number of lesions included varies per test	Stage IV: 100%	CT (CE) MRI (DW) MRI (DW + ultrafast GE) PET‐CT	37 (218 lesions) [209/115 (55%)] [218/125 (57%)] [191/104 excl brain (54%)]	3 (125/37)
Pfannenberg 2007 Mixed – incl primary, FU, and NR	Stage III or IV imaged before surgery due to abnormal radiological, clinical, and laboratory findings, or routine surveillance in high risk Number of lesions included same per test	Stage III: 39% Stage IV: 61%	CT (CE) MRI (DW + ultrafast GE) PET‐CT	64 All tests [420/297 (71%)]	5 (297/64)
Pfluger 2011 Mixed ‐ primary or follow‐up	Melanoma with regional lymph node metastases; excluded any lesions newly arising during follow‐up Number of lesions included same per test	Stage III: 100%	CT (CE); CT (U) PET‐CT (CE); (U)	50 All tests [232/151 (65%)]	3 (151/50)
AJCC: American Joint Cancer Committee; BT: Breslow thickness; CE: contrast enhanced; CLND: complete lymph node dissection; CT: computed tomography; DW: diffusion weighted; FNAC: fine needle aspiration cytology; FU: follow‐up; GE: gradient echo; HN: head and neck; LN: lymph node; MM: malignant melanoma; MRI: magnetic resonance imaging; mm: millimetre; N+: node positive; N/A: not applicable; NR: not reported; PET: positron emission tomography; SLNB: sentinel lymph node biopsy; U: unenhanced; US: ultrasound; VIBE: MRI sequence.

Table 3. Characteristics of studies conducted in mixed or unclear population groups

Table Tests. Data tables by test

Test	No. of studies	No. of participants
1 Pre‐SLNB US vs Histology ‐ Nodal mets ‐ per patient Show forest plot	11	2604

2 Pre‐SLNB US (stringent US criteria) vs Histology ‐ Nodal mets ‐ per patient Show forest plot	1	132

3 Pre‐SLNB US‐FNAC ‐ Nodal mets ‐ per patient Show forest plot	3	1164

4 Pre‐SLNB PET‐CT vs Histology ‐ Nodal mets ‐ all SLNB ‐ per patient Show forest plot	4	170

5 Pre‐SLNB PET‐CT vs Histology ‐ Nodal mets ‐ high risk ‐ per patient Show forest plot	3	75

6 Pre‐SLNB PET‐CT vs Histology ‐ Nodal mets ‐ head and neck only ‐ per patient Show forest plot	1	20

7 Pre‐SLNB PET‐CT vs Histology/FU ‐ Nodal mets ‐ high risk ‐ per patient Show forest plot	2	76

8 Pre‐SLNB PET‐CT vs Histology/FU ‐ Nodal mets ‐ head and neck only ‐ per patient Show forest plot	1	22

9 Any metastasis ‐ PET‐CT ‐ PRIMARY ‐ Any stage (per pt) Show forest plot	1	37

10 Any metastasis ‐ PET‐CT ‐ PRIMARY ‐ BT > 4 mm (per pt) Show forest plot	2	81

11 Any metastasis ‐ CT ‐ RE‐STAGING ‐ Any stage (per pt) Show forest plot	1	106

12 Any metastasis ‐ PET‐CT ‐ RE‐STAGING ‐ Any stage (per pt) Show forest plot	2	153

13 Any metastasis ‐ PET‐CT ‐ RE‐STAGING ‐ Stage IIIb or less (per pt) Show forest plot	1	76

14 Any metastasis ‐ PET‐CT ‐ RE‐STAGING ‐ Stage IIIc to IV (per pt) Show forest plot	1	30

15 Any metastasis ‐ CT‐ MIXED ‐ All data (per pt) Show forest plot	1	250

16 Any metastasis ‐ PET‐CT ‐ MIXED ‐ All data (per pt) Show forest plot	6	591

17 Any metastasis ‐ PET‐CT (plus CT) ‐ Mixed ‐ Any stage (per pt) Show forest plot	1	124

18 Any metastasis ‐ PET‐CT ‐ RE‐STAGING ‐ Any stage (per lesion) Show forest plot	1	139

19 Any metastasis ‐ CT‐ MIXED ‐ All data (per lesion) Show forest plot	5	1770

20 Any metastasis (incl brain) ‐ CT (U) ‐ MIXED (per lesion) Show forest plot	1	232

21 Any metastasis (incl brain) ‐ CT (CE) ‐ MIXED (per lesion) Show forest plot	1	209

22 Any metastasis ‐ MRI ‐ MIXED ‐ All data (per lesion) Show forest plot	4	1556

23 Any metastasis (excl brain) ‐ MRI (DW + VIBE) ‐ MIXED (per lesion) Show forest plot	1	195

24 Any metastasis (incl brain) ‐ MRI (DW) ‐ MIXED (per lesion) Show forest plot	1	218

25 Any metastasis (incl brain) ‐ MRI (DW + VIBE) ‐ MIXED (per lesion) Show forest plot	1	218

26 Any metastasis (incl brain) ‐ MRI plus CT ‐ MIXED (per lesion) Show forest plot	1	116

27 Any metastasis ‐ PET‐CT ‐ MIXED ‐ All data (per lesion) Show forest plot	5	1138

28 Any metastasis (incl brain) ‐ PET‐CT (U) ‐ MIXED (per lesion) Show forest plot	1	232

29 Any metastasis (direct test comparisons) ‐ CT ‐ Mixed ‐ Stage III/IV (per lesion) Show forest plot	3	1430

30 Any metastasis (direct test comparisons) ‐ MRI ‐ Mixed ‐ Stage III/IV (per lesion) Show forest plot	3	1439

31 Any metastasis (direct test comparisons) ‐ PET‐CT ‐ Mixed ‐ Stage III/IV (per lesion) Show forest plot	2	611

32 Nodal metastasis ‐ US ‐ PRIMARY (per pt) Show forest plot	2	317

33 Nodal metastasis ‐ CT ‐ PRIMARY (per pt) Show forest plot	1	56

34 Nodal metastasis ‐ PET‐CT ‐ PRIMARY (per pt) Show forest plot	1	56

35 Nodal metastasis ‐ US ‐ RE‐STAGING (per pt) Show forest plot	1	460

36 Nodal metastasis ‐ US plus US (CE) ‐ RE‐STAGING (per pt) Show forest plot	1	460

37 Nodal metastasis ‐ US ‐ MIXED (per pt) Show forest plot	1	79

38 Nodal metastasis ‐ CT ‐ MIXED (per pt) Show forest plot	2	276

39 Nodal metastasis (superficial groin) ‐ PET‐CT (indeterminate test positive) ‐ MIXED (per pt) Show forest plot	1	69

40 Nodal metastasis (superficial groin) ‐ PET‐CT (indeterminate test negative) ‐ MIXED (per pt) Show forest plot	1	69

41 Nodal metastasis (deep groin) ‐ PET‐CT (indeterminate test positive) ‐ MIXED (per pt) Show forest plot	1	67

42 Nodal metastasis (deep groin) ‐ PET‐CT (indeterminate test negative) ‐ MIXED (per pt) Show forest plot	1	67

43 Nodal metastasis ‐ PET‐CT ‐ MIXED (per pt) Show forest plot	1	250

44 Nodal metastasis ‐ CT ‐ MIXED ‐ All data (per lesion) Show forest plot	4	629

45 Nodal metastasis ‐ MRI ‐ MIXED ‐ All data (per lesion) Show forest plot	4	630

46 Nodal metastasis ‐ MRI (DW + VIBE) ‐ MIXED (per lesion) Show forest plot	1	53

47 Nodal metastasis ‐ PET‐CT ‐ MIXED ‐ All data (per lesion) Show forest plot	4	288

48 Superficial nodal metastasis ‐ US ‐ Mixed ‐ stage IV (per LNB) Show forest plot	1	33

49 Superficial nodal metastasis ‐ CT ‐ Mixed ‐ stage IV (per LNB) Show forest plot	1	33

50 Superficial nodal metastasis ‐ MRI ‐ Mixed ‐ stage IV (per LNB) Show forest plot	1	33

51 Superficial nodal metastasis ‐ MRI (DW + VIBE) ‐ Mixed ‐ Stage IV (per lesion) Show forest plot	1	33

52 Superficial nodal metastasis ‐ PET‐CT ‐ Mixed ‐ stage IV (per LNB) Show forest plot	1	33

53 Distant metastasis ‐ CT ‐ PRIMARY (per pt) Show forest plot	1	56

54 Distant metastasis ‐ PET‐CT ‐ PRIMARY (per pt) Show forest plot	2	112

55 Distant metastasis ‐ CT ‐ MIXED ‐ All data (per pt) Show forest plot	2	501

56 Distant metastasis ‐ PET‐CT ‐ MIXED ‐ All data (per pt) Show forest plot	1	250

57 Distant metastasis ‐ CT ‐ Mixed ‐ All data (per lesion) Show forest plot	4	920

58 Distant metastasis ‐ MRI ‐ Mixed ‐ All data (per lesion) Show forest plot	4	926

59 Distant metastasis ‐ PET‐CT ‐ Mixed ‐ All data (per lesion) Show forest plot	4	618

60 Distant metastasis (excl brain) ‐ MRI (DW + VIBE) ‐ Mixed ‐ stage III/IV (per lesion) Show forest plot	1	142

61 Distant metastasis (incl brain) ‐ MRI (DW) ‐ Mixed ‐ stage III/IV (per lesion) Show forest plot	1	165

62 Distant metastasis (incl brain) ‐ MRI (DW + VIBE) ‐ Mixed ‐ stage III/IV (per lesion) Show forest plot	1	165

63 Bone metastasis ‐ CT‐ MIXED ‐ All data (per lesion) Show forest plot	3	97

64 Bone metastasis ‐ MRI ‐ MIXED ‐ All data (per lesion) Show forest plot	3	99

65 Bone metastasis ‐ MRI (DW + VIBE) ‐ MIXED ‐ All data (per lesion) Show forest plot	1	35

66 Bone metastasis ‐ PET‐CT ‐ MIXED ‐ All data (per lesion) Show forest plot	4	133

67 Liver metastasis ‐ CT‐ MIXED ‐ All data (per lesion) Show forest plot	4	150

68 Liver metastasis ‐ MRI ‐ MIXED ‐ All data (per lesion) Show forest plot	4	155

69 Liver metastasis ‐ MRI (DW + VIBE) ‐ Mixed ‐ stage III/IV (per lesion) Show forest plot	1	27

70 Liver metastasis ‐ PET‐CT ‐ MIXED ‐ All data (per lesion) Show forest plot	4	94

71 Lung metastasis ‐ CT ‐ MIXED ‐ All data (per lesion) Show forest plot	4	325

72 Lung metastasis ‐ MRI ‐ MIXED ‐ All data (per lesion) Show forest plot	4	325

73 Lung metastasis ‐ MRI (DW + VIBE) ‐ Mixed ‐ stage III/IV (per lesion) Show forest plot	1	45

74 Lung metastasis ‐ PET‐CT ‐ MIXED ‐ All data (per lesion) Show forest plot	4	155

75 Soft tissue metastasis ‐ PET‐CT ‐ MIXED (per lesion) Show forest plot	1	25

76 Local/subcutaneous metastasis ‐ CT ‐ MIXED (per lesion) Show forest plot	3	139

77 Local/subcutaneous metastasis ‐ MRI ‐ MIXED (per lesion) Show forest plot	3	148

78 Local/subcutaneous metastasis ‐ MRI (DW + VIBE) ‐ MIXED (per lesion) Show forest plot	1	22

79 Local/subcutaneous metastasis ‐ PET‐CT ‐ MIXED (per lesion) Show forest plot	3	102

80 Brain metastasis ‐ CT‐ MIXED ‐ All data (per lesion) Show forest plot	1	20

81 Brain metastasis ‐ MRI (DW) ‐ MIXED ‐ All data (per lesion) Show forest plot	1	20

82 Brain metastasis ‐ MRI (DW + VIBE) ‐ MIXED ‐ All data (per lesion) Show forest plot	1	20

83 Brain metastasis ‐ PET‐CT ‐ MIXED ‐ All data (per lesion) Show forest plot	1	9

93 'Other' metastasis ‐ CT ‐ Mixed ‐ Any stage (per lesion) Show forest plot	1	26

94 'Other' metastasis ‐ MRI ‐ Mixed ‐ Any stage (per lesion) Show forest plot	1	21

95 'Other' metastasis ‐ PET‐CT ‐ Mixed ‐ Any stage (per lesion) Show forest plot	1	26

96 'Other' metastasis ‐ CT ‐ Mixed ‐ stage III/IV (per lesion) Show forest plot	2	160

97 'Other' metastasis ‐ MRI ‐ Mixed ‐ stage III/IV (per lesion) Show forest plot	2	160

98 'Other' metastasis ‐ PET‐CT ‐ Mixed ‐ stage III/IV (per lesion) Show forest plot	1	25

Table Tests. Data tables by test