Intraoperative imaging technology to maximise extent of resection for glioma

Michael D Jenkinson; Damiano Giuseppe Barone; Andrew Bryant; Luke Vale; Helen Bulbeck; Theresa A Lawrie; Michael G Hart; Colin Watts

doi:10.1002/14651858.CD012788.pub2

Intraoperative imaging technology to maximise extent of resection for glioma

Declaraciones de intereses de los autores

Versión publicada: 22 enero 2018 Historial de versiones

https://doi.org/10.1002/14651858.CD012788.pub2

Contraer todo Desplegar todo

Abstract

disponible en

Background

Extent of resection is considered to be a prognostic factor in neuro‐oncology. Intraoperative imaging technologies are designed to help achieve this goal. It is not clear whether any of these sometimes very expensive tools (or their combination) should be recommended as standard care for people with brain tumours. We set out to determine if intraoperative imaging technology offers any advantage in terms of extent of resection over standard surgery and if any one technology was more effective than another.

Objectives

To establish the overall effectiveness and safety of intraoperative imaging technology in resection of glioma. To supplement this review of effects, we also wished to identify cost analyses and economic evaluations as part of a Brief Economic Commentary (BEC).

Search methods

We searched the Cochrane Central Register of Controlled Trials (CENTRAL) (Issue 7, 2017), MEDLINE (1946 to June, week 4, 2017), and Embase (1980 to 2017, week 27). We searched the reference lists of all identified studies. We handsearched two journals, the Journal of Neuro‐Oncology and Neuro‐oncology, from 1991 to 2017, including all conference abstracts. We contacted neuro‐oncologists, trial authors, and manufacturers regarding ongoing and unpublished trials.

Selection criteria

Randomised controlled trials evaluating people of all ages with presumed new or recurrent glial tumours (of any location or histology) from clinical examination and imaging (computed tomography (CT) or magnetic resonance imaging (MRI), or both). Additional imaging modalities (e.g. positron emission tomography, magnetic resonance spectroscopy) were not mandatory. Interventions included intraoperative MRI (iMRI), fluorescence‐guided surgery, ultrasound, and neuronavigation (with or without additional image processing, e.g. tractography).

Data collection and analysis

Two review authors independently assessed the search results for relevance, undertook critical appraisal according to known guidelines, and extracted data using a prespecified pro forma.

Main results

We identified four randomised controlled trials, using different intraoperative imaging technologies: iMRI (2 trials including 58 and 14 participants, respectively); fluorescence‐guided surgery with 5‐aminolevulinic acid (5‐ALA) (1 trial, 322 participants); and neuronavigation (1 trial, 45 participants). We identified one ongoing trial assessing iMRI with a planned sample size of 304 participants for which results are expected to be published around autumn 2018. We identified no trials for ultrasound.

Meta‐analysis was not appropriate due to differences in the tumours included (eloquent versus non‐eloquent locations) and variations in the image guidance tools used in the control arms (usually selective utilisation of neuronavigation). There were significant concerns regarding risk of bias in all the included studies. All studies included people with high‐grade glioma only.

Extent of resection was increased in one trial of iMRI (risk ratio (RR) of incomplete resection 0.13, 95% confidence interval (CI) 0.02 to 0.96; 1 study, 49 participants; very low‐quality evidence) and in the trial of 5‐ALA (RR of incomplete resection 0.55, 95% CI 0.42 to 0.71; 1 study, 270 participants; low‐quality evidence). The other trial assessing iMRI was stopped early after an unplanned interim analysis including 14 participants, therefore the trial provides very low‐quality evidence. The trial of neuronavigation provided insufficient data to evaluate the effects on extent of resection.

Reporting of adverse events was incomplete and suggestive of significant reporting bias (very low‐quality evidence). Overall, reported events were low in most trials. There was no clear evidence of improvement in overall survival with 5‐ALA (hazard ratio 0.83, 95% CI 0.62 to 1.07; 1 study, 270 participants; low‐quality evidence). Progression‐free survival data were not available in an appropriate format for analysis. Data for quality of life were only available for one study and suffered from significant attrition bias (very low‐quality evidence).

Authors' conclusions

Intra‐operative imaging technologies, specifically iMRI and 5‐ALA, may be of benefit in maximising extent of resection in participants with high grade glioma. However, this is based on low to very low quality evidence, and is therefore very uncertain. The short‐ and long‐term neurological effects are uncertain. Effects of image‐guided surgery on overall survival, progression‐free survival, and quality of life are unclear. A brief economic commentary found limited economic evidence for the equivocal use of iMRI compared with conventional surgery. In terms of costs, a non‐systematic review of economic studies suggested that compared with standard surgery use of image‐guided surgery has an uncertain effect on costs and that 5‐aminolevulinic acid was more costly. Further research, including studies of ultrasound‐guided surgery, is needed.

PICO

Population

Intervention

Comparison

Outcome

El uso y la enseñanza del modelo PICO están muy extendidos en el ámbito de la atención sanitaria basada en la evidencia para formular preguntas y estrategias de búsqueda y para caracterizar estudios o metanálisis clínicos. PICO son las siglas en inglés de cuatro posibles componentes de una pregunta de investigación: paciente, población o problema; intervención; comparación; desenlace (outcome).

Para saber más sobre el uso del modelo PICO, puede consultar el Manual Cochrane.

Plain language summary

disponible en

Image‐guided surgery for brain tumours

Background
Surgery has a key role in the management of many types of brain tumour. Removing as much tumour as possible is very important, as in some types of brain tumour this can help patients to live longer and to feel better. However, removing a brain tumour may in some cases be difficult because the tumour either looks like normal brain tissue or is near brain tissue that is needed for normal functioning. New methods of seeing tumours during surgery have been developed to help surgeons better identify tumour from normal brain tissue.

Question
1. Is image‐guided surgery more effective at removing brain tumours than surgery without image guidance?
2. Is one image guidance technology or tool better than another?

Study characteristics
Our search strategy is up to date as of July 2017. We found four trials looking at three different types of tools to help improve the amount of tumour that is removed. The tumour being evaluated was high‐grade glioma. Imaging interventions used during surgery included:

• magnetic resonance imaging (iMRI) during surgery to assess the amount of remaining tumour;
• fluorescent dye (5‐aminolevulinic acid) to mark out the tumour; or
• imaging before surgery to map out the location of a tumour, which was then used at the time of surgery to guide the surgery (neuronavigation).

All the studies had compromised methods, which could mean their conclusions were biased. Other studies were funded by the manufacturers of the image guidance technology being evaluated.

Key results
We found low‐ to very low‐quality evidence that use of image‐guided surgery may result in more of the tumour being removed surgically in some people. The short‐ and long‐term neurological effects are uncertain. We did not have the data to determined whether any of the evaluated technologies affect overall survival, time until disease progression, or quality of life. There was very low‐quality evidence for neuronavigation, and we identified no trials for ultrasound guidance. In terms of costs, a non‐systematic review of economic studies suggested that compared with standard surgery use of image‐guided surgery has an uncertain effect on costs and that 5‐aminolevulinic acid was more costly than conventional surgery.

Quality of the evidence
Evidence for intraoperative imaging technology for use in removing brain tumours is sparse and of low to very low quality. Further research is needed to assess three main questions.

1. Is removing more of the tumour better for the patient in the long term?
2. What are the risks of causing a patient to have worse symptoms by taking out more of the tumour?
3. How does resection affect a patient's quality of life?

Authors' conclusions

Implications for practice

The purpose of these technologies is to make surgical resection safer and more effective. Patient selection, patient‐specific information, and informed consent are all essential to ensure that these technologies are used appropriately in the pathway of care. Standardisation of patient management through the use of evidence‐based clinical practice will ensure consistent surgical standards of care wherever a patient is treated.

Patient selection must be emphasised. All the trials included predominantly young participants of good performance status and with a well‐defined tumour in a non‐eloquent region that was amenable to safe complete resection.

Implications for research

The current studies provide a limited knowledge base upon which to consider implementing such technologies. Important questions remain about benefit in terms of overall survival, progression‐free survival, and the risk of adverse events. Future trials could be done with a similar design to those already performed but with simple improvements to the trial methodology and outcome reporting.

A direct comparison between individual intraoperative imaging technologies could be of benefit to compare their relative merits and in particular help to provide cost‐effectiveness data. The most logical comparison would be between iMRI and 5‐ALA, while ultrasound and advanced imaging neuronavigation (e.g. tractography of functional imaging based) have theoretical advantages but currently have not been the subject of a randomised controlled trial (RCT). However, units with access to all technologies are likely to be rare, and patients who are suitable for either procedure are likely to be very highly selected, although experience‐based RCTs are a possible way around this. Nevertheless, there are ongoing RCTs comparing different forms of image‐guided surgery, and these can hopefully be incorporated into an update of this review once they are completed (Ongoing studies). A network meta‐analysis may allow indirect comparisons of each technology, and a formal economic analysis could allow financial factors to be facilitated into the equation.

Evidence regarding extent of resection and the means with which to achieve this is becoming stronger, but this still needs to be balanced with making surgery safer. Awake craniotomy is probably the main means of enabling a maximal safe resection, particularly with tumours in eloquent areas. A comparison of tractography or functional MRI guided surgery versus awake craniotomy is potentially a relevant question for resection of tumours in eloquent areas.

Summary of findings

Open in table viewer

Summary of findings 1. iMRI image‐guided surgery compared to standard surgery for high‐grade glioma

Outcomes	*Illustrative comparative risk (95% CI)**		Relative effect (95% CI)	No. of participants (studies)	Quality of the evidence (GRADE)	Comments
iMRI image‐guided surgery compared to standard surgery for high‐grade glioma
Patient or population: high‐grade glioma Settings: specialist centres Intervention: iMRI image‐guided surgery (based on post‐operative MRI) Comparison: standard surgery
	Assumed risk	Corresponding risk
	Control	Image‐guided surgery
Extent of resection: complete resection	32¹ per 100	4 per 100 (1 to 31)	RR 0.13 (0.02 to 0.96)	49 participants (1 study)	⊕⊝⊝⊝^1,2,3 verylow	Small trial of highly selected participants with potential bias in allocation and performance. One other trial reported this outcome but did not contribute towards the analysis.
Adverse events	Inadequately and inconsistently reported in the trial				⊕⊝⊝⊝⁴ verylow	Adverse events were reported in an inconsistent manner and not according to the manner prespecified in our protocol. Additionally, we were mainly interested in identifying serious adverse events, which were inadequately reported
Overall survival	Not estimable				⊕⊝⊝⊝⁴ verylow	Not reported by trial authors so graded as very low quality evidence
Progression‐free survival	Not estimable				⊕⊝⊝⊝⁴ verylow	Progression‐free survival or time to progression was not adequately reported in the trial
Quality of life	Not estimable				⊕⊝⊝⊝⁴ verylow	Quality of life was not reported in the trial
The basis for the assumed risk* is only based on individual trials as only single trial reports were available. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI). CI: confidence interval; iMRI: intraoperative magnetic resonance imaging; RR: risk ratio
GRADE Working Group grades of evidence High quality: Further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: We are very uncertain about the estimate.
¹Expressed in terms of risk of incomplete resection (bad outcome). ²Small trial so quality of the evidence downgraded by one level. ³Highly selected participants with potential bias in allocation and performance as well as in other 'Risk of bias' domains, thus downgraded by two levels. ⁴Outcome was not reported (or inadequately reported for meaningful conclusions to be drawn), therefore giving lowest quality of evidence judgement.

Open in table viewer

Summary of findings 2. 5‐ALA image‐guided surgery compared to standard surgery for high‐grade glioma

Outcomes	*Illustrative comparative risk (95% CI)**		Relative effect (95% CI)	No. of participants (studies)	Quality of the evidence (GRADE)	Comments
5‐ALA image‐guided surgery compared to standard surgery for high‐grade glioma
Patient or population: high‐grade glioma Settings: specialist centres Intervention: 5‐ALA image‐guided surgery (based on post‐operative MRI) Comparison: standard surgery
	Assumed risk	Corresponding risk
	Control	Image‐guided surgery
Extent of resection: complete resection	64¹ per 100	35 per 100 (27 to 45)	RR 0.55 (0.42 to 0.71)	270 participants (1 study)	⊕⊕⊝⊝² low	Highly selected participants with potential bias in allocation and performance
Adverse events	Inadequately and inconsistently reported in the trial				⊕⊝⊝⊝³ verylow	Adverse events were reported in an inconsistent manner and not according to the manner prespecified in our protocol. Additionally, we were mainly interested in identifying serious adverse events, which were inadequately reported
Overall survival	Not estimable due to reporting of HR and since just a single trial reported on this outcome we did not arbitrarily choose a snap shot in time in which to use as basis to calculate the assumed and corresponding risks as this may be misleading.		HR 0.82 (0.62 to 1.07)	270 participants (1 study)	⊕⊕⊝⊝² low	The overall quality of this outcome was low in this trial and was downgraded for highly selected participants with potential bias in allocation and performance
Progression‐free survival	Not adequately reported in the trials				⊕⊝⊝⊝³ verylow	Progression‐free survival or time to progression was not adequately reported in the trial
Quality of life	Inadequately reported or not assessed at all in the included trials				⊕⊝⊝⊝³ verylow	Quality of life was not reported in the trial
The basis for the assumed risk* is only based on individual trials as only single trial reports were available. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI). 5‐ALA: 5‐aminolevulinic acid; CI: confidence interval; HR: hazard ratio; RR: risk ratio
GRADE Working Group grades of evidence High quality: Further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: We are very uncertain about the estimate.
¹Expressed in terms of risk of incomplete resection (bad outcome). ²Highly selected participants with potential bias in allocation and performance as well as in other 'Risk of bias' domains, thus downgraded by two levels. ³Outcome was not reported (or inadequately reported for meaningful conclusions to be drawn), therefore giving lowest quality of evidence judgement.

Open in table viewer

Summary of findings 3. Neuronavigation image‐guided surgery compared to standard surgery for high‐grade glioma

Outcomes	*Illustrative comparative risk (95% CI)**		Relative effect (95% CI)	No. of participants (studies)	Quality of the evidence (GRADE)	Comments
Neuronavigation image‐guided surgery compared to standard surgery for high‐grade glioma
Patient or population: high‐grade glioma Settings: specialist centres Intervention: neuronavigation image‐guided surgery (based on post‐operative MRI) Comparison: standard surgery
	Assumed risk	Corresponding risk
	Control	Image‐guided surgery
Extent of resection: complete resection	Not estimable	Not estimable	Not reported	45 participants (1 study)	⊕⊝⊝⊝^1,2,4 verylow	Small study of highly selected participants at very high risk of allocation bias.Complete resection was achieved in three participants in the control group and five participants in the neuronavigation group. However, there was significant attrition, with not all participants completing imaging, and the denominators for these figures were not stated, precluding formal analysis
Adverse events	Inadequately and inconsistently reported in the trial				⊕⊝⊝⊝² verylow	Adverse events were reported in an inconsistent manner and not according to the manner prespecified in our protocol. Additionally, we were mainly interested in identifying serious adverse events, which were inadequately reported
Overall survival	Not estimable				⊕⊝⊝⊝³ verylow	Not reported by trial authors so graded as very low quality evidence
Progression‐free survival	Not estimable				⊕⊝⊝⊝² verylow	Progression‐free survival or time to progression was not reported in the trial
Quality of life	Inadequately reported or not assessed at all in the included trials				⊕⊝⊝⊝³ verylow	Quality of life was reported in the trial but only 19 participants (8 in the neuronavigation arm and 11 in the standard surgery arm) completed questionnaires postoperatively at 3 months', constituting only 64.5% of all eligible participants, and no statistical analysis was presented
The basis for the assumed risk* is only based on individual trials as only single trial reports were available. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI). CI: confidence interval
GRADE Working Group grades of evidence High quality: Further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: We are very uncertain about the estimate.
¹Small trial so quality of the evidence downgraded by one level. ²Highly selected participants with potential bias in allocation and performance as well as in other 'Risk of bias' domains, thus downgraded by two levels. ³Outcome was not reported (or inadequately reported for meaningful conclusions to be drawn), therefore giving lowest quality of evidence judgement.

Background

Description of the condition

Tumours of the central nervous system constitute a large group characterised by a wide range of genetic, histological, and functional diversity (Louis 2016). Secondary brain tumours or metastases are the most common, accounting for almost half of all central nervous system tumours. Primary brain tumours typically occur as some variation of a glioma, so called because they arise from the supporting glial cell architecture; of these, glioblastoma is the most frequent and most malignant histological subtype (Ohgaki 2009).

Brain tumours may present with headaches, neurological deficits, or seizures, alone or in combination. Treatment choices include surgery (usually biopsy or resection), radiotherapy, and chemotherapy. National guidelines recommend that management of a central nervous system tumour should be discussed by a multidisciplinary team and individually tailored to patient needs (NICE 2006).

Description of the intervention

Intraoperative magnetic resonance imaging (iMRI) involves the use of an MRI during the actual operation to assess where there remains tumour that can be removed. Details on the fine anatomical structure of soft tissues provided by this technique have revolutionised the field of neuroscience, but the equipment is expensive and bulky. Intraoperative MRI requires a specific portable MRI scanner or a parallel stationary MRI scanner that is available for use in an adjacent diagnostic room. Acquisition of iMRI is aimed at providing high‐definition, easily interpreted images for real‐time assessment of tumour resection, allowing the possibility of immediate further resection during the same operative session (Black 1997; Seifert 2003). Uptake has been limited by low field strength scanners, which are associated with poor image quality, extended surgical time, and substantial capital costs.

Neuronavigation refers to representing a spatial position on the patient in imaging data. Preoperative imaging is used to localise a lesion, perform a tailored craniotomy, and guide resection. Postoperative MRI is performed to determine the extent of resection. A major limitation of this technique is the phenomenon of intraoperative brain shift, whereby the preoperative anatomy is altered during tumour resection and accuracy is consequently reduced. Advantages include the potential to use advanced imaging (functional MRI or tractography, for example) to define eloquent or invaded tissues.

Ultrasonography, performed in two or three dimensions (2D or 3D, respectively), enables visualisation of structures through recorded reflections of echoes of ultrasonic wave pulses (frequency > 20 MHz) directed into the tissue of interest. Freehand movement of an ultrasonography probe allows determination of image volume in 3D. Volumetric reconstruction allows neuronavigation accurate to within 1.4 mm. Updated 3D ultrasonography volumes can be created at any time during surgery. Advantages include relative affordability, easy repeatability, non‐invasiveness, lack of radiation, and the option for use in combination with other intraoperative technologies; the main disadvantage is operator variability, because efficacy depends on skill and experience (Unsgaard 2006).

Fluorescence‐guided surgery uses 5‐aminolevulinic acid (5‐ALA) as a natural biochemical precursor of haemoglobin that elicits synthesis and accumulation of fluorescent porphyrins preferentially in mitotically active tissue (Regula 1995). Porphyrin fluorescence can be visualised with the use of a modified microscope and ultraviolet light with the aim of identifying neoplastic tissue (Stummer 1998; Stummer 2000). Limitations include lack of a clear boundary between neoplastic and eloquent tissue, and variability in uptake of 5‐ALA depending on tumour characteristics. Distinct from iMRI and 3D ultrasonography, both of which involve little cost after the initial outlay, is the cost per patient of each dose of 5‐ALA, in addition to the requirement for a specific compatible operating microscope. Adverse events include hypotension, nausea, photosensitivity, and photodermatosis.

How the intervention might work

The extent of surgical resection is considered to be one of several important prognostic factors in neuro‐oncology. For some tumours this is clearly established, while for others the relationship is less clear (Hart 2011). Although high‐quality evidence is lacking, estimated benefits of gross total resection include a possible extension of survival from around 11 months to 14 months in glioblastoma, and from around 60 months to 90 months in low‐grade glioma, albeit based in highly selected patients in non‐randomised trials (Watts 2016). Limitations to the extent of surgical resection are related to difficulty in identifying residual tumour intraoperatively and proximity of the tumour to eloquent brain. Intraoperative imaging technologies have been developed to aid detection of residual tumour with the aim of extending resection. This information can be used by the surgeon to optimise resection, thereby potentially improving prognosis. Overall, prognosis for people with high‐grade glioma and low‐grade glioma depends on several factors of which extent of resection is one, and the importance of this prognostic factor may vary in different subgroups of these patients.

Why it is important to do this review

Maximising the extent of resection comes with the risk of encroaching upon eloquent brain. Potential benefits of more extensive tumour resection must be balanced against risks of causing new neurological deficits and reduced quality of life. This demands an objective assessment of risks and benefits for each technology.

Experience with each different technology is often limited within individual units. Often, technologies are seen as an evolution of established techniques and are not subjected to the rigorous scrutiny required for other new therapies, therefore evidence is potentially limited to small single‐institution case series. Direct comparisons between different intraoperative imaging technologies are necessary to limit overexpenditure on redundant technology and potential risk to patients.

This review aims to serve as a single comprehensive resource describing level of evidence and effectiveness for each technology.

Objectives

Methods

Criteria for considering studies for this review

Types of studies

Randomised controlled trials (RCTs).

Types of participants

This review included people of all ages with presumed new or recurrent glial tumours (of any location or histology) from clinical examination and imaging (computed tomography (CT) or MRI, or both). Additional imaging modalities (e.g. positron emission tomography, magnetic resonance spectroscopy) were not mandatory.

Types of interventions

Any of the following interventions could be compared with each other as well as within each intervention class (e.g. different forms of fluorescence‐guided surgery).

Intraoperative MRI (iMRI): defined as using a portable or fixed scanner (and moving scanner or patient, respectively) to acquire image data while the patient remains under anaesthesia. Can be integrated with neuronavigation (see below).
Neuronavigation or image guidance: defined as a system that integrates preoperative or intraoperative image data and creates a translation map between 'world space' and 'image space' to allow co‐registration of imaging and patient anatomy.
Intraoperative ultrasonography (2D or 3D): defined as a system that uses freehand movement of an ultrasonography probe over the region of interest and subsequently generates a volumetric reconstruction allowing intraoperative neuronavigation.
Fluorescence‐guided surgery: defined as administration of a contrast agent and intraoperative visualisation with the use of ultraviolet light (usually a specific mode of an operating microscope).

Types of outcome measures

Primary outcomes

Extent of resection: as shown on follow‐up imaging. Historically this has been broadly divided into complete resection, partial resection, and biopsy. Updated response criteria are available to enable dichotomising this into measurable and non‐measurable disease for contrast‐enhancing lesions (Wen 2010). Volumetric assessment is a better method of assessment in terms of accuracy and objectivity but requires additional imaging processing time and is not used routinely in many NHS (National Health Service) centres. Intraoperative evaluation of extent of resection by the operating surgeon is a biased and unverifiable method and therefore is not acceptable (Hensen 2008). We planned to use percentage resection, residual, mean volumes, and percentage of total/non‐total resection.
Adverse events: type (as defined by MedDRA (Medical Dictionary for Regulatory Authorities) criteria) and timing (MedDRA 2008). Examples include haematoma, wound complications, infection (and site), cerebrospinal fluid leak, oedema, seizures, and general medical complications. Additional procedures required for complications should be noted. Both the total number of complications and the number of complications per participant should be stated.

Secondary outcomes

Overall survival: length of time (in days, weeks, or months) from randomisation to death (from any cause).
Progression‐free survival (PFS): use of open and thorough criteria to define recurrence according to clinical symptoms, imaging, and increase in steroid therapy (Wen 2010).
Quality of life (QoL): use of a reliable and objective grading measure such as the EORTC QLQ‐C30/BN‐20 (European Organisation for Research and Treatment of Cancer QoL assessment specific to brain neoplasms) and FACT‐BrS (Functional Assessment of Cancer Therapy ‐ brain subscale) (Mauer 2008)

We planned to present a 'Summary of findings' table reporting the following outcomes, which are listed in order of priority (see Data synthesis).

Extent of resection
Adverse events
Overall survival
PFS
QoL

Search methods for identification of studies

Non‐English language journals were eligible for inclusion.

Electronic searches

We searched the following electronic databases:

Cochrane Central Register of Controlled Trials (CENTRAL; Issue 7, 2017) in the Cochrane Library.
MEDLINE (Ovid) (1946 to June week 4 2017).
Embase (Ovid) (1980 to 2017 week 27).

Search strategies for identification of RCTs in CENTRAL, MEDLINE, and Embase are depicted in Appendix 1; Appendix 2; Appendix 3, and for the brief economic commentary for MEDLINE in Appendix 4 and Embase in Appendix 5.

Searching other resources

We searched the references of all identified studies for additional trials.

Handsearching

We handsearched the Journal of Neuro‐Oncology, Neuro‐oncology, Journal of Neurosurgery, and Neurosurgery from 1991 to 2017, to identify trials that may not have been included in the electronic databases, including a search of all conference abstracts published in these journals.

Personal communication

We contacted neuro‐oncology experts to obtain information on current or pending RCTs as well as authors to clarify whether their study met the inclusion criteria or to request additional information where aspects of the publication were unclear.

We contacted the following neuro‐oncology experts for information on any current or pending RCTs: Professor Mitchel S Berger; Dr E Antiono Chiocca; Dr Michael Vogelbaum; Professor Hughes Duffau; Professor Joahn Pallud; Professor Walter Stummer; Professor Manfred Westphal; Professor Jorg Tonn; Professor Roland Goldbrunner; Professor Mark Bernstein; Professor Gelareh Zadeh; Professor Francesco di Mecco; Professor Franco Servadei; Professor Lorenzo Bello; Professor Domenico Davella; Professor Alessandro Olivi.

Data collection and analysis

Selection of studies

We identified studies in three stages. During title/abstract screening (for both intervention and economic analyses), we used a machine learning classifier designed to distinguish RCTs from non‐RCTs and applied this tool to de‐duplicated electronic search results (Wallace 2017). This classifier assigned a probability score to each retrieved citation (title‐abstract record) that reported an RCT. Citations with an assigned probability score greater than or equal to 0.1 were retained; we automatically discarded citations with a probability score less than 0.1.

Two review authors (MGH and DGB) independently examined and screened remaining abstracts to see if they met the inclusion or exclusion criteria. Next, we obtained the full texts of selected studies, which we further examined and compared against the inclusion and exclusion criteria. At all times, we resolved disagreements through discussion. If sufficient data were not available for assessment or there was uncertainty about inclusion criteria, we contacted the relevant trial authors for clarification.

Data extraction and management

For included studies, two review authors (MGH and DGB) independently abstracted data using a prespecified form designed to gather information required for characteristics of included studies and validity tables (Juni 2001). We resolved differences by discussion. Specific data extracted included the following.

Participant characteristics: age (mean and range), gender, performance status based on Karnofsky performance score (KPS) (Table 1) or WHO score (Table 2) (Karnofsky 1948; Oken 1982), tumour location, contrast enhancement, and tumour histology.
Trial characteristics: inclusion and exclusion criteria, randomisation methods and stratification, allocation concealment (if applicable), blinding (of whom and when), and statistics. Definitions identified will include extent of resection, progression, and adverse events.
Intervention: iMRI: field strength, timing, type of scanner (separate suite or 'double‐donut'), sequences performed, contrast administration, and reporting methods. Neuronavigation: imaging sequences and timing, brand of equipment. 5‐ALA: dose and timing, timing of ultraviolet light used intraoperatively, microscope used. Ultrasound: brand, timing, operator experience. Additionally, surgical decision making influenced by intraoperative imaging should be stated.
Outcome assessment: extent of resection (and measurement methods), overall survival, PFS, QoL, and adverse events. We recorded additional quality control information on follow‐up, as well as presence of an intention‐to‐treat cohort, deviations from protocol, and post‐recurrence management.

Open in table viewer

Table 1. Karnofsky performance score

Score	Definition
100	Normal, no complaints, no evidence of disease
90	Able to carry on normal activity: minor symptoms of disease
80	Normal activity with effort: some symptoms of disease
70	Cares for self: unable to carry on normal activity or active work
60	Requires occasional assistance but is able to care for needs
50	Requires considerable assistance and frequent medical care
40	Disabled: requires special care and assistance
30	Severely disabled: hospitalisation is indicated, death is not imminent
20	Very sick, hospitalisation is necessary: active treatment is necessary
10	Moribund, fatal processes are progressing rapidly
0	Dead

Open in table viewer

Table 2. WHO performance score

Grade	Definition
0	Fully active, able to carry on all pre‐disease performance without restriction
1	Restricted in physically strenuous activity but ambulatory and able to carry out work of a light or sedentary nature, e.g. light house work, office work
2	Ambulatory and capable of all self care, but unable to carry out any work activities. Up and about more than 50% of waking hours
3	Capable of only limited self care, confined to bed or chair more than 50% of waking hours
4	Completely disabled. Cannot carry out any self care. Totally confined to bed or chair
5	Dead

Assessment of risk of bias in included studies

We critically appraised trials deemed relevant according to the criteria reported in NHS CRD Report No. 4 (CRD 2008). We allocated trials according to risk of bias as described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011). We covered specific core 'Risk of bias' items including selection, performance, detection, attrition (deeming this to be adequate if at least 80% of participants were assessed for all outcomes specified in the review), reporting, and other biases. Operator blinding was not possible, but participant and outcome assessment blinding was desirable, although not mandatory. Two review authors (MGH and DGB) provided independent critical appraisal. We resolved disputes through discussion.

Measures of treatment effect

Time‐to‐event data (survival and PFS): We abstracted the log hazard ratio (HR) and standard error (SE) of the log HR for inputting into Review Manager 5 (Review Manager 2014). We additionally presented the overall numbers of participants experiencing the event of interest during the trial period. If the HR and its variance were not presented (i.e. other survival data were presented, e.g. median survival, ranges or percentages at stated time points), we attempted to abstract the data required to estimate these (Parmar 1998).
Continuous outcomes (QoL and extent of resection): We abstracted the final value and standard deviation (SD) of the outcome of interest for each treatment arm at the end of follow‐up.
Dichotomous outcomes (adverse events, mortality, and extent of resection): We abstracted the number of participants in each treatment arm who experienced the outcome of interest to estimate a risk ratio (RR).
Dichotomous and continuous data: We abstracted the number of participants assessed at each endpoint.

When possible, all data abstracted were those relevant to an intention‐to‐treat analysis. In the case of missing data required for review outcomes, we contacted study authors to request the pertinent information. Two review authors (MGH and DGB) extracted data and entered it into Review Manager 5.

Unit of analysis issues

We did not anticipate any unit of analysis issues, and there were none to note.

Dealing with missing data

In the case of missing data required for review outcomes, we contacted study authors. We did not impute missing outcome data for any of the outcomes.

Assessment of heterogeneity

We planned to assess heterogeneity between studies by visually inspecting forest plots, estimating the percentage of heterogeneity (I² statistic) between trials that could not be ascribed to sampling variation (Higgins 2011), and performing a formal statistical test of the significance of identified heterogeneity (Deeks 2001). However, this was not applicable as we did not conduct meta‐analyses.

Assessment of reporting biases

We intended to construct funnel plots of treatment effect versus precision to investigate the likelihood of publication bias. Had these plots suggested that treatment effects may not have been sampled from a symmetrical distribution, as assumed by the random‐effects model, we planned to perform additional meta‐analyses using the fixed‐effect model.

Data synthesis

Two review authors (DGB and MGH) independently entered data into Review Manager 5. We planned to pool data if trial characteristics (methodology, participants, interventions, controls, and outcomes) were similar. We planned to use the following methods in the Data synthesis, Subgroup analysis and investigation of heterogeneity, and Sensitivity analysis to perform meta‐analyses.

Time‐to‐event data: We intended pool HR and variance using the generic inverse variance function of Review Manager 5.
Continuous outcomes: We intended to pool mean differences (MDs) between treatment arms at the end of follow‐up if all trials measured the outcome on the same scale, or otherwise use the standardised mean difference (SMD).
Dichotomous outcomes: We intended to calculate the RR for each study and then pool values for all studies.

We planned to use random‐effects models for all meta‐analyses (DerSimonian 1986), but to perform additional fixed‐effect analyses if an asymmetrical distribution was found (see Assessment of reporting biases).

We presented the overall quality of evidence for each outcome (see Types of outcome measures) according to the GRADE approach, which takes into account issues related not only to internal validity (risk of bias, inconsistency, imprecision, publication bias) but also to external validity (e.g. directness of results) (Langendam 2013). We created a 'Summary of findings' table based on the methods described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011) and using GradePro GDT. We used the GRADE checklist and GRADE Working Group for quality of evidence definitions (Meader 2014). We downgraded evidence by one level for serious concerns (or two levels for very serious concerns) for each limitation, as follows.

High quality: We are very confident that the true effect lies close to that of the estimate of the effect.
Moderate quality: We are moderately confident in the effect estimate: the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low quality: Our confidence in the effect estimate is limited: the true effect may be substantially different from the estimate of the effect.
Very low quality: We have very little confidence in the effect estimate: the true effect is likely to be substantially different from the estimate of effect.

Subgroup analysis and investigation of heterogeneity

Owing to differences in prognosis, we planned to perform subgroup analyses according to tumour type, including:

high‐grade glioma;
low‐grade glioma; or
primary versus recurrent disease in high‐grade glioma and primary disease versus disease progression in low‐grade glioma.

Sensitivity analysis

We planned to perform a sensitivity analysis to investigate how trial quality affected the robustness of findings. We planned to perform a subsequent sensitivity analysis of trials that included objective, blinded early postoperative MRI and histology in their assessment of extent of resection.

Results

Description of studies

See Characteristics of included studies and Characteristics of excluded studies.

Results of the search

The literature search revealed a total of 4109 of records: CENTRAL, 927 references; MEDLINE, 519 references; Embase, 887 references. After de‐duplication and use of the Cochrane RCT classifier, 790 records remained.

We utilised the Cochrane author support tool Covidence for title and abstract screening of the 790 records. Two review authors (MGH and DGB) then independently examined the remaining 20 references. We excluded those studies that clearly did not meet the inclusion criteria, and obtained full‐text copies of 8 potentially relevant references, choosing 6 trials (reported in 12 nested publications) for inclusion. Subsequently 1 trial was excluded following correspondence with the lead author (Wu 2007), and 1 trial was classified as on‐going (Wu 2014), resulting in 4 trials for inclusion. Any disagreements were resolved through discussion with a third review author (AB) (Figure 1).

Figure 1

Study flow diagram.

We originally planned to include one study that was labelled as an RCT but without a description of the randomisation methods (scored as at unclear risk of selection bias) (Wu 2007). However, subsequent email correspondence with the lead author revealed that "randomisation methods were not strict, and that investigators were aware of allocation prior to enrolment". We therefore excluded this study.

Included studies

The four included studies are described in detail in Characteristics of included studies.

In summary, we identified two trials of intraoperative MRI (Kubben 2014; Senft 2011), one trial of fluorescence‐guided surgery (Stummer 2006), and one trial of neuronavigation (Willems 2006). We did not find any eligible studies of ultrasound‐guided surgery. All studies included people with presumed high‐grade glioma on preoperative imaging. None of the included trials included people with low‐grade glioma, although one ongoing trial includes this patient subgroup (Wu 2014).

Intraoperative MRI

Kubben 2014 recruited 14 participants from multiple centres in Belgium and the Netherlands between 2010 and 2012. Participants had to have a supratentorial brain tumour suspected to be a glioblastoma and an indication for gross total resection. The trial compared surgery with iMRI versus surgery without iMRI (of which either arm could include neuronavigation). Outcomes were residual tumour volume, complications, quality of life (EORTC QLQ‐C30), and overall survival. The final results were initially supposed to be an interim analysis, but ultimately the trial was stopped early thereafter. This unplanned interim analysis was not specified a priori, and as a consequence the sample size would not have taken this into account even if the trial had been fully completed. The size of the trial and circumstances around its early completion are reflected in the 'Risk of bias' assessment and GRADE profile (see below).

Senft 2011 recruited 58 participants from a single German neurosurgical centre between 2007 and 2010. Participants had to have a known or suspected glioma that was contrast enhancing and amenable to complete resection. The trial compared surgery with iMRI versus surgery without iMRI (of which either arm could include neuronavigation). The primary outcome was extent of resection. Secondary outcomes were volume of residual tumour on postoperative MRI, PFS at six months, duration of surgery, and treatment‐related morbidity.

Fluorescence‐guided surgery

Stummer 2006 recruited 322 participants from multiple centres in Germany between 1999 and 2004. Participants had to have a malignant glioma on imaging. The trial compared surgery with 5‐ALA versus surgery without 5‐ALA (of which either arm could include neuronavigation). Primary outcomes were complete tumour resection on MRI (< 72 hours' post‐operation and > 1.5 T) and PFS. Secondary outcomes were residual tumour volume, overall survival, type and severity of neurological deficits after surgery, and toxic effects.

Neuronavigation

Willems 2006 recruited 45 participants from a single Dutch centre between 1999 and 2002. Participants had to have a single space‐occupying lesion. The trial compared surgery with neuronavigation versus surgery without neuronavigation. Primary outcomes were extent of resection and survival. Secondary outcomes were procedure duration, usefulness of neuronavigation, extent of resection, QoL, and postoperative course (including neurological status and adverse events).

Excluded studies

We excluded 12 studies, as follows (see Characteristics of excluded studies).

Seven were not RCTs (Czyz 2011; Koc 2008; Stepp 2007; Wu 2003; Wu 2004; Zhang 2015).
Three were only presented as abstracts and we were unable to obtain sufficient information even after attempting correspondence with the original trial authors (Chen 2011; Chen 2012; Seddighi 2016).
Three did not directly compare an intraoperative imaging intervention with either another intraoperative imaging intervention or standard surgery (Eljamel 2008; Rohde 2011; Stummer 2017).

Risk of bias in included studies

Summary data for risk of bias are presented in table format (Figure 2; Figure 3). A detailed description is provided below and in the Characteristics of included studies.

Figure 2

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Figure 3

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Allocation

Randomisation methods

Randomisation methods were described and were satisfactory in all four included trials, for a judgement of low risk of bias (Kubben 2014; Senft 2011; Stummer 2006; Willems 2006).

Allocation concealment

We assessed one trial in which allocation concealment was potentially inadequate (i.e. sealed envelopes) and judged to be at high risk of bias (Senft 2011), one trial as at low risk of bias (Stummer 2006), and the remaining two trials as at unclear risk of bias (Kubben 2014; Willems 2006).

Blinding

Blinded assessment for extent of resection was performed in three trials (Kubben 2014; Senft 2011; Stummer 2006), and for histological assessment in one trial (Stummer 2006). Regarding overall survival, blinding would not affect outcome reporting but could affect subsequent treatment. For QoL, PFS, and adverse events, blinding would likely affect the outcomes reported. All trials were not blinded to participants or clinicians.

Incomplete outcome data

In one trial all participants were accounted for (Kubben 2014). In two trials all participants were accounted for, but an intention‐to‐treat analysis was not performed, as those participants that had alternative pathological diagnoses were excluded (Senft 2011; Stummer 2006). In the remaining trial there was evidence of attrition bias for extent of resection (analysis of 32 out of 42 participants) (Willems 2006).

Selective reporting

One trial reported all outcomes and was therefore at low risk of reporting bias (Senft 2011). Selective outcome reporting was apparent in three trials: one trial did not report quality of life outcomes (Kubben 2014); one trial did not report full outcome data in the form of figures and appropriate statistics for survival, PFS, and adverse events for 5‐ALA (Stummer 2006); and one trial did not present full data for survival, QoL, or adverse events (Willems 2006). Adverse event data in all studies were particularly poorly reported in terms of total number of events, number of participants with multiple events, and timing of events.

Other potential sources of bias

One of the issues with iMRI is attribution bias. Because a surgeon knows he can check for residual disease, he does not operate as aggressively as he might if he could not check for residual disease during the operation. So when a scan is done residual disease is more likely to be detected, removed, and the success of the removal attributed to the iMRI. This is likely to affect outcomes that report a difference between the first intraoperative and final postoperative MRI scans.

Early cessation of trial

All four trials were stopped early based on the results of interim analyses. Kubben 2014 was stopped early based on the results of an interim analysis not specified a priori. Given the low number of participants involved, we excluded this trial from quantitative analysis. Senft 2011 was stopped early based on the results of an interim analysis not specified a priori. Significance values were consequently adjusted (a P value of less than 0.04 was subsequently regarded as significant). Stummer 2006 was stopped early based on the results of a scheduled interim analysis with compensated power calculation. Willems 2006 was stopped early but no reason was given.

Industry sponsorship

Industry sponsorship was apparent in three trials. Kubben 2014 was financially supported by Medtronic Navigation, but the sponsors "were not involved in writing the protocol, had no access to the data, was not involved in writing the manuscript, and had no veto right for submission." Senft 2011 included authors that had received an honorarium from Medtronic (who manufactured the iMRI machine used in the study), although it was emphasised that the study received no funding from Medtronic. Stummer 2006 was sponsored by medac GmbH (who manufacture Gliolan), which was involved in the study design, quality assurance, and quality control but had no role in the interpretation of data, and the corresponding author had final responsibility for the article (although the author was a paid consultant to both medac GmbH and Zeiss, which manufactures the microscopes used for 5‐ALA). One trial did not state if there were conflicts of interest (Willems 2006).

Effects of interventions

See: Summary of findings 1 iMRI image‐guided surgery compared to standard surgery for high‐grade glioma; Summary of findings 2 5‐ALA image‐guided surgery compared to standard surgery for high‐grade glioma; Summary of findings 3 Neuronavigation image‐guided surgery compared to standard surgery for high‐grade glioma

Extent of resection

Meta‐analysis was not appropriate due to differences in the tumours included (eloquent versus non‐eloquent locations) and variations in the image guidance tools used in the control arms (usually selected utilisation of neuronavigation). Due to the small number of studies (four), we did not construct a funnel plot. The risk ratio (RR) for the extent of resection in participants with high‐grade glioma favoured the experimental arms in two of the four trials reporting this outcome, indicating a lower risk of having an incomplete resection with the intervention.

iMRI was assessed in two trials (Kubben 2014; Senft 2011).
5‐ALA was assessed in the trial of Stummer 2006.
Neuronavigation was assessed in the trial of Willems 2006.

Complete resection

iMRI: In the trial of Senft 2011, complete tumour resection was achieved in 23/24 (96%) of participants in the intervention group compared with 17/25 (68%) of participants in the control group (RR for incomplete resection 0.13, 95% confidence interval (CI) 0.02 to 0.96; very low‐quality evidence). In the Kubben 2014 trial, tumour resection was reported using residual tumour volume and data for complete tumour resections were not available.
5‐ALA: Complete resection was performed in 90/139 (65%) of participants in the intervention group versus 47/131 (36%) of participants in the control group (RR for incomplete resection 0.55, 95% CI 0.42 to 0.71; low‐quality evidence).
Neuronavigation: Complete resection was achieved in three participants in the control group and five participants in the neuronavigation group. However, there was significant attrition, with not all participants having complete imaging, and the denominators for these figures were not stated, precluding meta‐analysis (very low‐quality evidence).

Adverse events

Adverse events (AEs) were reported in an inconsistent manner between trials and not according to the prespecified manner required in our protocol. Specifically, data were not available for participants at risk, participants with multiple events, timing of events, and outcomes of events. We therefore adopted a descriptive method using the data available to describe the AEs in each trial.

iMRI: In the trial of Senft 2011, new or aggravated neurological deficits were present in 2/25 (8%) of participants in the control group and 3/24 (13%) participants in the iMRI group; intraoperative imaging did not lead to continuation of tumour resection in any of the participants with AEs. Two participants had symptomatic haematomas, which were not attributable to the use of iMRI. In one participant, hemianopia was deliberately accepted due to tumour extension around the temporal horn of the lateral ventricle involving the optic radiation. In the Kubben 2014 trial, a single participant in the intervention arm experienced a postoperative haemorrhage.
5‐ALA: Adverse events were present in 58.7% of the intervention arm versus 57.8% of the control arm. Neurological AEs were present in 42.8% of the intervention arm (7.0% grade 3 to 4) and 44.5% of the control arm (5.2% grade 3 to 4). Significant neurological AEs were present in 12.4% of the intervention arm versus 11.6% of the control arm. The number of participants with a deterioration in the National Institutes of Health Stroke Scale compared to baseline tended to be higher in the intervention arm at 48 hours (26.2% with 5‐ALA versus 14.5% in the control arm) but not at 7 days (20.5% versus 10.7%), 6 weeks (17.1% versus 11.3%), and 3 months (19.6% versus 18.6%). No denominators were given for each result, preventing calculation of the RR and CI.
Neuronavigation: New or worsened neurological deficits were present at three months in 45.5% of participants in the control group and 18.2% of participants in the neuronavigation group. During the first three months after surgery, seven participants (31.8%) in the control group and seven (30.4%) in the neuronavigation group experienced a new, non‐neurological adverse event. In three participants in the neuronavigation group, these events were fatal (pulmonary embolism, cardiac arrest with pulseless electrical activity, and postoperative pulmonary insufficiency). Other adverse events included pulmonary or urinary tract infection, surgical removal of an epidural haematoma, surgical cyst drainage, repeated tumour debulking, cerebrospinal fluid leakage, postoperative delirium, and insufficiently treated steroid‐induced diabetes.

Survival

iMRI: Senft 2011 did not assess this outcome, while Kubben 2014 did not report overall survival in the prespecified manner for inclusion.
5‐ALA: There was no difference in overall survival between the intervention and control arms (hazard ratio (HR) 0.82, 95% CI 0.62 to 1.07). Median survival was also reported and this was 15.2 months (95% CI 12.9 to 17.5) in the intervention arm versus 13.5 months (95% CI 12.0 to 14.7) in the control arm.
Neuronavigation: The median survival time was 9 months in the control arm and 5.6 months in the intervention arm and the HR was reported to be 1.6. However, no confidence intervals were available or able to be calculated.

Time to progression or progression‐free survival

iMRI: In Senft 2011, the median PFS in the intervention arm was 226 days (95% CI 0.0 to 454) versus 154 days (95% CI 60 to 248) in the control arm. HRs or their respective CIs were not available and could not be calculated. Kubben 2014 did not assess these outcomes.
5‐ALA: Median PFS was 5.1 months (95% CI 3.4 to 6.0) in the intervention arm versus 3.6 months (95% CI 3.2 to 4.4 months) in the control arm. HRs and their respective CIs were not available and could not be calculated.
Neuronavigation: These outcomes were not assessed.

Quality of life

iMRI: Senft 2011 and Kubben 2014 did not report data for this outcome.
5‐ALA: This outcome was not assessed.
Neuronavigation: Quality of life questionnaires at 3 months' postoperatively were completed by 19 participants (eight in the neuronavigation arm and 11 in the standard surgery arm), constituting 64.5% of all eligible participants. The questionnaire included one part with 30 general questions and another part with 20 brain‐specific questions (BN‐20). Out of 26 outcome measures that were presented, the direction of change differed in seven (all in the BN‐20 group): four were in favour of the neuronavigation group and three were in favour of standard surgery. No statistical analysis was presented.

We considered this evidence to be of low to very low quality for all reported outcomes (summary of findings Table 1; summary of findings Table 2; summary of findings Table 3).

Discussion

Summary of main results

We identified two RCTs for iMRI (Kubben 2014; Senft 2011), one for fluorescent‐guided surgery with 5‐ALA (Stummer 2006), and one assessing neuronavigation using standard preoperative MRI sequences (Willems 2006). Formal meta‐analysis was not possible due to the different comparisons and variability in the control arm population between trials. We were therefore limited to performing a narrative analysis of the included trials.

Two trials demonstrated a benefit for intraoperative imaging technology (iMRI and 5‐ALA, respectively) in terms of extent of resection (the primary outcome) (Senft 2011; Stummer 2006). Overall survival data were available for 5‐ALA only; there was no clear evidence that 5‐ALA improved overall survival. Data for PFS were only available for two trials, and were not available in the format specified (hazard ratios and their variance). Nevertheless, there was a suggestion that 5‐ALA increased PFS compared with standard surgery. Quality of life data were only reported in a single trial, and there was significant attrition and reporting bias. Adverse event reporting varied considerably between trials but in general was poorly performed. With 5‐ALA, it appears that neurological deterioration is more common after fluorescence‐guided surgery. The studies that reported on this effect noted that it occurred mainly among those with fixed deficits and early after surgery, but there was subsequently a trend towards recovery (Stummer 2006). Other adverse events appeared to be rare and similar in frequency between study arms.

To supplement the main systematic review of effects, we sought to identify cost analyses and economic evaluations that compared the interventions with each other or between different variants of the same intervention. A search of MEDLINE and Embase identified six such studies (Eljamel 2016; Esteves 2015; Hall 2003; Kowalik 2000; Makary 2011; Schulder 2003; Slof 2015) (one study was reported in two papers ‐ Hall 2003; Kowalik 2000).

Of the six studies, three studies compared iMRI to conventional surgery (Kowalik 2000; Makary 2011; Schulder 2003); two compared 5‐ALA with white light surgery (Esteves 2015; Slof 2015); and one compared conventional, 5‐ALA, fluorescein, ultrasound, and iMRI surgery (Eljamel 2016). Three studies were conducted in the USA (Kowalik 2000; Makary 2011; Schulder 2003), one in Portugal (Esteves 2015), and one in Spain (Slof 2015), and for one it was unclear (but was probably the USA) (Eljamel 2016). Four studies were based on non‐randomised retrospective comparative cohorts (Kowalik 2000; Makary 2011; Schulder 2003; Slof 2015); one was based on a review and pairwise meta‐analyses (Eljamel 2016), and one used data from a trial and retrospective cohorts (Esteves 2015). The cohort studies all involved fewer than 100 participants, except for the study by Slof 2015, which included 254 participants who received 5‐ALA and 120 who received white light surgery. All the studies except one (Esteves 2015), which integrated data using a Markov model, were based on comparisons of individual patient level data.

In terms of costs, what costs were included and over what time horizon varied markedly. Only one study considered costs over the patient lifetime (Esteves 2015), and one only considered the drug cost (Slof 2015). The other studies considered costs incurred in hospital for the index surgery. Costs were reported in US dollars in four studies (Esteves 2015; Kowalik 2000; Makary 2011; Schulder 2003), but the price year (2005/6) was stated only in one study (Makary 2011). The other two studies reported costs in Euros, and the price year was 2012 in one study, Esteves 2015, and not stated in the other (Slof 2015). Two studies were cost analyses only (Kowalik 2000; Schulder 2003). Effects were resection rates (Eljamel 2016), resection‐free years (Makary 2011), quality‐adjusted life years (Eljamel 2016; Esteves 2015; Slof 2015) PFS (Esteves 2015), and life years (Esteves 2015).

For the comparison of iMRI with conventional surgery, two studies reported a potential cost saving driven by reductions in length of stay (Kowalik 2000; Schulder 2003), and third study reported lower mean costs that were not statistically significant (Makary 2011). The one cost‐effectiveness analysis reported a longer interval to resection (20.1 versus 6.7 months; P = 0.02); further results suggested iMRI was more cost‐effective in terms of cost per resection‐free years. Another study reported that iMRI was the most costly of conventional, 5‐ALA, fluorescein, and ultrasound‐assisted surgery (Eljamel 2016). iMRI was the least cost‐effective, but the results could not be replicated from the data presented in the study. Estimates of cost‐effectiveness (and cost over a longer follow‐up) need to considered in the light of the very limited evidence for iMRI where there is a benefit shown in terms of extent of resection but no evidence in the review of clinical effectiveness on overall survival.

For the comparison of 5‐ALA and standard surgery, 5‐ALA was on average more costly in both studies, but results in more quality‐adjusted life years over the patient lifetime or over the time to progression of disease (Esteves 2015; Slof 2015). In both cases the study authors concluded that the extra costs were worth the extra quality‐adjusted life years and that these conclusions were consistent over all sensitivity analyses conducted. These findings of extra effectiveness in the economic studies need to be considered in context of the findings of the review of the best available clinical effectiveness data summarised above. National Institute for Health and Care Excellence (NICE) guidance on the cost‐effectiveness of 5‐ALA is expected to be published in June 2018.

Overall completeness and applicability of evidence

All the identified trials included highly selected participants in specialised centres, and the applicability of these findings to a more general population needs to be carefully considered. Participants included int he trials tended to be generally young and of good performance status. In addition, most trials also clearly specified the types of tumours that were to be included, and would not have randomised those patients with eloquent tumours or where a complete resection was not feasible. Potentially those enrolled in one of the iMRI trials (Senft 2011) were likely to have more resectable or less eloquent tumours than those in the 5‐ALA trial (Stummer 2006), given the far higher resection rates in both arms of the iMRI study (96% iMRI and 68% control versus 65% 5‐ALA and 36% control).

The majority of included trials only enrolled participants with probable high‐grade glioma. We identified no RCTs for ultrasound‐guided surgery, which may reflect the less widespread application of this particular technology. There are theoretical advantages to this technology, such as relative affordability, repeatability, and possibly better sensitivity in low‐grade tumours than the other included intraoperative imaging modalities. Nevertheless, it currently does not have the same evidence base as other intraoperative imaging modalities to recommend its use in routine clinical practice.

Quality of the evidence

It is clearly feasible to perform RCTs for new surgical interventions, and it appears now to have become standard practice to perform an RCT for assessing new intraoperative imaging technologies. The openness of major centres to enrolling participants in RCTs to provide clear outcome data is a major step forward in neuro‐oncology. Some aspects of the included trials were at low risk of bias, such as randomisation methods and blinded, objective reporting for extent of resection. However, the overall the risk of bias was high, and there were consistent concerns with stopping trials early and the role of industry involvement (summary of findings Table 1; summary of findings Table 2; summary of findings Table 3).

Extent of resection was the primary outcome for all of the included trials. This has the advantage of being the outcome most directly influenced by intraoperative imaging. However, there is still no evidence from RCTs that resection (either total or less than total) improves outcomes for high‐grade glioma over biopsy alone (Hart 2011). Subgroup analyses, particularly for the 5‐ALA trial (Stummer 2006), have shown that those participants that have a complete resection of all contrast‐enhancing tumour survive longer than those with residual tumour (Pichlmeier 2008). Studies of chemotherapy have also found that those without residual tumour survive longer (Stupp 2005). While this is not direct evidence in favour of complete resection, but rather a post hoc non‐randomised subgroup analysis, it is becoming increasingly apparent that a complete tumour resection is desirable, particularly when it can be achieved safely. Precisely how much a complete resection contributes towards the overall outcome is unclear. New methods of imaging (e.g. amino acid positron emission tomography) have found that tumours frequently extend out from the contrast‐enhancing margin on MRI (Miwa 2004). However, validation of this approach has yet to be established, and the need for a cyclotron makes widespread application and testing a challenge in the UK, therefore MRI in assessing residual tumour remains the current standard of care.

After extent of resection, studies tended to focus on PFS rather than overall survival. There are certain advantages to this in that possibly fewer participants are required and the results may be available sooner. Additionally, it may provide a more direct assessment of the effect of the primary intervention that is not confounded by subsequent therapy. However, it can be argued that overall survival should still remain the main outcome of interest. Firstly, survival is so short in high‐grade glioma that the practical benefits of assessing PFS are less relevant. Secondly, assessment of PFS can be more subjective, and is critically dependent on the timing and interpretation of imaging, which can often be complicated (Wen 2010).

Quality control for surgical neuro‐oncology trials is an emerging area (GNOSIS 2007). Standardisation of reporting is required to allow clear comparisons between trials in meta‐analyses. Detailed reporting is required for tumour location with regard to eloquent brain; operative technique used; postoperative imaging protocol; assessment of extent of resection; and recording of adverse events (including total numbers of events, total number of participants at risk, number of participants with multiple events; severity, timing, and outcome of events, i.e. resolution or persistence of neurological deficits).

Potential biases in the review process

We took multiple steps in the review process to minimise bias, including double independent literature sift and data extraction, not pooling results due to heterogeneity, and using strict inclusion criteria. Overall, these steps acted to minimise bias and restrict the review to the best available evidence. Notably this led to one trial that was titled as an RCT and included in an earlier version of this review being excluded. Specificially, the lead author stated that randomisation was not strict, surgeons were aware of allocations prior to enrolment, and that bias of participant allocation was inevitable. In a previous Cochrane Review (Barone 2014), this study was included to allow open discussion of its methodology, while for this review we felt it more appropriate to concentrate on the highest‐quality evidence in order to generate the most robust findings.

Notably, the majority of trials identified through the search strategy were not RCTs. It could be argued that excluding this volume of data biases our review and that it would be more appropriate to consider a Cochrane Review of non‐randomised studies (NRS). However, the issue of selection bias is critical, particularly in surgical trials. Participants enrolled in a NRS are likely to have a better prognosis than a control population, and it is impossible to accurately account for this bias without using randomisation. It would therefore not be clear what benefit intraoperative imaging had on the overall outcome. Meta‐analysis of RCTs remains the most reliable way of assessing the benefits of specific intraoperative imaging modalities. However, NRS may also have a role, particularly regarding technology development and reporting of adverse events.

This review included two specific groups of technologies, those that used imaging obtained intraoperatively and those that used imaging obtained preoperatively for use in an intraoperative manner. We felt that both methods were suitable for comparison, as the goals are similar: namely, to achieve maximal safe resection via the application of surgical technology. A major concern with preoperative imaging is intraoperative brain shift, whereby anatomical localisation is affected by events that occur during surgery (e.g. anaesthesia, brain retraction, tumour resection, dural opening, and cerebrospinal fluid drainage). Imaging obtained intraoperatively can theoretically account for brain shift and allow more accurate navigation than imaging obtained preoperatively. In this review we found that a single trial did not demonstrate an effect for intraoperative imaging utilising preoperatively acquired data (Willems 2006).

Another technique that is commonly used in neuro‐oncology surgery is awake craniotomy. This is often perceived as a technology to make surgery safer by allowing intraoperative mapping of eloquent brain. It is not typically regarded as a technique to maximise extent of resection and was therefore not included in this review.

We did not subject these studies to critical appraisal, and we do not attempt to draw any firm or general conclusions regarding the relative costs or efficiency of the interventions being compared. For the comparison of iMRI surgery with conventional surgery, it is clear that the available economic evidence is, at best, equivocal. For the comparison of 5‐ALA with white light surgery, the available economic evidence indicates that, from an economic perspective, use of 5‐ALA could be a promising strategy but effectiveness data used in the economic studies is not consistent with the findings of the review of effectiveness.

Agreements and disagreements with other studies or reviews

We are not aware of any other similar reviews that compare all the different types of intraoperative imaging or other interventions to maximise the extent of resection in neuro‐oncology. Currently, there are no national guidelines appraising the use of any of the technologies, for example by NICE, but guidance on 5‐ALA and iMRI is expected in 2018 when the NICE guidelines for the management of primary brain tumours are published. Many of the trials are relatively recent and appraisal is often limited to a linked editorial. In addition, many of the techniques have only been used in specialised trial centres, and real‐world experience is limited. Further prospective data reporting such real‐world experience would help inform future clinical guidelines and NHS (National Health Service) policy by reporting data on patients who are unsuitable for RCTs, for example due to co‐morbidities.

An interim analysis of an on‐going trial of iMRI is broadly in agreement with the findings of this review (Wu 2014). This reported outcomes on 114 out of a projected 304 participants. Complete resection was achieved in 86% of the iMRI arm versus 53% of the control arm. There was no difference in AEs or PFS while OS was not reported.

Figure 1

Study flow diagram.

Ir a la figura de la revisiónAbrir en una pestaña nueva

Figure 2

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Ir a la figura de la revisiónAbrir en una pestaña nueva

Figure 3

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Ir a la figura de la revisiónAbrir en una pestaña nueva

Summary of findings 1. iMRI image‐guided surgery compared to standard surgery for high‐grade glioma

Outcomes	*Illustrative comparative risk (95% CI)**		Relative effect (95% CI)	No. of participants (studies)	Quality of the evidence (GRADE)	Comments
iMRI image‐guided surgery compared to standard surgery for high‐grade glioma
Patient or population: high‐grade glioma Settings: specialist centres Intervention: iMRI image‐guided surgery (based on post‐operative MRI) Comparison: standard surgery
	Assumed risk	Corresponding risk
	Control	Image‐guided surgery
Extent of resection: complete resection	32¹ per 100	4 per 100 (1 to 31)	RR 0.13 (0.02 to 0.96)	49 participants (1 study)	⊕⊝⊝⊝^1,2,3 verylow	Small trial of highly selected participants with potential bias in allocation and performance. One other trial reported this outcome but did not contribute towards the analysis.
Adverse events	Inadequately and inconsistently reported in the trial				⊕⊝⊝⊝⁴ verylow	Adverse events were reported in an inconsistent manner and not according to the manner prespecified in our protocol. Additionally, we were mainly interested in identifying serious adverse events, which were inadequately reported
Overall survival	Not estimable				⊕⊝⊝⊝⁴ verylow	Not reported by trial authors so graded as very low quality evidence
Progression‐free survival	Not estimable				⊕⊝⊝⊝⁴ verylow	Progression‐free survival or time to progression was not adequately reported in the trial
Quality of life	Not estimable				⊕⊝⊝⊝⁴ verylow	Quality of life was not reported in the trial
The basis for the assumed risk* is only based on individual trials as only single trial reports were available. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI). CI: confidence interval; iMRI: intraoperative magnetic resonance imaging; RR: risk ratio
GRADE Working Group grades of evidence High quality: Further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: We are very uncertain about the estimate.
¹Expressed in terms of risk of incomplete resection (bad outcome). ²Small trial so quality of the evidence downgraded by one level. ³Highly selected participants with potential bias in allocation and performance as well as in other 'Risk of bias' domains, thus downgraded by two levels. ⁴Outcome was not reported (or inadequately reported for meaningful conclusions to be drawn), therefore giving lowest quality of evidence judgement.

Summary of findings 1. iMRI image‐guided surgery compared to standard surgery for high‐grade glioma

Ir a la tabla de la revisión

Summary of findings 2. 5‐ALA image‐guided surgery compared to standard surgery for high‐grade glioma

Outcomes	*Illustrative comparative risk (95% CI)**		Relative effect (95% CI)	No. of participants (studies)	Quality of the evidence (GRADE)	Comments
5‐ALA image‐guided surgery compared to standard surgery for high‐grade glioma
Patient or population: high‐grade glioma Settings: specialist centres Intervention: 5‐ALA image‐guided surgery (based on post‐operative MRI) Comparison: standard surgery
	Assumed risk	Corresponding risk
	Control	Image‐guided surgery
Extent of resection: complete resection	64¹ per 100	35 per 100 (27 to 45)	RR 0.55 (0.42 to 0.71)	270 participants (1 study)	⊕⊕⊝⊝² low	Highly selected participants with potential bias in allocation and performance
Adverse events	Inadequately and inconsistently reported in the trial				⊕⊝⊝⊝³ verylow	Adverse events were reported in an inconsistent manner and not according to the manner prespecified in our protocol. Additionally, we were mainly interested in identifying serious adverse events, which were inadequately reported
Overall survival	Not estimable due to reporting of HR and since just a single trial reported on this outcome we did not arbitrarily choose a snap shot in time in which to use as basis to calculate the assumed and corresponding risks as this may be misleading.		HR 0.82 (0.62 to 1.07)	270 participants (1 study)	⊕⊕⊝⊝² low	The overall quality of this outcome was low in this trial and was downgraded for highly selected participants with potential bias in allocation and performance
Progression‐free survival	Not adequately reported in the trials				⊕⊝⊝⊝³ verylow	Progression‐free survival or time to progression was not adequately reported in the trial
Quality of life	Inadequately reported or not assessed at all in the included trials				⊕⊝⊝⊝³ verylow	Quality of life was not reported in the trial
The basis for the assumed risk* is only based on individual trials as only single trial reports were available. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI). 5‐ALA: 5‐aminolevulinic acid; CI: confidence interval; HR: hazard ratio; RR: risk ratio
GRADE Working Group grades of evidence High quality: Further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: We are very uncertain about the estimate.
¹Expressed in terms of risk of incomplete resection (bad outcome). ²Highly selected participants with potential bias in allocation and performance as well as in other 'Risk of bias' domains, thus downgraded by two levels. ³Outcome was not reported (or inadequately reported for meaningful conclusions to be drawn), therefore giving lowest quality of evidence judgement.

Summary of findings 2. 5‐ALA image‐guided surgery compared to standard surgery for high‐grade glioma

Ir a la tabla de la revisión

Summary of findings 3. Neuronavigation image‐guided surgery compared to standard surgery for high‐grade glioma

Outcomes	*Illustrative comparative risk (95% CI)**		Relative effect (95% CI)	No. of participants (studies)	Quality of the evidence (GRADE)	Comments
Neuronavigation image‐guided surgery compared to standard surgery for high‐grade glioma
Patient or population: high‐grade glioma Settings: specialist centres Intervention: neuronavigation image‐guided surgery (based on post‐operative MRI) Comparison: standard surgery
	Assumed risk	Corresponding risk
	Control	Image‐guided surgery
Extent of resection: complete resection	Not estimable	Not estimable	Not reported	45 participants (1 study)	⊕⊝⊝⊝^1,2,4 verylow	Small study of highly selected participants at very high risk of allocation bias.Complete resection was achieved in three participants in the control group and five participants in the neuronavigation group. However, there was significant attrition, with not all participants completing imaging, and the denominators for these figures were not stated, precluding formal analysis
Adverse events	Inadequately and inconsistently reported in the trial				⊕⊝⊝⊝² verylow	Adverse events were reported in an inconsistent manner and not according to the manner prespecified in our protocol. Additionally, we were mainly interested in identifying serious adverse events, which were inadequately reported
Overall survival	Not estimable				⊕⊝⊝⊝³ verylow	Not reported by trial authors so graded as very low quality evidence
Progression‐free survival	Not estimable				⊕⊝⊝⊝² verylow	Progression‐free survival or time to progression was not reported in the trial
Quality of life	Inadequately reported or not assessed at all in the included trials				⊕⊝⊝⊝³ verylow	Quality of life was reported in the trial but only 19 participants (8 in the neuronavigation arm and 11 in the standard surgery arm) completed questionnaires postoperatively at 3 months', constituting only 64.5% of all eligible participants, and no statistical analysis was presented
The basis for the assumed risk* is only based on individual trials as only single trial reports were available. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI). CI: confidence interval
GRADE Working Group grades of evidence High quality: Further research is very unlikely to change our confidence in the estimate of effect. Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. Very low quality: We are very uncertain about the estimate.
¹Small trial so quality of the evidence downgraded by one level. ²Highly selected participants with potential bias in allocation and performance as well as in other 'Risk of bias' domains, thus downgraded by two levels. ³Outcome was not reported (or inadequately reported for meaningful conclusions to be drawn), therefore giving lowest quality of evidence judgement.

Summary of findings 3. Neuronavigation image‐guided surgery compared to standard surgery for high‐grade glioma

Ir a la tabla de la revisión

Table 1. Karnofsky performance score

Score	Definition
100	Normal, no complaints, no evidence of disease
90	Able to carry on normal activity: minor symptoms of disease
80	Normal activity with effort: some symptoms of disease
70	Cares for self: unable to carry on normal activity or active work
60	Requires occasional assistance but is able to care for needs
50	Requires considerable assistance and frequent medical care
40	Disabled: requires special care and assistance
30	Severely disabled: hospitalisation is indicated, death is not imminent
20	Very sick, hospitalisation is necessary: active treatment is necessary
10	Moribund, fatal processes are progressing rapidly
0	Dead

Table 1. Karnofsky performance score

Ir a la tabla de la revisión

Table 2. WHO performance score

Grade	Definition
0	Fully active, able to carry on all pre‐disease performance without restriction
1	Restricted in physically strenuous activity but ambulatory and able to carry out work of a light or sedentary nature, e.g. light house work, office work
2	Ambulatory and capable of all self care, but unable to carry out any work activities. Up and about more than 50% of waking hours
3	Capable of only limited self care, confined to bed or chair more than 50% of waking hours
4	Completely disabled. Cannot carry out any self care. Totally confined to bed or chair
5	Dead

Table 2. WHO performance score

Ir a la tabla de la revisión

	Idioma de la Revisión Cochrane Escoja su idioma de preferencia para las revisiones Cochrane y otros contenidos. Las secciones sin una traducción aparecerán en inglés.

	Idioma de la web Escoja su idioma de preferencia para la web de la Biblioteca Cochrane.

Idioma de la Revisión Cochrane

Idioma de la web

Abstract

Background

Objectives

Search methods

Selection criteria

Data collection and analysis

Main results

Authors' conclusions

PICO

PICO

Population

Intervention

Comparison

Outcome

Plain language summary

Image‐guided surgery for brain tumours

Resumen visual

Authors' conclusions

Implications for practice

Implications for research

Summary of findings

Background

Description of the condition

Description of the intervention

How the intervention might work

Why it is important to do this review

Objectives

Methods

Criteria for considering studies for this review

Types of studies

Types of participants

Types of interventions

Types of outcome measures

Primary outcomes

Secondary outcomes

Search methods for identification of studies

Electronic searches

Searching other resources

Handsearching

Personal communication

Data collection and analysis

Selection of studies

Data extraction and management

Assessment of risk of bias in included studies

Measures of treatment effect

Unit of analysis issues

Dealing with missing data

Assessment of heterogeneity

Assessment of reporting biases

Data synthesis

Subgroup analysis and investigation of heterogeneity

Sensitivity analysis

Results

Description of studies

Results of the search

Included studies

Intraoperative MRI

Fluorescence‐guided surgery

Neuronavigation

Excluded studies

Risk of bias in included studies

Allocation

Randomisation methods

Allocation concealment

Blinding

Incomplete outcome data

Selective reporting

Other potential sources of bias

Early cessation of trial

Industry sponsorship

Effects of interventions

Extent of resection

Complete resection

Adverse events

Survival

Time to progression or progression‐free survival

Quality of life

Discussion