Selective versus routine intraoperative cholangiography for cholecystectomy

Diego R Kleinubing; Rachel Riera; Delcio Matos; Marcelo Moura Linhares

doi:10.1002/14651858.CD012971

Selective versus routine intraoperative cholangiography for cholecystectomy

Authors' declarations of interest

Version published: 26 February 2018 Version history

https://doi.org/10.1002/14651858.CD012971

Collapse all Expand all

Abstract

This is a protocol for a Cochrane Review (Intervention). The objectives are as follows:

To assess the benefits and harms of selective versus routine intraoperative cholangiography in people undergoing cholecystectomy.

Background

Description of the condition

The gallbladder stores bile produced by the liver (Di Ciaula 2017). The stored bile in the gallbladder and biliary tract may crystallise and form cholesterol or black pigmented stones in a condition known as cholelithiasis (Stinton 2012). Female sex, family history, genetics, age, ethnic origin, obesity, rapid weight loss, and a sedentary lifestyle are among the risk factors for cholesterol gallstone formation. Black pigmented stones are common in cirrhosis, chronic haemolysis, and Crohn's disease, particularly when the Crohn's disease involves the ileal segment (Stinton 2012).

The prevalence of gallstones in the general population is reported to be 6% to 15% (Barbara 1987; Loria 1994; Duncan 2012). A subset of 5% to 10% of the symptomatic population will also present with common bile duct stones (so‐called choledocholithiasis), which may progress to biliary obstruction and lead to complications such as jaundice, cholangitis, and pancreatitis (Hunter 1992; Robinson 1995; Petelin 2003; Caddy 2006; O'Neil 2008).

Liver biochemical tests (gamma‐glutamyltransferase, alkaline phosphatases, alanine aminotransferase, aspartate aminotransferase, bilirubin) and transabdominal ultrasound are screening tests for choledocholithiasis in people with gallstones, especially in ruling out the presence of common bile duct stones. Both tests have high negative predictive values, of more than 97% for the liver biochemical tests when combined and 95% to 96% for normal bile duct diameter on transabdominal ultrasound (normal range from 3 mm to 6 mm) (Parulekar 1979; Bruneton 1981; Liu 2001; Yang 2008). Therefore, laboratory and ultrasound tests constitute the initial evaluation and are mainly used to screen people who need further investigation due to the poor sensitivity and specificity of the tests for choledocholithiasis (Freitas 2006; Gurusamy 2015a; Molvar 2016). Once suspected, definitive diagnosis of common bile duct stones may be confirmed with imaging techniques such as magnetic resonance cholangiopancreatography (sensitivity 93% and specificity 96%), endoscopic ultrasound (sensitivity 95% and specificity 97%), endoscopic retrograde cholangiopancreatography (ERCP) (sensitivity 83% and specificity 99%), and intraoperative cholangiography (sensitivity 99% and specificity 99%), according to availability and technical experience (Aziz 2014; Giljaca 2015; Gurusamy 2015b; Molvar 2016). ERCP may also be used as a therapeutic intervention (Desari 2013; Gurusamy 2015b).

Gallbladder stones may cause acute cholecystitis, a complication diagnosed in 10% to 35% of people admitted for cholecystectomy. This condition can distort and hamper the Calot's triangle anatomy interpretation during cholecystectomy. This leads to iatrogenic bile duct injuries, which are rare but serious complications with several large population‐based studies showing an occurrence of 0.3% to 0.5% (Garber 1997; Gurusamy 2013; Stewart 2014). In addition, the frequent presence of anatomical variations of the biliary tree, which may be detected in nearly 50% of the general population, may become the hepatocystic triangle dissection complex, predisposing to injuries of the extrahepatic bile ducts (Chaib 2014; Khayat 2014; Kostakis 2017).

Cholecystectomy is the procedure of choice to treat people with symptomatic gallstones (Gurusamy 2010).

Description of the intervention

Intraoperative cholangiography was first described by Mirizzi in 1931 (Mirizzi 1931). It involves the injection of dye through a catheter into the biliary tree via the cystic duct during cholecystectomy. This diagnostic procedure is a well‐established technique for delineating biliary anatomy and intraductal stones, and is considered the gold standard for intraoperative imaging of the biliary anatomy during laparoscopic cholecystectomy (Cushieri 1994; Buddingh 2011; Khan 2011; Jamal 2016).

Depending on resource availability and in the absence of magnetic resonance cholangiopancreatography and ERCP, intraoperative cholangiography based on clinical, biochemical, and ultrasound results can be performed to detect common bile duct stones during cholecystectomy (Gurusamy 2015b).

This procedure can be performed routinely in people undergoing cholecystectomy, despite their clinical condition, biochemical tests, and ultrasound results; or selectively, under prespecified conditions, if at least one of the following criteria exists (ASGE Standards of Practice Committee 2010):

history of jaundice, cholangitis, or pancreatitis;
abnormal liver function tests: gamma‐glutamyltransferase greater than 37 U/L, alkaline phosphatases greater than 95 U/L, alanine aminotransferase greater than 40 U/L, aspartate aminotransferase greater than 40 U/L, and total bilirubin greater than 1.2 mg/dL;
ultrasonographic evidence of common bile duct stone or dilation (diameter greater than 6 mm; Freitas 2006; Molvar 2016);
indefinite anatomy during cholecystectomy (Traverso 2006).

The major role of intraoperative cholangiography is for people with unclear ductal anatomy during cholecystectomy. Whether its routine use prevents common bile duct injury is still a matter of debate (Fletcher 1999; Singh 2000; Flum 2001; Flum 2003; Duncan 2012; Sajid 2012; Stewart 2014; Kumar 2015). Though not considered mandatory for everyone with gallbladder stones undergoing cholecystectomy, on‐table cholangiogram is recommended for people who have an intermediate to high probability of common bile duct stones which have not been diagnosed preoperatively by other means (e.g. ERCP) by using factors such as age, liver test results, and ultrasound findings (Williams 2008; ASGE Standards of Practice Committee 2010; Mohandas 2010). Once intraoperatively detected by cholangiography, common bile duct stones can be removed through laparoscopic exploration, conversion to open surgery, or intraoperative or postoperative ERCP, with comparable success rates depending on availability and technical skills (Desari 2013).

How the intervention might work

Considering the relatively low prevalence of common bile duct stones and the low likelihood of choledocholithiasis based on high negative predictive values of clinical, biochemical, and ultrasound tests (ASGE Standards of Practice Committee 2010), it might be that routine compared with selective intraoperative cholangiography offers more harm than benefit. In this scenario, the selective procedure emerges as the possibly recommended evaluation for people with biochemical and ultrasound abnormalities and in selected people with Calot's triangle anatomy distortion during cholecystectomy.

In addition, the selective approach seems to be more reasonable and safer as intraoperative cholangiogram may implicate adverse outcomes such as postprocedure pancreatitis, higher costs, false‐positive results leading to unnecessary common bile duct exploration, and increased operation time (Borjeson 2000; Ford 2012).

Why it is important to do this review

Different points of view regarding intraoperative cholangiography have emerged over the years, and the question about selective or routine intraoperative cholangiography remains controversial (Berci 1991; Talamini 2003). Therefore, it is necessary to resolve this controversy of benefits and harms of selective and routine approaches to guidance in decision‐making. All related randomised clinical trials need to be identified, meta‐analysed when possible, and critically appraised to guarantee an evidence‐based choice regarding the selective or routine use of intraoperative cholangiography.

We found no meta‐analyses or systematic reviews with meta‐analyses and Trial Sequential Analysis dealing specifically with this topic.

Objectives

To assess the benefits and harms of selective versus routine intraoperative cholangiography in people undergoing cholecystectomy.

Methods

Criteria for considering studies for this review

Types of studies

Randomised clinical trials irrespective of blinding, language, year, format, and publication status. We will also consider quasi‐randomised studies, controlled clinical studies, and observational studies retrieved with our searches for data on harm. We are aware that by not searching for all observational studies on harms, we run the risks of putting more weight on potential benefits than on potential harms.

Types of participants

We will include people, regardless their age or sex, undergoing cholecystectomy (open or laparoscopic), elective or emergency, for any benign condition such as symptomatic gallstone, acalculous cholecystitis, or gallbladder polyp.

Types of interventions

Experimental: selective intraoperative cholangiography. We will accept the definition of selective intraoperative cholangiography given by the authors of the randomised clinical trial publications. The criteria definition for indication may include:

- history of jaundice, cholangitis, or pancreatitis;
- abnormal liver function tests (alanine aminotransferases, aspartate aminotransferase, alkaline phosphatases, gamma‐glutamyltransferase, bilirubin);
- ultrasonographic evidence of common bile duct stone or dilation (greater than 6 mm);
- indefinite anatomy during cholecystectomy.

Comparator: routine intraoperative cholangiography performed in all participants, despite clinical condition, biochemical tests, and ultrasound results.

Types of outcome measures

Primary outcomes

All‐cause mortality.

Serious adverse events. Serious adverse events correspond to Grade III or above of the Clavien‐Dindo classification, the only validated system for classifying postoperative complications (Dindo 2004). We will classify the following postoperative complications as serious adverse events: post‐intraoperative cholangiography acute pancreatitis; and bile duct injury during cholecystectomy, confirmed radiologically by intraoperative cholangiography, if diagnosed intraoperatively. Common bile duct injury suspected at postoperative follow‐up (at three months or six months) is to be confirmed by ERCP or magnetic resonance cholangiopancreatography. Classification of bile duct injury will be according to Strasberg and further stratified from type A to type E, which are subdivided into E1 to E5, according to the Bismuth classification system (Strasberg 1995). Other serious post‐operative complications such as acute cholangitis, bile leak, biliary strictures, intra‐abdominal abscess, secondary cirrhosis, haemorrhage requiring reoperation, and biloma will also be considered as serious adverse events.

Health‐related quality of life: measured by any validated general or specific tool such as the 36‐item Short Form (SF‐36) (SF‐36 1994) and Gastrointestinal Quality of Life Index (GIQLI 1995).

We will analyse each primary outcome separately, including each component of serious adverse events.

Secondary outcomes

Non‐serious adverse events: may include non‐serious local and systemic postoperative complications: wound infection (according to Surgical Site Infection Criteria ‐ Center for Disease Control and Prevention (Berríos‐Torres 2017), or incision haematoma. Systemic complications: pneumoperitoneum‐related (carbon dioxide embolism, vasovagal reflex, cardiac arrhythmias, hypercarbic acidosis), nausea or vomiting, diarrhoea, or lung atelectasis if they do not lead to hospitalisation or prolongation of hospitalisation.
Pain: right upper abdomen, right shoulder, and incision pain (at trocar sites and subcostal incision), as provided by trial authors.
Common bile duct stones:

- if detected intraoperatively: we will consider intraoperative cholangiography as the diagnostic method. We are aware that this outcome is inherently biased, likely to occur more often in the control group;
- if detected postoperatively: suspected by clinical signs and symptoms such as jaundice, right upper abdominal pain, back pain, nausea or vomiting, and confirmed by serum chemistry analysis (gamma‐glutamyltransferase greater than 37 U/L, alkaline phosphatases greater than 95 U/L, alanine aminotransferase greater than 40 U/L, aspartate aminotransferase greater than 40 U/L, total bilirubin greater than 1.2 mg/dL) plus image study calculi demonstration: ultrasound, magnetic resonance cholangiopancreatography, or ERCP. It is important to consider and, to differentiate, that after at least a two year symptom‐free interval following cholecystectomy, there is a probability of primary common bile duct stones (Saharia 1977).

We will analyse the outcomes separately, considering each time point. We will primarily base our conclusions at maximum follow‐up.

Search methods for identification of studies

We will conduct a systematic search with no language limitations to find all relevant randomised clinical trials.

Electronic searches

We will search The Cochrane Hepato‐Biliary Group Controlled Trials Register (Gluud 2017), the Cochrane Central Register of Controlled Trials (CENTRAL) in the Cochrane Library, MEDLINE Ovid, Embase Ovid, Science Citation Index Expanded (Web of Science), Latin American and Caribbean Health Science Information Database (LILACS; Bibioteca Virtual em Saúde ‐ BVS), Cumulative Index to Nursing and Allied Health Literature (CINAHL; EBSCO), and the Physiotherapy Evidence Database (PEDro) (Royle 2003). The preliminary search strategies with the expected time spans of the searches are given in Appendix 1.

We will also endeavour to identify randomised clinical trials referenced in non‐English databases using our personal contacts or local access, or asking the Cochrane Hepato‐Biliary Group Information Specialist to contact Cochrane collaborators worldwide, with the same intent.

Searching other resources

We will search online trial registries such as ClinicalTrial.gov (clinicaltrials.gov/), European Medicines Agency (EMA) (www.ema.europa.eu/ema/), World Health Organization International Clinical Trial Registry Platform (www.who.int/ictrp), the Food and Drug Administration (FDA) (www.fda.gov), and contact pharmaceutical company sources for ongoing or unpublished trials. We will handsearch the lists of references of identified studies to identify additional studies and contact the main authors of studies and content experts for further unpublished or ongoing studies.

Data collection and analysis

We will perform the review following the recommendations of Cochrane (Higgins 2011) and the Cochrane Hepato‐Biliary Group (Gluud 2017). We will perform the analyses using Review Manager 5 (RevMan 2014) and Trial Sequential Analysis (TSA 2011).

Selection of studies

Two review authors (DK, MML) will independently screen abstracts for probable inclusion of studies retrieved by the search, read the full‐text reports/publications of those trials eligible for inclusion, and select trials that meet the inclusion criteria. We will justify the reasons for excluding studies in a 'Characteristics of excluded studies' table. We will solve disagreements by consulting a third author (RR).

We will identify and exclude duplicates and collated multiple reports of the same study, so that each study rather than each report is the unit of interest in the review. We will record the selection process in sufficient detail to complete a PRISMA flow diagram.

Data extraction and management

Two review authors (DK, MML) will independently extract the following information and data from included primary trials and complete a data extraction form which will be piloted on at least one study that meets the inclusion criteria of the review.

Publication data (i.e. year, country, authors).
Study design.
Setting, inclusion and exclusion criteria, methods of randomisation, allocation concealment, and blinding.
Population data (i.e. age, sex, history of jaundice, cholangitis, or pancreatitis; liver function tests, ultrasonographic evidence of common bile duct stone, or dilation).
Intervention data (i.e. criteria for selective intraoperative cholangiography, type of cholangiography).
Outcome measures (including benefits and adverse events).
Dropouts.
Length of follow‐up.
Types of data analyses (i.e. per protocol, intention‐to‐treat, modified intention‐to‐treat).

Two review authors (DK, MML) will transfer data into Review Manager 5 (RevMan 2014). A third review author (RR) will double‐check that data are properly entered by comparing the data with those provided in the study. We will complete a 'Characteristics of included studies' table.

We will contact the authors of randomised clinical trials in case of missing data.

Assessment of risk of bias in included studies

Three review authors (DK, MML, and DM) will independently evaluate the risk of bias of the included studies as described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011), the Cochrane Hepato‐Biliary Group (Gluud 2017), and methodological studies (Schulz 1995; Moher 1998; Kjaergard 2001; Wood 2008; Savović 2012a; Savović 2012b; Lundh 2017).

We will use the following domains with definitions in the assessment of risk of bias.

Allocation sequence generation

Low risk of bias: the study authors performed sequence generation using computer random number generation or a random number table. Drawing lots, tossing a coin, shuffling cards, and throwing dice were adequate if an independent person not otherwise involved in the study perform them.
Unclear risk of bias: the study authors did not specify the method of sequence generation.
High risk of bias: the sequence generation method was not random. We will only include such studies for assessment of harms.

Allocation concealment

Low risk of bias: the participant allocations could not have been foreseen in advance of, or during, enrolment. A central and independent randomisation unit controlled allocation. The investigators were unaware of the allocation sequence (e.g. if the allocation sequence was hidden in sequentially numbered, opaque, and sealed envelopes).
Unclear risk of bias: the study authors did not describe the method used to conceal the allocation so the intervention allocations may have been foreseen before, or during, enrolment.
High risk of bias: it is likely that the investigators who assigned the participants knew the allocation sequence. We will only include such studies for assessment of harms.

Blinding of participants and personnel

Low risk of bias: any of the following: no blinding or incomplete blinding, but the review authors judged that the outcome was not likely to be influenced by lack of blinding; or blinding of participants and key study personnel ensured, and it was unlikely that the blinding could have been broken.
Unclear risk of bias: any of the following: insufficient information to permit judgement of 'low risk' or 'high risk;' or the trial did not address this outcome.
High risk of bias: any of the following: no blinding or incomplete blinding, and the outcome was likely to be influenced by lack of blinding; or blinding of key study participants and personnel attempted, but likely that the blinding could have been broken, and the outcome was likely to be influenced by lack of blinding.

Blinding of outcome assessment

Low risk of bias: any of the following: no blinding of outcome assessment, but the review authors judged that the outcome measurement was not likely to be influenced by lack of blinding; or blinding of outcome assessment ensured, and unlikely that the blinding could have been broken.
Unclear risk of bias: any of the following: insufficient information to permit judgement of 'low risk' or 'high risk;' or the trial did not address this outcome.
High risk of bias: any of the following: no blinding of outcome assessment, and the outcome measurement was likely to be influenced by lack of blinding; or blinding of outcome assessment, but likely that the blinding could have been broken, and the outcome measurement was likely to be influenced by lack of blinding.

Incomplete outcome data

Low risk of bias: missing data were unlikely to make treatment effects depart from plausible values. The study used sufficient methods, such as multiple imputation, to handle missing data.
Unclear risk of bias: there was insufficient information to assess whether missing data in combination with the method used to handle missing data were likely to induce bias on the results.
High risk of bias: the results were likely to be biased due to missing data.

Selective outcome reporting

Low risk of bias: the trial reported the following predefined outcomes: mortality, serious adverse events, and surgical‐related morbidity. If the original trial protocol was available, the outcomes should have been those called for in that protocol. If the trial protocol was obtained from a trial registry (e.g. www.ClinicalTrials.gov), the outcomes sought should have been those enumerated in the original protocol if the trial protocol was registered before or at the time that the trial was begun. If the trial protocol was registered after the trial was begun, we will not consider those outcomes to be reliable.
Unclear risk of bias: the study authors did not report all predefined outcomes fully, or it was unclear whether the study authors recorded data on these outcomes or not.
High risk of bias: the study authors did not report one or more predefined outcomes.

For‐profit bias

Low risk of bias: the trial appeared free of industry sponsorship or other type of for‐profit support that could manipulate the trial design, conductance, or trial results.
Unclear risk of bias: the trial may or may not have been free of for‐profit bias as the trial did not provide any information on clinical trial support or sponsorship.
High risk of bias: the trial was sponsored by industry or received another type of for‐profit support.

Other bias

Low risk of bias: the trial appeared free of other factors that could put it at risk of bias.
Unclear risk of bias: the trial may or may not have been free of other factors that could put it at risk of bias.
High risk of bias: there were other factors in the trial that could put it at risk of bias.

Overall risk of bias

We will assess overall risk of bias in the trials as:

low risk of bias: if all the bias domains described in the above paragraphs are classified as low risk of bias;

high risk of bias: if one or more of the bias domains described in the above paragraphs are classified as 'unclear risk of bias' or 'high risk of bias.'

We will assess the domains 'Blinding of outcome assessment,' 'Incomplete outcome data,' and 'Selective outcome reporting' for each outcome. Thus, we will be able to assess the bias risk for each outcome in addition to each trial.

We will base our primary conclusions and our presentation in the 'Summary of findings' table on the results of our primary outcomes at low risk of bias.

We will resolve disagreements by discussion or by consulting a third review author (DM). We will have two assessors (Marcelo Linhares, Diego Kleinubing) and one adjudicator (Rachel Riera).

Measures of treatment effect

We will estimate the treatment effect for dichotomous data as risk ratios (RR), for continuous data as mean differences (MD), and for time‐to‐event data as hazard ratios (HR). We will report a 95% confidence intervals (CI) and Trial Sequential Analysis‐adjusted CI.

Unit of analysis issues

Trial participants as randomised per intervention group. Due to the nature of the intervention and the clinical situation, we do not expect to find cross‐over designs. However, if any exist, we will include only data from the first phase of the trial (Qizilbash 1998). We will also consider trials with cluster designs. For cluster‐randomised trials, we will perform an approximately correct analysis if we can extract data on the number of clusters randomised to each intervention group or the average (mean) size of each cluster; the outcome data ignoring the cluster design for the total number of participants; and an estimate of the intra‐cluster correlation coefficient (ICC). If we are unable to access the individual participant data to allow us to calculate an estimate of the ICC, we then will use external estimates obtained from similar studies. If this information is also not available, we will analyse the results of cluster studies using a general summary considering each cluster as the unit of analysis.

In case of multiple treatment groups, for primary analyses, we will pool results from relevant intervention groups and we will compare them with the pooled results from eligible control groups, creating single, pair‐wise comparisons (Higgins 2011, Section 16.5.4).

When only a subset of relevant participants are included in a trial, we will consider this trial only when the results are presented separately for the subgroup of interest for this review.

Dealing with missing data

Whenever possible, we will contact the original investigators to request missing dichotomous or continuous data from their published reports. If trialists used intention‐to‐treat analysis to deal with missing data, we will use these data in our primary analysis. Otherwise, we will use the data that are available to us.

Dealing with missing data using sensitivity analysis

If trials report only per protocol analysis results, we will include missing data by considering participants as treatment failures or treatment successes by imputing them according to the following two scenarios:

'extreme case' analysis favouring the experimental intervention ('best‐worst' case scenario): none of the participants who dropped out from the experimental group experienced the outcome, but all the participants who dropped out from the control group experienced the outcome; including all randomised participants in the denominator.
'extreme case' analysis favouring the control ('worst‐best' case scenario): all participants who dropped out from the experimental group, but none of the participants who dropped out from the control group experienced the outcome; including all randomised participants in the denominator.

We will perform the two sensitivity scenario analyses only for our primary outcomes.

Assessment of heterogeneity

We will address the presence of heterogeneity in both clinical and statistical ways.

We will specifically examine the degree of heterogeneity between trials by observing the results of the I² statistic (Higgins 2002). As thresholds for the interpretation of the I² statistic could be misleading, we will use the following approximation for interpretation of heterogeneity provided in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011):

0% to 40%: might not be important;
30% to 60%: may represent moderate heterogeneity*;
50% to 90%: may represent substantial heterogeneity*;
75% to 100%: considerable heterogeneity*.

*The importance of the observed value of the I² statistic depends on the magnitude and direction of effects and the strength of evidence for heterogeneity (e.g. P value from the Chi² test, or a CI for the I² statistic).

For the heterogeneity adjustment of the required information size in the Trial Sequential Analysis, we will use diversity (D²) because the I² statistics used for this purpose may underestimate the required information size (Wetterslev 2009).

Depending on the number of eligible trials, we will add covariates to a meta‐regression model to adjust for heterogeneity.

Assessment of reporting biases

For meta‐analyses with at least 10 trials, we will draw funnel plots using Review Manager 5 to investigate if bias and other small‐study effects are present (RevMan 2014). We will use a linear regression approach to determine funnel plot asymmetry; we will use the Harbor test for dichotomous outcomes in cases where tau² is less than 0.1 (Harbord 2006), and we will use Rücker test in cases where tau² is more than 0.1 (Rücker 2008). We will use the Egger's regression asymmetry test for continuous outcomes (Egger 1997), and the adjusted rank correlation (Begg 1994).

Data synthesis

Meta‐analysis

We will perform the meta‐analyses using Review Manager 5 (RevMan 2014), and according to the recommendations in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011).

We will present the results of dichotomous outcomes of individual trials as RR with 95% CI and the results of the continuous outcomes as MD with 95% CI. We will apply both fixed‐effect model (Demets 1987) and random‐effects model (DerSimonian 1986) meta‐analyses. If there are statistically significant discrepancies in the results (e.g. one giving a significant intervention effect and the other no significant intervention effect), we will report the more conservative point estimate analysis (Jakobsen 2014). The more conservative point estimate is the estimate closest to zero effect. If the two point estimates are equal, we will use the estimate with the widest CI as our main result of the two analyses. We will consider a P value of 0.025 or less, two‐tailed, as statistically significant if the required information size is reached due to our three primary outcomes at six‐months' follow‐up (Jakobsen 2014). We will use an eight‐step procedure to assess if the thresholds for significance are crossed (Jakobsen 2014). We will present heterogeneity using the I² statistic (Higgins 2002). We will present the results of the individual trials and meta‐analyses in the form of forest plots.

Where data are only available from one trial, we will use Fisher's exact test for dichotomous data (Fisher 1922) and Student's t‐test for continuous data (Student 1908) to present the results in a narrative way.

Trial Sequential Analysis

We will examine apparently significant beneficial and harmful intervention effects and neutral effects with Trial Sequential Analyses to evaluate if these apparent effects could be caused by random error (Brok 2008; Wetterslev 2008; Brok 2009; Thorlund 2009; Wetterslev 2009; Thorlund 2010; Thorlund 2011; TSA 2011; Wetterslev 2017).

We will use Trial Sequential Analysis as cumulative meta‐analyses are at risk of producing random errors due to sparse data and repetitive testing of the accumulating data (Wetterslev 2008). To minimise random errors, we will calculate the required information size (i.e. the number of participants needed in a meta‐analysis to detect or reject a certain intervention effect) (Wetterslev 2008). The required information size calculation should also account for the diversity present in the meta‐analysis (Wetterslev 2008; Wetterslev 2009; Wetterslev 2017).

In our meta‐analysis, the diversity‐adjusted required information size for dichotomous outcomes will be based on the event proportion in the control group; assumption of an a priori relative risk reduction of 20% or the relative risk reduction observed in the included trials at low risk of bias; a risk of type I error of 2.5% due to three primary outcomes (Jakobsen 2014); a risk of type II error of 10%; and the observed diversity of the included trials in the meta‐analysis (Castellini 2016). We will also calculate and report the Trial Sequential Analysis‐adjusted CI (Thorlund 2011). The underlying assumption of Trial Sequential Analysis is that testing for significance may be performed each time a new trial is added to the meta‐analysis. We will add the trials according to the year of publication, and, if more than one trial has been published in a year, we will add trials alphabetically according to the last name of the first author.

On the basis of the diversity‐adjusted required information size, trial sequential monitoring boundaries will be constructed (Thorlund 2011). These boundaries will determine the statistical inference one may draw regarding the cumulative meta‐analysis that has not reached the required information size. If the cumulative Z‐curve crosses the trial sequential monitoring boundary for benefit or harm before the diversity‐adjusted required information size is reached, firm evidence may perhaps be established and further trials may turn out to be superfluous. In contrast, if the boundary is not surpassed, it is most probably necessary to continue doing trials to detect or reject a certain intervention effect. That can be determined by assessing if the cumulative Z‐curve crosses the trial sequential monitoring boundaries for futility.

A more detailed description of Trial Sequential Analysis can be found at www.ctu.dk/tsa/ (Thorlund 2011).

Subgroup analysis and investigation of heterogeneity

When possible, we will conduct the following subgroup analyses to assess for possible sources of heterogeneity among trials with common features.

Trials at low risk of bias compared to trials at high risk of bias.
Different indication criteria for selective intraoperative cholangiography.
Sex.
Age of participants (older than 64 years compared to younger than 65 years).
Trial participants with or without acute cholecystitis.
Cholecystocholangiography and cystic duct cholangiography.

We may perform additional subgroup analyses if such are deemed of high importance during the review preparation. We will note such inclusions in the 'Differences between protocol and review' section of the review.

Sensitivity analysis

In addition to the sensitivity analyses specified under Dealing with missing data, we may perform additional sensitivity analyses if considered necessary (e.g. trials published as full paper articles only by excluding abstracts and unpublished trials; or trials with intention‐to‐treat data analysis only, or trials with data as analysed). If the number of trials permits, we will also perform sensitivity analysis, excluding studies that were listed as open‐labelled or studies that were not blinded. We will note such inclusions in the 'Differences between protocol and review' section of the review.

'Summary of findings' tables

We will assess confidence in the evidence using GRADE criteria (Atkins 2004) and using GRADEpro software (GRADEpro 2008). We will assess all our review primary and secondary outcomes using the five factors referring to limitations in the study design and implementation of included studies that suggest the quality of the evidence: risk of bias; indirectness of evidence (population, intervention, control, outcomes); unexplained heterogeneity or inconsistency of results (including problems with subgroup analyses); imprecision of results (wide CIs and as evaluated with our Trial Sequential Analyses) (Jakobsen 2014); and a high probability of publication bias. We will define the levels of evidence as 'high,' 'moderate,' 'low,' or 'very low'. We will follow the recommendations of Section 8.5 and Chapter 12 of the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011). These grades are defined as follows.

High certainty: this research provides a very good indication of the likely effect; the likelihood that the effect will be substantially different is low.
Moderate certainty: this research provides a good indication of the likely effect; the likelihood that the effect will be substantially different is moderate.
Low certainty: this research provides some indication of the likely effect; however, the likelihood that it will be substantially different is high.
Very low certainty: this research does not provide a reliable indication of the likely effect; the likelihood that the effect will be substantially different is very high.

Cochrane Review language

Website language