Serological tests for primary biliary cholangitis

Merica Aralica; Vanja Giljaca; Goran Poropat; Goran Hauser; Davor Štimac

doi:10.1002/14651858.CD012560

Serological tests for primary biliary cholangitis

Authors' declarations of interest

Version published: 23 February 2017 Version history

https://doi.org/10.1002/14651858.CD012560

Collapse all Expand all

Abstract

This is a protocol for a Cochrane Review (Diagnostic test accuracy). The objectives are as follows:

To determine and compare the diagnostic accuracy in terms of sensitivity and specificity of different serological markers for diagnosis of primary biliary cholangitis in people suspected of having the disease.

Background

Target condition being diagnosed

Primary biliary cholangitis is a chronic cholestatic liver disease of unknown aetiology. It is characterised by destruction of small‐sized intrahepatic bile ducts due to immune mechanisms, leading to cholestasis (a block of bile transport from the liver to the duodenum, the first part of the intestine), cirrhosis (scaring of the liver), and eventually liver failure (loss of liver function) (AASLD 2009). Primary biliary cholangitis typically affects adults of all races but predominantly Caucasian middle‐aged women. It seems that genetic, social, and past infectious status are contributing factors for development of the disease (Gershwin 2005). Incidence, as well as prevalence of primary biliary cholangitis, vary widely; from 0.33 to 5.8 per 100,000 population per year and from 1.9 to 40.2 per 100,000 population (Boonostra 2012).

In one cohort study from 2004, the majority of people with primary biliary cholangitis (61%) had no signs related to the disease at the time of diagnosis (Prince 2004). Almost half of initially asymptomatic people with the disease developed one or more symptoms of primary biliary cholangitis in the first five years (Prince 2004). In a 10‐year period after diagnosis, more than half of participants with primary biliary cholangitis developed pruritus (itch) and fatigue, and nearly quarter of all participants progressed to liver failure (Prince 2002). Although a small proportion of participants with primary biliary cholangitis remained symptom‐free 10 years after diagnosis, all participants faced three times higher mortality than expected in the general population (Prince 2002).

In about one third of people, primary biliary cholangitis coexisted with some autoimmune disease (an abnormal immune attack on a person's own healthy tissues) affecting organs other than the liver (Gershwin 2005), or people had liver‐related autoimmune diseases such as autoimmune hepatitis (Chazouilleres 1998).

Diagnosis of primary biliary cholangitis is generally confirmed if two of the three following criteria are fulfilled: elevated activity of alkaline phosphatase (ALP ‐ an enzyme from the liver, placenta, and bone) in the blood for more than six months, detection of antimitochondrial antibodies (AMA, antibodies that are directed to the person's own proteins on mitochondria, compartments in the cell) in blood, and liver biopsy findings compatible with the disease (AASLD 2009).

Ursodeoxycholic acid is an officially approved drug for people with primary biliary cholangitis, but its beneficial effect on symptoms, all‐cause mortality, and liver transplantation remained unproven in a Cochrane Review (Rudic 2012). Liver transplantation is a treatment option recommended for end‐stage liver disease in people with primary biliary cholangitis (EASL 2009).

Index test(s)

There are several autoantibodies (circulating proteins that attack the person's own tissues) that are connected to primary biliary cholangitis.

AMA are autoantibodies in the blood of people with primary biliary cholangitis that attack their bile cells in the liver (Gershwin 1987). Due to AMA structure, it is possible to detect them in the blood by an immunoassay, a method that employs an antibody (from tested blood) and antigen (from test) to get a specific type of reaction called a serological reaction.

In clinical guidelines, AMA detection is recommended in people with persistent high activity of ALP in the blood if they are free of obstruction in the liver and bile tree (EASL 2009). Absence or presence of this type of obstruction may be estimated by imaging studies as abdominal ultrasound and magnetic resonance cholangiopancreatography (AASLD 2009).

In the blood serum (a liquid part of blood), AMA detection is recommended by indirect immunofluorescence (IIF) (Vergani 2004). Generally, it is an immunoassay that utilises a fluorescent microscope and specific cells on microscope slides for the detection of diverse types of autoantibodies. The result of IIF is reported as a titre (Kavanaugh 2000). An adult with elevated ALP and the presence of AMA in titre equal to or higher than 1:40 can be diagnosed with primary biliary cholangitis with confidence. In these conditions, liver biopsy is not obligatory (AASLD 2009; EASL 2009). However, the same recommendations state that if a person has high clinical suspicion of primary biliary cholangitis and negative AMA, biopsy is necessary to confirm the presence of the disease.

Studies have indicated that some people with clinical signs of primary biliary cholangitis and positive liver biopsy may have no detectable AMA (Prince 2002; Hu 2014a). Nevertheless, it seems that AMA status has no effect on the course of the disease, drug therapy, or survival of people with primary biliary cholangitis (Kim 1997; Prince 2002; Muratori 2004). Besides IIF, some other immunoassays such as enzyme‐linked immunoassay (ELISA, a quantitative analytical method) and Western blot (WB) are utilised for AMA detection (Hu 2014a). The type of immunoassay performed as well as study design may influence the diagnostic accuracy of AMA (Hu 2014a).

The anti‐M2 is the most common subtype of AMA and may have a diagnostic and prognostic value in primary biliary cholangitis (Flisiak 2005). Detection of anti‐M2 subtype may be useful in people with suggestive primary biliary cholangitis presentation but persisting negative AMA (Kitami 1994).

Generally, anti‐M2 may be detected by different immunoassays such ELISA, line immunoassay (LIA, a qualitative immunoassay), or WB (Kitami 1994; Dahnrich 2009; Saito 2012). Performance of several different immunoassays may lead to diverse diagnostic accuracy of this antibody in the same cohort (Dahnrich 2009).

Therefore, AMA and anti‐M2 are noninvasive tests that may be useful as triage test ‐ AMA and add‐on test ‐ anti‐M2 for primary biliary cholangitis, both in people who are symptomatic and asymptomatic.

Antinuclear antibodies (ANA) are autoantibodies that attack a person's own proteins of the cell. ANA is main serological test in screening for an autoimmune disease and it is commonly performed in primary biliary cholangitis work‐up, but there is no support for such practice in clinical guidelines (Kavanaugh 2000; Prince 2002; EASL 2009). IIF with a substrate of human epithelial line HP‐2 cells on microscope slides is recommended for ANA detection (Kavanaugh 2000). A positive ANA result on IIF can be recorded as multiple nuclear dots or a nuclear rim pattern (Szostecki 1992). If multiple nuclear dots are detected by IIF, they may be more specifically defined by ELISA as three different ANA subtypes: anti‐sp100 (major antigen of multiple nuclear dots), anti‐promyelocytic leukaemia (anti‐PML, a nuclear protein), and anti‐sp140 in the blood of the same person (Zuchner 1997; Liu 2010). Anti‐sp100 may have a diagnostic and drug monitoring role in primary biliary cholangitis (Wichmann 2003; Gatselis 2013). It seems that anti‐PML along with anti‐sp100 shares a diagnostic role as well as higher frequency in people with AMA‐negative primary biliary cholangitis (Xiao 2012). Anti‐sp140 antibodies appear frequently with anti‐PML and anti‐sp100 in the same group of people with primary biliary cholangitis, but their clinical role is unclear (Granito 2010). In contrast, detection of a nuclear rim pattern by IIF may be proven as a presence of anti‐gp210 antibodies by ELISA in same people. They are considered a useful predictive marker for progression to end‐stage liver disease, but its diagnostic value seems to be modest (Nakamura 2007; Hu 2014b).

ANA are present in half of people with primary biliary cholangitis, predominantly in people who are AMA negative (Zuchner 1997; Muratori 2003). The prevalence rates of anti‐sp100 antibodies in people with primary biliary cholangitis range from 21% to 34% and prevalence rates of and anti‐gp210 range from 26% to 34% (Zuchner 1997; Muratori 2002; Miyachi 2003; Hu 2011). By contrast, in general population prevalence rates of anti‐sp100 are 0.05% and anti‐gp210 are 0.04% (Liu 2010). Diagnostic accuracy of ANA subtypes may depend on immunoassays performed for their detection, taking into account that ELISA and LIA are common immunoassays for detection of these antibodies in routine clinical practice (Muratori 2003; Liu 2010; Saito 2012).

ANA and its subtypes have not yet been incorporated in routine diagnostic algorithms for primary biliary cholangitis, although they may be used in the work‐up as add‐on tests along with AMA. Furthermore, estimation of ANA in a meta‐analysis may play a role in more accurate diagnosis of primary biliary cholangitis.

Eosinophil peroxidase antibodies (anti‐EPO) appear as novel serological markers, independent of AMA and ANA, but they may be associated with diagnosis of primary biliary cholangitis (Takigouchi 2005).

Currently, only AMA has been recommended as a serological test for the diagnosis of primary biliary cholangitis. However, in AMA negative people suspected of having primary biliary cholangitis, a liver biopsy is still recommended. Therefore, we are in need of more diagnostic tests that would confirm or exclude the diagnosis of primary biliary cholangitis. Those tests may be used as add‐on tests after AMA has been evaluated. The index tests that will be evaluated in this systematic review and meta‐analysis are AMA, ANA, anti‐M2, anti‐sp100, anti‐gp210, anti‐PML, anti‐sp140, and anti‐EPO antibodies (see Index tests).

Clinical pathway

Primary biliary cholangitis diagnostic work‐up usually begins with detection of persistent elevation of ALP activity in the blood, often in people who are initially asymptomatic (AASLD 2009). Raised ALP activity in people with primary biliary cholangitis may be isolated or accompanied by elevated levels of immunoglobulin M (IgM, a class of natural antibodies in human body) and other liver function tests (a set of blood enzymes tests that check how well the liver is working) that are biochemical markers of cholestasis (EASL 2009). When the person presents with clinical or biochemical (or both) evidence of cholestasis, an abdominal ultrasound or magnetic resonance cholangiopancreatography (or both) may be performed (EASL 2009). These imaging studies allow for ruling out any obstruction in extrahepatic biliary ducts (biliary ducts out of the liver) that may be the cause of blockage in bile transport. AMA testing is then performed, and, depending on the findings of AMA, the decision is made whether the person should proceed to liver biopsy. Liver biopsy and pathohistological findings typical for primary biliary cholangitis are used as definite diagnostic tests confirming the presence or absence of the disease (AASLD 2009). Other serological markers have not been recommended for routine use in people suspected of primary biliary cholangitis by clinical guidelines. However, they may have sufficient diagnostic accuracy to be used ad add‐on tests, and the need for liver biopsy may be reduced in such people. A flowchart of the current and hypothesised clinical pathway is shown in Figure 1.

Figure 1

Diagnostic pathway for diagnosis of primary biliary cholangitis (PBC).
ALP: alkaline phosphatase; AMA: antimitochondrial antibody; ANA: antinuclear antibody; anti‐EPO: anti‐eosinophil peroxidase antibody; anti‐PML: anti‐promyelocytic leukaemia antibody

Implication of positive tests

After exclusion of extrahepatic cholestasis, serological tests for diagnosis of primary biliary cholangitis are usually performed. This includes mainly AMA, and less often ANA. If AMA is positive, there is a consensus that liver biopsy is not necessary for diagnosis and treatment for primary biliary cholangitis can be introduced (AASLD 2009). Other serological tests are not performed in this case. If only ANA is positive, it is still considered that liver biopsy has to be performed. However, since other tests have not been evaluated in a systematic review of diagnostic accuracy, it is unknown if any of those tests, may have higher diagnostic accuracy than AMA.

Implication of negative tests

If AMA are negative in a person with elevated ALP and without other cause of cholestasis, the suspicion remains that the person has primary biliary cholangitis, despite the negative serology. In this case, ANA may be determined. However, if ANA are positive in an AMA‐negative person, the requirements for diagnosis of primary biliary cholangitis are not met and biopsy is mandatory according to current guidelines. In the case where ANA are negative in AMA‐negative person, a biopsy is also performed. Other serological tests have not been sufficiently investigated in clinical practice and no concrete recommendations for their use have been issued in present guidelines for diagnosis of primary biliary cholangitis (AASLD 2009; EASL 2013). Therefore, if a person has symptoms and signs that are sufficiently suspicious of primary biliary cholangitis and negative serological work‐up, a liver biopsy has to be done. Liver biopsy is an invasive procedure which has complications in about 10% of people (Anania 2014). There may be other serological tests that could be useful to diagnose primary biliary cholangitis with sufficient accuracy so that liver biopsy could be avoided. We plan to evaluate diagnostic accuracy of such tests (see Objectives and Index tests).

Alternative test(s)

There is a set of autoantibodies such as anti‐p97/valosin containing protein (anti‐p97/VCP) and anti‐β2‐glycoprotein I (anti‐β2GPI) possibly related to prognosis of primary biliary cholangitis, but this is supported with a scarce body of evidence (Miyachi 2004;Zachou 2006). Another serological marker associated with primary biliary cholangitis prognosis is anticentromere antibodies. It is currently used in clinical work‐up of primary biliary cholangitis, mainly in a set with other ANA subtypes (Parveen 1995; Nakamura 2007; Mahler 2010). These markers are more related to prognosis in already diagnosed people with primary biliary cholangitis, rather than to diagnosis of primary biliary cholangitis.

Rationale

Primary biliary cholangitis is considered a rare disease, but its prevalence has been increasing (Boonostra 2012). Rising prevalence of primary biliary cholangitis combined with steady mortality rate expands the burden of the disease, resulting in an annual cost of inpatients from USD 65 million to USD 115 million in hospitals in the USA (Kim 2002; Gershwin 2005). Furthermore, primary biliary cholangitis is a top four indication for liver transplantation in Europe (EASL 2013).

The goal of primary biliary cholangitis diagnostic work‐up should be the timely diagnosis, followed by appropriate management. Both may have an impact on survival as the probability of death or liver transplantation rises in people treated in later stages of the disease (Corpechot 2005). According to current clinical guidelines, AMA is an established serological marker with limits regarding low levels of recommendations, mainly due to lack of appropriate diagnostic accuracy studies (AASLD 2009; EASL 2009). AMA may have an entire spectrum of presentation in people suspected of having primary biliary cholangitis, both symptomatic and asymptomatic. It may be present in people's blood for years before the clinical manifestation of the disease (Prince 2004). It may also be absent in people who are symptomatic (Prince 2002), which may be a consequence of using a suboptimal type of immunoassay (Muratori 2004). In people with suggestive primary biliary cholangitis presentation and persisting negative AMA, detection of anti‐M2 or ANA may be helpful in establishing diagnosis (Kitami 1994; Lacerda 1995; Invernizzi 1997). Furthermore, ANA is a basic serological test in the diagnostic evaluation of autoimmune disease (Agmon‐Levin 2014). Since primary biliary cholangitis shares some features of autoimmunity, ANA may be useful as a diagnostic test for the disease and has been mentioned as such in current management guidelines (EASL 2009). In addition, all of these serological markers may be detected by different types of immunoassays of diverse design, accuracy, and technical details. All the mentioned markers may be a source of inconsistency in results of a single serological marker in selected people (Kavanaugh 2000). Furthermore, AMA‐negative people suspected of having primary biliary cholangitis currently must undergo liver biopsy to confirm the presence or absence of the disease. Therefore, we need of further serological markers that may be used as add‐on tests to AMA to allow for better diagnosis of primary biliary cholangitis in people who present a challenge in the diagnostic pathway.

Currently, there are no Cochrane Reviews of studies assessing diagnostic accuracy of serological markers for diagnosis of primary biliary cholangitis.

Objectives

Secondary objectives

To investigate variation in the diagnostic accuracy of AMA, ANA, anti‐M2, anti‐sp100, anti‐gp210, anti‐PML, anti‐sp140, and anti‐EPO antibodies according to the following potential sources of heterogeneity.

Studies at low risk of bias versus studies with unclear or high risk of bias (as assessed by the Quality Assessment of Diagnostic Accuracy Studies (QUADAS‐2) tool) (Table 1).

Open in table viewer

Table 1. QUADAS‐2 tool for assessment of methodological quality of included studies

	Signalling question	Signalling question	Signalling question	Risk of bias	Applicability
Domain 1: participant selection
Participant selection	Was a consecutive or random sample of participants enrolled?	Was a case‐control design avoided?	Did a study avoid inappropriate exclusions?	Could the selection of participants have introduced bias?	Are there concerns that the included participants and setting do not match the review question?
Participant selection	Yes: all consecutive participants or random sample of participants with suspected primary biliary cholangitis. No: only selected participants were enrolled. Unclear: this was not clear from the report.	Yes: a case‐control design was avoided. No: a case‐control design was not avoided. Unclear: this was not clear from the report.	Yes: a study avoided inappropriate exclusions, e.g. it included asymptomatic people with elevated ALP or people with elevated ALP and other autoimmune diseases. No: a study excluded difficult‐to‐diagnose participants, e.g. participants with suspected primary biliary cholangitis with other autoimmune disease were excluded. Unclear: this was not clear from the report.	Low risk: 'yes' for all signalling questions. High/unclear risk: 'no' or 'unclear' for at least 1 signalling question.	Low: the selected participants and settings matched the review question. High: the selected participants and settings differed from the review question. Unclear: this was not clear from the report.
Domain 2: index test
Index test	Were the index test results interpreted without knowledge of the results of the reference standard?	If a threshold was used, was it prespecified?	‐	Could the conduct or interpretation of the index test have introduced bias?	Are there concerns that the index test, its conduct, or its interpretation differ from the review question?
Index test	Yes: index test was always conducted and interpreted before the liver biopsy. No: index test was interpreted with knowledge of results of liver biopsy. Unclear: this was not clear from the report.	Yes: a threshold was prespecified for all autoantibodies. No: a threshold was not prespecified for all autoantibodies. Unclear: this was not clear from the report.	‐	Low risk: 'yes' for all signalling questions. High/unclear risk: 'no' or 'unclear' for at least 1 signalling question.	Low: there are low concerns the index test conduct or interpretation differed from the review question. High: there are high concerns the index test methodology or interpretation varied from the review question. Unclear: this was not clear from the report.
Domain 3: reference standard
Reference standard	Is the reference standard likely to classify the target condition correctly?	Were the reference standard results, interpreted without knowledge of the results of the index test?	‐	Could the reference standard, its conduct, or its interpretation have introduced bias?	Are there concerns that the target condition as defined by the reference standard does not match the review question?
Reference standard	Low: all participants were verified by liver biopsy and histology. No: some included participants were not verified by liver biopsy and histology, and this was related to the result of the index test, i.e. participants who were AMA positive. Unclear: this was not clear from the report.	Yes: the liver biopsy results were interpreted without knowledge of index test results. No: the liver biopsy results were interpreted with knowledge of index test results. Unclear: this was not clear from the report.	‐	Low risk: 'yes' for all signalling questions. High/unclear risk: 'no' or 'unclear' for at least 1 signalling question.	Low: all participants underwent common techniques of liver biopsy and their histology reports were classified as compatible with primary biliary cholangitis or not. High: all participants or a proportion of them did not undergo common or generally accepted liver biopsy procedures or histology techniques used were not in line with accepted standard.
Domain 4: flow and timing
Flow and Timing	Was there an appropriate interval between index tests and the reference standard?	Did all participants receive the reference standard?	Were all participants included in the analysis?	Could the participants flow have introduced bias?	‐
Flow and Timing	Yes: the interval between index tests and the reference standard was ≤ 6 months (an arbitrary value). No: the interval between index tests and the reference standard was > 6 months. Unclear: this was not clear from the report.	Low risk: all participants received the reference standard, i.e. liver biopsy and histology. High risk: some included participants received reference standard but it was inconclusive for primary biliary cholangitis or some participants did not receive the reference standard due to AMA positivity. Unclear: this was not clear from the report.	Yes: all participants recruited into the study were included in the analysis. No: the number of participants included in the analysis differed from the number of participants enrolled in the study. Unclear: this was not clear from the report.	Low risk: 'yes' for all signalling questions. High/unclear risk: 'no' or 'unclear' for at least 1 signalling question.	‐

ALP: alkaline phosphatase; AMA: antimitochondrial antibody.

Full‐text publications versus abstracts (this may indicate publication bias if there is an association between the results of the study and the study reaching full publication) (Eloubeidi 2001).
Prospective versus retrospective studies.
The prevalence of people who are symptomatic versus people who are asymptomatic (the presence of symptoms may increase the pretest probability). People who are symptomatic will be defined as people with fatigue (lasting more than three months; anaemia and hypothyroidism excluded), pruritus, jaundice, and abdominal pain in the absence of biliary stones, oesophageal varices, ascites, and liver failure (Prince 2004; AASLD 2009).
Studies that included 30% or less of participants with other autoimmune diseases versus studies that included more than 30% of such participants.
Detection of index tests by different types of immunoassays.
Diagnostic accuracy of anti‐M2, ANA, anti‐sp100, anti‐gp210, anti‐PML, anti‐sp140, and anti‐EPO according to the prevalence of AMA‐negative participants in the included studies.
Diagnostic accuracy of AMA, anti‐M2, ANA, anti‐sp100, anti‐gp210, anti‐PML, anti‐sp140, and anti‐EPO in participants suspected of primary biliary cholangitis referred from a general practitioner clinicversus people referred from specialist clinic.
Diagnostic accuracy of AMA, anti‐M2, ANA, anti‐sp100, anti‐gp210, anti‐PML, anti‐sp140, and anti‐EPO in participants without liver cirrhosis versus participants with liver cirrhosis (as defined by individual studies).

Methods

Criteria for considering studies for this review

Types of studies

We will include studies that provide information comparing one or more index tests against the reference standard in the appropriate participant population (see Participants). We will include studies irrespective of language or publication status, or whether data were collected prospectively or retrospectively. We plan to include comparative studies in which two or more index tests were performed in the same study population either by giving all participants both index tests or by randomly allocating participants to receive different index tests for each participant group.

Participants

We will include both asymptomatic and symptomatic men and women suspected of primary biliary cholangitis. The suspected presence of primary biliary cholangitis will be defined as: fatigue (lasting more than three months; anaemia and hypothyroidism excluded), pruritus, jaundice, and abdominal pain in the absence of biliary stones, oesophageal varices, ascites, and liver failure (Prince 2004; AASLD 2009). We will exclude participants that have undergone liver transplantation.

Index tests

Studies evaluating AMA, ANA, anti‐M2, anti‐sp100, anti‐gp210, anti‐PML, anti‐sp140, and anti‐EPO antibodies will be eligible. We do not know if there is sufficient diagnostic accuracy to this approach, and that is what we plan to investigate.

We do not expect any tests other than AMA and ANA to be ordered at the same time as other tests investigated because such tests are not commonly used in practice and will probably be investigated in studies that are specifically designed for that purpose. At the moment, we cannot expect any clinical use of those add‐on tests on the premise that they are correlated and that one should precede another.

Target conditions

Primary biliary cholangitis.

Reference standards

Liver biopsy with a sample that shows a pattern compatible with primary biliary cholangitis characterised with inflammatory destruction of bile ducts in the liver (changes in the structure of liver tissue are divided in four stages ranging from inflammation, interface hepatitis (liver cells dying), fibrosis (destruction of liver tissue structure) to cirrhosis (scaring of the liver) (AASLD 2009)).

Search methods for identification of studies

Electronic searches

We will search The Cochrane Hepato‐Biliary Group Controlled Trials Register, The Cochrane Hepato‐Biliary Group Diagnostic Test of Accuracy Studies Register, the Cochrane Library (latest issue), MEDLINE via OvidSP (January 1946 to date of search), Embase via OvidSP (January 1947 to date of search), Science Citation Index Expanded via Web of Science (January 1898 to date of search), and BIOSIS via Web of Science (January 1969 to date of search). Preliminary search strategies with the expected time spans of the searches are shown in Appendix 1.

Searching other resources

We will look for further studies in the reference lists of included studies as well as in any related systematic reviews in Medion and ARIF (Aggressive Research Intelligence Facility). Initial search strategies for these databases are shown in Appendix 1. We will perform the 'related search' function in MEDLINE (OvidSP) and Embase (OvidSP) and a 'citing reference' search (search the articles which cited the included articles) in Science Citation Index Expanded and Embase (OvidSP) to identify additional references linked to included studies. We will search for conference proceedings in the BIOSIS databases and via Embase. We will also search clinical trial registers for additional trials (EU Clinical trials register (www.clinicaltrialsregister.eu/) and Clinicaltrials.gov (clinicaltrials.gov/)).

Data collection and analysis

Selection of studies

By searching the references, two review authors (GP and MA) will independently identify relevant studies. We will obtain full texts from references that at least one of the review authors judge relevant. Two review authors (GH and MA) will independently assess the full‐text articles. We will resolve any differences in study selection by arbitration with DŠ. For data extraction, we will use only data from studies that meet the inclusion criteria. We will consider data from abstracts if they contain sufficient data for analysis.

Data extraction and management

Two review authors (VG and MA) will individually extract the following data from each included study: first author of report, year of publication of report, study design (prospective or retrospective; cross‐sectional studies or case‐control studies that report results of diagnostic accuracy of index tests in people suspected to have primary biliary cholangitis); inclusion and exclusion criteria for individual studies; total number of participants; number of males and females; mean age of the participants; tests carried out prior to index tests; index tests; reference standard; and true positive, false positive, true negative, and false negative data. If necessary, we will seek further information from the authors of the studies. If there are any disagreements between the review authors, we will discuss them with DŠ, who will make the final decision.

Assessment of methodological quality

We will use the QUADAS‐2 tool for evaluation of quality of included studies (Whiting 2011; see Table 1). We will evaluate the two segments of QUADAS‐2 evaluation separately (i.e. the risk of bias and applicability). Two review authors (VG and MA) will independently assess the quality of included studies. We will judge studies that receive positive answers to all signalling questions regarding risk of bias at low risk of bias. If any of the signalling questions receive a negative or unclear answer, the study will then be judged to be at risk of bias. If there are any concerns or unclear aspects regarding applicability, then we will judge the study as having concerns regarding applicability. If there are no concerns, we will judge the study as not having concerns regarding applicability. If necessary, we will ask for additional information from the authors of included studies to reach satisfactory level of assessment of quality. DŠ will arbitrate any differences regarding quality assessment.

Statistical analysis and data synthesis

To explore between‐study variation in the performance of each test visually, we will plot estimates of sensitivity and specificity from each study using forest plots and in receiver‐operating characteristic (ROC) space. Because our focus of inference is summary points, we will use the bivariate model to summarise the sensitivity and specificity of each test jointly (Reitsma 2005; Chu 2006). This model accounts for between‐study variability in estimates of sensitivity and specificity through the inclusion of random effects for the logit sensitivity and logit specificity parameters of the bivariate model. However, in the unlikely case where different thresholds are used for antibody positivity, we will use the hierarchical summary ROC model to construct ROC curves of sensitivity and specificity (Rutter 2001). In this scenario, we will also perform analyses using the bivariate model with data from subgroups of included studies that use the same threshold and we will calculate the summary points of sensitivity and specificity at the level of different positivity thresholds that will have been encountered across included studies.

Using all available studies (i.e. an indirect comparison), we will compare the diagnostic accuracy of ANA, AMA, anti‐M2, anti‐sp100, anti‐gp210, anti‐PML, anti‐sp140, and anti‐EPO antibodies by including covariate terms for test type in the bivariate model to estimate differences in the sensitivity and specificity of these tests. We will also allow the variances of the random effects and their covariance to depend on test type, thus allowing the variances to differ between tests. We will use likelihood ratio tests to compare the fit of different models, and we will also compare the estimates of sensitivity and specificity between models to check the robustness of our assumptions about the variances of the random effects. If four or more studies that compare two or more tests in the same study population are available, we plan to perform a direct head‐to‐head comparison by limiting the test comparison to such studies. We will perform meta‐analyses and will carry likelihood ratio tests by using the METADAS macro (Takwoingi 2010) for SAS statistical package (SAS).

We will create a table of pretest probabilities against post‐test probabilities using the observed median and range of prevalence of primary biliary cholangitis from all included studies. We will calculate the post‐test probabilities using these pretest probabilities and the summary positive and negative likelihood ratio derived using the METADAS macro and the bivariate model used to fit for estimates of the summary sensitivities and specificities for each index test.

Investigations of heterogeneity

We will visually inspect forest plots of sensitivity and specificity, and summary ROC plots to investigate the potential sources of heterogeneity stated in Secondary objectives. We will explore heterogeneity by adding covariates in the bivariate model which allows for separate calculations of their sensitivity and specificity and of the effect of each of the covariates on summary results of sensitivity and specificity (meta‐regression with one covariate at a time). We will consider a P value less than 0.05 statistically significant.

Sensitivity analyses

Exclusion of participants with uninterpretable results can result in overestimation of diagnostic test accuracy (Schuetz 2012). In practice, uninterpretable test results of serological tests are rare since there is a specific threshold for positivity. If any uninterpretable results exist, we will perform sensitivity analysis by including uninterpretable test results as test negatives if sufficient data are available. Inclusion of case‐control studies may result in overestimation of diagnostic accuracy of test (Lijmer 1999). Therefore, we will perform a sensitivity analysis in which we will exclude case‐control studies and perform meta‐analysis of only cross‐sectional studies.

Assessment of reporting bias

We will assess the differences in summary sensitivity and specificity between full‐text studies and studies available as abstracts as presented in the section on Investigations of heterogeneity.

Figure 1

Navigate to figure in ProtocolOpen in new tab

Table 1. QUADAS‐2 tool for assessment of methodological quality of included studies

	Signalling question	Signalling question	Signalling question	Risk of bias	Applicability
Domain 1: participant selection
Participant selection	Was a consecutive or random sample of participants enrolled?	Was a case‐control design avoided?	Did a study avoid inappropriate exclusions?	Could the selection of participants have introduced bias?	Are there concerns that the included participants and setting do not match the review question?
Participant selection	Yes: all consecutive participants or random sample of participants with suspected primary biliary cholangitis. No: only selected participants were enrolled. Unclear: this was not clear from the report.	Yes: a case‐control design was avoided. No: a case‐control design was not avoided. Unclear: this was not clear from the report.	Yes: a study avoided inappropriate exclusions, e.g. it included asymptomatic people with elevated ALP or people with elevated ALP and other autoimmune diseases. No: a study excluded difficult‐to‐diagnose participants, e.g. participants with suspected primary biliary cholangitis with other autoimmune disease were excluded. Unclear: this was not clear from the report.	Low risk: 'yes' for all signalling questions. High/unclear risk: 'no' or 'unclear' for at least 1 signalling question.	Low: the selected participants and settings matched the review question. High: the selected participants and settings differed from the review question. Unclear: this was not clear from the report.
Domain 2: index test
Index test	Were the index test results interpreted without knowledge of the results of the reference standard?	If a threshold was used, was it prespecified?	‐	Could the conduct or interpretation of the index test have introduced bias?	Are there concerns that the index test, its conduct, or its interpretation differ from the review question?
Index test	Yes: index test was always conducted and interpreted before the liver biopsy. No: index test was interpreted with knowledge of results of liver biopsy. Unclear: this was not clear from the report.	Yes: a threshold was prespecified for all autoantibodies. No: a threshold was not prespecified for all autoantibodies. Unclear: this was not clear from the report.	‐	Low risk: 'yes' for all signalling questions. High/unclear risk: 'no' or 'unclear' for at least 1 signalling question.	Low: there are low concerns the index test conduct or interpretation differed from the review question. High: there are high concerns the index test methodology or interpretation varied from the review question. Unclear: this was not clear from the report.
Domain 3: reference standard
Reference standard	Is the reference standard likely to classify the target condition correctly?	Were the reference standard results, interpreted without knowledge of the results of the index test?	‐	Could the reference standard, its conduct, or its interpretation have introduced bias?	Are there concerns that the target condition as defined by the reference standard does not match the review question?
Reference standard	Low: all participants were verified by liver biopsy and histology. No: some included participants were not verified by liver biopsy and histology, and this was related to the result of the index test, i.e. participants who were AMA positive. Unclear: this was not clear from the report.	Yes: the liver biopsy results were interpreted without knowledge of index test results. No: the liver biopsy results were interpreted with knowledge of index test results. Unclear: this was not clear from the report.	‐	Low risk: 'yes' for all signalling questions. High/unclear risk: 'no' or 'unclear' for at least 1 signalling question.	Low: all participants underwent common techniques of liver biopsy and their histology reports were classified as compatible with primary biliary cholangitis or not. High: all participants or a proportion of them did not undergo common or generally accepted liver biopsy procedures or histology techniques used were not in line with accepted standard.
Domain 4: flow and timing
Flow and Timing	Was there an appropriate interval between index tests and the reference standard?	Did all participants receive the reference standard?	Were all participants included in the analysis?	Could the participants flow have introduced bias?	‐
Flow and Timing	Yes: the interval between index tests and the reference standard was ≤ 6 months (an arbitrary value). No: the interval between index tests and the reference standard was > 6 months. Unclear: this was not clear from the report.	Low risk: all participants received the reference standard, i.e. liver biopsy and histology. High risk: some included participants received reference standard but it was inconclusive for primary biliary cholangitis or some participants did not receive the reference standard due to AMA positivity. Unclear: this was not clear from the report.	Yes: all participants recruited into the study were included in the analysis. No: the number of participants included in the analysis differed from the number of participants enrolled in the study. Unclear: this was not clear from the report.	Low risk: 'yes' for all signalling questions. High/unclear risk: 'no' or 'unclear' for at least 1 signalling question.	‐
ALP: alkaline phosphatase; AMA: antimitochondrial antibody.

Table 1. QUADAS‐2 tool for assessment of methodological quality of included studies

Navigate to table in Protocol

Cochrane Review language

Website language

Abstract

Visual summary

Background

Target condition being diagnosed

Index test(s)

Clinical pathway

Implication of positive tests

Implication of negative tests

Alternative test(s)

Rationale

Objectives

Secondary objectives

Methods

Criteria for considering studies for this review

Types of studies

Participants

Index tests

Target conditions

Reference standards

Search methods for identification of studies

Electronic searches

Searching other resources

Data collection and analysis

Selection of studies

Data extraction and management

Assessment of methodological quality

Statistical analysis and data synthesis

Investigations of heterogeneity

Sensitivity analyses

Assessment of reporting bias

Copy or download citation

Cochrane Review language

Website language

Previously accessed institutions

Institutional users

Previously accessed institutions

Other access options