Scolaris Content Display Scolaris Content Display

Pruebas rápidas para el diagnóstico de la leishmaniasis visceral en pacientes con presunción de la enfermedad

Contraer todo Desplegar todo

Antecedentes

El diagnóstico de la leishmaniasis visceral (LV) en pacientes con fiebre y un bazo agrandado depende de la presencia de parásitos Leishmania en muestras tisulares y en pruebas serológicas. Las técnicas parasitológicas son invasivas, requieren laboratorios complejos, consumen tiempo o carecen de exactitud. Desde hace poco se dispone de pruebas diagnósticas rápidas que son fáciles de realizar.

Objetivos

Determinar la exactitud diagnóstica de las pruebas rápidas para diagnosticar la LV en pacientes con presunción de la enfermedad que acuden a los servicios de salud en áreas endémicas.

Métodos de búsqueda

Se hicieron búsquedas en MEDLINE, EMBASE, LILACS, CIDG SR, CENTRAL, SCI‐expanded, Medion, Arif, CCT y en el registro de ensayos de la OMS el 3 de diciembre de 2013, sin aplicar límites de idioma o de fecha.

Criterios de selección

Esta revisión incluye estudios originales, de fase III, de la exactitud diagnóstica de las pruebas rápidas en pacientes con presunción clínica de presentar LV. Como estándar de referencia se aceptaron: 1) Frotis directo o cultivo de aspirado del bazo; 2) estándar de referencia compuesto basado en uno o más de los siguientes elementos: parasitología, serología o respuesta al tratamiento; y 3) análisis de clase latente.

Obtención y análisis de los datos

Dos autores de la revisión, de forma independiente, extrajeron los datos y evaluaron la calidad de los estudios incluidos mediante la herramienta QUADAS‐2. Los desacuerdos se resolvieron por un tercer autor. Se realizó un metanálisis para calcular la sensibilidad y la especificidad de las pruebas rápidas, mediante un modelo normal bivariado con una función complementaria de enlace logarítmico. Cada prueba índice se analizó por separado. Como posibles fuentes de heterogeneidad se exploró: área geográfica, marca comercial de la prueba índice, tipo de estándar de referencia, prevalencia de la enfermedad, tamaño del estudio y riesgo de sesgo (QUADAS‐2). También se realizó un análisis de sensibilidad para evaluar la influencia de los estándar de referencia imperfectos.

Resultados principales

Se incluyeron 24 estudios que contenían información acerca de cinco pruebas índices (prueba inmunocromatográfica [ICT] rK39, prueba de aglutinación en látex KAtex en orina, prueba de aglutinación FAST, ICT rK26 e ICT rKE16) y reclutaron a 4271 participantes (2605 con LV). Se realizó un metanálisis para la prueba ICT rK39 (que incluyó 18 estudios; 3622 participantes) y la prueba de aglutinación en látex (seis estudios; 1374 participantes). Los resultados mostraron una heterogeneidad considerable. Para la ICT rK39 la sensibilidad general fue del 91,9% (intervalo de confianza del 95% [IC del 95%]: 84,8 a 96,5) y la especificidad del 92,4% (IC del 95%: 85,6 a 96,8). La sensibilidad fue menor en África oriental (85,3%; IC del 95%: 74,5 a 93,2) que en la India subcontinental (97,0%; IC del 95%: 90,0 a 99,5). Para la prueba de aglutinación en látex la sensibilidad general fue del 63,6% (IC del 95%: 40,9 a 85,6) y la especificidad del 92,9% (IC del 95%: 76,7 a 99,2).

Conclusiones de los autores

La ICT rK39 muestra sensibilidad y especificidad altas para el diagnóstico de la leishmaniasis visceral en los pacientes con esplenomegalia febril y sin antecedentes de enfermedad, pero la sensibilidad es notablemente inferior en África oriental que en la India subcontinental. Otras pruebas rápidas carecen de exactitud, validación o ambas.

Resumen en términos sencillos

Pruebas diagnósticas rápidas para la leishmaniasis visceral

La leishmaniasis visceral (o kala‐azar) es causada por un parásito, provoca fiebre, agrandamiento del bazo y otros problemas de salud, y ocurre en la India, Bangladesh y Nepal, África oriental, la región mediterránea y Brasil. Sin tratamiento los pacientes mueren y el tratamiento adecuado puede dar lugar a la curación, por lo que el diagnóstico es importante. Muchas de las pruebas que se utilizan para determinar si un paciente tiene leishmaniasis visceral son complicadas, costosas, dolorosas y en ocasiones peligrosas para los pacientes. Actualmente se dispone de pruebas diagnósticas rápidas que son seguras y fáciles de realizar.

Esta revisión Cochrane describe cuán exactas son estas pruebas diagnósticas rápidas para la leishmaniasis visceral. Se resumieron los estudios que evaluaron las pruebas rápidas en pacientes que, según los médicos, podrían presentar la enfermedad. Solamente se incluyeron los estudios en los que los investigadores habían utilizado los métodos establecidos para detectar los pacientes con leishmaniasis visceral de los que no presentaban la enfermedad.

Se encontraron 24 estudios que contenían información acerca de cinco pruebas rápidas diferentes. En estos estudios participaron 4271 pacientes. Una de las pruebas rápidas (llamada prueba inmunocromatográfica rK39) proporcionó resultados positivos correctos en el 92% de los pacientes con leishmaniasis visceral y proporcionó resultados negativos correctos en el 92% de los pacientes que no presentaban la enfermedad. Esta prueba funcionó mejor en la India y Nepal que en África oriental. En la India y Nepal proporcionó resultados positivos correctos en el 97% de los pacientes con la enfermedad. En África oriental proporcionó resultados positivos correctos solamente en el 85% de los pacientes con la enfermedad.

Una segunda prueba rápida (llamada prueba de aglutinación en látex) proporcionó resultados positivos correctos en el 64% de los pacientes con la enfermedad y proporcionó resultados negativos correctos en el 93% de los pacientes sin la enfermedad. De las otras pruebas rápidas evaluadas hay muy pocos estudios para determinar su exactitud.

Authors' conclusions

Implications for practice

The rK39 ICT can be clearly recommended as a rapid diagnostic test for use in clinical care in the Indian subcontinent in patients with febrile splenomegaly and no previous history of VL.

In east Africa, the rK39 ICT can replace the DAT and other diagnostic tests as the basis for the therapeutic decision in patients suspected to have VL. However, because of the low sensitivity of the rK39 ICT, a negative test result does not rule out VL. Therefore, additional actions are needed in case of a negative result (eg second or different test, referral, coming back after two weeks for repeat testing or other). Too little evidence has accrued so far from other regions to make specific recommendations.

It is important to remember that this antibody‐based test has to be used in combination with a clinical case definition (fever of more than two weeks, splenomegaly and no previous history of VL) that needs to be strictly adhered to.

The sensitivity of the KAtex latex agglutination test in urine is too low to recommend it for standard practice guidelines for detection of VL in similar settings.

Implications for research

Although this review yielded solid evidence for recommending the rK39 ICT for clinical practice in the Indian subcontinent and East Africa, it would be helpful to conduct in the future more head‐to‐head comparisons of available tests by region and by test format, as was done by Cunningham et al. (Cunningham 2012), on a regular basis, as this would inform policy makers for their purchasing policies and quality assurance.

Several rapid diagnostic tests such as the latex agglutination test and the FAST should be further developed and evaluated before any recommendation on wide‐scale use can be made. Furthermore, better tests are needed, ie tests that are more specific for the acute stage of disease and more sensitive in all geographic regions, especially east Africa.

Last but not least, phase III clinical prospective studies are an essential element in the research and development process of a diagnostic device because they provide the basis for clinical guidelines. More studies of this type are needed, although in the case of VL, they are more complex in terms of reference standards. In addition, further standardization of evaluation methodology and a broader awareness of the QUADAS‐2 and STARD criteria by investigators would be beneficial for the quality of future evidence in this field.

Summary of findings

Open in table viewer
Summary of findings 1. rK39 immunochromatographic test for visceral leishmaniasis in the Indian subcontinent

Population: Patients suspected to have visceral leishmaniasis disease 1

Setting: Health services in endemic areas of the Indian subcontinent 2

New test: rK39 immunochromatographic test 3

Reference standard: (1) direct smear test or culture of splenic aspirate; (2) composite reference standard based on one or more of the following: parasitology, serology, or response to treatment; or (3) latent class analysis 4

Pooled sensitivity: 0.97 (95% CI 0.90 to 1.00) | Pooled specificity : 0.90 (95% CI 0.76 to 0.98)

Setting 5

Positive predictive value6

Negative predictive value6

Number of participants (studies)

Quality of the evidence (QUADAS‐2) 7

Peripheral health centre with a prior probability of disease of 40%

87%

98%

1468

(6 studies)

Risk of bias: none of the studies had a low risk of bias in all domains. One study had a high risk of bias (domain of flow and timing). Five studies had an unclear risk of bias (domains of index test or reference standard).

Applicability: low concerns in all studies and in all domains.

Referral centre with a prior probability of disease of 60%

94%

95%

Interpretation: When the rK39 ICT is used 1 in the Indian subcontinent, in a setting where the prior probability of VL among clinical suspects is 40%, which is typically seen in a peripheral health centre in an endemic area, the positive predictive value of the test is 87%. This means that out of 100 patients with a positive rK39 result, 87 would have VL (true positive result) and 13 would have another disease (false positive). The negative predictive value is 98%, meaning that out of 100 patients with a negative rK39 ICT result, 98 would have another disease (true negative) and 2 would have VL (false negative).

When the same test is used in a setting with a prior probability of VL of 60%, which is more typical for a referral centre in an endemic area, the positive predictive value is 94% and the negative predictive value is 95%.

A likelihood ratio is another way of expressing how informative a diagnostic test is: it indicates to what extent the rK39 ICT result changes the odds that a patient has VL. The likelihood ratio of a positive rK39 ICT result is 9.90, and the likelihood ratio of a negative test result is 0.03. This means that in the Indian subcontinent, a positive rK39 ICT result is a strong argument in favour of VL (ruling in) and that a negative rK39 ICT result is a strong argument against VL (ruling out).

CI: confidence interval

Boelaert M, Verdonck K, Menten J, Sunyoto T, van Griensven J, Chappuis F, Rijal S. Rapid diagnostic tests for visceral leishmaniasis. Cochrane Database of Systematic Reviews 2011, Issue 6. Art. No.: CD009135. DOI: 10.1002/14651858.CD009135.

1. The rK39 immunochromatographic test must be used in combination with a clinical case definition (fever and splenomegaly for more than two weeks and no previous history of visceral leishmaniasis). Studies with mainly HIV‐positive patients were not included in the pooled analyses.

2. The results of the meta‐analysis showed considerable heterogeneity, which was partly explained by the geographic region.

3. This rapid diagnostic test has been developed specifically for field use. It is less invasive, less time‐consuming, and easier to perform than the alternative parasitological or serological tests.

4. Latent class analysis is a modelling technique that allows us to estimate the sensitivity and specificity of a set of diagnostic tests in situations in which there is no good reference standard.

5. Two hypothetical situations: a peripheral health centre and a referral centre with a different prior probability of disease

6. A narrative explanation of the predictive values is given in Appendix 3.

7. QUADAS‐2 is a tool for the assessment of the quality of diagnostic accuracy studies. The tool comprises four domains: patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias, and the first three domains are also assessed in terms of concerns regarding applicability.

Open in table viewer
Summary of findings 2. rK39 immunochromatographic test for visceral leishmaniasis in east Africa

Population: Patients suspected to have visceral leishmaniasis disease 1

Setting: Health services in endemic areas of east Africa 2

New test: rK39 immunochromatographic test 3

Reference standard: (1) direct smear test or culture of splenic aspirate; (2) composite reference standard based on one or more of the following: parasitology, serology, or response to treatment; or (3) latent class analysis 4

Pooled sensitivity: 0.85 (95% CI 0.75 to 0.93) ; Pooled specificity : 0.91 (95% CI 0.80 to 0.97)

Setting 5

Positive predictive value 6

Negative predictive value6

Number of participants (studies)

Quality of the evidence

(QUADAS‐2) 7

Peripheral health centre with a prior probability of disease of 40%

86%

90%

1692

(9 studies)

Risk of bias: three studies had a low risk of bias across all domains; four studies had an unclear risk of bias (domains of index test or reference standard); two studies had a high risk of bias (domains of patient selection or flow and timing, or both).

Applicability: one study with high concerns about applicability (domain of patient selection); low concerns for all other studies and all other domains.

Referral centre with a prior probability of disease of 60%

93%

81%

Interpretation: When the rK39 ICT is used 1in east Africa, in a setting where the prior probability of VL is 40%, which is typically seen in a peripheral health centre in an endemic area, the positive predictive value of the test is 86%. This means that out of 100 patients with a positive rK39 ICT result, 86 would have VL (true positive result) and 14 would have another disease (false positive). The negative predictive value is 90%, meaning that out of 100 patients with a negative rK39 ICT result, 90 would have another disease (true negative) and 10 would have VL (false negative).

When the same test is used in a setting with a prior probability of VL of 60%, which is more typical for a referral centre in an endemic area, the positive predictive value is 93% and the negative predictive value is 81%.

In east Africa, the likelihood ratio of a positive rK39 ICT result is 9.58, and the likelihood of a negative rk39 ICT result is 0.16. This means that a positive rK39 ICT result is strong argument in favour of VL (ruling in), and that a negative rK39 ICT result is not an absolute argument against VL (does not allow to rule out VL completely).

CI: confidence interval

Boelaert M, Verdonck K, Menten J, Sunyoto T, van Griensven J, Chappuis F, Rijal S. Rapid diagnostic tests for visceral leishmaniasis. Cochrane Database of Systematic Reviews 2011, Issue 6. Art. No.: CD009135. DOI: 10.1002/14651858.CD009135.

1. The rK39 immunochromatographic test must be used in combination with a clinical case definition (fever and splenomegaly for more than two weeks and no previous history of visceral leishmaniasis). Studies with mainly HIV‐positive patients were not included in the pooled analyses.

2. The results of the meta‐analysis showed considerable heterogeneity, which was partly explained by the geographic region.

3. This rapid diagnostic test has been developed specifically for field use. It is less invasive, less time‐consuming, and easier to perform than the alternative parasitological or serological tests.

4. Latent class analysis is a modelling technique that allows to estimate the sensitivity and specificity of a set of diagnostic tests in situations in which there is no good reference standard.

5. Two hypothetical situations: a peripheral health centre and a referral centre with a different prior probability of disease

6. A narrative explanation of the predictive values is given in Appendix 3.

7. QUADAS‐2 is a tool for the assessment of the quality of diagnostic accuracy studies. The tool comprises four domains: patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias, and the first three domains are also assessed in terms of concerns regarding applicability.

Open in table viewer
Summary of findings 3. Latex agglutination test in urine for the diagnosis of visceral leishmaniasis

Population: Patients suspected to have visceral leishmaniasis disease 1

Setting: Health services in endemic areas 2

New test: Latex agglutination test in urine 3

Reference standard: (1) direct smear test or culture of splenic aspirate; (2) composite reference standard based on one or more of the following: parasitology, serology, or response to treatment; or (3) latent class analysis 4

Pooled sensitivity : 0.64 (95% CI 0.41 to 0.86) ; Pooled specificity : 0.93 (95% CI 0.77 to 0.99) 5

Setting 6

Positive predictive value

Negative predictive value

Number of participants (studies)

Quality of the evidence

(QUADAS‐2) 7

Peripheral health centre with a prior probability of disease of 40%

86%

79%

1374

(6 studies)

Risk of bias: none of the studies had a low risk of bias in all domains. One study had a high risk of bias (domain of flow and timing). Five studies had an unclear risk of bias (domain of reference standard).

Applicability: low concerns in all studies and in all domains.

Referral centre with a prior probability of disease of 60%

93%

63%

CI: confidence interval

Boelaert M, Verdonck K, Menten J, Sunyoto T, van Griensven J, Chappuis F, Rijal S. Rapid diagnostic tests for visceral leishmaniasis. Cochrane Database of Systematic Reviews 2011, Issue 6. Art. No.: CD009135. DOI: 10.1002/14651858.CD009135.

1. Studies with mainly HIV‐positive patients were not included in the pooled analyses.

2. The studies included in this review were conducted in Ethiopia, Kenya, Sudan, India and Nepal.

3. This rapid diagnostic test has been developed specifically for field use. It is less invasive, less time‐consuming, and easier to perform than the alternative parasitological or serological tests.

4. Latent class analysis is a modelling technique that allows to estimate the sensitivity and specificity of a set of diagnostic tests in situations in which there is no good reference standard.

5. The results of the meta‐analysis showed considerable heterogeneity.

6. Two hypothetical situations: a peripheral health centre and a referral centre with a different prior probability of disease.

7. QUADAS‐2 is a tool for the assessment of the quality of diagnostic accuracy studies. The tool comprises four domains: patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias, and the first three domains are also assessed in terms of concerns regarding applicability.

Background

Target condition being diagnosed

Visceral leishmaniasis (VL), also known as kala‐azar, is a life‐threatening systemic disease caused by the obligate intracellular protozoan, Leishmania, and transmitted through the bites of phlebotomine sand flies (Herwaldt 1999). Leishmanial infection can cause a diverse spectrum of diseases, among which VL is the most severe form and which is almost always fatal without adequate, timely treatment. The species that causes VL isLeishmania donovani in Asia and eastern Africa, and L. infantum in Europe, North Africa and Latin America (Boelaert 2000). The geographical distribution of VL is often limited to well‐identified endemic foci, but it has also emerged as epidemics (Seaman 1996) and as one of the opportunistic infections in human immunodeficiency virus (HIV)‐positive patients (Alvar 2008; Pasquau 2005). VL is a neglected disease that affects the poorest and most vulnerable people in rural, remote settings where there is limited access to health care. The estimated incidence is 200,000 to 400,000 cases per year, of which more than 90% are reported from India, Bangladesh, Ethiopia, Sudan, South Sudan, and Brazil (Alvar 2012). The two main transmission modes of VL are anthroponotic and zoonotic. The human‐to‐human transmission is predominant in the Indian subcontinent and in eastern Africa, while the zoonotic transmission mode is found in the Mediterranean region and the Americas. The zoonotic transmission mode involves dogs as the main parasite reservoir (Chappuis 2007).

The pathogenesis of VL entails a complex interaction between the characteristics of the parasite and the host (Rittig 2000). The parasite life cycle involves the replication of the promastigote form in the gut of the female sand fly and is completed when she takes a blood meal on a non‐immune host. In the human body, the Leishmania promastigote evolves into an amastigote form (without flagellum), which colonizes the macrophages in the liver, spleen and bone marrow. The control of the infection depends on the characteristics of the host, especially an intact cell‐mediated immunity in the form of a T‐helper type 1 response, involving the secretion of interleukin 12 and interferon‐gamma. The failure of this cell‐mediated immunity in some, but not all, of the infected individuals ultimately leads to the clinical manifestations of VL disease (Murray 2005).

The main clinical manifestations of VL are fever of insidious onset, weakness, loss of appetite, weight loss, abdominal distension due to enlargement of the spleen or the liver or both, lymph node enlargement, and a low blood cell count (pancytopenia). In children, other symptoms include diarrhoea, coughing, abdominal pain, and growth retardation. In any age group, if left untreated, the disease will progress with time, causing debilitation, bleeding, susceptibility to secondary infection and, eventually, death. Anti‐leishmanial treatment is life‐saving but non‐response or relapse can occur. The pentavalent antimonials, sodium stibogluconate and meglumine antimoniate, despite their toxicity, have been the mainstay of VL treatment in many areas for decades. Alternatives include (liposomal) amphotericin B, paromomycin, and miltefosine. Combination therapy is the suggested way forward to increase treatment efficacy, prevent resistance, and reduce treatment duration and cost (van Griensven 2010). It is important that diagnostic tests for VL do not give false‐negative results because VL may be fatal. Neither should they give false‐positive results in order to prevent people without VL from receiving the toxic VL treatment.

As stated above, asymptomatic or subclinical infections also occur, and in cross‐sectional surveys in endemic areas, a significant proportion of the healthy individuals present with anti‐leishmanial antibodies. In longitudinal studies, the ratio between incident clinical cases and incident asymptomatic infections ranged from 1:5 in Kenya (Ho 1982) to 1:8 in Brazil (Evans 1992), 1:9 in Ethiopia (Hailu 2009), 1:8.9 in India and Nepal (Ostyn 2011), and 1:11 and 2.6:1 in Sudan (Zijlstra 1994). Furthermore, antibodies (seropositivity) in VL patients tend to persist for many years after cure (De Almeida 2006; Hailu 1990).

Index test(s)

Rapid diagnostic tests (RDTs) are defined as equipment‐free diagnostic devices that do not require highly skilled laboratory staff. The results of an RDT can be read easily within minutes, or at most an hour or two (PATH 2008). Most RDTs work by capturing either an antigen or an antibody on a solid surface and then attaching molecules to them that allow detection by the naked eye. The technology used is mainly immunochromatography with a dipstick or lateral flow format. These immunochromatographic tests (ICTs) are used for VL diagnosis in dipstick or cassette format, using protein isolated from Leishmania sp as the antigen. The recombinant form of the 39 amino‐acid‐sequence from L. chagasi is the most widely used and is known as rK39. Other recombinant antigens such as rK9, rK16, rK26 and rK28 have also been evaluated. Currently, there are several on‐going projects to develop a VL test based on antigen detection, for example, the latex agglutination test detecting a urinary leishmanial antigen.

The rK39 antigen was first used in an enzyme‐linked immunosorbent assay (ELISA) format, but the newer dipstick or strip formats are increasingly being used in the field or away from the main centres. These rK39 ICTs give an immediate result (typically between 10 and 20 minutes) and give a binary reading (positive or negative). The test procedure involves adding the patient’s blood or serum with diluent buffer on the strip. When present in the blood or serum, specific antibodies against rK39 antigens, which are bound to the strip, can be read with the naked eye in the test window. There is a limited number of commercially available RDTs for VL, such as the IT‐LEISH® (DiaMed AG, Switzerland – now Biorad, France), Kalazar Detect® (InBios International, USA), Onsite Leishmania Ab Rapid Test (CTK Biotech, USA) as well as a number of prototypes under various formats (Cunningham 2012).

rK39 ICTs cannot be used to diagnose VL in people with a past history of VL due to the persistence of antibodies after cure. In addition, to avoid diagnosis of asymptomatic infections, these tests must be applied in patients suspected to have VL, ie patients with prolonged fever and splenomegaly (WHO 2010).

Clinical pathway

Prior test(s)

The RDTs are meant to be used in patients with a clinical suspicion of VL without previous history of VL. According to the case definition proposed by the World Health Organization (WHO), VL is an illness with prolonged irregular fever, splenomegaly and weight loss as its main symptoms (WHO 2010).

Role of index test(s)

The RDTs were specifically developed for field use in VL‐endemic areas. If they are sufficiently accurate, they could be used for the early diagnosis of VL at both peripheral and central levels. A positive RDT result would then confirm the diagnosis of VL in clinically suspected patients and allow the start of treatment (WHO 2010).

Alternative test(s)

A definite VL diagnosis is provided by the demonstration of the parasite through a microscopic visualization from spleen, bone marrow or lymph node aspirates. These invasive procedures are not always feasible in field settings, and using them is by no means free of risk: spleen aspirates are the most sensitive (93% to 99%) but the aspiration carries a rare but fatal risk of bleeding (Zijlstra 1992). Splenic aspiration should only be performed if there is rapid access to blood transfusion in case of bleeding and if there are no contraindications. Clinical contraindications include signs of active bleeding, jaundice, pregnancy, a barely palpable spleen and a bad general condition. Biological contraindications include severe anaemia, a prolonged prothrombin time and a low platelet count (WHO 2010). Bone marrow and lymph node aspiration are safer, but both have lower sensitivity (Sarker 2004; Babiker 2007). Parasitological confirmation through culture or molecular techniques such as a polymerase chain reaction (PCR) is possible, but their complexity restricts their use as a routine diagnosis method. In addition, in endemic areas, a substantial proportion of healthy individuals have parasite DNA in the blood, which is detected by PCR (Bhattarai 2009).

Serological tests based on indirect fluorescence antibody, ELISA and Western blot were developed but their use is limited in the first‐line health services of endemic areas as they require a too sophisticated laboratory infrastructure. A more useful antibody detection test is the direct agglutination test (DAT), a semi‐quantitative method based on visual agglutinations obtained by the increased dilution of blood or serum mixed with stained, killed parasites in V‐shaped wells (Harith 1986). It has been extensively validated and used in the field. Nonetheless, it cannot be categorized as an RDT because it requires a degree of laboratory skills, facilities and equipment, and the result is ready only after overnight incubation (Boelaert 1999).

Rationale

Correctly diagnosing VL disease is crucial as the signs and symptoms of VL are not specific enough to differentiate the condition from chronic malaria, schistosomiasis or other systemic infections. VL should be suspected in patients presenting with fever and splenomegaly, but confirmation is needed. The classical method relying on microscopic smears from tissue aspirates (spleen, bone marrow, lymph node) are unsuitable for settings with limited resources. Other methods such as antibody detection are more feasible, most notably the DAT and the rK39 antigen‐based ICT. Both have been specifically developed in the past two decades for use in such contexts and have shown high diagnostic accuracy in most endemic areas (Boelaert 2004; Chappuis 2006).

The major drawbacks of these serological methods are linked to the antibodies that remain detectable after a cure and those that are due to past or present asymptomatic infection present in a sizeable proportion of residents of endemic areas. The use of a clinical case definition and adequate medical history taking should help to avoid false positives, yet it underscores the need for better VL diagnosis tests that are specific to acute stage disease. Ideally, the test should be highly sensitive, as VL is potentially fatal, and it should also be specific, as presumptive treatment cannot be fully justified with the current treatment regimens (den Boer 2006).

The RDTs for VL, mainly but not exclusively the rK39‐based ICTs, seem to be the current solution for field diagnosis in remote settings: their ease of use, convenience and cost make them potentially advantageous to increase patients' access to VL diagnosis and treatment. The current WHO recommendation for VL case management includes the use of the rK39‐based ICT as the basis for initiating treatment and it has also been adopted as the diagnostic tool in first‐line services in the VL elimination initiative in the Indian subcontinent (WHO 2005). In other areas, the RDTs have been used as the first test in diagnosis‐treatment algorithms that often incorporate other test(s), such as the DAT or tissue aspiration (Raguenaud 2007; Veeken 2003;).

Despite its operational advantages, some regional variability in rK39‐based ICT performance has been observed (Chappuis 2006). Other prototype tests have been proposed as RDTs, but their real value in clinical practice is unclear. Furthermore, in the published literature, many phase II studies (with a sub‐optimal, case‐control design) might overshadow the more limited information of better quality based on phase III studies (including consecutive patients suspected to have VL). Thus, it is of utmost importance to synthesize the available evidence in this rapidly evolving field. Knowing the diagnostic accuracy of these RDTs would help inform policy makers and stakeholders on how best to use them in VL control in diverse settings. The role of RDTs in the different epidemiological contexts could be better defined, giving clearer evidence on their accuracy, and this review aims to contribute to exactly this goal.

Objectives

To determine the diagnostic accuracy of rapid tests for the diagnosis of visceral leishmaniasis in patients with suspected disease presenting at health services in endemic areas.

Secondary objectives

Investigation of sources of heterogeneity

We investigated differences in diagnostic accuracy in relation to test conditions (index and reference: commercial brand of index test, type of reference standard), geographical region (the Indian subcontinent, eastern Africa, Latin America and the Mediterranean region), disease prevalence in the sample, and study quality.

Methods

Criteria for considering studies for this review

Types of studies

Only original diagnostic accuracy studies of the phase III type (Zhou 2002) were included in the review, that is, prospective or retrospective cohort studies (meaning studies with patients who were consecutively enrolled, be it prospectively or retrospectively, and in which all patients were given the index test and reference standard), or studies that enrolled a random selection of patients from a series of patients. Randomized controlled trials in which patients were randomized to one of several index tests and all received the reference test could also be included.

Participants

Patients with a clinical suspicion of VL, that is, those who are febrile for more than two weeks, and present with splenomegaly; according to the WHO case definition (WHO 2010) presenting at health services in endemic areas.

We excluded studies with participants:

  • who were previously treated for VL (non‐responders or relapsed cases); and

  • who had signs and symptoms of other forms of leishmaniasis, such as post kala‐azar dermal leishmaniasis (PKDL).

Studies in which only a subgroup of participants was eligible for the review were included if it was possible to extract relevant data specific to that subgroup.

Studies of patients with HIV or other co‐infections were eligible for this review.

Index tests

All types of RDTs for VL, with results that are read out within one hour, regardless of the manufacturer. The RDTs could be assessed alone or in comparison with other tests.

Target conditions

The target condition was restricted to current clinical VL, and did not include asymptomatic leishmanial infection or VL in the past. 

Reference standards

We accepted the following reference standards for the diagnosis of active VL (adapted from Boelaert 2007):

  • reference standard including direct smears or culture of splenic aspirate;

  • composite reference standard (Sullivan 2003) based on a combination of several tests. Test algorithms could include smears or culture of tissue aspirate, serology (other than the index test), or clinical arguments; and

  • latent class analysis (LCA) (Zhou 2002) based on one or more of the following: smears or culture of tissue aspirates, serology (other than the index test), clinical response to antimonial treatment; or specific clinical signs (pancytopenia, darkened skin). To be selected, the studies had to assess the conditional independence assumption between the tests, and, if conditional dependence was expected, they had to use appropriate statistical methods (Dendukuri 2001). More information about LCA is given in Appendix 1.

Search methods for identification of studies

We attempted to identify all relevant studies regardless of their publication status (published, unpublished, in press, ongoing). No language or date limits were applied. We limited our searches to studies conducted in humans.

Electronic searches

To identify all relevant studies, we searched the following databases using the search terms and strategy described in Appendix 2: MEDLINE, EMBASE, LILACS, CIDG SR, the Cochrane Central Register of Controlled Trials (CENTRAL) published in The Cochrane Library, ISI Web of Knowledge (Science Citation Index Expanded), Medion, Arif, and CCT.

We also searched the metaRegister of Controlled Trials (mRCT) and the WHO Clinical Trials Registry Platform using "leishman*" and "diagnos* OR RDT*" as search terms.

MeSH and other search terms included: visceral leishmaniasis, kala‐azar, Leishmania donovani, Leishmania infantum,Leishmania chagasi, K39, immunoassay, chromatography, dipstick, immunochromatography, rapid diagnostic tests; RDT; antibody detection, serology.

Searching other resources

We checked the reference lists of the studies identified by the above methods. We also checked conference proceedings and reports from the WHO.

Data collection and analysis

Selection of studies

Two review authors assessed the titles and abstracts identified through the search strategy. All potentially relevant articles were retrieved in full and two authors (MB and TS or MB and KV) independently examined them for inclusion in the present review using an assessment form. Any discrepancies were resolved by discussion with a third author (JvG).

Data extraction and management

We extracted a standard set of data from each study, using a specifically modified data extraction form. If only a subgroup of participants in a study was eligible, we extracted the data for that particular subgroup.

Two authors independently extracted the data (MB and JvG). The discrepancies were resolved by discussion with a third author (KV).

For each study, data were extracted on:

Study ID

First author, year of publication.

Clinical features and settings

Presenting signs and symptoms, clinical setting.

Participants

Sample size, summary statistics of age (mean or median), sex (% male or female), HIV status (% positive, if available), country.

Study design

Whether the sample was consecutive or random.

Were consecutive patients enrolled retrospectively or prospectively?

If the study evaluated more than one RDT, how were tests allocated to individuals, or did each individual receive all tests?

Target condition

Leishmania species, clinical features.

Reference standard

The reference standard test(s) used.

Who performed the reference standard test(s) and where?

How many observers or repeats were used?

How were discrepancies between observers resolved?

If not all patients received the reference tests, how many did not (and what proportion were they of the total)?

If any participant received a different reference test, what reasons were stated for this, and how many participants were involved?

Index test

The commercial name, with batch number, if provided.

Transport and storage conditions.

Details of the test operators.

Index test results and reference standard results

Number of missing, not interpretable or doubtful results (for both index and reference tests).

For each index test (and for each acceptable reference standard, if more than one reference standard was reported), the number of true and false positives, false and true negatives (2 x 2 contingency table); or for studies that used LCA as the reference standard: estimates and 95% confidence intervals (CI) for the prevalence and for the sensitivity and specificity of each index test.

Notes

Anything else of relevance

Assessment of methodological quality

We first assessed the standard signalling questions of the QUADAS‐2 tool (Whiting 2011) and decided to omit one question (“If a threshold was used, was it pre‐specified?”) and not to add any specific signalling questions for this review.

The QUADAS‐2 tool comprises four domains: patient selection, index test, reference standard, and flow and timing. We considered that the risk of bias for a certain domain was low if all the signalling questions for that domain indicated a low risk of bias. If the answer to at least one signalling question for a certain domain was 'no', the risk of bias for that domain was considered to be high. If none of the answers was 'no', but the answer to at least one question was 'unclear', the risk of bias for that domain was considered to be unclear.

Two review authors (JVG and MB) then independently applied the tool to the included records. For each record, they assessed risk of bias and concerns about applicability. Several of the original studies included in this review were conducted by the review team. In order to ensure an objective assessment of all the included records, the judgements about eligibility, bias and applicability were entirely based on the published documents and not on unpublished background information. In addition, two of the three review authors who made the judgements about bias and applicability (JVG and KV) had not been involved in any of the original studies. In case of discrepant judgements, KV took the final decision.

Statistical analysis and data synthesis

The aim of the analysis was to estimate the diagnostic accuracy, that is, the sensitivity (Se) and specificity (Sp), of the index tests (RDTs for VL). We analysed each RDT separately and did not plan to make comparisons between different index tests.

The analysis was performed using R (R Development Team 2010) and WinBugs (Spiegelhalter 2003) because of these programs’ flexibility in model fitting, in line with recommendations in the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy ‐ Chapter 10 Analysing and Presenting Results (Macaskill 2010). More technical detail on the statistical methods can be found in Appendix 1 and Menten 2013.

1) Outcome data

The statistical analysis was based on summary data as provided in the source publications: either the number of true and false positive or negative test results compared to the reference test, or the Sensitivity (Se) and Specificity (Sp) of the index tests and 95% CI or credible intervals obtained from latent class analysis.

For those studies that reported data using several reference standards (Table 1), we extracted and summarized all 2 x 2 tables or Se and Sp estimates. The main analysis used a single 2 x 2 table or set of Se and Sp estimates for each study. The primary data for each study was selected based on a predefined ranking of reference standards: (1) latent class analysis, (2) parasitology and serology, (3) parasitology including spleen aspirate without serology, and (4) parasitology not including spleen aspirate without serology. The influence of the possible selection of alternative data sets was explored in sensitivity analyses.

Open in table viewer
Table 1. Studies on rK39 ICT presenting two sets of Se and Sp estimates based on different reference standards: estimates included in primary analysis and sensitivity analysis

Study

Selected for primary analysis

Selected for sensitivity analysis

Boelaert 2004 ‐ LCA & Boelaert 2004 ‐ classic

LCA

Parasitology including spleen aspirate – no serology

Boelaert 2008 ‐ Ethiopia & Diro 2007

LCA

Parasitology including spleen aspirate – no serology

Boelaert 2008 ‐ India & Sundar 2007

LCA

Parasitology and serology

Cota 2013 ‐ composite 1 & Cota 2013 ‐ composite 2 *

Parasitology and serology

Parasitology not including spleen aspirate ‐ no serology

Machado de Assis 2012 & Machado de Assis 2011

LCA

Parasitology not including spleen aspirate ‐ no serology

Veeken 2003 ‐ composite & Veeken 2003 ‐ spleen

Parasitology and serology

Parasitology including spleen aspirate – no serology

* Cota 2013 describes HIV‐infected patients only and is not included in the formal meta‐analysis

2) Basic data presentation

The basic study results were summarized using coupled forest plots, presenting the crude Se and Sp estimates and 95% CIs as presented in the original publications. A summary receiver operating characteristic (ROC) plot was presented to assess the need for a ROC‐based analysis rather than the bivariate logistic normal model, planned and described below.

3) Basic statistical model

The primary analysis was performed using the bivariate model with a hierarchical approach (Reitsma 2005). The bivariate, rather than a ROC‐based, approach was chosen based on the prior experience of the author team in a VL diagnostic meta‐analysis (Chappuis 2006), where no threshold effects were observed. In addition, the rapid tests report the results as positive or negative, which implies a common cut‐off point for test positivity and which indicates, on prior grounds, that the bivariate model is the most appropriate. We determined the most appropriate link function for the bivariate model using the deviance information criterion (DIC, Verde 2010) and graphical assessment of model fit. Based on this analysis, we selected the complementary log‐log (cloglog) function (see Appendix 1) as providing the best and most parsimonious fit to the data. All results are presented on the probability scale allowing direct comparison with results obtained using the logit link function.

For the studies estimating Se and Sp compared to a reference standard, at the lower level the cell counts in the 2 x 2 tables extracted from each study were modelled using binomial distributions (Macaskill 2010). 

For the studies estimating Se and Sp through LCA, the cloglog transforms of the Se and Sp and their standard errors were derived from the estimates and CIs in the source publications. They were then entered in the lower level of the hierarchical model using the normal distribution.

The Se and Sp, irrespective of their source, were, in the higher level of the hierarchical model, presumed to come from the same bivariate normal, that is, the model proposed corresponds to a hybrid approach of the bivariate model of Reitsma (Reitsma 2005) and standard random effects meta‐analyses (Whitehead 2002). This model was estimated using Markov Chain Monte Carlo (MCMC) methods as implemented by WinBugs with uniform ]0,1[ priors for the index test Se and Sp. The WinBugs code for the model can be found in Appendix 1.

4) Model summary

Results from the models were provided as point estimates of test Se and Sp and joint 95% CI and prediction regions. The confidence regions and prediction regions were plotted on the Se versus 1‐Sp diagram (that is, the ROC space), together with the crude or model‐based point estimates of Se and Sp, or both, for each study. In tables, we presented marginal CIs obtained from these confidence regions.

Investigations of heterogeneity

The amount of between‐study heterogeneity was graphically assessed in coupled forest plots.

To explain the between‐study variability, we assessed the following possible sources of heterogeneity:

  • geographical area: Indian subcontinent, eastern Africa, Latin America, Mediterranean region (note: this is related to the Leishmania species diversity and distribution: L. donovani complex in Asia and eastern Africa, and L. infantum in the Mediterranean region and Latin America);

  • commercial brand or type of test: categories determined based on extracted data;

  • type of reference standard: (a) parasitology including spleen aspirate – no serology; (b) parasitology not including spleen aspirate – no serology; (c) parasitology and serology; (d) latent class analysis;

  • disease prevalence (in the sample): below versus equal to or above the median (65%) across all studies; and

  • quality assessment: a first indicator of quality was based on the QUADAS‐2 assessment. Here, we used an overall, study‐level assessment of the risk of bias. If the risk of bias in a study was low in all four QUADAS‐2 domains, we defined the overall risk of bias in that study as 'low'. If the risk of bias was high in at least one of the QUADAS‐2 domains, we defined the overall risk of bias as 'high'. In all other cases, the overall risk of bias was labelled 'uncertain'. A second indicator of quality was the size of the study (below versus equal to or above median [250]), because in order to ensure reasonably precise estimates of sensitivity and specificity, investigators are expected to consider sample size issues during the planning of the study (Bachmann 2006; Peeling 2007). Larger studies may have been subjected to closer scrutiny and may have recruited a more representative sample of the clinical suspect population. Yet, the precision of Se and Sp estimates in original studies does not only depend on sample size, but also on disease prevalence (Deeks 2005).

We informally assessed the influence of these factors by presenting summary Sp and Se for subgroups of studies (categories of covariate). In a formal analysis, these sources were then individually added as fixed‐effect predictors of Se and Sp in the statistical model described above. We assessed the estimates and 95% CI of the effects of each of the predictors on Se and Sp. If warranted by the results of these analyses and the amount of studies available, we planned to consider developing this model further using multiple predictors or assuming separate bivariate normal distributions in different subgroups of studies.

Sensitivity analyses

Analysis using secondary reference standards

As a sensitivity analysis to the choice of the set of estimates of a given study, we made an alternative selection of the reference standard for those studies presenting results for two reference standards (Table 1).

Analysis allowing for imperfect reference standards

The model described above estimated the Se and Sp by comparison to a reference standard presuming that this reference standard was perfect, that is, that its Se and Sp were both equal to 1. However, in VL, it can be expected that reference standards have less than perfect Se or Sp, or both.

To allow for possible imperfect reference standards, we used the following approach:

  • during data extraction, each study was classified according to the type of reference standard: (a) parasitology including spleen aspirate – no serology; (b) parasitology not including spleen aspirate – no serology; (c) parasitology and serology; or (d) latent class analysis;

  • for these reference standards, expert opinion for the Se and Sp was elicited from seven experts (three of the authors FC, MB and SR, and four from the WHO technical expert panel on leishmaniasis);

  • the expert opinion on the reference test Se and Sp was then entered as priors, using the beta distribution, in an extension of the bivariate model described above as a Bayesian sensitivity analysis (Greenland 2006).

The bivariate model was extended using a multinomial distribution for the cell counts in the 2 x 2 tables, as in the latent class analysis of diagnostic tests (Black 2002). The Se and Sp of the index test were modelled through a bivariate normal distribution as in the basic model described above. The Se and Sp of the reference tests were assumed to be equal across studies using the same reference standard, with informative priors for the Se and Sp derived from expert opinion as described above. Prevalences were estimated for each study separately by providing a uniform ]0,1[ for the study‐specific prevalences. Data from studies that performed LCA in the source publication were included as in the basic model, as they already allowed for uncertainty in the true disease status in the source publication. The model was fitted using MCMC methods. Care was taken in assessing the fit and identifiability of the model using posterior predictive checks (Gelman 1995). It could be expected that the model would remain identifiable through the use of informative priors on the Se and Sp of the reference test and the assumption that the Se and Sp of the index test across studies arose from the same normal bivariate distribution (Enoe 2000), but, if needed, further constraints on the model parameters could be added (for example, by constraining the Sp of some of the reference standards to 1).

The results from this model were compared with those obtained in the primary analysis, described above. We also contrasted the assumptions of our model against the assumption of perfect ascertainment underlying the primary analysis approach. The performance of the model was evaluated through simulation studies. Details are given in Appendix 1 and Menten 2013.

Assessment of structure of hierarchical model

We compared the primary results with those obtained using the standard logit link function for hierarchical model. In addition, we relaxed the assumption of normally distributed random‐effects, by using the t‐distribution instead of the normal distribution.

Assessment of reporting bias

It is well known that reporting or publication bias in studies of diagnostic test accuracy may be difficult to detect and that formal tests of funnel plot asymmetry are biased (Macaskill 2010). We did not formally test for publication bias but explored the relationship between the logit Se, logit Sp, and log diagnostic odds ratio and the effective sample size (Deeks 2005) using funnel plots. However, the presence or absence of funnel plot asymmetry was not interpreted as definite proof of the presence or absence of publication bias.

Results

Results of the search

The date of the search was 3 December 2013. The search identified 1758 records, each record corresponding to a published article (Figure 1). After screening titles and abstracts, 1648 irrelevant records were removed. We were unable to obtain the full‐text article of one record and excluded one other article because it was written in Chinese language and no translators were available. We retrieved the full text of the remaining 108 records and excluded another 87 for the following reasons: publication of the same study in more than one record (we kept the record that was published most recently and excluded the others; n = 4); no original research (n = 6); index test not a rapid test (n = 1); target condition not clinical visceral leishmaniasis (n = 5); not a phase III diagnostic accuracy study (n = 66); and reference standard not according to criteria (n = 5; Figure 1 and Characteristics of excluded studies). The remaining 21 records were included in this review.


Study flow diagram showing the process of selection of records and studies for the review and for the meta‐analyses

Study flow diagram showing the process of selection of records and studies for the review and for the meta‐analyses

Several records reported multi‐country, multicentre or stratified data. If a ‘study’ is defined as a phase III design leading to a sensitivity and specificity estimate of one or more RDTs in a specific patient population, then three of the 21 included records contained results of more than one unique study. One article described RDT performance in patient populations in five countries (Boelaert 2008, henceforth distinguished as Boelaert 2008 ‐ Ethiopia; Boelaert 2008 ‐ India; Boelaert 2008 ‐ Kenya; Boelaert 2008 ‐ Nepal; Boelaert 2008 ‐ Sudan); Ter Horst et al. reported data separately for HIV‐positive and HIV‐negative patient populations (ter Horst 2009 ‐ HIV neg; ter Horst 2009 ‐ HIV pos); and Diro 2007 included data from clinically suspected patients presenting at the study site and from people identified through active case finding in the community. Furthermore, in one record, two different brands of the rK39 immunochromatographic test (ICT) were evaluated in the same population (Chappuis 2005 ‐ DiaMed; Chappuis 2005 ‐ InBios), and we treated this record as two different studies. On the other hand, three patient populations were presented in more than one record. Boelaert 2008 ‐ Ethiopia re‐analysed the same patient population data as Diro 2007, but using a different reference standard. Similarly, Boelaert 2008 ‐ India described the same patient population as Sundar 2007 but again, analysed it with a different reference standard. This also occurred in Machado de Assis 2011 and Machado de Assis 2012. We have treated these records as one single study in each country with a primary and a secondary analysis (see below).  Altogether, the 21 included records contained information about 25 unique studies.  We excluded the study based on active case finding (part of Diro 2007) from the review as it did not comply with the eligibility criteria of our protocol and, therefore, we finally included 24 studies.

These 24 studies included 4271 participants of whom 2606 were classified as having VL. Four studies used a reference standard including direct smears or culture of splenic aspirate, or both, 11 used a composite reference standard, three used latent class analysis, and six presented two sets of accuracy estimates using two different reference standard categories (Table 1).

The 24 studies contained information about five index tests: the rK39 ICT, the latex agglutination test in urine, the FAST agglutination test, an rK26 ICT and an rKE16 ICT. Six studies assessed the accuracy of more than one type of index test in the same patient population: four studies evaluated the rK39 ICT and the latex agglutination test; one study evaluated the rK39 and the rKE16 ICT; and one study evaluated the rK39 ICT, the rK26 ICT, and the latex agglutination test (Sundar 2007). Overall, the rK39 ICT test was evaluated in 20 studies including a total of 3806 participants of whom 2370 had VL. The latex agglutination test was evaluated in seven studies, which corresponds to 1459 participants including 873 people with VL. The FAST agglutination screening test was evaluated in two studies with a total of 148 participants including 69 with VL. The rK26 ICT test was assessed in one study with 352 participants of whom 282 had VL, and the rKE16 test was evaluated in one study with 219 participants of whom 131 had VL.

For the meta‐analysis of the accuracy of the rK39 ICT test, 18 out of 20 studies were included (Table 2). At this stage, we excluded two studies because they only described HIV‐positive patients (ter Horst 2009 ‐ HIV pos and Cota 2013). In all the other studies using the rK39 ICT test, the included patients were HIV negative or the HIV status was unknown but considered of no importance in the study population. Six out of the 18 included studies generated more than one set of sensitivity and specificity estimates by using different reference standards in a primary and a secondary analysis. We selected one of the two estimates for the primary meta‐analysis and explored the effects of this choice in the sensitivity analysis (Table 1).

Open in table viewer
Table 2. Overall Analysis Summary

Number of studies

Se (95% CI)

Sp (95% CI)

rK39 immunochromatographic test

Overall

18

91.9 (84.8 to 96.5)

92.4 (85.6 to 96.8)

Indian subcontinent

6

97.0 (90.0 to 99.5)

90.2 (76.1 to 97.7)

East Africa

9

85.3 (74.5 to 93.2)

91.1 (80.3 to 97.3)

Latex agglutination test

Overall

6

63.6 (40.9 to 85.6)

92.9 (76.7 to 99.2)

For the meta‐analysis of the accuracy of the latex agglutination test, we included six studies (Table 2). We excluded one study because it described mostly HIV‐positive patients (Vilaplana 2004).

Methodological quality of included studies

The QUADAS‐2 tool was used to assess the quality of the 21 included records in terms of the risk of bias and of concerns about applicability. Figure 2 summarizes the overall methodological quality and Figure 3 gives the ratings for each of the included records. In the domain of patient selection, we had high concerns about the risk of bias and the applicability of one record (Veeken 2003 ‐ composite) because the inclusion of participants was conditional on the availability of stored serum samples and of diagnostic information from another serological test that could have correlated with the index test. In another record (Cota 2013), we had high concerns about the applicability to the review question because a considerable proportion of the study population (21%) had a previous history of VL. In the domains of the index test and the reference standard, the risk of bias was unclear for 14 of the 21 included records. The most frequent underlying reason was that the publication did not report whether the index test results were interpreted without knowledge of the reference standard and vice versa. We considered that there was a high risk of bias in those studies that used bone marrow aspirate smear tests and culture with or without clinical information as a reference standard due to the low sensitivity of these techniques (Kilic 2008 and Cota 2013 ‐ composite 2). In the domain of flow and timing, we considered that there was a high risk of bias in five records, because not all patients were included in the analysis (Rijal 2004; ter Horst 2009; Veeken 2003; and Peruhype‐Magalhaes 2012) or because not all patients received the same reference standard (Sundar 1998).


Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studiesFootnote Figure 2 represents studies but our quality assessment was done per record. Some of the records include more than one study. Figure 2 shows all the estimates used in this review, including the estimates for primary and sensitivity analyses of the same study.

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Footnote

Figure 2 represents studies but our quality assessment was done per record. Some of the records include more than one study. Figure 2 shows all the estimates used in this review, including the estimates for primary and sensitivity analyses of the same study.


Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included studyFootnote Figure 3 represents studies but our quality assessment was done per record. Some of the records include more than one study. Figure 3 shows all the estimates used in this review, including the estimates for primary and sensitivity analyses of the same study.

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study

Footnote

Figure 3 represents studies but our quality assessment was done per record. Some of the records include more than one study. Figure 3 shows all the estimates used in this review, including the estimates for primary and sensitivity analyses of the same study.

Findings

rK39‐based rapid diagnostic tests

Data summary

Twenty studies reported Se and Sp estimates for the rK39 ICT. Six of these studies gave two sets of Se and Sp estimates, based on alternative reference standards (Table 1). A forest plot of the 26 available Se/Sp estimates is given in Figure 4. Figure 5 presents the available Se/Sp pairs according to the reference standard used; the results referring to the same data analysed with different reference standards are connected with a line.


rK39 ICT: forest plot of all the available estimates of sensitivity and specificity (n = 20 studies; 26 sets of estimates)Footnote For studies that use latent class analysis, the counts of true positive, false positive, false negative and true negative results are imputed from the Se and Sp estimates and the overall sample size. The estimates and confidence intervals are subsequently calculated from these imputed values.

rK39 ICT: forest plot of all the available estimates of sensitivity and specificity (n = 20 studies; 26 sets of estimates)

Footnote

For studies that use latent class analysis, the counts of true positive, false positive, false negative and true negative results are imputed from the Se and Sp estimates and the overall sample size. The estimates and confidence intervals are subsequently calculated from these imputed values.


rK39 ICT: summary of sensitivity‐specificity pairs according to the reference standard (n = 20 studies; 26 sets of estimates). The type of reference standard is classified as: (a) parasitology including spleen aspirate – no serology; (b) parasitology not including spleen aspirate – no serology; (c) parasitology and serology; or (d) latent class analysis. The data points that are connected with a line refer to the same data analysed with different reference standards.

rK39 ICT: summary of sensitivity‐specificity pairs according to the reference standard (n = 20 studies; 26 sets of estimates). The type of reference standard is classified as: (a) parasitology including spleen aspirate – no serology; (b) parasitology not including spleen aspirate – no serology; (c) parasitology and serology; or (d) latent class analysis. The data points that are connected with a line refer to the same data analysed with different reference standards.

Meta‐analysis

A formal meta‐analysis was performed on 18 of the 26 available data points: studies including mainly HIV‐infected patients were reported separately (ter Horst 2009 ‐ HIV pos, Cota 2013 ‐ composite 1 and Cota 2013 ‐ composite 2) and for the five remaining studies with multiple Se and Sp estimates, we used one set of estimates for the primary analysis (Table 1). Combining data from all studies without accounting for possible covariates explaining heterogeneity and using the bivariate model (Reitsma 2005), the average (95% CI) Se was 91.9% (84.8 to 96.5) and the average Sp was 92.4% (85.6 to 96.8)

The 95% prediction interval (PI) for the diagnostic accuracy that would be observed in a new study was (52.3 to 100)% for the Se and (54.9 to 100)% for the Sp (Figure 6).


rK39 ICT: basic summary using the bivariate normal model with a complementary log‐log link function. This analysis combines data from all studies without accounting for covariates. The average sensitivity is 91.9% (95% confidence interval: 84.8, to 96.5) and the average specificity is 92.4% (95% confidence interval: 85.6, to 96.8). The confidence region is indicated with a full line, the prediction region with a dotted line.

rK39 ICT: basic summary using the bivariate normal model with a complementary log‐log link function. This analysis combines data from all studies without accounting for covariates. The average sensitivity is 91.9% (95% confidence interval: 84.8, to 96.5) and the average specificity is 92.4% (95% confidence interval: 85.6, to 96.8). The confidence region is indicated with a full line, the prediction region with a dotted line.

Of note is the fact that, in contrast to what is presumed in the HROC model, the estimated correlation between the transformed Se and Sp was positive: 0.16 (95% CI: ‐0.40 to 0.65). Given these observations, we only report results from the bivariate model using the complementary log‐log, and not from the ROC‐based analysis.

Assessment of heterogeneity

A summary of the heterogeneity assessment through meta‐analysis can be found in Figure 7 and Table 3.


rK39 ICT; summary of the heterogeneity assessment through meta‐analysis: sensitivity and specificity estimates by covariates. Rectangles indicate significant differences.

rK39 ICT; summary of the heterogeneity assessment through meta‐analysis: sensitivity and specificity estimates by covariates. Rectangles indicate significant differences.

Open in table viewer
Table 3. Heterogeneity assessment of the rK39 ICT meta‐analysis

Sensitivity

Specificity

Estimate

(95% CI)

Estimate

(95% CI)

Geographic region

Indian subcontinent

97.0

(90.0 to 99.5)

90.2

(76.1 to 97.7)

East Africa

85.3

(74.5 to 93.2)

91.1

(80.3 to 97.3)

Commercial brand

DiaMed

86.4

(70.7 to 95.9)

94.4

(80.5 to 99.2)

InBios

91.3

(83.8 to 96.3)

91.2

(83.1 to 96.3)

Other

97.8

(88.7 to 99.9

92.5

(75.5 to 99.0)

Disease prevalence in sample

Low (< 65%)

91.0

(82.2 to 96.9)

94.4

(87.2 to 98.3)

High (≥ 65%)

92.4

(84.5 to 97.4)

89.8

(79.7 to 95.8)

Study size

Small (< 250)

91.5

(82.7 to 96.7)

89.8

(80.6 to 95.6)

Large (≥ 250)

92.3

(83.4 to 97.4)

94.5

(87.8 to 98.3)

QUADAS‐2: risk of bias

Low

89.5

(69.5 to 98.3)

96.0

(82.8 to 99.7)

Unclear

91.9

(83.7 to 97.1)

92.0

(84.0 to 96.7)

High

93.0

(77.9 to 99.1)

87.8

(70.7 to 97.2)

Reference standard

Parasitology ‐ no serology *

95.6

(84.3 to 99.6)

89.1

(72.4 to 97.4)

Parasitology and serology

89.6

(78.6 to 96.7)

94.6

(85.2 to 98.7)

Latent class analysis

91.0

(79.8 to 97.2)

91.5

(80.7 to 97.3)

* The categories "parasitology including spleen aspirate ‐ no serology" and "parasitology not including spleen aspirate ‐ no serology" were combined because the latter category contained only one study.

Geographic region

We assessed the difference in diagnostic accuracy effect between East Africa and the Indian subcontinent. There were nine data points from East Africa (two from Ethiopia, two from Kenya, three from Sudan, and two from Uganda), and six from the Indian subcontinent (two from India and four from Nepal). Three studies from other regions (two from Brazil and one from Turkey) were not considered in this analysis. The Se was significantly lower in East Africa (85.3%; 95% CI: 74.5 to 93.2) than in the Indian subcontinent (97.0%; 95% CI: 90.0 to 99.5). There was no significant difference in Sp: the Sp (95% CI) in east Africa was 91.1% (80.3 to 97.3) and in the Indian subcontinent 90.2% (76.1 to 97.7) (summary of findings Table 1 and summary of findings Table 2). Confidence and prediction regions by geographic region are given in Figure 8.


rK39 ICT: summary of the meta‐regression model with effects for geographic region using the bivariate normal model with a complementary log‐log link function. The sensitivity was significantly lower in East Africa (85.3%; 95% confidence interval: 74.5 to 93.2) than in the Indian subcontinent (97.0%; 95% confidence interval: 90.0 to 99.5). There was no significant difference in specificity: the specificity (95% confidence interval) in East Africa was 91.1% (80.3 to 97.3) and in the Indian subcontinent 90.2% (76.1 to 97.7). The confidence region is indicated with a full line, the prediction region with a dotted line.

rK39 ICT: summary of the meta‐regression model with effects for geographic region using the bivariate normal model with a complementary log‐log link function. The sensitivity was significantly lower in East Africa (85.3%; 95% confidence interval: 74.5 to 93.2) than in the Indian subcontinent (97.0%; 95% confidence interval: 90.0 to 99.5). There was no significant difference in specificity: the specificity (95% confidence interval) in East Africa was 91.1% (80.3 to 97.3) and in the Indian subcontinent 90.2% (76.1 to 97.7). The confidence region is indicated with a full line, the prediction region with a dotted line.

Commercial brand or type of test

The majority of studies used the InBios rK39 ICT test (11 studies); four assessed the DiaMed rK39 ICT test; and three evaluated other tests (DiaMed DUAL‐IT L/M ICT test, Amrad ICT rK39 ICT test, and Arista Biologicals rK39 ICT test). There were no significant differences in diagnostic accuracy between commercial brands.

Disease prevalence

We separated studies by estimated prevalence of VL in the study sample: nine studies reported prevalence rates < 65% and nine reported prevalence rates ≥ 65%. There were no significant differences in diagnostic accuracy between studies with low and high prior probability of disease in the patient sample.

Quality assessment

We used study size and risk of bias according to QUADAS‐2 as indicators of the quality of the primary studies.

Study size

We categorized studies according to the total number of study participants (true cases + non‐cases) in small (< 250) and large (≥ 250) studies. There were 10 small and eight large studies. There were no significant differences in diagnostic accuracy between small and large studies, but there was a trend for larger studies to show a higher Sp estimate.

QUADAS‐2: risk of bias

Studies were categorized according to the risk of bias assessed with QUADAS‐2: low risk (three studies), high risk (four studies), and unclear risk of bias (11 studies). There were no significant differences among these categories, but there was a trend for studies with high risk of bias to show a low Sp and studies with low risk of bias to show a higher Sp.

Type of reference standard

Studies were categorized according to the reference standard used in the primary publication. There were no significant differences in accuracy among these categories. Further exploration of the influence of the reference standard is assessed in the sensitivity analysis below.

Diagnostic accuracy in HIV‐positive participants

Two studies evaluated the rK39 ICT in HIV‐positive participants and were not included in the meta‐analysis.

The first study reported on the accuracy of the DiaMed rK39 ICT in 71 HIV‐positive participants in Ethiopia (ter Horst 2009 ‐ HIV pos). Based on a composite reference standard that combined direct parasitological examination of spleen aspirate with DAT serology, 65 participants had a diagnosis of VL. In this setting, the sensitivity of the rK39 ICT was 64.6% and the specificity 66.7%.

The second study evaluated the InBios rK39 ICT in 113 HIV‐infected people in Brazil. Data were extracted according to two different reference standards: (1) the decision of an adjudication committee after clinical follow‐up and taking into account all available information (including parasitology and serology) (Cota 2013 ‐ composite 1; VL prevalence: 40.7%); and (2) direct smear and culture of bone marrow aspirate (Cota 2013 ‐ composite 2; VL prevalence: 36.9%). The choice of the reference standard was of little influence. The Se of the rK39 ICT was low: 45.7% with the first and 46.3% with the second reference standard; and the Sp was high: 97.0% with the first and 97.1% with the second reference standard.

Sensitivity analyses

We performed sensitivity analyses of the influence of the choice of reference standard and possible biases resulting from imperfect reference standards. Given the impact of geographic region on the diagnostic accuracy of rK39 ICT, we corrected the sensitivity analyses for region effects and limited the analyses to 15 studies performed in the Indian subcontinent and East Africa. An overview of the results of the sensitivity analyses is given in Table 4.

Open in table viewer
Table 4. Sensitivity analysis results of the rK39 ICT meta‐analysis

Sensitivity

Specificity

Estimate

(95% CI)

Estimate

(95% CI)

Main Analysis

Indian subcontinent

97.0

(90.0 to 99.5)

90.2

(76.1 to 97.7)

East Africa

85.3

(74.5 to 93.2)

91.1

(80.3 to 97.3)

Alternative Analysis Set

Indian subcontinent

96.1

(88.9 to 99.2)

86.7

(69.2 to 99.2)

East Africa

85.2

(75.0 to 92.7)

90.1

(77.0 to 97.4)

Bayesian analysis allowing for imperfect reference standards: expert priors ‐ using main analysis set

Indian subcontinent

97.3

(91.9 to 99.5)

93.7

(74.9 to 99.7)

East Africa

86.7

(77.5 to 94.0)

95.7

(84.3 to 99.8)

Bayesian analysis allowing for imperfect reference standards: vague priors ‐ using main analysis set

Indian subcontinent

97.3

(91.9 to 99.5)

93.0

(77.8 to 99.3)

East Africa

86.1

(77.2 to 93.3)

94.3

(83.8 to 99.6)

Analysis using secondary reference standards

As a sensitivity analysis to the choice of the set of estimates of a given study, we made an alternative selection of the reference standard for those studies presenting results for two reference standards (Table 1). The estimated Sp was slightly lower compared to the main analysis set: 86.7% versus 90.2% for the Indian subcontinent and 90.1% versus 91.1% for east Africa. This may be related to a lower Se of the reference standards included in this analysis, which would result in an underestimation of the Sp of the index test. Estimated Se was similar in the sensitivity and main analysis.

Analysis allowing for imperfect reference standards

In the studies available for this sensitivity analysis, two types of reference standards were used: parasitology including spleen aspirate – no serology; (three studies) and a combined reference standard of parasitology and serology (six studies), in addition to the six studies analysed using latent class analysis. We elicited the opinion from seven VL experts on the diagnostic accuracy and possible variation across the two reference standards (Figure 9). We used these expert opinions as prior information in a Bayesian statistical model to estimate the diagnostic accuracy of the rK39 ICT. This analysis indicated that due to lack of sensitivity of the reference standard, the Sp of rK39 ICT may be somewhat underestimated. The estimated Sp correcting for this bias was 93.7% in the Indian subcontinent, and 95.7% in east Africa, compared to 90.2% and 91.1% assuming perfect reference standards. The estimates of Se did not change significantly when we corrected for imperfect reference standards. In this model, the Se of spleen aspirate was estimated to be 96.0% (95% CI: 92.5 to 98.8) and the Sp was estimated to be 99.4% (95% CI: 98.1 to 99.9). The estimated Se of the combined reference standard of spleen parasitology and a serological test was estimated to be 98.6% (95% CI: 97.1 to 99.4) and the estimated Sp was 96.5% (95% CI: 91.5 to 99.2).


Opinion from seven experts on visceral leishmaniasis regarding the diagnostic accuracy and possible variation of two reference standards: parasitology including spleen aspirate without serology and any combination of parasitology with serology. These expert opinions were used as prior information in a Bayesian statistical model to estimate the diagnostic accuracy of the rK39 ICT.

Opinion from seven experts on visceral leishmaniasis regarding the diagnostic accuracy and possible variation of two reference standards: parasitology including spleen aspirate without serology and any combination of parasitology with serology. These expert opinions were used as prior information in a Bayesian statistical model to estimate the diagnostic accuracy of the rK39 ICT.

Similar results were obtained when we used non‐informative priors instead of expert opinion to estimate the Bayesian model.

Assessment of structure of hierarchical model

Using the more standard logit link, in comparison to the better fitting complementary log‐log link increased the DIC from 171.3 to 199.0. The logit link resulted in similar estimates of Se and Sp: the average (95% CI) Se was 91.9% (83.6 to 96.2) and the average Sp was 92.4% (84.9 to 96.3). However, the remaining heterogeneity was considerably larger resulting in wider prediction regions: 28.3 to 99.7% for Se and 31.1 to 99.7% for Sp (Figure 10) compared to the primary analysis.


rK39 ICT: basic model summary using the bivariate logistic normal model. This analysis combines data from all studies without accounting for covariates. The average sensitivity is 91.9% (95% confidence interval: 83.6, to 96.2) and the average specificity is 92.4% (95% confidence interval: 84.9, to 96.3). The confidence region is indicated with a full line, the prediction region with a dotted line. In this analysis, the prediction intervals are wide: 28.3 to 99.7% for the sensitivity and 31.1 to 99.7% for the specificity.

rK39 ICT: basic model summary using the bivariate logistic normal model. This analysis combines data from all studies without accounting for covariates. The average sensitivity is 91.9% (95% confidence interval: 83.6, to 96.2) and the average specificity is 92.4% (95% confidence interval: 84.9, to 96.3). The confidence region is indicated with a full line, the prediction region with a dotted line. In this analysis, the prediction intervals are wide: 28.3 to 99.7% for the sensitivity and 31.1 to 99.7% for the specificity.

Using a bivariate t‐distribution for the random‐effects did not improve model fit compared to the normal model.

Assessment of reporting bias

Reporting bias was assessed through the robust funnel plot proposed by Deeks 2005. This robust funnel plot presents the log of the diagnostic odds ratio by a robust measure of the study size. No indication of reporting bias was observed in this plot (Figure 11).


rK39 ICT; assessment of reporting bias: funnel plot for log diagnostic odds ratio

rK39 ICT; assessment of reporting bias: funnel plot for log diagnostic odds ratio

Latex agglutination test

Data summary

Seven studies were identified that reported Se or Sp results of the latex agglutination test in urine (KAtex). For two of these studies, two sets of Se and Sp estimates were available. A forest plot of the nine available Se/Sp estimates is given in Figure 12.


Latex agglutination test: forest plot of all the available estimates of sensitivity and specificity (n = 7 studies; 9 sets of estimates).

Latex agglutination test: forest plot of all the available estimates of sensitivity and specificity (n = 7 studies; 9 sets of estimates).

Meta‐analysis

One study was not pooled with the others in the formal meta‐analysis because, contrary to the other studies, this study included a majority of HIV‐positive participants (Vilaplana 2004). For the two studies that presented Se/Sp estimates using LCA and a reference standard on the same data (Boelaert 2008 ‐ Ethiopia and Diro 2007, Boelaert 2008 ‐ India and Sundar 2007), the results from the two analysis approaches were very similar. We used the results from LCA for the primary meta‐analysis.

Combining data from six studies without accounting for possible covariates explaining heterogeneity, and using the bivariate normal model with complementary log‐log link, the average (95% CI) Se was 63.6% (40.9 to 85.6) and the average Sp was 92.9% (76.7 to 99.2) (Figure 13 and summary of findings Table 3). The 95% prediction intervals were very wide: (16.5 to 99.6)% for the Se and (41.3 to 100)% for the Sp (Figure 13). The estimated correlation between the transformed Se and Sp was ‐0.39 (95% CI: ‐0.90 to 0.70).


Latex agglutination test: basic summary using the bivariate normal model with a complementary log‐log link function. This analysis combines data from all studies without accounting for covariates. The average sensitivity (95% confidence interval) is 63.6% (40.9 to 85.6) and the average specificity is 92.9% (76.7 to 99.2). The confidence region is indicated with a full line, the prediction region with a dotted line. The 95% prediction intervals are very wide: 16.5% to 99.6% for the sensitivity and 41.3% to 100% for the specificity.

Latex agglutination test: basic summary using the bivariate normal model with a complementary log‐log link function. This analysis combines data from all studies without accounting for covariates. The average sensitivity (95% confidence interval) is 63.6% (40.9 to 85.6) and the average specificity is 92.9% (76.7 to 99.2). The confidence region is indicated with a full line, the prediction region with a dotted line. The 95% prediction intervals are very wide: 16.5% to 99.6% for the sensitivity and 41.3% to 100% for the specificity.

Assessment of heterogeneity

Of the predefined sources of heterogeneity, geographic region, disease prevalence, and study size showed sufficient variability across studies to warrant a meta‐regression. All studies used the KAtex latex agglutination test manufactured by Kalon biological. Five of the six studies had an unclear risk of bias. All studies in east Africa reported prevalences of VL < 65% and all studies in the Indian subcontinent reported prevalences ≥ 65%. Consequently, the effects of region and disease prevalence in the sample could not be separated. There were no significant differences in diagnostic accuracy of the latex agglutination test among regions or study sizes (Figure 14 and Table 5).


Latex agglutination test; summary of the heterogeneity assessment through meta‐analysis: sensitivity and specificity estimates by covariates.

Latex agglutination test; summary of the heterogeneity assessment through meta‐analysis: sensitivity and specificity estimates by covariates.

Open in table viewer
Table 5. Heterogeneity assessment of the Latex agglutination test meta‐analysis

Sensitivity

Specificity

Estimate

(95% CI)

Estimate

(95% CI)

Geographic region

Indian subcontinent

50.8

(34.1 to 69.3)

95.3

(73.6 to 99.8)

East Africa

77.9

(58.2 to 92.3)

88.6

(59.5 to 92.3)

Study size

Small (< 200)

52.9

(27.6 to 84.1)

86.2

(52.5 to 99.2)

Large (≥ 200)

68.5

(45.2 to 88.2)

94.5

(49.0 to 100.0)

Sensitivity analysis

As a sensitivity analysis to the choice of the set of estimates of a given study, we made an alternative selection of the reference standard for those studies presenting results for two reference standards. The results were very similar to the primary analysis: the average (95% CI) Se was 63.4% (40.8 to 85.4) and the average Sp was 92.8% (76.3 to 99.2) (Table 6).

Open in table viewer
Table 6. Sensitivity analysis results of the Latex agglutination test meta‐analysis

Sensitivity

Specificity

Estimate

(95% CI)

Estimate

(95% CI)

Main Analysis

63.6

(40.9 to 85.6)

92.9

(76.7 to 99.2)

Alternative Analysis Set

63.4

(40.8 to 85.4)

92.8

(76.3 to 99.2)

Diagnostic accuracy in HIV‐positive participants

One study assessed the accuracy of the latex agglutination test in a population of 85 mostly HIV‐positive participants (93%) in Spain (Vilaplana 2004). According to a reference standard based on culture or direct parasitological examination of bone marrow aspirate, 12 participants were classified as having VL. The estimated Se and Sp of the KAtex test were very high: the Se was 100% (12/12; 95% CI: 74 to 100) and the Sp was 96% (70/73; 95% CI: 88 to 99).

FAST

Two studies assessed the accuracy of the Fast agglutination screening test (FAST), one in Ethiopia (Hailu 2006) and one in Turkey (Kilic 2008). Both studies were small (89 participants in Ethiopia and 59 in Turkey). The prevalence of VL in the study population was 51% in Ethiopia and 41% in Turkey. The reference standard was the culture or direct parasitological examination of spleen or lymph node samples (Ethiopia) or of bone marrow samples (Turkey). In Ethiopia, the estimated Se of FAST was 91.1% and the Sp 70.5%. In Turkey, the estimated Se was 95.8% and the Sp 85.7%.

rK26‐based rapid diagnostic test

One study assessed the accuracy of an rK26‐based rapid diagnostic test manufactured by InBios International (Seattle, Washington, USA) in a population of 352 patients with suspected VL in India (Sundar 2007). According to a reference standard combining direct parasitological examination of spleen aspirate with DAT serology and response to VL treatment, 282 participants had a diagnosis of VL. The sensitivity of the rK26 ICT was 21.3% and the specificity 100%.

rKE16 immunochromatographic test

One study evaluated an rKE16‐based ICT (Signal KA, Span Diagnostics, India) in Kenya (Mbui 2013). This study included 219 patients suspected to have VL. Based on a reference standard consisting of direct smear examination of a splenic aspirate sample, 131 participants were classified as having VL. The Se of the rKE16 ICT was 77.1% and the Sp 95.5%.

Discussion

Summary of main results

The rK39 ICT, developed in 1998, is the rapid diagnostic test for VL that has been most thoroughly evaluated so far. We retrieved 21 publications that corresponded to our inclusion criteria for the review, and from these, ultimately 18 independent sets of sensitivity (Se) and specificity (Sp) estimates could be included in a formal meta‐analysis of diagnostic accuracy of the rK39 ICT. In patients with febrile splenomegaly and no previous history of VL, the overall sensitivity of the rK39 ICT was 91.9% (95% CI 84.8 to 96.5) and the specificity 92.4% (95% CI 85.6 to 96.8). Sensitivity was significantly lower in East Africa (84.3%; 95% CI 74.5 to 93.2) than in the Indian subcontinent (97.0%; 95% CI 90.0 to 99.5), but there was no significant difference in specificity between geographic regions (summary of findings Table 1, summary of findings Table 2, and Appendix 3). This assessment of heterogeneity did not include the Latin American and Mediterranean region as the number of studies from those regions was too low. There was no significant difference according to prevalence of disease in the sample or manufacturer of the rK39 ICT, but the number of estimates in some categories of the latter was low (11/18 InBios, 4/18 DiaMed, and 3/18 other brands). Generally, care should be taken when interpreting this heterogeneity assessment due to the limited number of studies included in the meta‐analysis. Apart from the lack of power, several risk factors may be correlated inducing confounding between different covariates.

For the KAtex test, a latex agglutination test in urine, when used in a clinical setting in patients with febrile splenomegaly, overall sensitivity was 63.6% (95% CI 40.9 to 85.6) and specificity 92.9 % (95% CI 76.7 to 99.2) (summary of findings Table 3). Seven studies were included in the review and six in the meta‐analysis. Because of the limited number of estimates, no complete heterogeneity assessment could be made. There were no significant differences in diagnostic accuracy of the KAtex test among regions or study sizes, but the number of included studies was too small to allow for definite conclusions.

The number of studies addressing three other rapid diagnostic tests (FAST, rK26 ICT, and rKE16 ICT) was too low to allow for a meta‐analysis. Our conclusions do not apply to HIV‐positive patients. In summary, completeness of evidence was most satisfactory for the rK39 ICT, only partially complete for the latex agglutination test in urine, and too incomplete for the other three tests.

The methodological quality of the evidence on the rK39 ICT and the latex agglutination test was acceptable, although we had concerns about the risk of bias in some studies (see below).

Strengths and weaknesses of the review

Our review protocol explicitly set out to select clinical evaluations in a phase III design, based on consecutively recruited patients all suspect for VL, as this type of evidence level is required for making policy recommendations for clinical practice. Therefore, we had to exclude many records using a phase II (case control) design. However, in some records, the study design was not clearly reported, which made it difficult to distinguish between a phase III and a case control design. Because of this poor reporting, we may have missed some true phase III studies.

We believe that the procedures used for study selection were quite exhaustive, as we did not impose any a priori language barriers (excluding only one study in Chinese ex‐post), and we searched several databases including regional ones such as LILACS. Therefore, we believe that the information presented here reflects the body of evidence accumulating over the past 15 years rather well. In addition, assessment of possible publication bias did not indicate that smaller studies without favourable results for the RDTs were less likely to be reported. 

Most important was the striking heterogeneity in methods across the studies. Although the included studies all complied with the criteria for an acceptable reference standard set out in our protocol, there was a remarkable variety in the reference standards that were used. These reference standards were based on different tests, done in different orders, with different cut‐off values, and making different use of clinical information. We formally assessed the risk of bias in each record using the revised QUADAS‐2 instrument. In five out of the 21 included records, we had clear concerns related to the patient and sample flow and the timing. For 14 records, the risk of bias was difficult to assess in the domains of the index test and the reference standard mainly because authors failed to report explicitly if and how blinding was applied.  When we categorized records into low (n = 3), unclear (n = 11), and high (n = 7) risk of bias, there was no significant difference in Se and Sp of the rK39 ICT, but high‐risk studies tended to have a higher sensitivity and a lower specificity. One possible explanation is that the reference standards in studies classified as at high risk of bias were of sub‐optimal sensitivity. This could result in a selection of cases being primarily severe cases which may be easy to diagnose with rapid diagnostic tests, while the control group may contain some undiagnosed true VL cases (falsely negative in the reference standard).

We tried to assess the influence of this quality in study design on the estimates of accuracy in a sensitivity analysis in the case of the rK39 ICT. Neither the use of a secondary analysis set that was based on a different reference standard, nor an analysis allowing for an imperfect reference standard affected the main conclusions of the meta‐analysis on the rK39 ICT, although the secondary analysis using an alternative reference standard in five studies tended to give a lower specificity estimate. This may be related to a lower Se of the reference standards included in this analysis, which would result in an underestimation of the Sp of the index test.

Finally, the findings seemed sufficiently homogeneous to allow for a meaningful summary estimate for the specificity of rK39 ICT as well as the Se and Sp of the latex agglutination test in urine. However, the sensitivity of rK39 ICT was different according to geographic region, and seems substantially lower in East Africa than in the Indian subcontinent. This finding is in line with an earlier meta‐analysis conducted by our team (Chappuis 2006) and with a multi‐country study conducted by WHO/TDR (data not included) (Cunningham 2012). Several hypotheses have been raised to explain this lower sensitivity in east Africa: a lower level of circulating antibodies, different age group, or parasitological differences (Bhattacharyya 2013; Harhay 2011).

Applicability of findings to the review question

Our review focused on clinical studies (phase III) of diagnostic accuracy, recruiting participants representative for the spectrum of patients encountered in clinical practice in whom a rapid diagnostic test for VL would be warranted. In summary, those patients would be in line with the WHO case definition and defined as “patients with a clinical suspicion of VL, that is, those who are febrile for more than two weeks and have splenomegaly, presenting at health services in endemic areas.” (Source: protocol). Patients who were previously treated for VL (non‐responders or relapsed cases), and those who had signs and symptoms of other forms of leishmaniasis, such as post kala‐azar dermal leishmaniasis (PKDL) were excluded. All clinical studies included in the review clearly corresponded with these inclusion criteria. Nevertheless, the settings of these clinical studies varied: some were conducted in tertiary care centres, and some in smaller hospitals of an intermediary level. Few studies recruited patients at the first‐contact point, the most peripheral level of the health service, ie health centres or health posts. Although the level of health service leads to variable prior probability of disease in the study sample, and as such is of influence in diagnostic accuracy studies, we believe that our findings are applicable across the several levels of the health system as an analysis of heterogeneity according to disease prevalence did not reveal any difference in the Se or Sp estimates.

Our review does not apply to HIV‐positive patients, and caution should be taken in areas with high HIV‐prevalence. Because HIV‐status is known to influence diagnostic accuracy (ter Horst 2009 ‐ HIV neg; ter Horst 2009 ‐ HIV pos; Alvar 2008), we originally intended to analyse the accuracy of the RDTs in HIV‐positive and HIV‐negative patients separately. Unfortunately, the number of included studies of HIV‐positive people was too low (n = 3) for a separate analysis or a comparison. Furthermore, the information from the Mediterranean region and from Latin America was too limited to draw robust conclusions. Finally, it was not possible to precisely evaluate the importance of the type of sample (serum, plasma or blood; fresh or frozen urine) or the manufacturer or version of a certain test, because these parameters did not vary enough among the included studies. In particular, Cunningham et al. pointed to important variations of sensitivity in a head‐to‐head comparison of five brands of RDT where two tests based on rK16 antigen performed substantially less well in east Africa and Brazil (Cunningham 2012).

In summary, we conclude that for current practice of VL control, there is one rapid diagnostic test, the rK39 ICT, which has been sufficiently evaluated, showing high sensitivity and specificity on average, but with a notably lower sensitivity in east Africa compared to the Indian subcontinent. In the Indian subcontinent, the rK39 ICT clearly complies with the normative requirements put forward before, of a sensitivity above 95% and a specificity above 90% (Boelaert 2007). In east Africa, the sensitivity of the rK39 ICT does not comply with these criteria, and can therefore not be used as a stand‐alone test to reliably rule out VL in a patient who is suspected to have VL. This does not mean that the use of an rK39 ICT is precluded, but it should be embedded in a test algorithm with a clear instruction on what to do in case of a negative rK39 ICT (second test, referral, come back after two weeks for repeat testing or other).

For the latex agglutination test, although there were only six studies included in the meta‐analysis, we can conclude that the low sensitivity makes it unsuitable for use in current clinical practice, though it does not preclude that its performance could be further improved in the future.

No recommendations can be made concerning the FAST test, the rK26 ICT or the rKE16 ICT because of paucity of evidence. 

Study flow diagram showing the process of selection of records and studies for the review and for the meta‐analyses
Figuras y tablas -
Figure 1

Study flow diagram showing the process of selection of records and studies for the review and for the meta‐analyses

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studiesFootnote Figure 2 represents studies but our quality assessment was done per record. Some of the records include more than one study. Figure 2 shows all the estimates used in this review, including the estimates for primary and sensitivity analyses of the same study.
Figuras y tablas -
Figure 2

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Footnote

Figure 2 represents studies but our quality assessment was done per record. Some of the records include more than one study. Figure 2 shows all the estimates used in this review, including the estimates for primary and sensitivity analyses of the same study.

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included studyFootnote Figure 3 represents studies but our quality assessment was done per record. Some of the records include more than one study. Figure 3 shows all the estimates used in this review, including the estimates for primary and sensitivity analyses of the same study.
Figuras y tablas -
Figure 3

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study

Footnote

Figure 3 represents studies but our quality assessment was done per record. Some of the records include more than one study. Figure 3 shows all the estimates used in this review, including the estimates for primary and sensitivity analyses of the same study.

rK39 ICT: forest plot of all the available estimates of sensitivity and specificity (n = 20 studies; 26 sets of estimates)Footnote For studies that use latent class analysis, the counts of true positive, false positive, false negative and true negative results are imputed from the Se and Sp estimates and the overall sample size. The estimates and confidence intervals are subsequently calculated from these imputed values.
Figuras y tablas -
Figure 4

rK39 ICT: forest plot of all the available estimates of sensitivity and specificity (n = 20 studies; 26 sets of estimates)

Footnote

For studies that use latent class analysis, the counts of true positive, false positive, false negative and true negative results are imputed from the Se and Sp estimates and the overall sample size. The estimates and confidence intervals are subsequently calculated from these imputed values.

rK39 ICT: summary of sensitivity‐specificity pairs according to the reference standard (n = 20 studies; 26 sets of estimates). The type of reference standard is classified as: (a) parasitology including spleen aspirate – no serology; (b) parasitology not including spleen aspirate – no serology; (c) parasitology and serology; or (d) latent class analysis. The data points that are connected with a line refer to the same data analysed with different reference standards.
Figuras y tablas -
Figure 5

rK39 ICT: summary of sensitivity‐specificity pairs according to the reference standard (n = 20 studies; 26 sets of estimates). The type of reference standard is classified as: (a) parasitology including spleen aspirate – no serology; (b) parasitology not including spleen aspirate – no serology; (c) parasitology and serology; or (d) latent class analysis. The data points that are connected with a line refer to the same data analysed with different reference standards.

rK39 ICT: basic summary using the bivariate normal model with a complementary log‐log link function. This analysis combines data from all studies without accounting for covariates. The average sensitivity is 91.9% (95% confidence interval: 84.8, to 96.5) and the average specificity is 92.4% (95% confidence interval: 85.6, to 96.8). The confidence region is indicated with a full line, the prediction region with a dotted line.
Figuras y tablas -
Figure 6

rK39 ICT: basic summary using the bivariate normal model with a complementary log‐log link function. This analysis combines data from all studies without accounting for covariates. The average sensitivity is 91.9% (95% confidence interval: 84.8, to 96.5) and the average specificity is 92.4% (95% confidence interval: 85.6, to 96.8). The confidence region is indicated with a full line, the prediction region with a dotted line.

rK39 ICT; summary of the heterogeneity assessment through meta‐analysis: sensitivity and specificity estimates by covariates. Rectangles indicate significant differences.
Figuras y tablas -
Figure 7

rK39 ICT; summary of the heterogeneity assessment through meta‐analysis: sensitivity and specificity estimates by covariates. Rectangles indicate significant differences.

rK39 ICT: summary of the meta‐regression model with effects for geographic region using the bivariate normal model with a complementary log‐log link function. The sensitivity was significantly lower in East Africa (85.3%; 95% confidence interval: 74.5 to 93.2) than in the Indian subcontinent (97.0%; 95% confidence interval: 90.0 to 99.5). There was no significant difference in specificity: the specificity (95% confidence interval) in East Africa was 91.1% (80.3 to 97.3) and in the Indian subcontinent 90.2% (76.1 to 97.7). The confidence region is indicated with a full line, the prediction region with a dotted line.
Figuras y tablas -
Figure 8

rK39 ICT: summary of the meta‐regression model with effects for geographic region using the bivariate normal model with a complementary log‐log link function. The sensitivity was significantly lower in East Africa (85.3%; 95% confidence interval: 74.5 to 93.2) than in the Indian subcontinent (97.0%; 95% confidence interval: 90.0 to 99.5). There was no significant difference in specificity: the specificity (95% confidence interval) in East Africa was 91.1% (80.3 to 97.3) and in the Indian subcontinent 90.2% (76.1 to 97.7). The confidence region is indicated with a full line, the prediction region with a dotted line.

Opinion from seven experts on visceral leishmaniasis regarding the diagnostic accuracy and possible variation of two reference standards: parasitology including spleen aspirate without serology and any combination of parasitology with serology. These expert opinions were used as prior information in a Bayesian statistical model to estimate the diagnostic accuracy of the rK39 ICT.
Figuras y tablas -
Figure 9

Opinion from seven experts on visceral leishmaniasis regarding the diagnostic accuracy and possible variation of two reference standards: parasitology including spleen aspirate without serology and any combination of parasitology with serology. These expert opinions were used as prior information in a Bayesian statistical model to estimate the diagnostic accuracy of the rK39 ICT.

rK39 ICT: basic model summary using the bivariate logistic normal model. This analysis combines data from all studies without accounting for covariates. The average sensitivity is 91.9% (95% confidence interval: 83.6, to 96.2) and the average specificity is 92.4% (95% confidence interval: 84.9, to 96.3). The confidence region is indicated with a full line, the prediction region with a dotted line. In this analysis, the prediction intervals are wide: 28.3 to 99.7% for the sensitivity and 31.1 to 99.7% for the specificity.
Figuras y tablas -
Figure 10

rK39 ICT: basic model summary using the bivariate logistic normal model. This analysis combines data from all studies without accounting for covariates. The average sensitivity is 91.9% (95% confidence interval: 83.6, to 96.2) and the average specificity is 92.4% (95% confidence interval: 84.9, to 96.3). The confidence region is indicated with a full line, the prediction region with a dotted line. In this analysis, the prediction intervals are wide: 28.3 to 99.7% for the sensitivity and 31.1 to 99.7% for the specificity.

rK39 ICT; assessment of reporting bias: funnel plot for log diagnostic odds ratio
Figuras y tablas -
Figure 11

rK39 ICT; assessment of reporting bias: funnel plot for log diagnostic odds ratio

Latex agglutination test: forest plot of all the available estimates of sensitivity and specificity (n = 7 studies; 9 sets of estimates).
Figuras y tablas -
Figure 12

Latex agglutination test: forest plot of all the available estimates of sensitivity and specificity (n = 7 studies; 9 sets of estimates).

Latex agglutination test: basic summary using the bivariate normal model with a complementary log‐log link function. This analysis combines data from all studies without accounting for covariates. The average sensitivity (95% confidence interval) is 63.6% (40.9 to 85.6) and the average specificity is 92.9% (76.7 to 99.2). The confidence region is indicated with a full line, the prediction region with a dotted line. The 95% prediction intervals are very wide: 16.5% to 99.6% for the sensitivity and 41.3% to 100% for the specificity.
Figuras y tablas -
Figure 13

Latex agglutination test: basic summary using the bivariate normal model with a complementary log‐log link function. This analysis combines data from all studies without accounting for covariates. The average sensitivity (95% confidence interval) is 63.6% (40.9 to 85.6) and the average specificity is 92.9% (76.7 to 99.2). The confidence region is indicated with a full line, the prediction region with a dotted line. The 95% prediction intervals are very wide: 16.5% to 99.6% for the sensitivity and 41.3% to 100% for the specificity.

Latex agglutination test; summary of the heterogeneity assessment through meta‐analysis: sensitivity and specificity estimates by covariates.
Figuras y tablas -
Figure 14

Latex agglutination test; summary of the heterogeneity assessment through meta‐analysis: sensitivity and specificity estimates by covariates.

rK39 immunochromatographic test.
Figuras y tablas -
Test 1

rK39 immunochromatographic test.

KAtex.
Figuras y tablas -
Test 2

KAtex.

FAST.
Figuras y tablas -
Test 3

FAST.

rK26 immunochromatographic test.
Figuras y tablas -
Test 4

rK26 immunochromatographic test.

rK39 Primary Analysis.
Figuras y tablas -
Test 5

rK39 Primary Analysis.

rKE16 immunochromatographic test.
Figuras y tablas -
Test 6

rKE16 immunochromatographic test.

Summary of findings 1. rK39 immunochromatographic test for visceral leishmaniasis in the Indian subcontinent

Population: Patients suspected to have visceral leishmaniasis disease 1

Setting: Health services in endemic areas of the Indian subcontinent 2

New test: rK39 immunochromatographic test 3

Reference standard: (1) direct smear test or culture of splenic aspirate; (2) composite reference standard based on one or more of the following: parasitology, serology, or response to treatment; or (3) latent class analysis 4

Pooled sensitivity: 0.97 (95% CI 0.90 to 1.00) | Pooled specificity : 0.90 (95% CI 0.76 to 0.98)

Setting 5

Positive predictive value6

Negative predictive value6

Number of participants (studies)

Quality of the evidence (QUADAS‐2) 7

Peripheral health centre with a prior probability of disease of 40%

87%

98%

1468

(6 studies)

Risk of bias: none of the studies had a low risk of bias in all domains. One study had a high risk of bias (domain of flow and timing). Five studies had an unclear risk of bias (domains of index test or reference standard).

Applicability: low concerns in all studies and in all domains.

Referral centre with a prior probability of disease of 60%

94%

95%

Interpretation: When the rK39 ICT is used 1 in the Indian subcontinent, in a setting where the prior probability of VL among clinical suspects is 40%, which is typically seen in a peripheral health centre in an endemic area, the positive predictive value of the test is 87%. This means that out of 100 patients with a positive rK39 result, 87 would have VL (true positive result) and 13 would have another disease (false positive). The negative predictive value is 98%, meaning that out of 100 patients with a negative rK39 ICT result, 98 would have another disease (true negative) and 2 would have VL (false negative).

When the same test is used in a setting with a prior probability of VL of 60%, which is more typical for a referral centre in an endemic area, the positive predictive value is 94% and the negative predictive value is 95%.

A likelihood ratio is another way of expressing how informative a diagnostic test is: it indicates to what extent the rK39 ICT result changes the odds that a patient has VL. The likelihood ratio of a positive rK39 ICT result is 9.90, and the likelihood ratio of a negative test result is 0.03. This means that in the Indian subcontinent, a positive rK39 ICT result is a strong argument in favour of VL (ruling in) and that a negative rK39 ICT result is a strong argument against VL (ruling out).

CI: confidence interval

Boelaert M, Verdonck K, Menten J, Sunyoto T, van Griensven J, Chappuis F, Rijal S. Rapid diagnostic tests for visceral leishmaniasis. Cochrane Database of Systematic Reviews 2011, Issue 6. Art. No.: CD009135. DOI: 10.1002/14651858.CD009135.

1. The rK39 immunochromatographic test must be used in combination with a clinical case definition (fever and splenomegaly for more than two weeks and no previous history of visceral leishmaniasis). Studies with mainly HIV‐positive patients were not included in the pooled analyses.

2. The results of the meta‐analysis showed considerable heterogeneity, which was partly explained by the geographic region.

3. This rapid diagnostic test has been developed specifically for field use. It is less invasive, less time‐consuming, and easier to perform than the alternative parasitological or serological tests.

4. Latent class analysis is a modelling technique that allows us to estimate the sensitivity and specificity of a set of diagnostic tests in situations in which there is no good reference standard.

5. Two hypothetical situations: a peripheral health centre and a referral centre with a different prior probability of disease

6. A narrative explanation of the predictive values is given in Appendix 3.

7. QUADAS‐2 is a tool for the assessment of the quality of diagnostic accuracy studies. The tool comprises four domains: patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias, and the first three domains are also assessed in terms of concerns regarding applicability.

Figuras y tablas -
Summary of findings 1. rK39 immunochromatographic test for visceral leishmaniasis in the Indian subcontinent
Summary of findings 2. rK39 immunochromatographic test for visceral leishmaniasis in east Africa

Population: Patients suspected to have visceral leishmaniasis disease 1

Setting: Health services in endemic areas of east Africa 2

New test: rK39 immunochromatographic test 3

Reference standard: (1) direct smear test or culture of splenic aspirate; (2) composite reference standard based on one or more of the following: parasitology, serology, or response to treatment; or (3) latent class analysis 4

Pooled sensitivity: 0.85 (95% CI 0.75 to 0.93) ; Pooled specificity : 0.91 (95% CI 0.80 to 0.97)

Setting 5

Positive predictive value 6

Negative predictive value6

Number of participants (studies)

Quality of the evidence

(QUADAS‐2) 7

Peripheral health centre with a prior probability of disease of 40%

86%

90%

1692

(9 studies)

Risk of bias: three studies had a low risk of bias across all domains; four studies had an unclear risk of bias (domains of index test or reference standard); two studies had a high risk of bias (domains of patient selection or flow and timing, or both).

Applicability: one study with high concerns about applicability (domain of patient selection); low concerns for all other studies and all other domains.

Referral centre with a prior probability of disease of 60%

93%

81%

Interpretation: When the rK39 ICT is used 1in east Africa, in a setting where the prior probability of VL is 40%, which is typically seen in a peripheral health centre in an endemic area, the positive predictive value of the test is 86%. This means that out of 100 patients with a positive rK39 ICT result, 86 would have VL (true positive result) and 14 would have another disease (false positive). The negative predictive value is 90%, meaning that out of 100 patients with a negative rK39 ICT result, 90 would have another disease (true negative) and 10 would have VL (false negative).

When the same test is used in a setting with a prior probability of VL of 60%, which is more typical for a referral centre in an endemic area, the positive predictive value is 93% and the negative predictive value is 81%.

In east Africa, the likelihood ratio of a positive rK39 ICT result is 9.58, and the likelihood of a negative rk39 ICT result is 0.16. This means that a positive rK39 ICT result is strong argument in favour of VL (ruling in), and that a negative rK39 ICT result is not an absolute argument against VL (does not allow to rule out VL completely).

CI: confidence interval

Boelaert M, Verdonck K, Menten J, Sunyoto T, van Griensven J, Chappuis F, Rijal S. Rapid diagnostic tests for visceral leishmaniasis. Cochrane Database of Systematic Reviews 2011, Issue 6. Art. No.: CD009135. DOI: 10.1002/14651858.CD009135.

1. The rK39 immunochromatographic test must be used in combination with a clinical case definition (fever and splenomegaly for more than two weeks and no previous history of visceral leishmaniasis). Studies with mainly HIV‐positive patients were not included in the pooled analyses.

2. The results of the meta‐analysis showed considerable heterogeneity, which was partly explained by the geographic region.

3. This rapid diagnostic test has been developed specifically for field use. It is less invasive, less time‐consuming, and easier to perform than the alternative parasitological or serological tests.

4. Latent class analysis is a modelling technique that allows to estimate the sensitivity and specificity of a set of diagnostic tests in situations in which there is no good reference standard.

5. Two hypothetical situations: a peripheral health centre and a referral centre with a different prior probability of disease

6. A narrative explanation of the predictive values is given in Appendix 3.

7. QUADAS‐2 is a tool for the assessment of the quality of diagnostic accuracy studies. The tool comprises four domains: patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias, and the first three domains are also assessed in terms of concerns regarding applicability.

Figuras y tablas -
Summary of findings 2. rK39 immunochromatographic test for visceral leishmaniasis in east Africa
Summary of findings 3. Latex agglutination test in urine for the diagnosis of visceral leishmaniasis

Population: Patients suspected to have visceral leishmaniasis disease 1

Setting: Health services in endemic areas 2

New test: Latex agglutination test in urine 3

Reference standard: (1) direct smear test or culture of splenic aspirate; (2) composite reference standard based on one or more of the following: parasitology, serology, or response to treatment; or (3) latent class analysis 4

Pooled sensitivity : 0.64 (95% CI 0.41 to 0.86) ; Pooled specificity : 0.93 (95% CI 0.77 to 0.99) 5

Setting 6

Positive predictive value

Negative predictive value

Number of participants (studies)

Quality of the evidence

(QUADAS‐2) 7

Peripheral health centre with a prior probability of disease of 40%

86%

79%

1374

(6 studies)

Risk of bias: none of the studies had a low risk of bias in all domains. One study had a high risk of bias (domain of flow and timing). Five studies had an unclear risk of bias (domain of reference standard).

Applicability: low concerns in all studies and in all domains.

Referral centre with a prior probability of disease of 60%

93%

63%

CI: confidence interval

Boelaert M, Verdonck K, Menten J, Sunyoto T, van Griensven J, Chappuis F, Rijal S. Rapid diagnostic tests for visceral leishmaniasis. Cochrane Database of Systematic Reviews 2011, Issue 6. Art. No.: CD009135. DOI: 10.1002/14651858.CD009135.

1. Studies with mainly HIV‐positive patients were not included in the pooled analyses.

2. The studies included in this review were conducted in Ethiopia, Kenya, Sudan, India and Nepal.

3. This rapid diagnostic test has been developed specifically for field use. It is less invasive, less time‐consuming, and easier to perform than the alternative parasitological or serological tests.

4. Latent class analysis is a modelling technique that allows to estimate the sensitivity and specificity of a set of diagnostic tests in situations in which there is no good reference standard.

5. The results of the meta‐analysis showed considerable heterogeneity.

6. Two hypothetical situations: a peripheral health centre and a referral centre with a different prior probability of disease.

7. QUADAS‐2 is a tool for the assessment of the quality of diagnostic accuracy studies. The tool comprises four domains: patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias, and the first three domains are also assessed in terms of concerns regarding applicability.

Figuras y tablas -
Summary of findings 3. Latex agglutination test in urine for the diagnosis of visceral leishmaniasis
Table 1. Studies on rK39 ICT presenting two sets of Se and Sp estimates based on different reference standards: estimates included in primary analysis and sensitivity analysis

Study

Selected for primary analysis

Selected for sensitivity analysis

Boelaert 2004 ‐ LCA & Boelaert 2004 ‐ classic

LCA

Parasitology including spleen aspirate – no serology

Boelaert 2008 ‐ Ethiopia & Diro 2007

LCA

Parasitology including spleen aspirate – no serology

Boelaert 2008 ‐ India & Sundar 2007

LCA

Parasitology and serology

Cota 2013 ‐ composite 1 & Cota 2013 ‐ composite 2 *

Parasitology and serology

Parasitology not including spleen aspirate ‐ no serology

Machado de Assis 2012 & Machado de Assis 2011

LCA

Parasitology not including spleen aspirate ‐ no serology

Veeken 2003 ‐ composite & Veeken 2003 ‐ spleen

Parasitology and serology

Parasitology including spleen aspirate – no serology

* Cota 2013 describes HIV‐infected patients only and is not included in the formal meta‐analysis

Figuras y tablas -
Table 1. Studies on rK39 ICT presenting two sets of Se and Sp estimates based on different reference standards: estimates included in primary analysis and sensitivity analysis
Table 2. Overall Analysis Summary

Number of studies

Se (95% CI)

Sp (95% CI)

rK39 immunochromatographic test

Overall

18

91.9 (84.8 to 96.5)

92.4 (85.6 to 96.8)

Indian subcontinent

6

97.0 (90.0 to 99.5)

90.2 (76.1 to 97.7)

East Africa

9

85.3 (74.5 to 93.2)

91.1 (80.3 to 97.3)

Latex agglutination test

Overall

6

63.6 (40.9 to 85.6)

92.9 (76.7 to 99.2)

Figuras y tablas -
Table 2. Overall Analysis Summary
Table 3. Heterogeneity assessment of the rK39 ICT meta‐analysis

Sensitivity

Specificity

Estimate

(95% CI)

Estimate

(95% CI)

Geographic region

Indian subcontinent

97.0

(90.0 to 99.5)

90.2

(76.1 to 97.7)

East Africa

85.3

(74.5 to 93.2)

91.1

(80.3 to 97.3)

Commercial brand

DiaMed

86.4

(70.7 to 95.9)

94.4

(80.5 to 99.2)

InBios

91.3

(83.8 to 96.3)

91.2

(83.1 to 96.3)

Other

97.8

(88.7 to 99.9

92.5

(75.5 to 99.0)

Disease prevalence in sample

Low (< 65%)

91.0

(82.2 to 96.9)

94.4

(87.2 to 98.3)

High (≥ 65%)

92.4

(84.5 to 97.4)

89.8

(79.7 to 95.8)

Study size

Small (< 250)

91.5

(82.7 to 96.7)

89.8

(80.6 to 95.6)

Large (≥ 250)

92.3

(83.4 to 97.4)

94.5

(87.8 to 98.3)

QUADAS‐2: risk of bias

Low

89.5

(69.5 to 98.3)

96.0

(82.8 to 99.7)

Unclear

91.9

(83.7 to 97.1)

92.0

(84.0 to 96.7)

High

93.0

(77.9 to 99.1)

87.8

(70.7 to 97.2)

Reference standard

Parasitology ‐ no serology *

95.6

(84.3 to 99.6)

89.1

(72.4 to 97.4)

Parasitology and serology

89.6

(78.6 to 96.7)

94.6

(85.2 to 98.7)

Latent class analysis

91.0

(79.8 to 97.2)

91.5

(80.7 to 97.3)

* The categories "parasitology including spleen aspirate ‐ no serology" and "parasitology not including spleen aspirate ‐ no serology" were combined because the latter category contained only one study.

Figuras y tablas -
Table 3. Heterogeneity assessment of the rK39 ICT meta‐analysis
Table 4. Sensitivity analysis results of the rK39 ICT meta‐analysis

Sensitivity

Specificity

Estimate

(95% CI)

Estimate

(95% CI)

Main Analysis

Indian subcontinent

97.0

(90.0 to 99.5)

90.2

(76.1 to 97.7)

East Africa

85.3

(74.5 to 93.2)

91.1

(80.3 to 97.3)

Alternative Analysis Set

Indian subcontinent

96.1

(88.9 to 99.2)

86.7

(69.2 to 99.2)

East Africa

85.2

(75.0 to 92.7)

90.1

(77.0 to 97.4)

Bayesian analysis allowing for imperfect reference standards: expert priors ‐ using main analysis set

Indian subcontinent

97.3

(91.9 to 99.5)

93.7

(74.9 to 99.7)

East Africa

86.7

(77.5 to 94.0)

95.7

(84.3 to 99.8)

Bayesian analysis allowing for imperfect reference standards: vague priors ‐ using main analysis set

Indian subcontinent

97.3

(91.9 to 99.5)

93.0

(77.8 to 99.3)

East Africa

86.1

(77.2 to 93.3)

94.3

(83.8 to 99.6)

Figuras y tablas -
Table 4. Sensitivity analysis results of the rK39 ICT meta‐analysis
Table 5. Heterogeneity assessment of the Latex agglutination test meta‐analysis

Sensitivity

Specificity

Estimate

(95% CI)

Estimate

(95% CI)

Geographic region

Indian subcontinent

50.8

(34.1 to 69.3)

95.3

(73.6 to 99.8)

East Africa

77.9

(58.2 to 92.3)

88.6

(59.5 to 92.3)

Study size

Small (< 200)

52.9

(27.6 to 84.1)

86.2

(52.5 to 99.2)

Large (≥ 200)

68.5

(45.2 to 88.2)

94.5

(49.0 to 100.0)

Figuras y tablas -
Table 5. Heterogeneity assessment of the Latex agglutination test meta‐analysis
Table 6. Sensitivity analysis results of the Latex agglutination test meta‐analysis

Sensitivity

Specificity

Estimate

(95% CI)

Estimate

(95% CI)

Main Analysis

63.6

(40.9 to 85.6)

92.9

(76.7 to 99.2)

Alternative Analysis Set

63.4

(40.8 to 85.4)

92.8

(76.3 to 99.2)

Figuras y tablas -
Table 6. Sensitivity analysis results of the Latex agglutination test meta‐analysis
Table Tests. Data tables by test

Test

No. of studies

No. of participants

1 rK39 immunochromatographic test Show forest plot

26

5544

2 KAtex Show forest plot

9

1848

3 FAST Show forest plot

2

148

4 rK26 immunochromatographic test Show forest plot

1

352

5 rK39 Primary Analysis Show forest plot

16

3574

6 rKE16 immunochromatographic test Show forest plot

1

219

Figuras y tablas -
Table Tests. Data tables by test