Scolaris Content Display Scolaris Content Display

Bestimmung des Tau und Tau/ABeta Verhältnisses im Liquor cerebrospinalis zur Diagnose der Alzheimer‐Demenz und anderer Demenzformen bei Menschen mit leichter kognitiver Beeinträchtigung

Contraer todo Desplegar todo

Background

Research suggests that measurable change in cerebrospinal fluid (CSF) biomarkers occurs years in advance of the onset of clinical symptoms (Beckett 2010). In this review, we aimed to assess the ability of CSF tau biomarkers (t‐tau and p‐tau) and the CSF tau (t‐tau or p‐tau)/ABeta ratio to enable the detection of Alzheimer’s disease pathology in patients with mild cognitive impairment (MCI). These biomarkers have been proposed as important in new criteria for Alzheimer's disease dementia that incorporate biomarker abnormalities.

Objectives

To determine the diagnostic accuracy of 1) CSF t‐tau, 2) CSF p‐tau, 3) the CSF t‐tau/ABeta ratio and 4) the CSF p‐tau/ABeta ratio index tests for detecting people with MCI at baseline who would clinically convert to Alzheimer’s disease dementia or other forms of dementia at follow‐up.

Search methods

The most recent search for this review was performed in January 2013. We searched MEDLINE (OvidSP), Embase (OvidSP), BIOSIS Previews (Thomson Reuters Web of Science), Web of Science Core Collection, including Conference Proceedings Citation Index (Thomson Reuters Web of Science), PsycINFO (OvidSP), and LILACS (BIREME). We searched specialized sources of diagnostic test accuracy studies and reviews. We checked reference lists of relevant studies and reviews for additional studies. We contacted researchers for possible relevant but unpublished data. We did not apply any language or data restriction to the electronic searches. We did not use any methodological filters as a method to restrict the search overall.

Selection criteria

We selected those studies that had prospectively well‐defined cohorts with any accepted definition of MCI and with CSF t‐tau or p‐tau and CSF tau (t‐tau or p‐tau)/ABeta ratio values, documented at or around the time the MCI diagnosis was made. We also included studies which looked at data from those cohorts retrospectively, and which contained sufficient data to construct two by two tables expressing those biomarker results by disease status. Moreover, studies were only selected if they applied a reference standard for Alzheimer's disease dementia diagnosis, for example, the NINCDS‐ADRDA or Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM‐IV) criteria.

Data collection and analysis

We screened all titles generated by the electronic database searches. Two review authors independently assessed the abstracts of all potentially relevant studies, and the full papers for eligibility. Two independent assessors performed data extraction and quality assessment. Where data allowed, we derived estimates of sensitivity at fixed values of specificity from the model we fitted to produce the summary receiver operating characteristic (ROC) curve.

Main results

In total, 1282 participants with MCI at baseline were identified in the 15 included studies of which 1172 had analysable data; 430 participants converted to Alzheimer’s disease dementia and 130 participants to other forms of dementia. Follow‐up ranged from less than one year to over four years for some participants, but in the majority of studies was in the range one to three years.

Conversion to Alzheimer’s disease dementia

The accuracy of the CSF t‐tau was evaluated in seven studies (291 cases and 418 non‐cases).The sensitivity values ranged from 51% to 90% while the specificity values ranged from 48% to 88%. At the median specificity of 72%, the estimated sensitivity was 75% (95% CI 67 to 85), the positive likelihood ratio was 2.72 (95% CI 2.43 to 3.04), and the negative likelihood ratio was 0.32 (95% CI 0.22 to 0.47).

Six studies (164 cases and 328 non‐cases) evaluated the accuracy of the CSF p‐tau. The sensitivities were between 40% and 100% while the specificities were between 22% and 86%. At the median specificity of 47.5%, the estimated sensitivity was 81% (95% CI: 64 to 91), the positive likelihood ratio was 1.55 (CI 1.31 to 1.84), and the negative likelihood ratio was 0.39 (CI: 0.19 to 0.82).

Five studies (140 cases and 293 non‐cases) evaluated the accuracy of the CSF p‐tau/ABeta ratio. The sensitivities were between 80% and 96% while the specificities were between 33% and 95%. We did not conduct a meta‐analysis because the studies were few and small. Only one study reported the accuracy of CSF t‐tau/ABeta ratio.

Our findings are based on studies with poor reporting. A significant number of studies had unclear risk of bias for the reference standard, participant selection and flow and timing domains. According to the assessment of index test domain, eight of 15 studies were of poor methodological quality.

The accuracy of these CSF biomarkers for ‘other dementias’ had not been investigated in the included primary studies.

Investigation of heterogeneity

The main sources of heterogeneity were thought likely to be reference standards used for the target disorders, sources of recruitment, participant sampling, index test methodology and aspects of study quality (particularly, inadequate blinding).

We were not able to formally assess the effect of each potential source of heterogeneity as planned, due to the small number of studies available to be included.

Authors' conclusions

The insufficiency and heterogeneity of research to date primarily leads to a state of uncertainty regarding the value of CSF testing of t‐tau, p‐tau or p‐tau/ABeta ratio for the diagnosis of Alzheimer's disease in current clinical practice. Particular attention should be paid to the risk of misdiagnosis and overdiagnosis of dementia (and therefore over‐treatment) in clinical practice. These tests, like other biomarker tests which have been subject to Cochrane DTA reviews, appear to have better sensitivity than specificity and therefore might have greater utility in ruling out Alzheimer's disease as the aetiology to the individual's evident cognitive impairment, as opposed to ruling it in. The heterogeneity observed in the few studies awaiting classification suggests our initial summary will remain valid. However, these tests may have limited clinical value until uncertainties have been addressed. Future studies with more uniformed approaches to thresholds, analysis and study conduct may provide a more homogenous estimate than the one that has been available from the included studies we have identified.

Laienverständliche Zusammenfassung

Proteine in der Cerebrospinalflüssigkeit zur frühzeitigen Vorhersage der Alzheimer‐Demenz oder anderer Demenzformen bei Menschen mit leichten kognitiven Problemen

Hintergrund

Die Anzahl von Menschen mit Demenz und anderen kognitiven (das Wahrnehmen, Denken und Erkennen betreffenden) Problemen nimmt weltweit zu. Die Diagnose einer Demenz in einem frühen Stadium wird empfohlen, aber es besteht keine Einigkeit bezüglich der besten Methode. Verschiedene Tests sind entwickelt worden, die Fachpersonen im Gesundheitswesen in der Untersuchung von Menschen mit Gedächtnisschwäche oder kognitiven Beeinträchtigungen anwenden können. In diesem Review haben wir uns auf die diagnostischen Tests der Cerebrospinalflüssigkeit (Gehirn‐Rückenmarks‐Flüssigkeit, auch Liquor genannt) fokussiert. Diese Tests werden im folgenden Liquor‐Tests genannt.

Fragestellung

Wir begutachteten die Evidenz zur Genauigkeit von Liquor‐Tests zur Ermittlung derjenigen Menschen mit leichten kognitiven Beeinträchtigungen, die im Laufe der Zeit eine Alzheimer‐Demenz oder andere Formen von Demenz entwickeln werden.

Studienmerkmale

Die Evidenz ist auf dem Stand von Januar 2013. Wir schlossen 15 Studien ein, die insgesamt 1.282 Teilnehmer mit leichten kognitiven Beeinträchtigungen umfassten. Die Mehrzahl der Studien (n = 9) wurde zwischen 2010 und 2013 veröffentlicht. Die verbleibenden sechs Studien wurden zwischen 2004 und 2009 veröffentlicht. Alle eingeschlossenen Studien wurden in Europa durchgeführt.

Die Studiengrößen variierten und betrugen zwischen 15 und 231 Teilnehmern. Das durchschnittliche Alter (die Spannweite) der jüngsten Stichprobe betrug 64 Jahre (45 bis 76 Jahre) und das durchschnittliche Alter (die Standardabweichung) der ältesten Stichprobe betrug 73.4 Jahre (6.6 Jahre).

Qualität der Evidenz

Unsere Ergebnisse basieren auf Studien mit mangelhafter Berichterstattung. Die Mehrzahl der Studien hatte ein unklares Risiko für Bias (eine systematische Verzerrung der Ergebnisse) aufgrund unzureichender Angaben darüber, wie die Teilnehmer ausgewählt wurden und wie die klinische Diagnose der Demenz gestellt wurde. Basierend auf der Beurteilung der Durchführung und Auswertung der CSF‐Tests waren acht der 15 Studien von schwacher methodischer Qualität.

Hauptergebnisse

Nachfolgend eine Zusammenfassung der Schlüsselergebnisse der Tests:

Liquor‐Test des T‐Tau‐Wertes hinsichtlich des Übergangs einer leichten kognitiven Beeinträchtigung in eine Alzheimer‐Demenz

Die Werte für die Sensitivität in sieben individuellen Studien reichten von 51% bis 90%, während die Werte für die Spezifität von 48% und 88% reichten. Die statistische Analyse dieser Studien zeigte, dass bei einer gegebenen Spezifität von 72% die geschätzte Sensitivität bei 77% lag, und dass bei einer Prävalenz von 37% der positive prädiktive Wert 62% und der negative prädiktive Wert 84% betrugen. Auf Basis dieser Ergebnisse würden im Durchschnitt 62 von 100 Menschen mit leichter kognitiver Beeinträchtigung und einem positiven Testergebnis eine Alzheimer‐Demenz entwickeln, 38 jedoch nicht. Im Durchschnitt würden 84 von 100 Menschen mit leichter kognitiver Beeinträchtigung und einem negativem Testergebnis keine Alzheimer‐Demenz entwickeln, 16 jedoch schon.

Liquor‐Test des T‐Tau‐Wertes: zum Übergang einer leichten kognitiven Beeinträchtigung in eine Alzheimer‐Demenz

Die Werte für die Sensitivität in sechs individuellen Studien reichten von 40% bis 100%, während die Werte für die Spezifität von 22% bis 86% reichten. Die statistische Analyse dieser Studien zeigte, dass bei einer gegebenen Spezifität von 48% die geschätzte Sensitivität bei 81% lag und, dass bei einer Prävalenz von 37 % der positive prädiktive Wert 48 % und der negative prädiktive Wert 81 % betrugen. Auf Basis dieser Ergebnisse würden im Durchschnitt 48 von 100 Menschen mit leichter kognitiver Beeinträchtigung mit einem positiven Testergebnis eine Alzheimer‐Demenz entwickeln, 52 jedoch nicht. Im Durchschnitt würden 81 von 100 Menschen mit leichter kognitiver Beeinträchtigung und einem negativem Testergebnis keine Alzheimer‐Demenz entwickeln, 19 jedoch schon.

Wir ermittelten, dass der diagnostischen Testung der Cerebrospinalflüssigkeit als einzelner Testung die Genauigkeit fehlt, um Menschen mit leichten kognitiven Beeinträchtigungen zu identifizieren, die in einem bestimmten Zeitraum eine Alzheimer‐Demenz oder andere Formen der Demenz entwickeln werden. Die Daten deuten darauf hin, dass ein negativer Liquor‐Test bei Menschen mit leichter kognitiver Beeinträchtigung die Abwesenheit der Alzheimer‐Krankheit als Ursache ihrer klinischen Symptome beinahe (mit einer relativ großen Wahrscheinlichkeit) anzeigt. Ein positiver Liquor‐Test bestätigt jedoch nicht das Vorhandensein der Alzheimer‐Krankheit als Ursache für die klinischen Symptome bei Menschen mit leichter kognitiver Beeinträchtigung.

Es gab methodische Probleme bei den eingeschlossenen Studien, aufgrund derer eine klare Antwort auf die Fragestellung des Reviews nicht möglich war. Die wesentlichen Grenzen dieses Review waren die mangelhafte Berichterstattung der eingeschlossenen Studien, das Fehlen weithin akzeptierter Grenzwerte für die Liquor‐Tests bei Menschen mit leichter kognitiven Beeinträchtigung, Unterschiede in der Länge des Nachbeobachtungszeitraums sowie die deutlichen Unterschiede in der Genauigkeit der Liquor‐Tests zwischen den eingeschlossenen Studien.

Authors' conclusions

Implications for practice

The principal conclusion from our review is that of ongoing uncertainty regarding the true value of these tests in the management of people with prodromal dementia or MCI.

The use of and access to lumbar punctures (LP) in dementia clinics varies greatly between and within countries. The test is usually straightforward with only very occasional side effects, such as headache. However, acceptability of an LP by patients and carers also varies greatly and may reflect the views of the clinicians proposing their use and their perspective on the value of the test diagnostically. In the context of the new diagnostic criteria being used for prodromal Alzheimer's dementia (Dubois 2014), the tests studied here have been used as being indicative of Alzheimer's pathology. The data from this review suggests that a negative CSF test, in people with MCI, is likely to reflect the absence of Alzheimer's disease pathology as the aetiology of their clinical symptoms. However, in CSF sampling for ABeta and tau levels, a positive result does very little to indicate the presence of Alzheimer's disease as the aetiology of their clinical symptoms. In the new National Institute on Aging and Alzheimer Association criteria (Albert 2011), the presence of abnormally high t‐tau/p‐tau or low ABeta is thought to indicate MCI due to Alzheimer's disease. This is not supported by this or previous reports. What is more consistent with our findings is that, in the presence of normal levels of CSF t‐tau/p‐tau and ABeta, we can say MCI is not due to Alzheimer's disease. This is an important distinction and one that has important implications when conveying risk of progression to people with MCI, as well as when giving pretest counselling to people before the test takes place. The positive and negative likelihood ratios we have generated as illustrations of our findings do demonstrate that positive and negative tests do have a small change from pre‐ to post‐test probability. However, they are relatively small and way below what would be expected from standard thresholds for a good test. The language used in new criteria do not reflect this level of uncertainty or small incremental benefit, and therefore confer much greater diagnostic accuracy on these tests than is currently merited. Our review suggests that where these tests are used to assist clinical diagnosis, their limitations and low incremental benefit should be considered. In the absence currently of any disease modifying interventions, the risk of overdiagnosis to a patient may do greater harm than underdiagnosis. However, this is a rapidly moving field and if disease‐modifying or secondary prevention interventions become available, then this opinion will shift, more so if the interventions are effective, low cost and well tolerated.

Implications for research

These tests though still have value in clinical trials with drugs proposed to affect the Alzheimer's disease process where normal levels should be used to exclude subjects from the trial with the knowledge that many individuals with 'positive' tests who are entered in to the study will still not progress to Alzheimer's disease dementia. Moreover, there may well be an interaction between test results, diagnosis, and stage of illness. For instance, abnormalities in these tests may be more specific if noted in younger people with no or minimal symptoms as opposed to older, symptomatic people where they may be reflections of nonspecific neurodegeneration, ageing or physiological reactions to ABeta oligomerisation (plaque formation). Only by undertaking longitudinal studies in mid‐life, preclinical populations can we answer that proposition; these types of studies are ongoing. It is also the case that several initiatives to collate data from across numerous cohort studies are commencing. These include Dementia Platform UK and the IMI‐funded European Medicines Informatic Framework (EMIF) (http://www.emif.eu; http://www.dementiasplatform.uk) and European Prevention of Alzheimer's Dementia (EPAD) (Ritchie 2016) programme that will deliver analyses of source‐aggregated data that will, in most cases, have been collected under standardised conditions, as well as (in the case of EPAD) developing a new longitudinal cohort which will provide at least ten‐fold increase in sample size for predementia disease modelling (Ritchie 2016). New cohort data are also being regularly published, especially as extant cohorts undergo further assessments. Together this suggests that as a rapidly moving field, this review will need to be updated on a regular basis and will include data from preclinical (asymptomatic) as well as MCI populations.

Summary of findings

Open in table viewer
Summary of findings Performance of CSF biomarkers in early diagnosis of dementia

What is the diagnostic accuracy of CSF biomarker levels for detecting Alzheimer's disease pathology in people with mild cognitive impairment (MCI), and identifying those MCI participants who would convert to Alzheimer’s disease dementia or other forms of dementia over time

Descriptive

Patient population

Participants diagnosed with MCI at baseline using any of the Petersen criteria or CDR = 0.5 or any 16 definitions included by Matthews (Matthews 2008)

Sampling procedure

Consecutive or random (n = 5)

Not consecutive or random (n = 3)

Unclear (n = 7)

Sources of recruitment

University memory clinic (n = 8); European multicentre memory clinics (n = 2); inpatients (n = 2); General Hospital memory clinic (n = 1); Research centre outpatient memory clinic (n = 1); not reported (n = 1)

Prior testing

The only testing prior to performing the plasma and CSF biomarkers was the application of diagnostic criteria for identifying participants with MCI.

MCI criteria

Petersen criteria (n = 14)

Global Deterioration Scale (GDS) (n = 1)

Index tests

CSF t‐tau or CSF p‐tau or CSF p‐tau/ABeta ratio or CSF t‐tau/ABeta ratio

Reference standard

NINCDS‐ADRDA and/or DSM and/or ICD criteria for Alzheimer's disease dementia (n = 12); Global Dementia Scale (GDS) & Research criteria (n = 1); CDR = 1 criteria (n = 1); not specified (n = 1)

McKeith criteria for Lewy body dementia; Lund criteria for frontotemporal dementia; and NINDS AIREN criteria for vascular dementia

Target condition

Alzheimer’s disease dementia or any other types of dementia

Included studies

Prospectively well‐defined cohorts of MCI participants (n = 7), nested case‐control studies with a prospectively defined MCI group (n = 6) and studies with a retrospectively defined MCI group with longitudinal data (n = 2).

Fifteen studies (N = 1282 participants) were included. Number included in analysis: 1172

Quality concerns

Patient selection and conduct of the reference standard were poorly reported. Applicability concerns were generally low. Regarding the inclusion criteria set in the review, the majority of included studies did match the review question: 'Could CSF t‐tau and CSF t‐tau/ABetaratio biomarkers identify those MCI participants with Alzheimer’s disease pathology at baseline who would convert clinically to dementia at follow up?' However, due to a limited number of included studies and levels of heterogeneity, it is difficult to determine to what extent the findings from a meta‐analysis can be applied to clinical practice.

Limitations

Limited investigation of heterogeneity due to insufficient number of studies. There was a lack of common thresholds.

Test

Median percentage converting (range) 2

Studies

Cases/participants

Median specificity from included studies

Sensitivity

(95% CI)1 at median specificity

Consequences in a cohort of 100

Median percentage converting2

Missed cases

Overdiagnosed

Alzheimer's disease dementia

CSF t‐tau

7

436/709

72

77 (67, 85)

37

9

18

Alzheimer's disease dementia

CSF p‐tau

6

164/492

47.5

81 (64, 91.5)

37

7

33

Alzheimer's disease dementia

CSF p‐tau/ ABeta ratio

5

140/433

No meta‐analysis

No meta‐analysis

All types of dementia

CSF t‐tau

4

166/319

No meta‐analysis

No meta‐analysis

Investigation of heterogeneity: the planned investigations were not possible due to the limited number of studies available for each analysis. We were unable to investigate the effect of duration of follow‐up due to substantial variation in length and reporting.

Conclusions: Given the insufficient evidence to evaluate the diagnostic value in MCI of CSF t‐tau, CSF p‐tau, CSF t‐tau/ABeta ratio and CSF p‐tau/ABeta ratio for Alzheimer's disease dementia and other forms of dementias examined in this review, particular attention should be paid to the risk of misdiagnosis and overdiagnosis of dementia (and therefore overtreatment) in clinical practice. Future studies with more uniform approaches to thresholds, analysis and study conduct may provide a more homogenous estimate than the one that has been available from the included studies we have identified.

1Meta‐analytic estimate of sensitivity derived from the HSROC model at a fixed value of specificity. Summary estimates of sensitivity and specificity were not computed because the studies that contributed to the estimation of the summary ROC curve used different thresholds.

2The median percentage converting was calculated using all the studies that reported 'conversion from MCI to Alzheimers' disease dementia' (Table 2)

Background

Dementia is a progressive syndrome of global cognitive impairment with resultant functional decline.  In the United Kingdom (UK), it affects 5% of the population over 65 and 25% of those over 85 (Knapp 2007).  Worldwide, there were estimated to be 36 million people living with dementia in 2010 (Wilmo 2010), and this will increase to over 115 million by 2050 (Prince 2013). The greatest increases in prevalence are likely to be seen in the developing regions.  By 2040, China and its western‐Pacific neighbours are predicted to have 26 million people living with dementia (Ferri 2005). 

Dementia encompasses a group of neurodegenerative disorders that are characterised by progressive loss of cognitive function and ability to perform activities of daily living, that can be accompanied by neuropsychiatric symptoms and challenging behaviours of varying type and severity.  The underlying pathology is usually degenerative and subtypes of dementia include Alzheimer’s disease dementia, vascular dementia, dementia with Lewy bodies, and frontotemporal dementia.  There may be considerable overlap in the clinical and pathological presentations (MRC CFAS 2001), and there is often coexistence of Alzheimer’s disease dementia, vascular dementia and other causes of neuronal atrophy (Matthews 2009; Savva 2009). 

Alzheimer’s disease dementia is an incurable, progressive, neurodegenerative condition which accounts for over 50% of all dementias, afflicting 5% of men and 6% of women over the age of 60 worldwide (World Health Organization 2010). Its prevalence increases exponentially with age, with Alzheimer’s dementia affecting fewer than 1% of people aged from 60 to 64 years, but 24% to 33% of those over the age of 85 (Ferri 2005).

There have been over a dozen different definitions used to describe cognitive impairment that is somehow qualitatively different from so‐called ‘normal’ ageing.  The first complaints in people with Alzheimer’s disease spectrum are often cognitive problems such as problems with planning and judgement, as well as the more characteristic memory complaints. This may lead to a diagnosis of Mild Cognitive Impairment (MCI) if formal testing reveals objective evidence of cognitive impairment. It has not been previously mandated which psychometric tests should be used to objectively define cognitive impairment. However, the objectivity of the cognitive impairment diagnosis is critical, as it differentiates this population from a group with subjective cognitive impairment, which is more likely to have a non‐neurodegenerative aetiology. MCI is a heterogeneous condition, the diagnosis of which holds very little prognostic significance. There are four outcomes for those within an MCI population: progression to Alzheimer’s disease dementia, progression to another dementia, maintaining stable MCI, and recovery.  Currently, 16 different classifications are used to define MCI (Matthews 2008). In this protocol, MCI refers to this extended definition of MCI or to the clinical criteria defined by Petersen criteria or revised Petersen criteria (Petersen 1999; Petersen 2004; Winbald 2004) or to the Cognitive Dementia Rating (CDR = 0.5) scale (Morris 1993). 

Studies indicate that an annual average of 5% to 15% of people with MCI progress to Alzheimer’s disease dementia (Petersen 1999; Bruscoli 2004; Mattson 2009; Petersen 2009). This all depends on clinical profile, settings and investigation for vascular disease. At the present time, there is no clinical method to determine accurately which of those people with MCI will develop Alzheimer’s disease dementia or other forms of dementia. 

Recent consensus guidelines have been developed, e.g. the second iteration of International Working Group (IWG2) on 'prodromal dementia', which seeks to improve prognostic accuracy in the prodromal phase of Azheimer's dementia by the incorporation in criteria of Alzheimer's disease‐related biomarkers (Dubois 2014). It is in this context, that reviews such as this one become especially relevant and timely.

Research suggests that measurable change in proton emission tomography (PET), magnetic resonance (MRI) and cerebrospinal fluid (CSF) biomarkers occurs years in advance of the onset of clinical symptoms (Beckett 2010).  In this review, we aimed to assess the ability of CSF total tau (t‐tau), CSF phosphorylated tau (p‐tau), the CSF t‐tau/ABeta ratio, and the CSF p‐tau/ABeta ratio, to enable the detection of Alzheimer’s dementia and other forms of dementia in people with MCI. These biomarkers have been chosen as they are considered to be the most intimately expressed biomarkers of the Alzheimer's disease core pathology; namely, the aggregation and fibrilisation of the amyloid plaque and hyperphosphorylation of tau. Consequentially, these biomarkers have been proposed as important in new criteria for Alzheimer's disease dementia that incorporate biomarker abnormalities. PET imaging of amyloid is now approved by both the FDA and EMA to rule out Alzheimer's disease as the aetiology of MCI, especially in individuals with unusual clinical presentations. However, manufacturers of these tracers have ongoing 'appropriate use criteria' ongoing post‐marketing studies to learn where these tests have greatest usage and utility for the person's accurate diagnosis. Recent improvements to CSF sampling and the relatively inexpensive nature of this test compared with PET scanning means that it will remain the test of choice for documenting CSF protein abnormalities in neurodegenerative disease. Side effects are increasingly rare but include headache and local reactions at the site of the lumbar puncture. Patients on anticoagulative therapies (except aspirin) are considered at too high a risk by most practitioners to undergo this procedure for the diagnosis of Alzheimer’s dementia.

Target condition being diagnosed

In this review, there are two target conditions: i) Alzheimer's disease dementia and ii) other forms of dementia, both of which were assessed at follow‐up.

We compared the index test results obtained at baseline with the results of the reference standard (clinical criteria) obtained at follow‐up (delayed verification of clinical diagnosis). 

Index test(s)

This review is part of a suite of reviews for assessing the accuracy of CSF ABeta (Ritchie 2014), PET Amyloid (Zhang 2014; Smailagic 2015), MMSE (Arevalo‐Rodriguez 2015), and other index tests in identifying those people with MCI without clinical onset of dementia, who would develop Alzheimer's disease dementia or other forms of dementia during follow‐up. We planned to consider the following:

Total tau (t‐tau) and phosphorylated tau (p‐tau) CSF biomarker tests

Tau is a microtubule‐associated protein located primarily in neuronal axons. There are six different human isoforms, each of which has multiple phosphorylation sites. Physiologically tau interacts with tubulin and plays an important role in the organisation and stabilisation of microtubules. Independent of phosphorylation status, slightly increased levels of CSF total tau (t‐tau) have been associated with ageing, vascular dementia, multiple sclerosis, AIDS dementia, head injury and tauopathy; significant increases with Creutzfeldt‐Jakob disease and meningoencephalitis; and a threefold increase has been seen in Alzheimer’s disease compared to normal controls (Shoji 2002). A systematic review of CSF biomarkers for Alzheimer’s disease analysing 41 studies of CSF t‐tau, demonstrated a specificity of 90% and sensitivity of 81% in diagnosing the condition (Blennow 2003).

The p‐tau protein also has a number of potential phosphorylation sites (Billingsley 1997) and abnormal hyperphosphorylation has been shown to be associated with microtubule disruption and the formation of neurofibrillary tangles, dystrophic neurites surrounded by neuritic plaques, and neuropil threads, major components of Alzheimer’s disease pathophysiology (see Mandelkow 1998). A systematic review of 11 studies of CSF p‐tau in Alzheimer’s disease indicated a diagnostic specificity and sensitivity of 92% and 80% respectively (Blennow 2003).

There is great interest around the use of biomarkers and imaging techniques for the prediction of progression from MCI populations to Alzheimer’s disease dementia and other forms of dementia. The international consortium study Alzheimer Disease Neuroimaging Initiative (ADNI), performed between 2004 and 2009, has so far been a key cohort study for predicting the progression from MCI to Alzheimer’s disease using biomarkers, and demonstrated a sensitivity and specificity of CSF t‐tau of 70% and 92% and CSF p‐Tau181 of 68% and 73% respectively (Petersen 2010).

T‐tau/ABeta ratio and p‐tau/ABeta ratio CSF biomarker tests

ABeta is produced mainly by neurons, secreted into the CSF and then cleared through the blood‐brain barrier and degraded by the reticuloendothelial system. ABeta levels are thus regulated in strict equilibrium between the brain, CSF and blood (Shoji 1992), but, in Alzheimer’s disease patients, ABeta42 forms insoluble amyloid and accumulates as intracerebral fibrils, resulting in decreased levels of CSF ABeta42 (Shoji 2001).

ABeta in CSF has only modest potential as a test for delayed verification of Alzheimer’s disease (Ritchie 2014), with meta‐analysis of studies being hampered by poor methodological quality (Noel‐Storr 2013) and multiple thresholds being reported between studies (Ritchie 2011).

In 2001, the American Academy of Neurology produced practical guidelines for dementia, including three Class II or III reports in a systematic review of a combination study of ABeta42 and t‐tau CSF levels. The sensitivity and specificity for diagnosis of Alzheimer’s disease were 85% and 87% (Knopman 2001), supported by the 2001 systematic review revealing 83% to 100% sensitivity and 85% to 95% specificity for the CSF ABeta42 and t‐tau combination assay (Blennow 2003). Again, the ADNI cohort study demonstrated that the t‐tau/ABeta42 ratio could be used to predict conversion from MCI to Alzheimer’s disease dementia, revealing a sensitivity of 86% and specificity of 85% (Petersen 2010).

Clinical pathway

Dementia develops over several years and there is a presumed period when people are asymptomatic, although disease pathology may have accumulated. Individuals or their relatives may first notice subtle impairments of short‐term memory when the completion of complex tasks such as management of finances or medications becomes increasingly difficult. In the UK, people usually present to their general practitioner who may then refer them to a specialist following a brief cognitive test, clinical examination and exclusion of relevant physical illness. The biomarkers may then be administered by a specialist. There is, however, much regional variability in this, with Spain and Nordic countries favouring CSF sampling in their routine clinical work‐up, whereas other countries, such as the UK, do not. However, many people with dementia do not present until much later in the disorder and they will, therefore, follow a different pathway to diagnosis, for example, being identified during an admission to general hospital for a physical illness. Thus, the pathway influences the accuracy of the diagnostic test. The accuracy of the test will vary with the experience of the administrator, and the accuracy of the subsequent diagnosis will vary with the history of referrals to the particular healthcare setting. Diagnostic assessment pathways may vary in other countries and diagnoses may be made by a variety of specialists including psychiatrists, neurologists, and geriatricians.

Role of index test(s)

The sampling of CSF and assay for levels of tau and ABeta could have a role when applied in specialist clinics. Due to the costs, risks, and complexity of the testing, CSF tests will not be applied in a primary care setting. The roles of these index tests are as add‐on biomarker tests which have been proposed in new research diagnostic criteria to compliment clinical examination and cognitive tests.

Alternative test(s)

We did not include alternative tests in this review, because there are currently no standard practice tests available for the diagnosis of dementia. 

Rationale

Recently proposed research diagnostic criteria for ‘prodromal dementia’/’pre‐dementia stage’/‘MCI due to Alzheimer's disease pathology’ and for 'Alzheimer’s disease' and for the 'preclinical states of Alzheimer's disease' (Albert 2011; Dubois 2010; Dubois 2014), incorporate biomarkers based on imaging or CSF measures within the diagnostic rubric. These tests are core to the criteria, assuming they will improve the specificity of the traditional solely clinical criteria. It is crucial that each of these biomarkers is assessed for their diagnostic accuracy before they are adopted as routine tests in clinical practice. It is worth noting that in each of these criteria, a single abnormality in any of the proposed biomarker/imaging tests is considered sufficient to make a diagnosis of prodromal Alzheimer’s disease dementia.

Underpinning the new criteria is the assumption that if Alzheimer’s disease pathology can be diagnosed at an earlier, pre‐dementia stage, this could open critical windows for interventions that will have a greater likelihood of success in affecting disease pathways and thereby improving clinical symptoms. Earlier accurate diagnosis will also help people with pre‐dementia cognitive impairment, their families and potential carers make timely plans for the future. Coupled with appropriate contingency planning, proper recognition of the disease may also help to prevent inappropriate and potentially harmful admissions to hospital or institutional care (Bourne 2007). In addition, the accurate early identification of a dementia syndrome may improve opportunities for the use of newly evolving interventions designed to delay or prevent progression to more debilitating stages of dementia.

Objectives

To determine the diagnostic accuracy of 1) CSF t‐tau, 2) CSF p‐tau, 3) the CSF t‐tau/ABeta ratio and 4) the CSF p‐tau/ABeta ratio index tests for detecting people with MCI at baseline who would clinically convert to Alzheimer’s disease dementia or other forms of dementia at follow‐up.

Secondary objectives

To investigate the amount and associations of heterogeneity in the included studies of test accuracy.

We expected heterogeneity to be an important component of the review. We planned to use target population, index test, target disorder and study quality as a framework for the investigation of heterogeneity.

Methods

Criteria for considering studies for this review

Types of studies

We considered longitudinal cohort studies in which index test results were obtained at baseline and the reference standard results at follow‐up (see Index tests; Reference standards). These studies necessarily employ delayed verification of conversion to dementia and are sometimes labelled as ‘delayed verification cross‐sectional studies’ (Bossuyt 2008,; Knottnerus 2002). This approach recognises the challenges of concurrent application of the reference test and index test. In reality, the reference standard for dementia is tissue sampling and histological examination, either at post mortem or from brain biopsy. Brain biopsy is not undertaken in any setting and a post mortem is so distant an event from the index test being conducted that there is the possibility that disease may have developed in the years after the index test. The Dementia DTA group chose to use later diagnosis of dementia (using standardised criteria) as evidence of delayed verification. This methodology has been published by our group (Mason 2010) and also reflects the approach taken in most of the primary research in this area.

We included nested case‐control studies if they incorporated a delayed verification design. We believe this can only occur in the context of a cohort study, so these studies are invariably diagnostic nested cohort studies. We only included data on performance of the index test to discriminate between people with MCI who converted to dementia and those who remained stable from those studies. We did not consider data from healthy controls or any other control group.

Participants

Participants recruited and clinically classified as those with mild cognitive impairment (MCI) at baseline were eligible for inclusion in this review.  The diagnosis for MCI was established using the Petersen criteria or revised Petersen criteria (Petersen 1999; Petersen 2004; Winbald 2004) and/or Matthews criteria (Matthews 2008) and/or 'CDR = 0.5' (Morris 1993). These criteria include: subjective complaints; a decline in memory objectively verified by neuropsychological testing in combination with a history from the patient; a decline in other cognitive domains; no or minimal impairment of activities of daily living; and not meeting the criteria for dementia. Therefore, the eligible participants had a number of tests, e.g. neuropsychological tests for cognitive deficit and checklists for activities of daily living, before study entry. Participants were defined either as amnestic single domain, amnestic multiple domain, non‐amnestic single domain, non‐amnestic multiple domain, or nonspecified MCI participants.

We included participants from secondary and tertiary settings. Although demographic and clinical characteristics of MCI, as well as sources of recruitment, might differ in those settings, we decided not to limit our review by setting; instead, we planned to look for variation within and between settings, and examined the potential influence of the setting on diagnostic performance of the index test in the analyses.

We excluded those studies that included people with MCI possibly caused by: i) a current or history of alcohol/drug abuse; ii) central nervous system (CNS) trauma (e.g. subdural haematoma), tumour, or infection; iii) other neurological conditions, e.g. Parkinson’s or Huntington’s diseases.

Because detail of the causes of study dropouts is crucial, and, if such data are missing, the reliability of the conclusions must be questioned, we planned to take this into consideration.

Index tests

Studies that assessed the accuracy of CSF measurements of CSF t‐tau, CSF p‐tau, CSF t‐tau/ABeta ratio, or CSF p‐tau/ABeta ratio were included.

There are currently no generally accepted standards for the plasma or CSF ABeta test threshold, and therefore it was not possible to prespecify what constituted a positive or negative result. We used the criteria which were applied in each included primary study to classify participants as either test positive or test negative.

Measure of index test: t‐tau and p‐tau and ABeta level in CSF (ng.l‐1 or pg.ml‐1)

The assays most commonly used were conventional Innogenetics, Ghent, Belgium kit or INNOTEST Phospho‐Tau(181) kit or INNOTEST ABeta42 or INNOTEST the multiplexing INNO‐BIA AlzBio3 for CSF.

We did not include a comparator test because there are currently no standard practice tests available for the diagnosis of dementia. We compared the index tests with a reference standard.

Target conditions

There were two target conditions in this review:

  1. Alzheimer’s disease dementia (conversion from MCI to Alzheimer’s disease dementia)

  2. Any other forms of dementia (conversion from MCI to any other forms of dementia)

Reference standards

For the purpose of this review, several definitions of Alzheimer’s disease dementia were acceptable. Included studies could apply probable or possible NINCDS‐ADRDA (National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer's Disease and Related Disorders Association) criteria (McKhann 1984). The Diagnostic and Statistical Manual of Mental Disorders (DSM) (DSMIII 1987; DSMIV 1994) and International Classification of Diseases (ICD) (World Health Organization 2010) definitions for Alzheimer’s disease dementia were also acceptable. It should be noted that different iterations of these standards may not be directly comparable over time (e.g. DSM‐IIIR versus DSM‐IV). Moreover, the validity of the diagnoses may vary with the degree or manner in which the criteria have been operationalised (e.g. individual clinician versus algorithm versus consensus determination). We planned to consider all these issues in interpreting the results, using sensitivity analyses as appropriate.

Similarly, differing clinical definitions of other forms of dementias were acceptable.  For Lewy body dementia, the reference standard is the McKeith criteria (McKeith 1996; McKeith 2005). For frontotemporal dementia, the reference standard is the Lund criteria (LMG 1994, Neary 1998, Boxer 2005). DSM (DSMIII 1987; DSMIV 1994) and ICD (World Health Organization 2010) were also acceptable for frontotemporal and vascular dementias.

The time interval over which progression from MCI to Alzheimer’s disease dementia or other forms of dementia happened is also important. As age is the principal risk factor for Alzheimer's dementia and other forms of dementias, the longer the duration of follow‐up, the more likely the possibility of generating false positive findings for the index test. To this end, no limits were put on the length of follow‐up in the included studies, though this important variable was captured so we could examine between‐study variations. This change reflected an alteration to the original thinking in the published protocol and is noted in the Differences between protocol and review section of this review.

We planned to segment analyses into separate follow‐up periods for the delay in verification: less than one year, one year to less than two years; two to less than four years; and more than four years. 

Search methods for identification of studies

Electronic searches

The main search for this review was performed in January 2013. However, we ran a top‐up search in December 2015. We searched MEDLINE (OvidSP), Embase (OvidSP), BIOSIS Previews (Thomson Reuters Web of Science), Web of Science Core Collection, including Conference Proceedings Citation Index (Thomson Reuters Web of Science), PsycINFO (OvidSP), and LILACS (BIREME) (see Appendix 1 for details of the sources searched, the search strategies used, and the number of hits that were retrieved for the search carried out in January 2013). The results of the top‐up search that were carried out in December 2015 have not yet been fully incorporated into the review (please see Results of the search for more details).

We also requested a search of the Cochrane Register of Diagnostic Test Accuracy Studies (managed by the Cochrane Renal Group).

We did not apply any language or date restrictions to the electronic searches. We did not use methodological search filters (collections of terms aimed at reducing the number needed to screen by filtering out irrelevant records and retaining only those that are relevant) in the main bibliographic databases (MEDLINE, Embase and PsycINFO) as a single‐stranded method to restrict the search overall because available filters have not yet proved sensitive enough for systematic review searches (Beynon 2013). Instead, we used a multi‐stranded approach in order to maximise sensitivity, including some searches run in parallel, that included specific terms designed to capture diagnostic studies (see search narrative in Appendix 1)

Searching other resources

We checked the reference lists of all relevant studies for additional studies. We also conducted searches in the MEDION database (Meta‐analyses van Diagnostisch Onderzoek) at www.mediondatabase.nl, Database of Abstracts of Reviews of Effects (DARE) at http://www.crd.york.ac.uk/CRDWeb, Health Technology Assessment Database (HTA Database) at http://www.crd.york.ac.uk/CRDWeb, and Aggressive Research Intelligence Facility (ARIF) database at www.arif.bham.ac.uk for other related systematic diagnostic accuracy reviews; we searched for systematic reviews of diagnostic studies from the International Federation of Clinical Chemistry and Laboratory Medicine Committee for Evidence‐based Laboratory Medicine database (C‐EBLM). We checked reference lists of any relevant systematic reviews for additional studies. We also contacted researchers involved in relevant studies for applicable and usable but unpublished data.

Data collection and analysis

Selection of studies

Two researchers (EL and AN‐S) screened all titles and abstracts generated by the electronic database searches for relevance. 

Two researchers (EL and AN‐S) independently reviewed the remaining abstracts of selected titles and selected all potentially‐eligible studies for full text review. Four researchers (NS, AN‐S, SM and EL ) independently further assessed full manuscripts against the inclusion criteria (see Criteria for considering studies for this review). Where necessary, a third arbitrator (CWR) resolved disagreements that the two researchers could not resolve through discussion.

Where a study included useable data but these were not presented in the published manuscript, we contacted the authors directly to request further information. If the same data set was presented in more than one paper, we included only the primary paper.

We detailed the numbers of studies selected at each point in a study flow diagram (Figure 1).


Study flow diagramNote: a top‐up search performed in December 2015 revealed 6134 records85 records retained after de‐duplication and assessment by one experienced reviewer81 records excluded after further assessment performed by two review authors4 studies identified for possible inclusion (Characteristics of studies awaiting classification)

Study flow diagram

Note: a top‐up search performed in December 2015 revealed 6134 records

85 records retained after de‐duplication and assessment by one experienced reviewer

81 records excluded after further assessment performed by two review authors

4 studies identified for possible inclusion (Characteristics of studies awaiting classification)

Data extraction and management

We extracted data onto a study‐specific form which included the following:

  • Author, year of publication, and journal.

  • The index test and assay type used (thresholds used to define positive and negative tests).

  • The criteria used for clinical definition for the baseline population.

  • Baseline demographics of the study population (age, gender, apolipoprotein E (ApoE) status, MMSE and clinical setting).

  • The duration of follow‐up (mean, minimum, maximum, and median).

  • The proportion of participants developing the outcome of interest (Alzheimer's disease dementia using NINCDS‐ADRDA criteria) as well as other forms of dementias where standard criteria were used.

  • The sensitivity and specificity of the index test in defining Alzheimer's disease dementia (these were used to back‐translate into a 2 x 2 table (Appendix 2)).

  • Other data relevant for creating 2 x 2 tables (TP = true test positive; FP = false test positive; FN = false test negative; TN = true test negative) e.g. the number of 'abnormal' and 'normal' tests and baseline variables; the number of disease 'presence' and disease 'absence' at follow‐up, as well as through scrutiny of scatter plots.

We also extracted data necessary for the assessment of quality as defined below.

Data extraction was performed independently by two blinded review authors (NS and AN‐S). Disagreement in data extraction was resolved by discussion, with the potential to involve a third author (CWR) as arbitrator, if necessary.

Assessment of methodological quality

Two review authors (NS and AN‐S), blinded to each other’s scores, independently performed methodological quality assessments of each study using the QUADAS‐2 tool (Whiting 2011), as recommended by the Cochrane Collaboration. Disagreement was resolved by further review and discussion with the potential to involve a third author (CWR) as arbitrator, if necessary.

The tool is made up of four domains: participant selection, index test, reference standard and participant flow. Each domain was assessed in terms of risk of bias, with the first three domains also considered in terms of applicability concerns (Quadas‐2) (Appendix 3).The components of each of these domains and a rubric which details how judgments concerning risk of bias are made are detailed in Appendix 4. Certain key areas important for this review regarding quality assessment were participant selection, index test, and blinding.

We did not use QUADAS‐2 data to form a summary quality score. We produced a narrative summary describing numbers of studies that were found to have high/low/unclear risk of bias, as well as concerns regarding applicability.

Statistical analysis and data synthesis

We evaluated test accuracy according to the target condition. There are no accepted thresholds to define what constitutes a positive or negative CSF index test for identifying those people with MCI who would convert to Alzheimer’s disease dementia or other forms of dementia over time. Therefore, the estimates of diagnostic accuracy reported in primary studies were likely to be based on data‐driven threshold selection (Leeflang 2008). We conducted exploratory analyses by plotting estimates of sensitivity and specificity from each study on forest plots and in receiver operating characteristic (ROC) space. We did not meta‐analyse pairs of sensitivity and specificity using the bivariate model, as originally planned, because the results were not clinically interpretable when studies with mixed thresholds were included in the analysis. Instead, we fitted HSROC meta‐analysis models to estimate summary ROC curves using SAS (Statistical Analysis Software), version 9.2 (SAS Institute 2011). We derived estimates of sensitivity and likelihood ratios at a fixed value of specificity (chosen a priori as the median specificity for the studies that were analysed when fitting the model) from the HSROC models for the illustrative purposes. Confidence intervals for sensitivity and the likelihood ratios were calculated using the delta method (Davison 2003), using the 'estimate' command after fitting the HSROC models in SAS. HSROC models were only fitted for analyses where data for 2 x 2 tables were provided by at least six studies, given the need to estimate five parameters. Where HSROC models were fitted, we summarised the post‐test probability of conversion from MCI to dementia given a positive test result and given a negative test result for a range of prevalences of conversion (pretest) probabilities. This was done by plotting the post‐test probabilities against the pretest probabilities, calculating the former based on the pretest probabilities and the likelihood ratios estimated from the HSROC model at the median of the observed specificity values from the included studies. A positive predictive value (PPV) and a negative predictive value (NPV) were also reported, based on the median prevalence (pretest probability) of conversion across studies. We caution that these post‐test probabilities and PPV and NPV values related to likelihood ratios for hypothetical values of sensitivity and specificity for which the true threshold value of the index test was not known.

Investigations of heterogeneity

Heterogeneity was investigated through visual examination of forest plots of sensitivities and specificities and through visual examination of the ROC plot of the raw data. The main sources of heterogeneity were thought likely to be reference standards used, participant sampling, index test methodology and aspects of study quality (particularly inadequate blinding).

There were insufficient studies, therefore we did not perform meta‐regression (by including each potential source of heterogeneity as a covariate in the HSROC model) as planned (Differences between protocol and review).

Sensitivity analyses

We planned to investigate the effect of quality items (such as pre‐specifying threshold) on the accuracy of index tests by undertaking sensitivity analyses. Due to the limited number of studies, we did not perform any sensitivity analyses (Differences between protocol and review)

Assessment of reporting bias

We did not investigate reporting bias because of current uncertainty about how it operates in test accuracy studies and the interpretation of existing analytical tools such as funnel plots. 

Results

Results of the search

The total number of records identified by the searches up to January 2013 was 20,446. After de‐duplication, a small team of assessors performed a first assessment of the remaining records. After a second assessment, 255 records were retained, of which 178 were excluded after assessment performed by two review authors. Seventy‐seven references were identified as possible eligible studies and were assessed for inclusion (Figure 1). Fifteen papers were included, and 40 were discarded for the following reasons: i) data not suitable for analysis or insufficient data for creating two by two tables (n = 28) (Characteristics of excluded studies); ii) not a delayed verification study (n = 2); iii) not MCI participants at baseline (n = 2); iv) unsuitable index test (n = 2); v) reference not obtained (n = 5). In addition, twenty two papers were identified as multiple publications. One paper was not in English (Urakami 2004). No extra studies were found through reference checking. We obtained usable data for five papers (Amlien 2013; Galluzzi 2010; Hansson 2006; Visser 2009; Vos 2013) through contacting the authors.

We ran a top‐up search in December 2015. The results of this search will be fully incorporated into the review at update. However, readers may wish to know that this search identified a total of 6314 results. After screening, four new studies were identified for inclusion within the review (please see Additional Tables: Table 1 for more details). The characteristics of these four new studies and their heterogeneity were all consistent with the fully incorporated studies.

Open in table viewer
Table 1. Studies awaiting classification

Conversion from MCI to Alzheimer’s disease dementia

Study

Participants

n/N

(included in analysis)

Index test

(number and % of positive tests)

Threshold

(test abnormal) (prespecified Yes/No)

Number of converters (%)

FP and FN

Test accuracy at study level

Duration of follow‐up

Sensitivity

Specificity

*Balasa 2014

51/51

CSF ABeta42/p‐tau ratio

25/51 (49%)

< 6.43

(Yes)

24/51 (47%)

FP =1; FN =0

100%

96%

41 months for MCI‐AD; 30 months for MCI‐MCI

*Ewers 2012

130/130

CSF t‐tau

65/130 (50%)

Not reported

58/130 (45%)

FP = 30; FN = 23

60.7%

58.9%

24 months

CSF p‐tau

67/130 (51.5%)

Not reported

58/130 (45%)

FP = 30; FN = 21

63.9%

58.9%

*Leuzy 2015

33/33

CSF t‐tau

15/33 (45%)

˃ 400 pg/mL(Yes)

12/33 (36%)

FP = 7; FN = 4

67%

67%

Not reported

CSF t‐tau/ABeta ratio

12/33 (36%)

< 1.14

(Yes)

12/33 (36%)

FP = 6; FN = 6

50%

71%

Conversion from MCI to all dementias

*Eckerstrom 2015

73/73

CSF p‐tau

15/73 (20.5%)

73 pg/mL

(No)

27/73 (36.9%)

FP = 3; FN = 15

75%

92%

43.1 ± 23 months MCI‐stable; 33.7 ± 24 months MCI converters

Study awaiting translation

Urakami 2004

AD: Alzheimer's disease; FN: false negative; FP: false positive; MCI: mild cognitive impairment

*Authors need to be contacted in order to obtain missing data/relevant information. Data presented are provisional.

Included Studies

The Characteristics of included studies table lists the characteristics of the 15 included studies containing a total of 1282 participants with MCI at baseline of whom 1172 had analysable data. Two studies (Buchhave 2012; Hansson 2006) involved the same cohort. Buchhave 2012 reported the data for the CSF p‐tau/ABeta ratio index test from a new follow‐up period.

Study designs were seven prospectively well‐defined cohorts of participants with MCI (Buchhave 2012; Fellgiebel 2007; Galluzzi 2010; Herukka 2007; Kester 2011; Palmqvist 2012; Vos 2013), six nested case‐control studies with a prospectively defined MCI group (Amlien 2013; Hansson 2006; Koivunen 2008; Monge‐Argiles 2011; Parnetti 2012; Visser 2009) and two studies with a retrospectively defined MCI group with longitudinal data (Eckerstrom 2010; Hampel 2004).

A majority of studies (n = 9) were published between 2010 and 2013. The remaining six studies were published from 2004 to 2008. All of the included studies were conducted in Europe (five in Sweden, two in Italy and two in Finland, one in The Netherland, one in Spain, one in Norway, one in Germany and two were European multi‐centre studies). They used one version or another of the Petersen criteria for MCI. Twelve studies applied NINCDS‐ADRDA criteria or NINCDS‐ADRDA and DSM criteria as a reference standard for Alzheimer’s disease dementia. Amlien 2013 used Global Dementia Scale (GDS) & Research criteria, Fellgiebel 2007 used 'CDR = 1 criteria' and Parnetti 2012 did not specify the reference standard at follow‐up.

Study sizes varied and ranged from 15 (Koivunen 2008) to 231 participants (Vos 2013). Nine papers had included participants with a mean age of 70 years or under. The mean (range) age of the youngest sample was 64 years (45 to 76) (Amlien 2013) and the mean (SD) age of the oldest sample was 73.4 (6.6) years (Monge‐Argiles 2011). Sampling procedure and APOE ɛ4 gene carriers were poorly reported. Participants were mainly recruited from university memory clinics (n = 8), while one study did not report sources of recruitment (Koivunen 2008).

Different CSF biomarker level values were used as a threshold in the included studies (Additional tables: Table 2). The threshold was prespecified in only five studies (Amlien 2013; Herukka 2007; Kester 2011; Koivunen 2008; Vos 2013). A percentage of converters to Alzheimer’s disease dementia ranged from 22% (Visser 2009) to 56% (Hampel 2004). CSF index test positivity ranged from 23% (Amlien 2013) to 69% (Vos 2013) . Duration of follow‐up was reported as mean and standard deviation (SD), or median, or range. Most studies had follow‐up between 12 and 36 months. Some participants were followed up for less than one year in three of the included studies (Fellgiebel 2007; Hampel 2004; Monge‐Argiles 2011), and for more than four years in five of the included studies (Buchhave 2012; Herukka 2007; Palmqvist 2012; Parnetti 2012). Participants in the remaining seven studies (Amlien 2013; Eckerstrom 2010; Galluzzi 2010; Kester 2011; Koivunen 2008; Visser 2009) were followed up from one to three years.

Open in table viewer
Table 2. Conversion from MCI to Alzheimer's disease dementia

Included studies, index test and test accuracy at study level for conversion from MCI to Alzheimer’s disease dementia

Study

Participants n/N

(included in analysis)

Index test

(number and % of positive tests)

Threshold

(test abnormal) (prespecified Yes/No)

Number of converters (%)

FP and FN

Test accuracy at study level

Duration of follow‐up

Sensitivity

Specificity

Amlien 2013

49/39

CSF t‐tau

9/39 (23%)

≥ 300 ng/L for age younger than 50 years; ≥ 450 ng/L for age 50 to 69 years; ≥ 500 ng/L for age older than 70 years (Sjogren 2001)

(Yes)

9/39 (23%);

FP = 4; FN = 4

56%

87%

mean 2.6 ± 0.5 years

(range 1.6 to 4 years)

Buchhave 2012*

137/134

CSF p‐tau/ABeta ratio

69/134 (51%)

˂ 6.2 ng/L

(No)

72/134 (54%)

FP = 6; FN = 9

88%

90%

median: 9.2 years

(range 4 to 12 years)

Fellgiebel 2007

16/16

CSF p‐tau

12/16 (75%)

≥ 50 pg/mL

(No)

4/16 (25%)

FP = 8; FN = 0

100%

33%

mean 19.6 ± 9.0 months

Hampel 2004

52/52

CSF t‐tau

38/52 (73%)

≥ 479 ng/L

(No)

29/52 (56%);

FP = 12; FN = 3

90%

48%

mean 8.4 ± 5.1 months

(range 2 to 24 months)

Hansson 2006*

137/134

CSF t‐tau

38/134 (28%)

> 350 ng/L

(No)

57/134 (42%);

FP = 9; FN = 28

51%

88%

Total sample: median 5.2 years (range 4.0 to 6.8 years);

MCI‐AD: median: 4.3 years (range 1.1 to 6.7 years)

MCI‐other dementias: median 4.2 years (range 1.5 to 3 years)

CSF p‐tau

50/134 (37%)

≥ 60 ng/L

(No)

57/134 (42%);

FP = 11; FN = 18

68%

86%

CSF p‐tau/ABeta ratio

74/134 (55%)

˂ 6.5 ng/L

(No)

57/134 (42%);

FP = 19; FN = 2

96%

75%

Kester 2011

153/100

CSF t‐tau

64/100 (64%)

> 356 pg/mL

(Yes)

42/100 (42%)

FP = 29; FN = 7

83%

50%

median 18 months

(IQR 13 ‐ 24)

Koivunen 2008

15/14

CSF p‐tau

9/14 (64%)

≥ 70 pg/mL

(Yes)

5/14 (36%)

FP = 7; FN = 3

40%

22%

2 years

CSF p‐tau/ABeta ratio

9/14 (64%)

˂ 6.5 pg/mL

(yes)

5/14 (36%)

FP = 6; FN = 1

80%

33%

Monge‐Argiles 2011

37/37

CSF t‐tau

16/37 (43%)

≥ 77.5 pg/mL

(No)

11/37 (28%)

FP = 8; FN = 3

73%

69%

6 months

CSF p‐tau

20/37 (54%)

≥ 54.5 pg/mL

(No)

11/37 (28%)

FP = 11; FN = 2

82%

58%

CSF p‐tau/ABeta ratio

18/37 (49%)

0.17

(No)

11/37 (28%)

FP = 9; FN = 2

82%

66%

CSF t‐tau/ABeta ratio

23/37 (62%)

0.18

(No)

11/37 (28%)

FP = 13; FN = 1

91%

50%

Palmqvist 2013

133/133

CSF t‐tau

65/133 (49%)

> 87 pg/mL

(No)

52/133 (39%)

FP = 23; FN = 10

81%

72%

mean 5.9 years

(range 3.2 to 8.8 years)

CSF p‐tau

46/133 (34%)

> 39 pg/mL

(No)

52/133 (39%)

FP = 11; FN = 17

67%

86%

Parnetti 2012

90/90

CSF p‐tau/ABeta ratio

29/90 (32%)

1074.0

(No)

32/90 (35%)

FP = 3; FN = 6

81%

95%

maximum: 4 years; mean 3.40 ± 1.01 years

Visser 2009

168/158

CSF p‐tau

108/158 (68%)

≥ 51 pg/mL

(used in clinical practice) (No)

35/158 (22%)

FP = 77; FN = 4

88%

37%

range 1 to 3 for MCI

CSF p‐tau

45/158 (28%)

≥ 85pg/mL

(> 90th percentile of controls after correction for age)

(No)

35/158 (22%)

FP = 25; FN = 15

57%

80%

CSF p‐tau/ABeta ratio

77/158 (49%)

˂ 9.92 (< 10th percentile of reference group after correction for age) (No)

35/158 (22%);

FP = 49; FN = 7

80%

60%

Vos 2013

231/214

CSF t‐tau

93/214 (43%)

> 450 pg/mL for age less than 70 years; > 500 pg/mL for age older than 70 years (Yes)

91/214 (42%)

FP = 28; FN = 26

71%

77%

mean 2.5 ± 1.0 years

CSF t‐tau/ABeta ratio

147/214 (69%)

ABeta1–42/(240 1 [1.18 3 t‐tau]) ˂ 1.0

(Yes)

91/214 (42%)

FP = 60; FN = 4

96%

51%

AD: Alzheimer's disease; FN: false negative; FP: false positive; MCI: mild cognitive impairment

*Studies involved the same participants. Only Hansson 2006 is included in the meta‐analysis

Excluded studies

Twenty‐nine studies, nine of which were ADNI studies, were excluded as they failed to meet the inclusion criteria for participants, index test, target condition, or they didn't have diagnostic accuracy data (Characteristics of excluded studies). We contacted the authors of two of the ADNI studies (Landau 2010; Westman 2012) in order to obtain additional data for creating two by two tables. Further information was not available for the Landau 2010 study at the time this review was prepared. The author of the Westman 2012 study informed us that the accuracy of combined, not individual, CSF biomarkers was assessed in their study.

Studies awaiting classifications

The Characteristics of studies awaiting classification table lists the characteristics of four studies which might be considered for the inclusion in an updated review. The authors of all those studies need to be contacted in order to obtain missing data/relevant information. Regarding the target condition ‘Conversion from MCI to Alzheimer’s disease’, provisional data from two studies (Ewers 2012; Leuzy 2015) might be used for the analysis of CSF t‐tau; data for the analysis of CSF p‐tau ABeta42/p‐tau ratio index tests might be available only from Ewers 2012 and Balasa 2014, respectively.

Additional Tables: Table 1 shows that the percentage of converters to Alzheimer’s disease dementia ranged from 36% to 47%. Duration of follow‐up was between 24 and 41 months. Leuzy 2015 did not report duration of follow‐up and Ewers 2012 did not report a threshold value. The heterogeneity of results in these four studies was consistent with that observed in the fully incorporated studies.

Methodological quality of included studies

Methodological quality was assessed using the QUADAS‐2 tool (Whiting 2011).

Review authors’ judgements about each methodological quality item for each included study are presented in the Characteristics of included studies table and Figure 2. The overall methodological quality of included study cohorts is summarised in Figure 3.


Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study


Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

In the participant selection domain, we considered five studies (Eckerstrom 2010; Hampel 2004; Herukka 2007; Kester 2011; Koivunen 2008) to be at high risk of bias because the participants were not consecutively or randomly enrolled or both the sampling procedure and exclusion criteria were not described. We stated that all included studies avoided a case‐control design because we only considered data on performance of the index test to discriminate between people with MCI who converted to dementia and those who remained stable. We considered four studies (Amlien 2013; Buchhave 2012; Galluzzi 2010; Hansson 2006) to be at low risk of bias. We considered the remaining six studies to be at unclear risk of bias, due to poor reporting on sampling procedure or exclusion criteria

In the index test domain, we considered eight studies (Buchhave 2012; Eckerstrom 2010; Fellgiebel 2007; Hampel 2004; Hansson 2006; Monge‐Argiles 2011; Palmqvist 2012; Parnetti 2012) to be at high risk of bias because the threshold used was not prespecified and the optimal cutoff level was determined from ROC analyses; therefore, the accuracy of the CSF biomarkers reported in these studies appeared to be overestimated. We considered two studies (Amlien 2013; Galluzzi 2010) to be at unclear risk of bias, due to poor reporting. We considered the remaining five studies to be at low risk of bias.

In the reference standard domain, we considered nine studies (Amlien 2013; Eckerstrom 2010; Fellgiebel 2007; Galluzzi 2010; Hampel 2004; Kester 2011; Koivunen 2008; Monge‐Argiles 2011; Parnetti 2012) to be at unclear risk of bias, mainly because it was not reported whether clinicians conducting follow‐up were aware of initial CSF biomarker analysis results. Three of those nine studies did not clearly report the reference standards used for diagnosing Alzheimer’s disease dementia. We were not able to obtain the information about how the reference standard was obtained and by whom, due to poor reporting. We considered the remaining six studies to be at low risk of bias.

In the flow and timing domain, we judged nine studies (Amlien 2013; Eckerstrom 2010; Fellgiebel 2007; Galluzzi 2010; Koivunen 2008; Monge‐Argiles 2011; Parnetti 2012; Visser 2009; Vos 2013) to be at unclear risk of bias because not all participants were included in the analysis and/or the follow‐up period was shorter than one year and/or reporting was poor. We judged three studies (Galluzzi 2010; Hansson 2006; Kester 2011) to be at high risk of bias because a large number of participants with non‐Alzheimer's disease dementia were excluded from the analysis. We considered the remaining three studies to be at low risk of bias.

For assessment of applicability concerns, for the majority of the studies there was no concern that the included participants and setting, the conduct and interpretation of the index test, and the target condition (as defined by the reference standard) in each of the included studies did not match the review question. We judged two studies (Eckerstrom 2010; Koivunen 2008) to be of unclear applicability because of concerns regarding the participant characteristics or setting. We also judged four studies (Amlien 2013; Eckerstrom 2010; Fellgiebel 2007; Parnetti 2012) to be of unclear applicability because of concerns with respect to the reference standard.

It should be noted that the lack of concern about applicability of the three domains mentioned above was based on the inclusion criteria set in the review, and therefore the judgment about applicability may be overstated.

Findings

The key characteristics of each study are summarised in Additional Tables: Table 2 and Table 3. Included studies used a range of different thresholds. The number of positive CSF index tests at baseline varied across studies. The summary of main results for the fifteen included studies is presented in summary of findings Table.

Open in table viewer
Table 3. Conversion from MCI to All dementia

Included studies, index test and test accuracy at study level for conversion from MCI to All dementias

Study

Participants n/N

(included in analysis)

Index test

(Number and % of positive tests)

Threshold

(test abnormal) (pre‐specified Yes / No)

Number of converters (%)

FP and FN

Test accuracy at study level

Duration of follow‐up

Sensitivity

Specificity

Eckerstrom 2010

42/42

CSF t‐tau

15/42 (36%)

≥ 500 ng/L

(No)

21/42 (50%)

FP = 1

FN = 7

67%

95%

Total sample: 19.6 ± 9.0 months; MCI‐MCI: 19.5 ± 9.3 months; MCI‐progressive: 17.6 ± 8.8 months (4/8 MCI‐AD: 23.7 ± 2.0 months)

Galluzzi 2010

90/64

CSF t‐tau

24/64 (37.5%)

> 450 pg/mL for subjects with an age range between 51 and 70 determined; > 500 pg/mL for subjects with an age range between 71 and 93

(Yes)

34/64 (53%)

FP = 5

FN = 15

56%

83%

Total sample: 8.4 ± 5.1 months (range 2 to 24 months); follow‐up interval for converters was 9.6 ± 5.4, and for non‐converters 7.0 ± 4.3 months

Hansson 2006

137/134

CSF t‐tau

38/134 (28%)

> 350 pg/mL

(No)

78/134 (58%)

FP = 5

FN = 45

42%

91%

Total sample: median 5.2 years (range 4.0 to 6.8); MCI‐AD: median: 4.3 years (range 1.1 to 6.7); MCI‐other dementias: median 4.2 (1.5 to 6.3)

Herukka 2007

79/79

CSF t‐tau

43/79 (54%)

> 400 pg/mL (Yes)

33/79 (42%)

FP = 17

FN = 7

79%

63%

Mean 3.52 ± 1.95 years in MCI converters; mean 4.56 ± 3.09 years in MCI‐stable

AD: Alzheimer's disease; FN: false positive; FP: false negative; MCI: mild cognitive impairment

CSF t‐tau for Alzheimer's disease dementia

Individual study estimates of sensitivity and specificity are shown in Figure 4 for each of the seven studies (291 cases and 418 non‐cases) that evaluated Alzheimer’s disease dementia. The sensitivity values ranged from 51% to 90% while the specificity values ranged from 48% to 88%. The thresholds used ranged from ≥ 77 to ≥ 500 pg/mL (ng/L).


Forest plot of 1 CSF t‐tau conversion to AD dementia.

Forest plot of 1 CSF t‐tau conversion to AD dementia.

The summary ROC curve summarising the accuracy of CSF t‐tau across the seven studies is shown in Figure 5. Because of the variation in thresholds, we did not estimate a summary sensitivity and specificity. However, we derived estimates of sensitivity and likelihood ratios at fixed values of specificity from the HSROC model we fitted to produce the summary ROC curve. At the median specificity of 72%, the estimated sensitivity was 77% (95% CI 67 to 85), the positive likelihood ratio was 2.72 (95% CI 2.43 to 3.04), and the negative likelihood ratio was 0.32 (95% CI 0.22 to 0.47).


Summary ROC Plot of 1 CSF t‐tau conversion to AD dementia.

Summary ROC Plot of 1 CSF t‐tau conversion to AD dementia.

At the median specificity (72%) and the median prevalence of Alzheimer's disease dementia (37%) (pretest probability, Figure 6), the positive predictive value was 62%, which means on average 62 out of 100 people with MCI and a positive index test result would convert to Alzheimer's disease dementia, but 38 would not. The negative predictive value of 84% means that on average 84 out of 100 people with MCI and with a negative index test result would not convert to Alzheimer's disease dementia, but 16 would.


Post‐test probability plots (Analysis 1): Conversion from MCI to Alzheimer’s disease for CSF t‐tau as a diagnostic test

Post‐test probability plots (Analysis 1): Conversion from MCI to Alzheimer’s disease for CSF t‐tau as a diagnostic test

In a hypothetical cohort of 100 people with MCI taking the CSF t‐tau test, there would be on average nine false negatives (participants who convert but incorrectly tested negative) and 18 false positives (participants who did not convert but incorrectly tested positive) (summary of findings Table).

CSF p‐tau for Alzheimer's disease dementia

Six studies (164 cases and 328 non‐cases) evaluated the accuracy of CSF p‐tau for conversion to Alzheimer’s disease dementia (Figure 7). The sensitivities were between 40% and 100%, while the specificities were between 22% and 86%. The thresholds used ranged from ≥ 39 to ≥ 85 pg/mL (ng/L).


Forest plot of 2 CSF p‐tau conversion to AD dementia.

Forest plot of 2 CSF p‐tau conversion to AD dementia.

Figure 8 shows the summary ROC space. We derived the summary estimates at different points on the fitted HSROC curve. At the median specificity of 48%, the estimated sensitivity was 81% (95% CI 64 to 91), the positive likelihood ratio was 1.55 (CI 1.31 to 1.84), and the negative likelihood ratio was 0.39 (CI 0.19 to 0.82).


Summary ROC Plot of 2 CSF p‐tau conversion to AD dementia.

Summary ROC Plot of 2 CSF p‐tau conversion to AD dementia.

At the median specificity (48%) and the median prevalence of Alzheimer's disease dementia (37%) (pretest probability, Figure 9), the positive predictive value was 48%, which means on average 48 out of 100 people with MCI and with a positive index test result would convert to Alzheimer's disease dementia but 52 would not. The negative predictive value of 81% means that on average 81 out of 100 people with MCI with a negative index test result would not convert to Alzheimer's disease dementia, but 19 would.


Post‐test probability plots (Analysis 2): Conversion from MCI to Alzheimer’s disease for CSF p‐tau as a diagnostic test

Post‐test probability plots (Analysis 2): Conversion from MCI to Alzheimer’s disease for CSF p‐tau as a diagnostic test

In a hypothetical cohort of 100 people with MCI taking the CSF p‐tau test, there would be on average seven false negatives (participants who convert but incorrectly tested negative) and 33 false positives (participants who did not convert but incorrectly tested positive) (summary of findings Table).

CSF p‐tau/ABeta ratio for Alzheimer's disease dementia

Five studies (140 cases and 293 non‐cases) evaluated the accuracy of the CSF p‐tau/ABeta ratio for conversion to Alzheimer’s disease dementia (Figure 10). The sensitivities were between 80% and 96%, while the specificities were between 33% and 95%. We were not able to report the range of thresholds due to different measurements: < 6.6 pg/mL (ng/L); 0.18; 1074.0; < 9.92. Figure 11 shows the summary ROC space. We did not conduct a meta‐analysis because the studies were few and small.


Forest plot of 3 CSF p‐tau/ABeta ratio to AD dementia.

Forest plot of 3 CSF p‐tau/ABeta ratio to AD dementia.


Summary ROC Plot of 3 CSF p‐tau/ABeta ratio to AD dementia.

Summary ROC Plot of 3 CSF p‐tau/ABeta ratio to AD dementia.

CSF t‐tau/ABeta ratio for Alzheimer's disease dementia

Only two studies (Monge‐Argiles 2011; Vos 2013) evaluated the accuracy of the CSF t‐tau/ABeta ratio for conversion to Alzheimer’s disease dementia.The sensitivities were 50% and 51%, and specificities were 91% and 96%, respectively. We were not able to conduct the meta‐analysis.

CSF t‐tau for all forms of dementia (combined Alzheimer's disease dementia and non‐Alzheimer's disease dementia)

Only four studies (166 cases and 153 non‐cases) evaluated the accuracy of CSF t‐tau for conversion to all forms of dementia (Figure 12 and Figure 13). The sensitivity values ranged from 42% to 79%, while the specificity values ranged from 63% to 95%. The thresholds used ranged from ˃ 350 to ≥ 500 pg/mL (ng/L). As above, we did not conduct a meta‐analysis because the studies were few and small.


Forest plot of 4 CSF t‐tau conversion to all forms of dementia.

Forest plot of 4 CSF t‐tau conversion to all forms of dementia.


Summary ROC Plot of 4 CSF t‐tau conversion to All dementias.

Summary ROC Plot of 4 CSF t‐tau conversion to All dementias.

Investigation of heterogeneity

We were not able to formally assess the effects of each potential source of heterogeneity as planned, due to the small number of studies available to be included.

Sensitivity analyses

Due to the limited number of studies evaluating each of four CSF biomarkers for Alzheimer’s disease dementia or other types of dementia, we did not perform any sensitivity analyses, as planned.

Discussion

We performed a review of the available evidence on the diagnostic accuracy of CSF biomarker levels for detecting Alzheimer's disease pathology in people with MCI, and identifying those MCI participants who would convert to Alzheimer’s disease dementia or other forms of dementia over time. In the absence of a contemporaneous reference standard for Alzheimer's disease diagnosis relative to the application of the index test, the decision to use a delayed verification design was taken for all DTA reviews by our group. This, however, creates problems when the length of follow‐up in studies varies, as the longer a study, in a chronic disorder where age is the principal risk factor, could create false positive findings. To address this, length of follow‐up was collected to help interpret between‐study variations in accuracy.

There is, however, a paucity of evidence in relation to the accuracy of CSF biomarkers. Where data were available for conversion to Alzheimer’s disease dementia, there was a wide range of sensitivity (51% to 90%; 40% to 100%; 80% to 96%) and specificity (48% to 88%; 22% to 86%; 33% to 95%) values for the CSF t‐tau, CSF p‐tau and CSF p‐tau/ABeta ratio index tests, respectively.

Due to the wide variations in thresholds, we did not estimate a summary sensitivity and specificity. Although, subject to considerable uncertainty of a statistical approach, in order to illustrate the potential strengths and weaknesses of CSF biomarker levels we estimated from the fitted summary ROC curve that the sensitivity was 77% (95% CI 67 to 85) and 81% (95% CI 64 to 91) at the included study median specificity of 72% and 48% for the CSF t‐tau and CSF p‐tau respectively. Assuming a conversion rate of MCI to Alzheimer’s dementia of 37%, for every 100 CSF t‐tau level, nine individuals with a negative test would progress and 18 with a positive test would not progress to Alzheimer’s dementia; for every 100 CSF p‐tau level, seven individuals with a negative test would progress and 33 with a positive test would not progress to Alzheimer’s dementia. The estimation of predictive values and consequences in a cohort of 100 (‘missed cases’ and ‘over‐diagnosed’) were based on hypothetical sensitivity and specificity values for which the threshold of the test is unknown; therefore, these findings should be interpreted with caution.

We were not able to evaluate the accuracy of CSF biomarkers for conversion from MCI to other forms of dementia (non‐Alzheimer’s disease dementia). As a result of the information available from four studies (Eckerstrom 2010; Galluzzi 2010; Hansson 2006; Herukka 2007), we evaluated the accuracy of CSF t‐tau for conversion to all types of dementia (combined Alzheimer's disease dementia and non‐Alzheimer’s disease dementia). The sensitivity values ranged from 42% to 79% while the specificity values ranged from 63% to 95%. We did not conduct a meta‐analysis because the studies were few and small.

Previous reviews of tests of amyloid in CSF and plasma (Ritchie 2014) and evidenced through PET imaging (Zhang 2014) have been published. They highlighted that as a test, there was consistently better sensitivity than specificity whereby the absence of evidence of amyloid pathology (low levels in CSF and high levels in the cortices) was likely to exclude a later diagnosis of Alzheimer's disease dementia, whereas the presence of amyloid pathology did not add much incremental benefit to diagnostic accuracy. Considering the findings of this systematic review, we have demonstrated again that the NPV is greater than the PPV which is a reflection of the higher sensitivity of these tests compared to their specificity. That is, a test indicating absence of biomarker abnormality and hence suggesting absence of disease is of more value than a positive biomarker indicating disease. CSF biomarkers are better at ruling out Alzheimer’s disease than ruling it in as a cause of the clinical symptoms, and therein progression to Alzheimer’s dementia in people described as having MCI. However, the reported optimal thresholds in individual papers tended to yield better sensitivities than specificities and this was reflected in our sROC analysis; therefore, those results should be interpreted with caution.

Given the insufficient evidence to evaluate the diagnostic value in MCI of CSF t‐tau, CSF p‐tau and the CSF p‐tau/ABeta ratio for Alzheimer's disease dementia and other dementias examined in this review, particular attention should be paid to the risk of misdiagnosis and overdiagnosis of dementia (and therefore overtreatment) in clinical practice. Our findings are consistent with the expert opinion conveyed by Molinuevo et al (Molinuevo 2014) where it was recognised that negative tests results were more clinically useful than positive ones. They still saw a routine use for these tests in clinical practice, and our review will help describe the degree of accuracy to help inform clinicians using this test in their current practice. As sensitivity of this test was better than specificity, the risk of a missed diagnosis, or a false‐negative test was lower. False reassurance given to a patient that they don't have or will not get Alzheimer's dementia would also have serious clinical consequences; however, appropriate pretest counselling for what can and cannot be revealed through CSF testing would mitigate the risk of an inappropriate level of salience being afforded to this particular test.

Summary of main results

In total, 1282 participants with MCI at baseline were identified in the fifteen included studies, of which 1172 had analysable data; 430 participants converted to Alzheimer’s disease dementia and 130 participants to other forms of dementia at follow‐up. It was possible to undertake a summary analysis of the CSF t‐tau and p‐tau markers but not the ratio, as too few studies presented results for the ratio. Consistent with the findings from the amyloid reviews, CSF t‐tau and p‐tau were reasonably sensitive tests for later diagnosis of Alzheimer's disease dementia, but had poor specificity. This is illustrated in Figure 6 and Figure 9 where the small positive likelihood ratio for both CSF t‐tau and p‐tau has very little impact on the change from pretest probability to post‐test probability. With respect to the CSF t‐tau/Abeta ratio, it was not possible to generate likelihood ratios, due to only one study (Monge‐Argiles 2011) reporting data . However, from Figure 10, it can be seen that for all but one study (Parnetti 2012), the sensitivity exceeded the specificity for the p‐tau/ABeta ratio. Figure 12 though demonstrates across four studies, that the specificity of CSF t‐tau is improved when the outcome is 'all forms of dementia', suggesting that the elevation of tau is a nonspecific marker of neurodegeneration and not tightly tethered to Alzheimer's disease pathology.

Our findings were based on studies with poor reporting and most included studies had an unclear risk of bias, mainly for reference standard and participant selection domains. Nine studies (56%) had unclear risk of bias for the flow and timing domain, mainly due to not including all participants in the analysis or inappropriate duration of the follow‐up period. According to the assessment of the index test domain, 50% of studies were of poor methodological quality.

The main sources of heterogeneity were thought likely to be index test thresholds, reference standards used for the target disorders, sources of recruitment, participant sampling and aspects of study quality (particularly inadequate blinding). We were not able to formally assess the effects of each potential source of heterogeneity, as planned, due to the small number of studies available to be included.

Strengths and weaknesses of the review

There were a number of strengths to this review. This review was conducted in adherence to the inclusion criteria and methods described in a published protocol (Ritchie 2011). We searched a number of electronic databases, using an extensive range of appropriate database indexing terms and equivalent text words covering the index test, how it was measured, and the target condition. The multi‐stranded search approach that we adopted to combine different search concepts in searches run in parallel, some including a more specific diagnostic component, has successfully increased the overall sensitivity of the search and is a strength of this review. Our searches were not limited by language. We contacted 12 study authors and usable data were obtained for five studies (Amlien 2013; Galluzzi 2010; Hansson 2006; Visser 2009; Vos 2013).

There were, however, also a number of limitations to this review. There was limited published information and substantial variation in the quality of the papers and caution is needed when interpreting these findings. Most included studies provided little data on participants at baseline. Several studies reported high or unclear dropout and withdrawal rates. Studies also contained wide variations in thresholds. It is also a weakness of the review that variability in length of follow‐up in the various cohorts was so great. It would stand to reason that a longer follow‐up period would more likely yield more cases of dementia, given that age is the principal risk factor for dementia. On the other hand, short follow‐up periods might increase false negative results. This topic is of great interest to the field where determination of proximal and distal biomarkers are being considered. In an MCI population presenting to a clinician, it is the question of proximity to a decline to dementia which is the most relevant; in this regard, follow‐up periods of over five years lose clinical meaningfulness. Standardisation of the follow‐up period would help reviews like this; this has been suggested in our group's recent STARDdem proposals (Noel‐Storr 2014). In our review, we were unable to formally test what affect length of follow‐up had on the accuracy of the test. The various contributors to the heterogeneity across the studies may affect the study results. Given the poor reporting within the included studies, it is difficult to determine the underlying difference or differences among the included studies. This highlights a shortfall of large‐scale, high‐quality empirical research conducted in this area. Future studies should provide clearer reporting of the participants, equipment, usage and the implications of implementing the tests. As the current research area is rapidly changing, further research exploring the impact of the CSF p‐tau/ABeta ratio on clinical outcomes is needed. To this end, we conducted a very recent literature review which revealed four new studies that will be fully incorporated in our next planned update. These four studies demonstrated the same between‐study heterogeneity in results and methodology that we had observed in the included studies, with the implication that there will not be an impact of incorporation on our existing conclusions.

Applicability of findings to the review question

These findings can be considered a reasonable answer to the question being set in this review. Caution, though, should still apply because of the quality and reporting issues highlighted from the included papers and the small data set. This is especially true when drawing conclusions from the analysis of the p‐tau/ABeta ratio. This is particularly important as it this ratio that is often favoured in clinical practice as being most accurate. However, this review and the previous published reviews of amyloid tests and Alzheimer's disease pathology consistently demonstrate reasonable sensitivity and poor specificity; accordingly, it is likely that the ratio of two sensitive tests will generate greater sensitivity than specificity.

Study flow diagramNote: a top‐up search performed in December 2015 revealed 6134 records85 records retained after de‐duplication and assessment by one experienced reviewer81 records excluded after further assessment performed by two review authors4 studies identified for possible inclusion (Characteristics of studies awaiting classification)
Figuras y tablas -
Figure 1

Study flow diagram

Note: a top‐up search performed in December 2015 revealed 6134 records

85 records retained after de‐duplication and assessment by one experienced reviewer

81 records excluded after further assessment performed by two review authors

4 studies identified for possible inclusion (Characteristics of studies awaiting classification)

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study
Figuras y tablas -
Figure 2

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies
Figuras y tablas -
Figure 3

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Forest plot of 1 CSF t‐tau conversion to AD dementia.
Figuras y tablas -
Figure 4

Forest plot of 1 CSF t‐tau conversion to AD dementia.

Summary ROC Plot of 1 CSF t‐tau conversion to AD dementia.
Figuras y tablas -
Figure 5

Summary ROC Plot of 1 CSF t‐tau conversion to AD dementia.

Post‐test probability plots (Analysis 1): Conversion from MCI to Alzheimer’s disease for CSF t‐tau as a diagnostic test
Figuras y tablas -
Figure 6

Post‐test probability plots (Analysis 1): Conversion from MCI to Alzheimer’s disease for CSF t‐tau as a diagnostic test

Forest plot of 2 CSF p‐tau conversion to AD dementia.
Figuras y tablas -
Figure 7

Forest plot of 2 CSF p‐tau conversion to AD dementia.

Summary ROC Plot of 2 CSF p‐tau conversion to AD dementia.
Figuras y tablas -
Figure 8

Summary ROC Plot of 2 CSF p‐tau conversion to AD dementia.

Post‐test probability plots (Analysis 2): Conversion from MCI to Alzheimer’s disease for CSF p‐tau as a diagnostic test
Figuras y tablas -
Figure 9

Post‐test probability plots (Analysis 2): Conversion from MCI to Alzheimer’s disease for CSF p‐tau as a diagnostic test

Forest plot of 3 CSF p‐tau/ABeta ratio to AD dementia.
Figuras y tablas -
Figure 10

Forest plot of 3 CSF p‐tau/ABeta ratio to AD dementia.

Summary ROC Plot of 3 CSF p‐tau/ABeta ratio to AD dementia.
Figuras y tablas -
Figure 11

Summary ROC Plot of 3 CSF p‐tau/ABeta ratio to AD dementia.

Forest plot of 4 CSF t‐tau conversion to all forms of dementia.
Figuras y tablas -
Figure 12

Forest plot of 4 CSF t‐tau conversion to all forms of dementia.

Summary ROC Plot of 4 CSF t‐tau conversion to All dementias.
Figuras y tablas -
Figure 13

Summary ROC Plot of 4 CSF t‐tau conversion to All dementias.

CSF t‐tau conversion to AD dementia.
Figuras y tablas -
Test 1

CSF t‐tau conversion to AD dementia.

CSF p‐tau conversion to AD dementia.
Figuras y tablas -
Test 2

CSF p‐tau conversion to AD dementia.

CSF p‐tau/ABeta ratio to AD dementia.
Figuras y tablas -
Test 3

CSF p‐tau/ABeta ratio to AD dementia.

CSF t‐tau conversion to All dementias.
Figuras y tablas -
Test 4

CSF t‐tau conversion to All dementias.

Summary of findings Performance of CSF biomarkers in early diagnosis of dementia

What is the diagnostic accuracy of CSF biomarker levels for detecting Alzheimer's disease pathology in people with mild cognitive impairment (MCI), and identifying those MCI participants who would convert to Alzheimer’s disease dementia or other forms of dementia over time

Descriptive

Patient population

Participants diagnosed with MCI at baseline using any of the Petersen criteria or CDR = 0.5 or any 16 definitions included by Matthews (Matthews 2008)

Sampling procedure

Consecutive or random (n = 5)

Not consecutive or random (n = 3)

Unclear (n = 7)

Sources of recruitment

University memory clinic (n = 8); European multicentre memory clinics (n = 2); inpatients (n = 2); General Hospital memory clinic (n = 1); Research centre outpatient memory clinic (n = 1); not reported (n = 1)

Prior testing

The only testing prior to performing the plasma and CSF biomarkers was the application of diagnostic criteria for identifying participants with MCI.

MCI criteria

Petersen criteria (n = 14)

Global Deterioration Scale (GDS) (n = 1)

Index tests

CSF t‐tau or CSF p‐tau or CSF p‐tau/ABeta ratio or CSF t‐tau/ABeta ratio

Reference standard

NINCDS‐ADRDA and/or DSM and/or ICD criteria for Alzheimer's disease dementia (n = 12); Global Dementia Scale (GDS) & Research criteria (n = 1); CDR = 1 criteria (n = 1); not specified (n = 1)

McKeith criteria for Lewy body dementia; Lund criteria for frontotemporal dementia; and NINDS AIREN criteria for vascular dementia

Target condition

Alzheimer’s disease dementia or any other types of dementia

Included studies

Prospectively well‐defined cohorts of MCI participants (n = 7), nested case‐control studies with a prospectively defined MCI group (n = 6) and studies with a retrospectively defined MCI group with longitudinal data (n = 2).

Fifteen studies (N = 1282 participants) were included. Number included in analysis: 1172

Quality concerns

Patient selection and conduct of the reference standard were poorly reported. Applicability concerns were generally low. Regarding the inclusion criteria set in the review, the majority of included studies did match the review question: 'Could CSF t‐tau and CSF t‐tau/ABetaratio biomarkers identify those MCI participants with Alzheimer’s disease pathology at baseline who would convert clinically to dementia at follow up?' However, due to a limited number of included studies and levels of heterogeneity, it is difficult to determine to what extent the findings from a meta‐analysis can be applied to clinical practice.

Limitations

Limited investigation of heterogeneity due to insufficient number of studies. There was a lack of common thresholds.

Test

Median percentage converting (range) 2

Studies

Cases/participants

Median specificity from included studies

Sensitivity

(95% CI)1 at median specificity

Consequences in a cohort of 100

Median percentage converting2

Missed cases

Overdiagnosed

Alzheimer's disease dementia

CSF t‐tau

7

436/709

72

77 (67, 85)

37

9

18

Alzheimer's disease dementia

CSF p‐tau

6

164/492

47.5

81 (64, 91.5)

37

7

33

Alzheimer's disease dementia

CSF p‐tau/ ABeta ratio

5

140/433

No meta‐analysis

No meta‐analysis

All types of dementia

CSF t‐tau

4

166/319

No meta‐analysis

No meta‐analysis

Investigation of heterogeneity: the planned investigations were not possible due to the limited number of studies available for each analysis. We were unable to investigate the effect of duration of follow‐up due to substantial variation in length and reporting.

Conclusions: Given the insufficient evidence to evaluate the diagnostic value in MCI of CSF t‐tau, CSF p‐tau, CSF t‐tau/ABeta ratio and CSF p‐tau/ABeta ratio for Alzheimer's disease dementia and other forms of dementias examined in this review, particular attention should be paid to the risk of misdiagnosis and overdiagnosis of dementia (and therefore overtreatment) in clinical practice. Future studies with more uniform approaches to thresholds, analysis and study conduct may provide a more homogenous estimate than the one that has been available from the included studies we have identified.

1Meta‐analytic estimate of sensitivity derived from the HSROC model at a fixed value of specificity. Summary estimates of sensitivity and specificity were not computed because the studies that contributed to the estimation of the summary ROC curve used different thresholds.

2The median percentage converting was calculated using all the studies that reported 'conversion from MCI to Alzheimers' disease dementia' (Table 2)

Figuras y tablas -
Summary of findings Performance of CSF biomarkers in early diagnosis of dementia
Table 1. Studies awaiting classification

Conversion from MCI to Alzheimer’s disease dementia

Study

Participants

n/N

(included in analysis)

Index test

(number and % of positive tests)

Threshold

(test abnormal) (prespecified Yes/No)

Number of converters (%)

FP and FN

Test accuracy at study level

Duration of follow‐up

Sensitivity

Specificity

*Balasa 2014

51/51

CSF ABeta42/p‐tau ratio

25/51 (49%)

< 6.43

(Yes)

24/51 (47%)

FP =1; FN =0

100%

96%

41 months for MCI‐AD; 30 months for MCI‐MCI

*Ewers 2012

130/130

CSF t‐tau

65/130 (50%)

Not reported

58/130 (45%)

FP = 30; FN = 23

60.7%

58.9%

24 months

CSF p‐tau

67/130 (51.5%)

Not reported

58/130 (45%)

FP = 30; FN = 21

63.9%

58.9%

*Leuzy 2015

33/33

CSF t‐tau

15/33 (45%)

˃ 400 pg/mL(Yes)

12/33 (36%)

FP = 7; FN = 4

67%

67%

Not reported

CSF t‐tau/ABeta ratio

12/33 (36%)

< 1.14

(Yes)

12/33 (36%)

FP = 6; FN = 6

50%

71%

Conversion from MCI to all dementias

*Eckerstrom 2015

73/73

CSF p‐tau

15/73 (20.5%)

73 pg/mL

(No)

27/73 (36.9%)

FP = 3; FN = 15

75%

92%

43.1 ± 23 months MCI‐stable; 33.7 ± 24 months MCI converters

Study awaiting translation

Urakami 2004

AD: Alzheimer's disease; FN: false negative; FP: false positive; MCI: mild cognitive impairment

*Authors need to be contacted in order to obtain missing data/relevant information. Data presented are provisional.

Figuras y tablas -
Table 1. Studies awaiting classification
Table 2. Conversion from MCI to Alzheimer's disease dementia

Included studies, index test and test accuracy at study level for conversion from MCI to Alzheimer’s disease dementia

Study

Participants n/N

(included in analysis)

Index test

(number and % of positive tests)

Threshold

(test abnormal) (prespecified Yes/No)

Number of converters (%)

FP and FN

Test accuracy at study level

Duration of follow‐up

Sensitivity

Specificity

Amlien 2013

49/39

CSF t‐tau

9/39 (23%)

≥ 300 ng/L for age younger than 50 years; ≥ 450 ng/L for age 50 to 69 years; ≥ 500 ng/L for age older than 70 years (Sjogren 2001)

(Yes)

9/39 (23%);

FP = 4; FN = 4

56%

87%

mean 2.6 ± 0.5 years

(range 1.6 to 4 years)

Buchhave 2012*

137/134

CSF p‐tau/ABeta ratio

69/134 (51%)

˂ 6.2 ng/L

(No)

72/134 (54%)

FP = 6; FN = 9

88%

90%

median: 9.2 years

(range 4 to 12 years)

Fellgiebel 2007

16/16

CSF p‐tau

12/16 (75%)

≥ 50 pg/mL

(No)

4/16 (25%)

FP = 8; FN = 0

100%

33%

mean 19.6 ± 9.0 months

Hampel 2004

52/52

CSF t‐tau

38/52 (73%)

≥ 479 ng/L

(No)

29/52 (56%);

FP = 12; FN = 3

90%

48%

mean 8.4 ± 5.1 months

(range 2 to 24 months)

Hansson 2006*

137/134

CSF t‐tau

38/134 (28%)

> 350 ng/L

(No)

57/134 (42%);

FP = 9; FN = 28

51%

88%

Total sample: median 5.2 years (range 4.0 to 6.8 years);

MCI‐AD: median: 4.3 years (range 1.1 to 6.7 years)

MCI‐other dementias: median 4.2 years (range 1.5 to 3 years)

CSF p‐tau

50/134 (37%)

≥ 60 ng/L

(No)

57/134 (42%);

FP = 11; FN = 18

68%

86%

CSF p‐tau/ABeta ratio

74/134 (55%)

˂ 6.5 ng/L

(No)

57/134 (42%);

FP = 19; FN = 2

96%

75%

Kester 2011

153/100

CSF t‐tau

64/100 (64%)

> 356 pg/mL

(Yes)

42/100 (42%)

FP = 29; FN = 7

83%

50%

median 18 months

(IQR 13 ‐ 24)

Koivunen 2008

15/14

CSF p‐tau

9/14 (64%)

≥ 70 pg/mL

(Yes)

5/14 (36%)

FP = 7; FN = 3

40%

22%

2 years

CSF p‐tau/ABeta ratio

9/14 (64%)

˂ 6.5 pg/mL

(yes)

5/14 (36%)

FP = 6; FN = 1

80%

33%

Monge‐Argiles 2011

37/37

CSF t‐tau

16/37 (43%)

≥ 77.5 pg/mL

(No)

11/37 (28%)

FP = 8; FN = 3

73%

69%

6 months

CSF p‐tau

20/37 (54%)

≥ 54.5 pg/mL

(No)

11/37 (28%)

FP = 11; FN = 2

82%

58%

CSF p‐tau/ABeta ratio

18/37 (49%)

0.17

(No)

11/37 (28%)

FP = 9; FN = 2

82%

66%

CSF t‐tau/ABeta ratio

23/37 (62%)

0.18

(No)

11/37 (28%)

FP = 13; FN = 1

91%

50%

Palmqvist 2013

133/133

CSF t‐tau

65/133 (49%)

> 87 pg/mL

(No)

52/133 (39%)

FP = 23; FN = 10

81%

72%

mean 5.9 years

(range 3.2 to 8.8 years)

CSF p‐tau

46/133 (34%)

> 39 pg/mL

(No)

52/133 (39%)

FP = 11; FN = 17

67%

86%

Parnetti 2012

90/90

CSF p‐tau/ABeta ratio

29/90 (32%)

1074.0

(No)

32/90 (35%)

FP = 3; FN = 6

81%

95%

maximum: 4 years; mean 3.40 ± 1.01 years

Visser 2009

168/158

CSF p‐tau

108/158 (68%)

≥ 51 pg/mL

(used in clinical practice) (No)

35/158 (22%)

FP = 77; FN = 4

88%

37%

range 1 to 3 for MCI

CSF p‐tau

45/158 (28%)

≥ 85pg/mL

(> 90th percentile of controls after correction for age)

(No)

35/158 (22%)

FP = 25; FN = 15

57%

80%

CSF p‐tau/ABeta ratio

77/158 (49%)

˂ 9.92 (< 10th percentile of reference group after correction for age) (No)

35/158 (22%);

FP = 49; FN = 7

80%

60%

Vos 2013

231/214

CSF t‐tau

93/214 (43%)

> 450 pg/mL for age less than 70 years; > 500 pg/mL for age older than 70 years (Yes)

91/214 (42%)

FP = 28; FN = 26

71%

77%

mean 2.5 ± 1.0 years

CSF t‐tau/ABeta ratio

147/214 (69%)

ABeta1–42/(240 1 [1.18 3 t‐tau]) ˂ 1.0

(Yes)

91/214 (42%)

FP = 60; FN = 4

96%

51%

AD: Alzheimer's disease; FN: false negative; FP: false positive; MCI: mild cognitive impairment

*Studies involved the same participants. Only Hansson 2006 is included in the meta‐analysis

Figuras y tablas -
Table 2. Conversion from MCI to Alzheimer's disease dementia
Table 3. Conversion from MCI to All dementia

Included studies, index test and test accuracy at study level for conversion from MCI to All dementias

Study

Participants n/N

(included in analysis)

Index test

(Number and % of positive tests)

Threshold

(test abnormal) (pre‐specified Yes / No)

Number of converters (%)

FP and FN

Test accuracy at study level

Duration of follow‐up

Sensitivity

Specificity

Eckerstrom 2010

42/42

CSF t‐tau

15/42 (36%)

≥ 500 ng/L

(No)

21/42 (50%)

FP = 1

FN = 7

67%

95%

Total sample: 19.6 ± 9.0 months; MCI‐MCI: 19.5 ± 9.3 months; MCI‐progressive: 17.6 ± 8.8 months (4/8 MCI‐AD: 23.7 ± 2.0 months)

Galluzzi 2010

90/64

CSF t‐tau

24/64 (37.5%)

> 450 pg/mL for subjects with an age range between 51 and 70 determined; > 500 pg/mL for subjects with an age range between 71 and 93

(Yes)

34/64 (53%)

FP = 5

FN = 15

56%

83%

Total sample: 8.4 ± 5.1 months (range 2 to 24 months); follow‐up interval for converters was 9.6 ± 5.4, and for non‐converters 7.0 ± 4.3 months

Hansson 2006

137/134

CSF t‐tau

38/134 (28%)

> 350 pg/mL

(No)

78/134 (58%)

FP = 5

FN = 45

42%

91%

Total sample: median 5.2 years (range 4.0 to 6.8); MCI‐AD: median: 4.3 years (range 1.1 to 6.7); MCI‐other dementias: median 4.2 (1.5 to 6.3)

Herukka 2007

79/79

CSF t‐tau

43/79 (54%)

> 400 pg/mL (Yes)

33/79 (42%)

FP = 17

FN = 7

79%

63%

Mean 3.52 ± 1.95 years in MCI converters; mean 4.56 ± 3.09 years in MCI‐stable

AD: Alzheimer's disease; FN: false positive; FP: false negative; MCI: mild cognitive impairment

Figuras y tablas -
Table 3. Conversion from MCI to All dementia
Table Tests. Data tables by test

Test

No. of studies

No. of participants

1 CSF t‐tau conversion to AD dementia Show forest plot

7

709

2 CSF p‐tau conversion to AD dementia Show forest plot

6

492

3 CSF p‐tau/ABeta ratio to AD dementia Show forest plot

5

433

4 CSF t‐tau conversion to All dementias Show forest plot

4

319

Figuras y tablas -
Table Tests. Data tables by test