Scolaris Content Display Scolaris Content Display

Pemeriksaan oksimetri nadi untuk kecacatan jantung kongenital yang kritikal

Contraer todo Desplegar todo

Abstract

Background

Health outcomes are improved when newborn babies with critical congenital heart defects (CCHDs) are detected before acute cardiovascular collapse. The main screening tests used to identify these babies include prenatal ultrasonography and postnatal clinical examination; however, even though both of these methods are available, a significant proportion of babies are still missed. Routine pulse oximetry has been reported as an additional screening test that can potentially improve detection of CCHD.

Objectives

• To determine the diagnostic accuracy of pulse oximetry as a screening method for detection of CCHD in asymptomatic newborn infants

• To assess potential sources of heterogeneity, including:

○ characteristics of the population: inclusion or exclusion of antenatally detected congenital heart defects;

○ timing of testing: < 24 hours versus ≥ 24 hours after birth;

○ site of testing: right hand and foot (pre‐ductal and post‐ductal) versus foot only (post‐ductal);

○ oxygen saturation: functional versus fractional;

○ study design: retrospective versus prospective design, consecutive versus non‐consecutive series; and

○ risk of bias for the "flow and timing" domain of QUADAS‐2.

Search methods

We searched the Cochrane Central Register of Controlled Trials (CENTRAL; 2017, Issue 2) in the Cochrane Library and the following databases: MEDLINE, Embase, the Cumulative Index to Nursing and Allied Health Literature (CINAHL), and Health Services Research Projects in Progress (HSRProj), up to March 2017. We searched the reference lists of all included articles and relevant systematic reviews to identify additional studies not found through the electronic search. We applied no language restrictions.

Selection criteria

We selected studies that met predefined criteria for design, population, tests, and outcomes. We included cross‐sectional and cohort studies assessing the diagnostic accuracy of pulse oximetry screening for diagnosis of CCHD in term and late preterm asymptomatic newborn infants. We considered all protocols of pulse oximetry screening (eg, different saturation thresholds to define abnormality, post‐ductal only or pre‐ductal and post‐ductal measurements, test timing less than or greater than 24 hours). Reference standards were diagnostic echocardiography (echocardiogram) and clinical follow‐up, including postmortem findings, mortality, and congenital anomaly databases.

Data collection and analysis

We extracted accuracy data for the threshold used in primary studies. We explored between‐study variability and correlation between indices visually through use of forest and receiver operating characteristic (ROC) plots. We assessed risk of bias in included studies using the QUADAS‐2 tool. We used the bivariate model to calculate random‐effects pooled sensitivity and specificity values. We investigated sources of heterogeneity using subgroup analyses and meta‐regression.

Main results

Twenty‐one studies met our inclusion criteria (N = 457,202 participants). Nineteen studies provided data for the primary analysis (oxygen saturation threshold < 95% or ≤ 95%; N = 436,758 participants). The overall sensitivity of pulse oximetry for detection of CCHD was 76.3% (95% confidence interval [CI] 69.5 to 82.0) (low certainty of the evidence). Specificity was 99.9% (95% CI 99.7 to 99.9), with a false‐positive rate of 0.14% (95% CI 0.07 to 0.22) (high certainty of the evidence). Summary positive and negative likelihood ratios were 535.6 (95% CI 280.3 to 1023.4) and 0.24 (95% CI 0.18 to 0.31), respectively. These results showed that out of 10,000 apparently healthy late preterm or full‐term newborn infants, six will have CCHD (median prevalence in our review). Screening by pulse oximetry will detect five of these infants as having CCHD and will miss one case. In addition, screening by pulse oximetry will falsely identify another 14 infants out of the 10,000 as having suspected CCHD when they do not have it.

The false‐positive rate for detection of CCHD was lower when newborn pulse oximetry was performed longer than 24 hours after birth than when it was performed within 24 hours (0.06%, 95% CI 0.03 to 0.13, vs 0.42%, 95% CI 0.20 to 0.89; P = 0.027).

Forest and ROC plots showed greater variability in estimated sensitivity than specificity across studies. We explored heterogeneity by conducting subgroup analyses and meta‐regression of inclusion or exclusion of antenatally detected congenital heart defects, timing of testing, and risk of bias for the "flow and timing" domain of QUADAS‐2, and we did not find an explanation for the heterogeneity in sensitivity.

Authors' conclusions

Pulse oximetry is a highly specific and moderately sensitive test for detection of CCHD with very low false‐positive rates. Current evidence supports the introduction of routine screening for CCHD in asymptomatic newborns before discharge from the well‐baby nursery.

Ringkasan bahasa mudah

Oksimetrii nadi untuk diagnosis kecacatan jantung kongenital yang kritikal

Soalan ulasan

Kami menyemak bukti mengenai ketepatan oximeter nadi untuk mengesan kecacatan jantung kongenital yang kritikal (CCHD) dalam bayi yang baru lahir tanpa gejala.

Latar belakang

CCHD berlaku di sekitar dua dalam 1000 bayi yang baru lahir dan merupakan punca utama kematian bayi. Diagnosis tepat pada masanya adalah penting bagi hasil terbaik untuk bayi ini, tetapi kaedah pemeriksaan semasa mungkin merosot sehingga 50% bayi yang baru lahir sebelum lahir, dan mereka yang dihantar ke rumah sebelum diagnosis sering mati atau mengalami morbiditi utama. Walau bagaimanapun, bayi dengan CCHD sering mempunyai paras oksigen darah rendah, yang dapat dikesan dengan cepat dan tidak invasif oleh oksimetri nadi, menggunakan sensor yang diletakkan pada tangan atau kaki bayi yang baru lahir. Oksimeter nadi adalah mesin yang dapat mengukur, bukan invasif, jumlah oksigen yang dibawa ke seluruh badan oleh sel darah merah. Oksigen dari paru‐paru terikat kepada hemoglobin dalam sel darah merah, membentuk oxyhemoglobin. Sekiranya oksigen tidak terikat, deoxyhemoglobin terbentuk. Dalam soal kesihatan, hampir semua hemoglobin adalah oxyhemoglobin, dan begitu ketepuan oksigen (iaitu peratusan hemoglobin yang mengikat oksigen) hampir 100%. Oksimeter nadi mengukur ini dengan cahaya melalui saluran darah periferal (contohnya, hujung jari dalam dewasa, di tangan atau kaki di dalam bayi). Oxyhemoglobin dan de‐oxyhemoglobin menyerap cahaya ini dengan cara yang berbeza, dan perkadaran cahaya yang diserap dapat dianalisis oleh perisian dalam oksimeter, yang kemudian menghitung peratusan ketepuan hemoglobin dengan oksigen.

Ciri‐ciri kajian

Kami mencari sehingga Mac 2017 untuk bukti mengenai penggunaan oksimetrik nadi untuk mengesan CCHD pada bayi baru lahir dan mendapati 21 kajian. Kajian‐kajian ini menggunakan tahap ambang yang berbeza untuk menentukan ujian oksimetri nadi sebagai positif. Kami menggabungkan semua kajian menggunakan tahap ambang sekitar 95% (19 kajian dengan 436,758 bayi yang baru lahir).

Keputusan‐keputusan utama

Ulasan ini mendapati bahawa bagi setiap 10,000 bayi yang baru lahir yang sihat disaringkan, sekitar enam daripadanya akan mempunyai CCHD. Ujian oksimetri nadi akan mengenal pasti lima bayi yang baru dilahirkan dengan CCHD (tetapi akan terlepas satu kes). Bayi yang baru lahir yang terlepas boleh mati atau mengalami morbiditi utama.

Bagi setiap 10,000 kanak‐kanak yang baru lahir yang sihat disaringkan, 9994 tidak akan mempunyai CCHD. Ujian oksimetri nadi akan mengenal pasti 9980 dengan betul (tetapi 14 bayi yang baru lahir akan disiasat kerana disyaki CCHD). Sesetengah bayi ini mungkin terdedah kepada ujian tambahan yang tidak diperlukan dan tinggal di hospital lebih lama, tetapi segelintir akan mempunyai penyakit bukan kardiak yang serius.

Bilangan bayi yang baru lahir yang disiasat dengan betul untuk CCHD menurun apabila oksimetri nadi dilakukan lebih lama daripada 24 jam selepas kelahiran.

Kepastian bukti

Kami menilai kajian‐kajian yang termasuk terutamanya pada risiko berat sebelah yang rendah atau tidak jelas untuk beberapa domain kepastian yang dinilai. Beberapa kajian menggunakan kaedah yang kurang kukuh untuk mengesahkan keputusan negatif. Kami menganggap kepastian keseluruhan bukti sebagai sederhana.

Authors' conclusions

disponible en

Implications for practice

This review provides further compelling evidence for the use of pulse oximetry as a routine screening test for early identification of CCHD in asymptomatic babies in the well‐baby nursery. The test has high specificity and moderate sensitivity and meets the criteria for universal screening.

Current evidence supports the introduction of routine screening for CCHD in asymptomatic newborns before discharge from the well‐baby nursery. The test appears feasible in various middle‐income countries and shows consistent test accuracy.

Some important elements regarding specific screening algorithms need further consideration. Data show no difference in sensitivity based on the site of testing (pre‐ductal or pre‐ductal and post‐ductal). However, only two studies using pre‐ductal and post‐ductal saturations reported absolute saturation values rather than just test results (de‐Wahl Granelli 2009; Ewer 2011). As has been reported previously by Ewer 2016 and Thangaratinam 2012, several CCHD cases that were detected by pre‐ductal and post‐ductal testing would have been missed by post‐ductal testing alone, but the numbers are too small to affect sensitivity analysis results.

In addition, the finding of a lower false‐positive rate with screening after 24 hours needs to be balanced against the fact that many countries discharge babies within 24 hours and ‐ as is important to note ‐ most reported studies did not take into account the risk that a baby with CCHD or other serious illness may present before screening takes place (de‐Wahl Granelli 2014; Ewer 2011; Ewer 2016; Riede 2010; Thangaratinam 2012).

The prevalence of CCHD is quite low, and most test‐positive infants do not have the target condition. The false‐positive rate is variable and depends largely on the timing of the screening (earlier screening ‐ within 24 hours of age ‐ has a higher false‐positive rate than screening after 24 hours). This raises concerns that a false‐positive test may unnecessarily increase parental anxiety and may lead to avoidable investigations and delayed discharge. Investigators in the UK PulseOx study assessed the acceptability of pulse oximetry screening and reported on anxiety created by the test ‐ particularly among mothers of false‐positive (FP) babies (Ewer 2012a; Powell 2013). Investigators quantified satisfaction with, and perceptions of, the test and anxiety and depression following screening by using validated questionnaires on samples of mothers whose babies were true positive, false positive, and true negative. All participants were predominantly satisfied with screening, and it is important to note that mothers given false‐positive results after screening were no more anxious than those given true‐negative results. Many studies report identification of alternative non‐cardiac conditions by pulse oximetry screening. Although these conditions ‐ such as congenital pneumonia and early‐onset sepsis ‐ are technically false positives, their identification may be seen as a positive additional benefit of screening; they are more likely to be detected within the first 24 hours, allowing early treatment of individuals with these potentially serious conditions. Healthcare providers must consider the potential for overdiagnosis of these conditions following screening and must apply rigorous criteria to classify these conditions.

Implications for research

The large sample size of this review along with precise estimates of sensitivity and specificity suggests that further research into the accuracy of this screening method is unnecessary. In addition, several countries, including the USA, have already implemented screening. However, given concerns related to differential verification, we propose that monitoring of screening outcomes (including possible reduction in early mortality) and management of false positives should be performed in a rigorous manner.

Further evidence regarding the routine screening of babies outside the well‐baby nursery (including non‐intensive care unit [NICU] stays and out‐of‐hospital births) is required. Additional raw saturation data and further analysis are required to further elucidate the relative sensitivities of post‐ductal versus pre‐ductal and post‐ductal saturation testing.

The ability of pulse oximetry to detect non‐cardiac illness such as respiratory and infectious conditions has been well described, but test accuracy remains unclear.

Summary of findings

Open in table viewer
Summary of findings

Should pulse oximetry be used to diagnose CCHD in asymptomatic newborns?

Patient or population: asymptomatic newborns at the time of pulse oximetry screening

Setting: hospital births

Index test: pulse oximetry

Reference test: Reference standards were both diagnostic echocardiography (echocardiogram) and clinical follow‐up in the first 28 days of life, including postmortem findings and mortality and congenital anomaly databases to identify false‐negative patients.

Studies: We included prospective or retrospective cohorts and cross‐sectional studies. We excluded case reports and studies of case‐control design.

Threshold

Summary accuracy

(95% CI)

Number

of participants (diseased

/non‐diseased)

Number

of

studies

Prevalence median

(range)

Implications

(in a cohort of 10,000 newborns tested [95% CI])

Certainty

of the evidence (GRADE)

Prevalence

0.6 per 1000

Prevalence

0.1 per 1000

Prevalence

3.7 per 1000

95%

(less than or less than or equal to)

Sensitivity

76.3%

(69.5 to 82.0)

Specificity

99.9%

(99.7 to 99.9)

436,758

(345/436,413)

19 studies

0.6 per 1000

(0.1 to 3.7)

True positives

(newborns with CCHD)

5

(4 to 5)

1

(1 to 1)

28

(26 to 30)

LOW*

⊕⊕⊝⊝

False negatives

(newborns incorrectly classified as not having CCHD)

1

(1 to 2)

0

(0 to 0)

9

(7 to 11)

True negatives

(newborns without CCHD)

9980

(9966 to 9987)

9985

(9971 to 9992)

9949

(9935 to 9956)

HIGH

⊕⊕⊕⊕

False positives

(newborns incorrectly classified as having CCHD)

14

(7 to 28)

14

(7 to 28)

14

(7 to 28)

CCHD: critical congenital heart defect; CI: confidence interval.

Sensitivity:

*We have downgraded certainty of the evidence from high to low because the low number of CCHD cases included in the review (serious imprecision) and secondly, there was a serious risk of differential verification bias (ie, diagnosis was established by echocardiography in test positive cases however test negatives were usually confirmed by clinical follow‐up or by accessing congenital malformation registries and mortality databases)."

Certainty of the evidence (Balshem 2011)
High certainty: We are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low certainty: Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect.
Very low certainty: We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect.

Background

disponible en

Congenital heart defects (CHDs) constitute the most common group of congenital malformations, with an incidence of 4 to 10 per 1000 live births (Botto 2001; Lloyd‐Jones 2009; Mahle 2009; Wren 2008); they account for more deaths than any other congenital malformation (Heron 2007; Mahle 2009; Office of National Statistics, 2015), and up to 10% of all infant deaths are attributed to them (Abu‐Harb 1994; Boneva 2001; Knowles 2005; Lloyd‐Jones 2009; Wren 2008). Life‐threatening critical CHDs (CCHDs) account for approximately 15% to 25% of all CHDs (Mahle 2009; Wren 2008). Most CCHDs are amenable to treatment, but poor clinical condition at the time of surgery increases mortality and has been shown to result in worse outcomes for conditions such as hypoplastic left heart (Brown 2001; Brown 2006), coarctation of the aorta (Franklin 2002), and transposition of the great arteries (Tworetzky 2001). Early detection of these conditions can reduce the risk of acute cardiovascular collapse and death (Abu‐Harb 1994; Mahle 2009).

Most newborns with a CCHD are asymptomatic at birth (Wren 2008); detection before the onset of symptoms usually involves routine screening by antenatal ultrasound scan, as described by Allan 1986 and Bull 1999, and by postnatal clinical examination of the cardiovascular system, as reported by Hall 1999. Unfortunately, both methods have a variable, and often low, detection rate (Abu‐Harb 1994a; Carvalho 2002; Chew 2007; Garne 2001; Tegnander 2006; Westin 2006; Wren 1999), and up to 30% of infants born with CCHDs are discharged home before the diagnosis has been established (Abu‐Harb 1994; Brown 2006; Mellander 2006; Wren 2008), with reported mortality rates as high as 50% (Chang 2008).

Although antenatal detection rates following screening ultrasonography are improving, average detection of isolated CCHD remains less than 50% (Abu‐Harb 1994a; Carvalho 2002; Chew 2007; Garne 2001; Tegnander 2006; Westin 2006; Wren 1999). Clinical examination abnormalities such as murmur and weak pulse are often absent in early postnatal life, and the more common finding of cyanosis (bluish discoloration of the skin due to reduced oxygen in the blood) is frequently clinically undetectable (Mahle 2008; O'Donnell 2007). The fact that most infants with CCHD will have such mild cyanosis has led to the exploration of pulse oximetry assessment as a possible screening test to identify affected infants (Ewer 2012a; Knowles 2005; Lloyd‐Jones 2009).

Following publication of several large test accuracy studies, several countries adopted pulse oximetry screening as routine practice, and many more are considering its introduction (de‐Wahl Granelli 2014; Ewer 2014; Kuelling 2009; Mahle 2012; Manzoni 2017).

In addition to test accuracy, studies have demonstrated that pulse oximetry screening is cost‐effective (Knowles 2005; Peterson 2013; Roberts 2012), and that it is acceptable to both parents and clinical staff (Narayen 2017; Powell 2013).

The vast majority of babies studied have been screened in a hospital setting – specifically, the well‐baby nursery – at low altitude. However, screening has been reported recently in other settings including neonatal units (Iyengar 2014; Suresh 2013), as well as out of hospital settings such as home births ‐ reported by Cawsey 2016; Lhost 2014; and Narayen 2016a ‐ and births at moderate altitude (Han 2013; Wright 2014).

This review does not include settings outside the well‐baby nursery.

Target condition being diagnosed

The definition of CCHD is not consistent, and the literature reveals many interpretations (Ewer 2012a). One of the difficulties arises because some conditions (such as coarctation of the aorta and pulmonary stenosis) may or may not predispose to acute collapse, depending on relative severity. For the purposes of this review, we have used a previously described definition of CCHD, that is, "any potentially life‐threatening duct‐dependent heart lesion from which infants either die or require invasive procedures (surgery or cardiac catheterization) in the first 28 days of life" (Ewer 2012a; Wren 2008). The definition includes all infants with hypoplastic left heart syndrome, pulmonary atresia with intact ventricular septum, simple transposition of the great arteries, or interruption of the aortic arch. In addition, all infants dying or needing surgery or catheter in the first 28 days of life with coarctation of the aorta, aortic valve stenosis, pulmonary valve stenosis, tetralogy of Fallot, pulmonary atresia with ventricular septal defect, or total anomalous pulmonary venous connection are classified as having critical congenital heart defects. This definition offers the advantages that it allows a degree of assessment of severity of certain lesions based on early death or intervention, it is relatively easy to categorize, and it has been used in several test accuracy studies and previous systematic reviews (Ewer 2011; Ewer 2012a; Thangaratinam 2012; Zhao 2014).

Index test(s)

Pulse oximetry measurement of oxygen saturations in an asymptomatic newborn infant can be used to identify CCHD before discharge from hospital. Pulse oximetry is an accurate and well‐established test used to quantify hypoxemia (low oxygen levels in the blood) that is rapid, painless, and easy to perform in all patient groups including newborn infants (Ewer 2012a; Ewer 2013; Knowles 2005; Lloyd‐Jones 2009; Mahle 2009; Narayen 2016b). Any trained individual can perform pulse oximetry screening, and results can be obtained in approximately five minutes. For infants enrolled in pulse oximetry screening studies, pulse oximetry probes (to measure oxygen saturations) are placed on the foot only (post‐ductal) or on the right hand and foot (pre‐ductal and post‐ductal) (Ewer 2012b; Ewer 2013; Ewer 2016). The index test allows screening to reduce the number of infants discharged from hospital before diagnosis of CCHD, and it can be performed at any point before discharge before or after the clinical examination.

Clinical pathway

Standard screening for CCHD usually includes midtrimester ultrasonography of pregnant women, which includes assessment of fetal cardiac anatomy. If a cardiac defect is suspected when this examination is performed, a detailed fetal echocardiogram may confirm the diagnosis. Most newborn infants also undergo one or more clinical examinations before discharge from the hospital, which include assessment of the cardiovascular system (auscultation of heart sounds, palpation of peripheral pulses). If a cardiac defect is suspected upon completion of either of these screening tests, then a postnatal diagnostic echocardiogram is usually obtained. As described previously, these screening tests have variable, and often low, detection rates.

The population included in this review may or may not have had antenatal screening. All were asymptomatic at the time of pulse oximetry screening.

Alternative test(s)

In addition to the screening tests already described, alternatives such as routine screening fetal echocardiography and postnatal echocardiography have been proposed but are unlikely to be cost‐effective (Knowles 2005). This review did not assess the accuracy of existing screening tests (ie, antenatal ultrasonography and physical examination).

Rationale

Hypoxemia, or suboptimal arterial oxygen saturation, is present in most infants with CCHD (Ewer 2012a; Lloyd‐Jones 2009; Mahle 2009). Some may have overt cyanosis, but in many, the degree of hypoxaemia may be difficult to discern on clinical examination. Pulse oximetry is a quick, painless, non‐invasive, and reliable method used to determine arterial oxygen saturation levels; it has been widely used in many areas of clinical medicine for over 30 years. The concept of using oxygen saturations as a screen for critical heart defects was first reported more than 15 years ago. Pulse oximetry screening may allow detection of infants who have been missed by other screening methods before they are discharged from hospital, allowing urgent cardiac intervention before the onset of life‐threatening cardiorespiratory collapse. Systematic reviews of pulse oximetry screening studies have been published (Thangaratinam 2007; Thangaratinam 2012), and indeed this screening technique is now common practice in the United States and in some European countries. However, this is the only review that includes recent large test accuracy studies, including one reported from a middle‐income country ‐ China.

We performed a systematic review of studies assessing the diagnostic accuracy of screening with pulse oximetry (index test) in relation to echocardiography or clinical follow‐up (reference standard) for detection of CCHD in asymptomatic newborn infants. It is important to note that we wanted to determine how well a negative pulse oximetry test result rules out a CCHD diagnosis. Several previous reviews have explored this topic (Ewer 2012a; Ewer 2013; Knapp 2010; Knowles 2005; Lloyd‐Jones 2009; Narayen 2016b; Thangaratinam 2007).

Objectives

disponible en

  • To determine the diagnostic accuracy of pulse oximetry as a screening method for detection of CCHD in asymptomatic newborn infants

Secondary objectives

  • To assess potential sources of heterogeneity, including:

    • characteristics of the population: inclusion or exclusion of antenatally detected congenital heart defects;

    • timing of testing: < 24 hours versus ≥ 24 hours after birth;

    • site of testing: right hand and foot (pre‐ductal and post‐ductal) versus foot only (post‐ductal);

    • oxygen saturation: functional versus fractional;

    • study design: retrospective versus prospective design, consecutive versus non‐consecutive series; and

    • risk of bias for the "flow and timing" domain of QUADAS‐2.

Methods

disponible en

Criteria for considering studies for this review

Types of studies

We considered inclusion of prospective or retrospective cohort and cross‐sectional studies evaluating the diagnostic accuracy of pulse oximetry as a screening method for detection of critical congenital heart defects in asymptomatic newborn infants. A study of diagnostic accuracy should provide sufficient data for construction of the two‐by‐two table showing the cross‐classification of disease status (CCHD) and test outcome (pulse oximetry). We excluded studies if we could not extract true‐positive (TP), true‐negative (TN), false‐positive (FP), and false‐negative (FN) values after contacting corresponding authors of primary studies when necessary. We excluded case reports and studies of case‐control design.

Participants

We included studies that recruited asymptomatic (with no signs of respiratory or cardiac illness) term or near‐term newborns before discharge from hospital.

Index tests

The test under evaluation was pulse oximetry screening to identify low oxygen saturation. We included all protocols of screening (eg, post‐ductal [foot] only vs pre‐ductal and post‐ductal [right hand and foot], different saturation thresholds to define abnormality, different numbers of repeat tests). Criteria for defining a screen as positive or negative in this review were those used by the authors of respective publications.

Target conditions

Critical congenital heart defects as defined above.

Reference standards

Reference standards were diagnostic echocardiography (echocardiogram) and clinical follow‐up in the first 28 days of life, including postmortem findings and information from mortality and congenital anomaly databases, to identify patients with false‐negative findings.

Search methods for identification of studies

Electronic searches

Information Specialists of the Cochrane Neonatal Review Group performed the searches. Using the strategy described in Appendix 1, they searched the following databases.

  • Cochrane Central Register of Controlled Trials (CENTRAL; 2017, Issue 2) in the Cochrane Library.

  • MEDLINE via PubMed (1966 to March 2017).

  • Embase via Ovid (1980 to March 2017).

  • Cumulative Index to Nursing and Allied Health Literature (CINAHL) (1982 to March 2017).

The MEDLINE search strategy included medical subject headings (MeSH) and free text words (see Appendix 1). We adjusted this strategy for use with the other electronic databases. We considered a combination of medical subject headings and text terms to generate three subsets of citations: one subset indexing the index test (pulse oximetry), a second subset indexing the target population (infant‐newborn), and a third subset indexing the clinical condition (congenital heart disease). We combined these subsets to generate a set of citations relevant to our research question. We considered both published and unpublished reports for inclusion and excluded studies published in abstract form only. We applied no language restriction to the electronic searches.

Searching other resources

We used the Science Citation Index, accessed via the Institute for Scientific Information (ISI) Web of Science, to retrieve reports citing the studies included in this review. We searched for similar systematic reviews in the Database of Abstracts of Reviews of Effects (DARE) to March 2017, to cross‐reference results. We also searched the Health Services Research Projects in Progress (HSRProj) database (http://www.nlm.nih.gov/hsrproj/) (searched on March 20, 2017). We handsearched the reference lists of all relevant primary studies on the topic of our interest to identify cited articles not captured by our electronic searches (up to March 15, 2017). We applied no language restrictions.

Data collection and analysis

Selection of studies

Two review authors (MNP and JZ) independently screened titles and abstracts identified through electronic literature searches to identify potentially eligible studies. First, we excluded those records classified by both review authors as "excluded." Second, we independently assessed the full text of reports classified as "unsure" or "potentially eligible" by applying the selection criteria outlined above in the Criteria for considering studies for this review section. We resolved disagreements through discussion. If finally we reached no consensus, we consulted a third review author (AKE).

Data extraction and management

We used a standardized data extraction form to aid extraction of relevant information and data from each included study. Three review authors (MNP, LFP, and AKE) separately participated in data extraction. MNP and LFP extracted data corresponding to study design, participant details, method of testing, threshold saturation level, and type of oxygen saturation measured, as well as timing of the test and inclusion or exclusion of infants with suspected congenital heart defects after antenatal ultrasound screening in pregnancy, reference tests, and funding. AKE and MNP extracted the following data to reconstruct the two‐by‐two table: true‐positive, false‐positive, true‐negative, and false‐negative values or, if not available, relevant parameters (sensitivity, specificity, or positive and negative predictive values). Two review authors (MNP and JZ) incorporated data and study characteristics into Review Manager 5.3 (RevMan 2014).

Dealing with duplicate publications

We included only once those studies that have been published in duplicate, ensuring that we extracted all relevant data from all publications.

Inconclusive results

Although we did not anticipate uninterpretable results, when we detected these cases, we excluded them from analysis and adequately reported their frequency in tables.

Assessment of methodological quality

Two review authors (MNP and LFP) independently appraised the methodological quality of each included study using the QUADAS‐2 tool (Whiting 2011). QUADAS‐2 consists of four domains, each requiring a risk of bias categorization of low, high, or unclear risk. The first three domains are also assessed in terms of concerns about applicability (applicability concerns ratings). Each domain comprises a set of signaling questions that should be marked as "yes," "no," or "unclear." We tailored QUADAS‐2 for our specific review question by modifying signaling questions accordingly and providing guidance on how to assess risk of bias and applicability concerns ratings (Appendix 2). We resolved disagreements between risk of bias and applicability concern ratings through discussion or by consultation with a third review author (AKE). We summarized our results in the text and in tables and corresponding figures. We decided post hoc to assess the certainty of evidence by using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach (GRADEpro GDT; Hultcrantz 2017; Schunemann 2008).

Statistical analysis and data synthesis

We performed analyses using methods described in Chapter 10 of the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy (Macaskill 2010).

We considered pulse oximetry screening as positive if the oxygen saturation level was below the threshold defined in the primary study, and negative if it was above that threshold. Cross‐classification of these test results with those of the reference standard(s) produced the numbers of true positives, false positives, true negatives, and false negatives for each study, based on the ability of pulse oximetry to detect CCHD.

We used data from the two‐by‐two tables to calculate sensitivity and specificity for individual studies. We present individual study results by plotting the estimates of sensitivity and specificity (and their 95% confidence intervals) in both forest plots and receiver operating characteristic (ROC) scatter plots. We extracted accuracy data for the threshold used in primary studies.

We performed meta‐analyses using a bivariate model (Chu 2006; Reitsma 2005). This model accounts for intra‐study accuracy variability and inter‐study variations in test performance with inclusion of random effects. We analyzed studies sharing the same threshold and obtained summary accuracy estimates (when the number of studies was enough). We present these estimates with a 95% confidence ellipse in the ROC space. We used pooled estimates of sensitivity and specificity to derive positive and negative likelihood ratios that can be used to update the prior probability of having CCHD to a post‐test probability of having CCHD after a positive or negative pulse oximetry result. The greater the positive likelihood ratio and the lower the negative likelihood ratio, the more important the effect of the test on changing pretest into post‐test probabilities. We did not calculate positive and negative predictive values because these indices depend on the prevalence of the target condition (ie, CCHD).

For analyses, we used a METADAS SAS macro that estimates parameters for the model with SAS Proc NLMIXED (SAS Institute Inc. 2004; Takwoingi 2010). We entered parameter estimates from the bivariate model into RevMan to produce the summary operating point with a 95% confidence region and a 95% prediction region (Chu 2006; Reitsma 2005; RevMan 2014).

Investigations of heterogeneity

We explored between‐study variability and correlation between indices visually through forest and ROC plots. We measured total between‐study variability in sensitivity and in specificity through variances of the random effects for logit(sensitivity), logit(specificity), and their covariance of the bivariate model. We also provided confidence and prediction ellipses. We further investigated heterogeneity by exploring effects of several study‐level factors through subgroup and meta‐regression analyses including covariate terms to the bivariate model (Chu 2006; Reitsma 2005).

When available, we examined the following covariates.

  • Inclusion or exclusion of antenatally detected congenital heart defects.

  • Screening test method (the screening test may be performed at different times after birth, oxygen saturations may be measured at pre‐ductal and post‐ductal sites or at post‐ductal sites only, and, finally, the oxygen saturation measured could be expressed as "functional" [which refers to the proportion of oxygenated hemoglobin that is capable of binding oxygen] or "fractional" [which refers to the percentage of total hemoglobin that is oxygenated]). In most cases, differences between the two values are very small, and most modern pulse oximeters measure functional saturations only.

  • Study design (included studies may be prospective or retrospective, and may enroll consecutive patients or not). We expect that retrospective studies are more prone to information and selection biases. In this review, it is more likely that medical records of infants with a positive index test result include more information as compared with medical records of infants with a negative test result (information bias). In a similar way, infants with any CCHD are more likely than infants without CCHD to be detected and included in the study after a retrospective medical records review (selection bias).

  • Risk of bias of the "flow and timing" domain of the QUADAS‐2 questionnaire (unclear/high vs low risk of bias). It is expected that studies used different reference standards to confirm index test results (echocardiogram, clinical follow‐up, registries in mortality, and congenital anomaly databases).

Sensitivity analyses

We examined the robustness of meta‐analyses by conducting sensitivity analyses. We checked the impact of excluding studies from analysis according to domains of the QUADAS‐2 assessment. Additionally, we decided to perform ad hoc sensitivity analyses to explore how sensitivity and specificity vary by including or excluding studies with different thresholds.

Assessment of reporting bias

We did not investigate reporting bias, given the limited power of available tests and uncertainty about interpreting statistical evidence of funnel plot asymmetry as necessarily implying publication bias (Leeflang 2008).

Results

Results of the search

Figure 1 shows details of the search and selection process. Electronic database searches yielded a total of 3415 references from CENTRAL, MEDLINE, Embase, and CINAHL. Searches for primary studies through other resources did not reveal additional potentially eligible studies.


Flow of studies through the screening process. CCHD: critical congenital heart defect.

Flow of studies through the screening process. CCHD: critical congenital heart defect.

After de‐duplication, two review authors (MNP and JZ) independently assessed 2695 references against the inclusion criteria. During initial screening of titles and abstracts, we identified 56 studies (46 full‐text papers and 10 conference abstracts). We excluded 2639 references because they did not meet the inclusion criteria. We also excluded those published in abstract form only (n = 10). Of 46 full‐text studies, nine studies provided a partial two‐by‐two diagnostic table, and we excluded them. We excluded 17 other studies for the following reasons.

  • Different outcomes (not accuracy) (n = 6).

  • Inability to determine CCHD outcomes (n = 1).

  • Out‐of‐hospital births (n = 2).

  • Preliminary studies (n = 1).

  • Different population (n = 2).

  • Health technology assessment report on already included study (n = 1).

  • Case‐control study (n = 1).

  • Journal club (n = 1).

  • Case report (n = 1).

  • Different index test (n = 1) (see Characteristics of excluded studies).

We obtained one additional study by searching Science Citation Index (Gomez‐Rodriguez 2015). We included 21 studies in a quantitative synthesis (Arlettaz 2006; Bakr 2005; Bhola 2014; de‐Wahl Granelli 2009; Ewer 2011; Gomez‐Rodriguez 2015; Jones 2016; Klausner 2017; Koppel 2003; Meberg 2008; Oakley 2015; Ozalkaya 2016; Richmond 2002; Riede 2010; Rosati 2005; Sendelbach 2008; Singh 2014; Turska 2012; Van Niekerk 2016; Zhao 2014; Zuppa 2015).

Characteristics of studies

We provide in the Characteristics of included studies table details on the design, setting, population, index test, target condition, and reference standard of all included studies. We prepared an additional table (Table 1) to summarize the main characteristics.

Open in table viewer
Table 1. Main studies characteristics

Study

Population

Index test

Reference

standard(s)

Antenatal diagnosis

of CHD

Pulse

oximeter

Limb

Test

timing

Oxygen

saturation

Threshold

Positive

pulse oximetry

Negative

pulse oximetry

Arlettaz 2006

included

Nellcor NPB‐40

post‐ductal

within 24 hours

functional

< 95%

echocardiography

NA

Bakr 2005

excluded

Digioxi PO 920

pre‐ductal and post‐ductal

longer than 24 hours

fractional

≤ 94%

echocardiography

cardiology database

Bhola 2014

included

Masimo Radical 5

post‐ductal

longer than 24 hours

functional

< 95%

echocardiography

cardiology database

De‐Wahl 2009

excluded

Radical SET v4

pre‐ductal and post‐ductal

longer than 24 hours

functional

< 95%

echocardiography

mortality data

Ewer 2011

included

Radical‐7

pre‐ductal and post‐ductal

within 24 hours

functional

< 95%

echocardiography

clinical follow‐up,

cardiology database & congenital registry

Gomez‐Rodriguez 2015

excluded

Radical‐5

post‐ductal

within 24 hours

functional

< 95%

echocardiography

clinical follow‐up

Jones 2016

excluded

NA

pre‐ductal and post‐ductal

within 24 hours

NA

≤ 95%

echocardiography

National Congenital Heart Disease Audit

Klausner 2017

excluded

NA

pre‐ductal and post‐ductal

longer than 24 hours

NA

< 95%

echocardiography

clinical follow‐up

Koppel 2003

excluded

Ohmeda Medical

post‐ductal

longer than 24 hours

functional

≤ 95%

echocardiography

clinical follow‐up & congenital registry

Meberg 2008

excluded

RAD‐5v

post‐ductal

within 24 hours

functional

< 95%

echocardiography

clinical follow‐up

Oakley 2015

excluded

Nellcor NPB 40

post‐ductal

longer than 24 hours

functional

< 95%

echocardiography

cardiology database & mortality data

Ozalkaya 2016

excluded

Nellcor

pre‐ductal and post‐ductal

longer than 24 hours

functional

≤ 95%

echocardiography

echocardiography

Richmond 2002

included

Oxi machine

post‐ductal

within 24 hours

fractional

< 95%

echocardiography

mortality data & congenital registry

Riede 2010

excluded

NA

post‐ductal

longer than 24 hours

functional

≤ 95%

echocardiography

congenital registry

Rosati 2005

excluded

NA

post‐ductal

longer than 24 hours

functional

≤ 95%

echocardiography

clinical follow‐up

Sendelbach 2008

excluded

Nellcor N‐395

post‐ductal

within 24 hours

functional

< 96%

echocardiography

clinical follow‐up

Singh 2014

excluded

NA

pre‐ductal and post‐ductal

within 24 hours

functional

< 95%

echocardiography

mortality data & congenital registry & cardiology database

Turska 2012

excluded

Novametrix,

Nellcor & Masimo

post‐ductal

within 24 hours

functional

< 95%

echocardiography

clinical follow‐up and Public Health registries

Van Niekerk 2016

excluded

Nellcor

pre‐ductal and post‐ductal

longer than 24 hours

functional

< 95%

echocardiography

NA

Zhao 2014

excluded

RAD‐5V

pre‐ductal and post‐ductal

longer than 24 hours

functional

< 95%

echocardiography

clinical follow‐up

Zuppa 2015

excluded

Ohmeda 3900

post‐ductal

longer than 24 hours

functional

< 95%

echocardiography

NA

NA: not available

Of 3415 references, we identified 21 primary studies that were eligible for inclusion and provided data for 457,202 newborn infants (Figure 1). Studies were published between 2002 and 2017. Countries included were United Kingdom (Ewer 2011; Jones 2016; Oakley 2015; Richmond 2002; Singh 2014), Italy (Rosati 2005; Zuppa 2015), USA (Klausner 2017; Koppel 2003; Sendelbach 2008), Australia (Bhola 2014), China (Zhao 2014), Germany (Riede 2010), Mexico (Gomez‐Rodriguez 2015), Norway (Meberg 2008), Poland (Turska 2012), Saudi Arabia (Bakr 2005), South Africa (Van Niekerk 2016), Sweden (de‐Wahl Granelli 2009), Switzerland (Arlettaz 2006), and Turkey (Ozalkaya 2016).

Sixteen studies included prospective cohorts (Arlettaz 2006; Bakr 2005; de‐Wahl Granelli 2009; Ewer 2011; Gomez‐Rodriguez 2015; Koppel 2003; Meberg 2008; Oakley 2015; Richmond 2002; Riede 2010; Rosati 2005; Sendelbach 2008; Turska 2012; Van Niekerk 2016; Zhao 2014; Zuppa 2015), as well as five retrospective cohorts (Bhola 2014; Jones 2016; Klausner 2017; Ozalkaya 2016; Singh 2014). Seventeen studies excluded newborns who were suspected to have congenital heart disease after antenatal ultrasound screening during pregnancy (Bakr 2005; de‐Wahl Granelli 2009; Gomez‐Rodriguez 2015; Jones 2016; Klausner 2017; Koppel 2003; Meberg 2008; Oakley 2015; Ozalkaya 2016; Riede 2010; Rosati 2005; Sendelbach 2008; Singh 2014; Turska 2012; Van Niekerk 2016; Zhao 2014; Zuppa 2015) (Table 1).

Nine studies performed pulse oximetry within 24 hours after birth (Arlettaz 2006; Ewer 2011; Gomez‐Rodriguez 2015; Jones 2016; Meberg 2008; Richmond 2002; Sendelbach 2008; Singh 2014; Turska 2012) (Table 1). Twelve studies used the foot alone (post‐ductal) to measure oxygen saturation, and the remainder used both right hand and foot (pre‐ductal and post‐ductal) (Table 1). Investigators used several different pulse oximeter models (see description in Table 1). Two studies measured fractional saturations (Bakr 2005; Richmond 2002) (Table 1). Eight studies used a post‐ductal saturation threshold of less than 95% (Arlettaz 2006; Bhola 2014; Gomez‐Rodriguez 2015; Meberg 2008; Oakley 2015; Richmond 2002; Turska 2012; Zuppa 2015), three studies used a post‐ductal saturation threshold ≤ 95% (Koppel 2003; Riede 2010; Rosati 2005), and six studies used both pre‐ductal and post‐ductal saturations less than 95% (de‐Wahl Granelli 2009; Ewer 2011; Klausner 2017; Singh 2014; Van Niekerk 2016; Zhao 2014). Two studies reported different positive thresholds (Bakr 2005 reported both pre‐ductal and post‐ductal fractional saturation ≤ 94%, and Sendelbach 2008 reported post‐ductal saturation < 96%) (Table 1). In summary, the most common threshold was less than 95% (n = 14); five studies reported a threshold lower than or equal to 95%, and two studies reported thresholds ≤ 94% and < 96%, respectively. No study reported results for more than one threshold.

Studies used different methods to verify test results: Investigators verified positive test results by echocardiography and negative results by interrogation of congenital anomaly registers, mortality data, or clinical follow‐up (Table 1).

Methodological quality of included studies

We appraised the quality of primary diagnostic accuracy studies using the QUADAS‐2 tool. We present quality assessment results for individual studies in the Characteristics of included studies table and in Figure 2. We have summarized the overall risk of bias and applicability concerns of studies in Figure 3.


Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study.

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study.


Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies.

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies.

We judged the risk that patient selection (QUADAS‐2, domain 1) had introduced bias as low in 10 studies (Arlettaz 2006; Bakr 2005; de‐Wahl Granelli 2009; Ewer 2011; Jones 2016; Klausner 2017; Koppel 2003; Ozalkaya 2016; Richmond 2002; Zuppa 2015), high in two because investigators did not avoid inappropriate exclusions (Oakley 2015; Van Niekerk 2016), and unclear in the remaining nine studies (Bhola 2014; Gomez‐Rodriguez 2015; Meberg 2008; Riede 2010; Rosati 2005; Sendelbach 2008; Singh 2014; Turska 2012; Zhao 2014). Applicability was of low concern for all studies in the patient selection domain.

For the index test assessment (QUADAS‐2, domain 2), we considered all studies to be at low risk of bias and low concern regarding applicability.

We judged the risk that conduct or interpretation of reference standard(s) (QUADAS‐2, domain 3) had introduced bias as low in four studies because investigators used echocardiography to confirm both positive and negative pulse oximetry cases (Ozalkaya 2016), or because they used echocardiography to confirm pulse oximetry positives and clinical follow‐up in the first 28 days of life, which included postmortem findings and mortality and congenital anomaly databases to identify false‐negative screening cases (Ewer 2011; Koppel 2003; Turska 2012). This comprehensive combination of clinical follow‐up and review of registries and databases was considered as having low risk of bias. We considered that three studies reporting only echocardiography as the reference standard for positive pulse oximetry results were at high risk of bias (Arlettaz 2006; Van Niekerk 2016; Zuppa 2015). We considered risk for the remaining 14 studies as unclear because they used an incomplete reference standard to identify false‐negative cases (Bakr 2005; Bhola 2014; de‐Wahl Granelli 2009; Gomez‐Rodriguez 2015; Jones 2016; Klausner 2017; Meberg 2008; Oakley 2015; Richmond 2002; Riede 2010; Rosati 2005; Sendelbach 2008; Singh 2014; Zhao 2014); six studies used echocardiography and follow‐up (Gomez‐Rodriguez 2015; Klausner 2017; Meberg 2009; Rosati 2005; Sendelbach 2008; Zhao 2014), and eight studies used echocardiography and different mortality and malformations registries (Bakr 2005; Bhola 2014; de‐Wahl Granelli 2009; Jones 2016; Oakley 2015; Richmond 2002; Riede 2010; Singh 2014). It is noteworthy that only one study used echocardiography for positive and negative pulse oximetry results (Ozalkaya 2016). Applicability was of low concern for all studies in the reference standard(s) domain.

For flow and timing assessment (QUADAS‐2, domain 4), 11 studies were at low risk of bias (de‐Wahl Granelli 2009; Ewer 2011; Gomez‐Rodriguez 2015; Koppel 2003; Meberg 2008; Oakley 2015; Ozalkaya 2016; Richmond 2002; Riede 2010; Singh 2014; Zhao 2014), and the remaining studies were at unclear risk because information reported was insufficient to permit judgment (Arlettaz 2006; Bakr 2005; Bhola 2014; Jones 2016; Klausner 2017; Rosati 2005; Sendelbach 2008; Turska 2012; Van Niekerk 2016; Zuppa 2015).

Findings

Results of meta‐analysis

We considered for primary analysis all studies with thresholds around 95% (< 95% and ≤ 95%). As expected, this was the most common threshold among included studies (n = 19 studies; 436,758 participants) (Arlettaz 2006; Bhola 2014; de‐Wahl Granelli 2009; Ewer 2011; Gomez‐Rodriguez 2015; Jones 2016; Klausner 2017; Koppel 2003; Meberg 2008; Oakley 2015; Ozalkaya 2016; Richmond 2002; Riede 2010; Rosati 2005; Singh 2014; Turska 2012; Van Niekerk 2016; Zhao 2014; Zuppa 2015). The overall sensitivity of pulse oximetry for detection of critical congenital heart defects was 76.3% (95% confidence interval [CI] 69.5 to 82.0). Specificity was 99.9% (95% CI 99.7 to 99.9) with a false‐positive rate of 0.14% (95% CI 0.07 to 0.22) (summary of findings Table). Summary positive and negative likelihood ratios were 535.6 (95% CI 280.3 to 1023.4) and 0.24 (95% CI 0.18 to 0.31), respectively.

Fourteen out of 19 studies used a threshold lower than 95% (Arlettaz 2006; Bhola 2014; de‐Wahl Granelli 2009; Ewer 2011; Gomez‐Rodriguez 2015; Klausner 2017; Meberg 2008; Oakley 2015; Richmond 2002; Singh 2014; Turska 2012; Van Niekerk 2016; Zhao 2014; Zuppa 2015), and five studies used a threshold lower than or equal to 95% (Jones 2016; Koppel 2003; Ozalkaya 2016; Riede 2010; Rosati 2005).

Two additional studies used different thresholds: One used a threshold lower than or equal to 94% with sensitivity and specificity of 100% (95% CI 29 to 100) and 100% (95% CI 100 to 100), respectively (Bakr 2005); the other used a threshold of less than 96% with sensitivity and specificity of 100% (95% CI 3 to 100) and 100% (95% CI 100 to 100), respectively (Sendelbach 2008).

Overall, we have included in this review 349 cases of CCHD. The median prevalence of CCHD was 0.6 per 1000 live births (range 0.1 to 3.7; interquartile range 0.4 to 1.2).

Investigations of heterogeneity

To visualize total variability in sensitivity and specificity, we present the data in forest and ROC scatter plots (Figure 4; Figure 5). Forest plots show studies in increasing order of specificity (Figure 4). Sensitivity of the 21 studies ranged from 0% to 100%, and specificity from 99% to 100%. Forest and ROC plots show greater variability in estimated sensitivity than specificity across studies. Given results from investigations of heterogeneity, we used the bivariate model to estimate summary sensitivity and specificity (summary points) instead of the hierarchical summary ROC model to estimate summary ROC curves.


Forest plot of sensitivity and specificity. The figure shows the estimated sensitivity and specificity of the study (blue square) and its 95% confidence interval (black horizontal line). Studies are ordered by ascending specificity.

Forest plot of sensitivity and specificity. The figure shows the estimated sensitivity and specificity of the study (blue square) and its 95% confidence interval (black horizontal line). Studies are ordered by ascending specificity.


Summary ROC plot for pulse oximetry using a threshold lower than or lower than or equal to 95% (n = 19 studies). The solid circle corresponds to the summary estimate of sensitivity and specificity, and is shown with a 95% prediction region (dashed line).

Summary ROC plot for pulse oximetry using a threshold lower than or lower than or equal to 95% (n = 19 studies). The solid circle corresponds to the summary estimate of sensitivity and specificity, and is shown with a 95% prediction region (dashed line).

For the primary analysis, we measured total between‐study variability in sensitivity and in specificity through variances of the random effects for logit(sensitivity), logit(specificity), and their covariance, which were 0.102, 2.001, and ‐0.340, respectively. We represented the summary operating point with a 95% prediction region (Figure 5) and explored heterogeneity by differentiating studies on the basis of antenatal screening for CHD, timing of testing, type of oxygen saturation, study design, and risk of bias for the "flow and timing" domain of QUADAS‐2. We plotted subgroups of studies in the ROC space.

Subgroup analysis and meta‐regression

Table 2 summarizes results of the subgroup analysis including sensitivity and false‐positive rates.

Open in table viewer
Table 2. Subgroup analysis

N

Sensitivity

(95% CI)

Relative

sensitivity

P value

False‐positive rate (FPR)

(95% CI)

Relative

FPR

P value

Antenatal diagnosis

Included

4

86.3% (71.8 to 94.0)

0.071

0.46% (0.13 to 1.59)

0.231

Excluded

15

74.1% (65.7 to 81.1)

0.10% (0.05 to 0.21)

Test timing

Longer than 24 hours

11

73.6% (62.8 to 82.1)

0.393

0.06% (0.03 to 0.13)

0.027

Within 24 hours

8

79.5% (70.0 to 86.6)

0.42% (0.20 to 0.89)

Limb

Foot only

11

81.2% (70.9 to 88.4)

0.197

0.13% (0.05 to 0.31)

0.718

Foot and right hand

8

71.2% (58.5 to 81.3)

0.17% (0.06 to 0.46)

Risk of bias ("flow and timing")

Unclear risk of bias

9

77.8% (64.1 to 87.3)

0.937

0.05% (0.02 to 0.12)

0.016

Low risk of bias

10

77.3% (68.8 to 84.0)

0.34% (0.17 to 0.66)

Antenatal diagnosis

Four studies included newborn infants with antenatal screening (Arlettaz 2006; Bhola 2014; Ewer 2011; Richmond 2002), and 15 studies did not (de‐Wahl Granelli 2009; Gomez‐Rodriguez 2015; Jones 2016; Klausner 2017; Koppel 2003; Meberg 2008; Oakley 2015; Ozalkaya 2016; Riede 2010; Rosati 2005; Singh 2014; Turska 2012; Van Niekerk 2016; Zhao 2014; Zuppa 2015). Summary estimates of sensitivity were 86.3% (95% CI 71.8 to 94.0) for studies that included antenatal screening, and 74.1% (95% CI 65.7 to 81.1) for studies that did not include antenatal screening. Summary estimates of specificity were 99.5% (95% CI 98.4 to 99.9) with a false‐positive rate of 0.46% (95% CI 0.13 to 1.59) for studies with antenatal screening, and 99.9% (95% CI 99.8 to 100) with a false positive rate of 0.10% (95% CI 0.05 to 0.21) for studies that did not include antenatal screening. Sensitivity (P = 0.071) and specificity (P = 0.231) did not change significantly when newborn infants with antenatal suspicion of congenital heart defects were included compared with when they were excluded.

Test timing

Eleven studies performed pulse oximetry screening after 24 hours from birth (Bhola 2014; de‐Wahl Granelli 2009; Klausner 2017; Koppel 2003; Oakley 2015; Ozalkaya 2016; Riede 2010; Rosati 2005; Van Niekerk 2016; Zhao 2014; Zuppa 2015), and the other eight studies performed pulse oximetry within 24 hours of birth (Arlettaz 2006; Ewer 2011; Gomez‐Rodriguez 2015; Jones 2016; Meberg 2008; Richmond 2002; Singh 2014; Turska 2012). Summary estimates of sensitivity and specificity of studies that performed screening after 24 hours were 73.6% (95% CI 62.8 to 82.1) and 99.9% (95% CI 99.9 to 100). For studies that performed screening within 24 hours, summary estimates of sensitivity and specificity were 79.5% (95% CI 70.0 to 86.6) and 99.6% (95% CI 99.1 to 99.8). Test timing to perform pulse oximetry had no significant effect on sensitivity (P = 0.393), but the false‐positive rate for detection of CCHD was lower when newborn pulse oximetry was done after 24 hours from birth than when it was done within 24 hours (0.06% [95% CI 0.03 to 0.13] vs 0.42% [95% CI 0.20 to 0.89]; P = 0.027).

Limbs

Eleven studies used the foot alone (post‐ductal) to measure oxygen saturation (Arlettaz 2006; Bhola 2014; Gomez‐Rodriguez 2015; Koppel 2003; Meberg 2008; Oakley 2015; Richmond 2002; Riede 2010; Rosati 2005; Turska 2012; Zuppa 2015); summary estimates of sensitivity and specificity were 81.2% (95% CI 70.9 to 88.4) and 99.9% (95% CI 99.7 to 100), respectively, with a false‐positive rate of 0.13% (95% CI 0.05 to 0.31). Eight studies used both right hand and foot (pre‐ductal and post‐ductal) (de‐Wahl Granelli 2009; Ewer 2011; Jones 2016; Klausner 2017; Ozalkaya 2016; Singh 2014; Van Niekerk 2016; Zhao 2014); summary estimates of sensitivity and specificity for this group of studies were 71.2% (95% CI 58.5 to 81.3) and 99.8% (95% CI 99.5 to 99.9), respectively, with a false‐positive rate of 0.17% (95% CI 0.06 to 0.46). We noted no significant differences in sensitivity (P = 0.197) nor in specificity (P = 0.718) for pulse oximetry when measures were obtained in the foot alone rather than in both the foot and the right hand.

Risk of bias

We judged nine studies as having unclear risk of bias for the "flow and timing" domain of QUADAS‐2 (Arlettaz 2006; Bhola 2014; Klausner 2017; Koppel 2003; Riede 2010; Rosati 2005; Turska 2012; Van Niekerk 2016; Zuppa 2015). Summary estimates of sensitivity and specificity were 77.8% (95% CI 64.1 to 87.3) and 100% (95% CI 99.9 to 100), respectively, with a false‐positive rate of 0.05% (95% CI 0.02 to 0.12). We judged the remaining 10 studies as having low risk of bias (de‐Wahl Granelli 2009; Ewer 2011; Gomez‐Rodriguez 2015; Jones 2016; Meberg 2008; Oakley 2015; Ozalkaya 2016; Richmond 2002; Singh 2014; Zhao 2014); summary estimates of sensitivity and specificity were 77.3% (95% CI 68.8 to 84.0) and 99.7% (95% CI 99.3 to 99.8), respectively, with a false‐positive rate of 0.34% (95% CI 0.17 to 0.66). Risk of bias for this domain had no significant effect on sensitivity (P = 0.937), but studies judged as having unclear risk of bias for the "flow and timing" domain had higher specificity (P = 0.016).

Sensitivity analysis

We performed a sensitivity analysis while excluding from the primary analysis studies with a threshold ≤ 95% (Jones 2016; Koppel 2003; Ozalkaya 2016; Riede 2010; Rosati 2005). For this analysis, sensitivity and specificity were 78.1% (95% CI 71.2 to 83.7) and 99.8% (95% CI 99.6 to 99.9) with a false‐positive rate of 0.23% (95% CI 0.12 to 0.44). Exclusion of these studies increased the sensitivity and false‐positive rate of pulse oximetry screening.

We also performed a sensitivity analysis for which we added to the primary analysis studies with a threshold ≤ 94% and < 96% (Bakr 2005; Sendelbach 2008). For this analysis, sensitivity and specificity were 77% (95% CI 70 to 82) and 100% (95% CI 100 to 100), respectively. Inclusion of these studies produced a slight improvement in the sensitivity of the test.

Furthermore, we investigated the effects of potential sources of bias by removing the four studies judged as having high risk of bias in one of the QUADAS‐2 domains (Arlettaz 2006; Oakley 2015; Van Niekerk 2016; Zuppa 2015). For this analysis, sensitivity and specificity were similar to those in the main analysis, at 75.5% (95% CI 68.2 to 81.6) and 99.79% (95% CI 99.7 to 99.9), respectively.

Discussion

disponible en

Summary of main results

For this review, we have identified and summarized the results of all available cohort studies reporting the test accuracy of pulse oximetry screening for detection of critical congenital heart defects (CCHDs) in asymptomatic late preterm and full‐term infants in postnatal wards or well‐baby nurseries. We have presented the main results in summary of findings Table. We analyzed data on 457,202 participants from 21 included studies. We restricted the primary analysis to studies with thresholds around 95% (< 95% and ≤ 95%). Analysis, including 436,758 participants from 19 studies, showed that pulse oximetry screening is a highly specific screening test with moderate sensitivity and a low overall false‐positive rate. Overall sensitivity was 76.3%, specificity was 99.9%, and the false‐positive rate was 0.14%. Summary positive and negative likelihood ratios were 535.6 and 0.24, respectively. Inclusion of studies that used different saturation thresholds from those in the primary analysis slightly improved the sensitivity of the test. Exclusion of studies at high risk of bias did not significantly alter overall sensitivity or specificity. Between‐study heterogeneity was higher in sensitivity than in specificity estimates.

Most studies were conducted in high‐income countries (USA, Europe); however, we also included studies from middle‐income countries, which increases the generalizability of review findings. We noted methodological variation between studies with respect to inclusion or exclusion of babies with a suspected antenatal diagnosis, timing of testing (before or after 24 hours of age), site of testing (post‐ductal only or pre‐ductal and post‐ductal), functional or fractional saturation measurement, and study design (prospective or retrospective). Subgroup analysis showed no effect on sensitivity or specificity among these variables, although later screening was associated with a lower false‐positive rate than was reported with earlier screening.

The definition of CCHD provided in the published literature is highly variable. We attempted to address this by applying a strict definition (see above) to categorize CCHD in a standardized manner, thus reducing the risk of an incorrect diagnosis.

Strengths and weaknesses of the review

Strengths of this review include a comprehensive literature search performed to identify all relevant studies, rigorous assessment of risk of bias of included studies using the QUADAS‐2 tool, duplicate data extraction, and performance of subgroup and sensitivity analyses to investigate differences in estimates of accuracy of pulse oximetry among studies with high, low, or unclear risk of bias. However, only one study included more than 100 CCHD cases, and 12 studies included fewer than 10 cases. The relatively low number of CCHD cases included in this review indicates that the precision of sensitivity is still low.

Our review has explored and quantified the heterogeneity, and review authors have tried to identify possible sources of heterogeneity. Exploration of sources of heterogeneity has produced different results for sensitivity and specificity. Sensitivity has not been affected by any of the a priori selected sources of heterogeneity. We cannot rule out the presence of unexplained heterogeneity in this accuracy index, although it is highly likely that some of the variability observed in sensitivities of individual studies could be explained by the paucity of CCHD cases. Use of different strategies for confirming pulse oximetry negative cases (ie, passively with mortality or registry data rather than active clinical follow‐up) could well have introduced some degree of heterogeneity into sensitivity results. However, this post hoc exploration was not performed, given the scarcity of data. This means that sensitivity estimates are somewhat unstable with wide confidence intervals. At the same time, this scarcity made analysis of heterogeneity underpowered. Conversely, specificity was affected by the timing of the test and by the risk of bias due to the flow and timing domain of the QUADAS‐2 tool. Statistical significance achieved by the specificity analysis is a direct consequence of the large number of healthy newborns included in the review. On the other hand, the magnitude of differences between subgroup analyses was small. False‐positive rates were 0.06% and 0.42% for newborns screened after and before 24 hours of birth, respectively. The absolute difference was 0.36% with more false‐positives in the earlier screening group as compared with the late screening group. This means, in relative terms, seven times more false positives are seen in the earlier screening group than in the late screening group. Similarly, false‐positive rates varied between studies judged as having unclear or low risk of bias for the "flow and timing" domain of QUADAS‐2 (ie, 0.05% vs 0.34% for unclear and low risk groups of studies, respectively). The absolute difference a 0.29% reduction in false positives in the unclear risk group, which equates almost seven times fewer false positives in relative terms.

Agreements and disagreements with other studies or reviews

This review includes more studies and a larger body of data from a significantly greater number of infants than were included in similar previous systematic reviews of the test accuracy of pulse oximetry screening to detect CCHD (Mahle 2009; Thangaratinam 2007; Thangaratinam 2012), which reported identical statistical methods and meta‐analyses. Compared with the largest prior review (Thangaratinam 2012), authors of this review screened a significantly larger number of references (2695 vs 552) and included data from over 220,000 more babies, allowing greater precision of the estimates of test accuracy, and providing the most complete meta‐analysis available so far.

Overall sensitivity is similar (76.3% vs 76.5%) to that described by Thangaratinam 2012 and is similar to the overall false‐positive rate (0.14% vs 0.14%). The statistically significant lower false‐positive rate between early and later screening persists (0.06% vs 0.42% and 0.05% vs 0.5%).

Applicability of findings to the review question

This review includes a large number of relevant studies that met our inclusion criteria, and review authors had few concerns about the relevance of their findings to our review questions. We mainly judged included studies to be at low or unclear risk of bias in QUADAS‐2 domains. Most studies had a prospective design with consecutive enrollment of participants and included an adequate description of the index test. Some studies reported the exclusion criteria poorly. Data were complete and were available for all included studies.

Risk of differential verification bias was unavoidable as diagnosis was established by echocardiography in test‐positive cases; however, test‐negative cases were usually confirmed by clinical follow‐up or by examination of congenital malformation registries and mortality databases; risk of bias in the conduct or interpretation of reference standard(s) was unclear in most studies that used incomplete reference standards. This of course raises the possibility that some of the false negatives may be misclassified as true negatives. This misclassification overestimates sensitivity and specificity. Differential verification may have had an impact on the sensitivity estimate. For this reason, along with the potential for imprecision, given the small number of CCHD cases, we have downgraded the GRADE certainty of evidence for sensitivity to "low." In our review, studies judged as having unclear risk of bias for the “flow and timing” domain showed higher specificity.

Flow of studies through the screening process. CCHD: critical congenital heart defect.
Figuras y tablas -
Figure 1

Flow of studies through the screening process. CCHD: critical congenital heart defect.

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study.
Figuras y tablas -
Figure 2

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study.

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies.
Figuras y tablas -
Figure 3

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies.

Forest plot of sensitivity and specificity. The figure shows the estimated sensitivity and specificity of the study (blue square) and its 95% confidence interval (black horizontal line). Studies are ordered by ascending specificity.
Figuras y tablas -
Figure 4

Forest plot of sensitivity and specificity. The figure shows the estimated sensitivity and specificity of the study (blue square) and its 95% confidence interval (black horizontal line). Studies are ordered by ascending specificity.

Summary ROC plot for pulse oximetry using a threshold lower than or lower than or equal to 95% (n = 19 studies). The solid circle corresponds to the summary estimate of sensitivity and specificity, and is shown with a 95% prediction region (dashed line).
Figuras y tablas -
Figure 5

Summary ROC plot for pulse oximetry using a threshold lower than or lower than or equal to 95% (n = 19 studies). The solid circle corresponds to the summary estimate of sensitivity and specificity, and is shown with a 95% prediction region (dashed line).

All studies.
Figuras y tablas -
Test 1

All studies.

Primary analysis (threshold < 95% or ≤ 95%).
Figuras y tablas -
Test 2

Primary analysis (threshold < 95% or ≤ 95%).

Should pulse oximetry be used to diagnose CCHD in asymptomatic newborns?

Patient or population: asymptomatic newborns at the time of pulse oximetry screening

Setting: hospital births

Index test: pulse oximetry

Reference test: Reference standards were both diagnostic echocardiography (echocardiogram) and clinical follow‐up in the first 28 days of life, including postmortem findings and mortality and congenital anomaly databases to identify false‐negative patients.

Studies: We included prospective or retrospective cohorts and cross‐sectional studies. We excluded case reports and studies of case‐control design.

Threshold

Summary accuracy

(95% CI)

Number

of participants (diseased

/non‐diseased)

Number

of

studies

Prevalence median

(range)

Implications

(in a cohort of 10,000 newborns tested [95% CI])

Certainty

of the evidence (GRADE)

Prevalence

0.6 per 1000

Prevalence

0.1 per 1000

Prevalence

3.7 per 1000

95%

(less than or less than or equal to)

Sensitivity

76.3%

(69.5 to 82.0)

Specificity

99.9%

(99.7 to 99.9)

436,758

(345/436,413)

19 studies

0.6 per 1000

(0.1 to 3.7)

True positives

(newborns with CCHD)

5

(4 to 5)

1

(1 to 1)

28

(26 to 30)

LOW*

⊕⊕⊝⊝

False negatives

(newborns incorrectly classified as not having CCHD)

1

(1 to 2)

0

(0 to 0)

9

(7 to 11)

True negatives

(newborns without CCHD)

9980

(9966 to 9987)

9985

(9971 to 9992)

9949

(9935 to 9956)

HIGH

⊕⊕⊕⊕

False positives

(newborns incorrectly classified as having CCHD)

14

(7 to 28)

14

(7 to 28)

14

(7 to 28)

CCHD: critical congenital heart defect; CI: confidence interval.

Sensitivity:

*We have downgraded certainty of the evidence from high to low because the low number of CCHD cases included in the review (serious imprecision) and secondly, there was a serious risk of differential verification bias (ie, diagnosis was established by echocardiography in test positive cases however test negatives were usually confirmed by clinical follow‐up or by accessing congenital malformation registries and mortality databases)."

Certainty of the evidence (Balshem 2011)
High certainty: We are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low certainty: Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect.
Very low certainty: We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect.

Figuras y tablas -
Table 1. Main studies characteristics

Study

Population

Index test

Reference

standard(s)

Antenatal diagnosis

of CHD

Pulse

oximeter

Limb

Test

timing

Oxygen

saturation

Threshold

Positive

pulse oximetry

Negative

pulse oximetry

Arlettaz 2006

included

Nellcor NPB‐40

post‐ductal

within 24 hours

functional

< 95%

echocardiography

NA

Bakr 2005

excluded

Digioxi PO 920

pre‐ductal and post‐ductal

longer than 24 hours

fractional

≤ 94%

echocardiography

cardiology database

Bhola 2014

included

Masimo Radical 5

post‐ductal

longer than 24 hours

functional

< 95%

echocardiography

cardiology database

De‐Wahl 2009

excluded

Radical SET v4

pre‐ductal and post‐ductal

longer than 24 hours

functional

< 95%

echocardiography

mortality data

Ewer 2011

included

Radical‐7

pre‐ductal and post‐ductal

within 24 hours

functional

< 95%

echocardiography

clinical follow‐up,

cardiology database & congenital registry

Gomez‐Rodriguez 2015

excluded

Radical‐5

post‐ductal

within 24 hours

functional

< 95%

echocardiography

clinical follow‐up

Jones 2016

excluded

NA

pre‐ductal and post‐ductal

within 24 hours

NA

≤ 95%

echocardiography

National Congenital Heart Disease Audit

Klausner 2017

excluded

NA

pre‐ductal and post‐ductal

longer than 24 hours

NA

< 95%

echocardiography

clinical follow‐up

Koppel 2003

excluded

Ohmeda Medical

post‐ductal

longer than 24 hours

functional

≤ 95%

echocardiography

clinical follow‐up & congenital registry

Meberg 2008

excluded

RAD‐5v

post‐ductal

within 24 hours

functional

< 95%

echocardiography

clinical follow‐up

Oakley 2015

excluded

Nellcor NPB 40

post‐ductal

longer than 24 hours

functional

< 95%

echocardiography

cardiology database & mortality data

Ozalkaya 2016

excluded

Nellcor

pre‐ductal and post‐ductal

longer than 24 hours

functional

≤ 95%

echocardiography

echocardiography

Richmond 2002

included

Oxi machine

post‐ductal

within 24 hours

fractional

< 95%

echocardiography

mortality data & congenital registry

Riede 2010

excluded

NA

post‐ductal

longer than 24 hours

functional

≤ 95%

echocardiography

congenital registry

Rosati 2005

excluded

NA

post‐ductal

longer than 24 hours

functional

≤ 95%

echocardiography

clinical follow‐up

Sendelbach 2008

excluded

Nellcor N‐395

post‐ductal

within 24 hours

functional

< 96%

echocardiography

clinical follow‐up

Singh 2014

excluded

NA

pre‐ductal and post‐ductal

within 24 hours

functional

< 95%

echocardiography

mortality data & congenital registry & cardiology database

Turska 2012

excluded

Novametrix,

Nellcor & Masimo

post‐ductal

within 24 hours

functional

< 95%

echocardiography

clinical follow‐up and Public Health registries

Van Niekerk 2016

excluded

Nellcor

pre‐ductal and post‐ductal

longer than 24 hours

functional

< 95%

echocardiography

NA

Zhao 2014

excluded

RAD‐5V

pre‐ductal and post‐ductal

longer than 24 hours

functional

< 95%

echocardiography

clinical follow‐up

Zuppa 2015

excluded

Ohmeda 3900

post‐ductal

longer than 24 hours

functional

< 95%

echocardiography

NA

NA: not available

Figuras y tablas -
Table 1. Main studies characteristics
Table 2. Subgroup analysis

N

Sensitivity

(95% CI)

Relative

sensitivity

P value

False‐positive rate (FPR)

(95% CI)

Relative

FPR

P value

Antenatal diagnosis

Included

4

86.3% (71.8 to 94.0)

0.071

0.46% (0.13 to 1.59)

0.231

Excluded

15

74.1% (65.7 to 81.1)

0.10% (0.05 to 0.21)

Test timing

Longer than 24 hours

11

73.6% (62.8 to 82.1)

0.393

0.06% (0.03 to 0.13)

0.027

Within 24 hours

8

79.5% (70.0 to 86.6)

0.42% (0.20 to 0.89)

Limb

Foot only

11

81.2% (70.9 to 88.4)

0.197

0.13% (0.05 to 0.31)

0.718

Foot and right hand

8

71.2% (58.5 to 81.3)

0.17% (0.06 to 0.46)

Risk of bias ("flow and timing")

Unclear risk of bias

9

77.8% (64.1 to 87.3)

0.937

0.05% (0.02 to 0.12)

0.016

Low risk of bias

10

77.3% (68.8 to 84.0)

0.34% (0.17 to 0.66)

Figuras y tablas -
Table 2. Subgroup analysis
Table Tests. Data tables by test

Test

No. of studies

No. of participants

1 All studies Show forest plot

21

457202

2 Primary analysis (threshold < 95% or ≤ 95%) Show forest plot

19

436758

Figuras y tablas -
Table Tests. Data tables by test