Scolaris Content Display Scolaris Content Display

Pruebas para la detección del estrabismo en niños de uno a seis años de edad en la comunidad

Collapse all Expand all

Antecedentes

El estrabismo (alineación incorrecta de los ojos) es un factor de riesgo de deterioro de desarrollo visual, tanto de la agudeza visual como de la estereopsis. La detección del estrabismo en la comunidad por examinadores no expertos puede realizarse usando varias pruebas índice diferentes: medidas directas de alineación incorrecta (prueba del reflejo corneal o del fondo de ojo) o medidas indirectas, como la estereopsis y la agudeza visual. La prueba de referencia para detectar el estrabismo por profesionales entrenados es la prueba de cubrir/descubrir.

Objetivos

Evaluar y comparar la exactitud de las pruebas, solas o en combinación, para la detección del estrabismo en niños de uno a seis años de edad, en el ámbito comunitario y realizadas por examinadores no expertos o profesionales de atención primaria para informar a los encargados de programas de cribado de niños.

El objetivo secundario fue investigar las fuentes de heterogeneidad de la exactitud diagnóstica.

Métodos de búsqueda

Se realizaron búsquedas en el Registro Cochrane Central de Ensayos Controlados (Cochrane Central Register of Controlled Trials) (CENTRAL; 2016, Número 12) (que contiene el Registro Cochrane de Ensayos de Ojos y Visión [Cochrane Eyes and Vision Trials Register]), en la Biblioteca Cochrane, la Health Technology Assessment Database (HTAD) en la Biblioteca Cochrane (2016, Número 4), MEDLINE Ovid (1946 al 5 de enero de 2017), Embase Ovid (1947 al 5 de enero de 2017), CINAHL (enero de 1937 al 5 de enero de 2017), Web of Science Conference Proceedings Citation Index‐Science (CPCI‐S) (de enero de 1990 al 5 de enero de 2017), BIOSIS Previews (de enero de 1969 al 5 de enero de 2017), MEDION (al 18 de agosto de 2014), la base de datos del Aggressive Research Intelligence Facility (ARIF) (al 5 de enero de 2017), el registro ISRCTN (www.isrctn.com/editAdvancedSearch); buscado el 5 de enero de 2017, ClinicalTrials.gov (www.clinicaltrials.gov); buscado el 5 de enero de 2017 y la Plataforma de Registro Internacional de Ensayos Clínicos (ICTRP) de la Organización Mundial de la Salud (OMS) (www.who.int/ictrp/search/en); buscado el 5 de enero de 2017. No se aplicaron restricciones de fecha o de idioma en las búsquedas electrónicas de ensayos. Además, se buscó en revistas de optometría y actas de congresos sin listas electrónicas.

Criterios de selección

Se incluyeron todos los estudios poblacionales prospectivos y retrospectivos sobre exactitud de las pruebas realizadas en participantes consecutivos. Los estudios compararon las pruebas índice solas o en combinación con la prueba de referencia. Sólo se incluyeron los estudios con datos suficientes para el análisis, específicamente para calcular la sensibilidad y la especificidad y determinar la exactitud diagnóstica.

Los participantes tenían de uno a seis años. Los estudios con datos sobre participantes fuera de este grupo etario se incluyeron si se disponía de datos de los subgrupos.

Los contextos permitidos fueron los programas poblacionales de cribado de la visión y los programas de cribado del tipo que se realiza en escuelas.

Obtención y análisis de los datos

Se utilizaron los procedimientos metodológicos estándar previstos por Cochrane. En resumen, dos autores de revisión de forma independiente evaluaron la elegibilidad de los títulos y los resúmenes y extrajeron los datos, con la participación de un tercer autor (de más antigüedad) para resolver cualquier desacuerdo. Se analizaron los datos, sobre todo en cuanto a la especificidad y la sensibilidad.

Resultados principales

Un estudio de un total de 1236 artículos, resúmenes y ensayos reunió los requisitos para la inclusión, con un número total de participantes de 335, de los cuales 271 completaron tanto la prueba de cribado como la prueba de referencia. La prueba de cribado mediante un "photoscreener" automatizado tuvo una sensibilidad de 0,46 (intervalo de confianza [IC] del 95%: 0,19 a 0,75) y una especificidad de 0,97 (IC 0,94 a 0,99). El número general afectado por el estrabismo fue bajo (13 = 4,8%).

Conclusiones de los autores

Hay datos bibliográficos muy limitados para establecer la exactitud de las pruebas que detectan el estrabismo en la comunidad y que son realizadas por examinadores no expertos. Se necesitaría un estudio prospectivo amplio que compare los métodos para determinar las pruebas de mayor exactitud.

Pruebas para la detección del estrabismo en niños de uno a seis años de edad en la comunidad

Objetivo de la revisión
El objetivo de esta revisión Cochrane era determinar la efectividad de las distintas pruebas para detectar el estrabismo en niños de uno a seis años de edad, fuera de los servicios de oftalmología. Estas pruebas fueron usadas en la comunidad y realizadas por examinadores que no eran especialistas en ojo.

Antecedentes
El estrabismo (también conocido como bizquera) aparece cuando no hay alineación de los ojos. Puede llevar a reducción de la visión y falta de funcionamiento común de los ojos, incluida la visión 3D. Se puede utilizar una serie de pruebas diferentes para detectar el estrabismo directamente, midiendo la desalineación; o indirectamente, midiendo el nivel de visión en cada ojo (agudeza visual); o midiendo la visión tridimensional (estereopsis). No se sabe cuál de estas pruebas logra una mayor exactitud para la identificación correcta de los niños con estrabismo.

Resultados y conclusión
Solo se encontró un estudio que cumplió con los estándares para ser incluido en esta revisión. Este estudio, usó un "photoscreener" (un tipo de cámara que mide defectos de refracción y alineación incorrecta). Después del cribado, se les ofreció a todos los niños un examen por un oftalmólogo para confirmar quiénes presentaban estrabismo. El "photoscreener" fue muy exacto en cuanto a la identificación de los niños sin estrabismo (alta especificidad), pero no para identificar correctamente a los que presentaban estrabismo (sólo baja sensibilidad).

Como sólo pudo incluirse un estudio en esta revisión, no fue posible concluir qué prueba es la más exacta para el cribado del estrabismo. Para determinar lo anterior, se necesitarían estudios adicionales. Sin embargo, deberían incluir un gran número de niños para poder establecer conclusiones estadísticamente válidas.

¿Cuál es el grado de actualización de esta revisión?
Los investigadores Cochrane buscaron estudios que se habían publicado hasta el 5 de enero de 2017.

Authors' conclusions

Implications for practice

Identifying strabismus as part of a screening programme is most important if it impacts on visual acuity (leading to amblyopia) or stereopsis. Therefore screening in the community does not need to directly test for strabismus by ocular misalignment although there is the suggestion from other studies that sensitivity is increased by doing so. There is a lack of evidence for which tests are most accurate in detecting strabismus specifically in a normal population being screened by non‐expert screeners.

Implications for research

Cochrane Reviews of the accuracy of screening tests to detect anisometropia and amblyopia would complement the evidence review on screening strategies. Given the prevalence of amblyopia and amblyogenic risk factors, primary vision and strabismus screening studies would require large numbers of children to be screened. Such studies may be cost‐effective if run alongside existing vision screening programmes. As visual acuity alone may not be sufficiently sensitive to detect strabismus, addition of autorefractor, stereoacuity, corneal light reflex testing or novel devices should be considered (VIP 2007). Although sensitivity of screening tests should be around 80% specificity may not need to be, as the further assessment for amblyopia is non‐invasive and does not carry a risk of harm.

Summary of findings

Open in table viewer
Summary of findings 1. Summary of findings table

Accuracy of a photoscreener to detect strabismus in the community

Patient/population: children aged 1 to 6 years old

Setting: school

Index test: plusoptix S04 photoscreener

Target condition: constant and intermittent manifest strabismus

Reference standard: cover test at distance and near

Number of studies

Number of participants

Number affected by target condition

Sensitivity of test (95% CI)

Specificity of test (95% CI)

Risk of bias based on QUADAS‐2 domains

Comments

1 Arthur 2009

271

13

0.46 (0.19 to 0.75)

0.97 (0.94 to 0.99)

Unclear risk

Low participation rate of 25%

CI: confidence intervals

Open in table viewer
Summary of findings 2. Data extraction from included studies

Study ID

Arthur 2009

Clinical features and settings

Previous testing and results: unknown.
Setting: elementary school.

Referral route/selection: all who were screened offered gold standard examination.

Participants

Sample size: 306 screened (1343 invited to study (consents sent: this may have introduced selection bias from concerned parents being more likely to return consent forms), 387 returned, 45 excluded as consents too late, 7 excluded for document errors, 28 absent on day of screening, 1 uncooperative), 275 gold standard exam (14 declined, 11 unable to attend within time frame, 6 uncontactable) of which 271 data interpretable for both index and reference (3 photographs unusable, 1 did not complete exam).

Socio‐demographic items: 98% 4 to 5 years of age, gender and ethnicity not given, no ocular abnormalities (i.e. media opacities, which would affect test results/technical failure rates). Geographic region: Limestone school district, Ontario, Canada.

Study design

Selection: all patients with data available on both index and reference tests as single group.
Enrolment: consecutive series, enrolled by post in combination with dental screening programme.
Identification: prospective.
If more than one test: one test.

Target condition

Constant and intermittent manifest strabismus (esotropia, exotropia, vertical tropia, microtropia), prevalence of the target condition in the sample: 13 (of 271).

Reference standard

Test definition and description: monocular visual acuity with occlusion glasses (crowded Keeler logMAR letter matching test/Crowded Kay pictures/Cardiff cards), cover test at distance and near, ocular movements and convergence, binocular single vision assessment (20D base‐out prism test and/or stereopsis) and red reflex test.

Standards: discharged if VA 0.2 logMAR or better, binocular single vision at distance and near and no suspected ocular pathology, 6‐ to 12‐week review and re‐check if borderline, cycloplegic refraction/dilated examination all others.
Test operator(s): optometrist or orthoptist or ophthalmologist.
Timing of reference standard: separate visit to hospital but timing unknown.

Index tests

plusoptiX S04 photoscreener, co‐axial camera, handheld at 1 m.
Criteria for positive test result: eye alignment > 10 degrees from centre (manually flagged as abnormal) anisometropia > 1D, astigmatism > 1.25D, myopia > 3D, hyperopia > 3.5D, anisocoria > 1 mm.
Details of test operators: certified dental assistants after 3 hours of training.
Timing: 5 to 10 seconds image acquisition time repeated if necessary.
Manufacturer: Plusoptix GmbH.
Technical characteristics: 3rd generation, infrared, coaxial video camera, portable, handheld, non‐contact.

Follow‐up

How many participants were lost to follow‐up: 31.
How many have missing or uninterpretable test results: 4.
Adverse events noted that could be caused by the test: 0.

Notes

Sources of funding: none declared.

Anything else of relevance: low participation rate (25%).

Background

Target condition being diagnosed

Strabismus is a physical condition in which the eyes are not aligned. It is associated with deficient binocularity, the mechanism that integrates visual information from both eyes. Strabismus can be primary, or it can be a consequence of poor vision in one eye or of refractive errors. Less commonly, strabismus can be caused by lesions affecting the oculomotor, trochlear or abducens nerve, or higher neurological pathways. Strabismus is rarely caused by developmental or traumatic defects of the extraocular muscles. Strabismus is a risk factor for the development of amblyopia during the 'sensitive period' of vision development. During this period, neural plasticity is greatest, and it begins to decline around the age of 6 years; clinical interventions are typically offered to children up to the age of 10 years. Screening programmes therefore attempt to identify children with amblyogenic risk factors before the age of 6 years, to allow remedial treatment.

Prevalence figures for strabismus vary. The most recent screening study in Baltimore, USA, found a prevalence of manifest strabismus of 3.3% in Caucasian and 2.1% in African American children aged 6 to 71 months (Friedman 2009). Other population‐based studies have reported a prevalence of childhood strabismus between 0.01% and 3.1%, indicating that prevalence may vary greatly by ethnicity, age, type of strabismus and definitions used (Graham 1974; Matsuo 2007a; Matsuo 2007b; Preslan 1996; Traboulsi 2008; Turacli 1995; Wedner 2000;Appendix 1).

Relevance of strabismus in children

There are many subtypes of strabismus. In the context of childhood vision screening programmes, the most relevant distinction is between manifest and latent strabismus. Manifest strabismus is a risk factor for the development of amblyopia, the commonest vision disorder in children (prevalence 1.6% to 3.6% in Western societies) (Simons 1996a).

Amblyopia is a developmental anomaly of spatial vision, usually associated with strabismus, anisometropia or from deprivation early in life (Ciuffreda 1991). Amblyopes have reduced visual acuity in one or both eyes, reduced contrast sensitivity and reduced contour integration. Clinical definitions of amblyopia are based on visual acuity only, taking into consideration the age of the child and progressive improvement of 'normal visual acuity' in the early years. Unilateral amblyopia is often defined as an interocular difference in best‐corrected visual acuity (of 2 logMAR or Snellen chart lines) (Friedman 2009), or best‐corrected visual acuity of 0.30 logMAR or worse in either eye (Rahi 2002; Traboulsi 2008). In 3‐year‐old children, bilateral amblyopia is suspected if best‐corrected visual acuity is worse than 0.40 logMAR in one eye and worse than 0.3 logMAR in the other eye in the presence of a bilateral amblyogenic risk factor. In 4‐year‐old children, the thresholds are 0.3 and 0.18 logMAR, respectively (Schmidt 2004).

Strabismus has a profound effect on stereopsis or perception of depth. Stereopsis normally develops within the first 3 to 4 months of age and reaches adult levels by the age of 24 to 36 months (Braddick 1980; Fawcett 2005; Fox 1980; Petrig 1981; Takai 2005). Two studies reported that stereoacuity continues to develop beyond the age of 3 years, and may not yet be fully mature at 5 years or 12 years of age, respectively (Simons 1981a; Walraven 1993). Normal adult stereopsis is 50 to 60 seconds of arc; some childhood vision screening programmes have used a threshold of 400 seconds of arc for "suspicion of amblyopia" (Traboulsi 2008). Reduced stereopsis adversely affects motor skills, particularly fine motor skills (Grant 2007; Hrisos 2006; O'Connor 2010; Webber 2008).

Significant misalignment can affect development (through unilateral reduced acuity, lack of depth perception and limitation of peripheral visual field), social interactions, and emotional well‐being. In children with infantile esotropia, surgical correction of strabismus leads to improvement in general development as measured by the Bayley scale (Rogers 1982). Scores on anxiety and depression scales such as the National Eye Institute Visual Functioning Questionnaire and the Hospital Anxiety and Depression Scale are significantly different from non‐strabismic children, and improve following surgical strabismus correction (Bernfeld 1982; Chai 2009). Children with strabismus may have significantly greater conduct and externalising problems (Koklanis 2006).

Strabismus can also be an indicator of severe eye and health problems. As it can indicate poor vision, it may in rare cases be the first sign of childhood cataract, glaucoma, or tumours of the eye, optic nerve, orbit or brain, such as retinoblastoma, glioma, or rhabdomyosarcoma.

Gross misalignment of the eyes is usually noticed by members of the family or carers. Small angles of deviation are not necessarily apparent. In young children, features such as a broad nasal bridge or certain lid positions and shape (epicanthus) can give rise to pseudostrabismus, i.e. a perception of strabismus when in fact the eyes are straight. 

Diagnosis of strabismus: the cover test

The cover test is based on the observation of a refixation movement of a deviated eye when the fixing eye is covered (Gamble 1950; McKean 1976; Romano 1971; Scott 1973). The basic form of the cover test, the cover‒uncover test, establishes the diagnosis of manifest strabismus. An occluder is introduced in front of one eye, then removed, re‐establishing binocular viewing. If an eye moves when the other is covered, this indicates that this eye was not fixing before the cover was introduced. Any eye movement is interpreted as 'test positive' and 'manifest strabismus present'; the magnitude of the movement is often categorised as small, moderate or large. This test is used in some screening programmes to detect strabismus (Fogt 2000). The accuracy of the cover test in detecting strabismus may be affected by the child's age at screening, with better test performance in children over the age of 3 years (Williams 2001).

Variations of the cover test are used to diagnose latent strabismus, and to measure the magnitude of both manifest and latent strabismus. The presence of latent strabismus is assessed by using the alternate cover test. The occluder is moved from eye to eye, allowing viewing of the target with one eye only, without permitting binocular viewing. The observer notes refixation movements of either eye as the cover is removed.

Quantitative measurements are obtained by neutralising the strabismus with prisms held in front of one eye whilst performing the cover‒uncover test (simultaneous prism cover test) or the alternate cover test (prism alternate cover test); the endpoint of measurements is the prism with which no refixation movement is observed when the cover is removed. To trained professionals (orthoptists) refixation on cover test can indicate strabismus; however, this has not been used in published screening studies.

All cover tests are carried out with the participant fixing on a target presented at distances of 6, 4 or 3 metres, and then at near distances (33 cm or 40 cm). In children, the distance target is often presented at 3 metres. In very young children the test is often only carried out at near fixation.

The cover‒uncover test aims to detect strabismus, but not refractive errors, the other significant group of amblyogenic risk factors. Its accuracy as a standalone amblyopia screening test is therefore limited (Schmidt 2004). Conversely, addition of the cover‒uncover test to vision screening tests increases the detection rate of strabismus (VIP 2007). Vision screening programmes for children between 4 and 6 years traditionally use optotype testing to determine visual acuity (matching or naming letters or pictures), with or without a cover test to detect strabismus. In an effort to screen younger children to identify and treat problems early, these 'manual' screening programmes are increasingly supplemented or even replaced by the use of devices such as photorefractors, which also aim to provide information about refractive amblyogenic risk factors. The American Association of Pediatric Ophthalmology and Strabismus (AAPOS) recently published updated recommendations for automated screening programmes (Donahue 2013). Screening methods were categorised into refractive and non‐refractive screening instruments. With regards to detection of strabismus, the AAPOS recommends that non‐refractive screening devices should detect manifest strabismus greater than 8 prism dioptres (PD) in primary position (Donahue 2013). UK recommendations suggest that screening at age 4 to 5 years old provides the most accuracy and allows adequate time to treat (Solebo 2015).

Index test(s)

Different tests are in use to detect strabismus in a community or primary care setting by non‐expert screeners or primary care professionals.

  • Type 1 tests directly identify ocular misalignment, for example corneal (Hirschberg) or fundus reflections tests (Brückner).

  • Type 2 tests assess binocular function such as stereoacuity, e.g. contour and random dot stereotests from which presence of strabismus is deduced.

  • Type 3 tests are designed to detect reduced central vision/visual acuity which may in turn be associated with strabismus, e.g. HOTV, LEA symbols, Keeler (previously Glasgow) crowded logMAR, Sonksen crowded logMAR, crowded Kay picture test, displayed as paper‐ or computer‐based charts or conventional retroilluminated charts.

  • Type 4 tests are those automated refraction devices which are also designed to report ocular misalignment.

We planned to include studies that report combinations of several index tests.

Other tests for strabismus, such as controlled binocular acuity test, suppression tests, blur test, and tests designed to detect reduced fusional reserve (prism reflex test, prism fusion range) are not used by lay screeners, but only by trained professionals (orthoptists). Orthoptist‐delivered screening is not within the remit of this review.

Principles underlying each type of index test

Detailed information about each index test is given in Appendix 2.

Type 1: tests which directly identify ocular misalignment

In manifest strabismus with childhood onset, information from the deviating eye is suppressed, so that a person does not perceive double vision. The principle of the cover test is that when the fixating eye is covered, the deviating eye will move to primary position (looking straight ahead position) to take up fixation, as long as it has some vision and does not have eccentric fixation or severe eye movement deficit. Presence, speed and magnitude of this refixation movement are the outcomes of the cover test.

Type 2: tests of binocular function ‒ stereopsis

The visual axes need to be within a certain angle to each other in order to detect information that is presented in stereotests. Strabismus may be associated with reduced stereopsis.

Type 3: tests designed to detect reduced central vision/visual acuity

Though not a specific indicator, visual acuity tests may indicate the presence of strabismus‐induced amblyopia.

Type 4: automated refraction devices designed to report ocular misalignment

Some autorefractors indicate asymmetry of corneal reflections.

Clinical pathway

Childhood vision screening programmes vary around the world, and may differ between high‐ and low‐ to middle‐income countries.

World Health Organization (WHO) Member States are grouped into low‐ and middle‐income countries (LMIC) by WHO region, separating out high‐income countries within each of these regions, based on the World Bank's gross national income per capita (World Bank; World Health Organization 2014). Low‐ and middle‐income states have a gross national income per capita of less than USD 12,276 whereas high‐income states have a gross national income per capita of USD 12,276 or more.

High‐income countries often have national guidelines for screening, though these are not necessarily matched by an established national screening programme. For example in Israel and Sweden, there are established national screening programmes (Schmucker 2009). In the UK, the National Screening Committee recommends vision screening of children during the school year of their fifth birthday, to be delivered under the supervision of orthoptists with the focus on screening for visual impairment as the target condition, not other risk factors for amblyopia (Hall 2003; UK National Screening Committee). Abnormal screening results trigger referral for a comprehensive eye examination. Despite this recommendation, implementation of childhood vision screening continues to show regional variation. In the USA, Canada, Belgium and Germany there are no national screening programmes although regional programmes do exist. There are, however, guidelines in the USA which recommend vision and alignment screening between the age of 3 and 3.5 years by a suitably trained individual (American Academy of Ophthalmology 2012). In Canada, there are national guidelines for screening visual acuity and ocular alignment at 3 to 5 years of age, but no established screening programme (Canadian Paediatric Society 2009). In many countries office‐based paediatricians, ophthalmologists and optometrists offer annual 'child health' or 'eye health' checks, respectively, but these occur outside national programmes.

In low‐ and middle‐income countries, little information on national screening programmes is available in the literature or online (World Health Organization 2014). One exception is Iran, where a national screening programme of 3 to 6 year olds performed by kindergarten teachers assessing visual acuity with illiterate 'E' Snellen charts has been in place since 1996 with an estimated uptake of 67% of eligible children in 2005 (Khandekar 2009). In India there is no national screening (Jose 2009).

There are efforts to find cost‐effective strategies for screening in developing countries such as a remote photoscreener system piloted in Brazil and China, and a home‐based screening programme in China performed by parents (Donahue 2008; Lan 2012).

Rationale

Strabismus is a risk factor for the development of amblyopia. Whilst large deviations may be detected by family, friends or lay screeners, small deviations may go unnoticed, leading to suppression of visual information from the deviated eye. Childhood vision screening programmes use varying combinations of tests, depending on the age at which children are tested, and the type of professionals carrying out tests. Strabismus tests as part of a combination of tests may increase the precision of childhood amblyopia screening (VIP 2007). Published vision screening studies often lack specific information on strabismus detection, instead reporting overall precision in detecting amblyopia.

This review does not propose screening for strabismus that is of no aesthetic concern or visual consequence, but it is important to summarise current evidence on accuracy of tests in detecting strabismus in a screening setting, in order to enable healthcare commissioners to implement the most effective screening programme.

Objectives

To assess and compare the accuracy of tests, alone or in combination, for detection of strabismus in children aged 1 to 6 years, in a community setting by non‐expert screeners or primary care professionals to inform healthcare commissioners setting up childhood screening programmes.

Secondary objectives

Other objectives were to investigate sources of heterogeneity of diagnostic accuracy, including:

  • age;

  • setting;

  • type of professionals performing the test;

  • study design;

  • study size (< 100 vs. ≥ 100 participants, which may reflect the adoption of different sampling strategies);

  • variation in the way a test is carried out;

  • type of strabismus (convergent vs. divergent, horizontal/vertical);

  • severity of strabismus (amount of misalignment, constant/intermittent/latent);

Methods

Criteria for considering studies for this review

Types of studies

We have included all prospective or retrospective population‐based test accuracy studies of consecutive participants. By 'population‐based' we mean not only screening studies, implying sampling based on census, but also studies recruiting from community services such as schools or paediatric health districts. Hospital cohorts were excluded, unless the sampling from a community service was clearly described.

We have included studies that compare a single index test, or a combination of index tests, with a reference standard (cover test, performed as a standalone test or as part of a comprehensive eye examination). Case‐control studies, in which children are selected based on their disease status, have been excluded unless they are nested in large prospective consecutive studies. Studies had to provide sufficient data to calculate diagnostic accuracy (sensitivity, specificity). We planned to include studies in which only a subgroup of participants had undergone the reference test; the result from these to be considered by subgroup analysis.

Participants

We included children aged 1 to 6 years old. Strabismus is a risk factor for the development of amblyopia during the 'sensitive period' of vision development. During this period, neural plasticity is greatest, and it begins to decline around the age of 6 years; clinical interventions are typically offered to children up to the age of 10 years. Screening programmes therefore attempt to identify children with amblyogenic risk factors before the age of 6 years, to allow remedial treatment. We set the lower limit of the age range at 1 year to avoid overlap with early postnatal eye screening programmes.

In countries where children start school in the academic year of their fifth birthday, screening programmes aim to capture children aged 4 to 5 years, i.e. during their first year at school. In other countries, the year of school entry can be earlier or later, and vision screening programmes may be carried out in the first year of school, or independent from schools. An age range of 1 to 6 years allows inclusion of all population‐based studies in children at risk of developing amblyopia from strabismus.

When studies included children outside the range of 1 to 6 years, we tried to obtain subgroup data. If we did not obtain subgroup data we excluded these studies. We intended to include these studies if the proportion of children beyond age 6 is less than an agreed threshold, e.g. 20%, and we would have conducted sensitivity or subgroup analyses as appropriate.

We considered children attending population‐based vision screening programmes. We included opportunistic screening programmes, such as including children attending schools. We excluded orthoptist‐delivered programmes, as these include the reference standard.

Index tests

We included any test used by lay screeners to detect strabismus, either directly by identifying misalignment, or indirectly by identifying a consequence of strabismus such as loss of stereovision. The participant age range means that different tests may be used, as appropriate for the age of participants in each particular study. We described the index test by test type rather than enumerated.

  • Type 1: tests which directly identify ocular misalignment:

    • corneal reflections tests (Hirschberg);

    • fundus reflections test (Brückner).

  • Type 2: tests of binocular function: stereoacuity:

    • e.g. contour and random dot stereotests.

  • Type 3: tests designed to detect reduced central vision/visual acuity:

    • e.g. HOTV, LEA symbols, Keeler (previously Glasgow) crowded logMAR, Sonksen crowded logMAR, crowded Kay picture test, displayed as paper‐ or computer‐based charts or conventional retroilluminated charts.

  • Type 4: automated refraction devices designed to report ocular misalignment.

If the search had revealed several high‐quality studies for each test type for inclusion in this review, we would have considered splitting the review by test type group.

Finally, we did not consider tests that require specialist skills, such as the 4‐dioptre prism test, since we are concerned with population screening which is typically carried out by non‐expert professionals, not by orthoptists, optometrists or ophthalmologists, who would directly use our reference standard, the cover‒uncover test.

Target conditions

The target condition is constant or intermittent manifest strabismus of any magnitude and type (esotropia, microtropia, exotropia, hyper/hypotropia).

Reference standards

The reference standard considered in this review was the cover‒uncover test, whether used alone or within a comprehensive ophthalmic examination, or in combination with other tests, by trained personnel.

We included studies that use the cover‒uncover test, regardless of the type of professional performing the test. Type of professional (ophthalmologist, orthoptist, optometrist, trained technician, non‐expert screener) will be noted and analysed as subgroups.

We included studies in which the cover‒uncover or alternate cover test is used as part of a comprehensive eye examination, which often also includes visual acuity, biomicroscopy and refraction. For the latter scenario there is a risk of incorporation bias. This bias can be avoided by ensuring that the tests that are part of the reference standard 'comprehensive eye examination' do not belong to the same type of test as the index test included in that analysis. We excluded the whole study if a single test is assessed and there is incorporation bias, and we excluded part of the study data if a study comparing several index tests suffered from incorporation bias regarding a specific test.

Search methods for identification of studies

Electronic searches

The Cochrane Eyes and Vision Information Specialist searched the following electronic databases. There were no language or publication year restrictions.

Searching other resources

We used the weblink pcwww.liv.ac.uk/˜rowef/index_files/Page646.htm to search the following orthoptic journals and conference proceedings which are not electronically listed: British and Irish Orthoptic Journal, American Orthoptic Journal, Australian Orthoptic Journal, European Strabismus Association, International Strabismus Association and the International Orthoptic Congress. We contacted study authors for further clarification when required.

Data collection and analysis

Selection of studies

Two review authors (SH, VT) independently assessed the titles and abstracts for eligibility. We sorted the abstracts into 'definitely exclude' and 'possibly include' categories, recognising that sometimes it is not possible to judge from the abstract whether a reference fulfils the criteria or not. We placed all abstracts selected by at least one review author in the 'possibly include' category. We resolved disagreements at each step by discussion between the two review authors and a third senior author (AD‐N).

Data extraction and management

We extracted the number of:

  • true positives (TP), i.e. participants categorised as strabismic by both the reference and index test;

  • false negatives (FN), i.e. participants categorised as strabismic by the reference test, but as non‐strabismic by the index test;

  • true negatives (TN), i.e. participants categorised as non‐strabismic by both the reference and index tests;

  • false positives (FP), i.e. participants categorised as non‐strabismic by the reference test, but as strabismic by the index test;

  • participants with uninterpretable index test results;

  • missing data, i.e. participants included in the study, but not in the analyses, by causes of exclusion.

Uninterpretable test results at individual participant level were recorded in primary publications when a child did not comply with a test, i.e. refused to give an answer during assessment of visual acuity or stereovision, or did not fixate on targets for automated devices, or in case of ocular abnormalities affecting the clarity of cornea, lens or vitreous, or a combination of the three. For each study, we recorded how such cases were treated in the analyses.

Two review authors independently extracted the data to ensure consistency and entered these into Cochrane's statistical software, Review Manager 5 (RevMan 5) (Review Manager 2014). We have extracted the data shown in Table 1, which we have displayed in the Characteristics of included studies tables.

Open in table viewer
Table 1. Data extraction from included studies

Study ID

First author, year of publication

Clinical features and settings

Previous testing and results.
Setting: community/school/clinic (office) setting.

Referral route/selection.

Participants

Sample size.

Socio‐demographic items: age, gender, ethnicity, frequency of ocular abnormalities (i.e. media opacities, which would affect test results/technical failure rates), geographic region.

Study design

Selection: as single group/as separate group with/without target condition.
Enrolment: consecutive series.
Identification: prospective/retrospective.
If more than one test: how were tests allocated to individuals, did each individual receive all tests?

Target condition

Constant and intermittent manifest strabismus (esotropia, exotropia, vertical tropia, microtropia), including the prevalence of the target condition in the sample.

Reference standard

Test definition and description, i.e. cover test; 'comprehensive eye examination' (visual acuity, cover test, cycloplegic refraction).
Test operator(s).
Timing of reference standard.

Index tests

Test definition and description.
Criteria for positive test result.
Details of test operators.
Timing.
Manufacturer.
Technical characteristics.

Follow‐up

How many participants were lost to follow‐up: unknown.
How many have missing or uninterpretable test results: unknown.
Adverse events noted that could be caused by the test: none reported.

Notes

Sources of funding.

Abbreviations.

Anything else of relevance.

Assessment of methodological quality

We used the Quality Assessment of Diagnostic Accuracy Studies (QUADAS)‐2 tool to evaluate the risk of bias and applicability of primary studies (www.bris.ac.uk/quadas/quadas‐2). QUADAS‐2 consists of four key domains: patient selection, index test, reference standard, and flow and timing. The tool is completed in four phases.

  1. The review question is stated.

  2. Development of review‐specific guidance.

  3. Review of the published flow diagram for the primary study or construction of a flow diagram if none is reported.

  4. Judgement of bias and applicability.

Each domain is assessed in terms of the risk of bias and the first three are also assessed in terms of concerns regarding applicability. To help reach a judgement on the risk of bias, signalling questions are included. These flag aspects of study design related to the potential for bias and aim to help review authors make risk of bias judgements. Two review authors independently assessed the methodological quality of the included studies. A third senior author resolved disagreements on study quality. Table 2 shows the guidance the review authors used when judging the methodological quality of studies.

Open in table viewer
Table 2. QUADAS‐2 assessment guidance

Domain

Yes

No

Unclear

PATIENT SELECTION

Describe methods of patient selection: Describe included patients (prior testing, presentation, intended use of index test and setting)

Was a consecutive or random sample of patients enrolled?

Consecutive sampling or random sampling of children according to inclusion criteria.

Non‐random sampling or sampling based on volunteering or referral.

Unclear whether consecutive or random sampling used.

Was a case‐control design avoided?

Yes for all studies since case‐control studies are excluded unless nested in cohort studies.

N/A

N/A

Did the study avoid inappropriate exclusions?

Exclusions are detailed and felt to be appropriate (systemic disease causing strabismus).

Children with known strabismus can be excluded.

Inappropriate exclusions are reported e.g. of children in whom strabismus has been suspected in primary care but not confirmed by trained professionals.

Exclusions are not detailed (pending contact with study authors).

Risk of bias: could the selection of patients have introduced bias?

‐ 

 ‐

 ‐

Concerns regarding applicability: are there concerns that the included patients do not match the review question?

Inclusion of children in community settings, such as school or screening settings, with no previous diagnosis of any eye disease.

Inclusion of children over the age of 6 years, referred to clinical settings, referred to eye professionals for suspect eye disease, or assessed in commercial settings on a volunteer basis; or previous diagnosis of failed screening test or strabismus.

Unclear inclusion criteria.

INDEX TEST 

Describe the index test and how it was conducted and interpreted

Were the index test results interpreted without knowledge of the results of the reference standard?

Test performed "blinded" or "independently and without knowledge of" reference standard results are sufficient and full details of the blinding procedure are not required; or clear temporal pattern to the order of testing that precludes the need for formal blinding.

Reference standard results available to those who conducted or interpreted the index tests.

Unclear whether results are interpreted independently.

If a threshold was used, was it pre‐specified?

Many included index tests are based on continuous measures (e.g. eye deviation, stereopsis, refractive error, visual acuity); the study authors declare that the selected cut‐off used to dichotomise data was specified a priori, or a protocol is available with this information.

A study is classified at higher risk of bias if the authors define the optimal cut‐off post hoc based on their own study data.

No information on pre‐selection of index test cut‐off values.

Risk of bias: could the conduct or interpretation of the index test have introduced bias?

‐ 

 ‐

‐ 

Concerns regarding applicability: are there concerns that the index test, its conduct, or interpretation differ from the review question?

Tests used and testing procedure clearly reported and tests executed by personnel with sufficient training.

Tests used are not validated or study personnel is insufficiently trained.

Unclear tests (e.g. stereopsis‐based tests but does not mention if a validated test is used) or unclear study personnel profile, background and training.

REFERENCE STANDARD

Describe the reference standard and how it was conducted and interpreted

Is the reference standard likely to correctly classify the target condition?

Cover‒uncover test performed by trained professionals, e.g. ophthalmologists, optometrists, orthoptists.

Complete eye examination with cover‒uncover test used as reference standard but not only the cover‒uncover test used to judge on strabismus (e.g. visual acuity measure also used).

Complete eye examination used but unclear whether cover‐uncover test used.

Were the reference standard results interpreted without knowledge of the results of the index test?

Reference standard performed "blinded" or "independently and without knowledge of" index test results are sufficient and full details of the blinding procedure are not required; or clear temporal pattern to the order of testing that precludes the need for formal blinding.

Index test results available to those who conducted the reference standard; or the index test is part of the reference standard (e.g. visual acuity within a compete ophthalmic examination used as reference standard and visual acuity is also the index test analysed ‒ this will be specific of each analysis).

Unclear whether results are interpreted independently.

Risk of bias: could the reference standard, its conduct, or its interpretation have introduced bias?

 

 

 

Concerns regarding applicability: are there concerns that the target condition as defined by the reference standard does not match the review question?

Cover‒uncover test used and testing procedure  executed by personnel with sufficient training.

Cover‒uncover test used  by personnel with inappropriate profile or insufficient training.

Unclear study personnel profile, background and training.

FLOW AND TIMING

Describe any patients who did not receive the index test(s) and/or reference standard or who were excluded from the 2×2 table (refer to flow diagram): describe the time interval and any interventions between index test(s) and reference standard

Was there an appropriate interval between index test(s) and reference standard?

No more than three months between index and reference test execution, and no corrective intervention between assessments.

More than three months between index and reference test execution.

Unclear whether test results are executed within three months.

Did all patients receive a reference standard?

The verification rate of index test‐positive children is definitely higher than that of negative children (the opposite is unlikely).

All children receiving the index test are verified with the reference standard.

Unclear whether all children receiving the index test are verified with the reference standard.

Did all patients receive the same reference standard?

All children are verified with the cover‒uncover test by trained professionals.

Some children, i.e. positive children, are verified with the cover‒uncover test by specialised personnel, while the others are verified by personnel with lower level of training.

Unclear whether all children are verified with the cover‒uncover test by trained professionals.

Were all patients included in the analysis?

The number of children included in the study does not match the number in analyses or children with undefined or borderline test results are excluded. However, children in whom one or more index tests are not performed because they are poorly cooperative can be excluded.

The number of children included in the study does not match the number in analyses and children with undefined or borderline test results are excluded from the analyses.

The number of children analysed, but not that included in the study, are reported; or unclear if there were inappropriate exclusions.

Risk of bias: could the patient flow have introduced bias? 

 ‐

 ‐

 ‐

COMPARATIVE STUDIES (MULTIPLE INDEX TESTS)

Were all tests performed on all patients, or randomly assigned?

All children received all index tests, or tests were randomly assigned.

Not all children received all index tests and the assignment criterion was opportunistic or non‐random (e.g. depending on test availability or type of professional).

Not all children received all index tests and the assignment criterion was unclear.

Could the order in which the index tests were used affect the target condition or the interpretation of the alternative tests? 

The order of presentation of the index test was random or alternate to avoid fatigue effects; or clear that no fatigue effect can arise.

Several tests are delivered in a fixed order which can cause children to be less compliant with the second or later test.

Unclear order of test presentation.

We scored the risk of bias signalling questions as 'yes/no/unclear' as detailed in Table 2. Risk of bias was judged as 'low', 'high' or 'unclear'. When we answered 'yes' for all signalling questions for a domain then we could judge the risk of bias 'low'. If we answered any question as 'no', this flagged the potential for bias. When this occurred, we followed the guidelines developed in phase 2 of the quality assessment process to judge the risk of bias. We used the 'unclear' category only when insufficient data were reported to permit a judgment. 

We judged applicability of primary studies to the review question in a similar manner.

We also recorded study sponsorship.

Statistical analysis and data synthesis

We used two‐by‐two data of index and reference test results to calculate the sensitivities and specificities, with their 95% confidence intervals. We used the RevMan 5 software for descriptive analyses, and plotted individual studies in forest plots.

Considering test threshold across different test types was the most important analytic issue in this review. We planned to use a continuous output measure for most tests: ocular misalignment as prism dioptres (PD) or degrees for test type 1; stereoscopic acuity as seconds of arc for test type 2; visual acuity in logMAR for test type 3 (acknowledging that comparison of values may be hampered by use of charts with different optotype size steps, and that simple mathematical conversion from Snellen to logMAR may be inaccurate); and millimetres or a ratio for test type 4. Other tests listed in Appendix 2 and used in the diagnosis of, but not in screening for, strabismus are not based on an explicit common measure. However, in practice the heterogeneous execution and technical characteristics of the tests made it difficult to consider using an explicit threshold in statistical analyses, and implicit threshold effects are more likely.

Analyses within each test type

We intended to analyse different tests within each test type group, using the following strategy. For each study, we intended to extract data at specific thresholds if available. We attempted to extract cut‐offs of 8 PD for horizontal and 1 PD for vertical deviations in test type 1; 400 arc seconds for test type 2; and visual acuity 0.2 logMAR for test type 3. UK screening recommendations specify "less than 0.2 logMAR" as referral threshold (UK National Screening Committee); guidelines from the AAPOS specify that optotype‐based screening (which covers test type 3) should detect visual acuity of less than 0.176 logMAR (Snellen 20/30) at all ages. Threshold values for test type 4 have not been published; we therefore used "any asymmetry, in millimetres or as ratio" as threshold. Thresholds are summarised in Table 3

Open in table viewer
Table 3. Thresholds for analysis

Test type categories

Tests included

Output measure

Threshold to extract data

1) Tests which identify ocular misalignment

1.1) Corneal reflections tests: Hirschberg, Krimsky (prism reflection test).
1.2) Fundus reflections test: Brückner.

Prism dioptres (PD).

8 PD for horizontal deviations; 1 PD for vertical deviations (no published threshold identified).

2) Test of binocular function: stereopsis

Stereoacuity tests such as contour and random dot stereotests.

Seconds of arc.

400 seconds of arc.

3) Tests designed to detect reduced ventral vision

3.1) Visual acuity tests, e.g. HOTV, LEA symbols, Keeler (previously Glasgow) crowded logMAR, Sonksen crowded logMAR, crowded Kay picture test.

3.2) Suppression tests.
3.3) Blur test.

LogMAR or

logMAR equivalent.

0.2 logMAR.

4) Automated refraction devices designed to report ocular misalignment

Millimetres of asymmetry or corneal reflections.

No published threshold identified.

Investigations of heterogeneity

The framework for likely sources of heterogeneity was described previously and mainly includes setting and study population, particularly regarding referral method and inclusion criteria; type of professional executing the reference standard; and study quality assessment.

We planned to investigate heterogeneity in the first instance through visual examination of forest plots of sensitivities and specificities and through visual examination of the receiver operating characteristic (ROC) plot of the raw data. However, we had insufficient data to investigate these secondary objectives.

Sensitivity analyses

Where appropriate (i.e. if not already explored in our analyses of heterogeneity) and if sufficient data were available, we planned to explore the sensitivity of any summary accuracy estimates to aspects of study quality such as nature of masking and type of reference standard, guided by the anchoring statements developed in our QUADAS‐2 exercise. 

Assessment of reporting bias

We did not assess publication bias since there is no standard method to achieve this in diagnostic test accuracy reviews (Deeks 2005). For selective outcome reporting issues, such as the use of a specific cut‐off of ocular misalignment, we did not search for a protocol to assess within‐study reporting bias, since protocols of diagnostic accuracy studies are not routinely reported.

Results

Results of the search

The searches yielded a total of 2327 records (Figure 1). After de‐duplication we screened 1236 studies/papers which, following independent screening by two authors (SH, VT), led to the exclusion of 1129 studies not meeting the inclusion criteria (including age range, examiner type and primary use of cover test), lack of relevance or lack of results. The remaining 107 studies underwent full text review, with disagreements resolved by a third author (AD) (Figure 1). The authors of six posters were contacted to ascertain relevant publications and data for those posters; three replied with one publication identified that did not meet the inclusion criteria as no strabismus outcomes were reported (Shallo‐Hoffman 2004). In addition authors for three published studies were contacted where full analysis of the results required additional data (Enzenauer 2000; Robinson 1999; Tung 2006); two of these authors responded but further data were unavailable (summarised in the Characteristics of excluded studies table). One study met the inclusion criteria and had sufficient data for analysis (Arthur 2009).


Study flow diagram.

Study flow diagram.

Methodological quality of included studies

Arthur 2009 was a prospective study performed in a community setting with all eligible children invited for screening and all screened participants offered a gold standard examination (summary of findings Table 1). Eligible children for the study were all junior kindergarten students in a specific school district of Ontario, Canada; and 98% of those enrolled were 4 or 5 years of age. The screening was conducted by certified dental assistants conjointly with an existing dental screening programme. The dental assistants underwent training on the plusoptiX S04 photoscreener (Plusoptix GmbH) with defined criteria for failing the test of a corneal reflex more than 10 degrees from the centre. Bias assessment indicated an unclear risk of bias for the patient selection domain but a low risk of bias for all other QUADAS‐2 domains (Figure 2; Figure 3). This was due to a relatively low uptake of screening at 25% with included children volunteering and not sampled randomly or consecutively. There was no available data on the prevalence of strabismus in non‐responders compared to responders. There were no adverse outcomes reported.


Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study


Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Findings

Three hundred and six children were screened by the photoscreener. Two hundred and seventy‐one had both interpretable screening photographs and completed the gold standard examination, the others having declined (n = 14), being unable to attend within the study timeframe (n = 11), become uncontactable (n = 6), having had uninterpretable photographs (n = 3) or incomplete examination (n = 1). The photoscreener was used to ascertain refractive error, anisocoria and ocular misalignment with 14 children referred specifically for ocular misalignment. A total of 13 children were identified to have strabismus on gold standard examination of which six had been referred for ocular misalignment, two had been referred for refractive error and five had passed the screening test. The two participants referred for refractive error not ocular misalignment and found to have strabismus were considered as false negatives for calculating accuracy. The main outcomes were a sensitivity of 0.46 (95% confidence interval (CI) 0.19 to 0.75), and a specificity of 0.97 (CI 0.94 to 0.99) (Figure 4). The estimated prevalence of strabismus in the screened population was 4.8%. The types of strabismus identified were intermittent exotropia (n = 3 well‐controlled, n = 2 poorly‐controlled), esotropia (n = 4), hypertropia (n = 2) and exotropia (n = 2).


Forest plot of 1 Photoscreener.

Forest plot of 1 Photoscreener.

Discussion

Screening for strabismus in children in the community may be achieved by tests that directly ascertain misalignment of the eyes (corneal or fundus reflections) or indirectly detect associated reduced vision or stereopsis. Small deviations may not be noticed by family but may have significant impact on visual development hence the rationale for screening.

Summary of main results

There is limited available data on strabismus screening in the community as performed by lay examiners with the majority of published screening studies predominantly focusing on amblyopia screening. One study was identified that met the full inclusion criteria for this review in which all children screened with a photoscreener were offered a gold standard examination (Arthur 2009). There was an unclear risk of bias. The results indicated high specificity but low sensitivity implying the potential for significant false negatives. Absolute numbers found to have strabismus by gold standard examination were small at 13 in total out of 271 children.

Strengths and weaknesses of the review

Only one study was analysed in this review, prohibiting any conclusion on the accuracy of screening tests. It remains unclear whether other screening modalities would have significant accuracy for screening in this context.

Applicability of findings to the review question

The findings have limited applicability to the review question in which the assessment and comparison of the accuracy of multiple tests in screening for strabismus was to be ascertained. The single study included suggests that the plusoptiX S04 photoscreener for detecting ocular misalignment could provide a specific but not sensitive test, and this single study is not sufficient for robust conclusion (95% CI about 20% to 75%). Further studies are needed. Due to the lack of relevant studies, the secondary objectives to investigate sources of heterogeneity of diagnostic accuracy could not be assessed.

Through review of the literature, other studies were identified with relevant results that did not meet the inclusion criteria for this review and as such could not be included. This included the Vision in Preschoolers (VIP) study, a large multicentre trial performed in two phases to ascertain the accuracy of various screening tools for children aged 3 to 4 years old (VIP 2007). In phase I, trained eye care professionals performed the screening assessment but in phase II, trained nurses and lay screeners performed the screening tests. The population screened were enriched from a preceding generalised screening programme, with all those who failed screening included in the VIP study as well as a proportion of those who did not. The aim was to enrich for ocular pathology within the study to better ascertain accuracy of screening methods. As such, this study could not be included in this review but still has relevant conclusions.

VIP 2007 specifically assessed methods for screening for strabismus; and for the lay and nurse screeners it included four tests; Retinomax autorefraction, SureSight Vision Screener autorefractor, LEA symbols visual acuity testing and Stereo Smile Test II stereoacuity testing. Of 4040 children screened, 157 (3.9%) were found to have strabismus. For lay screeners the combination of both the Stereo Smile test and the SureSight autorefractor and the Stereo Smile test and the Retinomax autorefractor were associated with a statistically significant increase in sensitivity of strabismus detection for a 90% specificity but no such increase was observed for other test combinations or for the nurse screeners. The study concluded that the addition of tests for eye alignment to acuity or refraction tests alone would depend on a screening programme's goals and resources. It also indicates that tests of visual acuity alone would be insufficient for identifying all cases of strabismus.

A large prospective, consecutively enrolled study of all 3 to 6 year olds in an eastern province of Taiwan used two different index tests for screening with all children offered a gold standard examination by a single ophthalmologist (Tung 2006). Screening for strabismus performed in 2003 on 2868 children was conducted by trained kindergarten teachers using both a National Taiwan University (NTU) random dot stereogram to detect stereopsis less than 300 seconds of arc and Hirschberg corneal reflexes at 1 metre, with any displacement of the light reflexes considered abnormal. The number screened and then unavailable for the gold standard assessment was not disclosed. Detailed outcome numbers were not provided and as such this study could not be included in this review. However, the overall sensitivity and specificity for the NTU random dot stereogram were 38.9% and 90.4% respectively and for the Hirschberg light reflex were 75% and 98.9% respectively suggesting good efficacy for the Hirschberg light reflexes as a screening modality.

In summary the applicability of available studies to primary screening programmes is limited. Future screening studies should also consider the optimum screening age, which for optotype‐based tests is around 4 to 5 years (Solebo 2015). Lastly, we would recommend further research into long‐term visual and psychosocial outcomes of childhood strabismus, to explore the benefits of early detection.

Study flow diagram.
Figures and Tables -
Figure 1

Study flow diagram.

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study
Figures and Tables -
Figure 2

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies
Figures and Tables -
Figure 3

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Forest plot of 1 Photoscreener.
Figures and Tables -
Figure 4

Forest plot of 1 Photoscreener.

Photoscreener.
Figures and Tables -
Test 1

Photoscreener.

Summary of findings 1. Summary of findings table

Accuracy of a photoscreener to detect strabismus in the community

Patient/population: children aged 1 to 6 years old

Setting: school

Index test: plusoptix S04 photoscreener

Target condition: constant and intermittent manifest strabismus

Reference standard: cover test at distance and near

Number of studies

Number of participants

Number affected by target condition

Sensitivity of test (95% CI)

Specificity of test (95% CI)

Risk of bias based on QUADAS‐2 domains

Comments

1 Arthur 2009

271

13

0.46 (0.19 to 0.75)

0.97 (0.94 to 0.99)

Unclear risk

Low participation rate of 25%

CI: confidence intervals

Figures and Tables -
Summary of findings 1. Summary of findings table
Summary of findings 2. Data extraction from included studies

Study ID

Arthur 2009

Clinical features and settings

Previous testing and results: unknown.
Setting: elementary school.

Referral route/selection: all who were screened offered gold standard examination.

Participants

Sample size: 306 screened (1343 invited to study (consents sent: this may have introduced selection bias from concerned parents being more likely to return consent forms), 387 returned, 45 excluded as consents too late, 7 excluded for document errors, 28 absent on day of screening, 1 uncooperative), 275 gold standard exam (14 declined, 11 unable to attend within time frame, 6 uncontactable) of which 271 data interpretable for both index and reference (3 photographs unusable, 1 did not complete exam).

Socio‐demographic items: 98% 4 to 5 years of age, gender and ethnicity not given, no ocular abnormalities (i.e. media opacities, which would affect test results/technical failure rates). Geographic region: Limestone school district, Ontario, Canada.

Study design

Selection: all patients with data available on both index and reference tests as single group.
Enrolment: consecutive series, enrolled by post in combination with dental screening programme.
Identification: prospective.
If more than one test: one test.

Target condition

Constant and intermittent manifest strabismus (esotropia, exotropia, vertical tropia, microtropia), prevalence of the target condition in the sample: 13 (of 271).

Reference standard

Test definition and description: monocular visual acuity with occlusion glasses (crowded Keeler logMAR letter matching test/Crowded Kay pictures/Cardiff cards), cover test at distance and near, ocular movements and convergence, binocular single vision assessment (20D base‐out prism test and/or stereopsis) and red reflex test.

Standards: discharged if VA 0.2 logMAR or better, binocular single vision at distance and near and no suspected ocular pathology, 6‐ to 12‐week review and re‐check if borderline, cycloplegic refraction/dilated examination all others.
Test operator(s): optometrist or orthoptist or ophthalmologist.
Timing of reference standard: separate visit to hospital but timing unknown.

Index tests

plusoptiX S04 photoscreener, co‐axial camera, handheld at 1 m.
Criteria for positive test result: eye alignment > 10 degrees from centre (manually flagged as abnormal) anisometropia > 1D, astigmatism > 1.25D, myopia > 3D, hyperopia > 3.5D, anisocoria > 1 mm.
Details of test operators: certified dental assistants after 3 hours of training.
Timing: 5 to 10 seconds image acquisition time repeated if necessary.
Manufacturer: Plusoptix GmbH.
Technical characteristics: 3rd generation, infrared, coaxial video camera, portable, handheld, non‐contact.

Follow‐up

How many participants were lost to follow‐up: 31.
How many have missing or uninterpretable test results: 4.
Adverse events noted that could be caused by the test: 0.

Notes

Sources of funding: none declared.

Anything else of relevance: low participation rate (25%).

Figures and Tables -
Summary of findings 2. Data extraction from included studies
Table 1. Data extraction from included studies

Study ID

First author, year of publication

Clinical features and settings

Previous testing and results.
Setting: community/school/clinic (office) setting.

Referral route/selection.

Participants

Sample size.

Socio‐demographic items: age, gender, ethnicity, frequency of ocular abnormalities (i.e. media opacities, which would affect test results/technical failure rates), geographic region.

Study design

Selection: as single group/as separate group with/without target condition.
Enrolment: consecutive series.
Identification: prospective/retrospective.
If more than one test: how were tests allocated to individuals, did each individual receive all tests?

Target condition

Constant and intermittent manifest strabismus (esotropia, exotropia, vertical tropia, microtropia), including the prevalence of the target condition in the sample.

Reference standard

Test definition and description, i.e. cover test; 'comprehensive eye examination' (visual acuity, cover test, cycloplegic refraction).
Test operator(s).
Timing of reference standard.

Index tests

Test definition and description.
Criteria for positive test result.
Details of test operators.
Timing.
Manufacturer.
Technical characteristics.

Follow‐up

How many participants were lost to follow‐up: unknown.
How many have missing or uninterpretable test results: unknown.
Adverse events noted that could be caused by the test: none reported.

Notes

Sources of funding.

Abbreviations.

Anything else of relevance.

Figures and Tables -
Table 1. Data extraction from included studies
Table 2. QUADAS‐2 assessment guidance

Domain

Yes

No

Unclear

PATIENT SELECTION

Describe methods of patient selection: Describe included patients (prior testing, presentation, intended use of index test and setting)

Was a consecutive or random sample of patients enrolled?

Consecutive sampling or random sampling of children according to inclusion criteria.

Non‐random sampling or sampling based on volunteering or referral.

Unclear whether consecutive or random sampling used.

Was a case‐control design avoided?

Yes for all studies since case‐control studies are excluded unless nested in cohort studies.

N/A

N/A

Did the study avoid inappropriate exclusions?

Exclusions are detailed and felt to be appropriate (systemic disease causing strabismus).

Children with known strabismus can be excluded.

Inappropriate exclusions are reported e.g. of children in whom strabismus has been suspected in primary care but not confirmed by trained professionals.

Exclusions are not detailed (pending contact with study authors).

Risk of bias: could the selection of patients have introduced bias?

‐ 

 ‐

 ‐

Concerns regarding applicability: are there concerns that the included patients do not match the review question?

Inclusion of children in community settings, such as school or screening settings, with no previous diagnosis of any eye disease.

Inclusion of children over the age of 6 years, referred to clinical settings, referred to eye professionals for suspect eye disease, or assessed in commercial settings on a volunteer basis; or previous diagnosis of failed screening test or strabismus.

Unclear inclusion criteria.

INDEX TEST 

Describe the index test and how it was conducted and interpreted

Were the index test results interpreted without knowledge of the results of the reference standard?

Test performed "blinded" or "independently and without knowledge of" reference standard results are sufficient and full details of the blinding procedure are not required; or clear temporal pattern to the order of testing that precludes the need for formal blinding.

Reference standard results available to those who conducted or interpreted the index tests.

Unclear whether results are interpreted independently.

If a threshold was used, was it pre‐specified?

Many included index tests are based on continuous measures (e.g. eye deviation, stereopsis, refractive error, visual acuity); the study authors declare that the selected cut‐off used to dichotomise data was specified a priori, or a protocol is available with this information.

A study is classified at higher risk of bias if the authors define the optimal cut‐off post hoc based on their own study data.

No information on pre‐selection of index test cut‐off values.

Risk of bias: could the conduct or interpretation of the index test have introduced bias?

‐ 

 ‐

‐ 

Concerns regarding applicability: are there concerns that the index test, its conduct, or interpretation differ from the review question?

Tests used and testing procedure clearly reported and tests executed by personnel with sufficient training.

Tests used are not validated or study personnel is insufficiently trained.

Unclear tests (e.g. stereopsis‐based tests but does not mention if a validated test is used) or unclear study personnel profile, background and training.

REFERENCE STANDARD

Describe the reference standard and how it was conducted and interpreted

Is the reference standard likely to correctly classify the target condition?

Cover‒uncover test performed by trained professionals, e.g. ophthalmologists, optometrists, orthoptists.

Complete eye examination with cover‒uncover test used as reference standard but not only the cover‒uncover test used to judge on strabismus (e.g. visual acuity measure also used).

Complete eye examination used but unclear whether cover‐uncover test used.

Were the reference standard results interpreted without knowledge of the results of the index test?

Reference standard performed "blinded" or "independently and without knowledge of" index test results are sufficient and full details of the blinding procedure are not required; or clear temporal pattern to the order of testing that precludes the need for formal blinding.

Index test results available to those who conducted the reference standard; or the index test is part of the reference standard (e.g. visual acuity within a compete ophthalmic examination used as reference standard and visual acuity is also the index test analysed ‒ this will be specific of each analysis).

Unclear whether results are interpreted independently.

Risk of bias: could the reference standard, its conduct, or its interpretation have introduced bias?

 

 

 

Concerns regarding applicability: are there concerns that the target condition as defined by the reference standard does not match the review question?

Cover‒uncover test used and testing procedure  executed by personnel with sufficient training.

Cover‒uncover test used  by personnel with inappropriate profile or insufficient training.

Unclear study personnel profile, background and training.

FLOW AND TIMING

Describe any patients who did not receive the index test(s) and/or reference standard or who were excluded from the 2×2 table (refer to flow diagram): describe the time interval and any interventions between index test(s) and reference standard

Was there an appropriate interval between index test(s) and reference standard?

No more than three months between index and reference test execution, and no corrective intervention between assessments.

More than three months between index and reference test execution.

Unclear whether test results are executed within three months.

Did all patients receive a reference standard?

The verification rate of index test‐positive children is definitely higher than that of negative children (the opposite is unlikely).

All children receiving the index test are verified with the reference standard.

Unclear whether all children receiving the index test are verified with the reference standard.

Did all patients receive the same reference standard?

All children are verified with the cover‒uncover test by trained professionals.

Some children, i.e. positive children, are verified with the cover‒uncover test by specialised personnel, while the others are verified by personnel with lower level of training.

Unclear whether all children are verified with the cover‒uncover test by trained professionals.

Were all patients included in the analysis?

The number of children included in the study does not match the number in analyses or children with undefined or borderline test results are excluded. However, children in whom one or more index tests are not performed because they are poorly cooperative can be excluded.

The number of children included in the study does not match the number in analyses and children with undefined or borderline test results are excluded from the analyses.

The number of children analysed, but not that included in the study, are reported; or unclear if there were inappropriate exclusions.

Risk of bias: could the patient flow have introduced bias? 

 ‐

 ‐

 ‐

COMPARATIVE STUDIES (MULTIPLE INDEX TESTS)

Were all tests performed on all patients, or randomly assigned?

All children received all index tests, or tests were randomly assigned.

Not all children received all index tests and the assignment criterion was opportunistic or non‐random (e.g. depending on test availability or type of professional).

Not all children received all index tests and the assignment criterion was unclear.

Could the order in which the index tests were used affect the target condition or the interpretation of the alternative tests? 

The order of presentation of the index test was random or alternate to avoid fatigue effects; or clear that no fatigue effect can arise.

Several tests are delivered in a fixed order which can cause children to be less compliant with the second or later test.

Unclear order of test presentation.

Figures and Tables -
Table 2. QUADAS‐2 assessment guidance
Table 3. Thresholds for analysis

Test type categories

Tests included

Output measure

Threshold to extract data

1) Tests which identify ocular misalignment

1.1) Corneal reflections tests: Hirschberg, Krimsky (prism reflection test).
1.2) Fundus reflections test: Brückner.

Prism dioptres (PD).

8 PD for horizontal deviations; 1 PD for vertical deviations (no published threshold identified).

2) Test of binocular function: stereopsis

Stereoacuity tests such as contour and random dot stereotests.

Seconds of arc.

400 seconds of arc.

3) Tests designed to detect reduced ventral vision

3.1) Visual acuity tests, e.g. HOTV, LEA symbols, Keeler (previously Glasgow) crowded logMAR, Sonksen crowded logMAR, crowded Kay picture test.

3.2) Suppression tests.
3.3) Blur test.

LogMAR or

logMAR equivalent.

0.2 logMAR.

4) Automated refraction devices designed to report ocular misalignment

Millimetres of asymmetry or corneal reflections.

No published threshold identified.

Figures and Tables -
Table 3. Thresholds for analysis
Table Tests. Data tables by test

Test

No. of studies

No. of participants

1 Photoscreener Show forest plot

1

271

Figures and Tables -
Table Tests. Data tables by test