Scolaris Content Display Scolaris Content Display

Análogos de insulina de acción (ultra) prolongada versus insulina NPH (insulina isófana humana) para adultos con diabetes mellitus tipo 2

Contraer todo Desplegar todo

Antecedentes

La evidencia de que el tratamiento antihiperglucémico es beneficioso para las personas con diabetes mellitus tipo 2 es contradictoria. Si bien el United Kingdom Prospective Diabetes Study (UKPDS) determinó que un control glucémico más estricto era positivo, otros estudios, como el ensayo Action to Control Cardiovascular Risk in Diabetes (ACCORD), determinaron que los efectos de un tratamiento intensivo para reducir la glucosa en sangre a niveles casi normales eran más perjudiciales que beneficiosos. Los resultados del estudio también mostraron diferentes efectos de diferentes fármacos antihiperglucémicos, independientemente de los niveles de glucosa en sangre alcanzados. Por lo tanto, no se pueden establecer conclusiones firmes sobre el efecto de las intervenciones sobre desenlaces relevantes para los pacientes a partir del efecto de esas intervenciones en la concentración de glucosa en sangre solamente. En teoría, el uso de los análogos de la insulina más nuevos puede resultar en menos eventos macro y microvasculares.

Objetivos

Comparar los efectos del tratamiento a largo plazo con análogos de insulina de acción (ultra) prolongada (insulina glargina U100 y U300, insulina detemir e insulina degludec) con insulina NPH (protamina neutra Hagedorn) (insulina isófana humana) en adultos con diabetes mellitus tipo 2.

Métodos de búsqueda

Para esta actualización de la revisión Cochrane se hicieron búsquedas en CENTRAL, MEDLINE, Embase, ICTRP Search Portal y ClinicalTrials.gov. La fecha de la última búsqueda fue el 5 de noviembre de 2019, excepto en Embase que se buscó por última vez el 26 de enero de 2017. No se aplicaron restricciones de idioma.

Criterios de selección

Se incluyeron los ensayos controlados aleatorizados (ECA) que compararon los efectos del tratamiento con análogos de la insulina de acción (ultra) prolongada con la NPH en adultos con diabetes mellitus tipo 2.

Obtención y análisis de los datos

Dos autores de la revisión, de forma independiente, seleccionaron los ensayos, evaluaron el riesgo de sesgo, extrajeron los datos y evaluaron la certeza general de la evidencia mediante GRADE. Los ensayos se agruparon mediante metanálisis de efectos aleatorios.

Resultados principales

Se identificaron 24 ECA. De éstos, 16 ensayos compararon la insulina glargina con la insulina NPH y ocho ensayos compararon la insulina detemir con la insulina NPH. En estos ensayos 3419 personas con diabetes mellitus tipo 2 se asignaron al azar a la insulina glargina y 1321 personas a la insulina detemir. La duración de los ensayos incluidos varió entre 24 semanas y cinco años. En los estudios que compararon la insulina glargina con la insulina NPH, los valores objetivo variaron entre 4,0 mmol/l y 7,8 mmol/l (72 mg/dl a 140 mg/dl) para la glucosa en sangre en ayunas, entre 4,4 mmol/l y 6,6 mmol/l (80 mg/dl a 120 mg/dl) para la glucosa en sangre nocturna y menos de 10 mmol/l (180 mg/dl) para la glucosa en sangre posprandial, cuando correspondía. Los valores objetivo de glucosa en sangre y hemoglobina glucosilada A1c (HbA1c) en los estudios que compararon la insulina detemir con la insulina NPH variaron entre 4,0 mmol/l y 7,0 mmol/l (72 mg/dl a 126 mg/dl) para la glucosa en sangre en ayunas, menos de 6,7 mmol/l (120 mg/dl) a menos de 10 mmol/l (180 mg/dl) para la glucosa en sangre posprandial, 4,0 mmol/l a 7,0 mmol/l (72 mg/dl a 126 mg/dl) para la glucosa en sangre nocturna y 5,8% a menos de 6,4% HbA1c, cuando correspondía.

Todos los ensayos tuvieron un riesgo de sesgo poco claro o alto en varios dominios de riesgo de sesgo.

En general, la insulina glargina y la insulina detemir dieron lugar a que menos participantes presentaran hipoglucemia en comparación con la insulina NPH. Los cambios en la HbA1c fueron comparables a los de los análogos de la insulina de acción prolongada y la insulina NPH.

La insulina glargina comparada con la insulina NPH tuvo una razón de riesgos (RR) para la hipoglucemia grave de 0,68 (intervalo de confianza [IC] del 95%: 0,46 a 1,01; p = 0,06; reducción del riesgo absoluto [RRA] ‐1,2%; IC del 95%: ‐2,0 a 0; 14 ensayos, 6164 participantes; evidencia de certeza muy baja). La RR para la hipoglucemia grave fue 0,75 (IC del 95%: 0,52 a 1,09; p = 0,13; RRA ‐0,7%; IC del 95%: ‐1,3 a 0,2; diez ensayos, 4685 participantes; evidencia de certeza baja). El tratamiento con insulina glargina redujo la incidencia de hipoglucemia confirmada y de hipoglucemia nocturna confirmada.

El tratamiento con insulina detemir en comparación con la insulina NPH encontró una RR para la hipoglucemia grave de 0,45 (IC del 95%: 0,17 a 1,20; p = 0,11; RRA ‐0,9%; IC del 95%: ‐1,4 a 0,4; cinco ensayos, 1804 participantes; evidencia de certeza muy baja). El odds ratio de Peto para la hipoglucemia grave fue 0,16; IC del 95%: 0,04 a 0,61; p = 0,007; ARR ‐0,9%; IC del 95%: ‐1,1 a ‐0,4; cinco ensayos, 1777 participantes; evidencia de certeza baja). El tratamiento con detemir también redujo la incidencia de hipoglucemia confirmada y de hipoglucemia nocturna confirmada.

La información sobre desenlaces relevantes para los pacientes, como la muerte por cualquier causa, las complicaciones relacionadas con la diabetes, la calidad de vida relacionada con la salud y los efectos socioeconómicos, fue insuficiente o inexistente en casi todos los ensayos incluidos. En los desenlaces de los que se dispuso de algunos datos, no hubo diferencias significativas entre el tratamiento con glargina o detemir y el tratamiento con NPH. No hubo una diferencia clara entre los análogos de insulina y la insulina NPH en cuanto al aumento de peso.

La incidencia de episodios adversos fue comparable en las personas tratadas con glargina o detemir y las personas tratadas con NPH.

No se encontraron ensayos que compararan la insulina de acción ultra prolongada glargina U300 o la insulina degludec con la insulina NPH.

Conclusiones de los autores

Aunque los efectos sobre la HbA1c fueron comparables, el tratamiento con insulina glargina e insulina detemir dio lugar a que menos participantes presentaran hipoglucemia en comparación con la insulina NPH. El tratamiento con insulina detemir también redujo la incidencia de hipoglucemia grave. Sin embargo, los episodios hipoglucémicos graves fueron poco frecuentes y el efecto de reducción del riesgo absoluto fue bajo. Aproximadamente una de cada 100 personas tratadas con insulina detemir en lugar de insulina NPH se benefició.

En los estudios se establecieron objetivos bajos de glucosa en sangre y de HbA1c, correspondientes a niveles de glucosa en sangre casi normales o incluso que no se corresponden con diabetes. Por lo tanto, los resultados de los estudios sólo son aplicables a las personas a las que se dirigen esas bajas concentraciones de glucosa en sangre. Sin embargo, las guías actuales recomiendan una reducción menos intensiva de la glucosa en sangre en la mayoría de las personas con diabetes tipo 2 en la práctica diaria (p.ej., las personas con enfermedades cardiovasculares, un largo historial de diabetes tipo 2, que son susceptibles a presentar hipoglucemia o las personas de edad avanzada). Además, la evidencia de certeza baja y los diseños de los ensayos que no se ajustaron a la práctica clínica actual hicieron que no quedara claro si se observarán los mismos efectos en la práctica clínica diaria. La mayoría de los ensayos no informaron desenlaces relevantes para los pacientes.

PICO

Population
Intervention
Comparison
Outcome

El uso y la enseñanza del modelo PICO están muy extendidos en el ámbito de la atención sanitaria basada en la evidencia para formular preguntas y estrategias de búsqueda y para caracterizar estudios o metanálisis clínicos. PICO son las siglas en inglés de cuatro posibles componentes de una pregunta de investigación: paciente, población o problema; intervención; comparación; desenlace (outcome).

Para saber más sobre el uso del modelo PICO, puede consultar el Manual Cochrane.

Análogos de insulina de acción (ultra) prolongada comparados con la insulina NPH (insulina isófana humana) para adultos con diabetes mellitus tipo 2

Introducción

La diabetes mellitus tipo 2 es una enfermedad progresiva, lo que significa que se necesitan cada vez más medicamentos antihiperglucémicos para alcanzar los niveles recomendados de hemoglobina glucosilada A1c (HbA1c) a medida que aumenta la duración de la enfermedad. La prueba de HbA1c mide los niveles de glucosa en sangre durante dos o tres meses. Con el tiempo, muchas personas requerirán tratamiento con insulina. Frecuentemente el tratamiento con insulina consiste en administrar insulinas basales humanas una o dos veces al día. Las insulinas basales son insulinas de acción prolongada con un inicio de acción retardado que cubren las necesidades básicas de insulina del cuerpo. Las insulinas de acción rápida se utilizan para cubrir las comidas. Los efectos secundarios más comunes del tratamiento con insulina son bajos niveles de azúcar en sangre (hipoglucemia) y aumento de peso. Se han desarrollado nuevas insulinas sintéticas, llamadas análogos de insulina de acción (ultra) prolongada, con la intención de reducir los efectos secundarios y permitir un mejor control de la glucosa en sangre.

Pregunta de la revisión

Se pretendía comparar los efectos del tratamiento con análogos de insulina de acción (ultra) prolongada con la insulina NPH (protamina neutra Hagedorn) (insulina isófana humana).

Fecha de la búsqueda

La evidencia está actualizada hasta el 5 de noviembre de 2019.

Antecedentes

No está claro si los análogos de insulina de acción (ultra) prolongada tienen más efectos beneficiosos o menos efectos perjudiciales en comparación con la insulina NPH, ni en qué medida.

Características de los estudios

Los 24 estudios incluidos fueron ensayos controlados aleatorizados (estudios clínicos donde las personas se asignan al azar a uno de dos o más grupos de tratamiento). Dieciséis estudios compararon la insulina de acción prolongada glargina con la insulina NPH y ocho estudios compararon la insulina de acción prolongada detemir con la insulina NPH. En estos estudios, 3419 personas con diabetes mellitus tipo 2 se asignaron al azar a la insulina glargina y 1321 personas a la insulina detemir. La duración de los estudios varió entre 24 semanas y cinco años.

Resultados clave

Las diferentes insulinas redujeron la HbA1c en aproximadamente la misma cantidad.

El tratamiento con insulina glargina o insulina detemir en lugar de insulina NPH dio lugar a menos personas con hipoglucemia. El tratamiento con insulina detemir redujo el riesgo de hipoglucemia grave. Sin embargo, la hipoglucemia grave sólo se produjo en raras ocasiones en los estudios, en menos de una de cada 100 personas tratadas con insulina detemir y en aproximadamente una de cada 100 personas tratadas con insulina NPH. Aproximadamente una de cada 100 personas tratadas con insulina detemir en lugar de insulina NPH se benefició.

Hubo poca información sobre las complicaciones relacionadas con la diabetes (como enfermedades cardíacas, renales, daños en la retina de los ojos y amputaciones), la muerte por cualquier causa y la calidad de vida relacionada con la salud. Cuando estuvieron disponibles, los resultados del estudio no indicaron diferencias claras entre los análogos de la insulina y la insulina NPH.

No hubo una diferencia clara entre los análogos de la insulina y la insulina NPH en los efectos secundarios o el aumento de peso.

Ninguno de los estudios incluidos informó sobre los efectos socioeconómicos (como los costos de la intervención, la ausencia al trabajo, el consumo de medicamentos).

Certeza de la evidencia

En los estudios se establecieron valores objetivo muy bajos de glucosa en sangre y de HbA1c. Sin embargo, los médicos suelen recomendar objetivos más altos para las personas con un largo historial de diabetes tipo 2, que han tenido un ataque al corazón o un ictus, o de edad avanzada. Con valores objetivo más altos, la hipoglucemia se produce con menos frecuencia y es necesario tratar a más personas con análogos de insulina en lugar de insulina NPH para prevenir la hipoglucemia en una persona. Por lo tanto, los resultados del estudio sólo son aplicables a las personas que son tratadas con estos valores objetivos de glucosa en sangre tan bajos.

En muchos estudios no fue posible un ajuste adecuado de la insulina NPH. Sin embargo, los médicos lo harán en la práctica diaria. Por lo tanto, se espera una mayor disminución del beneficio de los análogos de la insulina.

El tratamiento en todos los estudios, excepto uno, duró 12 meses o menos. Sin embargo, las complicaciones relacionadas con la diabetes normalmente sólo se desarrollan después de muchos años. Por lo tanto, la mayoría de los estudios no pudieron responder a la importante pregunta de si el tratamiento con diferentes preparaciones de insulina tiene efectos diferentes en las complicaciones relacionadas con la diabetes. Esto significa que no se detectaron diferencias potencialmente importantes entre los análogos de la insulina y la insulina NPH.

Todos los estudios tuvieron problemas en la forma en que se realizaron.

Authors' conclusions

Implications for practice

In people with type 2 diabetes mellitus, treatment with insulin detemir reduced the incidence of serious hypoglycaemia, and treatment with insulin glargine or detemir reduced the incidence of confirmed hypoglycaemic and confirmed nocturnal hypoglycaemic events, as compared with NPH insulin, with no substantial difference in glycosylated haemoglobin A1c (HbA1c) lowering. However, serious hypoglycaemic events were rare and the absolute risk reducing effect was low. Approximately one in 100 people treated with insulin detemir instead of NPH insulin benefited.

In all studies, low blood glucose and HbA1c targets, corresponding to near normal or even non‐diabetic blood glucose levels, were set. Therefore, results from the studies at hand are only applicable to people in whom such low blood glucose concentrations are targeted. However, current guidelines recommend less intensive blood glucose lowering for the majority of people with type 2 diabetes in daily practice (e.g. people with cardiovascular diseases, with a long history of type 2 diabetes, who are susceptible to hypoglycaemia or elderly people). Additionally, low‐certainty evidence and trial designs that did not conform with current clinical practice meant it remains unclear if the same effects will be observed in daily clinical practice.

We found no clear effects of glargine or detemir compared with NPH on diabetes‐related complications.

Data on health‐related quality of life and socioeconomic effects were limited or not available.

Implications for research

For most patient‐important outcomes it remains to be clarified if there is a clinically relevant difference between treatment with insulin glargine or detemir and NPH insulin in people with type 2 diabetes mellitus. Furthermore, data are required on socioeconomic effects, as well as from low‐ and middle‐income countries as they were under‐represented in the available trials.

Summary of findings

Open in table viewer
Summary of findings 1. Insulin glargine versus NPH insulin for type 2 diabetes mellitus

Insulin glargine vs NPH insulin for type 2 diabetes mellitus

Patient: participants with type 2 diabetes mellitus

Intervention: insulin glargine

Comparison: NPH insulin (human isophane insulin)

Outcomes

Risk for NPH insulin

Risk for insulin glargine

Relative effect
(95% CI)

No of participants
(trials)

Certainty of the evidence
(GRADE)

Comments

Diabetes‐related complications

(1) Fatal MI

(2) Fatal stroke

(3) Progression in retinopathy

(4) Amputations

(5) ESRD

Follow‐up: 6 months to 36 weeks

(1) See comment

(2) See comment

(3) 101 per 1000

(4) See comment

(5) See comment

(1) See comment

(2) See comment

(3) 104 per 1000 (60 to 178)

(4) See comment

(5) See comment

(1) + (2) See comment

(3) RR 1.03

(0.60 to 1.77)

(4) + (5) See comment

(1) 934 (4 RCTs)

(2) 934 (4 RCTs)

(3) 1947 (5 RCTs)

(4) 34 (1 RCT)

(5) 34 (1 RCT)

(1) + (2)

⊕⊝⊝⊝
Very lowa

(3) ⊕⊝⊝⊝
Verylowb

(4) + (5) ⊕⊝⊝⊝
Very lowc

(1) 1 trial reported 3/352 participants in the glargine 100 IU group vs 0/349 participants in the NPH group experienced fatal MI; 3 additional trials with 233 participants reported that no fatal MI occurred.

(2) No fatal strokes occurred.

(3) The 95% prediction interval ranged between 0.22 and 4.83.

(4) + (5) 1 trial reported that no amputation or ESRD occurred.

Hypoglycaemic episodes
(1) Severe hypoglycaemia

(2) Serious hypoglycaemia

(3) Confirmed hypoglycaemia (BG < 75 mg/dL)

(4) Confirmed hypoglycaemia (BG < 55 mg/dL)

(5) Confirmed nocturnal hypoglycaemia (BG < 75 mg/dL)

(6) Confirmed nocturnal hypoglycaemia (BG < 55 mg/dL)

Follow‐up: 24 weeks to 5 years

(1) 37 per 1000

(2) 27 per 1000

(3) 572 per 1000

(4) 180 per 1000

(5) 351 per 1000

(6) 115 per 1000

(1) 25 per 1000 (17 to 37)

(2) 20 per 1000 (14 to 29)

(3) 526 per 1000 (486 to 578)

(4) 159 per 1000 (146 to 173)

(5) 274 per 1000 (239 to 312)

(6) 85 per 1000 (74 to 98)

(1) RR 0.68 (0.46 to 1.01)

(2) RR 0.75 (0.52 to 1.09)

(3) RR 0.92 (0.85 to 1.01)

(4) RR 0.88 (0.81 to 0.96)

(5) RR 0.78 (0.68 to 0.89)

(6) RR 0.74 (0.64 to 0.85)

(1) 6164 (14 RCTs)

(2) 4685 (10 RCTs)

(3) 4115 (7 RCTs)

(4) 4388 (8 RCTs)

(5) 4225 (8 RCTs)

(6) 4759 (8 RCTs)

(1) ⊕⊝⊝⊝
Very lowd

(2) ⊕⊕⊝⊝
Lowe

(3) ⊕⊝⊝⊝
Very lowf

(4) ⊕⊕⊕⊝
Moderateg

(5) ⊕⊝⊝⊝
Very lowf

(6) ⊕⊕⊕⊝
Moderateg

(1) The 95% prediction interval ranged between 0.33 and 1.40.

(2) The 95% prediction interval ranged between 0.48 and 1.16.

(3) The 95% prediction interval ranged between 0.69 and 1.22.

(4) The 95% prediction interval ranged between 0.79 and 0.98.

(5) The 95% prediction interval ranged between 0.53 and 1.14.

(6) The 95% prediction interval ranged between 0.62 and 0.88.

HRQoL

Follow‐up: 28 weeks to 48 weeks

See comment

1228 (3 RCTs)

⊕⊝⊝⊝
Very lowh

3 trials reported no statically significant differences between glargine groups and NPH groups in HRQoL total scores (W‐BQ22; EQ‐5) or any subscales.

All‐cause mortality

Follow‐up: 24 weeks to 5 years

8 per 1000

9 per 1000 (5 to 15)

Peto OR 1.06 (0.62 to 1.82)

6173 (14 RCTs)

⊕⊕⊝⊝
Lowi

AEs other than hypoglycaemia

(1) SAE

(2) Overall AE

(3) AE leading to discontinuation

Follow‐up: 24 weeks to 5 years

(1) 135 per 1000

(2)662 per 1000

(3) 17 per 1000

(1) 132 per 1000 (117 to 148)

(2) 669 per 1000 (649 to 682)

(3)20 per 1000 (14 to 30)

(1) RR 0.98 (0.87 to 1.10)

(2) RR 1.01 (0.98 to 1.03)

(3) RR 1.21 (0.84 to 1.76)

(1) 5499 (13 RCTs)

(2) 6170 (14 RCTs)

(3) 6149 (13 RCTs)

(1) + (2) (+3) ⊕⊕⊕⊝
Moderatej

(1) The 95% prediction interval ranged between 0.86 and 1.12.

(2) The 95% prediction interval ranged between 0.99 and 1.03.

(3) The 95% prediction interval ranged between 0.79 and 1.84.

Socioeconomic effects

Not reported

HbA1c

Follow‐up: 24 weeks to 5 years

The mean change in HbA1c ranged across control groups from –2.12% to +0.1%

The mean change in HbA1c in the intervention groups was 0.07% lower

(0.18% lower to 0.03% higher)

5809 (16 RCTs)

⊕⊕⊝⊝
Lowk

The 95% prediction interval ranged between –46% and 0.32%.

AE: adverse event; BG: blood glucose; CI: confidence interval; EQ‐5(D): EuroQol 5 (Dimension); ESRD: end‐stage renal disease; HbA1c: glycosylated haemoglobin A1c; HRQoL: health‐related quality of life; MD: mean difference; MI: myocardial infarction; NPH: neutral protamine Hagedorn; OR: odds ratio; RCT: randomised controlled trial; RR: risk ratio; SAE: serious adverse event; W‐BQ22: Well‐Being Questionnaire (22 items).

GRADE Working Group grades of evidence

High certainty: we are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty: we are moderately confident in the effect estimate; the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low certainty: our confidence in the effect estimate is limited; the true effect may be substantially different from the estimate of the effect.
Very low certainty: we have very little confidence in the effect estimate; the true effect is likely to be substantially different from the estimate of effect.

aDowngraded three levels because of risk of bias and serious imprecision (very sparse data) – see Appendix 1.
bDowngraded three levels because of risk of bias, inconsistency and imprecision – see Appendix 1.
cDowngraded three levels because of indirectness and serious imprecision (very sparse data) – see Appendix 1.
dDowngraded three levels because of risk of bias, imprecision and inconsistency – see Appendix 1.
eDowngraded two levels because of risk of bias and imprecision – see Appendix 1.
fDowngraded three levels because of risk of bias, inconsistency and imprecision – see Appendix 1.
gDowngraded one level because of risk of bias – see Appendix 1.
hDowngraded three levels because of risk of bias and serious imprecision – see Appendix 1.
iDowngraded two levels because of risk of bias and imprecision – see Appendix 1.
jDowngraded one level because of imprecision – see Appendix 1.
kDowngraded two levels because of inconsistency and imprecision – see Appendix 1.

Open in table viewer
Summary of findings 2. Insulin detemir versus NPH insulin for type 2 diabetes mellitus

Insulin detemir vs NPH insulin for type 2 diabetes mellitus

Patient: participants with type 2 diabetes mellitus

Intervention: insulin detemir

Comparison: NPH insulin (human isophane insulin)

Outcomes

Risk for NPH insulin

Risk for insulin detemir

Relative effect
(95% CI)

No of participants
(trials)

Certainty of the evidence
(GRADE)

Comments

Diabetes‐related complications

(1) Fatal MI

(2) Fatal stroke

(3) Progression in retinopathy

(4) Amputations

(5) ESRD

Follow‐up: 24 weeks to 26 weeks

(1) + (2) See comment

(3) 25 per 1000

(4) + (5) See comment

(1) + (2) See comment

(3) 37 per 1000 (17 to 82)

(4) + (5) See comment

(1) + (2) See comment

(3) RR 1.50

(0.68 to 3.32)

(4) + (5) See comment

(1) + (2) 271 (1 RCT)

(3) 972 (2 RCTs)

(4) + (5) 271 (1 RCT)

(1) + (2) + (3) + (4) + (5) ⊕⊝⊝⊝
Very lowa

(1) + (2) 1 trial reported that no fatal MI or fatal stroke occurred.

(3) –

(4) + (5) 1 trial reported that no amputation or ESRD occurred.

Hypoglycaemic episodes

(1) Severe hypoglycaemia

(2) Serious hypoglycaemia

(3) Confirmed hypoglycaemia (BG < 75 mg/dL)

(4) Confirmed hypoglycaemia (BG < 55 mg/dL)

(5) Confirmed nocturnal hypoglycaemia (BG < 75 mg/dL)

(6) Confirmed nocturnal hypoglycaemia (BG < 55 mg/dL)

Follow‐up: 24 weeks to 7 months

(1) 17 per 1000

(2) 11 per 1000

(3) 562 per 1000

(4) 493 per 1000

(5) 309 per 1000

(6) 40 per 1000

(1) 8 per 1000 (3 to 21)

(2) 2 per 1000 (0 to 7)

(3) 410 per 1000 (343 to 484)

(4) 237 per 1000 (158 to 350)

(5) 176 per 1000 (145 to 210)

(6) 13 per 1000 (6 to 25)

(1) RR 0.45 (0.17 to 1.20)

(2) Peto OR 0.16 (0.04 to 0.61)

(3) RR 0.73 (0.61 to 0.86)

(4) RR 0.48 (0.32 to 0.71)

(5) RR 0.57 (0.47 to 0.68)

(6) RR 0.32 (0.16 to 0.63)

(1) 1804 (5 RCTs)

(2) 1777 (5 RCTs)

(3) 1718 (4 RCTs)

(4) 1718 (4 RCTs)

(5) 1718 (4 RCTs)

(6) 1718 (4 RCTs)

(1) ⊕⊝⊝⊝
Very lowb

(2) ⊕⊕⊝⊝
Lowc

(3) ⊕⊕⊝⊝
Lowd

(4) + (5) + (6)
⊕⊕⊝⊝
Lowe

(1) The 95% prediction interval ranged between 0.09 and 2.21.

(2) –

(3) The 95% prediction interval ranged between 0.36 and 1.48.

(4) The 95% prediction interval ranged between 0.20 and 1.13.

(5) The 95% prediction interval ranged between 0.39 and 0.84.

(6) The 95% prediction interval ranged between 0.07 and 1.42.

Health‐related quality of life

Follow‐up: 26 weeks to 36 weeks

See comment

873 (3 RCTs)

⊕⊝⊝⊝
Very lowb

3 trials reported no statically significant difference between detemir groups and NPH groups in HRQoL total scores (ITR‐QOLN; DHP‐2; SF‐36) or any subscales.

All‐cause mortality

Follow‐up: 24 weeks to 48 weeks

5 per 1000

4 per 1000 (1 to 13)

Peto OR 0.74 (0.20 to 2.65)

2328 (8 RCTs)

⊕⊕⊝⊝
Lowf

AEs other than hypoglycaemia

(1) SAE

(2) Overall AE

(3) AE leading to discontinuation

Follow‐up: 24 weeks to 48 weeks

(1) 71 per 1000

(2)611 per 1000

(3) 18 per 1000

(1) 62 per 1000 (45 to 85)

(2) 629 per 1000 (586 to 678)

(3)22 per 1000 (12 to 40)

(1) RR 0.88

(0.64 to 1.20)

(2) RR 1.03 (0.96 to 1.11)

(3) RR 1.22 (0.67 to 2.25)

(1) 2328 (8 RCTs)

(2) 2328 (8 RCTs)

(3) 2328 (8 RCTs)

(1) + (2) (+3)

⊕⊝⊝⊝
Moderateg

(1) The 95% prediction interval ranged between 0.60 and 1.30.

(2) The 95% prediction interval ranged between 0.94 and 1.13.

(3) The 95% prediction interval ranged between 0.57 and 2.62.

Socioeconomic effects

Not reported

HbA1c

Follow‐up:

The mean change in HbA1c ranged across control groups from –1.9% to –0.32%

The mean change in HbA1c in the intervention groups was 0.13% higher

(0.02% lower to 0.28% higher)

2233 (7 RCTs)

⊕⊕⊝⊝
Lowh

The 95% prediction interval ranged between –0.28% and 0.54%.

AE: adverse event; BG: blood glucose; CI: confidence interval; DHP‐2: Diabetes Health Profile 2; ESRD: end‐stage renal disease; HbA1c: glycosylated haemoglobin A1c; HRQoL: health‐related quality of life; ITR‐QOLN: insulin therapy‐related quality of life at night; MD: mean difference; MI: myocardial infarction; NPH: neutral protamine Hagedorn; OR: odds ratio; RR: risk ratio; SAE: serious adverse event; SF‐36: 36‐item Short Form Health Survey.

GRADE Working Group grades of evidence

High certainty: we are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty: we are moderately confident in the effect estimate; the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low certainty: our confidence in the effect estimate is limited; the true effect may be substantially different from the estimate of the effect.
Very low certainty: we have very little confidence in the effect estimate; the true effect is likely to be substantially different from the estimate of effect.

aDowngraded three levels because of risk of bias and serious imprecision (very sparse data) – see Appendix 2.
bDowngraded three levels because of risk of bias and serious imprecision – see Appendix 2.
cDowngraded two levels because of risk of bias and imprecision – see Appendix 2.
dDowngraded two levels because of risk of bias and inconsistency – see Appendix 2.
eDowngraded two levels because risk of bias and imprecision – see Appendix 2.
fDowngraded two levels because of serious imprecision – see Appendix 2.
gDowngraded one level because of imprecision – see Appendix 2.
hDowngraded two levels because of inconsistency and imprecision – see Appendix 2.

Background

Description of the condition

Type 2 diabetes mellitus is a metabolic disorder characterised by relative insulin deficiency resulting from a reduced sensitivity of tissues to insulin, impaired insulin secretion by pancreatic β‐cells, or both (ADA 2020). This in turn leads to chronic hyperglycaemia (i.e. elevated levels of plasma glucose) with disturbances in carbohydrate, fat and protein metabolism (ADA 2020). Long‐term complications of diabetes mellitus include retinopathy, nephropathy, neuropathy and increased risk of cardiovascular disease (CVD) (ADA 2020).

Description of the intervention

Type 2 diabetes mellitus is a progressive disease that causes a decline in pancreatic β‐cell function. Thus, at some point during the course of the disease, treatment with oral glucose‐lowering agents or other non‐insulin glucose‐lowering agents may not suffice, and exogenous insulin will be necessary to achieve the desired glucose levels. At this stage, treatment with intermediate or long‐acting insulins is one of the recommended treatment options (ADA 2020).

Historically, intermediate‐ and long‐acting insulin preparations were obtained by crystallising either protamine (NPH type) or zinc (Lente type). Treatment with these basal insulins, however, has drawbacks. Achieving lower blood glucose levels carries an increased risk of hypoglycaemia (Ahrén 2013). As NPH is associated with a pronounced insulin peak following injection and variable absorption (Heinemann 2000; Lepore 2000), targeting for lower glycosylated haemoglobin A1c (HbA1c) levels is often difficult and leads to a higher incidence of hypoglycaemic events (Ahrén 2013).

To provide insulin with a more suitable physiological time course to people with diabetes mellitus, so‐called insulin analogues have been developed. Insulin analogues are insulin‐like molecules, engineered on the basis of the molecular structure of human insulin by changing the amino acid sequence and physiochemical properties. Four such (ultra‐)long‐acting insulin analogues – insulin detemir (Levemir), insulin glargine U100 (Lantus), insulin degludec (Tresiba) and insulin glargine U300 (Toujeo) – are currently available on the market.

Adverse effects of the intervention

Compared to human insulin, some insulin analogues have shown higher mitogenic potency and insulin‐growth factor binding affinity in vitro and animal studies (Grant 1993; Jorgensen 1992; King 1985; Kurtzhals 2000). These effects differ depending on the insulin analogue, but results provided in these studies are unable to clarify their relevance for people with diabetes mellitus. The American and European pharmaceutical registration bodies, the Food and Drug Administration (FDA) and the European Medicines Agency (EMA) have commented on the mitogenic and carcinogenic potency of long‐acting insulin analogues and concluded that there appear to be few detrimental effects (EMA 2003; EMA 2004; EMA 2012; FDA 2000; FDA 2005). One cohort study based on data from a large German statutory insurance fund found a dose‐dependent increase in cancer risk for treatment with insulin glargine compared with human insulin (Hemkens 2009).

Epidemiological investigations indicate that higher blood glucose concentrations are associated with a higher risk of developing micro‐ and macrovascular diabetic complications (Adler 1997; Klein 1995; Turner 1998). The United Kingdom Prospective Diabetes Study (UKPDS) showed that lowering blood glucose to near normal levels might reduce microvascular complications (Holman 2008; UKPDS‐33 1998; UKPDS 34 1998). However, evidence that the effects of antihyperglycaemic therapy on macrovascular complications and mortality is positive, is conflicting. Several studies investigating the effects of tight versus less tight glycaemic control have not shown a clear reduction in the risk of macrovascular complications (ACCORD 2008; ADVANCE 2008; Duckworth 2009; Kooy 2009). Furthermore, investigations into different pharmacological interventions have shown a reduction in the risk of complications without a significant simultaneous change in blood glucose concentrations (Marso 2016; Zinman 2015), while others have reported an increase in the risk of mortality and macrovascular complications despite a substantial decrease in blood glucose levels (ACCORD 2008; Singh 2007). In consequence, firm conclusions on the effect of interventions on patient‐relevant outcomes cannot be drawn from the effect of these interventions on blood glucose concentrations alone.

The new long‐ and ultra‐long‐acting insulins are usually more expensive than NPH insulin. While price differences may not be a problem for health services in high‐income countries, they may be important in low‐ and middle‐income countries.

How the intervention might work

Based on the altered time‐action profiles of insulin analogues, several possible advantages in the therapy of people with type 2 diabetes mellitus have been suggested. For instance, it has been hypothesised that the longer action (lower fasting plasma glucose) and the less pronounced peak (less hypoglycaemia, especially during the night) will enable both HbA1c and the risk of hypoglycaemia to be reduced. It has also been suggested that the use of Insulin glargine or detemir may improve patient's health‐related quality of life and treatment satisfaction.

Why it is important to do this review

The aim of the original Cochrane Review was to systematically review the clinical efficacy and safety of insulin glargine and detemir in the treatment of people with type 2 diabetes mellitus. Although their pharmacokinetic profiles appeared to indicate that long‐acting insulin analogues improved the insulin therapy of people with type 2 diabetes mellitus, their superiority in a clinical setting had still to be confirmed. This is an update of the original Cochrane Review which was necessary because new trials on the topic have been published and new ultra‐long‐acting insulin analogues – insulin degludec and insulin glargine U300 – have been launched on the market since publication of the original review.

Objectives

To compare the effects of long‐term treatment with (ultra‐)long‐acting insulin analogues (insulin glargine U100 and U300, insulin detemir and insulin degludec) with NPH insulin (human isophane insulin) in adults with type 2 diabetes mellitus.

Methods

Criteria for considering studies for this review

Types of studies

We included randomised controlled trials (RCTs).

Reports for which no full publication existed were only considered for inclusion in this review if the available information met the publication criteria in the CONSORT statement.

Types of participants

Adults (aged 18 years and older) with type 2 diabetes mellitus and not pregnant.

Types of interventions

We had intended to compare the following interventions with the comparator.

Only trials reporting on subcutaneously administered insulin were considered for inclusion in this review.

Intervention

  • Long‐acting insulin analogues (insulin glargine U100 or insulin detemir) or ultra‐long‐acting insulin analogues (insulin glargine U300 or insulin degludec).

Comparator

  • NPH insulin

Interventions in both intervention and comparator groups had to be the same to enable fair comparisons.

Minimum duration of intervention

We considered studies with a minimal duration of 24 weeks. In case of a cross‐over design, each of the periods had to last at least 24 weeks.

Minimum duration of follow‐up

Minimal duration of follow‐up was 24 weeks. In case of a cross‐over design, duration of follow‐up for each of the periods had to be at least 24 weeks.

We defined extended follow‐up periods (also called open‐label extension studies) as the follow‐up of participants once the original trial as specified in the trial protocol had been terminated. However, such studies are frequently of an observational nature and were only evaluated in case of adverse events (Buch 2011; Megan 2012).

Types of outcome measures

We did not exclude trials solely based on their outcome measures. In case none of our primary or secondary outcomes were reported, we planned to provide at least some basic information in an additional table.

Primary outcomes

  • Diabetes‐related complications.

  • Hypoglycaemic episodes.

  • Health‐related quality of life.

Secondary outcomes

  • All‐cause mortality.

  • Adverse events other than hypoglycaemia.

  • Socioeconomic effects.

  • HbA1c.

Method of outcome measurement

  • Diabetes‐related complications: such as renal failure, amputation, blindness or deterioration in retinopathy, myocardial infarction, stroke, heart failure, revascularisation procedures.

  • Hypoglycaemic episodes: number of severe (as defined in the studies), serious (as defined in the studies), confirmed (as defined in the studies) and confirmed nocturnal hypoglycaemic episodes (as defined in the studies).

  • Health‐related quality of life: evaluated using a validated instrument such as the 36‐item Short Form (SF‐36) or EuroQol 5 Dimension (EQ‐5D).

  • All‐cause mortality: defined as death from any cause and measured at any time after participants had been randomised to intervention or comparator groups.

  • Adverse events other than hypoglycaemia: such as cancer incidence, skin reactions and measured at any time after participants had been randomised to intervention or comparator groups.

  • Socioeconomic effects: such as direct costs defined as admission or readmission rates, mean length of stay, visits to general practitioner, accident/emergency visits; medication consumption; indirect costs defined as resources lost due to illness by the participant or a family member.

  • HbA1c: measured in % or mmol/mol.

Timing of outcome measurement

  • Diabetes‐related complications, hypoglycaemic episodes, all‐cause mortality, adverse events other than hypoglycaemia, socioeconomic effects: measured at any time after participants had been randomised to intervention or comparator groups.

  • Health‐related quality of life: measured at the latest time point of measurement during follow‐up.

  • HbA1c: measured as change between baseline and end of follow‐up.

Search methods for identification of studies

Electronic searches

Searches for the previous review version were conducted to 11 December 2006. For this update, we revised the search strategies.

We searched the following sources without restrictions on the language of publication:

  • from 1 September 2006 to 26 January 2017 (for detailed search strategies, see Appendix 3):

    • Cochrane Central Register of Controlled Trials (CENTRAL) via the Cochrane Register of Studies Online (CRSO) (searched 26 January 2017);

    • MEDLINE Ovid (Epub Ahead of Print, In‐Process & Other Non‐Indexed Citations, Ovid MEDLINE(R) Daily and Ovid MEDLINE(R) 1946 to Present) (searched 26 January 2017);

    • Embase Ovid (1974 to 2017 Week 04) (searched 26 January 2017);

    • ClinicalTrials.gov (www.clinicaltrials.gov) (searched 26 January 2017);

    • World Health Organization (WHO) International Clinical Trials Registry Platform (ICTRP) (www.who.int/trialsearch/) (searched 26 January 2017);

  • from 1 January 2017 to 5 November 2019 (for detailed search strategies, see Appendix 4):

    • CENTRAL via the CRSO (searched 5 November 2019);

    • MEDLINE Ovid (Epub Ahead of Print, In‐Process & Other Non‐Indexed Citations, Ovid MEDLINE(R) Daily and Ovid MEDLINE(R) 1946 to Present) (searched 5 November 2019);

    • ClinicalTrials.gov (www.clinicaltrials.gov) (searched 5 November 2019);

    • WHO ICTRP (www.who.int/trialsearch/) (searched 5 November 2019).

Searching other resources

We tried to identify other potentially eligible trials or ancillary publications by searching the reference lists of included trials, systematic reviews, meta‐analyses and health technology assessment reports. Inquiries were directed to the two main pharmaceutical companies producing long‐acting insulin analogues (Sanofi, Novo Nordisk). In addition, we contacted authors of potentially relevant and included trials to obtain additional information on the retrieved trials and to determine if further trials existed that we might have missed.

We also searched the databases of regulatory agencies (EMA and FDA) (Hart 2012; Schroll 2015).

We considered additional information based on original trial reports published in a report by the German Institut für Qualität und Wirtschaftlichkeit im Gesundheitswesen (Institute for Quality and Efficiency in Health Care) (IQWiG 2009), which we defined as grey literature. This report was cited as an additional source. In terms of inconsistency between journal publications and the IQWiG report 2009, data from the IQWiG report were used preferentially, since these data were based on original trial reports and therefore deemed more reliable.

We did not use abstracts or conference proceedings for data extraction because this information source does not fulfil the CONSORT requirements which is "an evidence‐based, minimum set of recommendations for reporting randomized trials" (CONSORT; Scherer 2018).

Data collection and analysis

Selection of studies

Two review authors (JE and TS or KH) independently scanned the abstract, title or both, of every record we retrieved in the literature searches, to determine which trials we should assess further. We obtained the full text of all potentially relevant records. We resolved any disagreements through consensus or by recourse to a third review author (JE, KH, TS, KJ). If we could not resolve a disagreement, we categorised the trial as a 'study awaiting classification' and contacted the trial authors for clarification. We present an adapted PRISMA flow diagram to show the process of trial selection (Liberati 2009).

Data extraction and management

For trials that fulfilled our inclusion criteria, two review authors (JE, KH, TS, KJ) independently extracted key participant and intervention characteristics. We reported data on efficacy outcomes and adverse events using standardised data extraction sheets from the Cochrane Metabolic and Endocrine Disorders (CMED) Group. We resolved any disagreements by discussion or, if required, by consultation with a third review author (JE, KH, TS, KJ) (for details, see Characteristics of included studies table; Table 1; Appendix 5; Appendix 6; Appendix 7; Appendix 8; Appendix 9; Appendix 10; Appendix 11; Appendix 12; Appendix 13; Appendix 14, Appendix 15; Appendix 16; Appendix 17; Appendix 18; Appendix 19; Appendix 1; Appendix 2; Appendix 20).

Open in table viewer
Table 1. Overview of trial populations

Trial ID

(study design)

Intervention(s) and comparator(s)

Description of power and sample size calculation

Screened/eligible
(n)

Randomised
(n)

ITT
(n)

Analysed
(n)

Finishing trial
(n)

Randomised finishing trial
(%)

Follow‐up
(extended follow‐up)a

Berard 2015

(parallel RCT)

I: insulin glargine once‐daily

32

6 months

C: NPH insulin once daily or twice daily

34

total:

66

Eliaschewitz 2006

(parallel RCT, equivalence design)

I: insulin glargine at bedtime + glimepiride 4 mg/day in the morning

Based on an equivalence region of 0.5% and an SD of 2.0% for the differences in HbA1c between the groups, equivalence can be demonstrated with a statistical power of 80% with 199 participants per group, based on a 1‐sided α = 0.05. A 1:1 randomisation would require 199 evaluable participants in each group. Based on an expectation that 20% of the participants would not be evaluable, the study required the enrolment of 240 in each group.

918/—

231

Efficacy: 218

Safety: 231

218

24 weeks

C: NPH insulin at bedtime + glimepiride 4 mg/day in the morning

250

Efficacy: 244

Safety: 250

244

total:

528

481

Efficacy: 462

Safety: 481

462

87.5

Fajardo Montañana 2008

(parallel RCT)

I: insulin detemir at bedtime

272 participants (230 evaluable) were required to detect a difference in weight change of 1.5 kg (SD 4.0) between groups after 26 weeks, using a 2‐sided test with a 0.05 significance level.

345/293

126

125

125

119

94.4

26 weeks

C: NPH insulin at bedtime

151

146

146

139

92.1

total:

277

271

271

252

91.0

Fritsche 2003

(parallel RCT, non‐inferiority design)

I1: insulin glargine in the morning + glimepiride 3 mg

Based on the assumption of an SD of σ = 2.0%, a difference of Δ = 0.5% for HbA1c reductions among treatment groups can be detected with an α‐error of 0.05 and a β‐error of 0.2. This equates to a statistical power of 80% with 199 participants per group. With use of a 1:1:1 randomisation, 597 participants would be required for this study. Assuming a non‐evaluable rate of 20%, 720 participants (240 per group) would need to be enrolled in this study.

938/752

237

236

236

225

94.9

24 weeks

I2: insulin glargine at bedtime + glimepiride 3 mg

229

227

227

210

91.7

C: NPH insulin at bedtime + glimepiride 3 mg

234

232

232

205

87.6

total:

700

695

695

640

91.4

Haak 2005

(parallel RCT, non‐inferiority design)

I: detemir once daily at bedtime or twice daily in the morning and at bedtime + mealtime insulin aspart

The study had sufficient power (85%) to detect a mean difference of 0.4% in HbA1c between groups. A 95% 2‐sided CI was constructed for the difference between the group means (insulin detemir NPH insulin); insulin detemir was deemed non‐inferior if the upper limit of the 95% CI was < 0.4% (absolute). Treatments were considered comparable if the non‐inferiority criterion was fulfilled.

—/—

341

341

341

315

92.4

26 weeks

C: detemir once daily at bedtime or twice daily in the morning and at bedtime + mealtime insulin aspart

164

164

164

156

95.1

total:

505

505

505

471

93.3

Hermanns 2015

(cross‐over RCT)

I: insulin glargine

In a previous cross‐sectional study, different effect sizes of insulin glargine compared to NPH insulin in terms of SF‐12 (d = 0.10), PAID (d = 0.22), and ITEQ (d = 0.29) scores were observed. The mean effect size of all 3 scales was d = 0.166. Since the present study had a cross‐over design, in which each participant served as his/her own control, an effect size on the primary endpoint DRQoL of d = 0.20 was expected. Such an effect can be detected with 90% power using a paired t‐test with a significance level of 5% and with 265 participants.

460/—

343;
sequence A: 176,
sequence B: 167

339;

sequence A: 175,
sequence B: 164

229b; sequence A: 118,
sequence B: 111

296;

sequence A: 151,
sequence B: 145

86.3;

sequence A: 85.8,
sequence B: 86.8

48 weeks (efficacy)
49 weeks (safety)

C: NPH basal insulin

total:

343

339

229b

296

86.3

Hermansen 2006

(parallel RCT, non‐inferiority design)

I: detemir in the morning and evening

A non‐inferiority criterion, defined as a < 0.4% difference in HbA1c, was calculated to require 198 completers per arm for 95% power with a 5% significance level and with a maximum baseline‐adjusted SD of 1.1%

735/490

237

237

237

227

95.8

24 weeks

C: NPH insulin in the morning and evening

239

238

238

225

94.1

total:

476

475

475

452

95.0

Home 2015

(parallel RCT)

I: insulin glargine

It was estimated that at least 568 evaluable participants (670 were randomised with 15% not assessable) needed to be randomised to detect a difference in change of HbA1c of 0.3% (3.3 mmol/mol) at the 5% significance level with 90% power.This assumes an SD of change of HbA1c of 1.1% (12 mmol/mol)

1102/—

355

352

352

335

94.5

36 weeks (efficacy)
37 weeks (safety)

C: NPH insulin

353

349

349

328

92.9

total:

708

701

701

663

93.6

Hsia 2011

(parallel RCT)

I1: insulin glargine at bedtime

Based on previously published HbA1c levels in oral agent‐treated participants from the study centre, enrolment of 24 in each of the 3 treatment arms (72 total) would provide 95% power to detect an HbA1c difference of 0.8%, at a 5% significance level.

—/108

30

30

30

20c

66.7c

26 weeks

I2: insulin glargine in the morning

25

25

25

14c

56.0c

C: NPH insulin at bedtime

30

30

30

17c

56.7c

total:

85

85

85

51c

60.0c

Kawamori 2003

(parallel RCT, non‐inferiority design)

I: insulin glargine once in the morning + OAD

—/400

167

158

Efficacy: 141

Safety: 158

141

84.4

28 weeks

C: NPH insulin once in the morning + OAD

168

159

Efficacy: 134

Safety: 159

134

79.8

total:

335

317

Efficacy: 275

Safety: 317

275

82.1

Kobayashi 2007 A

(parallel RCT, non‐inferiority design)

I: insulin detemir once daily at bedtime or twice daily in the morning and at bedtime + mealtime insulin aspart

454/401d

70

67

67

65

92.9

48 weeks

C: NPH insulin once daily at bedtime or twice daily in the morning and at bedtime + mealtime insulin aspart

35

35

35

32

91.4

total:

105

102

102

97

92.4

Kobayashi 2007 B

(parallel RCT, non‐inferiority design)

I: insulin detemir at bedtime + OAD

437/371

183

180

180

160

87.4

36 weeks

C: NPH insulin at bedtime + OAD

188

183

183

172

97.5

total:

371

363

363

332

89.5

Massi 2003

(parallel RCT)

I: insulin glargine once daily at bedtime + OAD

Based on 1:1 randomisation and using a t‐test, a total number of 384 participants (192 for each group) was required to detect a mean difference of 0.5% glycated haemoglobin between insulin glargine and NPH insulin with a significance level of α = 5% and a statistical power of 90%. It was estimated that a total of 480 participants were to be enrolled to have 384 participants evaluable for efficacy analysis,

687/—

293

289

289

277

94.5

52 weeks

C: NPH insulin once daily at bedtime + OAD

285

281

281

252

88.4

total:

578

570

570

529

91.5

NCT00687453

(parallel RCT, non‐inferiority design)

I: insulin glargine at bedtime

27/24 e

11

11

11

8c

73c

6 months

C: NPH insulin in the morning and at bedtime

13

13

13

7c

54c

total:

24

24

24

15c

62.5c

NN304‐1337

(parallel RCT)

I: insulin detemir once daily at bedtime + metformin

309

309

309

266

86.1

24 weeks

C: NPH insulin once daily at bedtime + metformin

158

158

158

140

88.6

total:

467

467

467

406

86.9

NN304‐1808

(parallel RCT, non‐inferiority design)

I: insulin detemir once daily before breakfast ± metformin at optimal dose

For 80% power and 5% significance level with a baseline‐adjusted SD of 1.1, a total of 238 completers (119 per group) was required. Owing to a 20% maximal expected frequency of participants lost for follow‐up, 286 were to have been included, 143 in each group.

124/—

38

38

38

21c

55.3c

7 months

C: NPH insulin once daily before breakfast ± metformin at optimal dose

48

48

48

20c

41.7c

total:

86

86

86

41c

47.7c

NN304‐3614

(parallel RCT)

I: insulin detemir in the evening + insulin aspart each meal

As the primary objective was to demonstrate a difference of 5% in the primary endpoint, using an analysis of covariance model with 3 factors and 1 covariate and with an SD of 5%, the number of participants needed would be of 23 per group. Assuming a withdrawal rate of 20%, the total number of randomised participants would be 58. With a planned screening failure of 20%, the total number of participants planned would be 73.

81/—

25

24

24

21

84.0

26 weeks

C: NPH insulin in the evening + insulin aspart each meal

35

35

35

31

88.6

total:

60

59

59

52

86.7

Pan 2007

(parallel RCT, non‐inferiority design)

I: insulin glargine in the evening + glimepiride 3 mg in the morning

Assuming an SD of 1.6% for the changes from baseline in HbA1c in the 2 groups, and a maximum difference between the groups to be equivalent to 0.4%, the sample size per group was calculated to provide 80% power. The sample size was adjusted for an evaluation rate of 90%, and a total of 440 participants (220 per group) was thus targeted for randomisation.

224

220

220f

211

94.2

24 weeks

C: NPH insulin in the evening + glimepiride 3 mg in the morning

224

223

223f

214

95.5

total:

448

443

443

425

94.9

Betônico 2019

(cross‐over RCT, non‐inferiority design)

I: insulin glargine in the morning + insulin lispro at mealtime

A sample size of 34 participants provided 90% power to detect a mean difference of 0.7% (7.7 mmol/mol) in the primary endpoint (HbA1c), considering a 15% dropout rate and assuming an SD of 0.85% and a type I error of 5%.

193/40

Period 1 – glargine/period 2 – NPH: 16

Period 1 – NPH/period 2 – glargine: 18

After period 1: 16

After period 2: 15

After period 1: 16

After period 2: 15

Period 1 – glargine/period 2 – NPH: 14

Period 1 – NPH/period 2 – glargine: 15

Period 1 – glargine/period 2 – NPH: 87.5

Period 1 – NPH/period 2 – glargine: 83.3

Cross‐over trial, 6 months per period

C: NPH insulin 3 times daily + insulin lispro at mealtime

After period 1: 18

After period 2: 14

After period 1: 18

After period 2: 14

total:

34

After period 1: 34

After period 2: 29

After period 1: 34

After period 2: 29

29

85.3

Riddle 2003

(parallel RCT)

I: insulin glargine once at bedtime + OAD

Based on previous data, randomisation of 750 participants had the power to provide an 85% chance of detecting, with α = 5%, a 10% treatment effect for the primary outcome measure

1381/764

372

367

367

334

89.8

24 weeks

C: NPH insulin once at bedtime + OAD

392

389

389

357

91.1

total:

764

756

756

691

90.4

Rosenstock 2001

(parallel RCT)

I: insulin glargine once daily at bedtime + premeal regular insulin

The study was designed to provide 90% power to detect a mean difference of 0.5% in HbA1c between treatment groups

846g/—

260

259

259

231

88.8

28 weeks

C: NPH insulin once at bedtime or twice daily in the morning and at bedtime + premeal regular insulin

261

259

259

238

91.2

total:

521

518

518

469

90.0

Rosenstock 2009

(parallel RCT, non‐inferiority design)

I: insulin glargine once daily, generally at bedtime

Sample size was calculated assuming a 20% 5‐year event rate for a ≥ 3 step progression in diabetic retinopathy on the Early Treatment of Diabetic Retinopathy Study scale from baseline to end of study (based on data from the Diabetes Control and Complications Trial), and a non‐inferiority margin of 10% (half of the expected background rate of 20%) was chosen. Assuming that approximately 40% of the randomised participants would not be evaluable, a sample size of 840 randomised participants (420 per group) was calculated to provide at least 80% power for declaring non‐inferiority

1413/—

515

513

513 (ITT); 514h (safety population)

374

72.6

5 years

C: NPH insulin twice daily, generally in the morning and at bedtime

509

504

504 (ITT); 503h (safety population)

364

71.5

total:

1024

1017

1017

738

72.1

Yki‐Järvinen 2006

(parallel RCT)

I: insulin glargine at bedtime + metformin

The sample size calculation was based on differences observed in a previous study between 11 insulin‐naive participants treated with NPH and metformin and 12 participants treated with glargine and metformin for 1 year in Helsinki. In this study, HbA1c differed by 0.5% at the end of 1 year; the SDs for the groups were not different and averaged 0.87. The mean HbA1c change for the NPH + metformin group was −0.8 (SE 0.2%) (11 participants), and for the glargine + metformin group it was −1.3 (SE 0.3%) (12 participants) at the end of 1 year. Assuming α = 0.05 and 80% power, the required number of participants per group to observe a difference of 0.5% is 50. To allow for a 10% dropout rate, 110 participants were randomised

157/110

61

61

61

60

98.4

36 weeks

C: NPH insulin at bedtime + metformin

49

49

49

48

98.0

total:

110

110

110

108

98.2

Yokoyama 2006

(parallel RCT)

I: insulin glargine once at breakfast + aspart/lispro at each meal with or without OADs

—/—

31

6 months

C: NPH insulin daily at bedtime + aspart/lispro at each meal with or without OADs

31

total:

62

Overall total

All interventions

All comparators

All interventions and comparators

8677

— denotes not reported.

aFollow‐up under randomised conditions until end of trial or if not available, duration of intervention; extended follow‐up refers to follow‐up of participants once the original study was terminated as specified in the power calculation.
bModified ITT set for primary endpoint evaluation (including randomised participants with valid values for DRQoL for both treatment periods).
cStudy prematurely discontinued.
dParticipants with type 1 and type 2 diabetes.
e3 participants not randomised due to protocol violations.
fSafety population: 444 participants.
gAccording to European Medical Agency report.
h1 participant who was randomised to receive NPH insulin received insulin glargine throughout the study, and was consequently counted in the ITT population as an NPH participant, but in the safety population as an insulin glargine participant, leading to a discrepancy in the numbers for the ITT and safety populations in both the insulin glargine and NPH insulin arms.

C: comparator; CI: confidence interval; DRQoL: diabetes‐related quality of life; HbA1c: glycosylated haemoglobin A1c; I: intervention; ITEQ: Insulin Therapy Experience Questionnaire; ITT: intention‐to‐treat; n: number of participants; NPH: neutral protamine Hagedorn; OAD: oral antihyperglycaemic drug; PAID: Problem Areas In Diabetes; RCT: randomised controlled trial; SD: standard deviation; SE: standard error; SF‐12: 12‐item Short Form Health Survey.

We provided information on potentially relevant ongoing trials, including the trial identifier in the Characteristics of ongoing studies table and in Appendix 10 'Matrix of trial endpoint (publications and trial documents)'. We tried to find the protocol for each included trial and we reported primary, secondary and other outcomes in comparison with data in publications in Appendix 10.

We emailed all authors of included trials to enquire whether they would be willing to answer questions on their trials. We presented the results of this survey in Appendix 19. We then asked for relevant missing information on the trial from the primary trial author(s), where necessary.

Dealing with duplicate and companion publications

In the event of duplicate publications, companion documents or multiple reports of a primary trial, we maximised the information yielded by collating all available data and we used the most complete data set aggregated across all known publications. We listed duplicate publications, companion documents, multiple reports of a primary trial and trial documents of included trials (such as trial registry information) as secondary references under the study identifier (ID) of the included trial. Furthermore, we also listed duplicate publications, companion documents, multiple reports of a trial and trial documents of excluded trials (such as trial registry information) as secondary references under the study ID of the excluded trial.

Data from clinical trial registers

If data from included trials were available as study results in clinical trial registers such as ClinicalTrials.gov or similar sources, we made full use of this information and extracted the data. If there was also a full publication of the trial, we collated and critically appraised all available data. If an included trial was marked as a completed study in a clinical trial register but no additional information (study results, publication or both) was available, we added this trial to the Characteristics of excluded studies table.

Assessment of risk of bias in included studies

Two review authors (JE, KH, TS, KJ) independently assessed the risk of bias for each included trial. We resolved disagreements by consensus or by consulting a third review author (JE, KH, TS, KJ). In the case of disagreement, we consulted the remainder of the review author team and made a judgement based on consensus. If adequate information was unavailable from the study publications, study protocols or other sources, we contacted the study authors for more detail to request missing data on 'Risk of bias' items.

We used the Cochrane 'Risk of bias' assessment tool (Higgins 2019a), to assign assessments of low, high or unclear risk of bias (for details, see Appendix 5; Appendix 6). We evaluated individual bias items as described in the Cochrane Handbook for Systematic Reviews of Interventions, according to the criteria and associated categorisations contained therein (Higgins 2019a).

Summary assessment of risk of bias

We presented a 'Risk of bias' graph and a 'Risk of bias' summary figure. We distinguished between self‐reported and investigator‐assessed and adjudicated outcome measures.

We considered the following self‐reported outcomes.

  • Diabetes‐related complications, reported by participants.

  • Hypoglycaemia, when measured by participants.

  • Health‐related quality of life.

  • Adverse events other than hypoglycaemia, as reported by participants.

We considered the following outcomes to be investigator‐assessed.

  • Diabetes‐related complications, evaluated/as measured by trial personnel.

  • Hypoglycaemia, when measured by trial personnel.

  • All‐cause mortality.

  • Adverse events other than hypoglycaemia, as measured by trial personnel.

  • Socioeconomic effects.

  • HbA1c.

Risk of bias for a study across outcomes

Some risk of bias domains, such as selection bias (sequence generation and allocation sequence concealment), affect the risk of bias across all outcome measures in a study. In case of high risk of selection bias, we marked all endpoints investigated in the associated study as being at high risk. Otherwise, we did not perform a summary assessment of the risk of bias across all outcomes for a study.

Risk of bias for an outcome within a study and across domains

We assessed the risk of bias for an outcome measure by including all entries relevant to that outcome (i.e. both study‐level entries and outcome‐specific entries). We considered low risk of bias to denote a low risk of bias for all key domains, unclear risk to denote an unclear risk of bias for one or more key domains and high risk to denote a high risk of bias for one or more key domains.

Risk of bias for an outcome across studies and across domains

To facilitate our assessment of the certainty of the evidence for key outcomes, we assessed risk of bias across studies and domains for the outcomes included in the 'Summary of findings' tables. We defined the evidence as being at low risk of bias when most information came from studies at low risk of bias, unclear risk of bias when most information came from studies at low or unclear risk of bias and high risk of bias when a sufficient proportion of information came from studies at high risk of bias.

Measures of treatment effect

When at least two included trials were available for a comparison and a given outcome, we tried to express dichotomous data as a risk ratio (RR) or odds ratio (OR) with 95% confidence intervals (CI). For continuous outcomes measured on the same scale (e.g. weight loss in kilograms), we estimated the intervention effect using the mean difference (MD) with 95% CI. For continuous outcomes measuring the same underlying concept (e.g. health‐related quality of life) but using different measurement scales, we intended to calculate the standardised mean difference with 95% CI. We also planned to express time‐to‐event data as a hazard ratio with 95% CI.

Unit of analysis issues

We intended to consider the level at which randomisation occurred, such as cross‐over trials, cluster‐randomised trials and multiple observations for the same outcome. If more than one comparison from the same trial was eligible for inclusion in the same meta‐analysis, we either combined groups to create a single pair‐wise comparison or appropriately reduced the sample size so that the same participants did not contribute more than once (splitting the 'shared' group into two or more groups). While the latter approach offers some solution to adjusting the precision of the comparison, it does not account for correlation arising because the same set of participants was included in multiple comparisons (Higgins 2019b).

We planned to reanalyse cluster‐RCTs that had not appropriately adjusted for potential clustering of participants within clusters in their analyses. Variance in the intervention effects would have been inflated by a design effect. Calculation of a design effect would have involved estimation of an intracluster correlation (ICC). We would have obtained estimates of ICCs through contact with authors, or imputed them using estimates from other included trials that reported ICCs, or using external estimates from empirical research (e.g. Bell 2013). We also planned to examine the impact of clustering using sensitivity analyses.

Dealing with missing data

If possible, we obtained missing data from the authors of the included trials. We carefully evaluated important numerical data such as screened, randomly assigned participants as well as intention‐to‐treat (ITT), and as‐treated and per‐protocol populations. We investigated attrition rates (e.g. dropouts, losses to follow‐up, withdrawals), and we critically appraised issues concerning missing data and the use of imputation methods (e.g. last observation carried forward (LOCF)).

In trials where the standard deviation (SD) of the outcome was not available at follow‐up or could not be recreated, we standardised using the mean of the pooled baseline SD from those trials in which this information was reported.

Where included trials did not report means and SDs for outcomes and we did not receive the necessary information from trial authors, we imputed these values by estimating the mean and variance from the median, range and the size of the sample (Hozo 2005).

We investigated the impact of imputation on meta‐analyses by performing sensitivity analyses and we reported which trials were included with imputed SDs per outcome.

Assessment of heterogeneity

In the event of substantial clinical or methodological heterogeneity, we did not report study results as the pooled effect estimate in a meta‐analysis.

We identified heterogeneity (inconsistency) by visually inspecting the forest plots and by using a standard Chi² test with a significance level of α = 0.1 (Deeks 2019). In view of the low power of this test, we also considered the I² statistic – which quantifies inconsistency across studies – to assess the impact of heterogeneity on the meta‐analysis (Higgins 2002; Higgins 2003). When we identified heterogeneity, we attempted to determine possible reasons for this by examining individual characteristics of the study and subgroups.

Assessment of reporting biases

If we included 10 or more studies that investigated a particular outcome, we planned to use funnel plots to assess small‐study effects. Several explanations may account for funnel plot asymmetry, including true heterogeneity of effect with respect to study size, poor methodological design (and hence bias of small studies) and selective non‐reporting (Kirkham 2010). Therefore, we interpreted the results carefully (Sterne 2011).

Data synthesis

We planned to undertake (or display) a meta‐analysis only if participants, interventions, comparisons and outcomes were judged to be sufficiently similar to ensure an answer that was clinically meaningful. For the outcome HbA1c, we used the change between baseline and end of follow‐up for the comparison between the two groups. MDs were calculated using a random‐effects model for the meta‐analysis. In some studies, the mean change and its variance for each group were presented in the published report of the trial. In other cases where these estimates were not reported, we had to calculate appropriate variances, if possible, from other statistics presented. The same approach was used for the outcome weight gain (change in body mass index (BMI)). Furthermore, we looked at different episodes of hypoglycaemia (severe, serious, less than 70 mg/dL to 75 mg/dL, less than 50 mg/dL to 55 mg/dL, nocturnal less than 70 mg/dL to 75 mg/dL, nocturnal less than 50 mg/dL to 55 mg/dL) and serious adverse events. For the meta‐analysis of severe and serious hypoglycaemic episodes, we used Peto's OR method, since the event rates were low.

We interpreted random‐effects meta‐analyses with due consideration for the whole distribution of effects and planned to present a prediction interval (Borenstein 2017a; Borenstein 2017b; Higgins 2009). A prediction interval requires at least three studies to be calculated and specifies a predicted range for the true treatment effect in an individual study (Riley 2011). In addition, we performed statistical analyses according to the statistical guidelines presented in the Cochrane Handbook for Systematic Reviews of Interventions (Deeks 2019).

Subgroup analysis and investigation of heterogeneity

We expected the following characteristics to introduce clinical heterogeneity, and planned to carry out the following subgroup analyses for the primary outcomes including investigation of interactions.

  • Different additional antihyperglycaemic therapy such as oral antidiabetic drugs (OADs) versus insulin.

  • NPH once daily versus NPH twice or three times daily.

Sensitivity analysis

We intended to perform sensitivity analyses for the primary outcomes to explore the influence of the following factors (when applicable) on effect size by restricting analysis to the following:

  • Published trials.

  • Taking into account risk of bias, as specified in the Assessment of risk of bias in included studies section.

  • Very long or large trials to establish the extent to which they dominated the results.

  • Trials using the following filters: diagnostic criteria, imputation, language of publication, source of funding (industry versus other) or country.

We also tested the robustness of results by repeating the analyses using different measures of effect size (RR, OR, etc.) and different statistical models (fixed‐effect and random‐effects models).

Certainty of the evidence

We present the overall certainty of the evidence for each outcome specified below, according to the GRADE approach, which takes into account issues related to internal validity (risk of bias, inconsistency, imprecision, publication bias) and external validity (such as directness of results). Two review authors (JE, KH, TS, KJ) independently rated the certainty of the evidence for each outcome. We resolved any differences in assessment by discussion or by consultation with a third review author (JE, KH, TS, KJ).

We included appendices entitled 'Checklist to aid consistency and reproducibility of GRADE assessments' (Appendix 1; Appendix 2), to help with standardisation of the 'Summary of findings' tables (Meader 2014). Alternatively, we planned to use the GRADEpro GDT software and would have presented evidence profile tables as an appendix (GRADEpro GDT). If meta‐analysis was not possible, we presented the results in a narrative format in the 'Summary of findings' table. We justified all decisions to downgrade the certainty of the evidence using footnotes, and we made comments to aid the reader's understanding of the Cochrane Review when necessary.

'Summary of findings' tables

We presented a summary of the evidence in 'Summary of findings' tables. This provides key information about the best estimate of the magnitude of effect, in relative terms and as absolute differences for each relevant comparison of alternative management strategies; the numbers of participants and studies addressing each important outcome; and a rating of overall confidence in effect estimates for each outcome. We created the 'Summary of findings' tables using the methods described in the Cochrane Handbook for Systematic Reviews of Interventions (Schünemann 2019), along with Review Manager 5 software (Review Manager 2020).

Interventions presented in the 'Summary of findings' tables were long‐acting insulin analogues (insulin glargine U100 or insulin detemir). The comparator was NPH insulin.
We reported the following outcomes, listed according to priority.

  • Diabetes‐related complications.

  • Hypoglycaemic episodes.

  • Health‐related quality of life.

  • All‐cause mortality.

  • Adverse events other than hypoglycaemia.

  • Socioeconomic effects.

  • HbA1c.

Results

Description of studies

For a detailed description of trials, see Table 1, and the Characteristics of included studies, Characteristics of excluded studies, Characteristics of studies awaiting classification, and Characteristics of ongoing studies tables.

Results of the search

Using the described strategies, the update searches in 2017 yielded 1969 and in 2019 yielded 2644 results. We identified 92 additional records including the IQWiG report through non‐database sources (IQWiG 2009). After deduplication, 3810 records remained.

After reading the 3810 abstracts, we excluded 3721 articles by consensus as irrelevant to the question under review, leaving 89 articles for further examination. We identified 10 additional publications by handsearching the reference lists of included trials, systematic reviews/meta‐analyses and Health Technology Assessment (HTA) reports or databases from regulatory agencies. After screening the full text of the selected 99 publications and after contacting authors of potentially relevant studies, 28 studies (59 articles/records) met the inclusion criteria. Two of these studies (two records) were potentially relevant ongoing trials and three trials (five records) were classified as studies awaiting assessment. Finally, incorporating 15 additional studies with the nine studies from the previous version of the review, in this review update 24 completed trials (63 articles/records) could be included. For further details see flow diagram in Figure 1.


Trial flow diagram; HTA: health technology assessment; RCT: randomised controlled trial.

Trial flow diagram; HTA: health technology assessment; RCT: randomised controlled trial.

Included studies

A detailed description of the characteristics of included trials is presented elsewhere (see Characteristics of included studies table and Appendix 7; Appendix 8; Appendix 9). The following is a short overview.

Source of data

The results of 20 trials were at least partially published in scientific journals between 2000 and 2019. For three of the trials, information and results were mostly obtained from entries in ClinicalTrials.gov and from pharmaceutical manufacturers' study reports. For 14 trials, we relied on additional information that was based on the original study reports published in a report by the Institute for Quality and Efficiency in Health Care (IQWiG 2009). For one trial, the IQWiG report was the only available source of data. We contacted authors of all studies that were not included in the IQWiG report to request missing data or clarify issues regarding the methodology of the trial. Nine of the authors replied but only two of them provided information that was relevant to the review. All other authors were contacted during the preparation of the IQWiG report and their replies incorporated in the IQWiG report, where relevant (see Appendix 19).

Comparisons
Glargine U300 versus NPH

We found no trials comparing glargine U300 with NPH insulin.

Glargine versus NPH

Sixteen trials compared NPH to insulin glargine (Berard 2015; Eliaschewitz 2006; Fritsche 2003; Hermanns 2015; Home 2015; Hsia 2011; Kawamori 2003; Massi 2003; NCT00687453; Pan 2007; Betônico 2019; Riddle 2003; Rosenstock 2001; Rosenstock 2009; Yki‐Järvinen 2006; Yokoyama 2006).

Glargine was given once daily in all trials and was generally administered shortly before retiring to bed (Berard 2015; Eliaschewitz 2006; Home 2015; Massi 2003; NCT00687453; Pan 2007; Riddle 2003; Rosenstock 2001; Rosenstock 2009; Yki‐Järvinen 2006). Three trials administered glargine in the morning (Kawamori 2003; Betônico 2019; Yokoyama 2006). Two further trials had two interventional arms, of which one involved taking glargine at bedtime and one in the morning (Fritsche 2003; NCT00687453). Both intervention arms were compared with NPH. In one trial, glargine could be administered at any time, as long as it was at the same time each day (Hermanns 2015).

Most trials administered NPH once daily, either shortly before retiring to bed (Eliaschewitz 2006; Fritsche 2003; Home 2015; Hsia 2011; Massi 2003; Pan 2007; Riddle 2003; Yki‐Järvinen 2006; Yokoyama 2006), or in the morning (Kawamori 2003). Three trials gave NPH once daily at bedtime and once more in the morning if blood glucose targets were not met (Berard 2015; Hermanns 2015; Rosenstock 2001). Two trials compared insulin glargine to NPH insulin both at bedtime and in the morning (NCT00687453; Rosenstock 2009). One trial administered NPH three times daily, at breakfast, lunch and bedtime (Betônico 2019).

Three trials administered short‐acting insulins (either regular insulin or short‐acting insulin analogues) at mealtimes in addition to glargine and NPH (Betônico 2019; Rosenstock 2001; Yokoyama 2006). In nine trials, concomitant medication to lower blood glucose consisted only of OADs (Eliaschewitz 2006; Fritsche 2003; Hsia 2011; Kawamori 2003; Massi 2003; NCT00687453; Pan 2007; Riddle 2003; Yki‐Järvinen 2006). In the four remaining trials, additional blood glucose lowering medication included OADs and, if necessary, short‐acting insulins (Berard 2015; Hermanns 2015; Rosenstock 2009; Yokoyama 2006).

All 16 trials used insulin glargine U100.

Detemir versus NPH

Eight trials compared NPH insulin to insulin detemir (Fajardo Montañana 2008; Haak 2005; Hermansen 2006; Kobayashi 2007 A; Kobayashi 2007 B; NN304‐1337; NN304‐1808; NN304‐3614).

Five of these trials administered detemir and NPH once daily, either shortly before retiring to bed (Fajardo Montañana 2008; Kobayashi 2007 B; NN304‐1337; NN304‐3614), or in the morning (NN304‐1808). Two trials gave detemir and NPH at bedtime and, if necessary, in the morning (Haak 2005; Kobayashi 2007 A), and study gave them at bedtime and in the morning (Hermansen 2006).

Four studies gave OADs as concomitant medication to lower blood glucose (Hermansen 2006; Kobayashi 2007 B; NN304‐1337; NN304‐1808), three studies gave of insulin aspart at mealtimes (Haak 2005; Kobayashi 2007 A; NN304‐3614), and one study gave a combination of insulin aspart and OADs (Fajardo Montañana 2008).

Deglutec versus NPH

We found no trials comparing insulin degludec with NPH insulin.

Overview of trial populations
Glargine versus NPH

Overall, 6330 people with type 2 diabetes mellitus were randomised to the different comparison groups. Individual sample sizes ranged from 24 to 1024 participants per study. Between 60% and 95% of randomised participants finished the trials.

Detemir versus NPH

Overall, 2347 people with type 2 diabetes mellitus were randomised to the different comparison groups. Individual sample sizes ranged from 60 to 505 per study. Between 48% and 95% of participants finished the trial.

Trial design
Glargine versus NPH

Fourteen trials had a parallel design (Berard 2015; Eliaschewitz 2006; Fritsche 2003; Home 2015; Hsia 2011; Kawamori 2003; Massi 2003; NCT00687453; Pan 2007; Riddle 2003; Rosenstock 2001; Rosenstock 2009; Yki‐Järvinen 2006; Yokoyama 2006), and two had a cross‐over design (Hermanns 2015; Betônico 2019). Seven trials had a superiority design (Hermanns 2015; Home 2015; Massi 2003; Riddle 2003; Rosenstock 2001; Yki‐Järvinen 2006; Yokoyama 2006), and seven trials had an equivalence/non‐inferiority design (Eliaschewitz 2006; Fritsche 2003; Kawamori 2003; NCT00687453; Pan 2007; Betônico 2019; Rosenstock 2009). The latter was unclear in two trials (Berard 2015; Hsia 2011).

Eleven trials had a multicentre design with the number of centres ranging from seven to 111 (Eliaschewitz 2006; Fritsche 2003; Hermanns 2015; Home 2015; Kawamori 2003; Massi 2003; Pan 2007; Riddle 2003; Rosenstock 2001; Rosenstock 2009; Yki‐Järvinen 2006). Eight trials involved more than 50 centres (Eliaschewitz 2006; Fritsche 2003; Home 2015; Kawamori 2003; Massi 2003; Riddle 2003; Rosenstock 2001; Rosenstock 2009).

Neither participants nor study personnel or outcome assessors were reported to be blinded in any of the trials.

Trials were performed between 1997 and 2016. This information was not available for seven trials (Berard 2015; Eliaschewitz 2006; Kawamori 2003; Pan 2007; Rosenstock 2001; Yki‐Järvinen 2006; Yokoyama 2006).

The mean duration of intervention was identical to the mean duration of follow‐up in all trials and ranged from six to 60 months.

Seven trials had formal run‐in periods ranging from two to 12 weeks (Fritsche 2003; Home 2015; Hsia 2011; NCT00687453; Riddle 2003; Yki‐Järvinen 2006; Yokoyama 2006).

Two trials were terminated early, one because of a lack of funding (Hsia 2011), and one for unspecified reasons (NCT00687453).

Detemir versus NPH

All trials had a parallel design and all but three had a non‐inferiority design. Of these three trials, two had a superiority design (Fajardo Montañana 2008; NN304‐3614), and in one trial, the design was unclear (NN304‐1337).

Seven trials were multicentre where the number ranged from five to 65. For one trial, no information on the number of trial centres was available (NN304‐1337).

Neither participants nor study personnel or outcome assessors were reported to be blinded in any of the trials.

Trials were performed between 2003 and 2010. For two trials, no information on when the trials were performed was available (Haak 2005; NN304‐1337).

The mean duration of intervention was similar to the mean duration of follow‐up in all trials and ranged from 24 to 48 weeks.

One trial had a formal run‐in period of two weeks' duration (NN304‐1337).

NN304‐1808 was discontinued prematurely because of recruitment problems.

Settings
Glargine versus NPH

Some information on settings was available for six trials: two trials were conducted in a speciality primary care clinic (Hsia 2011; NCT00687453), one in university hospital facilities (Betônico 2019), two in outpatient facilities (Hermanns 2015; Yokoyama 2006), and one in an inpatient and outpatient care facility (Kawamori 2003).

Detemir versus NPH

No information was available for any of the trials.

Participants
Glargine versus NPH

Only people with type 2 diabetes mellitus were included. Major exclusion criteria were insulin therapy before the initiation of the trial in six trials (Eliaschewitz 2006; Fritsche 2003; Hermanns 2015; Massi 2003; Riddle 2003; Yki‐Järvinen 2006), use of insulin analogues in one trial (Rosenstock 2009), and severe diabetic retinopathy in eight trials (Home 2015; Hsia 2011; Kawamori 2003; Massi 2003; NCT00687453; Pan 2007; Rosenstock 2009; Yki‐Järvinen 2006). The mean duration of diabetes at baseline ranged from eight to 19 years. Participants were mostly of white ethnicity (for those publications with no information on the ethnic composition of the study population, it was inferred from the study locations), with mean age ranging from 50 to 62 years. The proportion of women in the comparison groups varied between 25% and 77%. Most participants were overweight, with mean BMI ranging from 23 kg/m² to 35 kg/m². None of the trials were performed on pharmaco‐naive people (i.e. people whose treatment consisted only of dietary changes, exercise or both). Metabolic control in participants ranged from 6.9% (51 mmol/mol) to 9.6% (81 mmol/mol) HbA1c at baseline. Two trials including 515 participants were conducted in low‐ and middle‐income countries (Eliaschewitz 2006; Betônico 2019). In addition, three further trials had study centres in low‐ and middle‐income countries (Home 2015; Massi 2003; Pan 2007). Two trials conducted in the USA were specifically designed to investigate the effects of glargine versus NPH in low‐income inner‐city ethnic minorities (Hsia 2011; NCT00687453).

Detemir versus NPH

Only people with type 2 diabetes mellitus were included. Major exclusion criteria were severe retinopathy (Fajardo Montañana 2008; Haak 2005; Hermansen 2006; Kobayashi 2007 A; NN304‐1337; NN304‐1808; NN304‐3614), and recurrent major hypoglycaemia (Haak 2005; Hermansen 2006; Kobayashi 2007 A; NN304‐1337; NN304‐1808). Mean duration of diabetes ranged from about 10 to 17 years. Participants were mostly of white or Asian ethnicity (for those publications with no information on the ethnic composition of the study population, it was inferred from the study locations), with mean age ranging from 55 to 78 years in the various comparison groups. Most participants were overweight, with BMI ranging from 22 kg/m² to 32 kg/m². The proportion of women varied from 29% to 62% in the different comparison groups. None of the studies were performed on pharmaco‐naive people. Metabolic control in participants ranged from 7.6% to 9.5% HbA1c at baseline. One trial was partly conducted in a low‐income country, in Puerto Rico as well as the USA (NN304‐1337). All other trials were carried out in Europe and Japan.

Diagnosis
Glargine versus NPH

In one trial, the diagnosis of type 2 diabetes was made according to American Diabetes Association (ADA) criteria valid at the time (Betônico 2019), and in one further trial in accordance with WHO criteria (Pan 2007). For all other trials, the exact criteria used to diagnose type 2 diabetes mellitus were unclear.

Detemir versus NPH

In two trials, the diagnosis of type 2 diabetes was made in accordance with the ADA criteria valid at the time (Haak 2005; NN304‐1808). For all other trials, the exact diagnostic criteria used to diagnose type 2 diabetes mellitus were unclear.

Interventions
Glargine versus NPH

All insulins were administered subcutaneously in all included trials. Dosages of investigative drugs and concomitant medications varied between trials and participants. None of the included trials were placebo controlled.

While administering glargine once daily is adequate, usage instructions recommend adapting the number of daily injections for NPH as necessary, as is common in clinical practice. Thus, for studies limiting NPH to a single daily injection, the comparator could not be considered adequate (Eliaschewitz 2006; Fritsche 2003; Home 2015; Hsia 2011; Kawamori 2003; Massi 2003; Pan 2007; Riddle 2003; Yki‐Järvinen 2006; Yokoyama 2006).

Target values for fasting blood glucose concentration ranged from 4.0 mmol/L to 7.8 mmol/L (72 mg/dL to 140 mg/dL) (Eliaschewitz 2006; Fritsche 2003; Home 2015; Hsia 2011; Kawamori 2003; Massi 2003; Pan 2007; Betônico 2019; Riddle 2003; Rosenstock 2001; Rosenstock 2009; Yki‐Järvinen 2006). Two studies set nocturnal blood glucose targets ranging from 4.4 mmol/L to 6.6 mmol/L (80 mg/dL to 120 mg/dL) (Home 2015; Betônico 2019). One study prespecified an additional target for postprandial blood glucose concentration below 10 mmol/L (180 mg/dL) (Betônico 2019). For four studies target values remained unclear (Berard 2015; Hermanns 2015; NCT00687453; Yokoyama 2006).

The design of the investigation by Yokoyama 2006 required upward titration of insulin glargine with the aim of basal insulin making up 50% of the total daily insulin requirement. In contrast to this, the percentage of NPH in the total daily insulin requirement was left unchanged, thus introducing a difference in the treatments of the two comparison groups. This made the trial unfit to identify substance‐specific differences between insulin glargine and NPH insulin. Even though this was the case, we were unable to formally exclude this trial on the basis of our prespecified exclusion criteria. When appropriate, therefore, we conducted a sensitivity analysis that excluded this study.

Detemir versus NPH

All insulins were administered subcutaneously in all included trials. Dosages of investigative drugs and concomitant medications varied among trials and participants. None of the included trials were placebo controlled.

In accordance with usage instructions and common clinical practice, the number of daily injections of NPH should be adjusted as necessary. Thus, the comparator could not be considered adequate in studies limiting NPH to a single daily injection (Fajardo Montañana 2008; Kobayashi 2007 B; NN304‐1337; NN304‐1808; NN304‐3614).

Target values for fasting blood glucose concentration ranged from 4.0 mmol/L to 7.0 mmol/L (72 mg/dL to 126 mg/dL) (Fajardo Montañana 2008; Haak 2005; Hermansen 2006; Kobayashi 2007 A; Kobayashi 2007 B; NN304‐1337). A nocturnal blood glucose target was set in one study ranging from 4.0 mmol/L to 7.0 mmol/L (72 mmol/L to 126 mg/dL) (Haak 2005). Four studies prespecified an additional target for postprandial blood glucose concentration ranging between below 6.7 mmol/L (120 mg/dL) to below 10 mmol/L (180 mg/dL) (Fajardo Montañana 2008; Haak 2005; Kobayashi 2007 A; Kobayashi 2007 B). In Hermansen 2006, a predinner blood glucose target of 6.0 mmol/L (108 mg/dL) or below was also set. For two studies, target values remained unclear (NN304‐1808; NN304‐3614).

In trials comparing glargine or detemir to NPH, reported target blood glucose levels used in the adjustment of blood glucose lowering medications were consistent with the target range recommended by ADA for most non‐pregnant adults with diabetes (ADA 2020), but they were generally at the lower end of the recommended target range. Furthermore, they did not take into account ADA's recommendation that the goals be adjusted individually, depending on the duration of diabetes, known CVD and other factors. In the included trials, the participants had diabetes for eight to 19 years at the start of the trial. In current clinical practice, blood glucose targets for many of the trials' participants would probably have been raised. This also casts doubt on the adequacy of the interventions and comparators.

Outcomes
Glargine versus NPH

HbA1c was the defined primary outcome in all but four trials (Berard 2015; Hermanns 2015; Rosenstock 2009; Yokoyama 2006). Further defined primary outcomes were progression of retinopathy in one trial (Rosenstock 2009), and health‐related quality of life in another (Hermanns 2015). For two trials, the primary outcome remained unclear (Berard 2015; Yokoyama 2006). For all but two trials (Berard 2015; Yokoyama 2006), information on adverse events was reported. Reports on late diabetes complications were available for eight trials (Home 2015; Hsia 2011; Massi 2003; NCT00687453; Betônico 2019; Rosenstock 2001; Rosenstock 2009; Yki‐Järvinen 2006). Fourteen trials either reported mortality directly, or it could be deduced from provided information (Eliaschewitz 2006; Fritsche 2003; Hermanns 2015; Home 2015; Hsia 2011; Kawamori 2003; Massi 2003; NCT00687453; Pan 2007; Betônico 2019; Riddle 2003; Rosenstock 2001; Rosenstock 2009; Yki‐Järvinen 2006). Three trials did not report information on health‐related quality of life (Hermanns 2015; Massi 2003; Rosenstock 2001). No trial provided information on socioeconomic effects.

Detemir versus NPH

HbA1c was the defined primary outcome in six trials (Haak 2005; Hermansen 2006; Kobayashi 2007 A; Kobayashi 2007 B; NN304‐1337; NN304‐1808). Further defined primary outcomes were weight loss (Fajardo Montañana 2008) and changes in trunk fat mass (NN304‐3614).

All trials reported information on adverse events. All trials either reported on mortality, or it could be deduced from the information provided.

Reports on late diabetes complications were available in five trials (Fajardo Montañana 2008; Haak 2005; Kobayashi 2007 B; NN304‐1337; NN304‐1808).

Three trials reported information on health‐related quality of life (Fajardo Montañana 2008; Haak 2005; Kobayashi 2007 A). No trial provided information on socioeconomic effects.

Excluded studies

Full‐text evaluation in the study selection process of this review update resulted in the exclusion of 36 trials (40 articles/records). The main reasons for exclusion were that the study was a systematic review (meta‐analysis/HTA report), that the comparison was not adequate, and that the study was not a RCT (see Figure 1). For details see Characteristics of excluded studies table.

Studies awaiting classification

We classified three trials with five references as awaiting classification (see Characteristics of studies awaiting classification table). All three trials with an estimated total of 140 participants were listed in ClinicalTrials.gov as having been completed (NCT00788840; NCT01310452; NCT01500850), but no study results were reported and no publications were available. We contacted the investigators for each trial, but none of them replied.

Ongoing trials

We found two potentially relevant ongoing RCTs. One trial investigated the ultra‐long‐acting insulin analogue insulin degludec in comparison to insulin detemir, insulin glargine or NPH insulin in adults aged 60 to 75 years with type 2 diabetes mellitus who were at high risk of developing Alzheimer's disease (EUCTR2017‐004454‐42‐ES). The estimated number of participants in this trial is 188. The trial will assess the rate of hypoglycaemic events and blood glucose measures which are primary and secondary outcomes in our review. The estimated completion date for the trial is not stated. The second trial investigates the ultra‐long‐acting insulin analogue insulin glargine U‐300 in comparison to NPH insulin in insulin‐naive adults with type 2 diabetes mellitus who were suboptimally controlled on their previous antidiabetic treatment (NCT03389490). The estimated number of participants is 50. Changes in HbA1c and the incidence of hypoglycaemia, which are primary and secondary outcomes in our review, are predefined secondary outcomes in this trial. For further details see Characteristics of ongoing studies table.

Risk of bias in included studies

For details on the risk of bias in the included trials, see Characteristics of included studies table.

For an overview of assessments of each risk of bias item for individual trials and across all trials see Figure 2 and Figure 3.


Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included trials (blank cells indicate that the particular outcome was not measured in some trials).

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included trials (blank cells indicate that the particular outcome was not measured in some trials).


Risk of bias summary: review authors' judgements about each risk of bias item for each included trial (blank cells indicate that the particular outcome was not measured in the trial)

Risk of bias summary: review authors' judgements about each risk of bias item for each included trial (blank cells indicate that the particular outcome was not measured in the trial)

Allocation

All included studies were RCTs. Regarding method of randomisation and allocation concealment, we judged seven trials to have a low risk of bias based on the information from journal publications (Fajardo Montañana 2008; Fritsche 2003; Hermansen 2006; Home 2015; Massi 2003; NN304‐1808; Riddle 2003). In six additional trials, randomisation and allocation concealment were considered adequate in the IQWiG report 2009 (Eliaschewitz 2006; Haak 2005; NN304‐1337; Pan 2007; Rosenstock 2001; Rosenstock 2009; Yki‐Järvinen 2006) (IQWiG 2009), while in one trial randomisation was adequate but allocation concealment unclear (Kawamori 2003). For another trial, detailed information was only available for allocation concealment, while the randomisation method was unclear (Hermanns 2015). The remaining eight trials only reported that participants were randomised without providing any information on the method used (Berard 2015; Hsia 2011; Kobayashi 2007 A; Kobayashi 2007 B; NCT00687453; NN304‐3614; Betônico 2019; Yokoyama 2006). Therefore, we considered these trials to have an unclear risk of bias with regard to randomisation and allocation concealment.

Blinding

Participants or carers were not blinded to the interventions in any of the included trials. Even if blinding is difficult in such trials because glargine and detemir are clear solutions while NPH is milky in appearance, the fact remains that an open design, especially with no blinded outcome assessment and poor or unclear concealment of allocation, carries an increased risk of bias.

None of the trials provided explicit information on a blinded outcome assessment. Where measured, all primary and secondary outcomes in this review, except HbA1c, were participant reported, investigator assessed or both. Thirteen trials conducted assessment of HbA1c in central laboratories (Eliaschewitz 2006; Fajardo Montañana 2008; Fritsche 2003; Haak 2005; Hermanns 2015; Hermansen 2006; Home 2015; Hsia 2011; Massi 2003; NN304‐1337; Riddle 2003; Rosenstock 2001; Rosenstock 2009). A blinded outcome assessment can therefore be assumed and we considered these studies to carry a low risk of performance and detection bias for this outcome measure. Non‐serious adverse events and diabetes‐related complications were participant‐reported or investigator assessed in all but one trial and were, therefore, considered unclear risk of performance and detection bias. One trial described an adjudicated outcome measurement for diabetic retinopathy with treatment‐group‐masked grading of fundus photographs, and, therefore, carried a low risk of detection bias for this outcome (Rosenstock 2009). Health‐related quality of life measurements and non‐severe or severe hypoglycaemia were exclusively participant‐reported in the included trials. As blood glucose was self‐measured in all trials, including confirmed hypoglycaemic events, an increased risk of subjective influence existed. Therefore, these outcomes carried a high risk of performance and detection bias. We considered serious hypoglycaemia, fulfilling at least one criterion of a serious adverse event, serious adverse events themselves and mortality, to carry a low risk of performance and detection bias, since the possibility of subjective interference is minimal for these outcome measures.

Incomplete outcome data

Twenty‐two trials reported the number of participants who were randomised and who finished the trial (Eliaschewitz 2006; Fajardo Montañana 2008; Fritsche 2003; Haak 2005; Hermanns 2015; Hermansen 2006; Home 2015; Hsia 2011; Kawamori 2003; Kobayashi 2007 A; Kobayashi 2007 B; Massi 2003; NCT00687453; NN304‐1337; NN304‐1808; NN304‐3614; Pan 2007; Betônico 2019; Riddle 2003; Rosenstock 2001; Rosenstock 2009; Yki‐Järvinen 2006). The percentage of randomised participants completing their respective trials ranged from 48% to 98%. The remaining two trials only stated the number of randomised participants (Berard 2015; Yokoyama 2006).

Nineteen trials used an ITT approach for efficacy outcomes and 22 trials for safety outcomes. Even though none of the trials included all randomised participants in the analyses (so that they were not, in a strict sense, ITT analyses), the difference between randomised and analysed participants was small in all but three trials (Hsia 2011; NCT00687453; NN304‐1808), so we judged this as no substantial problem. One trial reported a per‐protocol analysis of efficacy outcomes with an equivalence design (Eliaschewitz 2006), and in another one with a non‐inferiority design (Kawamori 2003).

Seventeen trials used LOCF in the analyses for missing data (Eliaschewitz 2006; Fajardo Montañana 2008; Fritsche 2003; Haak 2005; Hermanns 2015; Hermansen 2006; Home 2015; Hsia 2011; Kobayashi 2007 A; Kobayashi 2007 B; NCT00687453; NN304‐1337; NN304‐1808; NN304‐3614; Riddle 2003; Rosenstock 2001; Rosenstock 2009). It was unclear how the remaining seven investigations treated missing data in the analyses (Berard 2015; Kawamori 2003; Massi 2003; Pan 2007; Betônico 2019; Yki‐Järvinen 2006; Yokoyama 2006).

Twenty‐two trials described discontinuing participants and provided at least some details on the reasons for terminating the trial. Two trials reported neither the number of discontinuing participants nor the reasons for withdrawal (Berard 2015; Yokoyama 2006).

Three trials were at high risk of incomplete outcome data for all outcomes of relevance for this review. The trials with completion rates of only 48% (NN304‐1808), 60% (Hsia 2011), and 63% (NCT00687453) were prematurely discontinued. Two trials stated the reasons for discontinuation, which were recruitment problems (NN304‐1808) and funding limits (Hsia 2011). We considered two trials to have incomplete outcome data for the primary outcome health‐related quality of life (Kobayashi 2007 A; Massi 2003). In one trial, 8% (glargine) and 23% (NPH) of the trial population were not included in health‐related quality of life analyses (Kobayashi 2007 A). In the other trial, 12% (glargine) and 15% (NPH) of the trial population were not included in health‐related quality of life analyses because validated questionnaires were not available in all languages (Massi 2003). In one trial, only per‐protocol analysis was available for HbA1c measurement, with missing data of 16% in the glargine group and 20% in the NPH group (Kawamori 2003). Therefore, this trial was at high risk of reporting incomplete outcome data for this secondary outcome.

Selective reporting

Four trials have not yet been published in peer‐review journals (NCT00687453; NN304‐1337; NN304‐1808; NN304‐3614), and information was only obtained from pharmaceutical manufacturers' study reports, ClinicalTrials.gov entries, the IQWiG report or correspondence with the study investigator. Six trials had a high risk of reporting bias on one or more of the outcomes of relevance for this review (Berard 2015; Home 2015; Kobayashi 2007 A; NCT00687453; NN304‐1808; Yokoyama 2006). Two trials had an unclear risk of reporting bias (Hermanns 2015; Pan 2007). The risk of reporting bias was low in all other trials. For details, see Appendix 10 and Appendix 11.

Other potential sources of bias

We judged three trials at high risk in the 'other bias' section because of their premature termination (Hsia 2011; NCT00687453; NN304‐1808). With the exception of one (Betônico 2019), all other trials had an unclear risk in this section, either because they had received funding from a pharmaceutical company (Berard 2015; Eliaschewitz 2006; Fajardo Montañana 2008; Fritsche 2003; Hermanns 2015; Hermansen 2006; Home 2015; Kawamori 2003; Kobayashi 2007 A; Kobayashi 2007 B; Massi 2003; NN304‐1337; NN304‐3614; Pan 2007; Riddle 2003; Rosenstock 2001; Rosenstock 2009; Yki‐Järvinen 2006), or did not report their funding source (Haak 2005; Yokoyama 2006).

Effects of interventions

See: Summary of findings 1 Insulin glargine versus NPH insulin for type 2 diabetes mellitus; Summary of findings 2 Insulin detemir versus NPH insulin for type 2 diabetes mellitus

Baseline characteristics

For details of baseline characteristics, see Appendix 8 and Appendix 9.

Long‐acting insulin analogue glargine versus NPH insulin

As it is not possible to include a comparison group twice in the same meta‐analysis, we could not consider all treatment arms from two trials in the meta‐analyses (Fritsche 2003; Hsia 2011). For reasons of homogeneity, only the comparison of glargine in the evening versus NPH in the evening was included in our analyses.

In case of cross‐over studies, we considered only the results from the first period for continuous outcomes, but from both periods for dichotomous outcomes (Hermanns 2015; Betônico 2019).

Primary outcomes
Diabetes‐related complications

Two trials reported a total of six non‐fatal myocardial infarctions, three in people treated with glargine and three in people treated with NPH (very low‐certainty evidence) (Pan 2007; Betônico 2019).

Four trials provided information on fatal myocardial infarctions, three in people treated with glargine, and one in a person treated with NPH (very low‐certainty evidence) (Home 2015; Hsia 2011; Betônico 2019; Yki‐Järvinen 2006).

One trial with 32 participants reported no strokes in either group (very low‐certainty evidence) (Betônico 2019). There were no fatal strokes in four trials with 934 participants (very low‐certainty evidence) (Home 2015; Hsia 2011; Betônico 2019; Yki‐Järvinen 2006).

For end‐stage renal disease for glargine versus NPH, there were no events in either group (1 trial, 34 participants; very low‐certainty evidence; Betônico 2019).

There was no evidence of a difference in progression of retinopathy (three steps) (RR 1.03, 95% CI 0.60 to 1.77; P = 0.90; 5 trials, 1974 participants; very low‐certainty evidence; Analysis 1.1). The 95% prediction interval ranged between 0.22 and 4.83.

Amputations: there were no events in either group (1 trial, 34 participants; very low‐certainty evidence; Betônico 2019).

None of the trials provided further information on late diabetic complications.

Hypoglycaemic episodes

The RR for severe hypoglycaemia was 0.68 (95% CI 0.46 to 1.01 in random‐effects meta‐analysis; P = 0.06; 14 trials, 6164 participants; very low‐certainty evidence; Analysis 1.2). The 95% prediction interval ranged between 0.33 and 1.40. Fixed‐effect meta‐analysis showed a RR of 0.67 (95% CI 0.50 to 0.90; P = 0.007).

The RR for serious hypoglycaemia was 0.75 (95% CI 0.52 to 1.09; P = 0.54; 10 trials, 4685 participants; low‐certainty evidence; Analysis 1.3). The 95% prediction interval ranged between 0.48 and 1.16.

The RR for confirmed hypoglycaemia less than 75 mg/dL was 0.92 (95% CI 0.85 to 1.01; P = 0.08; 7 trials, 4115 participants; low‐certainty evidence; Analysis 1.4). The 95% prediction interval ranged between 0.69 and 1.22.

The RR for confirmed hypoglycaemia less than 55 mg/dL was 0.88 in favour of glargine (95% CI 0.81 to 0.96; P = 0.005; 8 trials, 4388 participants; moderate‐certainty evidence; Analysis 1.5). The 95% prediction interval ranged between 0.79 and 0.98.

The RR for nocturnal confirmed hypoglycaemia less than 75 mg/dL was 0.78 (95% CI 0.68 to 0.89; P < 0.001; 8 trials, 4225 participants; very low‐certainty evidence; Analysis 1.6). The 95% prediction interval ranged between 0.53 and 1.14.

The RR for nocturnal confirmed hypoglycaemia less than 55 mg/dL was 0.74 in favour of glargine (95% CI 0.64 to 0.85; P < 0.001; 8 trials, 4759 participants; moderate‐certainty evidence; Analysis 1.7). The 95% prediction interval ranged between 0.62 and 0.88.

Health‐related quality of life

Three trials reported health‐related quality of life (1228 participants; very low‐certainty evidence).

Massi 2003 used the Well‐being Questionnaire (W‐BQ22). The difference between trial start and trial end for total score was 1.0 (95% CI –45.0 to 32.0) for glargine and 0.0 (95% CI –25.2 to 46.2) for NPH (P = 0.40).

Rosenstock 2001 used the W‐BQ22. The difference between trial start and trial end for total score was 0.5 (95% CI –22.0 to 36.0) for glargine and 0.0 (95% CI –37.0 to 39.0) for NPH (P = 0.25).

Hermanns 2015 used the EuroQol 5 (EQ‐5) instrument. The difference between trial start and trial end for EQ‐5 descriptive was –0.009 (SD 0.1727) for glargine and 0.001 (SD 0.1606) for NPH (P = 0.62). The difference between trial start and trial end for EQ‐5 Visual Analogue Scale (VAS) was –0.0 (SD 0.1646) for glargine and 0.009 (SD 0.1655) for NPH (P = 0.64).

Secondary outcomes
All‐cause mortality

The Peto OR for death from any cause was 1.06 (95% CI 0.62 to 1.82; P = 0.83; 14 trials, 6173 participants; low‐certainty evidence; Analysis 1.8).

Adverse events other than hypoglycaemia

The RR for serious adverse events was 0.98 (95% CI 0.87 to 1.10; P = 0.74; 13 trials, 5499 participants; moderate‐certainty evidence; Analysis 1.9). The 95% prediction interval ranged between 0.86 and 1.12.

The RR for all adverse events was 1.01 (95% CI 0.98 to 1.03; P = 0.62; 14 trials, 6170 participants; moderate‐certainty evidence; Analysis 1.10). The 95% prediction interval ranged between 0.99 and 1.03.

The RR for adverse events leading to discontinuation of the trial was 1.21 (95% CI 0.84 to 1.76; P = 0.30; 13 trials, 6149 participants; moderate‐certainty evidence; Analysis 1.11). The 95% prediction interval ranged between 0.79 and 1.84.

There was an increase in weight gain (BMI) in favour of NPH insulin (MD 0.12 kg/m², 95% CI 0.02 to 0.22; P = 0.02; 8 trials, 2405 participants; Analysis 1.12). The 95% prediction interval ranged between 0.06 kg/m² and 0.26 kg/m².

The RR for adverse skin reactions was 1.06 (95% CI 0.83 to 1.35; P = 0.63; 10 trials, 4735 participants; Analysis 1.13). The 95% prediction interval ranged between 0.80 and 1.41.

The RR for eye‐related adverse events was 1.08 (95% CI 0.86 to 1.35; P = 0.52; 9 trials, 4204 participants; Analysis 1.14). The 95% prediction interval ranged between 0.83 and 1.41.

Socioeconomic effects

No study investigated socioeconomic effects.

HbA1c

The MD in HbA1c was –0.07% (95% CI –0.18 to 0.03; P = 0.17; 16 trials, 5809 participants; low‐certainty evidence; Analysis 1.15). The 95% prediction interval ranged between –46% and 0.32%.

Subgroup analyses

We performed subgroup analyses for trials with OADs as concomitant blood glucose lowering medications versus trials with short‐acting insulin as a concomitant blood glucose lowering medication for hypoglycaemic events (severe, serious, confirmed, nocturnal confirmed) and progression of diabetic retinopathy. Interaction was only found for the outcome progression of retinopathy. Retinopathy progression (three steps) for OADs showed a RR of 0.73 (95% CI 0.38 to 1.38, P = 0.33). Retinopathy progression (three steps) for short‐acting insulin showed a RR of 2.75 (95% CI 1.10 to 6.91, P = 0.03) (test for subgroup difference: I² = 81.6%; P = 0.02).

We performed subgroup analyses for trials with NPH administration once daily versus NPH administration more than once daily for hypoglycaemic events (severe, serious, confirmed, nocturnal confirmed) and progression of diabetic retinopathy. Interaction was only found for the outcome nocturnal confirmed hypoglycaemia less than 75 mg/dL. Nocturnal confirmed hypoglycaemia less than 75 mg/dL for NPH once daily showed a RR of 0.74 (95% CI 0.62 to 0.89; P = 0.001). Nocturnal confirmed hypoglycaemia less than 75 mg/dL for NPH more than once daily showed a RR of 0.91 (95% CI 0.82 to 1.01; P = 0.11) (test for subgroup difference: I² = 74.7%; P = 0.05).

We did not conduct any further subgroup analyses because there were not enough trials or events to evaluate effects.

Sensitivity analyses

We performed sensitivity analyses for the following factors: very long (follow‐up more than 12 months) and very large trials (more than 1000 participants). Trials did not differ enough in terms of other variables to allow meaningful sensitivity analyses. Analyses including and excluding the results from Yokoyama 2006 were also not feasible. We investigated the robustness of the pooled results by repeating the analyses using different statistical models (fixed‐ and random‐effects).

Nocturnal confirmed hypoglycaemia less than 75 mg/dL all studies showed a RR of 0.78 (95% CI 0.68 to 0.89; P = 0.003). Nocturnal confirmed hypoglycaemia less than 75 mg/dL for very large and very long studies only showed a RR (0.92, 95% CI 0.82 to 1.02; P = 0.11). The meta‐analyses' results remained robust for all other outcomes.

Results also remained robust when we repeated the analyses using different statistical models.

Assessment of reporting bias

We drew a funnel plot for the primary endpoint severe hypoglycaemia (14 trials; Figure 4). The funnel plot did not indicate any increased risk of publication bias.


Funnel plot of comparison: 1 Insulin glargine versus NPH insulin, outcome: 1.2 Severe hypoglycaemia.

Funnel plot of comparison: 1 Insulin glargine versus NPH insulin, outcome: 1.2 Severe hypoglycaemia.

Long‐acting insulin analogue detemir versus NPH insulin

Primary outcomes
Diabetes‐related complications

Three trials reported a total of three non‐fatal myocardial infarctions, two in participants treated with detemir and one in a participant treated with NPH (very low‐certainty evidence) (Fajardo Montañana 2008; Haak 2005; NN304‐3614).

Fajardo Montañana 2008 reported no fatal myocardial infarctions in either group. No other trial reported on this endpoint.

Three trials reported stroke (Fajardo Montañana 2008; Haak 2005; NN304‐1337). One trial reported no fatal strokes in any group (NN304‐1337). The other two reported two non‐fatal events in the detemir groups and one non‐fatal event in the NPH groups. The certainty of evidence was very low.

Fajardo Montañana 2008 reported that no end‐stage renal disease occurred in any of the participants. No further information was available in the other trials.

Three‐step progression of retinopathy for detemir versus NPH showed a RR of 1.50 (95% CI 0.68 to 3.32; P = 0.32; 2 trials; 972 participants; very low‐certainty evidence; Analysis 2.1).

Fajardo Montañana 2008 reported that no participants underwent amputations.

No further information on diabetic late complications were available in any of the trials.

Hypoglycaemic episodes

The RR for severe hypoglycaemia was 0.45 (95% CI 0.17 to 1.20, P = 0.63; 5 trials, 1804 participants; very low‐certainty evidence; Analysis 2.2). The 95% prediction interval ranged between 0.09 and 2.21.

The Peto OR for serious hypoglycaemia was 0.16 (95% CI 0.04 to 0.61; P = 0.007; 5 trials, 1777 participants; low‐certainty evidence; Analysis 2.3).

The RR for confirmed hypoglycaemia less than 75 mg/dL was 0.73 in favour of detemir (95% CI 0.61 to 0.86; P < 0.001; 4 trials, 1718 participants; very low‐certainty evidence; Analysis 2.4). The 95% prediction interval ranged between 0.36 and 1.48.

The RR for confirmed hypoglycaemia less than 55 mg/dL was 0.48 in favour of detemir (95% CI 0.32 to 0.71; P < 0.001; 4 trials, 1718 participants; low‐certainty evidence; Analysis 2.5). The 95% prediction interval ranged between 0.20 and 1.13.

The RR for nocturnal confirmed hypoglycaemia less than 75 mg/dL was 0.57, 95% CI 0.47 to 0.68; P < 0.001; 4 trials; 1718 participants; low‐certainty evidence; Analysis 2.6 in favour of detemir. The 95% prediction interval ranged between 0.39 and 0.84.

The RR for nocturnal confirmed hypoglycaemia less than 55 mg/dL was 0.32 in favour of detemir (95% CI 0.16 to 0.63; P = 0.001; 4 trials; 1718 participants; low‐certainty evidence; Analysis 2.7). The 95% prediction interval ranged between 0.07 and 1.42.

Health‐related quality of life

Three trials reported information on health‐related quality of life (Fajardo Montañana 2008; Haak 2005; Kobayashi 2007 B). As the trials used different instruments for measuring health‐related quality of life, a meta‐analysis was not feasible.

Haak 2005 used the Diabetes Health Profile 2 (DHP‐2) questionnaire. The MD in barriers to activity for detemir compared with NPH was –0.16 (95% CI –2.45 to 2.13; P = 0.89). The MD in disinhibited eating for detemir compared with NPH was 1.34 (95% CI –1.52 to 4.20; P = 0.36). The MD in psychological distress for detemir compared with NPH was –0.19 (95% CI –2.46 to 2.09; P = 0.87).

Fajardo Montañana 2008 used the SF‐36 questionnaire. The MD in total score physical health for detemir compared with NPH was 2.83 (95% CI –1.56 to 7.23; P = 0.21). The MD in total score mental health for detemir compared with NPH was 4.19 (95% CI –0.22 to 8.61; P = 0.06).

Kobayashi 2007 A used the Insulin Therapy Related Quality Of Life At Night (ITR‐QOLN) instrument. The MD in total score for detemir compared with NPH was 1.7 (95% CI –4.4 to 7.8; P = 0.58).

Secondary outcomes
All‐cause mortality

The Peto OR for death from any cause was 0.74 (95% CI 0.20 to 2.65; P = 0.64; 8 trials, 2328 participants; low‐certainty evidence; Analysis 2.8).

Adverse events other than hypoglycaemia

The RR for serious adverse events was 0.88 (95% CI 0.64 to 1.20; P = 0.40; 8 trials, 2328 participants; moderate‐certainty evidence; Analysis 2.9). The 95% prediction interval ranged between 0.60 and 1.30.

The RR for all adverse events was 1.03 (95% CI 0.96 to 1.11; P = 0.35; 8 trials, 2328 participants; moderate‐certainty evidence; Analysis 2.10). The 95% prediction interval ranged between 0.94 and 1.13.

The RR for adverse events leading to discontinuation of the trial was 1.22 (95% CI 0.67 to 2.25; P = 0.52; 8 trials, 2328 participants; moderate‐certainty evidence; Analysis 2.11). The 95% prediction interval ranged between 0.57 and 2.62.

The MD for weight gain (BMI) was –0.60 kg/m² (95% CI –0.88 to –0.32; P < 0.001; 1 trial; 278 participants) (Fajardo Montañana 2008).

The RR for adverse skin reactions was 1.28 (95% CI 0.63 to 2.59; P = 0.50; 5 trials; 1777 participants; Analysis 2.12).

The RR for eye‐related adverse events was 0.75 (95% CI 0.41 to 1.37; P = 0.34; 6 trials; 1386 participants; Analysis 2.13).

Socioeconomic effects

No study investigated socioeconomic effects.

HbA1c

The MD for HbA1c was 0.13% (95% CI –0.02 to 0.28; P = 0.08; 7 trials, 2233 participants; very low‐certainty evidence; Analysis 2.14). The 95% prediction interval ranged between –0.28% and 0.54%.

Subgroup analyses

Subgroup analyses in trials with OADs versus short‐acting insulin as concomitant blood glucose‐lowering medications either showed no substantial differences (severe, serious, confirmed and confirmed nocturnal hypoglycaemia), or were not feasible because of the low number of trials and events. This was also true for subgroup analyses involving trials administering NPH once daily versus trials administering NPH at least twice daily.

Sensitivity analyses

We performed sensitivity analyses for the following factors: published or unpublished trials, and commercially or non‐commercially funded trials. Trials did not differ enough in terms of other variables to allow for meaningful additional sensitivity analyses. We also investigated the robustness of the pooled results by repeating the analyses using different statistical models (fixed‐ and random‐effects).

Severe hypoglycaemia including all trials found an OR of 0.37 (95% CI 0.15 to 0.92; P = 0.03). Severe hypoglycaemia including published trials only found an OR of 0.43 (95% CI 0.17 to 1.13; P = 0.09). The results of the meta‐analyses remained robust for all other outcomes.

The results of the meta‐analyses remained robust for all outcomes after exclusion of non‐commercially funded studies and following comparisons using different statistical models.

Assessment of reporting bias

We did not draw funnel plots due to the limited number of trials (eight).

Discussion

Summary of main results

With regard to diabetes complications information on myocardial infarction, stroke, amputations and end‐stage renal disease was available from few trials only with a small number of events. No trustworthy inferences could be drawn from these results. There were more data on retinopathy; however, meta‐analyses did not result in statistically or clinically relevant differences between treatment with glargine or detemir and NPH.

There were no clear differences for all‐cause mortality when comparing treatment with long‐acting insulin‐analogues to NPH treatment. Information was available from almost all included trials and the number of people dying during a trial was low.

Three trials comparing glargine to NPH and three further trials comparing detemir to NPH reported outcomes on health‐related quality of life utilising mostly different instruments. None of them found substantial differences between treatment with glargine or detemir and NPH.

We found no substantial differences between interventions and comparators in the frequency of adverse events.

No trial reported on socioeconomic effects.

Treatment of people with type 2 diabetes mellitus with insulin glargine and insulin detemir compared to NPH insulin resulted in no substantial differences in hypoglycaemic episodes, HbA1c lowering was comparable between treatments. Serious hypoglycaemia was somewhat lower following insulin detemir treatment compared to NPH insulin. Both insulin glargine and insulin detemir showed lower confirmed (nocturnal) hypoglycaemia rates in comparison to NPH insulin.

We considered no evidence to be of high certainty and all trials to have an unclear or high risk of bias in one or more risk of bias domains.

Overall completeness and applicability of evidence

This Cochrane Review is the most current and comprehensive systematic review to compare the effects of (ultra‐)long‐acting insulin analogues with those of NPH insulin. We have included 16 trials (6330 participants) comparing glargine to NPH and eight trials(2342 participants) comparing detemir to NPH.

We conducted an extensive search for trials, included publications in all languages, and tried to obtain additional data on all trials. However, the provision of additional data was limited. We also took into consideration additional information published in a report by the German Institute for Quality and Efficiency in Health Care that was based on the original trial reports (IQWiG 2009).

The diagnostic criteria for type 2 diabetes mellitus were not specified in most of the included trials. Participants had diabetes for eight years or longer at the beginning of all trials.

In the included trials, blood glucose targets set for insulin dose titration were comparable to those set in studies comparing the effects of a near‐normal blood glucose reduction with a less intense reduction. In some cases, they were even lower. In other words, the trials aimed to achieve a near‐normal reduction in blood glucose levels for the participants. This contrasts with the recommendations of professional associations for the individual setting of target values (ADA 2020). For example, a more moderate therapy target is recommended for people with a long duration of illness, significant comorbidity or diabetes‐associated complications, and for people with limited life expectancy and resources (ADA 2020). In fact, since all trial participants had the disease for a relatively long time, higher target levels may well have been more appropriate. Incidence of serious or severe hypoglycaemia is directly associated with the intensity of blood glucose lowering. From this follows that less stringent blood glucose or HbA1c target values will result in less frequent major hypoglycaemic events and absolute risk reducing effect will be lower. Therefore, results from the studies at hand are only applicable to people in whom very low blood glucose concentrations are targeted.

However, even for those people for whom a blood glucose reduction to near normal concentrations can be considered an adequate treatment goal, the trial results provided only limited information about the different effects of insulin analogues and NPH insulin. In most studies (10/16 glargine trials and 5/8 detemir trials) limited NPH to a single injection per day. Also, an adjustment of the blood glucose‐lowering comedications (short‐acting insulin or oral glucose‐lowering agents) was not possible – both of which do not correspond to current good clinical practice.

Because of the limited applicability of the results it remains unclear if the same effects will be observed in daily clinical practice. An indication that this is not the case is provided by the study of Lipska and colleagues, a retrospective observational study using data from a large US health management organisation (Lipska 2018). The authors found that initiating therapy with a long‐acting insulin analogue was not associated with a reduced risk of hypoglycaemic emergency department visits, hospital admissions or improved glycaemic control compared to NPH insulin.

We identified no trials investigating NPH insulin therapy with ultra‐long‐acting insulin analogues treatment. Data can only be derived from indirect comparisons based on the results from trials comparing long‐acting and ultra‐long‐acting insulin analogues. The results of a network meta‐analysis suggested that compared to NPH insulin ultra‐long‐acting insulin analogues reduce the risk of hypoglycaemic events s (Madenidou 2018). However, regarding the applicability of the results, in addition to the uncertainty of indirect comparisons, restrictions apply as again the set target titration values corresponded to a near‐normal blood glucose reduction (Gerstein 2012; Rosenstock 2018).

Quality of the evidence

For most patient‐important outcomes, no or very limited information was available and only in a small number of trials. Furthermore, the reported frequency of such outcomes was low. Duration of follow‐up was 12 months or less for all studies but one, which lasted for 60 months (Rosenstock 2009).

None of the included 24 studies could be classified as having low risk of bias for all risk of bias domains. The major shortcoming in all included trials was that neither participants nor study personnel were blinded to the respective interventions. Although blinding of participants and personnel would have been complicated, no effort was made to at least provide for a blinded outcome assessment. While this is of lesser concern for objective outcomes such as centrally measured HbA1c or death from any cause, it means that more subjective and self‐reported outcomes such as hypoglycaemia are at high risk of bias.

This is even more important when the bias‐prone definitions of hypoglycaemia in the included trials are taken into consideration. Patients may inappropriately deny having had severe hypoglycaemia and in this context 'third party help' is a soft and variable description of severity. More robust definitions such as 'injection of glucose or glucagon by a third person' may result in more reliable data (Mühlhauser 1998). Since classification of hypoglycaemia as a serious adverse event requires the presence of specific additional criteria (ICH 2016), severe hypoglycaemic events, which simultaneously fulfil at least one of these criteria, are less vulnerable to bias. To minimise the risk of bias, apart from severe or serious hypoglycaemia, we only considered events for which a confirmed blood glucose measurement was available (ADA 2005). But even for confirmed hypoglycaemia, the risk of bias was high as participants may have chosen not to report events or may have made mistakes when transcribing blood glucose readings.

In addition, randomisation and allocation of concealment remained unclear in many trials.

Pharmaceutical companies producing long‐acting insulin analogues funded most trials. Some argue that systematic bias favours products that are made by the company funding the research. Explanations include the selection of an inappropriate comparator to the product under investigation, and publication bias (Lexchin 2003).

Major reasons for downgrading the certainty of the evidence (GRADE) were lack of blinding of participants, study personnel and outcome‐assessment, inconsistency (95% prediction intervals) and imprecision (small numbers of studies reporting on outcomes and low frequency of events).

Potential biases in the review process

As part of the original review, we contacted all authors of the available trials and producers of insulin‐analogues and requested missing data and clarification of risk when bias domains could not be adequately assessed. For this update, we contacted all authors of the additionally identified trials, unless they had been included in the IQWiG report. This was because additional data had already been provided for the IQWiG report. We again contacted the two pharmaceutical companies manufacturing insulin glargine and insulin detemir. We also sought additional data from documents available from the USA and European medical agencies and trial registries. The IQWiG report was an important source of data because it included data provided by Sanofi and Novo Nordisk. But despite these efforts, a large quantity of data were still missing, which limited our investigations of the effects of the different insulins on a large number of outcomes and influenced our assessments of risk of bias and certainty of the evidence.

We excluded trials that lasted less than 24 weeks. While this is consistent with our effort to investigate the long‐term effects of long‐acting insulin analogues compared to NPH, especially on diabetes‐related complications, it also resulted in fewer data on outcomes such as HbA1c and hypoglycaemia.

We were unable to draw funnel plots because of the small number of trials comparing detemir to NPH.

It was only possible to investigate heterogeneity by conducting subgroup and sensitivity analyses for a limited number of outcomes and variables.

Two review authors extracted the data. However, the review authors extracting the data were not blinded to the trial they were extracting data from.

Agreements and disagreements with other studies or reviews

Several other systematic reviews and meta‐analyses have investigated the effects of treatment with (ultra‐)long‐acting insulin analogues compared to treatment with NPH (Bi 2012; Freemantle 2016; Frier 2013; Owens 2017; Rys 2015). However, they all differ from this Cochrane Review in several aspects: only Bi 2012 compared insulin glargine and insulin detemir to NPH insulin, but did not provide separate analyses and results for the two insulin‐analogues. Owens 2017 only included trials comparing glargine U100 once daily in combination with OAD with treatment with NPH once daily combined with OADs. Freemantle 2016 compared glargine U300 to other basal insulins in a network meta‐analysis. Rys 2015 included trials comparing NPH with glargine and Frier 2013 included trials comparing NPH with detemir. With the exception of Owens 2017, trials of less than 24 weeks' duration were also considered. Frier 2013 did not conduct any meta‐analyses.

Despite these differences, the reported results were similar. All authors reported comparable effects of (ultra‐)long‐acting insulin analogues and NPH on HbA1c and weight gain, and they all found a lower rate of nocturnal hypoglycaemia when using long‐acting insulin‐analogues compared to NPH. Owens 2017 and Bi 2012 also found lower rates of overall hypoglycaemia and Rys 2015 found less symptomatic hypoglycaemia when treating with insulin analogues. In Owens 2017 and Rys 2015, there were no beneficial effects of insulin analogues on severe hypoglycaemia.

Trial flow diagram; HTA: health technology assessment; RCT: randomised controlled trial.

Figuras y tablas -
Figure 1

Trial flow diagram; HTA: health technology assessment; RCT: randomised controlled trial.

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included trials (blank cells indicate that the particular outcome was not measured in some trials).

Figuras y tablas -
Figure 2

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included trials (blank cells indicate that the particular outcome was not measured in some trials).

Risk of bias summary: review authors' judgements about each risk of bias item for each included trial (blank cells indicate that the particular outcome was not measured in the trial)

Figuras y tablas -
Figure 3

Risk of bias summary: review authors' judgements about each risk of bias item for each included trial (blank cells indicate that the particular outcome was not measured in the trial)

Funnel plot of comparison: 1 Insulin glargine versus NPH insulin, outcome: 1.2 Severe hypoglycaemia.

Figuras y tablas -
Figure 4

Funnel plot of comparison: 1 Insulin glargine versus NPH insulin, outcome: 1.2 Severe hypoglycaemia.

Comparison 1: Insulin glargine versus NPH insulin, Outcome 1: Diabetes‐related complications (progression in retinopathy)

Figuras y tablas -
Analysis 1.1

Comparison 1: Insulin glargine versus NPH insulin, Outcome 1: Diabetes‐related complications (progression in retinopathy)

Comparison 1: Insulin glargine versus NPH insulin, Outcome 2: Severe hypoglycaemia

Figuras y tablas -
Analysis 1.2

Comparison 1: Insulin glargine versus NPH insulin, Outcome 2: Severe hypoglycaemia

Comparison 1: Insulin glargine versus NPH insulin, Outcome 3: Serious hypoglycaemia

Figuras y tablas -
Analysis 1.3

Comparison 1: Insulin glargine versus NPH insulin, Outcome 3: Serious hypoglycaemia

Comparison 1: Insulin glargine versus NPH insulin, Outcome 4: Confirmed hypoglycaemia (blood glucose (BG) < 75 mg/dL)

Figuras y tablas -
Analysis 1.4

Comparison 1: Insulin glargine versus NPH insulin, Outcome 4: Confirmed hypoglycaemia (blood glucose (BG) < 75 mg/dL)

Comparison 1: Insulin glargine versus NPH insulin, Outcome 5: Confirmed hypoglycaemia (BG < 55 mg/dL)

Figuras y tablas -
Analysis 1.5

Comparison 1: Insulin glargine versus NPH insulin, Outcome 5: Confirmed hypoglycaemia (BG < 55 mg/dL)

Comparison 1: Insulin glargine versus NPH insulin, Outcome 6: Confirmed nocturnal hypoglycaemia (BG < 75 mg/dL)

Figuras y tablas -
Analysis 1.6

Comparison 1: Insulin glargine versus NPH insulin, Outcome 6: Confirmed nocturnal hypoglycaemia (BG < 75 mg/dL)

Comparison 1: Insulin glargine versus NPH insulin, Outcome 7: Confirmed nocturnal hypoglycaemia (BG < 55 mg/dL)

Figuras y tablas -
Analysis 1.7

Comparison 1: Insulin glargine versus NPH insulin, Outcome 7: Confirmed nocturnal hypoglycaemia (BG < 55 mg/dL)

Comparison 1: Insulin glargine versus NPH insulin, Outcome 8: All‐cause mortality

Figuras y tablas -
Analysis 1.8

Comparison 1: Insulin glargine versus NPH insulin, Outcome 8: All‐cause mortality

Comparison 1: Insulin glargine versus NPH insulin, Outcome 9: Adverse events other than hypoglycaemia (serious adverse effects)

Figuras y tablas -
Analysis 1.9

Comparison 1: Insulin glargine versus NPH insulin, Outcome 9: Adverse events other than hypoglycaemia (serious adverse effects)

Comparison 1: Insulin glargine versus NPH insulin, Outcome 10: Adverse events other than hypoglycaemia (all adverse events (AE))

Figuras y tablas -
Analysis 1.10

Comparison 1: Insulin glargine versus NPH insulin, Outcome 10: Adverse events other than hypoglycaemia (all adverse events (AE))

Comparison 1: Insulin glargine versus NPH insulin, Outcome 11: Adverse events other than hypoglycaemia (AEs leading to discontinuation)

Figuras y tablas -
Analysis 1.11

Comparison 1: Insulin glargine versus NPH insulin, Outcome 11: Adverse events other than hypoglycaemia (AEs leading to discontinuation)

Comparison 1: Insulin glargine versus NPH insulin, Outcome 12: Adverse events other than hypoglycaemia (weight gain)

Figuras y tablas -
Analysis 1.12

Comparison 1: Insulin glargine versus NPH insulin, Outcome 12: Adverse events other than hypoglycaemia (weight gain)

Comparison 1: Insulin glargine versus NPH insulin, Outcome 13: Adverse events other than hypoglycaemia (skin reactions)

Figuras y tablas -
Analysis 1.13

Comparison 1: Insulin glargine versus NPH insulin, Outcome 13: Adverse events other than hypoglycaemia (skin reactions)

Comparison 1: Insulin glargine versus NPH insulin, Outcome 14: Adverse events other than hypoglycaemia (eye related AEs)

Figuras y tablas -
Analysis 1.14

Comparison 1: Insulin glargine versus NPH insulin, Outcome 14: Adverse events other than hypoglycaemia (eye related AEs)

Comparison 1: Insulin glargine versus NPH insulin, Outcome 15: Glycosylated haemoglobin (HbA1c)

Figuras y tablas -
Analysis 1.15

Comparison 1: Insulin glargine versus NPH insulin, Outcome 15: Glycosylated haemoglobin (HbA1c)

Comparison 2: Insulin detemir vs NPH insulin, Outcome 1: Diabetes‐related complications (progression in retinopathy)

Figuras y tablas -
Analysis 2.1

Comparison 2: Insulin detemir vs NPH insulin, Outcome 1: Diabetes‐related complications (progression in retinopathy)

Comparison 2: Insulin detemir vs NPH insulin, Outcome 2: Severe hypoglycaemia

Figuras y tablas -
Analysis 2.2

Comparison 2: Insulin detemir vs NPH insulin, Outcome 2: Severe hypoglycaemia

Comparison 2: Insulin detemir vs NPH insulin, Outcome 3: Serious hypoglycaemia

Figuras y tablas -
Analysis 2.3

Comparison 2: Insulin detemir vs NPH insulin, Outcome 3: Serious hypoglycaemia

Comparison 2: Insulin detemir vs NPH insulin, Outcome 4: Confirmed hypoglycaemia (blood glucose (BG) < 75 mg/dL)

Figuras y tablas -
Analysis 2.4

Comparison 2: Insulin detemir vs NPH insulin, Outcome 4: Confirmed hypoglycaemia (blood glucose (BG) < 75 mg/dL)

Comparison 2: Insulin detemir vs NPH insulin, Outcome 5: Confirmed hypoglycaemia (BG < 55 mg/dL)

Figuras y tablas -
Analysis 2.5

Comparison 2: Insulin detemir vs NPH insulin, Outcome 5: Confirmed hypoglycaemia (BG < 55 mg/dL)

Comparison 2: Insulin detemir vs NPH insulin, Outcome 6: Confirmed nocturnal hypoglycaemia (BG < 75 mg/dL)

Figuras y tablas -
Analysis 2.6

Comparison 2: Insulin detemir vs NPH insulin, Outcome 6: Confirmed nocturnal hypoglycaemia (BG < 75 mg/dL)

Comparison 2: Insulin detemir vs NPH insulin, Outcome 7: Confirmed nocturnal hypoglycaemia (BG < 55 mg/dL)

Figuras y tablas -
Analysis 2.7

Comparison 2: Insulin detemir vs NPH insulin, Outcome 7: Confirmed nocturnal hypoglycaemia (BG < 55 mg/dL)

Comparison 2: Insulin detemir vs NPH insulin, Outcome 8: All‐cause mortality

Figuras y tablas -
Analysis 2.8

Comparison 2: Insulin detemir vs NPH insulin, Outcome 8: All‐cause mortality

Comparison 2: Insulin detemir vs NPH insulin, Outcome 9: Adverse events other than hypoglycaemia (serious adverse events)

Figuras y tablas -
Analysis 2.9

Comparison 2: Insulin detemir vs NPH insulin, Outcome 9: Adverse events other than hypoglycaemia (serious adverse events)

Comparison 2: Insulin detemir vs NPH insulin, Outcome 10: Adverse events other than hypoglycaemia (all adverse events (AE))

Figuras y tablas -
Analysis 2.10

Comparison 2: Insulin detemir vs NPH insulin, Outcome 10: Adverse events other than hypoglycaemia (all adverse events (AE))

Comparison 2: Insulin detemir vs NPH insulin, Outcome 11: Adverse events other than hypoglycaemia (AEs leading to discontinuation)

Figuras y tablas -
Analysis 2.11

Comparison 2: Insulin detemir vs NPH insulin, Outcome 11: Adverse events other than hypoglycaemia (AEs leading to discontinuation)

Comparison 2: Insulin detemir vs NPH insulin, Outcome 12: Adverse events other than hypoglycaemia (skin reactions)

Figuras y tablas -
Analysis 2.12

Comparison 2: Insulin detemir vs NPH insulin, Outcome 12: Adverse events other than hypoglycaemia (skin reactions)

Comparison 2: Insulin detemir vs NPH insulin, Outcome 13: Adverse events other than hypoglycaemia (eye‐related AEs)

Figuras y tablas -
Analysis 2.13

Comparison 2: Insulin detemir vs NPH insulin, Outcome 13: Adverse events other than hypoglycaemia (eye‐related AEs)

Comparison 2: Insulin detemir vs NPH insulin, Outcome 14: Glycosylated haemoglobin (HbA1c)

Figuras y tablas -
Analysis 2.14

Comparison 2: Insulin detemir vs NPH insulin, Outcome 14: Glycosylated haemoglobin (HbA1c)

Summary of findings 1. Insulin glargine versus NPH insulin for type 2 diabetes mellitus

Insulin glargine vs NPH insulin for type 2 diabetes mellitus

Patient: participants with type 2 diabetes mellitus

Intervention: insulin glargine

Comparison: NPH insulin (human isophane insulin)

Outcomes

Risk for NPH insulin

Risk for insulin glargine

Relative effect
(95% CI)

No of participants
(trials)

Certainty of the evidence
(GRADE)

Comments

Diabetes‐related complications

(1) Fatal MI

(2) Fatal stroke

(3) Progression in retinopathy

(4) Amputations

(5) ESRD

Follow‐up: 6 months to 36 weeks

(1) See comment

(2) See comment

(3) 101 per 1000

(4) See comment

(5) See comment

(1) See comment

(2) See comment

(3) 104 per 1000 (60 to 178)

(4) See comment

(5) See comment

(1) + (2) See comment

(3) RR 1.03

(0.60 to 1.77)

(4) + (5) See comment

(1) 934 (4 RCTs)

(2) 934 (4 RCTs)

(3) 1947 (5 RCTs)

(4) 34 (1 RCT)

(5) 34 (1 RCT)

(1) + (2)

⊕⊝⊝⊝
Very lowa

(3) ⊕⊝⊝⊝
Verylowb

(4) + (5) ⊕⊝⊝⊝
Very lowc

(1) 1 trial reported 3/352 participants in the glargine 100 IU group vs 0/349 participants in the NPH group experienced fatal MI; 3 additional trials with 233 participants reported that no fatal MI occurred.

(2) No fatal strokes occurred.

(3) The 95% prediction interval ranged between 0.22 and 4.83.

(4) + (5) 1 trial reported that no amputation or ESRD occurred.

Hypoglycaemic episodes
(1) Severe hypoglycaemia

(2) Serious hypoglycaemia

(3) Confirmed hypoglycaemia (BG < 75 mg/dL)

(4) Confirmed hypoglycaemia (BG < 55 mg/dL)

(5) Confirmed nocturnal hypoglycaemia (BG < 75 mg/dL)

(6) Confirmed nocturnal hypoglycaemia (BG < 55 mg/dL)

Follow‐up: 24 weeks to 5 years

(1) 37 per 1000

(2) 27 per 1000

(3) 572 per 1000

(4) 180 per 1000

(5) 351 per 1000

(6) 115 per 1000

(1) 25 per 1000 (17 to 37)

(2) 20 per 1000 (14 to 29)

(3) 526 per 1000 (486 to 578)

(4) 159 per 1000 (146 to 173)

(5) 274 per 1000 (239 to 312)

(6) 85 per 1000 (74 to 98)

(1) RR 0.68 (0.46 to 1.01)

(2) RR 0.75 (0.52 to 1.09)

(3) RR 0.92 (0.85 to 1.01)

(4) RR 0.88 (0.81 to 0.96)

(5) RR 0.78 (0.68 to 0.89)

(6) RR 0.74 (0.64 to 0.85)

(1) 6164 (14 RCTs)

(2) 4685 (10 RCTs)

(3) 4115 (7 RCTs)

(4) 4388 (8 RCTs)

(5) 4225 (8 RCTs)

(6) 4759 (8 RCTs)

(1) ⊕⊝⊝⊝
Very lowd

(2) ⊕⊕⊝⊝
Lowe

(3) ⊕⊝⊝⊝
Very lowf

(4) ⊕⊕⊕⊝
Moderateg

(5) ⊕⊝⊝⊝
Very lowf

(6) ⊕⊕⊕⊝
Moderateg

(1) The 95% prediction interval ranged between 0.33 and 1.40.

(2) The 95% prediction interval ranged between 0.48 and 1.16.

(3) The 95% prediction interval ranged between 0.69 and 1.22.

(4) The 95% prediction interval ranged between 0.79 and 0.98.

(5) The 95% prediction interval ranged between 0.53 and 1.14.

(6) The 95% prediction interval ranged between 0.62 and 0.88.

HRQoL

Follow‐up: 28 weeks to 48 weeks

See comment

1228 (3 RCTs)

⊕⊝⊝⊝
Very lowh

3 trials reported no statically significant differences between glargine groups and NPH groups in HRQoL total scores (W‐BQ22; EQ‐5) or any subscales.

All‐cause mortality

Follow‐up: 24 weeks to 5 years

8 per 1000

9 per 1000 (5 to 15)

Peto OR 1.06 (0.62 to 1.82)

6173 (14 RCTs)

⊕⊕⊝⊝
Lowi

AEs other than hypoglycaemia

(1) SAE

(2) Overall AE

(3) AE leading to discontinuation

Follow‐up: 24 weeks to 5 years

(1) 135 per 1000

(2)662 per 1000

(3) 17 per 1000

(1) 132 per 1000 (117 to 148)

(2) 669 per 1000 (649 to 682)

(3)20 per 1000 (14 to 30)

(1) RR 0.98 (0.87 to 1.10)

(2) RR 1.01 (0.98 to 1.03)

(3) RR 1.21 (0.84 to 1.76)

(1) 5499 (13 RCTs)

(2) 6170 (14 RCTs)

(3) 6149 (13 RCTs)

(1) + (2) (+3) ⊕⊕⊕⊝
Moderatej

(1) The 95% prediction interval ranged between 0.86 and 1.12.

(2) The 95% prediction interval ranged between 0.99 and 1.03.

(3) The 95% prediction interval ranged between 0.79 and 1.84.

Socioeconomic effects

Not reported

HbA1c

Follow‐up: 24 weeks to 5 years

The mean change in HbA1c ranged across control groups from –2.12% to +0.1%

The mean change in HbA1c in the intervention groups was 0.07% lower

(0.18% lower to 0.03% higher)

5809 (16 RCTs)

⊕⊕⊝⊝
Lowk

The 95% prediction interval ranged between –46% and 0.32%.

AE: adverse event; BG: blood glucose; CI: confidence interval; EQ‐5(D): EuroQol 5 (Dimension); ESRD: end‐stage renal disease; HbA1c: glycosylated haemoglobin A1c; HRQoL: health‐related quality of life; MD: mean difference; MI: myocardial infarction; NPH: neutral protamine Hagedorn; OR: odds ratio; RCT: randomised controlled trial; RR: risk ratio; SAE: serious adverse event; W‐BQ22: Well‐Being Questionnaire (22 items).

GRADE Working Group grades of evidence

High certainty: we are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty: we are moderately confident in the effect estimate; the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low certainty: our confidence in the effect estimate is limited; the true effect may be substantially different from the estimate of the effect.
Very low certainty: we have very little confidence in the effect estimate; the true effect is likely to be substantially different from the estimate of effect.

aDowngraded three levels because of risk of bias and serious imprecision (very sparse data) – see Appendix 1.
bDowngraded three levels because of risk of bias, inconsistency and imprecision – see Appendix 1.
cDowngraded three levels because of indirectness and serious imprecision (very sparse data) – see Appendix 1.
dDowngraded three levels because of risk of bias, imprecision and inconsistency – see Appendix 1.
eDowngraded two levels because of risk of bias and imprecision – see Appendix 1.
fDowngraded three levels because of risk of bias, inconsistency and imprecision – see Appendix 1.
gDowngraded one level because of risk of bias – see Appendix 1.
hDowngraded three levels because of risk of bias and serious imprecision – see Appendix 1.
iDowngraded two levels because of risk of bias and imprecision – see Appendix 1.
jDowngraded one level because of imprecision – see Appendix 1.
kDowngraded two levels because of inconsistency and imprecision – see Appendix 1.

Figuras y tablas -
Summary of findings 1. Insulin glargine versus NPH insulin for type 2 diabetes mellitus
Summary of findings 2. Insulin detemir versus NPH insulin for type 2 diabetes mellitus

Insulin detemir vs NPH insulin for type 2 diabetes mellitus

Patient: participants with type 2 diabetes mellitus

Intervention: insulin detemir

Comparison: NPH insulin (human isophane insulin)

Outcomes

Risk for NPH insulin

Risk for insulin detemir

Relative effect
(95% CI)

No of participants
(trials)

Certainty of the evidence
(GRADE)

Comments

Diabetes‐related complications

(1) Fatal MI

(2) Fatal stroke

(3) Progression in retinopathy

(4) Amputations

(5) ESRD

Follow‐up: 24 weeks to 26 weeks

(1) + (2) See comment

(3) 25 per 1000

(4) + (5) See comment

(1) + (2) See comment

(3) 37 per 1000 (17 to 82)

(4) + (5) See comment

(1) + (2) See comment

(3) RR 1.50

(0.68 to 3.32)

(4) + (5) See comment

(1) + (2) 271 (1 RCT)

(3) 972 (2 RCTs)

(4) + (5) 271 (1 RCT)

(1) + (2) + (3) + (4) + (5) ⊕⊝⊝⊝
Very lowa

(1) + (2) 1 trial reported that no fatal MI or fatal stroke occurred.

(3) –

(4) + (5) 1 trial reported that no amputation or ESRD occurred.

Hypoglycaemic episodes

(1) Severe hypoglycaemia

(2) Serious hypoglycaemia

(3) Confirmed hypoglycaemia (BG < 75 mg/dL)

(4) Confirmed hypoglycaemia (BG < 55 mg/dL)

(5) Confirmed nocturnal hypoglycaemia (BG < 75 mg/dL)

(6) Confirmed nocturnal hypoglycaemia (BG < 55 mg/dL)

Follow‐up: 24 weeks to 7 months

(1) 17 per 1000

(2) 11 per 1000

(3) 562 per 1000

(4) 493 per 1000

(5) 309 per 1000

(6) 40 per 1000

(1) 8 per 1000 (3 to 21)

(2) 2 per 1000 (0 to 7)

(3) 410 per 1000 (343 to 484)

(4) 237 per 1000 (158 to 350)

(5) 176 per 1000 (145 to 210)

(6) 13 per 1000 (6 to 25)

(1) RR 0.45 (0.17 to 1.20)

(2) Peto OR 0.16 (0.04 to 0.61)

(3) RR 0.73 (0.61 to 0.86)

(4) RR 0.48 (0.32 to 0.71)

(5) RR 0.57 (0.47 to 0.68)

(6) RR 0.32 (0.16 to 0.63)

(1) 1804 (5 RCTs)

(2) 1777 (5 RCTs)

(3) 1718 (4 RCTs)

(4) 1718 (4 RCTs)

(5) 1718 (4 RCTs)

(6) 1718 (4 RCTs)

(1) ⊕⊝⊝⊝
Very lowb

(2) ⊕⊕⊝⊝
Lowc

(3) ⊕⊕⊝⊝
Lowd

(4) + (5) + (6)
⊕⊕⊝⊝
Lowe

(1) The 95% prediction interval ranged between 0.09 and 2.21.

(2) –

(3) The 95% prediction interval ranged between 0.36 and 1.48.

(4) The 95% prediction interval ranged between 0.20 and 1.13.

(5) The 95% prediction interval ranged between 0.39 and 0.84.

(6) The 95% prediction interval ranged between 0.07 and 1.42.

Health‐related quality of life

Follow‐up: 26 weeks to 36 weeks

See comment

873 (3 RCTs)

⊕⊝⊝⊝
Very lowb

3 trials reported no statically significant difference between detemir groups and NPH groups in HRQoL total scores (ITR‐QOLN; DHP‐2; SF‐36) or any subscales.

All‐cause mortality

Follow‐up: 24 weeks to 48 weeks

5 per 1000

4 per 1000 (1 to 13)

Peto OR 0.74 (0.20 to 2.65)

2328 (8 RCTs)

⊕⊕⊝⊝
Lowf

AEs other than hypoglycaemia

(1) SAE

(2) Overall AE

(3) AE leading to discontinuation

Follow‐up: 24 weeks to 48 weeks

(1) 71 per 1000

(2)611 per 1000

(3) 18 per 1000

(1) 62 per 1000 (45 to 85)

(2) 629 per 1000 (586 to 678)

(3)22 per 1000 (12 to 40)

(1) RR 0.88

(0.64 to 1.20)

(2) RR 1.03 (0.96 to 1.11)

(3) RR 1.22 (0.67 to 2.25)

(1) 2328 (8 RCTs)

(2) 2328 (8 RCTs)

(3) 2328 (8 RCTs)

(1) + (2) (+3)

⊕⊝⊝⊝
Moderateg

(1) The 95% prediction interval ranged between 0.60 and 1.30.

(2) The 95% prediction interval ranged between 0.94 and 1.13.

(3) The 95% prediction interval ranged between 0.57 and 2.62.

Socioeconomic effects

Not reported

HbA1c

Follow‐up:

The mean change in HbA1c ranged across control groups from –1.9% to –0.32%

The mean change in HbA1c in the intervention groups was 0.13% higher

(0.02% lower to 0.28% higher)

2233 (7 RCTs)

⊕⊕⊝⊝
Lowh

The 95% prediction interval ranged between –0.28% and 0.54%.

AE: adverse event; BG: blood glucose; CI: confidence interval; DHP‐2: Diabetes Health Profile 2; ESRD: end‐stage renal disease; HbA1c: glycosylated haemoglobin A1c; HRQoL: health‐related quality of life; ITR‐QOLN: insulin therapy‐related quality of life at night; MD: mean difference; MI: myocardial infarction; NPH: neutral protamine Hagedorn; OR: odds ratio; RR: risk ratio; SAE: serious adverse event; SF‐36: 36‐item Short Form Health Survey.

GRADE Working Group grades of evidence

High certainty: we are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty: we are moderately confident in the effect estimate; the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low certainty: our confidence in the effect estimate is limited; the true effect may be substantially different from the estimate of the effect.
Very low certainty: we have very little confidence in the effect estimate; the true effect is likely to be substantially different from the estimate of effect.

aDowngraded three levels because of risk of bias and serious imprecision (very sparse data) – see Appendix 2.
bDowngraded three levels because of risk of bias and serious imprecision – see Appendix 2.
cDowngraded two levels because of risk of bias and imprecision – see Appendix 2.
dDowngraded two levels because of risk of bias and inconsistency – see Appendix 2.
eDowngraded two levels because risk of bias and imprecision – see Appendix 2.
fDowngraded two levels because of serious imprecision – see Appendix 2.
gDowngraded one level because of imprecision – see Appendix 2.
hDowngraded two levels because of inconsistency and imprecision – see Appendix 2.

Figuras y tablas -
Summary of findings 2. Insulin detemir versus NPH insulin for type 2 diabetes mellitus
Table 1. Overview of trial populations

Trial ID

(study design)

Intervention(s) and comparator(s)

Description of power and sample size calculation

Screened/eligible
(n)

Randomised
(n)

ITT
(n)

Analysed
(n)

Finishing trial
(n)

Randomised finishing trial
(%)

Follow‐up
(extended follow‐up)a

Berard 2015

(parallel RCT)

I: insulin glargine once‐daily

32

6 months

C: NPH insulin once daily or twice daily

34

total:

66

Eliaschewitz 2006

(parallel RCT, equivalence design)

I: insulin glargine at bedtime + glimepiride 4 mg/day in the morning

Based on an equivalence region of 0.5% and an SD of 2.0% for the differences in HbA1c between the groups, equivalence can be demonstrated with a statistical power of 80% with 199 participants per group, based on a 1‐sided α = 0.05. A 1:1 randomisation would require 199 evaluable participants in each group. Based on an expectation that 20% of the participants would not be evaluable, the study required the enrolment of 240 in each group.

918/—

231

Efficacy: 218

Safety: 231

218

24 weeks

C: NPH insulin at bedtime + glimepiride 4 mg/day in the morning

250

Efficacy: 244

Safety: 250

244

total:

528

481

Efficacy: 462

Safety: 481

462

87.5

Fajardo Montañana 2008

(parallel RCT)

I: insulin detemir at bedtime

272 participants (230 evaluable) were required to detect a difference in weight change of 1.5 kg (SD 4.0) between groups after 26 weeks, using a 2‐sided test with a 0.05 significance level.

345/293

126

125

125

119

94.4

26 weeks

C: NPH insulin at bedtime

151

146

146

139

92.1

total:

277

271

271

252

91.0

Fritsche 2003

(parallel RCT, non‐inferiority design)

I1: insulin glargine in the morning + glimepiride 3 mg

Based on the assumption of an SD of σ = 2.0%, a difference of Δ = 0.5% for HbA1c reductions among treatment groups can be detected with an α‐error of 0.05 and a β‐error of 0.2. This equates to a statistical power of 80% with 199 participants per group. With use of a 1:1:1 randomisation, 597 participants would be required for this study. Assuming a non‐evaluable rate of 20%, 720 participants (240 per group) would need to be enrolled in this study.

938/752

237

236

236

225

94.9

24 weeks

I2: insulin glargine at bedtime + glimepiride 3 mg

229

227

227

210

91.7

C: NPH insulin at bedtime + glimepiride 3 mg

234

232

232

205

87.6

total:

700

695

695

640

91.4

Haak 2005

(parallel RCT, non‐inferiority design)

I: detemir once daily at bedtime or twice daily in the morning and at bedtime + mealtime insulin aspart

The study had sufficient power (85%) to detect a mean difference of 0.4% in HbA1c between groups. A 95% 2‐sided CI was constructed for the difference between the group means (insulin detemir NPH insulin); insulin detemir was deemed non‐inferior if the upper limit of the 95% CI was < 0.4% (absolute). Treatments were considered comparable if the non‐inferiority criterion was fulfilled.

—/—

341

341

341

315

92.4

26 weeks

C: detemir once daily at bedtime or twice daily in the morning and at bedtime + mealtime insulin aspart

164

164

164

156

95.1

total:

505

505

505

471

93.3

Hermanns 2015

(cross‐over RCT)

I: insulin glargine

In a previous cross‐sectional study, different effect sizes of insulin glargine compared to NPH insulin in terms of SF‐12 (d = 0.10), PAID (d = 0.22), and ITEQ (d = 0.29) scores were observed. The mean effect size of all 3 scales was d = 0.166. Since the present study had a cross‐over design, in which each participant served as his/her own control, an effect size on the primary endpoint DRQoL of d = 0.20 was expected. Such an effect can be detected with 90% power using a paired t‐test with a significance level of 5% and with 265 participants.

460/—

343;
sequence A: 176,
sequence B: 167

339;

sequence A: 175,
sequence B: 164

229b; sequence A: 118,
sequence B: 111

296;

sequence A: 151,
sequence B: 145

86.3;

sequence A: 85.8,
sequence B: 86.8

48 weeks (efficacy)
49 weeks (safety)

C: NPH basal insulin

total:

343

339

229b

296

86.3

Hermansen 2006

(parallel RCT, non‐inferiority design)

I: detemir in the morning and evening

A non‐inferiority criterion, defined as a < 0.4% difference in HbA1c, was calculated to require 198 completers per arm for 95% power with a 5% significance level and with a maximum baseline‐adjusted SD of 1.1%

735/490

237

237

237

227

95.8

24 weeks

C: NPH insulin in the morning and evening

239

238

238

225

94.1

total:

476

475

475

452

95.0

Home 2015

(parallel RCT)

I: insulin glargine

It was estimated that at least 568 evaluable participants (670 were randomised with 15% not assessable) needed to be randomised to detect a difference in change of HbA1c of 0.3% (3.3 mmol/mol) at the 5% significance level with 90% power.This assumes an SD of change of HbA1c of 1.1% (12 mmol/mol)

1102/—

355

352

352

335

94.5

36 weeks (efficacy)
37 weeks (safety)

C: NPH insulin

353

349

349

328

92.9

total:

708

701

701

663

93.6

Hsia 2011

(parallel RCT)

I1: insulin glargine at bedtime

Based on previously published HbA1c levels in oral agent‐treated participants from the study centre, enrolment of 24 in each of the 3 treatment arms (72 total) would provide 95% power to detect an HbA1c difference of 0.8%, at a 5% significance level.

—/108

30

30

30

20c

66.7c

26 weeks

I2: insulin glargine in the morning

25

25

25

14c

56.0c

C: NPH insulin at bedtime

30

30

30

17c

56.7c

total:

85

85

85

51c

60.0c

Kawamori 2003

(parallel RCT, non‐inferiority design)

I: insulin glargine once in the morning + OAD

—/400

167

158

Efficacy: 141

Safety: 158

141

84.4

28 weeks

C: NPH insulin once in the morning + OAD

168

159

Efficacy: 134

Safety: 159

134

79.8

total:

335

317

Efficacy: 275

Safety: 317

275

82.1

Kobayashi 2007 A

(parallel RCT, non‐inferiority design)

I: insulin detemir once daily at bedtime or twice daily in the morning and at bedtime + mealtime insulin aspart

454/401d

70

67

67

65

92.9

48 weeks

C: NPH insulin once daily at bedtime or twice daily in the morning and at bedtime + mealtime insulin aspart

35

35

35

32

91.4

total:

105

102

102

97

92.4

Kobayashi 2007 B

(parallel RCT, non‐inferiority design)

I: insulin detemir at bedtime + OAD

437/371

183

180

180

160

87.4

36 weeks

C: NPH insulin at bedtime + OAD

188

183

183

172

97.5

total:

371

363

363

332

89.5

Massi 2003

(parallel RCT)

I: insulin glargine once daily at bedtime + OAD

Based on 1:1 randomisation and using a t‐test, a total number of 384 participants (192 for each group) was required to detect a mean difference of 0.5% glycated haemoglobin between insulin glargine and NPH insulin with a significance level of α = 5% and a statistical power of 90%. It was estimated that a total of 480 participants were to be enrolled to have 384 participants evaluable for efficacy analysis,

687/—

293

289

289

277

94.5

52 weeks

C: NPH insulin once daily at bedtime + OAD

285

281

281

252

88.4

total:

578

570

570

529

91.5

NCT00687453

(parallel RCT, non‐inferiority design)

I: insulin glargine at bedtime

27/24 e

11

11

11

8c

73c

6 months

C: NPH insulin in the morning and at bedtime

13

13

13

7c

54c

total:

24

24

24

15c

62.5c

NN304‐1337

(parallel RCT)

I: insulin detemir once daily at bedtime + metformin

309

309

309

266

86.1

24 weeks

C: NPH insulin once daily at bedtime + metformin

158

158

158

140

88.6

total:

467

467

467

406

86.9

NN304‐1808

(parallel RCT, non‐inferiority design)

I: insulin detemir once daily before breakfast ± metformin at optimal dose

For 80% power and 5% significance level with a baseline‐adjusted SD of 1.1, a total of 238 completers (119 per group) was required. Owing to a 20% maximal expected frequency of participants lost for follow‐up, 286 were to have been included, 143 in each group.

124/—

38

38

38

21c

55.3c

7 months

C: NPH insulin once daily before breakfast ± metformin at optimal dose

48

48

48

20c

41.7c

total:

86

86

86

41c

47.7c

NN304‐3614

(parallel RCT)

I: insulin detemir in the evening + insulin aspart each meal

As the primary objective was to demonstrate a difference of 5% in the primary endpoint, using an analysis of covariance model with 3 factors and 1 covariate and with an SD of 5%, the number of participants needed would be of 23 per group. Assuming a withdrawal rate of 20%, the total number of randomised participants would be 58. With a planned screening failure of 20%, the total number of participants planned would be 73.

81/—

25

24

24

21

84.0

26 weeks

C: NPH insulin in the evening + insulin aspart each meal

35

35

35

31

88.6

total:

60

59

59

52

86.7

Pan 2007

(parallel RCT, non‐inferiority design)

I: insulin glargine in the evening + glimepiride 3 mg in the morning

Assuming an SD of 1.6% for the changes from baseline in HbA1c in the 2 groups, and a maximum difference between the groups to be equivalent to 0.4%, the sample size per group was calculated to provide 80% power. The sample size was adjusted for an evaluation rate of 90%, and a total of 440 participants (220 per group) was thus targeted for randomisation.

224

220

220f

211

94.2

24 weeks

C: NPH insulin in the evening + glimepiride 3 mg in the morning

224

223

223f

214

95.5

total:

448

443

443

425

94.9

Betônico 2019

(cross‐over RCT, non‐inferiority design)

I: insulin glargine in the morning + insulin lispro at mealtime

A sample size of 34 participants provided 90% power to detect a mean difference of 0.7% (7.7 mmol/mol) in the primary endpoint (HbA1c), considering a 15% dropout rate and assuming an SD of 0.85% and a type I error of 5%.

193/40

Period 1 – glargine/period 2 – NPH: 16

Period 1 – NPH/period 2 – glargine: 18

After period 1: 16

After period 2: 15

After period 1: 16

After period 2: 15

Period 1 – glargine/period 2 – NPH: 14

Period 1 – NPH/period 2 – glargine: 15

Period 1 – glargine/period 2 – NPH: 87.5

Period 1 – NPH/period 2 – glargine: 83.3

Cross‐over trial, 6 months per period

C: NPH insulin 3 times daily + insulin lispro at mealtime

After period 1: 18

After period 2: 14

After period 1: 18

After period 2: 14

total:

34

After period 1: 34

After period 2: 29

After period 1: 34

After period 2: 29

29

85.3

Riddle 2003

(parallel RCT)

I: insulin glargine once at bedtime + OAD

Based on previous data, randomisation of 750 participants had the power to provide an 85% chance of detecting, with α = 5%, a 10% treatment effect for the primary outcome measure

1381/764

372

367

367

334

89.8

24 weeks

C: NPH insulin once at bedtime + OAD

392

389

389

357

91.1

total:

764

756

756

691

90.4

Rosenstock 2001

(parallel RCT)

I: insulin glargine once daily at bedtime + premeal regular insulin

The study was designed to provide 90% power to detect a mean difference of 0.5% in HbA1c between treatment groups

846g/—

260

259

259

231

88.8

28 weeks

C: NPH insulin once at bedtime or twice daily in the morning and at bedtime + premeal regular insulin

261

259

259

238

91.2

total:

521

518

518

469

90.0

Rosenstock 2009

(parallel RCT, non‐inferiority design)

I: insulin glargine once daily, generally at bedtime

Sample size was calculated assuming a 20% 5‐year event rate for a ≥ 3 step progression in diabetic retinopathy on the Early Treatment of Diabetic Retinopathy Study scale from baseline to end of study (based on data from the Diabetes Control and Complications Trial), and a non‐inferiority margin of 10% (half of the expected background rate of 20%) was chosen. Assuming that approximately 40% of the randomised participants would not be evaluable, a sample size of 840 randomised participants (420 per group) was calculated to provide at least 80% power for declaring non‐inferiority

1413/—

515

513

513 (ITT); 514h (safety population)

374

72.6

5 years

C: NPH insulin twice daily, generally in the morning and at bedtime

509

504

504 (ITT); 503h (safety population)

364

71.5

total:

1024

1017

1017

738

72.1

Yki‐Järvinen 2006

(parallel RCT)

I: insulin glargine at bedtime + metformin

The sample size calculation was based on differences observed in a previous study between 11 insulin‐naive participants treated with NPH and metformin and 12 participants treated with glargine and metformin for 1 year in Helsinki. In this study, HbA1c differed by 0.5% at the end of 1 year; the SDs for the groups were not different and averaged 0.87. The mean HbA1c change for the NPH + metformin group was −0.8 (SE 0.2%) (11 participants), and for the glargine + metformin group it was −1.3 (SE 0.3%) (12 participants) at the end of 1 year. Assuming α = 0.05 and 80% power, the required number of participants per group to observe a difference of 0.5% is 50. To allow for a 10% dropout rate, 110 participants were randomised

157/110

61

61

61

60

98.4

36 weeks

C: NPH insulin at bedtime + metformin

49

49

49

48

98.0

total:

110

110

110

108

98.2

Yokoyama 2006

(parallel RCT)

I: insulin glargine once at breakfast + aspart/lispro at each meal with or without OADs

—/—

31

6 months

C: NPH insulin daily at bedtime + aspart/lispro at each meal with or without OADs

31

total:

62

Overall total

All interventions

All comparators

All interventions and comparators

8677

— denotes not reported.

aFollow‐up under randomised conditions until end of trial or if not available, duration of intervention; extended follow‐up refers to follow‐up of participants once the original study was terminated as specified in the power calculation.
bModified ITT set for primary endpoint evaluation (including randomised participants with valid values for DRQoL for both treatment periods).
cStudy prematurely discontinued.
dParticipants with type 1 and type 2 diabetes.
e3 participants not randomised due to protocol violations.
fSafety population: 444 participants.
gAccording to European Medical Agency report.
h1 participant who was randomised to receive NPH insulin received insulin glargine throughout the study, and was consequently counted in the ITT population as an NPH participant, but in the safety population as an insulin glargine participant, leading to a discrepancy in the numbers for the ITT and safety populations in both the insulin glargine and NPH insulin arms.

C: comparator; CI: confidence interval; DRQoL: diabetes‐related quality of life; HbA1c: glycosylated haemoglobin A1c; I: intervention; ITEQ: Insulin Therapy Experience Questionnaire; ITT: intention‐to‐treat; n: number of participants; NPH: neutral protamine Hagedorn; OAD: oral antihyperglycaemic drug; PAID: Problem Areas In Diabetes; RCT: randomised controlled trial; SD: standard deviation; SE: standard error; SF‐12: 12‐item Short Form Health Survey.

Figuras y tablas -
Table 1. Overview of trial populations
Comparison 1. Insulin glargine versus NPH insulin

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1.1 Diabetes‐related complications (progression in retinopathy) Show forest plot

5

1947

Risk Ratio (M‐H, Random, 95% CI)

1.03 [0.60, 1.77]

1.2 Severe hypoglycaemia Show forest plot

14

6164

Risk Ratio (IV, Random, 95% CI)

0.68 [0.46, 1.01]

1.3 Serious hypoglycaemia Show forest plot

10

4685

Risk Ratio (IV, Random, 95% CI)

0.75 [0.52, 1.09]

1.4 Confirmed hypoglycaemia (blood glucose (BG) < 75 mg/dL) Show forest plot

7

4115

Risk Ratio (M‐H, Random, 95% CI)

0.92 [0.85, 1.01]

1.5 Confirmed hypoglycaemia (BG < 55 mg/dL) Show forest plot

8

4388

Risk Ratio (M‐H, Random, 95% CI)

0.88 [0.81, 0.96]

1.6 Confirmed nocturnal hypoglycaemia (BG < 75 mg/dL) Show forest plot

8

4225

Risk Ratio (M‐H, Random, 95% CI)

0.78 [0.68, 0.89]

1.7 Confirmed nocturnal hypoglycaemia (BG < 55 mg/dL) Show forest plot

8

4759

Risk Ratio (M‐H, Random, 95% CI)

0.74 [0.64, 0.85]

1.8 All‐cause mortality Show forest plot

14

6173

Peto Odds Ratio (Peto, Fixed, 95% CI)

1.06 [0.62, 1.82]

1.9 Adverse events other than hypoglycaemia (serious adverse effects) Show forest plot

13

5499

Risk Ratio (M‐H, Random, 95% CI)

0.98 [0.87, 1.10]

1.10 Adverse events other than hypoglycaemia (all adverse events (AE)) Show forest plot

14

6170

Risk Ratio (M‐H, Random, 95% CI)

1.01 [0.98, 1.03]

1.11 Adverse events other than hypoglycaemia (AEs leading to discontinuation) Show forest plot

13

6149

Risk Ratio (M‐H, Random, 95% CI)

1.21 [0.84, 1.76]

1.12 Adverse events other than hypoglycaemia (weight gain) Show forest plot

8

2405

Mean Difference (IV, Random, 95% CI)

0.12 [0.02, 0.22]

1.13 Adverse events other than hypoglycaemia (skin reactions) Show forest plot

10

4735

Risk Ratio (M‐H, Random, 95% CI)

1.06 [0.83, 1.35]

1.14 Adverse events other than hypoglycaemia (eye related AEs) Show forest plot

9

4204

Risk Ratio (M‐H, Random, 95% CI)

1.08 [0.86, 1.35]

1.15 Glycosylated haemoglobin (HbA1c) Show forest plot

16

5809

Mean Difference (IV, Random, 95% CI)

‐0.07 [‐0.18, 0.03]

Figuras y tablas -
Comparison 1. Insulin glargine versus NPH insulin
Comparison 2. Insulin detemir vs NPH insulin

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

2.1 Diabetes‐related complications (progression in retinopathy) Show forest plot

2

972

Risk Ratio (M‐H, Random, 95% CI)

1.50 [0.68, 3.32]

2.2 Severe hypoglycaemia Show forest plot

5

1804

Risk Ratio (IV, Random, 95% CI)

0.45 [0.17, 1.20]

2.3 Serious hypoglycaemia Show forest plot

5

1777

Peto Odds Ratio (Peto, Fixed, 95% CI)

0.16 [0.04, 0.61]

2.4 Confirmed hypoglycaemia (blood glucose (BG) < 75 mg/dL) Show forest plot

4

1718

Risk Ratio (M‐H, Random, 95% CI)

0.73 [0.61, 0.86]

2.5 Confirmed hypoglycaemia (BG < 55 mg/dL) Show forest plot

4

1718

Risk Ratio (M‐H, Random, 95% CI)

0.48 [0.32, 0.71]

2.6 Confirmed nocturnal hypoglycaemia (BG < 75 mg/dL) Show forest plot

4

1718

Risk Ratio (M‐H, Random, 95% CI)

0.57 [0.47, 0.68]

2.7 Confirmed nocturnal hypoglycaemia (BG < 55 mg/dL) Show forest plot

4

1718

Risk Ratio (M‐H, Random, 95% CI)

0.32 [0.16, 0.63]

2.8 All‐cause mortality Show forest plot

8

2328

Peto Odds Ratio (Peto, Fixed, 95% CI)

0.74 [0.20, 2.65]

2.9 Adverse events other than hypoglycaemia (serious adverse events) Show forest plot

8

2328

Risk Ratio (M‐H, Random, 95% CI)

0.88 [0.64, 1.20]

2.10 Adverse events other than hypoglycaemia (all adverse events (AE)) Show forest plot

8

2328

Risk Ratio (M‐H, Random, 95% CI)

1.03 [0.96, 1.11]

2.11 Adverse events other than hypoglycaemia (AEs leading to discontinuation) Show forest plot

8

2328

Risk Ratio (M‐H, Random, 95% CI)

1.22 [0.67, 2.25]

2.12 Adverse events other than hypoglycaemia (skin reactions) Show forest plot

5

1777

Risk Ratio (M‐H, Random, 95% CI)

1.28 [0.63, 2.59]

2.13 Adverse events other than hypoglycaemia (eye‐related AEs) Show forest plot

6

1386

Risk Ratio (M‐H, Random, 95% CI)

0.75 [0.41, 1.37]

2.14 Glycosylated haemoglobin (HbA1c) Show forest plot

7

2233

Mean Difference (IV, Random, 95% CI)

0.13 [‐0.02, 0.28]

Figuras y tablas -
Comparison 2. Insulin detemir vs NPH insulin