Scolaris Content Display Scolaris Content Display

Inhibidores de la recaptación de serotonina y noradrenalina (IRSN) para la fibromialgia

Contraer todo Desplegar todo

Resumen

Antecedentes

La fibromialgia es una afección crónica clínicamente bien definida y de etiología desconocida que se caracteriza por dolor crónico generalizado que a menudo coexiste con trastornos del sueño, disfunción cognitiva y fatiga. Con frecuencia los pacientes con fibromialgia informan niveles altos de discapacidad y una calidad de vida deficiente. El tratamiento farmacológico, por ejemplo, con inhibidores de la recaptación de serotonina y noradrenalina (IRSN), se centra en la reducción de los principales síntomas y en mejorar la calidad de vida. Esta revisión actualiza y amplía la versión de 2013 de esta revisión sistemática.

Objetivos

Evaluar la eficacia, la tolerabilidad y la seguridad de los inhibidores de la recaptación de serotonina y noradrenalina (IRSN) en comparación con placebo u otro/s fármaco/s activo/s en el tratamiento de la fibromialgia en pacientes adultos.

Métodos de búsqueda

Para esta actualización se realizaron búsquedas de ensayos publicados y en curso en CENTRAL, MEDLINE, Embase, los US National Institutes of Health y la International Clinical Trials Registry Platform de la Organización Mundial de la Salud (OMS), además se examinaron las listas de referencias de los artículos revisados hasta el 8 de agosto de 2017.

Criterios de selección

Se seleccionaron los ensayos aleatorizados y controlados de cualquier formulación de IRSN versus placebo o cualquier otro tratamiento activo de la fibromialgia en pacientes adultos.

Obtención y análisis de los datos

Tres autores de la revisión, de forma independiente, extrajeron los datos, examinaron la calidad de los estudios y evaluaron el riesgo de sesgo. Para la eficacia, se calculó el número necesario a tratar para un resultado beneficioso adicional (NNTB) para el alivio del dolor del 50% o más y del 30% o más, la impresión global del paciente de mucha o muchísima mejoría, las tasas de abandono debido a la falta de eficacia y las diferencias de medias estandarizadas (DME) para la fatiga, los problemas del sueño, la calidad de vida relacionada con la salud, la intensidad media del dolor, la depresión, la ansiedad, la discapacidad, la función sexual, los trastornos cognitivos y la sensibilidad. Para la tolerabilidad, se calculó el número necesario a tratar para un resultado perjudicial adicional (NNTD) para los retiros debido a eventos adversos y para las náuseas, el insomnio y la somnolencia como eventos adversos específicos. Para la seguridad, se calculó el NNTD para los eventos adversos graves. Se realizó un metanálisis mediante un modelo de efectos aleatorios. La calidad de la evidencia se evaluó con los criterios GRADE y se creó una tabla de "Resumen de los hallazgos".

Resultados principales

Se añadieron ocho estudios nuevos con 1979 participantes para un total de 18 estudios incluidos con 7903 participantes. Siete estudios investigaron la duloxetina y nueve estudios investigaron el milnacipran versus placebo. Un estudio comparó desvenlafaxina con placebo y pregabalina. Un estudio comparó duloxetina con L‐carnitina. La mayoría de los estudios tuvieron un riesgo de sesgo incierto o alto en tres a cinco dominios.

La calidad de la evidencia de todas las comparaciones de desvenlafaxina, duloxetina y milnaciprán versus placebo en los estudios con un diseño paralelo fue baja debido a las preocupaciones sobre el sesgo de publicación y la falta de direccionalidad, y muy baja para los eventos adversos graves debido a las preocupaciones sobre el sesgo de publicación, la imprecisión y la falta de direccionalidad. La calidad de la evidencia de todas las comparaciones de la duloxetina y la desvenlafaxina con otros fármacos activos fue muy baja debido a las preocupaciones sobre el sesgo de publicación, la imprecisión y la falta de direccionalidad.

La duloxetina y el milnaciprán no tuvieron efectos beneficiosos clínicamente relevantes sobre placebo en el alivio del dolor del 50% o más: 1274 de 4104 (31%) participantes que recibieron duloxetina y milnaciprán informaron de un alivio del dolor del 50% o más, en comparación con 591 de 2814 (21%) participantes que recibieron placebo (diferencia de riesgo [DR] 0,09; intervalo de confianza [IC] del 95%: 0,07 a 0,11; NNTB 11; IC del 95%: 9 a 14). La duloxetina y el milnaciprán tuvieron efectos beneficiosos clínicamente relevantes sobre placebo en la impresión global del paciente de mucha o muchísima mejoría: 888 de 1710 (52%) participantes con duloxetina y milnaciprán (DR 0,19; IC del 95%: 0,12 a 0,26; NNTB 5; IC del 95%: 4 a 8) informaron mucha o muchísima mejoría en comparación con 354 de 1208 (29%) de los participantes con placebo. La duloxetina y el milnaciprán no tuvieron efectos beneficiosos clínicamente relevantes sobre placebo en el alivio del dolor del 30% o más. La DR fue 0,10; IC del 95%: 0,08 a 0,12; NNTB 10; IC del 95%: 8 a 12. La duloxetina y el milnaciprán no tuvieron efectos beneficiosos clínicamente relevantes en la fatiga (DME ‐0,13; IC del 95%: ‐0,18 a ‐0,08; NNTB 18; IC del 95%: 12 a 29), en comparación con placebo. No hubo diferencias estadísticamente significativas entre duloxetina o milnaciprán y placebo en cuanto a la reducción de los trastornos del sueño (DME ‐0,07; IC del 95%: ‐0,15 a 0,01). La duloxetina y el milnaciprán no tuvieron efectos beneficiosos clínicamente relevantes en comparación con placebo en la mejoría de la calidad de vida relacionada con la salud (DME ‐0,20; IC del 95%: ‐0,25 a ‐0,15; NNTB 11; IC del 95%: 8 a 14).

Hubo 794 de 4166 (19%) participantes con IRSN que abandonaron debido a los eventos adversos en comparación con 292 de 2863 (10%) de los participantes con placebo (DR 0,07; IC del 95%: 0,04 a 0,10; NNTD 14; IC del 95%: 10 a 25). No hubo diferencias en los eventos adversos graves entre duloxetina, milnaciprán o desvenlafaxina (DR ‐0,00; IC del 95%: ‐0,01 a 0,00).

No hubo diferencias entre desvenlafaxina y placebo en cuanto a la eficacia, la tolerabilidad y la seguridad en un pequeño ensayo.

No hubo diferencias entre la duloxetina y la desvenlafaxina en cuanto a la eficacia, la tolerabilidad y la seguridad en dos ensayos con comparadores activos (L‐carnitina, pregabalina).

Conclusiones de los autores

La actualización no modificó las principales conclusiones de la revisión anterior. Sobre la base de evidencia de calidad baja a muy baja, los IRSN duloxetina y milnaciprán no tuvieron efectos beneficiosos clínicamente relevantes sobre placebo en la frecuencia de alivio del dolor del 50% o más, pero en la impresión global del paciente de mucha o muchísima mejoría y en la frecuencia de alivio del dolor del 30% o más tuvieron un efecto beneficioso clínicamente relevante. La duloxetina y el milnaciprán no tuvieron efectos beneficiosos clínicamente relevantes sobre placebo en la mejoría de la calidad de vida relacionada con la salud ni en la reducción de la fatiga. La duloxetina y el milnaciprán no difirieron significativamente de placebo en la reducción de los problemas de sueño. Las tasas de abandono debido a eventos adversos fueron mayores con duloxetina y milnaciprán que con placebo. En promedio, los efectos beneficiosos potenciales de la duloxetina y el milnaciprán en la fibromialgia fueron superados por sus posibles efectos perjudiciales. Sin embargo, una minoría de pacientes con fibromialgia podría tener un alivio significativo de los síntomas sin que se produzcan efectos adversos clínicamente relevantes con la duloxetina o el milnaciprán.

No se encontraron estudios controlados con placebo con otros IRSN que no sean desvenlafaxina, duloxetina y milnacipran.

PICO

Population
Intervention
Comparison
Outcome

El uso y la enseñanza del modelo PICO están muy extendidos en el ámbito de la atención sanitaria basada en la evidencia para formular preguntas y estrategias de búsqueda y para caracterizar estudios o metanálisis clínicos. PICO son las siglas en inglés de cuatro posibles componentes de una pregunta de investigación: paciente, población o problema; intervención; comparación; desenlace (outcome).

Para saber más sobre el uso del modelo PICO, puede consultar el Manual Cochrane.

Resumen en términos sencillos

Inhibidores de la recaptación de serotonina y noradrenalina para la fibromialgia

Conclusión

La duloxetina y el milnaciprán pueden reducir el dolor en los pacientes con fibromialgia. Sin embargo, algunos de estos pacientes también pueden presentar efectos secundarios, como náuseas (sensación de malestar) y somnolencia. Una minoría de los pacientes con fibromialgia experimenta un alivio de los síntomas sin los efectos secundarios de la duloxetina y el milnaciprán.

Antecedentes

Los pacientes con fibromialgia suelen presentar dolor generalizado crónico (más de tres meses de duración) así como trastornos del sueño, dificultad para pensar y agotamiento. A menudo informan de una deficiente calidad de vida relacionada con la salud. Actualmente no existe una cura para la fibromialgia, por lo que los tratamientos tienen como objetivo aliviar los síntomas y mejorar la calidad de vida relacionada con la salud.

La serotonina y la noradrenalina son productos químicos producidos por el cuerpo humano que participan en la regulación del dolor, el sueño y el estado de ánimo. Se han informado concentraciones bajas de serotonina en los pacientes con fibromialgia. Los inhibidores de la recaptación de serotonina y noradrenalina (IRSN) son una clase de antidepresivos que aumentan la concentración de serotonina y noradrenalina en el cerebro.

Características de los estudios

En agosto de 2017, se actualizaron las búsquedas de ensayos clínicos en los que se utilizaron los IRSN para tratar los síntomas de la fibromialgia en pacientes adultos. Se encontraron ocho estudios nuevos desde la versión anterior de la revisión. En total, se encontraron 18 estudios con 7903 participantes. Los estudios tuvieron una duración de cuatro a 27 semanas y compararon los IRSN desvenlafaxina, duloxetina y milnaciprán con un medicamento falso (placebo). La calidad de la evidencia de los estudios se calificó en cuatro niveles: muy baja, baja, moderada o alta. La evidencia de calidad muy baja significa que hay muy poca seguridad en los resultados. La evidencia de calidad alta significa que existe mucha seguridad en los resultados.

Resultados clave y calidad de la evidencia

La duloxetina y el milnaciprán fueron mejores que placebo para una reducción del dolor del 50% o más y para mejorar el bienestar general (evidencia de calidad baja). La duloxetina y el milnaciprán fueron mejores que placebo para mejorar la calidad de vida relacionada con la salud y reducir la fatiga (evidencia de calidad baja). La duloxetina y el milnaciprán no fueron superiores a placebo en cuanto a la reducción de los trastornos del sueño (evidencia de calidad baja). Más pacientes abandonaron el ensayo debido a los efectos secundarios de la duloxetina y el milnaciprán que con placebo (evidencia de calidad baja). Más pacientes informaron de náuseas y somnolencia con la duloxetina y el milnaciprán que con placebo (evidencia de calidad baja). La duloxetina, el milnaciprán y placebo no difirieron en la frecuencia de los efectos secundarios graves presentados (evidencia de calidad muy baja).

Authors' conclusions

Implications for practice

For people with fibromyalgia

Only a minority of people may profit from treatment with the serotonin and noradrenaline reuptake inhibitors (SNRIs) duloxetine and milnacipran in terms of meaningful relief of fibromyalgia symptoms and a good tolerability of the drug. The majority of people will not experience substantial relief of fibromyalgia symptoms or will terminate the treatment because of adverse events, or both. There is no evidence for the efficacy of other SNRIs such as desvenlafaxine and venlafaxine.

For physicians

If duloxetine or milnacipran are being considered for the treatment of fibromyalgia, a frank discussion between the physician and patient about the potential benefits and harms of both drugs is important. The contraindications (concomitant use of monoamine oxidase inhibitors, uncontrolled narrow‐angle glaucoma, substantial alcohol use or evidence of chronic liver damage) and warnings (suicidality, hepatotoxicity, serotonin syndrome, abnormal bleeding, discontinuation syndrome, elevated blood pressure, urinary hesitation and retention) are to be discussed (Häuser 2010b). Defining realistic goals of therapy (e.g. pain relief of 30% or more and/or improvement of daily functioning) by people with fibromyalgia and their physicians before starting drug treatment has been recommended (Häuser 2015b).

The recommended dosages are duloxetine 60 mg a day, and milnacipran 100 mg a day. A dose response has not been demonstrated. Higher doses are associated with more adverse events (Cording 2015; Häuser 2010b). Treatment has only been continued in responders, that is to say in people who reached the predefined treatment goals with a reasonable tolerability of duloxetine or milnacipran (Petzke 2017).

A class effect of SNRIs on fibromyalgia symptoms cannot be assumed. One study found no difference between four dosages of desvenlafaxine and placebo in mean pain intensity reduction (NCT00697787). One study found no differences between venlafaxine and placebo in all outcomes of efficacy (Ziljstra 2002).

Treating fibromyalgia with drugs only, such as SNRIs alone, is discouraged since current best practices in fibromyalgia guidelines recommend using the combination of pharmacological therapy with aerobic exercise and psychological therapies (Ablin 2013; MacFarlane 2017; Petzke 2017).This is especially true for symptoms where duloxetine and milnacipran are ineffective, but other therapies are effective, for example, aerobic exercise for fatigue (Häuser 2010c), and cognitive‐behavioral therapies for depression (Bernardy 2017).

Since relatively few participants achieve a worthwhile response with SNRIs, it is important to establish stopping rules, so that when someone does not respond within a specified time, they can be switched to an alternative treatment. This will reduce the number of participants exposed to adverse events in the absence of benefit. One study included in this review demonstrated that some people with fibromyalgia who do not respond to duloxetine might respond to milnacipran (Bateman 2013).

For policy‐makers

Since no single treatment is effective in a majority of individuals with fibromyalgia, this relatively small number who benefit may be considered worthwhile, particularly if appropriate switching or stopping rules are in place.

For funders

Treatment with duloxetine and milnacipran for fibromyalgia may be considered worthwhile, particularly if switching and stopping rules are in place in case the predefined treatment goals are not reached or the drugs are not well tolerated, or both. It is important that the treatment is supervised by a physician experienced in the treatment with duloxetine and milnacipran.

Implications for research

General

Analysis of all studies investigating duloxetine and milnacipran in fibromyalgia at the level of individual participant data could provide important information, for example, whether or not a clinically important pain response delivers large functional and quality‐of‐life benefits. Moreover, a re‐analysis of the data using baseline observation carried forward, and responder analysis where discontinuation is classified as non‐response, would allow a determination of the true efficacy of duloxetine and milnacipran in fibromyalgia. All journals should follow the BMJ rule that reports of randomized trials will only be considered for publication if the authors commit to making the relevant anonymous participant‐level data available on reasonable request (BMJ).

Studies in any continent and the inclusion of people with inflammatory rheumatic diseases, osteoarthritis and mental disorders (depressive and anxiety disorders, post‐traumatic stress disorder) are necessary to provide external validity of the study findings.

A standardized psychiatric interview at study entry can stratify participants according to comorbid anxiety and depressive disorders.

There is bias towards studies conducted in USA. To provide generalizability of study results, study populations equally recruited from every continent are necessary.

It is necessary that the details of the assessment of adverse events (spontaneous reports, open questions, symptom questionnaires) are reported by the studies because the type and frequency of adverse events is influenced by the modes of assessment (Häuser 2012). It is mandatory that adverse events should be reported using the International Conference on Harmonization guidelines, and coded within organ classes using the Medical Dictionary for Regulatory Activities (MedDRA) (International Council for Harmonisation 2016). It is desirable that regulatory agencies standardize the assessment strategies of adverse events in RCTs.

It is important to control for potential effects of co‐interventions on outcomes.

Measurement (endpoints)

It is important to use responder criteria for a clinically relevant improvement of sleep problems, fatigue, depression and physical function (disability) (Arnold 2012b). Homogeneous outcomes for studies with an EERW design need to be defined.

Comparison between active treatments

It is important not only to compare with placebo but also with drugs with known efficacy, such as amitriptyline or pregabalin. In addition, more studies with defined subgroups (e.g. major depression, no adequate response to a specific drug treatment) are necessary.

Summary of findings

Open in table viewer
Summary of findings 1. Serotonin noradrenaline reuptake inhibitors compared with placebo for fibromyalgia ‐ studies with parallel design

Serotonin noradrenaline reuptake inhibitors compared with placebo for fibromyalgia ‐ studies with parallel design

Patient or population: people with fibromyalgia

Settings: study centers in North, Central and South America, Asia and Europe

Intervention: serotonin noradrenaline reuptake inhibitors (duloxetine, milnacipran)

Comparison: placebo

Outcomes

Probable outcome with intervention

(95% CI)

Probable outcome with placebo

Relative effect

SMD or risk difference
(95% CI)

No of participants
(studies)

Quality of the evidence
(GRADE)

Comments

Self‐reported pain relief of 50% or greater

309 per 1000

(282 to 344)

210 per 1000

RD 0.09 (0.07 to 0.11)

6918 (15 studies)

⊕⊕⊝⊝
low1,2

NNTB 11 (95% CI 9 to 14)

Patient Global Impression to be much or very much improved (PGIC)

519 per 1000

(459 to 573)

293 per 1000

RD 0.19 (0.12 to 0.26)

2918 (6 studies)

⊕⊕⊝⊝
low1,2

NNTB 5 (95% CI 4 to 8)

Self‐reported fatigue (20‐100 scale)

Higher scores indicate higher fatigue problem levels

Mean fatigue
score was 2.6 points
lower (1.0 to
5.0 points lower) based on a 20‐100 scale

Baseline mean score 69.4 (SD 12.3)3

SMD ‐0.13 (‐0.18 to ‐0.08)

6168 (12 studies)

⊕⊕⊝⊝
low1,2

NNTB 18 (95% CI 12 to 29)

Self‐reported sleep problems

(0‐100 scale)

Higher scores indicate higher sleep problem levels

Mean sleep problems
score was 1.2 points
lower (0.2 higher to
5.5 points lower) based on a 0‐100 scale

Baseline mean score 68.0 (23.8)4

SMD ‐0.07 (‐0.15 to 0.01)

4547 (8 studies)

⊕⊕⊝⊝
low1,2

NNTB not calculated due to lack of statistically significant difference

Self‐reported health‐related quality of life (0‐100 scale)

Higher scores indicate higher burden of disease (lower quality of life)

Mean health‐related quality of life problems score was 3.9 points lower (2.3 to
5.3 points lower) based on a
0‐100 scale

Baseline mean score 57.9 (SD
14.1)5

SMD ‐0.20 (‐0.25 to ‐0.15)

6861 (14 studies)

⊕⊕⊝⊝
low1,2

NNTB 11 (95% CI 8 to 14)

Tolerability (withdrawal due to adverse events)

191 per 1000

(172 to 210)

102 per 1000

RD 0.07 (0.04 to 0.10)

7029 (15 studies)

⊕⊕⊝⊝
low1,2

NNTH 14 (95% CI 10 to 25)

Safety (serious adverse events)

18 per 1000

(16 to 20)

21 per 1000

RD ‐0.00 (‐0.01 to 0.00)

6732 (13 studies)

⊕⊝⊝⊝
Verylow1,2,6

NNTH not calculated due to lack of statistically significant difference

*The basis for the assumed risk (e.g. the median control group risk across studies) is provided in footnotes. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CI: Confidence interval; FIQ: Fibromyalgia Impact Questionnaire; MFI: Multidimensional Fatigue Inventory; MOS‐Sleep problem index: Medical Outcome Study ‐ sleep problem index; NNTB: number needed to treat for an additional beneficial outcome; NNTH: number needed to treat for an additional harm; NRS: numerical rating scale; RD: risk difference; SMD: standardized mean difference; VAS: visual analog scale

GRADE Working Group grades of evidence
High quality: we are very confident that the true effect lies close to that of the estimate of the effect.
Moderate quality: we are moderately confident in the effect estimate; the true effect is likely to be close to the estimate of effect, but there is a possibility that it is substantially different.
Low quality: our confidence in the effect estimate is limited; the true effect may be substantially different from the estimate of the effect.
Very low quality: we have very little confidence in the effect estimate; the true effect is likely to be substantially different from the estimate of effect.

1Downgraded once: indirectness: participants with major medical diseases and mental disorders except major depression excluded in > 50% of studies
2Downgraded once: publication bias
3Clauw 2008: N = 401 participants; MFI NRS 20‐100 scale
4Mease 2009b: N = 223 participants; MOS Sleep problem index NRS 0‐100 scale
5Arnold 2010b; N = 509 participants; FIQ VAS 0‐80 scale
6Downgraded once: imprecision due to low event rate

Background

Description of the condition

Fibromyalgia is defined by the American College of Rheumatology (ACR) 1990 classification criteria as widespread pain lasting for longer than three months with tenderness on palpation at 11 or more of 18 specified tender points (Wolfe 1990). Chronic widespread pain is frequently associated with other symptoms, such as poor sleep, fatigue, and depression (Wolfe 2013a). People with moderate and severe forms of fibromyalgia often report high disability levels and poor quality of life along with extensive use of medical care (Häuser 2015a; Häuser 2017). Fibromyalgia symptoms can be assessed by patient self‐report using the fibromyalgia criteria and severity scales for clinical and epidemiological studies: a modification of the ACR Preliminary Diagnostic Criteria for Fibromyalgia (so‐called Fibromyalgia Symptom Questionnaire) (Wolfe 2011a). For a clinical diagnosis, the ACR 1990 classification criteria (Wolfe 1990), the ACR 2010 preliminary diagnostic criteria (Wolfe 2010) and the 2016 criteria (Wolfe 2016) can be used. Lacking a specific laboratory test, diagnosis is established by a history of the key symptoms and the exclusion of somatic diseases sufficiently explaining the key symptoms (Wolfe 2010). For epidemiology studies, the modified ACR 2010 preliminary diagnostic criteria (survey criteria) can be used (Wolfe 2011a).

The indexing of fibromyalgia within the international classification of diseases is under debate. While some rheumatologists have thought of it as a specific pain disorder and central sensitivity syndrome (Clauw 2014; Yunus 2008), recent research points at small fibre pathology in a subgroup of people with fibromyalgia that may be of pathophysiological importance (Üceyler 2017 a). In psychiatry and psychosomatic medicine, fibromyalgia symptoms are categorized as a functional somatic syndrome, a bodily distress syndrome, a physical symptom disorder, or a somatoform disorder (Häuser 2009; Häuser 2014).

Fibromyalgia is a heterogeneous condition. The definite etiology (causes) of this syndrome remains unknown. A model of interacting biological and psychosocial variables in the predisposition, triggering and development of the chronicity of fibromyalgia symptoms has been suggested (Üceyler 2017 a)). Inflammatory rheumatoid arthritis (Wolfe 2011a), depression (Chang 2015), genetics (Arnold 2013; Lee 2012), obesity combined with physical inactivity (Mork 2010), physical and sexual abuse in childhood (Häuser 2010a), sleep problems (Mork 2012), and smoking (Choi 2010) predict future development of fibromyalgia. Psychosocial stress (e.g. working place and family conflicts) and physical stress (e.g. infections, surgery, accidents) might trigger the onset of chronic widespread pain and fatigue (Clauw 2014; Üceyler 2017 a). Depression and post‐traumatic stress disorder worsen fibromyalgia symptoms (Häuser 2013 a; Lange 2010).

Several factors are associated with the pathophysiology (functional changes associated with or resulting from disease) of fibromyalgia, but the relationship is unclear. The functional changes include alteration of sensory processing in the brain (so‐called central sensitization), reduced reactivity of the hypothalamus‐pituitary‐adrenal axis to stress, increased pro‐inflammatory and reduced anti‐inflammatory cytokine profiles (produced by cells involved in inflammation), disturbances in neurotransmitters such as dopamine and serotonin and small nerve fibre pathology (Üceyler 2017 a). Prolonged exposure to stress, as outlined above, may contribute to these functional changes in predisposed individuals (Bradley 2009).

Fibromyalgia is common. Numerous studies have investigated its prevalence in different settings and countries. A review gives a global mean prevalence of 2.7% (range 0.4% to 9.3%), and a mean in the Americas of 3.1%, in Europe of 2.5% and in Asia of 1.7%. It is more common in women, with a female to male ratio of 3:1 (4.2%:1.4%) (Queiroz 2013). Estimates of prevalence in specific populations vary greatly, but have been reported as being as high as 9% in female textile workers in Turkey and 10% in metalworkers in Brazil (Queiroz 2013).The change in diagnostic criteria does not appear to have significantly affected estimates of prevalence (Wolfe 2013a).

Since specific treatment aimed at altering the pathogenesis is not possible, drug therapy that focuses on symptom reduction is ubiquitously employed.

Description of the intervention

Serotonin and noradrenaline (norepinephrine) reuptake inhibitors (SNRIs) act on noradrenergic and serotonergic neurons in the nervous system. Serotonin and noradrenaline are implicated in the mediation of endogenous pain inhibitory mechanisms.

How the intervention might work

Dysfunction of serotonin and noradrenaline transmission, which mediates endogenous analgesic mechanisms via the descending inhibitory pain pathways in the central nervous system, may play a key role in the pathophysiology of fibromyalgia. Researchers found that levels of metabolites of biogenic amines key to descending inhibition were lower than normal in at least three fibromyalgia body fluid compartments (Legangneux 2001; Russell 1992). Imbalance or deficiency in serotonin and noradrenaline is also associated with other key symptoms of fibromyalgia such as fatigue and cognitive deficits (Bradley 2009). Treatment with SNRI increases transmission of these neurotransmitters and may improve disease states associated with serotonin and noradrenaline deficiencies such as pain, fatigue and cognitive deficits.

Why it is important to do this review

There is a transatlantic difference in the approval of SNRIs as a treatment for fibromyalgia by drug agencies (Briley 2010). The SNRIs duloxetine and milnacipran have been approved by the US Food and Drug Administration (FDA), but not by the European Medical Agencies (EMA), for the management of fibromyalgia. The FDA stated that the sponsors of the two drugs had provided adequate evidence of their benefits and harms to support their indication for the management of fibromyalgia (Department of health & Human Services 2008; Department of health & Human Services 2009). The EMA, however, denied clinically relevant effects for both drugs, on the basis of a lack of robust evidence of efficacy, and because the adverse effects profile was considered to outweigh the benefits (EMA 2008; EMA 2010). We conducted a systematic review on SNRIs in fibromyalgia which included randomized controlled trials that had not been evaluated by the FDA and EMA in 2013 (Häuser 2013 b). Meanwhile, new randomized controlled trials with duloxetine (Leombruni 2015; Murakami 2015), and milnacipran (Bateman 2013; Matthey 2013; Staud 2015), were published that had not been evaluated by the FDA and EMA and by the previous version of this review (Häuser 2013 b). With new data available, and in the light of the divergent appraisals of duloxetine and milnacipran by the FDA and EMA, we saw the need to evaluate the efficacy and safety of SNRIs according to recently established methodological standards of pain medicine (Moore 2010a), in order to assist people with fibromyalgia and doctors in shared decision making on pharmacological treatment options.

Objectives

To assess the efficacy, tolerability and safety of serotonin and noradrenaline reuptake inhibitors (SNRIs) compared with placebo or other active drug(s) in the treatment of fibromyalgia in adults.

Methods

Criteria for considering studies for this review

Types of studies

We included studies if they were double blind, randomized and controlled trials (RCTs) following four weeks of treatment (titration and maintenance) or longer. We included studies with a parallel, cross‐over and enriched enrolment randomized withdrawal (EERW) design. We included studies with a cross‐over design where (a) separate data from the two periods were reported, or (b) data were presented that excluded a statistically significant carry‐over effect, or (c) statistical adjustments were carried out in case of a significant carry‐over effect. Trials had to have at least 20 participants per treatment arm and had to report at least one of the outcomes of efficacy as defined below and of tolerability and safety as defined below. We required full journal publication, with the exception of online clinical trial results summaries of otherwise unpublished clinical trials, and abstracts with sufficient data for analysis. We did not include short abstracts (usually meeting reports). We excluded studies that were non‐randomized, without control groups, studies of experimental pain, case reports, and clinical observations.

Types of participants

Adults (over 18 years) having a clinical diagnosis of fibromyalgia by any published, recognized and standardized criteria (Smythe 1981; Wolfe 1990; Wolfe 2010; Wolfe 2011b, Yunus 1981; Yunus 1982; Yunus 1984).

Types of interventions

We included trials comparing SNRIs with placebo or another active drug with proven efficacy to reduce fibromyalgia symptoms.

We allowed co‐interventions, such as physical therapy or other drugs different from those being assessed in the trial.

We considered the following SNRIs in this review: desvenlafaxine, duloxetine, milnacipran, venlafaxine

Types of outcome measures

We followed some suggestions of the OMERACT Fibromyalgia Working Group (Mease 2009a), the Initiative of Methods, Measurement and Pain Assessment in Clinical Trials (IMMPACT) (Dworkin 2009), and of best practice in the reporting of systematic reviews in chronic pain (Moore 2010a; Moore 2010b), for selecting outcome measures.

Primary outcomes

  • Self‐reported pain relief of 50% or greater. Number of participants who reported a pain relief of 50% or greater in parallel and cross‐over design studies. For EERW design, loss of therapeutic response of self‐reported pain relief was defined as less than 30% reduction in visual analog scale (VAS) pain from pre‐drug exposure or worsening of fibromyalgia requiring alternative treatment.

  • Patient perceived global improvement (Patient Global Impression of Change (PGIC), or Clinical Global Impression (CGI) of severity): number of participants who reported to be much or very much improved for parallel and cross‐over design studies; number of participants who reported a loss of therapeutic response to be much or very much improved in studies with EERW design.

  • Tolerability (withdrawals due to adverse events)

  • Safety (serious adverse events)

Secondary outcomes

  • Self‐reported fatigue. We used the following preference: validated combined scale (e.g. Multidimensional Fatigue Inventory (MFI), Fatigue Severity Scale (FSS), Multidimensional Assessment of Fatigue (MAF) or other validated scales) over single item scales (e.g. Fibromyalgia Impact Questionnaire (FIQ) fatigue VAS, or other single item scales). We selected reduction of self‐reported fatigue as the outcome for studies with a parallel and cross‐over design, and loss of therapeutic response of self‐reported fatigue in studies with an EERW design.

  • Self‐reported sleep problems. We used the following preference: validated combined scale (e.g. Medical Outcomes Study (MOS) sleep scale, or other validated scales), over single item assessment (e.g. FIQ sleep VAS, or other single item scales). We selected reduction of self‐reported sleep problems as the outcome for studies with a parallel and cross‐over design, and loss of therapeutic response of self‐reported sleep problems in studies with an EERW design.

  • Self‐reported health‐related quality of life (HRQoL) measured by the total score of the Fibromyalgia Impact Questionnaire (FIQ). We selected improvement of self‐reported HRQoL as the outcome for studies with a parallel and cross‐over design, and loss of therapeutic response of self‐reported HRQoL in studies with an EERW design.

  • Self‐reported pain relief of 30% or greater. There was no comparable outcome in studies with an EERW design.

  • Self‐reported mean pain intensity. We used the following preferences: (a) we preferred electronic diaries over paper; (b) 24‐hour recall pain, weekly recall pain with visual analog scale (VAS); (c) paper VAS, paper numeric 11‐point ordinal scale (Numeric Rating Scale NRS), combined pain measures, pain drawings. We selected reduction of self‐reported mean pain intensity as the outcome for studies with a parallel and cross‐over design. There was no comparable outcome in studies with an EERW design.

  • Self‐reported depression. We used the following preference: validated combined scale (Beck Depression Inventory (BDI), or other validated scales), over single‐item assessment (e.g. FIQ subscale for depression, or other single item scales). We selected reduction of self‐reported depression as the outcome for studies with a parallel and cross‐over design, and loss of therapeutic response of self‐reported depression in studies with an EERW design.

  • Self‐reported anxiety. We used the following preference: validated combined scale (Beck Anxiety Inventory (BAI), State Trait Anxiety Inventory (STAI), or other validated scales), over single item scale (FIQ anxiety VAS, or other single item scales). We selected reduction of self‐reported anxiety as the outcome for studies with a parallel and cross‐over design and loss of therapeutic response of self‐reported anxiety in studies with an EERW design.

  • Self‐reported disability (impairment of physical function). We used the following preference: validated combined scale (Brief Pain Inventory (BPI) interference from pain, Short‐Form Health Survey (SF‐36) physical summary score, or other validated scales), over single item scale (FIQ physical impairment VAS, or other single item scales). We selected reduction of self‐reported disability as an outcome for studies with a parallel and cross‐over design, and loss of therapeutic response of self‐reported disability in studies with an EERW design.

  • Self‐reported sexual function. We used the following preference: validated combined scale (Arizona Sexual Experience Scale, or other validated scale), over single item scale. We selected reduction of self‐reported sexual problems as the outcome for studies with a parallel and cross‐over design, and loss of therapeutic response of self‐reported sexual problems in studies with an EERW design.

  • Self‐reported cognitive disturbances: validated combined scale (Multiple Ability Self‐report Questionnaire (MASQ), or any other validated scale), over single item scale. We selected reduction of self‐reported cognitive disturbances as the outcome for studies with a parallel and cross‐over design, and loss of therapeutic response of self‐reported cognitive disturbances in studies with an EERW design.

  • Tenderness: measurement of tender point pain threshold

  • Number of participants dropping out due to lack of efficacy

  • Specific adverse events frequently associated with the use of SNRIs (nausea, somnolence, insomnia)

Search methods for identification of studies

Electronic searches

We ran three searches for the update, with the first in November 2015, the second in August 2016 and the third in August 2017. For this update we searched:

  • the Cochrane Central Register of Controlled Trials (CENTRAL; 2017, Issue 7) in the Cochrane Library;

  • MEDLINE accessed through PubMed (Sept 2012 to August 2017);

  • Embase accessed through SCOPUS (Sept 2012 to August 2017).

See Appendix 1 for details of all search strategies used. There were no language or date restrictions.

Searching other resources

We also searched the websites of the US National Institute of Health (www.clinicaltrials.gov/) and the World Health Organization (WHO) International Clinical Trials Registry Platform (ICTRP) (apps.who.int/trialsearch/) to August 2017 for ongoing trials. We searched bibliographies from reviewed articles and we retrieved relevant articles. Our search included all languages. We contacted content experts for unpublished and further possible studies.

Data collection and analysis

Selection of studies

Two review authors (WH, BW) independently scrutinized all the titles and abstracts revealed by the searches and determined which fulfilled the selection criteria. A third review author (NÜ) verified that the selection had been properly realized.

Data extraction and management

Three review authors (NÜ, PW, WH) extracted data independently onto a specially designed data extraction form. We would have resolved any disagreements by discussion with the third review author (BW), but this was not necessary. One author (WH) entered data into Review Manager 5 (RevMan 5) (RevMan 2014) and two authors (NÜ,PW) checked them. We resolved discrepancies by discussion.

Assessment of risk of bias in included studies

Two review authors (NÜ,WH) independently assessed the risk of bias of each included trial. We resolved disagreements by consensus and, if needed, referral to a third review author (BW).

We assessed the following risks of bias for each study.

  • Random sequence generation (checking for possible selection bias). We assessed the method used to generate the allocation sequence as: low risk of bias (any truly random process, for example random number table; computer random number generator); unclear risk of bias (method used to generate sequence not clearly stated). We excluded studies using a non‐random process (for example, odd or even date of birth; hospital or clinic record number).

  • Allocation concealment (checking for possible selection bias). The method used to conceal allocation to interventions prior to assignment determines whether intervention allocation could have been foreseen in advance of, or during recruitment, or changed after assignment. We assessed the methods as: low risk of bias (for example, telephone or central randomization; consecutively numbered, sealed, opaque envelopes); unclear risk of bias (method not clearly stated).

  • Blinding of participants and personnel/treatment providers (systematic performance bias). We assessed the methods used to blind participants and personnel/treatment providers from knowledge of which intervention a participant received. We assessed the methods as: low risk of bias (study states that it was blinded and describes the method used to achieve blinding, for example, identical tablets; matched in appearance and smell); unclear risk of bias (study states that it was blinded but does not provide an adequate description of how it was achieved); high risk (blinding of participants was not ensured, e.g. tablets different in form or taste).

  • Blinding of outcome assessment (checking for possible detection bias). We assessed the methods used to blind study outcome assessors from knowledge of which intervention a participant received. We assessed the methods as: low risk of bias (study states that outcome assessors were blinded to the intervention or exposure status of participants; unclear risk of bias (study states that it was blinded but does not provide an adequate description of how it was achieved); high risk: outcome assessors knew the intervention or exposure status of participants.

  • Incomplete outcome data (checking for possible attrition bias due to the amount, nature, and handling of incomplete outcome data). We assessed the methods used to deal with incomplete data as: low risk of bias (fewer than 10% of participants did not complete the study or used ’baseline observation carried forward’ analysis, or both); unclear risk of bias (used ’last observation carried forward’ (LOCF) analysis); or high risk of bias (used ’completer’ analysis).

  • Reporting bias due to selective outcome reporting (reporting bias). We checked if an a priori study protocol was available and if all outcomes of the study protocol were reported in the publications of the study. There is low risk of reporting bias if the study protocol is available and all of the study’s prespecified (primary and secondary) outcomes that are of interest in the review have been reported in the prespecified way, or if the study protocol is not available but it is clear that the published reports include all expected outcomes, including those that were prespecified (convincing text of this nature may be uncommon). There is a high risk of reporting bias if not all of the study’s prespecified primary outcomes have been reported; one or more primary outcomes is reported using measurements, analysis methods or subsets of the data (for example, subscales) that were not prespecified; one or more reported primary outcomes were not prespecified (unless clear justification for their reporting is provided, such as an unexpected adverse effect); one or more outcomes of interest in the review are reported incompletely so that they cannot be entered in a meta‐analysis; the study report fails to include results for a key outcome that would be expected to have been reported for such a study.

  • Group similarity at baseline (selection bias). We assessed similarity of the study groups at baseline for the most important prognostic clinical and demographic indicators. There is low risk of bias if groups are similar at baseline for demographic factors, value of main outcome measure(s), and important prognostic factors. There is high risk of bias if groups are not similar at baseline for demographic factors, value of main outcome measure(s) and important prognostic factor.

  • Size of study (checking for possible biases confounded by small size). We assessed studies as being at low risk of bias (≥ 200 participants per treatment arm); unclear risk of bias (50 to 199 participants per treatment arm); high risk of bias (< 50 participants per treatment arm).

We defined studies with no to two unclear or high risks of bias to be high‐quality studies, with three to five unclear or high risks of bias to be moderate‐quality studies and with six to eight unclear or high risks of bias to be low‐quality studies (Häuser 2015b).

Measures of treatment effect

Our effect measures of choice were risk differences (RD) for dichotomous data and standardized mean difference (SMD) for continuous data (using the inverse variance method). We used a random‐effects model because we assumed that clinical heterogeneity would be present. We expressed uncertainty using 95% confidence intervals (CIs).

We calculated number needed to treat for an additional beneficial outcome (NNTB) as the reciprocal of the absolute risk reduction (ARR). For unwanted effects, the NNTB becomes the number needed to treat for an additional harmful outcome (NNTH) and is calculated in the same manner. For drop outs due to lack of efficacy, NNTp becomes the number of participants needed to prevent an additional unwanted outcome and is calculated in the same manner. For dichotomous data we calculated risk differences (RDs). The threshold for 'clinically relevant benefit' or 'clinically relevant harm' was set for categorical variables by an absolute risk reduction or increase of 10% or greater, corresponding to a NNTB or NNTH of 10 or less (Moore 2008).

We used Cohen’s categories to evaluate the magnitude of the effect size, calculated by SMD, with values for Hedges' g as follows: 0.2 to 0.5 equating to a small effect size, 0.5 to 0.8 equating to a medium effect size, and more than 0.8 equating to a large effect size (Cohen 1988). We considered values of g less than 0.2 to equate to a 'not substantial' effect size (Häuser 2015b). The threshold ’clinically relevant benefit’ was set for continuous variables by an effect size more than 0.2 (Fayers 2014).

We calculated the NNTBs for continuous variables (fatigue, sleep problems, HRQoL) using the Wells calculator software available at the Cochrane Musculoskeletal Group editorial office, which estimates, from the SMDs, the proportion of participants who will benefit from treatment if there was a statistically significant (P value ≤ 0.05) difference between SNRIs and control group (Norman 2001). We used a minimally important difference (MID) of 0.5 for calculation.

We calculated measures of treatment effect if at least two studies with at least 200 participants were available.

Unit of analysis issues

In trials comparing multiple SNRI‐dosage arms with one placebo group, for continuous outcomes we adjusted the number of participants in the placebo group according to the number of participants in the different SNRI‐dosage arms. For dichotomous variables we pooled the different SNRI dosage arms and compared the pooled results with the placebo arm.

Dealing with missing data

We used intention‐to‐treat (ITT) analysis data. The ITT population consisted of participants who were randomized, took the assigned study medication, and provided at least one post‐baseline assessment. Wherever possible, we assigned zero improvement to missing participants. However, most studies in chronic pain report results, including responder results, using last observation carried forward. This has been questioned as being potentially biased, as withdrawal is an important outcome that makes last observation carried forward unreliable. Last observation carried forward can lead to overestimation of efficacy, particularly in situations where adverse event withdrawal rates differ between active and control groups. At this time it is unclear what strategy can actually be used to deal with missing data inside studies (Moore 2012). We examined and reported imputation strategies clearly.

Where means or SDs were missing, we attempted to obtain these data through contacting trial authors for the first version, but not for the update of the review. Where SDs were not available from trial authors, we calculated them from t‐values, CIs or standard errors, where reported in articles (Higgins 2011). Where rates of pain relief of 30% and 50% or greater were not reported and not provided on request, we calculated them from means and SDs by a validated imputation method (Furukawa 2005).

Assessment of heterogeneity

We used the I2 statistic for heterogeneity (Higgins 2003). I2 statistic values less than 25% indicate low heterogeneity; values of 25% to 50% indicate moderate heterogeneity, and values of 50% or over indicate substantial heterogeneity (Deeks 2011).

Assessment of reporting biases

We assessed publication bias using a method designed to detect the amount of unpublished data with a null effect required to make any result clinically irrelevant (usually taken to mean an NNTB of 10 or higher) (Moore 2008).

Data synthesis

We undertook each meta‐analysis using a random‐effects model in RevMan 5 (RevMan 2014).

Quality of evidence

Two review authors (NÜ, WH) independently rated the quality of the outcomes. We used the GRADE system to rank the quality of the evidence using the GRADEprofiler Guideline Development Tool software (GRADEpro GDT), and the guidelines provided in Chapter 12.2 of the CochraneHandbook for Systematic Reviews of Interventions (Schünemann 2011).

The GRADE approach uses five considerations (study limitations, consistency of effect, imprecision, indirectness and publication bias) to assess the quality of the body of evidence for each outcome. The GRADE system uses the following criteria for assigning grade of evidence.

  • High: we are very confident that the true effect lies close to that of the estimate of the effect

  • Moderate: we are moderately confident in the effect estimate; the true effect is likely to be close to the estimate of effect, but there is a possibility that it is substantially different

  • Low: our confidence in the effect estimate is limited; the true effect may be substantially different from the estimate of the effect

  • Very low: we have very little confidence in the effect estimate; the true effect is likely to be substantially different from the estimate of effect

The GRADE system uses the following criteria for assigning a quality level to a body of evidence (Chapter 12, Schünemann 2011).

  • High: randomized trials; or double‐upgraded observational studies

  • Moderate: downgraded randomized trials; or upgraded observational studies

  • Low: double‐downgraded randomized trials; or observational studies

  • Very low: triple‐downgraded randomized trials; or downgraded observational studies; or case series/case reports

Factors that may decrease the quality level of a body of evidence are as follows.

  • limitations in the design and implementation of available studies suggesting high likelihood of bias. We assumed that there were limitations in study design if more than 50% of participants were from low‐quality studies, as defined by the 'Risk of bias' tool;

  • indirectness of evidence (indirect population, intervention, control, outcomes). We assessed whether the question being addressed by the systematic review diverged from the available evidence, in terms of the population in routine clinical care, if exclusion of participants with clinically relevant somatic disease (e.g. inflammatory rheumatic diseases) and/or depressive and anxiety disorders in the included studies resulted in 50% or more of the total participant collective of the systematic review coming from studies in which participants with clinically relevant somatic disease (e.g. inflammatory rheumatic diseases) and/or depressive and anxiety disorders had been excluded;

  • unexplained heterogeneity or inconsistency of results (including problems with subgroup analyses);

  • imprecision of results (wide confidence intervals);

  • high probability of publication bias. We assumed a publication bias if all studies were initiated and funded by the manufacturer of the drug.

Factors that may increase the quality level of a body of evidence are:

  • large magnitude of effect;

  • all plausible confounding would reduce a demonstrated effect or suggest a spurious effect when results show no effect;

  • dose‐response gradient.

We decreased the grade rating by one (‐ 1) or two (‐ 2) (up to a maximum of ‐ 3 to 'very low') if we identified:

  • serious (‐ 1) or very serious (‐ 2) limitation to study quality;

  • important inconsistency (‐ 1);

  • some (‐ 1) or major (‐ 2) uncertainty about directness;

  • imprecise or sparse data (‐ 1);

  • high probability of reporting bias (‐ 1).

'Summary of findings' table

We included a 'Summary of findings' table to present the main findings in a transparent and simple tabular format. In particular, we included key information concerning the quality of evidence, the magnitude of effect of the interventions examined, and the sum of available data on the outcomes.

We included the following outcomes in the 'Summary of findings' table.

  • Self‐reported pain relief 50% or greater

  • Patient global impression to be much or very much improved

  • Self‐reported fatigue

  • Self‐reported sleep problems

  • Self‐reported health‐related quality of life

  • Withdrawal rates due to adverse events (tolerability)

  • Serious adverse events (safety)

Subgroup analysis and investigation of heterogeneity

We performed a subgroup analysis of duloxetine, and milnacipran studies to test for potential differences in benefits and harms of these two drugs. We performed a subgroup analysis of studies with and without European participants to test for potential transatlantic differences between the efficacy and adverse events of SNRIs. A more detailed analysis of European versus non‐European participants was not possible because the studies with mixed continent samples did not report how many participants were recruited from each continent. We decided to restrict the comparisons on pain relief of 50% or greater and dropout due to adverse events in order not to inflate the number of comparisons. To test the hypotheses of a subgroup effect, we used a test of interaction with a predetermined, two‐tailed α value of 0.05 for subgroup analysis of studies with and without European participants (Altman 2003). We did not conduct the intended subgroup analyses with gender and pain because individual participant data were not available.

Sensitivity analysis

We planned to conduct sensitivity analyses (different statistical models applied, diagnostic criteria used in the trial, presence/absence of any mental or psychiatric disorder, and presence/absence of any concomitant systemic disease).

Results

Description of studies

Results of the search

In the previous review, one excluded study (Gendreau 2005) was incorrectly categorized, and should have been a secondary reference to Vitton 2004. The total number of studies excluded in the 2013 review was nine. The total number of included studies in the 2013 review was 10 (11 reports).

The updated searches (last performed 8 August 2017) produced 214 records after duplicates were removed. We excluded 22 studies in total: we excluded nine in the 2013 review (Branco 2011; Chappell 2009b; Dwight 1998; Goldenberg 2010; Hsiao 2007; Mease 2010; Saxe 2012; Sayar 2003; NCT00369343) (total corrected in this review), and 13 additional studies for the update (Ahmed 2016; Ang 2013; Natelson 2015; NCT00725101; NCT00793520; NCT01108731; NCT01173055; NCT01234675; NCT01294059; NCT01331109; NCT01621191; Trugman 2014; Ziljstra 2002). One study with desvenlafaxine excluded in the 2013 review, available in clinicaltrials.gov and which did not report data suited for meta‐analysis (NCT00369343), was published in a peer‐reviewed journal in 2017. The published data were again not suited for meta‐analysis (Allen 2017, secondary reference to (NCT00369343). One study with venlafaxine which was not found in the search for the 2013 review (Ziljstra 2002), was published in a peer‐reviewed journal in 2015 (vanDerWeide 2015, secondary reference to Ziljstra 2002).The published data of both reports were not suited for meta‐analysis. See the Characteristics of excluded studies table for further details about reasons for exclusion and Figure 1 for the study flow diagram.


Study flow diagram

Study flow diagram

Mohs 2012 is a secondary reference for Arnold 2010a, which was an included study in the 2013 review; Mease 2014 is a secondary reference for Clauw 2013, an included study added at this update.

We included eight new studies (nine reports) (Arnold 2012a; Bateman 2013; Clauw 2013; Leombruni 2015; Matthey 2013; Murakami 2015; NCT00697787; Staud 2015).

In sum, we included 18 studies in the qualitative and quantitative analysis. See the Characteristics of included studies table for a full description of the studies.

Included studies

We included eight studies with duloxetine (Arnold 2004; Arnold 2005; Arnold 2010a; Arnold 2012a; Chappell 2009a; Leombruni 2015; Murakami 2015; Russell 2008), nine studies with milnacipran (Arnold 2010b; Bateman 2013; Branco 2010; Clauw 2008; Clauw 2013; Matthey 2013; Staud 2015; Mease 2009b; Vitton 2004) and one study with desvenlafaxine (NCT00697787) in the analysis of placebo controlled trials. The eight studies with duloxetine included 11 study arms with different dosages of duloxetine. The studies with milnacipran contained 11 study arms with different dosages of milnacipran. Two studies were entered in the analysis of active drug controlled trials (Leombruni 2015; NCT00697787), one with duloxetine (Leombruni 2015) and one with desvenlafaxine (NCT00697787). One of these studies had three study arms (desvenlafaxine fixed dosage, pregabalin fixed dosage, placebo) (NCT00697787). The studies included a total of 7903 participants.

Study characteristics

All studies were conducted in multiple research centers except three single‐center studies (Leombruni 2015; Matthey 2013; Staud 2015). Eight studies were conducted in the USA (Arnold 2004; Arnold 2005; Clauw 2008; Mease 2009b; Clauw 2013; NCT00697787; Staud 2015, Vitton 2004), two studies each in the USA and Puerto Rico (Arnold 2010a; Russell 2008) and in more than one continent (Arnold 2012a; Bateman 2013), one study each in the USA and Western Europe (Chappell 2009a), in the USA and Canada (Arnold 2010b) and in Japan (Murakami 2015), and three studies in Europe (Branco 2010; Leombruni 2015; Matthey 2013). All studies had a parallel design except one with an EERW design (Clauw 2013). Study duration ranged between 6 and 12 weeks in 11 studies (short‐term studies) (Arnold 2004; Arnold 2005; Arnold 2010a; Arnold 2012a; Bateman 2013; Clauw 2013; Leombruni 2015; Matthey 2013; NCT00697787; Staud 2015; Vitton 2004) and between 13 and 26 weeks in four studies (medium‐term studies) (Arnold 2010b; Branco 2010; Clauw 2008; Russell 2008). Two studies had a long‐term duration (> 26 weeks) with 27 weeks each (Chappell 2009a; Mease 2009b). Two studies were started after 2010 (Leombruni 2015; Murakami 2015), the remaining studies were conducted between 2002 and 2010.

All studies were funded by the manufacturer of the respective drug except one study that did not report details of funding (Leombruni 2015). There was no investigator‐initiated study or public funding. All authors but five (Arnold 2005; Arnold 2010a; Arnold 2010b; Leombruni 2015; NCT00697787) declared potential conflicts of interest. Two authors (Matthey 2013; Staud 2015) stated that they had no potential financial conflict of interest. The remaining authors who declared conflicts of interest, reported to have received payments by the sponsor of the study for consultancies and/or owned stocks or were employees of the sponsor of the study.

Participant characteristics

All studies included participants over 18 years old. Diagnosis of fibromyalgia was established by all studies by the ACR 1990 classification criteria (Wolfe 1990). All studies required a pain score of more than 3 for inclusion except for Chappell 2009a; NCT00697787; Staud 2015 and Vitton 2004, which did not require a minimum pain score for inclusion. Mease 2009b required a pain score of more than 4 for inclusion. Bateman 2013 required that participants reported no adequate reduction of fibromyalgia symptoms by previous treatment with duloxetine 60 mg a day. All studies excluded participants with somatic diseases, including inflammatory rheumatic diseases. All duloxetine studies included participants with mental disorders, except for major depression (all studies) and general anxiety disorder (all but one study Arnold 2010a). All milnacipran studies excluded participants with severe mental disorders including major depression except Vitton 2004. Middle‐aged, white women prevailed in all studies: the median of the mean age was 49 years (range 47 to 55 years). The median of the percentage of women was 95% (range 82% to 100%). The median of the percentage of white people was 91% (range 0% to 97%). Three studies conducted in Europe (Branco 2010; Leombruni 2015; Matthey 2013) and two studies conducted in USA (NCT00697787; Staud 2015) did not report the ethnicity of the participants. The percentage of participants with major depressive disorder in the duloxetine studies ranged from 4% to 41%. A total of 4230 (mean 235; SD 55; minimum 23, maximum 795) participants were included in the active drug groups and 2997 (mean 167; SD 36; minimum 21, maximum 509) in the comparison groups.

Interventions

Duloxetine dosage was fixed, with 30 mg a day in Arnold 2012a and Russell 2008, 60 mg a day in Arnold 2005; Murakami 2015 and Russell 2008, and 120 mg a day in Arnold 2004; Arnold 2005 and Russell 2008. Duloxetine dosage was flexible with 30 mg or 60 mg a day in Leombruni 2015 and 60 mg or 120 mg a day in Arnold 2010a and Chappell 2009a. Milnacipran dosage was fixed, with 100 mg a day in Bateman 2013; Branco 2010; Clauw 2008; Mease 2009b and Staud 2015, and with 200 mg a day in Clauw 2008; Matthey 2013; Mease 2009b and Vitton 2004, and flexible (100 mg or 200 mg a day) in Arnold 2010b and Clauw 2013. In addition, one study compared duloxetine 60 mg a day, fixed, with L‐carnitine 300 mg a day (Leombruni 2015), and one study compared desvenlafaxine 200 mg a day, fixed, with pregabalin 450 mg a day, fixed (NCT00697787). The other studies had a single SNRI arm. The rescue medication in duloxetine trials was acetaminophen (paracetamol) up to 2 g a day and aspirin up to 325 mg a day, and in milnacipran trials was hydrocodone up to 60 mg a day. The desvenlafaxine study did not report on rescue medication (NCT00697787).

Primary outcomes
Self‐reported pain relief of 50% or greater

All studies used different measures for pain. We selected the predefined primary outcome variables of the studies for analysis. The duloxetine studies (Arnold 2004; Arnold 2005; Arnold 2010a; Arnold 2012a; Chappell 2009a; Murakami 2015; Russell 2008) assessed pain using the Brief Pain Inventory (BPI) 24 average pain score except Leombruni 2015, which used a VAS 0‐10. The milnacipran trials used the patient electronic diary 24‐hour recall pain score (Arnold 2010b; Bateman 2013; Branco 2010; Clauw 2008; Clauw 2013; Matthey 2013; Staud 2015; Mease 2009b; Vitton 2004) and the desvenlafaxine study used pain score numeric rating scale (NRS) from 0 (no pain) to 10 (worst pain imaginable) (NRS 0‐10) across the last 24 hours and seven days (NCT00697787).

Patient global impression much or very much improved

Patient global impression of change was assessed by all studies except three studies (NCT00697787; Staud 2015; Vitton 2004). However, five studies reported only average scores, which could not be used for the predefined analysis (Arnold 2012a; Branco 2010; Leombruni 2015; Matthey 2013; Murakami 2015). The desvenlafaxine study did not assess this outcome (NCT00697787).

Tolerability (withdrawals due to adverse events)

All studies reported the number of participants dropping out due to adverse events.

Safety (serious adverse events)

Four studies reported no details at all of the assessment (Bateman 2013; Leombruni 2015; NCT00697787; Staud 2015). The remaining studies used physical examination, electrocardiograms, and laboratory analysis for the assessment of adverse events. Four studies did not report details about how they had assessed subjective adverse symptoms (Arnold 2004; Arnold 2005; Arnold 2010a; Arnold 2010b). Three studies reported the recording of spontaneously‐reported adverse events (Chappell 2009a; Russell 2008; Vitton 2004), another two studies reported spontaneously‐reported and investigator‐observed adverse events (Clauw 2008; Mease 2009b), and one study reported both spontaneously‐reported and investigator‐observed (use of non‐leading questions) adverse events (Branco 2010). Two studies used the Columbia Suicide Severity Scale to assess suicidality (Arnold 2012a; Murakami 2015).

Secondary outcomes
Self‐reported fatigue

Fatigue was assessed either by the single item of the FIQ (Arnold 2004; Arnold 2005; Arnold 2012a; Bateman 2013; Murakami 2015; Vitton 2004), or by a VAS 0‐10 (Staud 2015) or by the Multidimensional Fatigue Inventory (MFI) (Arnold 2010a, Arnold 2010b; Branco 2010; Chappell 2009a; Clauw 2008; Clauw 2013; Matthey 2013; Mease 2009b). One study assessed fatigue by a visual analog scale (VAS) from 0 to 100 (Staud 2015). Two studies did not report the outcome (Bateman 2013; Leombruni 2015). One study did not assess this outcome (NCT00697787.

Self‐reported sleep problems

The duloxetine studies, Arnold 2004; Arnold 2005; Arnold 2010a; Chappell 2009a; Murakami 2015 and Russell 2008 assessed sleep disturbances using the BPI sleep interference scale. However, three studies did not report the sleep outcomes (Arnold 2004; Arnold 2010a; Chappell 2009a). The duloxetine studies of Arnold 2012a and Leombruni 2015 did not assess sleep problems

Sleep was assessed by the Medical Outcomes Study (MOS) in four milnacipran studies (Branco 2010; Clauw 2008;Matthey 2013; Mease 2009b). The Vitton 2004 study used the Jenkins Sleep Scale. One milnacipran study did not report on the assessment of sleep outcomes (Arnold 2010b). The remaining milnacipran studies did not assess sleep problems (Arnold 2012a; Bateman 2013; Clauw 2013; Staud 2015).

The study with desvenlafaxine did not assess this outcome (NCT00697787).

Self‐reported health‐related quality of life

Two studies did not assess health‐related quality of life (NCT00697787; Staud 2015). The remaining studies except Arnold 2010a used the FIQ‐total score of which one study used the revised FIQ (Clauw 2013). Arnold 2010a used the Short Form Health Survey SF‐36. One study did not report the FIQ total score ( Leombruni 2015).

Self‐reported pain relief of 30% or greater

All studies used different measures for pain. We selected the predefined primary outcome variables of the studies for analysis. The duloxetine studies (Arnold 2004; Arnold 2005; Arnold 2010a; Arnold 2012a; Chappell 2009a; Murakami 2015; Russell 2008) assessed pain using the Brief Pain Inventory (BPI) 24 average pain score except Leombruni 2015, which used a VAS 0 to 10 scale. The milnacipran trials assessed pain using the patient electronic diary 24‐hour recall pain score (Arnold 2010b; Bateman 2013; Branco 2010; Clauw 2008; Clauw 2013; Matthey 2013; Staud 2015; Mease 2009b; Vitton 2004). The desvenlafaxine study used a pain score numeric rating scale (NRS) from 0 (no pain) to 10 (worst pain imaginable) (NRS 0‐10) across the last 24 hours and seven days (NCT00697787).

Self‐reported mean pain intensity

All studies used different measures for pain. We selected the predefined primary outcome variables of the studies for analysis. The duloxetine studies (Arnold 2004; Arnold 2005; Arnold 2010a; Arnold 2012a; Chappell 2009a; Murakami 2015; Russell 2008) assessed pain using the Brief Pain Inventory (BPI) 24 average pain score except Leombruni 2015, which used a VAS 0 to 10 scale. The milnacipran trials assessed pain using the patient electronic diary 24‐hour recall pain score (Arnold 2010b; Bateman 2013; Branco 2010; Clauw 2008; Clauw 2013; Matthey 2013; Staud 2015; Mease 2009b; Vitton 2004). The desvenlafaxine study used a pain score numeric rating scale (NRS) from 0 (no pain) to 10 (worst pain imaginable) (NRS 0‐10) across the last 24 hours and seven days (NCT00697787).

Self‐reported depression

Arnold 2005 used the Hamilton Depression Rating Scale (HDRS), Leombruni 2015 the Hospital and Anxiety Depression Subscale Depression, Clauw 2013 and Vitton 2004 used the FIQ single item depression scale and Staud 2015 a VAS 0 to 100 scale. The remaining studies used the Beck Depression Inventory (BDI). The desvenlafaxine study did not assess this outcome (NCT00697787).

Self‐reported anxiety

Four studies used the Beck Anxiety Inventory (BAI) to assess anxiety (Arnold 2004; Arnold 2010a; Arnold 2010b; Arnold 2012a), two studies used the Stait‐Trait Anxiety Inventory (STAI) (Branco 2010; Matthey 2013), three studies used the FIQ single item scale (Clauw 2013; Murakami 2015; Vitton 2004), and Staud 2015 used a VAS 0 to 100. The remaining studies, including the desvenlafaxine study (NCT00697787), did not assess this outcome.

Self‐reported disability

We used the BPI average interference scale as a measure of disability in six studies with duloxetine (Arnold 2004; Arnold 2005; Arnold 2010a; Arnold 2010b; Arnold 2012a; Branco 2010; Chappell 2009a; Russell 2008). The remaining seven studies used three different measures for disability/physical function, namely subscale data of: Multidimensional Health Assessment Questionnaire (MDHAQ) (Clauw 2008), the Short Form Health Survey physical component summary score (Clauw 2013; Leombruni 2015; Murakami 2015; Mease 2009b); and the FIQ single item subscale Bateman 2013; Vitton 2004). One study did not report the FIQ single item subscale score (Matthey 2013). Two studies did not assess this outcome (NCT00697787; Staud 2015).

Self‐reported sexual function

Only three studies reported on the assessment of sexual function by the Arizona Sexual Experience Scale. However, one study did not report the data (Clauw 2008), and the other did not report the SDs (Mease 2009b). Only one study reported outcomes suitable for meta‐analysis (Bateman 2013).

Self‐reported cognitive disturbances

Three duloxetine studies assessed cognitive disturbances ('fibro fog') using the mental fatigue subscale of the MFI (Arnold 2010a; Chappell 2009a; Russell 2008), five milnacipran studies, using the Multiple Ability Self‐report Questionnaire (MASQ) (Arnold 2010b; Bateman 2013; Branco 2010; Clauw 2008; Mease 2009b).

Tender point pain threshold

Only four duloxetine studies measured tender point pain threshold (Arnold 2004; Arnold 2005; Chappell 2009a; Russell 2008).

Dropout due to lack of efficacy

All the included studies reported this outcome except Clauw 2013 and Staud 2015.

Specific adverse events

Nausea

All the included studies reported this adverse event except Arnold 2004; Leombruni 2015; Matthey 2013; Staud 2015; and Vitton 2004.

Somnolence

All the duloxetine studies except Arnold 2004 reported this adverse event, as well as the desvenlafaxine study (NCT00697787). None of the studies with milnacipran reported on this outcome.

Insomnia

All the included studies reported this adverse event except Arnold 2004; Clauw 2013; Leombruni 2015; Murakami 2015; Matthey 2013; Staud 2015; and Vitton 2004.

Excluded studies

We excluded 22 studies in total. Five studies had fewer than 20 participants per treatment arm (Ahmed 2016; Natelson 2015; NCT00793520; NCT01108731; NCT01234675); 11 studies had no control group (Branco 2011; Chappell 2009b; Dwight 1998;Goldenberg 2010; Hsiao 2007; Mease 2010; NCT00725101; NCT01294059; NCT01331109; NCT01621191; Sayar 2003); two studies did not include outcomes of efficacy, which were preconditions to be included into our review (NCT01173055; Trugman 2014); one study combined milnacipran with education or psychological therapies (Ang 2013); one study duration was shorter than four weeks (Saxe 2012); one study was only published as an abstract (Ziljstra 2002); and for one study the study results were incompletely reported and not suited for quantitative analysis (NCT00369343).

Studies awaiting classification

We found one study with duloxetine 60 mg a day whose results were not reported (NCT01268631). The recruitment status of another study was unknown (NCT01268631).

Risk of bias in included studies

In general, the risks of bias of included studies differed between the studies (see Figure 2 and Figure 3 for 'Risk of bias' summary and graph). Detailed information regarding 'Risk of bias' assessments of every study are given in the Characteristics of included studies table. Seven studies met the predefined criteria of high quality for methodology (Arnold 2010a; Arnold 2010b; Branco 2010; Clauw 2008; Mease 2009b; Murakami 2015; Vitton 2004), seven studies of moderate quality for methodology (Arnold 2004; Arnold 2005; Arnold 2012a; Chappell 2009a; Matthey 2013; Russell 2008; Staud 2015) and four studies of low quality for methodology (Bateman 2013; Clauw 2013; Leombruni 2015; NCT00697787). The assessment is based on the reports in the publications. We did not request missing details of methods in the update of the review as we did in the first review because we did not get responses to some of our requests in the first version of the review.


Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies


Risk of bias summary: review authors' judgements about each risk of bias item for each included study

Risk of bias summary: review authors' judgements about each risk of bias item for each included study

Allocation

Random sequence generation was adequately described and therefore all studies were at low risk of bias except Bateman 2013; Clauw 2013; Leombruni 2015; NCT00697787, Staud 2015 which did not adequately describe it (unclear risk of bias).

Allocation concealment was adequately described and therefore all studies were at low risk of bias except Arnold 2012a; Bateman 2013; Leombruni 2015 and NCT00697787, which did not adequately describe it (unclear risk of bias).

Blinding

Blinding of participants and personnel was adequately described in all studies except Bateman 2013; Clauw 2013; Leombruni 2015; Matthey 2013; NCT00697787 and Staud 2015, which did not adequately describe it (unclear risk of bias).

Blinding (detection bias)

Blinding of outcome assessors was adequately described in all studies except Arnold 2012a; Bateman 2013; Clauw 2013; Leombruni 2015; Matthey 2013; NCT00697787 and Staud 2015, which did not adequately describe it (unclear risk of bias).

Incomplete outcome data

Most outcomes of the study of Vitton 2004 were based on analysis of observed cases that were provided on request (high risk of bias). Leombruni 2015 and Staud 2015 also performed completer analysis (high risk of bias). Only Murakami 2015 provided analysis by the baseline observation carried forward method. The remaining studies imputed missing data by baseline or last observation carried forward and therefore we judged them to be at unclear risk of bias.

Selective reporting

Only Arnold 2010a; Murakami 2015; NCT00697787 and Vitton 2004 reported or provided on request all data of interest for this review if outlined in the protocol, and we judged them to be at low risk of bias. We judged the remaining studies to be at high risk of bias.

Other potential sources of bias

Group similarity at baseline

No significant differences in demographic and clinical variables between the study groups (low risk of bias) could be detected in the studies included except in NCT00697787 and Vitton 2004 (high risk of bias).

Sample size bias

The sample size was of a low risk of bias only in Arnold 2010a; Arnold 2010b; Branco 2010; Clauw 2008 and Mease 2009b. Six studies had a high risk of bias (Bateman 2013; Leombruni 2015; Matthey 2013; NCT00697787; Staud 2015; Vitton 2004) and seven studies an unclear risk of bias (Arnold 2004; Arnold 2005; Arnold 2012a; Chappell 2009a; Clauw 2013; Murakami 2015; Russell 2008).

Effects of interventions

See: Summary of findings 1 Serotonin noradrenaline reuptake inhibitors compared with placebo for fibromyalgia ‐ studies with parallel design

All SNRIs (desvenlafaxine, duloxetine, milnacipran) versus placebo, studies with parallel and cross‐over design

Primary outcomes
Self‐reported pain relief of 50% or greater

We entered 15 studies with 6918 participants into an analysis of the RD of participant‐reported pain relief of 50% or greater. For this outcome, 1274 of 4104 (31.0%) participants with duloxetine and milnacipran and 591 out of 2814 (21.0%) participants in the placebo group reported pain relief of 50% or greater. The RD was 0.09 (95% CI 0.07 to 0.11) (see Analysis 1.1). NNTB was 11 (95% CI 9 to 14) (P value < 0.0001). According to the predefined categories there was no clinically meaningful benefit with duloxetine and milnacipran. The quality of evidence was low, downgraded due to indirectness and publication bias (see summary of findings Table 1).

Patient global impression to be much or very much improved

We entered six studies with 2918 participants into an analysis of patient global impression much or very much improved. There were 888 participants out of 1710 (51.9%) with duloxetine and milnacipran and 354 of 1208 (29.3%) participants in the placebo group reported to be much or very much improved. The RD was 0.19 (95% CI 0.12 to 0.26) (see Analysis 1.2). NNTB was 5 (95% CI 4 to 8) (P value < 0.0001). According to the predefined categories there was a clinically meaningful benefit with duloxetine and milnacipran. The quality of evidence was low, downgraded due to indirectness and publication bias.

Tolerability (withdrawals due to adverse events)

We entered 15 studies, with 7029 participants, into an analysis of withdrawals due to adverse events. Out of 4166 participants with desvenlafaxine, duloxetine and milnacipran, 794 (19.1%) dropped out due to adverse events and 292 participants out of 2863 (10.2%) dropped out in the placebo group. The RD was 0.07 (95% CI 0.04 to 0.10). The NNTH with desvenlafaxine, duloxetine and milnacipran was 14 (95% CI 10 to 25) (P value < 0.0001) (see Analysis 1.3). According to the predefined categories there was no clinically meaningful harm with desvenlafaxine, duloxetine and milnacipran. The quality of evidence was low, downgraded due to indirectness and publication bias.

Safety (serious adverse events)

We entered 13 studies, with 6732 participants, into an analysis of serious adverse events. In 73 out of 4022 (1.8%) participants with desvenlafaxine, duloxetine and milnacipran and in 58 out of 2710 (2.1%) in the placebo group an adverse event was noted. The RD was ‐0.00 (95% CI ‐0.01 to 0.00) (P value 0.90) (Analysis 1.4). The quality of evidence was very low, downgraded due to indirectness, imprecision and publication bias.

Secondary outcomes
Self‐reported fatigue

We entered 12 studies with 6168 participants into an analysis of the effects of desvenlafaxine, duloxetine and milnacipran on fatigue reduction. The SMD was ‐0.13 (95% CI ‐0.18 to ‐0.08) (P value < 0.001). Based on Cohen's categories, the effect on fatigue of SNRIs versus placebo was not substantial (Analysis 1.5). The quality of evidence was low, downgraded due to indirectness and publication bias.

Self‐reported sleep problems

We entered eight studies, with 4547 participants, into an analysis of the effects of duloxetine and milnacipran on reduction of sleep disturbances. The overall effect on sleep disturbances was not significant (P value = 0.11) (see Analysis 1.6). The quality of evidence was low, downgraded due to indirectness and publication bias.

Self‐reported health‐related quality of life

We entered 14 studies, with 6861 participants, into an analysis of the effects of duloxetine and milnacipran on health‐related quality of life. SMD was ‐0.20 (95% CI ‐0.25 to ‐0.15) (P value < 0.0001). Based on Cohen's categories the effect on disease‐related quality of life of SNRIs versus placebo was not substantial (Analysis 1.7). The quality of evidence was low, downgraded due to indirectness and publication bias.

Self‐reported pain relief of 30% or greater

We entered 15 studies, with 6924 participants, into an analysis of the RD of participant‐reported pain relief of 30% or greater. There were 1653 out of 4105 participants (40.3%) with duloxetine and milnacipran and 888 out of 2819 (31.5%) participants in the placebo group who reported pain relief of 30% and more. The RD was 0.10 (95% CI 0.08 to 0.12). NNTB was 10 (95% CI 8 to 12) (P value < 0.0001) (Analysis 1.8). According to the predefined categories there was a clinically meaningful benefit with duloxetine and milnacipran. The quality of evidence was low, downgraded due to indirectness and publication bias.

Self‐reported pain intensity

We entered 16 studies, with 7014 participants, into an analysis of the effects of desvenlafaxine, duloxetine and milnacipran on pain intensity reduction.The SMD was ‐0.22 (95% CI ‐0.27 to ‐0.17) (P value < 0.0001). According to Cohen's categories the effect on pain of desvenlafaxine, duloxetine and milnacipran compared to placebo was small (Analysis 1.9). According to the predefined categories there was a clinically meaningful benefit with duloxetine and milnacipran. The quality of evidence was low, downgraded due to indirectness and publication bias.

Self‐reported depression

We entered 14 studies, with 6478 participants into an analysis of the effects of duloxetine and milnacipran on depression reduction. One study reported only the outcomes of one of three dosage groups (Russell 2008). SMD was ‐0.16 (95% CI ‐0.21 to ‐0.11) (P value < 0.0001) (Analysis 1.10). Based on Cohen's categories, the effect on depression of duloxetine and milnacipran versus placebo was not substantial. The quality of evidence was low, downgraded due to indirectness and publication bias.

Self‐reported anxiety

We entered 9 studies, with 3533 participants, into an analysis of the effects of duloxetine and milnacipranon anxiety reduction. The overall effect on anxiety was not significant (P value = 0.21) (Analysis 1.11). The quality of evidence was low, downgraded due to indirectness and publication bias.

Self‐reported disability

We entered 13 studies, with 6789 participants into an analysis of the effects of duloxetine and milnacipran on disability reduction. SMD was ‐0.21 (95% CI ‐0.26 to ‐0.16) (P value < 0.0001). Based on Cohen's categories the effect on disability of duloxetine and milnacipran versus placebo was small (Analysis 1.12). According to the predefined categories there was a clinically meaningful benefit with duloxetine and milnacipran. The quality of evidence was low, downgraded due to indirectness and publication bias.

Self‐reported sexual function

One study with 100 participants assessed the effects of milnacipran on sexual function. There was no difference between milnacipran and placebo on sexual function (P value = 0.59). The quality of evidence was very low, downgraded due to indirectness, imprecision and publication bias.

Self‐reported cognitive disturbances

We entered eight studies, with 5444 participants, into an analysis of the effects of duloxetine and milnacipran on cognitive disturbances. The overall effect on 'fibro fog' was significant (P value < 0.0001). SMD was ‐0.16 (95% CI ‐0.21 to ‐0.10). Based on Cohen's categories, the effect on cognitive disturbances of duloxetine and milnacipran versus placebo was not substantial (Analysis 1.13). The quality of evidence was low, downgraded due to indirectness and publication bias.

Tenderness

We entered five studies, with 1444 participants, which performed tender point pain threshold measurement. Duloxetine and milnacipran were superior to placebo in raising the tender point pain threshold (P value = 0.0007), suggesting less tenderness. SMD was ‐0.21 (95% CI ‐0.33 to ‐0.09). Based on Cohen's categories the effect on tenderness of duloxetine and milnacipran versus placebo was small (Analysis 1.14). According to the predefined categories there was a clinically meaningful benefit with duloxetine and milnacipran. The quality of evidence was low, downgraded due to indirectness and publication bias.

Dropout due to lack of efficacy

We entered 14 studies, with 6924 participants into an analysis of withdrawals due to lack of efficacy. Out of 4082 participants with desvenlafaxine, duloxetine and milnacipran, 264 (6.5%) dropped out due to lack of efficacy and 258 out of 2842 (9.1%) dropped out in the placebo group. The RD was ‐0.03 (95% CI ‐0.04 to ‐0.02). The number of participants needed to prevent an additional unwanted outcome (NNTp) with desvenlafaxine, duloxetine and milnaciprans was 33 (95% CI 25 to 50) (P value < 0.0001) (see Analysis 1.15). According to the predefined categories there was no clinically meaningful benefit by desvenlafaxine, duloxetine and milnacipran. The quality of evidence was low, downgraded due to indirectness and publication bias.

Specific adverse events

Nausea

We entered 12 studies, with 6606 participants, into an analysis of nausea as an adverse event. Out of 3918 participants with desvenlafaxine, duloxetine and milnacipran, 1253 (32.0 %) reported nausea and 382 out of 2688 participants reported nausea (14.2%) in the placebo group. The RD was 0.16 (95% CI 0.14 to 0.19). The NNTH with desvenlafaxine, duloxetine and milnacipran was 6 (95% CI 5 to 7) (P value < 0.0001) (see Analysis 1.16). According to the predefined categories there was a clinically meaningful harm with desvenlafaxine, duloxetine and milnacipran. The quality of evidence was low, downgraded due to indirectness and publication bias.

Somnolence

We entered seven studies, with 2514 participants, into an analysis of somnolence as an adverse event. Out of 1426 participants with duloxetine and milnacipran, 155 (10.9%) reported somnolence and 51 participants out of 1088 (4.7%) reported somnolence in the placebo group. The RD was 0.05 (95% CI 0.02 to 0.08). The NNTH with SNRIs was 20 (95% CI 12 to 50) (P value = 0.0004) (see Analysis 1.17). According to the predefined categories there was no clinically meaningful harm with duloxetine and milnacipran. The quality of evidence was low, downgraded due to indirectness and publication bias.

Insomnia

We entered nine studies, with 5387 participants, into an analysis of insomnia as an adverse event. Out of 3119 participants with desvenlafaxine, duloxetine and milnacipran, 298 (9.6 %) reported insomnia and 132 participants out of 2268 (5.8%) reported insomnia in the placebo group. The RD was 0.03 (95% CI 0.01 to 0.04) (P value < 0.0001). The NNTH with desvenlafaxine, duloxetine and milnacipran was 33 (95% CI 25 to 100) (see Analysis 1.18). According to the predefined categories there was no clinically meaningful harm with desvenlafaxine, duloxetine and milnacipran. The quality of evidence was low, downgraded due to indirectness and publication bias.

All SNRIs versus placebo, studies with enriched enrolment randomized withdrawal design

We downgraded the quality of evidence by one level each to very low because of limitations of study design, indirectness and imprecision for all outcomes. We present a qualitative analysis of the data because there was only one study with 151 participants examining milnacipran available (Clauw 2013).

Primary outcomes
Loss of therapeutic response for self‐reported pain relief

There were 35 of 100 (35%) participants in the milnacipran group and 32 of 51 (62.7%) participants in the placebo group who reported a loss of therapeutic response (P value 0.0008).

Loss of therapeutic response for patient's global impression to be much or very much improved

There were 22 of 100 participants (22%) in the milnacipran and 25 of 51 (49.0%) participants in the placebo group reported to be much or very much worse (P value 0.0009).

Tolerability (withdrawals due to adverse events)

Two of 100 (2%) participants in the milnacipran and 0 of 51 (0%) participants in the placebo group dropped out due to adverse events (P value 0.33).

Safety (serious adverse events)

One of 100 (1%) participants in the milnacipran and 0 of 51 (0%) participants in the placebo group experienced a serious adverse event (P value 0.58).

Secondary outcomes
Self‐reported fatigue: Loss of therapeutic response

There were 36 of 100 participants (36%) in the milnacipran and 20 of 51 participants (39.2%) in the placebo group who reported a loss of therapeutic response of reduction of fatigue (P value 0.70).

Self‐reported sleep problems: Loss of therapeutic response

This outcome was not assessed by the study.

Self‐reported health‐related quality of life: Loss of therapeutic response

This outcome was not reported by the study.

Self‐reported depression: Loss of therapeutic response

This outcome was not reported by the study.

Self‐reported anxiety: Loss of therapeutic response

This outcome was not reported by the study.

Self‐reported disability: Loss of therapeutic response

There were 47 of 100 (47%) participants in the milnacipran and 26 of 51 (51%) participants in the placebo group who reported a loss of therapeutic response of reduction of self‐reported disability (P value 0.82).

Self‐reported sexual function: Loss of therapeutic response

This outcome was not assessed by the study.

Self‐reported cognitive disturbances: Loss of therapeutic response

This outcome was not assessed by the study.

Tenderness: Loss of therapeutic response

This outcome was not assessed by the study.

Dropout due to lack of efficacy

This outcome was not reported sufficiently for meta‐analysis.

Specific adverse events

Nausea

Four of 100 (4%) participants with milnacipran and 1 of 50 (2%) participants with placebo reported nausea as an adverse event (P value 0.52).

Somnolence

This outcome was not reported by the study.

Insomnia

This outcome was not reported by the study.

All SNRIs versus other active drugs

We present a qualitative analysis of the data because we analysed only two studies with fewer than 200 participants for this comparison. We downgraded the quality of evidence by one level each because of limitations of study design, indirectness and imprecision to very low for each outcome.

Primary outcomes
Self‐reported pain relief of 50% or greater

This outcome was not reported by the studies analysed.

Patient global impression to be much or very much improved

This outcome was not reported by the studies analysed.

Tolerability (withdrawals due to adverse events)

Eight of 28 participants (28.6%) with duloxetine and 0 of 56 (0%) participants with L‐carnitine withdrew due to side effects (P value 0.008). None of 42 (0%) participants with desvenlafaxine and six of 43 (14.0%) participants with pregabalin dropped out due to side effects (P value 0.02).

Safety (serious adverse events)

No serious adverse event was reported in the study with duloxetine versus L‐carnitine. A serious adverse event was noted in one of the 42 (2.4%) participants with desvenlafaxine and in 1 of 43 (2.3%) participants with pregabalin (P value 0.99).

Secondary outcomes
Self‐reported fatigue

This outcome was not reported by the studies analysed.

Self‐reported sleep problems

This outcome was not reported by the studies analysed.

Self‐reported health‐related quality of life

This outcome was not reported by the studies analysed.

Self‐reported pain relief of 30% or greater

This outcome was not reported by the studies analysed.

Self‐reported pain intensity

There was no statistically significant difference between duloxetine and L‐carnitine (P value 0.87).

Self‐reported depression

There was no statistically significant difference between duloxetine and L‐carnitine (P value 0.33).

Self‐reported anxiety

There was no statistically significant difference between duloxetine and L‐carnitine (P value 0.76).

Self‐reported disability

There was no statistically significant difference between duloxetine and L‐carnitine (P value 0.42).

Self‐reported sexual function

This outcome was not reported by the studies analysed.

Self‐reported cognitive disturbances

This outcome was not reported by the studies analysed.

Tenderness

This outcome was not reported by the studies analysed.

Dropout due to lack of efficacy

Three of 42 (7.1%) participants with desvenlafaxine and none of 43 participants with pregabalin dropped out due to lack of efficacy (P value 0.09). No participant with duloxetine and L‐carnitine dropped out due to lack of efficacy.

Specific adverse events

Nausea

Six of 42 (14.3%) participants with desvenlafaxine and three of 43 (7.0%) participants with pregabalin reported nausea as an adverse event (P value 0.15).

Somnolence

Two of 42 (4.8%) participants with desvenlafaxine and six of 43 (13.6%) participants with pregabalin reported somnolence as an adverse event (P value 0.33).

Insomnia

One of 42 (2.4%) participants with desvenlafaxine and none of 43 (0%) participants with pregabalin reported insomnia as an adverse event (P value 0.31).

Heterogeneity

I² statistic of all comparisons was less than 25% except for the outcome 'withdrawal due to adverse events' in the comparison SNRIs versus placebo (60%) and the outcome 'withdrawal due to adverse events' in the comparison SNRIs versus other active drugs (95%).

Publication bias

Studies with 1459 participants with a null effect on patient global impression to be much or very much improved would have been required to make the result clinically irrelevant (NNTB of 10 or higher).

Subgroup analysis

Duloxetine and milnacipran

There was no difference between duloxetine and milnacipran in the rates of pain relief of 50% or greater (P value 0.53) (see Analysis 1.1), in the reduction of fatigue (P value 0.73) (see Analysis 1.5) and in improvement of health‐related quality of life (P value 0.56) (see Analysis 1.7). Duloxetine was superior to milnacipran in the number of participants who reported to be much or very much improved (P value < 0.0001) (see Analysis 1.2) and in reducing sleep problems (P value 0.0006) (see Analysis 1.6). The dropout rate due to adverse events in milnacipran studies was higher than in duloxetine studies (P value 0.0007) (see Analysis 1.3). There was no difference between duloxetine and milnacipran in the frequency of serious adverse events (P value 0.90) (see Analysis 1.4).

There was no difference between duloxetine and milnacipran in the rates of pain relief of 30% and greater (P value 0.65) (see Analysis 1.8) and dropout rates due to lack of efficacy (P value 0.22) (see Analysis 1.15). There was no difference in reduction of mean pain intensity between duloxetine and milnacipran (P value 0.10) (see Analysis 1.9). Duloxetine was superior to milnacipran in reducing depression (P value 0.007) (see Analysis 1.10), disability (P value 0.01) (see Analysis 1.12) and cognitive disturbances (P value 0.02) (see Analysis 1.13). There was no difference between duloxetine and milnacipran in reducing anxiety (P value 0.74) (see Analysis 1.11) and tenderness (P value 0.12) (see Analysis 1.14). There was no difference between duloxetine and milnacipran in the frequency of nausea (P value 0.12) (see Analysis 1.16) and insomnia (P value 0.39) (see Analysis 1.18).

Studies with and without European participants

The RD of pain relief of 50% and more and of withdrawals due to adverse events did not differ between studies without European participants than with European participants (see Additional Table 1).

Open in table viewer
Table 1. Subgroup analysis. Efficacy and safety of SNRIs in studies with North American and European participants

Outcome

Number

of

participants (studies)

Effect size

RD (95% CI)

Test for

overall

effect

P value

Heterogeneity

(%)

 

Test of interaction:

effect estimate and P value

Self‐reported pain relief 50% or greater

 

 

 

 

 Z = 0.78; P = 0.43

Only North American participants

3935 (8)

0.10 (0.08 to 0.13)

< 0.001

0

Only European participants

960 (2)

0.06 (0.01 to 0.12)

0.02

0

 

Withdrawal due to adverse events

 

 

 

 

 Z = 1.14; P = 0.25

Only North American participants

3935 (8)

0.08 (0.04 to 0.13)

< 0.0002

71

Only European participants

960 (2)

0.12 (0.08 to 0.17)

< 0.0001

0

 

CI: confidence interval; RD: risk difference; SNRIs: serotonin and noradrenaline reuptake inhibitors

Sensitivity analysis

We did not conduct the intended sensitivity analyses (different statistical models applied, diagnostic criteria used in the trial, according to the presence or absence of any mental or psychiatric disorder, presence or absence of any concomitant systemic disease), because the studies did not differ in these characteristics.

Discussion

Summary of main results

The SNRIs duloxetine and milnacipran did not show a clinically relevant benefit compared to placebo in participant‐reported pain relief of 50% or greater. The SNRIs duloxetine and milnacipran showed a clinically relevant benefit compared to placebo in participant‐reported pain relief of 30% or greater and in increased patient‐perceived global improvement. The SNRIs duloxetine and milnacipran had a clinically relevant benefit compared to placebo in reducing mean pain intensity, disability and tenderness. The effect of duloxetine and milnacipran compared to placebo in reducing fatigue, depression, limitations of health‐related quality of life and cognitive disturbances was not clinically relevant. There were no differences between duloxetine or milnacipran and placebo in reducing sleep problems and anxiety. The dropout rate due to adverse events with duloxetine or milnacipran did not show a clinically relevant difference to placebo. There were no differences in the frequency of serious adverse events between duloxetine or milnacipran and placebo. Desvenlafaxine was not superior to placebo in mean pain intensity reduction.

Overall completeness and applicability of evidence

We are confident that we did not miss studies of SNRIs duloxetine and milnacipran, because all trials with these drugs had been registered with the application of an approval for fibromyalgia management by regulatory agencies. We cannot rule out the possibility that negative study results with other SNRIs have not been published or have been missed by our search strategy. We identified one study investigating desvenlafaxine that had been terminated prior to completion. The data available did not suggest any therapeutic effect (NCT00369343).

The applicability (external validity) of evidence is very limited for the following reasons.

  • The studies were performed in research centers and not in routine clinical care. It is known that the efficacy of drug therapies is higher in the context of RCTs than in routine clinical care (Routman 2010).

  • The substantial placebo and nocebo response rates seen across all of the SNRI trials impede the appraisal of the efficacy and tolerability of SNRIs in fibromyalgia. However, the high degree of placebo and nocebo rates that have been seen with SNRIs have also been observed in all fibromyalgia drug trials (Häuser 2011; Häuser 2012).

  • The exclusion criteria were strict. Participants were not allowed to take some defined concomitant medications for their fibromyalgia symptoms. This excluded a large number of participants who were unwilling, or unable, to come off medications, such as other antidepressants and anticonvulsants. For this reason, participant selection in the RCTs was biased towards recruiting participants with less severe symptoms than are seen in the community (Fuller‐Thomson 2012). Participants with other medical disorders, such as inflammatory rheumatic diseases, which are frequently associated with fibromyalgia, were also excluded. The study results cannot be applied to people with so‐called secondary fibromyalgia (associated with inflammatory rheumatic diseases) (Clauw 1995). All except one of the studies with milnacipran excluded all potential participants with major mental disorders, while the studies with duloxetine excluded all participants with major mental disorders except for those with major depression and general anxiety disorder. The study results cannot be applied to people with fibromyalgia and concomitant psychiatric disease, except for the duloxetine studies that suggest efficacy in fibromyalgia with major depression and general anxiety disorder.

  • The majority of the participants were middle‐aged women. The authors of the duloxetine studies provided a pooled subgroup analysis that demonstrated the efficacy of duloxetine in male participants (Russell 2008). A similar analysis was not available for milnacipran. Neither of the pharmaceutical companies (Eli‐Lilly and Pierre Fabre/Forest Laboratories) presented a subgroup analysis of participants over 65 years of age.

  • Only adult participants were included. Whether the study results can be applied to children or adolescents remains to be clarified. One study with milnacipran in adolescents with fibromyalgia was terminated early due to low enrolment (Arnold 2015).

  • Even if the review included studies with a duration of therapy of up to 27 weeks, the long‐term efficacy and safety of SNRIs in fibromyalgia cannot be assessed by the studies included. Long‐term, open‐label extension studies with duloxetine and milnacipran demonstrated a sustained symptom relief and tolerability in up to 20% of the participants who were enrolled in the RCT prior to the open‐label period (Arnold 2013; Branco 2011; Mease 2010; Mease 2013, Murakami 2017).

  • Even if the review included two RCTs that compared SNRIs with other active drugs, the definite importance of SNRIs compared to other drugs and non‐pharmacological therapies still needs to be determined. The trial comparing desvenlafaxine with pregabalin was terminated early after Wyeth, the manufacturer of desvenlafaxine, was acquired by Pfizer, the manufacturer of pregabalin. There was no difference between the two drugs in pain reduction (NCT00697787). A network meta‐analysis did not find relevant differences between drugs (tricyclic antidepressants, SNRIs, SSRIs and pregabalin) and aerobic exercise and cognitive behavioral therapies in mean pain reduction and total dropout rates (Nüesch 2013).

Quality of the evidence

The quality of evidence ranks from low to very low across the different outcomes. The likelihood that the effect could be substantially different is high or very high. The main limiting factors, which were the reasons for a decrease in quality in all outcomes, were indirectness and publication bias. All of the reviewed studies had been sponsored by pharmaceutical companies. The quality of evidence of this review is based on the data presented in peer reviewed journals and some additional details that were provided on request by the pharmaceutical companies or principal investigators. However, not all data requested were provided. A selective non‐reporting of some negative study results on pain, sleep and anxiety, as well as non‐reporting of serious adverse events is possible.

Potential biases in the review process

We searched for unpublished studies with SNRIs, but we are not certain that we identified all other studies that might have been performed but not published.

We might have overestimated the risk of bias of some studies that were added to this update and that did not report some details of methodology (e.g. randomization and blinding procedures). In contrast to the first version of this review we did not ask the study authors or the sponsors of the studies for the missing details.

Nearly all studies selected statistical methods (last observation carried forward) that bias results towards exaggerating the efficacy of drugs (Moore 2012).

The subgroup comparisons of duloxetine versus milnacipran for the outcomes patient global impression to be much or very much improved and of sleep problems were limited due to the small number of studies presenting results suitable for meta‐analysis (patient impression of change) or assessing the outcome (sleep problems).

The influence of allowed co‐interventions (e.g. rescue medication) on positive effects and adverse events was unclear because type and dosage of co‐interventions were not clearly reported or controlled for.

This systematic review update included 7903 participants. To capture rare and potentially severe adverse events a larger data set would have been necessary. For example, to capture an adverse event with a frequency of 1:100,000, 300,000 patients would need to be observed (Andersohn 2008). Rare complications of SNRIs include suicide (Taylor 2013), severe liver injury (Voican 2014), hypertension (de Toledo 2007) and sexual dysfunction (Higgins 2010).

We were not able to perform individual participant data analyses because these data were not published or provided by the sponsors of the studies. Therefore we could not test if moderate or substantial pain reduction was associated with improvement of fatigue, function, sleep, depression, anxiety, ability to work, and general health status, as has been demonstrated for pregabalin in fibromyalgia (Moore 2010c). The NNTB for substantial pain relief with duloxetine and milnacipran in our analysis was lower than for relief of sleep problems and fatigue. In addition, sleep problems are a common side effect of SNRIs (Rahmadi 2011) and insomnia as an adverse event was more frequently reported by participants with SNRIs than with placebo in this review. Therefore it might be possible that a substantial improvement of fibromyalgia pain by duloxetine and milnacipran is not associated with a substantial improvement in other key symptom domains of fibromyalgia in some people.

Agreements and disagreements with other studies or reviews

We cannot share the conclusion of some reviews that the efficacy of duloxetine and milnacipran in the management of fibromyalgia has been proven (Arnold 2010c; Kyle 2010; Ormseth 2010; Ursini 2010). Neither drug has a benefit on all key symptoms of fibromyalgia. Our results are in line with Cochrane Pain and Palliative and Supportive Care reviews that analyse drugs for fibromyalgia separately. Cording 2015 included six studies with 4238 participants in total in a review of milnacipran in fibromyalgia. The review authors concluded that milnacipran 100 mg or 200 mg per day was effective only for a minority of people in the treatment of pain due to fibromyalgia, providing moderate levels of pain relief (at least 30%) to about 40% of participants, compared with about 30% with placebo. The use of last observation carried forward imputation may overestimate the efficacy of milnacipran. Using stricter criteria for 'responder' and a more conservative method of analysis gave lower response rates (about 26% with milnacipran versus 17% with placebo). Withdrawals for any reason were more common with milnacipran than placebo, and more common with 200 mg (NNTH 9) than 100 mg (NNTH 23), compared with placebo. This was largely driven by adverse event withdrawals, where the NNTH compared with placebo was 14 for 100 mg and 7 for 200 mg (Cording 2015). Lunn and co‐authors included six studies involving 2249 participants with fibromyalgia. Duloxetine at 60 mg daily was effective for fibromyalgia over 12 weeks (RR for ≥ 50% reduction in pain 1.57, 95% CI 1.20 to 2.06; NNTB 8, 95% CI 4 to 21) and over 28 weeks (RR 1.58, 95% CI 1.10 to 2.27). Adverse events were common in both treatment and placebo arms but more common in the treatment arm, with a dose‐dependent effect. Most adverse events were minor (Lunn 2014). Our results do confirm the conclusions of the aforementioned reviews, that tolerability and safety of both drugs is limited, because a substantial number of participants dropped out of trials due to adverse events. The most frequent adverse events in both drugs were nausea, dry mouth, headache, constipation and hyperhidrosis (increased perspiration) (Cording 2015). The lack of difference in serious adverse events between SNRI and placebo demonstrated that most adverse effects were considered minor.

We cannot share the conclusion of a systematic review that venlafaxine (which is metabolized to desvenlafaxine) is "at least modestly effective in treating fibromyalgia" (VanderWeide 2015). VanderWeide 2015 included one RCT with venlafaxine. There was no difference between venlafaxine and placebo in ITT analysis in pain reduction at the end of treatment (six weeks) (Ziljstra 2002). We found no differences between desvenlafaxine and placebo in pain reduction in one trial (NCT00697787).

The results of our subgroup comparisons of duloxetine and milnacipran are in line with the ones of network meta‐analyses that did not find significant differences between the two drugs in pain reduction and tolerability (Lee 2016; Nüesch 2013). However, these network meta‐analyses did not test for some other outcomes relevant for people with fibromyalgia. We found that duloxetine was superior to milnacipran in reducing sleep problems, depression, disability, cognitive disturbances and improving global well‐being.

Routine clinical care data call into question the long‐term effectiveness or tolerability, or both, of duloxetine and milnacipran in the majority of people with fibromyalgia. A longitudinal study in people with fibromyalgia of the National Data Bank of Rheumatic Diseases found that pain scores were reduced significantly, but not clinically relevantly by 0.17 (95 % CI 0.03 to 0.30) units on an 11‐point scale following the start of therapy with duloxetine or milnacipran or pregabalin. There was no significant improvement in fatigue or functional status with these drugs (Wolfe 2013b). In a retrospective analysis using a US claims database to identify adults with a first diagnosis of fibromyalgia between 2009 and 2011, the discontinuation rates were 52% for duloxetine and 72% for milnacipran after 12 months (Liu 2016). The one‐year discontinuation rate of SSRI/SNRI antidepressants including duloxetine and milnacipran was 74% in patients of a large Israeli Health Maintenance Organisation (Ben‐Ami 2017).

Considering the current differences in regulatory approval regarding the use of duloxetine and milnacipran in fibromyalgia in the USA and Japan versus Europe, it seems relevant to comment on whether our data support either of these positions. There was one European study each with duloxetine (Chappell 2009a) and milnacipran (Branco 2010). According to EMA analysis, duloxetine and milnacipran did not meet the primary endpoint of the study, namely the superiority over placebo in the reduction of mean pain intensity (EMA 2008; EMA 2010). Our analyses demonstrated a superiority of duloxetine and milnacipran respectively in pain relief of 50% or more. It is our view that the trial data show that the benefits of duloxetine and milnacipran (NNTB 11 for an incremental 50% pain reduction) are nearly counterbalanced by the risk of side effects (NNTH 14 for an incremental dropout rate due to adverse events). The data do not provide clear support for either of the regulatory positions over the other. Thus, our review cannot provide support for any of these regulatory positions.

Study flow diagram

Figuras y tablas -
Figure 1

Study flow diagram

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies

Figuras y tablas -
Figure 2

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies

Risk of bias summary: review authors' judgements about each risk of bias item for each included study

Figuras y tablas -
Figure 3

Risk of bias summary: review authors' judgements about each risk of bias item for each included study

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 1: Self‐reported pain relief of 50% or greater

Figuras y tablas -
Analysis 1.1

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 1: Self‐reported pain relief of 50% or greater

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 2: PGIC much or very much improved

Figuras y tablas -
Analysis 1.2

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 2: PGIC much or very much improved

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 3: Withdrawal due to adverse events

Figuras y tablas -
Analysis 1.3

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 3: Withdrawal due to adverse events

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 4: Serious adverse events

Figuras y tablas -
Analysis 1.4

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 4: Serious adverse events

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 5: Self‐reported fatigue

Figuras y tablas -
Analysis 1.5

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 5: Self‐reported fatigue

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 6: Self‐reported sleep problems

Figuras y tablas -
Analysis 1.6

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 6: Self‐reported sleep problems

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 7: Self‐reported health‐related quality of life

Figuras y tablas -
Analysis 1.7

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 7: Self‐reported health‐related quality of life

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 8: Self‐reported pain relief of 30% or greater

Figuras y tablas -
Analysis 1.8

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 8: Self‐reported pain relief of 30% or greater

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 9: Self‐reported mean pain intensity

Figuras y tablas -
Analysis 1.9

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 9: Self‐reported mean pain intensity

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 10: Self‐reported depression

Figuras y tablas -
Analysis 1.10

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 10: Self‐reported depression

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 11: Self‐reported anxiety

Figuras y tablas -
Analysis 1.11

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 11: Self‐reported anxiety

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 12: Self‐reported disability

Figuras y tablas -
Analysis 1.12

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 12: Self‐reported disability

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 13: Self‐reported cognitive disturbances

Figuras y tablas -
Analysis 1.13

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 13: Self‐reported cognitive disturbances

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 14: Tenderness

Figuras y tablas -
Analysis 1.14

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 14: Tenderness

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 15: Withdrawal due to lack of efficacy

Figuras y tablas -
Analysis 1.15

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 15: Withdrawal due to lack of efficacy

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 16: Nausea

Figuras y tablas -
Analysis 1.16

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 16: Nausea

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 17: Somnolence

Figuras y tablas -
Analysis 1.17

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 17: Somnolence

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 18: Insomnia

Figuras y tablas -
Analysis 1.18

Comparison 1: SNRIs versus placebo in parallel and cross‐over design trials, Outcome 18: Insomnia

Summary of findings 1. Serotonin noradrenaline reuptake inhibitors compared with placebo for fibromyalgia ‐ studies with parallel design

Serotonin noradrenaline reuptake inhibitors compared with placebo for fibromyalgia ‐ studies with parallel design

Patient or population: people with fibromyalgia

Settings: study centers in North, Central and South America, Asia and Europe

Intervention: serotonin noradrenaline reuptake inhibitors (duloxetine, milnacipran)

Comparison: placebo

Outcomes

Probable outcome with intervention

(95% CI)

Probable outcome with placebo

Relative effect

SMD or risk difference
(95% CI)

No of participants
(studies)

Quality of the evidence
(GRADE)

Comments

Self‐reported pain relief of 50% or greater

309 per 1000

(282 to 344)

210 per 1000

RD 0.09 (0.07 to 0.11)

6918 (15 studies)

⊕⊕⊝⊝
low1,2

NNTB 11 (95% CI 9 to 14)

Patient Global Impression to be much or very much improved (PGIC)

519 per 1000

(459 to 573)

293 per 1000

RD 0.19 (0.12 to 0.26)

2918 (6 studies)

⊕⊕⊝⊝
low1,2

NNTB 5 (95% CI 4 to 8)

Self‐reported fatigue (20‐100 scale)

Higher scores indicate higher fatigue problem levels

Mean fatigue
score was 2.6 points
lower (1.0 to
5.0 points lower) based on a 20‐100 scale

Baseline mean score 69.4 (SD 12.3)3

SMD ‐0.13 (‐0.18 to ‐0.08)

6168 (12 studies)

⊕⊕⊝⊝
low1,2

NNTB 18 (95% CI 12 to 29)

Self‐reported sleep problems

(0‐100 scale)

Higher scores indicate higher sleep problem levels

Mean sleep problems
score was 1.2 points
lower (0.2 higher to
5.5 points lower) based on a 0‐100 scale

Baseline mean score 68.0 (23.8)4

SMD ‐0.07 (‐0.15 to 0.01)

4547 (8 studies)

⊕⊕⊝⊝
low1,2

NNTB not calculated due to lack of statistically significant difference

Self‐reported health‐related quality of life (0‐100 scale)

Higher scores indicate higher burden of disease (lower quality of life)

Mean health‐related quality of life problems score was 3.9 points lower (2.3 to
5.3 points lower) based on a
0‐100 scale

Baseline mean score 57.9 (SD
14.1)5

SMD ‐0.20 (‐0.25 to ‐0.15)

6861 (14 studies)

⊕⊕⊝⊝
low1,2

NNTB 11 (95% CI 8 to 14)

Tolerability (withdrawal due to adverse events)

191 per 1000

(172 to 210)

102 per 1000

RD 0.07 (0.04 to 0.10)

7029 (15 studies)

⊕⊕⊝⊝
low1,2

NNTH 14 (95% CI 10 to 25)

Safety (serious adverse events)

18 per 1000

(16 to 20)

21 per 1000

RD ‐0.00 (‐0.01 to 0.00)

6732 (13 studies)

⊕⊝⊝⊝
Verylow1,2,6

NNTH not calculated due to lack of statistically significant difference

*The basis for the assumed risk (e.g. the median control group risk across studies) is provided in footnotes. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CI: Confidence interval; FIQ: Fibromyalgia Impact Questionnaire; MFI: Multidimensional Fatigue Inventory; MOS‐Sleep problem index: Medical Outcome Study ‐ sleep problem index; NNTB: number needed to treat for an additional beneficial outcome; NNTH: number needed to treat for an additional harm; NRS: numerical rating scale; RD: risk difference; SMD: standardized mean difference; VAS: visual analog scale

GRADE Working Group grades of evidence
High quality: we are very confident that the true effect lies close to that of the estimate of the effect.
Moderate quality: we are moderately confident in the effect estimate; the true effect is likely to be close to the estimate of effect, but there is a possibility that it is substantially different.
Low quality: our confidence in the effect estimate is limited; the true effect may be substantially different from the estimate of the effect.
Very low quality: we have very little confidence in the effect estimate; the true effect is likely to be substantially different from the estimate of effect.

1Downgraded once: indirectness: participants with major medical diseases and mental disorders except major depression excluded in > 50% of studies
2Downgraded once: publication bias
3Clauw 2008: N = 401 participants; MFI NRS 20‐100 scale
4Mease 2009b: N = 223 participants; MOS Sleep problem index NRS 0‐100 scale
5Arnold 2010b; N = 509 participants; FIQ VAS 0‐80 scale
6Downgraded once: imprecision due to low event rate

Figuras y tablas -
Summary of findings 1. Serotonin noradrenaline reuptake inhibitors compared with placebo for fibromyalgia ‐ studies with parallel design
Table 1. Subgroup analysis. Efficacy and safety of SNRIs in studies with North American and European participants

Outcome

Number

of

participants (studies)

Effect size

RD (95% CI)

Test for

overall

effect

P value

Heterogeneity

(%)

 

Test of interaction:

effect estimate and P value

Self‐reported pain relief 50% or greater

 

 

 

 

 Z = 0.78; P = 0.43

Only North American participants

3935 (8)

0.10 (0.08 to 0.13)

< 0.001

0

Only European participants

960 (2)

0.06 (0.01 to 0.12)

0.02

0

 

Withdrawal due to adverse events

 

 

 

 

 Z = 1.14; P = 0.25

Only North American participants

3935 (8)

0.08 (0.04 to 0.13)

< 0.0002

71

Only European participants

960 (2)

0.12 (0.08 to 0.17)

< 0.0001

0

 

CI: confidence interval; RD: risk difference; SNRIs: serotonin and noradrenaline reuptake inhibitors

Figuras y tablas -
Table 1. Subgroup analysis. Efficacy and safety of SNRIs in studies with North American and European participants
Comparison 1. SNRIs versus placebo in parallel and cross‐over design trials

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1.1 Self‐reported pain relief of 50% or greater Show forest plot

15

6918

Risk Difference (IV, Random, 95% CI)

0.09 [0.07, 0.11]

1.1.1 Duloxetine

7

2582

Risk Difference (IV, Random, 95% CI)

0.10 [0.06, 0.14]

1.1.2 Milnacipran

8

4336

Risk Difference (IV, Random, 95% CI)

0.09 [0.06, 0.11]

1.2 PGIC much or very much improved Show forest plot

6

2918

Risk Difference (M‐H, Random, 95% CI)

0.19 [0.12, 0.26]

1.2.1 Duloxetine

1

530

Risk Difference (M‐H, Random, 95% CI)

0.35 [0.27, 0.42]

1.2.2 Milnacipran

5

2388

Risk Difference (M‐H, Random, 95% CI)

0.15 [0.11, 0.19]

1.3 Withdrawal due to adverse events Show forest plot

15

7029

Risk Difference (IV, Random, 95% CI)

0.07 [0.04, 0.10]

1.3.1 Desvenlafaxine

1

82

Risk Difference (IV, Random, 95% CI)

‐0.03 [‐0.09, 0.04]

1.3.2 Duloxetine

7

2642

Risk Difference (IV, Random, 95% CI)

0.05 [0.02, 0.07]

1.3.3 Milnacipran

7

4305

Risk Difference (IV, Random, 95% CI)

0.11 [0.07, 0.14]

1.4 Serious adverse events Show forest plot

13

6732

Risk Difference (IV, Random, 95% CI)

‐0.00 [‐0.01, 0.00]

1.4.1 Desvenlafaxine

1

82

Risk Difference (IV, Random, 95% CI)

0.00 [‐0.05, 0.05]

1.4.2 Duloxetine

6

2432

Risk Difference (IV, Random, 95% CI)

‐0.00 [‐0.01, 0.01]

1.4.3 Milnacipran

6

4218

Risk Difference (IV, Random, 95% CI)

‐0.00 [‐0.01, 0.01]

1.5 Self‐reported fatigue Show forest plot

12

6168

Std. Mean Difference (IV, Random, 95% CI)

‐0.13 [‐0.18, ‐0.08]

1.5.1 Duloxetine

5

1954

Std. Mean Difference (IV, Random, 95% CI)

‐0.12 [‐0.21, ‐0.03]

1.5.2 Milnacpran

7

4214

Std. Mean Difference (IV, Random, 95% CI)

‐0.14 [‐0.21, ‐0.07]

1.6 Self‐reported sleep problems Show forest plot

8

4547

Std. Mean Difference (IV, Random, 95% CI)

‐0.07 [‐0.15, 0.01]

1.6.1 Duloxetine

3

1382

Std. Mean Difference (IV, Random, 95% CI)

‐0.20 [‐0.31, ‐0.10]

1.6.2 Milnacipran

5

3165

Std. Mean Difference (IV, Random, 95% CI)

0.02 [‐0.05, 0.10]

1.7 Self‐reported health‐related quality of life Show forest plot

14

6861

Std. Mean Difference (IV, Random, 95% CI)

‐0.20 [‐0.25, ‐0.15]

1.7.1 Duloxetine

7

2604

Std. Mean Difference (IV, Random, 95% CI)

‐0.22 [‐0.30, ‐0.13]

1.7.2 Milnacipran

7

4257

Std. Mean Difference (IV, Random, 95% CI)

‐0.19 [‐0.25, ‐0.12]

1.8 Self‐reported pain relief of 30% or greater Show forest plot

15

6924

Risk Difference (IV, Random, 95% CI)

0.10 [0.08, 0.12]

1.8.1 Duloxetine

7

2588

Risk Difference (IV, Random, 95% CI)

0.11 [0.07, 0.15]

1.8.2 Milnacipran

8

4336

Risk Difference (IV, Random, 95% CI)

0.10 [0.07, 0.13]

1.9 Self‐reported mean pain intensity Show forest plot

16

7014

Std. Mean Difference (IV, Random, 95% CI)

‐0.22 [‐0.27, ‐0.17]

1.9.1 Desvenlafaxine

1

82

Std. Mean Difference (IV, Random, 95% CI)

0.16 [‐0.27, 0.59]

1.9.2 Duloxetine

7

2619

Std. Mean Difference (IV, Random, 95% CI)

‐0.26 [‐0.35, ‐0.18]

1.9.3 Milncipran

8

4313

Std. Mean Difference (IV, Random, 95% CI)

‐0.20 [‐0.26, ‐0.13]

1.10 Self‐reported depression Show forest plot

14

6478

Std. Mean Difference (IV, Random, 95% CI)

‐0.16 [‐0.21, ‐0.11]

1.10.1 Duloxetine

7

2264

Std. Mean Difference (IV, Random, 95% CI)

‐0.25 [‐0.34, ‐0.17]

1.10.2 Milnacipran

7

4214

Std. Mean Difference (IV, Random, 95% CI)

‐0.11 [‐0.17, ‐0.05]

1.11 Self‐reported anxiety Show forest plot

9

3533

Std. Mean Difference (IV, Random, 95% CI)

‐0.08 [‐0.21, 0.05]

1.11.1 Duloxetine

4

1403

Std. Mean Difference (IV, Random, 95% CI)

‐0.07 [‐0.17, 0.04]

1.11.2 Milnacipran

5

2130

Std. Mean Difference (IV, Random, 95% CI)

‐0.11 [‐0.36, 0.13]

1.12 Self‐reported disability Show forest plot

13

6789

Std. Mean Difference (IV, Random, 95% CI)

‐0.21 [‐0.26, ‐0.16]

1.12.1 Duloxetine

7

2602

Std. Mean Difference (IV, Random, 95% CI)

‐0.29 [‐0.37, ‐0.21]

1.12.2 Milnacipran

6

4187

Std. Mean Difference (IV, Random, 95% CI)

‐0.16 [‐0.22, ‐0.10]

1.13 Self‐reported cognitive disturbances Show forest plot

8

5444

Std. Mean Difference (IV, Random, 95% CI)

‐0.16 [‐0.21, ‐0.10]

1.13.1 Duloxetine

3

1360

Std. Mean Difference (IV, Random, 95% CI)

‐0.27 [‐0.38, ‐0.16]

1.13.2 Milnacipran

5

4084

Std. Mean Difference (IV, Random, 95% CI)

‐0.12 [‐0.18, ‐0.05]

1.14 Tenderness Show forest plot

5

1444

Std. Mean Difference (IV, Random, 95% CI)

‐0.21 [‐0.33, ‐0.09]

1.14.1 Duloxetine

4

1364

Std. Mean Difference (IV, Random, 95% CI)

‐0.23 [‐0.35, ‐0.12]

1.14.2 Milnacipran

1

80

Std. Mean Difference (IV, Random, 95% CI)

0.12 [‐0.31, 0.56]

1.15 Withdrawal due to lack of efficacy Show forest plot

14

6924

Risk Difference (IV, Random, 95% CI)

‐0.03 [‐0.04, ‐0.02]

1.15.1 Desvenlafaxine

1

82

Risk Difference (IV, Random, 95% CI)

0.07 [‐0.02, 0.16]

1.15.2 Duloxetine

7

2642

Risk Difference (IV, Random, 95% CI)

‐0.04 [‐0.06, ‐0.02]

1.15.3 Milnacipran

6

4200

Risk Difference (IV, Random, 95% CI)

‐0.02 [‐0.04, ‐0.01]

1.16 Nausea Show forest plot

12

6606

Risk Difference (IV, Random, 95% CI)

0.16 [0.14, 0.19]

1.16.1 Desvenlafaxine

1

82

Risk Difference (IV, Random, 95% CI)

0.04 [‐0.10, 0.18]

1.16.2 Duloxetine

6

2432

Risk Difference (IV, Random, 95% CI)

0.19 [0.15, 0.22]

1.16.3 Milnacipran

5

4092

Risk Difference (IV, Random, 95% CI)

0.15 [0.12, 0.18]

1.17 Somnolence Show forest plot

7

2514

Risk Difference (IV, Random, 95% CI)

0.05 [0.02, 0.08]

1.17.1 Desvenlafaxine

1

82

Risk Difference (IV, Random, 95% CI)

‐0.05 [‐0.17, 0.06]

1.17.2 Duloxetine

6

2432

Risk Difference (IV, Random, 95% CI)

0.06 [0.03, 0.09]

1.18 Insomnia Show forest plot

9

5387

Risk Difference (M‐H, Random, 95% CI)

0.03 [0.01, 0.04]

1.18.1 Desvenlafaxine

1

82

Risk Difference (M‐H, Random, 95% CI)

‐0.08 [‐0.18, 0.03]

1.18.2 Duloxetine

4

1684

Risk Difference (M‐H, Random, 95% CI)

0.04 [0.01, 0.07]

1.18.3 Milnacipran

4

3621

Risk Difference (M‐H, Random, 95% CI)

0.03 [0.01, 0.04]

Figuras y tablas -
Comparison 1. SNRIs versus placebo in parallel and cross‐over design trials