Topical analgesics for acute and chronic pain in adults ‐ an overview of Cochrane Reviews

Sheena Derry; Philip J Wiffen; Eija A Kalso; Rae Frances Bell; Dominic Aldington; Tudor Phillips; Helen Gaskell; R Andrew Moore

doi:10.1002/14651858.CD008609.pub2

Обезболивающие средства для местного применения при острой и хронической боли у взрослых ‐ обзор Кокрейновских обзоров

Declaraciones de intereses de los autores

Versión publicada: 12 mayo 2017 Historial de versiones

https://doi.org/10.1002/14651858.CD008609.pub2

Contraer todo Desplegar todo

Abstract

disponible en

Background

Topical analgesic drugs are used for a variety of painful conditions. Some are acute, typically strains or sprains, tendinopathy, or muscle aches. Others are chronic, typically osteoarthritis of hand or knee, or neuropathic pain.

Objectives

To provide an overview of the analgesic efficacy and associated adverse events of topical analgesics (primarily nonsteroidal anti‐inflammatory drugs (NSAIDs), salicylate rubefacients, capsaicin, and lidocaine) applied to intact skin for the treatment of acute and chronic pain in adults.

Methods

We identified systematic reviews in acute and chronic pain published to February 2017 in the Cochrane Database of Systematic Reviews (the Cochrane Library). The primary outcome was at least 50% pain relief (participant‐reported) at an appropriate duration. We extracted the number needed to treat for one additional beneficial outcome (NNT) for efficacy outcomes for each topical analgesic or formulation, and the number needed to treat for one additional harmful outcome (NNH) for adverse events. We also extracted information on withdrawals due to lack of efficacy or adverse events, systemic and local adverse events, and serious adverse events. We required information from at least 200 participants, in at least two studies. We judged that there was potential for publication bias if the addition of four studies of typical size (400 participants) with zero effect increased NNT compared with placebo to 10 (minimal clinical utility). We extracted GRADE assessment in the original papers, and made our own GRADE assessment.

Main results

Thirteen Cochrane Reviews (206 studies with around 30,700 participants) assessed the efficacy and harms from a range of topical analgesics applied to intact skin in a number of acute and chronic painful conditions. Reviews were overseen by several Review Groups, and concentrated on evidence comparing topical analgesic with topical placebo; comparisons of topical and oral analgesics were rare.

For at least 50% pain relief, we considered evidence was moderate or high quality for several therapies, based on the underlying quality of studies and susceptibility to publication bias.

In acute musculoskeletal pain (strains and sprains) with assessment at about seven days, therapies were diclofenac Emulgel (78% Emulgel, 20% placebo; 2 studies, 314 participants, NNT 1.8 (95% confidence interval 1.5 to 2.1)), ketoprofen gel (72% ketoprofen, 33% placebo, 5 studies, 348 participants, NNT 2.5 (2.0 to 3.4)), piroxicam gel (70% piroxicam, 47% placebo, 3 studies, 522 participants, NNT 4.4 (3.2 to 6.9)), diclofenac Flector plaster (63% Flector, 41% placebo, 4 studies, 1030 participants, NNT 4.7 (3.7 to 6.5)), and diclofenac other plaster (88% diclofenac plaster, 57% placebo, 3 studies, 474 participants, NNT 3.2 (2.6 to 4.2)).

In chronic musculoskeletal pain (mainly hand and knee osteoarthritis) therapies were topical diclofenac preparations for less than six weeks (43% diclofenac, 23% placebo, 5 studies, 732 participants, NNT 5.0 (3.7 to 7.4)), ketoprofen over 6 to 12 weeks (63% ketoprofen, 48% placebo, 4 studies, 2573 participants, NNT 6.9 (5.4 to 9.3)), and topical diclofenac preparations over 6 to 12 weeks (60% diclofenac, 50% placebo, 4 studies, 2343 participants, NNT 9.8 (7.1 to 16)). In postherpetic neuralgia, topical high‐concentration capsaicin had moderate‐quality evidence of limited efficacy (33% capsaicin, 24% placebo, 2 studies, 571 participants, NNT 11 (6.1 to 62)).

We judged evidence of efficacy for other therapies as low or very low quality. Limited evidence of efficacy, potentially subject to publication bias, existed for topical preparations of ibuprofen gels and creams, unspecified diclofenac formulations and diclofenac gel other than Emulgel, indomethacin, and ketoprofen plaster in acute pain conditions, and for salicylate rubefacients for chronic pain conditions. Evidence for other interventions (other topical NSAIDs, topical salicylate in acute pain conditions, low concentration capsaicin, lidocaine, clonidine for neuropathic pain, and herbal remedies for any condition) was very low quality and typically limited to single studies or comparisons with sparse data.

We assessed the evidence on withdrawals as moderate or very low quality, because of small numbers of events. In chronic pain conditions lack of efficacy withdrawals were lower with topical diclofenac (6%) than placebo (9%) (11 studies, 3455 participants, number needed to treat to prevent (NNTp) 26, moderate‐quality evidence), and topical salicylate (2% vs 7% for placebo) (5 studies, 501 participants, NNTp 21, very low‐quality evidence). Adverse event withdrawals were higher with topical capsaicin low‐concentration (15%) than placebo (3%) (4 studies, 477 participants, NNH 8, very low‐quality evidence), topical salicylate (5% vs 1% for placebo) (7 studies, 735 participants, NNH 26, very low‐quality evidence), and topical diclofenac (5% vs 4% for placebo) (12 studies, 3552 participants, NNH 51, very low‐quality evidence).

In acute pain, systemic or local adverse event rates with topical NSAIDs (4.3%) were no greater than with topical placebo (4.6%) (42 studies, 6740 participants, high quality evidence). In chronic pain local adverse events with topical capsaicin low concentration (63%) were higher than topical placebo (5 studies, 557 participants, number needed to treat for harm (NNH) 2.6), high quality evidence. Moderate‐quality evidence indicated more local adverse events than placebo in chronic pain conditions with topical diclofenac (NNH 16) and local pain with topical capsaicin high‐concentration (NNH 16). There was moderate‐quality evidence of no additional local adverse events with topical ketoprofen over topical placebo in chronic pain. Serious adverse events were rare (very low‐quality evidence).

GRADE assessments of moderate or low quality in some of the reviews were considered by us to be very low because of small numbers of participants and events.

Authors' conclusions

There is good evidence that some formulations of topical diclofenac and ketoprofen are useful in acute pain conditions such as sprains or strains, with low (good) NNT values. There is a strong message that the exact formulation used is critically important in acute conditions, and that might also apply to other pain conditions. In chronic musculoskeletal conditions with assessments over 6 to 12 weeks, topical diclofenac and ketoprofen had limited efficacy in hand and knee osteoarthritis, as did topical high‐concentration capsaicin in postherpetic neuralgia. Though NNTs were higher, this still indicates that a small proportion of people had good pain relief.

Use of GRADE in Cochrane Reviews with small numbers of participants and events requires attention.

Резюме на простом языке

disponible en

Действительно ли обезболивающие средства, наносимые на кожу, эффективны?

Ключевые выводы

Диклофенак Эмульгель, гель кетопрофена, гель пироксикама и пластырь диклофенака действуют достаточно хорошо при растяжениях и вывихах. При остеоартрите кисти и коленного сустава нестероидные противовоспалительные средства (НПВС), диклофенак для местного применения и кетопрофен для местного применения, втираемые в кожу в течение не менее 6‐12 недель, помогают уменьшить боль по меньшей мере наполовину у малого числа людей. При постгерпетической невралгии (боль после опоясывающего лишая) высококонцентрированный капсаицин для местного применения (полученный из перца чили) может уменьшить боль как минимум вдвое у небольшого числа людей.

Актуальность

Болеутоляющие средства, наносимые на кожу, называются местными (локальными) обезболивающими средствами (анальгетиками). Было много споров о том, работают ли они, как и при каких состояниях, сопровождающихся болью (болевых синдромах).

Характеристика исследований

Мы искали систематические обзоры, посвященные местным обезболивающим средствам, в Кокрейновской базе данных систематических обзоров (Кокрейновская библиотека), опубликованные до февраля 2017 года. В обзорах оценивали лечение краткосрочных (острых, менее трех месяцев) или долгосрочных (хронических, более трех месяцев) состояний, сопровождающихся болью. Мы проверили, насколько хорошо действуют местные обезболивающие, какой вред они наносят, и выбывают ли люди из исследований. Мы также рассмотрели качество доказательств.

Основные результаты

Наиболее часто в обзорах сравнивали эффекты местного обезболивающего с местным плацебо. Местное плацебо ‐ похоже на активное вещество, но оно не имеет свойств обезболивающего средства. Использование плацебо позволяет устранить эффекты, которые может оказывать непосредственно процесс растирания при применении некоторых из этих местных анальгетиков.

При растяжениях и вывихах несколько местных обезболивающих НПВС, втираемых в кожу, помогают уменьшить боль как минимум наполовину примерно в течение недели примерно у одного из 2‐5 человек. Этими лекарствами являются диклофенак Эмульгель, гель кетопрофена, гель пироксикама, пластырь диклофенака Flector и другой пластырь диклофенака. Важное значение имеет то, как лекарства изготовлены (какой у них состав), это определяет, насколько хорошо они работают.

При остеоартрите кистей и коленных суставов обезболивающие НПВС, местный диклофенак и местный кетопрофен, втираемые в кожу, помогают уменьшить боль как минимум наполовину по меньшей мере в течение 6‐12 недель примерно у одного из 5‐10 человек. При постгерпетической невралгии однократное применение местного высококонцентрированного капсаицина может уменьшить боль как минимум вдвое примерно примерно у 1 из 12 человек в течение 8‐12 недель.

Нет никаких веских доказательств в поддержку любого другого местного болеутоляющего средства при любом другом болевом синдроме.

Местный капсаицин низкой концентрации вызывал местные побочные явления (такие как, зуд или сыпь) у 4 из 10 человек, и побочные эффекты были причиной выбывания из исследования каждого двенадцатого человека. Побочные эффекты и выбывание из исследования из‐за побочных эффектов в других случаях были редкими или не отличались от таковых с местным плацебо. Серьезные побочные эффекты были редкими.

Качество доказательств

Качество доказательств варьировало от высокого до очень низкого. Основной причиной очень низкого качества доказательств было небольшое число участников в некоторых исследованиях, что делает невозможным (или небезопасным) оценку пользы или вреда.

Authors' conclusions

Implications for practice

For people with pain

The major implication for people with pain is the knowledge that there is a body of reliable evidence about the efficacy of topical analgesics in different types of acute and chronic pain. Not every person will achieve good pain relief even with the most effective drugs, and analgesic failure is to be expected with particular drugs in particular people. Failure to achieve good pain relief should not be acceptable because it is likely that failure with any one drug could be reversed with another.

For clinicians

The major implication for clinicians is the knowledge that there is a body of reliable evidence about a number of topical analgesics in acute and chronic pain. Drug and formulation matter, so choice of therapy should usually be driven by the evidence: topical diclofenac and ketoprofen gel for strains and sprains, and to an extent in knee and hand osteoarthritis. Topical capsaicin high‐concentration may be of limited use in some people with postherpetic neuralgia.

Topical salicylate, low‐concentration capsaicin, clonidine, and lidocaine are not well supported by evidence, or much evidence of effect. There may be circumstances when an experienced clinician may still choose to use them, because the evidence does not exclude beneficial effects in a small percentage of people.

For policy makers

The issue is not which topical analgesic product, but achieving success ‐ good pain relief is the goal of treatment. Surveys over a long period have shown that acute and chronic pain are poorly treated, and that many people experience moderate or severe pain despite being on treatment. Pain treatment is often part of a complex of interactions between the person with pain, their pain condition, and desired outcome; the overview helps by presenting evidence from which rational choices and decisions can be made.

For funders

The important message is that some topical analgesics can produce very good pain relief for some people with acute or chronic pain. The issue is not which topical analgesic product works best, but achieving success for individual people with pain. Good pain relief is the goal of treatment. Surveys over a long period have shown that acute and chronic pain are poorly treated, and that many people experience moderate or severe pain despite being on treatment. Pain treatment is often part of a complex of interactions between the person with pain, their pain condition, and desired outcome; the overview helps by presenting evidence from which rational choices and decisions can be made, especially about the place of particular products in care pathways.

Implications for research

General

The individual reviews and this overview have highlighted the lack of good evidence for many topical analgesics. Most of the studies and the participants included in them did not contribute to any reliable assessment of efficacy or harm. That is a waste, and the ethics of research of that sort is hard to justify.

Most topical analgesics are inexpensive, and have not had the detailed examination in large, properly‐conducted randomised trials that would be expected of modern medicines. And yet they offer good levels of pain relief for at least some people with acute and chronic pain, with a general absence of systemic adverse events in this overview, but also in studies designed to examine rare but serious harm.

Design

While there appears to be general consensus over design of studies, many of the individual studies in the individual reviews fail to meet reasonable standards. Much of that reflects the age of the studies and the standards of reporting extant at the time of publication. Others are more fundamental, such as having an adequate duration of studies investigating chronic pain; while efficacy may be established relatively early (four to six weeks), longer duration allows for assessment of tolerability.

As much as studies are needed to examine efficacy compared with placebo, or a commonly used active comparator, study designs to examine the pragmatic value of topical analgesics might be of especial value in chronic pain conditions. An example of such a trial design has been published (Moore 2010c).

Measurement (endpoints)

People with acute and chronic pain want an outcome of treatment that is equivalent to 'no worse than mild pain'. There is no reason why current pain trial methods could not report this as an outcome.

Other

Many of the improvements in understanding acute pain have been derived from individual participant‐level analyses. These can only come from close co‐operation with the pharmaceutical industry, which overwhelmingly funds the studies and 'owns' the data. Industry has a responsibility to perform more useful analyses than just those required for regulatory purposes. The main implication for research is methodological, and that is driven by data analysis at the level of the individual participant.

Background

Description of the condition

A topical analgesic medication is one applied to body surfaces such as the skin or mucous membranes to treat painful ailments; they are either rubbed onto the skin or made into patches or plasters that are stuck onto the skin. Painful conditions that might be treated by direct application of drugs include, for example, painful cutaneous ulcers; wounds of various sorts including surface wounds or wounds inside the body due to surgery or pain due to infiltration by needles; and painful eye conditions, especially perioperatively, such as after cataract surgery. Each of these, and other, situations, might be described as a topical application of drug. In this overview we have restricted our scope to drugs that are applied to intact skin, which is the situation where topical analgesics are most frequently used outside special circumstances.

Pain is a very common experience. Acute pain is of short duration, lasting less than three months, and gradually resolves as the injured tissues heal. Chronic pain is usually defined as pain lasting three to six months or longer. Acute pain conditions like tension‐type headache, migraine, and acute low back pain rank amongst the top 10 most common conditions worldwide (Vos 2012). Chronic painful conditions comprise 5 of the 11 top‐ranking conditions for years lived with disability in 2010 (Vos 2012); they include low back pain (LBP), neck pain, osteoarthritis (OA), and other musculoskeletal diseases.

Pain is responsible for considerable loss of quality of life and employment, and increased health costs (Moore 2014). People with pain want it to go away, and relatively quickly (Moore 2013a); understanding this has led to a recognition that this should drive what we regard as useful outcomes in pain trials, namely a large reduction in pain, or being in a low pain state (Moore 2010a).

Topical analgesic drugs are used to treat both acute pain (strains, sprains, tendonitis, acute back pain, muscle aches) and chronic pain (osteoarthritis of hand or knee, low back pain, and specific types of neuropathic pain). Topical analgesics are recommended in guidelines for the pain of osteoarthritis (Hochberg 2012; NICE 2014) and neuropathic pain (Attal 2010; Finnerup 2015; NICE 2013).

Description of the interventions

A number of different topical analgesics have been tested in a wide range of different painful conditions. The scope of this overview covers a number of possible interventions.

For strains and sprains, topical nonsteroidal anti‐inflammatory drugs (NSAIDs) or topical rubefacients.
For osteoarthritis, topical NSAIDs, topical rubefacients, and low‐concentration topical capsaicin.
For neuropathic pain, topical local anaesthetic (lidocaine, for example) or high‐concentration topical capsaicin.
Topical herbal medicines have been used for a variety of painful conditions.
Other possible interventions include glyceryl trinitrate for some joint pains.

Many different topical formulations may be used, including but not limited to creams, foams, gels, lotions, ointments, and plasters (patches). The exact formulation of a topical medication is often determined by the required rate of drug delivery (Moore 2008a). Plasters containing drug reservoirs result in slow absorption rates, lower blood levels, and reduced first pass effect in the liver. They are frequently used for transdermal delivery of drugs that are distributed systemically (opioids or contraceptive steroids), but are also available for some NSAIDs for topical drug delivery.

How the intervention might work

Topical medications are applied externally and are absorbed through the skin. They exert their effects close to the site of application, and there should be little systemic uptake or distribution. This compares with transdermal application, where the medication is applied externally and is taken up through the skin, but relies on systemic distribution for its effect.

For a topical formulation to be effective, it must first pass through the skin. Individual drugs have different degrees of penetration, and some formulations add substances that improve skin penetration and result in higher drug concentrations in tissues. A balance between lipid and aqueous solubility is needed to optimise penetration, and use of prodrug esters has been suggested as a way of enhancing permeability. Formulation is also crucial to good skin penetration. Experiments with artificial membranes or human epidermis suggest that creams are generally less effective than gels or sprays, but newer formulations such as microemulsions may have greater potential.

Topical NSAIDs

NSAIDs reversibly inhibit the enzyme cyclooxygenase (prostaglandin endoperoxide synthase or COX), now recognised to consist of two isoforms, COX‐1 and COX‐2, mediating production of prostaglandins and thromboxane A2 (Fitzgerald 2001); inhibition of the COX‐2 format reduces inflammation and produces analgesic effects. Relatively little is known about the mechanism of action of this class of compounds aside from their ability to inhibit COX‐dependent prostanoid formation (Hawkey 1999). Systemically, prostaglandins mediate a variety of physiological functions such as maintenance of the gastric mucosal barrier, regulation of renal blood flow, and regulation of endothelial tone. They also play an important role in inflammatory and nociceptive (pain) processes. The rationale behind topical application is based on the ability of NSAIDs to inhibit COX enzymes locally and peripherally, with minimum systemic uptake. Their use is therefore limited to conditions where the pain is superficial and localised, such as in joints and skeletal muscle.

Once the drug has reached the site of action, it must be present at a sufficiently high concentration to inhibit COX enzymes and produce pain relief. It is probable that topical NSAIDs exert their action by local reduction of symptoms arising from periarticular and intracapsular structures. Tissue levels of NSAIDs applied topically certainly reach levels high enough to inhibit COX‐2. Plasma concentrations found after topical administration, however, are only a fraction (usually much less than 5%) of the levels found in plasma following oral administration. Topical application can potentially limit systemic adverse events by minimising systemic concentrations of the drug. We know that upper gastrointestinal bleeding is low with chronic use of topical NSAIDs (Evans 1995), but have no certain knowledge of effects on heart failure, or renal failure, both of which are associated with oral NSAID use. Current guidelines in the UK encourage use of topical NSAIDs ahead of oral NSAIDs, COX‐2 inhibitors, or opioids for hand or knee osteoarthritis (NICE 2014).

Topical rubefacients

Rubefacients (typically containing salicylates) cause irritation of the skin, and are believed to relieve pain in muscles, joints and tendons, and other musculoskeletal pains in the extremities by counter‐irritation (Martindale 2016). These agents cause a reddening of the skin by causing the blood vessels of the skin to dilate, which gives a soothing feeling of warmth. The term counter‐irritant refers to the idea that irritation of the sensory nerve endings alters or offsets pain in the underlying muscle or joints that are served by the same nerves (Morton 2002).

There has been confusion about which compounds should be classified as rubefacients. Salicylates are related pharmacologically to aspirin and NSAIDs, but when used in topical products (often as amine derivatives) their principal action is as skin irritants. By contrast, topical NSAIDs penetrate the skin and underlying tissues where they inhibit COX enzymes, as described above. We will include salicylates and nicotinate esters as rubefacients.

Topical capsaicin

Capsaicin is the active compound present in chili peppers, responsible for making them hot when eaten. It binds to nociceptors (sensory receptors responsible for sending signals that cause the perception of pain) in the skin, and specifically to the TRPV1 receptor, which controls movement of sodium and calcium ions across the cell membrane. Initially, binding opens the ion channel (influx of sodium and calcium ions), causing depolarisation and the production of action potentials, which are usually perceived as itching, pricking, or burning sensations. Repeated applications or high concentrations give rise to a long‐lasting effect, which has been termed 'defunctionalisation', probably owing to a number of different effects that together overwhelm the cell's normal functions, and can lead to reversible degeneration of nerve terminals (Anand 2011).

Topical creams with low‐concentration capsaicin designed for repeated applications are used to treat pain from a wide range of chronic conditions including postherpetic neuralgia (PHN), peripheral diabetic neuropathy (PDN), osteoarthritis and rheumatoid arthritis, in addition to pruritus and psoriasis. The creams typically contain capsaicin 0.025% or 0.075%, but in some countries 0.25% creams are available (Martindale 2016).

A high‐concentration (8%) patch has been developed to increase the amount and speed of delivery of capsaicin to the skin, and improve tolerability. Rapid delivery is thought to improve tolerability because cutaneous nociceptors are 'defunctionalised' quickly, and the single application avoids both non‐compliance and contamination of the home environment with particles of dried capsaicin cream (Anand 2011). The high‐concentration product is a single application with a minimum interval of 3 months, performed in a clinic, with cooling, local anaesthesia or short‐acting opioids to reduce local pain on application. Patients are usually monitored for up to two hours after treatment. Stringent conditions are required, and as well as using trained healthcare professionals, the treatment setting needs to be well ventilated and spacious due to the vapour of the capsaicin, and cough due to inhalation of capsaicin particles/dust is a hazard for both the healthcare professionals and the patients.

High‐concentration capsaicin is licensed in the European Union (EU) to treat neuropathic pain, and in the USA to treat peripheral postherpetic neuralgia. It is available on prescription only; it was licensed in 2009 in Europe and the USA. The US Food and Drug Administration (FDA) refused a licence for neuropathic pain in HIV in 2012. The EU licence originally restricted use to non‐diabetic patients, but this restriction was lifted in 2015.

Topical lidocaine

Topical lidocaine dampens peripheral nociceptor sensitisation and central nervous system hyperexcitability if used in recommended doses. It may benefit people with PHN or traumatic nerve injury, where its lack of systemic side effects makes it an attractive option. As cream and gel, lidocaine is cumbersome to administer. The patch is more convenient, being worn for 12 hours in every 24 hours. In addition to the local anaesthetic effect, the patch provides protection against mechanical stimulation (dynamic allodynia), which is a frequent problem in PHN.

Lidocaine is an amide‐type local anaesthetic agent that acts by stabilising neuronal membranes. It impairs membrane permeability to sodium, which in turn blocks impulse propagation, and thus dampens both peripheral nociceptor sensitisation, and eventually central nervous system hyperexcitability. It also suppresses neuronal discharge in A delta and C fibres. Regenerating nerve fibres have an accumulation of sodium channels. When lidocaine binds to such sodium channels it initiates an 'inactive state' from which normal activation is unable to occur. Lidocaine reduces the frequency rather than the duration of sodium channel opening. In a small dose it inhibits ectopic discharges, although it does not disrupt normal neuronal function. Lidocaine also suppresses spontaneous impulse generation from dorsal root ganglia, where the herpes virus remains dormant after initial infection by Varicella zoster (chickenpox) (Khaliq 2007).

Other effects on keratinocytes and immune cells, or activation of irritant receptors (TRPV1 and TRPA1), may also contribute to the analgesic effect of topical lidocaine (Sawynok 2014). Long‐term use may cause a loss of epidermal nerve fibres (Wehrfritz 2011).

Why it is important to do this overview

Use of topical analgesics is increasing as patients and clinicians look for alternatives to systemic treatments and their associated adverse events. The efficacy of topical interventions is now widely recognised, together with a different pattern of troublesome adverse events from oral analgesics. Guidelines recommend topical analgesics for a range of pain conditions (Attal 2010; Finnerup 2015; Hochberg 2012; NICE 2013; NICE 2014).

There is a need to pull together the best available evidence for all interventions and conditions and assess the evidence with regard to current understanding of potential biases in order to facilitate decisions about which interventions are helpful in particular circumstances.

Objectives

Methods

Criteria for considering reviews for inclusion

We considered for inclusion Cochrane Reviews assessing randomised controlled trials (RCTs) of topical analgesics for pain relief in adults in distinct clinical conditions.

Acute musculoskeletal conditions (sprains, strains, muscle pain)
Osteoarthritis, rheumatoid arthritis, or other chronic musculoskeletal conditions
Neuropathic pain

Search methods for identification of reviews

We searched the Cochrane Database of Systematic Reviews (the Cochrane Library) for relevant reviews; Appendix 1 shows the search strategy. In addition, we examined reviews produced by Cochrane Musculoskeletal; Cochrane Pain, Palliative, and Supportive Care; and Cochrane Skin for suitable reviews, and performed broader searches using the terms 'topical' and 'pain' in title, abstract, and keywords.

Data collection and analysis

Two review authors (RAM, SD) independently carried out searches, selected reviews for inclusion, and carried out assessment of methodological quality, and data extraction. Any disagreements were resolved by discussion involving a third author.

Selection of reviews

Included reviews assessed RCTs of the effects of topical application of analgesics for pain relief in adults (as defined by individual reviews), compared with placebo or active comparator if available, and included:

a clearly defined clinical question;
details of inclusion and exclusion criteria;
details of databases searched and relevant search strategies;
participant‐reported pain relief;
summary results for at least one desired outcome.

Data extraction and management

We took data from the included reviews, planning to refer to original study reports only if specific data were missing.

We collected information on the following.

Number of included studies and participants
Intervention and dose
Comparator
Condition treated: acute pain (strains and sprains, overuse injuries), chronic pain (arthritis, neuropathic pain)
Time of assessment

We extracted risk difference (RD) or risk ratio (RR) and number needed to treat for one additional beneficial or harmful outcome (NNT or NNH) for the following outcomes.

At least 50% pain relief or more (participant‐reported)
Any other measure of 'improvement' (participant‐reported)
Adverse events: local and systemic, and particularly serious adverse events
Withdrawals (particularly withdrawals caused by lack of efficacy or because of adverse events)

Assessment of methodological quality of included reviews

We assessed the methodological quality of included reviews using the following criteria (adapted from AMSTAR; Shea 2007).

Was an a priori design provided?
Was there duplicate study selection and data extraction?
Was a comprehensive literature search performed?
Were published and unpublished studies included irrespective of language of publication?
Was a list of studies (included and excluded) provided?
Were the characteristics of the included studies provided?
Was the scientific quality of the included studies assessed and documented?
Was the scientific quality of the included studies used appropriately in formulating conclusions?
Were the methods used to combine the findings of studies appropriate?
Was the conflict of interest stated?

For each review we assessed the likelihood of publication bias by calculating the number of participants in studies with zero effect (RR of one) that would be needed to give an NNT too high to be clinically relevant (Moore 2008b). In this case we considered an NNT of 10 or more for the outcome 'at least 50% maximum pain relief' or 'substantial benefit' at a specified assessment time to be the cut‐off for clinical relevance. We used this method because statistical tests for presence of publication bias have been shown to be unhelpful (Thornton 2000).

Data synthesis

We did not plan additional quantitative analyses, since only results from properly conducted Cochrane Reviews were considered. The aim was to concentrate on specific outcomes such as the proportion of participants with at least 50% pain relief, all‐cause or adverse event discontinuations, and serious adverse events, and to explore how these can be compared across different treatments for the same condition. Care was taken to ensure that we compared like with like, for example in duration of treatment, which can be an additional source of bias (Moore 2010b). Importantly, issues of low trial quality, inadequate size, and whether trials were truly valid for the particular condition, were highlighted in making any between‐therapy comparisons.

We considered study size and the overall amount of information available for analysis. There are issues over both random chance effects with small amounts of data, and potential bias in small studies, especially in pain (Dechartres 2013; Dechartres 2014; Moore 1998; Nüesch 2010; Thorlund 2011).

We did not use information from pooled analyses unless they included data from at least 200 participants for the outcome (Moore 1998). Where appropriate we used or calculated risk ratio (RR) or risk difference (RD) with 95% confidence intervals (CI) using a fixed‐effect model (Morris 1995). We used or calculated NNT and NNH with 95% CIs using the pooled number of events, using the method devised by Cook and Sackett (Cook 1995). We assumed a statistically significant difference from control when the 95% CI of the RR did not include the number one or the RD the number zero.

Quality of the evidence

We used the GRADE (Grades of Recommendation, Assessment, Development and Evaluation) system to assess the quality of the evidence related to the key outcomes listed in 'Types of outcome measures', as appropriate (Appendix 2). Two review authors independently rated the quality of each outcome independently of any GRADE evaluation in the original reports.

We paid particular attention to inconsistency, where point estimates vary widely across studies, or confidence intervals (CIs) of studies show minimal or no overlap (Guyatt 2011), and potential for publication bias, based on the amount of unpublished data required to make the result clinically irrelevant (Moore 2008b). Small studies have been shown to overestimate treatment effects, probably because the conduct of small studies is more likely to be less rigorous, allowing critical criteria to be compromised (Dechartres 2013; Nüesch 2010), and large studies often have smaller treatment effects (Dechartres 2014). Cochrane Reviews have been criticised for perhaps over‐emphasising results of underpowered studies or analyses (AlBalawi 2013; Turner 2013), and simulation studies demonstrate that small studies have low power to estimate treatment effect with accuracy (Moore 1998; Thorlund 2011).

In addition, there may be circumstances where the overall rating for a particular outcome needs to be adjusted as recommended by GRADE guidelines (Guyatt 2013a). For example, if there are so few data that the results are highly susceptible to the random play of chance, or if studies use last observation carried forward (LOCF) imputation in circumstances where there are substantial differences in adverse event withdrawals (Moore 2012), one would have no confidence in the result, and would need to downgrade the quality of the evidence by three levels, to very low quality. In circumstances where there were no data reported for an outcome, we would report the level of evidence as very low quality (Guyatt 2013b).

We report both the GRADE assessment made by the authors of individual Cochrane reviews, and a GRADE assessment made by us based on current knowledge. We used the following descriptors for levels of evidence (EPOC 2015); "substantially different" in this context implies a large enough difference that it might affect a decision.

High: this research provides a very good indication of the likely effect. The likelihood that the effect will be substantially different is low.
Moderate: this research provides a good indication of the likely effect. The likelihood that the effect will be substantially different is moderate.
Low: this research provides some indication of the likely effect. However, the likelihood that it will be substantially different is high.
Very low: this research does not provide a reliable indication of the likely effect. The likelihood that the effect will be substantially different is very high.

We used the amount and quality of evidence to report results in a hierarchical way (Moore 2015). We split the available information into five groups, essentially according to the GRADE descriptors.

Drugs and doses for which Cochrane Reviews found no information (very low‐quality evidence).
Drugs and doses for which Cochrane Reviews found inadequate information: fewer than 200 participants in comparisons, in at least two studies (very low‐quality evidence).
Drugs and doses for which Cochrane Reviews found evidence of effect, but where results were potentially subject to publication bias. We considered the number of additional participants needed in studies with zero effect (relative benefit of one) required to change the NNT for at least 50% maximum pain relief to an unacceptably high level (in this case the arbitrary NNT of 10) (Moore 2008b). Where this number was less than 400 (equivalent to four studies with 100 participants per comparison, or 50 participants per group), we considered the results to be susceptible to publication bias and therefore unreliable (low quality‐evidence).
Drugs and doses for which Cochrane Reviews found no evidence of effect or evidence of no effect: more than 200 participants in comparisons, but where there was no statistically significant difference from placebo (moderate‐ or high‐quality evidence).
Drugs and doses for which Cochrane Reviews found evidence of effect, where results were reliable and not subject to potential publication bias (high‐quality evidence).

'Summary of findings' table

We did not plan to include a 'Summary of findings' table, as set out in the author guide (PaPaS 2012), and recommended in the Cochrane Handbook for Systematic Reviews of Interventions (Chapter 4.6.6, Higgins 2011). The reasons include the difficulty of using such tables when there are many different conditions, interventions, and outcomes, because they quickly become unwieldy.

We planned to summarise information in the text, as appropriate. In‐text tables were organised in the format of condition, intervention, outcome, numbers of studies and participants, RR or RD.

Key information included the quality of evidence, the magnitude of effect of the interventions examined as appropriate for the condition studied. This included, for example in chronic pain conditions, the sum of available data on the outcomes of 'substantial benefit' (Patient Global Impression of Change (PGIC) very much improved from weeks 2 to 8 and weeks 2 to 12), 'moderate benefit' (PGIC much or very much improved from weeks 2 to 8 and weeks 2 to 12), withdrawals due to adverse events, withdrawals due to lack of efficacy, serious adverse events, and death (a particular serious adverse event).

Results

We included 13 reviews (Cameron 2011; Cameron 2013; Cui 2010; Cumpston 2009; Derry 2012; Derry 2014a; Derry 2014b; Derry 2015; Derry 2016; Derry 2017; Otlean 2014; Pattanittum 2013; Wrzosek 2015). We excluded one review because it was a protocol without data (Johnston 2007).

Description of included reviews

The 13 included reviews covered a range of treatments for acute and chronic pain conditions. The reviews involved 206 studies with around 30,700 participants.

Acute pain was addressed in four reviews. These were:

topical salicylate‐containing rubefacients for acute injuries (mainly strains, sprains, and acute low back pain) (Derry 2014b);
topical NSAIDs for strains and sprains (Derry 2015);
glyceryl trinitrate for rotator cuff injury among people with acute symptoms (Cumpston 2009);
topical NSAIDs for lateral elbow pain (Pattanittum 2013).

Chronic pain reviews was examined in 12 reviews. These were:

topical glyceryl trinitrate for chronic rotator cuff pain (Cumpston 2009);
topical NSAIDs for chronic lateral elbow pain (Pattanittum 2013);
topical herbal remedies for rheumatoid arthritis (Cameron 2011), osteoarthritis (Cameron 2013), neck pain due to cervical degenerative disease (Cui 2010), and low back pain (Otlean 2014);
topical NSAIDs for chronic musculoskeletal conditions (Derry 2016);
topical salicylate‐containing rubefacients for chronic musculoskeletal conditions (Derry 2014b);
topical capsaicin high‐ and low‐concentration for neuropathic pain (Derry 2012; Derry 2017);
topical lidocaine for neuropathic pain (Derry 2014a);
topical clonidine for neuropathic pain (Wrzosek 2015).

Summary table A has details of the number of included studies and participants, the intervention, comparators, and the condition treated, whether acute or chronic, and the time of assessment.

Summary table A: Details of included reviews

Review	Condition treated	Studies	Participants	Topical intervention	Comparators	Assessment time (weeks)
Acute pain conditions
Derry 2014b	Strains, sprains, low back pain	6	697	Salicylate rubefacients	Placebo and active	1
Derry 2015	Strains and sprains	61	9001	NSAID	Placebo and active	1
Acute and chronic pain conditions
Cumpston 2009	Rotator cuff disease or chronic tendinopathy	3	121	Glyceryl trinitrate	Placebo	1 ‐ 24
Pattanittum 2013	Lateral elbow pain	15	759	NSAID	Placebo	2 ‐ 4
Chronic pain conditions
Cameron 2011	Rheumatoid arthritis	22	1243	Herbal remedies	Placebo and active	Up to 26
Cameron 2013	Osteoarthritis	7	785	Herbal remedies	Placebo and active	Up to 4
Cui 2010	Neck pain due to degenerative disease	4	1100	Herbal remedies	Placebo and active	Up to 4
Derry 2012	Neuropathic pain	6	389	Topical capsaicin low‐concentration	Placebo	6‐8
Derry 2014a	Neuropathic pain	12	508	Lidocaine	Placebo	From a single dose up to 12 weeks
Derry 2014b	Chronic musculoskeletal conditions	10	671	Salicylate rubefacients	Placebo and active	2
Derry 2016	Chronic musculoskeletal conditions	39	10631	NSAID	Placebo and active	2 ‐ 12
Derry 2017	Neuropathic pain	8	2488	Topical capsaicin high‐concentration	Placebo	8 ‐ 12
Otlean 2014	Low back pain	14	2050	Herbal remedies	Placebo and active	Typically 3
Wrzosek 2015	Neuropathic pain	2	344	Clonidine	Placebo	8 ‐ 12

Methodological quality of included reviews

All the reviews met all AMSTAR criteria (Shea 2007). They:

had a priori design;
performed duplicate study selection and data extraction;
had a comprehensive literature search;
used published and any unpublished studies included irrespective of language of publication, although not all reviews contacted companies or researchers for unpublished trial data;
provided a list of included and excluded studies;
provided characteristics of included studies;
assessed and documented the scientific quality of the included studies;
used the scientific quality of the included studies appropriately in formulating conclusions, because only studies with minimal risk of bias were included (a particular issue was trial size, but conclusions were not drawn from inadequate data sets, based on previously established criteria (Moore 1998));
used appropriate methods to combine findings of studies and, importantly, provided analyses according to drug dose; and
conflict of interest statements had appropriate conflict of interest statements.

All reviews except two reported a GRADE assessment, but not necessarily for all comparisons or outcomes (Cui 2010; Derry 2012).

Effect of interventions

We have reported first the preferred efficacy outcome of at least 50% pain relief compared with placebo, followed by other efficacy outcomes, and comparisons with other active treatments.

At least 50% pain relief

1 Interventions for which Cochrane Reviews found no information

None of the reviews reported finding no information.

2 Interventions for which Cochrane Reviews found inadequate information

A number of reviews reported that there were interventions where the amount of information was small, with fewer than 200 participants in at least two studies. These included:

Acute pain

Topical benzydamine for strains and sprains (Derry 2015). There were 193 participants in three studies. Pooled analysis demonstrated no difference between topical benzydamine and topical placebo. The review authors made no GRADE assessment for this comparison in the review.

Chronic pain

Glyceryl trinitrate for rotator cuff disease (Cumpston 2009). There were fewer than 200 participants in total, with no pooled analysis of the three included studies. The GRADE assessment made by the review authors for quality of evidence in a single study with 20 participants was low.
Evening primrose oil, borage seed oil, blackcurrant seed oil (with gamma‐linolenic acid) versus placebo in rheumatoid arthritis (Cameron 2011). Pooled analysis of three studies for pain intensity involved only 82 participants. The GRADE assessment made by the review authors for quality of evidence for this was moderate. No other herbal remedies had adequate information on efficacy.
No herbal remedy had adequate information on efficacy in osteoarthritis in 200 participants (Cameron 2013). The GRADE assessment made by the review authors for quality of evidence for this was moderate.
No herbal remedy had adequate information on efficacy in randomised double‐blind studies in neck pain in 200 participants (Cui 2010). No GRADE assessment was made.
No herbal remedy had adequate information on efficacy in low back pain in 200 participants (Otlean 2014). The GRADE assessment made by the review authors for quality of evidence for this was very low.
Topical lidocaine for neuropathic pain (Derry 2014a). Few analyses were possible due to poor reporting. Only a single study with 58 participants provided relevant data. The GRADE assessment made by the review authors for quality of evidence for this was very low.
Topical capsaicin low‐concentration (0.075%) for neuropathic pain (Derry 2012). There were 124 participants in two studies. Pooled analysis demonstrated no difference between topical capsaicin low‐concentration and topical placebo. No specific GRADE assessment was made.
Topical clonidine for neuropathic pain (Wrzosek 2015). Only a single study with 179 participants provided relevant data. The GRADE assessment made by the review authors for quality of evidence for this was low.

We assessed the evidence quality for all these interventions as very low. This means that this research does not provide a reliable indication of the likely effect and that the likelihood that the effect will be substantially different is very high. In doing this, we agreed with GRADE assessments made by the review authors in two reviews (Derry 2012; Derry 2014a), but we disagreed with all others, where GRADE assessments were either low or moderate. Moderate‐quality evidence, for example, implies that the research provides a good indication of the likely effect. With fewer than 200 participants in trials with methodological problems that come with a high risk of bias, that seems improbable.

3 Interventions for which Cochrane Reviews found no evidence of effect or evidence of no effect

No reviews demonstrated null effect for a topical intervention compared with topical placebo.

We agreed with the original authors that the evidence for topical salicylate rubefacients in acute pain conditions was very low quality (Derry 2014b). This was despite an apparently significant effect with RR of 1.9 (95% CI 1.5 to 2.5) and NNT of 3.2 (2.4 to 4.9) calculated from the pooled analysis and with an apparently low susceptibility to publication bias calculated from the aggregated data of 689 participants. The reason was that the most recent high‐quality study in that review showed no difference between topical salicylate and topical placebo. There were quality and potential bias issues other than simply publication bias, and we and the original authors judged that the available data did not amount to good evidence of effect, or of no effect.

4 Interventions for which Cochrane Reviews found evidence of effect, but where results were potentially subject to publication bias

Some reviews had some evidence of effect, but in these either the number of participants was low, or the size of the effect was low, or both. In that circumstance the number of participants in unpublished, null‐effect studies required to render the result susceptible to publication bias can be small. Where this number was less than 400 (equivalent to four studies with 100 participants per comparison, or 50 participants per group), we considered the results to be susceptible to publication bias and therefore unreliable and indicative of low‐quality evidence. The appropriateness or otherwise of this categorisation is discussed below, but these results are the least reliable of those available from the reviews.

Summary table B shows the topical treatments where our judgement was of high susceptibility to publication bias (low‐quality evidence), and ordered according to susceptibility to publication bias. Comparisons were all with placebo. Acute conditions treated were strains and sprains, acute low back pain (Derry 2014b; Derry 2015), or acute lateral elbow pain (Pattanittum 2013). Chronic conditions were osteoarthritis or low back pain (Derry 2014b), or hand and knee osteoarthritis (Derry 2016). The number in the susceptibility to publication bias column refers to the number of participants in studies with null effect needed to produce an NNT worse than 10. For topical capsaicin (high‐concentration) the measured effect was above 10 (Derry 2017), and so we judged it to be subject to possible publication bias due to the wide confidence interval of the NNT.

Also included in Summary table B is salicylate rubefacients for acute pain. This was despite there needing to be 689 participants in unpublished, null‐effect trials to increase the NNT to our threshold of 10. The review authors graded this as very low‐quality evidence, and pointed out that the most recent study with over half of all participants had a null effect (Derry 2014b). All the studies had important methodological uncertainties, eliminating any trust in the results.

We assessed the evidence quality for all these interventions as very low or, in one case moderate, based on the underlying quality of studies in the reviews and susceptibility to publication bias.

In acute pain for strains and sprains we judged as very low‐quality evidence that for ketoprofen plaster, diclofenac gel other than Emulgel, indomethacin, ibuprofen cream, and diclofenac (unspecified formulation) in lateral elbow pain. This means that this research does not provide a reliable indication of the likely effect, and that the likelihood that the effect will be substantially different is very high. We disagreed with the moderate‐quality GRADE assessment given for ibuprofen gel based on a high RR, low NNT, and a susceptibility bias on the borderline of 400 participants. Moderate‐quality research provides a good indication of the likely effect, and with only two studies with 241 participants a more conservative view is that the quality of the evidence is low; the research provides some indication of the likely effect, but the likelihood that it will be substantially different is high.

In chronic pain we agreed with the GRADE assessments in the original reviews. Salicylate rubefacient studies have a number of problems over and above susceptibility to publication bias; importantly for a chronic condition, the time of assessment was only at two weeks (Derry 2014b). A very low‐quality GRADE assessment is therefore reasonable. For various formulations of diclofenac in chronic pain, there was very considerable evidence from 2343 participants of a low effect, and NNT of close to 10. A judgement of moderate‐quality research is appropriate because the quantity of data provides a good indication of the likely effect. We therefore agreed with the GRADE assessments in the original reviews (Derry 2014b; Derry 2016).

Summary table B: Results potentially subject to publication bias

			Percent with outcome
Reference	Topical treatment	Studies/participants	Active	Placebo	RR 95% CI	NNT 95% CI	Susceptibility to publication bias	GRADE (review‐reported)
Acute pain conditions
Derry 2015	Ibuprofen ‐ gel	2/241	42	16	2.7 (1.7 to 4.2)	3.9 (2.7 to 6.7)	377	Moderate quality
Derry 2015	Ibuprofen ‐ cream	3/195	71	56	1.3 (1.03 to 1.6)	6.4 (3.4 to 41)	110	No specific GRADE given
Pattanittum 2013	Diclofenac (unspecified formulation)	3/153	Continuous data used		Not reported	7 (3 to 21)	66	Very low quality
Derry 2015	Indomethacin	3/341	58	46	1.3 (1.03 to 1.6)	8.3 (4.4 to 65)	73	No specific GRADE given
Derry 2015	Diclofenac ‐ other gel than Emugel	1/232	94	82	1.2 (1.1 to 1.3)	8.0 (4.8 to 24)	58	No specific GRADE given
Derry 2015	Ketoprofen ‐ plaster	2/335	73	60	1.2 (1.04 to 1.4)	8.2 (4.5 to 47)	29	No specific GRADE given
Chronic pain conditions
Derry 2014b	Salicylate rubefacient	6/455	45	28	1.6 (1.2 to 2.0)	6.2 (4.0 to 13)	279	Very low quality

Footnote: CI: confidence interval; NNT: number needed to treat for an additional beneficial outcome; RR: risk ratio

5 Interventions for which Cochrane Reviews found evidence of effect, where results were reliable and not subject to potential publication bias

Reliable results are presented in Summary table C, ordered according to susceptibility to publication bias. Comparisons were all with placebo.

For topical NSAIDs, acute conditions treated were strains and sprains treated for one week (Derry 2015), while chronic conditions were knee and hand osteoarthritis treated for two to less than six weeks (diclofenac), or treated for 6 to 12 weeks (ketoprofen), or chronic musculoskeletal conditions (Derry 2014b; Derry 2016).

In acute pain conditions NNTs for topical NSAIDs varied from 1.8 for diclofenac Emulgel to 4.7 for diclofenac formulated as Flector plaster. The number of participants was modest, ranging from 314 in studies of diclofenac Emulgel to 1030 for studies of diclofenac formulated as Flector plaster. Results involving low (good) NNTs were particularly robust, with over 1000 participants in studies with null effect needed to overturn all results except that for piroxicam gel, and over 5000 needed for diclofenac formulated as Flector plaster.

In chronic pain conditions, NNTs for topical NSAIDs were higher, between 5 and 10, depending on NSAID and study duration. The numbers of participants were high for ketoprofen at 2573 and diclofenac in longer‐term studies (2343), but modest for diclofenac in studies of shorter duration (732). Over 1000 participants in unpublished, null‐effect studies would be required to overturn the result for ketoprofen, and only 48 for diclofenac in studies between six and 12 weeks. The results for topical diclofenac are included in this section because the numbers in studies were large, the effect size was small, and because the review identified around 6000 participants in unpublished studies of topical NSAIDs (Derry 2016). The existence of a large body of evidence with a large effect size in this quantity of unpublished material is unlikely, and the susceptibility of the result to publication bias is small.

For topical NSAIDs in acute and chronic pain we assessed the evidence quality for these interventions as moderate or high quality based on the underlying quality of studies in the reviews and susceptibility to publication bias. The high quality assessment was based on a large effect size (NNT of below 2), consistent results in moderately‐sized recent studies of high quality, and a low susceptibility to publication bias. High quality indicates that the research provides a very good indication of the likely effect. The likelihood that the effect will be substantially different is low. We also considered the evidence for diclofenac as Flector plaster to be high quality because of the low susceptibility to publication bias (Derry 2015).

Most other GRADE assessments in the original reviews were moderate. While the research provided a good indication of the likely effect, effect sizes tended to be larger with somewhat greater susceptibility to publication bias (Derry 2015; Derry 2016). We also considered the other diclofenac plasters and piroxicam gel to have moderate‐quality evidence, although it was not specifically graded in the original review.

One important comparison, between topical and oral NSAIDs (Derry 2015), was difficult to assess in terms of GRADE. Of the five studies (1735 participants) only two compared the same NSAID (diclofenac) in both topical and oral formulations; the three others compared a topical NSAID against an oral formulation of a different NSAID. The results were consistent, finding no greater or lesser benefit with topical NSAID (55%) than oral NSAID (54%). The RR was 1.0 (95% CI 0.95 to 1.1). As there was no difference, susceptibility to publication bias could not be calculated. No specific GRADE assessment was given in the review.

Topical high‐concentration capsaicin in neuropathic pain conditions had an NNT for postherpetic neuralgia of 11, and so susceptibility to publication bias could not be calculated using our prespecified method. The original review reported results according to type of neuropathic pain condition, and found similar estimates of beneficial effect in HIV neuropathy, but not peripheral diabetic neuropathy, and with similar results across different efficacy outcomes. Unpublished large studies with substantially higher efficacy are unlikely to exist, and so the potential for publication bias is small. We agreed with the GRADE assessment of moderate‐quality evidence.

Summary table C: Results not subject to publication bias

			Percent with outcome
Reference	Topical treatment	Studies/participants	Active	Placebo	RR 95% CI	NNT 95% CI	Susceptibility to publication bias	GRADE (review‐reported)
Acute pain conditions
Derry 2015	Diclofenac ‐ Flector plaster	4/1030	63	41	1.5 (1.4 to 1.7)	4.7 (3.7 to 6.5)	5029	No specific GRADE given
Derry 2015	Diclofenac ‐ Emulgel	2/314	78	20	3.8 (2.7 to 5.5)	1.8 (1.5 to 2.1)	1430	High quality
Derry 2015	Ketoprofen ‐ gel	5/348	72	33	2.2 (1.7 to 2.8)	2.5 (2.0 to 3.4)	1044	Moderate quality
Derry 2015	Diclofenac ‐ other plaster	3/474	88	57	1.6 (1.4 to 1.8)	3.2 (2.6 to 4.2)	1007	No specific GRADE given
Derry 2015	Piroxicam gel	3/522	70	47	1.5 (1.3 to 1.7)	4.4 (3.2 to 6.9)	664	No specific GRADE given
Chronic pain conditions
Derry 2016	Ketoprofen gel	4/2573	63	48	1.1 (1.01 to 1.2)	6.9 (5.4 to 9.3)	1156	Moderate quality
Derry 2016	Diclofenac (< 6 weeks' duration)	5/732	43	23	1.9 (1.5 to 2.3)	5.0 (3.7 to 7.4)	732	Moderate quality
Derry 2016	Diclofenac ‐ various formulations (> 6 weeks' duration)	4/2343	60	50	1.2 (1.1 to 1.3)	9.8 (7.1 to 16)	48	Moderate quality
Derry 2017	Capsaicin (high‐concentration)	2/571	33	24	1.3 (1.0 to 1.7)	11 (6.1 to 62)	Result above threshold of 10	Moderate quality

Footnote: CI: confidence interval; NNT: number needed to treat for an additional beneficial outcome; RR: risk ratio

Other pain outcomes

Topical capsaicin 0.075% for neuropathic pain had some information on pain efficacy in 258 participants, of whom only 124 had the important outcome of at least 50% pain intensity reduction (Derry 2012). Null data from only 102 participants would be required to produce a NNT of 10, which might be considered a threshold for useful efficacy in neuropathic pain. No GRADE assessment was made, but the authors reported that no effect size for efficacy could be safely calculated based on the available data (Moore 2013b).

Comparisons other than placebo

Only one review had sufficient information for an analysis of a topical versus an oral analgesic. Derry 2016 analysed five osteoarthritis studies in which topical piroxicam, ketoprofen, or diclofenac was compared with oral ibuprofen (900 mg and 1200 mg daily), celecoxib (200 mg daily), or diclofenac (100 mg or 150 mg daily) over 3 to 12 weeks. In 1735 participants there was no difference in the proportion achieving treatment success equivalent to at least 50% pain intensity reduction with topical NSAID (55%) and oral NSAID (54%). The RR was 1.0 (95% CI 0.95 to 1.1). The GRADE assessment for quality of evidence for this was moderate.

Withdrawals

Summary table D presents the available results for withdrawals, either as all‐cause for acute pain conditions, or due to lack of efficacy and adverse events for chronic pain conditions. In each case the comparison is with placebo. Not all reviews were able to report information on withdrawals due to incomplete or inconsistent reporting in the original studies, or because there was so little information available.

The amount of available information varied considerably, from as few as 179 participants with topical clonidine (where results are presented for completeness), to 5790 for all NSAIDs in acute pain such as strains and sprains. In almost all cases the number of events was limited, as the rate of events was typically below 10%. Even when there was an apparently large effect, as with adverse event withdrawals with low‐concentration capsaicin, the number of events was small, which meant that the GRADE estimation was very low quality. By contrast, larger numbers of participants did allow a greater confidence in both a low rate of withdrawals and the lack of difference between active and placebo topical treatments. We agreed with all the GRADE assessments concerning withdrawal.

Summary table D: Summary of available results on withdrawals, and cause

				Percent with outcome
Reference	Condition treated	Topical treatment	Studies/Participants	Active	Placebo	RR 95% CI	NNTp or NNH 95% CI	GRADE (review‐reported)
Acute pain conditions
Derry 2015 (Adverse event withdrawals)	Strains and sprains	All NSAIDs	42/5790	1	1	1.0 (0.7 to 1.7)	n/c	High quality
Pattanittum 2013 (All‐cause withdrawals)	Lateral elbow pain	Diclofenac	4/485	0	0	n/c	n/c	Low
Chronic pain conditions (lack of efficacy withdrawals)							NNTp
Derry 2014a	Neuropathic pain	Lidocaine	4/293	7	12	0.6 (0.3 to 1.1)	n/c	No specific GRADE given
Derry 2014b	Acute and chronic	Salicylate	5/501	2	7	0.4 (0.2 to 0.9)	21 (12 to 120)	Very low quality
Derry 2016	Knee and hand OA	Diclofenac	11/3455	6	9	0.6 (0.5 to 0.8)	26 (18 to 47)	Moderate quality
Derry 2016	Knee and hand OA	Ketoprofen	4/2885	6	9	1.1 (0.8 to 1.6)	n/c	Moderate quality
Derry 2017	Neuropathic pain	Capsaicin (high‐concentration)	6/2073	2	3	0.6 (0.3 to 1.04)	n/c	Moderate quality
Wrzosek 2015	Neuropathic pain	Clonidine	1/179	1	1	1.0 (0.06‐16)	n/c	Very low quality
Chronic pain conditions (adverse event withdrawals)							NNH
Derry 2012	Neuropathic pain	Capsaicin (low‐concentration)	4/477	15	3	5.0 (2.3 to 11)	8.1 (5.7 to 14)	No specific GRADE given
Derry 2014a	Neuropathic pain	Lidocaine	5/349	2	1	1.2 (0.3 to 4.6)	n/c	No specific GRADE given
Derry 2014b	Acute and chronic	Salicylate	7/737	5	1	4.2 (1.5 to 12)	26 (15 to 85)	Very low quality
Derry 2016	Knee and hand OA	Diclofenac	12/3552	5	4	1.6 (1.1 to 2.1)	51 (30 to 170)	Moderate quality
Derry 2016	Knee and hand OA	Ketoprofen	4/2621	6	5	1.3 (0.9 to 1.8)	n/c	Moderate quality
Derry 2017	Neuropathic pain	Capsaicin (high)	8 /2487	1	1	0.8 (0.4 to 1.8)	n/c	Moderate quality
Wrzosek 2015	Neuropathic pain	Clonidine	1/179	1	3	0.3 (0.03‐3.2)	n/c	Very low quality

Footnote: CI: confidence interval; n/c = not calculated; NNTp = number needed to treat to prevent; NNH = number needed to treat for an additional harmful outcome; NSAID: nonsteroidal anti‐inflammatory drugs; OA: osteoarthritis; RR: risk ratio.

Adverse events

Summary table E presents the available results for participants experiencing at least one adverse event over the duration of the studies. In each case the comparison is with placebo. Not all reviews were able to report information on withdrawals due to incomplete or inconsistent reporting in the original studies, or because there was so little information available. Topical capsaicin (high‐concentration) presented particular problems for reporting adverse events because the application was usually undertaken after local anaesthesia (Derry 2017). While adverse events were generally well reported this was not done in a way that is comparable with other studies.

With the exception of topical salicylate rubefacients in all available acute and chronic pain studies combined, there were no differences between active topical analgesic and topical placebo. For topical salicylates, more participants with topical salicylate reported an adverse event, with a NNH of 17 (9.9 to 58).

We agreed with all GRADE assessments with the exception of diclofenac for lateral elbow strains. Here a judgement was that the evidence concerning adverse events was low quality (Pattanittum 2013); we judged it very low quality because the very small number of participants and low event rates would mean very few events, and our judgement would be very low quality.

Summary table E: Participants with at least one adverse event

				Percent with outcome
Reference	Condition treated	Topical treatment	Studies/Participants	Active	Placebo	RR 95% CI	NNH 95% CI	GRADE (review‐reported)
Acute pain conditions
Pattanittum 2013	Lateral elbow pain	Diclofenac	3/153	2	1	1.6 (0.2 to 12)	n/c	Low quality
Derry 2015 (Systemic adverse events)	Strains and sprains	All NSAIDs	36/5576	3	4	1.0 (0.7 to 1.3)	n/c	High quality
Chronic pain conditions							NNH
Wrzosek 2015	Neuropathic pain	Clonidine	2/344	12	13	0.7 (0.1 to 3.1)	n/c	Very low quality
Derry 2016 (Systemic adverse events)	Knee and hand OA	Diclofenac	7/1266	6	7	0.9 (0.6 to 1.3)	n/c	Very low quality
Acute and chronic pain conditions							NNH
Derry 2014b	OA or LBP	Salicylate	11/984	15	9	1.6 (1.2 to 2.0)	17 (9.9 to 58)	Low quality

Footnote: CI: confidence interval; LBP: low back pain; n/c: not calculated; NNH: number needed to treat for an additional beneficial outcome; NSAID: nonsteroidal anti‐inflammatory drugs; OA: osteoarthritis; RR: risk ratio

Summary table F presents the available results for participants experiencing a local adverse event over the duration of the studies. In each case the comparison is with placebo. Not all reviews were able to report information on withdrawals due to incomplete or inconsistent reporting in the original studies, or because there was so little information available. For topical capsaicin high‐concentration, studies that did not capture the application‐associated local adverse events reported event rates that were not different between topical capsaicin high‐concentration and placebo, with the exception of pain in around 10% of participants.

There were three interventions for which there were more local adverse events with topical analgesic than with placebo. These were topical capsaicin (low‐concentration) for neuropathic pain with a NNH of 2.5 (2.1 to 3.1) (Derry 2012), topical diclofenac for hand and knee osteoarthritis with a NNH of 16 (12 to 23) (Derry 2016), and topical salicylate rubefacients for osteoarthritis or low back pain with a NNH of 31 (16 to 300) (Derry 2014b).

We agreed with all GRADE assessments made by the authors, ranging from high quality for all NSAIDs in acute painful conditions, to very low quality for topical clonidine in neuropathic pain. In acute pain, results for individual topical NSAIDs were not graded, but based on the evidence available we graded the results high quality (diclofenac, ketoprofen, and piroxicam) or moderate quality (felbinac, indomethacin, and ibuprofen), based mainly on the amount of information available. Topical capsaicin (low‐concentration) evidence was not graded, but we considered it high quality based on a large effect size, consistency between studies, and a reasonably large number of events, together with a biological plausibility.

Summary table F: Participants with at least one local adverse event

				Percent with outcome
Reference	Condition treated	Topical treatment	Studies/Participants	Active	Placebo	RR 95% CI	NNH 95% CI	GRADE (review reported)
Acute pain conditions
Derry 2015	Strains and sprains	All NSAIDs	42/6740	4.3	4.6	0.98 (0.8 to 1.2)	n/c	High quality
Derry 2015	Strains and sprains	Diclofenac	15/3271	3.1	4.3	0.8 (0.6 to 1.1)	n/c	No specific GRADE given
Derry 2015	Strains and sprains	Ketoprofen	8/852	11	9.5	1.2 (0.8 to 1.7)	n/c	No specific GRADE given
Derry 2015	Strains and sprains	Piroxicam	3/522	2.3	5.4	0.4 (0.2 to 1.1)	n/c	No specific GRADE given
Derry 2015	Strains and sprains	Felbinac	3/397	3.0	1.5	1.9 (0.5 to 7.5)	n/c	No specific GRADE given
Derry 2015	Strains and sprains	Indomethacin	3/354	6.3	2.2	2.7 (0.9 to 7.7)	n/c	No specific GRADE given
Derry 2015	Strains and sprains	Ibuprofen	3/321	10	4.3	2.3 (0.98 to 5.4)	n/c	No specific GRADE given
Chronic pain conditions							NNH
Derry 2012	Neuropathic pain	Capsaicin (low‐concentration)	5/557	63	24	2.6 (2.1 to 3.3)	2.5 (2.1 to 3.1)	No specific GRADE given
Derry 2016	Knee and hand OA	Diclofenac	15/3658	14	8	1.8 (1.5 to 2.2)	16 (12 to 23)	Moderate quality
Derry 2016	Knee and hand OA	Ketoprofen	4/2621	15	13	1.0 (0.85 to 1.3)	n/c	Moderate quality
Derry 2017	Neuropathic	Capsaicin (high‐concentration (pain only))	4/1005	10	4	2.4 (1.4 to 4.1)	16 (11 to 31)	Moderate quality
Wrzosek 2015	Neuropathic	Clonidine	2/344	12	13	0.7 (0.1 to 3.1)	n/c	Very low quality
Acute and chronic pain conditions							NNH
Derry 2014b	Strains and sprains, OA or LBP	Salicylate	10/869	6	2	2.2 (1.1 to 4.1)	31 (16 to 300)	Very low quality

Footnote: CI: confidence interval; LBP: low back pain; n/c: not calculated; NNH: number needed to treat for an additional harmful outcome; NSAID: nonsteroidal anti‐inflammatory drug; OA: osteoarthritis; RR: risk ratio

Serious adverse events

Serious adverse events were uncommon. They were not noted in several reviews of short duration or with sparse data (Cui 2010; Cumpston 2009; Derry 2014b; Otlean 2014; Pattanittum 2013). A number of reviews indicated that there were no adverse events reported (Cameron 2013; Derry 2012; Derry 2015), or that there was no difference between topical analgesic and placebo (Derry 2016; Derry 2017). Of the remainder, one found no clear data (Wrzosek 2015), one had some possible serious adverse events for one herbal preparation (Cameron 2011), and one reported six serious adverse events in 263 participants in an open label extension of a topical lidocaine study, apparently unrelated to treatment (Derry 2014a).

Review authors rarely gave a GRADE assessment. We assessed the evidence as very low quality, based on the sparse data available.

Discussion

Summary of main results

This overview review showed that 13 individual Cochrane Reviews assessed the efficacy and harms from a range of available topical analgesics applied to intact skin in a number of acute and chronic painful conditions. The reviews involved 206 studies with around 30,700 participants.

For efficacy, we considered that there was moderate or high‐quality evidence for several therapies, based on the underlying quality of studies in the reviews and susceptibility to publication bias in comparisons with placebo topical therapy. In acute musculoskeletal pain conditions (strains and sprains) these were diclofenac Emulgel (NNT 1.8 (1.5 to 2.1)), ketoprofen gel (NNT 2.5 (2.0 to 3.4)), piroxicam (NNT 4.4 (3.2 to 6.9)), diclofenac Flector plaster (NNT 4.7 (3.7 to 6.5)), and diclofenac other plaster (NNT 3.2 (2.6 to 4.2)). In chronic pain musculoskeletal conditions (mainly hand and knee osteoarthritis) these were topical diclofenac preparations over < 6 weeks (NNT 5.0 (3.7 to 7.4)), ketoprofen over 6 to 12 weeks (NNT 6.9 (5.4 to 9.3)), and topical diclofenac preparations over 6 to 12 weeks (NNT 9.8 (7.1 to 16)). In postherpetic neuralgia topical high‐concentration capsaicin had moderate‐quality evidence of limited efficacy (NNT 11 (6.1 to 62)).

We judged that evidence of efficacy for all other therapies for acute or chronic pain conditions was low or very low. Limited evidence of efficacy potentially subject to publication bias existed for topical preparations of ibuprofen gels and creams, unspecified diclofenac formulations and diclofenac gel other than Emulgel, indomethacin, and ketoprofen plaster in acute pain conditions, and for salicylate rubefacients for chronic pain conditions. Evidence for other interventions (other topical NSAIDs, topical salicylate in acute pain conditions, low‐concentration capsaicin, lidocaine, or clonidine for neuropathic pain, and herbal remedies for any condition) was of very low quality, and typically was limited to single studies or comparisons with sparse data from fewer than 200 participants.

One strong message is that for a number of topical analgesics, the exact formulation used may be of critical importance. This is seen most clearly for topical NSAIDs, particularly in acute pain. Different formulations of the same drug, usually at similar concentrations, have widely different effect sizes. For example, topical diclofenac in acute pain has NNTs ranging between 1.8 and 8, depending on formulation tested, and that was true for five different formulations. Similarly ketoprofen and ibuprofen gels and creams had very different effects. There is a strong indication that for acute pain conditions, gels tend to be superior to creams.

A second message is that analgesic efficacy of topical analgesics is not just about rubbing them in, although it is commonly believed, even if there is little evidence, that rubbing in itself is beneficial, except in experimental pain (Kammers 2010). While rubbing may be part of topical analgesic application, the rigour of the methods includes a rubbed placebo without an active agent, thus taking rubbing out of the equation of assessment of drug efficacy (Tramèr 2004). That may not mean that rubbing may not pay a part in the overall analgesic experience, and even be part of the reason for the relatively high placebo response rates of 20% to 57% seen.

We assessed evidence on withdrawals as very low quality, based on small numbers of events. In chronic pain conditions lack of efficacy withdrawals were lower than placebo with topical diclofenac (NNTp 26) and topical salicylate (NNTp 21), and adverse event withdrawals were higher than placebo with topical capsaicin low‐concentration (NNH 8), topical salicylate (NNH 26), and topical diclofenac (NNH 51).

We assessed evidence for systemic or local adverse event rates being no higher with topical NSAIDs than topical placebo as high quality. We also assessed local adverse events with topical capsaicin low‐concentration (NNH 2.5) as high quality. There was moderate‐quality evidence for more local adverse events than placebo in chronic pain conditions with topical diclofenac (NNH 16) and local pain with topical capsaicin high‐concentration (NNH 16). There was moderate‐quality evidence of no additional local adverse events with topical ketoprofen over topical placebo in chronic pain.

There was very low‐quality evidence for serious adverse events. It might be argued that the absence of events in a large body of studies speaks to an absence of risk, but the limited duration of the studies makes them unsuitable for assessing rare but serious harm. Epidemiological studies provide evidence that gastrointestinal bleeding and perforation risk is not increased with topical NSAIDs (Evans 1995).

Reporting of adverse events in systematic reviews has been criticised because they compound the poor reporting of harms in primary studies by failing to report on harms or doing so inadequately (Zorzela 2014). More and better information on adverse events, and particular adverse events, would be valuable. However, the problems are numerous, and include issues around the methods of collection of adverse events, and about their reporting (Derry 2001; Edwards 1999). Other limitations in both individual studies and systematic reviews include small numbers of participants and events in studies generally not powered to measure these outcomes.

Overall completeness and applicability of evidence

While the overview review included data from 206 studies with almost 31,000 participants, there was considerable fragmentation of this body of data that limits its completeness and applicability. Fragmentation derives from five main sources.

A range of different active agents (NSAID, capsaicin, salicylate, lidocaine, clonidine, glyceryl trinitrate, and a range of different herbal remedies). Some of these, such as NSAIDs, have a range of different chemical entities with probable different efficacy at doses used.
A number of different formulations that affect, or potentially affect, efficacy; an example is different formulations of diclofenac and ketoprofen in acute and chronic pain studies.
A range of different acute and chronic pain conditions.
Trial durations inappropriate for the conditions treated.
Inconsistency in reporting efficacy outcomes of relevance, or adverse event rates. Many studies will have reported average pain scores or pain change only, not recognising the value people with pain place on high levels of pain reduction (Moore 2013a), or the fact that in all pain conditions a bimodal distribution for analgesic response is the norm/usual (Moore 2013b; Moore 2013c).

These factors mean that many of the possible comparisons involve small numbers of studies, participants, and events. Small numbers limit completeness and applicability even if all other considerations are excluded. Usually there are additional imperfections in trial methods or reporting that raise the possibility of risk of bias, further limiting applicability.

Topical analgesics offer possible important benefits to older people, in part due to lower adverse events (Gaskell 2014). No evidence in these reviews specifically related to older people.

We are not aware of any major topical analgesic that is not covered by the individual reviews in this overview. Menthol is used in many products, but we did not find any Cochrane or non‐Cochrane Review. Topical use of other drugs such as ketamine and amitriptyline is known; there are few trials, but no systematic reviews.

Quality of the evidence

Fragmentation also limits the quality of the evidence. Evidence of efficacy of moderate or high quality is available only for certain formulations of diclofenac, ketoprofen, and piroxicam in acute pain from strains and sprains, diclofenac and piroxicam in hand and knee osteoarthritis, and high‐concentration capsaicin in post herpetic neuralgia. Evidence of moderate or high quality is available only for some adverse event outcomes, typically from pooling studies of different topical products.

There was no indication that the individual reviews were not properly performed. One area of possible concern was inconsistency in the use of GRADE. There were several instances where evidence from rather small amounts of data was graded as moderate or low in the original review. Because this was based on low numbers of participants and events, we judged evidence quality to be very low (Moore 1998; Thorlund 2011). Assessments of Cochrane Reviews have made a similar criticism (AlBalawi 2013; Turner 2013). This is not unimportant, as a GRADE assessment of moderate‐quality evidence might result in a different approach by professionals or in guidance than one of very low quality.

Potential biases in the overview process

One potential bias is the overlap between authors of the overview and of some of the individual reviews. This has been addressed by having other experienced authors for the overview.

Agreements and disagreements with other studies or reviews

We are not aware of any similar overview reviews.

	Idioma de la Revisión Cochrane Escoja su idioma de preferencia para las revisiones Cochrane y otros contenidos. Las secciones sin una traducción aparecerán en inglés.

	Idioma de la web Escoja su idioma de preferencia para la web de la Biblioteca Cochrane.

Idioma de la Revisión Cochrane

Idioma de la web

Abstract

Background

Objectives

Methods

Main results

Authors' conclusions

Резюме на простом языке

Действительно ли обезболивающие средства, наносимые на кожу, эффективны?

Resumen visual

Authors' conclusions

Implications for practice

Implications for research

Background

Description of the condition

Description of the interventions

How the intervention might work

Topical NSAIDs

Topical rubefacients

Topical capsaicin

Topical lidocaine

Why it is important to do this overview

Objectives

Methods

Criteria for considering reviews for inclusion

Search methods for identification of reviews

Data collection and analysis

Selection of reviews

Data extraction and management

Assessment of methodological quality of included reviews

Data synthesis

Quality of the evidence

'Summary of findings' table

Results

Description of included reviews

Summary table A: Details of included reviews

Methodological quality of included reviews

Effect of interventions

At least 50% pain relief

1 Interventions for which Cochrane Reviews found no information

2 Interventions for which Cochrane Reviews found inadequate information

Acute pain

Chronic pain

3 Interventions for which Cochrane Reviews found no evidence of effect or evidence of no effect

4 Interventions for which Cochrane Reviews found evidence of effect, but where results were potentially subject to publication bias

Summary table B: Results potentially subject to publication bias

5 Interventions for which Cochrane Reviews found evidence of effect, where results were reliable and not subject to potential publication bias

Summary table C: Results not subject to publication bias

Other pain outcomes

Comparisons other than placebo

Withdrawals

Summary table D: Summary of available results on withdrawals, and cause

Adverse events

Summary table E: Participants with at least one adverse event

Summary table F: Participants with at least one local adverse event

Serious adverse events

Discussion

Summary of main results

Overall completeness and applicability of evidence

Quality of the evidence

Potential biases in the overview process

Agreements and disagreements with other studies or reviews

Copiar o descargar referencia

Idioma de la Revisión Cochrane

Idioma de la web

Instituciones a las que se accedió previamente

Usuarios institucionales

Instituciones a las que se accedió previamente

Otras opciones de acceso