Topical analgesics for acute and chronic pain in adults ‐ an overview of Cochrane Reviews

Sheena Derry; Philip J Wiffen; Eija A Kalso; Rae Frances Bell; Dominic Aldington; Tudor Phillips; Helen Gaskell; R Andrew Moore

doi:10.1002/14651858.CD008609.pub2

成人急性和慢性疼痛的局部止痛药——一项Cochrane系统综述概述

Declaraciones de intereses de los autores

Versión publicada: 12 mayo 2017 Historial de versiones

https://doi.org/10.1002/14651858.CD008609.pub2

Contraer todo Desplegar todo

摘要

disponible en

研究背景

局部止痛药可用于多种疼痛情况。有些是急性的，通常是拉伤或扭伤，肌腱病或肌肉酸痛。其他的则是慢性的，通常是手或膝盖骨关节炎，或神经性疼痛。

研究目的

综述用于治疗成人急慢性疼痛的局部镇痛药（主要是非甾体抗炎药（NSAIDs）、水杨酸红霉素、辣椒素和利多卡因）的止痛效果和相关不良事件。

研究方法

我们在Cochrane系统综述数据库（Cochrane图书馆）中检索了截至2017年2月发表的关于急性和慢性疼痛的系统综述。主要结局是在适当的持续时间内疼痛至少减轻50％（受试者报告）。我们提取了每种局部镇痛剂或止痛剂的疗效结果的一种额外有益结果（NNT）所需治疗的数量，以及针对不良事件的一种额外有害结果（NNH）所需治疗的数量。我们还提取了由于缺乏疗效或不良事件，全身和局部不良事件以及严重不良事件而撤药的信息。在至少两项研究中，我们要求至少200名受试者提供信息。我们判断，如果增加四项典型规模研究（400名受试者）的零效应研究，那么与安慰剂相比，NNT会增至10（最小临床效用），则存在潜在的发表偏倚。我们从原始论文中提取了GRADE评分，并进行了自己的GRADE评分。

主要结果

13项Cochrane系统综述（206项研究, 约30,700名受试者）评价了在多种急性和慢性疼痛情况下，将一系列局部止痛药应用于皮肤的功效和危害。评价由几个综述小组监督，集中比较局部止痛药和局部安慰剂的证据质量，局部止痛药和口服止痛药的比较是罕见的。

对于疼痛至少减轻50％，我们根据研究的基本质量和发表偏倚的敏感性，认为几种疗法的证据为中等或高质量。

对急性肌肉骨骼疼痛（拉伤和扭伤）进行约7天的评估，治疗方法为双氯芬酸乳胶剂（78％乳胶，20％安慰剂，2项研究，314名受试者，NNT= 1.8, 95％ CI [1.5, 2.1]），酮洛芬凝胶（72％酮洛芬，33％安慰剂，5项研究，348名受试者，NNT= 2.5, 95％ CI [2.0, 3.4]），吡罗昔康凝胶（70％吡罗昔康，47％安慰剂，3项研究，522名受试者，NNT= 4.4, 95％ CI [3.2, 6.9]），双氯芬酸石膏（63％的双氯芬酸，41％安慰剂，4项研究，1030名受试者，NNT= 4.7, 95% CI [3.7, 6.5]）和双氯芬酸其他膏药（88％双氯芬酸的石膏，57％安慰剂，3项研究，474名受试者，NNT 3.2 [2.6, 4.2]）。

在慢性肌肉骨骼疼痛（主要是手和膝盖骨关节炎）的评价中，治疗方法为局部双氯芬酸制剂，治疗时间少于6周（43％双氯芬酸，23％安慰剂，5项研究，732名受试者，NNT= 5.0, 95% CI [3.7, 7.4]），酮洛芬治疗时间为6周以上至12周（63％酮洛芬，48％安慰剂，4项研究，2573名受试者，NNT=6.9, 95% CI [5.4, 9.3]）和局部双氯芬酸制剂治疗时间6周以上至12周（60％双氯芬酸，50％安慰剂，4项研究，2343名受试者，NNT=9.8, 95% CI [7.1, 16]）。在带状疱疹后遗神经痛中，局部高浓度辣椒素有中等质量的有限疗效证据（辣椒素33％，安慰剂24％，2项研究，571名受试者，NNT=11, 95% CI [6.1, 62]）。

我们判断其他疗法的疗效证据质量为低或非常低。用于急性疼痛情况的布洛芬凝胶和乳膏、未指定的双氯芬酸制剂和双氯芬酸凝胶（除乳膏、消炎痛和酮洛芬膏药外）的局部制剂以及用于慢性疼痛情况的水杨酸红霉素存在疗效证据有限，可能受到发表偏倚的影响。其他干预措施（其他局部非甾体抗炎药（NSAIDs），用于急性疼痛情况的局部水杨酸盐，低浓度的辣椒素，利多卡因，用于神经性疼痛的可乐定以及任何情况下的草药）的证据质量非常低，通常仅限于单项研究或零散数据的比较。

由于事件数量很少，我们评估撤药的证据质量为中等或非常低。在慢性疼痛情况下，用局部双氯芬酸的无效撤药率（6％）低于局部安慰剂（9％）（11项研究，3455名受试者，预防治疗（NNTp）所需人数26人，中等质量证据），局部水杨酸盐的无效撤药率（2％）低于局部安慰剂（7％）（5项研究，501名受试者，预防治疗（NNTp）所需人数21人，非常低质量证据）。局部使用低浓度辣椒素的不良事件撤药率（15％）高于安慰剂（3％）（4项研究，477名受试者，NNH=8，非常低质量证据），局部水杨酸盐不良事件撤药率（5％）高于安慰剂（1％）（ 7项研究，735名受试者，NNH=26，非常低质量证据）局部双氯芬酸不良事件撤药率（5％）高于安慰剂（4％）（12项研究，3552名受试者，NNH=51，非常低质量证据）。

在急性疼痛中，局部使用非甾体抗炎药（NSAIDs）的全身或局部不良事件发生率（4.3％）不高于局部使用安慰剂（4.6％）（42项研究，6740名受试者，高质量证据）。在慢性疼痛中，局部使用低浓度辣椒素的局部不良事件（63％）高于局部使用安慰剂（5项研究，557名受试者，治疗伤害所需的数字为（NNH）2.6），高质量的证据。中等质量的证据表明，在慢性疼痛情况下局部使用双氯芬酸（NNH=16）和局部使用高浓度辣椒素（NNH=16）引起的局部不良事件比安慰剂更多。有中等质量的证据表明，在慢性疼痛中，局部使用酮洛芬比局部使用安慰剂没有额外的局部不良事件。严重不良反应很少见（非常低质量的证据）。

由于受试人数少，事件的数量少，我们认为某些综述中的GRADE评分为中等或低质量。

作者结论

有充分的证据表明，局部双氯芬酸和酮洛芬的一些制剂可用于急性疼痛，如扭伤或拉伤，且NNT值低（良好）。有一个确凿的信息是，在急性疼痛情况下使用精确配方是至关重要，这也可能适用于其他疼痛情况。在评价时间超过6至12周的慢性肌肉骨骼疾病中，局部双氯芬酸和酮洛芬对手和膝盖骨关节炎的疗效有限，局部使用高浓度辣椒素治疗带状疱疹后遗神经痛的疗效也很有限。尽管NNTs较高，但这仍然表明一小部分人的疼痛得到了很好的缓解。

在Cochrane系统综述中对少量的受试者和事件使用GRADE评分，需要引起注意。

简语概要

disponible en

涂抹在皮肤上的止痛药真的有效吗？

概要

双氯芬酸乳胶剂，酮洛芬凝胶，吡罗昔康凝胶和双氯芬酸石膏对拉伤和扭伤的治疗效果相当好。对于手和膝盖骨关节炎，在皮肤上涂抹至少6至12周的非甾体抗炎药（NSAIDs）局部双氯芬酸和局部酮洛芬有助于将少数受试者的疼痛至少减轻一半。对于带状疱疹后遗神经痛（带状疱疹后疼痛），局部高浓度辣椒素（从辣椒中提取）可将少数受试者的疼痛至少减轻一半。

背景

涂在皮肤上的止痛药称为局部止痛药（镇痛药）。关于它们是否起作用，如何起作用以及在什么样的疼痛条件下起作用，一直存在着相当多的争论。

研究特征

我们在Cochrane 系统综述数据库（Cochrane图书馆）中检索截至2017年2月发布的研究局部止痛药的系统综述。系统综述评价了短期（急性，少于三个月）或长期（慢性，超过三个月）疼痛状况的治疗。我们研究了局部止痛药的疗效，它们造成的危害以及人们是否退出了研究。我们还研究了证据的质量。

关键结局

大多数系统综述评价了局部止痛药和局部安慰剂的疗效。局部安慰剂与活性物质相同，不同之处在于其中没有止痛药。使用安慰剂可以消除摩擦对某些局部止痛药的影响。

对于拉伤和扭伤，一些涂抹在皮肤上的非甾体抗炎（ NSAID）止痛药可以帮助减轻疼痛，大约1/2到1/5的人在1周左右的时间内疼痛至少减轻一半。这些是双氯芬酸乳胶剂，酮洛芬凝胶，吡罗昔康凝胶，双氯芬酸石膏和双氯芬酸其他石膏。药物的组成对于确定药物的功效很重要。

对于手和膝盖骨关节炎，在皮肤上涂抹非甾体抗炎药（NSAID）局部双氯芬酸和局部酮洛芬有助于减轻疼痛，大约1/5到1/10的人在至少6到12周的时间内疼痛至少减轻一半。对于带状疱疹后遗神经痛，局部使用一次高浓度辣椒素可在8至12周内使大约1/12的人疼痛至少减轻一半。

没有充分的证据支持在任何其他疼痛状况下支持使用任何其他局部止痛药。

局部低浓度辣椒素引起局部副作用（如瘙痒或皮疹）的比例为4/10，而副作用则导致撤药的比例为1/12。副作用和因副作用而引起撤药在其他方面并不常见，或者与使用局部安慰剂没有什么区别。严重的副作用并不常见。

证据的质量

证据的质量从高到很低不等。证据质量很低的主要原因是某些研究的受试者人数少，这使得不可能（或不安全）估算收益或损害。

Authors' conclusions

Implications for practice

For people with pain

The major implication for people with pain is the knowledge that there is a body of reliable evidence about the efficacy of topical analgesics in different types of acute and chronic pain. Not every person will achieve good pain relief even with the most effective drugs, and analgesic failure is to be expected with particular drugs in particular people. Failure to achieve good pain relief should not be acceptable because it is likely that failure with any one drug could be reversed with another.

For clinicians

The major implication for clinicians is the knowledge that there is a body of reliable evidence about a number of topical analgesics in acute and chronic pain. Drug and formulation matter, so choice of therapy should usually be driven by the evidence: topical diclofenac and ketoprofen gel for strains and sprains, and to an extent in knee and hand osteoarthritis. Topical capsaicin high‐concentration may be of limited use in some people with postherpetic neuralgia.

Topical salicylate, low‐concentration capsaicin, clonidine, and lidocaine are not well supported by evidence, or much evidence of effect. There may be circumstances when an experienced clinician may still choose to use them, because the evidence does not exclude beneficial effects in a small percentage of people.

For policy makers

The issue is not which topical analgesic product, but achieving success ‐ good pain relief is the goal of treatment. Surveys over a long period have shown that acute and chronic pain are poorly treated, and that many people experience moderate or severe pain despite being on treatment. Pain treatment is often part of a complex of interactions between the person with pain, their pain condition, and desired outcome; the overview helps by presenting evidence from which rational choices and decisions can be made.

For funders

The important message is that some topical analgesics can produce very good pain relief for some people with acute or chronic pain. The issue is not which topical analgesic product works best, but achieving success for individual people with pain. Good pain relief is the goal of treatment. Surveys over a long period have shown that acute and chronic pain are poorly treated, and that many people experience moderate or severe pain despite being on treatment. Pain treatment is often part of a complex of interactions between the person with pain, their pain condition, and desired outcome; the overview helps by presenting evidence from which rational choices and decisions can be made, especially about the place of particular products in care pathways.

Implications for research

General

The individual reviews and this overview have highlighted the lack of good evidence for many topical analgesics. Most of the studies and the participants included in them did not contribute to any reliable assessment of efficacy or harm. That is a waste, and the ethics of research of that sort is hard to justify.

Most topical analgesics are inexpensive, and have not had the detailed examination in large, properly‐conducted randomised trials that would be expected of modern medicines. And yet they offer good levels of pain relief for at least some people with acute and chronic pain, with a general absence of systemic adverse events in this overview, but also in studies designed to examine rare but serious harm.

Design

While there appears to be general consensus over design of studies, many of the individual studies in the individual reviews fail to meet reasonable standards. Much of that reflects the age of the studies and the standards of reporting extant at the time of publication. Others are more fundamental, such as having an adequate duration of studies investigating chronic pain; while efficacy may be established relatively early (four to six weeks), longer duration allows for assessment of tolerability.

As much as studies are needed to examine efficacy compared with placebo, or a commonly used active comparator, study designs to examine the pragmatic value of topical analgesics might be of especial value in chronic pain conditions. An example of such a trial design has been published (Moore 2010c).

Measurement (endpoints)

People with acute and chronic pain want an outcome of treatment that is equivalent to 'no worse than mild pain'. There is no reason why current pain trial methods could not report this as an outcome.

Other

Many of the improvements in understanding acute pain have been derived from individual participant‐level analyses. These can only come from close co‐operation with the pharmaceutical industry, which overwhelmingly funds the studies and 'owns' the data. Industry has a responsibility to perform more useful analyses than just those required for regulatory purposes. The main implication for research is methodological, and that is driven by data analysis at the level of the individual participant.

Background

Description of the condition

A topical analgesic medication is one applied to body surfaces such as the skin or mucous membranes to treat painful ailments; they are either rubbed onto the skin or made into patches or plasters that are stuck onto the skin. Painful conditions that might be treated by direct application of drugs include, for example, painful cutaneous ulcers; wounds of various sorts including surface wounds or wounds inside the body due to surgery or pain due to infiltration by needles; and painful eye conditions, especially perioperatively, such as after cataract surgery. Each of these, and other, situations, might be described as a topical application of drug. In this overview we have restricted our scope to drugs that are applied to intact skin, which is the situation where topical analgesics are most frequently used outside special circumstances.

Pain is a very common experience. Acute pain is of short duration, lasting less than three months, and gradually resolves as the injured tissues heal. Chronic pain is usually defined as pain lasting three to six months or longer. Acute pain conditions like tension‐type headache, migraine, and acute low back pain rank amongst the top 10 most common conditions worldwide (Vos 2012). Chronic painful conditions comprise 5 of the 11 top‐ranking conditions for years lived with disability in 2010 (Vos 2012); they include low back pain (LBP), neck pain, osteoarthritis (OA), and other musculoskeletal diseases.

Pain is responsible for considerable loss of quality of life and employment, and increased health costs (Moore 2014). People with pain want it to go away, and relatively quickly (Moore 2013a); understanding this has led to a recognition that this should drive what we regard as useful outcomes in pain trials, namely a large reduction in pain, or being in a low pain state (Moore 2010a).

Topical analgesic drugs are used to treat both acute pain (strains, sprains, tendonitis, acute back pain, muscle aches) and chronic pain (osteoarthritis of hand or knee, low back pain, and specific types of neuropathic pain). Topical analgesics are recommended in guidelines for the pain of osteoarthritis (Hochberg 2012; NICE 2014) and neuropathic pain (Attal 2010; Finnerup 2015; NICE 2013).

Description of the interventions

A number of different topical analgesics have been tested in a wide range of different painful conditions. The scope of this overview covers a number of possible interventions.

For strains and sprains, topical nonsteroidal anti‐inflammatory drugs (NSAIDs) or topical rubefacients.
For osteoarthritis, topical NSAIDs, topical rubefacients, and low‐concentration topical capsaicin.
For neuropathic pain, topical local anaesthetic (lidocaine, for example) or high‐concentration topical capsaicin.
Topical herbal medicines have been used for a variety of painful conditions.
Other possible interventions include glyceryl trinitrate for some joint pains.

Many different topical formulations may be used, including but not limited to creams, foams, gels, lotions, ointments, and plasters (patches). The exact formulation of a topical medication is often determined by the required rate of drug delivery (Moore 2008a). Plasters containing drug reservoirs result in slow absorption rates, lower blood levels, and reduced first pass effect in the liver. They are frequently used for transdermal delivery of drugs that are distributed systemically (opioids or contraceptive steroids), but are also available for some NSAIDs for topical drug delivery.

How the intervention might work

Topical medications are applied externally and are absorbed through the skin. They exert their effects close to the site of application, and there should be little systemic uptake or distribution. This compares with transdermal application, where the medication is applied externally and is taken up through the skin, but relies on systemic distribution for its effect.

For a topical formulation to be effective, it must first pass through the skin. Individual drugs have different degrees of penetration, and some formulations add substances that improve skin penetration and result in higher drug concentrations in tissues. A balance between lipid and aqueous solubility is needed to optimise penetration, and use of prodrug esters has been suggested as a way of enhancing permeability. Formulation is also crucial to good skin penetration. Experiments with artificial membranes or human epidermis suggest that creams are generally less effective than gels or sprays, but newer formulations such as microemulsions may have greater potential.

Topical NSAIDs

NSAIDs reversibly inhibit the enzyme cyclooxygenase (prostaglandin endoperoxide synthase or COX), now recognised to consist of two isoforms, COX‐1 and COX‐2, mediating production of prostaglandins and thromboxane A2 (Fitzgerald 2001); inhibition of the COX‐2 format reduces inflammation and produces analgesic effects. Relatively little is known about the mechanism of action of this class of compounds aside from their ability to inhibit COX‐dependent prostanoid formation (Hawkey 1999). Systemically, prostaglandins mediate a variety of physiological functions such as maintenance of the gastric mucosal barrier, regulation of renal blood flow, and regulation of endothelial tone. They also play an important role in inflammatory and nociceptive (pain) processes. The rationale behind topical application is based on the ability of NSAIDs to inhibit COX enzymes locally and peripherally, with minimum systemic uptake. Their use is therefore limited to conditions where the pain is superficial and localised, such as in joints and skeletal muscle.

Once the drug has reached the site of action, it must be present at a sufficiently high concentration to inhibit COX enzymes and produce pain relief. It is probable that topical NSAIDs exert their action by local reduction of symptoms arising from periarticular and intracapsular structures. Tissue levels of NSAIDs applied topically certainly reach levels high enough to inhibit COX‐2. Plasma concentrations found after topical administration, however, are only a fraction (usually much less than 5%) of the levels found in plasma following oral administration. Topical application can potentially limit systemic adverse events by minimising systemic concentrations of the drug. We know that upper gastrointestinal bleeding is low with chronic use of topical NSAIDs (Evans 1995), but have no certain knowledge of effects on heart failure, or renal failure, both of which are associated with oral NSAID use. Current guidelines in the UK encourage use of topical NSAIDs ahead of oral NSAIDs, COX‐2 inhibitors, or opioids for hand or knee osteoarthritis (NICE 2014).

Topical rubefacients

Rubefacients (typically containing salicylates) cause irritation of the skin, and are believed to relieve pain in muscles, joints and tendons, and other musculoskeletal pains in the extremities by counter‐irritation (Martindale 2016). These agents cause a reddening of the skin by causing the blood vessels of the skin to dilate, which gives a soothing feeling of warmth. The term counter‐irritant refers to the idea that irritation of the sensory nerve endings alters or offsets pain in the underlying muscle or joints that are served by the same nerves (Morton 2002).

There has been confusion about which compounds should be classified as rubefacients. Salicylates are related pharmacologically to aspirin and NSAIDs, but when used in topical products (often as amine derivatives) their principal action is as skin irritants. By contrast, topical NSAIDs penetrate the skin and underlying tissues where they inhibit COX enzymes, as described above. We will include salicylates and nicotinate esters as rubefacients.

Topical capsaicin

Capsaicin is the active compound present in chili peppers, responsible for making them hot when eaten. It binds to nociceptors (sensory receptors responsible for sending signals that cause the perception of pain) in the skin, and specifically to the TRPV1 receptor, which controls movement of sodium and calcium ions across the cell membrane. Initially, binding opens the ion channel (influx of sodium and calcium ions), causing depolarisation and the production of action potentials, which are usually perceived as itching, pricking, or burning sensations. Repeated applications or high concentrations give rise to a long‐lasting effect, which has been termed 'defunctionalisation', probably owing to a number of different effects that together overwhelm the cell's normal functions, and can lead to reversible degeneration of nerve terminals (Anand 2011).

Topical creams with low‐concentration capsaicin designed for repeated applications are used to treat pain from a wide range of chronic conditions including postherpetic neuralgia (PHN), peripheral diabetic neuropathy (PDN), osteoarthritis and rheumatoid arthritis, in addition to pruritus and psoriasis. The creams typically contain capsaicin 0.025% or 0.075%, but in some countries 0.25% creams are available (Martindale 2016).

A high‐concentration (8%) patch has been developed to increase the amount and speed of delivery of capsaicin to the skin, and improve tolerability. Rapid delivery is thought to improve tolerability because cutaneous nociceptors are 'defunctionalised' quickly, and the single application avoids both non‐compliance and contamination of the home environment with particles of dried capsaicin cream (Anand 2011). The high‐concentration product is a single application with a minimum interval of 3 months, performed in a clinic, with cooling, local anaesthesia or short‐acting opioids to reduce local pain on application. Patients are usually monitored for up to two hours after treatment. Stringent conditions are required, and as well as using trained healthcare professionals, the treatment setting needs to be well ventilated and spacious due to the vapour of the capsaicin, and cough due to inhalation of capsaicin particles/dust is a hazard for both the healthcare professionals and the patients.

High‐concentration capsaicin is licensed in the European Union (EU) to treat neuropathic pain, and in the USA to treat peripheral postherpetic neuralgia. It is available on prescription only; it was licensed in 2009 in Europe and the USA. The US Food and Drug Administration (FDA) refused a licence for neuropathic pain in HIV in 2012. The EU licence originally restricted use to non‐diabetic patients, but this restriction was lifted in 2015.

Topical lidocaine

Topical lidocaine dampens peripheral nociceptor sensitisation and central nervous system hyperexcitability if used in recommended doses. It may benefit people with PHN or traumatic nerve injury, where its lack of systemic side effects makes it an attractive option. As cream and gel, lidocaine is cumbersome to administer. The patch is more convenient, being worn for 12 hours in every 24 hours. In addition to the local anaesthetic effect, the patch provides protection against mechanical stimulation (dynamic allodynia), which is a frequent problem in PHN.

Lidocaine is an amide‐type local anaesthetic agent that acts by stabilising neuronal membranes. It impairs membrane permeability to sodium, which in turn blocks impulse propagation, and thus dampens both peripheral nociceptor sensitisation, and eventually central nervous system hyperexcitability. It also suppresses neuronal discharge in A delta and C fibres. Regenerating nerve fibres have an accumulation of sodium channels. When lidocaine binds to such sodium channels it initiates an 'inactive state' from which normal activation is unable to occur. Lidocaine reduces the frequency rather than the duration of sodium channel opening. In a small dose it inhibits ectopic discharges, although it does not disrupt normal neuronal function. Lidocaine also suppresses spontaneous impulse generation from dorsal root ganglia, where the herpes virus remains dormant after initial infection by Varicella zoster (chickenpox) (Khaliq 2007).

Other effects on keratinocytes and immune cells, or activation of irritant receptors (TRPV1 and TRPA1), may also contribute to the analgesic effect of topical lidocaine (Sawynok 2014). Long‐term use may cause a loss of epidermal nerve fibres (Wehrfritz 2011).

Why it is important to do this overview

Use of topical analgesics is increasing as patients and clinicians look for alternatives to systemic treatments and their associated adverse events. The efficacy of topical interventions is now widely recognised, together with a different pattern of troublesome adverse events from oral analgesics. Guidelines recommend topical analgesics for a range of pain conditions (Attal 2010; Finnerup 2015; Hochberg 2012; NICE 2013; NICE 2014).

There is a need to pull together the best available evidence for all interventions and conditions and assess the evidence with regard to current understanding of potential biases in order to facilitate decisions about which interventions are helpful in particular circumstances.

Objectives

To provide an overview of the analgesic efficacy and associated adverse events of topical analgesics (primarily nonsteroidal anti‐inflammatory drugs (NSAIDs), salicylate rubefacients, capsaicin, and lidocaine) applied to intact skin for the treatment of acute and chronic pain in adults.

Methods

Criteria for considering reviews for inclusion

We considered for inclusion Cochrane Reviews assessing randomised controlled trials (RCTs) of topical analgesics for pain relief in adults in distinct clinical conditions.

Acute musculoskeletal conditions (sprains, strains, muscle pain)
Osteoarthritis, rheumatoid arthritis, or other chronic musculoskeletal conditions
Neuropathic pain

Search methods for identification of reviews

We searched the Cochrane Database of Systematic Reviews (the Cochrane Library) for relevant reviews; Appendix 1 shows the search strategy. In addition, we examined reviews produced by Cochrane Musculoskeletal; Cochrane Pain, Palliative, and Supportive Care; and Cochrane Skin for suitable reviews, and performed broader searches using the terms 'topical' and 'pain' in title, abstract, and keywords.

Data collection and analysis

Two review authors (RAM, SD) independently carried out searches, selected reviews for inclusion, and carried out assessment of methodological quality, and data extraction. Any disagreements were resolved by discussion involving a third author.

Selection of reviews

Included reviews assessed RCTs of the effects of topical application of analgesics for pain relief in adults (as defined by individual reviews), compared with placebo or active comparator if available, and included:

a clearly defined clinical question;
details of inclusion and exclusion criteria;
details of databases searched and relevant search strategies;
participant‐reported pain relief;
summary results for at least one desired outcome.

Data extraction and management

We took data from the included reviews, planning to refer to original study reports only if specific data were missing.

We collected information on the following.

Number of included studies and participants
Intervention and dose
Comparator
Condition treated: acute pain (strains and sprains, overuse injuries), chronic pain (arthritis, neuropathic pain)
Time of assessment

We extracted risk difference (RD) or risk ratio (RR) and number needed to treat for one additional beneficial or harmful outcome (NNT or NNH) for the following outcomes.

At least 50% pain relief or more (participant‐reported)
Any other measure of 'improvement' (participant‐reported)
Adverse events: local and systemic, and particularly serious adverse events
Withdrawals (particularly withdrawals caused by lack of efficacy or because of adverse events)

Assessment of methodological quality of included reviews

We assessed the methodological quality of included reviews using the following criteria (adapted from AMSTAR; Shea 2007).

Was an a priori design provided?
Was there duplicate study selection and data extraction?
Was a comprehensive literature search performed?
Were published and unpublished studies included irrespective of language of publication?
Was a list of studies (included and excluded) provided?
Were the characteristics of the included studies provided?
Was the scientific quality of the included studies assessed and documented?
Was the scientific quality of the included studies used appropriately in formulating conclusions?
Were the methods used to combine the findings of studies appropriate?
Was the conflict of interest stated?

For each review we assessed the likelihood of publication bias by calculating the number of participants in studies with zero effect (RR of one) that would be needed to give an NNT too high to be clinically relevant (Moore 2008b). In this case we considered an NNT of 10 or more for the outcome 'at least 50% maximum pain relief' or 'substantial benefit' at a specified assessment time to be the cut‐off for clinical relevance. We used this method because statistical tests for presence of publication bias have been shown to be unhelpful (Thornton 2000).

Data synthesis

We did not plan additional quantitative analyses, since only results from properly conducted Cochrane Reviews were considered. The aim was to concentrate on specific outcomes such as the proportion of participants with at least 50% pain relief, all‐cause or adverse event discontinuations, and serious adverse events, and to explore how these can be compared across different treatments for the same condition. Care was taken to ensure that we compared like with like, for example in duration of treatment, which can be an additional source of bias (Moore 2010b). Importantly, issues of low trial quality, inadequate size, and whether trials were truly valid for the particular condition, were highlighted in making any between‐therapy comparisons.

We considered study size and the overall amount of information available for analysis. There are issues over both random chance effects with small amounts of data, and potential bias in small studies, especially in pain (Dechartres 2013; Dechartres 2014; Moore 1998; Nüesch 2010; Thorlund 2011).

We did not use information from pooled analyses unless they included data from at least 200 participants for the outcome (Moore 1998). Where appropriate we used or calculated risk ratio (RR) or risk difference (RD) with 95% confidence intervals (CI) using a fixed‐effect model (Morris 1995). We used or calculated NNT and NNH with 95% CIs using the pooled number of events, using the method devised by Cook and Sackett (Cook 1995). We assumed a statistically significant difference from control when the 95% CI of the RR did not include the number one or the RD the number zero.

Quality of the evidence

We used the GRADE (Grades of Recommendation, Assessment, Development and Evaluation) system to assess the quality of the evidence related to the key outcomes listed in 'Types of outcome measures', as appropriate (Appendix 2). Two review authors independently rated the quality of each outcome independently of any GRADE evaluation in the original reports.

We paid particular attention to inconsistency, where point estimates vary widely across studies, or confidence intervals (CIs) of studies show minimal or no overlap (Guyatt 2011), and potential for publication bias, based on the amount of unpublished data required to make the result clinically irrelevant (Moore 2008b). Small studies have been shown to overestimate treatment effects, probably because the conduct of small studies is more likely to be less rigorous, allowing critical criteria to be compromised (Dechartres 2013; Nüesch 2010), and large studies often have smaller treatment effects (Dechartres 2014). Cochrane Reviews have been criticised for perhaps over‐emphasising results of underpowered studies or analyses (AlBalawi 2013; Turner 2013), and simulation studies demonstrate that small studies have low power to estimate treatment effect with accuracy (Moore 1998; Thorlund 2011).

In addition, there may be circumstances where the overall rating for a particular outcome needs to be adjusted as recommended by GRADE guidelines (Guyatt 2013a). For example, if there are so few data that the results are highly susceptible to the random play of chance, or if studies use last observation carried forward (LOCF) imputation in circumstances where there are substantial differences in adverse event withdrawals (Moore 2012), one would have no confidence in the result, and would need to downgrade the quality of the evidence by three levels, to very low quality. In circumstances where there were no data reported for an outcome, we would report the level of evidence as very low quality (Guyatt 2013b).

We report both the GRADE assessment made by the authors of individual Cochrane reviews, and a GRADE assessment made by us based on current knowledge. We used the following descriptors for levels of evidence (EPOC 2015); "substantially different" in this context implies a large enough difference that it might affect a decision.

High: this research provides a very good indication of the likely effect. The likelihood that the effect will be substantially different is low.
Moderate: this research provides a good indication of the likely effect. The likelihood that the effect will be substantially different is moderate.
Low: this research provides some indication of the likely effect. However, the likelihood that it will be substantially different is high.
Very low: this research does not provide a reliable indication of the likely effect. The likelihood that the effect will be substantially different is very high.

We used the amount and quality of evidence to report results in a hierarchical way (Moore 2015). We split the available information into five groups, essentially according to the GRADE descriptors.

Drugs and doses for which Cochrane Reviews found no information (very low‐quality evidence).
Drugs and doses for which Cochrane Reviews found inadequate information: fewer than 200 participants in comparisons, in at least two studies (very low‐quality evidence).
Drugs and doses for which Cochrane Reviews found evidence of effect, but where results were potentially subject to publication bias. We considered the number of additional participants needed in studies with zero effect (relative benefit of one) required to change the NNT for at least 50% maximum pain relief to an unacceptably high level (in this case the arbitrary NNT of 10) (Moore 2008b). Where this number was less than 400 (equivalent to four studies with 100 participants per comparison, or 50 participants per group), we considered the results to be susceptible to publication bias and therefore unreliable (low quality‐evidence).
Drugs and doses for which Cochrane Reviews found no evidence of effect or evidence of no effect: more than 200 participants in comparisons, but where there was no statistically significant difference from placebo (moderate‐ or high‐quality evidence).
Drugs and doses for which Cochrane Reviews found evidence of effect, where results were reliable and not subject to potential publication bias (high‐quality evidence).

'Summary of findings' table

We did not plan to include a 'Summary of findings' table, as set out in the author guide (PaPaS 2012), and recommended in the Cochrane Handbook for Systematic Reviews of Interventions (Chapter 4.6.6, Higgins 2011). The reasons include the difficulty of using such tables when there are many different conditions, interventions, and outcomes, because they quickly become unwieldy.

We planned to summarise information in the text, as appropriate. In‐text tables were organised in the format of condition, intervention, outcome, numbers of studies and participants, RR or RD.

Key information included the quality of evidence, the magnitude of effect of the interventions examined as appropriate for the condition studied. This included, for example in chronic pain conditions, the sum of available data on the outcomes of 'substantial benefit' (Patient Global Impression of Change (PGIC) very much improved from weeks 2 to 8 and weeks 2 to 12), 'moderate benefit' (PGIC much or very much improved from weeks 2 to 8 and weeks 2 to 12), withdrawals due to adverse events, withdrawals due to lack of efficacy, serious adverse events, and death (a particular serious adverse event).

Results

We included 13 reviews (Cameron 2011; Cameron 2013; Cui 2010; Cumpston 2009; Derry 2012; Derry 2014a; Derry 2014b; Derry 2015; Derry 2016; Derry 2017; Otlean 2014; Pattanittum 2013; Wrzosek 2015). We excluded one review because it was a protocol without data (Johnston 2007).

Description of included reviews

The 13 included reviews covered a range of treatments for acute and chronic pain conditions. The reviews involved 206 studies with around 30,700 participants.

Acute pain was addressed in four reviews. These were:

topical salicylate‐containing rubefacients for acute injuries (mainly strains, sprains, and acute low back pain) (Derry 2014b);
topical NSAIDs for strains and sprains (Derry 2015);
glyceryl trinitrate for rotator cuff injury among people with acute symptoms (Cumpston 2009);
topical NSAIDs for lateral elbow pain (Pattanittum 2013).

Chronic pain reviews was examined in 12 reviews. These were:

topical glyceryl trinitrate for chronic rotator cuff pain (Cumpston 2009);
topical NSAIDs for chronic lateral elbow pain (Pattanittum 2013);
topical herbal remedies for rheumatoid arthritis (Cameron 2011), osteoarthritis (Cameron 2013), neck pain due to cervical degenerative disease (Cui 2010), and low back pain (Otlean 2014);
topical NSAIDs for chronic musculoskeletal conditions (Derry 2016);
topical salicylate‐containing rubefacients for chronic musculoskeletal conditions (Derry 2014b);
topical capsaicin high‐ and low‐concentration for neuropathic pain (Derry 2012; Derry 2017);
topical lidocaine for neuropathic pain (Derry 2014a);
topical clonidine for neuropathic pain (Wrzosek 2015).

Summary table A has details of the number of included studies and participants, the intervention, comparators, and the condition treated, whether acute or chronic, and the time of assessment.

Summary table A: Details of included reviews

Review	Condition treated	Studies	Participants	Topical intervention	Comparators	Assessment time (weeks)
Acute pain conditions
Derry 2014b	Strains, sprains, low back pain	6	697	Salicylate rubefacients	Placebo and active	1
Derry 2015	Strains and sprains	61	9001	NSAID	Placebo and active	1
Acute and chronic pain conditions
Cumpston 2009	Rotator cuff disease or chronic tendinopathy	3	121	Glyceryl trinitrate	Placebo	1 ‐ 24
Pattanittum 2013	Lateral elbow pain	15	759	NSAID	Placebo	2 ‐ 4
Chronic pain conditions
Cameron 2011	Rheumatoid arthritis	22	1243	Herbal remedies	Placebo and active	Up to 26
Cameron 2013	Osteoarthritis	7	785	Herbal remedies	Placebo and active	Up to 4
Cui 2010	Neck pain due to degenerative disease	4	1100	Herbal remedies	Placebo and active	Up to 4
Derry 2012	Neuropathic pain	6	389	Topical capsaicin low‐concentration	Placebo	6‐8
Derry 2014a	Neuropathic pain	12	508	Lidocaine	Placebo	From a single dose up to 12 weeks
Derry 2014b	Chronic musculoskeletal conditions	10	671	Salicylate rubefacients	Placebo and active	2
Derry 2016	Chronic musculoskeletal conditions	39	10631	NSAID	Placebo and active	2 ‐ 12
Derry 2017	Neuropathic pain	8	2488	Topical capsaicin high‐concentration	Placebo	8 ‐ 12
Otlean 2014	Low back pain	14	2050	Herbal remedies	Placebo and active	Typically 3
Wrzosek 2015	Neuropathic pain	2	344	Clonidine	Placebo	8 ‐ 12

Methodological quality of included reviews

All the reviews met all AMSTAR criteria (Shea 2007). They:

had a priori design;
performed duplicate study selection and data extraction;
had a comprehensive literature search;
used published and any unpublished studies included irrespective of language of publication, although not all reviews contacted companies or researchers for unpublished trial data;
provided a list of included and excluded studies;
provided characteristics of included studies;
assessed and documented the scientific quality of the included studies;
used the scientific quality of the included studies appropriately in formulating conclusions, because only studies with minimal risk of bias were included (a particular issue was trial size, but conclusions were not drawn from inadequate data sets, based on previously established criteria (Moore 1998));
used appropriate methods to combine findings of studies and, importantly, provided analyses according to drug dose; and
conflict of interest statements had appropriate conflict of interest statements.

All reviews except two reported a GRADE assessment, but not necessarily for all comparisons or outcomes (Cui 2010; Derry 2012).

Effect of interventions

We have reported first the preferred efficacy outcome of at least 50% pain relief compared with placebo, followed by other efficacy outcomes, and comparisons with other active treatments.

At least 50% pain relief

1 Interventions for which Cochrane Reviews found no information

None of the reviews reported finding no information.

2 Interventions for which Cochrane Reviews found inadequate information

A number of reviews reported that there were interventions where the amount of information was small, with fewer than 200 participants in at least two studies. These included:

Acute pain

Topical benzydamine for strains and sprains (Derry 2015). There were 193 participants in three studies. Pooled analysis demonstrated no difference between topical benzydamine and topical placebo. The review authors made no GRADE assessment for this comparison in the review.

Chronic pain

Glyceryl trinitrate for rotator cuff disease (Cumpston 2009). There were fewer than 200 participants in total, with no pooled analysis of the three included studies. The GRADE assessment made by the review authors for quality of evidence in a single study with 20 participants was low.
Evening primrose oil, borage seed oil, blackcurrant seed oil (with gamma‐linolenic acid) versus placebo in rheumatoid arthritis (Cameron 2011). Pooled analysis of three studies for pain intensity involved only 82 participants. The GRADE assessment made by the review authors for quality of evidence for this was moderate. No other herbal remedies had adequate information on efficacy.
No herbal remedy had adequate information on efficacy in osteoarthritis in 200 participants (Cameron 2013). The GRADE assessment made by the review authors for quality of evidence for this was moderate.
No herbal remedy had adequate information on efficacy in randomised double‐blind studies in neck pain in 200 participants (Cui 2010). No GRADE assessment was made.
No herbal remedy had adequate information on efficacy in low back pain in 200 participants (Otlean 2014). The GRADE assessment made by the review authors for quality of evidence for this was very low.
Topical lidocaine for neuropathic pain (Derry 2014a). Few analyses were possible due to poor reporting. Only a single study with 58 participants provided relevant data. The GRADE assessment made by the review authors for quality of evidence for this was very low.
Topical capsaicin low‐concentration (0.075%) for neuropathic pain (Derry 2012). There were 124 participants in two studies. Pooled analysis demonstrated no difference between topical capsaicin low‐concentration and topical placebo. No specific GRADE assessment was made.
Topical clonidine for neuropathic pain (Wrzosek 2015). Only a single study with 179 participants provided relevant data. The GRADE assessment made by the review authors for quality of evidence for this was low.

We assessed the evidence quality for all these interventions as very low. This means that this research does not provide a reliable indication of the likely effect and that the likelihood that the effect will be substantially different is very high. In doing this, we agreed with GRADE assessments made by the review authors in two reviews (Derry 2012; Derry 2014a), but we disagreed with all others, where GRADE assessments were either low or moderate. Moderate‐quality evidence, for example, implies that the research provides a good indication of the likely effect. With fewer than 200 participants in trials with methodological problems that come with a high risk of bias, that seems improbable.

3 Interventions for which Cochrane Reviews found no evidence of effect or evidence of no effect

No reviews demonstrated null effect for a topical intervention compared with topical placebo.

We agreed with the original authors that the evidence for topical salicylate rubefacients in acute pain conditions was very low quality (Derry 2014b). This was despite an apparently significant effect with RR of 1.9 (95% CI 1.5 to 2.5) and NNT of 3.2 (2.4 to 4.9) calculated from the pooled analysis and with an apparently low susceptibility to publication bias calculated from the aggregated data of 689 participants. The reason was that the most recent high‐quality study in that review showed no difference between topical salicylate and topical placebo. There were quality and potential bias issues other than simply publication bias, and we and the original authors judged that the available data did not amount to good evidence of effect, or of no effect.

4 Interventions for which Cochrane Reviews found evidence of effect, but where results were potentially subject to publication bias

Some reviews had some evidence of effect, but in these either the number of participants was low, or the size of the effect was low, or both. In that circumstance the number of participants in unpublished, null‐effect studies required to render the result susceptible to publication bias can be small. Where this number was less than 400 (equivalent to four studies with 100 participants per comparison, or 50 participants per group), we considered the results to be susceptible to publication bias and therefore unreliable and indicative of low‐quality evidence. The appropriateness or otherwise of this categorisation is discussed below, but these results are the least reliable of those available from the reviews.

Summary table B shows the topical treatments where our judgement was of high susceptibility to publication bias (low‐quality evidence), and ordered according to susceptibility to publication bias. Comparisons were all with placebo. Acute conditions treated were strains and sprains, acute low back pain (Derry 2014b; Derry 2015), or acute lateral elbow pain (Pattanittum 2013). Chronic conditions were osteoarthritis or low back pain (Derry 2014b), or hand and knee osteoarthritis (Derry 2016). The number in the susceptibility to publication bias column refers to the number of participants in studies with null effect needed to produce an NNT worse than 10. For topical capsaicin (high‐concentration) the measured effect was above 10 (Derry 2017), and so we judged it to be subject to possible publication bias due to the wide confidence interval of the NNT.

Also included in Summary table B is salicylate rubefacients for acute pain. This was despite there needing to be 689 participants in unpublished, null‐effect trials to increase the NNT to our threshold of 10. The review authors graded this as very low‐quality evidence, and pointed out that the most recent study with over half of all participants had a null effect (Derry 2014b). All the studies had important methodological uncertainties, eliminating any trust in the results.

We assessed the evidence quality for all these interventions as very low or, in one case moderate, based on the underlying quality of studies in the reviews and susceptibility to publication bias.

In acute pain for strains and sprains we judged as very low‐quality evidence that for ketoprofen plaster, diclofenac gel other than Emulgel, indomethacin, ibuprofen cream, and diclofenac (unspecified formulation) in lateral elbow pain. This means that this research does not provide a reliable indication of the likely effect, and that the likelihood that the effect will be substantially different is very high. We disagreed with the moderate‐quality GRADE assessment given for ibuprofen gel based on a high RR, low NNT, and a susceptibility bias on the borderline of 400 participants. Moderate‐quality research provides a good indication of the likely effect, and with only two studies with 241 participants a more conservative view is that the quality of the evidence is low; the research provides some indication of the likely effect, but the likelihood that it will be substantially different is high.

In chronic pain we agreed with the GRADE assessments in the original reviews. Salicylate rubefacient studies have a number of problems over and above susceptibility to publication bias; importantly for a chronic condition, the time of assessment was only at two weeks (Derry 2014b). A very low‐quality GRADE assessment is therefore reasonable. For various formulations of diclofenac in chronic pain, there was very considerable evidence from 2343 participants of a low effect, and NNT of close to 10. A judgement of moderate‐quality research is appropriate because the quantity of data provides a good indication of the likely effect. We therefore agreed with the GRADE assessments in the original reviews (Derry 2014b; Derry 2016).

Summary table B: Results potentially subject to publication bias

			Percent with outcome
Reference	Topical treatment	Studies/participants	Active	Placebo	RR 95% CI	NNT 95% CI	Susceptibility to publication bias	GRADE (review‐reported)
Acute pain conditions
Derry 2015	Ibuprofen ‐ gel	2/241	42	16	2.7 (1.7 to 4.2)	3.9 (2.7 to 6.7)	377	Moderate quality
Derry 2015	Ibuprofen ‐ cream	3/195	71	56	1.3 (1.03 to 1.6)	6.4 (3.4 to 41)	110	No specific GRADE given
Pattanittum 2013	Diclofenac (unspecified formulation)	3/153	Continuous data used		Not reported	7 (3 to 21)	66	Very low quality
Derry 2015	Indomethacin	3/341	58	46	1.3 (1.03 to 1.6)	8.3 (4.4 to 65)	73	No specific GRADE given
Derry 2015	Diclofenac ‐ other gel than Emugel	1/232	94	82	1.2 (1.1 to 1.3)	8.0 (4.8 to 24)	58	No specific GRADE given
Derry 2015	Ketoprofen ‐ plaster	2/335	73	60	1.2 (1.04 to 1.4)	8.2 (4.5 to 47)	29	No specific GRADE given
Chronic pain conditions
Derry 2014b	Salicylate rubefacient	6/455	45	28	1.6 (1.2 to 2.0)	6.2 (4.0 to 13)	279	Very low quality

Footnote: CI: confidence interval; NNT: number needed to treat for an additional beneficial outcome; RR: risk ratio

5 Interventions for which Cochrane Reviews found evidence of effect, where results were reliable and not subject to potential publication bias

Reliable results are presented in Summary table C, ordered according to susceptibility to publication bias. Comparisons were all with placebo.

For topical NSAIDs, acute conditions treated were strains and sprains treated for one week (Derry 2015), while chronic conditions were knee and hand osteoarthritis treated for two to less than six weeks (diclofenac), or treated for 6 to 12 weeks (ketoprofen), or chronic musculoskeletal conditions (Derry 2014b; Derry 2016).

In acute pain conditions NNTs for topical NSAIDs varied from 1.8 for diclofenac Emulgel to 4.7 for diclofenac formulated as Flector plaster. The number of participants was modest, ranging from 314 in studies of diclofenac Emulgel to 1030 for studies of diclofenac formulated as Flector plaster. Results involving low (good) NNTs were particularly robust, with over 1000 participants in studies with null effect needed to overturn all results except that for piroxicam gel, and over 5000 needed for diclofenac formulated as Flector plaster.

In chronic pain conditions, NNTs for topical NSAIDs were higher, between 5 and 10, depending on NSAID and study duration. The numbers of participants were high for ketoprofen at 2573 and diclofenac in longer‐term studies (2343), but modest for diclofenac in studies of shorter duration (732). Over 1000 participants in unpublished, null‐effect studies would be required to overturn the result for ketoprofen, and only 48 for diclofenac in studies between six and 12 weeks. The results for topical diclofenac are included in this section because the numbers in studies were large, the effect size was small, and because the review identified around 6000 participants in unpublished studies of topical NSAIDs (Derry 2016). The existence of a large body of evidence with a large effect size in this quantity of unpublished material is unlikely, and the susceptibility of the result to publication bias is small.

For topical NSAIDs in acute and chronic pain we assessed the evidence quality for these interventions as moderate or high quality based on the underlying quality of studies in the reviews and susceptibility to publication bias. The high quality assessment was based on a large effect size (NNT of below 2), consistent results in moderately‐sized recent studies of high quality, and a low susceptibility to publication bias. High quality indicates that the research provides a very good indication of the likely effect. The likelihood that the effect will be substantially different is low. We also considered the evidence for diclofenac as Flector plaster to be high quality because of the low susceptibility to publication bias (Derry 2015).

Most other GRADE assessments in the original reviews were moderate. While the research provided a good indication of the likely effect, effect sizes tended to be larger with somewhat greater susceptibility to publication bias (Derry 2015; Derry 2016). We also considered the other diclofenac plasters and piroxicam gel to have moderate‐quality evidence, although it was not specifically graded in the original review.

One important comparison, between topical and oral NSAIDs (Derry 2015), was difficult to assess in terms of GRADE. Of the five studies (1735 participants) only two compared the same NSAID (diclofenac) in both topical and oral formulations; the three others compared a topical NSAID against an oral formulation of a different NSAID. The results were consistent, finding no greater or lesser benefit with topical NSAID (55%) than oral NSAID (54%). The RR was 1.0 (95% CI 0.95 to 1.1). As there was no difference, susceptibility to publication bias could not be calculated. No specific GRADE assessment was given in the review.

Topical high‐concentration capsaicin in neuropathic pain conditions had an NNT for postherpetic neuralgia of 11, and so susceptibility to publication bias could not be calculated using our prespecified method. The original review reported results according to type of neuropathic pain condition, and found similar estimates of beneficial effect in HIV neuropathy, but not peripheral diabetic neuropathy, and with similar results across different efficacy outcomes. Unpublished large studies with substantially higher efficacy are unlikely to exist, and so the potential for publication bias is small. We agreed with the GRADE assessment of moderate‐quality evidence.

Summary table C: Results not subject to publication bias

			Percent with outcome
Reference	Topical treatment	Studies/participants	Active	Placebo	RR 95% CI	NNT 95% CI	Susceptibility to publication bias	GRADE (review‐reported)
Acute pain conditions
Derry 2015	Diclofenac ‐ Flector plaster	4/1030	63	41	1.5 (1.4 to 1.7)	4.7 (3.7 to 6.5)	5029	No specific GRADE given
Derry 2015	Diclofenac ‐ Emulgel	2/314	78	20	3.8 (2.7 to 5.5)	1.8 (1.5 to 2.1)	1430	High quality
Derry 2015	Ketoprofen ‐ gel	5/348	72	33	2.2 (1.7 to 2.8)	2.5 (2.0 to 3.4)	1044	Moderate quality
Derry 2015	Diclofenac ‐ other plaster	3/474	88	57	1.6 (1.4 to 1.8)	3.2 (2.6 to 4.2)	1007	No specific GRADE given
Derry 2015	Piroxicam gel	3/522	70	47	1.5 (1.3 to 1.7)	4.4 (3.2 to 6.9)	664	No specific GRADE given
Chronic pain conditions
Derry 2016	Ketoprofen gel	4/2573	63	48	1.1 (1.01 to 1.2)	6.9 (5.4 to 9.3)	1156	Moderate quality
Derry 2016	Diclofenac (< 6 weeks' duration)	5/732	43	23	1.9 (1.5 to 2.3)	5.0 (3.7 to 7.4)	732	Moderate quality
Derry 2016	Diclofenac ‐ various formulations (> 6 weeks' duration)	4/2343	60	50	1.2 (1.1 to 1.3)	9.8 (7.1 to 16)	48	Moderate quality
Derry 2017	Capsaicin (high‐concentration)	2/571	33	24	1.3 (1.0 to 1.7)	11 (6.1 to 62)	Result above threshold of 10	Moderate quality

Footnote: CI: confidence interval; NNT: number needed to treat for an additional beneficial outcome; RR: risk ratio

Other pain outcomes

Topical capsaicin 0.075% for neuropathic pain had some information on pain efficacy in 258 participants, of whom only 124 had the important outcome of at least 50% pain intensity reduction (Derry 2012). Null data from only 102 participants would be required to produce a NNT of 10, which might be considered a threshold for useful efficacy in neuropathic pain. No GRADE assessment was made, but the authors reported that no effect size for efficacy could be safely calculated based on the available data (Moore 2013b).

Comparisons other than placebo

Only one review had sufficient information for an analysis of a topical versus an oral analgesic. Derry 2016 analysed five osteoarthritis studies in which topical piroxicam, ketoprofen, or diclofenac was compared with oral ibuprofen (900 mg and 1200 mg daily), celecoxib (200 mg daily), or diclofenac (100 mg or 150 mg daily) over 3 to 12 weeks. In 1735 participants there was no difference in the proportion achieving treatment success equivalent to at least 50% pain intensity reduction with topical NSAID (55%) and oral NSAID (54%). The RR was 1.0 (95% CI 0.95 to 1.1). The GRADE assessment for quality of evidence for this was moderate.

Withdrawals

Summary table D presents the available results for withdrawals, either as all‐cause for acute pain conditions, or due to lack of efficacy and adverse events for chronic pain conditions. In each case the comparison is with placebo. Not all reviews were able to report information on withdrawals due to incomplete or inconsistent reporting in the original studies, or because there was so little information available.

The amount of available information varied considerably, from as few as 179 participants with topical clonidine (where results are presented for completeness), to 5790 for all NSAIDs in acute pain such as strains and sprains. In almost all cases the number of events was limited, as the rate of events was typically below 10%. Even when there was an apparently large effect, as with adverse event withdrawals with low‐concentration capsaicin, the number of events was small, which meant that the GRADE estimation was very low quality. By contrast, larger numbers of participants did allow a greater confidence in both a low rate of withdrawals and the lack of difference between active and placebo topical treatments. We agreed with all the GRADE assessments concerning withdrawal.

Summary table D: Summary of available results on withdrawals, and cause

				Percent with outcome
Reference	Condition treated	Topical treatment	Studies/Participants	Active	Placebo	RR 95% CI	NNTp or NNH 95% CI	GRADE (review‐reported)
Acute pain conditions
Derry 2015 (Adverse event withdrawals)	Strains and sprains	All NSAIDs	42/5790	1	1	1.0 (0.7 to 1.7)	n/c	High quality
Pattanittum 2013 (All‐cause withdrawals)	Lateral elbow pain	Diclofenac	4/485	0	0	n/c	n/c	Low
Chronic pain conditions (lack of efficacy withdrawals)							NNTp
Derry 2014a	Neuropathic pain	Lidocaine	4/293	7	12	0.6 (0.3 to 1.1)	n/c	No specific GRADE given
Derry 2014b	Acute and chronic	Salicylate	5/501	2	7	0.4 (0.2 to 0.9)	21 (12 to 120)	Very low quality
Derry 2016	Knee and hand OA	Diclofenac	11/3455	6	9	0.6 (0.5 to 0.8)	26 (18 to 47)	Moderate quality
Derry 2016	Knee and hand OA	Ketoprofen	4/2885	6	9	1.1 (0.8 to 1.6)	n/c	Moderate quality
Derry 2017	Neuropathic pain	Capsaicin (high‐concentration)	6/2073	2	3	0.6 (0.3 to 1.04)	n/c	Moderate quality
Wrzosek 2015	Neuropathic pain	Clonidine	1/179	1	1	1.0 (0.06‐16)	n/c	Very low quality
Chronic pain conditions (adverse event withdrawals)							NNH
Derry 2012	Neuropathic pain	Capsaicin (low‐concentration)	4/477	15	3	5.0 (2.3 to 11)	8.1 (5.7 to 14)	No specific GRADE given
Derry 2014a	Neuropathic pain	Lidocaine	5/349	2	1	1.2 (0.3 to 4.6)	n/c	No specific GRADE given
Derry 2014b	Acute and chronic	Salicylate	7/737	5	1	4.2 (1.5 to 12)	26 (15 to 85)	Very low quality
Derry 2016	Knee and hand OA	Diclofenac	12/3552	5	4	1.6 (1.1 to 2.1)	51 (30 to 170)	Moderate quality
Derry 2016	Knee and hand OA	Ketoprofen	4/2621	6	5	1.3 (0.9 to 1.8)	n/c	Moderate quality
Derry 2017	Neuropathic pain	Capsaicin (high)	8 /2487	1	1	0.8 (0.4 to 1.8)	n/c	Moderate quality
Wrzosek 2015	Neuropathic pain	Clonidine	1/179	1	3	0.3 (0.03‐3.2)	n/c	Very low quality

Footnote: CI: confidence interval; n/c = not calculated; NNTp = number needed to treat to prevent; NNH = number needed to treat for an additional harmful outcome; NSAID: nonsteroidal anti‐inflammatory drugs; OA: osteoarthritis; RR: risk ratio.

Adverse events

Summary table E presents the available results for participants experiencing at least one adverse event over the duration of the studies. In each case the comparison is with placebo. Not all reviews were able to report information on withdrawals due to incomplete or inconsistent reporting in the original studies, or because there was so little information available. Topical capsaicin (high‐concentration) presented particular problems for reporting adverse events because the application was usually undertaken after local anaesthesia (Derry 2017). While adverse events were generally well reported this was not done in a way that is comparable with other studies.

With the exception of topical salicylate rubefacients in all available acute and chronic pain studies combined, there were no differences between active topical analgesic and topical placebo. For topical salicylates, more participants with topical salicylate reported an adverse event, with a NNH of 17 (9.9 to 58).

We agreed with all GRADE assessments with the exception of diclofenac for lateral elbow strains. Here a judgement was that the evidence concerning adverse events was low quality (Pattanittum 2013); we judged it very low quality because the very small number of participants and low event rates would mean very few events, and our judgement would be very low quality.

Summary table E: Participants with at least one adverse event

				Percent with outcome
Reference	Condition treated	Topical treatment	Studies/Participants	Active	Placebo	RR 95% CI	NNH 95% CI	GRADE (review‐reported)
Acute pain conditions
Pattanittum 2013	Lateral elbow pain	Diclofenac	3/153	2	1	1.6 (0.2 to 12)	n/c	Low quality
Derry 2015 (Systemic adverse events)	Strains and sprains	All NSAIDs	36/5576	3	4	1.0 (0.7 to 1.3)	n/c	High quality
Chronic pain conditions							NNH
Wrzosek 2015	Neuropathic pain	Clonidine	2/344	12	13	0.7 (0.1 to 3.1)	n/c	Very low quality
Derry 2016 (Systemic adverse events)	Knee and hand OA	Diclofenac	7/1266	6	7	0.9 (0.6 to 1.3)	n/c	Very low quality
Acute and chronic pain conditions							NNH
Derry 2014b	OA or LBP	Salicylate	11/984	15	9	1.6 (1.2 to 2.0)	17 (9.9 to 58)	Low quality

Footnote: CI: confidence interval; LBP: low back pain; n/c: not calculated; NNH: number needed to treat for an additional beneficial outcome; NSAID: nonsteroidal anti‐inflammatory drugs; OA: osteoarthritis; RR: risk ratio

Summary table F presents the available results for participants experiencing a local adverse event over the duration of the studies. In each case the comparison is with placebo. Not all reviews were able to report information on withdrawals due to incomplete or inconsistent reporting in the original studies, or because there was so little information available. For topical capsaicin high‐concentration, studies that did not capture the application‐associated local adverse events reported event rates that were not different between topical capsaicin high‐concentration and placebo, with the exception of pain in around 10% of participants.

There were three interventions for which there were more local adverse events with topical analgesic than with placebo. These were topical capsaicin (low‐concentration) for neuropathic pain with a NNH of 2.5 (2.1 to 3.1) (Derry 2012), topical diclofenac for hand and knee osteoarthritis with a NNH of 16 (12 to 23) (Derry 2016), and topical salicylate rubefacients for osteoarthritis or low back pain with a NNH of 31 (16 to 300) (Derry 2014b).

We agreed with all GRADE assessments made by the authors, ranging from high quality for all NSAIDs in acute painful conditions, to very low quality for topical clonidine in neuropathic pain. In acute pain, results for individual topical NSAIDs were not graded, but based on the evidence available we graded the results high quality (diclofenac, ketoprofen, and piroxicam) or moderate quality (felbinac, indomethacin, and ibuprofen), based mainly on the amount of information available. Topical capsaicin (low‐concentration) evidence was not graded, but we considered it high quality based on a large effect size, consistency between studies, and a reasonably large number of events, together with a biological plausibility.

Summary table F: Participants with at least one local adverse event

				Percent with outcome
Reference	Condition treated	Topical treatment	Studies/Participants	Active	Placebo	RR 95% CI	NNH 95% CI	GRADE (review reported)
Acute pain conditions
Derry 2015	Strains and sprains	All NSAIDs	42/6740	4.3	4.6	0.98 (0.8 to 1.2)	n/c	High quality
Derry 2015	Strains and sprains	Diclofenac	15/3271	3.1	4.3	0.8 (0.6 to 1.1)	n/c	No specific GRADE given
Derry 2015	Strains and sprains	Ketoprofen	8/852	11	9.5	1.2 (0.8 to 1.7)	n/c	No specific GRADE given
Derry 2015	Strains and sprains	Piroxicam	3/522	2.3	5.4	0.4 (0.2 to 1.1)	n/c	No specific GRADE given
Derry 2015	Strains and sprains	Felbinac	3/397	3.0	1.5	1.9 (0.5 to 7.5)	n/c	No specific GRADE given
Derry 2015	Strains and sprains	Indomethacin	3/354	6.3	2.2	2.7 (0.9 to 7.7)	n/c	No specific GRADE given
Derry 2015	Strains and sprains	Ibuprofen	3/321	10	4.3	2.3 (0.98 to 5.4)	n/c	No specific GRADE given
Chronic pain conditions							NNH
Derry 2012	Neuropathic pain	Capsaicin (low‐concentration)	5/557	63	24	2.6 (2.1 to 3.3)	2.5 (2.1 to 3.1)	No specific GRADE given
Derry 2016	Knee and hand OA	Diclofenac	15/3658	14	8	1.8 (1.5 to 2.2)	16 (12 to 23)	Moderate quality
Derry 2016	Knee and hand OA	Ketoprofen	4/2621	15	13	1.0 (0.85 to 1.3)	n/c	Moderate quality
Derry 2017	Neuropathic	Capsaicin (high‐concentration (pain only))	4/1005	10	4	2.4 (1.4 to 4.1)	16 (11 to 31)	Moderate quality
Wrzosek 2015	Neuropathic	Clonidine	2/344	12	13	0.7 (0.1 to 3.1)	n/c	Very low quality
Acute and chronic pain conditions							NNH
Derry 2014b	Strains and sprains, OA or LBP	Salicylate	10/869	6	2	2.2 (1.1 to 4.1)	31 (16 to 300)	Very low quality

Footnote: CI: confidence interval; LBP: low back pain; n/c: not calculated; NNH: number needed to treat for an additional harmful outcome; NSAID: nonsteroidal anti‐inflammatory drug; OA: osteoarthritis; RR: risk ratio

Serious adverse events

Serious adverse events were uncommon. They were not noted in several reviews of short duration or with sparse data (Cui 2010; Cumpston 2009; Derry 2014b; Otlean 2014; Pattanittum 2013). A number of reviews indicated that there were no adverse events reported (Cameron 2013; Derry 2012; Derry 2015), or that there was no difference between topical analgesic and placebo (Derry 2016; Derry 2017). Of the remainder, one found no clear data (Wrzosek 2015), one had some possible serious adverse events for one herbal preparation (Cameron 2011), and one reported six serious adverse events in 263 participants in an open label extension of a topical lidocaine study, apparently unrelated to treatment (Derry 2014a).

Review authors rarely gave a GRADE assessment. We assessed the evidence as very low quality, based on the sparse data available.

Discussion

Summary of main results

This overview review showed that 13 individual Cochrane Reviews assessed the efficacy and harms from a range of available topical analgesics applied to intact skin in a number of acute and chronic painful conditions. The reviews involved 206 studies with around 30,700 participants.

For efficacy, we considered that there was moderate or high‐quality evidence for several therapies, based on the underlying quality of studies in the reviews and susceptibility to publication bias in comparisons with placebo topical therapy. In acute musculoskeletal pain conditions (strains and sprains) these were diclofenac Emulgel (NNT 1.8 (1.5 to 2.1)), ketoprofen gel (NNT 2.5 (2.0 to 3.4)), piroxicam (NNT 4.4 (3.2 to 6.9)), diclofenac Flector plaster (NNT 4.7 (3.7 to 6.5)), and diclofenac other plaster (NNT 3.2 (2.6 to 4.2)). In chronic pain musculoskeletal conditions (mainly hand and knee osteoarthritis) these were topical diclofenac preparations over < 6 weeks (NNT 5.0 (3.7 to 7.4)), ketoprofen over 6 to 12 weeks (NNT 6.9 (5.4 to 9.3)), and topical diclofenac preparations over 6 to 12 weeks (NNT 9.8 (7.1 to 16)). In postherpetic neuralgia topical high‐concentration capsaicin had moderate‐quality evidence of limited efficacy (NNT 11 (6.1 to 62)).

We judged that evidence of efficacy for all other therapies for acute or chronic pain conditions was low or very low. Limited evidence of efficacy potentially subject to publication bias existed for topical preparations of ibuprofen gels and creams, unspecified diclofenac formulations and diclofenac gel other than Emulgel, indomethacin, and ketoprofen plaster in acute pain conditions, and for salicylate rubefacients for chronic pain conditions. Evidence for other interventions (other topical NSAIDs, topical salicylate in acute pain conditions, low‐concentration capsaicin, lidocaine, or clonidine for neuropathic pain, and herbal remedies for any condition) was of very low quality, and typically was limited to single studies or comparisons with sparse data from fewer than 200 participants.

One strong message is that for a number of topical analgesics, the exact formulation used may be of critical importance. This is seen most clearly for topical NSAIDs, particularly in acute pain. Different formulations of the same drug, usually at similar concentrations, have widely different effect sizes. For example, topical diclofenac in acute pain has NNTs ranging between 1.8 and 8, depending on formulation tested, and that was true for five different formulations. Similarly ketoprofen and ibuprofen gels and creams had very different effects. There is a strong indication that for acute pain conditions, gels tend to be superior to creams.

A second message is that analgesic efficacy of topical analgesics is not just about rubbing them in, although it is commonly believed, even if there is little evidence, that rubbing in itself is beneficial, except in experimental pain (Kammers 2010). While rubbing may be part of topical analgesic application, the rigour of the methods includes a rubbed placebo without an active agent, thus taking rubbing out of the equation of assessment of drug efficacy (Tramèr 2004). That may not mean that rubbing may not pay a part in the overall analgesic experience, and even be part of the reason for the relatively high placebo response rates of 20% to 57% seen.

We assessed evidence on withdrawals as very low quality, based on small numbers of events. In chronic pain conditions lack of efficacy withdrawals were lower than placebo with topical diclofenac (NNTp 26) and topical salicylate (NNTp 21), and adverse event withdrawals were higher than placebo with topical capsaicin low‐concentration (NNH 8), topical salicylate (NNH 26), and topical diclofenac (NNH 51).

We assessed evidence for systemic or local adverse event rates being no higher with topical NSAIDs than topical placebo as high quality. We also assessed local adverse events with topical capsaicin low‐concentration (NNH 2.5) as high quality. There was moderate‐quality evidence for more local adverse events than placebo in chronic pain conditions with topical diclofenac (NNH 16) and local pain with topical capsaicin high‐concentration (NNH 16). There was moderate‐quality evidence of no additional local adverse events with topical ketoprofen over topical placebo in chronic pain.

There was very low‐quality evidence for serious adverse events. It might be argued that the absence of events in a large body of studies speaks to an absence of risk, but the limited duration of the studies makes them unsuitable for assessing rare but serious harm. Epidemiological studies provide evidence that gastrointestinal bleeding and perforation risk is not increased with topical NSAIDs (Evans 1995).

Reporting of adverse events in systematic reviews has been criticised because they compound the poor reporting of harms in primary studies by failing to report on harms or doing so inadequately (Zorzela 2014). More and better information on adverse events, and particular adverse events, would be valuable. However, the problems are numerous, and include issues around the methods of collection of adverse events, and about their reporting (Derry 2001; Edwards 1999). Other limitations in both individual studies and systematic reviews include small numbers of participants and events in studies generally not powered to measure these outcomes.

Overall completeness and applicability of evidence

While the overview review included data from 206 studies with almost 31,000 participants, there was considerable fragmentation of this body of data that limits its completeness and applicability. Fragmentation derives from five main sources.

A range of different active agents (NSAID, capsaicin, salicylate, lidocaine, clonidine, glyceryl trinitrate, and a range of different herbal remedies). Some of these, such as NSAIDs, have a range of different chemical entities with probable different efficacy at doses used.
A number of different formulations that affect, or potentially affect, efficacy; an example is different formulations of diclofenac and ketoprofen in acute and chronic pain studies.
A range of different acute and chronic pain conditions.
Trial durations inappropriate for the conditions treated.
Inconsistency in reporting efficacy outcomes of relevance, or adverse event rates. Many studies will have reported average pain scores or pain change only, not recognising the value people with pain place on high levels of pain reduction (Moore 2013a), or the fact that in all pain conditions a bimodal distribution for analgesic response is the norm/usual (Moore 2013b; Moore 2013c).

These factors mean that many of the possible comparisons involve small numbers of studies, participants, and events. Small numbers limit completeness and applicability even if all other considerations are excluded. Usually there are additional imperfections in trial methods or reporting that raise the possibility of risk of bias, further limiting applicability.

Topical analgesics offer possible important benefits to older people, in part due to lower adverse events (Gaskell 2014). No evidence in these reviews specifically related to older people.

We are not aware of any major topical analgesic that is not covered by the individual reviews in this overview. Menthol is used in many products, but we did not find any Cochrane or non‐Cochrane Review. Topical use of other drugs such as ketamine and amitriptyline is known; there are few trials, but no systematic reviews.

Quality of the evidence

Fragmentation also limits the quality of the evidence. Evidence of efficacy of moderate or high quality is available only for certain formulations of diclofenac, ketoprofen, and piroxicam in acute pain from strains and sprains, diclofenac and piroxicam in hand and knee osteoarthritis, and high‐concentration capsaicin in post herpetic neuralgia. Evidence of moderate or high quality is available only for some adverse event outcomes, typically from pooling studies of different topical products.

There was no indication that the individual reviews were not properly performed. One area of possible concern was inconsistency in the use of GRADE. There were several instances where evidence from rather small amounts of data was graded as moderate or low in the original review. Because this was based on low numbers of participants and events, we judged evidence quality to be very low (Moore 1998; Thorlund 2011). Assessments of Cochrane Reviews have made a similar criticism (AlBalawi 2013; Turner 2013). This is not unimportant, as a GRADE assessment of moderate‐quality evidence might result in a different approach by professionals or in guidance than one of very low quality.

Potential biases in the overview process

One potential bias is the overlap between authors of the overview and of some of the individual reviews. This has been addressed by having other experienced authors for the overview.

Agreements and disagreements with other studies or reviews

We are not aware of any similar overview reviews.

	Idioma de la Revisión Cochrane Escoja su idioma de preferencia para las revisiones Cochrane y otros contenidos. Las secciones sin una traducción aparecerán en inglés.

	Idioma de la web Escoja su idioma de preferencia para la web de la Biblioteca Cochrane.

Idioma de la Revisión Cochrane

Idioma de la web

摘要

研究背景

研究目的

研究方法

主要结果

作者结论

简语概要

涂抹在皮肤上的止痛药真的有效吗？

Resumen visual

Authors' conclusions

Implications for practice

Implications for research

Background

Description of the condition

Description of the interventions

How the intervention might work

Topical NSAIDs

Topical rubefacients

Topical capsaicin

Topical lidocaine

Why it is important to do this overview

Objectives

Methods

Criteria for considering reviews for inclusion

Search methods for identification of reviews

Data collection and analysis

Selection of reviews

Data extraction and management

Assessment of methodological quality of included reviews

Data synthesis

Quality of the evidence

'Summary of findings' table

Results

Description of included reviews

Summary table A: Details of included reviews

Methodological quality of included reviews

Effect of interventions

At least 50% pain relief

1 Interventions for which Cochrane Reviews found no information

2 Interventions for which Cochrane Reviews found inadequate information

Acute pain

Chronic pain

3 Interventions for which Cochrane Reviews found no evidence of effect or evidence of no effect

4 Interventions for which Cochrane Reviews found evidence of effect, but where results were potentially subject to publication bias

Summary table B: Results potentially subject to publication bias

5 Interventions for which Cochrane Reviews found evidence of effect, where results were reliable and not subject to potential publication bias

Summary table C: Results not subject to publication bias

Other pain outcomes

Comparisons other than placebo

Withdrawals

Summary table D: Summary of available results on withdrawals, and cause

Adverse events

Summary table E: Participants with at least one adverse event

Summary table F: Participants with at least one local adverse event

Serious adverse events

Discussion

Summary of main results

Overall completeness and applicability of evidence

Quality of the evidence

Potential biases in the overview process

Agreements and disagreements with other studies or reviews

Copiar o descargar referencia

Idioma de la Revisión Cochrane

Idioma de la web

Instituciones a las que se accedió previamente

Usuarios institucionales

Instituciones a las que se accedió previamente

Otras opciones de acceso