Scolaris Content Display Scolaris Content Display

요통 치료를 위한 허브약제

Contraer todo Desplegar todo

배경

요통(low‐back pain)은 흔히 발생하는 질환이며 산업화된 사회에 사는 사람들에게 실질적으로 커다란 경제적 부담을 준다. 만성 요통이 있는 많은 사람들은 대체 보조약제 (complementary alternative medicine)를 사용하며, 대체 보조 약제 종사자를 방문하거나 또는 두 가지 방법 모두를 사용한다. 요통 환자 치료에 몇 가지 허브약제가 사용되는 것으로 알려져왔다. 본 연구는 2006년 최초 발표된 코크란연구의 개정 본이다.

목적

비 특이적 요통 치료를 위한 허브약제의 효과를 결정한다.

검색 전략

2014년 9월 까지의 다음 자료를 전자 검색했다: MEDLINE, EMBASE, CENTRAL, CINAHL, Clinical.gov, World Health Organization International Clinical Trials Registry Portal과 PubMed; 연구 논문, 지침과 검색한 시험의 참고 문헌 목록을 확인했다; 이 분야의 전문가들과 개별적으로 상의했다.

선정 기준

급성, 준 급성 또는 만성 비 특이적 요통이 있는 성인(18세 이상)을 조사한 무작위배정 비교임상시험연구(randomized controlled trials)를 포함시켰다. 허브약제에 의한 중재로, 우리는 허브약제를 어떤 형태이든 의학적 목적으로 사용된 식물로 정의했다. 통증과 기능이 1차 결과 측정치였다.

자료 수집 및 분석

Cochrane Back Review Group의 도서관 과학자 1명이 데이터를 검색했다. 연구자 1명이 전문가와 접촉하여 관련 인용 자료를 얻었다. 확인된 연구의 전체 참고 문헌과 초록을 다운로드했고, 최종 포함 여부를 결정하기 위해 각 연구의 복사본을 검색했다. 연구자 2명이 비뚤림 위험을 평가하고 다른 사람이 GRADE 기준 (GRADE 2004), CONSORT 준수와 무작위 하위 세트를 비교하여 평가했다. 연구자 2명이 임상적 타당성을 평가하고 의견이 일치하지 않는 부분은 합의하여 해결했다.

주요 결과

본 연구에는 14건의 무작위배정 비교임상시험연구(참가자 2,050명)이 포함되었다. Solidago chilensis M 에 관한 연구 1부. (브라질 아르니카 (Brazilian arnica))에 관한 1건의 시험(참가자 20명)에서는 브라질 아르니카가 함유된 겔을 1일 2회 사용하면 위약 겔에 비해 통증 인지가 줄고 유연성이 늘어난다는 근거의 질이 매우 낮음을 확인했다. Capsicum frutescens 크림 또는 고약(plaster)은 만성 요통 환자에게 위약보다 효과적인 것 같다(시험 3건, 참가자 755명, 근거의 질은 중간 정도). 현재의 근거를 기반으로 하면, 국소 고추크림(capsicum cream)이 위약보다 급성 요통 환자 치료에 효과적인지는 분명하지 않다(시험 1건, 참가자 40명, 근거의 질이 낮음). 다른 시험에서는 C. frutescens 크림은 동종 요법 연고와 효과가 같다고 밝혔다 (시험 1건, 참가자 161명, 근거의 질이 매우 낮음). 0mg 또는 100mg harpagoside로 표준화된 Harpagophytum procumbens (악마의 발톱)를 매일 복용하면 위약보다 단기적으로 통증 개선에 효과적이며, 구조약물(rescue medicine) 사용을 줄일 수도 있다(시험 2건, 참가자 315명, 근거의 질이 낮음). 다른 H. procumbens 시험에서는 로페콕시브(rofecoxib, Vioxx Ⓡ)를 매일 12.5mg 섭취하는 것과 상대적으로 같다고 하였으나, 근거의 질은 매우 낮았다(시험 1건, 참가자 88명, 근거의 질이 낮음). 살리신(salicin) 120mg ‐ 240mg로 표준화된 Salix alba (흰버들 껍질)를 매일 복용하면 위약보다 단기적 통증 개선과 구조약물에서 효과적인 것 같았다(시험 2건, 참가자 261명, 근거의 질은 중간 정도). 추가 시험 1건에서 로페콕시브 12.5mg을 매일 복용하는 것과 상대적으로 같다고 하였으나(시험 1건, 참가자 228명) 근거의 질이 매우 낮았다. S. alba는 아세틸살리신산염(acetylsalicylate)을 심장 보호 용량 정도로 복용하는 것 보다 혈소판 혈전증(platelet thrombosis)에 최소한으로 영향을 미쳤다(시험 1건, 참가자 51명). Symphytum officinale L. 을 검토한 1건의 연구에서는, VAS에서 평가된 바와 같이 Kytta‐Salbe 캄프리 추출연고가 위약연고보다 단기적 통증 개선에 효과적이라는 질이 낮은 근거를 확인했다. 지압에 사용되는 방향족 라벤더 에센셜 오일은 치료받지 않은 환자들에게 객관적 통증 강도를 줄이고, 한 쪽 척추굴절과 보행 시간을 개선할 수 있다(시험 1건, 참가자 61명, 근거의 질이 매우 낮다). 포함된 시험들에서는 유의한 부작용이 보고되지 않았다.

연구진 결론

C. frutescens (Cayenne)는 위약보다 통증을 줄인다. H. procumbens, S. alba, S. officinale L., S. chilensis 와 라벤더 에센스 오일이 위약보다 통증을 줄이는 것 같지만, 이에 대한 근거의 질은 잘해야 중간 정도이다. 이러한 허브약제를 표준치료와 비교하여 테스트할 잘 설계된 대규모 연구가 필요하다. 대체로, 이들 시험에서는 보고의 완벽성이 떨어진다. 시험 관계자들은 허브약제 중재 시험을 보고하기 위해 CONSORT 헌장 부칙을 참고해야 한다.

PICO

Population
Intervention
Comparison
Outcome

El uso y la enseñanza del modelo PICO están muy extendidos en el ámbito de la atención sanitaria basada en la evidencia para formular preguntas y estrategias de búsqueda y para caracterizar estudios o metanálisis clínicos. PICO son las siglas en inglés de cuatro posibles componentes de una pregunta de investigación: paciente, población o problema; intervención; comparación; desenlace (outcome).

Para saber más sobre el uso del modelo PICO, puede consultar el Manual Cochrane.

요통 치료를 위한 허브약제

연구의 질문

우리는 비 특이적 요통 환자의 통증 치료를 위한 허브약제의 효과에 관한 근거를 조사했다.

배경
요통은 흔하며 인구의 35%는 언젠가 요통을 경험한다. 류마티스 관절염(rheumatoid arthritis), 감염, 골절, 암 또는 디스크 파열이나 신경에 대한 다른 압력으로 인한 좌골신경통(sciatica) 같은 심각한 기저 문제가 원인이 아닌 최하부 늑골과 엉덩이 아래 사이의 통증을 비 특이적 요통으로 정의한다. 입으로 섭취하거나, 피부에 바르는 허브약제는 요통을 비롯한 많은 질환 치료에 사용되고 있다.

연구의 특징
Cochrane Collaboration의 연구자들이 2013년 8월 5일 까지 가능한 근거를 검토했다. 여섯 가지 허브약제와 비 특이적 급성 또는 만성 요통 성인 환자 250명을 테스트한 14건의 연구가 포함되었다. 두 가지 경구 허브약제, Harpagophytum procumbens (악마의 발톱) 및 Salix alba (흰버들 껍질)를 위약(가짜 약) 또는 로페콕시브(Vioxx Ⓡ)와 비교했다. 3가지 국소 크림, 고약 또는 겔, Capsicum frutescesn (cayenne), Symphytum officinale L. (캄프리, comfrey)와 Solidago chilensis (브라질 아르니카)를 위약 크림 또는 고약 및 동종 겔과 비교했다. 한 가지 에센셜 오일, 라벤더를 무 치료와 비교했다. 시험에 포함된 사람들의 평균 연령은 52세였고, 연구는 보통 3주간 지속되었다.

주요 결과
하루 harpagoside 50mg 또는 100mg로 표준화된 악마의 발톱은 위약보다 통증을 줄일 수 있다; 표준화된 하루 용량 60mg을 복용하면 Vioxx Ⓡ 하루 용량 12.5mg 만큼 통증이 줄었다. 하루 용량 120mg ‐ 240mg 살리신으로 표준화된 흰버들 껍질을 복용하면 위약보다 통증이 경감된다; 표준화된 하루 용량 240mg를 복용하면 매일 Vioxx Ⓡ (비 스테로이드성 항염증제) 12.5mg 용량으로 통증을 줄이는 것과 같다. Cayenne는 몇 가지 형태로 시험되었다: 고약 형태는 위약보다 통증을 줄였고, 동종 겔 Spiroflor SLR과 비슷했다. 두 가지 다른 연고 형태 치료에서는, S. officinaleS. chilensis 는 위약 크림보다 통증 감지를 많이 줄이는 것 같았다. 지압사가 사용하는 라벤더 에센스 오일은 기존에 비해 통증 경감과 유용성 개선에 효과적인 것 같다. 부작용도 보고되었으나, 주로 경미하고 일시적인 위장 증상 또는 피부 자극에 국한되었다.

근거의 질
포함된 시험 대부분은 비뚤림 위험이 낮았고, 근거의 질은 주로 낮거나 중간 정도이다. 근거의 질이 중간 정도인 것은 C. frutescens 에 관한 것 뿐 이었다. 시험들은 단기적인 사용(6주 까지)만을 검사했다. 포함된 시험 중 8건의 연구자들은 이해가 상충될 가능성이 있으며, 다른 4명의 연구자들은 이해가 상충됨을 밝히지 않았다. 부작용으로 인해 Vioxx Ⓡ이 시장에서 철수하여, 3가지 식물은 상대적 효과와 안전을 위해 비 스테로이드성 항염증제와 아세카미노펜(acetaminophen) 같은 이미 나와있는 통증 약과 비교해야한다.

결론
네 가지 허브약제가 급성과 만성 요통의 통증을 단기적으로 줄일 수 있고 부작용이 거의 없다는 근거의 질은 낮거나 중간 정도이다. 이들 중 어느 것이 장기적으로 안전하고 효과적이라는 근거는 없다. 이러한 중재들의 효과를 확인하기 위해 대규모의 잘 설계된 시험이 필요하다.

Authors' conclusions

Implications for practice

A topically applied plaster or cream of C. frutescens, and a cream of S. officinale appear to reduce pain more than placebo. These herbal medicines could be considered as a treatment option for acute (S. officinale) and of chronic LBP (C. frutescens).

Implications for research

Additional large RCTs at low risk of bias and completely reported must be done to determine if the herbal medicines discussed above are effective in the treatment of acute and chronic LBP. In particular, more trials are needed that include people with acute and subacute LBP. Also, additional trials testing these herbal medicines against standard treatments (acetaminophen, NSAIDs) will clarify their equivalence in terms of efficacy and effectiveness. The quality of reporting in these trials was generally poor and thus trialists should refer to the CONSORT and related statements in designing and reporting clinical trials of herbal medicines.

Summary of findings

Open in table viewer
Summary of findings for the main comparison. Summary of findings table 1: Brazilian arnica extract compared to placebo for patients with non‐specific chronic back pain or soft tissue pain

Brazilian arnica extract compared to placebo for patients with non‐specific chronic back pain or soft tissue pain

Patient or population: patients with back pain

Settings: outpatient clinic

Intervention: extract of Brazilian arnica

Comparison: placebo

Outcomes

No of participants
(trials)

Quality of the evidence
(GRADE)

Comments

Pain reduction based on

Pain VAS instrument 0‐100 scale

20
(one trial)

⊕⊝⊝⊝
very low1

Very small sample size only N = 10 in the treatment group. This trial found that topical application of Brazilian arnica reduced the perception of pain and increased flexibility in the treated group compared to baseline values in that group. Unknown if acute or chronic LBP.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1Selection bias was high to unclear, performance bias was low risk to unclear risk, with other attributes being low risk.

Open in table viewer
Summary of findings 2. Summary of findings table 2: Topical capsaicin cream or plaster compared to placebo for patients with non‐specific chronic back pain or soft tissue pain

Topical capsaicin cream or plaster compared to placebo for patients with non‐specific chronic back pain or soft tissue pain

Patient or population: patients with chronic LBP or soft tissue pain

Settings: Outpatient clinic

Intervention: topical capsicum cream or plaster

Comparison: placebo

Outcomes

No of participants
(trials)

Quality of the evidence
(GRADE)

Comments

Pain perception according to the Pain VAS scale

0‐10

755
(three trials)

⊕⊕⊕⊝
moderate1

All three trials found a statistically significant difference between

the capsaicin intervention vs. placebo. In three trials minor adverse effects were noted in the treatment groups requiring no specific follow‐up treatments.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1All three trials exhibited low to unclear risk in selection bias, performance bias and attrition bias. One trial was at high risk for selective reporting.

Open in table viewer
Summary of findings 3. Summary of findings table 3: Topical capsaicin cream compared with placebo for patients with acute non‐specific LBP

Topical capsaicin cream compared with placebo for patients with acute non‐specific LBP

Patient or population: patients with acute mechanical LBP

Settings: outpatient clinic

Intervention: Rado‐Salil ointment

Comparison: placebo

Outcomes

No of participants
(trials)

Quality of the evidence
(GRADE)

Comments

Pain evaluation on a 10 cm linear scale

40

(one trial)

⊕⊝⊝⊝
very low1,2

Pain improvements were significantly greater in the capsicum cream group up to day 14. Adverse events: Pruritis, one in placebo, one in Rado‐Salil group. Local erythema and burning, three in the Rado‐Salil group.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1Exhibited unclear risk for selection bias as well unclear baseline similarities. Performance bias was low risk as was attrition bias but it was high risk for incomplete outcome data.

2As under 400 participants were included, evidence was downgraded to very low from low.

Open in table viewer
Summary of findings 4. Summary of findings table 4: H. procumbens compared to placebo for non‐specific chronic back pain

H. procumbens compared to placebo for non‐specific chronic back pain

Patient or population: patients with chronic back pain

Settings: outpatient clinic

Intervention:H. procumbens extract

Comparison: placebo

Outcomes

No of participants
(trials)

Quality of the evidence
(GRADE)

Comments

Arhus pain index

scale 0‐130

315
(two trials)

⊕⊕⊝⊝
low1,2

In one trial a 50mg dose of H. procumbens was used, and in

the second trial a 50 mg and 100 mg dose was used with both trialss

showing a significantly improved pain score over placebo.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1Both included trials exhibited low risk of bias regarding selection bias with one trial at unclear risk of bias. Performance bias was at low risk of bias, as was attrition bias with one trial at high risk of bias for incomplete outcome data.

2Two trials included under 400 participants and we downgraded the evidence to low from moderate.

Open in table viewer
Summary of findings 5. Summary of findings table 5: H. procumbens extract compared to Vioxx® for non‐specific chronic LBP

H. procumbens extract compared to Vioxx®for non‐specific chronic LBP

Patient or population: patients with chronic LBP

Settings: outpatient clinic

Intervention:H. procumbens extract

Comparison: Vioxx®

Outcomes

No of participants
(trials)

Quality of the evidence
(GRADE)

Comments

Modified Arhus Index

Scale 0‐120

88
(one trial)

⊕⊝⊝⊝
very low1,2

H. procumbens was compared to Vioxx®

and while both groups showed similar pain reduction scores there were no

demonstrable difference among groups. There were adverse effects noted in both

groups.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1This trial was at low risk of bias for all risk of bias factors, with the exception of allocation concealment and compliance which were at unclear risk of bias.

2Downgraded to very low versus low as under 400 participants were included.

Open in table viewer
Summary of findings 6. Summary of findings table 6: Willow bark extract compared to placebo for non‐specific chronic LBP

Willow bark extract compared to placebo for non‐specific chronic LBP

Patient or population: patients with chronic LBP

Settings: outpatient clinic and public advertisement

Intervention: willow bark extract

Comparison: placebo

Outcomes

No of participants
(trials)

Quality of the evidence
(GRADE)

Comments

Pain VAS

Scale 0‐10

261
(two trials)

⊕⊕⊕⊝
moderate1,2

The high dose (240 mg) treatment group

showed a significant reduction in pain

scores versus the low dose (120 mg) group

and the placebo group. There was one severe

allergic reaction related to the extract noted.

One trial (N = 51) also examined the effect of

the extract on platelet aggregation.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1Both trials were at low to unclear risk for selection bias, low risk for performance bias with one trial exhibiting high risk in baseline characteristics similarity. Both trials were rated as an overall low risk of bias since they met our predetermined cut‐point of 50% of the criteria on which the trial methods were assessed.

2Downgraded from high to moderate as under 400 participants were included between both trials.

Open in table viewer
Summary of findings 7. Summary of findings table 7: Willow bark extract compared to rofecoxib for non‐specific chronic LBP

Willow bark extract compared to rofecoxib for non‐specific chronic LBP

Patient or population: patients with chronic LBP

Settings: outpatient clinic

Intervention: willow bark extract

Comparison: rofecoxib

Outcomes

No of participants
(one trial)

Quality of the evidence
(GRADE)

Comments

Arhus Index

Scale 0‐130

Pain VAS

Scale 0‐10

228
(one trial)

⊕⊝⊝⊝
very low1,2

There was no significant difference

in the effectiveness and adverse

events between the extract and

rofecoxib.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1Low risk for selection bias, high risk for performance bias, and high and low risk for attrition bias.

2Downgraded from low to very low due as under 400 participants were included.

Open in table viewer
Summary of findings 8. Summary of findings table 8: Comfrey root extract compared to placebo for acute lower and upper back non‐specific pain

Comfrey root extract compared to placebo for acute lower and upper back non‐specific pain

Patient or population: patients with acute lower and upper back pain

Settings: outpatient setting

Intervention: comfrey root extract

Comparison: placebo

Outcomes

No of participants
(studies)

Quality of the evidence
(GRADE)

Comments

Pain VAS sum (decrease) on active standardized

movement (mm)

120
(one trial)

⊕⊕⊝⊝
low1,2

The root extract showed a statistically

and clinically relevant reduction in

acute back pain versus placebo.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1Unclear risk for selection bias, low risk for both performance and attrition bias.

2Downgraded from moderate to low as under 400 participants were included.

Open in table viewer
Summary of findings 9. Summary of findings table 9: Lavender oil acupressure massage and acupoint stimulation compared to usual treatment for acute non‐specific LBP

Lavender oil acupressure massage and acupoint stimulation compared to usual treatment for acute non‐specific LBP

Patient or population: patients with acute LBP

Settings: old aged home and community centre

Intervention: lavender oil massage

Comparison: usual therapy

Outcomes

No of participants
(trials)

Quality of the evidence
(GRADE)

Comments

Pain VAS

0‐10 scale

61
(one trial)

⊕⊝⊝⊝
very low1

One week post‐study the treatment group

showed a significant (P = 0.0001) reduction

in VAS pain as well as improved walking time

and lateral spine flexion range.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1Sequence generation was at low risk of bias but allocation concealment was at high risk. Performance bias was at high and unclear risk. Co‐interventions and timing outcome assessment factors were at high risk of bias.

Open in table viewer
Summary of findings 10. Summary of findings table 10: Spiroflor SRL compared to CCC for chronic non‐specific LBP

Spiroflor SRL compared to CCC for chronic non‐specific LBP

Patient or population: patients with acute and chronic LBP

Settings: outpatient clinic

Intervention: Spiroflor SRL

Comparison: CCC

Outcomes

No of participants
(trials)

Quality of the evidence
(GRADE)

Comments

Pain VAS

0‐100 scale

161
(one trial)

⊕⊝⊝⊝
very low1

Spiroflor SRL and CCC were equally effective in

treating acute LBP but the CCC

group experienced greater adverse events

and adverse drug reactions.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

CCC = Cremor Capsici Compositus FNA; SRL = Homeopathic combination of Symphytum officinale, Rhus toxicodendron and Ledum palustre

1All risk of bias factors were at low risk of bias, except patient compliance which was at high risk.

2Downgraded from low to very low as under 400 participants were included.

Background

Low‐back pain (LBP) and its related disability are major public health problems across industrialized nations. As a result, research efforts have intensified to identify effective treatment and management strategies for people with LBP (Mounce 2002). Point prevalence estimates of LBP vary widely depending on the methodology used, but range from 12% to 40% of the population (Friedley 2010). Additionally, several studies indicate that LBP prevalence is increasing over time (Friedley 2010). Data from a United States. national survey from 2002 reported a three‐month prevalence of 26.4%, with higher prevalence among American Indians and Alaskan Natives (35.0%), and lower prevalence among Asian Americans (19.0%) (Deyo 2006). LBP prevalence peaks between the ages of 45 to 64 years and is more common among lower socioeconomic status groups, as defined by income and education (Deyo 2006). Lifetime prevalence of LBP is estimated at 67% (Deyo 2006).

In the United States, back pain accounts for 19 million physician visits, 250 million workdays lost, and $14 billion (USD) in direct expenditures (Grabois 2005). Indirect costs, excluding short‐ and long‐term disability, are estimated at up to $100 billion (USD) per year (Eisenberg 2012). LBP‐related disorders caused 2.63 million annual emergency department (ED) visits, or 2.3% of all visits to EDs in the United States (Friedman 2010). This amounts to substantial societal productivity losses and an economic burden for health‐care systems in many industrialized countries (Mounce 2002).

Wide variations in the medical and surgical management of LBP reflect widespread professional uncertainty about optimal care of people with LBP (Eisenberg 2012). Over 1000 randomized controlled trials (RCTs) have been published evaluating all types of conservative, complementary, or surgical treatments for LBP that are commonly used in primary and secondary care (Koes 2006). A special focus issue of The Spine Journal reviewed 25 categories of treatment presented for the management of chronic LBP (Haldeman 2008). Several interventions are included in clinical practice guidelines on LBP, including: back schools, non‐steroidal anti‐inflammatory drugs (NSAIDs), the McKenzie method, needle acupuncture, spinal manipulation, trigger point injections, and watchful waiting (Haldeman 2008). A summary of European clinical guidelines for chronic LBP includes cognitive behavior therapy, supervised exercise therapy, educational interventions, biopsychosocial treatment, and short‐term use of NSAIDs and weak opioids (Koes 2006).

Although systematic reviews suggest that few of these interventions have sufficient evidence to suggest benefit, it does appear that acute LBP can usually be effectively managed by encouraging activity, reassurance, and short‐term symptom control (analgesics or NSAIDs) (Koes 2006). Treatments that demonstrate some effectiveness for the management of chronic LBP include exercise therapy, behavioural treatment, and multidisciplinary treatment programs, as well as short‐term use of analgesics or NSAIDs (Koes 2006).

Research in complementary and alternative medicine (CAM) has increased over the past 15 years. Rigorous literature is growing steadily and is subsequently clarifying the validity of these techniques (e.g., Vickers 2000) . Specifically, the number of randomized trials of complementary treatments has doubled approximately every five years (Vickers 2000) and currently, the Cochrane Complementary Medicine Field Trials Registry contains over 43,000 records. In addition, CAM teaching institutions are now beginning to teach principles of evidence‐based medicine and clinical epidemiology (Mills 2002; Sierpina 2002). These initiatives are well placed, given the large number of visits to CAM practitioners (Metcalfe 2010; Frass 2012). A recent population survey in Canada found that 12.4% of Canadians visited a CAM practitioner in the year they were surveyed, between 2001‐2005 (Metcalfe 2010). A review article on the international acceptance and use of CAM found prevalence rates between 5.0% to 74.8%, with an overall average prevalence of 32.2% (Frass 2012). Follow‐up studies indicate that there has been a steady increase in CAM use (Frass 2012). More CAM users are women, middle‐aged, educated, and experiencing chronic disease (Metcalfe 2010; Frass 2012). Back pain or back problems are one of the five most common medical conditions for which CAM has most often been used (Frass 2012). Among people reporting back problems, between 16.8% to 57.2% seek CAM treatments (Frass 2012).

Several herbal medicines are reported treatments for various types of pain. These include Cammiphora molmol (myrrh) , Capsicum frutescens (capsicum), Salix alba (white willow bark), Melaleuca alternifolia (tea tree) , Angelica sinensis (don quai) , Aloe vera (aloe), Thymus officinalis (Thyme), Menthe peperita (peppermint), Arnica montana (Arnica), Curcuma longa (curcumin), Tanacetum parthenium (feverfew), Harpagophytum procumbens (devil's claw) , and Zingiber officinale (ginger) (Blumenthal 1998). Many have been the subject of extensive biochemical research, resulting in the delineation of their pharmacological and physiological effects (Mills 2000). For example, the mechanism of C. frutescens is partially related to its ability to deplete substance P, a neurotransmitter for pain perception (Keitel 2001). S. alba is a platelet inhibitor and analgesic, and H. procumbens has analgesic and anti‐inflammatory properties (Chrubasik 1996). In addition, some of these herbal species have been clinically tested for the relief of symptoms of LBP (Krivoy 2000; Laudahn 2001a; Mills 2000; Stam 2001).

Given the large public health and economic burden LBP causes and the large number of people with LBP who regularly visit CAM practitioners, a systematic review of these herbal medicines was warranted.

Objectives

To determine the effectiveness of herbal medicine compared to placebo, no intervention, or other interventions in the treatment of non‐specific LBP.

Methods

Criteria for considering studies for this review

Types of studies

RCTs.

Types of participants

Adults aged over 18 years, suffering from acute (lasting up to six weeks), sub‐acute (six to 12 weeks) or chronic (longer than 12 weeks) non‐specific LBP.

We defined LBP as pain localized to the area between the costal margin or the 12th rib to the inferior gluteal fold. Non‐specific LBP indicated that no specific cause was detectable, such as infection, neoplasm, metastasis, osteoporosis, rheumatoid arthritis, fracture, inflammatory process, or radicular syndrome (Waddell 1996).

Types of interventions

For the purpose of this Cochrane Review, we defined a herbal medicine as all or part of a plant used for medicinal purposes, administered orally (ingestion) or applied topically. This definition did not include plant substances smoked (e.g. Cannabis sativa), individual chemicals derived from plants, or synthetic chemicals that were based on constituents of plants. However, we considered C. sativa and other plants that can be smoked as herbal medicines in this Cochrane Review if they were ingested. Various forms of oral herbal medicine include: standardized extracts (encapsulated or tablet form), tinctures (alcohol, glycerine, etc.), dried herb (encapsulated or tablet form), raw whole herb infusion (e.g. tea) and decoction (e.g. boiled‐down tea). Topical herbal applications include ointments, essential oils, creams (petroleum or glycerine‐based), powders, plasters, and poultices. We excluded opioids as they bridge the definitions of herbal medicine and analgesic.

Types of outcome measures

  1. Pain intensity (e.g. visual analogue scale (VAS), numerical rating scale (NRS)) and proportion of pain‐free patients (use of analgesic medications);

  2. Back pain specific functional status measured by validated instruments (e.g. Roland Disability Questionnaire (RDQ), Owestry Disability index (ODI), modified Aberdeen LBP Scale);

  3. Overall improvement (% reporting subjective improvement, NRS);

  4. Return to Work or Work Status (% of population, number of days of absenteeism);

  5. Lumbar flexibility (measured by Schober method, fingertip‐to‐ground distance).

Search methods for identification of studies

We used the search strategy recommended by the Cochrane Back Review Group (CBRG) (Bombardier 2011; Furlan 2009). The search stratgies for the identification of RCTs recommended by Robinson 2002 was also used in some of the strategies in this update.

Electronic searches

We searched the following electronic databases up to September 11, 2014:

  1. MEDLINE (OvidSP, 1946 to September Week 1 2014) and MEDLINE In‐Process & Other Non‐Indexed Citations (OvidSP, September 10, 2014)

  2. EMBASE (Ovid SP, 1980 to 2014 Week 36)

  3. Cochrane Central Registry of Controlled Trials (CENTRAL, The Cochrane Library; Issue 8 of 12, August 2014)

  4. Cumulative Index to Nursing and Allied Health Literature (CINAHL; EBSCO, 1981 to 2014)

  5. ClinicalTrials.gov

  6. World Health Organization International Clinical Trials Registry Portal (WHO ICTRP)

  7. PubMed

Searches have been conducted annually since 2009. CENTRAL (which contains the Complementary Medicine Field trials register) and CINAHL were added to the strategy in 2009, the trials registries ClinicalTrials.gov and WHO ICTRP were added in 2012, and Medline In‐Process & Other non‐Indexed Citaitons was added in 2013. A supplemental search of PubMed was added in 2014 to capture items not indexed in Medline using the strategy recommended by Duffy 2014. The 2014 search strategies for all databases are included in Appendix 1; Appendix 2 contains the 2009 strategies and highlights updates to these strategies to 2013.

Searching other resources

We reviewed reference lists in review articles, guidelines, and in the retrieved trials. Also we contacted individuals with expertise in herbal medicine and LBP to identify additional trials. We translated non‐English articles and JJG and MvT discussed these articles following the same procedures described below.

Data collection and analysis

Selection of studies

A library scientist with the CBRG conducted the electronic searches. Two review authors (HNO and JJG) independently selected studies based on title, abstract, and keywords. We included studies that met the inclusion criteria. If it was unclear from the title and abstract if a study fulfilled the inclusion criteria, we retrieved the full‐text article for final selection. We used a consensus method to resolve any disagreements.

Data extraction and management

Two review authors (HNO and JJG) independently extracted the data from each trial using a standardized form. We extracted the following data from each trial: recruitment, characteristics of the trial population (age, gender), setting (e.g. year, country of origin), duration of LBP (acute, subacute, or chronic), previous treatment for LBP, number of participants initially recruited, number of participants randomized, number of drop‐outs or withdrawals, duration of intervention, type of herbal medicine used (plant name and form of delivery and dosage), standardization information (e.g. percentage of active constituent per delivery unit), characteristics of the control intervention (type and duration), types of outcome measures, summary statistics, timing of outcome assessments, compliance, adverse effects due to intervention, and authors' conclusions as to the intervention's effectiveness.

Reporting quality

One review author (HNO) assessed the reporting quality of each included trial using the CONSORT statement (Moher 2012) and the CONSORT statement for herbal interventions (Gagnier 2006c). HNO scored each criterion as 'yes' (Y), 'no' (N) or 'don't know' (DK). 'Yes' indicated that the criterion was met. 'No' reflected the lack of fulfilment of that criterion. 'Don't know' reflected the fact that there was insufficient information to determine if this criterion was fulfilled or not. We considered a trial to have high reporting quality if it contained at least 50% of the CONSORT checklist items or at least 50% of the CONSORT for herbal interventions items.

Assessment of risk of bias in included studies

Two review authors (HNO and JJG) independently assessed methodological quality using the method recommended by Furlan 2009. Given JJG's familiarity with the literature, trials were not blinded for authors, institution or journal. We used the 12 items in the methodological quality assessment reflecting internal validity, together with operational definitions, recommended by the CBRG in their updated method guidelines for systematic reviews to assess methodological quality (Furlan 2009). We scored each criterion as high, low, or unclear. 'High' indicated that the criterion was not met. 'Low' reflected the fulfilment of that criterion. 'Unclear' reflected the fact that there was insufficient information to determine whether this criterion was fulfilled or not. We considered a trial to have low risk of bias if the trial met over 50% (6/12) of internal validity items and we found no other serious flaws.

Data synthesis

We planned to analyse dichotomous outcomes by calculating the relative risk values (RR). For continuous outcomes, we planned to calculate the mean difference (MD) when the same instrument was used to measure outcomes or the standardized mean difference (SMD) when different instruments were used to measure the outcomes. We planned to use 95% confidence intervals (95% CI) to express the uncertainty of the findings. However, we were unable to combine the trials through meta‐analysis because of insufficient data and clinical heterogeneity. Therefore we conducted a qualitative analysis of trial findings.

Two review authors (HNO and JJG) independently assessed the overall quality of the evidence for each outcome using the GRADE approach (GRADE 2004), as recommended in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011) and adapted in the updated CBRG method guidelines (Furlan 2009). Domains that decreased the quality of the evidence include: study design and risk of bias, inconsistency of results, indirectness, imprecision (sparse data) and other factors (e.g. reporting bias).

We determined the overall quality of evidence for each outcome by combining the assessments on all domains. The five levels of evidence include:

  • High quality evidence: there are consistent findings among at least 75% of RCTs with low risk of bias, consistent, direct and precise data and no known or suspected publication biases. Further research is unlikely to change either the estimate or our confidence in the results.

  • Moderate quality evidence: one of the domains is not met. Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.

  • Low quality evidence: two of the domains are not met. Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.

  • Very low quality evidence: three of the domains are not met. We are very uncertain about the results.

  • No evidence: no RCTs were identified that addressed this outcome

Clinical relevance

One review author (JJG) assessed the clinical relevance of each trial using these five questions:

  1. Are the patients described in detail so that you can decide whether they are comparable to those that you see in your practice?

  2. Are the interventions and treatment settings described well enough so that you can provide the same for your patients?

  3. Were all clinically relevant outcomes measured and reported?

  4. Is the size of the effect clinically important?

  5. Are the likely treatment benefits worth the potential harms?

Results

Description of studies

We identified 775 references in total and excluded 725 references after reviewing titles and abstracts. Fifty papers were retrieved in full, 34 of which were excluded. We have listed the reasons for exclusion in the Characteristics of excluded studies section. We excluded a further six articles, for a total of 40 excluded articles, because they were notes or comments on a previous study or repeated articles. Threfore, in the original search we included 10 articles(Chrubasik 1996; Chrubasik 1999; Chrubasik 2000; Chrubasik 2001a; Chrubasik 2003; Frerick 2003; Ginsberg 1987; Krivoy 2001; Keitel 2001; Stam 2001). In this search update (July 2005 to August 5, 2013), we identified a total of four citations that met the inclusion criteria (Chrubasik 2010; da Silva 2010; Giannetti 2010; Yip 2004), which we included with the 10 articles (Chrubasik 1996; Chrubasik 1999; Chrubasik 2000; Chrubasik 2001a; Chrubasik 2003; Frerick 2003; Ginsberg 1987; Krivoy 2001; Keitel 2001; Stam 2001) identified in earlier literature searches. In a search update, through September 2014, we identified 5 citations with none meeting the inclusion criteria.

Three trials used an oral form of the herbal species H. procumbens (devil's claw; Chrubasik 1996; Chrubasik 1999; Chrubasik 2003), three trials used oral S. alba (white willow bark; Chrubasik 2000; Chrubasik 2001a; Krivoy 2001), five trials used topical C. frutescens (cayenne; Chrubasik 2010; Frerick 2003; Ginsberg 1987; Keitel 2001; Stam 2001), one used lavender essential oil applied by acupressure (Yip 2004), one used topical S. chilensis (Brazilian arnica; da Silva 2010), and one used topical Symphytum officinale L. (comfrey; Giannetti 2010).

Four trials compared various oral herbal medicines with placebo (Chrubasik 1996; Chrubasik 1999; Chrubasik 2000; Krivoy 2001). Two trials compared oral herbal medicines to standard pain medications (Chrubasik 2001a; Chrubasik 2003). Six trials compared topical herbal medicines to placebo (Chrubasik 2010; da Silva 2010; Frerick 2003; Giannetti 2010; Ginsberg 1987; Keitel 2001). One trial compared a topical herbal medicine to a topical homeopathic medicine (Stam 2001) and one trial compared a topical herbal medicine to no treatment (Yip 2004).

The three H. procumbens trials included participants with acute exacerbations of chronic non‐specific LBP (Chrubasik 1996; Chrubasik 1999; Chrubasik 2003). Similarly, the three trials of S. alba preparations included homogeneous populations with acute episodes of chronic non‐specific LBP (Chrubasik 2000; Chrubasik 2001a; Krivoy 2001). One trial using a topical C. frutescens ointment included patients with acute mechanical LBP (Ginsberg 1987). Two trials using a Capsicum pain plaster included participants with chronic non‐specific LBP (Frerick 2003; Keitel 2001). One trial using Capsicum cream included patients with chronic pain of the soft tissues of the musculoskeletal apparatus, with a subgroup of back pain patients (Chrubasik 2010). A second trial using a topical Capsicum ointment included a sample of patients with either newly occurring acute LBP or acute episodes of chronic LBP (Stam 2001). The lavender essential oil trial included patients with non‐specific sub‐acute LBP (Yip 2004). The trial assessing S. chilensis included patients seeking treatment for a diagnosis of lumbago (da Silva 2010). Finally, the S. officinale trial included patients with acute non‐specific upper or lower back pain (Giannetti 2010).

Three trials (Chrubasik 1996; Chrubasik 1999; Frerick 2003) used a relatively unknown LBP scale, the Arhus Index, which was designed to monitor outcomes of clinical trials of LBP. The Arhus Index is a back‐pain specific index that includes physical impairment, pain, and disability scores, which are summed into a total score (Manniche 1994). The pain scale is rated by the patient and includes back pain and leg pain, with a score that ranges from 0 to 60. The disability scale consists of a questionnaire that asks about 15 daily tasks, with a score that ranges from 0 to 30. The physical impairment score is obtained by scoring on a deep knee bend, a modified Schober's test, a low‐back strength test and a measure of analgesic use, with a total combined score ranging from 0 to 40. The higher the scores, the more physical impairment, pain and disability. This test takes approximately 15 minutes to complete. It has been shown to be a valid and reliable measure of LBP (Manniche 1994).

In the H. procumbens trials, Chrubasik 1996 used a standardized dosage of 50 mg harpagoside per day or 2400 mg of the crude extract; Chrubasik 1999 used daily dosages of the proprietary extract WS 1531 at 600 and 1200 mg of the crude herb, which was the equivalent of 50 and 100 mg harpagoside; and Chrubasik 2003 used a proprietary extract of Doloteffin, containing a daily dose of 60 mg harpagoside, or 12.5 mg rofecoxib (Vioxx®).

In the white willow bark extract trials, Chrubasik 2000 utilized an extract containing 0.153 mg salicin per mg and made comparisons between daily dose of 120 mg salicin, 240 mg of salicin and a matched placebo; Chrubasik 2001a used S. alba containing a daily dose of 240 mg salicin and compared it to 12.5 mg rofecoxib; and Krivoy 2001 used a daily dose of S. alba containing 240 mg salicin compared to placebo and 100 mg acetylsalicyliate.

The five trials of topical C. frutescens preparations used: a topical plaster application containing 11 mg of capsaicinoids per plaster (Keitel 2001); a plaster containing an ethonolic extract of cayenne pepper standardized to 22 µg/cm2 of capsaicinoids (Frerick 2003); a gel called Cremor Capsici Compositus FNA (CCC), which contains 100 mg of Capsicum Oleoresin (BPC), 10 g of glycol salicylate, 1 g of methylnictinate, and a combined 1 g of histamine hydrochloride, sorbitol, methylprahydroxybenzoate, triethanolamine, lanette wax, stearic acid, and purified water (Stam 2001); a gel called Rado‐Sailil, containing 17.64 mg acetylsalicylate, 26.47 mg methylsalicylate, 8.82 mg glycosalicylate, 8.82 mg salicylic acid, 4.41 mg camphor, 55.14 mg menthol and 15.44 mg Capsicum Oleoresin per gram (Ginsberg 1987); and a cream called Finalgon CPD Warmecreme, containing 2.2 to 2.6 g soft extract of capsici fructus acer corresponding to 53 mg capsaicin (0.05%) (Chrubasik 2010).

In the trial using Symphytom officinale, Giannetti 2010 used an ointment called Kytta‐Salbe containing 35 g 99% reduced Rad symphyti fluid extract per 100 g.

The S. chilensis trial used a gel containing plant extract diluted in propylene glycol and added at a proportion of 5% to carbomer gel, corresponding to active substances in 5 g of dry raw material (da Silva 2010).

The trial that applied lavender oil using acupressure used 3% Lavandula angustifolia essential oil in grape seed oil.

Twelve of the 14 included studies reported information regarding adverse events associated with the study medication. We have reported details of the included trials in the Characteristics of included studies.

We assessed the included trials for any potential conflicts of interest by looking at funding sources (public vs. private) and whether a trial author was employed by a private pharmaceutical, nutraceutical, or herbal medicine manufacturer. Three trials reported no conflict of interest (Chrubasik 1999; Giannetti 2010; Ginsberg 1987). However, we deemed a conflict of interest possible in Giannetti 2010, as several trial authors were employed by pharmaceutical companies. The authors of eight trials had potential conflicts of interest (Chrubasik 1996; Chrubasik 2000; Chrubasik 2003; Chrubasik 2010; Frerick 2003; Giannetti 2010; Keitel 2001; Stam 2001). In five trials, an author was employed by a pharmaceutical company (Chrubasik 2010; Frerick 2003; Giannetti 2010; Keitel 2001; Stam 2001), one trial was funded by a professional academy (Chrubasik 2000), one trial was funded by a pharmaceutical company (Chrubasik 2003), and for one trial, the experimental herbal medicine was supplied by a company (Chrubasik 1996). The remaining trial was funded by an oil company; we could not determine whether this was a conflict of interest (da Silva 2010). In the final three trials (Chrubasik 2001a; Krivoy 2001; Yip 2004), we considered conflicts of interest unlikely.

Risk of bias in included studies

Two review authors (HNO and JG) assessed the methodological quality criteria in full of all included papers. Agreement between the review authors was over 98%. The mean score for methodological quality assessment criteria (Furlan 2009) of all included studies was 7.2, with a median score of 7.5 and a range of four to ten. Using a cut‐off point of six fulfilled criteria out of 12, 11 trials (79%) were at low risk of bias (Chrubasik 1996; Chrubasik 1999; Chrubasik 2000; Chrubasik 2001a; Chrubasik 2003; Chrubasik 2010; Frerick 2003; Giannetti 2010; Keitel 2001; Krivoy 2001; Stam 2001). The main methodological shortcomings of the H. procumbens trials included a lack of reporting of allocation concealment, compliance rates, controls for co‐interventions and acceptability of withdrawal or drop‐out rates during the follow‐up period. Of the included S. alba trials, one was an open‐label trial and the additional two did not report allocation concealment, compliance rates, controls for co‐interventions, or the acceptability of withdrawal or drop‐out rates during the follow‐up period. Stam 2001's capsicum trial was at low risk of bias. The additional capsicum trials (Frerick 2003; Keitel 2001; Ginsberg 1987; Chrubasik 2010) failed to report the type of randomization, allocation concealment, similarity of baselines, outcome assessor, investigator and participant blinding, comparability of co‐interventions, and acceptability of compliance. The lavender Yip 2004 and Brazilian arnica da Silva 2010 trials were at high risk of bias, meeting less than half of the criteria. The comfrey trial failed to report randomization and treatment allocation, blinding of the outcome assessor, and similarity of the groups at baseline. The risk of bias assessment of each included trial is given in Figure 1.


Summary of risk of bias for each of the included trials.

Summary of risk of bias for each of the included trials.

GRADE

We applied the GRADE criteria to the included trials as recommended in Higgins 2011. Four treatments (Capsicum cream, Capsicum plaster, S. alba, and devil's claw) had several RCTs investigating their use, and we subsequently assessed them on all GRADE criteria. We assessed the remaining six treatments as having moderate to very low quality of evidence, as there was only one included RCT for each treatment.

We downgraded the evidence from three trials to low quality evidence due to limitations in study design (da Silva 2010; Ginsberg 1987; Yip 2004, ). For two of these treatments (lavender and Brazilian arnica) there was evidence from only one RCT, therefore we downgraded the evidence to very low quality due to insufficient data and having less than 400 included participants. Giannetti 2010 had no other downgrades aside from being a singular study and having less than 400 included participants, resulting in a grade of low quality. The trials comparing Capsicum cream or plaster to placebo for chronic LBP (Chrubasik 2010; Frerick 2003; Keitel 2001) were analyzed together and determined to provide moderate quality evidence for this treatment. Ginsberg 1987 compared capsicum to placebo in acute LBP and was downgraded for study limitations and a small sample size. Stam 2001 was the only article examining Capsicum cream compared to a homeopathic gel; therefore we graded this evidence as very low quality due to a small sample size and low patient compliance. We downgraded the evidence from the trials comparing H. procumbens to placebo (Chrubasik 1996; Chrubasik 1999) for imprecise data and a sample size of less than 400, resulting in an overall grade of low quality. Chrubasik 2003 compared H. procumbens to rofecoxib was the only article to examine this comparison, and thus we downgraded the evidence due to study limitations (problems with allocation concealment and compliance) and a small sample size. Two trials assessed S. alba compared to placebo (Chrubasik 2000; Krivoy 2001); we rated this evidence as moderate quality due to a small sample size, potential selection bias, and differences in baseline characteristics. Finally, we rated the evidence from the single trial comparing S. alba to rofecoxib (Chrubasik 2001a) as very low quality due to indirectness, imprecision and a small sample size.

Clinical relevance

Four trials met all five clinical relevance criteria (Chrubasik 1996; Chrubasik 2000; Ginsberg 1987; Yip 2004). Of the trials testing H. procumbens, two trials did not meet items four and five (Chrubasik 1999; Chrubasik 2003). Of the S. alba trials, one did not meet item one (Krivoy 2001), one did not meet items four and five (Chrubasik 2001a) and it was not possible to tell if one trial fulfilled items four and five or not (Krivoy 2001). Of the C. frutescens trials, three did not meet item one (Chrubasik 2010; Ginsberg 1987; Keitel 2001), one did not meet item three (Chrubasik 2010), two did not meet items four and five (Frerick 2003; Stam 2001), and for Keitel 2001, it was not possible to tell if items four and five were met or not. Giannetti 2010 did not meet items 1 to 3, while da Silva 2010 did not meet item three and it was not possible to tell if it met item 5.

Reporting (CONSORT)

We assessed reporting in the published articles using the CONSORT statement and the CONSORT statement for herbal interventions. On average, the included trials had information on 45.3% of the CONSORT items, with a range from 19.4% (Ginsberg 1987) to 59.5% (Chrubasik 2000; Stam 2001). Items missing in over half of the included trials included a description of trial design, any changes to methods or outcomes after trial commencement, how sample size was determined, explanation of interim analyses and stopping guidelines, type of randomization, method used to implement the random allocation sequence (including who generated it), methods for subgroup or adjusted analyses, dates defining periods of recruitment and follow‐up, why the trial was stopped, trial limitations and sources of bias, generalizability, registration number and name of trial registry, where the full protocol can be accessed, and sources of funding and support. Using the arbitrary cut‐off of 50% of items, the average reporting in these trials was poor. However, eight trials had good completeness of reporting (Chrubasik 1999; Chrubasik 2000; Chrubasik 2001a; Chrubasik 2003; Chrubasik 2010; Giannetti 2010; Stam 2001; Yip 2004).

Also, we assessed reporting using the CONSORT statement for herbal interventions. Using these guidelines, the average trial included information on 45.4% of the checklist items, with a range from 27.3% (Ginsberg 1987) to 58.2% (Giannetti 2010). Information that was reported in under half of the trials included the Latin binomial for the herbal medicine product, the part of the plant used, the authority and family name of all herbal ingredients, the name of the manufacturer of the product, the type and concentration of extraction solvent used, the drug to extract ratio, the method of authentication of raw material, whether a voucher specimen was retained and where it is stored, how the duration of drug administration was determined, weight amount of all known herbal product constituents, qualitative testing of product, a description of the practitioners, concomitant herbal medicine use, and discussion of results in relation to other available products. Using the arbitrary cut‐off of 50% of items, the average reporting in these studies was poor. However, five trials had good completeness of reporting (Chrubasik 1996; Chrubasik 2000; Chrubasik 2003; Chrubasik 2010; Giannetti 2010).

Effects of interventions

See: Summary of findings for the main comparison Summary of findings table 1: Brazilian arnica extract compared to placebo for patients with non‐specific chronic back pain or soft tissue pain; Summary of findings 2 Summary of findings table 2: Topical capsaicin cream or plaster compared to placebo for patients with non‐specific chronic back pain or soft tissue pain; Summary of findings 3 Summary of findings table 3: Topical capsaicin cream compared with placebo for patients with acute non‐specific LBP; Summary of findings 4 Summary of findings table 4: H. procumbens compared to placebo for non‐specific chronic back pain; Summary of findings 5 Summary of findings table 5: H. procumbens extract compared to Vioxx® for non‐specific chronic LBP; Summary of findings 6 Summary of findings table 6: Willow bark extract compared to placebo for non‐specific chronic LBP; Summary of findings 7 Summary of findings table 7: Willow bark extract compared to rofecoxib for non‐specific chronic LBP; Summary of findings 8 Summary of findings table 8: Comfrey root extract compared to placebo for acute lower and upper back non‐specific pain; Summary of findings 9 Summary of findings table 9: Lavender oil acupressure massage and acupoint stimulation compared to usual treatment for acute non‐specific LBP; Summary of findings 10 Summary of findings table 10: Spiroflor SRL compared to CCC for chronic non‐specific LBP

1a) H. procumbens (devil's claw) verses placebo

The included two trials testing H. procumbens enrolled participants suffering from acute exacerbations of chronic LBP lasting longer than six months (summary of findings Table 4).

Chronic LBP
50 mg Harpagoside dose

Two four‐week trials, which included 315 participants, tested extracts of H. procumbens standardized to 50 mg harpagoside (H) per day versus placebo (Chrubasik 1996; Chrubasik 1999). Both trials found a significant increase in the number of pain‐free patients in the 50 mg H group (9% to 17%) versus placebo (2% to 5%). One trial found that for participants taking 50 mg H, the percentage with no pain or mild LBP increased over the four week period (from 2% in week 1, to 24% in week 4), whereas the percentage with unbearable or severe pain decreased over the four weeks (from 59% in week 1 to 35% in week 4, (Chrubasik 1999). Tramadol consumption decreased more in both trials in the group that received 50 mg H than in the group that received placebo. However, this decrease did not reach statistical significance in Chrubasik 1999 and Chrubasik 1996 did not perform a statistical test on this measure. Both trials used the Arhus Index. The overall Arhus score improved by 21% in both the 50 mg H group and the placebo group, with no significant difference between groups. The pain subscale was significantly improved in favour of the 50 mg H group in both trials (median change for those with current LBP of 43%, Chrubasik 1999; median change of 34%, Chrubasik 1996), which was a greater improvement than that of the group that received an additional 100 mg H in one trial (median change for those with current LBP of 37%, Chrubasik 1999).

Based on this low quality evidence, a daily dose of 50 mg harpagoside in an aqueous extract of H. procumbens may reduce pain more than placebo in the short‐term, in patients with acute episodes of chronic non‐specific LBP. Long‐term treatment data are not yet available.

100 mg Harpagoside dose

Chrubasik 1999, a four‐week trial which included 197 participants, tested H. procumbens standardized to 100 mg harpagoside (H) per day versus placebo. The number of patients who were pain‐free for at least five days in the fourth week of treatment was significantly higher (N = 10) than in either the placebo (N = 3) or lower dose (50 mg H) groups (N = 6). Half of the pain‐free patients in the 100 mg H group had a neurological deficit at the start of the trial. The changes from baseline in the overall Arhus Index, the pain index, invalid index and physical impairment index did not differ between the three groups. The percentage of patients with no or mild LBP increased over the four‐week period, whereas the percentage with unbearable or severe pain decreased.

Therefore, there is low quality evidence that a daily dose of 100 mg harpagoside in an aqueous extract of H. procumbens may lead to a greater number of patients who are pain‐free for at least five days, in the fourth week of treatment of acute episodes of chronic non‐specific LBP. Superiority of the higher dose has not been shown.

1b) S. alba versus placebo

We included two trials which enrolled participants suffering from acute exacerbations of chronic LBP lasting longer than six months (summary of findings Table 6).

Chronic LBP
120 mg salicin dose

A four‐week trial, including 210 participants, tested two doses of S. alba, standardized to either 120 mg or 240 mg salicin (S) per day, against placebo (N = 70 for each group; Chrubasik 2000). The number of patients who were pain‐free for at least five days in the fourth week of treatment increased from baseline in the placebo (N = 4), 120 mg salicin group (N = 15) and the 240 mg salicin group (N = 27), with the trend for dose being significant. The number of patients requiring relief medication (Tramadol) during each week decreased to 33 during week four for the placebo group, 10 for the 120 mg salicin group and three for the 240 mg salicin group; with the trend for dose being significant. The total Arhus Index, pain index, invalid index, and physical impairment index did not change from baseline for the placebo group but improved in the groups receiving either 120 mg or 240 mg salicin. The trend for dose was significant, with the group receiving 240 mg salicin showing more improvement in the total Arhus Index score and the pain index than the group receiving 120 mg salicin group.

There is moderate quality evidence that a daily dose of 120 mg salicin from an extract of S. alba results in more pain‐free patients in the short‐term for individuals with acute episodes of chronic non‐specific LBP.

240 mg salicin dose

Two trials included 261 patients tested 240 mg salicin (Chrubasik 2000; Krivoy 2001). Results for the Chrubasik 2000 trial are reported above. In summary, for the 240 mg salicin per day, there were more participants who were pain‐free for five days during the fourth week of treatment, and fewer patients required relief medication. There was a trend of greater improvements with higher dose for all outcomes and significant differences between the groups receiving 120 mg and 240 mg salicin for the total Arhus Index score and the pain index. The additional trial by Krivoy 2001, which was designed to test platelet aggregation of S. alba extract, did not measure clinically relevant outcomes. Although the trial authors stated that fewer patients in the group receiving 240 mg salicin required rescue medication (i.e. Tramadol) than in the placebo group, they did not provide any data.

Based on moderate quality evidence a daily dose of 240 mg salicin from an extract of S. alba probably reduces pain more than either placebo or a daily dose of 120 mg of salicin in the short term for individuals with acute episodes of chronic non‐specific LBP.

1c) C. frutescens versus placebo

We included four trials: one that enrolled participants with acute LBP (Ginsberg 1987), though the actual duration of LBP was not described; two trials with participants with chronic LBP that had lasted longer than three months (Frerick 2003; Keitel 2001); and one trial which included participants with chronic soft tissue pain (with a subset of patients experiencing chronic back pain) and did not describe duration of pain (Chrubasik 2010; summary of findings Table 2; summary of findings Table 3).

Acute LBP
Cream

Ginsberg 1987 gave 40 participants with acute mechanical LBP either a cream called Rado‐Salil, containing salicylate and capsicum (N = 20) or a placebo cream containing bergamot and lavender (N = 20) for 14 days. At day three, there was an improvement in pain score in the Rado‐Salil group of almost 2 cm on the VAS, which was significantly better than the placebo group. By day 14, the improvement increased to 3.79 cm, which was also significantly greater than the placebo group. In addition, both patients and physicians rated the effect of Rado‐Salil more favourably than the placebo group rated the effect of their cream.

Chronic LBP
Cream

Chrubasik 2010 included 281 participants suffering from chronic non‐specific soft‐tissue pain who were randomly allocated to either a placebo cream group (N = 141) or a Capsicum cream group (N = 140) for 21 days. A reduction in pain by at least 30% was achieved in 75.0% of the Capsicum group and 40.9% of the placebo group. A reduction in pain by at least 50% was achieved in 50.0% of the Capsicum group and 28.8% of the placebo group. The median relative pain sum score improvement was 48.9% in the Capsicum group and 22.5% in the placebo group. The capsicum treatment was rated as either "excellent" or "good" by patients in 59.3% of cases compared to 21.9% for the placebo group. The absolute number of days where patients reported an analgesic effect were > 70% among the Capsicum group and below 30% in the placebo group. For the majority of patients, the maximum effect was reached within two hours after application, and in 50% of patients the effect persisted for two to four hours.

Plaster

Keitel 2001 included 154 participants with acute episodes of chronic non‐specific LBP, who were randomly allocated to either a placebo plaster group (N = 77) or a Capsicum plaster group (n = 77) for three weeks. A reduction in pain by at least 30% was achieved in 60.9% of the Capsicum group and 42.1% of the placebo group. A reduction in pain by at least 50% was reported in 35.1% of the Capsicum group and 17.1% of the placebo group. The total Arhus score improved significantly more in the group using Capsicum (38.5%) than in the group using placebo (28%). Physician global ratings of efficacy were considered "excellent" or "good" in 75.7% of those using Capsicum and 47.4% of those using the placebo. After treatment, 13.5% of participants using Capsicum and 6.6% using placebo were symptom‐free. Compliance was 90.6% in the group using Capsicum and 88.1% in the group using placebo.

Frerick 2003 enrolled 320 participants suffering from chronic non‐specific LBP, who were randomly allocated to either a placebo plaster group (N = 180) or a Capsicum plaster group (N = 180) for 21 days. The total Arhus Index score decreased significantly more in the group using Capsicum (33%) than in the group using placebo (22%). The Arhus compound pain score decreased significantly more in the group using Capsicum (42%) than in those using placebo (31%). A reduction in pain by at least 30% was achieved in 67% of those using Capsicum and 49% in those using placebo, and a reduction in pain by at least 50% was seen in 45% and 24%, respectively. The Arhus subscale for physical impairment also decreased significantly more in the Capsicum group (21%) than in the placebo group (10%). Similar results were found for the disability subscale (35% vs. 22%, respectively). The capsicum treatment was rated as either "excellent" or "good" by investigators in 74% of cases compared to 36% for the placebo group. Compliance was reported as being "very good" or "good" in both groups (91% and 93%, respectively).

Therefore, there is moderate quality of evidence that a plaster and moderate quality evidence that a cream of C. frutescens reduces pain and improves function more than placebo in the short‐term for individuals with chronic non‐specific LBP. There is very low GRADE of evidence that a cream of C. frutescens may reduce pain more than placebo in the short‐term for individuals with acute non‐specific LBP.

1d) S. officinale versus placebo

Acute LBP

Giannetti 2010 included 120 participants with acute non‐specific upper or lower back pain, and randomized participants to eithercomfrey ointment treatment (N = 60) or placebo ointment (N = 60) for five days, with three applications per day. Over the course of the treatment, pain intensity by VAS decreased 95.2% in the comfrey group, a significant difference from the placebo group (37.8%). Reported back pain at rest decreased 97.4% in the comfrey group compared to 39.6% in the placebo group. Pressure algometry values in the trigger point increased 125% in the comfrey group and 71.8% in the placebo group.  Global assessment of efficacy by patients and investigators were all superior in the comfrey group (good or excellent for 80%) compared to the placebo group (good or excellent for 18.4%) (summary of findings Table 8).

There is low quality evidence that a cream of S. officinale reduces pain more than placebo in the short‐term for individuals with acute episodes of upper or lower back pain.

1e) S. chilensis versus placebo

da Silva 2010 randomized 20 patients seeking treatment for lumbago to either treatment with S. chilensis gel (N = 10) or placebo gel (N = 10) for 15 days. Each gel was applied twice per day. The S. chilensis treatment group reported a significant change in pain, as assessed by VAS at the end of the treatment period compared to baseline values. Also they experienced a significant increase in lumbar flexibility. Participants treated with placebo did not experience significant changes in perception of pain or lumbar flexibility over the course of treatment. The placebo and treatment groups were not statistically compared to one another.

Therefore, based on low quality evidence, S. chilensis gel may reduce pain and improve lumbar flexibility in the short term for people with lumbago (summary of findings Table for the main comparison).

2a) H. procumbens versus rofecoxib

Chronic LBP

Chrubasik 2003 included 88 patients with acute episodes of chronic non‐specific LBP in a six‐week trial, and tested H. procumbens standardized to 60 mg harpagoside per day (N = 44) versus 12.5 mg rofecoxib per day (N = 44). There were no significant differences in the number of patients who were pain‐free for at least five days in the sixth week of treatment in the 60 mg H. procumbens group (10/44) than in the rofecoxib group (5/44). The number of patients with improvements in pain scores did not differ between the two groups. This trial may lack power due to its small sample size. The number of patients using rescue medication (Tramadol) decreased from baseline in both groups, but did not differ between groups at week six. At the end of six weeks, there were no differences between groups for current LBP, scores on the Arhus pain index, invalid index, functional index, or the total score for the Arhus Index. The health assessment questionnaire (HAQ) improved in both groups during the six‐week period, with no differences between groups.

Therefore, based on current evidence, it is unclear whether a daily dose of 60 mg harpagoside in an aqueous extract of H. procumbens differs in effectiveness compared to a daily dose of 12.5 mg rofecoxib in the short‐term for individuals with acute episodes of chronic non‐specific LBP (very low quality evidence; summary of findings Table 5).

2b) S. alba versus rofecoxib

Chronic LBP

Chrubasik 2001a included 228 participants with acute episodes of chronic non‐specific LBP in a four‐week trial, and tested S. alba standardized to provide a daily dose of 240 mg salicin against 12.5 mg per day of rofecoxib. Both the rofecoxib and the 240 mg salicin groups improved by 44% on the pain scale, the Arhus invalid index, pain index, and physical impairment index. The percentage of patients requiring NSAIDs, Tramadol, or both was 10% for the S. alba group and 13% for the rofecoxib group. Approximately 90% of physicians and patients rated either treatment as effective and close to 100% rated either treatment as acceptable.

It is unclear, based on current evidence whether a daily dose of 240 mg salicin (of an extract of S. alba) is more effective than a daily dose of 12.5 mg rofecoxib in the short term for people with acute episodes of chronic non‐specific LBP in the short‐term (very low quality evidence; summary of findings Table 7).

3) C. frutescens versus homeopathic treatment

Acute and chronic LBP

Stam 2001 included 161 participants, who were a mixed group of patients with new acute LBP and acute episodes of chronic LBP. Participants were randomly allocated to either a Spiroflor SLR homeopathic gel (SLR) group (N = 83) or the CCC, the Capsici Oleoresin gel, group (N = 78) for a period of seven days. Each of the gels was applied at 3 g/day. Both groups showed a significant reduction in pain on the VAS scale, with a decrease of 38.2 mm in the SLR group and 36.6 mm in the CCC group. In the SLR group, 50% of participants reported that treatment was 80% effective and 18% reported total (100%) effectiveness. In the CCC group, this was 55% and 15%, respectively. There were also no differences in the proportion of participants using paracetamol, the proportion of participants still unable to work at the end of the study, and overall efficacy.

Based on current evidence, it is unclear whether Spiroflor SLR homeopathic gel and CCC gel differ in efficacy (very low quality evidence;summary of findings Table 10).

4) Lavender versus conventional treatment

Acute LBP

Yip 2004 included 61 participants with non‐specific sub‐acute LBP for most days in the previous four weeks. The trial used lavender essential oil, applied by acupressure eight times over a three‐week period, to treat 32 participants. "Conventional treatment", which is not described in the text, was used to treat the remaining 29 participants. One week after the end of the study, the intervention group reported significantly lower pain ratings (39% reduction) than the control group (no change). Both groups reported similar decreases in pain duration. Improvements were also seen in the lavender group in walking time and fingertip‐to‐floor distance, but these changes were functionally insignificant (9% and 4% changes, respectively). Acceptance of the intervention was rated as "satisfied" or "strongly satisfied" by 93% in the lavender‐treatment group. Trial authors did not report acceptance of the control group.

Therefore, based on current evidence it is unclear whether lavender essential oil applied via acupressure treatments significantly reduces perception of pain among people reporting non‐specific sub‐acute LBP (very low quality evidence; summary of findings Table 9).

Discussion

Quality of the evidence

We included 14 RCTs in this review. Three trials examined H. procumbens (devil's claw), three trials examined S. alba (white willow bark), five assessed C. frutescens (cayenne), one examined S. officinale (comfrey), one Lavandula angustifolia (lavender), and one S. chilensis (Brazilian arnica). Although reporting quality in the included trials was poor, risk of bias is not directly related to reporting quality (Huwiler‐Muntener 2002). Therefore, the risk of bias of poorly reported trials remains unclear. We attempted to contact all trial authors to clarify aspects of trials that were inadequately reported in the published manuscripts but did not receive replies from several corresponding trial authors.

Efficacy

The results of the included trials suggest that specific herbal medicines may be effective for short‐term (four to six weeks) improvement in pain and functional status for individuals with acute episodes of chronic non‐specific LBP. Ten trials were placebo‐controlled while four trials were comparative. There is insufficient evidence to make definitive conclusions regarding those trials comparing herbal medicine interventions to standard drugs. Two of the comparative trials used Vioxx® as a comparator (Chrubasik 2001a; Chrubasik 2003), another used a homeopathic topical preparation (Stam 2001), and the last compared to "conventional treatment" (Yip 2004). Given the severe adverse effects of Vioxx® and its subsequent removal from the retail market, additional trials testing these herbal medicines against standard drugs (acetaminophen, NSAIDs) are needed.

Although the majority of these trials were considered to have homogenous LBP populations, we were unable to pool and analyse trial data due to lack of reporting of sufficient raw data. Therefore, we could not provide quantitative evidence of efficacy of the six individual herbal medicines used in these trials. Instead, we used the GRADE criteria to synthesize the data. The included trials did not assess long term efficacy (e.g. return to work, recurrence) and therefore remains to be determined.

Given the overwhelming evidence that conflicts of interest may bias trial results, we assessed the potential for conflict of interest in these trials. We determined that a conflict of interest was a possibility in eight included trials. It is not possible to determine the specific influence of these potential conflicts on results of this Cochrane Review.

This review highlights research that, when combined, indicates that there are at least four herbal medicines that have low to moderate quality of evidence for the short‐term treatment of acute episodes of non‐specific LBP. These interventions are reported to have very few side effects, but more research is required to extensively explore the safety of these herbals. The adverse effects appear to be primarily confined to mild, transient gastrointestinal complaints and skin irritations. Large observational studies are needed to explore the relative safety of these herbals to standard medications such as acetaminophen and NSAIDs.

This review has several strengths, including the comprehensive search strategy, the inclusion of only the highest quality trial design and use of suggested methods for systematic reviews of interventions for LBP (Furlan 2009). One drawback of this review is that many included trials were authored by the same trialists (Chrubasik and colleagues). It is possible that the results may be systematically biased in some way. It is imperative that trials of these herbal medicines be repeated by other research groups and in different settings.

The qualitative analysis used here may be regarded as a strength and drawback. That is, though it would have been incorrect to statistically combine data from heterogeneous trials, the qualitative method used does not provide information on the size of the treatment effect. Without this quantitative data it is hard to determine whether these herbal interventions cause clinically significant effects on patients suffering from non‐specific LBP. Quantitative analyses were precluded by incomplete reporting of data in these trials. Evidence suggests that reporting of clinical trials, irrespective of the intervention, is poor (e.g. Moher 2001). Specifically, RCTs of herbal interventions report less than half of the required information as outlined by the CONSORT statement (Gagnier 2006a). An extension of the CONSORT statement for the reporting of RCTs of herbal medicine interventions has been developed and should be referred to when reporting such trials (Gagnier 2006b). These guidelines will aid trialists in planning, implementing, and reporting controlled clinical trials.

Another point of note is from the known heterogeneity of herbal medicine products. That is, herbal medicines often vary in the type of preparation (liquid, VS dried, VS topical) and thus in the amount of chemical constituents per dose. These variations influence the pharmacokinetics and therefore the relative efficacy of these products.

Summary of main results

We included 14 trials (2050 participants) in this Cochrane Review. Daily doses of H. procumbens (Devil's Claw) standardized to 50 mg or 100 mg harpagoside may be better than placebo for short‐term improvements in pain and reduced use of rescue medication (two trials, 315 participants, low quality evidence). Another H. procumbens trial demonstrated relative equivalence to 12.5 mg per day of rofecoxib (Vioxx®) (one trial, 88 participants, very low quality evidence). Daily doses of S. alba (White Willow Bark) standardized to 120 mg or 240 mg salicin are probably better than placebo for short‐term improvements in pain and rescue medication (two trials, 261 participants, moderate quality evidence). An additional trial demonstrated relative equivalence to 12.5 mg per day of rofecoxib (one trial, 228 participants, very low quality evidence). S. alba minimally affected platelet thrombosis versus a cardioprotective dose of acetylsalicylate (one trial, 51 patients). C. frutescens cream produced more favourable results than placebo and C. frutescens plaster produced more favourable results than placebo in people with chronic LBP (three trials, 755 participants, moderate quality evidence). Also, C. frutescens cream was preferable to placebo in people with acute LBP (one trial, 40 patients, very low quality evidence). Another trial found equivalence of C. frutescens cream to a homeopathic ointment (one trial, 161 participants, very low quality evidence). S. officinale L. (comfrey root extract) applied three times daily may be better than placebo ointment for short‐term improvements in pain (one trial, 120 participants, low quality evidence). S. chilensis M. (Brazilian arnica) found very low quality evidence of reduction in perception of pain and improved flexibility with application of Brazilian arnica‐containing gel twice daily as compared to placebo gel (one trial, 20 participants, very low quality evidence). Aromatic lavender essential oil applied by acupressure reduced subjective pain intensity and improved lateral spine flexion and walking time compared with participants who were not offered treatment (one trial, 61 participants, very low quality evidence). There were no significant adverse events noted within the trials included in this Cochrane Review.

Summary of risk of bias for each of the included trials.
Figuras y tablas -
Figure 1

Summary of risk of bias for each of the included trials.

Summary of findings for the main comparison. Summary of findings table 1: Brazilian arnica extract compared to placebo for patients with non‐specific chronic back pain or soft tissue pain

Brazilian arnica extract compared to placebo for patients with non‐specific chronic back pain or soft tissue pain

Patient or population: patients with back pain

Settings: outpatient clinic

Intervention: extract of Brazilian arnica

Comparison: placebo

Outcomes

No of participants
(trials)

Quality of the evidence
(GRADE)

Comments

Pain reduction based on

Pain VAS instrument 0‐100 scale

20
(one trial)

⊕⊝⊝⊝
very low1

Very small sample size only N = 10 in the treatment group. This trial found that topical application of Brazilian arnica reduced the perception of pain and increased flexibility in the treated group compared to baseline values in that group. Unknown if acute or chronic LBP.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1Selection bias was high to unclear, performance bias was low risk to unclear risk, with other attributes being low risk.

Figuras y tablas -
Summary of findings for the main comparison. Summary of findings table 1: Brazilian arnica extract compared to placebo for patients with non‐specific chronic back pain or soft tissue pain
Summary of findings 2. Summary of findings table 2: Topical capsaicin cream or plaster compared to placebo for patients with non‐specific chronic back pain or soft tissue pain

Topical capsaicin cream or plaster compared to placebo for patients with non‐specific chronic back pain or soft tissue pain

Patient or population: patients with chronic LBP or soft tissue pain

Settings: Outpatient clinic

Intervention: topical capsicum cream or plaster

Comparison: placebo

Outcomes

No of participants
(trials)

Quality of the evidence
(GRADE)

Comments

Pain perception according to the Pain VAS scale

0‐10

755
(three trials)

⊕⊕⊕⊝
moderate1

All three trials found a statistically significant difference between

the capsaicin intervention vs. placebo. In three trials minor adverse effects were noted in the treatment groups requiring no specific follow‐up treatments.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1All three trials exhibited low to unclear risk in selection bias, performance bias and attrition bias. One trial was at high risk for selective reporting.

Figuras y tablas -
Summary of findings 2. Summary of findings table 2: Topical capsaicin cream or plaster compared to placebo for patients with non‐specific chronic back pain or soft tissue pain
Summary of findings 3. Summary of findings table 3: Topical capsaicin cream compared with placebo for patients with acute non‐specific LBP

Topical capsaicin cream compared with placebo for patients with acute non‐specific LBP

Patient or population: patients with acute mechanical LBP

Settings: outpatient clinic

Intervention: Rado‐Salil ointment

Comparison: placebo

Outcomes

No of participants
(trials)

Quality of the evidence
(GRADE)

Comments

Pain evaluation on a 10 cm linear scale

40

(one trial)

⊕⊝⊝⊝
very low1,2

Pain improvements were significantly greater in the capsicum cream group up to day 14. Adverse events: Pruritis, one in placebo, one in Rado‐Salil group. Local erythema and burning, three in the Rado‐Salil group.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1Exhibited unclear risk for selection bias as well unclear baseline similarities. Performance bias was low risk as was attrition bias but it was high risk for incomplete outcome data.

2As under 400 participants were included, evidence was downgraded to very low from low.

Figuras y tablas -
Summary of findings 3. Summary of findings table 3: Topical capsaicin cream compared with placebo for patients with acute non‐specific LBP
Summary of findings 4. Summary of findings table 4: H. procumbens compared to placebo for non‐specific chronic back pain

H. procumbens compared to placebo for non‐specific chronic back pain

Patient or population: patients with chronic back pain

Settings: outpatient clinic

Intervention:H. procumbens extract

Comparison: placebo

Outcomes

No of participants
(trials)

Quality of the evidence
(GRADE)

Comments

Arhus pain index

scale 0‐130

315
(two trials)

⊕⊕⊝⊝
low1,2

In one trial a 50mg dose of H. procumbens was used, and in

the second trial a 50 mg and 100 mg dose was used with both trialss

showing a significantly improved pain score over placebo.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1Both included trials exhibited low risk of bias regarding selection bias with one trial at unclear risk of bias. Performance bias was at low risk of bias, as was attrition bias with one trial at high risk of bias for incomplete outcome data.

2Two trials included under 400 participants and we downgraded the evidence to low from moderate.

Figuras y tablas -
Summary of findings 4. Summary of findings table 4: H. procumbens compared to placebo for non‐specific chronic back pain
Summary of findings 5. Summary of findings table 5: H. procumbens extract compared to Vioxx® for non‐specific chronic LBP

H. procumbens extract compared to Vioxx®for non‐specific chronic LBP

Patient or population: patients with chronic LBP

Settings: outpatient clinic

Intervention:H. procumbens extract

Comparison: Vioxx®

Outcomes

No of participants
(trials)

Quality of the evidence
(GRADE)

Comments

Modified Arhus Index

Scale 0‐120

88
(one trial)

⊕⊝⊝⊝
very low1,2

H. procumbens was compared to Vioxx®

and while both groups showed similar pain reduction scores there were no

demonstrable difference among groups. There were adverse effects noted in both

groups.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1This trial was at low risk of bias for all risk of bias factors, with the exception of allocation concealment and compliance which were at unclear risk of bias.

2Downgraded to very low versus low as under 400 participants were included.

Figuras y tablas -
Summary of findings 5. Summary of findings table 5: H. procumbens extract compared to Vioxx® for non‐specific chronic LBP
Summary of findings 6. Summary of findings table 6: Willow bark extract compared to placebo for non‐specific chronic LBP

Willow bark extract compared to placebo for non‐specific chronic LBP

Patient or population: patients with chronic LBP

Settings: outpatient clinic and public advertisement

Intervention: willow bark extract

Comparison: placebo

Outcomes

No of participants
(trials)

Quality of the evidence
(GRADE)

Comments

Pain VAS

Scale 0‐10

261
(two trials)

⊕⊕⊕⊝
moderate1,2

The high dose (240 mg) treatment group

showed a significant reduction in pain

scores versus the low dose (120 mg) group

and the placebo group. There was one severe

allergic reaction related to the extract noted.

One trial (N = 51) also examined the effect of

the extract on platelet aggregation.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1Both trials were at low to unclear risk for selection bias, low risk for performance bias with one trial exhibiting high risk in baseline characteristics similarity. Both trials were rated as an overall low risk of bias since they met our predetermined cut‐point of 50% of the criteria on which the trial methods were assessed.

2Downgraded from high to moderate as under 400 participants were included between both trials.

Figuras y tablas -
Summary of findings 6. Summary of findings table 6: Willow bark extract compared to placebo for non‐specific chronic LBP
Summary of findings 7. Summary of findings table 7: Willow bark extract compared to rofecoxib for non‐specific chronic LBP

Willow bark extract compared to rofecoxib for non‐specific chronic LBP

Patient or population: patients with chronic LBP

Settings: outpatient clinic

Intervention: willow bark extract

Comparison: rofecoxib

Outcomes

No of participants
(one trial)

Quality of the evidence
(GRADE)

Comments

Arhus Index

Scale 0‐130

Pain VAS

Scale 0‐10

228
(one trial)

⊕⊝⊝⊝
very low1,2

There was no significant difference

in the effectiveness and adverse

events between the extract and

rofecoxib.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1Low risk for selection bias, high risk for performance bias, and high and low risk for attrition bias.

2Downgraded from low to very low due as under 400 participants were included.

Figuras y tablas -
Summary of findings 7. Summary of findings table 7: Willow bark extract compared to rofecoxib for non‐specific chronic LBP
Summary of findings 8. Summary of findings table 8: Comfrey root extract compared to placebo for acute lower and upper back non‐specific pain

Comfrey root extract compared to placebo for acute lower and upper back non‐specific pain

Patient or population: patients with acute lower and upper back pain

Settings: outpatient setting

Intervention: comfrey root extract

Comparison: placebo

Outcomes

No of participants
(studies)

Quality of the evidence
(GRADE)

Comments

Pain VAS sum (decrease) on active standardized

movement (mm)

120
(one trial)

⊕⊕⊝⊝
low1,2

The root extract showed a statistically

and clinically relevant reduction in

acute back pain versus placebo.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1Unclear risk for selection bias, low risk for both performance and attrition bias.

2Downgraded from moderate to low as under 400 participants were included.

Figuras y tablas -
Summary of findings 8. Summary of findings table 8: Comfrey root extract compared to placebo for acute lower and upper back non‐specific pain
Summary of findings 9. Summary of findings table 9: Lavender oil acupressure massage and acupoint stimulation compared to usual treatment for acute non‐specific LBP

Lavender oil acupressure massage and acupoint stimulation compared to usual treatment for acute non‐specific LBP

Patient or population: patients with acute LBP

Settings: old aged home and community centre

Intervention: lavender oil massage

Comparison: usual therapy

Outcomes

No of participants
(trials)

Quality of the evidence
(GRADE)

Comments

Pain VAS

0‐10 scale

61
(one trial)

⊕⊝⊝⊝
very low1

One week post‐study the treatment group

showed a significant (P = 0.0001) reduction

in VAS pain as well as improved walking time

and lateral spine flexion range.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1Sequence generation was at low risk of bias but allocation concealment was at high risk. Performance bias was at high and unclear risk. Co‐interventions and timing outcome assessment factors were at high risk of bias.

Figuras y tablas -
Summary of findings 9. Summary of findings table 9: Lavender oil acupressure massage and acupoint stimulation compared to usual treatment for acute non‐specific LBP
Summary of findings 10. Summary of findings table 10: Spiroflor SRL compared to CCC for chronic non‐specific LBP

Spiroflor SRL compared to CCC for chronic non‐specific LBP

Patient or population: patients with acute and chronic LBP

Settings: outpatient clinic

Intervention: Spiroflor SRL

Comparison: CCC

Outcomes

No of participants
(trials)

Quality of the evidence
(GRADE)

Comments

Pain VAS

0‐100 scale

161
(one trial)

⊕⊝⊝⊝
very low1

Spiroflor SRL and CCC were equally effective in

treating acute LBP but the CCC

group experienced greater adverse events

and adverse drug reactions.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

CCC = Cremor Capsici Compositus FNA; SRL = Homeopathic combination of Symphytum officinale, Rhus toxicodendron and Ledum palustre

1All risk of bias factors were at low risk of bias, except patient compliance which was at high risk.

2Downgraded from low to very low as under 400 participants were included.

Figuras y tablas -
Summary of findings 10. Summary of findings table 10: Spiroflor SRL compared to CCC for chronic non‐specific LBP