Scolaris Content Display Scolaris Content Display

Payment methods for outpatient care facilities

Contraer todo Desplegar todo

Abstract

Background

Outpatient care facilities provide a variety of basic healthcare services to individuals who do not require hospitalisation or institutionalisation, and are usually the patient's first contact. The provision of outpatient care contributes to immediate and large gains in health status, and a large portion of total health expenditure goes to outpatient healthcare services. Payment method is one of the most important incentive methods applied by purchasers to guide the performance of outpatient care providers.

Objectives

To assess the impact of different payment methods on the performance of outpatient care facilities and to analyse the differences in impact of payment methods in different settings.

Search methods

We searched the Cochrane Central Register of Controlled Trials (CENTRAL), 2016, Issue 3, part of the Cochrane Library (searched 8 March 2016); MEDLINE, OvidSP (searched 8 March 2016); Embase, OvidSP (searched 24 April 2014); PubMed (NCBI) (searched 8 March 2016); Dissertations and Theses Database, ProQuest (searched 8 March 2016); Conference Proceedings Citation Index (ISI Web of Science) (searched 8 March 2016); IDEAS (searched 8 March 2016); EconLit, ProQuest (searched 8 March 2016); POPLINE, K4Health (searched 8 March 2016); China National Knowledge Infrastructure (searched 8 March 2016); Chinese Medicine Premier (searched 8 March 2016); OpenGrey (searched 8 March 2016); ClinicalTrials.gov, US National Institutes of Health (NIH) (searched 8 March 2016); World Health Organization (WHO) International Clinical Trials Registry Platform (ICTRP) (searched 8 March 2016); and the website of the World Bank (searched 8 March 2016).

In addition, we searched the reference lists of included studies and carried out a citation search for the included studies via ISI Web of Science to find other potentially relevant studies. We also contacted authors of the main included studies regarding any further published or unpublished work.

Selection criteria

Randomised trials, non‐randomised trials, controlled before‐after studies, interrupted time series, and repeated measures studies that compared different payment methods for outpatient health facilities. We defined outpatient care facilities in this review as facilities that provide health services to individuals who do not require hospitalisation or institutionalisation. We only included methods used to transfer funds from the purchaser of healthcare services to health facilities (including groups of individual professionals). These include global budgets, line‐item budgets, capitation, fee‐for‐service (fixed and unconstrained), pay for performance, and mixed payment. The primary outcomes were service provision outcomes, patient outcomes, healthcare provider outcomes, costs for providers, and any adverse effects.

Data collection and analysis

At least two review authors independently extracted data and assessed the risk of bias. We conducted a structured synthesis. We first categorised the comparisons and outcomes and then described the effects of different types of payment methods on different categories of outcomes. We used a fixed‐effect model for meta‐analysis within a study if a study included more than one indicator in the same category of outcomes. We used a random‐effects model for meta‐analysis across studies. If the data for meta‐analysis were not available in some studies, we calculated the median and interquartile range. We reported the risk ratio (RR) for dichotomous outcomes and the relative change for continuous outcomes.

Main results

We included 21 studies from Afghanistan, Burundi, China, Democratic Republic of Congo, Rwanda, Tanzania, the United Kingdom, and the United States of health facilities providing primary health care and mental health care. There were three kinds of payment comparisons.

1) Pay for performance (P4P) combined with some existing payment method (capitation or different kinds of input‐based payment) compared to the existing payment method

We included 18 studies in this comparison, however we did not include five studies in the effects analysis due to high risk of bias. From the 13 studies, we found that the extra P4P incentives probably slightly improved the health professionals' use of some tests and treatments (adjusted RR median = 1.095, range 1.01 to 1.17; moderate‐certainty evidence), and probably led to little or no difference in adherence to quality assurance criteria (adjusted percentage change median = ‐1.345%, range ‐8.49% to 5.8%; moderate‐certainty evidence). We also found that P4P incentives may have led to little or no difference in patients' utilisation of health services (adjusted RR median = 1.01, range 0.96 to 1.15; low‐certainty evidence) and may have led to little or no difference in the control of blood pressure or cholesterol (adjusted RR = 1.01, range 0.98 to 1.04; low‐certainty evidence).

2) Capitation combined with P4P compared to fee‐for‐service (FFS)

One study found that compared with FFS, a capitated budget combined with payment based on providers' performance on antibiotic prescriptions and patient satisfaction probably slightly reduced antibiotic prescriptions in primary health facilities (adjusted RR 0.84, 95% confidence interval 0.74 to 0.96; moderate‐certainty evidence).

3) Capitation compared to FFS

Two studies compared capitation to FFS in mental health centres in the United States. Based on these studies, the effects of capitation compared to FFS on the utilisation and costs of services were uncertain (very low‐certainty evidence).

Authors' conclusions

Our review found that if policymakers intend to apply P4P incentives to pay health facilities providing outpatient services, this intervention will probably lead to a slight improvement in health professionals' use of tests or treatments, particularly for chronic diseases. However, it may lead to little or no improvement in patients' utilisation of health services or health outcomes. When considering using P4P to improve the performance of health facilities, policymakers should carefully consider each component of their P4P design, including the choice of performance measures, the performance target, payment frequency, if there will be additional funding, whether the payment level is sufficient to change the behaviours of health providers, and whether the payment to facilities will be allocated to individual professionals. Unfortunately, the studies included in this review did not help to inform those considerations.

Well‐designed comparisons of different payment methods for outpatient health facilities in low‐ and middle‐income countries and studies directly comparing different designs (e.g. different payment levels) of the same payment method (e.g. P4P or FFS) are needed.

Plain language summary

Payment methods for outpatient care facilities

Review aim

The aim of this Cochrane review was to assess the effect of different payment systems for outpatient care facilities. We collected and analysed all relevant studies to answer this question and included 21 studies.

Key messages

Pay‐for‐performance systems probably have only small benefits or make little or no difference to healthcare provider behaviour or patients' use of healthcare services. We are uncertain whether they cause harm. We are uncertain about the benefits and harms of other payments systems because the research is lacking or of very low certainty.

What was studied in the review?

Many healthcare services are offered to patients through outpatient facilities rather than to inpatients in hospitals. Outpatient facilities are also known as ambulatory care facilities, and include primary healthcare centres, outpatient clinics, urgent care centres, family planning centres, mental health centres, and dental clinics.

Different systems to reimburse outpatient (ambulatory) care facilities for their services are available to governments and health insurers. These systems include:

• budget systems, where the facility is given a fixed amount of money in advance to cover expenses for a fixed period;

• capitation payment systems, where the facility is paid a fixed amount of money in advance to provide specific services to each enrolled patient for a fixed period;

• fee‐for‐service systems, where payment is based on the specific services that the healthcare facility provides;

• pay‐for‐performance systems, where payment is partly based on the performance of the facility's healthcare providers.

Different payment systems can have different effects on how healthcare facilities deliver care. These changes can be intentional or unintentional and can lead to both benefits and harms. At best, a payment system can encourage healthcare providers to offer the right healthcare services to the right patients in the best and most cost‐efficient way. However, payment systems can also lead providers to offer poor‐quality, expensive, and unnecessary care, which can ultimately have a negative impact on patients' health.

This Cochrane review assessed the effect of different payment systems for outpatient care facilities. Other Cochrane reviews have assessed the effect of different payment systems for individual healthcare professionals and for inpatient facilities.

Main results

We found 21 relevant studies from the United Kingdom, the United States, Rwanda, Burundi, Tanzania, Afghanistan, China, and Democratic Republic of Congo. Most of the studies were from primary healthcare facilities. The studies assessed capitation systems, fee‐for‐service systems, and different types of pay‐for‐performance systems.

Pay‐for‐performance systems:

• probably slightly improve providers' use of some tests and treatments;

• probably lead to little or no difference in providers' compliance with quality assurance criteria;

• may lead to little or no difference in patients' use of health services;

• may lead to little or no difference in patients' health status.

Capitation combined with a pay‐for‐performance system targeted at reducing antibiotic use probably slightly reduces antibiotic prescriptions when compared to a fee‐for‐service system.

Two studies compared capitation with fee‐for‐service systems, however, we assessed the certainty of the evidence as very low.

We did not find any relevant studies that assessed budget systems.

How up‐to‐date is this review?

We searched for studies that had been published up to March 2016.

Authors' conclusions

Implications for practice

If policymakers are considering the use of pay‐for‐performance (P4P) incentives to pay outpatient health facilities, our review found that this intervention will probably lead to a slight improvement in service delivery, such as the use of tests or treatments for controlling risk factors for chronic diseases, but may lead to little or no improvement in utilisation of health services or health outcomes. We did not find a relationship between the design components of P4P and its effectiveness due to the limited number of included studies for subgroup analysis, and the costs and effects of adding P4P to an existing payment method are uncertain. The effects of using P4P without additional resources is also uncertain.

The effects of other payment methods are uncertain due to very low certainty or a lack of evidence.

Implications for research

The majority of studies included in this review are from high‐income countries, and there is a need for well‐designed research into payment methods for outpatient health facilities in low‐ and middle‐income countries. There is also a need for more well‐designed studies to directly compare or evaluate the effects of differences in the design of P4P and other payment methods. Future research should evaluate the costs of changes in payment methods as well as the effects, since some payment interventions, like P4P, involve more resources for an increase of the payment level and entail administrative costs for performance assessment and management. Current evidence on the efficiency of P4P is scarce and inconclusive (Emmert 2011).

Summary of findings

Open in table viewer
Summary of findings for the main comparison. P4P plus some existing payment method compared with existing payment method for provision and patient outcomes

P4P plus some existing payment method compared with existing payment method for provision and patient outcomes

Patient or population: outpatient health facilities

Settings: United States, United Kingdom, Rwanda, Afghanistan

Intervention: P4P plus some existing payment method

Comparison: existing payment method (capitation or input‐based payment)

Outcomes

Impact: RR for dichotomous outcomes and relative percentage change for continuous outcomes

Median (range)

No of participants
(studies)

Certainty of the evidence
(GRADE)

Comments

Provision outcomes (prescription of testing or treatment, dichotomous)

The adjusted RR median = 1.095 (ranged from 1.01 to 1.17)

3 randomised trials and 1 CBA

Moderate

⊕⊕⊕⊝

Of 3 randomised trials, 2 were rated as unclear risk of bias, and only 1 was rated as low risk of bias. The certainty was downgraded 1 level because of limitation in study design.

Provision outcomes (compliance with quality criteria, continuous)

The adjusted percentage change median = ‐1.345% (ranged from ‐8.49% to 5.8%)

2 randomised trials

Moderate

⊕⊕⊕⊝

2 randomised trials were rated as unclear risk of bias. The certainty was downgraded 1 level because of limitation in study design.

Patients' utilisation of health services (dichotomous)

The adjusted RR median = 1.01 (ranged from 0.96 to 1.15)

3 randomised trials and 1 CBA

Low

⊕⊕⊝⊝

3 randomised trials were rated as unclear risk of bias. The certainty was downgraded 1 level because of limitation in study design. The heterogeneity among estimates of effect of different studies was tested, and the certainty was downgraded 1 level because of inconsistency.

Patients' health outcomes (dichotomous)

The adjusted RR median = 1.01 (ranged from 0.98 to 1.04)

1 randomised trial

Low

⊕⊕⊝⊝

This trial was rated as unclear risk of bias. In addition, only 1 study targeting small primary health clinics in the United States was included, and the certainty was downgraded 1 level because of indirectness.

Provider outcomes

0

Costs

The P4P intervention costs were greater than usual care costs without P4P incentives by USD 86,796 in total, and USD 83 per additional referral to telephone counselling and USD 300 per additional enrollee to quit line services.

1 randomised trial

Low

⊕⊕⊝⊝

This trial was rated as unclear risk of bias. In addition, only 1 study targeted 1 specific health service (referral to telephone counselling for smokers) in the United States, and the certainty was downgraded 1 level because of indirectness.

Adverse effects

When the P4P intervention ended, there was a significant reduction in performance in the intervention group compared with the control group.

1 randomised trial

Low

⊕⊕⊝⊝

This trial was rated as unclear risk of bias. In addition, only 1 study targeted primary care clinics in 5 Veterans Affairs networks in the United States, and the certainty was downgraded 1 level because of indirectness.

CBA: controlled before‐after study; P4P: pay for performance; RR: risk ratio

GRADE Working Group grades of evidence
High certainty: This research provides a very good indication of the likely effect. The likelihood that the effect will be substantially different** is low.
Moderate certainty: This research provides a good indication of the likely effect. The likelihood that the effect will be substantially different** is moderate.
Low certainty: This research provides some indication of the likely effect. However, the likelihood that it will be substantially different** is high.
Very low certainty: This research does not provide a reliable indication of the likely effect. The likelihood that the effect will be substantially different** is very high.

**Substantially different = a large enough difference that it might affect a decision.

Open in table viewer
Summary of findings 2. Capitation plus P4P compared with FFS for provision improvement

Capitation plus P4P compared with FFS for provision improvement

Patient or population: primary healthcare facilities in rural areas

Settings: China

Intervention: capitation plus P4P

Comparison: FFS

Outcomes

Impact: RR(95% CI)

No of participants
(studies)

Certainty of the evidence
(GRADE)

Comments

Provision outcomes

The adjusted RR for dichotomous outcome was 0.84 (95% CI 0.74 to 0.96)

1 randomised trial

Moderate

⊕⊕⊕⊝

This trial was rated as unclear risk of bias, and the certainty was downgraded 1 level because of limitation in study design.

Patient outcomes

0

Provider outcomes

0

Costs

0

Adverse effects

0

CI: confidence interval; FFS: fee‐for‐service; P4P: pay for performance; RR: risk ratio

GRADE Working Group grades of evidence
High certainty: This research provides a very good indication of the likely effect. The likelihood that the effect will be substantially different** is low.
Moderate certainty: This research provides a good indication of the likely effect. The likelihood that the effect will be substantially different** is moderate.
Low certainty: This research provides some indication of the likely effect. However, the likelihood that it will be substantially different** is high.
Very low certainty: This research does not provide a reliable indication of the likely effect. The likelihood that the effect will be substantially different** is very high.

**Substantially different = a large enough difference that it might affect a decision.

Open in table viewer
Summary of findings 3. Capitation compared with FFS for provision, patient, and cost outcomes

Capitation compared with FFS for provision, patient, and cost outcomes

Patient or population: mental health centres

Settings: United States

Intervention: capitation

Comparison: FFS

Outcomes

Impacts

No of participants (studies)

Certainty of the evidence
(GRADE)

Comments

Provision outcomes (number of children treated as outpatients or for disruptive behaviour, or the number of very young children treated, continuous)

1 study showed that in for‐profit mental health centres, capitation resulted in more children being treated as outpatients and for disruptive behaviour, and more very young children being treated.

1 ITS study

Very low

⊕⊝⊝⊝

The study design is ITS and was initially graded as moderate. This study was rated as unclear risk of bias, and so was downgraded 1 level because of limitation in study design. In addition, this study only targeted mental health centres in the United States, and the certainty was downgraded 1 level because of indirectness.

Patient outcomes (number of children in inpatient or emergency treatment, continuous)

1 study showed that capitation resulted in a decrease in the number of inpatients. 2 studies showed contradictory results for the change in number of Emergency department visits.

2 ITS studies

Very low

⊕⊝⊝⊝

These 2 ITS studies were rated as unclear risk of bias, so the certainty was downgraded. The studies only targeted mental health centres in the United States, and so the certainty was downgraded because of indirectness. In addition, there was inconsistency in the results of the studies.

Cost outcomes (cost level, continuous)

1 study showed that capitation resulted in a reduction in total costs for all services and costs for inpatient care in all mental health centres, and an increase in outpatients only in for‐profit mental health centres.

1 ITS study

Very low

⊕⊝⊝⊝

The study design is ITS and was initially graded as moderate. This study was rated as unclear risk of bias, and so was downgraded 1 level because of limitation in study design. In addition, this study only targeted mental health centres in the United States, and the certainty was downgraded 1 level because of indirectness.

Provider outcomes

0

Adverse effects

0

FFS: fee‐for‐service; ITS: interrupted time series

GRADE Working Group grades of evidence

High certainty: This research provides a very good indication of the likely effect. The likelihood that the effect will be substantially different** is low.
Moderate certainty: This research provides a good indication of the likely effect. The likelihood that the effect will be substantially different** is moderate.
Low certainty: This research provides some indication of the likely effect. However, the likelihood that it will be substantially different** is high.
Very low certainty: This research does not provide a reliable indication of the likely effect. The likelihood that the effect will be substantially different** is very high.

**Substantially different = a large enough difference that it might affect a decision.

Background

Description of the condition

Outpatient care facilities, also known as ambulatory health facilities, are organisations that deliver healthcare services to individuals who do not require hospitalisation or institutionalisation. They provide a variety of types of health care including preventive health care, treatment of acute illness, dental care, and some types of maternal and family‐planning care. Most of these services are first‐contact and basic healthcare services. The provision of outpatient care contributes to immediate and large gains in health status, and a large portion of total health expenditure goes to outpatient healthcare services, especially in low‐ and middle‐income countries (Berman 2000).

Description of the intervention

Based on Barnum's framework of "flow of funds under generic reforms" (Figure 1), the flow of funds would typically be from the Ministry of Health to government providers, and from social or private insurers, if they exist, to providers. Different payment methods can be used for different outpatient care facilities (Barnum 1995).

The most commonly used payment systems to remunerate outpatient care facilities are budgets, capitation, fee‐for‐service, pay for performance, and mixed systems (Barnum 1995; Langenbrunner 2009; WHO 2000).

Line‐item budgets

The allocation of a fixed amount of funds to a healthcare provider to cover specific line items (or input costs), such as personnel, utilities, medicines, and supplies, for a certain period of time (Langenbrunner 2009). Line‐item budgets are widely used in low‐ and middle‐income countries and are often an important part of a centrally directed healthcare system (Barnum 1995).

Global budgets

A payment fixed in advance to cover aggregate expenditures for a given period. This method is used by government or insurers to pay hospitals (Hirdes 1996; Wolfe 1993), as well as some types of outpatient care facilities. For example, in 1996 the National Health Insurance program in Taiwan implemented a global budget payment system for clinics with the aim of reducing pharmaceutical expenditures (Lee 2006). Global budgets can be an important element of health sector reforms such as decentralisation of the healthcare system.

Capitation

The provider is paid a predetermined fixed rate in advance to provide a defined set of services for each individual enrolled with the provider for a fixed period. Capitation payment may be a flat fee for each of the enrollees or it can be a risk‐adjusted fee, based on the relative risk of the registered population. This payment method has been widely used in low‐, middle‐, and high‐income countries. For example, in Thailand, capitation was used to motivate hospitals to provide comprehensive health services (ILO/UNDP 1993); it was also used to pay primary care providers in Hungary (Deeble 1992).

Fee‐for‐service

Providers are reimbursed based on specific items provided. Fee‐for‐service, with fixed‐fee schedules or without (unconstrained), is commonly used in such countries as Canada, China, Japan, and the Republic of Korea; among private insurers in the Gulf states, such as Saudi Arabia; in indemnity plans in the United States (a type of health insurance that reimburses the patient or provider as expenses are incurred) (Pati 2005); and in parts of Western Europe, such as Austria and Germany.

Pay for performance

The payment is directly linked to the performance of healthcare providers. Pay for performance can be used to pay individuals, groups of people, or organisations by government or insurers. Pay‐for‐performance schemes vary widely in terms of the types of performance that are targeted, how performance is measured, when payments for performance are paid, the size of payments for performance, and the proportion of total reimbursements that is paid for performance (Witter 2012).

Mixed systems

A mixed system may be adopted simply because it is administratively more practical or to counter the adverse incentives of specific payment methods while retaining their desirable features. Most provider payment systems are mixed.

Categories of payment methods

Provider payment methods can be categorised based on three characteristics (Table 1) (Langenbrunner 2009):

Open in table viewer
Table 1. Outpatient care facilities payment methods and characteristics

Payment methods

Payment rate determined

Payment made

Payment related to

Prospectively

Retrospectively

Prospectively

Retrospectively

Inputs

Outputs

Line‐item budgets

Global budgets

Capitation

Fee‐for‐service

‐Unconstrained

‐Fixed

Pay for performance

  1. whether payment rates to providers for a single service or a package of services are set prospectively (in advance) or retrospectively (after services are provided). For prospectively set payment methods, services are bundled into a package reimbursed at a fixed payment rate, and some financial risk is shifted from the purchaser to the provider. Alternatively, payment rates are set retrospectively when the provider is simply reimbursed the amount that is billed, the reimbursement rates reflect the cost of providing the services, and the purchaser bears all the financial risk;

  2. whether payment to the provider is made prospectively (before services are provided) or retrospectively (after services are provided). With prospective rate setting, the actual payment may be made either prospectively or retrospectively;

  3. whether the payment that is made to providers is based on inputs used to provide services (i.e. all costs of providing services are financed) or on outputs produced, such as cases treated, bed‐days completed, or individual services provided (i.e. each test, procedure, or consultation).

How the intervention might work

Different types of payment methods have different incentives. Retrospective payment systems provide incentives to providers to deliver more services, and thereby might increase utilisation of services. There is little risk for providers, provided the payments are appropriate, and there are no incentives for patient selection. However, retrospective payment systems can also provide incentives to provide unnecessary and inappropriate care. Retrospective payments can also provide incentives to deliver desired services, thereby improving quality. In contrast, prospective payment methods can provide incentives to deliver rational levels of services, to improve efficiency, and to contain costs (Langenbrunner 2009).

An input‐based payment method creates incentives to increase the number of inputs. An output‐based payment method creates stronger incentives to increase the number or quality of services delivered. The lower the levels of aggregation at which services are defined as outputs, the greater the incentives are to increase the number of services delivered.

Based on these theoretical incentives, different payment methods might be expected to have different effects on the quality and quantity of services provided per patient, the efficiency (cost per unit), and selection of patients (risk selection) (Table 2).

Open in table viewer
Table 2. Incentives in pure reimbursement systems of outpatient care facilities

Reimbursement type

Performance

Services/Case

Quantity

Quality

Cost/Unit

Risk selection

Line‐item budgets

‐‐

+

0

Global budgets

‐‐

‐‐

0

Capitation

‐‐

‐‐

‐‐

++

Fee‐for‐service

‐Unconstrained

++

+

‐‐

0

‐Fixed

++

+

‐‐

‐‐

+

Case‐based

‐‐

++

++

‐‐

+

Pay for performance

+

++

++

‐‐

+

Line‐item budgets are input‐based, whereas global budgets can be input‐ or output‐based payments. Both are prospective, which means that providers have incentives to under‐provide services (negative effects) and increase inputs, and no incentives to improve the efficiency of the input mix. The main incentive of global budgets is to encourage providers to control healthcare costs, rather than to improve performance.

Fee‐for‐service with a fixed‐fee schedule and bundling of services is output‐based, which provides incentives to increase the number of services delivered, including unnecessary ones (negative effects), and to reduce the amount of input per service. Fee‐for‐service with no fixed‐fee schedule (unconstrained) is input‐based and retrospective. The provider has incentives to increase the number of services and increase inputs.

Capitation is output‐based and prospective. The provider has incentives to increase output or attract more patients to enrol, which increases the total payment received. Providers might attract enrollees through improved quality of care, additional services that are not typically covered, or other measures that patients may perceive as increasing the benefit of enrolling with that provider rather than with another provider. It also provides incentives to improve efficiency of the input mix and decrease inputs, focus on less expensive health promotion and prevention, and attempt to select healthier enrollees (negative effects).

Pay‐for‐performance is output‐based and retrospectively paid for the providers. Providers are informed as to how much they could be paid with different levels of performance on service provision. It motivates the providers behaviour to the desired performance target, which could be the improvement in quantity or quality of services; however, it can also result in providers selecting the patients who are easier to achieve the performance target (e.g. blood pressure control) and ignoring the services that are not included in the performance target (negative effects).

Administrative costs are incurred whatever payment methods are used. Administrative costs include the costs of making payments and monitoring inputs, outputs (quantity and quality of services provided). More complex payment methods, such as pay‐for‐performance schemes, entail more administrative costs.

Why it is important to do this review

Several systematic reviews have evaluated the effects of payment to individual health providers (Flodgren 2011; Gosden 2000; Scott 2011). However, little is known about the effects of the different types of payment system on performance of healthcare facilities.

This review focused on payment for healthcare facilities providing outpatient services. Together with the Cochrane reviews on payment methods for individual health professionals and payment methods for hospitals (Jia 2015; Mathes 2014), this review will contribute to the evidence on which types of payment methods are effective in improving provision of outpatient health services, from which policy advice can be disseminated to policymakers in different countries and managers of health insurance.

Objectives

To assess the impact of different payment methods on the performance of outpatient care facilities and to analyse the differences in impact of payment methods in different settings.

Methods

Criteria for considering studies for this review

Types of studies

  • Randomised trials, including cluster‐randomised trials

  • Non‐randomised trials

  • Interrupted time series and repeated measures studies with:

    • a clearly defined point in time when the intervention occurred;

    • at least three data points before and three data points after the intervention.

  • Controlled before‐after studies with:

    • contemporaneous data collection;

    • a minimum of two intervention and two control sites.

Types of participants

We evaluated the payment targeted to health facilities, so outpatient care facilities are the participants included in this review. Outpatient care facilities, also known as ambulatory health facilities, are those facilities that provide health services to individuals who do not require hospitalisation or institutionalisation, including community healthcare centres, clinics (including outpatient clinics), urgent care centres, family planning centres, mental health centres, and dental clinics. We also included primary care practices (groups of individual professionals).

As some studies might investigate the individual patients to measure the performance of health facilities as the impact of payment methods, the participants in this review also included patients receiving services from outpatient care facilities.

Types of interventions

The payment methods varied with respect to the level at which the incentives were targeted. We only included the provider payment method that was used to transfer funds from the purchaser of healthcare services to the level of health facilities (including groups of individual professionals). In this review, payment methods included:

  • global budgets;

  • line‐item budgets;

  • capitation;

  • fee‐for‐service (fixed and unconstrained) (FFS);

  • pay for performance (P4P);

  • mixed payment.

We included comparisons of:

  • any two types or combinations of the above payment methods for outpatient care facilities; or

  • changes in the design of a payment method, such as increasing or decreasing the level of funding, changing the payment frequency, or changing performance target used in P4P.

We excluded methods of paying individuals who work in outpatient care facilities.

Types of outcome measures

Primary outcomes

To be included, the study must have reported at least one of the following objective measures of outpatient care facilities' performance in health services provision.

  1. Service provision outcomes (controlled by providers' behaviour):

    1. Quantity of health services provided (e.g. proportion of patients getting aspirin prescription, rate of referral of smokers to quit line);

    2. Quality of health services provided (e.g. adherence to guidelines, quality score for certain health services).

  2. Patient outcomes (not only controlled by providers' behaviour):

    1. Patients' utilisation of health services (e.g. proportion of women having any prenatal care, proportion of children being fully immunised);

    2. Patients' intermediate and final health outcomes (e.g. blood pressure of patients with hypertension, health‐related quality of life, mortality).

  3. Healthcare provider outcomes (e.g. workload, work morale)

  4. Costs for providers (e.g. cost per service, administration costs, total cost for purchasers)

  5. Adverse effects (e.g. unnecessary services, reduced access to services (especially for disadvantaged populations), and patient selection)

Secondary outcomes

  1. Satisfaction of patients, providers, or other stakeholders.

Search methods for identification of studies

Electronic searches

We searched the following databases.

  • Cochrane Central Register of Controlled Trials (CENTRAL), 2016, Issue 3, part of the Cochrane Library (searched 8 March 2016)

  • MEDLINE, 1946 to present, In‐Process and Other Non‐Indexed Citations, OvidSP (searched 8 March 2016)

  • Embase, 1947 to present, OvidSP (searched 24 April 2014)

  • PubMed, 1966 to present (searched 8 March 2016)

  • Dissertations and Theses Database, 1861 to present, ProQuest (searched 8 March 2016)

  • Conference Proceedings Citation Index, 1990 to present (ISI Web of Science) (searched 8 March 2016)

  • IDEAS Research Papers in Economics, 1927 to present (searched 8 March 2016)

  • EconLit, 1969 to present (searched 8 March 2016)

  • ProQuest (searched 8 March 2016)

  • POPLINE (Population Information Online), 1970 to present, K4Health (searched 8 March 2016)

  • China National Knowledge Infrastructure (CHKD‐CNKI), 1915 to present, (searched 8 March 2016)

  • Chinese Medicine Premier (Wanfang Data), 1988 to present, (searched 8 March 2016)

  • Website of the World Health Organization(www.who.int/en/) (searched 8 March 2016)

  • Website of the World Bank (www.worldbank.org/) (searched 8 March 2016)

The Cochrane Effective Practice and Organisation of Care (EPOC) Group Information Specialist developed the MEDLINE strategy in consultation with the authors.

Search strategies are comprised of keywords and controlled vocabulary terms. We applied no language limits. We searched all databases from database start date to date of search.

Searching other resources

Grey literature

We conducted a grey literature search to identify studies not indexed in the databases listed above. We searched one grey literature database: OpenGrey (System for Information on Grey Literature in Europe) (www.opengrey.eu/) (searched 8 March 2016).

Trial registries

  • ClinicalTrials.gov, US National Institutes of Health (clinicaltrials.gov/) (searched 8 March 2016)

  • World Health Organization International Clinical Trials Registry Platform (WHO ICTRP) (www.who.int/ictrp/en/) (searched 8 March 2016)

In addition, we:

  • searched reference lists of all relevant papers identified;

  • searched Science Citation Index and Social Sciences Citation Index, ISI Web of Science, for papers citing any studies included in the review;

  • searched PubMed for related citations to any studies included in the review;

  • contacted authors of relevant papers regarding any further published or unpublished work.

All search strategies are provided in Appendix 1.

Data collection and analysis

Selection of studies

Two review authors scanned titles and abstracts of all articles obtained from the search and retrieved the full text of articles deemed relevant. Two review authors independently assessed full texts of studies for inclusion. Any disagreements on inclusion were resolved by discussion with a third review author or EPOC editor.

The screening process and results are reported in a study flow chart (Figure 2). We described all included studies in the Characteristics of included studies tables, even if the included studies did not report usable results for re‐analysis or synthesis. We listed studies that appeared to meet the inclusion criteria, but were eventually excluded, in the Characteristics of excluded studies table.


Study flow diagram.

Study flow diagram.

Data extraction and management

Two review authors independently carried out data extraction using a data extraction form adopted from the Cochrane good practice data collection form (EPOC 2013a). The information we extracted included:

  • general information of study;

  • participants and setting;

  • study method;

  • intervention groups, including payment method description, duration of intervention, if patients can choose providers, how purchasers monitored the implementation of payment;

  • outcomes, including outcome measures, time points measured, unit of measurement, and person measuring outcomes;

  • results, including results reported by authors, analysis method, unintended effects, if re‐analysis was required and possible.

Any disagreements were resolved by discussion with a third review author or the EPOC contact editor. Data were managed in Microsoft Word. For interrupted time series (ITS) studies reporting time series data that are not appropriately analysed, we extracted and re‐analysed the data as described in the EPOC resources for review authors (EPOC 2013b).

Assessment of risk of bias in included studies

We used the EPOC suggested 'Risk of bias' criteria to assess the risk of bias for each outcome in all included studies (EPOC 2013c). For each criterion, two review authors independently described what was reported in the study, commented on the description, and judged the risk of bias. Any unresolved disagreements were discussed with a third review author, and, if consensus could not be reached, with the EPOC contact editor. We summarised the overall risk of bias across criteria for the primary outcome of the included studies. For randomised trials, non‐randomised trials, and controlled before‐after studies, we primarily considered four criteria: baseline outcome measurements, baseline characteristics measurements, incomplete outcome data addressed, and protection against contamination. If all four criteria were scored 'low risk of bias' for the outcome in a study, the summary assessment was that there was a low risk of bias; if one or more key criteria were scored 'unclear', the summary assessment was unclear risk of bias. If one or more key criteria were scored 'high risk of bias', the summary assessment was high risk of bias. For ITS studies, when we summarised the overall risk of bias across these criteria, we primarily considered the following criteria: intervention independence, intervention affecting data collection, and incomplete outcome data addressed.

Measures of treatment effect

For randomised trials, non‐randomised trials, and controlled before‐after studies, we recorded or calculated risk ratios (RRs) with 95% confidence intervals (CIs) for dichotomous outcomes. If adjusted analysis was done, we reported the effect estimates reported by the authors, also converting them into RRs when possible. For continuous outcomes, when possible, we reported the absolute change from a statistical analysis adjusted for baseline differences or the relative change adjusted for baseline differences in the outcome measures. If not enough data were provided for statistical analysis, we only reported the absolute and relative change adjusted for baseline differences.

For interrupted time series and repeated measures studies, we attempted to report the difference between the predicted value based on the pre‐intervention trend and the estimated value based on the change in level and postintervention trend at relevant time points (including immediately after the intervention (change in level), one year, two years, and three years). However, in all included interrupted time series studies, only one paper provided enough data for us to calculate the above effect outcomes, so for other interrupted time series studies, we only used change in level immediately after intervention and change in trend to measure the treatment effects, or the effects results reported by the authors if any re‐analysis was not possible.

We included five controlled before‐after studies with very high risk of bias and unclear reporting for a large number of outcomes, but did not use them in effects analyses (Bonfrer 2014a; Canavan 2008; Rudasingwa 2015; Soeters 2008; Soeters 2011).

Unit of analysis issues

We planned to re‐analyse comparisons that allocated clusters (e.g. clinics in one district) but did not account for clustering, if we could extract the intracluster coefficient. However, all included studies adjusted for clustering in their analyses.

Dealing with missing data

We contacted the original investigators to request missing data. At the time of submission of this review we had not received any response on missing data. We used the available data for those studies reporting the point estimate of effect measures without confidence intervals or, if that information was missing, we included the study in the review, but not in analyses of the effects of payment methods. If information on subgroup analyses was missing (e.g. for P4P, the size of the incentive or frequency of payment), we contacted the original investigators to request information. We received replies from two authors regarding P4P interventions. If we did not receive a response clarifying the P4P design, we did not include the study in subgroup analyses.

Assessment of heterogeneity

We conducted meta‐analysis to synthesise the effect measures of included studies if they had:

  • similar intervention and comparison payment methods: the payment methods evaluated by all studies were defined as the same category of payment;

  • same participants: e.g. all targets of payment methods included were primary care clinics or practices;

  • same category of outcome measures: e.g. all outcome measures were health service provision measures, or all were patient outcome measures.

When the included studies were similar enough based on the above criteria, we used the Chi2 test and I2 statistic to assess statistical heterogeneity. When the P value from a Chi2 test was smaller than 0.1, we interpreted this as an indication that the observed difference in results across studies was unlikely to have occurred by chance alone.

We used the random‐effects model for meta‐analysis across studies, because the payment methods included several components; the payment methods conducted by different purchasers or in different areas were not exactly the same; and the effectiveness of payment methods may also have been influenced by many contextual factors that varied in different studies. In addition, the outcome measures under the same category were not exactly the same in different studies.

We explored heterogeneity in the design of payment methods through prespecified subgroup analyses. We downgraded the certainty of the evidence for results from meta‐analyses with high levels of heterogeneity without a compelling explanation for the heterogeneity.

Assessment of reporting biases

We planned to use funnel plots to examine asymmetry and assess the potential of any asymmetry being due to publication bias. However, there were too few studies of similar comparisons to allow for a meaningful assessment of asymmetry.

Data synthesis

We conducted a structured synthesis, as described in the EPOC resources for review authors (EPOC 2013d). We firstly categorised the comparisons and outcomes, and then described the effects of different kinds of payment methods on different categories of outcomes. We also listed and described the differences in context and components of payment methods in different studies. We considered the potential influence of these factors on the effects of the payment methods in the Discussion. We planned on using meta‐analysis if we found more than one study with similar comparisons and outcomes. Only three randomised trials, Bardach 2013, Engineer 2016, and Petersen 2013, and one controlled before‐after study, Basinga 2011, were similar enough and provided sufficient data for a meta‐analysis. For these studies, we firstly used a fixed‐effect model for meta‐analysis within a study, if the study included more than one outcome in the same category of outcome measures (service provision measures or patients outcome measures). We then used the random‐effects model for meta‐analysis across studies. If the data for meta‐analysis were not available in some studies, we calculated the median and interquartile range of effect sizes, if there was a sufficient number of included studies. We reported RRs for dichotomous outcomes and the relative change for continuous outcomes.

For the synthesised effects of each comparison, we assessed the certainty of evidence for each outcome(i.e. the extent of our confidence in the estimate of effect across studies) using the GRADE approach (Guyatt 2008).

Subgroup analysis and investigation of heterogeneity

Based on the incentives of different reimbursement systems for outpatient care facilities (Table 2), in the protocol for this review we hypothesised and analysed the following factors (Table 3) that might affect the size of effects of payment methods to explain any differences in the effects of payment methods.

Open in table viewer
Table 3. Factors that might modify the effects of changes in payment methods on the delivery of services per case

Explanatory factors

How we will categorise the factor

Hypothesised direction of the interaction

Basis for the hypothesis

Larger fees (per service)

Relative increase in fees (continuous)

Larger (positive) effects with larger relative increases

The larger the incentive, the larger the effect

Duration of follow‐up

When outcomes are measured relative to when the change was made (continuous)

Larger (positive) effects with shorter follow‐up

Other changes and adjustments over time might reduce the initial incentive.

Ownership

For‐profit vs not‐for‐profit ownership

Larger (positive) effects with for‐profit ownership

For‐profit facilities might be more motivated to increase income and therefore more sensitive to changes in incentives.

Multiple providers

Choice of providers available to patients vs little or no choice of providers

Larger (negative) effects with little or no choice

Need to attract and retain patients might provide counteractive incentives to offer more services.

Monitoring

Monitoring vs no monitoring of the delivery of services

Larger (negative) effects without monitoring

Monitoring might provide counteractive incentives to offer more services.

  • Larger fees: Relative increase in fees.

  • Duration of follow‐up: When outcomes are measured relative to when the change was made.

  • Ownership: For‐profit versus not‐for‐profit ownership.

  • Multiple providers: Choice of providers available to patients versus little or no choice of providers.

  • Monitoring: Monitoring versus no monitoring of the delivery of services.

Of the three kinds of comparisons included in this review, there were enough studies for conducting subgroup analysis for only one comparison (between P4P plus existing capitation or input‐based payment with the existing payment method). Especially for the P4P, we hypothesised and analysed the following factors that might affect the size of effects of P4P payment to explain any differences in the effects of payment methods; these factors and hypothesis were based on the dimensions of P4P schemes defined by Conrad and Perry and the analysis of results‐based financing by Oxman (Conrad 2009; Oxman 2009).

  • Type of performance measures applied by P4P: service provision measures, patient outcome measures, or combined measures (P4P with provision measures leads to larger effects because provision behaviour is easier to change).

  • Type of performance target applied by P4P: threshold payment, or pay for each instance of service (P4P with pay for each instance of service leads to larger effects because one instance of service is easier to achieve).

  • Size of incentive (percentage of P4P payment on total income level): lower than 10%, 10% to 30%, higher than 30% (the larger the size, the larger the effects).

  • Frequency of monitoring and feedback: quarterly, annual (the more frequently the monitoring and feedback, the larger the effects).

  • Frequency of payment: quarterly, annual (the more frequently the payment, the larger the effects).

  • Resourcing: involvement of extra fund or not (involvement of extra fund leads to larger effects because of the support of extra resources).

  • Individual payment inside facility: no payment to individual, equally allocated to individual, or allocated to individual based on individual performance (allocation to individual based on individual performance leads to larger effects because individual health workers are more motivated).

  • Duration of follow‐up: 1 year or less, 1.5 to 2 years, 3 years (larger effects with shorter follow‐up because changes and adjustments over time might reduce the initial incentive).

We did not include two factors we had prespecified in the protocol in the review: ownership (for‐profit versus not‐for‐profit ownership) and multiple providers (choice of providers available to patients versus little or no choice of providers), because data and information from the included studies were insufficient.

Sensitivity analysis

We did not use imputed data and did not include studies with a high risk of bias in meta‐analyses. We therefore did not conduct sensitivity analyses to assess the robustness of the findings in relation to assumptions about imputed data or judgments about the risk of bias, as planned in the protocol.

Results

Description of studies

See: Characteristics of included studies

Results of the search

We identified 55,558 references after removing duplicates. We screened this large number of references because the searches were conducted without the study design filters in MEDLINE Ovid and Embase in order to also find relevant studies for a larger scoping review on payment for health facilities or individual providers.

Two researchers independently examined these references. We retrieved 1338 full‐text articles regarded as potentially relevant, which two review authors read independently. We included 21 studies evaluating payment methods for outpatient health facilities (Figure 2). Basinga 2011 and a 2010 unpublished working paper by the same authors were from the same study (a P4P program in Rwanda). They had the same outcomes measures, analysis methods, and study results, so we listed the unpublished working paper as a secondary reference to Basinga 2011. Bonfrer 2014a, Soeters 2008, and Rudasingwa 2015 were also from the same study (a P4P program in Burundi), but they applied different outcome measures, data sources, and analysis methods, therefore we treated them as three studies in the analysis.

Included studies

Study design

We included 21 studies (see Characteristics of included studies): eight randomised trials (An 2008; Bardach 2013; Engineer 2016; Hillman 1998; Hillman 1999; Petersen 2013; Roski 2003; Yip 2014), six controlled before‐after studies (Basinga 2011; Bonfrer 2014a; Canavan 2008; Rudasingwa 2015; Soeters 2008; Soeters 2011), four interrupted time series studies (Catalano 2000; Catalano 2005; McLintock 2014; Serumaga 2011), and three repeated measure studies (Alshamsan 2012; Chien 2012; Lee 2011).

Participants and settings

All of the facilities in the included studies provided primary health care or mental health care, although the facilities had different names in different countries. Four studies evaluated P4P for general practitioner practises in the United Kingdom (Alshamsan 2012; Lee 2011; McLintock 2014; Serumaga 2011). Nine studies were conducted in the United States, of which two included community mental health centres (Catalano 2000; Catalano 2005). The remaining seven studies included clinics providing different types of primary health care, Bardach 2013, An 2008, Petersen 2013, Roski 2003, Hillman 1998, and Hillman 1999, or physician groups providing primary health care (Chien 2012). Eight studies were conducted in low‐ and middle‐income countries (Rwanda (Basinga 2011), Tanzania (Canavan 2008), Burundi (Soeters 2008), Democratic Republic of Congo (Bonfrer 2014a; Rudasingwa 2015; Soeters 2011), China (Yip 2014), and Afghanistan (Engineer 2016)). All of these studies from low‐ and middle‐income countries included primary healthcare facilities (centres).

Interventions' characteristics and comparison interventions

Characteristics of included studies tables provide a summary of the interventions and comparisons. The interventions varied. There were three types of comparisons: P4P plus some existing payment method (capitation or input‐based payment) compared to the existing payment method (Alshamsan 2012; An 2008; Bardach 2013; Basinga 2011; Bonfrer 2014a; Canavan 2008; Chien 2012; Engineer 2016; Hillman 1998; Hillman 1999; Lee 2011; McLintock 2014; Petersen 2013; Roski 2003; Rudasingwa 2015; Serumaga 2011; Soeters 2008; Soeters 2011); P4P combined with capitation compared to FFS (Yip 2014); and capitation versus FFS (Catalano 2000; Catalano 2005).

Most of the payment methods evaluated in the included studies were P4P, however the design of these P4P interventions varied. Based on the dimensions of P4P schemes defined by Conrad and Perry and the analysis of results‐based financing by Oxman (Conrad 2009; Oxman 2009), we systematically disentangled the P4P payment methods into seven components and described their characteristics (Table 4): the performance measures applied by P4P, the performance target for payment applied by P4P, size of incentive, frequency of monitoring, frequency of payment, individual payment inside the facility, and resourcing (if involvement of extra funding).

Open in table viewer
Table 4. The characteristics of P4P payments included in review

Study

Performance measures

Performance target

Size of incentive

Frequency of monitoring

Frequency of payment

Individual payment

Resourcing (if with more funds)

Alshamsan 2012

Lee 2011

Serumaga 2011

McLintock 2014

Both provision and outcome measures: 76 clinical quality indicators and 70 indicators relating to organisation of care and patient experience. Of the clinical indicators, 10 relate to maintaining disease registers, 56 to processes of care (such as measuring disease parameters and giving treatments), and 10 to intermediate outcomes (such as controlling blood pressure).

Threshold payment: Practices are awarded points based on the proportion of patients for whom targets are achieved, between a lower achievement threshold of 40% for most indicators (i.e. practices must achieve the targets for over 40% of patients to receive any points) and an upper threshold that varies according to the indicator. Each point earned the practice the certain level of money, adjusted for patient population size and disease prevalence. A maximum of 1000 points was available.

The highest level of performance payment is 25% of total income.

Annual

Annual

Allocated to individual based on individual performance

Yes

An 2008

Provision outcome measures: referral of smokers to consultation

Threshold payment combined with payment for each instance: Clinics that referred 50 smokers would receive a USD 5000 performance bonus. Clinics would also receive $25 for each referral beyond the initial 50.

Not clear, but mentioned "This incentive amount was arrived at after consultation with the management team and represents an amount that was judged as likely to be meaningful to most clinics ..."

10 months

10 months

Into clinics' operation fund, no payment to individual physicians and administrators

Yes

Bardach 2013

Both provision and outcome measures: 4 quality goals, including aspirin prescription, blood pressure control, cholesterol control, and smoking cessation intervention provision

Payment for each instance of performance measure unit: An incentive was paid for every instance of a patient meeting the quality goal (e.g. 1 blood pressure control USD 20). A higher payment was paid for patients with certain comorbidities or, as proxies for socioeconomic status, had Medicaid insurance or were uninsured.

Approximately 5% of an average physician's annual salary

Quarterly

Annual

Allocated to individual based on individual performance

Yes

Basinga 2011

Process measures: The 14 key maternal and child healthcare output indicators. Some of these output indicators are reasons for a visit, such as prenatal care or delivery, whereas others are services provided during a visit, such as tetanus vaccination during prenatal care.

Payment for each instance of performance measure unit: Basis for payment is calculated based on the number of 14 kinds of services provided; the final payment level is adjusted based on quality index.

Facility funding increased by 22%

Quarterly

Quarterly

77% of P4P1 funds to allocate to individual personnel, amounting to 35% increase in salary

No, control group funding also increased by the same level.

Canavan 2008

Process measures: outpatient utilisation rate, delivery rate, VCT2 clients

Threshold payment: 50% of support paid upfront for the year; 50% paid retrospectively if all the targets are met (outpatient utilisation rate 0.6, delivery rate 20/1000, VCT2 clients 20/1000).

8% of facility income

Semi‐annual

Semi‐annual

50% maximum bonus allocated to individual

Yes

Chien 2012

Both provision and outcome measures: diabetes patient completing all the missing care processes, and whether glycated haemoglobin and low‐density lipoprotein levels were lowered or at goal levels

Payment for each instance of performance measure unit: Certain amount of money for each patient paid if this patient met the performance target, e.g. USD 15 for 1 glycated haemoglobin test, USD 35 for glycated haemoglobin < 7%.

Not clear, but mentioned that "then incentive amount ... may not have been strong enough"

Annual

Annual

Not clear

Yes

Engineer 2016

Provision outcome measures: volume of 9 primary health services provided, combined with service provision quality indicators

Payment for each instance of performance measure unit: Certain amount of bonus per unit per quarter, e.g. USD 1.30 to USD 2.67 for first antenatal care visit; final payment was also adjusted by quality indicators.

The bonus amounts paid
were about 6% to 11% above health workers' base salary, and increased to about 14% to 28% depending on the health worker's cadre.

Quarterly

Quarterly

All allocated to individual, but the allocation method was determined by health facility managers, including giving individual bonuses proportional to the health worker's salary, giving them in equal amounts to all staff, or giving them based on their determination of an individual's contribution to the performance indicators.

Yes

Hillman 1998

Provision outcome measures: compliance with a quality assurance policy, i.e. is referral of clinically indicated for Pap test, colorectal screening, or mammography

Threshold payment: 3 intervention sites with highest compliance scores received full bonus (20% of capitation); 3 with the next highest scores and the 3 improving most from previous audit both received partial bonus (10% of capitation).

10% to 20% of capitation for all female members 50 years of age and older

Semi‐annual

Semi‐annual

Not clear, 38.5% of sites were solo group.

Yes

Hillman 1999

Provision outcome measures: compliance with provision of defined services for children, including immunisation, other preventive services

Threshold payment: 3 intervention sites with highest compliance scores received full bonus (20% of capitation); 3 with the next highest scores and the 3 improving most from previous audit both received partial bonus (10% of capitation).

10% to 20% of capitation for all paediatric members up to 7 years

Semi‐annual

Semi‐annual

Not clear, 42.1% of sites were solo group.

Yes

Petersen 2013

Combined provision and outcome measures: blood pressure thresholds or appropriately responding to uncontrolled blood pressure, prescribing guideline‐recommended antihypertensive medications

Payment for each instance: a maximum prerecord reward of USD 18.20, USD 9.10 for each successful measure

Mean level was 1.6% of a physician's salary.

4 months

4 months

Equally allocated to individual physician, non‐physician in team

Yes

Roski 2003

Provision outcome measures: Tobacco status clearly identified at each visit and documented in their medical records for their last visit; smokers should have provision of advice to quit smoking documented in their medical record.

Threshold payment: Performance targets were set at approximately 15 percentage points above the average performance for these clinic practices as assessed by the medical group 2 years prior to the effort described here. Incentive amounts were based on the number of providers per clinic. Specifically, clinics with 1 to 7 providers could receive a USD 5000 award, and clinics with 8 or more providers were eligible for a USD 10,000 bonus. Clinics that reached or exceeded only 1 of the 2 performance goals were eligible for half the amount.

Not clear, just discussed "it is not clear whether significantly higher incentive payments would have been able to focus clinic sites'

attention more strongly on ..."

Semi‐annual

Annual

Not clear, just mentioned "Clinics were provided with suggestions on how to spend earned incentive payments (i.e., travel and registration for educational courses). Ultimately, clinics decided how to allocate incentive payments."

Yes

Soeters 2008

Bonfrer 2014a

Rudasingwa 2015

Provision measures: health provision actions and quality composite index

Payment for each instance: Fixed amount paid per targeted action; multiplied by quality bonus ranging from 1 to 1.25 based on quarterly reviews of quality.

Studies published at different times reported different proportions:

58% of facility total revenue in 2009;

in 2014, this part accounted for 40% of total health facility budget;

in 2010, 20% of total health facility revenue.

Quarterly

Quarterly

Allocated to individual based on individual performance, using a systematic approach called "indices".

Yes

Soeters 2011

Provision measures: health provision actions and quality composite index with 154 indicators

Payment for each instance: Fixed amount paid per targeted action; top‐up of 15% available, based on quarterly reviews of quality. Also 15% additional payment for remote facilities.

Not clear, but should be the major component of funding for the health centres

Quarterly

Quarterly

Just mentioned facilities having discretion to pay staff.

Yes

Yip 2014

Provision measures: antibiotic prescription rates and patient satisfaction

Threshold payment: 70% of the budget allocated to health facilities firstly, withholding the balance until after performance assessments at the middle and end of the year; after each assessment, the performance scores were compared between each health facility to the average score in the county; each centre that scored above the average received more than the 30% of the budget that had been withheld, in proportion to how much above the county average its score was. Each centre that scored below the average received less than the 30%, in proportion to how much lower than average its score was.

30% of capitation budget

Semi‐annual

Semi‐annual

No allocation to individual

No

1P4P: pay for performance
2VCT: voluntary counselling and testing

Performance measures applied by P4P

There were two categories of performance measures: provision performance measures and outcome performance measures. For the provision measures, payment was based on the providers' performance on the process of service provision, for example percentage of tobacco use identified and percentage of smokers receiving advice to quit (Roski 2003), percentage of patients whose blood pressure was measured (Serumaga 2011), percentage of patients with guideline‐based prescriptions and treatments (Petersen 2013), number or percentage of women having any prenatal care (Basinga 2011; Soeters 2008; Soeters 2011), or number or percentage of women with a facility‐based delivery (Basinga 2011; Bonfrer 2014a; Canavan 2008; Engineer 2016; Rudasingwa 2015; Soeters 2008; Soeters 2011). For the outcome measures, payment was based on patients' health or behaviour outcomes, for example the percentage of patients with blood pressure controlled (Alshamsan 2012; Bardach 2013; Petersen 2013), total cholesterol control (Alshamsan 2012; Lee 2011), or the percentage of smokers with seven‐days sustained abstinence from smoking (Roski 2003). There were also several P4P programs that used combined process and outcome performance measures (Alshamsan 2012; Bardach 2013; Lee 2011; Petersen 2013; Serumaga 2011).

Performance target for payment

This component is the level of performance for which the incentives were paid. There are two main categories in the included P4P methods: threshold payment and payment for each instance of performance. Threshold payment means that the providers only received the performance incentives if they achieved a certain level of performance, and this level of performance could be absolute performance or relative performance compared with other providers or their previous performance, for example at least 40% of covered patients with diabetes who have a record of total cholesterol (Alshamsan 2012; Lee 2011; Serumaga 2011), the outpatient utilisation rate achieving at least 0.6 per resident and the facility‐based delivery achieving at least 20/1000 (Canavan 2008), referral of at least 50 smokers (An 2008), percentage of tobacco status identified achieving 15% above the average performance (Roski 2003), or being one of six facilities with the highest compliance scores or one of three facilities improving the most from a previous audit (Hillman 1998; Hillman 1999). Payment for each instance of performance means providers received an award for each instance of service provision or patient's health improvement, for example payment of USD 20 for one case of blood pressure control (Bardach 2013), USD 15 for one glycated haemoglobin test (Chien 2012), or USD 18.20 payment for one blood pressure control or appropriate response to uncontrolled blood pressure (Petersen 2013). One P4P program used a combined performance target (An 2008), in which the intervention clinics who referred 50 smokers received a USD 5000 performance bonus, and clinics also received USD 25 for each referral beyond the initial 50.

Size of incentive

This component is the percentage of P4P incentives to the total income level of health facilities or individuals (if incentives were allocated to individuals within facilities). Some P4P incentives accounted for less than 10% of total income level of health facilities or individuals (Bardach 2013; Canavan 2008; Petersen 2013); in two P4P programs evaluated in four studies (Alshamsan 2012; Basinga 2011; Lee 2011; Serumaga 2011), incentive levels were about 20% to 25% of total facilities' income; in one program in Burundi (Soeters 2008), the P4P payments constituted 58% of facilities' revenue. In three studies, the levels of incentive were only compared with part of the facilities' revenue (20% of capitation for some female members 50 years of age and older) (Hillman 1998; Hillman 1999); 30% of capitation fee from one insurance plan (Yip 2014)). Several studies did not report the exact level of incentive (An 2008; Chien 2012; Roski 2003; Soeters 2011).

Frequency of monitoring and payment

The frequency of monitoring and payment was the same in most P4P programs, including quarterly (Basinga 2011; Bonfrer 2014a; Engineer 2016; Rudasingwa 2015; Soeters 2008; Soeters 2011), 4 months (Petersen 2013), semi‐annual (Canavan 2008; Hillman 1998; Hillman 1999; Yip 2014), 10 months (An 2008), and annual (Alshamsan 2012; Chien 2012; Lee 2011; McLintock 2014; Serumaga 2011). Two P4P programs had four times, in Bardach 2013, and two times, in Roski 2003, of monitoring on performance respectively, but both paid the health facilities annually.

Individual payment within facilities

How the facilities allocate the payment to individual health workers is an important component, potentially related to the effects of P4P methods. Some payment methods were explicitly described as incentives or bonuses allocated to individual physicians based on their individual performance (Alshamsan 2012; Bardach 2013; Bonfrer 2014a; Lee 2011; Rudasingwa 2015; Serumaga 2011; Soeters 2008), or were equally allocated to individuals (Petersen 2013). In P4P programs in Rwanda and Tanzania (Basinga 2011; Canavan 2008), part of P4P incentives received by facilities was used to increase individual's income, but the allocation criteria were not clear. In a P4P program in Afghanistan (Engineer 2016), all of the P4P incentives received by facilities were used to pay individual health workers, and facility managers decided the allocation criteria. In two P4P programs, facilities did not use the extra payment to reward individual health professionals (An 2008; Yip 2014). Four studies did not provide information on how the facilities allocated P4P payments to individuals (Hillman 1998; Hillman 1999; Roski 2003; Soeters 2011).

Resourcing (if extra funding)

Some P4P programs included the input of additional resources (Alshamsan 2012; An 2008; Bardach 2013; Bonfrer 2014a; Chien 2012; Engineer 2016; Hillman 1998; Hillman 1999; Lee 2011; Petersen 2013; Roski 2003; Rudasingwa 2015; Serumaga 2011; Soeters 2008; Soeters 2011), and the additional resources were paid based on performance. In another two P4P programs (Basinga 2011; Yip 2014), there was no input of additional resources, but adjusting of existing resources from input‐based payment or FFS to P4P.

Study outcomes

We categorised the majority of outcome measures in the included studies into two types: service provision measures and patient outcome measures.

Provision measures included the quantity and quality of service provision. They included the services related to the control of risk factors for chronic diseases, An 2008, Bardach 2013, Petersen 2013, Roski 2003, Serumaga 2011, Hillman 1998, Hillman 1999, and Rudasingwa 2015, and the general outpatient consultations (Yip 2014), for example the rate of smokers referred to a quit line (An 2008), the percentage of hypertension patients prescribed guideline‐recommended medications (Petersen 2013), and the percentage of outpatient visits with an antibiotic prescription (Yip 2014).

Patient outcome measures included patients' utilisation of health services and their intermediate and final health status changes, which could not be entirely controlled by providers. This category of outcome measures covered maternal and children healthcare services, Basinga 2011, Soeters 2008, Soeters 2011, Bonfrer 2014a, Engineer 2016, and Rudasingwa 2015, and the control of risk factors for chronic diseases (Alshamsan 2012; Bardach 2013; Lee 2011; Petersen 2013; Roski 2003; Serumaga 2011), for example the percentage of children aged 12 to 23 months being fully immunised (Basinga 2011), the percentage of general population with cholesterol control (Bardach 2013), the glycated haemoglobin level of diabetic patients (Alshamsan 2012), and the percentage of smokers’ 7‐day sustained abstinence from smoking (Roski 2003).

One study evaluating the effects of capitation compared with FFS also included costs of services as outcome measures, including total inpatient costs and total costs of treating people younger than 18 (Catalano 2000).

All outcome measures reported in the included studies are listed in Table 5.

Open in table viewer
Table 5. Outcome measures of included studies (for studies included in effects analysis)

Study

Primary outcomes

Secondary outcomes

Unintended or adverse effects

Length of observation

Provision outcomes

Patient outcomes

Costs

Bardach 2013

Proportion of patients 18 years or older with IVD1 or 40 years or older with DM2 taking aspirin or another antithrombotic therapy (including cilostazol, clopidogrel bisulfate, warfarin sodium, dipyridamole);

Proportion of patients 18 years or older identified as current smokers who received certain smoking cessation services (cessation counselling, referral for counselling, or prescription or increased dose of a cessation aid)

Proportion of patients aged 18 to 75 years with hypertension getting blood pressure control (with blood pressure lower than 140/90 mmHg (if without DM2) or lower than 130/80 mmHg (if with DM2)) (Health);

Proportion of male patients 35 years or older and female patients 45 years or older without IVD1 or DM2 who have cholesterol control (total cholesterol lower than 240 mg/dL or low‐density lipoprotein lower than 160 mg/dL measured in the past 5 years) (Health)

12 months

Petersen 2013

Proportion of physicians' patients getting the guideline‐recommended antihypertensive medications

Proportion of physicians' patients with blood pressure control or appropriate response to uncontrolled blood pressure (Health)

Performance of physician groups during the final intervention period to the post‐washout performance period

24 months

Chien 2012

Probability of diabetes patients getting glycated haemoglobin testing (Utilisation);

Probability of diabetes patients getting lipid testing (Utilisation);

Probability of diabetes patients getting dilated eye exam (Utilisation)

12 months

Basinga 2011

Probablity of respondents getting tetanus vaccine during prenatal visit

Probability of respondents having any prenatal care (Utilisation);

Probability of respondents having 4 or more prenatal care visits (Utilisation);

Probability of respondents having institutional delivery (Utilisation);

Probability of children younger than 23 months preventive visit in previous 4 weeks (Utilisation);

Probability of children aged 24 to 59 months preventive visit in previous 4 weeks (Utilisation);

Probability of children aged 12 to 23 months being fully immunised (Utilisation)

23 months

Roski 2003

Percentage of tobacco users identified at last visit;

Percentage of smokers who received advice to quit;

Percentage of smokers who were offered assistance to quit at last visit

Percentage of respondents reporting using any aids for smoking cessation (Utilisation);

Percentage of respondents reporting using any medication for quitting (Utilisation);

Percentage of respondents reporting using any counselling services (Utilisation);

Percentage of smoker respondents 7‐day sustained abstinence from smoking (Health);

Percentage of respondents being current smokers (7‐day point prevalence);

Percentage of respondents reporting intention to quit within 30 days (Health)

12 months for provision outcomes;

18 months for patient outcomes

Serumaga 2011

Proportion of patients receiving 0, 1, 2, and 3 or more classes of antihypertensive drugs as a proportion of all study patients

Proportion of patients with blood pressure measured each month (Utilisation);

Proportion of patients with controlled blood pressure (blood pressure less than 150/90 mmHg) (Health);

Percentage of patients with hypertension‐related adverse outcomes (myocardial infarction, stroke, renal failure, heart failure) or on all‐cause mortality (Health)

12 months;

24 months;

36 months

An 2008

Rate of referral of smokers to quit line

Rate of smokers enrolled into quit line (Utilisation)

The marginal cost per additional quit line enrollee

10 months

Hillman 1999

Compliance scores3 of providers for immunisation;

Compliance scores of providers for other indicators;

Overall compliance scores of providers

6 months;

12 months;

18 months

Hillman 1998

Compliance scores for Pap test;

Compliance scores for colorectal screening;

Compliance scores for mammography;

Compliance scores for breast exam;

Total compliance scores

6 months;

12 months;

18 months

Alshamsan 2012

Glycated haemoglobin level for diabetes patients;

Total cholesterol level for diabetes patients (Health);

Systolic blood pressure for diabetes patients (Health);

Diastolic blood pressure for diabetes patients (Health)

Ethnic disparities in all outcomes

12 months;

24 months;

36 months

Lee 2011

Total cholesterol level for CHD4 patients (Health);

Total cholesterol level for stroke patients (Health);

Systolic blood pressure for CHD4 patients (Health);

Systolic blood pressure for stroke patients (Health);

Systolic blood pressure for hypertension patients (Health);

Diastolic blood pressure for CHD4 patients (Health);

Diastolic blood pressure for stroke patients (Health);

Diastolic blood pressure for hypertension patients (Health)

Ethnic disparities in all outcomes

12 months;

24 months;

36 months

Yip 2014

Percentage of visits with antibiotic prescription in Township Health Centre;

Percentage of visits with antibiotic prescription in Village Posts

Patient satisfaction score in Township Health Centre;

Patient satisfaction score in Village Posts;

Total expenditure per visit;

Total drug expenditure visit

Patient volume

Catalano 2000

Number of people younger than 18 receiving outpatient services

Number of people younger than 18 receiving inpatient services;

Number of people younger than 5 in treatment;

Number of disruptive children in treatment;

Number of people younger than 18 treated in emergency

Total outpatient costs;

Total costs of treating people younger than 18;

Total inpatient costs

12 months

18 months

Catalano 2005

Number of emergency visits by adults who had a primary mental or substance use disorder

12 months

Engineer 2016

Percentage of current use of modern family planning methods;
Percentage of at least 1 antenatal checkup from a skilled provider;
Percentage of skilled birth attendant present at latest delivery;
Percentage of postnatal checkup within 42 days of delivery by
a skilled provider;
Percentage of children who received pentavalent 3 vaccination;

Concentration index for institutional deliveries;
Concentration index for children's utilisation of outpatient services

20 indicators covering 5 domains of quality of care: Client and community perspectives, including an index of overall client satisfaction and perceived quality of care; Human resources perspectives, including a health worker satisfaction index and health worker motivation index; Physical capacity of health facility inputs (drugs, equipment, infrastructure); Quality of service provision, measuring 4 processes of care; and Management systems

23 to 25 months

McLintock 2014

Percentage of patients on the diabetes register or CHD4 register, or both, for whom case finding, diagnosis, and prescription for depression has been undertaken

Percentage of patients with non‐target long‐term physical conditions for whom case finding for depression, diagnosis, and prescription has been undertaken

60 months

1IVD: ischaemic vascular disease
2DM: diabetes mellitus
3Compliance scores: the extent of providers' consistent with the quality assurance criteria
4CHD: coronary heart disease

Excluded studies

Studies that initially appeared to meet the inclusion criteria but that were eventually excluded are listed in the Characteristics of excluded studies table. We excluded all of these studies because they did not fulfil the criteria for study design.

Risk of bias in included studies

Our assessment of the risk of bias for each of the included studies can be found in the 'Risk of bias' tables in the Characteristics of included studies tables.

Of the eight randomised trials, we assessed one study as at low risk of bias (Bardach 2013). We judged the remaining seven trials as having an unclear risk of bias for all the primary outcomes (An 2008; Engineer 2016; Hillman 1998; Hillman 1999; Petersen 2013; Roski 2003; Yip 2014). One major issue with some of the randomised trials was that the statistical comparison of the characteristics and outcomes of participants at baseline was not done (An 2008; Hillman 1998; Hillman 1999). This is not important for large randomised trials because of the randomised allocation; however, in this review small numbers of health facilities (12 to 143) were randomised.

Basinga 2011 was described as a randomised trial by the authors, but during the study the original randomised allocation of districts was changed. Originally, eight blocks were randomised into two comparison groups, and one group in each block was randomly assigned to the intervention group. However, before implementation of the baseline survey, the administrative district boundaries were redefined by the government in a decentralisation process. As a result, some of the districts selected for this study were combined with districts that already had existing P4P schemes. Consequently, the researchers had to switch the assignment (intervention or control) for eight districts from four blocks, and add one block to the sample. Due to the unclear allocation process and changed allocation, we grouped and analysed this study as a controlled before‐after study. Despite the change of allocation, the intervention and control groups were comparable in terms of the main characteristics and outcomes measured at baseline, so we rated this study as a controlled before‐after study with a low risk of bias.

We assessed five other included controlled before‐after studies as being at high risk of bias (Bonfrer 2014a; Canavan 2008; Rudasingwa 2015; Soeters 2008; Soeters 2011; ). The main issue with these studies was that the control areas had very different characteristics from the intervention areas. Another concern was that there were several different interventions or supports from international donors at the same time the intervention was evaluated (Bonfrer 2014a; Canavan 2008; Rudasingwa 2015; Soeters 2008; Soeters 2011). It was therefore not clear if the effects of the payment interventions were independent of other changes. We included these studies and described their characteristics in this review, but did not use them in analysing intervention effects.

Of seven ITS or repeated measure studies, we assessed one study as having a low risk of bias (Chien 2012), five studies as having an unclear risk of bias (Alshamsan 2012; Catalano 2000; Catalano 2005; Lee 2011; Serumaga 2011), and one study as having a high risk of bias for the primary outcomes (McLintock 2014). Catalano 2000 and Catalano 2005 used different data sources before and after the intervention. Alshamsan 2012, Lee 2011, Serumaga 2011, and McLintock 2014 all used existing medical record databases, and general practitioners' recording of performance measures may have improved in the intervention group (and not in the control group) after the intervention, because the financial incentives for their performance were based on what they recorded.

Effects of interventions

See: Summary of findings for the main comparison P4P plus some existing payment method compared with existing payment method for provision and patient outcomes; Summary of findings 2 Capitation plus P4P compared with FFS for provision improvement; Summary of findings 3 Capitation compared with FFS for provision, patient, and cost outcomes

Comparison 1: P4P plus existing capitation or input‐based payment compared to the existing payment method

Eighteen studies compared P4P plus existing capitation or input‐based payment with the existing payment method (Alshamsan 2012; An 2008; Bardach 2013; Basinga 2011; Bonfrer 2014a; Canavan 2008; Chien 2012; Engineer 2016; Hillman 1998; Hillman 1999; Lee 2011; McLintock 2014; Petersen 2013; Roski 2003; Rudasingwa 2015; Serumaga 2011; Soeters 2008; Soeters 2011), and we excluded five controlled before‐after studies from the effects analysis due to high risk of bias (Bonfrer 2014a; Canavan 2008; Rudasingwa 2015; Soeters 2008; Soeters 2011). Thirteen studies were included in the effects analysis under this comparison (Alshamsan 2012; An 2008; Bardach 2013; Basinga 2011; Chien 2012; Engineer 2016; Hillman 1998; Hillman 1999; Lee 2011; McLintock 2014; Petersen 2013; Roski 2003; Serumaga 2011). These studies found that adding P4P to an existing payment method probably slightly improved the care provided by health professionals (moderate‐certainty evidence) and may have little or no effect on utilisation of health services or patient outcomes (low‐certainty evidence) (summary of findings Table for the main comparison).

Effect on provision outcomes

Nine studies reported provision outcomes, of which six randomised trials, Bardach 2013, Hillman 1998, Hillman 1999, Petersen 2013, Roski 2003, and An 2008, and one controlled before‐after study, Basinga 2011, reported seven dichotomous provision outcomes and nine continuous provision outcomes. Two ITS studies reported six dichotomous provision outcomes (McLintock 2014; Serumaga 2011). The nine studies included a variety of specific provision outcome measures (Table 5).

In the six randomised trials, Bardach 2013, Hillman 1998, Hillman 1999, Petersen 2013, Roski 2003, and An 2008, and the controlled before‐after study, Basinga 2011, that we included in the effects analysis, four studies reported dichotomous outcomes (Bardach 2013; Basinga 2011; Petersen 2013; Roski 2003). Of these, three studies reported an adjusted risk ratio (RR) and its confidence interval (CI), or reported other outcome measures and relevant data to calculate an adjusted RR and its CI (Bardach 2013; Basinga 2011; Petersen 2013). If we included only those three studies in the primary synthesis analysis (Table 6), the adjusted RR for improvement in service provision was 1.08 (95% CI 1.03 to 1.14) (Analysis 1.1). If we included all four studies in the primary analysis, the adjusted RR for improvement in services provision across the four studies ranged from 1.01 to 1.17 (median = 1.095). Three studies reported nine continuous outcomes (Table 7) (An 2008; Hillman 1998; Hillman 1999). Only one study did not report baseline data for calculating the baseline adjusted relative change (An 2008), and this study was excluded from this analysis. For the continuous provision outcomes, the adjusted percentage change ranged from ‐8.49% to 5.8% (median = ‐1.345%).

Open in table viewer
Table 6. Effects of P4P on dichotomous provision outcomes

Study

Outcome measures

Control/baseline level

Risk ratio

Confidence intervals

1.USA Bardach 2013, randomised trial

Proportion of patients with ischaemic vascular disease or diabetes mellitus getting aspirin therapy prescription

59.7%

1.10

1.04, 1.16

Proportion of patients getting smoking cessation intervention

18.9%

1.23

1.03, 1.46

Synthesised effects inside the study (fixed‐effect model)

1.11

1.05, 1.17

2. USA Petersen 2013, randomised trial

Percentage of patients prescribed guideline‐recommended medications

63.0%

1.01

0.92, 1.12

3. Rwanda Basinga 2011, CBA

Proportion of respondents getting tetanus vaccine during prenatal visit

67.0%

1.08

0.997, 1.15

Synthesised effect across the above 3 studies (random‐effects model)

1.08

1.03, 1.14

4. USA Roski 2003, randomised trial

Percentage of patients identified as tobacco users at last visit

40.5%

1.20

Percentage of smokers who received advice to quit

35.4%

1.17

Percentage of smokers who were offered assistance to quit at last visit

19.7%

0.72

Synthesised effects inside the study (median)

1.17

Synthesised effect across the above 4 studies (median)

1.095

CBA: controlled before‐after study
P4P: pay for performance

Open in table viewer
Table 7. Effects of P4P on continuous provision outcomes

Study

Outcome measures

Control/baseline level

Absolute change

Relative change

1. USA An 2008, randomised trial

Rate of referral of smokers to quit line (not adjusted by baseline, not used for analysis)

4.2%

7.2%

+171%

2. USA Hillman 1999, randomised trial

Compliance scores for immunisation

60.2%

4.8%

+7.97%

Compliance scores for other indicators

55.2%

3.2%

+5.80%

Overall compliance scores

53.7%

2.8%

+5.21%

Synthesised effects inside the study (median)

3.2%

5.80%

3. USA Hillman 1998, randomised trial

Compliance scores for Pap test

25.4%

‐5.2%

‐20.47%

Compliance scores for colorectal screening

14.9%

3.3%

+22.15%

Compliance scores for mammography

40.9%

‐5.1%

‐12.47%

Compliance scores for breast exam

23.0%

‐0.2%

‐0.87%

Total compliance scores

27.1%

‐2.3%

‐8.49%

Synthesised effects inside the study (median)

‐2.3%

‐8.49%

Synthesised effect across the above 4 studies (median)

‐1.345%

P4P: pay for performance

One ITS study evaluated the impact of P4P on the proportion of patients receiving one, two, or three blood pressure control drugs (Serumaga 2011). It found little or no impact (Table 8). The second ITS study evaluated the impact of P4P on diagnosis and treatment of depression in patients with diabetes and coronary heart disease (McLintock 2014). It found an increase in the rate of diagnoses and the rate of new antidepressant prescriptions, but little or no change in the increase trend of these indicators.

Open in table viewer
Table 8. Effect measures of included ITS and RM studies

Comparison 1: P4P plus some existing payment method vs existing payment method

Study

Immediate change in level

Change in trend

Other effects results reported by authors

Estimate

Confidence interval

Estimate

Confidence interval

Serumaga 2011, ITS

Proportion of patients receiving 1 drug (%) (provision outcome)

0.07

‐0.83, 0.98

0.03

‐0.01 to 0.07

Proportion of patients receiving 2 drugs (%) (provision outcome)

0.03

‐0.19, 0.26

‐0.01

‐0.01 to 0.02

Proportion of patients receiving 3 or more drugs (%) (provision outcome)

0.11

‐0.26, 0.47

0.02

‐0.15 to 0.18

Percentage of patients with blood pressure measured each month (%) (patient outcome, utilisation)

0.85

‐3.04, 4.74

‐0.01

‐0.24 to 0.21

Proportion of patients with controlled blood pressure (%) (patient outcome, health)

‐1.19

‐2.06, 1.09

‐0.01

‐0.06 to 0.03

Percentage of patients with hypertension‐related adverse outcomes (myocardial infarction, stroke, renal failure, heart failure) or on all‐cause mortality (%) (patient outcome, health)

0.07

‐0.13, 0.28

0.05

‐0.02 to 0.07

Alshamsan 2012, RM

Systolic blood pressure level (patient outcome, health)

‐1.95

‐2.87, ‐1.02

‐1.04

‐1.42 to ‐0.64

Diastolic blood pressure level (patient outcome, health)

‐0.51

‐1.05, 0.01

0.19

‐0.03 to 0.41

Total cholesterol level (patient outcome, health)

‐0.12

‐0.18, ‐0.06

0.03

0.01 to 0.05

Glycated haemoglobin level (patient outcome, health)

0.04

‐0.04, 0.12

0.19

0.15 to 0.22

Lee 2011, RM

Systolic blood pressure level for CHD patients (patient outcome, health)

‐0.81

‐2.01, 0.49

‐0.53

‐1.09 to 0.02

Diastolic blood pressure level for CHD patients (patient outcome, health)

‐0.32

‐1.06, 0.42

0.32

‐0.00 to 0.64

Total cholesterol level for CHD patients (patient outcome, health)

‐0.01

‐0.08, 0.06

0.02

‐0.01 to 0.05

Systolic blood pressure level for stroke patients (patient outcome, health)

‐1.92

‐3.89, 0.05

‐0.79

‐1.64 to 0.06

Diastolic blood pressure level for stroke patients (patient outcome, health)

‐0.38

‐1.50, 0.74

0.26

‐0.22 to 0.74

Total cholesterol level for stroke patients (patient outcome, health)

‐0.11

‐0.23, 0.02

‐0.01

‐0.05 to 0.07

Systolic blood pressure level for hypertension patients (patient outcome, health)

‐1.18

‐1.76, ‐0.61

‐0.83

‐1.08 to ‐0.58

Diastolic blood pressure level for hypertension patients (patient outcome, health)

‐0.77

‐1.10, ‐0.43

0.03

‐0.11 to 0.18

McLintock 2014

Rate of coded case finding for depression in patients with diabetes and CHD (provision outcome)

Increase from 0.07/1000 to 7.45/1000 per month (OR 99.76, 95% CI 83.15 to 119.68)

Rate of new depression‐related diagnoses in patients with diabetes and CHD (provision outcome)

Increase from 21/1000 to 94/1000 per month (OR 2.09, 95% CI 1.92 to 2.27), the trends before and after interventions were 0.

Rate of new antidepressant prescribing in these patients (provision outcome)

Rates of prescribing increased over the full period of observation. The trends before and after interventions were similar.

Chien 2012

Rate of patients receiving glycated haemoglobin testing (patient outcome, utilisation)

After the intervention the adjusted RR 1.00 (95% CI 0.94 to 1.04)

Rate of patients receiving lipid testing (patient outcome, utilisation)

After the intervention the adjusted RR 1.02 (95% CI 0.99 to 1.04)

Rate of patients receiving dilated eye exam (patient outcome, utilization)

After the intervention the adjusted RR 0.95 (95% CI 0.84 to 1.05)

Comparison 3: Capitation vs FFS

Study

Immediate change in level

Change in trend

Other effects results reported by authors

Estimate

Confidence interval

Estimate

Confidence interval

Catalano 2005, ITS

Number of emergency visits in not‐for‐profit health centres' area (patient outcome, health)

‐7.422

‐12.808 to ‐2.036

‐0.332

‐0.510 to ‐0.154

Number of emergency visits in for‐profit health centres' area (patient outcome, health)

‐5.305

‐12.861 to 2.251

‐0.164

‐0.419 to 0.091

Catalano 2000, ITS

Number of people in outpatient treatment (provision outcome)

Weekly mean increase from 1196 before to 1299 after the intervention in for‐profit community health centres, the difference between real and expected level from history trend being 82.92, P < 0.01. No effects on not‐for‐profit community health centres.

Number of very young (< 5 years old) children in treatment (provision outcome)

Weekly mean increase from 94 before intervention to 100 after intervention in for‐profit community health centres, the difference between real and expected level from history trend being 18.53, P < 0.01. No effects on not‐for‐profit community health centres.

Number of children who receive treatment for disruptive behaviour (provision outcome)

Weekly mean increase from 287 before intervention to 318 after intervention in for‐profit community health centres, the difference between real and expected level from history trend being 72, P < 0.01. No effects on not‐for‐profit community health centres.

Number of inpatients treated (patient outcome, health)

Weekly mean decrease from 77 before intervention to 13 after intervention in not‐for‐profit health centres, the difference between real and expected level from history trend being ‐49,

P < 0.01, weekly mean decreasing from 96 to 45 in for‐profit health centres, the difference between real and expected level from history trend being ‐52, P < 0.01.

Number of people treated in emergency (patient outcome, health)

Weekly mean change from 7.9 to 7.6 in for‐profit community health centres, the difference between real and expected level from history trend being 6.66, P < 0.01. No effects on not‐for‐profit community health centres.

Total costs for all services

Weekly mean change from 507,796 before intervention to 534,800 after intervention, the difference between real and expected level from history trend being USD ‐211,400 in not‐for‐profit

health centres P < 0.01, weekly mean changing from 421,705 to 441,341, the difference between real and expected level from history trend being USD ‐178,500 in for‐profit health centres P < 0.01.

Total costs for inpatient care

Weekly mean change from 186,834 to 51,717, the difference between real and expected level from history trend being USD ‐134,200 in not‐for‐profit health centres P < 0.01, weekly mean changing from 216,166 to 111,238, the difference between real and expected level from history trend being USD ‐201,200 in for‐profit health centres P < 0.01.

Total outpatient costs

Weekly mean change from 205,539 to 330,102 in for‐profit health centres, the difference between real and expected level from history trend being USD 44,577, P < 0.01. No effects on not‐for‐profit community health centres.

CHD: coronary heart disease
CI: confidence interval
FFS: fee‐for‐service
ITS: interrupted time series study
OR: odds ratio
P4P: pay for performance
RM: repeated measures study
RR: risk ratio

Effect on patient outcomes

Among the 13 studies included in this comparison, 10 studies reported patient outcomes, including five randomised trials and one controlled before‐after study reporting 23 dichotomous outcomes and 3 continuous outcomes (An 2008; Bardach 2013; Basinga 2011; Engineer 2016; Petersen 2013; Roski 2003), and 4 ITS or repeated measures (RM) studies reporting 6 dichotomous outcomes and 12 continuous outcomes (Alshamsan 2012; Chien 2012; Lee 2011; Serumaga 2011). The specific patient outcome measures varied (Table 5). We grouped these outcome measures into outcomes related to utilisation of services and outcomes related to health outcomes, as described in the Methods section.

The outcome measure for one trial was a combination of patient outcome and provision outcome and was excluded from analysis (Petersen 2013). In the remaining studies, three studies' outcomes were related to utilisation of services (Basinga 2011; Engineer 2016; Roski 2003), of which two studies reported an adjusted RR and its CI, or reported relevant data to calculate an adjusted RR and its CI (Basinga 2011; Engineer 2016). Including only these two studies in the primary analysis (Table 9), the adjusted RR for improvement in service utilisation was 1.11 (95% CI 1.02 to 1.22) (Analysis 2.1). If we included all four studies in the primary analysis, the adjusted RR for improvement in service utilisation across the four studies ranged from 0.96 to 1.15 (median = 1.01).

Open in table viewer
Table 9. Effects of P4P on dichotomous patient outcomes

Study

Outcome measures

Control/baseline level

Risk ratio

Confidence intervals

Utlisation outcomes

1. Rwanda Basinga 2011, CBA

Proportion of respondents having any prenatal care

96.0%

1.002

0.98, 1.03

Proportion of respondents having 4 or more prenatal care visits

11.0%

1.07

0.43, 1.72

Proportion of respondents having institutional delivery

36.0%

1.23

1.04, 1.41

Proportion of children younger than 23 months preventive visit in previous 4 weeks

0.24%

1.50

1.17, 1.83

Proportion of children aged 24 to 59 months preventive visit in previous 4 weeks

0.14%

1.79

1.42, 2.16

Proportion of children aged 12 to 23 months being fully immunised

0.63%

0.91

0.71, 1.12

Synthesised effects inside the study (fixed‐effect model)

1.01

0.99, 1.04

2. Afghanistan Engineer 2016, randomised trial

Percentage of current use of modern family planning methods

10.3

0.96

0.47, 1.95

Percentage of at least 1 antenatal checkup from a skilled provider

56.9

1.01

0.85, 1.20

Percentage of skilled birth attendant present at latest delivery

22.5

1.19

0.91, 1.55

Percentage of postnatal checkup within 42 days of delivery by a skilled provider

24.7

1.03

0.10, 10.39

Percentage of children received pentavalent 3 vaccination

62.0

0.95

0.90, 0.99

Synthesised effects inside the study (fixed‐effect model)

0.96

0.92, 1.00

Synthesised effect across the above 2 studies (random‐effects model)

1.11

1.02, 1.22

3. USA Roski 2003, randomised trial

Percentage of respondents reporting using any aids for smoking cessation

22.3%

0.93

Percentage of respondents reporting using any medication for quitting

21.6%

0.92

Percentage of respondents reporting using any counselling services

1.0%

1.23

Percentage of smoker respondents with 7‐day sustained abstinence from smoking

19.2%

1.16

Percentage of respondents being current non‐smokers (7‐day point prevalence)

19.2%

1.17

Percentage of respondents reporting intention to quit within 30 days

9.4%

1.13

Synthesised effects inside the study (median)

1.145

Synthesised effect across the above 3 studies (median)

1.01

Health outcomes

4. USA Bardach 2013, randomised trial

Proportion of patients with no IVD or DM getting blood pressure control

34.6%

1.14

1.03, 1.25

Proportion of patients with IVD getting blood pressure control

47.8%

0.82

0.56, 1.11

Proportion of patients with DM getting blood pressure control

11.8%

1.43

1.10, 1.84

Proportion of patients with IVD or DM getting blood pressure control

17.0%

1.29

1.06, 1.55

Proportion of general population with cholesterol control

91.4%

0.99

0.96, 1.01

Synthesised effects inside the study (fixed‐effect model)

1.01

0.98, 1.04

Combined health and provision outcomes

5. USA Petersen 2013, randomised trial

Percentage of patients achieving guideline‐recommended blood pressure thresholds or receiving an appropriate response to uncontrolled blood pressure (combination of provision and patients outcome measure, not used for analysis)

86%

1.04

0.98, 1.10

Synthesised effect across the above 3 studies (median)

1.07

CBA: controlled before‐after study
DM: diabetes mellitus
IVD: ischaemic vascular disease
P4P: pay for performance

Two randomised trials reported one continuous outcome related to utilisation of services (An 2008; Engineer 2016). An 2008 found that the overall percentage of smokers who were enrolled in a quit line service was higher in intervention clinics (3.0%) compared with control clinics (1.3%; relative change without adjusting for baseline being 131%, P = 0.005). Engineer 2016 evaluated the effects of P4P on the equity of health service utilisation by measuring concentration index (an index measuring the extent to which a health indicator is concentrated among the disadvantaged or the advantaged). The concentration index score ranges from ‐1 to 1 with 0 meaning the total equity. Given that a population is ranked by increasing socioeconomic status, the concentration index has a negative value when the health indicator is concentrated among the disadvantaged, and has a positive value when the health indicator is concentrated among the advantaged. They found that there was little or no change in the inequity level for the institutional deliveries (concentration index increased by 75.7% from a baseline level of 0.1000 (P = 0.3)) and children's utilisation of outpatient services (concentration index decreased by 46.81% from baseline level of 0.0047 (P = 0.98)).

One trial's outcomes were related to health outcomes (proportion of patients with blood pressure control and proportion of general population with cholesterol control) (Bardach 2013). In this study, the adjusted RR for improvement in patients' health outcomes was 1.01 (95% CI 0.98 to 1.04) (Analysis 3.1).

One ITS study, Serumaga 2011, and two RM studies, Alshamsan 2012 and Lee 2011, reported the immediate level change after the intervention and the change in trend after intervention for patient outcomes (Table 8). Serumaga 2011 evaluated the effects of P4P incentive on quality of care and outcomes among patients in the United Kingdom with hypertension in primary care. This study included patient utilisation and patient health outcomes: the percentage of patients with blood pressure measured, the proportion of patients with controlled blood pressure, and the percentage of patients with hypertension‐related adverse outcomes (myocardial infarction, stroke, renal failure, heart failure). It found that there was little or no change in levels and change trends of these outcome measures before and after the P4P scheme. Alshamsan 2012 evaluated the same P4P scheme in the United Kingdom on diabetes patients’ systolic blood pressure level, diastolic blood pressure level, total cholesterol level, and glycated haemoglobin level. It found that the introduction of this P4P scheme was associated with an initial reduction in systolic blood pressure level, and that this improvement was sustained over the three years following the P4P scheme. The P4P was also associated with an initial reduction in cholesterol level, but this reduction was not sustained, and this P4P scheme had little or no effect on diastolic blood pressure level and glycated haemoglobin level. Lee 2011 also evaluated this P4P scheme in the United Kingdom on stroke, hypertension, and management of coronary heart disease (CHD), and found that the scheme resulted in a reduction in systolic blood pressure level and diastolic blood pressure level for hypertension patients immediately after the start of the P4P scheme, but this reduction was only sustained for the systolic blood pressure level. This study found that the P4P scheme had little or no impact on other health outcome measures, including systolic blood pressure level and diastolic blood pressure level for CHD patients, total cholesterol level for CHD patients, systolic blood pressure level and diastolic blood pressure level for stroke patients, and total cholesterol level for stroke patients. Another RM study reported insufficient data for re‐analysis to obtain the change in level and change in trend for outcome measures (Chien 2012). It found that immediately after the intervention there was little or no change in the rate of patients receiving glycated haemoglobin testing, the rate of patients receiving lipid testing, and the rate of patients receiving dilated eye examination.

Effect on healthcare provider outcomes

No relevant healthcare provider outcomes were reported.

Effect on cost

Only An 2008 reported the cost of implementing the P4P program for referral of smokers to telephone counselling. The results showed that the P4P intervention costs were greater than usual care costs without P4P incentives by USD 86,796 in total. In return for these costs, intervention clinics provided 1042 additional referrals of smokers to telephone counselling that resulted in 289 additional enrollees. The marginal cost for the intervention clinics was therefore USD 83 per additional referral to telephone counselling and USD 300 per additional enrollee to quit line services (low‐certainty evidence) (summary of findings Table for the main comparison).

Unintended or adverse effects

Four studies reported some unintended or adverse effects. Petersen 2013 found that after the P4P intervention had ended, there was a significant reduction in blood pressure control and appropriate response to uncontrolled blood pressure in the intervention group compared with the control group (low‐certainty evidence) (summary of findings Table for the main comparison).

Effect on secondary outcomes

No relevant secondary outcomes were reported.

Comparison 2: Capitation plus P4P versus FFS

We included one study in this comparison (Yip 2014), which found that capitation plus P4P probably slightly improved antibiotic use, which was the performance target for the P4P (moderate‐certainty evidence) (summary of findings Table 2).

The intervention in this study was a payment reform aimed at improving the quality of services provided by primary health providers in rural China. The performance target was designed to control antibiotics use. This intervention was applied by the New Rural Cooperative Medical Insurance (NCMS), the major health insurance for rural residents in China. In intervention areas, NCMS changed its traditional FFS to the capitated budget based on the number of NCMS enrollees for each health facility, and at the beginning of every year the NCMS disbursed 70% of the budget to the health centres, withholding the balance until after performance assessments at the middle and end of the year. The performance indicators included antibiotic prescription rates (oral and by injection) and measures of patient satisfaction.

Effect on provision outcomes

One randomised trial evaluated this intervention (Yip 2014), finding that compared with FFS, capitation combined with P4P targeting control of antibiotic prescriptions led to a reduction of antibiotic prescriptions in village and township health facilities in China (adjusted RR 0.84, 95% CI 0.74 to 0.96) (Analysis 4.1).

Effect on patient outcomes

No relevant outcomes were reported.

Effect on healthcare provider outcomes

No relevant healthcare provider outcomes were reported.

Effect on cost

No relevant outcomes were reported.

Unintended or adverse effects

Capitation could provide incentives to under provide health care, so this study also analysed if the intervention influenced patient volume. They found that at the township and village health facility levels, the adjusted relative change for the number of patient visits per day was ‐14.3% (P > 0.01) and ‐9.3% (P > 0.01), respectively.

Effect on secondary outcomes

Yip 2014 also analysed the effects of capitation plus P4P on total expenditure per visit, drug expenditure per visit, and patient satisfaction. At the township health facility level, there was little or no difference in the total expenditure, which increased by CNY 0.02 (adjusted relative change 0.096%, P = 0.994), or the expenditure for drugs, which decreased by CNY 0.88 (adjusted relative change 4.74%, P = 0.600). At the village health facility level, the total expenditure decreased by CNY 1.04 (adjusted relative change 6.3%, P = 0.002), and there was little or no difference in expenditure for drugs, which decreased by CNY 0.24 (adjusted relative change 2.1%, P = 0.227). There was little or no difference in patients' satisfaction with the healthcare services, measured using a satisfaction score from 1 (very dissatisfied) to 5 (very satisfied) (adjusted relative change ‐0.46% (P = 0.913) and ‐0.38% (P = 0.693) at the township and village health facility level, respectively).

Comparison 3: Capitation versus FFS

Two studies evaluated the effects of capitation for community mental health centres compared with FFS in the United States (Catalano 2000; Catalano 2005). The effects of capitation compared to FFS based on these studies were uncertain (very low‐certainty evidence) (summary of findings Table 3).

Effect on provision outcomes

Catalano 2000 was an ITS that analysed the effects of capitation on provision outcomes. Capitation offered incentives to provide more prevention or outpatient services for controlling cost with the fixed total payment based on number of registered patients. This study used three provision outcomes: the number of people receiving outpatient treatment, the number of very young (less than 5 years old) children in treatment, and the number of children receiving treatment for disruptive behaviour, because the author assumed that capitation incentivised providers to detect health problems in children of a younger age and with a less serious status, so that more serious episodes and more expensive treatment were reduced. This ITS did not report the results in detail, and provided insufficient data to conduct re‐analysis, so we only described the results reported by the authors. They reported increases in all three provision outcomes with capitation in only one subgroup of intervention areas (for‐profit community health centres) (Table 8).

Effect on patient outcomes

Two outcomes, the number of people requiring inpatient treatment and the number of people requiring emergency treatment, were regarded as health outcomes. Catalano 2000 found that capitation resulted in a decrease in the number of inpatients treated in all subgroups of intervention areas; for the number of people treated in emergency, only the initial level increase was observed in the for‐profit community health centres subgroup, but the increase was not sustained (Table 8).

Another ITS study also reported effects on emergency visits (Catalano 2005), finding that in not‐for‐profit health centres there was a reduction in emergency visits shortly after capitation payment, and the increase trend of emergency visits was reduced after capitation; in for‐profit health centres, there was little or no effect on emergency visits and the change trend of emergency visits (Table 8).

Effect on healthcare provider outcomes

No relevant healthcare provider outcomes were reported.

Effect on costs

Catalano 2000 reported downward shifts in the total costs for all services and the total costs for inpatient care in both the not‐for‐profit and for‐profit capitated health centres. Regarding total outpatient costs, the increase was only found in the for‐profit health centres (Table 8).

Unintended or adverse effects

No relevant outcomes were reported.

Effect on secondary outcomes

No relevant outcomes were reported.

Discussion

Summary of main results

See summary of findings Table for the main comparison, summary of findings Table 2, summary of findings Table 3

P4P combined with an existing capitation or input‐based payment method compared to the existing payment method

In this comparison, the majority of P4P interventions (12 of 14 included P4P programs) was a marginal payment (extra payment aiming to modify targeted provider behaviours) that did not replace regular funding systems (or cover full costs of service provision). We found evidence to suggest that extra P4P incentives probably slightly improved the use of some tests and treatments by health providers, but likely lead to little or no difference in adherence to quality assurance criteria. We also found that P4P incentives may lead to little or no difference in patients' utilisation of health services or health outcomes. One study found that adding a P4P scheme to an existing payment method may lead to higher costs than the existing payment method (An 2008).

Capitation combined with P4P compared to FFS

We included only one study in this comparison. The P4P was mainly targeted at controlling antibiotic prescriptions in outpatient visits. Compared with FFS, a capitated budget combined with payment based on providers' performance on antibiotic prescriptions and patient satisfaction probably slightly reduced antibiotic prescriptions in primary health facilities.

Capitation compared to FFS

This intervention targeted mental health centres in the included studies and aimed to motivate health providers to provide more outpatient and preventive services to control overall costs. The effects of capitation compared to FFS based on this evidence were uncertain, because the certainty of this evidence was very low.

Overall completeness and applicability of evidence

The health facilities in the studies included in this review all provided primary health care or mental health care. We found no evidence on dental clinics. This review covered most payment methods for outpatient health facilities other than budget payment. However, we identified only three comparisons. For the comparison of P4P added to an existing capitation or an input‐based payment method versus the existing payment method, four of the primary outcomes were reported in one or more studies: service provision, patient outcomes, costs, and adverse effects. However, only one study reported costs, no studies reported provider outcomes, and the certainty of the evidence was low for utilisation of health services and patient outcomes. These studies were from low‐, middle‐, and high‐income countries.

Nearly all of the P4P programs included extra funding in addition to the change in the payment method. It is thus unclear to what extent the effects of these P4P programs on service provision can be attributed to the increase in resources, and it is uncertain whether P4P programs that do not include extra funding would have similar effects. In addition, information on how the incentive payments were used inside the health facilities was lacking in some of the included studies. Since P4P is intended to improve targeted behaviours through financial incentives, it is uncertain to what extent the way in which incentive payments were used influenced the effects of the P4P programs that were evaluated, and it would be difficult to replicate (or know how to improve) this component of the programs.

The countries in which the included studies were conducted had well‐developed electronic records or insurance claim data (in high‐income countries) or specially designed data systems for evaluating the effects of P4P programs (in low‐ and middle‐income countries). Unavailability of an electronic information system or resources to support the administrative cost of P4P will limit its use.

The studies comparing capitation combined with P4P to FFS, or capitation to FFS were conducted in one country (China or the United States), so the evidence base is incomplete, and the findings may have limited applicability in other settings.

Subgroup analysis

We included more than one study evaluating the effects of P4P on service provision, utilisation, and health outcomes. Four included studies evaluated utilisation outcomes, but one study did not report the design of P4P components clearly (Roski 2003), leaving only three studies for subgroup analysis of the effects of P4P on service provision (Table 10). Due to the limited number of studies and multiple differences between the P4P programs (Table 4), we were unable to conduct meaningful subgroup comparisons (Table 10).

Open in table viewer
Table 10. Subgroup analysis

Bardach 2013

Petersen 2013

Basinga 2011

Effects size for service provision measures

RR 1.11 (1.05, 1.17)

RR 1.01 (0.92, 1.12)

RR 1.08 (0.997, 1.15)

Design of P4P

Performance measures

Both provision and outcome measures

Both provision and outcome measures

Both provision and outcome measures

Performance target

Pay for each instance of performance measure unit

Pay for each instance of performance measure unit

Pay for each instance of performance measure unit

Size of incentive

5% of an average physician's annual salary

1.6% of an average physician's annual salary

35% increase in salary

Frequency of monitoring

Quarterly

4 months

Quarterly

Frequency of payment

Annual

4 months

Quarterly

Individual payment

Allocated to individual based on individual performance

Equally allocated to individual physician, non‐physician in team

77% of P4P fund allocated to individual personnel, but not clear how it was allocated

Resourcing (if with more funds)

Yes

Yes

No (in control facilities, the input‐based payments were increased by the average amount of P4P payments received by facilities in the intervention group)

P4P: pay for performance
RR: risk ratio

Certainty of the evidence

The certainty of the evidence for the effects of adding P4P to an existing payment method on the provision of services (use of tests or treatments) was moderate because of study limitations. We assessed only one study as having low risk of bias. Common problems among the studies included: no clear description of the random allocation method or factors uncontrolled by researchers (policy change) influencing the initial random allocation, and no baseline outcomes and characteristics for small numbers of facilities that were randomly allocated. The certainty of the evidence for adherence to quality criteria was also moderate due to heterogeneity. Only one study provided four specific patient health outcomes. The certainty of this evidence on health outcomes was low because of the risk of bias and uncertainty about the applicability of the evidence outside of the setting in which the study was done (small primary health clinics in the United States) (summary of findings Table for the main comparison).

The certainty of the evidence for the effects of capitation combined with P4P compared to FFS on provision of services was moderate because there was only one randomised trial with unclear of risk of bias and uncertainty about the applicability of the evidence outside of the setting in which the study was done (primary healthcare facilities in rural China) (summary of findings Table 2).

The certainty of the evidence for the effects of capitation compared to FFS on the provision of services, health outcomes, and costs was very low because of the risk of bias. We included only two ITS studies in mental health centres in the United States in this comparison. The studies used different data sources before and after the intervention (Medicaid fee‐for‐service claims before and a shadow billing system after) (summary of findings Table 3).

Potential biases in the review process

We carried out an extensive search to ensure that we identified all relevant studies, but it remains possible that we could have missed some unpublished studies. We contacted the authors of relevant studies to clarify some questions on the design of payment methods and research results, and for finding additional and ongoing studies, but at the time of submission of this review we had received a reply from the author of only one study.

Agreements and disagreements with other studies or reviews

Current reviews on payment are focused on payments to individual health professionals (Gosden 2000; Houle 2012; Witter 2012), and there are no reviews on payments to facilities. However, there are several reviews on the effectiveness of P4P that overlap with this review (Petersen 2006; Schatz 2008; Scott 2011; Witter 2012). There are two reasons for overlap. One is that some reviews targeted one type of payment intervention (e.g. P4P) but did not constrain the level of payment (individual or facility) (Petersen 2006; Scott 2011; Witter 2012). The other is that one review focused on payment to health professionals (Scott 2011), but also included payment to practices or physician groups. We found that some studies asserted that the payment methods they evaluated were to health professionals or physicians (Engineer 2016; Hillman 1998; Hillman 1999), but that the payment was actually based on performance of a facility or physician group. In this situation, only part of the payment is allocated to individual health professionals, or all of the payment is allocated to individuals but not allocated based on individuals' behaviours. The mechanism of how these payments affect behaviours may be different from how direct payments to individuals affect behaviours. Our review included all payment methods to facilities and excluded direct payments to individuals.

Regarding the Cochrane reviews that overlap with our review (Scott 2011; Witter 2012), we contacted the authors, compared our data extraction, and discussed any disagreements. For several controlled before‐after studies from low‐income countries included in the Witter 2012 review (Canavan 2008; Soeters 2008; Soeters 2011), we included them only for description and not for effects analysis. We only included studies with low or unclear risk of bias in our analysis, as this provides better evidence to analyse the effects. For the comparison of P4P plus an existing payment method to the existing payment method, other reviews all found that the effects of P4P varied in direction and size (Petersen 2006; Schatz 2008; Scott 2011), so that it was difficult to draw general conclusions. Different from those reviews, we categorised outcome measures into provision outcomes, utilisation outcomes, and health outcomes, based on the extent of control of health providers on these outcomes, and attempted to draw conclusions on the effects of P4P at the facility level on each category of outcomes.

original image
Figuras y tablas -
Figure 1

Study flow diagram.
Figuras y tablas -
Figure 2

Study flow diagram.

Comparison 1 Effects of P4P on outpatient health facilities' performance: dichotomous provision outcomes, Outcome 1 Service provision outcomes.
Figuras y tablas -
Analysis 1.1

Comparison 1 Effects of P4P on outpatient health facilities' performance: dichotomous provision outcomes, Outcome 1 Service provision outcomes.

Comparison 2 Effects of P4P on outpatient health facilities' performance: dichotomous patients' utilisation outcomes, Outcome 1 Patients' utilisation outcomes.
Figuras y tablas -
Analysis 2.1

Comparison 2 Effects of P4P on outpatient health facilities' performance: dichotomous patients' utilisation outcomes, Outcome 1 Patients' utilisation outcomes.

Comparison 3 Effects of P4P on outpatient health facilities' performance: dichotomous patients' health outcomes, Outcome 1 Patients' health outcomes.
Figuras y tablas -
Analysis 3.1

Comparison 3 Effects of P4P on outpatient health facilities' performance: dichotomous patients' health outcomes, Outcome 1 Patients' health outcomes.

Comparison 4 Effects of P4P plus capitation on outpatient health facilities' performance compared to FFS, Outcome 1 Service provision outcomes (percentage of getting certain kinds of services, dichotomous).
Figuras y tablas -
Analysis 4.1

Comparison 4 Effects of P4P plus capitation on outpatient health facilities' performance compared to FFS, Outcome 1 Service provision outcomes (percentage of getting certain kinds of services, dichotomous).

Comparison 4 Effects of P4P plus capitation on outpatient health facilities' performance compared to FFS, Outcome 2 Patient outcomes (patient satisfaction, continuous).
Figuras y tablas -
Analysis 4.2

Comparison 4 Effects of P4P plus capitation on outpatient health facilities' performance compared to FFS, Outcome 2 Patient outcomes (patient satisfaction, continuous).

Summary of findings for the main comparison. P4P plus some existing payment method compared with existing payment method for provision and patient outcomes

P4P plus some existing payment method compared with existing payment method for provision and patient outcomes

Patient or population: outpatient health facilities

Settings: United States, United Kingdom, Rwanda, Afghanistan

Intervention: P4P plus some existing payment method

Comparison: existing payment method (capitation or input‐based payment)

Outcomes

Impact: RR for dichotomous outcomes and relative percentage change for continuous outcomes

Median (range)

No of participants
(studies)

Certainty of the evidence
(GRADE)

Comments

Provision outcomes (prescription of testing or treatment, dichotomous)

The adjusted RR median = 1.095 (ranged from 1.01 to 1.17)

3 randomised trials and 1 CBA

Moderate

⊕⊕⊕⊝

Of 3 randomised trials, 2 were rated as unclear risk of bias, and only 1 was rated as low risk of bias. The certainty was downgraded 1 level because of limitation in study design.

Provision outcomes (compliance with quality criteria, continuous)

The adjusted percentage change median = ‐1.345% (ranged from ‐8.49% to 5.8%)

2 randomised trials

Moderate

⊕⊕⊕⊝

2 randomised trials were rated as unclear risk of bias. The certainty was downgraded 1 level because of limitation in study design.

Patients' utilisation of health services (dichotomous)

The adjusted RR median = 1.01 (ranged from 0.96 to 1.15)

3 randomised trials and 1 CBA

Low

⊕⊕⊝⊝

3 randomised trials were rated as unclear risk of bias. The certainty was downgraded 1 level because of limitation in study design. The heterogeneity among estimates of effect of different studies was tested, and the certainty was downgraded 1 level because of inconsistency.

Patients' health outcomes (dichotomous)

The adjusted RR median = 1.01 (ranged from 0.98 to 1.04)

1 randomised trial

Low

⊕⊕⊝⊝

This trial was rated as unclear risk of bias. In addition, only 1 study targeting small primary health clinics in the United States was included, and the certainty was downgraded 1 level because of indirectness.

Provider outcomes

0

Costs

The P4P intervention costs were greater than usual care costs without P4P incentives by USD 86,796 in total, and USD 83 per additional referral to telephone counselling and USD 300 per additional enrollee to quit line services.

1 randomised trial

Low

⊕⊕⊝⊝

This trial was rated as unclear risk of bias. In addition, only 1 study targeted 1 specific health service (referral to telephone counselling for smokers) in the United States, and the certainty was downgraded 1 level because of indirectness.

Adverse effects

When the P4P intervention ended, there was a significant reduction in performance in the intervention group compared with the control group.

1 randomised trial

Low

⊕⊕⊝⊝

This trial was rated as unclear risk of bias. In addition, only 1 study targeted primary care clinics in 5 Veterans Affairs networks in the United States, and the certainty was downgraded 1 level because of indirectness.

CBA: controlled before‐after study; P4P: pay for performance; RR: risk ratio

GRADE Working Group grades of evidence
High certainty: This research provides a very good indication of the likely effect. The likelihood that the effect will be substantially different** is low.
Moderate certainty: This research provides a good indication of the likely effect. The likelihood that the effect will be substantially different** is moderate.
Low certainty: This research provides some indication of the likely effect. However, the likelihood that it will be substantially different** is high.
Very low certainty: This research does not provide a reliable indication of the likely effect. The likelihood that the effect will be substantially different** is very high.

**Substantially different = a large enough difference that it might affect a decision.

Figuras y tablas -
Summary of findings for the main comparison. P4P plus some existing payment method compared with existing payment method for provision and patient outcomes
Summary of findings 2. Capitation plus P4P compared with FFS for provision improvement

Capitation plus P4P compared with FFS for provision improvement

Patient or population: primary healthcare facilities in rural areas

Settings: China

Intervention: capitation plus P4P

Comparison: FFS

Outcomes

Impact: RR(95% CI)

No of participants
(studies)

Certainty of the evidence
(GRADE)

Comments

Provision outcomes

The adjusted RR for dichotomous outcome was 0.84 (95% CI 0.74 to 0.96)

1 randomised trial

Moderate

⊕⊕⊕⊝

This trial was rated as unclear risk of bias, and the certainty was downgraded 1 level because of limitation in study design.

Patient outcomes

0

Provider outcomes

0

Costs

0

Adverse effects

0

CI: confidence interval; FFS: fee‐for‐service; P4P: pay for performance; RR: risk ratio

GRADE Working Group grades of evidence
High certainty: This research provides a very good indication of the likely effect. The likelihood that the effect will be substantially different** is low.
Moderate certainty: This research provides a good indication of the likely effect. The likelihood that the effect will be substantially different** is moderate.
Low certainty: This research provides some indication of the likely effect. However, the likelihood that it will be substantially different** is high.
Very low certainty: This research does not provide a reliable indication of the likely effect. The likelihood that the effect will be substantially different** is very high.

**Substantially different = a large enough difference that it might affect a decision.

Figuras y tablas -
Summary of findings 2. Capitation plus P4P compared with FFS for provision improvement
Summary of findings 3. Capitation compared with FFS for provision, patient, and cost outcomes

Capitation compared with FFS for provision, patient, and cost outcomes

Patient or population: mental health centres

Settings: United States

Intervention: capitation

Comparison: FFS

Outcomes

Impacts

No of participants (studies)

Certainty of the evidence
(GRADE)

Comments

Provision outcomes (number of children treated as outpatients or for disruptive behaviour, or the number of very young children treated, continuous)

1 study showed that in for‐profit mental health centres, capitation resulted in more children being treated as outpatients and for disruptive behaviour, and more very young children being treated.

1 ITS study

Very low

⊕⊝⊝⊝

The study design is ITS and was initially graded as moderate. This study was rated as unclear risk of bias, and so was downgraded 1 level because of limitation in study design. In addition, this study only targeted mental health centres in the United States, and the certainty was downgraded 1 level because of indirectness.

Patient outcomes (number of children in inpatient or emergency treatment, continuous)

1 study showed that capitation resulted in a decrease in the number of inpatients. 2 studies showed contradictory results for the change in number of Emergency department visits.

2 ITS studies

Very low

⊕⊝⊝⊝

These 2 ITS studies were rated as unclear risk of bias, so the certainty was downgraded. The studies only targeted mental health centres in the United States, and so the certainty was downgraded because of indirectness. In addition, there was inconsistency in the results of the studies.

Cost outcomes (cost level, continuous)

1 study showed that capitation resulted in a reduction in total costs for all services and costs for inpatient care in all mental health centres, and an increase in outpatients only in for‐profit mental health centres.

1 ITS study

Very low

⊕⊝⊝⊝

The study design is ITS and was initially graded as moderate. This study was rated as unclear risk of bias, and so was downgraded 1 level because of limitation in study design. In addition, this study only targeted mental health centres in the United States, and the certainty was downgraded 1 level because of indirectness.

Provider outcomes

0

Adverse effects

0

FFS: fee‐for‐service; ITS: interrupted time series

GRADE Working Group grades of evidence

High certainty: This research provides a very good indication of the likely effect. The likelihood that the effect will be substantially different** is low.
Moderate certainty: This research provides a good indication of the likely effect. The likelihood that the effect will be substantially different** is moderate.
Low certainty: This research provides some indication of the likely effect. However, the likelihood that it will be substantially different** is high.
Very low certainty: This research does not provide a reliable indication of the likely effect. The likelihood that the effect will be substantially different** is very high.

**Substantially different = a large enough difference that it might affect a decision.

Figuras y tablas -
Summary of findings 3. Capitation compared with FFS for provision, patient, and cost outcomes
Table 1. Outpatient care facilities payment methods and characteristics

Payment methods

Payment rate determined

Payment made

Payment related to

Prospectively

Retrospectively

Prospectively

Retrospectively

Inputs

Outputs

Line‐item budgets

Global budgets

Capitation

Fee‐for‐service

‐Unconstrained

‐Fixed

Pay for performance

Figuras y tablas -
Table 1. Outpatient care facilities payment methods and characteristics
Table 2. Incentives in pure reimbursement systems of outpatient care facilities

Reimbursement type

Performance

Services/Case

Quantity

Quality

Cost/Unit

Risk selection

Line‐item budgets

‐‐

+

0

Global budgets

‐‐

‐‐

0

Capitation

‐‐

‐‐

‐‐

++

Fee‐for‐service

‐Unconstrained

++

+

‐‐

0

‐Fixed

++

+

‐‐

‐‐

+

Case‐based

‐‐

++

++

‐‐

+

Pay for performance

+

++

++

‐‐

+

Figuras y tablas -
Table 2. Incentives in pure reimbursement systems of outpatient care facilities
Table 3. Factors that might modify the effects of changes in payment methods on the delivery of services per case

Explanatory factors

How we will categorise the factor

Hypothesised direction of the interaction

Basis for the hypothesis

Larger fees (per service)

Relative increase in fees (continuous)

Larger (positive) effects with larger relative increases

The larger the incentive, the larger the effect

Duration of follow‐up

When outcomes are measured relative to when the change was made (continuous)

Larger (positive) effects with shorter follow‐up

Other changes and adjustments over time might reduce the initial incentive.

Ownership

For‐profit vs not‐for‐profit ownership

Larger (positive) effects with for‐profit ownership

For‐profit facilities might be more motivated to increase income and therefore more sensitive to changes in incentives.

Multiple providers

Choice of providers available to patients vs little or no choice of providers

Larger (negative) effects with little or no choice

Need to attract and retain patients might provide counteractive incentives to offer more services.

Monitoring

Monitoring vs no monitoring of the delivery of services

Larger (negative) effects without monitoring

Monitoring might provide counteractive incentives to offer more services.

Figuras y tablas -
Table 3. Factors that might modify the effects of changes in payment methods on the delivery of services per case
Table 4. The characteristics of P4P payments included in review

Study

Performance measures

Performance target

Size of incentive

Frequency of monitoring

Frequency of payment

Individual payment

Resourcing (if with more funds)

Alshamsan 2012

Lee 2011

Serumaga 2011

McLintock 2014

Both provision and outcome measures: 76 clinical quality indicators and 70 indicators relating to organisation of care and patient experience. Of the clinical indicators, 10 relate to maintaining disease registers, 56 to processes of care (such as measuring disease parameters and giving treatments), and 10 to intermediate outcomes (such as controlling blood pressure).

Threshold payment: Practices are awarded points based on the proportion of patients for whom targets are achieved, between a lower achievement threshold of 40% for most indicators (i.e. practices must achieve the targets for over 40% of patients to receive any points) and an upper threshold that varies according to the indicator. Each point earned the practice the certain level of money, adjusted for patient population size and disease prevalence. A maximum of 1000 points was available.

The highest level of performance payment is 25% of total income.

Annual

Annual

Allocated to individual based on individual performance

Yes

An 2008

Provision outcome measures: referral of smokers to consultation

Threshold payment combined with payment for each instance: Clinics that referred 50 smokers would receive a USD 5000 performance bonus. Clinics would also receive $25 for each referral beyond the initial 50.

Not clear, but mentioned "This incentive amount was arrived at after consultation with the management team and represents an amount that was judged as likely to be meaningful to most clinics ..."

10 months

10 months

Into clinics' operation fund, no payment to individual physicians and administrators

Yes

Bardach 2013

Both provision and outcome measures: 4 quality goals, including aspirin prescription, blood pressure control, cholesterol control, and smoking cessation intervention provision

Payment for each instance of performance measure unit: An incentive was paid for every instance of a patient meeting the quality goal (e.g. 1 blood pressure control USD 20). A higher payment was paid for patients with certain comorbidities or, as proxies for socioeconomic status, had Medicaid insurance or were uninsured.

Approximately 5% of an average physician's annual salary

Quarterly

Annual

Allocated to individual based on individual performance

Yes

Basinga 2011

Process measures: The 14 key maternal and child healthcare output indicators. Some of these output indicators are reasons for a visit, such as prenatal care or delivery, whereas others are services provided during a visit, such as tetanus vaccination during prenatal care.

Payment for each instance of performance measure unit: Basis for payment is calculated based on the number of 14 kinds of services provided; the final payment level is adjusted based on quality index.

Facility funding increased by 22%

Quarterly

Quarterly

77% of P4P1 funds to allocate to individual personnel, amounting to 35% increase in salary

No, control group funding also increased by the same level.

Canavan 2008

Process measures: outpatient utilisation rate, delivery rate, VCT2 clients

Threshold payment: 50% of support paid upfront for the year; 50% paid retrospectively if all the targets are met (outpatient utilisation rate 0.6, delivery rate 20/1000, VCT2 clients 20/1000).

8% of facility income

Semi‐annual

Semi‐annual

50% maximum bonus allocated to individual

Yes

Chien 2012

Both provision and outcome measures: diabetes patient completing all the missing care processes, and whether glycated haemoglobin and low‐density lipoprotein levels were lowered or at goal levels

Payment for each instance of performance measure unit: Certain amount of money for each patient paid if this patient met the performance target, e.g. USD 15 for 1 glycated haemoglobin test, USD 35 for glycated haemoglobin < 7%.

Not clear, but mentioned that "then incentive amount ... may not have been strong enough"

Annual

Annual

Not clear

Yes

Engineer 2016

Provision outcome measures: volume of 9 primary health services provided, combined with service provision quality indicators

Payment for each instance of performance measure unit: Certain amount of bonus per unit per quarter, e.g. USD 1.30 to USD 2.67 for first antenatal care visit; final payment was also adjusted by quality indicators.

The bonus amounts paid
were about 6% to 11% above health workers' base salary, and increased to about 14% to 28% depending on the health worker's cadre.

Quarterly

Quarterly

All allocated to individual, but the allocation method was determined by health facility managers, including giving individual bonuses proportional to the health worker's salary, giving them in equal amounts to all staff, or giving them based on their determination of an individual's contribution to the performance indicators.

Yes

Hillman 1998

Provision outcome measures: compliance with a quality assurance policy, i.e. is referral of clinically indicated for Pap test, colorectal screening, or mammography

Threshold payment: 3 intervention sites with highest compliance scores received full bonus (20% of capitation); 3 with the next highest scores and the 3 improving most from previous audit both received partial bonus (10% of capitation).

10% to 20% of capitation for all female members 50 years of age and older

Semi‐annual

Semi‐annual

Not clear, 38.5% of sites were solo group.

Yes

Hillman 1999

Provision outcome measures: compliance with provision of defined services for children, including immunisation, other preventive services

Threshold payment: 3 intervention sites with highest compliance scores received full bonus (20% of capitation); 3 with the next highest scores and the 3 improving most from previous audit both received partial bonus (10% of capitation).

10% to 20% of capitation for all paediatric members up to 7 years

Semi‐annual

Semi‐annual

Not clear, 42.1% of sites were solo group.

Yes

Petersen 2013

Combined provision and outcome measures: blood pressure thresholds or appropriately responding to uncontrolled blood pressure, prescribing guideline‐recommended antihypertensive medications

Payment for each instance: a maximum prerecord reward of USD 18.20, USD 9.10 for each successful measure

Mean level was 1.6% of a physician's salary.

4 months

4 months

Equally allocated to individual physician, non‐physician in team

Yes

Roski 2003

Provision outcome measures: Tobacco status clearly identified at each visit and documented in their medical records for their last visit; smokers should have provision of advice to quit smoking documented in their medical record.

Threshold payment: Performance targets were set at approximately 15 percentage points above the average performance for these clinic practices as assessed by the medical group 2 years prior to the effort described here. Incentive amounts were based on the number of providers per clinic. Specifically, clinics with 1 to 7 providers could receive a USD 5000 award, and clinics with 8 or more providers were eligible for a USD 10,000 bonus. Clinics that reached or exceeded only 1 of the 2 performance goals were eligible for half the amount.

Not clear, just discussed "it is not clear whether significantly higher incentive payments would have been able to focus clinic sites'

attention more strongly on ..."

Semi‐annual

Annual

Not clear, just mentioned "Clinics were provided with suggestions on how to spend earned incentive payments (i.e., travel and registration for educational courses). Ultimately, clinics decided how to allocate incentive payments."

Yes

Soeters 2008

Bonfrer 2014a

Rudasingwa 2015

Provision measures: health provision actions and quality composite index

Payment for each instance: Fixed amount paid per targeted action; multiplied by quality bonus ranging from 1 to 1.25 based on quarterly reviews of quality.

Studies published at different times reported different proportions:

58% of facility total revenue in 2009;

in 2014, this part accounted for 40% of total health facility budget;

in 2010, 20% of total health facility revenue.

Quarterly

Quarterly

Allocated to individual based on individual performance, using a systematic approach called "indices".

Yes

Soeters 2011

Provision measures: health provision actions and quality composite index with 154 indicators

Payment for each instance: Fixed amount paid per targeted action; top‐up of 15% available, based on quarterly reviews of quality. Also 15% additional payment for remote facilities.

Not clear, but should be the major component of funding for the health centres

Quarterly

Quarterly

Just mentioned facilities having discretion to pay staff.

Yes

Yip 2014

Provision measures: antibiotic prescription rates and patient satisfaction

Threshold payment: 70% of the budget allocated to health facilities firstly, withholding the balance until after performance assessments at the middle and end of the year; after each assessment, the performance scores were compared between each health facility to the average score in the county; each centre that scored above the average received more than the 30% of the budget that had been withheld, in proportion to how much above the county average its score was. Each centre that scored below the average received less than the 30%, in proportion to how much lower than average its score was.

30% of capitation budget

Semi‐annual

Semi‐annual

No allocation to individual

No

1P4P: pay for performance
2VCT: voluntary counselling and testing

Figuras y tablas -
Table 4. The characteristics of P4P payments included in review
Table 5. Outcome measures of included studies (for studies included in effects analysis)

Study

Primary outcomes

Secondary outcomes

Unintended or adverse effects

Length of observation

Provision outcomes

Patient outcomes

Costs

Bardach 2013

Proportion of patients 18 years or older with IVD1 or 40 years or older with DM2 taking aspirin or another antithrombotic therapy (including cilostazol, clopidogrel bisulfate, warfarin sodium, dipyridamole);

Proportion of patients 18 years or older identified as current smokers who received certain smoking cessation services (cessation counselling, referral for counselling, or prescription or increased dose of a cessation aid)

Proportion of patients aged 18 to 75 years with hypertension getting blood pressure control (with blood pressure lower than 140/90 mmHg (if without DM2) or lower than 130/80 mmHg (if with DM2)) (Health);

Proportion of male patients 35 years or older and female patients 45 years or older without IVD1 or DM2 who have cholesterol control (total cholesterol lower than 240 mg/dL or low‐density lipoprotein lower than 160 mg/dL measured in the past 5 years) (Health)

12 months

Petersen 2013

Proportion of physicians' patients getting the guideline‐recommended antihypertensive medications

Proportion of physicians' patients with blood pressure control or appropriate response to uncontrolled blood pressure (Health)

Performance of physician groups during the final intervention period to the post‐washout performance period

24 months

Chien 2012

Probability of diabetes patients getting glycated haemoglobin testing (Utilisation);

Probability of diabetes patients getting lipid testing (Utilisation);

Probability of diabetes patients getting dilated eye exam (Utilisation)

12 months

Basinga 2011

Probablity of respondents getting tetanus vaccine during prenatal visit

Probability of respondents having any prenatal care (Utilisation);

Probability of respondents having 4 or more prenatal care visits (Utilisation);

Probability of respondents having institutional delivery (Utilisation);

Probability of children younger than 23 months preventive visit in previous 4 weeks (Utilisation);

Probability of children aged 24 to 59 months preventive visit in previous 4 weeks (Utilisation);

Probability of children aged 12 to 23 months being fully immunised (Utilisation)

23 months

Roski 2003

Percentage of tobacco users identified at last visit;

Percentage of smokers who received advice to quit;

Percentage of smokers who were offered assistance to quit at last visit

Percentage of respondents reporting using any aids for smoking cessation (Utilisation);

Percentage of respondents reporting using any medication for quitting (Utilisation);

Percentage of respondents reporting using any counselling services (Utilisation);

Percentage of smoker respondents 7‐day sustained abstinence from smoking (Health);

Percentage of respondents being current smokers (7‐day point prevalence);

Percentage of respondents reporting intention to quit within 30 days (Health)

12 months for provision outcomes;

18 months for patient outcomes

Serumaga 2011

Proportion of patients receiving 0, 1, 2, and 3 or more classes of antihypertensive drugs as a proportion of all study patients

Proportion of patients with blood pressure measured each month (Utilisation);

Proportion of patients with controlled blood pressure (blood pressure less than 150/90 mmHg) (Health);

Percentage of patients with hypertension‐related adverse outcomes (myocardial infarction, stroke, renal failure, heart failure) or on all‐cause mortality (Health)

12 months;

24 months;

36 months

An 2008

Rate of referral of smokers to quit line

Rate of smokers enrolled into quit line (Utilisation)

The marginal cost per additional quit line enrollee

10 months

Hillman 1999

Compliance scores3 of providers for immunisation;

Compliance scores of providers for other indicators;

Overall compliance scores of providers

6 months;

12 months;

18 months

Hillman 1998

Compliance scores for Pap test;

Compliance scores for colorectal screening;

Compliance scores for mammography;

Compliance scores for breast exam;

Total compliance scores

6 months;

12 months;

18 months

Alshamsan 2012

Glycated haemoglobin level for diabetes patients;

Total cholesterol level for diabetes patients (Health);

Systolic blood pressure for diabetes patients (Health);

Diastolic blood pressure for diabetes patients (Health)

Ethnic disparities in all outcomes

12 months;

24 months;

36 months

Lee 2011

Total cholesterol level for CHD4 patients (Health);

Total cholesterol level for stroke patients (Health);

Systolic blood pressure for CHD4 patients (Health);

Systolic blood pressure for stroke patients (Health);

Systolic blood pressure for hypertension patients (Health);

Diastolic blood pressure for CHD4 patients (Health);

Diastolic blood pressure for stroke patients (Health);

Diastolic blood pressure for hypertension patients (Health)

Ethnic disparities in all outcomes

12 months;

24 months;

36 months

Yip 2014

Percentage of visits with antibiotic prescription in Township Health Centre;

Percentage of visits with antibiotic prescription in Village Posts

Patient satisfaction score in Township Health Centre;

Patient satisfaction score in Village Posts;

Total expenditure per visit;

Total drug expenditure visit

Patient volume

Catalano 2000

Number of people younger than 18 receiving outpatient services

Number of people younger than 18 receiving inpatient services;

Number of people younger than 5 in treatment;

Number of disruptive children in treatment;

Number of people younger than 18 treated in emergency

Total outpatient costs;

Total costs of treating people younger than 18;

Total inpatient costs

12 months

18 months

Catalano 2005

Number of emergency visits by adults who had a primary mental or substance use disorder

12 months

Engineer 2016

Percentage of current use of modern family planning methods;
Percentage of at least 1 antenatal checkup from a skilled provider;
Percentage of skilled birth attendant present at latest delivery;
Percentage of postnatal checkup within 42 days of delivery by
a skilled provider;
Percentage of children who received pentavalent 3 vaccination;

Concentration index for institutional deliveries;
Concentration index for children's utilisation of outpatient services

20 indicators covering 5 domains of quality of care: Client and community perspectives, including an index of overall client satisfaction and perceived quality of care; Human resources perspectives, including a health worker satisfaction index and health worker motivation index; Physical capacity of health facility inputs (drugs, equipment, infrastructure); Quality of service provision, measuring 4 processes of care; and Management systems

23 to 25 months

McLintock 2014

Percentage of patients on the diabetes register or CHD4 register, or both, for whom case finding, diagnosis, and prescription for depression has been undertaken

Percentage of patients with non‐target long‐term physical conditions for whom case finding for depression, diagnosis, and prescription has been undertaken

60 months

1IVD: ischaemic vascular disease
2DM: diabetes mellitus
3Compliance scores: the extent of providers' consistent with the quality assurance criteria
4CHD: coronary heart disease

Figuras y tablas -
Table 5. Outcome measures of included studies (for studies included in effects analysis)
Table 6. Effects of P4P on dichotomous provision outcomes

Study

Outcome measures

Control/baseline level

Risk ratio

Confidence intervals

1.USA Bardach 2013, randomised trial

Proportion of patients with ischaemic vascular disease or diabetes mellitus getting aspirin therapy prescription

59.7%

1.10

1.04, 1.16

Proportion of patients getting smoking cessation intervention

18.9%

1.23

1.03, 1.46

Synthesised effects inside the study (fixed‐effect model)

1.11

1.05, 1.17

2. USA Petersen 2013, randomised trial

Percentage of patients prescribed guideline‐recommended medications

63.0%

1.01

0.92, 1.12

3. Rwanda Basinga 2011, CBA

Proportion of respondents getting tetanus vaccine during prenatal visit

67.0%

1.08

0.997, 1.15

Synthesised effect across the above 3 studies (random‐effects model)

1.08

1.03, 1.14

4. USA Roski 2003, randomised trial

Percentage of patients identified as tobacco users at last visit

40.5%

1.20

Percentage of smokers who received advice to quit

35.4%

1.17

Percentage of smokers who were offered assistance to quit at last visit

19.7%

0.72

Synthesised effects inside the study (median)

1.17

Synthesised effect across the above 4 studies (median)

1.095

CBA: controlled before‐after study
P4P: pay for performance

Figuras y tablas -
Table 6. Effects of P4P on dichotomous provision outcomes
Table 7. Effects of P4P on continuous provision outcomes

Study

Outcome measures

Control/baseline level

Absolute change

Relative change

1. USA An 2008, randomised trial

Rate of referral of smokers to quit line (not adjusted by baseline, not used for analysis)

4.2%

7.2%

+171%

2. USA Hillman 1999, randomised trial

Compliance scores for immunisation

60.2%

4.8%

+7.97%

Compliance scores for other indicators

55.2%

3.2%

+5.80%

Overall compliance scores

53.7%

2.8%

+5.21%

Synthesised effects inside the study (median)

3.2%

5.80%

3. USA Hillman 1998, randomised trial

Compliance scores for Pap test

25.4%

‐5.2%

‐20.47%

Compliance scores for colorectal screening

14.9%

3.3%

+22.15%

Compliance scores for mammography

40.9%

‐5.1%

‐12.47%

Compliance scores for breast exam

23.0%

‐0.2%

‐0.87%

Total compliance scores

27.1%

‐2.3%

‐8.49%

Synthesised effects inside the study (median)

‐2.3%

‐8.49%

Synthesised effect across the above 4 studies (median)

‐1.345%

P4P: pay for performance

Figuras y tablas -
Table 7. Effects of P4P on continuous provision outcomes
Table 8. Effect measures of included ITS and RM studies

Comparison 1: P4P plus some existing payment method vs existing payment method

Study

Immediate change in level

Change in trend

Other effects results reported by authors

Estimate

Confidence interval

Estimate

Confidence interval

Serumaga 2011, ITS

Proportion of patients receiving 1 drug (%) (provision outcome)

0.07

‐0.83, 0.98

0.03

‐0.01 to 0.07

Proportion of patients receiving 2 drugs (%) (provision outcome)

0.03

‐0.19, 0.26

‐0.01

‐0.01 to 0.02

Proportion of patients receiving 3 or more drugs (%) (provision outcome)

0.11

‐0.26, 0.47

0.02

‐0.15 to 0.18

Percentage of patients with blood pressure measured each month (%) (patient outcome, utilisation)

0.85

‐3.04, 4.74

‐0.01

‐0.24 to 0.21

Proportion of patients with controlled blood pressure (%) (patient outcome, health)

‐1.19

‐2.06, 1.09

‐0.01

‐0.06 to 0.03

Percentage of patients with hypertension‐related adverse outcomes (myocardial infarction, stroke, renal failure, heart failure) or on all‐cause mortality (%) (patient outcome, health)

0.07

‐0.13, 0.28

0.05

‐0.02 to 0.07

Alshamsan 2012, RM

Systolic blood pressure level (patient outcome, health)

‐1.95

‐2.87, ‐1.02

‐1.04

‐1.42 to ‐0.64

Diastolic blood pressure level (patient outcome, health)

‐0.51

‐1.05, 0.01

0.19

‐0.03 to 0.41

Total cholesterol level (patient outcome, health)

‐0.12

‐0.18, ‐0.06

0.03

0.01 to 0.05

Glycated haemoglobin level (patient outcome, health)

0.04

‐0.04, 0.12

0.19

0.15 to 0.22

Lee 2011, RM

Systolic blood pressure level for CHD patients (patient outcome, health)

‐0.81

‐2.01, 0.49

‐0.53

‐1.09 to 0.02

Diastolic blood pressure level for CHD patients (patient outcome, health)

‐0.32

‐1.06, 0.42

0.32

‐0.00 to 0.64

Total cholesterol level for CHD patients (patient outcome, health)

‐0.01

‐0.08, 0.06

0.02

‐0.01 to 0.05

Systolic blood pressure level for stroke patients (patient outcome, health)

‐1.92

‐3.89, 0.05

‐0.79

‐1.64 to 0.06

Diastolic blood pressure level for stroke patients (patient outcome, health)

‐0.38

‐1.50, 0.74

0.26

‐0.22 to 0.74

Total cholesterol level for stroke patients (patient outcome, health)

‐0.11

‐0.23, 0.02

‐0.01

‐0.05 to 0.07

Systolic blood pressure level for hypertension patients (patient outcome, health)

‐1.18

‐1.76, ‐0.61

‐0.83

‐1.08 to ‐0.58

Diastolic blood pressure level for hypertension patients (patient outcome, health)

‐0.77

‐1.10, ‐0.43

0.03

‐0.11 to 0.18

McLintock 2014

Rate of coded case finding for depression in patients with diabetes and CHD (provision outcome)

Increase from 0.07/1000 to 7.45/1000 per month (OR 99.76, 95% CI 83.15 to 119.68)

Rate of new depression‐related diagnoses in patients with diabetes and CHD (provision outcome)

Increase from 21/1000 to 94/1000 per month (OR 2.09, 95% CI 1.92 to 2.27), the trends before and after interventions were 0.

Rate of new antidepressant prescribing in these patients (provision outcome)

Rates of prescribing increased over the full period of observation. The trends before and after interventions were similar.

Chien 2012

Rate of patients receiving glycated haemoglobin testing (patient outcome, utilisation)

After the intervention the adjusted RR 1.00 (95% CI 0.94 to 1.04)

Rate of patients receiving lipid testing (patient outcome, utilisation)

After the intervention the adjusted RR 1.02 (95% CI 0.99 to 1.04)

Rate of patients receiving dilated eye exam (patient outcome, utilization)

After the intervention the adjusted RR 0.95 (95% CI 0.84 to 1.05)

Comparison 3: Capitation vs FFS

Study

Immediate change in level

Change in trend

Other effects results reported by authors

Estimate

Confidence interval

Estimate

Confidence interval

Catalano 2005, ITS

Number of emergency visits in not‐for‐profit health centres' area (patient outcome, health)

‐7.422

‐12.808 to ‐2.036

‐0.332

‐0.510 to ‐0.154

Number of emergency visits in for‐profit health centres' area (patient outcome, health)

‐5.305

‐12.861 to 2.251

‐0.164

‐0.419 to 0.091

Catalano 2000, ITS

Number of people in outpatient treatment (provision outcome)

Weekly mean increase from 1196 before to 1299 after the intervention in for‐profit community health centres, the difference between real and expected level from history trend being 82.92, P < 0.01. No effects on not‐for‐profit community health centres.

Number of very young (< 5 years old) children in treatment (provision outcome)

Weekly mean increase from 94 before intervention to 100 after intervention in for‐profit community health centres, the difference between real and expected level from history trend being 18.53, P < 0.01. No effects on not‐for‐profit community health centres.

Number of children who receive treatment for disruptive behaviour (provision outcome)

Weekly mean increase from 287 before intervention to 318 after intervention in for‐profit community health centres, the difference between real and expected level from history trend being 72, P < 0.01. No effects on not‐for‐profit community health centres.

Number of inpatients treated (patient outcome, health)

Weekly mean decrease from 77 before intervention to 13 after intervention in not‐for‐profit health centres, the difference between real and expected level from history trend being ‐49,

P < 0.01, weekly mean decreasing from 96 to 45 in for‐profit health centres, the difference between real and expected level from history trend being ‐52, P < 0.01.

Number of people treated in emergency (patient outcome, health)

Weekly mean change from 7.9 to 7.6 in for‐profit community health centres, the difference between real and expected level from history trend being 6.66, P < 0.01. No effects on not‐for‐profit community health centres.

Total costs for all services

Weekly mean change from 507,796 before intervention to 534,800 after intervention, the difference between real and expected level from history trend being USD ‐211,400 in not‐for‐profit

health centres P < 0.01, weekly mean changing from 421,705 to 441,341, the difference between real and expected level from history trend being USD ‐178,500 in for‐profit health centres P < 0.01.

Total costs for inpatient care

Weekly mean change from 186,834 to 51,717, the difference between real and expected level from history trend being USD ‐134,200 in not‐for‐profit health centres P < 0.01, weekly mean changing from 216,166 to 111,238, the difference between real and expected level from history trend being USD ‐201,200 in for‐profit health centres P < 0.01.

Total outpatient costs

Weekly mean change from 205,539 to 330,102 in for‐profit health centres, the difference between real and expected level from history trend being USD 44,577, P < 0.01. No effects on not‐for‐profit community health centres.

CHD: coronary heart disease
CI: confidence interval
FFS: fee‐for‐service
ITS: interrupted time series study
OR: odds ratio
P4P: pay for performance
RM: repeated measures study
RR: risk ratio

Figuras y tablas -
Table 8. Effect measures of included ITS and RM studies
Table 9. Effects of P4P on dichotomous patient outcomes

Study

Outcome measures

Control/baseline level

Risk ratio

Confidence intervals

Utlisation outcomes

1. Rwanda Basinga 2011, CBA

Proportion of respondents having any prenatal care

96.0%

1.002

0.98, 1.03

Proportion of respondents having 4 or more prenatal care visits

11.0%

1.07

0.43, 1.72

Proportion of respondents having institutional delivery

36.0%

1.23

1.04, 1.41

Proportion of children younger than 23 months preventive visit in previous 4 weeks

0.24%

1.50

1.17, 1.83

Proportion of children aged 24 to 59 months preventive visit in previous 4 weeks

0.14%

1.79

1.42, 2.16

Proportion of children aged 12 to 23 months being fully immunised

0.63%

0.91

0.71, 1.12

Synthesised effects inside the study (fixed‐effect model)

1.01

0.99, 1.04

2. Afghanistan Engineer 2016, randomised trial

Percentage of current use of modern family planning methods

10.3

0.96

0.47, 1.95

Percentage of at least 1 antenatal checkup from a skilled provider

56.9

1.01

0.85, 1.20

Percentage of skilled birth attendant present at latest delivery

22.5

1.19

0.91, 1.55

Percentage of postnatal checkup within 42 days of delivery by a skilled provider

24.7

1.03

0.10, 10.39

Percentage of children received pentavalent 3 vaccination

62.0

0.95

0.90, 0.99

Synthesised effects inside the study (fixed‐effect model)

0.96

0.92, 1.00

Synthesised effect across the above 2 studies (random‐effects model)

1.11

1.02, 1.22

3. USA Roski 2003, randomised trial

Percentage of respondents reporting using any aids for smoking cessation

22.3%

0.93

Percentage of respondents reporting using any medication for quitting

21.6%

0.92

Percentage of respondents reporting using any counselling services

1.0%

1.23

Percentage of smoker respondents with 7‐day sustained abstinence from smoking

19.2%

1.16

Percentage of respondents being current non‐smokers (7‐day point prevalence)

19.2%

1.17

Percentage of respondents reporting intention to quit within 30 days

9.4%

1.13

Synthesised effects inside the study (median)

1.145

Synthesised effect across the above 3 studies (median)

1.01

Health outcomes

4. USA Bardach 2013, randomised trial

Proportion of patients with no IVD or DM getting blood pressure control

34.6%

1.14

1.03, 1.25

Proportion of patients with IVD getting blood pressure control

47.8%

0.82

0.56, 1.11

Proportion of patients with DM getting blood pressure control

11.8%

1.43

1.10, 1.84

Proportion of patients with IVD or DM getting blood pressure control

17.0%

1.29

1.06, 1.55

Proportion of general population with cholesterol control

91.4%

0.99

0.96, 1.01

Synthesised effects inside the study (fixed‐effect model)

1.01

0.98, 1.04

Combined health and provision outcomes

5. USA Petersen 2013, randomised trial

Percentage of patients achieving guideline‐recommended blood pressure thresholds or receiving an appropriate response to uncontrolled blood pressure (combination of provision and patients outcome measure, not used for analysis)

86%

1.04

0.98, 1.10

Synthesised effect across the above 3 studies (median)

1.07

CBA: controlled before‐after study
DM: diabetes mellitus
IVD: ischaemic vascular disease
P4P: pay for performance

Figuras y tablas -
Table 9. Effects of P4P on dichotomous patient outcomes
Table 10. Subgroup analysis

Bardach 2013

Petersen 2013

Basinga 2011

Effects size for service provision measures

RR 1.11 (1.05, 1.17)

RR 1.01 (0.92, 1.12)

RR 1.08 (0.997, 1.15)

Design of P4P

Performance measures

Both provision and outcome measures

Both provision and outcome measures

Both provision and outcome measures

Performance target

Pay for each instance of performance measure unit

Pay for each instance of performance measure unit

Pay for each instance of performance measure unit

Size of incentive

5% of an average physician's annual salary

1.6% of an average physician's annual salary

35% increase in salary

Frequency of monitoring

Quarterly

4 months

Quarterly

Frequency of payment

Annual

4 months

Quarterly

Individual payment

Allocated to individual based on individual performance

Equally allocated to individual physician, non‐physician in team

77% of P4P fund allocated to individual personnel, but not clear how it was allocated

Resourcing (if with more funds)

Yes

Yes

No (in control facilities, the input‐based payments were increased by the average amount of P4P payments received by facilities in the intervention group)

P4P: pay for performance
RR: risk ratio

Figuras y tablas -
Table 10. Subgroup analysis
Comparison 1. Effects of P4P on outpatient health facilities' performance: dichotomous provision outcomes

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Service provision outcomes Show forest plot

3

Risk Ratio (Random, 95% CI)

1.08 [1.03, 1.14]

1.1 Process outcomes of Bardach 2013

1

Risk Ratio (Random, 95% CI)

1.13 [1.03, 1.23]

1.2 Process outcomes of Petersen 2013

1

Risk Ratio (Random, 95% CI)

1.01 [0.92, 1.11]

1.3 Process outcomes of Basinga 2011

1

Risk Ratio (Random, 95% CI)

1.08 [1.00, 1.17]

Figuras y tablas -
Comparison 1. Effects of P4P on outpatient health facilities' performance: dichotomous provision outcomes
Comparison 2. Effects of P4P on outpatient health facilities' performance: dichotomous patients' utilisation outcomes

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Patients' utilisation outcomes Show forest plot

2

Risk Ratio (Random, 95% CI)

1.11 [1.02, 1.22]

1.1 Utilisation outcomes of Basinga 2011

1

Risk Ratio (Random, 95% CI)

1.23 [0.99, 1.52]

1.2 Utilisation outcomes of Engineer 2016

1

Risk Ratio (Random, 95% CI)

0.96 [0.92, 1.00]

Figuras y tablas -
Comparison 2. Effects of P4P on outpatient health facilities' performance: dichotomous patients' utilisation outcomes
Comparison 3. Effects of P4P on outpatient health facilities' performance: dichotomous patients' health outcomes

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Patients' health outcomes Show forest plot

1

Risk Ratio (Fixed, 95% CI)

1.01 [0.98, 1.04]

1.1 Health outcomes of Bardach 2013

1

Risk Ratio (Fixed, 95% CI)

1.01 [0.98, 1.04]

Figuras y tablas -
Comparison 3. Effects of P4P on outpatient health facilities' performance: dichotomous patients' health outcomes
Comparison 4. Effects of P4P plus capitation on outpatient health facilities' performance compared to FFS

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Service provision outcomes (percentage of getting certain kinds of services, dichotomous) Show forest plot

1

Risk Ratio (Fixed, 95% CI)

0.84 [0.74, 0.96]

2 Patient outcomes (patient satisfaction, continuous) Show forest plot

1

Mean Difference (Fixed, 95% CI)

‐0.02 [‐0.43, 0.39]

Figuras y tablas -
Comparison 4. Effects of P4P plus capitation on outpatient health facilities' performance compared to FFS