Can Combining Performance-Based Financing With Equity Measures Result in Greater Equity in Utilization of Maternal Care Services? Evidence From Burkina Faso

Background: As countries reform health financing systems towards universal health coverage, increasing concerns emerge on the need to ensure inclusion of the most vulnerable segments of society, working to counteract existing inequities in service coverage. To this end, selected countries in sub-Saharan Africa have decided to couple performance-based financing (PBF) with demand-side equity measures. Still, evidence on the equity impacts of these more complex PBF models is largely lacking. We aimed at filling this gap in knowledge by assessing the equity impact of PBF combined with equity measures on utilization of maternal health services in Burkina Faso. Methods: Our study took place in 24 districts in rural Burkina Faso. We implemented an experimental design (clusterrandomized trial) nested within a quasi-experimental one (pre- and post-test design with independent controls). Our analysis relied on self-reported data on pregnancy history from 9999 (baseline) and 11 010 (endline) women of reproductive age (15-49 years) on use of maternal healthcare and reproductive health services, and estimated effects using a difference-in-differences (DID) approach, purposely focused on identifying program effects among the poorest wealth quintile. Results: PBF improved the utilization of few selected maternal health services compared to status quo service provision. These benefits, however, were not accrued by the poorest 20%, but rather by the other quintiles. PBF combined with equity measures did not produce better or more equitable results than standard PBF, with specific differences only on selected outcomes. Conclusion: Our findings challenge the notion that implementing equity measures alongside PBF is sufficient to produce an equitable distribution in program benefits and point at the need to identify more innovative and contextsensitive measures to ensure adequate access to care for the poorest. Our findings also highlight the importance of considering changing policy environments and the need to assess interferences across policies.


Background
There is a growing concern that health inequalities related to social determinants of health are responsible for the slow progress witnessed in health and healthcare at global, regional and country levels, potentially jeopardizing opportunities to achieve the health related Sustainable Development Goals (SDGs). 1 As countries embark on health financing and service delivery reforms, often targeting women and children first, monitoring health and healthcare inequalities remains an essential element of tracking progress towards the SDG 3 target of achieving universal health coverage by 2030, ensuring that no one is left behind. 2 Monitoring is even more urgent and important in low-and middle-income countries (LMICs) because, despite the recent progress made in curbing maternal and child deaths, serious inequities in maternal and child health persist, especially in sub-Saharan Africa. 3 Exacerbating the situation, LMICs also face huge coverage gaps, health system inefficiencies, and insufficient quality of service delivery. 4 In recent years, much attention has been paid to performance-based financing (PBF) as a possible strategy to improve health system performance. PBF aims at reorienting health providers' behavior towards provision of more and better quality care through the implementation of performance contracts that reward the attainment of predefined targets. 5 One vividly debated program design topic is a PBF programs' potential to reinforce rather than to counteract existing inequities. 6,7 Still, very few studies have looked at the equity impact of PBF on health services use and maternal care services in particular. While some studies suggest pro-least poor effects, 8,9 others find evidence for the opposite 10 or distributional-neutral effects. 11,12 Given the mixed evidence, an increasing number of authors advocate the introduction of PBF designs deliberately aimed at spreading program benefits more evenly across wealth groups. 8,11,12 However, to date limited evidence is available. 13 Our study evaluates an "equity-conscious" PBF design, which was recently implemented in Burkina Faso and which was piloted with 3 different equity measures. We examined program effects on maternal health service utilization and defined equity as equal access given equal need, with access measured as reported service use and need, 14 measured in terms of a woman's pregnancy status.

Study Setting
Burkina Faso is a landlocked West African country with a population of 18.6 million and a life expectancy of 60 years. Infant and under-five mortality rates stand respectively at 61 and 89 deaths per 1000 live births. An estimated 41.1 percent of the population live below the national poverty line of US$1.90 a day. Maternal mortality remains high at an estimated 371 per 100 000 live births. Multiple challenges related to maternal care persist, including serious inequities in access linked to sub-standard quality, low geographical accessibility, and financial barriers. 15 Prior to PBF, Burkina Faso undertook several health financing reforms to increase coverage and reduce inequities in access to and utilization of maternal health services, such as removal of user fees for antenatal care (ANC) services in 2002, and an 80% removal of user fees for delivery care in 2007, with a provision for full exemption of the ultra-poor. 16 Later in 2016, with the introduction of national free healthcare policy, known as the gratuité, the government removed all user fees for services delivered to children under the age of 5 years and to pregnant and lactating women. 17 As described below, the introduction of the national free healthcare policy induced the Ministry of Health to modify PBF prices, specifically to remove the equity measure additional payments (implemented in PBF2 and PBF3) for selected services targeted by both the national free healthcare policy and the PBF program. 18 The Intervention and the Study Design Following an initial pilot in the districts of Titao, Leo, and Boulsa, starting in January 2014, Burkina Faso piloted PBF combined with different equity interventions in 12 districts distributed across 6 regions (Boucle du Mouhoun, Centre-Nord, Centre-Ouest, Nord, Sud-Ouest, Centre-Est) in which health facilities were rewarded by the Ministry of Health for achievement of defined health service indicators using a case-based payment system, adjusted for quality of care after verification. More details on the intervention design have been described elsewhere. 18 In brief, PBF was implemented according to 4 different models, 3 of which included an equity intervention targeting specifically the ultra-poor, as summarized in Table 1. The details of the ultra-poor selection process have been described elsewhere. 19,20 Policy-makers expected the equity measures to induce increased utilization among the ultra-poor through 4 different pathways. First, they assumed that the targeting process would sensitize communities and particularly the targeted ultrapoor to the importance of health service utilization in case of need. Second, they assumed that the equity component would sensitize health workers to the importance of making specific efforts to facilitate health service utilization among the ultrapoor. Third, the removal of user fees for the targeted ultrapoor was assumed to reduce barriers to healthcare utilization among the ultra-poor. And finally, the elevated price levels for treating the targeted ultra-poor patients were assumed to enable and motivate health facilities to provide services to the ultra-poor free of charge.
To address our study primary objective of measuring the equity impact of PBF combined with equity measures compared to standard PBF alone, we inevitably needed to investigate the effect of PBF compared to status-quo service provision in the first place. To do so, we adopted a design that combined experimental with quasi-experimental elements. More specifically, we conducted a cluster-randomized trial nested within a pre-and post-test study with independent controls. Hereafter, we describe the different elements of our study in detail, referring to the quasi-experiment as study component 1 and to the cluster-randomized trial as study component 2. Figure provides a summary of the details of both study components including final sample sizes used in the analysis.

PBF Model Description
Standard PBF (PBF1) Performance contracts based on case-based payments method adjusted for quality were signed between the Ministry of Health and health facilities. Verification agencies were employed to verify service provision data submitted by individual facilities. PBF unit prices were calculated based on the relative cost and frequency of the services provided. Additional incentives were calculated on the basis of quantity outcomes and service quality if facilities achieved a quality score of 50% (and later changed to 60%), every quarter. Incentives were expected to pay for expenditures incurred, to increase savings and to pay bonuses to individual staff members.
PBF1 plus systematic targeting and subsidization of health services for the ultra-poor (PBF2) Used the same health service purchasing model as PBF1, but had specific equity measures meant to ease access to and utilization of maternal healthcare services among the ultra-poor living in the catchment areas of the participating health facilities with the following components: (a) a systematic targeting of the ultra-poor to identify a maximum of poorest 20% of the population; (b) providing the identified ultra-poor with proof of status so they could access health services at no cost at the point of use; and (c) higher purchase unit prices than in PBF1 for health services delivered to the targeted ultra-poor (ie, as compensation for the lost revenues due to free health services provided to the ultra-poor at the point of use). The adjusted higher unit prices were only for services where user fees existed, such as tetanus toxoid vaccine, delivery and family planning services among others, while for services already provided free of charge at point of use, such as HIV and tuberculosis testing and treatment among others, the same unit prices as in standard PBF were used. The additional payments were removed in June 2016 after the introduction of the national free healthcare policy.
PBF2 combined with higher incentive purchase price to provide health services to the ultra-poor (PBF3) Used the same purchasing arrangement as PBF1 and PBF2 and also involved the same targeting mechanisms and equity measures for the ultra-poor as in PBF2. The main difference was in the unit prices, whereby services provided to the ultra-poor were reimbursed at a higher unit price than in PBF2-at around 150% of the PBF2 unit prices. The higher unit prices were meant to compensate for the lost revenue from user fees, and also to offer health workers an additional incentive to motivate them to attract or reach out to the ultra-poor. This applied only to services where user fees were still charged at the point of use. These additional payment were removed in June 2016 after the introduction of the national free healthcare policy.
PBF1 plus community-based health insurance, combined with targeting and subsidization of health services for the ultrapoor (PBF4) Involved implementation of PBF1 alongside CBHI whereby an annual insurance premium of 3900 F CFA (US$7) per individual was offered for the whole population using the same targeting mechanism as in PBF2 and PBF3. CBHI insurance premiums for the ultra-poor was fully paid for by the PBF program and payments to health facilities were made by both the CBHI scheme as a replacement for user fees and by the PBF program, using a case-based payment system as in PBF1.

Study Component 1
For the study component 1, six regions (Boucle du Mouhoun, Centre-Nord, Centre-Ouest, Nord, Sud-Ouest, Centre-Est) were identified non-randomly by the government and its development partners as intervention regions. Within each region, 2 districts were selected as intervention districts, ie, destined to receive PBF, and 2 districts (when not possible in a neighbouring region) as control, ie, continue with status quo service provision with no PBF. The intervention districts were purposely selected based on poor performance on selected maternal health indicators. 18 Control districts were selected to be as similar as possible (also in terms of performance on maternal health indicators) to intervention districts. This study component was set to allow a comparison between PBF districts (12) and districts (12) with status quo service provision without PBF. Since the intervention was assigned at district levels, districts effectively functioned as clusters for this study component.

Study Component 2
For the study component 2, ten out of 12 districts (2 with community-based health insurance (CBHI) where CBHI was pre-existing to allow implementation) were targeted by the Ministry of Health and development partners for randomization due to financial constraints (not enough funds to allow targeting across all 12 selected districts once calculations for targeting costs were made). In 8 out of 10 targeted districts in 4 regions (Centre-East, Centre-Nord, Sud-Ouest, Nord), clusters (primary healthcare facilities) were randomized to receive either PBF1 or PBF combined with either one of 2 equity measures, PBF2, and PBF3 so as to test the additional effect of combining standard PBF with equity measures as one way of reducing inequities in access to and utilization of health services. Randomization took place within the framework of public randomization ceremonies in which concerned district and regions took turns in drawing primary healthcare facility names from a box containing all primary healthcare facility names in the 4 regions (Centre-East, Centre-Nord, Sud-Ouest, Nord) starting with the selection of pre-defined PBF model, and followed by the assignment of primary healthcare facilities in the order in which they were drawn from the box. 18 For example, in the 8 districts of the 4 regions (Centre-East, Centre-Nord, Sud-Ouest, Nord), this was done as follows: first facility: PBF1, second facility: PBF2; third facility: PBF3, fourth facility: PBF1, fifth facility: PBF2; and sixth facility: PBF3; etc. In these 8 districts of the 4 regions (Centre-East, Centre-Nord, Sud-Ouest, Nord) concerned by the three-arm randomization, this resulted into samples of 90 PBF1 facilities, 83 PBF2 facilities, and 84 PBF3 facilities. In the Boucle du Mouhoun region where 2 districts already implementing CBHI were targeted, 59 facilities were randomized to receive either PBF1 or PBF4 (following the same procedure outlined above), generating samples of 29 PBF1 and 30 PBF4 facilities.  The 18 facilities which had been implementing CBHI prior to the launch of the study were excluded from randomization and hence from our study. Nevertheless, for ethical reasons, these facilities all implemented PBF in addition to CBHI. This study component was set to allow us to measure the benefit of combining PBF with an equity measure compared to implementing standard PBF on its own. In particular, we used this experimental component to measure the additional equity effects of PBF2, PBF3, and PBF4 compared to PBF1.
Sampling and Data Sources For both study components, we used repeated cross-sectional household survey data collected at baseline from November 2013-March 2014 and at endline from April-June 2017. Sampling followed a three-stage cluster sampling procedure. First, for each primary healthcare facility included in the study (416 in intervention districts and 117 in the control districts -the number of facilities included in the study is larger for intervention compared to control districts since in intervention districts, we took a census of all facilities while in control districts we randomly selected one third of all facilities), we randomly selected one village. Second, within each village, we randomly selected 15 out of all households identified in each village where at least one woman was pregnant or had completed a pregnancy in the prior 24 months (inclusion criteria). Third, within a household, we interviewed all women of reproductive age (15-49 years), irrespective of whether they had a recent history of pregnancy. The survey collected information on use of reproductive and maternal health services from women of reproductive age (15-49 years). Data on use of family planning were collected from all women of reproductive age regardless of marital status while data on use of maternal health services were collected only from women with a recent pregnancy. At baseline in study component 1, our sample comprised 9999 (7766 in intervention group, 2233 in control group) women of reproductive age (15- Variables and Their Measurement Table 2 summarizes all outcomes and control variables. Our outcomes were selected to capture service coverage (defined as utilization given need, ie, pregnancy status) along the reproductive and maternal health service continuum and to reflect services which were incentivized by the PBF program, namely: ANC in the first trimester, at least 4 antenatal care (ANC4+) visits, at least 2 doses of tetanus toxoid vaccine (TTV2+), iron supplementation, HIV testing in pregnancy, facility-based delivery, at least 1 postnatal care (PNC1+) visit, at least 3 postnatal care (PNC3+) visits, and modern family planning methods (female sterilization, male sterilization, intrauterine device [IUD]/spiral, injectables/depoprovera, implants/norplant, male condom, female condom, diaphragm, foam/jelly). To improve the estimation precision, we included a number of control variables, which have the potential to explain the variation in outcome indicators from our previous work. 21 We relied on multiple correspondence analysis -run separately on baseline and endline samples -  to generate a wealth index based on asset ownership and dwelling characteristics. 22 Given our specific research focus and the intervention's intention to treat the ultra-poor 23 and in line with prior literature, 8 we divided households in 2 wealth brackets corresponding to the Lowest 20% (ie, ultrapoor) and the rest -Upper 80%.

Data Analysis
Bivariate Analysis First, we used t tests to assess systematic differences in the distribution of Outcome variables and Control variables across study arms for both study components.

Regression Analysis
Second, to assess the overall impact of PBF compared to status quo, we relied on study component 1 and used a differencein-differences (DID) estimation approach, 24 comparing intervention districts (irrespective of specific study arm) with control districts. We estimated a linear probability model, where we clustered standard errors at district level. In addition, for each outcome, we included village fixed effects and several individual-level covariates (equation 1): . 17 .
where Y dvit outcome for individual i from village v in district d at time t with t as (baseline, endline); Y17 t is dummy variable representing endline; PBF d is dummy variable denoting PBF exposure (1 = PBF, 0 = control); α v is village fixed effects capturing time-invariant unobserved differences across villages. X it is vector of individual-level covariates; and ε dvit is error term. λ is the variable of interest (interaction term between PBF and endline) that gives the DID estimate for the effect of being located in a PBF district.
To determine overall PBF effects compared to status quo by socio-economic status group, we estimated regression model 1 by wealth bracket, following Lannes et al. 8 Third, to answer our key question on the equity impact of the PBF models integrating equity interventions, we relied exclusively on study component 2 of our study (10 districts) and also used DID to estimate a linear probability model as in equation where Y vit is outcome for individual i from village v at time t with t as (baseline, endline) in the intervention districts. λ 2 and λ 3 are variables of interest that give the DID estimates for the effects of being resident in PBF2 and PBF3 compared to PBF1, respectively, and λ 4 is the variable of interest that give the DID estimate for the effect of being resident in PBF4 compared to PBF1 in the Boucle du Mouhoun region.
Similarly to what we described earlier, to estimate specific effects by socio-economic status, we performed separate analyses by wealth bracket. 8 Furthermore, for study component 1, we performed several robustness checks to account for the small number of clusters (24 districts). We did so in light of the existing literature suggesting that: (1) a small number of clusters results in a higher likelihood of estimating downwards-biased standard errors, potentially leading to over rejection of the null hypothesis, ie, suggesting significant program impact while in reality there is none or very little impact 25 ; and (2) bias arising from a small number of clusters is more acute in situations characterized by an imbalance in cluster sample sizes. 26 Hence, to account for these 2 problems pertaining to our study component 1, we relied on the 'wild bootstrap' method for the related analyses. This method relies on a bootstrap tprocedure instead of bootstrapping the standard errors. 25 We performed all analyses using Stata14 (Stata Corporation, Texas, USA).

Results
The results of our study are presented according to study components as follows: Tables 3 and 4 shows bivariate analysis of the characteristics of women of reproductive age (15-49 years) and coverage of maternal health services at baseline, respectively. At baseline, women in intervention and control districts were comparable on most demographic characteristics except age (Table 3).

Study Component 1 Bivariate Analysis
In contrast, significant differences between PBF and control group existed in baseline values for a number of outcome variables: ANC4+ visits, HIV testing in pregnancy, facilitybased delivery, PNC1+ visit and PNC3+ visits (Table 4). Table 5 summarizes the results of the regression models pertaining to the overall impact of PBF compared to status quo, both for the entire sample and stratified by socioeconomic group for the study component 1. We detected a positive effect of PBF on utilization of facility-based delivery [4.4 percentage points (pp) (P < .1)] and for PNC3+ visits: [6.6 pp (P < .1)]. This effect was primarily driven by an effect among the upper 80% of 5.5 pp (P < .05) and of 7.2 pp (P < .1) for facility-based delivery and PNC3+ visits, respectively. Among the poorest 20%, we detected an increase attributable to PBF for utilization of modern family planning methods of 7.6 pp (P < .1). Table 6 summarizes the results of the robustness testsusing the "wild bootstrap" method. 25 The results show that the estimates included in this study were all within the 95% confidence interval, and as such, there is no concern that the DID estimates in this study component are substantially biased due to the small number of clusters and/or to imbalances in Abbreviation: PBF, performance-based financing.

Regression Analysis
T test for differences between PBF and control samples; *** P < .01. cluster sample sizes. These results imply that we are not at risk of detecting an effect as significant when there was in fact no effect. Tables 7 and 8 presents bivariate analysis of the characteristics of women of reproductive age (15-49 years) and coverage of maternal health services, respectively at baseline in PBF2, PBF3 and PBF1 in the 4 regions (Centre-East, Centre-Nord, Sud-Ouest, Nord) and PBF4 and PBF1 in Boucle du Mouhoun region. At baseline, women across the 4 PBF arms were comparable except for marital status, distance to primary healthcare facilities, and age ( Table 7). The random allocation of facilities to the 4 PBF intervention arms resulted in uniform allocation for the majority of the outcome variables intervention arms, and utilization of most maternal care services increased over time with the exception of TTV2+ (Table 8). Table 9 summarizes the results of the regression models aimed at estimating the additional benefit of PBF2, PBF3, and PBF4 compared to PBF1 for study component 2. Only PBF4 appeared to produce additional benefits, with significant positive effects on TTV2+ by 13.1pp (P < .05) and iron supplementation by 6.2 pp (P < .05) over and above PBF1. PBF2 performed worse than PBF1 in terms of its effect on utilization of TTV2+ by 6.8 pp (P < .1), while PBF3 had negative additional effects on facility-based delivery by 3.8 pp (P < .1) and PNC1+ visit by 6.7 pp (P < .1) respectively. Table 10 represents the core of our analysis, as it summarizes the results of the regression models aimed at estimating the additional benefit of PBF2, PBF3, and PBF4 compared to PBF1 by socio-economic subgroup for study component 2. Similar to the overall findings presented in Table 9, the equity measures that accompanied the implementation of PBF did not result in any additional benefit for the poorest 20%, but rather the opposite on certain indicators. PBF2 and PBF3 decreased utilization of facility-based delivery by 11.7 pp (P < .05) and 11.8 pp (P < .05), respectively among the poorest 20%. In addition, PBF3 decreased utilization of iron supplementation by 7.4 pp (P < .05) and modern family planning methods by 12.7 pp (P < .05) among the poorest 20%, while PBF4 decreased utilization of PNC1+ visit by 24.1 pp (P < .05) among the poorest 20%. The overall positive additional effect of PBF4 on TTV2+ and iron supplementation coverage (seen in Table 9) was present only among the upper 80%, with an increase of 13.7 pp (P < .05) and 7.6 pp (P < .05) respectively, but not among the poorest 20%.

Discussion
This study makes a unique contribution to the literature by combining experimental and quasi-experimental elements to investigate not only the overall impact of PBF on maternal health service coverage, but specifically the role of combining equity interventions with standard PBF to reduce existing inequities.
Unfortunately, the results do not correspond to what policy-makers and their development partners had intended to achieve when designing the intervention and even yield   Abbreviation: PBF, performance-based financing.
opposite results. It is unfortunate that no data on costs of any interventions (PBF, equity measures, gratuité) are available which would have allowed policy-makers to put the results into a better perspective. Nevertheless, appraising findings across our multiple strains of analysis, it appears that while PBF produced modest changes compared to status quo, the implementation of equity interventions did not generate additional benefits compared to PBF alone, neither for women in general nor for the poorest women specifically. In a few selected instances, PBF combined with equity interventions even resulted in worse outcomes than PBF alone. This observation is aligned with some published evidence. For instance, in Cambodia, coupling PBF with maternity vouchers to cover user fees for the poor was also observed not to improve service utilization for the poor. 27 Two studies from Rwanda, where PBF was coupled with CBHI in the analysis, showed mixed results. One study found that PBF yielded no equity effects, 12 while the other study detected pro-poor effects for utilization of facility-based deliveries but negative equity effect on use of modern family planning methods among the poor. 8 Before we attempt to uncover reasons for why PBF did not T test for differences between PBF2 and PBF1, PBF3 and PBF1 "in 4 regions" and PBF4 and PBF1 "in Boucle du Mouhoun" samples. * P < .1, ** P < .05, *** P < .01.  attain the intended equity effects, we shall briefly comment on our finding regarding facility-based delivery. While an absolute change of 4 percentage points at a significance level of 10% may appear to be negligible, it is in fact remarkable considering the extremely high baseline utilization values, approaching 90%. In addition, we ought to consider that study component 1 was largely underpowered due to the constraints imposed by the small number of clusters, equivalent to 24 districts. Since the least-poor drove changes in utilization of facility-based delivery, it is plausible to assume that PBF might have produced positive changes in quality of service delivery necessary to encourage further utilization among those with means to be receptive to quality improvements. Further research into the impact of PBF on quality of service delivery is needed to verify this hypothesis. The positive change observed on PNC is less striking, since baseline utilization values departed from relatively low levels. Nevertheless, this change is highly relevant given that health systems currently struggle to increase use of PNC services. 28 The general decline in coverage of TTV2+ from baseline to endline in both PBF and control catchment areas may be due to the fact that by the time the endline data were collected, most women in the catchment areas had received the 5 doses stipulated by government policy and were therefore not eligible to receive additional vaccination. This issue arose as the question in the survey was set to capture new vaccinations rather than overall vaccination coverage. Understanding the lack of additional benefit produced by the more complex PBF models integrating equity interventions compared to standard PBF requires a closer inspection of the study context. Endline data collection took place between April and June 2017, approximately one year after the launch of the national free healthcare policy targeting women and children. 17 Given a pregnancy recall period of 24 months, this means that by the time we collected endline data, only a portion of women in our study area had been exempted from payment of user fees for all maternal care services except modern family planning methods, irrespective of whether they lived in the catchment area of a standard PBF facility (PBF1) or in areas with an additional targeting and subsidization of the ultra-poor (PBF2, PBF3) or a CBHI (PBF4) model.
It could be argued that following the launch of the national free healthcare policy, health service use for the specific indicators included in our study among the ultra-poor might have caught up so fast in PBF1 areas due to the removal of the financial barrier to make it impossible for us to detect any effect of the PBF equity interventions which might have been there prior to June 2016. Our analysis, however, clearly indicates that saturation (ie, utilization rates of 100%) was not reached for any of the targeted indicators. In addition, we note that our effect estimation is by no means invalidated by the implementation of the national free healthcare policy for maternal health, since pre-and post-test designs with independent controls and relying on a DID analytical approach are not compromised by presence of groupinvariant factors, such as policies launched across all districts in the country simultaneously. 29 As such, if the national free healthcare policy did bear any effect on service utilization (which we do not know because this is beyond the scope of this study), it is likely to have done so in all PBF models and control districts, not affecting in any way our ability to detect differences between PBF and control districts and across PBF models.
Nevertheless, we need to acknowledge that the introduction of the national free healthcare policy induced policy-makers to adjust the implementation of the equity measures in PBF2 and PBF3. Specifically, qualitative interviews with key stakeholders revealed that following the launch of the national free healthcare policy, the Ministry of Health removed additional compensation to healthcare facilities in PBF2 and PBF3 for all those services which were included in the free healthcare policy benefit package. This means that effectively, by the time we collected endline data, PBF2 and PBF3 were equivalent to PBF1 in terms of incentives related to all maternal care services except modern family planning. This could well have demotivated health providers from seeking innovative strategies for reaching out to provide the poor with the needed services. 30 It ought to be noted, however, that the introduction of the national free healthcare policy only touched the assumed financial mechanisms of the equity components, while effects of the sensitization mechanisms activated by the targeting exercise should have remained constant in PBF2 and PBF3 facility catchment areas only, but not in PBF1 facility catchment areas. As such, the introduction of the free national healthcare policy could have diluted, but not fully removed, the effect of the equity measures, had there been one in the first place.
The fact that we detected a negative effect of PBF2, PBF3, and PBF4 compared to PBF1 on selected indicators, especially when considering the stratified analysis looking only at the poorest, is worrisome. However, this appears to corroborate existing evidence pointing at the presence of unintended consequences related to the implementation of the community-based targeting and related subsidized program 30 and at general challenges related to the implementation of the overall PBF program in the country. 31,32 Appraising our current findings in light of existing literature suggests that combining PBF and equity interventions into a single intervention might have resulted in a level of complexity not easily manageable for front-line healthcare providers, ultimately leading to effects contrary to the ones that had been anticipated. For example, evidence shows that health providers introduced ceilings to services offered to the ultra-poor as a way of adapting to the complexity of the PBF interventions and to the long delays in receiving incentive payments, which created financial difficulties for health workers. 30 Albeit worrisome, the results of our analysis are not per se surprising as they align with PBF evidence from other settings as well as with prior research assessing the impact of earlier targeted exemption policies in Burkina Faso. For instance, the national obstetric care policy implemented from 2007 to 2016 failed to reach the poorest women effectively. 33 A recently published study indicated that lack of fidelity in implementing exemption policies may be due to providers' lack of adequate knowledge in the first place. 34 This suggests a need to educate providers on the purpose and procedures of a given policy to transform them into real agents of change, since poor communities are not sufficiently empowered to overcome all relevant barriers to access in response to a single targeting mechanism. Further qualitative inquiry is needed to unravel if and to what extent providers' understanding of the targeted exemptions represented a barrier to the effective implementation of equity interventions in Burkina Faso.
In addition, it is possible that mere removal of user fees through targeting was insufficient to enable very poor people to seek care. Prior evidence from the region and the country specifically clearly points at the presence of important nonfinancial barriers to access. 33,35 For example, inequities existed in facility-based delivery due to distance to catchment primary health facility, literacy, parity and religion. 21 There were also inequities in utilization rates for ANC4+ visits due to distance to catchment primary health facility, literacy, parity, religion and marital status and for PNC1+ visit due to distance to catchment primary health facility, age and religion. 21 Since the PBF program did not address these other sources of inequity, other than household wealth, arguably they still constituted barriers to uptake of essential maternal health services by the poorest women. Still, further research is needed to unravel why in some settings equity interventions are not effective in narrowing equity gaps while in others, such as Tanzania, 10 combining PBF and targeted exemptions resulted in greater service use among the poor in public health facilities. This is in line with Renmans et al, who have observed that although PBF has received increasing attention, a lot remains unknown about the exact mechanisms triggered by PBF arrangements. 36 As such, they have called for more research to examine the exact mechanisms through which not only incentives, but also ancillary components operate. Such knowledge is necessary to understand and appreciate the effectiveness, desirability and appropriateness of PBF as a possible tool towards health systems strengthening in LMICs.

Methodological Considerations
Our study is not without limitations. First, since the intervention took place within a real-life setting, we cannot rule-out that other interventions with similar objectives took place alongside PBF, especially in control districts. Hence, we cannot estimate the extent to which our comparator really reflects status quo utilization rates. Second, we need to acknowledge the fact that women identified in our study as the poorest do not exactly match the ones identified by the community-based targeting procedure of the program as such. Hence, the reader ought to be aware that our findings illustrate the impact on the lowest quintile in general and not on targeted individuals specifically. Parallel research efforts based on a different dataset are ongoing to look at the impact of the equity interventions specifically on targeted individuals. Third, the power to detect impact in study component 1 was limited by the relatively low number of clusters. This limitation was noted well in advance, when the overall PBF impact evaluation study was being designed, but financial and policy challenges made it impossible to increase the number of clusters, and hence it was agreed amongst all key stakeholders to live with this limitation. Fourth, the purposive selection of the districts represents a potential threat to external validity, more specifically to the generalizability of the results emerging from study component 1. Precisely, the purposive selection of the districts does not allow us to make inferences about the possible effects of PBF in districts with higher baseline values, hence we need to exercise caution in generalizing the results of this study to other contexts. This purposive selection, however, does not represent a threat in terms of the internal validity of the DID analysis, since it does not violate the basic assumptions of the DID model. 29,37 Fifth, as discussed extensively earlier, the modifications which were operated to the PBF design following the introduction of the national free healthcare policy could have potentially contributed to diluting, but not eliminating, the effect of PBF, had there been one in the first place.

Conclusion
PBF is being implemented in LMICs and sub-Saharan Africa in particular as a response to weak health systems performance. Although rapidly growing, evidence regarding its effectiveness is still very mixed. Evidence regarding PBF equity impacts on use of health services and maternal health is particularly scarce and, when available, it is mixed, in some cases conflicting. Our results indicate that even well-designed PBF interventions which integrate explicit equity components are not sufficient to overcome inequities in health service use. As such, our results confirm the need for additional interventions reaching beyond the financial realm to ensure access to care by the ultra-poor. In addition, our findings suggest that changing policy environments inevitably affect the way an intervention, in this case PBF combined with equity measures, is carried out and hence should be explicitly acknowledged when appraising effects.
Lastly, we would like to reiterate the importance of carefully monitoring and measuring the equity impact of interventions targeted at improving access and quality of service delivery as an integral element of SDG 3. The experience of the PBF program in Burkina Faso provides a clear illustration of how even well-intended and accurately designed interventions may fail to achieve their objective due to a variety of contextual elements that shape implementation in unattended manners. As such, greater attention to context and competing policies ought to be paid in designing strategies that aim at building synergies between supply and demand to overcome existing inequities.