Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

The Time for Austerity: Estimating the Average Treatment Effect of Fiscal Policy

The Time for Austerity: Estimating the Average Treatment Effect of Fiscal Policy Abstract After the Global Financial Crisis, a controversial rush to fiscal austerity followed in many countries. Yet research on the effects of austerity on macroeconomic aggregates was and still is unsettled, mired by the difficulty of identifying multipliers from observational data. This article, reconciles seemingly disparate estimates of multipliers within a unified and state‐contingent framework. We achieve identification of causal effects with new propensity‐score based methods for time series data. Using this novel approach, we show that austerity is always a drag on growth, and especially so in depressed economies: a 1% of GDP fiscal consolidation translates into a loss of 3.5% of real GDP over five years when implemented in a slump, rather than just 1.8% in a boom. I solemnly affirm and believe, if a hundred or a thousand men of the same age, same temperament and habits, together with the same surroundings, were attacked at the same time by the same disease, that if one half followed the prescriptions of the doctors of the variety of those practising at the present day, and that the other half took no medicine but relied on Nature’s instincts, I have no doubt as to which half would escape. – A doctor, quoted by Petrarch in his letter to Boccaccio, 1364 (Donaldson, 2016). The boom, not the slump, is the right time for austerity at the Treasury. – J. M. Keynes (1937) In 1809 on a battlefield in Portugal, in a defining experiment in epidemiological history, a Scottish surgeon and his colleagues attempted what some believe to have been the first recognisable medical trial, a test of the effectiveness of bloodletting on a sample of 366 soldiers allocated into treatment and control groups by alternation. The cure was shown to be bogus. Tests of this sort heralded the beginning of the end of premodern medicine, vindicating sceptics like Petrarch for whom the idea of a fair trial was a mere thought experiment. Yet, even with alternation, allocation bias – i.e. ‘insufficient randomisation’ – remained pervasive in poor experimental designs (e.g. via foreknowledge of assignment) and the intellectual journey was only completed in the 1940s with the landmark British Medical Research Council trials of patulin and streptomycin. Ever since the randomised controlled trial has been the foundation of evidence‐based medicine.1 Is a similar evidence‐based macroeconomics possible and what can it learn from this noble scientific tradition? Ideas from the experimental approach bridge medicine, epidemiology, and statistics, and they have slowly infected empirical economics, although mostly on the micro side.2 In this article, we delve into the experimental toolkit so as to re‐examine a key issue for macroeconomics, the need to ensure treatments are somehow re‐randomised in non‐experimental data. We do this in the context of the foremost academic and policy dispute of the day – the effects of fiscal policy shocks on output; see, in particular, Alesina and Ardagna (2010) and Guajardo et al. (2014).3 Identification of the effects of fiscal consolidation in empirical studies has broadly taken one of two forms. In the context of a vector autoregression, one option is to achieve identification based on exclusion restrictions. In practice, these restrictions are roughly equivalent to a regression‐control strategy based on a limited number of observable controls and (usually) a linear conditional mean assumption. Examples of this strand of the literature include Alesina and Perotti (1995), Perotti (1999) and Mountford and Uhlig (2009). The other strand of the literature has approached the identification problem through instrumental variables. Examples of this approach include Auerbach and Gorodnichenko (2012, 2013), Mertens and Ravn (2013, 2014) and Owyang et al. (2013). Following a new and arguably more promising direction, we take a third fork on the road to identification based on the Rubin Causal Model. This approach has the attractive features of being semi‐parametric (and hence flexible with respect to the functional form), providing better control for observables, and offering a more reliable alternative when the putative instrumental variables for policy action are themselves possibly endogenous. Tests of instrument validity are well‐known to have low power (Cameron and Trivedi, 2005) but, more importantly, formal testing is not an option when we are in the case of exact identification. We find that, on average, fiscal consolidations generate a drag on GDP growth. The effect is also state dependent: if a 1% of GDP fiscal consolidation is imposed in a slump then it results in a real GDP loss of around 3.5% over five years, rather than just 1.8% in a boom. We arrive at this conclusion by carefully constructing an encompassing framework that allows us to evaluate the type of approach followed by several recent articles in the literature (to be discussed in detail shortly) to improve comparability with the methods we introduce in this article. In addition to accommodating existing methods, the framework allows us to address the identification concerns we uncover via the application of an estimator from the family of ‘doubly robust’ augmented inverse‐propensity‐score weighted regression adjustment methods (Robins et al., 1994; Scharfstein et al., 1999; Robins, 2000; Hirano et al., 2003; Imbens, 2004; Lunceford and Davidian, 2004; Glynn and Quinn, 2010). To provide more texture to our results, we evaluate the UK austerity programme implemented by the Coalition Government after the 2010 election. The Global Financial Crisis struck the US and the UK in a similar way and these economies ran on parallel trajectories until 2010. Thereafter the UK experienced a second slowdown while the US continued to grow. Using our estimates we compute how much of the slowdown could be attributed to the austerity programme; we find it to be a very significant contribution (rising to 3.1% of GDP in 2013) and larger than official estimates. Thus, better models, with state‐dependent features, could improve official fiscal policy analyses going forward. 1. The Austerity Debate: A Road Map Are fiscal consolidations expansionary, neutral, or contractionary? In order to answer this question and understand the different answers the literature has arrived at so far, we proceed in a series of incremental stages. First, we use the OECD annual panel data set adopted in two recent high‐profile yet seemingly irreconcilable studies. The ‘expansionary austerity’ idea has come to be associated with the paper by Alesina and Ardagna 2010, henceforth AA) an idea perhaps dating back to at least Giavazzi and Pagano (1990). On the opposite side, the IMF team of Guajardo et al. 2014, henceforth GLP) reached the opposite conclusion of ‘contractionary austerity’. By juxtaposing these two papers we are not implying that the literature falls evenly or comprehensively within these two camps. We use the contraposition as a rhetorical device much like Perotti (2013), who presents a lucid discussion of the empirical pitfalls in this research area. Second, we use Jordà (2005) local projections (LPs), rather than structural vector auto regressions (SVARs). The reason is that, among other advantages that we will discuss momentarily, LPs are a convenient pedestal on which all extensions of existing estimation methods can rest. The unified framework provides the reader a way to compare the results across a set of nested estimation strategies. LPs provide a flexible semi‐parametric regression control strategy to estimate dynamic multipliers and include, as a special case, impulse responses calculated with an SVAR. LPs accommodate possibly non‐linear, or state‐dependent responses easily, and indeed we find that the effects of fiscal policy can be very different in the boom and the slump, as emphasised by Keynes in the 1930s. State‐dependent multipliers based on LPs have been taken up in some very recent papers (Auerbach and Gorodnichenko, 2012, 2013, for the US and OECD; Owyang et al., 2013, for the US and Canada). Other recent papers on state‐dependent multipliers, using various measures of slack, include Barro and Redlick (2011) and Nakamura and Steinsson (2014). For a critical survey see Parker (2011). Long ago, Perotti (1999) explored the idea of ‘expansionary austerity’ with state‐dependent multipliers. We calculate the impact of fiscal policy shocks based on LPs using the AA measure of policy, the change in the cyclically‐adjusted primary balance (d.CAPB).4 When we restrict attention to ‘large’ shocks (changes in CAPB larger in magnitude than 1.5% of GDP, which is the benchmark cutoff value used by AA and proposed earlier by Alesina and Perotti, 1995), we replicate the ‘expansionary austerity’ result. However, when we condition on the state of the economy, we find that this result is driven entirely by what happens during a boom. The expansionary effects of fiscal consolidation evaporate when the economy is in a slump. Third, we then use instrumental variable (IV) estimation of the LPs to account for unobserved confounders. Specifically, we instrument the cyclically‐adjusted primary balance using the IMF’s narrative measure of an exogenous fiscal consolidation in GLP. This type of ‘narrative‐based identification’ has been applied by, e.g. Romer and Romer (1989, 1997), Ramey and Shapiro (1998) and Mertens and Ravn (2013, 2014). Our IV estimation then turns out to replicate the flavour of the GLP results: austerity is contractionary, and strongly so in slumps. Fourth, we show that the proposed IMF narrative instrumental variable has a significant forecastable element driven by plausible state variables, such as the debt‐to‐GDP level, the cyclical level or rate of growth of real GDP, and the lagged treatment indicator itself (since austerity programmes are typically persistent, multi‐year affairs).5 Formal testing of instrument validity is not possible since we have exact (not over‐) identification. However, the evidence that we provide calls into question the validity of the narrative instrumental variable. As noted above in the history of medicine and as with any efforts to construct a narrative policy variable that is exogenous, one has to worry about the possibility that treatment is still contaminated by endogeneity, which would impart allocation bias to any estimates.6 Fifth, in order to purge the remaining allocation bias, we use inverse probability weighting (IPW) estimation based on a prediction model of the narrative policy variable to estimate the LP responses. We consider the IMF narrative policy variable as a ‘fiscal treatment’ – i.e. a binary indicator rather than a continuous variable – and we are interested in characterising a dynamic average treatment effect (ATE). In new work, Angrist et al. (2013) introduce IPW estimators in a time series context to calculate the dynamic ATE responses to policy interventions. We follow a slightly different approach using augmented regression‐adjusted estimation instead, denoted AIPW, which combines IPW with regression control and adjusts the estimator to achieve semi‐parametric efficiency (Lunceford and Davidian, 2004). Our AIPW estimator falls into the broad class of ‘doubly robust’ estimators of which Robins et al. (1994) is perhaps the earliest reference (Scharfstein et al., 1999; Robins, 2000; Hirano et al., 2003; Imbens, 2004; Lunceford and Davidian, 2004; Glynn and Quinn, 2010). The ‘doubly robust’ property means that consistency of the estimated ATE can be proved in the special cases where either the propensity score model and/or the conditional mean is correctly specified; Monte Carlo evidence also suggests that the estimator performs better than alternatives even in more general cases too. The remainder of the article expands on each of these stages in turn. 2. Replicating Expansionary Austerity: OLS Results Our first estimates use OLS estimation with the LP method, based on what is the traditional variable in the literature, the change in the cyclically adjusted primary balance (denoted d.CAPB), the same variable used by Alesina and Perotti (1995) and by AA, and used as a reference point by GLP in the IMF study. The local projection is done from year 0, when a policy change is assumed to be announced, with the fiscal impacts first felt in year 1, consistent with the timing in GLP. The LP output forecast path is constructed out to year 5, deviations from year 0 levels are shown and also the sum of these deviations, or ‘lost output’, across all of those five years. To create a benchmark estimating equation that mimics the standard setup in the literature, the typical LP equation that we estimate has the form: yi,t+h−yi,t=αih+ΛhDi,t+1+βL0hΔyi,t+βL1hΔyi,t−1+βChyi,tC+vi,t+h,(1) for h = 1, … , 5, and where yi,t+h−yi,t denotes the cumulative change from time t to t + h in 100 times the log of real GDP, the αih are country‐fixed effects, and Di,t denotes the d.CAPB policy variable (measured from time t to time t + 1 given the assumed timing of the announcement and implementation of fiscal plans). Finally, to control for reversion to the potential output trend, the term yi,tC is the output gap, denoting the cyclical component of GDP, and it is proxied here by deviations of log real GDP from an HP trend estimated with a smoothing parameter of λ = 100. We use the subscripts L0 and L1 for the β parameters associated to Δyi,t−l for l = 0, 1 so as not to confuse them with the j = 1, 0 treatment‐control index that we will use later. Our choice of using the HP filter with λ = 100 was justified by a series of experiments undertaken with US postwar data (from FRED) which showed that a relatively high smoothing parameter was needed if the proposed proxy series (HP filtered log real GDP) was to come close to matching the official CBO output gap series. We also replicated this type of analysis using a bandpass filter tuned to various frequencies and the conclusions were very similar. That is to say, we found that the conventional filter frequencies typically used in the business cycle literature are too low to provide a good match with the output gap, which is what we want in our model so as to control for reversion to trend. These experiments are available from the authors upon request. The specification (1) nests the main elements in AA and GLP to facilitate comparisons of our results with theirs. The coefficient Λh from expression (1) is the parameter governing the impact of the continuous policy treatment measured by d.CAPB and corresponds to the constrained version of expression (7) below in Section 5, where we have rearranged that expression to get a direct estimate of the average response to policy intervention Λh from the regression output, but it is otherwise specified the same way. Table 1 reports estimates based on expression (1). Estimated log real GDP impacts (× 100) for each year are reported in columns 1–5 and the five‐year sum of the deviations in final column 6. In parallel with the main result in AA, although the effects are economically modest, the data appear to support the notion that fiscal consolidation can be expansionary (especially in the first two years), although the cumulative effect over a five year period is small and negative. If we focus on multiplier estimates based on large consolidations (i.e. changes in CAPB larger than 1.5% of GDP using the Alesina and Perotti (1995) and AA cutoff value), then the results are almost identical. Small consolidation packages have a small effect on output, but the estimates are imprecise. Table 1 Fiscal Multiplier, Effect of d.CAPB, OLS Estimates Deviation in log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal multiplier, full sample 0.11** 0.12** −0.04 −0.21*** −0.32** −0.42** (0.04) (0.05) (0.04) (0.07) (0.12) (0.16) Fiscal multiplier, large change in CAPB (>1.5%) 0.12** 0.13** −0.04 −0.23*** −0.33** −0.41* (0.04) (0.05) (0.04) (0.07) (0.12) (0.19) Fiscal multiplier, small change in CAPB (≤1.5%) 0.06 0.11 0.03 −0.07 −0.23 −0.53 (0.07) (0.15) (0.14) (0.19) (0.28) (0.50) Observations 457 440 423 406 389 389 Deviation in log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal multiplier, full sample 0.11** 0.12** −0.04 −0.21*** −0.32** −0.42** (0.04) (0.05) (0.04) (0.07) (0.12) (0.16) Fiscal multiplier, large change in CAPB (>1.5%) 0.12** 0.13** −0.04 −0.23*** −0.33** −0.41* (0.04) (0.05) (0.04) (0.07) (0.12) (0.19) Fiscal multiplier, small change in CAPB (≤1.5%) 0.06 0.11 0.03 −0.07 −0.23 −0.53 (0.07) (0.15) (0.14) (0.19) (0.28) (0.50) Observations 457 440 423 406 389 389 Notes Standard errors (clustered by country) in parentheses. ***/**/* Indicate p < 0.01/0.05/0.10. Additional controls: cyclical component of y, two lags of change in y, country fixed effects. Open in new tab Table 1 Fiscal Multiplier, Effect of d.CAPB, OLS Estimates Deviation in log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal multiplier, full sample 0.11** 0.12** −0.04 −0.21*** −0.32** −0.42** (0.04) (0.05) (0.04) (0.07) (0.12) (0.16) Fiscal multiplier, large change in CAPB (>1.5%) 0.12** 0.13** −0.04 −0.23*** −0.33** −0.41* (0.04) (0.05) (0.04) (0.07) (0.12) (0.19) Fiscal multiplier, small change in CAPB (≤1.5%) 0.06 0.11 0.03 −0.07 −0.23 −0.53 (0.07) (0.15) (0.14) (0.19) (0.28) (0.50) Observations 457 440 423 406 389 389 Deviation in log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal multiplier, full sample 0.11** 0.12** −0.04 −0.21*** −0.32** −0.42** (0.04) (0.05) (0.04) (0.07) (0.12) (0.16) Fiscal multiplier, large change in CAPB (>1.5%) 0.12** 0.13** −0.04 −0.23*** −0.33** −0.41* (0.04) (0.05) (0.04) (0.07) (0.12) (0.19) Fiscal multiplier, small change in CAPB (≤1.5%) 0.06 0.11 0.03 −0.07 −0.23 −0.53 (0.07) (0.15) (0.14) (0.19) (0.28) (0.50) Observations 457 440 423 406 389 389 Notes Standard errors (clustered by country) in parentheses. ***/**/* Indicate p < 0.01/0.05/0.10. Additional controls: cyclical component of y, two lags of change in y, country fixed effects. Open in new tab Would the picture change much if we broke down the analysis of the impact of consolidation as a function of whether the economy is experiencing a boom or a slump? Estimation was next carried out on two bins of the data to allow responses to be state dependent. We sort on the sign of yC ⁠, the time‐0 output gap (HP filtered) into ‘boom’ and ‘slump’ bins, to capture conditions at time 0 varying across the cycle. This partition places just over 200 observations in each of the ‘boom’ and ‘slump’ bins, given the AA‐GLP combined data set with about 450 observations in total, after allowing for observations lost due to lags. Note that this partition is meant to provide a more granular statistical summary of the main features of the data. We are not arguing whether or not a boom or a slump is more likely under a particular choice of fiscal policy or another. Table 2 shows OLS estimated responses using expression (1) by sorting the data into these two bins. Panel (a) shows the estimated response coefficient at year h based on values of d.CAPB common to the AA and GLP data sets. Panel (b) shows results when we estimate separate response coefficients for ‘large’ and ‘small’ changes in d.CAPB, following the 1.5% of GDP cutoff value employed by Alesina and Perotti (1995) and by AA. These distinctions prove to be relatively unimportant since, as can be seen, all of the action is driven by ‘large’ changes, with similar coefficients on the ‘large’ changes in panel (b) and all changes in panel (a). In panel (b), the coefficients for ‘small’ changes are small and not statistically significant at conventional levels. This is similar to what we found in Table 1. Table 2 Fiscal Multiplier, Effect of d.CAPB, OLS Estimates, Booms versus Slumps Deviation in log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Panel (a): uniform effect of d.CAPB changes Fiscal multiplier, yC>0 ⁠, boom 0.21*** 0.24*** 0.05 −0.17 −0.22 −0.02 (0.07) (0.07) (0.05) (0.11) (0.15) (0.24) Observations 222 205 192 180 175 175 Fiscal multiplier, yC≤0 ⁠, slump −0.03 −0.07 −0.17 −0.23* −0.41** −0.98** (0.04) (0.07) (0.11) (0.12) (0.18) (0.40) Observations 235 235 231 226 214 214 Panel (b): Separate effects of d.CAPB for large (>1.5%) nd small (≤1.5%) changes in CAPB Fiscal multiplier, large change in CAPB, yC>0 ⁠, boom 0.23** 0.24*** 0.06 −0.15 −0.18 0.13 (0.08) (0.08) (0.06) (0.11) (0.15) (0.28) Fiscal multiplier, small change in CAPB, yC>0 ⁠, boom 0.06 0.21 −0.04 −0.32 −0.57 −1.55 (0.11) (0.35) (0.40) (0.37) (0.41) (1.14) Observations 222 205 192 180 175 175 Fiscal multiplier, large change in CAPB, yC≤0 ⁠, slump −0.02 −0.05 −0.18 −0.30* −0.52** −1.16* (0.05) (0.08) (0.13) (0.16) (0.23) (0.56) Fiscal multiplier, small change in CAPB, yC≤0 ⁠, slump −0.05 −0.16 −0.10 0.13 0.17 0.03 (0.12) (0.21) (0.23) (0.32) (0.49) (1.10) Observations 235 235 231 226 214 214 Deviation in log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Panel (a): uniform effect of d.CAPB changes Fiscal multiplier, yC>0 ⁠, boom 0.21*** 0.24*** 0.05 −0.17 −0.22 −0.02 (0.07) (0.07) (0.05) (0.11) (0.15) (0.24) Observations 222 205 192 180 175 175 Fiscal multiplier, yC≤0 ⁠, slump −0.03 −0.07 −0.17 −0.23* −0.41** −0.98** (0.04) (0.07) (0.11) (0.12) (0.18) (0.40) Observations 235 235 231 226 214 214 Panel (b): Separate effects of d.CAPB for large (>1.5%) nd small (≤1.5%) changes in CAPB Fiscal multiplier, large change in CAPB, yC>0 ⁠, boom 0.23** 0.24*** 0.06 −0.15 −0.18 0.13 (0.08) (0.08) (0.06) (0.11) (0.15) (0.28) Fiscal multiplier, small change in CAPB, yC>0 ⁠, boom 0.06 0.21 −0.04 −0.32 −0.57 −1.55 (0.11) (0.35) (0.40) (0.37) (0.41) (1.14) Observations 222 205 192 180 175 175 Fiscal multiplier, large change in CAPB, yC≤0 ⁠, slump −0.02 −0.05 −0.18 −0.30* −0.52** −1.16* (0.05) (0.08) (0.13) (0.16) (0.23) (0.56) Fiscal multiplier, small change in CAPB, yC≤0 ⁠, slump −0.05 −0.16 −0.10 0.13 0.17 0.03 (0.12) (0.21) (0.23) (0.32) (0.49) (1.10) Observations 235 235 231 226 214 214 Notes Standard errors (clustered by country) in parentheses. ***/**/* Indicate p < 0.01/0.05/0.10. yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. Additional controls: cyclical component of y, two lags of change in y, country fixed effects. The boom bin is for observations where the cyclical component yC is greater than zero, the slump bin is for observations where the cyclical component is less than or equal to zero. Large consolidations means larger than 1.5% of GDP; small means less than or equal to 1.5% of GDP. Open in new tab Table 2 Fiscal Multiplier, Effect of d.CAPB, OLS Estimates, Booms versus Slumps Deviation in log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Panel (a): uniform effect of d.CAPB changes Fiscal multiplier, yC>0 ⁠, boom 0.21*** 0.24*** 0.05 −0.17 −0.22 −0.02 (0.07) (0.07) (0.05) (0.11) (0.15) (0.24) Observations 222 205 192 180 175 175 Fiscal multiplier, yC≤0 ⁠, slump −0.03 −0.07 −0.17 −0.23* −0.41** −0.98** (0.04) (0.07) (0.11) (0.12) (0.18) (0.40) Observations 235 235 231 226 214 214 Panel (b): Separate effects of d.CAPB for large (>1.5%) nd small (≤1.5%) changes in CAPB Fiscal multiplier, large change in CAPB, yC>0 ⁠, boom 0.23** 0.24*** 0.06 −0.15 −0.18 0.13 (0.08) (0.08) (0.06) (0.11) (0.15) (0.28) Fiscal multiplier, small change in CAPB, yC>0 ⁠, boom 0.06 0.21 −0.04 −0.32 −0.57 −1.55 (0.11) (0.35) (0.40) (0.37) (0.41) (1.14) Observations 222 205 192 180 175 175 Fiscal multiplier, large change in CAPB, yC≤0 ⁠, slump −0.02 −0.05 −0.18 −0.30* −0.52** −1.16* (0.05) (0.08) (0.13) (0.16) (0.23) (0.56) Fiscal multiplier, small change in CAPB, yC≤0 ⁠, slump −0.05 −0.16 −0.10 0.13 0.17 0.03 (0.12) (0.21) (0.23) (0.32) (0.49) (1.10) Observations 235 235 231 226 214 214 Deviation in log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Panel (a): uniform effect of d.CAPB changes Fiscal multiplier, yC>0 ⁠, boom 0.21*** 0.24*** 0.05 −0.17 −0.22 −0.02 (0.07) (0.07) (0.05) (0.11) (0.15) (0.24) Observations 222 205 192 180 175 175 Fiscal multiplier, yC≤0 ⁠, slump −0.03 −0.07 −0.17 −0.23* −0.41** −0.98** (0.04) (0.07) (0.11) (0.12) (0.18) (0.40) Observations 235 235 231 226 214 214 Panel (b): Separate effects of d.CAPB for large (>1.5%) nd small (≤1.5%) changes in CAPB Fiscal multiplier, large change in CAPB, yC>0 ⁠, boom 0.23** 0.24*** 0.06 −0.15 −0.18 0.13 (0.08) (0.08) (0.06) (0.11) (0.15) (0.28) Fiscal multiplier, small change in CAPB, yC>0 ⁠, boom 0.06 0.21 −0.04 −0.32 −0.57 −1.55 (0.11) (0.35) (0.40) (0.37) (0.41) (1.14) Observations 222 205 192 180 175 175 Fiscal multiplier, large change in CAPB, yC≤0 ⁠, slump −0.02 −0.05 −0.18 −0.30* −0.52** −1.16* (0.05) (0.08) (0.13) (0.16) (0.23) (0.56) Fiscal multiplier, small change in CAPB, yC≤0 ⁠, slump −0.05 −0.16 −0.10 0.13 0.17 0.03 (0.12) (0.21) (0.23) (0.32) (0.49) (1.10) Observations 235 235 231 226 214 214 Notes Standard errors (clustered by country) in parentheses. ***/**/* Indicate p < 0.01/0.05/0.10. yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. Additional controls: cyclical component of y, two lags of change in y, country fixed effects. The boom bin is for observations where the cyclical component yC is greater than zero, the slump bin is for observations where the cyclical component is less than or equal to zero. Large consolidations means larger than 1.5% of GDP; small means less than or equal to 1.5% of GDP. Open in new tab The results are reasonable and consistent with the literature, and particularly the GLP replication of the AA‐type results. The OLS estimates suggest that fiscal austerity is expansionary, since the only statistically significant coefficients are ones that have a positive sign. However, our stratification of the results by the state of the cycle at time 0 brings out a new insight, and shows that this result is entirely driven by what happens in booms. It is only in the boom that we find a significant positive response of real GDP to fiscal tightening, with a coefficient or ‘multiplier’ (the more general usage of the term, which we follow in the remainder of the article) of nearly 0.25 in years 1 and 2. Over five years, the sum of these effects is small, near 0.15. In the slump, the estimate of the policy response is not statistically different from zero and in many cases it is negative. 3. Replicating Contractionary Austerity: IV Results One widely shared concern with the OLS estimates just discussed is that the policy measure d.CAPB may be highly imperfect for the job. It likely suffers from both measurement error and endogeneity. A recent frank discussion of the measurement problems with this concept is presented by Perotti (2013). Moreover, to disentangle the true cyclical component of this variable from the observed actual level outcome has to rely on modelling assumptions about the sensitivity of taxes and revenues to the cycle – effects which may be only imprecisely estimated and which may not be stable over time or across countries. If that attempt at purging the cyclical part of the variable still leaves some endogenous variation in d.CAPB, then the implicit assumption of exogeneity needed for a causal estimate and policy analysis would be violated. One potential solution therefore is to seek a different and more direct measure of underlying fiscal policy change, using the so‐called ‘narrative approach’ (Romer and Romer, 1989). This was the arduous strategy adopted by the IMF’s GLP study, which went back over the history of 17 OECD countries and estimated the timing and magnitude of fiscal policy shocks on a year‐by‐year basis, based on documentary evidence from each country concerning the policies enacted since the 1970s. GLP focused exclusively on fiscal consolidation episodes, where authorities aimed to reduce their budget deficit, and they sought events that were not reactions to the contemporaneous or prospective economic conditions, so that they could claim plausible exogeneity. We employ the IMF narrative measures in two ways: much of time we use an indicator of a fiscal treatment (denoted Treatment) which is simply a country‐year event binary 0‐1 dummy that shows when a consolidation is taking place; the other variable of interest is the IMF’s estimate of the magnitude of the consolidation measures in that year as a percentage of GDP (denoted Total), and which provides a scaled measure of that year’s austerity package. To bring this IMF approach into our framework, and consistent with our OLS replication of the AA results above, we present in Tables 3 and 4, our IV estimates which make use of the IMF narrative variable. We re‐estimate expression (1) using the IMF dates of fiscal consolidations as both binary and continuous instruments. This approach is parallel to the approach in Mertens and Ravn (2013, 2014) for the US and based on Stock and Watson (2012). If the IMF approach is correct and has found truly exogenous shocks to fiscal policy, then it would be a valid instrument for d.CAPB. It would also be a potentially strong instrument: the raw correlation between d.CAPB (year 1 versus year 0) and Treatment (in year 1) is 0.31 and a bivariate regression has an F‐statistic of over 50; the same applies when Treatment is replaced by Total (in year 1). Table 3 Fiscal Multiplier, Effect of d.CAPB, IV Estimates Deviation in log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal multiplier, binary Treatment IV −0.34** −0.72*** −0.76*** −0.78*** −0.88*** −2.94*** (0.12) (0.23) (0.25) (0.23) (0.28) (0.84) First stage F‐statistic 32.85 33.41 26.61 31.99 30.78 30.78 Fiscal multiplier, continuous Total IV −0.46*** −0.81*** −0.69** −0.58* −0.68** −2.77** (0.13) (0.23) (0.31) (0.28) (0.30) (0.97) First stage F‐statistic 53.90 51.52 48.95 45.25 42.39 42.39 Observations 457 440 423 406 389 389 Deviation in log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal multiplier, binary Treatment IV −0.34** −0.72*** −0.76*** −0.78*** −0.88*** −2.94*** (0.12) (0.23) (0.25) (0.23) (0.28) (0.84) First stage F‐statistic 32.85 33.41 26.61 31.99 30.78 30.78 Fiscal multiplier, continuous Total IV −0.46*** −0.81*** −0.69** −0.58* −0.68** −2.77** (0.13) (0.23) (0.31) (0.28) (0.30) (0.97) First stage F‐statistic 53.90 51.52 48.95 45.25 42.39 42.39 Observations 457 440 423 406 389 389 Notes Standard errors (clustered by country) in parentheses. ***/**/* Indicate p < 0.01/0.05/0.10. Additional controls: cyclical component of y, two lags of change in y, country fixed effects. d.CAPB instrumented by IMF fiscal action variable in binary 0‐1 form (Treatment) in the top panel, and as a continuous (Total) variable in the bottom panel. First stage F‐statistic reports the Kleibergen‐Paap weak identification Wald test statistic. Open in new tab Table 3 Fiscal Multiplier, Effect of d.CAPB, IV Estimates Deviation in log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal multiplier, binary Treatment IV −0.34** −0.72*** −0.76*** −0.78*** −0.88*** −2.94*** (0.12) (0.23) (0.25) (0.23) (0.28) (0.84) First stage F‐statistic 32.85 33.41 26.61 31.99 30.78 30.78 Fiscal multiplier, continuous Total IV −0.46*** −0.81*** −0.69** −0.58* −0.68** −2.77** (0.13) (0.23) (0.31) (0.28) (0.30) (0.97) First stage F‐statistic 53.90 51.52 48.95 45.25 42.39 42.39 Observations 457 440 423 406 389 389 Deviation in log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal multiplier, binary Treatment IV −0.34** −0.72*** −0.76*** −0.78*** −0.88*** −2.94*** (0.12) (0.23) (0.25) (0.23) (0.28) (0.84) First stage F‐statistic 32.85 33.41 26.61 31.99 30.78 30.78 Fiscal multiplier, continuous Total IV −0.46*** −0.81*** −0.69** −0.58* −0.68** −2.77** (0.13) (0.23) (0.31) (0.28) (0.30) (0.97) First stage F‐statistic 53.90 51.52 48.95 45.25 42.39 42.39 Observations 457 440 423 406 389 389 Notes Standard errors (clustered by country) in parentheses. ***/**/* Indicate p < 0.01/0.05/0.10. Additional controls: cyclical component of y, two lags of change in y, country fixed effects. d.CAPB instrumented by IMF fiscal action variable in binary 0‐1 form (Treatment) in the top panel, and as a continuous (Total) variable in the bottom panel. First stage F‐statistic reports the Kleibergen‐Paap weak identification Wald test statistic. Open in new tab Table 4 Fiscal Multiplier, Effect of d.CAPB, IV Estimates (binary IV), Booms versus Slumps Deviation in log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal multiplier, yC>0 ⁠, boom −0.34 −0.32 −0.13 −0.59 −0.81 −1.36 (0.33) (0.50) (0.51) (0.52) (0.59) (1.78) First stage F‐statistic 11.60 10.22 8.16 11.67 11.87 11.87 Observations 222 205 192 180 175 175 Fiscal multiplier, yC≤0 ⁠, slump −0.25 −0.76*** −0.95*** −0.79** −0.93* −3.35** (0.15) (0.25) (0.31) (0.33) (0.45) (1.19) First stage F‐statistic 32.45 32.45 27.74 28.34 28.10 28.10 Observations 235 235 231 226 214 214 Deviation in log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal multiplier, yC>0 ⁠, boom −0.34 −0.32 −0.13 −0.59 −0.81 −1.36 (0.33) (0.50) (0.51) (0.52) (0.59) (1.78) First stage F‐statistic 11.60 10.22 8.16 11.67 11.87 11.87 Observations 222 205 192 180 175 175 Fiscal multiplier, yC≤0 ⁠, slump −0.25 −0.76*** −0.95*** −0.79** −0.93* −3.35** (0.15) (0.25) (0.31) (0.33) (0.45) (1.19) First stage F‐statistic 32.45 32.45 27.74 28.34 28.10 28.10 Observations 235 235 231 226 214 214 Notes Standard errors (clustered by country) in parentheses. ***/**/* Indicate p < 0.01/0.05/0.10. The boom bin is for observations where the cyclical component yC is greater than zero, the slump bin is for observations where the cyclical component is less than or equal to zero. yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. Additional controls: cyclical component of y, two lags of change in y, country fixed effects. d.CAPB instrumented by IMF fiscal action variable in binary 0‐1 form (treatment). First stage F‐statistic reports the Kleibergen‐Paap weak identification Wald test statistic Open in new tab Table 4 Fiscal Multiplier, Effect of d.CAPB, IV Estimates (binary IV), Booms versus Slumps Deviation in log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal multiplier, yC>0 ⁠, boom −0.34 −0.32 −0.13 −0.59 −0.81 −1.36 (0.33) (0.50) (0.51) (0.52) (0.59) (1.78) First stage F‐statistic 11.60 10.22 8.16 11.67 11.87 11.87 Observations 222 205 192 180 175 175 Fiscal multiplier, yC≤0 ⁠, slump −0.25 −0.76*** −0.95*** −0.79** −0.93* −3.35** (0.15) (0.25) (0.31) (0.33) (0.45) (1.19) First stage F‐statistic 32.45 32.45 27.74 28.34 28.10 28.10 Observations 235 235 231 226 214 214 Deviation in log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal multiplier, yC>0 ⁠, boom −0.34 −0.32 −0.13 −0.59 −0.81 −1.36 (0.33) (0.50) (0.51) (0.52) (0.59) (1.78) First stage F‐statistic 11.60 10.22 8.16 11.67 11.87 11.87 Observations 222 205 192 180 175 175 Fiscal multiplier, yC≤0 ⁠, slump −0.25 −0.76*** −0.95*** −0.79** −0.93* −3.35** (0.15) (0.25) (0.31) (0.33) (0.45) (1.19) First stage F‐statistic 32.45 32.45 27.74 28.34 28.10 28.10 Observations 235 235 231 226 214 214 Notes Standard errors (clustered by country) in parentheses. ***/**/* Indicate p < 0.01/0.05/0.10. The boom bin is for observations where the cyclical component yC is greater than zero, the slump bin is for observations where the cyclical component is less than or equal to zero. yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. Additional controls: cyclical component of y, two lags of change in y, country fixed effects. d.CAPB instrumented by IMF fiscal action variable in binary 0‐1 form (treatment). First stage F‐statistic reports the Kleibergen‐Paap weak identification Wald test statistic Open in new tab We begin by re‐estimating the full sample specification reported in the top panel of Table 1 using instrumental variables in two ways. First we use the IMF narrative variables on dates of fiscal consolidation as a binary instrument (first row). Second, for a continuous IV we use the size of the consolidation identified by the IMF (second row). The results are reported in Table 3. Strikingly, the message here completely overturns the findings in Table 1. This is of course a well known problem, consistent with the pronounced divergence between the AA and GLP results. Fiscal consolidation is unambiguously contractionary. Using the sum of coefficients reported in column (6) of Table 3, for every 1% in fiscal consolidation, the path of real GDP is pushed down by over 0.57% each year on average over the five subsequent years. This result is not sensitive to whether we use the binary or continuous instrument. The previous Section broke down the analysis as a function of whether the economy is in a boom or a slump. For completeness and as a check that the IV results in Table 3 are robust, we reproduced much of the analysis in Table 2 using instrumental variables based on the binary version of the IMF narrative variable. These results are reported in Table 4. Almost identical results (not shown) arise when the continuous IV is used, so the precise choice of IV makes very little difference to the overall message. The IV‐based responses suggest that austerity is contractionary since the only statistically significant coefficients here have a negative sign. However, stratification by the state of the cycle shows that this result is now driven by what happens in slumps. It is only in the slump bin that we find a significant negative response of real GDP to fiscal tightening. In Table 4 we find a coefficient or ‘multiplier’ of between −0.25 and −0.95 in years 1 to 5. Over five years the sum of these effects is −3.35** ⁠, so the average loss for a 1% of GDP fiscal consolidation is to depress the output level by about −0.67% per year over this horizon. 4. Endogenous Austerity: Is the Narrative Instrument Valid? So far we have briefly replicated the current state of the literature but this is not entirely pointless. It serves to show that the LP framework can capture different sides of the debate in a uniform empirical design, on a consistent data sample, allowing us to focus on how differences in estimation and identification assumptions lead to different results. It also shows how the LP estimation method makes it very easy to allow for non‐linearity and do a stratification of results; here we found significant variations in responses across bins designed to capture variations in the state of the economy from boom to slump. We found that indeed fiscal impacts vary considerably across these states in a manner that is intuitive and not unexpected: the output response to fiscal austerity is less favourable the weaker is the economy. Does this mean that Keynes was right? Before drawing any conclusions we evaluate whether the IMF narrative variable might be a legitimate instrument. Have we identified the causal effect of fiscal consolidations on output? We cannot formally test the validity of the IMF narrative instrument since the LPs are just identified. However, if the IMF’s narrative variable can be predicted by excluded controls and those controls are correlated with the outcome, at a minimum the excluded controls should be added to the regression. At worst, predictability points to having failed to resolve the allocation bias in our estimates – episodes of consolidation identified by the IMF might be simply an endogenous response by the fiscal authority. This possible shortcoming of the ‘narrative identification’ strategy has been noted before in the context of monetary policy (Leeper, 1997) and we have the same concern here. To address this issue we report three diagnostic tests in this Section in Tables 5, 6 and 7. Table 5 Checking for Balance in Treatment and Control Sub‐populations . Difference (Treated minus Control) . Public debt to GDP ratio 0.13* (0.03) Deviation of log output from trend −0.72* (0.20) Output growth rate −0.63* (0.18) Treatment (lagged) 0.56* (0.04) Observations 491 . Difference (Treated minus Control) . Public debt to GDP ratio 0.13* (0.03) Deviation of log output from trend −0.72* (0.20) Output growth rate −0.63* (0.18) Treatment (lagged) 0.56* (0.04) Observations 491 Notes Standard errors in parentheses. ***/**/* Indicate p < 0.01/0.05/0.10. Open in new tab Table 5 Checking for Balance in Treatment and Control Sub‐populations . Difference (Treated minus Control) . Public debt to GDP ratio 0.13* (0.03) Deviation of log output from trend −0.72* (0.20) Output growth rate −0.63* (0.18) Treatment (lagged) 0.56* (0.04) Observations 491 . Difference (Treated minus Control) . Public debt to GDP ratio 0.13* (0.03) Deviation of log output from trend −0.72* (0.20) Output growth rate −0.63* (0.18) Treatment (lagged) 0.56* (0.04) Observations 491 Notes Standard errors in parentheses. ***/**/* Indicate p < 0.01/0.05/0.10. Open in new tab Table 6 Omitted Variables Explain Output Fluctuations Model . OLS . IV (binary) . IV (continuous) . Real GDP growth 0.00 0.00 0.00 Real private loan growth 0.24 0.56 0.54 CPI Inflation 0.00 0.00 0.00 Change in investment to GDP ratio 0.01 0.00 0.00 Short‐term interest rate 0.00 0.00 0.00 Long‐term interest rate 0.00 0.01 0.02 Current account to GDP ratio 0.00 0.00 0.00 Model . OLS . IV (binary) . IV (continuous) . Real GDP growth 0.00 0.00 0.00 Real private loan growth 0.24 0.56 0.54 CPI Inflation 0.00 0.00 0.00 Change in investment to GDP ratio 0.01 0.00 0.00 Short‐term interest rate 0.00 0.00 0.00 Long‐term interest rate 0.00 0.01 0.02 Current account to GDP ratio 0.00 0.00 0.00 Notes See text. Entries are the p‐value of a test of the null hypothesis that the given variable and its lag are irrelevant in determining output given the fiscal treatment. The test is applied to three models. ‘OLS’ refers to the LP responses calculated in 2; ‘IV’ refers to the LP responses calculated using the binary instrument in 4; and ‘IV‐Total’ refers to the LP responses calculated using the continuous instrument. Open in new tab Table 6 Omitted Variables Explain Output Fluctuations Model . OLS . IV (binary) . IV (continuous) . Real GDP growth 0.00 0.00 0.00 Real private loan growth 0.24 0.56 0.54 CPI Inflation 0.00 0.00 0.00 Change in investment to GDP ratio 0.01 0.00 0.00 Short‐term interest rate 0.00 0.00 0.00 Long‐term interest rate 0.00 0.01 0.02 Current account to GDP ratio 0.00 0.00 0.00 Model . OLS . IV (binary) . IV (continuous) . Real GDP growth 0.00 0.00 0.00 Real private loan growth 0.24 0.56 0.54 CPI Inflation 0.00 0.00 0.00 Change in investment to GDP ratio 0.01 0.00 0.00 Short‐term interest rate 0.00 0.00 0.00 Long‐term interest rate 0.00 0.01 0.02 Current account to GDP ratio 0.00 0.00 0.00 Notes See text. Entries are the p‐value of a test of the null hypothesis that the given variable and its lag are irrelevant in determining output given the fiscal treatment. The test is applied to three models. ‘OLS’ refers to the LP responses calculated in 2; ‘IV’ refers to the LP responses calculated using the binary instrument in 4; and ‘IV‐Total’ refers to the LP responses calculated using the continuous instrument. Open in new tab Table 7 Fiscal Treatment Regression, Pooled Probit Estimators (average marginal effects) Probit model of treatment at time t + 1 (fiscal consolidation event) . Model . (1) . (2) . (3) . (4) . Public debt/GDP (t) 0.33*** 0.28*** 0.12* 0.11* (0.073) (0.073) (0.064) (0.064) Cyclical component of log y (t) (⁠ yC ⁠) −0.026** −0.012 (0.011) (0.009) Growth rate of output (t) −0.030** −0.024** (0.012) (0.010) Treatment (t) 0.41*** 0.41*** (0.020) (0.019) Observations 457 457 457 457 Classification test: AUC 0.61 0.66 0.81 0.82 (0.03) (0.03) (0.02) (0.02) Probit model of treatment at time t + 1 (fiscal consolidation event) . Model . (1) . (2) . (3) . (4) . Public debt/GDP (t) 0.33*** 0.28*** 0.12* 0.11* (0.073) (0.073) (0.064) (0.064) Cyclical component of log y (t) (⁠ yC ⁠) −0.026** −0.012 (0.011) (0.009) Growth rate of output (t) −0.030** −0.024** (0.012) (0.010) Treatment (t) 0.41*** 0.41*** (0.020) (0.019) Observations 457 457 457 457 Classification test: AUC 0.61 0.66 0.81 0.82 (0.03) (0.03) (0.02) (0.02) Notes Standard errors in parentheses. ***/**/* Indicate p < 0.01/0.05/0.10. yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. AUC is the area under CCF curve. AUC ∈ [0.5, 1]; H0:AUC=0.5 ⁠. See text. Open in new tab Table 7 Fiscal Treatment Regression, Pooled Probit Estimators (average marginal effects) Probit model of treatment at time t + 1 (fiscal consolidation event) . Model . (1) . (2) . (3) . (4) . Public debt/GDP (t) 0.33*** 0.28*** 0.12* 0.11* (0.073) (0.073) (0.064) (0.064) Cyclical component of log y (t) (⁠ yC ⁠) −0.026** −0.012 (0.011) (0.009) Growth rate of output (t) −0.030** −0.024** (0.012) (0.010) Treatment (t) 0.41*** 0.41*** (0.020) (0.019) Observations 457 457 457 457 Classification test: AUC 0.61 0.66 0.81 0.82 (0.03) (0.03) (0.02) (0.02) Probit model of treatment at time t + 1 (fiscal consolidation event) . Model . (1) . (2) . (3) . (4) . Public debt/GDP (t) 0.33*** 0.28*** 0.12* 0.11* (0.073) (0.073) (0.064) (0.064) Cyclical component of log y (t) (⁠ yC ⁠) −0.026** −0.012 (0.011) (0.009) Growth rate of output (t) −0.030** −0.024** (0.012) (0.010) Treatment (t) 0.41*** 0.41*** (0.020) (0.019) Observations 457 457 457 457 Classification test: AUC 0.61 0.66 0.81 0.82 (0.03) (0.03) (0.02) (0.02) Notes Standard errors in parentheses. ***/**/* Indicate p < 0.01/0.05/0.10. yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. AUC is the area under CCF curve. AUC ∈ [0.5, 1]; H0:AUC=0.5 ⁠. See text. Open in new tab In the ideal randomised controlled trial, with treatment and control units allocated randomly, the probability density function of each of the controls in X would be the same for each subpopulation – there would be perfect overlap between the two subpopulation densities. For example, the distribution of debt to GDP ratios would be similar in the subpopulation of narrative IMF fiscal consolidations and the subpopulation of all other observations. A simple way to check for this balance condition, as it is often referred to in the literature, is to do a test of the equality of the means across subpopulations. Notice that the balance condition also lies behind the implicit assumption that one can estimate the LP by restricting the coefficient of the controls to be the same for the treatment and control groups, an observation that we discuss in detail in Section 5. The balance condition is evaluated in Table 5 for several potentially important macroeconomic control variables included in expression (1). The null hypothesis of balance is rejected for all of them, strongly suggesting that the IMF narrative dates are not truly exogenous events. We go beyond this simple check and perform two additional tests. First, we check if the outcome is predictable by a set of available controls not yet included in the analysis. To be clear, the original AA and GLP papers do include in their analysis a robustness check that includes other controls. However, the controls they consider are typically related fiscal variables rather than the set of macroeconomic controls we consider here. In Table 6 we report the results of such tests by re‐examining whether our candidate model in expression (1) admits as additional explanation the following variables: real GDP growth; real private loan growth; CPI inflation; the change in the investment to GDP ratio; the short‐term interest rate on government securities (usually three‐months in maturity); the long‐term rate on government securities (usually 5–10 year bonds); and the current account to GDP ratio. The first three variables are expressed as 100 times the log difference. In all cases, we consider the value of the variable and one lag. The tests are conducted with the one‐period ahead local projection (the equivalent of the corresponding equation in a VAR) using the full sample according to expression (1). The objective is to set a higher bar for the possibly omitted regressors to be significant. Partitioning the sample into the growth bins we used earlier could generate spurious findings since the tests would rely on a smaller sample. Table 6 reports the p‐value associated with the joint null that the candidate variable and its lag are not significant. A rejection means that output fluctuations could be due to reasons other than the fiscal treatment variable. The message is clear: most of the excluded controls are highly significant. For now, a cautious interpretation is to view these findings as a source of concern rather than conclusive evidence that the multipliers reported earlier are incorrect. Next we check for another condition: do excluded controls predict fiscal consolidations? Table 7 asks whether variation in the IMF binary treatment variable identified by GLP can be predicted. The results indicate that we have a reasonable basis for this concern. This is a set of estimated treatment equations, where we use a pooled probit estimator to predict the IMF fiscal consolidation variable in year 1, presumptively announced at year 0, based on state variables at time 0. As shown in the A.1, our later results are robust to alternative binary classification models such as pooled logit, and fixed‐effects probit and logit with controls for global time‐varying trends. Table 7 shows in column (1) that treatment is more likely, as expected, when public debt to GDP is high: the coefficient is positive, meaning that governments tend to pursue austerity when debt has run up. In column (2) we add yC (the output gap) and the growth rate of y to further condition on the state of the economy: when the economy is growing below potential, there is an increase in the likelihood of consolidation. Moreover, austerity is more likely to be pursued when growth slows, in stark contrast to what common‐sense textbook countercyclical policy suggests. But this finding is in line with contemporary experience in Europe and the UK, although all of the sample data we use here are pre‐crisis. Thus, the act of engaging in pro‐cyclical fiscal policy is not a new‐fangled craze but more of a chronic tendency in advanced countries. Finally, columns (3) and (4) add the lag of the dependent variable Treatment and this has a highly significant coefficient: as we know from the raw data series generated by the IMF study, the fiscal consolidation episodes are typically long, drawn‐out affairs, so once such a programme is started it tends to run for several years. Being in treatment today is thus a good predictor of being in treatment tomorrow. In these last two columns the lagged growth rate rather than the cyclical level of output emerges as the slightly better predictor of treatment. Further confirmation of the predictive ability of these treatment regressions is provided by the AUC statistic.7 The AUC is commonly used in biostatistics and machine learning to evaluate classification ability (Jordà and Taylor, 2011). Under the null that the covariates have no classification ability, AUC = 0.5. Perfect classification ability corresponds to AUC = 1. The AUC has an approximate Gaussian distribution in large samples. Table 7 measures the classification ability of each specification. The AUC statistics show that the probits have very good predictive ability, with AUC around 0.65 when lagged treatment is omitted (Column 2), and over 0.8 when lagged treatment is included (Columns (3) and (4)). The AUCs are all significantly different from 0.5. The key lesson from Table 7 is simply that the IMF variable has a significant forecastable component.8 The question, then, is how to deal with the problem of potentially endogenous instruments. The remainder of this article provides one answer. 5. Statistical Design The previous Section raises concerns that the narrative IMF variable could be an invalid instrument using three different checks. The empirical strategy that we propose is based on taking triple insurance against this potential endogeneity. First, we take the episodes of consolidation from the IMF narrative variable as the subset of all consolidation episodes that are a candidate for random allocation. Think of it as a pseudo‐IV step. Second, we include the extended set of covariates from Tables 6 and 7 and add them as right hand side variables in the LP of expression (1). Third, we use inverse propensity score weighting on this LP to re‐randomise allocation of the IMF fiscal consolidation events. In order to facilitate the exposition we momentarily drop the cross‐sectional country index in the panel. Denote, as before, yt the outcome variable of interest, the log of real GDP. In other applications yt could be a ky‐dimensional vector. Let Dt denote the fiscal policy variable. Dt will now be a discrete random variable Dt∈{0,1} based on the IMF narrative indicator of exogenous fiscal consolidations, although earlier it was the continuous d.CAPB variable. The methods that we present next can be extended to settings in which the policy variable takes on a small number of discrete values. Next we allow for a kw‐dimensional vector of variables, wt that are not included in the vector yt, but which could be relevant predictors of the policy variable Dt ⁠. Finally, denote Xt the rich conditioning set given by Δyt−1,Δyt−2,…;Dt−1,Dt−2,…; and wt ⁠. We assume that policy is determined by Dt=D(Xt,ψ,εt) where ψ refers to the parameters of the implied policy function and εt is an idiosyncratic source of random variation. Therefore, D(Xt,ψ,.) refers to the systematic component of policy determination. To make further progress at this point, we will borrow from definition 1 in Angrist et al. 2013, henceforth AJK). This defines potential outcomes given by yt,hψ(d)−yt as the value that the observed outcome variable yt+h−yt would have taken if Dt=d for all ψ ∈ Ψ and d∈D ⁠. In our application, the difference yt+h−yt refers to the cumulative change in the outcome from t to t + h and d = 0,1. The horizon h can be any positive integer. The causal effect of a policy intervention is defined as the unobservable random variable given by the difference [yt,h(1)−yt]−[yt,h(0)−yt]. Notice that yt is only used to benchmark the cumulative change and it is observed at time t. We assume that the parameters of the policy function do not change. Following AJK, we can state the selection‐on‐observables assumption (or the conditional ignorability or conditional independence assumption as it is sometimes called) as: [yt,hψ(d)−yt]⊥Dt|Xt;ψfor allh>0,and ford∈Dandψ∈Ψ.(2) That is, the treatment‐control allocation is independent of potential outcomes, given the variables or controls Xt ⁠. This condition does not imply that there is no effect of policy on the outcome given controls. We are simply stating that conditional on controls, policy allocation is independent of the potential outcome, whatever that might be. Consider the ideal randomised experiment to understand the role that the conditional independence assumption plays. The average causal effect of policy intervention on the outcome at time t + h given by: E{[yt,h(1)−yt]−[yt,h(0)−yt]}, could be simply calculated using group means as: Λ^GroupMeanh=1n1∑tDt(yt+h−yt)−1n0∑t(1−Dt)(yt+h−yt)for allh>0,(3) where n1=ΣtDt and n0=Σt(1−Dt) are the number of observations in treatment and control groups, respectively. Alternatively, the ATE, Λh ⁠, could be calculated from the auxiliary regression: (yt+h−yt)=Dtα1h+(1−Dt)α0h+vt+hfor allh>0.(4) The difference in the OLS estimates of the intercepts α^1h−α^0h=Λ^h in expression (4) is equivalent to that in expression (3). More conveniently, one could estimate the ATE directly from the regression: (yt+h−yt)=α0h+DtΛh+vt+hfor allh>0.(5) Even when data are randomly allocated across the treatment and control subpopulations, it would be natural to condition on the Xt to adjust for small‐sample differences in subpopulation characteristics and therefore to gain in efficiency. The estimator is consistent for the ATE whether or not regressors are included. Notice that the model for the outcomes is unspecified. The estimate of the ATE does not depend on specific assumptions about this model if the conditional ignobility assumption is met. Allocation to treatment and control groups is not usually random in observational data. To appreciate the role of the selection‐on‐observables assumption in (2), consider elaborating on the example. First, by the law of iterated expectations, we can write: E{[yt,h(1)−yt]−[yt,h(0)−yt]}=E[E(yt+h−yt|Dt=1;Xt)−E(yt+h−yt|Dt=0;Xt)]=Λhfor allh>0.(6) Assume that a linear regression control strategy suffices to do the appropriate conditioning for the Xt and hence obtain a consistent estimate of E[yt+h−yt|Dt,Xt] ⁠. This is a big assumption that we relax later on in the article. Note this is the assumption of studies based on VARs where identification does not rely on external information. Then the average causal effect of a policy intervention on the outcome variable at time t+h in the maintained example, can be calculated by expanding expression (4) with yt+h−yt=Dtα1h+(1−Dt)α0h+DtXtβ1h+(1−Dt)Xtβ0h+vt+hfor allh>0.(7) If one imposes the constraint β1h=β0h ⁠, then expression (7) is nothing more than a standard LP of expression (1) and Λh=α1h−α0h is the policy response at horizon h. The standard linear LP is a direct estimate of the typical impulse response derived from a traditional VAR, as Jordà (2005) shows. This naïve constrained specification, which characterises responses derived from a VAR, imposes two implicit assumptions. First, the effect of the controls Xt on the outcomes is assumed to be stable across the treated and control subpopulations. Second, the expected value of Xt in each subpopulation is assumed to be the same. The first assumption is potentially defensible. The economic mechanism describing the transmission of interest rates on real GDP could be the same whether or not there is a fiscal consolidation, for example. The second assumption is more difficult to defend. It is unlikely that, say, government debt levels are the same in the treated and control groups. Fiscal consolidations are often driven by high levels of debt. This is a good place to make a connection with structural VAR identification. When h = 1, the LP is equivalent to the corresponding equation in a VAR. A specification that includes all contemporaneous variables as controls (in addition to their lags) could be seen as equivalent to imposing the Choleski ordering in which the policy variable is ordered last. However, unlike a VAR, there is no need to impose exclusion restrictions on the remaining variables in the system if the focus shifts to a different response/intervention pair. Practically speaking it would be advisable not to impose such constraints but rather include all available observable controls again and let the data choose which variables are appropriate conditioning information. The larger principle is to ensure that fluctuations in the shocked variable cannot be explained by any observable information. In that respect, it is perhaps useful to remember that the true square‐root of the reduced‐form residual covariance matrix need not be upper‐triangular or even have any zero entries for that matter. When instruments are available one can further achieve identification using instrumental variable methods as in Stock and Watson (2012) and Mertens and Ravn (2013, 2014). We have shown above how IV methods can be used with the LP approach in a more natural way. However, it is important to recall several features required to resolve the identification puzzle. These are: (i) the instrument is relevant, which appears to be the case as we discussed earlier; (ii) the instrument is valid, which is untestable given just‐identification and for which the analysis of the previous Section raises concerns; and (iii) predetermined and exogenous controls are not omitted from the specification. This latter requirement is not resolved by the use of the instrument, especially when there is substantial evidence that the controls are predictive for the instrument, as shown here. Taking up our earlier discussion once more, using expressions (6) and (7), notice that: E(E{[yt,h(1)−yt]−[yt,h(0)−yt]|Xt})=EE[Dt(α^1+Xtβ^1h)]−[(1−Dt)(α^0+Xtβ^0h)]|Xt=E(α^1h−α^0h)=E(Λ^h)=Λh, under the maintained assumptions of the example that E(Xt|Dt=1)=E(Xt|Dt=0) and β1=β0 and noticing that E(Dt|Dt=1)=E[(1−Dt)|Dt=0]=1 ⁠. More generally, if we do not impose the implicit assumptions of the naïve LP specification, the analogous representation to the group means expression (3) is: Λ^RAh=1n1∑tDt[m1h(Xt,θ^1h)]−1n0∑t(1−Dt)[m0h(Xt,θ^0h)]for allh>0,(8) where mjh(.) is a generic specification of the conditional mean of (yt+h−yt) in each subpopulation j = 1,0 and θjh=(αjhβjh)′ for the regression example in (7). The n1 and n0 have been defined earlier. Note that this more general form of regression adjustment allows the conditional means to be different for the treated and control subpopulations and allows their effect on the outcome to differ as well. 5.1. Re‐randomisation Through the Propensity Score Recall that the critical assumption is the conditional ignorability or selection‐on‐observables condition (2). Rosenbaum and Rubin (1983) show that: Xt⊥Dt|p(Dt=1|Xt,ψ), that is, the propensity score p(Dt=1|Xt,ψ) is all that is needed to capture the effect of the Xt in the selection‐on‐observables condition.9 This result provides further support for the IPW estimator. Recall the ATE is, by definition: Λh=E{[yt,h(1)−yt]−[yt,h(0)−yt]}=E(E{[yt,h(1)−yt]−[yt,h(0)−yt]|Xt}),(9) using the law of iterated expectations. Looking inside the expectations in the final term above, the average policy response conditional on Xt ⁠, in terms of observable data, is: E{[yt,h(1)−yt]−[yt,h(0)−yt]|Xt}=E[yt,h−yt|Dt=1;Xt]−E[yt,h−yt|Dt=0;Xt],for allh>0,(10) where it is assumed that the policy environment characterised by ψ ∈ Ψ remains constant. Estimation of these conditional expectations can be simplified considerably when a model for the policy variable Dt is available. Angrist and Kuersteiner (2004, 2011) refer to the predicted value from such a policy model as the policy propensity score. The policy propensity score is meant to ensure the estimation of the policy response (the ATE in the microeconomics parlance) is consistent under the main assumption. In addition, it acts as a dimension‐reduction device. Ideally, any predictor of policy should be included, regardless of whether that predictor is a fundamental variable in a macroeconomic model. The probit results reported in Table 7 can be seen as candidate estimates of this policy propensity score. We instead construct the policy propensity score using a richer specification that includes all the controls used in Table 6 as well. Denote the policy propensity score P(Dt=j|Xt)=pj(Xt,ψ) for j = 1,0. Clearly p1(Xt,ψ)=1−p0(Xt,ψ) ⁠. Using the selection‐on‐observables condition in expression (2) shown earlier, then: E[(yt,h−yt)1{Dt=j}|Xt]=E{[yt,h(j)−yt]|Xt}pj(Xt,ψ)forj=1,0.(11) Solving for E{[yt,h(j)−yt]|Xt} and taking unconditional expectations, by integrating over Xt ⁠, the ATE in (9) can be calculated as: Λh=E{[yt,h(1)−yt]−[yt,h(0)−yt]}=E(yt,h−yt)1{Dt=1}p1(Xt,ψ)−1{Dt=0}p0(Xt,ψ)for allh>0.(12) Under standard regularity conditions (detailed in AJK) an estimate of expression (12) can be obtained using sample moments which generalise the sample moments presented earlier in expression (3) for the OLS case. Suppose that the first‐stage treatment model takes the form of a probability of treatment at time t given by the estimated model p^t=p1(Xt,ψ^) ⁠, where ψ^ is the estimated parameter vector, and 1−p^t=p0(Xt,ψ^) ⁠. The inverse propensity score weighted (IPW) ‘ratio estimator’ of the ATE is: Λ^IPW=1n∑tDt(yt+h−yt)p^t−1n∑t(1−Dt)(yt+h−yt)1−p^t.(13) Some improvements can be made to this expression. Imbens (2004) and Lunceford and Davidian (2004) suggest renormalising the weights so that they sum up to one in small samples. Hence expression (13) becomes: Λ^IPW=1n1*∑tDt(yt+h−yt)p^t−1n0*∑t(1−Dt)(yt+h−yt)1−p^t,(14) where n1*=∑tDtp^tn0*=∑t(1−Dt)(1−p^t),(15) and the notation nj* parallels the notation nj for j = 1,0 in (3). Note that E(Dt/pt)=E[E(Dt|Xt)]/pt=1 ⁠; similarly E[(1−Dt)/(1−pt)]=E{E[(1−Dt)|Xt]}/(1−pt)=1 ⁠; and hence it follows that in large samples expressions (13) and (14) apply the same weighting, since E(n1*)=E(n0*)=n ⁠. These expressions are natural analogs of the Group Mean estimator in (3), with inverse propensity‐score weighting to correct for allocation bias and to achieve a quasi‐random distribution of treatment and control observations via reweighting. 5.2. Regression Adjustment (IPWRA) and Augmented IPW (AIPW) As a way to enhance robustness, researchers have derived estimators with a regression adjustment component added to the standard IPW estimator presented above. This estimator parallels that in expression (8) but using IPW. To further enhance efficiency, the augmented IPW or AIPW estimator combines the IPW and IPWRA estimators in a manner to be discussed shortly. It is natural to consider extending the estimator in expression (8) using the propensity score. Formally, the basis for such an estimator would be to transition from expression (11) to (12) in the following manner: Λh=E(yt+h−yt|Xt)1{Dt=1}p1(Xt,ψ)−1{Dt=0}p0(Xt,ψ)for allh>0,(16) which can be implemented by first projecting the outcome variable on the set of control variables (Robins and Rotnitzky, 1995; Robins et al., 1995; Wooldridge, 2007). The inverse propensity score weighted estimator with regression adjustment (IPWRA) is then given by: Λ^IPWRAh=1n1*∑Dtm1h(Xt,θ1^h)p^t−1n0∗∑(1−Dt)m0h(Xt,θ0^h)1−p^t,(17) where again mjh(Xt,θj^h) for j = 1,0 is the conditional mean from the first‐step regression of (yt+h−yt) on Xt as in expression (8) in Section 5. The nj∗ for j = 1,0 are the same as in expression (15). It is clear that (17) nests all the previous estimators, the Group Mean (3), the RA (8) and the IPW (14) as special cases. The estimator in expression (17) falls into the class of doubly robust estimators (Imbens, 2004; Lunceford and Davidian, 2004; Wooldridge, 2007; Kreif et al., 2013). The intuition behind the estimator is to use the regression model as a way to ‘predict’ the unobserved potential outcomes. Consistency of the estimated ATE only requires either the conditional mean model or the propensity score model to be correctly specified. However, although (17) is one of a large class of unbiased IPWRA estimators of ATE, it is not the most efficient in this class. Starting with Robins et al. (1994) and more recently, Lunceford and Davidian (2004), the estimator within the doubly‐robust class having the smallest asymptotic variance, is the (locally) semi‐parametric efficient estimator: Λ^AIPWh=1n∑tDt(yt+h−yt)p^t−(1−Dt)(yt+h−yt)(1−p^t)−(Dt−p^t)p^t(1−p^t)(1−p^t)m1h(Xt,θ^1h)+p^tm0h(Xt,θ^0h).(18) Thus, the estimator in (18) can be seen as the basic IPW estimator plus an adjustment consisting of the weighted average of the two regression estimators. The adjustment term has expectation zero when the estimated propensity scores and regression models are replaced by their population counterparts. Moreover, the adjustment term stabilises the estimator when the propensity scores get close to zero or one (Glynn and Quinn, 2010) and this alleviates the need to truncate the propensity score weights as suggested in Imbens (2004). Another way to interpret the AIPW estimator is to realise that: Λ^AIPWh=Λ^IPWh+(Λ^RAh−Λ^IPWRAh).(19) Readers familiar with the bootstrap will notice the similarities between the bootstrap bias correction formula and (19). The AIPW has a number of attractive theoretical properties. Using the theory of M‐estimation, Lunceford and Davidian (2004) show that the estimator is asymptotically normally distributed. In addition, they show that the variance can be calculated using the empirical sandwich estimator V(Λ^AIPWh)=1n2∑t(I^th)2 ⁠, where: I^th=Dt(yt+h−yt)p^t−(1−Dt)(yt+h−yt)(1−p^t)−(Dt−p^t)p^t(1−p^t)(1−p^t)m1h(Xt,θ^1h)+p^tm0h(Xt,θ^0h)−Λ^AIPWh.(20) Later we allow for the possibility that the I^th are not a martingale difference sequence and calculate standard errors using cluster robust methods. When the propensity score and the regression function are modelled correctly, the AIPW achieves the semi‐parametric efficiency bound. Alternatively, Imbens (2004) shows that standard errors for Λ^AIPWh can be calculated with the bootstrap. 5.3. Intuition Although these techniques are relatively new to macroeconomics, matching estimators using inverse propensity score weighting have been frequently implemented in applied microeconomics with cross‐sectional data. Matching methods more generally constitute a benchmark within the medical research literature when trials are suspected of being contaminated by allocation bias. The provenance of the particular inverse propensity score weighting method we employ is thus well established. Figure 1 provides intuition for the methods that we just described and exemplifies the perils of allocation bias with a simple bivariate manufactured example based one observable confounder. Panel (a) displays the hypothetical frequencies of observing the control variable x in the treatment and control subpopulations using circles of varying diameter. Think of it as a display of the raw data. In the control subpopulation, we are more likely to observe low values of x. This is indicated by the bigger circles in dark grey that are located near the vertical axis. The opposite is the case for the treated subpopulation, indicated with the light grey circles that grow the further they are from the vertical axis. The example is set up so that the true ATE = 1. Fig. 1. Open in new tabDownload slide An Example of Allocation Bias and the IPWRA Estimator 
Notes. See text. True ATE = 1. Panel (a) displays the raw hypothetical data using circles of increasing diameter to denote where the data are more frequently observed. Treated units are in light grey, control units in dark grey. The Group Means estimate of the ATE = 3.67. Panel (b) is the same as panel (a) and adds regression lines to each subpopulation. The RA ATE = 1. Panel (c) displays the data in panel (a) once it has been inversely weighted by the frequency with which they are observed. The IPW ATE = 1. Panel (d) adds regression lines to panel (c) to show the IPWRA estimate of the ATE = 1. Fig. 1. Open in new tabDownload slide An Example of Allocation Bias and the IPWRA Estimator 
Notes. See text. True ATE = 1. Panel (a) displays the raw hypothetical data using circles of increasing diameter to denote where the data are more frequently observed. Treated units are in light grey, control units in dark grey. The Group Means estimate of the ATE = 3.67. Panel (b) is the same as panel (a) and adds regression lines to each subpopulation. The RA ATE = 1. Panel (c) displays the data in panel (a) once it has been inversely weighted by the frequency with which they are observed. The IPW ATE = 1. Panel (d) adds regression lines to panel (c) to show the IPWRA estimate of the ATE = 1. The naïve Group Means estimator based on (3) consists of the difference in means between the two subpopulations delivering an estimate of the ATE = 7.33 (treated mean) − 4.67 (control mean) = 3.67 that is almost four times as large as the true ATE. Panel (b) implements the regression adjustment estimator described in (8). Now the ATE is estimated using the conditional mean average implied by the regression estimates of each subpopulation. In this case ATE = 6 (treated regression mean) − 5 (control regression mean) = 1, which is the correct ATE. In the simple example, the effect of the control x is linear so a regression control strategy suffices to obtain the correct ATE. Suppose that you do not want to make assumptions about the functional form of the regression needed to adjust for the covariate x in the ATE estimator. Panel (c) implements the IPW estimator in (17). Using weights based on the inverse frequency with which the data are observed for each value of x in each subpopulation generates a ‘pseudo‐randomised’ sample from which the simple difference in mean estimator delivers the correct answer. In this case ATE = 6 (treatment mean using inverse weights) − 5 (control mean using inverse weights) = 1, again the correct value. In practice, one may be unsure about the correct specification of either the regression or the propensity score describing the appropriate reweighing scheme. Panel (d) combines the two approaches (IPW and regression adjustment) based on expression (17). This estimator is ‘doubly robust’ meaning that either the propensity score or the regression may be incorrectly specified and yet still deliver the correct estimate of the ATE. In the example there is no gain from using this procedure but one can still verify that ATE = 6 (conditional regression mean for treated using inverse weighting) − 5 (conditional regression mean for control using inverse weighting) = 1. Again, the correct ATE. When policy interventions are mostly driven by the endogenous response to controls, we can think of the observable treatment/control subpopulations as being oversampled from the region of the distribution in which the propensity score attains its highest values. Moments calculated with this raw empirical distribution will therefore be biased: not enough probability mass is given to observations with low propensity scores. Weighting by the inverse of the propensity score shifts weight away from the oversampled toward the undersampled region of the distribution. This shift of probability mass reconstructs the appropriate frequency weights of the underlying true distribution of outcomes under treatment and control so that the means estimated from each subpopulation are no longer biased and their difference is an unbiased estimate of the ATE. 5.4. What We Do The next Section reports the results of applying the AIPW estimator (18) to measure the ATE of fiscal consolidations as a counterpoint to the conventional OLS and IV results reported earlier. As a way to understand where the differences come from, we first implement the AIPW estimator by restricting the parameters of the regression (based on LPs) to be the same in the treated and control subpopulations, as is implicit in the OLS and IV approaches. Under that constraint, the results from the AIPW estimator are close to the IV results seen earlier. Next we allow for the parameters to vary across subpopulations, adhering to the way expression (18) is typically applied in the policy evaluation literature. These results deliver the same qualitative implication of contractionary austerity but show that the effects of consolidations are quantitatively even more painful. 6. Contractionary Austerity Revisited: Estimates of the Average Effect of Fiscal Consolidations This Section presents AIPW estimates of the ATE of fiscal consolidations. Following standard procedures, the propensity score used here is based on a saturated probit model that extends the set of controls used in Table 7 with the current and lagged values of the controls in Table 6. The saturated probit also includes country fixed effects. Although we do not report the coefficient estimates of this more saturated model, it is worth mentioning that the AUC in this treatment model rises to 0.86. Figure 2 provides smooth kernel density estimates of the distribution of the propensity score for the treated and control units to check for overlap. One way to think of overlap is to consider what overlap would be in the ideal RCT. The empirical distributions of the propensity score for treated and control units would be uniform and identical to each other. At the other extreme, suppose that treatment is allocated mechanically on the basis of controls. Then the distribution of treated units would spike at one and be zero elsewhere and the distribution of control units would spike at zero and be zero elsewhere. Despite the high AUC, the Figure indicates considerable overlap between the distributions, which indicates we have a satisfactory first‐stage model with which to identify the ATE properly using IPW methods. Fig. 2. Open in new tabDownload slide Overlap Check: Empirical Distributions of the Treatment Propensity Score 
Notes. See text. The propensity score is estimated using the saturated probit specification discussed in the text, which includes country fixed effects. The Figure displays the predicted probabilities of treatment with a dashed line for the treatment observations and with a solid line for the control observations. Fig. 2. Open in new tabDownload slide Overlap Check: Empirical Distributions of the Treatment Propensity Score 
Notes. See text. The propensity score is estimated using the saturated probit specification discussed in the text, which includes country fixed effects. The Figure displays the predicted probabilities of treatment with a dashed line for the treatment observations and with a solid line for the control observations. However, the Figure also indicates that there are some observations likely to get very high weights. Specifically, there are control (treated) units whose propensity score is near zero (one) and hence who get weights in the IPW in excess of 10. In general, it is often recommended to truncate the maximum weights in the IPW to 10 (Imbens, 2004; Cole and Hernán, 2008). However, the AIPW has the property that high weights in the IPW are compensated at the same rate by the augmentation term. Experiments not reported here indicate that this is indeed what happens in practice and that truncation is unnecessary in our application (see Appendix A.3). Using the more saturated probit, we then estimate cumulated responses and their sum to the five‐year horizon as before. Our indicator of a fiscal consolidation is the narrative IMF indicator, the Treatment variable. Since Treatment is binary, we are estimating average effects only. However, coincidentally, the average treatment size (or dose) is close to 1% of GDP in these data (the exact value is 0.97, with a standard deviation of 0.07 in the full sample and is not significantly different in booms and slumps), so the interpretation of these responses is directly comparable to a conventional multiplier, with only a small upscaling (of 1/0.97) for strict accuracy. We can return to this rescaling issue in a moment when we make a formal comparison with the previous OLS and IV results. We begin by discussing Table 8, which is the direct counterpart to the OLS and IV result presentations in Tables 1 and 3. Here we show the ATE of fiscal consolidation using the AIPW estimator (18), for the full sample (i.e. no use of boom and slump bins, yet) and using the propensity score estimates based on the saturated probit. Both the treatment equation probit model and the outcome‐equation AIPW model include country‐fixed effects. Table 8 Average Treatment Effect of Fiscal Consolidation, AIPW Estimates, Full Sample Deviations of log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal ATE, restricted (⁠ θ1h=θ0h ⁠) −0.17 −0.55** −0.61*** −0.88** −1.14** −3.22*** (0.17) (0.23) (0.20) (0.32) (0.42) (0.89) Fiscal ATE, unrestricted (⁠ θ1h≠θ0h ⁠) −0.24 −0.70** −0.75*** −0.93** −1.23** −3.61*** (0.16) (0.26) (0.25) (0.33) (0.47) (1.06) Observations 456 439 423 406 389 389 Deviations of log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal ATE, restricted (⁠ θ1h=θ0h ⁠) −0.17 −0.55** −0.61*** −0.88** −1.14** −3.22*** (0.17) (0.23) (0.20) (0.32) (0.42) (0.89) Fiscal ATE, unrestricted (⁠ θ1h≠θ0h ⁠) −0.24 −0.70** −0.75*** −0.93** −1.23** −3.61*** (0.16) (0.26) (0.25) (0.33) (0.47) (1.06) Observations 456 439 423 406 389 389 Notes Empirical sandwich standard errors (clustered by country) in parentheses (see expression (20)). ***/**/* Indicate p < 0.01/0.05/0.10. Conditional mean controls: cyclical component of y, two lags of change in y, country fixed effects. yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. Specification includes country fixed effects in the propensity score model and in the AIPW model. Propensity score based on the saturated probit model as described in the text. AIPW estimates do not impose restrictions on the weights of the propensity score. Truncated results not reported here but available upon request. See text. Open in new tab Table 8 Average Treatment Effect of Fiscal Consolidation, AIPW Estimates, Full Sample Deviations of log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal ATE, restricted (⁠ θ1h=θ0h ⁠) −0.17 −0.55** −0.61*** −0.88** −1.14** −3.22*** (0.17) (0.23) (0.20) (0.32) (0.42) (0.89) Fiscal ATE, unrestricted (⁠ θ1h≠θ0h ⁠) −0.24 −0.70** −0.75*** −0.93** −1.23** −3.61*** (0.16) (0.26) (0.25) (0.33) (0.47) (1.06) Observations 456 439 423 406 389 389 Deviations of log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal ATE, restricted (⁠ θ1h=θ0h ⁠) −0.17 −0.55** −0.61*** −0.88** −1.14** −3.22*** (0.17) (0.23) (0.20) (0.32) (0.42) (0.89) Fiscal ATE, unrestricted (⁠ θ1h≠θ0h ⁠) −0.24 −0.70** −0.75*** −0.93** −1.23** −3.61*** (0.16) (0.26) (0.25) (0.33) (0.47) (1.06) Observations 456 439 423 406 389 389 Notes Empirical sandwich standard errors (clustered by country) in parentheses (see expression (20)). ***/**/* Indicate p < 0.01/0.05/0.10. Conditional mean controls: cyclical component of y, two lags of change in y, country fixed effects. yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. Specification includes country fixed effects in the propensity score model and in the AIPW model. Propensity score based on the saturated probit model as described in the text. AIPW estimates do not impose restrictions on the weights of the propensity score. Truncated results not reported here but available upon request. See text. Open in new tab Table 8 is organised into two rows. The first row reports the results based on imposing the restriction θ1h=θ0h ⁠, the usual implicit restriction used without hesitation in the macro‐VAR empirical literature and is the same restriction we imposed in reporting the results of Tables 1 and 3. The second row reports the results that do not impose the θ1h=θ0h restriction. The results are qualitatively similar to those reported in Table 3 in that we still find that austerity is contractionary. However, the estimated impacts of fiscal consolidations on output are now even bigger. Recall that according to the IV estimates, the accumulated loss over five years was −2.94*** ⁠. This would imply an average annual real GDP loss of about 0.59% of GDP per 1% of fiscal consolidation over each of the five years. Here our AIPW estimate with unrestricted coefficients has a sum effect of −3.61*** over five years. This would imply an average annual real GDP loss of about 0.74% of GDP per 1% of fiscal consolidation over each of the five years (using a 1/0.97 rescaling factor). Thus the implied output losses due to austerity are about 20% larger under our AIPW estimation than with IV estimation. Next we once again explore the same partition of the data into booms and slumps, allocating to the bins according to whether output is above or below trend as in earlier Sections to provide a more granular view of these results; Table 9 presents these AIPW estimates based on the same saturated policy propensity score probit model described earlier. These results show that in a boom a fiscal consolidation has on average a small, negative, but imprecisely estimated effect. The first row of the Table indicates that the accumulated loss over five years is −1.80% of GDP. In a slump, the results are about three times as strong and highly statistically significant: over five years, the accumulated loss is −3.54**% of GDP, as shown in the second row of the Table. Scaling these effects for the average treatment size (0.97% of GDP) the average loss per 1% fiscal consolidation is 0.37% of GDP per year over the five year window in booms, and 0.73% of GDP per year in slumps. Table 9 Average Treatment Effect of Fiscal Consolidation, AIPW Estimates, Booms Versus Slumps Deviations of log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal ATE, yC>0 ⁠, boom −0.33 −0.68* −0.36 −0.55 −0.56 −1.80 (0.22) (0.39) (0.41) (0.57) (0.84) (1.85) Fiscal ATE, yC<0 ⁠, slump −0.19 −0.76*** −0.96*** −0.68 −0.95 −3.54** (0.19) (0.25) (0.33) (0.43) (0.61) (1.52) Observations 456 439 423 406 389 389 Deviations of log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal ATE, yC>0 ⁠, boom −0.33 −0.68* −0.36 −0.55 −0.56 −1.80 (0.22) (0.39) (0.41) (0.57) (0.84) (1.85) Fiscal ATE, yC<0 ⁠, slump −0.19 −0.76*** −0.96*** −0.68 −0.95 −3.54** (0.19) (0.25) (0.33) (0.43) (0.61) (1.52) Observations 456 439 423 406 389 389 Notes Empirical sandwich standard errors (clustered by country) in parentheses (see (20)). ***/**/* Indicate p < 0.01/0.05/0.10. Conditional mean controls: cyclical component of y, two lags of change in y, country fixed effects. yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. Specification includes country fixed effects in the propensity score model and in the AIPW model. Propensity score based on the saturated probit model as described in the text. AIPW estimates do not impose restrictions on the weights of the propensity score. The boom bin is for observations where the cyclical component yC is greater than zero, the slump bin is for observations where the cyclical component is less than or equal to zero. Open in new tab Table 9 Average Treatment Effect of Fiscal Consolidation, AIPW Estimates, Booms Versus Slumps Deviations of log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal ATE, yC>0 ⁠, boom −0.33 −0.68* −0.36 −0.55 −0.56 −1.80 (0.22) (0.39) (0.41) (0.57) (0.84) (1.85) Fiscal ATE, yC<0 ⁠, slump −0.19 −0.76*** −0.96*** −0.68 −0.95 −3.54** (0.19) (0.25) (0.33) (0.43) (0.61) (1.52) Observations 456 439 423 406 389 389 Deviations of log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal ATE, yC>0 ⁠, boom −0.33 −0.68* −0.36 −0.55 −0.56 −1.80 (0.22) (0.39) (0.41) (0.57) (0.84) (1.85) Fiscal ATE, yC<0 ⁠, slump −0.19 −0.76*** −0.96*** −0.68 −0.95 −3.54** (0.19) (0.25) (0.33) (0.43) (0.61) (1.52) Observations 456 439 423 406 389 389 Notes Empirical sandwich standard errors (clustered by country) in parentheses (see (20)). ***/**/* Indicate p < 0.01/0.05/0.10. Conditional mean controls: cyclical component of y, two lags of change in y, country fixed effects. yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. Specification includes country fixed effects in the propensity score model and in the AIPW model. Propensity score based on the saturated probit model as described in the text. AIPW estimates do not impose restrictions on the weights of the propensity score. The boom bin is for observations where the cyclical component yC is greater than zero, the slump bin is for observations where the cyclical component is less than or equal to zero. Open in new tab Summing up our LP results, we always find more adverse paths when austerity is imposed in slumps rather than in booms but there are sometimes big differences across specifications. OLS suggests that austerity might have a small and imprecisely estimated expansionary effect, although a more granular view indicates that even then, this result holds only in booms. Using the ‘narrative’ instrument we would walk away believing more firmly that austerity is contractionary. The estimated effect with IV is relatively small and imprecisely estimated for the boom but stronger and significant in the slump, adding up to a loss of −3.3% of output over five years for the typical consolidation. Finally, using the AIPW estimator we find even larger contractionary effects of austerity, still not statistically significant in booms, and amounting to −3.5% over five years in slumps. One may quibble that the size of the consolidation should also be taken into consideration. In principle, this is a valid concern. However, in practice this would mean extending the space of discrete interventions. Given the data, there would be too few observations to obtain robust results (and in some cases, insufficient data to estimate the desired effects). Fortunately, as we have discussed earlier, fiscal consolidations typically average about 1% relative to GDP with a tight range of variation, which greatly facilitates the interpretability of our findings. Figure 3 displays the coefficients reported in Table 9, with appropriate rescaling in the case of AIPW to allow for the average treatment size, to show the dynamic ATE impacts of fiscal consolidations in graphical form and compares them with the responses obtained using the IV coefficient estimates which were reported earlier in Table 4. Fig. 3. Open in new tabDownload slide Comparing AIPW and IV Estimates of the Response of the Output Path in the Case of a 1% Fiscal Consolidation, Deviations of Log Real GDP (relative to Year 0, × 100) 
Notes. Panel (a) reports the cumulative ATE responses based on yt+h−yt ⁠, where as panel (b) presents the accumulated ATE output loss, which is the running sum of the coefficients displayed in panel (a). 95%/90% error bands displayed. The top row shows the results for the subpopulation of observations in the boom measured in deviations above HP trend. The bottom row shows the results for the subpopulation of observations in the slump, measured in deviations below HP trend. AIPW refers to the responses calculated using the AIPW estimator of subsection 5.1; IV refers to the IV estimator discussed in Section 5. AIPW impacts are rescaled to allow for the average size of fiscal consolidation. See text. Fig. 3. Open in new tabDownload slide Comparing AIPW and IV Estimates of the Response of the Output Path in the Case of a 1% Fiscal Consolidation, Deviations of Log Real GDP (relative to Year 0, × 100) 
Notes. Panel (a) reports the cumulative ATE responses based on yt+h−yt ⁠, where as panel (b) presents the accumulated ATE output loss, which is the running sum of the coefficients displayed in panel (a). 95%/90% error bands displayed. The top row shows the results for the subpopulation of observations in the boom measured in deviations above HP trend. The bottom row shows the results for the subpopulation of observations in the slump, measured in deviations below HP trend. AIPW refers to the responses calculated using the AIPW estimator of subsection 5.1; IV refers to the IV estimator discussed in Section 5. AIPW impacts are rescaled to allow for the average size of fiscal consolidation. See text. Our results underscore that austerity tends to be painful but that timing matters: the least painful fiscal consolidations, from a growth and hence budgetary perspective, will tend to be those launched from a position of strength, that is, in the boom not the slump. This would seem to require moderately wise policymaking and/or fiscal regimes (councils, rules etc.), not to mention an ability to stay below any debt limit so as to maintain capital market access to permit smoothing. The next Section puts our new results to work in the context of the austerity programme launched in UK by the Coalition administration in 2010, to show how our analysis can be used in practice. Moreover, by putting our results in a realistic situation outside the sample used for estimation, we obtain a feel for how well calibrated our findings are to the recent macroeconomic experience of a representative economy from our sample. 7. Counterfactual: Coalition Austerity and the UK Recession This Section makes a counterfactual forecast of the post‐2007 path of the UK economy without the fiscal austerity policies imposed by the Coalition Government after the 2010 election. These estimates are based on a sample that excludes the Global Financial Crisis. Therefore, the exercise has the flavour of an out‐of‐sample evaluation. We need to be very clear about the assumptions, explicit or implicit, in our counterfactual. We assume that zero fiscal consolidation was done in the years 2010–3 as the counterfactual, that this was feasible, and would have not caused other possibly adverse outcomes. Most crucially we assume that this would have had no damaging effect on UK government debt yields, or country risk, in the relevant time frame. We assume, in other words, that the UK had the fiscal space to follow a counterfactual path for those years. Critics may question this assumption but no evidence has been adduced to support their view. First, without a model on their part of the determinants of UK sovereign risk showing such a spike, the point is not well supported. In fact, to the extent that we have any cross‐country evidence on the impact of fiscal consolidations on sovereign yields it shows that austerity tends to worsen sovereign spreads, or has a small and often non‐significant efffect (Born et al., 2015, Figure 4). Second, the evidence from UK longer‐run history is even less favourable: the country emerged from the Napoleonic Wars and the two World Wars with debt ratios higher than current levels relative to GDP (and tax revenues much lower) and yet never defaulted on market debt or lost capital market access. So there is no within‐country evidence of this link either. Third, recent UK history also goes against this argument: despite public debt levels rising under the Coalition (mainly because a slow recovery meant poor tax revenues) the UK gilt yield has fallen to its lowest level in centuries. Indeed, with stronger output growth under the counterfactual, tax revenues would have been higher, a factor that would have alleviated debt build up, all else equal.10 Fig. 4. Open in new tabDownload slide UK Austerity: Forecast, Actual, and Counterfactual Paths for Real GDP, 2007–13 
Notes. Units are percent of 2007 real GDP, the last peak. OBR forecast is from http://budgetresponsibility.independent.gov.uk/wordpress/docs/pre_budget_forecast_140610.pdf. The Jordà et al. (2013) path is for real GDP per capita, extended to a six‐year horizon, adjusted by +0.65% per year given the UK rate of population growth. Actual data from ONS in March 2014. Model counterfactuals subtract estimated AIPW responses in the slump bin, suitably scaled. See text. Fig. 4. Open in new tabDownload slide UK Austerity: Forecast, Actual, and Counterfactual Paths for Real GDP, 2007–13 
Notes. Units are percent of 2007 real GDP, the last peak. OBR forecast is from http://budgetresponsibility.independent.gov.uk/wordpress/docs/pre_budget_forecast_140610.pdf. The Jordà et al. (2013) path is for real GDP per capita, extended to a six‐year horizon, adjusted by +0.65% per year given the UK rate of population growth. Actual data from ONS in March 2014. Model counterfactuals subtract estimated AIPW responses in the slump bin, suitably scaled. See text. The UK experienced a much weaker recovery than in the US, where nothing close to a double dip took place. The divergence between the two recovery paths began in 2010 (Schularick and Taylor, 2012). Since both countries’ central banks acted with aggressive easing, by going to the zero bound and pursuing quantitative easing policies thereafter, explanations for the differences have focused elsewhere. Various explanations have been offered, ranging from tighter UK fiscal policy, to spillovers from the Eurozone and weak trade links with fast‐growth emerging markets. Other stories have invoked contractions in oversized UK sectors such as finance and North Sea oil and gas, the extent of non‐bank finance and differential energy costs (Davies, 2012; Posen, 2012). To gain quantitative traction on the share of responsibility that should be borne by fiscal policy we use our AIPW estimates. We scale, and assign the impacts of fiscal shocks as follows. As a measure of the change in fiscal stance we use the change in the UK Office of Budget Responsibility’s (OBR) cyclically adjusted primary balance. The changes turn out to be +2.3% of GDP in year 1 (2009–10 to 2010–1), followed by +1.5% in year 2, and +0.1% in year 3, showing a slowing of the pace of tightening in year 3, but with further austerity planned in future years.11 This gives us a sequence of three fiscal policy shocks. Note that the average treatment size is 0.97, so for this counterfactual exercise we scale treatment effects due to each shock by a factor of 1/0.97. We then have to compute the impact of each shock at each horizon and make sure we assign it appropriately. Our AIPW estimation already allows for the fact that if at time 0 a treatment occurs, then its measured impact at time h ≥ 1 includes not just the direct impact of the policy on output but also its indirect impact arising from the fact that treatment at time 0 also predicts some positive probability of treatment at time h ≥ 1. To prevent double counting we therefore need to subtract these ‘expected austerity’ measures carefully from any forecast of fiscal impacts in year 1 and beyond. The effects of the first round of austerity in 2010–1 can be computed directly from the AIPW estimates above (for the slump bin, since the UK was already in a deep recession then). For example, the effect of the 2010–1 austerity shock in 2011 itself would be computed as the shock magnitude of +2.3 (OBR data, as above) multiplied by the scaling factor of 1/0.97 (noted above), and then multiplied by the AIPW coefficient of −0.19 (from the slump bin in year 1). However, in other subsequent years, an adjustment must be done. For example, the effect of the 2010–1 plus 2011–2 austerity shock in 2012 itself would be computed in two parts. First, there is a similar direct effect of the first year shock on second year output: the first year shock magnitude of 2.3 (again) multiplied by the scaling factor of 1/0.97 (again), and then multiplied by the AIPW coefficient of −0.76 (from the slump bin but now in year 2). Second, there is the additional effect from unexpected treatment in year 2 conditional on treatment in year 1. To get at this problem, we estimate a simple LP regression for the forward path of treatment at time h, conditional on treatment today, and use these to weight austerity impacts in years 2 and 3.12 The results of this counterfactual exercise are presented in Figure 4 and, for reference, we also show various actual and forecast paths for UK real GDP from 2007 (the business cycle peak) to 2013. As a starting point, without knowledge of what was to happen after the 2010 Coalition austerity programme, what might have been the ex ante expected path of the UK economy? This question is answered by the two dashed lines. The double‐long‐dashed line shows the unconditional historical path in a financial crisis recession based on a large sample of all advanced‐economy recessions from 1870 to 2007 in Jordà et al. (2013), extended to the six‐year horizon. We restrict attention to their average path for highly leveraged economies after a financial crisis, a category which includes the UK case in 2007. Clearly, a seriously painful recession was to be expected anyway: if output is scaled to 100 in 2007, this path shows a 4% drop over two years, to a level of 96 by 2009, followed by recovery thereafter to about 104 in 2013. What did the authorities expect? According to the June 2010 Pre‐Budget report of the OBR they expected something similar but slightly worse to unfold after 2010, as shown by the short‐dashed path in the Figure. The bottom in output here is 94.2 and the recovery was predicted to be initially slower, although by 2012 the OBR thought the output level would be 100.6 and by 2013 it would be at 103.4, in the same units. (Thus the difference between the two displayed forecast paths is only 0.6% of GDP by 2013.) Alas, this did not come to pass, as shown by the solid line in the chart using actual UK (ONS) data to depict the outturn of events. Everything was going more or less in line with the forecast path until 2010. After that, a double‐dip recession was avoided only by a decimal rounding and the UK real economy virtually flatlined for a couple of years before a small uptick in 2013. (In per capita terms, the UK economy actually shrank.) How much of the UK’s dismal performance can be attributed to the fiscal policy choice of instigating austerity during a slump? The answer based on our counterfactual model is about 7/9, or just over three‐quarters. This is shown by the dotted line in the chart, which cumulates the effects of each of the three years of austerity on growth from 2010 to 2013. By 2013, the last year in the window, the cumulative effects of these choices amounted to about 3.1% of GDP (in 2007 units) where the total gap relative to the actual path was 4.0%, thus leaving an unexplained residual of 0.9%. Our model also suggests that additional drag from the 2010–3 austerity policies would have been felt into 2014–6, even if no further austerity had been imposed.13 In 2013, at the end of the period analysed here, OBR published an estimate that austerity caused a roughly −1.5% change in output in the year 2013. Our −3.1% estimate of the impact of fiscal austerity on economic activity is just over twice as large. We think this important difference is largely due to the fact that, unlike us, OBR does not allow for state‐dependence and they also arbitrarily force the effects of fiscal policy to decay to zero after four years. Both of these modelling choices would appear to be strongly rejected by the data, however.14 Even so, our 3.4% estimate could still be biased down because we are unable to adjust for monetary policy at the zero lower bound (ZLB). The UK out‐of‐sample counterfactual is based in a liquidity trap environment but the in‐sample data we used for estimation overwhelmingly were not. Our estimates used data from the 1970s to 2007. Out of 173 consolidation episodes, there are only seven country‐year observations at the ZLB, all relating to Japan in the 1990–2007. Economic theory (Christiano et al., 2011; Eggertsson and Krugman, 2012; Rendahl, 2012) and also historical evidence from the 1930s (Almunia et al., 2010) indicate that fiscal multipliers are much larger under ZLB conditions than in normal times when monetary policy is away from this constraint. But we cannot hope to capture the ZLB effect convincingly in our sample with just a handful of observations from Japan, so this must remain a goal for future research where we hope to apply our new estimation methods to a large set of contemporary and historical data. 8. Conclusion Few macroeconomic policy debates have generated as much controversy as the current austerity argument and, as Europe stagnates, the furore appears to be far from over. Amidst the cacophony of competing estimates of fiscal multipliers, the goal of this article is not to add another source of noise. Rather, the main contribution is to harmonise dissonant views into a unified framework where the merits of each approach can be properly evaluated. The effect of fiscal consolidation on macroeconomic outcomes is ultimately an empirical question. In the absence of randomised controlled trials, we have to rely on observational data. And to measure the causal effect of fiscal consolidations on growth, it is critical that identification assumptions be properly evaluated and that empirical methods be suitably adjusted to the demands of the data. Whenever outcomes are correlated with observables that determine the likelihood of treatment, the effect of the treatment cannot be causally measured without bias. Yet, this allocation bias prevents us from being able to tell whether or not the low or even inverted values of the fiscal multiplier often found in this strand of the literature are indeed close enough to the truth. If episodes of fiscal consolidation could be separated by whether or not they are explained by circumstances, identification could be, once again, restored. The narrative approach relies on a careful reading of the records to achieve just such a separation. Moreover, results from this approach indicate that the fiscal multiplier is larger in magnitude, especially in depressed economies. However, when those consolidations believed to be exogenous are predictable by omitted observable controls, two concerns arise. One is that the instrument may not be exogenous as argued. Second, even if the instrument is exogenous, the omitted predictors introduce bias in the IV estimator. Extant results in the literature can be somewhat reconciled by interpreting exogenous consolidations as instrumental variables. After all, if the narrative approach were not very informative about the exogeneity of these episodes, there should not be any difference in the value of the multiplier estimated using simple least squares and IV methods. So, while potentially problematic, the narrative approach (through these IV estimates) seems to be isolating fiscal consolidations that differ from those in the overall population in some important respects. Whether the fiscal multiplier estimated with instrumental variables can be interpreted causally required further analysis. Dissatisfaction with the potential violation of exogeneity conditions required for identification could lead one, like Mill (1836), to the nihilistic conclusion that without an experimentum crucis mere observational data are hopelessly unsuitable for testing a macroeconomic hypothesis but we believe the battle is not lost. Propensity score methods, common in biostatistics, medical research and in applied microeconomics when ideal randomised trials are unavailable, offer a last line of defence. Recent work by Angrist et al. (2013) introduced inverse probability weighted estimators of ATEs for time series data. Our appeal to this approach begins by recognising that fiscal consolidations may not be exogenous events, even those identified by the narrative approach. Next we construct a predictive model for the likelihood of fiscal consolidation using various specifications including some with a rich set of available observable controls. The predictive model serves to reallocate probability mass from the regions of the distributions in the treatment/control subpopulations that are oversampled to those regions that are undersampled, thus enabling identification in the framework of the Rubin Causal Model. Our preferred AIPW estimates, which correct for the endogeneity of the fiscal treatments, are quantitatively much closer to those from the instrumental variables specification than to those from the least squares specification, although that such would be the outcome was unknowable without doing the analysis. This result provides some measure of comfort on the potential validity of the instrument. Our analysis suggests even larger austerity impacts than the IMF study when the economy is growing below its long‐run trend, however. This is likely a result of correcting attenuation bias due to the omitted predictors of fiscal consolidation and the re‐randomisation methods that we use. Generally, in the slump, austerity prolongs the pain, much more so than in the boom. It appears that Keynes was right after all. Appendix (A.1) OLS with Country‐fixed Effects and Controlling for World Growth This subsection reports estimates of the OLS specification (expression (1) when the model is extended to include the World real GDP growth rate (from the World Bank dataset) as a control to capture global time varying trends. The following Table A1 corresponds to Table 2 using this alternative specification. Table A1 Fiscal Multiplier, d.CAPB, OLS Estimate, Booms versus Slumps Log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Panel (a): Uniform effect of d.CAPB changes Fiscal multiplier, yC>0 ⁠, boom 0.21*** 0.25*** 0.06 −0.18* −0.26* −0.07 (0.07) (0.07) (0.05) (0.10) (0.14) (0.24) Observations 222 205 192 180 175 175 Fiscal multiplier, yC≤0 ⁠, slump −0.03 −0.06 −0.17 −0.23* −0.41** −0.97** (0.03) (0.06) (0.10) (0.12) (0.17) (0.37) Observations 235 235 231 226 214 214 Panel (b): separate effects of d.CAPB for Large (> 1.5%) and Small (≤ 1.5%) changes Fiscal multiplier, large change in CAPB, yC>0 ⁠, boom 0.23*** 0.25*** 0.07 −0.17 −0.22 0.08 (0.08) (0.08) (0.06) (0.10) (0.14) (0.27) Fiscal multiplier, small change in CAPB, yC>0 ⁠, boom 0.04 0.19 −0.02 −0.35 −0.68 −1.68 (0.12) (0.33) (0.40) (0.37) (0.39) (1.11) Observations 222 205 192 180 175 175 Fiscal multiplier, large change in CAPB, yC≤0 ⁠, slump −0.03 −0.05 −0.18 −0.30* −0.52** −1.16** (0.04) (0.08) (0.12) (0.16) (0.22) (0.53) Fiscal multiplier, small change in CAPB, yC≤0 ⁠, slump −0.05 −0.15 −0.10 0.13 0.16 0.03 (0.12) (0.21) (0.23) (0.32) (0.49) (1.09) Observations 235 235 231 226 214 214 Log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Panel (a): Uniform effect of d.CAPB changes Fiscal multiplier, yC>0 ⁠, boom 0.21*** 0.25*** 0.06 −0.18* −0.26* −0.07 (0.07) (0.07) (0.05) (0.10) (0.14) (0.24) Observations 222 205 192 180 175 175 Fiscal multiplier, yC≤0 ⁠, slump −0.03 −0.06 −0.17 −0.23* −0.41** −0.97** (0.03) (0.06) (0.10) (0.12) (0.17) (0.37) Observations 235 235 231 226 214 214 Panel (b): separate effects of d.CAPB for Large (> 1.5%) and Small (≤ 1.5%) changes Fiscal multiplier, large change in CAPB, yC>0 ⁠, boom 0.23*** 0.25*** 0.07 −0.17 −0.22 0.08 (0.08) (0.08) (0.06) (0.10) (0.14) (0.27) Fiscal multiplier, small change in CAPB, yC>0 ⁠, boom 0.04 0.19 −0.02 −0.35 −0.68 −1.68 (0.12) (0.33) (0.40) (0.37) (0.39) (1.11) Observations 222 205 192 180 175 175 Fiscal multiplier, large change in CAPB, yC≤0 ⁠, slump −0.03 −0.05 −0.18 −0.30* −0.52** −1.16** (0.04) (0.08) (0.12) (0.16) (0.22) (0.53) Fiscal multiplier, small change in CAPB, yC≤0 ⁠, slump −0.05 −0.15 −0.10 0.13 0.16 0.03 (0.12) (0.21) (0.23) (0.32) (0.49) (1.09) Observations 235 235 231 226 214 214 Notes Standard errors (clustered by country) in parentheses. ***/**/* Indicates p < 0.01/0.05/0.10 respectively. Additional controls: cyclical component of y, two lags of change in y, country fixed effects; and also growth rate of world real GDP (World Bank). yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. Open in new tab Table A1 Fiscal Multiplier, d.CAPB, OLS Estimate, Booms versus Slumps Log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Panel (a): Uniform effect of d.CAPB changes Fiscal multiplier, yC>0 ⁠, boom 0.21*** 0.25*** 0.06 −0.18* −0.26* −0.07 (0.07) (0.07) (0.05) (0.10) (0.14) (0.24) Observations 222 205 192 180 175 175 Fiscal multiplier, yC≤0 ⁠, slump −0.03 −0.06 −0.17 −0.23* −0.41** −0.97** (0.03) (0.06) (0.10) (0.12) (0.17) (0.37) Observations 235 235 231 226 214 214 Panel (b): separate effects of d.CAPB for Large (> 1.5%) and Small (≤ 1.5%) changes Fiscal multiplier, large change in CAPB, yC>0 ⁠, boom 0.23*** 0.25*** 0.07 −0.17 −0.22 0.08 (0.08) (0.08) (0.06) (0.10) (0.14) (0.27) Fiscal multiplier, small change in CAPB, yC>0 ⁠, boom 0.04 0.19 −0.02 −0.35 −0.68 −1.68 (0.12) (0.33) (0.40) (0.37) (0.39) (1.11) Observations 222 205 192 180 175 175 Fiscal multiplier, large change in CAPB, yC≤0 ⁠, slump −0.03 −0.05 −0.18 −0.30* −0.52** −1.16** (0.04) (0.08) (0.12) (0.16) (0.22) (0.53) Fiscal multiplier, small change in CAPB, yC≤0 ⁠, slump −0.05 −0.15 −0.10 0.13 0.16 0.03 (0.12) (0.21) (0.23) (0.32) (0.49) (1.09) Observations 235 235 231 226 214 214 Log real GDP (relative to Year 0, × 100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Panel (a): Uniform effect of d.CAPB changes Fiscal multiplier, yC>0 ⁠, boom 0.21*** 0.25*** 0.06 −0.18* −0.26* −0.07 (0.07) (0.07) (0.05) (0.10) (0.14) (0.24) Observations 222 205 192 180 175 175 Fiscal multiplier, yC≤0 ⁠, slump −0.03 −0.06 −0.17 −0.23* −0.41** −0.97** (0.03) (0.06) (0.10) (0.12) (0.17) (0.37) Observations 235 235 231 226 214 214 Panel (b): separate effects of d.CAPB for Large (> 1.5%) and Small (≤ 1.5%) changes Fiscal multiplier, large change in CAPB, yC>0 ⁠, boom 0.23*** 0.25*** 0.07 −0.17 −0.22 0.08 (0.08) (0.08) (0.06) (0.10) (0.14) (0.27) Fiscal multiplier, small change in CAPB, yC>0 ⁠, boom 0.04 0.19 −0.02 −0.35 −0.68 −1.68 (0.12) (0.33) (0.40) (0.37) (0.39) (1.11) Observations 222 205 192 180 175 175 Fiscal multiplier, large change in CAPB, yC≤0 ⁠, slump −0.03 −0.05 −0.18 −0.30* −0.52** −1.16** (0.04) (0.08) (0.12) (0.16) (0.22) (0.53) Fiscal multiplier, small change in CAPB, yC≤0 ⁠, slump −0.05 −0.15 −0.10 0.13 0.16 0.03 (0.12) (0.21) (0.23) (0.32) (0.49) (1.09) Observations 235 235 231 226 214 214 Notes Standard errors (clustered by country) in parentheses. ***/**/* Indicates p < 0.01/0.05/0.10 respectively. Additional controls: cyclical component of y, two lags of change in y, country fixed effects; and also growth rate of world real GDP (World Bank). yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. Open in new tab (A.2) IV with Country‐fixed Effects and Controlling for World Growth The following Table A2 corresponds to Tables 4 when we add the World real GDP growth rate (from the World Bank dataset) as a control to capture global time varying trends. Table A2 Fiscal Multiplier, d.CAPB, IV Estimate (binary), Booms versus Slumps Log real GDP (relative to Year 0, ×100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal multiplier, yC>0 ⁠, boom −0.32 −0.33 −0.14 −0.54 −0.67 −1.18 (0.32) (0.52) (0.51) (0.45) (0.45) (1.54) Observations 222 205 192 180 175 175 Fiscal multiplier, yC≤0 ⁠, slump −0.24 −0.76*** −0.95*** −0.79** −0.94** −3.38*** (0.15) (0.24) (0.31) (0.32) (0.42) (1.10) Observations 235 235 231 226 214 214 Log real GDP (relative to Year 0, ×100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal multiplier, yC>0 ⁠, boom −0.32 −0.33 −0.14 −0.54 −0.67 −1.18 (0.32) (0.52) (0.51) (0.45) (0.45) (1.54) Observations 222 205 192 180 175 175 Fiscal multiplier, yC≤0 ⁠, slump −0.24 −0.76*** −0.95*** −0.79** −0.94** −3.38*** (0.15) (0.24) (0.31) (0.32) (0.42) (1.10) Observations 235 235 231 226 214 214 Notes Standard errors (clustered by country) in parentheses. ***/**/* Indicates p < 0.01/0.05/0.10 respectively. Additional controls: cyclical component of y, two lags of change in y, country fixed effects; and also growth rate of world real GDP (World Bank). yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. d.CAPB instrumented by IMF fiscal action variable in binary 0‐1 form (treatment). Open in new tab Table A2 Fiscal Multiplier, d.CAPB, IV Estimate (binary), Booms versus Slumps Log real GDP (relative to Year 0, ×100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal multiplier, yC>0 ⁠, boom −0.32 −0.33 −0.14 −0.54 −0.67 −1.18 (0.32) (0.52) (0.51) (0.45) (0.45) (1.54) Observations 222 205 192 180 175 175 Fiscal multiplier, yC≤0 ⁠, slump −0.24 −0.76*** −0.95*** −0.79** −0.94** −3.38*** (0.15) (0.24) (0.31) (0.32) (0.42) (1.10) Observations 235 235 231 226 214 214 Log real GDP (relative to Year 0, ×100) . . (1) . (2) . (3) . (4) . (5) . (6) . . Year 1 . Year 2 . Year 3 . Year 4 . Year 5 . Sum . Fiscal multiplier, yC>0 ⁠, boom −0.32 −0.33 −0.14 −0.54 −0.67 −1.18 (0.32) (0.52) (0.51) (0.45) (0.45) (1.54) Observations 222 205 192 180 175 175 Fiscal multiplier, yC≤0 ⁠, slump −0.24 −0.76*** −0.95*** −0.79** −0.94** −3.38*** (0.15) (0.24) (0.31) (0.32) (0.42) (1.10) Observations 235 235 231 226 214 214 Notes Standard errors (clustered by country) in parentheses. ***/**/* Indicates p < 0.01/0.05/0.10 respectively. Additional controls: cyclical component of y, two lags of change in y, country fixed effects; and also growth rate of world real GDP (World Bank). yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. d.CAPB instrumented by IMF fiscal action variable in binary 0‐1 form (treatment). Open in new tab (A.3) Robustness As discussed in the text, we explored the sensitivity of our results to different model specifications. These findings are shown in Table A3. In each case we show the impacts that these model changes have on the estimated five‐year summed estimate of the response of output to the fiscal treatment in the two output level bins. We also report the predictive ability test for the first stage in each case based on the area under the curve (AUC) statistic and its standard error. Table A3 ATE of Fiscal Consolidation, AIPW Estimates, Booms versus Slumps, Various Propensity Score Models and Truncations Sum of log real GDP impacts, years 1 to 5 (all relative to Year 0, ×100) . Estimator . (1) . (2) . (3) . (4) . (5) . Probit CFE + world GDP . Logit CFE . Logit CFE + world GDP . Probit CFE p ∈ [0.1,0.9] . Probit CFE p ∈ [0.2,0.8] . Fiscal ATE, yC>0 ⁠, boom −1.81 −1.74 −1.72 −1.82 −1.78 (1.85) (1.84) (1.86) (1.90) (1.93) Fiscal ATE, yC≤0 ⁠, slump −3.46** −4.00** −3.88** −3.58** −3.58** (1.50) (1.40) (1.41) (1.50) (1.48) First‐stage, AUC 0.88 0.85 0.85 0.87 0.86 (0.02) (0.02) (0.02) (0.02) (0.02) Observations 389 389 389 389 389 Sum of log real GDP impacts, years 1 to 5 (all relative to Year 0, ×100) . Estimator . (1) . (2) . (3) . (4) . (5) . Probit CFE + world GDP . Logit CFE . Logit CFE + world GDP . Probit CFE p ∈ [0.1,0.9] . Probit CFE p ∈ [0.2,0.8] . Fiscal ATE, yC>0 ⁠, boom −1.81 −1.74 −1.72 −1.82 −1.78 (1.85) (1.84) (1.86) (1.90) (1.93) Fiscal ATE, yC≤0 ⁠, slump −3.46** −4.00** −3.88** −3.58** −3.58** (1.50) (1.40) (1.41) (1.50) (1.48) First‐stage, AUC 0.88 0.85 0.85 0.87 0.86 (0.02) (0.02) (0.02) (0.02) (0.02) Observations 389 389 389 389 389 Notes Standard errors (clustered by country) in parentheses. ***/**/* Indicates p < 0.01/0.05/0.10 respectively. Additional controls: cyclical component of y, two lags of change in y, country fixed effects. yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. AUC is the area under the Correct Classification Frontier (null = 1/2); see text. First‐stage p‐score models for the fiscal treatment are: column (1): as in Table 9, but including the year‐0 World real GDP growth rate (from the World Bank dataset) as a control to capture global time varying trends. Column (2): as in Table 9, but pooled logit estimator. Column (3): as in 2, but including the year‐0 World real GDP growth rate (from the World Bank dataset) as a control to capture global time varying trends. Column (4): as in Table 9, pooled probit, but probability weights truncated to [0.1, 0.9]. Column (5): as in Table 9, pooled probit, but probability weights truncated to [0.2, 0.8]. Open in new tab Table A3 ATE of Fiscal Consolidation, AIPW Estimates, Booms versus Slumps, Various Propensity Score Models and Truncations Sum of log real GDP impacts, years 1 to 5 (all relative to Year 0, ×100) . Estimator . (1) . (2) . (3) . (4) . (5) . Probit CFE + world GDP . Logit CFE . Logit CFE + world GDP . Probit CFE p ∈ [0.1,0.9] . Probit CFE p ∈ [0.2,0.8] . Fiscal ATE, yC>0 ⁠, boom −1.81 −1.74 −1.72 −1.82 −1.78 (1.85) (1.84) (1.86) (1.90) (1.93) Fiscal ATE, yC≤0 ⁠, slump −3.46** −4.00** −3.88** −3.58** −3.58** (1.50) (1.40) (1.41) (1.50) (1.48) First‐stage, AUC 0.88 0.85 0.85 0.87 0.86 (0.02) (0.02) (0.02) (0.02) (0.02) Observations 389 389 389 389 389 Sum of log real GDP impacts, years 1 to 5 (all relative to Year 0, ×100) . Estimator . (1) . (2) . (3) . (4) . (5) . Probit CFE + world GDP . Logit CFE . Logit CFE + world GDP . Probit CFE p ∈ [0.1,0.9] . Probit CFE p ∈ [0.2,0.8] . Fiscal ATE, yC>0 ⁠, boom −1.81 −1.74 −1.72 −1.82 −1.78 (1.85) (1.84) (1.86) (1.90) (1.93) Fiscal ATE, yC≤0 ⁠, slump −3.46** −4.00** −3.88** −3.58** −3.58** (1.50) (1.40) (1.41) (1.50) (1.48) First‐stage, AUC 0.88 0.85 0.85 0.87 0.86 (0.02) (0.02) (0.02) (0.02) (0.02) Observations 389 389 389 389 389 Notes Standard errors (clustered by country) in parentheses. ***/**/* Indicates p < 0.01/0.05/0.10 respectively. Additional controls: cyclical component of y, two lags of change in y, country fixed effects. yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. AUC is the area under the Correct Classification Frontier (null = 1/2); see text. First‐stage p‐score models for the fiscal treatment are: column (1): as in Table 9, but including the year‐0 World real GDP growth rate (from the World Bank dataset) as a control to capture global time varying trends. Column (2): as in Table 9, but pooled logit estimator. Column (3): as in 2, but including the year‐0 World real GDP growth rate (from the World Bank dataset) as a control to capture global time varying trends. Column (4): as in Table 9, pooled probit, but probability weights truncated to [0.1, 0.9]. Column (5): as in Table 9, pooled probit, but probability weights truncated to [0.2, 0.8]. Open in new tab In the main text we adopted a baseline specification of a pooled probit with country‐fixed effects in the first‐stage binary treatment regression. In column (1) we add the year‐0 World real GDP growth rate (from the World Bank dataset) as a control to capture global time varying trends in both stages. In column (2) we show the first‐stage using a pooled logit estimator with country‐fixed effects. In column (3), we extend the estimator in column (3) and add the year‐0 World real GDP growth rate (from the World Bank dataset) as a control to capture global time varying trends in both stages. Columns (4) and (5) report the results for the baseline probit model in the main text when probability weights are truncated to [0.1, 0.9] and [0.2, 0.8]. The message from these checks is that our results are not sensitive to the particular choice of first‐stage model used to generate the propensity score. In the boom bin, effects are always small and statistically insignificant. In the slump bin the effects are negative and significant. (A.4) Estimated LP Equation for Future Treatment For our UK counterfactuals we use LP‐OLS estimates of future treatment as a response to treatment today. This allows us to compute expected and unexpected components of fiscal shocks in multi‐year austerity programmes, e.g. UK 2010–3. The estimates are shown in Table A4. Table A4 LP Estimate of Impact Treatment on Future Treatment, OLS Estimates, Booms versus Slumps Dependent variable: Treatment in year h (consolidation from year h to h + 1) . . (1) . (2) . (3) . . Treatment (t + 1) . Treatment (t + 2) . Treatment (t + 4) . Treatment (t) 0.509*** 0.281*** 0.171*** (0.054) (0.055) (0.042) Observations 439 421 404 Dependent variable: Treatment in year h (consolidation from year h to h + 1) . . (1) . (2) . (3) . . Treatment (t + 1) . Treatment (t + 2) . Treatment (t + 4) . Treatment (t) 0.509*** 0.281*** 0.171*** (0.054) (0.055) (0.042) Observations 439 421 404 Notes Standard errors (clustered by country) in parentheses. ***/**/* Indicates p < 0.01/0.05/0.10 respectively. Additional controls: cyclical component of y, two lags of change in y, country fixed effects. yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. Open in new tab Table A4 LP Estimate of Impact Treatment on Future Treatment, OLS Estimates, Booms versus Slumps Dependent variable: Treatment in year h (consolidation from year h to h + 1) . . (1) . (2) . (3) . . Treatment (t + 1) . Treatment (t + 2) . Treatment (t + 4) . Treatment (t) 0.509*** 0.281*** 0.171*** (0.054) (0.055) (0.042) Observations 439 421 404 Dependent variable: Treatment in year h (consolidation from year h to h + 1) . . (1) . (2) . (3) . . Treatment (t + 1) . Treatment (t + 2) . Treatment (t + 4) . Treatment (t) 0.509*** 0.281*** 0.171*** (0.054) (0.055) (0.042) Observations 439 421 404 Notes Standard errors (clustered by country) in parentheses. ***/**/* Indicates p < 0.01/0.05/0.10 respectively. Additional controls: cyclical component of y, two lags of change in y, country fixed effects. yC is the cyclical component of log y (log real GDP), from HP filter with λ = 100. Open in new tab (A.5) Measures of UK Fiscal Consolidation 2010–3 Measures of the Size of UK Fiscal Treatments are shown in shown in Table A5. As discussed in the text, in our UK counterfactuals we use the change in the UK Office of Budget Responsibility (OBR) cyclically adjusted primary balance as a measure of the scale of the fiscal treatment in each period (panel a). Alternative measures exist such as the OBR’s cyclically adjusted Treaty balance (panel b) or the IMF government structural balance (panel c). (‘Treaty’ refers to Maastricht Treaty definitions.) All three paths are broadly similar; our preferred OBR cyclically‐adjusted primary balance series (a) shows smaller changes than the other two series. Table A5 OBR and IMF Measures of the Size of UK Fiscal Consolidations, 2010–3 Levels and Changes in Percent of GDP Budget year . (1) . (2) . (3) . (4) . 2009/10 . 2010/1 . 2011/2 . 2012/3 . (a) OBR cyc.‐adjust. primary bal. (used in text) −6.8 −4.4 −2.9 −2.8 (−1.0)* Change – +2.3 +1.5 +0.1 (+1.9)* Cumulative change – +2.3 +3.8 +3.9 (+5.7)* (b) OBR, cyc.‐adjust. Treaty def., sign reversed −9.5 −7.4 −5.9 −3.6 Change – +2.1 +1.5 +2.3 Cumulative change – +2.1 +3.6 +5.9 Budget year . (1) . (2) . (3) . (4) . 2009/10 . 2010/1 . 2011/2 . 2012/3 . (a) OBR cyc.‐adjust. primary bal. (used in text) −6.8 −4.4 −2.9 −2.8 (−1.0)* Change – +2.3 +1.5 +0.1 (+1.9)* Cumulative change – +2.3 +3.8 +3.9 (+5.7)* (b) OBR, cyc.‐adjust. Treaty def., sign reversed −9.5 −7.4 −5.9 −3.6 Change – +2.1 +1.5 +2.3 Cumulative change – +2.1 +3.6 +5.9 IMF calendar year . 2010 . 2011 . 2012 . 2013 . (c) IMF, government structural balance −8.5 −6.6 −5.4 −4.0 Change – +1.9 +1.2 +1.4 Cumulative change – +1.9 +3.1 +4.5 IMF calendar year . 2010 . 2011 . 2012 . 2013 . (c) IMF, government structural balance −8.5 −6.6 −5.4 −4.0 Change – +1.9 +1.2 +1.4 Cumulative change – +1.9 +3.1 +4.5 Notes Data from IMF WEO October 2012 database, HM Treasury Autumn Statements 2011 and 2012, and HM Treasury and OBR Budget 2013 and 2014 documents online. The data in panel (a) are updated based on March 2014 OBR updates and are consistent with the estimates computed by Simon Wren‐Lewis (http://mainlymacro.blogspot.com/2014/03/i-got-to-third-sentence-of-osbornes.html). The Figures in parentheses (*) indicate headline Figures which include ‘distortions’ due to credits taken for accounting adjustments involving the Bank of England’s asset purchase programme and the Royal Mail. Open in new tab Table A5 OBR and IMF Measures of the Size of UK Fiscal Consolidations, 2010–3 Levels and Changes in Percent of GDP Budget year . (1) . (2) . (3) . (4) . 2009/10 . 2010/1 . 2011/2 . 2012/3 . (a) OBR cyc.‐adjust. primary bal. (used in text) −6.8 −4.4 −2.9 −2.8 (−1.0)* Change – +2.3 +1.5 +0.1 (+1.9)* Cumulative change – +2.3 +3.8 +3.9 (+5.7)* (b) OBR, cyc.‐adjust. Treaty def., sign reversed −9.5 −7.4 −5.9 −3.6 Change – +2.1 +1.5 +2.3 Cumulative change – +2.1 +3.6 +5.9 Budget year . (1) . (2) . (3) . (4) . 2009/10 . 2010/1 . 2011/2 . 2012/3 . (a) OBR cyc.‐adjust. primary bal. (used in text) −6.8 −4.4 −2.9 −2.8 (−1.0)* Change – +2.3 +1.5 +0.1 (+1.9)* Cumulative change – +2.3 +3.8 +3.9 (+5.7)* (b) OBR, cyc.‐adjust. Treaty def., sign reversed −9.5 −7.4 −5.9 −3.6 Change – +2.1 +1.5 +2.3 Cumulative change – +2.1 +3.6 +5.9 IMF calendar year . 2010 . 2011 . 2012 . 2013 . (c) IMF, government structural balance −8.5 −6.6 −5.4 −4.0 Change – +1.9 +1.2 +1.4 Cumulative change – +1.9 +3.1 +4.5 IMF calendar year . 2010 . 2011 . 2012 . 2013 . (c) IMF, government structural balance −8.5 −6.6 −5.4 −4.0 Change – +1.9 +1.2 +1.4 Cumulative change – +1.9 +3.1 +4.5 Notes Data from IMF WEO October 2012 database, HM Treasury Autumn Statements 2011 and 2012, and HM Treasury and OBR Budget 2013 and 2014 documents online. The data in panel (a) are updated based on March 2014 OBR updates and are consistent with the estimates computed by Simon Wren‐Lewis (http://mainlymacro.blogspot.com/2014/03/i-got-to-third-sentence-of-osbornes.html). The Figures in parentheses (*) indicate headline Figures which include ‘distortions’ due to credits taken for accounting adjustments involving the Bank of England’s asset purchase programme and the Royal Mail. Open in new tab Footnotes 1 " See Chalmers (2005, 2011), who discusses Petrarch, bloodletting and the MRC clinical trials. 2 " Angrist and Pischke (2010, p. 5) judge that ‘progress has been slower in empirical macro’. The fear that standard empirical practices would not work, especially for aggregate economic questions, goes back a long way, at least to J. S. Mill (1836, p. 18), who favoured a priori reasoning alone, arguing that: ‘There is a property common to almost all the moral sciences, and by which they are distinguished from many of the physical; this is, that it is seldom in our power to make experiments in them’. 3 " Ironically enough, fiscal policy debates are now littered with medical metaphors. In 2011 German Finance Minister Wolfgang Schäuble wrote in The Financial Times, that ‘austerity is the only cure for the Eurozone’; while Paul Krugman, at The New York Times, likened it to ‘economic bloodletting’. In the FT, Martin Wolf, cautioned that ‘the idea that treatment is right irrespective of what happens to the patient falls into the realm of witch‐doctoring, not science’. Martin Taylor, former head of Barclays, put it bluntly: ‘Countries are being enrolled, like it or not, in the economic equivalent of clinical trials’. 4 " The d.CAPB measure used by AA is based on Blanchard (1993). The construction of this variable consists of adjusting for cyclical fluctuations using the unemployment rate. 5 " The potential endogeneity of fiscal consolidation episodes has been noted by other authors. For example, Ardagna (2004) uses political variables as an exogenous driver for consolidation in a GLS simultaneous equation model of growth and consolidation for the period 1975–2002. Hernández de Cos and Moral‐Benito (2013) use economic variables as instruments. 6 " For example, in the debate over the use of narrative methods to assess monetary policy, see the exchange between Leeper (1997) and Romer and Romer (1997). 7 " AUC stands for area under the curve. The curve usually refers to the Receiver Operating Characteristic curve or ROC curve. It also refers to the Correct Classification Frontier, as in Jordà and Taylor (2011). 8 " Hernández de Cos and Moral‐Benito (2013) have arrived at a similar conclusion. Their proposed solution to the lack of exogeneity problem is to use an instrumental variable approach. Instruments rely on data for predetermined controls and on past consolidations. Since data on predetermined controls already appear in the specification of previous studies (AA, GLP, etc.), the key question is whether past consolidation data predict current consolidation episodes. Fixed‐effect panel estimation already takes into account heterogeneity in the unconditional probability of consolidation across countries. Take Australia as an example, it is unlikely that the consolidation observed in 1985 helps determine the likelihood of consolidation in the year 1994 beyond the observation that Australia may consolidate more or less often than the typical country (already captured by the fixed effect). There may be little gained from the point of view of strengthening the identification. 9 " Correction of incidental truncation with IPW has a long history in statistics (Horvitz and Thompson, 1952) and is generally viewed as more general than Heckman’s (1976) selection model. Heckman’s (1976) approach corrects for incidental truncation using the inverse Mills ratio, requires specific distributional assumptions, and at least one selection variable not affecting the structural equation. Heckman’s approach is only known to work for special non‐linear models, such as an exponential regression model (Wooldridge, 1997). See Wooldridge (2010) for a more general discussion. 10 " Low yields have been evident throughout the advanced economies in recent years, outside the Eurozone crisis countries. Though outside the scope of our article, the main explanations for these trends would seem to be a strong medium‐run flight to safety since 2008 and a longer‐run drift to lower real rates globally over two decades driven by global demography and/or EM precautionary saving motives. An additional factor is that having its own central bank means that the UK cannot suffer a sudden stop or a self‐fulfilling run on the sovereign like, say, countries in the Eurozone periphery. Together, these factors appear have contributed to enlarged fiscal space in the form of a massive absorption potential for UK gilts. 11 " Considerable controversy attends the question as to whether austerity policy was eased in year 3, with the Chancellor and HM Treasury insisting that consolidation continued, but many critics suggesting the data showed otherwise. This is often referred to as the Plan A versus Plan B debate. See the discussion by Jonathan Portes, http://www.niesr.ac.uk/blog/fiscal-policy-plan-and-recovery-explaining-economics. For consistency with official sources, we use the official OBR Figures, excluding certain accounting credits in year 3 due to Bank of England and Royal Mail asset transactions which are not related to fiscal plans. See Appendix A.5 and Table A5. Two alternative measures of fiscal shocks are discussed in the Appendix, one from the OBR and one from the IMF. The measure we have chosen is the official UK measure and is more modest than these alternative measures 12 " The LP estimation for the forward path of treatment for the necessary three‐year horizon is reported in Table A4. We find that the ATE estimate of a change in probability, in the slump bin, of a treatment in year 1 given a treatment in year 0 is 0.51; the model also gives a 28% chance in year 2 and 17% in year 3. For our counterfactual this means that 51% of the Coalition austerity in 2011–2 (and 28% in 2012–3) was ‘baked in’ – in probabilistic terms – by the decision to do austerity in 2010–1. So this component is already accounted for in the AIPW output path estimates. The net effects can be computed mechanically as we illustrate in the following example. First, we can compute the first‐year shock magnitude of 2.3 multiplied by the scaling factor of 1/0.97, and then multiplied by the AIPW coefficient of −0.76 (from the slump bin in year 2). Second, we can add to this the second year shock magnitude of 1.5 (OBR) multiplied by the scaling factor of 1/0.97, multiplied by the AIPW coefficient of −0.24 (from the slump bin in year 1), and multiplied by the probability of no treatment in year 2 which is 0.49 = 1 − 0.51. In a similar way, we can assign unexpected and expected effects of contemporaneous treatment to prior treatment in all years along the path. 13 " The residual in Figure 4 could be accounted for by factors outside the framework: export patterns, the Eurozone crisis, or idiosyncratic UK sector shocks. There may have also been over‐optimism in the 2010 forecast (e.g. OBR underestimating either the size or economic impacts of upcoming austerity shocks). 14 " See the impacts for 2010–1, 2011–2, and 2012–3 cumulated to 2012–3 in Chart 2.26 of the OBR’s Forecast Evaluation Report, http://cdn.budgetresponsibility.independent.gov.uk/FER2013.pdf. References Alesina , A. and Ardagna , S. ( 2010 ). ‘Large changes in fiscal policy: taxes versus spending’, in ( J.R. Brown, ed.), Tax Policy and the Economy , pp. 35 – 68 , Chicago, IL : University of Chicago Press . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Alesina , A. and Perotti , R. ( 1995 ). ‘ Fiscal expansions and adjustments in OECD economies ’, Economic Policy , vol. 10 ( 21 ), pp. 207 – 47 . Google Scholar Crossref Search ADS WorldCat Almunia , M. , Bénétrix , A., Eichengreen , B., O’Rourke , K.H. and Rua , G. ( 2010 ). ‘ From great depression to great credit crisis: similarities, differences and lessons ’, Economic Policy , vol. 25 ( 62 ), pp. 219 – 65 . Google Scholar Crossref Search ADS WorldCat Angrist , J.D. , Jordà , O. and Kuersteiner , G.M. ( 2013 ). ‘ Semiparametric estimates of monetary policy effects: string theory revisited ’, NBER Working Paper No. 19355. Angrist , J.D. , and Kuersteiner , G.M. ( 2004 ). ‘ Semiparametric causality tests using the policy propensity score ’, NBER Working Paper No. 10975. Angrist , J.D. , and Kuersteiner , G.M. ( 2011 ). ‘ Causal effects of monetary shocks: semiparametric conditional independence tests with a multinomial propensity score ’, Review of Economics and Statistics , vol. 93 ( 3 ), pp. 725 – 47 . Google Scholar Crossref Search ADS WorldCat Angrist , J.D. , and Pischke , J. ( 2010 ). ‘ The credibility revolution in empirical economics: how better research design is taking the con out of econometrics ’, Journal of Economic Perspectives , vol. 24 ( 2 ), pp. 3 – 30 . Google Scholar Crossref Search ADS WorldCat Ardagna , S. ( 2004 ). ‘ Fiscal stabilizations: when do they work and why ’, European Economic Review , vol. 48 ( 5 ), pp. 1047 – 74 . Google Scholar Crossref Search ADS WorldCat Auerbach , A.J. and Gorodnichenko , Y. ( 2012 ). ‘ Measuring the output responses to fiscal policy ’, American Economic Journal: Economic Policy , vol. 4 ( 2 ), pp. 1 – 27 . Google Scholar Crossref Search ADS WorldCat Auerbach , A.J. and Gorodnichenko , Y. ( 2013 ). ‘Fiscal multipliers in recession and expansion’, in ( A. Alesina and F. Giavazzi, eds.), Fiscal Policy after the Financial Crisis , pp. 98 – 102 , Chicago, IL : University of Chicago Press . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Barro , R.J. and Redlick , C.J. ( 2011 ). ‘ Macroeconomic effects from government purchases and taxes ’, Quarterly Journal of Economics , vol. 126 ( 1 ), pp. 51 – 102 . Google Scholar Crossref Search ADS WorldCat Blanchard , O.J. ( 1993 ). ‘ Suggestion for a new set of fiscal indicators ’, OECD Economics Department Working Paper No. 79. Born , B. , Müller , G. and Pfeifer , G. ( 2015 ). ‘ Does austerity pay off? ’ CEPR Discussion Paper No. 10425. Cameron , C.A. and Trivedi P.K. ( 2005 ). Microeconometrics: Methods and Applications , Cambridge : Cambridge University Press . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Chalmers , I. ( 2005 ). ‘Statistical theory was not the reason that randomisation was used in the British Medical Research Council’s clinical trial of streptomycin for pulmonary tuberculosis’, in ( G. Jorland, A. Opinel, and G. Weisz, eds.), Body Counts: Medical Quantification in Historical and Sociological Perspectives , pp. 309 – 34 , Montreal : McGill‐Queens University Press . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Chalmers , I. ( 2011 ). ‘ Why the 1948 MRC trial of streptomycin used treatment allocation based on random numbers ’, Journal of the Royal Society of Medicine , vol. 104 ( 9 ), pp. 383 – 86 . Google Scholar Crossref Search ADS PubMed WorldCat Christiano , L. , Eichenbaum , M. and Rebelo , S. ( 2011 ). ‘ When is the government spending multiplier large? ’, Journal of Political Economy , vol. 119 ( 1 ), pp. 78 – 121 . Google Scholar Crossref Search ADS WorldCat Cole , S.R. and Hernán , M.A. ( 2008 ). ‘ Constructing inverse probability weights for marginal structural models ’, American Journal of Epidemiology , vol. 186 ( 6 ), pp. 656 – 64 . Google Scholar Crossref Search ADS WorldCat Davies , G. ( 2012 ). ‘ Why is the UK recovery weaker than the US? ’, Financial Times, November 14. Available at: http://blogs.ft.com/gavyndavies/2012/11/14/why-is-the-uk-recovery-weaker-than-the-us/ (last accessed: 4 November 2015). Donaldson , I.M.L. ( 2016 ). ‘ Petrarch’s letter to Boccaccio “on the proud and presumptuous behaviour of physicians” ’, JLL Bulletin: Commentaries on the history of treatment evaluation . Available at: http://www.jameslindlibrary.org/articles/petrarchs-letter-to-boccaccio-on-the-proud-and-presumptuous-behaviour-of-physicians/. OpenURL Placeholder Text WorldCat Eggertsson , G.B. , and Krugman , P. ( 2012 ). ‘ Debt, deleveraging, and the liquidity trap: a Fisher‐Minsky‐Koo approach ’, Quarterly Journal of Economics , vol. 127 ( 3 ), pp. 1469 – 513 . Google Scholar Crossref Search ADS WorldCat Giavazzi , F. and Pagano , M. ( 1990 ). ‘Can severe fiscal contractions be expansionary? Tales of two small european countries’, in ( O.J. Blanchard and S. Fischer, eds.), NBER Macroeconomics Annual 1990 , pp. 75 – 122 , Cambridge, MA : MIT Press . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Glynn , A.N. , and Quinn , K.M. ( 2010 ). ‘ An introduction to the augmented inverse propensity weighted estimator ’, Political Analysis , vol. 18 ( 1 ), pp. 36 – 56 . Google Scholar Crossref Search ADS WorldCat Guajardo , J. , Leigh , D. and Pescatori , A. ( 2014 ). ‘ Expansionary austerity: new international evidence ’, Journal of the European Economic Association , vol. 12 ( 4 ), pp. 949 – 68 . Google Scholar Crossref Search ADS WorldCat Heckman , J.J. ( 1976 ). ‘ The common structure of statistical models of truncation, sample selection, and limited dependent variables and a simple estimator for such models ’, Annals of Economic and Social Measurement , vol. 5 ( 4 ), pp. 475 – 92 . OpenURL Placeholder Text WorldCat Hernández de Cos , P. and Moral‐Benito , E. ( 2013 ). ‘ Fiscal consolidations and economic growth ’, Fiscal Studies , vol. 34 ( 4 ), pp. 491 – 515 . Google Scholar Crossref Search ADS WorldCat Hirano , K. , Imbens , G.W. and Ridder , G. ( 2003 ). ‘ Efficient estimation of average treatment effects using the estimated propensity score ’, Econometrica , vol. 71 ( 4 ), pp. 1161 – 89 . Google Scholar Crossref Search ADS WorldCat Horvitz , D.G. and Thompson , D.J. ( 1952 ). ‘ A generalization of sampling without replacement from a finite population ’, Journal of the American Statistical Association , vol. 47 ( 260 ), pp. 663 – 85 . Google Scholar Crossref Search ADS WorldCat Imbens , G.W. ( 2004 ). ‘ Nonparametric estimation of average treatment effects under exogeneity: a review ’, Review of Economics and Statistics , vol. 86 ( 1 ), pp. 4 – 29 . Google Scholar Crossref Search ADS WorldCat Jordà , O. ( 2005 ). ‘ Estimation and inference of impulse responses by local projections ’, American Economic Review , vol. 95 ( 1 ), pp. 161 – 82 . Google Scholar Crossref Search ADS WorldCat Jordà , O. , Schularick , M. and Taylor A.M. ( 2013 ). ‘ When credit bites back ’, Journal of Money, Credit and Banking , vol. 45 ( s2 ), pp. 3 – 28 . Google Scholar Crossref Search ADS WorldCat Jordà , O. and Taylor , A.M. ( 2011 ). ‘ Performance evaluation of zero net‐investment strategies ’, NBER Working Paper No. 17150. Keynes , J.M. ( 1937 ). ‘ How to avoid a slump ’, The Times , 12–14th January. OpenURL Placeholder Text WorldCat Kreif , N. , Grieve , R., Radice , R.and Sekhon , J.S. ( 2013 ). ‘ Regression‐adjusted matching and double‐robust methods for estimating average treatment effects in health economic evaluation ’, Health Services and Outcomes Research Methodology , vol. 13 ( 2–4 ), pp. 174 – 202 . Google Scholar Crossref Search ADS WorldCat Leeper , E.M. ( 1997 ). ‘ Narrative and VAR approaches to monetary policy: common identification problems ’, Journal of Monetary Economics , vol. 40 ( 3 ), pp. 641 – 57 . Google Scholar Crossref Search ADS WorldCat Lunceford , J.K. and Davidian , M. ( 2004 ). ‘ Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study ’, Statistics in Medicine , vol. 23 ( 19 ), pp. 2937 – 60 . Google Scholar Crossref Search ADS PubMed WorldCat Mertens , K. and Ravn , M.O. ( 2013 ). ‘ The dynamic effects of personal and corporate income tax changes in the United States ’, American Economic Review , vol. 103 ( 4 ), pp. 1212 – 47 . Google Scholar Crossref Search ADS WorldCat Mertens , K. and Ravn , M.O. ( 2014 ). ‘ A reconciliation of SVAR and narrative estimates of tax multipliers ’, Journal of Monetary Economics , vol. 68 ( S ), pp. S1 – 19 . Google Scholar Crossref Search ADS WorldCat Mill , J. ( 1836 ). ‘ On the definition of political economy; and on the method of philosophical investigation in that science ’, London and Westminster Review , vol. 26 ( 1 ), pp. 1 – 29 . OpenURL Placeholder Text WorldCat Mountford , A. and Uhlig , H. ( 2009 ). ‘ What are the effects of fiscal policy shocks? ’, Journal of Applied Econometrics , vol. 24 ( 6 ), pp. 960 – 92 . Google Scholar Crossref Search ADS WorldCat Nakamura , E. and Steinsson , J. ( 2014 ). ‘ Fiscal stimulus in a monetary union: evidence from US regions ’, American Economic Review , vol. 104 ( 3 ), pp. 753 – 92 . Google Scholar Crossref Search ADS WorldCat Owyang , M.T. , Ramey , V.A. and Zubairy , S. ( 2013 ). ‘ Are government spending multipliers greater during periods of slack? Evidence from twentieth‐century historical data ’, American Economic Review , vol. 103 ( 3 ), pp. 129 – 34 . Google Scholar Crossref Search ADS WorldCat Parker , J.A. ( 2011 ). ‘ On measuring the effects of fiscal policy in recessions ’, Journal of Economic Literature , vol. 49 ( 3 ), pp. 703 – 18 . Google Scholar Crossref Search ADS WorldCat Perotti , R. ( 1999 ). ‘ Fiscal policy in good times and bad ’, Quarterly Journal of Economics , vol. 114 ( 4 ), pp. 1399 – 436 . Google Scholar Crossref Search ADS WorldCat Perotti , R. ( 2013 ). ‘The austerity myth: gain without pain?’, in ( A. Alesina and F. Giavazzi, eds.), Fiscal Policy after the Financial Crisis , pp. 307 – 54 , Chicago, IL : University of Chicago Press . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Posen , A. ( 2012 ). Why is their recovery better than ours? (Even though neither is good enough.) Speech at the National Institute of Economic and Social Research, London , 27 March. Available at: http://www.bankofengland.co.uk/publications/Documents/speeches/2012/speech560.pdf (last accessed: 4 November 2015). Ramey , V.A. and Shapiro , M.D. ( 1998 ). ‘ Costly capital reallocation and the effects of government spending ’, Carnegie‐Rochester Conference Series on Public Policy , vol. 48 ( 1 ), pp. 145 – 94 . Google Scholar Crossref Search ADS WorldCat Rendahl , P. ( 2012 ). ‘ Fiscal policy in an unemployment crisis ’, Cambridge Working Papers in Economics 1211. Robins , J.M. ( 2000 ). ‘Robust estimation in sequentially ignorable missing data and causal inference models’, Proceedings of the American Statistical Association Section on Bayesian Statistical Science 1999 , pp. 6 – 10 , Alexandria, VA : American Statistical Association . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Robins , J.M. and Rotnitzky , A. ( 1995 ). ‘ Semiparametric efficiency in multivariate regression models ’, Journal of the American Statistical Association , vol. 90 ( 429 ), pp. 122 – 9 . Google Scholar Crossref Search ADS WorldCat Robins , J.M. , Rotnitzky , A. and Zhao , L.P. ( 1994 ). ‘ Estimation of regression coefficients when some regressors are not always observed ’, Journal of the American Statistical Association , vol. 89 ( 427 ), pp. 846 – 66 . Google Scholar Crossref Search ADS WorldCat Robins , J.M. , Rotnitzky , A. and Zhao , L.P. ( 1995 ). ‘ Analysis of semiparametric regression models for repeated outcomes in the presence of missing data ’, Journal of the American Statistical Association , vol. 90 ( 429 ), pp. 106 – 21 . Google Scholar Crossref Search ADS WorldCat Romer , C.D. and Romer , D.H. ( 1989 ). ‘Does monetary policy matter? A new test in the spirit of Friedman and Schwartz’, in ( O.J. Blanchard and S. Fischer, eds.), NBER Macroeconomics Annual 1989 , pp. 121 – 70 , Cambridge, MA : MIT Press . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Romer , C.D. and Romer , D.H. ( 1997 ). ‘ Identification and the narrative approach: a reply to Leeper ’, Journal of Monetary Economics , vol. 40 ( 3 ), pp. 659 – 65 . Google Scholar Crossref Search ADS WorldCat Rosenbaum , P.R. and Rubin , D.B. ( 1983 ). ‘ The central role of the propensity score in observational studies for causal effects ’, Biometrika , vol. 70 ( 1 ), pp. 41 – 55 . Google Scholar Crossref Search ADS WorldCat Scharfstein , D.O. , Rotnitzky , A. and Robins , J.M. ( 1999 ). ‘ Adjusting for nonignorable drop‐out using semiparametric nonresponse models: rejoinder ’, Journal of the American Statistical Association , vol. 94 ( 448 ), pp. 1135 – 46 . OpenURL Placeholder Text WorldCat Schularick , M. and Taylor , A.M. ( 2012 ). ‘ Fact‐checking financial recessions: US‐UK update ’, VoxEU, October 24. Available at: http://www.voxeu.org/article/fact-checking-financial-recessions-us-uk-update (last accessed: 4 November 2015). Stock , J.H. and Watson , M.W. ( 2012 ). ‘ Disentangling the channels of the 2007–09 recession ’, Brookings Papers on Economic Activity , Spring, pp. 81 – 156 . OpenURL Placeholder Text WorldCat Wooldridge , J.M. ( 1997 ). ‘Quasi‐likelihood methods for count data’, in ( M. H. Pesaran and P. Schmidt, eds.), Handbook of Applied Econometrics , vol. 2, pp. 352 – 406 , Oxford : Blackwell . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Wooldridge , J.M. ( 2007 ). ‘ Inverse probability weighted M‐estimation for general missing data problems ’, Journal of Econometrics , vol. 141 ( 2 ), pp. 1281 – 301 . Google Scholar Crossref Search ADS WorldCat Wooldridge , J.M. ( 2010 ). Econometric Analysis of Cross Section and Panel Data , 2nd edn, Cambridge, MA : MIT Press . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Author notes " The views expressed herein are solely the responsibility of the authors and should not be interpreted as reflecting the views of the Federal Reserve Bank of San Francisco or the Board of Governors of the Federal Reserve System. We thank three referees and the editor as well as seminar participants at the Federal Reserve Bank of San Francisco, the Swiss National Bank, the NBER Summer Institute, the Bank for International Settlements, the European Commission and HM Treasury for helpful comments and suggestions. We are particularly grateful to Daniel Leigh for sharing data and Early Elias for outstanding research assistance. All errors are ours. © 2015 Royal Economic Society http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png The Economic Journal Oxford University Press

The Time for Austerity: Estimating the Average Treatment Effect of Fiscal Policy

The Economic Journal , Volume 126 (590) – Feb 1, 2016

Loading next page...
 
/lp/oxford-university-press/the-time-for-austerity-estimating-the-average-treatment-effect-of-jRNxXBFgg0

References (71)