Friday, April 19, 2013

On Reinhart and Rogoff

If you follow macroeconomics or global economic policy, by now you'd be familiar with the recent blow-up over the so-called Reinhart-Rogoff (R-R) results. Carmen Reinhart and Kenneth Rogoff (R&R) have compiled and conducted some extensive research into the experience of countries following financial crises, which led to their book This Time Is Different. The central thesis of book was that financial crises have certain common, hence perhaps predictable, patterns, and among the biggest challenges in mitigating their occurrence or ensuing damage has been the tendency to believe that "this time is different". The book received plaudits when it was released - many have gone on to say that it's been their "bible" in the current global financial crisis.

A subset of the themes that R&R covered in their book was the dynamics between growing public debt and growth. This evolved into an independent line of published research, from which the 'headline' take-away was that countries with public debt/GDP ratios of over 90% experience strikingly slow growth, and thus the 90% level might be an intuitive tipping point beyond which may lie treacherous territory for sovereign borrowing. This argument, though made with suitable academic precaution, has been used or at least invoked by several influential policy-makers in developed economies with debt/GDP ratios exceeding the 90% level, as reason to impose a cutback in fiscal spending immediately. It is obviously at the centre of some of the most important global policy debates of our debates.

Now two days ago, Rortybomb (Mike Konczal) posted research from three University of Massachusetts economists (Herndon, Ash, Pollin) that tried to replicate the Rogoff-Reinhart results but could not. They found that the R-R results resulted from a rather messy combination of an excel code error, choosing to exclude certain data points, and devising a schema of weighting that seemed to skew the results and did not seem intuitively defensible. Unsurprisingly, the microcosm of the online world that cares about these things more or less exploded, first with schadenfreude, then with appeals to subtle readings, and very quickly, as is the wont of online discourse, to more systemic and 'meta' things like the central lessons to be learned from the debacle, the incentives facing economists, etc.

For a quick update on what has been happening, Felix Salmon has a good round-up and Tyler Cowen continues to act as a conscientious and indispensable clearing-house.

A short summary of just what has happened may be useful. First of all, though the schadenfreude is directed mostly at the Excel error, that error itself does not drive the biggest magnitude change in the result at hand. It pushes the growth experienced by countries with debt in excess of 90% of GDP from -0.1% to merely 0.2%. But removing the asymmetric weights and including all episodes changes it further to 2.2%, and that's really where the meat and juice of the debate lies. Rortybomb has further posted analysis by U-Mass economist Arin Dube that shows that the debt-slow growth correlation actually shows reverse causality, i.e. it's the slow growth that causes high debt ratios, rather than vice versa.

The economists themselves responded remarkably fast, and were forthcoming about accepting the code error. However, they insisted that firstly, they had only claimed association, not causation and secondly, at ~2%, growth for countries with public debt more than 90% of GDP was still a full 1% lower than that of countries with public debt less than 90%. The first is the plausible deniability of every sophist's argument - after all who gets global fame for simply showing that a high ratio is "associated with" a low denominator. It was always the potential causation hinted at which mattered and the op-eds that Rogoff and Reinhart wrote for popular audiences seemed to unambiguously hint at implied causation. The second may still be relevant, but ~2% vs ~3% simply lacks the jaw-drop value of less than 0% vs ~3%, and is actually not even statistically significant.

But the R&R defence wholly skipped over the deeper issues around including and excluding data points and arbitrary weighing schemata. On this, others have risen, not to their defence per se, but to shift focus and draw attention to the larger issues at hand. I have now seen a bunch of these arguments, and none of them are very convincing.

First up, let's get the 'any data analysis exercise will have to make these choices' bit out of the way. This is true, and for precisely this reason data analysis exercises offer, or should offer, at the very least, footnotes or explanatory pieces detailing these choices and a short note on why they were preferred to other choices. Investment banking analysts are routinely laughed at for having plucked out of the air assumptions about projections of growth, industry growth, market share etc. and yet any half decent analyst would not dare offer his or her work without detailing qualitative reasons for the quantitative assumptions being made and a list of sensitivities. Arguably, the former is more important and should be dealt with at depth by an academic - especially if the choices being made are not prime-facie intuitive (like, say, removing a 10 sigma outlier). It is worth pointing out that the main reason R&R come up with the 90% figure is because their intervals are of 30% i.e. they split the data into buckets of 0-30%, 30-60% and so on. This is purely a modelling choice artifact, and the actual tipping point, assuming any exists, may be 80% or 110% or 93.7%.

Here, a good way of going about it might be to to set up a graph and see if there are any natural 'inflection points' and if the simple graph between the two variables won't do, consider natural transformations like time-derivatives or logarithms until you get a graph with an inflection point. Ultimately, your data range choices have to make intuitive sense to numerate people. If you have to torture the data too much, it's quite possible you're looking at a very banal reality.

Another set of defences underscores the point that the episode mostly shows the pitfalls of macro analysis owing to the small data sets available. While true, this is inapplicable to the issue at hand. This is a problem with all data sets that show structural transformation, even if they go back a long time, and so with all data sets that concern themselves with the socio-commercial reality of our society. R & R stand questioned not because they made a prediction using analysis that came unhinged because the original analysis suffered from a small data set. Given the small data set, they chose to transform it in such a manner that very different conclusions were reached than if those transformations were eschewed. These transformations are reminiscent of the sham that passes off for the concept of risk-weighted assets when calculating capital ratios for banks - I contend that the fineness achieved was outweighed by the robustness lost due to it, and that this would have been true even ex-ante.

Finally, some have made the point that R&R are simply reacting to the incentive system in academic economics and popular commentary. Economics, it turns out, is a discipline that does not prize replication of results the way hard sciences do, and prizes novel insight more so that the errors of the type of R&R are rife through the profession. R&R might actually be ahead of the curve by engaging with the criticism so openly. Further, popular commentary obviously rewards strong claims more than agnostic and possibly sterile data inference exercises. Ergo, the choices that R&R made follow. Even if these arguments are true, one is forced to ask - should we then resist the urge to trash R&R or simply add the urge to trash the entire economics profession to the mix? Moreover,  is it alright to apply the leeway in standards that we may be able to afford an up and coming popular essayist or a graduate student seeking a job and tenure to Kenneth Rogoff and Carmen Reinhart. Rogoff has been the chief economist of the IMF, is a member of the super-elite Group of 30 and was a student of Rudiger Dornbusch (who advised several LatAm governments in the '80s) and Stanley Fischer (possibly the macroeconomist with the most star studded policy experience ever) during the phase at MIT which has produced some of the most prominent international macroeconomists of our era (Bernanke, Obstfeld, Krugman, Frankel). He is a plausible contender for the topmost policy jobs in a Republican administration. Reinhart is somewhat less entrenched, but is also a tenured professor at Harvard, a top-10 authority on sovereign debt and default especially in emerging nations, and brought the term 'financial repression' into the modern macro lexicon. By any stretch, they are as about as elite as the academic macropolicy elite gets, and remember that these are days when macroeconomists are disproportionately at the centre of the global policy elite.

No, this defence from diminished expectations won't cut it. To check how diminished our expectations may have become, see this otherwise brilliant Neil Irwin piece where he seems thankful to R&R for simply compiling and collating the data that backed their broader research.

Reinhart and Rogoff did a shoddy job and deserve most of the ridicule coming their way.