The “Factor Zoo”: Some Thoughts On “Is There A Replication Crisis In Finance?”

A few months ago three researchers published an astonishingly ambitious and compendious paper called “Is There a Replication Crisis in Finance?” (Their names are Theis Jensen, Bryan Kelly, and Lasse Pedersen; two are at the Copenhagen Business School, one is at Yale, and two also work for AQR Capital Management.) It attempts to refute several recent papers that have said that there is indeed a replication crisis in finance: that researchers today are unable to replicate the findings of earlier researchers who claimed that going long and/or short certain factors results in improved returns. This new paper appears to prove that factor-based investing actually works. As the authors write,

Our findings challenge the dire view of finance research. We find that the majority of factors do replicate, do survive joint modeling of all factors, do hold up out-of-sample, are strengthened (not weakened) by the large number of observed factors, are further strengthened by global evidence, and the number of factors can be understood as multiple versions of a smaller number of themes. At the same time, a non-trivial minority of factors fail to replicate in our data, but the overall evidence is much less disastrous than some people suggest.

In this article I’m going to summarize their paper and talk about the various factors that they tested, what factors they didn’t test, their factor classification system, some things about their research that remain vague, and some useful conclusions about the results.

The Difference Between This Paper and Previous Ones

What do these researchers do differently than the authors of the you-can’t-replicate-this papers?

  1. They use one-month holding periods rather than six- or twelve-month periods. (This makes sense to me, as it provides an even playing ground for factors that have short and long look-back periods.)
  2. They use terciles rather than deciles (earlier papers claimed that a factor didn’t work if the top tenth of stocks ranked by the factor failed to beat the bottom tenth; these authors say that if the top third beats the bottom third, it works; this is a somewhat more forgiving and broader measure).
  3. They test in 93 different countries. (This is one of the signal merits of this study.)
  4. They use value-weighted (cap-weighted) results, but winsorize at the 80th percentile of the NYSE, so that massive firms don’t overwhelm the rest. (This makes sense especially if you're designing a system that works for large caps.)
  5. They exclude factors that the original researchers found insignificant (or at least they say they do, but they end up including a few anyway).
  6. They measure success for a factor by looking at its alpha rather than its raw return. (This is, in my opinion, exactly the way it should be measured.)
  7. They use a Bayesian approach to factor evaluation based on the prior assumption that alpha is zero. This, together with considering all factors simultaneously, naturally lowers the p-value threshold for factor success.

As a result, their overall out-of-sample success rate for factors tested in academic papers is a massive 85%. In addition, they find that higher in-sample alphas correspond to higher out-of-sample alphas. Their conclusion: academic research into factors is totally valid.

Why Out-of-Sample Results are Rarely Higher than In-Sample Results

As anyone who does any backtesting can attest, in-sample alpha is always higher than out-of-sample alpha, and this is confirmed by the authors’ testing. The authors don’t explain why, though, so I thought I would.

  1. Regression to the mean. One must start with the assumption of zero alpha—the assumption that the market is either quite efficient or quite random. In both cases, betting on a factor will prove unprofitable. A researcher will backtest several factors and publish the results for those that work best. Because of the statistical law of regression to the mean, factors that work best over one period are unlikely to work best over another period.
  2. Arbitrage. Once a factor has been published, investors are going to try to use that factor, thus arbitraging away its effect. For example, let’s say I publish an academic paper that touts a new factor, Factor X, that nobody has used before. A number of investors read that paper and decide to go long stocks with high levels of Factor X and short stocks with low levels of Factor X. If enough investors do this, the prices of stocks with high levels of Factor X will rise and the prices of stocks with low levels of Factor X will fall. After a short while, high–Factor X stocks will be, on the whole, quite expensive and low–Factor X stocks will be quite cheap. This will drastically reduce the profitability of the Factor X–based investing strategy. Very widely used factors such as book-to-market, price-to-earnings, price-to-sales, and return on equity may have been mostly arbitraged away.
  3. Changes in market structure. There are always going to be fundamental structural changes in market conditions. For example, the elimination of trading commissions has enabled much more frequent placement of small orders. The ready availability of fundamental and aggregate estimate data has enabled a far greater number of people to trade according to factor analysis. The rise of the internet has created huge and fundamental changes in the retail, business-to-business, communications, and technology industries. The creation of new kinds of securities like ETFs and SPACs has fundamentally altered the way people invest. All of these changes create conditions in which replicating the success of past factors will be difficult.
  4. Manipulation of data. The more certain factors become important to shareholders, the more financial officers at companies will try to give the shareholders what they want. Companies are given rather broad discretion in reporting various expenses, and there is strong evidence that they manipulate earnings, EBITDA, and free-cash-flow numbers to influence their stock price. This makes it harder to replicate findings based on those numbers.
1 2 3 4
View single page >> |

Disclosure: My top ten holdings right now: STRT, CTG, PMTS, RMNI, HBP, EVC, WFG, INSE, RCKY, TGLS.

How did you like this article? Let us know so we can better customize your reading experience.


Leave a comment to automatically be entered into our contest to win a free Echo Show.