Here is part three in my series. After this post there will be a final part four. However, in between three and four will be a new post on Ricardo’s theory of rents and financial markets, which is a personal viewpoint of financial markets that I have been developing. This current series–while I have put lots of hard work into it–is based upon University of Chicago professor John Cochrane’s seminal market theories and analysis.
Most active management and performance evaluation just is not well described by the alpha-beta, information-systematic, selection-style split anymore. There is no “alpha.” There is just beta you understand and beta you don’t understand, and beta you are positioned to buy vs. beta you are already exposed to and should sell. -John Cochrane
In the past post on efficient markets, I explained the traditional view of efficient markets and how it has evolved. I then extrapolated this to the current financial industry, investments firms, and what it means to actively beat a market. I now would like to focus on academic literature and efficient market theory that has continued to develop since its foundation in the 1970s. Most investors stop their study of finance far too early. After learning the basics of the efficient market hypothesis, they move on to the Wall Street Journal, The Economist, Zero-Hedge, and the Financial Times. And as these students read more, the efficient market hypothesis is only a lingering memory of an overly abstract theory. This is unfortunate. While the introduction to investments is often the same across the board, there is less consensus and many divergent schools of thought after this common starting point.
I will be primarily using John Cochrane’s notes on predictability, available on his website, as my reference. The first goal for most aspiring investors is how to search for patterns and then to use these patterns to make predictions about the financial market; in other words, to make money. Research built upon market theory and used to make predictions is mostly empirical. This means that instead of considering how markets ought to behave based on human nature and economic intuition, we use empirical data to understand how they work and look for patterns. In finance and economics a large weight is placed on theory due to the inability to conduct controlled experiments. However, now that we have a robust theory of markets, we can start to place empirical results in full context.
The primary tool of empirical research in investments is the following equation: R(t+1)=a + bx1(t) + bx2(t) + e(t+1). This is a simple regression equation. If financial economists were engaged in a battle to make predictions about financial markets, that equation would be the standard issued M16 rifle. The primary point of my posts is to offer a non-mathematical understanding of financial markets to the reader. Nonetheless, I have decided one equation will not hurt. Below I have explained the separate parts of the regression, placed in the context of financial analysis.
R(t+1): This is the expected return in the following period (this could be any time interval we choose, such as one day or one year). “t” is the current period and therefore “t+1” is the following period.
Bx1(t): This is the first beta we use to forecast our left hand variable R(t+1) using current information. The most commonly used beta value in financial economics is the ‘market factor,’ which determines how much an asset(s) covaries with a chosen market index. This is so popular it is often referred to as simply ‘beta.’ However, when using multiple betas we are sure to specify it is the market factor beta. A common error in investments analysis is to accept beta as a robust and catch-all risk measurement. The beta of a given asset or portfolio of assets will vary based on the chosen market index it is measured against as well as the time frame specified For example, the beta of Apple will be different if it is measured relative to the S&P500 as opposed to the Barclay’s Aggregate Bond Index. In this case the S&P500 makes more intuitive sense, though there is not a ‘correct’ answer. Interpretation of beta should also consider the return distribution of the index, the overall volatility, and the fact that beta risk isn’t perfectly linear. While we tend to assume something close to a normal distribution for equities, in the case of options or derivatives a market beta value might be strange. Very low and very high beta stocks are often less reliable measurements for various reasons.
Bx2(t): This could just be a second beta. A regression can have one, two, or many betas.
e(t+1): This is the error term (or disturbance term). We assume it is normally distributed with a mean of zero. This may include other variables (betas) we have not considered; measurement errors; unpredictable effects; and nonlinearities.
a: This is the alpha term, formally titled Jensen’s alpha after his seminal paper. If running a regression on investments, this term will be positive if the investor received greater return than the risk suggests. Conversely, this term will be negative if the investor does not receive return commensurate with risk. There is no magic in this term and the math is simple. If the return was 10% and the beta factor(s) can only explain 9%, the residual is assigned to the alpha term. The leap from the 1% assigned to alpha to then saying it was riskless profit is massive. It might have been due to market volatility or beta factors that were not included in the model.
The hunt for alpha is so hyped that there are swaths of investors (and students) hoping to find alpha. Most are not even aware how it fits into the above regression. To identify beta an investor must first create a regression model that, hopefully, has a beta for each relevant risk factor. This model would be run over a sufficiently large sample size and time period. The goal would be to analyze if the investor consistently received positive alpha, suggesting he or she had returns above the expected returns based on risk exposure. The unfortunate part of this model is that risk factors are difficult to observe and create. Consider that a risk factor must isolate and measure a significant and constant level of risk observable in all similar assets, for which investors demand compensation.
The market beta factor–the darling variable of the CAPM–has many faults. Low beta stocks often receive significantly higher returns than should be anticipated in the model, and high beta stocks often receive significantly lower returns than should be anticipated. Demand for additional compensation for beta risk varies over time due to events, such as recessions, that lead investors to risk aversion. This means that if a recession has recently occurred, and investors are now risk averse, the market beta factors may instantly lose efficacy. Despite these drawbacks, the market beta factor is still the most dominant and dependable risk factor. This is due in part to my original explanation, in part one, of how it is micro-founded in individual choice. There is no need to guess why beta works.
However, there are other notable beta factors. Fama and French noticed an issue with the market beta. When creating 10 portfolios out of size and book-to-market ratios, the mean portfolio returns co-varied with exposure to these two factors, while market beta had an insignificant covariance. This suggested that there were additional dimensions to risk that were not being captured by the market beta. As a result Fama and French added two more factors. Size is measured by small cap vs. large cap, with the notion that small cap is riskier and earns higher returns. Value is measured by the book-to-market (BtM) ratio, with the notion that firms with a high BtM offer higher returns. These high BtM stocks are called ‘value’ stocks, in contrast to low BtM stocks, which are called ‘growth’ stocks. There is one last commonly used risk factor developed by Carhartt called momentum, which considered past momentum as a predictor of future returns.
The final result is when a full regression is run on a series of investments, all returns not explained by these risk factors are attributed to alpha. And as a result alpha becomes interpreted as skill. As I mentioned before, alpha must by definition be zero across the entire market. Not every investor can receive higher returns for their risk; for every investor that has positive alpha, other investors must have negative alpha. However, this theory presupposes that all risk factors are observable and measurable. The truth of the matter is that different firms use different risk factors, and even the best models fail to capture some risk factors. For example, one firm might have exposure to a few risk factors that they do not include in their model. As a result they will attribute to alpha what ought to be attributed to beta.
In addition, a large sample of active investing is required to determine that a positive or negative alpha is consistent enough to be attributed to an investor’s ability. Very loosely speaking, it might take five years to even make an educated guess that a manager is beating the market. And it would take even longer to feel safe in thinking a manager has talent. The amount of time required tends to be painfully long. Even then, with thousands of managers and investors, a few are bound to be in the top 99.99% percentile by luck. As a result investors tend to search for the traits of a typical good manager. This is likely one reason for the obsession with prestige and pedigree in investments. A firm might be unable to show statistically significant alpha returns, but it can show a team of highly educated and trained investors.
An interesting example to start with is insurance. In fact, insurance products are often traded as assets on financial markets. Home insurance is an investment with a negative expected return, but a positive value. This means we don’t expect to receive a positive return but we still value it enough to buy it anyway. When paying for insurance we are transferring different risk factors to an insurance firm. For example, the risk of a fire, theft, an earthquake, or other negotiable details. Each one of these is a beta factor. We need to pay another person money to take our risk. However, value is created for society as a whole by spreading risk out. The insurance company’s goal is to offer you the cheapest deal they can, while being sure they are being compensate for all the risk factors they absorb. Now a hypothetical might be that the insurance decides to cover all water damage that your house suffers. They consider the cost of your house, the size, and other variables necessary to price the water damage risk factor. However, it is possible that they might miss information. In this case perhaps the purchaser of insurance lives near a swamp. Because of this, the insurance company is not being adequately compensate for the risk they have accepted. They have agreed to pay for water damage, but the chance for water damage is higher than they expect. In this situation the homeowner will receive insurance alpha. They are having all their risk taken away for less than it ought to cost.
A year later the cost for insurance increases for the family. The incredibly clever team of statisticians at the insurance company realized location to swamps is an important variable in water damage. Despite shopping around, the family realized this new information had flooded the market. They are no longer receiving more benefit than risk (alpha), and instead they must now pay more to have another party take on their risk (beta). So in this case the source of the family’s alpha became common knowledge, and as a result turned into beta.
Now that I have covered market efficiency, market alpha, market beta, and both the simplifications and reality of our tools–I would like to move on to the empirics and cover predictability.
Excess stock returns are unpredictable based on past price movements. A regression of returns on lagged returns, annually from 1927-2008, shows a predictability beta of 0.04. What this means is that annual returns were gathered over this time period into a data series. This data series was then ‘copied’ and lagged one year. The goal is to find if the return of last year helps us forecast the current year return. With a beta coefficient of 0.04, this suggests that if the return was 10% last year, we expect a rise of (0.04)*(10%)=0.4% expected rise from “momentum.” In addition the R^2 value, a statistical tool for measuring the proportion of return variance than can be forecast one year ahead, is near zero. Predictability of returns are equally unimpressive if run on time intervals shorter than a year. Generally it is of near zero value to attempt to forecast future price movements based on past prices alone. There are some strategies that consider isolated momentum in high volatility periods over fractions of a second, and others that focus on the intricacies of mutual fund pricing. These clever traders use a far more robust model than simply annual return predictions over 80 years. And while some pure momentum models do exist (predicting future price from past price alone) it is nearly always one of many different factors used in a full model. It is also important for traders to respect the reality that of the hundreds of thousands of regressions run over past data some might show good results simply due to cherry-picking, statistical luck, and lastly that those that once predicted excess returns might no longer predict alpha returns.
Conversely, the T Bill market is highly predictable. This isn’t surprising. The Federal Reserve is likely to use the interest rate of the previous year as a base before they attempt any adjustments. As a result the lagged ‘momentum’ beta for T bills is 0.91. This does not violate market efficiency, since even if you know T bill returns should be high the following year, you must still borrow at the same high rate. Whereas if you know stock returns will be higher than the T bill in the following year, you can borrow at the T bill rate and invest in the market with full clairvoyance that the spread will be profitable.
While excess equity returns cannot with any real consistency forecast itself, other variables have shown an ability to forecast future expected returns. For example, dividend yields vary over time. A dividend yield is calculated by taking the current annual dividend over the stock price. A high dividend yield means that the current dividend is high relative to the stock price. A low dividend yield means that the current dividend is low relative to the stock price. If a stock has a low dividend yield relative to its long run average we might guess that future dividends will be higher than past dividends. The reason for this is that the stock price includes all future dividend increases (whereas the current dividend is simply what was last paid as the dividend). An example of this might be if a firm recently announced that future dividend payments will be double the current payments due to unexpected success. While this does not affect the past dividend payment, it will increase the price of the stock. In this situation the dividend yield will be lower than its long term average, and we should expect future dividends to be higher, thus bringing it back to its average. Conversely, if the dividend yield is high we might expect future dividends to be lower.
The dividend yield has the ability to forecast future returns and forecast itself. The statistical significance is not high, but it is material. This fits with our theoretical economic understanding of the market. We cannot be clairvoyant, but we can use variables to increase our understanding of future returns. In this case the dividend yield works as a great starting point. Cochrane uses it as an initial term due to its simplicity and clear statistical evidence. However, there are a thousands of papers on thousands of variables, all with an attempt to predict the future. A casino does not need to win every time to make money, only 51% of the time. I will not include the quantitative data in this paper. I suggest reading Cochrane’s notes on predictability for a full understanding. It is useful to begin with the dividend yield
It is now important to reconcile the ability to forecast future expected returns (discount rates) with market efficiency. Predictability at first blush seems terribly incompatible with the idea of market efficiency. It ends up working perfectly fine with the idea. Firstly, market efficiency does not require constant returns – variation is allowed. We could theorize that this variation is perhaps simply a function of changing discount rates, or risk. While the dividend yield holds predictive power, it is possible when the future dividend yield rises that risk does as well, which would explain the price increase. If this were the case markets would simply be continuing to be rational. Using dividend yield forecasting would allow us to predict when risk is higher, leading to the rational expectation that returns ought to be higher in periods of higher risk.
While this is congruent with efficient markets, it is a different perspective than what was originally thought in the 1970s. The original theory was that returns are constant, meaning predictability is impossible. If returns are constant and equity returns are a purely random walk there is by definition no good or bad time to invest. As a result any characteristic or ratio (such as the dividend price ratio) meant nothing about whether an investor ought to buy or not. After all, if returns are constant valuation ratios should simply reflect efficient beliefs for an investor. Since Cochrane phrased his point so succinctly and clearly, I will include an excerpt.
“This is a huge change in viewpoint from the classic eﬃcient markets/constant returns view, circa 1980. We used to think that expected returns are constant; stocks are a random walk; there is no “good time” to invest or “bad time”. Now, of course, prices move around. Isn’t a low a good “buying opportunity?” No, we would have said, low happens when people expect declines in dividend growth. Variation in P/D [price to dividend] occurs entirely because of cashﬂow news. What we see in these results is exactly the opposite. Now we think that market P/D variance corresponds 100% to expected return news, and none at all to cashﬂow news. (Prices decline when current dividends decline of course.) (Things get even stronger when we add more variables; technically this result refers to forecasts using only .) In this sense, our view of the world has changed from 100% / 0 to 0 / 100%.” -John Cochrane “Notes on Predictability.”
The prime point from this passage is that valuation ratios act differently under a microscope. The original market theory was wrong. It stated that returns were constant and valuation ratios were just the result of rational investors considering future business conditions. As we see, returns are not constant and valuation ratios can be used to predict future returns. In addition, these ratios tend to be related far more to market discount rates than to firm specific features. The P/D variances are tied to expected return news, not actual future dividend news. If the P/D ratio is high or low, it does not mean future cash flows or dividends will change. Instead we can expect the price to change. Different valuations can help us expect the return environment we are in, and this is what it means for returns to be predictable.
Stepping aside from ratios, it is also possible to understand how finding different risk factors, and isolating them in a portfolio, can increase wealth for all investors. For example, a hedge fund might increase its risk to emerging market currency fears. As we know in markets this will be compensated with a corresponding return. If it turns out that this risk has a low correlation with traditional asset classes, such as US stocks, taking purposeful exposure might be a great strategy. This could be a novel and useful strategy. It is possible that while emerging market currency risk does exist, and those who hold this risk are compensated for it, there is no way to gain pure exposure to this risk. For example, another hedge fund might be interested in investing in emerging market firms with strong fundamentals as value investors. As a result they will be gaining marginal exposure to currency risk that they do not want. This second hedge fund that does not want this risk might engage in a currency swap with the first hedge fund, that wants to hold this risk. Despite the complex financial tools that take place, the end economic reality is similar to a farmer selling livestock manure to a fertilizer firm. The firm that does not want currency risk uses financial tools to unload it on the firm that does want the risk. And savvy investors may now invest in the currency hedge fund firm to gain returns with a low correlation to their more traditional asset classes (that have more traditional risk, like the US business cycle).
The importance of this paragraph is the world of active investing and hedge funds does not fall apart under a reasonably efficient market. It is viable for hedge funds to actively trade on what we typically consider ‘beta’ factors. And while some hedge funds do make absurd amounts of money per year (and these are few and far between), this could be rationalized by the idea that when spending hundreds of millions on infrastructure, programmers, physicists, and economists, it is possible to gain an edge on having meticulous exposure to exactly the right risk factors at the exact right moments. And while this might allow an edge, it also brings these funds teetering on the edge of a cliff. It is not unknown for the most brilliantly guided hedge funds to blow-up, as “Long Term Capital Management” displayed to the world of financial markets.