Learning vs. Signalling

Just under two years ago I started working at the Federal Reserve in Financial Research. My year at the LSE prior to the Fed had given me an intense year to begin thinking mathematically for essentially the first time in my life.  After the LSE I wanted more academic training before I forever left the world of academia. At the Fed I’ve studied everything that interests me, both during work and outside of work, without obsessing how I will accurately signal my knowledge for my next job. I simply wanted to focus on becoming smarter. I knew that everything I was learning in the world of computational finance, scientific inference, statistics, programming, and math, was valued in other jobs. As a result I worked on plenty of personal projects, completed lots of segments of Coursera courses that were useful, and began to enjoy my new abilities to efficiently learn new disciplines. These two years have been great. Now it’s time to start shamelessly signaling. It is challenging to find a balance between developing my own skills on my terms, and also remembering to find credible ways to prove what I know to others. Time to restore balance and abandon life as San Francisco Economic Monk, and enter the world of Business.

For my own sake, here is a list of everything I’ve done and taught myself at the Fed (both in and out of work) over the past two years.

1.) Math:

 When I began working at the Fed I had a weak foundation in mathematics. It appears to me that the majority of Economics and private sector quantitative work sticks to linear models. The Financial Economics team I am on at the Fed is one of the more mathematically intensive teams at the Fed, since yield curves and asset pricing exists in the world of Stochastic Differential Equations (SDEs), continuous time, and filtering.

image06[1]
A Non-Homogeneous ODE. Useful in Asset Pricing.
I began with multivariable integrals. This was a topic I had briefly covered before, and needed to be better at in order to not struggle with probability distributions. I primarily used Stewart’s textbook, which doesn’t go into proofs but focuses instead on how to compute problems. I spent about 100-120 hours just doing integrals by hand this way after work. I then moved on to ordinary differential equations (ODEs), because this was the technique used to model the yield curve (technically the yield curve is modeled with SDEs, but they reduce to ODEs with a few theorems).

My first goal was to learn how to solve this 

image07[1]
From An arbitrage-free generalized Nelson-Siegel term structure model.
For the first six months of living in San Francisco, before Alison joined me, every weekend I would wake up and then walk to the park near my place and lie on the grass doing math for a few hours. Eventually I decided I wanted to be even better at math, and began studying the proofs underlying calculus. I’m not sure if this was a good use of time. I had hoped it would give me a richer ability to follow practical applications, which I’m not sure that it did. At least it gave me a small peak into the beauty of math, which I hadn’t appreciated fully.

Otherwise I continued to become better at Matrix Algebra, both because I used it at work and in its practical usage it’s really not that hard. I spent about three months teaching myself the intro to SDEs and brownian motion in 2016, since I wanted to have a better understanding of the models that we work with at the Fed.

2.) The Paper:

At the Fed my boss Jens Christensen put me as a co-author on a (working) paper called The TIPS Liquidity Premium. This is an incredible paper that attempts to filter out the liquidity premium on Treasury Inflation Protected Securities issued by the U.S. Government. When we succeed, we will have the least biased measure of the U.S. real rate in the field. This is an interesting question because the TIPS market has historically not been as traded as normal Treasuries. This means if you buy a TIPS, you are assuming some risk in the form of fear that in the future if you want to sell it immediately you may not be able to find a buyer. And if you desperately need to sell it, you can only find a buyer by lowering your price below its true value. As a result, because of this fear, the buyer of a TIPS from the U.S. Government will require a slightly better price to compensate them for this risk. For example, today the 5-year constant maturity TIPS is trading at a yield of 39 basis points, and the 5-year Treasury at 176 basis points. The difference is considered to be break-even inflation, which is what inflation expectation has to be for the asset prices to be in equilibrium with one another. The TIPS is here representing the real rate of intertemporal substitution in the US economy.

I rewrote the computational heart of the model from R to C++ to improve the speed. This has been my biggest project over the past two years, and the one where I have learned the most. I began working at the Fed barely able to code in R. I had a project that required me to take a heavy filtering model in a language I barely knew, and rewrite it in a language I didn’t know at all, at a point where I had barely even learned how a loop works. I started with ‘Hello World’ in C++, and ended with a few hundred lines of dense mathematical code in C++ that I compiled into an R package. In between was a year of nights and weekends of constant frustration with occasional profound excitement.

image03[1]
The ODE pricing equations for our paper, which I can now solve.
3.)  More Asset Pricing:

Since I worked on an asset pricing team, I spent a lot of time outside work trying to learn the foundations of the field. I worked a lot out of John Cochrane’s book on Asset Pricing, and did about four weeks of his Coursera course. This took a long time since I had to spend a few months first learning more time series and stochastic processes to follow along. I also read the seminal work on return predictability in equities, and some in bond prices, by guys like Campbell, Shiller, Fama, and Cochrane. I studied yield curve modeling from my bosses paper and the book Yield Curve Modeling and Forecasting by Diebold and Rudebusch.

I have mixed feelings about this field. Financial markets are extremely interesting, and the years I have spent working in investments, studying economic theory, financial institutions, political economy, and pairing this with working in academic financial economics, I have developed a robust and deep knowledge of asset markets and investments that is more advanced than almost all who work in private sector investments research. While many in the field are better at the coding and math than I am, my time studying and working at the Fed has given me an advanced understanding of financial market reasoning and theory.

What does shock me is how rich the literature on predictability is, yet how little investments ‘professionals’ know about these papers. If your goal is to predict asset prices and beat the market, shouldn’t you have at least a great base knowledge of the academic literature, what works and doesn’t work, and where the current research is in this field? This isn’t necessary just for quant funds, but even for endowments or fundamental funds. For example, if I worked at an endowment or fundamental fund I would (and could) recreate the main valuation signals from Shiller’s papers. Even if investing in individual stocks, wouldn’t it be useful to have a program that uses time-series techniques to report on historical and mean-reverting processes of market valuations, and estimates the parameters to give a context within which to talk about individual companies or stocks?  This would be even more useful for endowments, which invest across all asset classes.

image10[1]
A chart from Parsimonious Yield Curve Modeling by Nelson and Siegel.

4.) Econometrics and Time-Series:

I have spent a reasonable amount of time studying Jim Hamilton’s PhD textbook on Time-Series. I began with the foundations of ARIMA models, and eventually moved on to the Kalman Filter, which we use in our model at the Fed. I’ve also continued to study experimental design, by finishing reading and understanding the book Mostly Harmless Econometrics by Angrist and Pischke.  I combined this with studying Asset Pricing to code up a toy-model to forecast the price of a short-VIX ETF.

I chose the short-VIX for a few reasons. The first is that the VIX is based off of option prices for the S&P500, meaning it is a mean-reverting process. The second is that the VIX is an asset that increases as expected variance in the future price goes (this is correlated with market fear, but not the same thing). This means that it is similar to insurance, which requires the buyer pays a risk premium. As a result the short-VIX receives a risk premium for selling market insurance. Lastly, selling volatility insurance is extremely risky for investments funds, as it means when the market does poorly they will have to pay out ‘insurance’ to other funds: In short, when the market does poorly, they will do the worst. I hypothesize that for this reason the risk premium received for selling volatility insurance is very high.  Using this thesis, I coded up a relatively simple rolling ARIMA model to forecast the future price of short-VIX, which I’ll post to my website eventually. I’ve recently read more on Options, Futures, and Other Derivatives by Hull. It would be a fun challenge to recreate this model, but instead of using the SVXY ETF, to build it from scratch using the S&P500 options values. I might work on this with my brother at some point, as we want to build some investments strategies together.

I have also studied Bayesian Computation with R by Jim Albert. I only spent a few weeks on that book, but it was great to finally code up the basics behind Bayesian prior and posterior modeling. I hope to have an opportunity someday to learn more about pure Bayesian statistics. At the moment, I’m investing my efforts into machine learning algorithms, which leads me into my next category.

image09[1]
The steps associated with the Kalman Filter. I think it’s so neat the same mathematical models we use for satellites and Rocket Science is also used in asset pricing. Filtering information from noise isn’t field specific.
5.) Data Science and Machine Learning:

I want to stay on the West Coast, and I really want to live in Seattle. The most booming industry over here is in data science. A poorly defined field that requires people who know statistics, computer science, and scientific inference. It sounds like a great fit for me, as I have been using a mix of the scientific method, causal inference, programming, and math, for the past three years. What I have learned is that the biggest challenge is that economists are never as good at coding as a software engineer and not as good at statistics as a Statistician. In general, we are better at asking the right questions and setting up good scientific practices and research design. Because ultimately equations and code are tools used to answer questions, but can’t solve anything by themselves.

For the past five months I have been studying data science tools. At the Fed we rebuilt our data infrastructure to use Python to query PostgreSQL databases. On my own time I have read a lot of machine learning textbooks, and academic papers on models (i.e. Random Forests). And recently I competed in a Kaggle data science competition, landing in the 40th percentile. Nothing to brag about, but not awful for it being my first competition, in a new programming language, using new models. I am confident I will do much better in my next one, as I spent about 100 hours struggling through lots of easy issues.

I am now doing a Stanford Coursera course on Machine Learning for a certificate. I’m annoyed because it’s very easy so far, as I’ve already taught myself most of these foundational topics. Unfortunately, I have no way to credibly signal it yet, so I have to play the game. As I finish this course and clean up my code from the Kaggle competition, I will have a credible signal that I can do Machine Learning. It still feels like I’m new at it, but with my background at the Fed I underestimate how fast I pick this stuff up compared to others.

image02[1]
A standard classification problem in machine learning. What is the right way to find the dividing lines in a high-dimensional space?
6.) Coding:

I could barely code at all when I started at the Fed. I’ve learned an incredible amount over the past two years. I’m proficient in R now, and able to write fast code within the realm of data cleaning, analysis, and modeling, without having to google too much. I spent a lot of time coding in C++ using mathematical libraries, and learning how to address inefficiencies in R by creating new functions in C++. Again, this isn’t traditional development coding, but coding as a tool for quantitative scientific work.

Despite spending on average five hours coding a day over the past two years, I have done it all by figuring things out as I go, and spending most of my time working with data and models. Over the past half year, I have been coding in Python. I’m still not as efficient as I am in R, but I have been learning far more about what it means to be a true programmer since I started working in Python. At work we have been tackling problems that involve rebuilding production code, which has given me a chance to learn how to write stable automated code outside of simply modelling.

I’ve spent a few weeks studying basic algorithms, and learning best practices when programming. If I want a job in data science, I will need to spend at least 3-5 months studying data structures and algorithms. This will also help me in the long run, as currently most of what I know I have just picked up and never properly studied.

image08[1]
Some pricing equations I coded in C++
7.)  History and Journalism (and Russia):

I have written a few blog posts on this topic, and it’s a field I find fascinating. This year I read Homage to Catalonia and The Road to Wiggan Pier by George Orwell, as well as his published journals and letters. I also read Chronicles of Wasted Time by Malcom Muggeridge’s, who was a British journalist, as well as a spy during WWII. Both were men documenting the political environment of Britain and Europe during the 20th century. They both wrote heavily about Russia and the Soviet Union, as well as the misguided communist sympathizers on the left. It’s incredible how wrong the left was on the success of communism, while all the intellectuals at Oxford and Cambridge opined on how progressive it was as a state model. It’s easy to look back as Muggeridge and Orwell having simply been smart men, but at the time they were widely hated by both nationalists and communist groups. Many intellectuals had disdain for their work. Despite this, they abstained from excitement over ideologies and movements for excitements sake, and instead worked intensely to travel the world and base their worldview off of the empirical reality. Muggeridge saw first hand the starvation of the Russian people, and no rationalization by the intellectual left would change his view.

My interest in these two writers interestingly lined up with my reading of War and Peace and Anna Karenina by Leo Tolstoy. The first gave an incredible account on the will of the Russian People. Anna Karenina devoted a large amount of time to the failing structure of agriculture in 19th century Russia.  Having moved from serfdom, to sharecropping, and then to collectivization under communist rule. A failure that would pave the path for the starvation of untold millions, and the rise of totalitarianism.

This journalist and novel account of communism and Russia resulted in my talking to a professor of Development, who specializes in Russian history, and recommended I read The Soviet Tragedy: A History of Socialism In Russia, 1917-1991 by Martin Malia, which I have not yet finished.

image05[1]
Audrey Hepburn in the film adaptation of War and Peace (I still need to watch this)
8.) Syria:

The tragedy in Syria has no end in sight. I have followed it intensely for the past few years, keeping track of the different groups fighting within the country. As I have read the news, I have also paid close attention to the social media space of information. On reddit there is a subreddit called SyrianCivilWar, which posts twitter accounts, youtube videos, and primary source blog posts, tracking battles, rumors, and information. Having followed this for some time, it’s incredible how much more complicated the reality is than what is presented in the news. To an extent this is understandable, as the news must condense information. The issue is that the complexity of the sites, the reasons for fighting, the massacres, and the conditions for peace, are highly dimensional. There is no clear policy to solve this issue.

The foreign policy status quo has been something as follows: Identify the bad guy. Make sure he’s not making new alliances with other big bad guys (e.g. Russia or Iran). Find guys who are fighting the bad guys (call them moderates). Give moderates guns. Bomb bad guys. Threaten big bad guys to stay away.

This strategy worked to an extent during the Cold War. By that I mean it drained the Soviet Union of tons of resources (e.g. Afghanistan), which some argue resulted in the collapse of the Soviet Union. But it has yielded no obvious dividends in the 21st century. Each time our status quo analysis has missed a highly dimensional complexity, involving clashing value systems, cultures, and our own desire to completely solve the situation. In this case Assad is the ‘bad guy’ working with Russia ‘the badder guy.’ And the amount of civilian murder Assad has presided over absolutely means he has secured himself his place in hell, if such a place exists. On the other hand, Assad is the only hope for a secular government in Syria, and still has the popular support of the people under his control. Without Assad, it is almost certain an Islamist government would take power. Russia realizes this, and also realizes this would mean either more terrorist attacks within their country, or a new Western puppet government in Syria (less likely). Iran realizes a destabilized Syria, with an Islamist government, would threaten their security as well. Surprisingly, they aren’t evil countries who scheme together, but are instead looking out for their regional interests.

The answers are not obvious, but neither are they simple. Imposing a no fly zone for Russia, bombing Assad and ousting him out of power, and putting some US troops on the ground, will literally not solve a single issue.

image00[1]
An info-graphic I made a few years ago.
9.) Philosophy of Science:
This is the ace in my sleeve. It is the intellectual foundation of how I hope to be better at modeling the world than my competition. Scientific inference within the face of massive uncertainty is hardly taught at all within academia. It’s not taught at all in engineering disciplines, which instead exploit the logical rules of the universe. Programming follows logical rules, but doesn’t rely on inference. Within Economics it is taught more than the STEM fields, but even within Economics it is not formally studied. It is instead taught by osmosis through reading the great academic papers of the past. This is usually not enough, which explains why most academic research is shitty regressions with zero citations.

Not surprisingly, studying the connection between making scientific inferences on the world and combining it with statistics is extremely challenging. It is incredibly common for a researcher to pose a question, gather data, and fit a model to the data, only to present it to a roomful of scientists who all aren’t convinced of its accuracy. I’ve seen it happen all the time at the Fed. But on what basis do these scientists not find it compelling? They can’t prove the model is wrong, but they have a hunch that there is something missing. This hunch is based on their own empirical experiences, where they know you can make a model sing however you want if you torture it enough.

This happens more often in quantitative consulting and analysis. It is easy to ask a question, guess what the answer could be, find data and make it match your guess, and then turn it all into a neat story to sell your model. If those you present to don’t have a great background and intuition, they are likely going to look at your t-statistic and think “golly-gee, what’s significant! The model must be true.” When, the truth is, everything that has happened is probably just a step above randomness.

This year I have been reading Against Method by Paul Feyerband, Conjectures and Refutations , The Poverty of Historicism, and The Logic of Scientific Discovery by Karl Popper, and Error and the Growth of Experimental Knowledge by Deborah Mayo. I have also followed Andrew Gelman, Scott Adams, and Scott Alexander’s blogs, who cover the scientific method, statistics, and empiricism. Mostly Harmless Econometrics also focuses on issues of Causal Inference, which I was first exposed to at the LSE.

image04[1]
Karl Popper: The King of the Philosophy of Science
10.) Success:

I have taken my goals of the pursuit of knowledge and rigorous skills very seriously these past three years. And I have been lucky to have been accepted to the LSE and then the Fed. Now I’m preparing to leave the world of academia, and work in the private sector (although at the Fed we did a lot of faster paced policy work). I want to live in Seattle again, and for much of this year despaired that it couldn’t happen. That the jobs I want are on the East Coast, that there aren’t the right jobs in Seattle, or that I’m not properly qualified to get jobs in Tech, as they don’t get many applicants from the LSE or the Fed. It was too easy to let myself become discouraged, and I’m working on ending that style of thinking. Finding a fulfilling job in the right city isn’t easy, but it’s not impossible either. After reading How to Fail at Almost Everything and Still Win Big by Scott Adams (author of Dilbert) I decided to form a new outlook. I’m usually not the type to read self-help books. But Scott Adams has the right mix of cynicism and optimism that appeals to me, without being too heavy on the ‘self-help’ garbage.

He writes a lot on our own psychology, and how we often form negative outlooks based on our own guesses of the odds of success, despite the fact that we rarely get the odds right. I assumed I couldn’t get a great job I want in Seattle, because I already ‘knew’ that all the jobs I want are on the East Coast. It’s certainly true the easiest and highest paying jobs for me to find would be on the East Coast. However, it’s also true that Seattle is a pretty big city, and I don’t know everything that exists. I’m working on forming a more positive and optimistic attitude. There is no reason I’m unable to find a great job in Seattle. I might have to work harder to find the right team, and network harder, but that’s within my abilities.

image01[1]
A great Dilbert comic by Scott Adams (from Dilbert.com)
11.) Final Thoughts:
I spent just over the first half of my twenties trying to learn as much as I possibly could. It was rough graduating with a Finance degree into a crippled job market in Seattle. My greatest fear was not being able to learn more in the fields I found most interesting. I had a deep admiration for analytical scientists and economists who were able to not only come up with clever questions, but to implement their own answers using a mix of statistics, modeling, and programming. While I will continue to improve for the rest of my life, my time at the LSE, the Federal Reserve, and my own auto-didactic ways have taken me to a rigorous foundation.

Now I need to not only refine these skills, but learn how to market and signal them in order to build my own career progression in the private sector. I’ll succeed at this too.