Thoughts on Optimal Investment

Passive Management

Generally speaking, there are two types of investment: active and passive. Active investing entails spending time/money/effort to find good investments and good times to buy/sell to optimize your risk-adjusted rate of return, while passive investing entails buying one or more index funds and just continuously buying each month.

There is, admittedly a continuum to be seen here, but, by and large passive investment tends to be as good as (if not superior to) active investment for the majority of people Passive management.

For this reason, I'm not going to talk about how to choose which stocks/bonds/etc to invest in. Instead I'm going to discuss the other aspects of investing: which indexes to use, how to allocate your savings between those indices, and how to avoid large tax bills.

This hypothesis is backed up by historic data, since low-fee index funds have outperformed most funds actively managed by investors. For this reason, so its pretty clear that's where you should be looking. For this reason, I will only be looking at index funds and treasury bonds going forward. I am ignoring gold (which is a terrible long-run investment) and individual stocks/bonds with the exception of US Treasury bonds.

Taxes

Any American who wants to understand optimal investment needs to have a solid understanding of the US tax code. I've listed the most important topics below, but this list isn't meant to explain any of these in-depth; it should instead be viewed as a good point to start your own research.

  1. Tax Advantaged Accounts - principally 401(k)s and IRAs Comparison of 401(k) and IRA accounts but also HSAs and FSAs. After maxing out these contributions, you can save even more tax-advantaged money via the "mega backdoor" conversion Mega Backdoor Roths: How They Work Technically, your house can be a tax-advantaged investment in that your mortgage interest may be tax deductible.
  2. Short Term Penalties - there are two main ways holding investments short-term is penalized by the US tax code. First, generally speaking, selling an asset less than a year after buying it will result in it being taxed as ordinary labor income. If you hold the asset for more than a year, it is generally taxed at the lower capital gains rates Topic No. 409 Capital Gains and Losses. Similarly, dividends are taxed as ordinary labor income by default, but if (a) you hold the stick for at least a couple months and (b) the stock is for a US corporation, then you can generally pay the lower capital gains rates Qualified dividend.
  3. Tax loss harvesting - There are a number of tax advantages for selling an investment at a loss. The first $3,000 you lose can be deducted from your ordinary labor income. After that, you can either use it to cancel out capital gains made during the year or carry it forward to future years. If you choose the latter strategy, then (with enough assets, discipline, and luck) you can leave significantly more of your wealth to someone/thing else when you die. Note: when selling for a tax loss, you can't repurchase the security for 30 days.

Measuring Success

There are, financially speaking, two epochs to your adult life: saving for retirement and spending during retirement.

Giving a fixed savings rate, the goal of the first epoch is to maximize your utility during retirement. Assuming you spend reasonably in retirement and assuming you have a typical utility-consumption elasticity of 0.35, that utility is given by

$$ u = -s^{-0.35} $$

where $s$ is how much you have saved when you retire. So, for investment pre-retirement, we want to find the asset mix that, historically, has optimized this utility function.

After you retire, the goals fundamentally change. Now you want to withdraw as much as possible while having minimal risk of going bankrupt. To determine this, we will find the asset mix that allows for the greatest withdraws while never, historically speaking, going bankrupt.

An Easier Metric

The most common way to weight returns and risk is via the Sharpe ratio, which is defined as

$$ \frac{E[r]-r_0}{\sqrt{Var(r)}} $$

where $r_0$ is the risk-free rate of return.

However, I maintain that this measurement is silly. An easy way to see why is to notice that the only way to get a negative Sharpe ratio is to have returns below the risk-free rate - an investment that earns 0.01% higher expected returns but is incredibly risk (say an all-or-nothing coin flip) ends up with a positive Sharpe ratio, suggesting its overall better than the risk-free option. This is obviously ridiculous.

The Sharpe ratio is based on the assumption that returns are normally distributed. This is wrong (log-returns skew negative and have fatter tails), this model is still useful, and I'd like to suggest a different evaluation method based on the same assumption.

First, let our utility function be $-m^{-\epsilon}$, and then let $\mu$ and $\sigma$ be the mean and varianec of our returns, respectively. We can then prove that expected utility is proportional to

$$ e^\left(\mu - \epsilon \cdot \sigma / 2 \right) $$

This means choosing how to invest is equivalent to trying to optimize

$$ \begin{equation} E[R] - \frac{\epsilon \cdot Var[R]}{2} \end{equation} $$

where $R$ is real returns and $\epsilon \approx 0.35$.

To be clear, $R$ is referencing the distribution of returns over the time period of interest. For instance, if you're investing for 10 years, $R$ is the distribution of 10-year-returns. An interesting thing to note here is that if, say, 1-year returns are independent, then $E[R]$ and $Var[R]$ both grow perfectly proportionally over time. This observation will come back later.

Regarding the issue of kurtosis and skew in returns, I've only investigated kurtosis so far by using the Student T distribution. I found that increasing kurtosis from 3 to 4 was nearly equivalent to increasing $\epsilon$ by 4.5%. This is, in my opinion, quite small and suggests that the ignoring of kurtosis can be largely accounted for by just being a little more risk-averse.

TODO: Investigate skew.

Asset Allocation

Since we're committed to passive investment, our main question is simply how many eggs to put in different baskets. Your answer to this is called your "asset mix" and people typically think of it as being a mix of the following:

  • Domestic stocks
  • Foreign stocks
  • Domestic corporate bonds
  • Foreign corporate bonds
  • Treasury bonds
  • Gold
  • Housing/apartments

Using some mathematical properties of expectation and variance, we adjust our earlier equation, which assumes a single investment from

$$ E[R] - \frac{\epsilon \cdot Var[R]}{2} $$

and generalize it to support multiple investments:

$$ \begin{equation} \sum_i^n{p_i \cdot E[R_i]} - \frac{\epsilon}{2} \left( \sum_{i,j}^{n,n}{p_i \cdot p_j \cdot Cov[R_i, R_j]} \right) \end{equation} $$

where $R_i$ represents the distribution of returns for an asset and $p_i$ represents the percent of our portfolio we have invested in that asset.

I know this formula looks really complicated, but its really just a multivariate quadratic, which means it can be efficiently solved with quadratic programming Quadratic programming.

Bootstrapping Historical Data

There are two chief problems with using historical data. The first is obvious: past returns are not guaranteed of future returns. The second is less obvious: we don't actually have that much historical data. The oldest dataset I could find was compiled by Shiller Shiller (of Case-Shiller fame) and it only tracks two things: the SP500 and treasury bond interest rates and goes back to 1871. 149 years might seem like a lot, but we're interested in 40 year investment horizons, and 149 years only contains 3.7 independent 40-year periods. For this reason, we'll have to be more "creative" with our analysis.

The obvious solution is bootstrapping: basically chopping up our dataset into smaller time periods, assuming these are independent, and then generating a million years of data.

To verify independence between annual returns I used the Shiller data to compute 140 total real annual returns from the SP500. Linear regressions finds no relationship between one year's returns and the next's (95% CI = -0.16 to 0.18). Doing the same thing for variance (risk) yields a slope estimate of 0.07 (95% CI=-0.10 to 0.24). In both case we come nowhere close to rejecting the null hypothesis (both p-values > 0.39)

The other thing I did to investigate independence was to consider whether looking at annual returns vs returns over longer time periods had similar distributions. First things first, even if independence is violated, the average 2-year return will be exactly double the average annual return, since that's one of the properties of expected value. However, we can check to see whether variance also grows linearly (as the independence assumption predicts). Using the same dataset, I computed the variance for annual log-returns 0.0305. For 2-years, 3-years, and 10-years the variance was 0.0327, 0.0251, and 0.0310, respectively. Given that the sample sizes were 69, 46, and 13, respectively, I think we can consider this prediction generally fulfilled and the theory that annual returns are independent as confirmed as it can be.

What about shorter intervals? After all, if monthly returns are independent, then we can 12x our sample size, and if daily returns are independent, we can increase it by a whopping 253x!

For methodological reasons, the Shiller dataset is really only accurate for annual return analysis, so we'll have to turn to a different dataset. I obtained daily SP 500 total return data from Yahoo Finance S&P 500 (TR) (^SP500TR). Unfortunately, inflation is only estimated monthly rather than daily, so I ignored it for this analysis.

I computed the first four movements (mean, variance, skew, kurtosis) for returns every 1, 2, 3, 4, 5, 10, 20, 40, and 63 days. An important point to note is that I'm referring to trading days rather than normal days. For this reason, the 20-days and 63-days are very close to monthly and quarterly returns.

Naturally, I found no change in mean return (that would be mathematically impossible), but I found a statistically significant reduction in variance as the time period increased relative to what independence would predict. This trend was mostly caused by shrinkage between 1-day and 3-days, but the sign persisted throughout the data with the exception of the step between 40 and 60 days.

I also found that skew changed monotonically from -0.44 for 1-day returns to -1.23 for 40-day returns and kurtosis generally decreased from 14.6 to 6.1.

Together, this makes using daily or weekly returns to bootstrap to long-term return dubious at best. Though, statistically insignificant, the same trends persisted through the 20- and 40-day returns, which makes even monthly returns suspect.

By the time we look at the difference between 40- and 63-day returns, the variance trend reverse, but not the skew or kurtosis trends. It's hard to know for sure, but since we know variance is far more important than kurtosis in determining asset optimality (a 3 -> 4 change in kurtosis is roughly equivalent to a 4.5% increase in variance) this makes it look like treating quarterly returns as independent is plausibly okay for long-run analysis.

In addition to this empirical support, we have a good reason to prefer quarterly returns a priori: companies announce earnings every quarter, which can be responsible for large stock swings those days. For this reason, earning call days probably can't be treated as coming from the same distribution as other trading days, which makes frequencies higher than theoretically suspect.

Finally, as a reminder, we also prefer quarterly returns to annual returns since it increases our dataset, allowing for better statistical analysis. For these reasons, I will use quarterly returns for bootstrapping going forward.

Statistical Analysis

I put together real total quarterly returns for 5 indices from 1891 to 2015. I included (based on availability) returns on housing, S&P 500, gold, governement bills, and government bonds Shiller Karsten Compound Annual Growth Rate (Annualized Return) Jordà.

The average real return for gold over the last 140 years is 0.4%. When you the S&P 500 averaging 8.1% returns with similar risk, it's pretty clear that rational investors don't invest in gold. For this reason, while I would expect the EMH to apply to the gold market, I wouldn't expect it to apply between to gold market and other investment markets.

I have similar thoughts about government bonds, which average 1.7% real annual returns compared to 5.5% real returns from housing despite housing having less variance in returns and comparable covariance with the S&P 500.

I actually feel the same way about T-bills, but it's less obvious since (as far as I know) there aren't really any other comparable investment options.

For this reason, I'm mostly interested in stocks, housing, and corporate bonds - though I couldn't find any long-run data on the latter.

In any case, I solved for the optimal asset allocation with quadratic programming. I believe $\epsilon \approx 0.35$, but I've seen people suggest most investors assume $\epsilon \approx 1$, so I used $\epsilon = 0.5$ as a compromise. After solving, I found you should only have invested in the domestic stocks (the S&P 500) from 1891 to 2015.

The devil, however, is in the details. For instance, the best asset class depends a lot on the decade. More generally, the e/p ratio of the previous five year correlates with the subsequent 5-year returns for stocks but not housing:

Asset ClassSlope
S&P 500+1.5 (CI = +0.1 to +2.8)
Gold-1.1 (CI = -2.6 to +0.5)
Housing+0.1 (CI = -0.4 to +0.6)
Treasury Bills+0.3 (CI = -0.5 to 1.0)
Treasury Bonds+0.2 (CI = -0.8 to 1.3)

This is actually an important point. The e/p ratio has averaged 4.6% in the last 5 years, whereas its historically averaged 7.4%. The facts and the slope above suggest that stocks will deliver ~4% expected lower returns in the next 5 years. This, in turn, has huge implications for the optimal investment portfolio, suggesting that you should only invest in real estate.

It makes sense that the e/p ratio predicts stock returns since the e/p acts as an effective long-term floor on stock returns. To see this, you just need to realize that any company that returns all profits as dividends will (in the long-run) have returns exactly equal to its e/p ratio. Assuming companies are profit maximizers, this proves the e/p ratio acts as a lower bound on returns in the long-run.

However, per the efficient-market hypothesis, we'd expect the e/p ratio to also predict other investment returns with similar strength. This, historically, hasn't been true.

Like I mentioned before, I don't really think gold or government bills/bonds are efficient in a cross-market sense. For this reason, I'm only really interested in the discrepancy between how e/p predicts the S&P 500 and how it (doesn't) predict housing.

Pre & Post Retirement

All this analysis has assumed you have fixed nest egg that you are neither adding money to nor withdrawing money from.

If you are still contributing money to your nest egg, then you are more risk tolerant than our simple model investor, which makes the S&P 500 even more attractive.

To determine the optimal asset allocation in retirement, I determined which allocation would have allowed you to withdraw the most fixed amount of money per year (in real terms) without ever running out historically after 40 years. Given the 5 above investment options, the best allocation is 90% housing and 10% S&P 500 - an allocation that allows you to withdraw 4.3% of your assets each year. Note, the SP500 alone only lets you withdraw 2.1% of your assets, so this is a huge improvement.

However, looking at returns on corporate bonds since 1980 suggests that you may be able to do even better by throwing those into the mix. I used those 35 years of data to randomly generate some returns with the appropriate mean, variance, and covariance back to 1891 and found an optimal allocation of 70% corporate bonds and 30% housing, with an allowable withdrawal rate of 5.1%.

Summary

  • While saving for retirement, invest your money solely the SP500 and housing, probably with an emphasis on the former.
  • After retiring, you can probably safely withdraw around ~4.5% of your initial assets each year (adjusting for inflation) if you invest in housing and corporate bonds.
Wikipedia contributors. (2020, June 12). Passive management. In Wikipedia, The Free Encyclopedia. Retrieved 21:54, June 21, 2020, from https://en.wikipedia.org/w/index.php?title=Passive_management&oldid=962158625 Shiller, J. Online Data Robert Shiller. http://www.econ.yale.edu/~shiller/data.htm Wikipedia contributors. (2020, February 2). Comparison of 401(k) and IRA accounts. In Wikipedia, The Free Encyclopedia. Retrieved 21:20, June 21, 2020, from https://en.wikipedia.org/w/index.php?title=Comparison_of_401(k)_and_IRA_accounts&oldid=938818789 Coombes A. (2020). Mega Backdoor Roths: How They Work. Nerd Wallet. https://www.nerdwallet.com/blog/investing/mega-backdoor-roths-work/ Topic No. 409 Capital Gains and Losses. (2020). Internal Revenue Service. https://www.irs.gov/taxtopics/tc409 Wikipedia contributors. (2020, March 8). Qualified dividend. In Wikipedia, The Free Encyclopedia. Retrieved 21:37, June 21, 2020, from https://en.wikipedia.org/w/index.php?title=Qualified_dividend&oldid=944552367 S&P 500 (TR) (^SP500TR). Yahoo Finance. https://finance.yahoo.com/quote/%5ESP500TR/history Wikipedia contributors. (2020, June 26). Quadratic programming. In Wikipedia, The Free Encyclopedia. Retrieved 05:37, June 30, 2020, from https://en.wikipedia.org/w/index.php?title=Quadratic_programming&oldid=964585052 CPI for All Urban Consumers (CPI-U). U.S. Bureau of Labor Statistics. https://data.bls.gov/cgi-bin/surveymost Karsten. (2018). EarlyRetirementNow SWR Toolbox v2.0 - save your own copy before editing! https://docs.google.com/spreadsheets/d/1QGrMm6XSGWBVLI8I_DOAeJV5whoCnSdmaR8toQB2Jz8/edit#gid=1084562995 (see also https://earlyretirementnow.com/2018/08/29/google-sheet-updates-swr-series-part-28/) Compound Annual Growth Rate (Annualized Return). http://www.moneychimp.com/features/market_cagr.htm Jordà, Ò., Schularick, M., and Taylor, A. Jordà-Schularick-Taylor Macrohistory Database. Macrofinance & Macrohistory Lab. http://www.macrohistory.net/data/