Optimal Taxation

This page is for my personal thoughts on optimal taxation, for the wider literature see here. I disregard large portions of the the theoretical literature, because a lot of it based on the terrible assumption that the only way the tax system affects people's labor decisions is via their marginal tax rate.

This page is mostly concerned with the equity-efficiency trade-off. For my thoughts on Pigovian taxes see here.


  • Use lots of tagging.
  • Tax capital gains at lower rates than labor income.
  • Don't tax corporate income.
  • Raise the top tax brackets.

Equality-Efficiency Tradeoff

Governments intervene all the time to push the market towards outcomes they deem favorable. In general, such steering falls into one of three categories

  • Strategic Interventions - Most governments have certain strategic goals. A common one is a desire for the country to be self-sufficient in "essential" industries so that the country can support itself during crises (wars, pandemics, famines, etc).
  • Efficiency Interventions - Government frequently intervenes to correct or mitigate market failures. For instance, they might tax or regulate pollution and prevent monopolistic corporate mergers. Likewise, law enforcement fits here since its a public good that free markets can't adequately supply.
  • Redistributive Interventions - Governments typically also engage in policies to redistribute resources from some groups to others.

I don't have much to say about strategic interventions.

Assuming they follow economic theory, interventions to boost efficiency are generally uncontroversial among economists since they rely only on the assumption that people can rank possible outcomes.

This contrasts with redistributive interventions, where you need to assume (1) that people can assign numbers to outcomes and (2) that you can compare these numbers between people. I personally accept both these conditions (and assume them going forward), but many economists and philosophers do not.

Therefore, my general goal, as always, is to maximize social welfare ("utility"). Unfortunately, this is made challenging because I reject a central premise from most of the literature: that the only part of the tax system that changes how long someone works is the marginal tax rate they face. I am stunned this is so widely assumed when many models don't predict this such as this one. As far as I can tell, this assumption is made exclusively to make the mathematical analysis easier. Once we drop this assumption, much of past work stops being useful.

That being said, much of the literature doesn't rely on this assumption and I think it's quite valuable. I've also done some of my own work, which I report here.


What is Tagging?


Deductions for "Unavoidable" Expenses

TODO: health care, mortgage interest, SALT

Charitable Deduction

The elasticity of charitable donations to tax changes is probably a little over 1 The price elasticities of charitable contributions: A meta-analysis, which suggests that for every dollar a charitable deduction costs the government, private charities gain a little more than a dollar.

From this, the main question becomes how effective charities are at doing good per dollar relative to the government. This good isn't necessarily easily measurable either; from the paper:

Cordes (2001, p. 3) notes the correlation between cash donations and volunteerism and argues that a system encouraging more individual donations would also “help foster civic virtues that are needed to maintain a ‘civil society.’”

In any case, it's not obvious whether this is true.

One argument against allowing charitable deductions is that a willingness to give probably positively correlates with not valuing money very much, which (via the tagging philosophy) means they should actually be taxed more.

Capital Gains Taxation

Bad Arguments For Taxing More

Before I start, I want to make the goalposts clear. I will assume the default hypothesis is that the tax system is a progressive tax on all income. For this reason, the fact that people who earn capital income tend to have higher incomes is not a good argument for taxing them at a higher rate since that should already be reflected in that they'll be in higher tax brackets on average.

Good Arguments for Taxing More

That being said, I think there is really only one good argument for taxing capital income at a higher rate than labor income, which is a variation on the redistribution-based critique above.

Basically, suppose Alice and Bob have the same incomes, but Alice earns hers entirely through labor while Bob earns his entirely through capital. In expectation, I think it's pretty clear Alice values the marginal dollar more than Bob does for a couple reasons.

First, the fact Bob has large financial wealth suggests he has a high ability to earn, which (via basic tagging arguments) suggests we should tax him more.

Second, suppose we know that Alice and Bob, in fact, do have the same ability to earn, suppose Alice and Bob have the same utility function, and suppose consumption has decreasing utility returns while labor has increasing utility returns. It then follows that Bob values the marginal dollar less than Alice. This provides another justification to tax him more.

Obviously the above situation is rather extreme, but less extreme variants hold for households who get different proportions of their income from labor or capital.

A counterargument is that one of the reasons some people earn more in capital income than other people is a preference for free-time over consumption. I think this reasoning is true, but it doesn't fully (or even mostly) eliminate the argument's potency.

Bad Arguments for Taxing Less

While people aren't perfectly rational, I generally am willing to entertain that model of human behavior for economic arguments. However, this willingness doesn't extend to long-term human behavior since there is ample evidence humans aren't even remotely rational when weighing costs and benefits of decisions over decades.

However, long-term rationality is presupposed by many of the theoretical arguments against the taxation of capital income. As a result, I don't think much of this literature is a good guide to optimal policy in the real world. With that being said, I'll examine what I consider the solid arguments for taxing capital income at rates higher (or lower) than labor income.

Good Arguments for Taxing Less

The easiest arguments for lower capital gains taxations are real-world arguments. In particular,

  1. The capital gains tax is currently imposed on nominal capital gains even though it is real capital gains that provide the ability to consume. Therefore if we want to tax labor and capital income at the same rate, we need to allow the deduction of inflation from capital gains. However, a permanently lower tax rate is equivalent in the long-run to the inflation deduction while also acting as a counter-cyclical fiscal policy.
  2. Capital income is frequently already taxed by the corporate income tax before it reaches the owners of stocks. Therefore, for stock owners in particular, the effective tax rate on their income is higher than the literal rate in tax law. Hence, if we want equal effective tax rates on labor and capital income, the nominal tax on capital income must actually be lower.
  3. Our society is set up to discourage savings by default in a variety of ways: we provide extremely subsidized emergency medical care to people without assets; we provide less financial aid for college students who come from high-saving families; many welfare programs are reduced or eliminated if you have high savings. These misaligned incentives build atop of people's natural and well-documented tendency to not care about their future well-being. Taxing capital gains exacerbates this tendency.

A fourth argument I find somewhat convincing is more theoretical. Consider the Solow-Swan model Solow–Swan model which basically claims long-term GDP is given by

$$ Y = K^\alpha $$

where $K$ is the amount of capital in a country. This model implies that capital income is $\alpha Y$ and labor income is $(1-\alpha) Y$. Suppose that $s_K$ is the percent of capital income that is saved and $s_L$ is the percent of labor income that is saved. Finally, suppose a fixed percent of capital is used up each year.

From this it follows that

$$ Y \propto \left( \alpha s_K + (1-\alpha) s_L \right)^{\alpha/(1-\alpha)} $$

From this it follows that every dollar of tax on capital will reduce $Y$ by $\frac{\alpha s_K}{\alpha s_K + (1-\alpha)s_L}$ dollars while every dollar of tax on labor income will reduce $Y$ by $ \frac{\alpha s_L}{\alpha s_K + (1-\alpha)s_L} $

So, if $s_K \gt s_L$, then the cost of taxing capital is greater than the cost of taxing labor. Conversely, if $s_L \gt s_K$, then the cost of taxing labor is greater.

In the real world, $s_K$ is almost certainly greater than $s_L$ (TODO: cite), which suggests we should avoid taxing capital income.

In less mathematical language, the basic argument is that (a) more capital income is saved than labor income (b) thus taxing capital incomes reduces long-run capital (c) which reduces the ability of both capital-owners and workers to earn/consume.

A counter to this line of reasoning is that it's better to incentivize savings directly by letting people deduct it on their taxes. However, due to tagging-esque reasons, even with this kind of deduction, there remains an incentive for a benevolent social planner to favor capital income.

A second counter is that this all presupposes a closed economy. In an open economy, capital is determined by the global interest rate, so all of the above fails to follow. This is true, but (a) financial flows aren't perfectly open Equity home bias puzzle Backus–Kehoe–Kydland puzzle Feldstein–Horioka puzzle (b) a pure patriot should favor promoting savings since it means we're borrowing less from foreigners and (c) a perfect utilitarian cares about the world which is perfectly closed. In fact, (c) could end up being the decisive factor for utilitarians since investment by a globally rich person (like an American) could cause very large benefits in the form of higher wages for the global poor.

Finally, we should probably be discouraging labor in general, which suggests we should be taxing it more than capital income.


Finally, one last consideration is that some capital income has significant negative externalities. In particular, if you earn your money via short-term trading, nearly all your income comes at the expense of other traders, which suggests it should be taxed at a very high rate to internalize the externality. For this reason, I feel pretty good about high taxes on short-term capital gains.

However, for long-term capital income, we have two arguments in favor of taxing capital income more than labor income and five for taxing it less, but we have no attempt at weighing these pros and cons against each other, which makes it unclear what is optimal policy.

On the whole, my impression is that taxing capital income at lower rates makes sense, mainly because the first two arguments for lower rates are so obviously true and require no assumptions (theoretical or empirical) and because of the argument that lower capital income taxes boost the incomes of foreigners, which could potentially be a very big deal.

That being said, I'm quite uncertain and could see myself deciding higher taxes are warranted or (conversely) even that a negative tax is optimal.

Corporate Taxation

[see here]

According to the neoclassical model, taxes on capital income are less distortionary than taxes on corporate income, especially with existing depreciation deduction and international flow rules. Moreover, if we assume a perfectly open economy, taxing dividends and capital gains has a much smaller distortionary effect than taxing corporate profits. On the other hand, according to the models of CEO agency taxing dividends is more distortionary. For this reason, it's unclear whether capital gains are more or less distortionary than capital income taxation.

I personally feel the arguments against corporate taxation are stronger than the arguments against. While CEO motivations undoubtedly matter, it's not clear to me the models used to show corporate taxes are better capture how true CECs behave. Conversely, while we don't have truly open capital flows, they are at least somewhat open, which mutes the investment-reducing effects of taxing dividends and capital gains without muting the effects of taxing corporate profits.

Consumption Taxes

Flat Consumption Taxes

Assuming you are born with a net worth of zero and die with a net worth of zero, your inflows of wealth equaled your outflows of wealth. That is

(Labor Income) + (Capital Income) - (Capital Losses) - (Taxes) = (Consumption) + (Donations)

In other words, for a particular (reasonable) definition of income, every dollar you earn is spent. For this reason a flat consumption tax is equivalent to a flat income tax on all income minus capital loses (e.g. selling for a loss, interest paid, etc).

Since (1) we have no good reason to think a flat consumption tax is optimal and (2) a flat consumption tax is equivalent of a particular kind of income tax we have no good reason to prefer a flat consumption tax over an income tax.

Minor Variations

A natural rejoinder to this is that consumption taxes don't need to completely flat.

For instance, some people say we could exempt necessities or provide a refund of some number of dollars to each person.

Unfortunately, these specific justifications aren't really promising when examined rigorously since they don't really offer anything an income tax wouldn't (i.e. progressivity).

That being said, there is a place for specific consumption taxes in the optimal tax system. The obvious exception is Pigovian taxes, which are justifiable even absent redistribution concerns. Beyond that, there are actually several other examples of goods/services that should be (theoretically) taxed even with optimal income taxation. In particular:

  • Things correlating with leisure (games, hotels, flight tickets, etc)
  • Things black-market sellers tend to want to buy on the legal market.
  • College tuition.

That being said, despite these theoretical arguments, I continue to be skeptical of consumption taxation in ways beyond Pigovian taxes for a couple reasons:

  • The welfare effects of taxing/subsidizing specific good/services can vary quite a bit for reasons completely independent of a standard labor/consumption model. For instance, we should probably actually subsidize leisure activities and we should probably be taxing college anyways.
  • Despite my rather "socialist" conclusions on this page, I do actually give some merit to the idea that expansion of the government to fix tiny social problems is likely net-negative, even if financially, the pros outweigh the cons. In the real world legislators and regulators aren't perfect when writing the law, bureaucrats and police aren't perfect enforcing it, and the cost for private people/firms to comply isn't zero.

[ As an aside, I do think that if you don't are at all about redistribution, consumption taxes have a lot to offer since you can structure them to the minimize deadweight loss. ]

The Top Tax Rate

The Optimal Top Tax Rate

The traditional interpretation of the Laffer curve is that you never want to have rates above the revenue-maximizing rate, since such rates harm both the government's ability to help society and also harm the person being taxed. For the very wealthy, it's been argued that the marginal social welfare of an additional dollar is effectively zero. I generally buy this arguments.

Note, that while I think much of the tax literature is bunk, my criticism doesn't affect analysis of the top tax bracket, because changes to this rate don't affect people who make less than or more than the top bracket's threshold. This means that analysis of this optimal rate can be done in an almost entirely model-agnostic way. To this end, it has been shown that, assuming the marginal social welfare is zero for the very wealthy, the optimal top tax rate should be given by Saez:

$$ \tau^* = \frac{1}{1 - a \zeta} $$

where $\zeta$ is the elasticity of income to taxes and $a$ is the Pareto parameter of the income distribution (i.e. how unequal income is distributed). Note: this method is not super precise because $\zeta$ could vary with the tax rate. Nevertheless, this method should give a somewhat accurate and reasonably assumption-free estimate of the the optimal top tax rate. Based on this, Saez estimates an optimal top rate of between 50% and 80% - the precise value depending a great deal on $\zeta$.

These rates are rather high relative to what the US has now and, in fact, you could justify an even higher tax rate if you actually want to discourage working, which seems like a realistic goal for a utilitarian.

Contra Optimal Top Tax Rate (and contra that)

Conversely, as mentioned above above, the primary assumption here is that the optimal tax rate is one that maximizes revenue. In principle, you could justify a lower tax rate by claiming that consumption by the very wealthy yields non-zero social welfare. While technically true, I am really skeptical this changes the analysis much.

Besides rejecting the utilitarian framework, the other avenue of attack is to claim that the rich improve the welfare of other people in the economy and that high tax rates discourage this, thereby harming people with lower incomes.

Generally speaking, I'm skeptical of this claim. In a competitive labor market, your compensation should be equal to the marginal revenue you bring into the firm, which doesn't leave a whole lot of room for positive externalities.

That being said, I do think this argument holds some water for innovators. In particular, innovation has positive externalities:

  • Once the patent expires, anyone can duplicate their work.
  • Even before the patent expires, other people can build off the work's novel ideas.

That being said, this argument for subsidizing the tax rates of the wealthy has some weakness:

  • Patents give the first innovator a legal monopoly already. If their innovation would otherwise never have happened, this is still a Pareto efficient improvement, but I think it's fairly intuitive that a large amount of new innovation would have happened anyways without any particular inventor. In this sense, a large portion of the money enabled by the patent is just a rent. Because of this, it's entirely plausible that we've already "over-internalized" the positive externality from innovations.
  • Presumably a large portion of the wealthy are not, in fact, innovators. Given this, if we want to encourage more innovation, there are more effective ways: funding for R&D, tax credits for R&D, subsidizing grad school for STEM students, promoting savings, etc.

In short, it's likely labor-discouraging tax rates would reduce innovation, but it's unclear whether we're over- or under-supplying innovation at the moment.

My Model


[ For simplicity, I'll be focusing only on labor income in this section, and I'll be ignoring all tagging considerations except for household size. ]

The standard model used in the optimal taxation literature includes the idea that if $t$ is the marginal tax rate, each 1% change in $(1-t)$ will cause an equal percent change in hours worked. This assumption can be useful (as in computing the optimal top rate above), but is obviously an extremely simplifying assumption that I generally think runs contrary to the evidence.

In particular, lump sum cash transfers also reduce hours worked Robins as do lottery winnings Imbens. Moreover, despite the fact that wages are now far far higher than they were historically, we work fewer hours. Since higher wages are equivalent to a lower flat tax, this flies in the face of conventional economic theory.

With all that in mind, the obvious utility function to use is (in my opinnion)

$$ u = -\frac{(w L)^{-\epsilon}}{\epsilon} - \alpha \frac{L^{\beta}}{\beta} $$

The left hand side is the typical way income/consumption is related to utility in the literature (mostly because it assumes a constant elasticity). The right hand side is just the same type of function (power function) for labor, which also is equivalent to supposing a constant elasticity between utility and labor.

The two denominators are there just to simplify the math, but don't affect the result because of the $\alpha$ parameter. To see this, just take the derivative:

$$ \frac{du}{dL} = w (w L)^{-\epsilon-1} - \alpha L^{\beta-1} $$

Then set to zero and solve:

$$ \frac{w^{-\epsilon}}{\alpha} = L^{\beta + \epsilon} $$

[ Note: this implies that a 1% increase in $w$ should cause a $\frac{\epsilon}{\epsilon + \beta}$ decrease in $L$. ]

However, people live in households and coordinate decisions with the people living there. To extend the above individual utility function, we just add some extra people's labors and assume cost of living grows with the household size by $c(n)$:

$$ -\frac{1}{\epsilon} \left( \frac{1}{c(n)} \sum_{i=1}{n}{w_i L_i} \right)^{-\epsilon} - \frac{\alpha}{\beta} \sum_{i=1}^{n}{L_i} $$

For simplicity's sake, I assume $c(n)$ is simply the poverty line HHS Poverty Guidelines for 2020, but note that since $k$ and $\epsilon$ are yet to be chosen, the choice of $c(n)$ is scale-invariant.

Speaking of, I chose $\epsilon = 0.35$ since that's what the evidence appears to suggest.

Estimating $\beta$

First things first, $\beta$ must be greater than 1 because otherwise each hour of free time would actually be worth more than the last - that's obviously lubricous.

Returning briefly to the individual utility function, the math above shows that we can estimate $\beta$ with

$$ \beta = \frac{\epsilon}{\delta} - \epsilon $$

Where $\delta$ is how the elasticity of $L$ relative to $w$.

Historically, between 1948 and 2018, real GDP per hour worked increased 3.43-fold while hours-per-working-age-person increased 8%. However, if you account for the surge into women into the workforce increased the number of laborers by ~22%, but women work 16% fewer hours than men, which suggests that, absent that trend, $L$ would have decreased by around 10%:

0.08 - 0.22*(1-0.16)

Plugging that in yields an estimate of $\delta \approx 0.077$ and $\beta \approx 4.2 $.

However, I'm not very confident with this approach since my estimation of $\delta$ is pretty ad hoc and quite sensitive to my estimate for $\epsilon$.

Another approach is to look at how much less people work when they get given transfers. To do this, we alter the utility function to:

$$ u = -\frac{(w L + \gamma)^{-\epsilon}}{\epsilon} - \alpha \frac{L^{\beta}}{\beta} $$

where $\gamma$ is the size of the transfer. Optimizing yields

$$ \frac{w^{-\epsilon}}{\alpha} = L^{\beta+\epsilon} (1 + \gamma/(wL))^{\epsilon+1} $$

Assuming $\gamma$ is small, this suggests that increasing $\gamma$ by 1% of someone's income reduces $L$ by $(\epsilon+1)/(\epsilon + \beta)$.

Lottery research suggests that this is about 0.1 Imbens, which implies $\beta \approx 13$. A series of negative income studies conducted in the US guaranteed people significant portions of their income (typically around ~half) and saw ~5% reductions in labor, again suggesting $\delta \approx 13$ (data from Robins).

However, both these studies are biased towards finding negative effects:

  • If you win the lottery, you leisure time is less valuable to you because all your friends still have to work.
  • The negative income tax studies only lasted a few years and long-term elasticities are generally higher than short-term ones.

So how low could $\beta$ be? As a gut check, suppose we guaranteed everyone their current income forever. It seems plausible this would cut hours worked in half. If so, that implies $\beta \approx 2.4$, which is much lower than our estimate of 13.

All things considered, it seems very likely that $1 \lt \beta \lt 13$, but that still leaves quite a bit of room for error. Due to this uncertainty, I'll continue the analysis using both $\beta=2$ and $\beta=13$.

Demographic Distribution

Based on work by Saez, we know the upper end of the income distribution is characterized by a Pareto distribution with $\alpha \approx2$ Saez (at least for married taxpayers). We also know the threshold for top 1% income for an individual is about \$329,551 DQYDJ. From this we can deduce the PDF for the top 1% is

$$ (1295/x)^3 $$

TODO: Finish qqqq

The other thing we need to include is how wages correlate with household size. For simplicity, we'll assume all households consist of either a man, a woman, a man-woman pair, and that any of these groups can have kids. I'm ignoring gay and lesbian couples due to lack of data.

The first thing to note is that when both spouses earn, the correlation between their incomes is ~0.23 Schwartz. Note, that since some spouses don't all work the same number of hours, the correlation between spouses' wages is even higher. In particular, 40% of wives don't work, 40% work full-time, and 20% work part-time. If we, therefore, treat whether a wife works part-time or full-time as a biased (1:2) coin flip, we can compute the implied true wage correlation, which ends up being ~0.24.

Next, we note that 48% of households have a married couple, 17% have a single head with dependents, 17% are a single male, and 18% are a single female Table HH-1.

While I can find household size distributions in general HH-4, I can't find distributions by marital status, so I'll just assume that the distribution of number-of-kids is the same for married and unmarried households. Doing this yields





Saez, E. (2001). Using elasticities to derive optimal income tax rates. The review of economic studies, 68(1), 205-229. https://doi.org/10.1111/1467-937X.00166 Wikipedia contributors. (2020, July 14). Solow–Swan model. In Wikipedia, The Free Encyclopedia. Retrieved 21:03, September 14, 2020, from https://en.wikipedia.org/w/index.php?title=Solow%E2%80%93Swan_model&oldid=967676382 Wikipedia contributors. (2019, December 4). Equity home bias puzzle. In Wikipedia, The Free Encyclopedia. Retrieved 20:52, September 9, 2020, from https://en.wikipedia.org/w/index.php?title=Equity_home_bias_puzzle&oldid=929180771 Wikipedia contributors. (2020, June 23). Backus–Kehoe–Kydland puzzle. In Wikipedia, The Free Encyclopedia. Retrieved 20:52, September 9, 2020, from https://en.wikipedia.org/w/index.php?title=Backus%E2%80%93Kehoe%E2%80%93Kydland_puzzle&oldid=964074299 Wikipedia contributors. (2020, July 9). Feldstein–Horioka puzzle. In Wikipedia, The Free Encyclopedia. Retrieved 20:52, September 9, 2020, from https://en.wikipedia.org/w/index.php?title=Feldstein%E2%80%93Horioka_puzzle&oldid=966830962 Peloza, J., & Steel, P. (2005). The price elasticities of charitable contributions: A meta-analysis. Journal of Public Policy & Marketing, 24(2), 260-272. https://doi.org/10.1509%2Fjppm.2005.24.2.260 Robins, P. K. (1985). A comparison of the labor supply findings from the four negative income tax experiments. Journal of human Resources, 567-582. https://doi.org/10.2307%2F145685 Imbens, G. W., Rubin, D. B., & Sacerdote, B. I. (2001). Estimating the effect of unearned income on labor earnings, savings, and consumption: Evidence from a survey of lottery players. American economic review, 91(4), 778-794. https://doi.org/10.1257/aer.91.4.778 Office of the Assistant Secretary for Planning and Evaluation. (2020). U.S. Federal Poverty Guidelines Used to Determine Financial Eligibility for Certain Federal Programs: HHS Poverty Guidelines for 2020. https://aspe.hhs.gov/poverty-guidelines Schwartz, C. R. (2010). Earnings inequality and the changing association between spouses’ earnings. American journal of sociology, 115(5), 1524-1557. https://doi.org/10.1086/651373 United States Census Bureau. (2019). Table HH-1. Households by Type: 1940 to Present. https://www.census.gov/data/tables/time-series/demo/families/households.html United States Census Bureau. (2019). Table HH-4. Households by Size: 1960 to Present. https://www.census.gov/data/tables/time-series/demo/families/households.html DQYDJ. Average, Median, Top 1%, and all United States Individual Income Percentiles in 2019. https://dqydj.com/average-median-top-individual-income-percentiles/