redding.dev
	chetty economics taxes inequality literature-summary Optimal Taxation Literature [ This page is my attempt at an unbiased review of the optimal taxation literature, much as one would get at a university. It is based on videos of lectures from Harvard, the cited papers, other reading. It is mostly based on this series of lectures Topic 4: Optimal Taxation Part 1. I largely ignore empirical considerations, preferring to focus this page on the purely theoretical models and their results. ] TODO: Efficiency cost of Taxation (start here). Lump Sum Transfers The second fundamental theorem of welfare economics states that you can achieve any Pareto optimal outcome via only lump-sum wealth redistribution. For a proof see the Wikipedia page Fundamental theorems of welfare economics. Unfortunately, this theory runs into serious problems when it makes contact with reality. Probably the most fatal flaw is that it assumes the government has perfect information. In practice, governments generally want to transfer money from people with high earning ability to people with low earning ability, but they don't have a way to perfectly determine each person's earning ability. Other problems come from the other, fairly strong, assumptions like that there are no transaction costs, that all actors have perfect information, and that there are no monopolies. Nevertheless, this ideal has significant influence on the literature and you'll often see papers make claims about being the second best redistribution mechanism - the best is always implied to be lump sum transfers. Consumption Taxes Ramsey Model The overall elasticity of a good is the percent equilibrium quantity changes due a percent change in price from taxes. You can compute this from the elasticity of demand ($\epsilon_d$) and supply ($\epsilon_s$) using this formula: $$ \frac{\epsilon_d \epsilon_s}{\epsilon_d + \epsilon_s} $$ Ramsey proved that if you're levying flat consumption taxes, then the tax on each good should be inversely proportional to the elasticity of that good to minimize deadweight loss A Contribution to the Theory of Taxation. This system obviously ignores redistributive goals and is probably regressive since necessities tend to be inelastic while luxury goods tend to be elastic. Diamond extends this result from optimizing merely efficiency to optimizing social welfare as a whole. He determined that the if we still restrict ourselves to flat consumption taxes, the optimal rate on each good should be multiplied by the average marginal utility for consumers of that good A many-person Ramsey tax rule. Land Value Tax As a special case of the Ramsey rule, perfectly inelastic goods and services should be taxed to the greatest extent possible. This is the main impetus behind a land value tax - taxing the value of all land but not the value of any improvements made to it (e.g. buildings). The logic is simply that since the supply of land is perfectly inelastic (well mostly Land reclamation), a tax on land has zero dead-weight loss and is therefore optimal from an efficiency standpoint. In fact, from this perspective, we can levy a tax equal to the entire value of the land times the interest rate, effectively giving the government all income derived from unimproved land - all without any efficiency trade-off. Another reason the tax tends to be popular is that its progressive since the poor tend to own no land at all. A skeptic may think landlords will just raise rents, but since the supply is perfectly inelastic, the entire incidence of the tax should fall on landlords rather than renters. Finally, if your unimproved plot of land gains value, it's because other people improved the land around you. This makes that value an externality and makes you essentially a rent seeker. A 100% land value tax removes this unfair value people get when they luck out and their land becomes more valuable through no actions of their own. There are also arguments against: It's difficult to assess how much a piece of land would be worth without any improvements. If land values increase around you, you may be unable to afford the land value tax and will, thus, be forced from your home. A land value tax is unnecessary to the extent the Atkinson-Stiglitz result holds (discussed later). Income Tax Laffer Curve (Flat Income Tax) Instead of optimizing for social welfare, what if we want to optimize for tax revenue? For simplicity, we'll consider a flat tax example. First of all, it is obvious that revenue will be zero at t=0 since there is no tax. It is nearly as obvious that revenue will be zero at t=1, since no one will work if the tax rate is 100%. Assuming some revenue can be raised, it therefore follows the revenue-maximizing rate is between 0% and 100%. Let $\epsilon$ be the the elasticity of income with respect to $1-t$. For instance, if $\epsilon=0.4$, then increasing taxes from 0% to 1% will cause income to decrease by 0.4%. Given the above, we can show that maximum tax revenue is given by $$ t_{max} = \frac{1}{\epsilon + 1} $$ Using our arbitrary example of $\epsilon=0.4$, this implies a revenue-maximizing rate of ~71%. The traditional interpretation is that you never want to have rates above the revenue-maximizing rate, since such rates harm both the government's ability to help society and also harm the person being taxed. In principle, (and maybe in practice), you could justify a higher tax rate if you actually want to discourage working. The Mirrlees Model (Income Taxes) Suppose consumption is given by $c = w L - T(w L)$, where $w$ is your wage, $L$ is your labor, and $T(w L)$ is how much you pay in taxes. Also, suppose everyone has the same utility function $u(c, L)$ where $c$ and they're trying to maximize it: $$ w(1-T')\frac{\delta u}{\delta c} + \frac{\delta u}{\delta L} = 0 $$ Finally, suppose there are $h(w)$ people with wage $w$ and that the government has its own social welfare function $G$ such that it is optimizing for $$ \int_0^\infty{G(u(c,L)) h(w) dw} $$ but it needs to raise enough revenue to match our expenditures, $E$: $$ \int_0^\infty T(w L) h(w) dw \geq E $$ Two notes: Conventional utilitarianism suggests $G(x)=x$. However, both unconventional utilitarians (e.g. Rawlsian ones) and conventional utilitarians who believe people optimize for the wrong things should choose a non-trivial $G$. This model is premised on the assumption that the only reason we don't perfectly redistribute income is due to behavioral responses. Some people argue that at least some people intrinsically deserve at least some of their income. From this approach, a few general results have been proven: Taxes should be negative at low incomes. Taxes should positive at high incomes. Marginal tax rates should never be negative Seade. Marginal tax rates should never be above 100% (trivial since no one would work). The marginal tax rate on the very highest earner should be zero if the skill distribution is bounded Sadka. This result is generally believed to be inapplicable to reality as we'll discuss in a bit. Suppose the maximum tax bracket has rate $\tau$ and is applied to all income above $\bar{z}$. Suppose there are $N$ people who earn above that and that their average income is $z_m$. If I increase $\tau$, this increases tax revenue simply due to having a higher tax rate reduces tax revenue by discouraging labor reduces the welfare of these $N$ people by making them consume less These three effects are called the fiscal effect, the behavioral effect, and the welfare effect. I won't really discuss them more, but they come up a lot in the literature and are the sources of some of the derivatives I merely state later on. We can represent this mathematically using the derivative of social welfare with respect to $\tau$: $$ N \left( (1-g)(z^m - \bar{z}) - \epsilon \frac{\tau}{1 - \tau}z^m \right) $$ where $g$ represents how much. We can optimize by setting this derivative to zero to achieve Using elasticities to derive optimal income tax rates: $$ \frac{\tau}{1 - \tau} = \frac{(1-g)(z^m/\bar{z} - 1)}{\epsilon \cdot z^m / \bar{z}} $$ Note, these results make intuitive sense: As $g$ increases, $\tau$ should decrease. As $\epsilon$ increases, $\tau$ should decrease. As $z_m/\bar{z}$ increases, income inequality increases, and $\tau$ increases. A couple things to note: This formula is less precise than it appears. Most people believe, for instance, that changes to the top tax rate will affect $z^m$ and $\epsilon$ This result can also be used to show both (1) that the top earner should face a 0% marginal tax rate and (2) that this "proof" is completely inapplicable to reality. Basically, for the top earner, as $\bar{z}$ approaches their income, the optimal $\tau$ goes to zero. Hence, if we were to implement a tax bracket just below the top earner's income, that bracket's rate should be 0%. In English, the basic argument goes "if we can predict the top earner's income, we can introduce a new 0% tax bracket just below to encourage them to work more with virtually no revenue loss." However, this entire paradigm is inapplicable to reality, because we cannot (in fact) predict the top earner's income very well. If, for instance, the top bracket's threshold is half the top earner's income, this entire argument utterly breaks down. As a special case, we can reproduce the Laffer curve result by letting $\bar{z}=0$ and $g=0$. After proving this, Saez uses this model to argue the optimal top tax rate is between 50% and 80% Using elasticities to derive optimal income tax rates. Let's generalize this to non-linear income taxes. Consider the people earning between $z$ and $z+dz$. If I change their marginal tax rate by a small amount, we can use the derivative of social welfare relative to their marginal tax rate to compute the change in their welfare: $$ (1 - H(z)) - (1 - H(z))G(z) + h(z) \cdot z \cdot \epsilon \cdot \frac{T'}{1-T'} $$ where $H$ is the CDF of income (as $h$ was the pdf). When we set this to zero, we get $$ \frac{T'(z)}{1-T'(z)} = \frac{1}{\epsilon} \frac{1 - H(z)}{z h(z)} (1 - G(z)) $$ Note, this result requires us to assume that labor decisions are only made based on the marginal tax rate. This is almost certainly false. One note on the $(1-H(z))/h(z)$ term. This term implies that when few people are earning $z$ income and lots of people are earning above $z$, the marginal tax rate at $z$ should be high. The intuitive reason is that our distortion is low (few people pay the high marginal tax), but it lets us raise effective rates on everyone above that point without discouraging their labor. Although we can't elegantly solve for the entire tax system, we can solve for the best system computationally based on the current US tax system and income distribution. Saez did this Using elasticities to derive optimal income tax rates. The review of economic studies and found: Typically this Mirrlees model advocates that marginal tax rates follow a U-shaped curve, while having a lump sum grant to those with no earnings. TODO: Diamond 1998, Piketty 1997 Discrete Labor Model One of the problems with the Mirrlees model is that it assumes people can choose any number of hours, when, in reality, it is difficult to work, say, 4 hours per week. Instead, people are likely to respond to taxes and transfers by dropping in or re-entering the workforce. Suppose we have a finite number of jobs and each individual is trained for only one of those jobs and all people working the same job earn the same wage. Each individual can merely choose whether or not to work and the government can choose to levy a tax on each job. Finally, suppose that if after-tax income goes down by 1%, the participation rate of a job goes down by some fixed percent. This model implies that work subsidies are optimal Optimal income transfer programs: intensive versus extensive labor supply responses - directly contradicting our earlier result that the marginal tax rate never be negative. This justifies programs like the EITC. Saez finally combines both the Mirrlees and the discrete labor model into a single general model, but that is more complicated, so we won't go into it other than to say that this can still support negative marginal tax rates at low incomes. Capital Gains Taxation The Wikipedia page Optimal capital income taxation. In Wikipedia provides a really good overview - to the extent that I feel like I'm almost copy-pasting them at times, but I've seen similar collections of arguments in lectures, so I guess this falls under "common knowledge". I'm mainly including this section for completeness. Arguments Against The Ramsey model implyes there should be no capital gains tax. The reasoning was given by Judd Judd and Chamley Chamley and is relatively straightforward: Suppose the interest rate is 5%, we levy a 10% capital gains tax, and you invest \$1 to spend later. If a consumption tax rate is $t$, then instead of being able to buy $A$ goods, I'll only be able to buy $A/(1+t)$ goods. Alternatively, if I can buy $B$ with the tax, then $t=A/B-1$ After 1 year, you have \$1.045 to consume, but you would have had \$1.05 to consume absent the tax. This implies a tax rate of (1.05/1.045-1) ~ 0.48% After 100 years, you have \$81.59 to consume, but you would have had \$131.50 to consume absent the tax. This implies a tax rate of 131.50/81.59-1 ~ 61% After 1000 years, the implied tax rate grows to 11731%. etc. As time goes to infinity, the implied tax rate grows to infinity. However, the Ramsey formula implies the tax rate cannot be infinity for any good, so we have a contradiction. From this we know that, in the long run, capital gains taxes must tend towards zero. Another argument comes from the Atkinson-Stiglitz theorem. As we showed above, capital gains taxes are equivalent to consumption taxes, so, to the extent you buy the Atkinson-Stiglitz result, you should agree capital gains taxes should be zero. Some economists have also invoked the Diamond-Mirrless production efficiency result Optimal Taxation and Public Production I: Production Efficiency by arguing capital is an input to production and therefore shouldn't be taxed Mankiw - this is disputed The Case for a Progressive Tax: From Basic Research to Policy Recommendations. Finally, there's the more pragmatic arguments for taxing capitals income at a lower rate than labor income: Most countries already tax corporate profits, so taxing stock dividends/gains is effectively double-tax, which makes the rates higher than they naively appear. Taxes on capital income are taxes on nominal returns rather than real returns. For instance, the value I derive from a 10% bond is very different if inflation is 2% vs 12%. Lower tax rates can be justified as a sort of ad-hoc alternative to letting people deduct inflation from their capital income. Finally, if we focus exclusively on corporate income taxes rather than taxes on dividends and capital gains, there are a variety of other arguments. One such argument is that if we assume the economy is open, then a tax on corporate profits will cause less investment in the country, reducing the marginal return of labor, causing labor incomes to fall. The capitalists within the country, on the other hand, will simply shift their investments abroad and not see their incomes change at all. All that being said, it's not clear that capital is actually very mobile across countries Domestic savings and international capital flows (see also Feldstein–Horioka puzzle Equity home bias puzzle). On the other hand, economists do generally believe capital is getting more mobile over time. Arguments For Conversely, there are many arguments against a capital gains tax as well [TODO]. Bernheim progressive income taxation - Golosov credit market imperfections - Aiyagari and (Farhi and Werning 2011) A theory of optimal capital taxation More generally, are agents even rational when making saving choices? What To Tax? Production Efficency The Diamond-Mirrless production efficiency result is basically a proof that in a competitive economy, the government shouldn't distort the inputs of firms, choosing instead to levy taxes only on the final goods and services Optimal Taxation and Public Production I: Production Efficiency. I don't see this result as politically controversial, since I can't really much a non-rent-seeking political motive to tax (or subsidize) inputs. Contrast this with taxing outputs, where there are lots of motives: don't tax food, tax luxuries, apply tariffs, etc. TODO Optimal Inefficient Production Dasgupta Atkinson-Stiglitz See here. Odds and Ends Tagging The idea behind tagging is that we alter taxes and transfers based on people's immutable characteristics. For instance, we might (and do) give more money to people who have certain disabilities. Akerlof showed that the optimal income tax system will make it so the average marginal welfare weight of the blind should equal that of the non-blind Akerlof. If, for instance, you define social welfare as $\ln(x)$, then the marginal social welfare is $1/x$. So, according to Akerlof, $\frac{1}{n}\sum{1/x_i}$ should be equal for blind and non-blind people. Equivalent logic holds for other immutable tags. If, on the other hand, these tags are partially mutable, things get more complicated. Now, politically, there are lots of characteristics that, though pretty much immutable, are still quite controversial to use. Examples include height, sex, and race. People have argued that the fact people don't want to tax these tags implies the overall model is wrong Optimal taxation in theory and practice. Two proposed corrections are That we value "horizontal equity" - that two people with the same abilities but different immutable characteristics should pay the same taxes That we should only use tags that cause higher income, not ones that merely correlate with them. Likewise, people seem more open to giving welfare based on things that cause it to be harder to make ends meet (e.g. # of kids, medical expenses) than to characteristics that merely correlate with the same thing. Cash vs In-Kind Transfer Naively, theory suggests governments should prefer cash transfers to in-kind transfers (e.g. food stamps, housing vouchers, etc). The arguments for this is straightforward: (1) the person receiving the welfare knows what they need better than the government and (2) cash transfers are easier/cheaper for both the government and the recipient to handle. However, some economists argue that in-kind transfers have their place Nichols. For instance, suppose you have a soup kitchen that requires waiting in line to get soup. A poor person might get 2 utils from soup and might lose 1 util from waiting, so they choose to wait in line for free soup. A rich person might get 1 util from soup (they can easily afford better food) and lose 2 utils from waiting, so they choose not to wait in line for free soup. In this way, in-kind transfers can closely target the people who have more free time relative to income (i.e. the poor). There are some theoretical situations where cash transfers work better and others where in-kind transfers work better. Which is better in the real world depends a great deal on the specific situation. Tax Incidence One issue with periodic (i.e. monthly) welfare payments is that they cause a temporary surge in demand, which firms can take advantage of by raising prices. For instance, even though it is illegal for companies to price-discriminate based on whether purchases are made with food-stamps, the fact that food-stamps are paid out at the beginning of the month means that the food-stamps program causes a ~30% surge in demand for food during the beginning of the month in high food-stamp areas. Presumably for this reason, stores raise prices at the beginning of the month by ~2.5%. In this way, some of the social welfare created by food stamps makes its way to the stores rather than the intended recipients Hastings. Conversely, people who don't receive food-stamps are also hurt by the higher prices. Note, that if everyone received food stamps (to destigmatize them), this effect would be significantly larger because (1) we'd see a larger surge and (2) while companies have limited ability to change their prices "optimally" now because even in high food-stamp areas, the vast majority of customers aren't using food stamps, this limitation would cease. Conversely, this demonstrates that, today, legally requiring that food stamps be treated as cash largely prevents firms from taking reaping the benefits from this social program for themselves. In a similar way the fact that the EITC only applies to workers with kids helps prevent employers from taking its money; likewise, some people believe that EITC causes general wage cuts, so non-EITC workers end up being hurt Rothstein. An alternative approach to estimate the effect of tax/welfare policy is to look at how asset prices change when a policy is announced. (see for example Friedman). Finally, it's worth pointing out that mandates are very different than taxes. For instance, if the government implements a 10% payroll tax to pay for a healthcare program, we'd expect firms and laborers to both bear some of that tax while the unhealthy benefit. In the Mirrlees labor model, this will reduce total hours worked. However, if the government mandates that employers pay for healthcare and make it illegal to discriminate based on health, then we'd expect (for the most part) employees to bear the brunt of the cost via lower pay, but we wouldn't expect their overall compensation to fall. Likewise, it's entirely possible in the Mirrlees labor model that total hours worked won't change. In particular, if an employee values the mandated benefit at $\alpha$ times its cash cost, the distortion's size is $(1 - \alpha)$ times what achieving the mandate with a tax would be. Salience todo Salience and taxation: Theory and evidence Philosophical Considerations Philosophically, you can think of redistributive concerns as "insurance behind a veil of ignorance". However, you can also justify progressive taxation as "insurance in front of a veil of ignorance." The basic idea is that if you face an adverse event (e.g. you become blind), a progressive tax system makes it so you have to reduce your consumption by less than you otherwise would have. See Varian for a more mathematical analysis. Intertemporal Models TODO 31:15 from Topic 5: Income Taxation and Labor Supply part 3. Commentary To stay "objective", I don't throw include my own comments on this page except for this section. My main comment is that all the income tax literature derived from the Mirrlees model is utter bullshit. It's all based on the assumption that only marginal tax rates affect labor decisions, when this is laughably false. The obvious counterexample is welfare: if I give everyone $100k for free, a large number of them will work less (if at all) even though marginal tax rates haven't changed. I can only assume this assumption is used because achieving nice mathematical results using a more complete model is either exceptionally difficult or impossible, but how On Earth can economists be using this to justify policy recommendations? The only exceptions to my ire are (1) that marginal tax rates should never exceed 100% and (2) Saez's optimal top tax rate result. This is also, in my opinion, why there's the ludicrous consensus that lump sum taxes aren't distortionary, when they obviously are. For instance, if Alice is working 1 hour a week for \$10 and living of beans, rice, and propane, she's making \$520 per year. You can bet you soul that she'd work more if the government levied a \$1000 lump sum tax on her. Seriously, God have mercy on their souls. The relentless obsession with marginal tax rates is the single biggest sin in economics. Chetty, R. Bruich, G. (2012). Topic 4: Optimal Taxation Part 1. Youtube. https://www.youtube.com/watch?v=IPlCtuB3B68 Ramsey, F. P. (1927). A Contribution to the Theory of Taxation. The Economic Journal, 37(145), 47-61. https://doi.org/10.2307/2222721 Diamond, P. A. (1975). A many-person Ramsey tax rule. Judd, K. L. (1985). Redistributive taxation in a simple perfect foresight model. Journal of public Economics, 28(1), 59-83. Wikipedia contributors. (2020, August 16). Land reclamation. In Wikipedia, The Free Encyclopedia. Retrieved 14:50, August 19, 2020, from https://en.wikipedia.org/w/index.php?title=Land_reclamation&oldid=973278082 Chamley, C. (1986). Optimal taxation of capital income in general equilibrium with infinite lives. Econometrica: Journal of the Econometric Society, 607-622. https://doi.org/10.2307/1911310 Mirrlees, J. A. (1971). An exploration in the theory of optimum income taxation. The review of economic studies, 38(2), 175-208. http://doi.org/10.2307/2296779 Atkinson, A. B., & Stiglitz, J. E. (1976). The design of tax structure: direct versus indirect taxation. Journal of public Economics, 6(1-2), 55-75. https://doi.org/10.1016/0047-2727(76)90041-4 Seade, J. (1982). On the sign of the optimum marginal income tax. The Review of Economic Studies, 49(4), 637-643. https://doi.org/10.2307/2297292 Sadka, E. (1976). On income distribution, incentive effects and optimal income taxation. The review of economic studies, 43(2), 261-267. https://doi.org/10.2307/2297322 Saez, E. (2001). Using elasticities to derive optimal income tax rates. The review of economic studies, 68(1), 205-229. https://doi.org/10.1111/1467-937X.00166 Saez, E. (2002). Optimal income transfer programs: intensive versus extensive labor supply responses. The Quarterly Journal of Economics, 117(3), 1039-1073. https://doi.org/10.1162/003355302760193959 Akerlof, G. A. (1978). The economics of" tagging" as applied to the optimal income tax, welfare programs, and manpower planning. The American economic review, 68(1), 8-19. Mankiw, N. G., Weinzierl, M., & Yagan, D. (2009). Optimal taxation in theory and practice. Journal of Economic Perspectives, 23(4), 147-74. Nichols, A. L., & Zeckhauser, R. J. (1982). Targeting transfers through restrictions on recipients. The American Economic Review, 72(2), 372-377. Varian, H. R. (1980). Redistributive taxation as social insurance. Journal of public Economics, 14(1), 49-68. Chetty, R. Bruich, G. (2012). Topic 5: Income Taxation and Labor Supply part 3. Youtube. https://youtu.be/Mx0ZyYGqtjM?t=1874 Wikipedia contributors. (2020, July 22). Optimal capital income taxation. In Wikipedia, The Free Encyclopedia. Retrieved 18:50, August 25, 2020, from https://en.wikipedia.org/w/index.php?title=Optimal_capital_income_taxation&oldid=968893798 Diamond, Peter A.; Mirrlees, James A. (1971). "Optimal Taxation and Public Production I: Production Efficiency". The American Economic Review. 61 (1): 8–27. Diamond, P. A.; Saez, E. (2011). "The Case for a Progressive Tax: From Basic Research to Policy Recommendations". Journal of Economic Perspectives. 25 (4): 165–190 [p. 177]. https://doi.org/10.1257/jep.25.4.165 Wikipedia contributors. (2020, July 24). Fundamental theorems of welfare economics. In Wikipedia, The Free Encyclopedia. Retrieved 19:27, August 25, 2020, from https://en.wikipedia.org/w/index.php?title=Fundamental_theorems_of_welfare_economics&oldid=969212261 Jacobs, B. (2015). Optimal Inefficient Production. Mimeo, Erasmus University Rotterdam. Stiglitz, J. E., & Dasgupta, P. (1971). Differential taxation, public goods, and economic efficiency. The Review of Economic Studies, 38(2), 151-174. Kaplow, L., 2006. On the undesirability of commodity taxation even when income taxation is not optimal. Journal of Public Economics 90, 1235–1250. https://doi.org/10.1016/j.jpubeco.2005.07.001. Laroque, G., 2005. Indirect taxation is superfluous under separability and taste homogeneity: a simple proof. Economics Letters 87, 141–144. https://doi.org/10.1016/j.econlet.2004.10.010. Hellwig, M. F. (2010). A generalization of the Atkinson–Stiglitz (1976) theorem on the undesirability of nonuniform excise taxation. Economics Letters, 108(2), 156–158. https://doi.org/10.1016/j.econlet.2010.04.035 Boadway, R., & Pestieau, P. (2003). 21 Indirect Taxation and Redistribution: The Scope of the Atkinson-Stiglitz Theorem. Economics for an imperfect world: Essays in honor of Joseph E. Stiglitz, 387. Naito, H. (2007). Atkinson-Stiglitz Theorem with Endogenous Human Capital Accumulation. The BE Journal of Economic Analysis & Policy, 7(1). https://doi.org/10.2202/1935-1682.1516. Saez, E. (2002). The desirability of commodity taxation under non-linear income taxation and heterogeneous tastes. Journal of Public Economics, 83(2), 217-230. https://doi.org/10.1016/S0047-2727(00)00159-6. Newbery, D. M. (1986). On the desirability of input taxes. Economics Letters, 20(3), 267-270. https://doi.org/10.1016/0165-1765(86)90036-4 Gorman, W. M. (1968). Measuring the quantities of fixed factors. Naito, H. (1999). Re-examination of uniform commodity taxes under a non-linear income tax system and its implication for production efficiency. Journal of Public Economics, 71(2), 165-188. https://doi.org/10.1016/S0047-2727(98)00052-8 Kleven, H. J., Richter, W. F., & Sørensen, P. B. (2000). Optimal taxation with household production. Oxford Economic Papers, 52(3), 584-594. https://doi.org/10.1093/oep/52.3.584 Keen, M., & Wildasin, D. (2004). Pareto-efficient international taxation. American Economic Review, 94(1), 259-275. http://doi.org/10.1257/000282804322970797 Naito, H. (2004). Endogenous human capital accumulation, comparative advantage and direct vs. indirect redistribution. Journal of Public Economics, 88(12), 2685-2710. https://doi.org/10.1016/j.jpubeco.2003.07.003 Saez, E. (2004). Direct or indirect tax instruments for redistribution: short-run versus long-run. Journal of Public Economics, 88(3-4), 503-518. https://doi.org/10.1016/S0047-2727(02)00222-0 Gaube, T., 2005. Income taxation, endogenous factor prices and production efficiency. Scandinavian Journal of Economics, 107(2), pp.335-352. https://doi.org/10.1111/j.1467-9442.2005.00411.x Jacobs, B., & Bovenberg, A. L. (2011). Optimal taxation of human capital and the earnings function. Journal of Public Economic Theory, 13(6), 957-971. https://doi.org/10.1111/j.1467-9779.2011.01527.x Gomes, R., Lozachmeur, J. M., & Pavan, A. (2018). Differential taxation and occupational choice. The Review of Economic Studies, 85(1), 511-557. https://doi.org/10.1093/restud/rdx022 Hastings, J., & Washington, E. (2010). The first of the month effect: consumer behavior and store responses. American economic Journal: economic policy, 2(2), 142-62. https://doi.org/10.1257/pol.2.2.142 Rothstein, J. (2008). The unintended consequences of encouraging work: Tax incidence and the EITC (Vol. 165). Princeton, NJ: Center for Economic Policy Studies, Princeton University. Feldstein, M., & Horioka, C. (1979). Domestic savings and international capital flows (No. w0310). National Bureau of Economic Research. https://doi.org/10.3386/w0310 Wikipedia contributors. (2020, July 9). Feldstein–Horioka puzzle. In Wikipedia, The Free Encyclopedia. Retrieved 17:27, September 4, 2020, from https://en.wikipedia.org/w/index.php?title=Feldstein%E2%80%93Horioka_puzzle&oldid=966830962 Wikipedia contributors. (2019, December 4). Equity home bias puzzle. In Wikipedia, The Free Encyclopedia. Retrieved 17:24, September 4, 2020, from https://en.wikipedia.org/w/index.php?title=Equity_home_bias_puzzle&oldid=929180771 Friedman, J. N. (2009). The incidence of the Medicare prescription drug benefit: using asset prices to assess its impact on drug makers. Harvard University. Bernheim, B. D. (2002). Taxation and saving. In Handbook of public economics (Vol. 3, pp. 1173-1249). Elsevier. https://doi.org/10.1016/S1573-4420(02)80022-2 Golosov, M., Kocherlakota, N., & Tsyvinski, A. (2003). Optimal indirect and capital taxation. The Review of Economic Studies, 70(3), 569-587. https://doi.org/10.1111/1467-937X.00256 Aiyagari, S. R. (1995). Optimal capital income taxation with incomplete markets, borrowing constraints, and constant discounting. Journal of political Economy, 103(6), 1158-1175. Piketty, T., & Saez, E. (2012). A theory of optimal capital taxation (No. w17989). National Bureau of Economic Research. Chetty, R., Looney, A., & Kroft, K. (2009). Salience and taxation: Theory and evidence. American economic review, 99(4), 1145-77. https://doi.org/10.1257/aer.99.4.1145

chetty economics taxes inequality literature-summary

Optimal Taxation Literature

[ This page is my attempt at an unbiased review of the optimal taxation literature, much as one would get at a university. It is based on videos of lectures from Harvard, the cited papers, other reading. It is mostly based on this series of lectures Topic 4: Optimal Taxation Part 1. I largely ignore empirical considerations, preferring to focus this page on the purely theoretical models and their results. ]

TODO: Efficiency cost of Taxation (start here).

Lump Sum Transfers

The second fundamental theorem of welfare economics states that you can achieve any Pareto optimal outcome via only lump-sum wealth redistribution. For a proof see the Wikipedia page Fundamental theorems of welfare economics.

Unfortunately, this theory runs into serious problems when it makes contact with reality.

Probably the most fatal flaw is that it assumes the government has perfect information. In practice, governments generally want to transfer money from people with high earning ability to people with low earning ability, but they don't have a way to perfectly determine each person's earning ability.

Other problems come from the other, fairly strong, assumptions like that there are no transaction costs, that all actors have perfect information, and that there are no monopolies.

Nevertheless, this ideal has significant influence on the literature and you'll often see papers make claims about being the second best redistribution mechanism - the best is always implied to be lump sum transfers.

Consumption Taxes

Ramsey Model

The overall elasticity of a good is the percent equilibrium quantity changes due a percent change in price from taxes. You can compute this from the elasticity of demand ($\epsilon_d$) and supply ($\epsilon_s$) using this formula:

$$ \frac{\epsilon_d \epsilon_s}{\epsilon_d + \epsilon_s} $$

Ramsey proved that if you're levying flat consumption taxes, then the tax on each good should be inversely proportional to the elasticity of that good to minimize deadweight loss A Contribution to the Theory of Taxation.

This system obviously ignores redistributive goals and is probably regressive since necessities tend to be inelastic while luxury goods tend to be elastic.

Diamond extends this result from optimizing merely efficiency to optimizing social welfare as a whole. He determined that the if we still restrict ourselves to flat consumption taxes, the optimal rate on each good should be multiplied by the average marginal utility for consumers of that good A many-person Ramsey tax rule.

Land Value Tax

As a special case of the Ramsey rule, perfectly inelastic goods and services should be taxed to the greatest extent possible. This is the main impetus behind a land value tax - taxing the value of all land but not the value of any improvements made to it (e.g. buildings).

The logic is simply that since the supply of land is perfectly inelastic (well mostly Land reclamation), a tax on land has zero dead-weight loss and is therefore optimal from an efficiency standpoint. In fact, from this perspective, we can levy a tax equal to the entire value of the land times the interest rate, effectively giving the government all income derived from unimproved land - all without any efficiency trade-off.

Another reason the tax tends to be popular is that its progressive since the poor tend to own no land at all. A skeptic may think landlords will just raise rents, but since the supply is perfectly inelastic, the entire incidence of the tax should fall on landlords rather than renters.

Finally, if your unimproved plot of land gains value, it's because other people improved the land around you. This makes that value an externality and makes you essentially a rent seeker. A 100% land value tax removes this unfair value people get when they luck out and their land becomes more valuable through no actions of their own.

There are also arguments against:

It's difficult to assess how much a piece of land would be worth without any improvements.
If land values increase around you, you may be unable to afford the land value tax and will, thus, be forced from your home.
A land value tax is unnecessary to the extent the Atkinson-Stiglitz result holds (discussed later).

Income Tax

Laffer Curve (Flat Income Tax)

Instead of optimizing for social welfare, what if we want to optimize for tax revenue? For simplicity, we'll consider a flat tax example.

First of all, it is obvious that revenue will be zero at t=0 since there is no tax. It is nearly as obvious that revenue will be zero at t=1, since no one will work if the tax rate is 100%. Assuming some revenue can be raised, it therefore follows the revenue-maximizing rate is between 0% and 100%.

Let $\epsilon$ be the the elasticity of income with respect to $1-t$. For instance, if $\epsilon=0.4$, then increasing taxes from 0% to 1% will cause income to decrease by 0.4%.

Given the above, we can show that maximum tax revenue is given by

$$ t_{max} = \frac{1}{\epsilon + 1} $$

Using our arbitrary example of $\epsilon=0.4$, this implies a revenue-maximizing rate of ~71%.

The traditional interpretation is that you never want to have rates above the revenue-maximizing rate, since such rates harm both the government's ability to help society and also harm the person being taxed. In principle, (and maybe in practice), you could justify a higher tax rate if you actually want to discourage working.

The Mirrlees Model (Income Taxes)

Suppose consumption is given by $c = w L - T(w L)$, where $w$ is your wage, $L$ is your labor, and $T(w L)$ is how much you pay in taxes.

Also, suppose everyone has the same utility function $u(c, L)$ where $c$ and they're trying to maximize it:

$$ w(1-T')\frac{\delta u}{\delta c} + \frac{\delta u}{\delta L} = 0 $$

Finally, suppose there are $h(w)$ people with wage $w$ and that the government has its own social welfare function $G$ such that it is optimizing for

$$ \int_0^\infty{G(u(c,L)) h(w) dw} $$

but it needs to raise enough revenue to match our expenditures, $E$:

$$ \int_0^\infty T(w L) h(w) dw \geq E $$

Two notes:

Conventional utilitarianism suggests $G(x)=x$. However, both unconventional utilitarians (e.g. Rawlsian ones) and conventional utilitarians who believe people optimize for the wrong things should choose a non-trivial $G$.
This model is premised on the assumption that the only reason we don't perfectly redistribute income is due to behavioral responses. Some people argue that at least some people intrinsically deserve at least some of their income.

From this approach, a few general results have been proven:

Taxes should be negative at low incomes.
Taxes should positive at high incomes.
Marginal tax rates should never be negative Seade.
Marginal tax rates should never be above 100% (trivial since no one would work).
The marginal tax rate on the very highest earner should be zero if the skill distribution is bounded Sadka. This result is generally believed to be inapplicable to reality as we'll discuss in a bit.

Suppose the maximum tax bracket has rate $\tau$ and is applied to all income above $\bar{z}$. Suppose there are $N$ people who earn above that and that their average income is $z_m$.

If I increase $\tau$, this

increases tax revenue simply due to having a higher tax rate
reduces tax revenue by discouraging labor
reduces the welfare of these $N$ people by making them consume less

These three effects are called the fiscal effect, the behavioral effect, and the welfare effect. I won't really discuss them more, but they come up a lot in the literature and are the sources of some of the derivatives I merely state later on.

We can represent this mathematically using the derivative of social welfare with respect to $\tau$:

$$ N \left( (1-g)(z^m - \bar{z}) - \epsilon \frac{\tau}{1 - \tau}z^m \right) $$

where $g$ represents how much.

We can optimize by setting this derivative to zero to achieve Using elasticities to derive optimal income tax rates:

$$ \frac{\tau}{1 - \tau} = \frac{(1-g)(z^m/\bar{z} - 1)}{\epsilon \cdot z^m / \bar{z}} $$

Note, these results make intuitive sense:

As $g$ increases, $\tau$ should decrease.
As $\epsilon$ increases, $\tau$ should decrease.
As $z_m/\bar{z}$ increases, income inequality increases, and $\tau$ increases.

A couple things to note:

This formula is less precise than it appears. Most people believe, for instance, that changes to the top tax rate will affect $z^m$ and $\epsilon$
This result can also be used to show both (1) that the top earner should face a 0% marginal tax rate and (2) that this "proof" is completely inapplicable to reality.
Basically, for the top earner, as $\bar{z}$ approaches their income, the optimal $\tau$ goes to zero. Hence, if we were to implement a tax bracket just below the top earner's income, that bracket's rate should be 0%.
In English, the basic argument goes "if we can predict the top earner's income, we can introduce a new 0% tax bracket just below to encourage them to work more with virtually no revenue loss."
However, this entire paradigm is inapplicable to reality, because we cannot (in fact) predict the top earner's income very well. If, for instance, the top bracket's threshold is half the top earner's income, this entire argument utterly breaks down.
As a special case, we can reproduce the Laffer curve result by letting $\bar{z}=0$ and $g=0$.

After proving this, Saez uses this model to argue the optimal top tax rate is between 50% and 80% Using elasticities to derive optimal income tax rates.

Let's generalize this to non-linear income taxes.

Consider the people earning between $z$ and $z+dz$. If I change their marginal tax rate by a small amount, we can use the derivative of social welfare relative to their marginal tax rate to compute the change in their welfare:

$$ (1 - H(z)) - (1 - H(z))G(z) + h(z) \cdot z \cdot \epsilon \cdot \frac{T'}{1-T'} $$

where $H$ is the CDF of income (as $h$ was the pdf).

When we set this to zero, we get

$$ \frac{T'(z)}{1-T'(z)} = \frac{1}{\epsilon} \frac{1 - H(z)}{z h(z)} (1 - G(z)) $$

Note, this result requires us to assume that labor decisions are only made based on the marginal tax rate. This is almost certainly false.

One note on the $(1-H(z))/h(z)$ term. This term implies that when few people are earning $z$ income and lots of people are earning above $z$, the marginal tax rate at $z$ should be high. The intuitive reason is that our distortion is low (few people pay the high marginal tax), but it lets us raise effective rates on everyone above that point without discouraging their labor.

Although we can't elegantly solve for the entire tax system, we can solve for the best system computationally based on the current US tax system and income distribution. Saez did this Using elasticities to derive optimal income tax rates. The review of economic studies and found:

Typically this Mirrlees model advocates that marginal tax rates follow a U-shaped curve, while having a lump sum grant to those with no earnings.

TODO: Diamond 1998, Piketty 1997

Discrete Labor Model

One of the problems with the Mirrlees model is that it assumes people can choose any number of hours, when, in reality, it is difficult to work, say, 4 hours per week. Instead, people are likely to respond to taxes and transfers by dropping in or re-entering the workforce.

Suppose we have a finite number of jobs and each individual is trained for only one of those jobs and all people working the same job earn the same wage. Each individual can merely choose whether or not to work and the government can choose to levy a tax on each job. Finally, suppose that if after-tax income goes down by 1%, the participation rate of a job goes down by some fixed percent.

This model implies that work subsidies are optimal Optimal income transfer programs: intensive versus extensive labor supply responses - directly contradicting our earlier result that the marginal tax rate never be negative. This justifies programs like the EITC.

Saez finally combines both the Mirrlees and the discrete labor model into a single general model, but that is more complicated, so we won't go into it other than to say that this can still support negative marginal tax rates at low incomes.

Capital Gains Taxation

The Wikipedia page Optimal capital income taxation. In Wikipedia provides a really good overview - to the extent that I feel like I'm almost copy-pasting them at times, but I've seen similar collections of arguments in lectures, so I guess this falls under "common knowledge". I'm mainly including this section for completeness.

Arguments Against

The Ramsey model implyes there should be no capital gains tax. The reasoning was given by Judd Judd and Chamley Chamley and is relatively straightforward:

Suppose the interest rate is 5%, we levy a 10% capital gains tax, and you invest \$1 to spend later.
If a consumption tax rate is $t$, then instead of being able to buy $A$ goods, I'll only be able to buy $A/(1+t)$ goods. Alternatively, if I can buy $B$ with the tax, then $t=A/B-1$
After 1 year, you have \$1.045 to consume, but you would have had \$1.05 to consume absent the tax. This implies a tax rate of (1.05/1.045-1) ~ 0.48%
After 100 years, you have \$81.59 to consume, but you would have had \$131.50 to consume absent the tax. This implies a tax rate of 131.50/81.59-1 ~ 61%
After 1000 years, the implied tax rate grows to 11731%.
etc. As time goes to infinity, the implied tax rate grows to infinity.
However, the Ramsey formula implies the tax rate cannot be infinity for any good, so we have a contradiction.

From this we know that, in the long run, capital gains taxes must tend towards zero.

Another argument comes from the Atkinson-Stiglitz theorem. As we showed above, capital gains taxes are equivalent to consumption taxes, so, to the extent you buy the Atkinson-Stiglitz result, you should agree capital gains taxes should be zero.

Some economists have also invoked the Diamond-Mirrless production efficiency result Optimal Taxation and Public Production I: Production Efficiency by arguing capital is an input to production and therefore shouldn't be taxed Mankiw - this is disputed The Case for a Progressive Tax: From Basic Research to Policy Recommendations.

Finally, there's the more pragmatic arguments for taxing capitals income at a lower rate than labor income:

Most countries already tax corporate profits, so taxing stock dividends/gains is effectively double-tax, which makes the rates higher than they naively appear.
Taxes on capital income are taxes on nominal returns rather than real returns. For instance, the value I derive from a 10% bond is very different if inflation is 2% vs 12%. Lower tax rates can be justified as a sort of ad-hoc alternative to letting people deduct inflation from their capital income.

Finally, if we focus exclusively on corporate income taxes rather than taxes on dividends and capital gains, there are a variety of other arguments. One such argument is that if we assume the economy is open, then a tax on corporate profits will cause less investment in the country, reducing the marginal return of labor, causing labor incomes to fall. The capitalists within the country, on the other hand, will simply shift their investments abroad and not see their incomes change at all.

All that being said, it's not clear that capital is actually very mobile across countries Domestic savings and international capital flows (see also Feldstein–Horioka puzzle Equity home bias puzzle). On the other hand, economists do generally believe capital is getting more mobile over time.

Arguments For

Conversely, there are many arguments against a capital gains tax as well [TODO].

Bernheim
progressive income taxation - Golosov
credit market imperfections - Aiyagari and (Farhi and Werning 2011)
A theory of optimal capital taxation

More generally, are agents even rational when making saving choices?

What To Tax?

Production Efficency

The Diamond-Mirrless production efficiency result is basically a proof that in a competitive economy, the government shouldn't distort the inputs of firms, choosing instead to levy taxes only on the final goods and services Optimal Taxation and Public Production I: Production Efficiency.

I don't see this result as politically controversial, since I can't really much a non-rent-seeking political motive to tax (or subsidize) inputs. Contrast this with taxing outputs, where there are lots of motives: don't tax food, tax luxuries, apply tariffs, etc.

TODO

Optimal Inefficient Production
Dasgupta

Atkinson-Stiglitz

See here.

Odds and Ends

Tagging

The idea behind tagging is that we alter taxes and transfers based on people's immutable characteristics. For instance, we might (and do) give more money to people who have certain disabilities.

Akerlof showed that the optimal income tax system will make it so the average marginal welfare weight of the blind should equal that of the non-blind Akerlof. If, for instance, you define social welfare as $\ln(x)$, then the marginal social welfare is $1/x$. So, according to Akerlof, $\frac{1}{n}\sum{1/x_i}$ should be equal for blind and non-blind people. Equivalent logic holds for other immutable tags.

If, on the other hand, these tags are partially mutable, things get more complicated.

Now, politically, there are lots of characteristics that, though pretty much immutable, are still quite controversial to use. Examples include height, sex, and race. People have argued that the fact people don't want to tax these tags implies the overall model is wrong Optimal taxation in theory and practice. Two proposed corrections are

That we value "horizontal equity" - that two people with the same abilities but different immutable characteristics should pay the same taxes
That we should only use tags that cause higher income, not ones that merely correlate with them. Likewise, people seem more open to giving welfare based on things that cause it to be harder to make ends meet (e.g. # of kids, medical expenses) than to characteristics that merely correlate with the same thing.

Cash vs In-Kind Transfer

Naively, theory suggests governments should prefer cash transfers to in-kind transfers (e.g. food stamps, housing vouchers, etc). The arguments for this is straightforward: (1) the person receiving the welfare knows what they need better than the government and (2) cash transfers are easier/cheaper for both the government and the recipient to handle.

However, some economists argue that in-kind transfers have their place Nichols. For instance, suppose you have a soup kitchen that requires waiting in line to get soup.

A poor person might get 2 utils from soup and might lose 1 util from waiting, so they choose to wait in line for free soup.

A rich person might get 1 util from soup (they can easily afford better food) and lose 2 utils from waiting, so they choose not to wait in line for free soup.

In this way, in-kind transfers can closely target the people who have more free time relative to income (i.e. the poor).

There are some theoretical situations where cash transfers work better and others where in-kind transfers work better. Which is better in the real world depends a great deal on the specific situation.

Tax Incidence

One issue with periodic (i.e. monthly) welfare payments is that they cause a temporary surge in demand, which firms can take advantage of by raising prices.

For instance, even though it is illegal for companies to price-discriminate based on whether purchases are made with food-stamps, the fact that food-stamps are paid out at the beginning of the month means that the food-stamps program causes a ~30% surge in demand for food during the beginning of the month in high food-stamp areas. Presumably for this reason, stores raise prices at the beginning of the month by ~2.5%. In this way, some of the social welfare created by food stamps makes its way to the stores rather than the intended recipients Hastings. Conversely, people who don't receive food-stamps are also hurt by the higher prices.

Note, that if everyone received food stamps (to destigmatize them), this effect would be significantly larger because (1) we'd see a larger surge and (2) while companies have limited ability to change their prices "optimally" now because even in high food-stamp areas, the vast majority of customers aren't using food stamps, this limitation would cease.

Conversely, this demonstrates that, today, legally requiring that food stamps be treated as cash largely prevents firms from taking reaping the benefits from this social program for themselves. In a similar way the fact that the EITC only applies to workers with kids helps prevent employers from taking its money; likewise, some people believe that EITC causes general wage cuts, so non-EITC workers end up being hurt Rothstein.

An alternative approach to estimate the effect of tax/welfare policy is to look at how asset prices change when a policy is announced. (see for example Friedman).

Finally, it's worth pointing out that mandates are very different than taxes. For instance, if the government implements a 10% payroll tax to pay for a healthcare program, we'd expect firms and laborers to both bear some of that tax while the unhealthy benefit. In the Mirrlees labor model, this will reduce total hours worked.

However, if the government mandates that employers pay for healthcare and make it illegal to discriminate based on health, then we'd expect (for the most part) employees to bear the brunt of the cost via lower pay, but we wouldn't expect their overall compensation to fall. Likewise, it's entirely possible in the Mirrlees labor model that total hours worked won't change.

In particular, if an employee values the mandated benefit at $\alpha$ times its cash cost, the distortion's size is $(1 - \alpha)$ times what achieving the mandate with a tax would be.

Salience

todo Salience and taxation: Theory and evidence

Philosophical Considerations

Philosophically, you can think of redistributive concerns as "insurance behind a veil of ignorance". However, you can also justify progressive taxation as "insurance in front of a veil of ignorance." The basic idea is that if you face an adverse event (e.g. you become blind), a progressive tax system makes it so you have to reduce your consumption by less than you otherwise would have. See Varian for a more mathematical analysis.

Intertemporal Models

TODO 31:15 from Topic 5: Income Taxation and Labor Supply part 3.

Commentary

To stay "objective", I don't throw include my own comments on this page except for this section.

My main comment is that all the income tax literature derived from the Mirrlees model is utter bullshit. It's all based on the assumption that only marginal tax rates affect labor decisions, when this is laughably false.

The obvious counterexample is welfare: if I give everyone $100k for free, a large number of them will work less (if at all) even though marginal tax rates haven't changed.

I can only assume this assumption is used because achieving nice mathematical results using a more complete model is either exceptionally difficult or impossible, but how On Earth can economists be using this to justify policy recommendations?

The only exceptions to my ire are (1) that marginal tax rates should never exceed 100% and (2) Saez's optimal top tax rate result.

This is also, in my opinion, why there's the ludicrous consensus that lump sum taxes aren't distortionary, when they obviously are. For instance, if Alice is working 1 hour a week for \$10 and living of beans, rice, and propane, she's making \$520 per year. You can bet you soul that she'd work more if the government levied a \$1000 lump sum tax on her.

Seriously, God have mercy on their souls. The relentless obsession with marginal tax rates is the single biggest sin in economics.

Chetty, R. Bruich, G. (2012). Topic 4: Optimal Taxation Part 1. Youtube. https://www.youtube.com/watch?v=IPlCtuB3B68 Ramsey, F. P. (1927). A Contribution to the Theory of Taxation. The Economic Journal, 37(145), 47-61. https://doi.org/10.2307/2222721 Diamond, P. A. (1975). A many-person Ramsey tax rule. Judd, K. L. (1985). Redistributive taxation in a simple perfect foresight model. Journal of public Economics, 28(1), 59-83. Wikipedia contributors. (2020, August 16). Land reclamation. In Wikipedia, The Free Encyclopedia. Retrieved 14:50, August 19, 2020, from https://en.wikipedia.org/w/index.php?title=Land_reclamation&oldid=973278082 Chamley, C. (1986). Optimal taxation of capital income in general equilibrium with infinite lives. Econometrica: Journal of the Econometric Society, 607-622. https://doi.org/10.2307/1911310 Mirrlees, J. A. (1971). An exploration in the theory of optimum income taxation. The review of economic studies, 38(2), 175-208. http://doi.org/10.2307/2296779 Atkinson, A. B., & Stiglitz, J. E. (1976). The design of tax structure: direct versus indirect taxation. Journal of public Economics, 6(1-2), 55-75. https://doi.org/10.1016/0047-2727(76)90041-4 Seade, J. (1982). On the sign of the optimum marginal income tax. The Review of Economic Studies, 49(4), 637-643. https://doi.org/10.2307/2297292 Sadka, E. (1976). On income distribution, incentive effects and optimal income taxation. The review of economic studies, 43(2), 261-267. https://doi.org/10.2307/2297322 Saez, E. (2001). Using elasticities to derive optimal income tax rates. The review of economic studies, 68(1), 205-229. https://doi.org/10.1111/1467-937X.00166 Saez, E. (2002). Optimal income transfer programs: intensive versus extensive labor supply responses. The Quarterly Journal of Economics, 117(3), 1039-1073. https://doi.org/10.1162/003355302760193959 Akerlof, G. A. (1978). The economics of" tagging" as applied to the optimal income tax, welfare programs, and manpower planning. The American economic review, 68(1), 8-19. Mankiw, N. G., Weinzierl, M., & Yagan, D. (2009). Optimal taxation in theory and practice. Journal of Economic Perspectives, 23(4), 147-74. Nichols, A. L., & Zeckhauser, R. J. (1982). Targeting transfers through restrictions on recipients. The American Economic Review, 72(2), 372-377. Varian, H. R. (1980). Redistributive taxation as social insurance. Journal of public Economics, 14(1), 49-68. Chetty, R. Bruich, G. (2012). Topic 5: Income Taxation and Labor Supply part 3. Youtube. https://youtu.be/Mx0ZyYGqtjM?t=1874 Wikipedia contributors. (2020, July 22). Optimal capital income taxation. In Wikipedia, The Free Encyclopedia. Retrieved 18:50, August 25, 2020, from https://en.wikipedia.org/w/index.php?title=Optimal_capital_income_taxation&oldid=968893798 Diamond, Peter A.; Mirrlees, James A. (1971). "Optimal Taxation and Public Production I: Production Efficiency". The American Economic Review. 61 (1): 8–27. Diamond, P. A.; Saez, E. (2011). "The Case for a Progressive Tax: From Basic Research to Policy Recommendations". Journal of Economic Perspectives. 25 (4): 165–190 [p. 177]. https://doi.org/10.1257/jep.25.4.165 Wikipedia contributors. (2020, July 24). Fundamental theorems of welfare economics. In Wikipedia, The Free Encyclopedia. Retrieved 19:27, August 25, 2020, from https://en.wikipedia.org/w/index.php?title=Fundamental_theorems_of_welfare_economics&oldid=969212261 Jacobs, B. (2015). Optimal Inefficient Production. Mimeo, Erasmus University Rotterdam. Stiglitz, J. E., & Dasgupta, P. (1971). Differential taxation, public goods, and economic efficiency. The Review of Economic Studies, 38(2), 151-174. Kaplow, L., 2006. On the undesirability of commodity taxation even when income taxation is not optimal. Journal of Public Economics 90, 1235–1250. https://doi.org/10.1016/j.jpubeco.2005.07.001. Laroque, G., 2005. Indirect taxation is superfluous under separability and taste homogeneity: a simple proof. Economics Letters 87, 141–144. https://doi.org/10.1016/j.econlet.2004.10.010. Hellwig, M. F. (2010). A generalization of the Atkinson–Stiglitz (1976) theorem on the undesirability of nonuniform excise taxation. Economics Letters, 108(2), 156–158. https://doi.org/10.1016/j.econlet.2010.04.035 Boadway, R., & Pestieau, P. (2003). 21 Indirect Taxation and Redistribution: The Scope of the Atkinson-Stiglitz Theorem. Economics for an imperfect world: Essays in honor of Joseph E. Stiglitz, 387. Naito, H. (2007). Atkinson-Stiglitz Theorem with Endogenous Human Capital Accumulation. The BE Journal of Economic Analysis & Policy, 7(1). https://doi.org/10.2202/1935-1682.1516. Saez, E. (2002). The desirability of commodity taxation under non-linear income taxation and heterogeneous tastes. Journal of Public Economics, 83(2), 217-230. https://doi.org/10.1016/S0047-2727(00)00159-6. Newbery, D. M. (1986). On the desirability of input taxes. Economics Letters, 20(3), 267-270. https://doi.org/10.1016/0165-1765(86)90036-4 Gorman, W. M. (1968). Measuring the quantities of fixed factors. Naito, H. (1999). Re-examination of uniform commodity taxes under a non-linear income tax system and its implication for production efficiency. Journal of Public Economics, 71(2), 165-188. https://doi.org/10.1016/S0047-2727(98)00052-8 Kleven, H. J., Richter, W. F., & Sørensen, P. B. (2000). Optimal taxation with household production. Oxford Economic Papers, 52(3), 584-594. https://doi.org/10.1093/oep/52.3.584 Keen, M., & Wildasin, D. (2004). Pareto-efficient international taxation. American Economic Review, 94(1), 259-275. http://doi.org/10.1257/000282804322970797 Naito, H. (2004). Endogenous human capital accumulation, comparative advantage and direct vs. indirect redistribution. Journal of Public Economics, 88(12), 2685-2710. https://doi.org/10.1016/j.jpubeco.2003.07.003 Saez, E. (2004). Direct or indirect tax instruments for redistribution: short-run versus long-run. Journal of Public Economics, 88(3-4), 503-518. https://doi.org/10.1016/S0047-2727(02)00222-0 Gaube, T., 2005. Income taxation, endogenous factor prices and production efficiency. Scandinavian Journal of Economics, 107(2), pp.335-352. https://doi.org/10.1111/j.1467-9442.2005.00411.x Jacobs, B., & Bovenberg, A. L. (2011). Optimal taxation of human capital and the earnings function. Journal of Public Economic Theory, 13(6), 957-971. https://doi.org/10.1111/j.1467-9779.2011.01527.x Gomes, R., Lozachmeur, J. M., & Pavan, A. (2018). Differential taxation and occupational choice. The Review of Economic Studies, 85(1), 511-557. https://doi.org/10.1093/restud/rdx022 Hastings, J., & Washington, E. (2010). The first of the month effect: consumer behavior and store responses. American economic Journal: economic policy, 2(2), 142-62. https://doi.org/10.1257/pol.2.2.142 Rothstein, J. (2008). The unintended consequences of encouraging work: Tax incidence and the EITC (Vol. 165). Princeton, NJ: Center for Economic Policy Studies, Princeton University. Feldstein, M., & Horioka, C. (1979). Domestic savings and international capital flows (No. w0310). National Bureau of Economic Research. https://doi.org/10.3386/w0310 Wikipedia contributors. (2020, July 9). Feldstein–Horioka puzzle. In Wikipedia, The Free Encyclopedia. Retrieved 17:27, September 4, 2020, from https://en.wikipedia.org/w/index.php?title=Feldstein%E2%80%93Horioka_puzzle&oldid=966830962 Wikipedia contributors. (2019, December 4). Equity home bias puzzle. In Wikipedia, The Free Encyclopedia. Retrieved 17:24, September 4, 2020, from https://en.wikipedia.org/w/index.php?title=Equity_home_bias_puzzle&oldid=929180771 Friedman, J. N. (2009). The incidence of the Medicare prescription drug benefit: using asset prices to assess its impact on drug makers. Harvard University. Bernheim, B. D. (2002). Taxation and saving. In Handbook of public economics (Vol. 3, pp. 1173-1249). Elsevier. https://doi.org/10.1016/S1573-4420(02)80022-2 Golosov, M., Kocherlakota, N., & Tsyvinski, A. (2003). Optimal indirect and capital taxation. The Review of Economic Studies, 70(3), 569-587. https://doi.org/10.1111/1467-937X.00256 Aiyagari, S. R. (1995). Optimal capital income taxation with incomplete markets, borrowing constraints, and constant discounting. Journal of political Economy, 103(6), 1158-1175. Piketty, T., & Saez, E. (2012). A theory of optimal capital taxation (No. w17989). National Bureau of Economic Research. Chetty, R., Looney, A., & Kroft, K. (2009). Salience and taxation: Theory and evidence. American economic review, 99(4), 1145-77. https://doi.org/10.1257/aer.99.4.1145