Twins Studies on Educational Attainment
See this spreadsheet for a summary.
Instrumented Variables Methodology
We want to estimate the impact of educational attainment on earnings. Our general approach is to look at pairs of identical twins and look at the slope between the two variables. If all confounding is either genetic or shared-environmental in nature, then this within-twin slope is a good estimator of the causal impact between the two variables.
This assumption is questionable. For instance, it is plausible that unshared environmental factors cause differences in ambition between two twins, which causes both educational and earning differences. However, it seems plausible that the bulk of confounders are genetic and shared-environmental in origin, which makes this approach seem fairly promising - especially compared to other, more naive, models.
However, in practice, most of the relevant twin data is collected via surveys, which means we're relying on fallible humans. This introduces "measurement error", as the twins filling out the surveys misremember/lie about how much they earn and what degrees they have. Broadly speaking, such errors bias our causal estimates downwards.
The most common analytical solution to this problem is to use an "instrumental-variables" ("IV") model, described here Stock. At least regarding the effect of education on earnings, this approach appears to have been pioneered by Ashenfelter et al in 1994 Ashenfelter, O., & Krueger, A..
Instrumented Variable Studies
US: National Research Council's Twin Sample
In 1955, researchers assembled data from various government/survey sources on white male veteran twins born between 1917 and 1927 Socioeconomic success. In 1974, other researchers mailed out a second set of surveys that included questions about educational attainment and earnings Intergenerational transmission of income and wealth. Unfortunately, probably due to the novelty of IV models in the field, they did not report IV estimates.
US: Twinsburg Festival
Ashenfelter and his colleagues interviewed several hundred twins at the Twins Day Festival in Twinsburg, Ohio in August of 1991. Their first paper Ashenfelter, O., & Krueger, A., based on just one year of data collection, found that the slope between educational attainment and earnings was actually significantly higher than the naive slope. The second paper Ashenfelter, O., & Rouse, C., based on the same data plus data from 1992 and 1993, failed to replicate this result, instead finding that the naive slope is slightly upwardly biased. In particular, Table III suggests the bias is about 14%.
US: Minnesota Twin Registry
The Minnesota Twins Registry is the largest birth-certificate based twins registry in the United States. It consists of twins born between 1936 and 1955, with data collected by surveys in the 1980s. Behrman et al performed an IV analysis using this data and found the within-twin IV slope was 8% smaller than the naive slope “Ability” biases in schooling returns and twins: a test and new estimates.
Other Rich-Country Studies
- Australian Twin Registry - Analysis of this data results in quite different point estimates. One study finds the within-pair-IV slope is 40% smaller than the naive slope Miller; the other study finds it is 50% larger Lee. Given the large standard errors, (1) this large discrepancy isn't actually all that surprising and (2) both estimates are consistent with the roughly 10% bias found in the above studies.
- Swedish Twin Registry - In 1999, Isacsson found a 15% bias using a non-IV model Isacsson, G. (1999).
- St. Thomas' U.K. Adult Twin Registry - the naive slope and the within-pair-IV slope were equal (Table 2 from Bonjour)
Conclusion
Ignoring the discordant Australian studies, the above estimates all suggest the with-twin slope is 0-20% smaller than the naive general-population slope, with a fixed-effect meta estimate of around 8% - though 12% if we only look at the two US samples. So, a decent summary of the IV-related literature is that the slope between education and earnings is about 10% smaller within twin pairs than within the entire population.
All of this suggests that ability bias plays a fairly small role in the correlation between educational attainment and earnings.
Post-IV Studies
Later researchers started questioning the premises of the IV model this issue in different ways. For instance Black re-analyzed data from Krueger, and found that while the IV slope was 8% smaller than the population-wide slope, the author's preferred slope estimate was only 4% smaller. Meanwhile, Isacsson, G. (2004) re-analyzed data from Isacsson, G. (1999). The former claims IV models are biased upwards by about 30%. See also Sandewall.
Ultimately, I'd guess the "true" slope (i.e. sans measurement error) between educational attainment and earnings is ~20% smaller within twin pairs than within the general population. I would like to come back to these methodological questions later, but, to be frank, they look unresolvable with current human knowledge.
Heterogeneity
Finally, an important question is how these estimates vary with educational attainment level. For instance, it is possible the graduating from high school has minimal ability-bias confounding, while graduating with a PhD does (or vice-versa). As far as I can tell, this has been examined in only two places:
- Ashenfelter, O., & Rouse, C. - These researchers generally finds ability-bias becomes a larger confounder at later education levels (see Table IVb). For instance, a year of education on the margin of 12th grade is associated with a 0.085 increase in log-income, while the value is 0.114 for 16th grade. Meanwhile, the slopes within twin pairs using IV are 0.101 and 0.079, respectively.
- Isacsson, G. (2004) - The authors report results in Tables I, III, and V. They're hard to interpret due to the education categories used.