College Ranking: Primary Analysis
[ Part of a sequence of posts constructing my own college rankings. ]
Causation: Reading the Tea Leaves
It would sure be convenient if these fixed effect estimates represented the causal effect a college has on its students incomes. But is it?
I think, we have some evidence the answer is "mostly yes".
First, note that, based on the data discussed here
- The correlation between SAT score and income is about r~0.20.
- The correlation between college selectivity, as measured by average SAT score, and an individual student's SAT score is about r~0.65.
- The correlation between college selectivity and alumni income at the student-level is about r~0.12.
Note that if SAT is a perfect intelligence test, and intelligence is the sole cause of the correlation between college selectivity and income, then we'd expect the correlation between college selectivity and income to be equal to the product of the correlation between SAT and college selectivity and the correlation between SAT and earnings.
We can run the numbers: 0.65 * 0.20 = 0.13. This is awfully close to the observed correlation of 0.12.
Now, this isn't quite as perfect as it sounds, because the SAT is not, in fact, a perfect measure of intelligence. If we assume SAT scores correlate with intelligence at r~0.9, then we find the expected correlation (assuming intelligence is the sole cause) is (0.65/0.9) * (0.20/0.9) = 0.16.
The observed correlation (0.12) is lower than 0.16, which suggests the idea that intelligence being the sole cause actually over-explains the observed covariance between college selectivity and earnings.
You can note that this is inconsistent (it is), but it also suggests that individual intelligence is the single factor that drives the bulk of the correlation between college selectivity and income. Therefore, by controlling for intelligence, we've at least removed most of the confounding variance that makes estimating the effect of attending a specific college so difficult.
This is not a knockdown argument that the fixed effects are causal, but it does, I hope, demonstrate that this methodology is at least a good starting point for estimating causal effects - and definitely far better than existing models.
Causation: Quasi-Experimental Design
Several years later Chetty et al published a paper that uses a variety of quasi-experimental methods to check whether this observational value-added approach yields roughly causal estimates Diversifying Society’s Leaders? The Causal Effects of Admission to Highly Selective Private Colleges. Of particular interest, they compare people who get off a waitlist at an Ivy+ school and then attend with people who do not get off the waitlist and then attend a flagship state school. They go to great lengths to show that this process is essentially random.
The estimated "effects" are given in Table 5 in the paper, which I'll partly reproduce here:
Waitlist Model | VA Model | |||
Outcome | Est | SE | Est | SE |
Top 1% Income Prob | 4.63 | 1.18 | 5.38 | 0.01 |
Top 10% Income Prob | 3.87 | 1.75 | 4.71 | 0.01 |
Top 25% Income Prob | 1.89 | 1.28 | 2.77 | 0.01 |
Mean Income Rank | 1.38 | 0.72 | 1.51 | 0.00 |
Attend Elite Grad School | 5.56 | 2.75 | 8.85 | 0.01 |
Attend Non-Elite Grad School | -0.04 | 0.02 | -0.06 | 0.01 |
Work at Elite Firm | 17.59 | 4.02 | 23.63 | 0.03 |
Work at Prestigious Firm | 15.54 | 4.19 | 22.42 | 0.02 |
The first main point is that the estimates using the waitlist design were never statistically significantly different than the estimates using the VA design. In other words, this evidence is consistent with the VA model yielding causal estimates.
However, the second main point is that the VA models are consistently larger than the waitlist model effects, ranging from 1.09x (mean income rank) to 1.59x (attend elite grad school). This suggests that there are yet additional unobserved "quality" differences not accounted for by the VA model - that is beyond SAT score, parent income, race, gender, and home state.
Control Methodology
This is all great, but the quasi-experimental paper is playing this game on easy mode, since Ivy+ colleges are probably broadly similar to Ivy+ colleges in terms of some other obvious confounders. In particular, schools with a STEM or occupational focus almost certainly have biased estimates in the VA model, since the VA model will implicitly assume that the school is causing the students to go into particular careers, when, in fact, many students enter college already intending to got into a particular field.
Given this starting point, I think the best we can do now is try to control for other confounders and hope what remains is causal. The two obvious ones are gender and STEM-focus. Indeed, the schools with the highest FEs are all STEM- or business-focused.
The natural step forward is to control for field of study by constructing a linear model where the dependent variable is the FE estimates discussed above and the independent variables are things that we believe are biasing them away from being causal estimates - things like the proportion of students in each field of study. Then, we can interpret the residuals of this model as the actual causal effects.
However, adding controls for field of study is, a priori, controversial, because it implicitly means that we are assuming the percent of (say) physics majors at a school says something about the students rather than the school - this assumption might be completely wrong!
For instance, it is entirely possible that more selective schools are more likely to be STEM-heavy. If this were the case, adding a control for STEM majors will over-correct our FEs.
To evade this problem, we will attempt an unconventional analysis. We will include par_rank
and average SAT score in our model. Perform the regression. And then add back to the residuals the variance that was removed by par_rank
and school SAT score. The rationale here is that to the extent a confounder affects a major through parent income or SAT score, we will consider that a genuine causal effect of the college, since we know we've already properly controlled for the causal effects of those two variables. However, to the extent that a variable predicts greater alumni income independent of parent income or alumni SAT scores, we will generally interpret such covariance as not caused by the college, but rather caused by the pre-existing traits of its students.
Gender & Race
In addition to controlling for fields of study, we will also include controls for the proportion of student body who is female. I intended to also include a control for the proportion of each (non-white) race, however due to the below reasoning, I have chosen not to.
A priori, it's at least possible that attending a college with a lot of {demographic} causally reduces your future earnings, such as by reducing the expected economic value of your social network, a connection that seems likely given recent research Social capital I: measurement and associations with economic mobility Social capital II: determinants of economic connectedness. As with fields of study, we should have some protection from this issue because we're already including par_rank
and college SAT score as controls.
Ideally, we could observe the values that the race dummy variables had in Chetty et al's models, but they didn't publish those. They did, however, give some hints in the online appendix. For instance, in section I, they say
We bound the degree to which high mobility rates can be explained simply by colleges enrolling a large share of Asian students who would attain top-quintile outcomes regardless of college—i.e., by the ecological (group level) correlation between Asian share and top-quintile outcome rate.
In Online Appendix Figure X, we present a binned scatter plot of the relationship between the fraction of students who reach the top quintile of the income distribution and Asian shares across colleges. As Asian shares rise from 0% to 5%, the percentage of students who reach the top quintile rises by nearly 15 percentage points (pp). Even if every Asian student ended up in the top quintile of the earnings distribution, the fraction of students in the top quintile would rise by a maximum of 5 pp over this range (a non-parametric upper bound, depicted by the solid line on the figure). Hence, non-Asian students at colleges with larger Asian shares must also have higher top-quintile outcome rates, either because they are also more positively selected or because such colleges have higher value-added.
To gauge the extent to which individual-level differences in top-quintile outcome rates drive the correlation between Asian shares and mobility rates, we use Census data to estimate that Asian students from low-income families have top-quintile outcome rates that are at most 23.5 pp higher than non-Asians. An “Asian-adjusted” measure of top-quintile outcome rates that subtracts 0.235 times the Asian share from the raw top-quintile outcome rate at each college yields mobility rates that have a correlation of more than 0.98 with our baseline estimates. The Asian-adjusted mobility rates continue to have a correlation of 0.43 with Asian shares, implying that most of the baseline correlation of 0.54 between mobility rates and Asian shares is due to ecological factors.
In English: colleges with more Asians have more mobility. Only a small part of this is explainable by the fact that Asians generally have more mobility. This indicates that about 80% (0.43/0.54) of the higher mobility at Asian colleges is due to the causal effect of the college rather than merely the pre-existing races of the Asians studying there.
I tried adding the proportion of students of each race in as control variables. I then tried including them with other controls. The latter slopes were generally less than 80% the former slopes. For this reason, I think the 20% correlational "effect" noted by Chetty et al is already properly controlled for by the other variables.
For the above reasons, I do not include race controls in my final models. This is essentially me believing that either (a) something causes high Asian attendance and high school earnings besides field of study, or (b) high Asian attendance causes non-Asians to earn more. I remain agnostic regarding which is true.
SAT and Parent Income
The rest of my analysis (and data) can be found in this GitHub repo. Feel free to dive deeper into the code and data, but I'll leave the highlights here.
The first surprising factoid is that when I throw both average SAT score and par_rank
into a regression (with or without controls), a college's average SAT score no long predicts its FE. When par_rank
is removed from the regression, the SAT score becomes predictive again (as we noted in the last post). This suggests there no causal advantage from going to a college with smart kids - merely an advantage from going to a college with rich kids.
Why might this be the case?
Researchers have found (hat-tip to Malcolm Gladwell) that the proportion of students majoring in STEM remains relatively constant between liberal arts colleges and yet relative SAT score within the college still predicts which students will major in STEM Christopher Strenta Why Did I Say "Yes" to Speak Here? David and Goliath underdogs, misfits, and the Art of Battling Giants.
His point is that whether someone majors in STEM is partly determined by their ability at STEM relative to their peers, but is almost entirely unaffected by their absolute STEM ability.
My point is that the above suggests that which college you attend 100% affects what major you choose: attending a better school makes you feel dumb relative to your peers, which makes you less likely to major in STEM.
Fortunately, this doesn't affect our rankings, because it's not a confounder, but is, instead, a legitimate causal effect that attending a college will have on a student in expectation.
It also doesn't affect our decision to control for the proportion of students in each field, since the above suggests, all-else-being equal, this proportion should be similar if your students are similar - exactly the kind of situation that suggests control variables are warranted.
However, the fact that (1) attending a more selective school makes you less likely to major in STEM and (2) majoring in STEM confers a large pay premium rationalize provide an elegant story for why school SAT score doesn't appear to cause its individual students to succeed more: In The Case Against Education, Caplan correctly argues that there are "sheepskin" effects whereby you look like your more-able peers, but he never considered that your more-able peers would raise standards and push you out of more lucrative fields. It appears the two affects are of similar strength.
Still, not to be outdone, college SAT score does predict the rate at which graduates attain PhDs, even after controlling for parent income. In fact, while having rich parents predicts greater PhD attainment rates, including SAT in the model flips that slope's sign. This suggests that coming from money actually causes you to be less likely to get a PhD, but that because such students tend to be smarter, the naive correlation ends up being positive.
Fields, Race, and Gender
Alumni from a college with 10pp more STEM majors (versus humanities majors) will tend to earn 0.25pp more after controlling for par_rank
and SAT score. Interpreted causally, this suggests STEM degrees pay about 8% more than humanities degrees.
As discussed in the quote above, I think there is some evidence that most of the association between the racial composition of college classes and outcomes is not simply due ability bias, but reflects causal effects. Relative to whites, there are some large effects. A 10pp increase in asian students is associated with a +2.1pp increase in college FEs - literally eight times larger than the (correlational) effect of STEM degrees.
The number is +1.5 for hispanics and +0.4 for blacks. Gender significantly predicts earnings: by about 9 percentiles.
3 Models
While correlations with the racial composition of the student body likely represent causal effects, the same is not true for the composition of student gender and field of study - especially since we are explicitly including any remaining par_rank
and SAT score covariance in our causal estimate.
Therefore, when computing the first model, I used a model that controls for just field of study and gender. As mentioned, I also control for par_rank
and SAT score, but I then add that covariance back into the residuals when computing the rankings. Finally, I interpret these residuals as causal effects. This constitutes the first model.
For the second model, I used more fine-grained fields of study (38 v 8 field-clusters) and the difference between a school's average math and average verbal SAT scores - a metric of how "mathy" a school is. The downside is that to add these variables, I have to pull from a different datasource and intersecting the two brings my number of schools down from 779 to 554.
For the third model, I go back to the original dataset of 779 schools, but I exclude any school with a major-cluster that differs from Harvard's major cluster proportion by more than 20 percentage points. This excludes various schools with specific focuses on STEM, business, trades, nursing, etc. The idea is this removes some of the hard-to-control for confounders that must exist among such a heterogenous sample. This results in 223 schools.
Then, I apply add the same controls as from Model 1.
The rankings implied by all three models can be found here.
Controversial Models: 2b, 3b
You may find the above quite counterintuitive, so there is one last tweak I thought might plausibly be a good idea, but it's probably controversial.
Basically, the above summarizes of college tiers assume that whatever is left after first controlling for confounders is causal. However, you could imagine a world where college tier causes both higher earnings and a change in confounders: for instance, "good" colleges cause people to be more likely to major in economics rather than psychology, which causes them to earn more.
The above model rule such an effect out by fiat. We can relax that assumption by including a college's tier in the regression and then adding its covariance back in - similar to how we treat SAT score and par_rank
. While this removes the assumption that college tier can't possibly cause both earnings and a confounder, it introduces a new assumption: that college tier is not associated with the human capital of who a college admits, at least after adjusting for the other confounders - that it actually only has causal effects. This assumption naively seems atrocious, but, seeing as we've included SAT score and parent income among the other confounders, it might actually be plausible.
Ultimately, I think the assumption is very controversial and too generous to top-tier colleges; however, I think the assumption used in models 1-3 are probably too conservative and make top-tier colleges look worse than they actually are. Therefore, for completeness and fairness' sake, I modified models 2 and 3 to include tier in the regression.
The resulting dummy parameters were statistically insignificant across all tiers. For this reason, I ignore these models.
More Limitations
In addition to the issues already discussed, there are other issues with these ranking. One is sampling error. Simply put, if a college has 500 graduates, that's 1500 incomes in the sample (ages 32-34). Some back-of-the-envelope math suggests this means the 95% confidence interval around a typical college's ranking is probably around ±1.4pp.
Another issue is that 8 (or 38) fields of study don't control for all field-based variance. For instance "engineering" is a single cluster even in the 38-field model, even though different disciplines have different earning potentials.
Finally, it seems likely that there exists further heterogeneity between schools' student bodies. For instance, students at liberal arts schools might prioritize earnings less than students at more conventional schools. If this were true, this would bias these causal estiamtes downwards.
Results
Here are the average residuals (hopefully causal effects) for different institution tiers by model. All are relative to the average residual for nonselective 4-year colleges.
Tier | Fixed Effects | Model 1 | Model 2 | Model 3 |
Ivy Plus | +14.66 | +6.69 | +2.11 | +1.95 |
Other elite schools | +11.44 | +4.89 | +0.29 | +0.22 |
Highly selective public | +12.73 | +4.28 | +0.10 | +0.60 |
Highly selective private | +8.53 | +2.96 | +0.17 | +2.99 |
Selective public | +6.31 | -0.09 | -0.87 | -5.22 |
Selective private | +5.95 | +0.12 | -0.65 | -5.45 |
I personally find models #2 and #3 most convincing. They can best be summarized as not supporting the idea that attending more selective colleges makes students earn more in adulthood - that virtually all the effects are, in fact, simple ability bias.
Lest you think that its a bunch of rich trust-fund babies, parent income correlates positively with wage, hours worked, and probability of being employed up to at least the 99th percentile Race and economic opportunity in the United States: An intergenerational perspective (Figure IV). Look, no one disputes that Harvard kids disproportionately come from wealthy families, but only 15% come from top-1% families Harvard University. Is it possible that within the top 99% kids start getting lazy - so lazy that it's possible, but when all the evidence points one direction, it's foolhardy to put much weight on such conjecture.
Still, despite the above non-results, these rankings do not suggest that schools have no effect on student incomes. For instance, the top colleges per the third model are
Estimated Effect | Name | Average SAT |
+10.0 | College Of The Holy Cross | 1360 |
+7.7 | Tufts University | 1455 |
+7.2 | Spelman College | 1125 |
+7.2 | Princeton University | 1510 |
+7.0 | Saint Anselm College | 1220 |
+7.0 | State University Of New York At Albany | 1170 |
+6.9 | Colgate University | 1385 |
Look, I don't know to what extent these ratings correspond to causal effects. I do know there is more reason to believe they do than any other existing rankings out there. And these rankings suggest there are some seriously undervalued colleges - that, for instance, Spelman College boosts its graduates incomes by more than 20% than the average college despite being pretty easy to get into.
Finally, from a broader perspective, the correlation between the models' causal estimates and alumni's actual income (at the student-level) ranges from r~0.12 to r~0.17, depending on the model. This suggests that while your alma mater has some causal effect on your earnings, it only explains about 3% of income inequality.
And this last result is very robust. For instance if you just naively assume that the average alumni's income from a college represents pure casual effects, this correlation is actually even smaller: r~0.07.
The tl;dr here is that where you go to college matters, which colleges are best barely correlates with conventional wisdom, and college sorting causes only a minuscule amount to age-adjusted income inequality.