Mobility: Location (Part 2)
Adjusting for Cognitive Ability
Chetty et al's Opportunity Atlas The Opportunity Atlas and associated data Data Library include the typical income of children from a neighborhood adjusted for their race and parents' income.
Chetty et al give reasons why these estimates likely reflect causal estimates, but an obvious confounder they don't control for is the intelligence of the children in an area - even though they know that a correlation between a child's SAT score and later income persists even after controlling for their parents' income and the income of their neighbors The opportunity atlas: Mapping the childhood roots of social mobility.
Let's correct this oversight.
In my college ranking sequence, I consider the association between parent income, child income, and child SAT scores. In broad strokes, the correlations are roughly
Variables | Approximate r |
child_income ~ parent_income | 0.29 |
child_income ~ sat_score | 0.19 |
sat_score ~ parent_income | 0.30 |
We'd prefer to fit a single model predicting child income from parent income and SAT score within each census tract, but we sadly lack the data. Instead, we will take Chetty's income-adjusted estimates and then use some nifty math to adjust for SAT scores on top of that. This ends up finding that we should subtract 0.103 SD of child income for every 1 SD increase in SAT score.
Unfortunately, Chetty et al don't provide the average SAT score by census tract, so we use the gsmn_math_g3_2013
variable instead - the average test score on a standardized math test in 3rd grade. However, in order to make this substitution reasonable, we need to know the standard deviation in this standardized test score.
We get hints at that answer Where is the land of opportunity?:
We obtain data on mean grade 3–8 math and English test scores by [commuting zone] from the Global Report Card. The Global Report Card converts school district–level scores on statewide tests to a single national scale by benchmarking statewide test scores to scores on the National Assessment of Educational Progress (NAEP) test... We regress test scores on mean parent family income (from 1996 to 2000) in the core sample and compute residuals to obtain an income-adjusted measure of test score gains...
The income-adjusted test score and dropout rates are very highly correlated with upward mobility across all specifications, as shown in the fourth panel of Figure VIII . In the baseline specification, the magnitude of the correlation between both measures and upward mobility is nearly 0.6.
The definition is elaborated on in the dataset's codebook Codebook for Table 9:
Mean 3rd grade math test scores in 2013. Obtained from the Stanford Education Data Archive (SEDA) and measured at the district level. We create a crosswalk from districts to tracts by weighting by the proportion of land area that a given school district covers in a tract.
Sadly, I couldn't find any concrete citation, so I went looking on my own for these two data sources: the Global Report Card and the SEDA. The former's website is now defunct, so use the Internet Archive to investigate further, but it ended up being a dead end.
The Global Report Card's test score metric is pretty straightforward Greene:
For example, the average student in Scarsdale School District in Westchester County, New York scored nearly one standard deviation above the mean for New York on the state's math exam. The average student in New York scored six hundredths of a standard deviation above the national average of the NAEP exam given in the same year, and the average student in the United States scored about as far in the negative direction (-.055) from the international average on PISA. Our final index score for Scarsdale in 2007 is equal to the sum of the district, state, and national estimates (1+.06+ -.055 = 1.055). Since the final index score is expired in standard deviation units, it can easily be converted to a percentile for easy interpretation. In our example, Scarsdale would rank at the seventy seventh percentile internationally in math.
The SEDA's metric follows a vaguely similar methodology, but is full of other nuances Fahle. In addition to the scale with mean=0, sd=1, they also provide a "Grade Standardized Scale" where a "4" is defined as the average score of a 4th grader and "8" is defined as the average score of an 8th grader. The remaining scores are inferred via linear transformation. Based on the information in that source (esp Table 4), it looks like the average student move up about 1/3 of a standard deviation in score each year in both reading and math.
When I inspect the actual values Chetty et al provide, I find a median of 3.22 with half the data lying between 2.61 and 3.76. This sure makes it look like we're dealing with the latter scale.
So, our approach is this: we convert from gsmn_math_g3_2013
to (shifted) z-scores by dividing by three. Then we will multiply by 4.1 to and adjust Chetty et al's mobility estimates by that many percentiles.
For instance, if students in a neighborhood are scoring 2 SD higher on math tests in third grade, we'll subtract 8.2pp from Chetty et al's mobility statistics, thereby removing the bias that would otherwise plague smart-but-poor and dumb-but-rich neighborhoods.
Ultimately, these adjusted metrics correlate very closely with the original metrics (r~0.99), and the slope when predicting the adjusted mobility estimate from the naive mobility estimate is about 0.98. Recall that the slope between the naive mobility estimate and the actual mobility effect is around 0.6, so clearly most of the remaining confounders are unrelated to intelligence.
Covariates
todo