chetty economics personal-finance education social-politics inequality intelligence twins

Opportunity Atlas Adjustment

Chetty et al's Opportunity Atlas The Opportunity Atlas and associated data Data Library include the typical income of children from a neighborhood adjusted for their race and parents' income.

Chetty et al give reasons why these estimates likely reflect causal estimates, but an obvious confounder they don't control for is the intelligence of the children in an area - even though they know that a correlation between a child's SAT score and later income persists even after controlling for their parents' income and the income of their neighbors The opportunity atlas: Mapping the childhood roots of social mobility.

This page is my attempt to correct this oversight.

I laid out the correlations between parent income, child income, and child SAT scores in footnotes here. In broad strokes, the correlations are roughly

VariablesApproximate r
child_income ~ parent_income0.29
child_income ~ sat_score0.19
sat_score ~ parent_income0.30

If we assume these variables come from a multivariate normal distribution, we can simluate this data. First, we compute the residuals child_income~parent_income model, which is essentially what Chetty uses. Next, we linearly fit these residuals again sat_score. This analysis reveals that a 1 SD increase in a child's SAT score predicts a 0.11 SD increase in their eventual earnings even after previously controlling for parental income.

The next step is to see how much variance in average test score by census tract explains the Chetty et al's mobility estimates.

We get hints that the answer is: a lot Where is the land of opportunity?:

We obtain data on mean grade 3–8 math and English test scores by [commuting zone] from the Global Report Card. The Global Report Card converts school district–level scores on statewide tests to a single national scale by benchmarking statewide test scores to scores on the National Assessment of Educational Progress (NAEP) test... We regress test scores on mean parent family income (from 1996 to 2000) in the core sample and compute residuals to obtain an income-adjusted measure of test score gains...

The income-adjusted test score and dropout rates are very highly correlated with upward mobility across all specifications, as shown in the fourth panel of Figure VIII . In the baseline specification, the magnitude of the correlation between both measures and upward mobility is nearly 0.6.

So even after adjusting for race, parent incomes, and neighbor income, a strong and robust correlation persists between the average SAT score in a school district and the incomes of its alumni.

Chetty et al's conclusion is that school quality is paramount to child success. An equally consistent hypothesis is that each child's SAT scores predicts that specific child's success. The truth is probably in between, but where exactly?

The simplest path forward is to compute the slope census tracts' average SAT scores and their average mobility scores. Then, we compare this estimate to the 0.43 slope we found above. The difference would estimate the total effect of having smart neighbors, smart peers, and good schools.

Unfortunately, Chetty et al don't provide the school data they used in their analysis. Fortunately, they do provide another set of test score data at the census-tract level - though this data is entirely derived from school-district level data Data Library: Codebook for Table 9.

The definition for this variable is Codebook for Table 9:

Mean 3rd grade math test scores in 2013. Obtained from the Stanford Education Data Archive (SEDA) and measured at the district level. We create a crosswalk from districts to tracts by weighting by the proportion of land area that a given school district covers in a tract.

Sadly, I couldn't find any concrete citation, so I went looking on my own for these two data sources: the Global Report Card and the SEDA. Equally sadly, the former's website is now defunct, so I will need to use the Internet Archive and hope it ended up indexing everything I need.

The Global Report Card's test score metric is pretty straightforward Greene:

For example, the average student in Scarsdale School District in Westchester County, New York scored nearly one standard deviation above the mean for New York on the state's math exam. The average student in New York scored six hundredths of a standard deviation above the national average of the NAEP exam given in the same year, and the average student in the United States scored about as far in the negative direction (-.055) from the international average on PISA. Our final index score for Scarsdale in 2007 is equal to the sum of the district, state, and national estimates (1+.06+ -.055 = 1.055). Since the final index score is expired in standard deviation units, it can easily be converted to a percentile for easy interpretation. In our example, Scarsdale would rank at the seventy seventh percentile internationally in math.

The SEDA's metric follows a vaguely similar methodology, but is full of other nuances Fahle. In addition to the scale with mean=0, sd=1, they also provide a "Grade Standardized Scale" where a "4" is defined as the average score of a 4th grader and "8" is defined as the average score of an 8th grader. The remaining scores are inferred via linear transformation. Based on the information in that source (esp Table 4), it looks like the average student move up about 1/3 of a standard deviation in score each year in both reading and math.

When I inspect the actual values Chetty et al provide, I find a median of 3.22 with half the data lying between 2.61 and 3.76. This sure makes it look like we're dealing with the latter scale. This suggests the standard deviation of district-test-score-averages is about 0.85 grade levels, or about 0.28 student-level standard deviations.

This suggests Chetty et al's estimates will be biased upwards by about 0.12 sd in earnings per sd in school-district-test-score.

If we conservatively assume there is no variance in test scores between schools within a district, TODO...

The Opportunity Atlas. Opportunity Insights. https://www.opportunityatlas.org/ Data Library: Publicly available data we've produced and replication code. Opportunity Insights. https://opportunityinsights.org/data/ Chetty, R., Friedman, J. N., Hendren, N., Jones, M. R., & Porter, S. R. (2018). The opportunity atlas: Mapping the childhood roots of social mobility (No. w25147). National Bureau of Economic Research. https://doi.org/10.3386/w25147 Chetty, R., Hendren, N., Kline, P., & Saez, E. (2014). Where is the land of opportunity? The geography of intergenerational mobility in the United States. The Quarterly Journal of Economics, 129(4), 1553-1623. https://doi.org/10.1093/qje/qju022 Codebook for Table 9: Neighborhood Characteristics by Census Tract. Opportunity Insights. https://opportunityinsights.org/wp-content/uploads/2019/07/Codebook-for-Table-9.pdf Fahle, E. M., Shear, B. R., Kalogrides, D., & Reardon, S. F. (2017). Stanford Education Data Archive: Technical Documentation. https://cepa.stanford.edu/sites/default/files/SEDA_documentation_v20b.pdf Global Report Card. (2021). Internet Archive: Wayback Machine. https://web.archive.org/web/20210728200553/http://www.globalreportcard.org/ Greene, J., & McGee, J. B. (2021). About the Index. Global Report Card. Internet Archive: Wayback Machine. https://web.archive.org/web/20210412033249/http://www.globalreportcard.org/about.html George W. Bush Presidential Center. https://www.bushcenter.org/