This study is a reproduction of Charles W. Sterling III et al’s study on “Connections Between Present-Day Water Access and Historical Redlining”.
Sterling III, Charles W., et al. “Connections between present-day water access and historical redlining.” Environmental Justice (2023). DOI:[10.1089/env.2022.0115](https://doi.org/10.1089/env.2022.0115)
This study uses ACS Census data and historical HOLC records of neighborhoods to examine the correlation of historical redlining and current-day access to water in cities. The study uses a binary logistic regression to identify relationships of different demographic and HOLC varaibles to access to water and sewage. The study finds that historically worse HOLC scores were correlated with less access to water in cities across all regions of the United States.
Key words: ACS, HOLC Grade, Redlining, Water
AccessSubject: Social and Behavioral Sciences: Geography:
Human GeographyDate created: 4/7/25Date modified: 4/7/25Spatial Coverage: A collection of inner-city cores with
historical HOLC grades that are stored in the University of Richmond
Mapping Inequality database.Spatial Resolution: HOLC polygonsSpatial Reference System: NAD83Temporal Coverage: 2016-2020Temporal Resolution: Observations collected yearlySpatial Coverage: A collection of inner-city cores with
historical HOLC grades that are stored in the University of Richmond
Mapping Inequality database.Spatial Resolution: HOLC polygonsSpatial Reference System: NAD83Temporal Coverage: 2016-2020Temporal Resolution: Observations collected yearly
(ACS)This study is a reproduction of Charles W. Sterling III et
al’s study on “Connections Between Present-Day Water Access and
Historical Redlining”, a study which set out to explore the relationship
between prevalence of the complete plumbing ACS variable in
census blocks and historical neighborhood HOLC grades. In the study,
Sterling hypothesizes that there will be a link between HOLC grade and
complete plumbing. This hypothesis is based on the HOLC’s well
documented history of assigning neighborhoods of color the lowest grades
and previous findings of a nationwide link between race and complete
plumbing prevalance as found in Deitz and Meehan 2019. Sterling’s work
uses similar method to this study, using a logistic regression to test
the link between the neighborhood grades and plumbing.
Other methods included in this study include the use of Areal Interpolation to supposedly assign ACS variable characteristics to the HOLC graded neighborhoods. This is somewhat of a strange strategy; often used to fill in missing or anamolous values in gridded data, areal interpolation doesn’t make a lot of sense for something like assigning data to an irregularly shaped polygon. Furthermore, areal interpolation assigns a value that is a average of several values of the areal units on either side of it. This suggests that the phenomena to be studied in the filled in area is a continuation of patterns that are present proximal to it. This is almost the opposite of the point of HOLC grades, grades assigned to neighborhoods based on their characteristics that set them apart. There’s a somewhat simple explanation to why Sterling chose to use this approach; The study that Sterling cites as the published usage of this technique for re-aggregating HOLC data - Fricker and Allen 2022 - used area-weighted-reaggregation to re-assign casualties from tornado paths to the neighborhoods they passed through, but incorrectly used the term ‘areal interpolation’ to describe their method. This means that likely Sterling also just used AWR. This assumption is supported by an interrogation of the python package used for the reaggregation, which considers AWR “the simplest form of areal interpolation”, a somewhat misleading designation given that the two refer to different techniques.
This study used only two data sources; American Community Survey (ACS) demographic data and maps of neighborhood’s historical mortgage HOLC grades, published and maintained on the University of Richmond’s Mapping Inequality Database.
Title: American Community Survey (ACS)Abstract: The ACS is a nationwide survey that collects
and produces information on social, economic, housing, and demographic
characteristics about our nation’s population every yearSpatial Coverage: United States of AmericaSpatial Resolution: ACS data is aggregated by Census
Block.Spatial Representation Type: vector,
MULTIPOLYGONSpatial Reference System: NAD83Temporal Coverage: ACS data from 2016-2020.Temporal Resolution: Data is collected annually.Lineage: Downloaded and used as-is.Distribution: Data is publically avalible from the US
Census (https://www.census.gov/programs-surveys/acs/data.html)[https://www.census.gov/programs-surveys/acs/data.html]Constraints: Publicly available data.Data Quality: N/ASee acs_values.csv at
/data/raw/public/acs_values.csv
Title: HOLC grades from the Mapping Inequality Database
(MID) - Abstract: HOLC grades are historical neighborhood
delineations created by the Home Owners’ Loan Corporation in the 1930s
to assess mortgage lending risk. These maps were used to guide
investment and disinvestment in urban areas and are a foundational
dataset for understanding redlining and its long-term impacts. -
Spatial Coverage: Select major U.S. cities -
Spatial Resolution: Neighborhood-scale boundaries (city
block to district level, variable by city) -
Spatial Representation Type: vector,
MULTIPOLYGON - Spatial Reference System:
NAD83 - Temporal Coverage: 1935–1940
(approximate dates of HOLC map creation) -
Temporal Resolution: Single mapping effort. -
Lineage: Downloaded and used as-is. -
Distribution: Data is avalible on the Mapping Inequality
website (https://dsl.richmond.edu/panorama/redlining/data)[https://dsl.richmond.edu/panorama/redlining/data] -
Constraints: Publicly available data under a CC-by-NC
2.5 license - Data Quality: Accurate to the degree that
the original digitization of the 1930’s HOLC maps was accurate.
| Label | Alias | Definition | Type | Accuracy | Domain | Missing Data Value(s) | Missing Data Frequency |
|---|---|---|---|---|---|---|---|
| city | City Name | City in which the HOLC designated area resides | character | N/A | Names of US cities | none | none |
| state | State Abbreviation | USPS state designator | character | N/A | The 50 states | N/A | none |
| category | Desirability | Assessment of investment desirability based on HOLC designation | character | subjective/historical | a range of desigations | none | none |
| grade | Grade | Historical HOLC grade | character | subjective/historical | A - D | none | none |
| label | Shape Label | HOLC grade and arbitrary number (A1, C3) to divide same-grade areas from the same city | character | N/A | A-D, number of same-grade zones in the same city | none | none |
| residential | Is it residential? | ” ” | binary (TRUE/FALSE) | N/A | TRUE/FALSE | none | none |
| commercial | Is it commercial? | ” ” | binary (TRUE/FALSE) | N/A | TRUE/FALSE | none | none |
| industrial | Is it industrial? | ” ” | binary (TRUE/FALSE) | N/A | TRUE/FALSE | none | none |
There are no prior observations related with this data.
Two main source of bias in this study come from the HOLC data itself, while another comes from the ACS data. Firstly, the spatial extent of the HOLC data only covers the centers of major cities across the united states, limiting the analysis to center-cities. Secondly, ‘Areal Interpolation’ was used to determine census data values for each HOLC neighborhood, a method that doesn’t seem logical or advisable give the study context. Some data in the study suggests that this actually was a more logical area weighted reaggregation. Additionally, it’s unclear how this areal interpolation was done. Lastly, there could be a difference in results if an areal unit other than census block groups were used to perform this analysis.
First, remap ACS values from block groups to HOLC polygons through areal interpolation, likely actually area-weighted reaggregation.
Using census statistics from the ACS dataset, resulting calculated assigned variables are: Black Population %, White Population %, Indigenous American %, Asian Population %, Hispanic or Latino Population %, % Below Poverty Line, % Housing Units Owned, % Housing Units Rented, % Mobile Homes, % Homes before 1980, % Homes after 1980, % With Complete Plumbing, % With Incomplete Plumbing, and % Foreign Born. This data is separated by region as well as collected nationally. Using regional and national averages for plumbing rates, a binary variable is calculated on whether plumbing is below or above the average. This bianary variable will be creaed with a case/when function, using a national average of 2.61668% incomplete plumbing to determine if HOLC neighborhoods fall above or below. Neighborhoods that fall above will be assigned a 1, others will be assigned a 0.
This data can then be verified with the Supplementary Data Tables.
Then, use the Dunn Test to determine whether incomplete plumbing % in Grade A is significantly different than the other Grades and compared with supplementary table 1.
Using a Stepwise function with the \(StepAIC\) package and the HOLC Grade A as a reference group, we can gather a correlation table to remove the highest correlation values. In this case, the highest correlation value will be removed, and compared with supplementary table 2.
Sterling III’s original study used a binary logistic regression model to compare categorical hold grades and a census block group’s binary desigantion of whether plumbing rates were above or below average. Results were gathered as the Average Marginal Effects (AME) of different census-derived varaibles represented as binary values.
To test potential unaccounted for geographic correlation, we will re-run using a geographic weighted regression (GWR) to assess the degree to which the proximity of similar HOLC grades effect the statistical correlation found by the original authors.
Data is presented as 3 figures, 2 tables, and 7 supplementary tables:
Figures:
Tables:
Supplementary Tables:
National Data Tables - AME Table for the Northeast region using the average 1.447623 as the threshold for the binary variable - AME Table for the South region using the average 3.414391 as the threshold for the binary variable - AME Table for the Midwest region using the average 3.756471 as the threshold for the binary variable - AME Table for the West region using the average 0.7836609 as the threshold for the binary variable
Sterling hypothesizes that areas of present-day incomplete plumbing within U.S. cities (i.e., communities with a proportion of homes lacking complete plumbing above the national average) are significantly associated with HOLC neighborhood designations. The study finds evidence for this hypothesis through the correlation of lower HOLC grades and incomplete plumbing.
A successful reproduction would confirm this hypothesis with similar derived values to those found in the original study. Because the data transformations that are a part of this workflow are relatively simple, I would expect that there is very little margin for error in the difference of the final product from that of Sterling. The one potential complication is Sterling’s unsubstantiated use of either AWR or AI, which will need to be investigated. Because we’re not sure what method he used, our results could be slightly different depending on what package was chosen.
Comparing the results of the bianary logistic regression and the geographic weighted regression will be interesting, assessing the degree to which geographic correlation plays a role in the significance found as part of the results of this study.
The authors of this preregistration state that they completed this preregistration to the best of their knowledge and that no other preregistration exists pertaining to the same hypotheses and research.
This report is based upon the template for Reproducible and Replicable Research in Human-Environment and Geographical Sciences, DOI:[10.17605/OSF.IO/W29MQ](https://doi.org/10.17605/OSF.IO/W29MQ)
Sterling III, Charles W., et al. “Connections between present-day water access and historical redlining.” Environmental Justice (2023). DOI:[10.1089/env.2022.0115](https://doi.org/10.1089/env.2022.0115)
Shiloh Deitz and Katie Meehan. “Plumbing Poverty: Mapping Hot Spots of Racial and Geographic Inequality in U.S. Household Water Insecurity.” Annals of the American Association of Geographers 109 (2019): 1092–1109.
Tyler Fricker and Douglas L. Allen. “A Place-Based Analysis of Tornado Activity and Casualties in Shreveport, Louisiana.” Natural Hazards 113 (2022): 1853–1874.