This study is a replication of:
Meng, Yunliang. 2021. Crime rates and contextual characteristics: A case study in connecticut, USA. Human Geographies 15, (2) (11): 209-228, https://www.proquest.com/scholarly-journals/crime-rates-contextual-characteristics-case-study/docview/2638089143/se-2 (accessed April 6, 2025).
Key words: Connecticut, crime, inequality, contextual
characteristicsSubject: Social and Behavioral Sciences: Geography:
Human GeographyDate created: 04/06/2024Date modified: 2025-04-16Spatial Coverage: Connecticut, USASpatial Resolution: County SubdivisionsSpatial Reference System: EPSG: 2234Temporal Coverage: 2013 - 2017Temporal Resolution: 1 yearSpatial Coverage: Connecticut, USASpatial Resolution: County SubdivisionsSpatial Reference System: EPSG: 2234Temporal Coverage: 2013 - 2017Temporal Resolution: 1 yearThis is a replication of a study on crime and contextual characteristics in Connecticut. The original study uses geographically weighted regression to test how crime rates at the county subdivision level vary based on several socio-demographic characteristics.
The original study is observational using socio-demographic indicators from the Census Bureau’s American Community Survey 5-year estimates and crime data from the Uniform Crime Report disseminated by the Federal Bureau of Investigation.
We will attempt to use the same methods and data sources as the original authors to see if there is any variation in our results or missing methods in their research.
There are two data sources for this study, one is demographic data from the American Community Survey and the other is crime rate statistics from the Uniform Crime Report gathered by the FBI.
Title: CT Census Subdivision Socio-demographic
DataAbstract: BCT Census County Subdivision
Socio-demographic DataSpatial Coverage: ConnecticutSpatial Resolution: County SubdivisionSpatial Representation Type: vectorSpatial Reference System: EPSG: 2234Temporal Coverage: 2013-2017Temporal Resolution: 1 yearLineage: collected using the census API and tidycensus
package in RDistribution: Publicly availableConstraints: Public dataData Quality: trustworthy## Reading layer `county_subdivision' from data source
## `/Users/dermotmcmillan/Desktop/GitHub/RPr-CT-crime/data/raw/public/county_subdivision.gpkg'
## using driver `GPKG'
## Simple feature collection with 173 features and 92 fields (with 4 geometries empty)
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -73.72777 ymin: 40.98014 xmax: -71.78699 ymax: 42.05059
## Geodetic CRS: NAD83
| Label | Alias | Definition | Type | Accuracy | Domain | Missing Data Value(s) | Missing Data Frequency |
|---|---|---|---|---|---|---|---|
| total_population | B01003_001 | Total US population (Estimate) | … | … | … | … | … |
| age_20m | B01001_008 | Population of Males aged 20 | … | … | … | … | … |
| age_21m | B01001_009 | Population of Males aged 21 | … | … | … | … | … |
| age_22_24m | B01001_010 | Population of Males aged 22-24 | … | … | … | … | … |
| age_25_29m | B01001_011 | Population of Males aged 25-29 | … | … | … | … | … |
| age_30_34m | B01001_012 | Population of Males aged 30-34 | … | … | … | … | … |
| age_20f | B01001_032 | Population of Females aged 20 | … | … | … | … | … |
| age_21f | B01001_033 | Population of Females aged 20 | … | … | … | … | … |
| age_22_24f | B01001_034 | Population of Females aged 22-24 | … | … | … | … | … |
| age_25_29f | B01001_035 | Population of Females aged 25-29 | … | … | … | … | … |
| age_30_34f | B01001_036 | Population of Females aged 30-34 | … | … | … | … | … |
| education_total | B15003_001 | Total population | … | … | … | … | … |
| education_assoc | B15003_021 | Highest degree or the highest level of school completed = Associates degree | … | … | … | … | … |
| education_ba | B15003_022 | Highest degree or the highest level of school completed = Bachelors Degree | … | … | … | … | … |
| education_ma | B15003_023 | Highest degree or the highest level of school completed = Masters Degree | … | … | … | … | … |
| education_pro | B15003_024 | Highest degree or the highest level of school completed = Profession School Degree | … | … | … | … | … |
| education_phd | B15003_025 | Highest degree or the highest level of school completed = Doctorate Degree | … | … | … | … | … |
| median_income | B19013_001 | Median Household Income | … | … | … | … | … |
| poverty_total_pop | B17001_001 | Total Population | … | … | … | … | … |
| poverty_below | B17001_002 | Income below the poverty level in last 12 months | … | … | … | … | … |
| unemployment_total | B23025_001 | Total Population | … | … | … | … | … |
| unemployment_total_in_labor | B23025_002 | Population in Labor Force | … | … | … | … | … |
| unemployment_unemployed | B23025_005 | Unemployed population considered to be in labor force | … | … | … | … | … |
| housing_total | B25003_001 | Occupied Housing Units | … | … | … | … | … |
| housing_renter | B25003_003 | Renter occupied Housing Units | … | … | … | … | … |
| housing_units_total | B25024_001 | Housing Units | … | … | … | … | … |
| housing_units_2 | B25024_004 | Housing Units w/ 2 units | … | … | … | … | … |
| housing_units_3_4 | B25024_005 | Housing Units w/ 3 or 4 units | … | … | … | … | … |
| housing_units_5_9 | B25024_006 | Housing Units w/ 5 to 9 units | … | … | … | … | … |
| housing_units_10_19 | B25024_007 | Housing Units w/ 10 to 19units | … | … | … | … | … |
| housing_units_20_49 | B25024_008 | Housing Units w/ 20-49 units | … | … | … | … | … |
| housing_units_50 | B25024_009 | Housing Units w/ 50 or more units | … | … | … | … | … |
| moved_total | B07001_001 | Population 1 year or more in the US | … | … | … | … | … |
| moved_within_12_months | B07001_017 | Population that has moved homes in the past 12 months | … | … | … | … | … |
| households_total | B11003_001 | Family Type by Presence and Age of Own Children Under 18 Years | … | … | … | … | … |
| lone_parent_families_m | B11003_010 | Male Housholder, no wife present | … | … | … | … | … |
| lone_parent_families_f | B11003_016 | Female housholder, no husband present | … | … | … | … | … |
| hispanic | B03002_012 | Hispanic | … | … | … | … | … |
| race_white | B03002_003 | Not Hispanic or Latino, White alone | … | … | … | … | … |
| race_black | B03002_003 | Not Hispanic or Latino, Black or African American alone | … | … | … | … | … |
| race_asian | B03002_006 | Not Hispanic or Latino, Asian alone | … | … | … | … | … |
| race_native | B03002_005 | Not Hispanic or Latino, American Indian and Alaska Native Alone | … | … | … | … | |
| race_pacific | B03002_007 | Not Hispanic or Latino, Native Hawaiian and Other Pacific Islander Alone | … | … | … | … | … |
| race_other | B03002_008 | Not Hispanic or Latino, Some Other Race Alone | … | … | … | … | … |
| race_two_or_more | B03002_009 | Not Hispanic or Latino, Two or more races | … | … | … | … | … |
Title: CTAbstract: BCT Census town level Crime DataSpatial Coverage: ConnecticutSpatial Resolution: townSpatial Representation Type: non-spatialTemporal Coverage: 2013-2017Temporal Resolution: 1 yearLineage: gathered on 04/06/2024 from http://data.ctdata.org/dataset/ucr-crime-indexDistribution: Publicly availableConstraints: Public dataData Quality: good, reported from local law enforcement
agenciesThe threat specifically relevant to this problem is the Modifiable Unit Area Problem since crime rates will have different social and spatial patterns at different scales. There are also potential sources of error related to endogeneity and spatial auto-correlation both of which are moderately accounted for in the original study. Additionally, the results do not have predictive power because the GWR is too regionally specific and over fit. Instead these results can be interpreted as exploratory requiring more rigorous research to contextualize and verify any findings. Bias is also inherent to crime data since crime is socially constructed and criminality is at least partially defined around race and class in America. Over-policing and over-reporting in Low Income areas and Black and brown neighborhoods introduces bias into the measurement of crime itself.
There are several methodological choices that the original authors did not specify, and which we will have to figure out by comparing results and summary statistics. Specifically, we need to choose a spatial weights matrix for the GWR. We will start with the default ArcGIS spatial matrix (since they used the ArcGIS tool for their analysis) and go from there. If we cannot figure out which one they used we will chose our own and compare results. There are also some transformation choices with the census data that we will have to figure out by comparing our data to the summary statistics provided (i.e what denominator for percentages).
Data transformations for Crime and Census data are provided in the following workflow:
We will attempt to reproduce the graphics, model outputs and summary statistics provided in the study.
Graphs/ figures we hope to reproduce:
For this figure we hope to classify the beta value
into bins where values around 0 are neutral and bins are equally spaced.
There was no explicit research question. We will treat the results as exploratory and potentially do a spatial lag model to expand on the research. We also hope to caveat the results with qualitative context and offer some criticisms of predicting crime in general. For each of graphs and figures we hope to get very similar results but this will depend heavily on our ability to figure out the spatial weights matrix and data transformations used by the original authors.
Include an integrity statement - The authors of this preregistration state that they completed this preregistration to the best of their knowledge and that no other preregistration exists pertaining to the same hypotheses and research.
This report is based upon the template for Reproducible and Replicable Research in Human-Environment and Geographical Sciences, DOI:[10.17605/OSF.IO/W29MQ](https://doi.org/10.17605/OSF.IO/W29MQ)