Abstract

This study is a replication of:

Meng, Yunliang. 2021. Crime rates and contextual characteristics: A case study in connecticut, USA. Human Geographies 15, (2) (11): 209-228, https://www.proquest.com/scholarly-journals/crime-rates-contextual-characteristics-case-study/docview/2638089143/se-2 (accessed April 6, 2025).

Study metadata

Original study spatio-temporal metadata

  • Spatial Coverage: Connecticut, USA
  • Spatial Resolution: County Subdivisions
  • Spatial Reference System: EPSG: 2234
  • Temporal Coverage: 2013 - 2017
  • Temporal Resolution: 1 year

Study design

This is a replication of a study on crime and contextual characteristics in Connecticut. The original study uses geographically weighted regression to test how crime rates at the county subdivision level vary based on several socio-demographic characteristics.

The original study is observational using socio-demographic indicators from the Census Bureau’s American Community Survey 5-year estimates and crime data from the Uniform Crime Report disseminated by the Federal Bureau of Investigation.

We will attempt to use the same methods and data sources as the original authors to see if there is any variation in our results or missing methods in their research.

Materials and procedure

Computational environment

Data and variables

There are two data sources for this study, one is demographic data from the American Community Survey and the other is crime rate statistics from the Uniform Crime Report gathered by the FBI.

Census County Subdivisions

  • Title: CT Census Subdivision Socio-demographic Data
  • Abstract: BCT Census County Subdivision Socio-demographic Data
  • Spatial Coverage: Connecticut
  • Spatial Resolution: County Subdivision
  • Spatial Representation Type: vector
  • Spatial Reference System: EPSG: 2234
  • Temporal Coverage: 2013-2017
  • Temporal Resolution: 1 year
  • Lineage: collected using the census API and tidycensus package in R
  • Distribution: Publicly available
  • Constraints: Public data
  • Data Quality: trustworthy
## Reading layer `county_subdivision' from data source 
##   `/Users/dermotmcmillan/Desktop/GitHub/RPr-CT-crime/data/raw/public/county_subdivision.gpkg' 
##   using driver `GPKG'
## Simple feature collection with 173 features and 92 fields (with 4 geometries empty)
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -73.72777 ymin: 40.98014 xmax: -71.78699 ymax: 42.05059
## Geodetic CRS:  NAD83
Label Alias Definition Type Accuracy Domain Missing Data Value(s) Missing Data Frequency
total_population B01003_001 Total US population (Estimate)
age_20m B01001_008 Population of Males aged 20
age_21m B01001_009 Population of Males aged 21
age_22_24m B01001_010 Population of Males aged 22-24
age_25_29m B01001_011 Population of Males aged 25-29
age_30_34m B01001_012 Population of Males aged 30-34
age_20f B01001_032 Population of Females aged 20
age_21f B01001_033 Population of Females aged 20
age_22_24f B01001_034 Population of Females aged 22-24
age_25_29f B01001_035 Population of Females aged 25-29
age_30_34f B01001_036 Population of Females aged 30-34
education_total B15003_001 Total population
education_assoc B15003_021 Highest degree or the highest level of school completed = Associates degree
education_ba B15003_022 Highest degree or the highest level of school completed = Bachelors Degree
education_ma B15003_023 Highest degree or the highest level of school completed = Masters Degree
education_pro B15003_024 Highest degree or the highest level of school completed = Profession School Degree
education_phd B15003_025 Highest degree or the highest level of school completed = Doctorate Degree
median_income B19013_001 Median Household Income
poverty_total_pop B17001_001 Total Population
poverty_below B17001_002 Income below the poverty level in last 12 months
unemployment_total B23025_001 Total Population
unemployment_total_in_labor B23025_002 Population in Labor Force
unemployment_unemployed B23025_005 Unemployed population considered to be in labor force
housing_total B25003_001 Occupied Housing Units
housing_renter B25003_003 Renter occupied Housing Units
housing_units_total B25024_001 Housing Units
housing_units_2 B25024_004 Housing Units w/ 2 units
housing_units_3_4 B25024_005 Housing Units w/ 3 or 4 units
housing_units_5_9 B25024_006 Housing Units w/ 5 to 9 units
housing_units_10_19 B25024_007 Housing Units w/ 10 to 19units
housing_units_20_49 B25024_008 Housing Units w/ 20-49 units
housing_units_50 B25024_009 Housing Units w/ 50 or more units
moved_total B07001_001 Population 1 year or more in the US
moved_within_12_months B07001_017 Population that has moved homes in the past 12 months
households_total B11003_001 Family Type by Presence and Age of Own Children Under 18 Years
lone_parent_families_m B11003_010 Male Housholder, no wife present
lone_parent_families_f B11003_016 Female housholder, no husband present
hispanic B03002_012 Hispanic
race_white B03002_003 Not Hispanic or Latino, White alone
race_black B03002_003 Not Hispanic or Latino, Black or African American alone
race_asian B03002_006 Not Hispanic or Latino, Asian alone
race_native B03002_005 Not Hispanic or Latino, American Indian and Alaska Native Alone
race_pacific B03002_007 Not Hispanic or Latino, Native Hawaiian and Other Pacific Islander Alone
race_other B03002_008 Not Hispanic or Latino, Some Other Race Alone
race_two_or_more B03002_009 Not Hispanic or Latino, Two or more races

Connecticut Crime Rate/ Type

  • Title: CT
  • Abstract: BCT Census town level Crime Data
  • Spatial Coverage: Connecticut
  • Spatial Resolution: town
  • Spatial Representation Type: non-spatial
  • Temporal Coverage: 2013-2017
  • Temporal Resolution: 1 year
  • Lineage: gathered on 04/06/2024 from http://data.ctdata.org/dataset/ucr-crime-index
  • Distribution: Publicly available
  • Constraints: Public data
  • Data Quality: good, reported from local law enforcement agencies

Prior observations

Bias and threats to validity

The threat specifically relevant to this problem is the Modifiable Unit Area Problem since crime rates will have different social and spatial patterns at different scales. There are also potential sources of error related to endogeneity and spatial auto-correlation both of which are moderately accounted for in the original study. Additionally, the results do not have predictive power because the GWR is too regionally specific and over fit. Instead these results can be interpreted as exploratory requiring more rigorous research to contextualize and verify any findings. Bias is also inherent to crime data since crime is socially constructed and criminality is at least partially defined around race and class in America. Over-policing and over-reporting in Low Income areas and Black and brown neighborhoods introduces bias into the measurement of crime itself.

Data transformations / analysis

There are several methodological choices that the original authors did not specify, and which we will have to figure out by comparing results and summary statistics. Specifically, we need to choose a spatial weights matrix for the GWR. We will start with the default ArcGIS spatial matrix (since they used the ArcGIS tool for their analysis) and go from there. If we cannot figure out which one they used we will chose our own and compare results. There are also some transformation choices with the census data that we will have to figure out by comparing our data to the summary statistics provided (i.e what denominator for percentages).

Data transformations for Crime and Census data are provided in the following workflow:

Workflow
Workflow

Results

We will attempt to reproduce the graphics, model outputs and summary statistics provided in the study.

Graphs/ figures we hope to reproduce: Figure 2 Figure 3 Figure 4 Figure 5 For this figure we hope to classify the beta value into bins where values around 0 are neutral and bins are equally spaced. Figure 6

Table 1 Table 2 Table 3 Table 4

Discussion

There was no explicit research question. We will treat the results as exploratory and potentially do a spatial lag model to expand on the research. We also hope to caveat the results with qualitative context and offer some criticisms of predicting crime in general. For each of graphs and figures we hope to get very similar results but this will depend heavily on our ability to figure out the spatial weights matrix and data transformations used by the original authors.

Integrity Statement

Include an integrity statement - The authors of this preregistration state that they completed this preregistration to the best of their knowledge and that no other preregistration exists pertaining to the same hypotheses and research.

This report is based upon the template for Reproducible and Replicable Research in Human-Environment and Geographical Sciences, DOI:[10.17605/OSF.IO/W29MQ](https://doi.org/10.17605/OSF.IO/W29MQ)

References

Müller, Kirill. 2020. Here: A Simpler Way to Find Your Files. https://here.r-lib.org/.
Pebesma, Edzer. 2018. Simple Features for R: Standardized Support for Spatial Vector Data.” The R Journal 10 (1): 439–46. https://doi.org/10.32614/RJ-2018-009.
———. 2025. Sf: Simple Features for r. https://r-spatial.github.io/sf/.
Pebesma, Edzer, and Roger Bivand. 2023. Spatial Data Science: With applications in R. Chapman and Hall/CRC. https://doi.org/10.1201/9780429459016.
R Core Team. 2024. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Tennekes, Martijn. 2018. tmap: Thematic Maps in R.” Journal of Statistical Software 84 (6): 1–39. https://doi.org/10.18637/jss.v084.i06.
———. 2025. Tmap: Thematic Maps. https://github.com/r-tmap/tmap.
Walker, Kyle. 2024. Tigris: Load Census TIGER/Line Shapefiles. https://github.com/walkerke/tigris.
Walker, Kyle, and Matt Herman. 2025. Tidycensus: Load US Census Boundary and Attribute Data as Tidyverse and Sf-Ready Data Frames. https://walker-data.com/tidycensus/.
Wickham, Hadley. 2023. Tidyverse: Easily Install and Load the Tidyverse. https://tidyverse.tidyverse.org.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.