Chapter 4 Area-weighted re-aggregation

Suppose that we need to summarize population data from zones in the boundaries of the watersheds. We only have one commonality between the two layers: geographic location. To make matters more complicated, the boundaries are not neatly nested within one another. In order to move data from the zones into the watersheds, we must assume that the population (and variables thereof) are equally distributed across the space of the zones.

4.1 Calculate source areas

zones <- mutate(zones, s_area = st_area(geom))

4.2 Disaggregate

Let’s use the st_intersection operation to overlay zones and watersheds.

fragments <- st_intersection(zones, watersheds)

Resulting in the table…

fragments |> st_drop_geometry() |> knitr::kable()

	zone	pop	minority	poverty	s_area	name
1	NW	5300	4300	2000	875000 [m^2]	Moshassuck
4	SW	4600	3600	1300	1000000 [m^2]	Moshassuck
3	SE	4400	3200	1400	1000000 [m^2]	Narragansett
4.1	SW	4600	3600	1300	1000000 [m^2]	Narragansett
1.1	NW	5300	4300	2000	875000 [m^2]	Blackstone
2	NE	4800	3800	1350	1000000 [m^2]	Blackstone
3.1	SE	4400	3200	1400	1000000 [m^2]	Blackstone
4.2	SW	4600	3600	1300	1000000 [m^2]	Blackstone

and the map…

pointgrid_map +
  fragments |> rownames_to_column(var = "id") |> 
    tm_shape() +
    tm_polygons(fill_alpha = 0.5) +
    tm_text("id")

Now, calculate area-weighted estimates. We use an as.numeric function to discard the units data of m^2.

fragments <- mutate(fragments,
                    f_area = st_area(geom),
                    aw = as.numeric(f_area / s_area),
                    aw_pop = aw * pop,
                    aw_minor = aw * minority,
                    aw_pov = aw * poverty)

fragments |> st_drop_geometry() |> 
  knitr::kable(digits = 2) |>
  kable_styling() |> 
  scroll_box(width = "100%", box_css = "border: 0px;")

	zone	pop	minority	poverty	s_area	name	f_area	aw	aw_pop	aw_minor	aw_pov
1	NW	5300	4300	2000	875000 [m^2]	Moshassuck	281250 [m^2]	0.32	1703.57	1382.14	642.86
4	SW	4600	3600	1300	1000000 [m^2]	Moshassuck	750000 [m^2]	0.75	3450.00	2700.00	975.00
3	SE	4400	3200	1400	1000000 [m^2]	Narragansett	468750 [m^2]	0.47	2062.50	1500.00	656.25
4.1	SW	4600	3600	1300	1000000 [m^2]	Narragansett	125000 [m^2]	0.12	575.00	450.00	162.50
1.1	NW	5300	4300	2000	875000 [m^2]	Blackstone	593750 [m^2]	0.68	3596.43	2917.86	1357.14
2	NE	4800	3800	1350	1000000 [m^2]	Blackstone	1000000 [m^2]	1.00	4800.00	3800.00	1350.00
3.1	SE	4400	3200	1400	1000000 [m^2]	Blackstone	531250 [m^2]	0.53	2337.50	1700.00	743.75
4.2	SW	4600	3600	1300	1000000 [m^2]	Blackstone	125000 [m^2]	0.12	575.00	450.00	162.50

4.3 Reaggregate

Now it’s time to re-aggregate into watersheds. In R, we accomplish this by grouping with group_by and summarizing with summarize. We dissolve the polygons together with st_union. Based on this, it may note be required to explicitly specify geom = st_union(geom) https://bookdown.org/robinlovelace/geocompr/geometric-operations.html

watersheds_pop <- fragments |> group_by(name) |> 
  summarize(
    sumpop = sum(aw_pop),
    summinor = sum(aw_minor),
    sumpov = sum(aw_pov),
    geom = st_union(geom)
  )

Let’s calculate the percentage minority and percentage below poverty and view the resulting attribute table.

watersheds_pop <- mutate(watersheds_pop,
                         pctminority = round(summinor / sumpop * 100, 1),
                         pctpov = round(sumpov / sumpop * 100, 1))

watersheds_pop |> st_drop_geometry() |> knitr::kable(digits = 2)

name	sumpop	summinor	sumpov	pctminority	pctpov
Blackstone	11308.93	8867.86	3613.39	78.4	32.0
Moshassuck	5153.57	4082.14	1617.86	79.2	31.4
Narragansett	2637.50	1950.00	818.75	73.9	31.0

And map the percentage in poverty in watersheds.

pointgrid_map +
  watersheds_pop |> 
    tm_shape() +
    tm_polygons(fill = "pctminority",
                fill_alpha = 0.5,
                fill.legend = tm_legend_hide()) +
    tm_text("pctminority")

Another approach could have summarized the attribute data without the geometry data, and then joined the summaries back to the watersheds by attribute.

First, drop the geometry data and summarize the attribute data.

watersheds_sum <- fragments |> 
  st_drop_geometry() |> 
  group_by(name) |> 
  summarize(
    sumpop = sum(aw_pop),
    summinor = sum(aw_minor),
    sumpov = sum(aw_pov)
  )

watersheds_sum |> knitr::kable(digits = 1)

name	sumpop	summinor	sumpov
Blackstone	11308.9	8867.9	3613.4
Moshassuck	5153.6	4082.1	1617.9
Narragansett	2637.5	1950.0	818.8

Next, join the data to watersheds by attribute.

watersheds_pop <- watersheds |> 
  left_join(watersheds_sum, by = join_by(name)) |> 
  mutate(pctminority = round(summinor / sumpop * 100, 1),
         pctpov = round(sumpov / sumpop * 100, 1))

watersheds_pop |> st_drop_geometry() |> knitr::kable(digits = 1)

name	sumpop	summinor	sumpov	pctminority	pctpov
Moshassuck	5153.6	4082.1	1617.9	79.2	31.4
Narragansett	2637.5	1950.0	818.8	73.9	31.0
Blackstone	11308.9	8867.9	3613.4	78.4	32.0

And map the percentage of people in poverty in watersheds.

pointgrid_map +
  watersheds_pop |> 
    tm_shape() +
    tm_polygons(fill = "pctminority",
                fill_alpha = 0.5,
                fill.legend = tm_legend_hide()) +
    tm_text("pctminority")