New police stations based on Chicago crime location clusters

In this example we are going to use points clustering to analyze where to locate five new police stations in Chicago based on 5000 samples of crime locations.

Generating the clusters

First, we calculate crime locations clusters using the ST_CLUSTERKMEANS function:

WITH clustered_points AS
(
    SELECT `carto-un`.carto.ST_CLUSTERKMEANS(ARRAY_AGG(ST_GEOGPOINT(longitude, latitude)), 5) AS cluster_arr
    FROM cartobq.docs.chicago_crime_sample
)
SELECT cluster_element.cluster, cluster_element.geom AS geom FROM clustered_points, UNNEST(cluster_arr) AS cluster_element;

WITH clustered_points AS
(
    SELECT `carto-un-eu`.carto.ST_CLUSTERKMEANS(ARRAY_AGG(ST_GEOGPOINT(longitude, latitude)), 5) AS cluster_arr
    FROM cartobq.docs.chicago_crime_sample
)
SELECT cluster_element.cluster, cluster_element.geom AS geom FROM clustered_points, UNNEST(cluster_arr) AS cluster_element;

WITH clustered_points AS
(
    SELECT carto.ST_CLUSTERKMEANS(ARRAY_AGG(ST_GEOGPOINT(longitude, latitude)), 5) AS cluster_arr
    FROM cartobq.docs.chicago_crime_sample
)
SELECT cluster_element.cluster, cluster_element.geom AS geom FROM clustered_points, UNNEST(cluster_arr) AS cluster_element;

This query takes the crimes samples data, gathers their geometries in order to establish clusters and then groups the different geometry clusters. In the following visualization we can easily identify the different clusters by colors.

Calculating the clusters' centers

Once we have split the sample of points into clusters, we can easily work with them to calculate their centers, envelopes, concave/convex hulls and other different transformations. In this particular example we are interested in finding the center of the clusters, since that is where we are going to place the police stations. The Analytics Toolbox offers different functions for this task: ST_CENTERMEAN, ST_CENTERMEDIAN and ST_CENTEROFMASS. In this case let’s use ST_CENTERMEDIAN to calculate the location of the new police stations:

WITH clustered_points AS
(
    SELECT `carto-un`.carto.ST_CLUSTERKMEANS(ARRAY_AGG(ST_GEOGPOINT(longitude, latitude)), 5) AS cluster_arr
    FROM cartobq.docs.chicago_crime_sample
)
SELECT cluster_element.cluster, `carto-un`.carto.ST_CENTERMEDIAN(ST_UNION_AGG(cluster_element.geom)) AS geom FROM clustered_points, UNNEST(cluster_arr) AS cluster_element GROUP BY cluster_element.cluster;

WITH clustered_points AS
(
    SELECT `carto-un-eu`.carto.ST_CLUSTERKMEANS(ARRAY_AGG(ST_GEOGPOINT(longitude, latitude)), 5) AS cluster_arr
    FROM cartobq.docs.chicago_crime_sample
)
SELECT cluster_element.cluster, `carto-un-eu`.carto.ST_CENTERMEDIAN(ST_UNION_AGG(cluster_element.geom)) AS geom FROM clustered_points, UNNEST(cluster_arr) AS cluster_element GROUP BY cluster_element.cluster;

WITH clustered_points AS
(
    SELECT carto.ST_CLUSTERKMEANS(ARRAY_AGG(ST_GEOGPOINT(longitude, latitude)), 5) AS cluster_arr
    FROM cartobq.docs.chicago_crime_sample
)
SELECT cluster_element.cluster, carto.ST_CENTERMEDIAN(ST_UNION_AGG(cluster_element.geom)) AS geom FROM clustered_points, UNNEST(cluster_arr) AS cluster_element GROUP BY cluster_element.cluster;

We can see the result in the following visualization, where the bigger dots represent the police stations we have decided to open based on our analysis:

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 960401.

Last updated 1 year ago

Was this helpful?