Space-time hotspot analysis: Identifying traffic accident hotspots

Spatiotemporal analysis plays a crucial role in extracting meaningful insights from data that possess both spatial and temporal components. By incorporating spatial information, such as geographic coordinates, with temporal data, such as timestamps, spatiotemporal analysis unveils dynamic behaviors and dependencies across various domains. This applies to different industries and use cases like car sharing and micro-mobility planning, urban planning, transportation optimization, and more.

In this example, we will perform space temporal analysis to identify traffic accident hotspots using the location and time of accidents in the city of Barcelona in 2018.

Step 1. Get the data ready

The dataset can be found in cartobq.docs.bcn_accidents_2018. For the purpose of this analysis, only the location and time of accidents are relevant. The table below shows an extraction of 10 of these accidents.

SELECT
 ST_GEOGFROMTEXT(geometry) AS geolocation,
 datetime
FROM
 `cartobq.docs.bcn_accidents_2018`
LIMIT
 10

In addition, the map below shows all accidents in the city of Barcelona in 2018.

On the left panel, the exact locations of the accidents are shown, while on the right one, the aggregated number of accidents per H3 cell at resolution 9 is displayed. At the bottom of the map, the number of accidents over time is shown, where a periodicity can be observed.

Step 2. Generate space-time bins

The next step is to bucketize the data in space bins and time intervals. For this example, a spatial index H3 at resolution 9 and weekly time intervals were chosen. The data is aggregated by H3 cell and week. This can be achieved with the following code:

CREATE TABLE project.dataset.bcn_accidents_count_grid AS
SELECT
 `carto-un`.carto.H3_FROMGEOGPOINT(ST_GEOGFROMTEXT(geometry), 9) as h3,
 DATETIME_TRUNC(CAST(datetime AS DATETIME), WEEK) AS datetime,
 COUNT(*) AS value
FROM
 `cartobq.docs.bcn_accidents_2018`
GROUP BY
 `carto-un`.carto.H3_FROMGEOGPOINT(ST_GEOGFROMTEXT(geometry), 9),
 DATETIME_TRUNC(CAST(datetime AS DATETIME), WEEK)

Step 3. Perform space-time hotspot analysis

Now let us use the space-time Getis Ord Gi* function to calculate the z-score for each H3 cell and week. For that purpose, we will use the GETIS_ORD_SPACETIME_H3_TABLE function of the Analytics Toolbox.

This function needs the following inputs:

  • A table with the h3 cells and their corresponding date-time and number of accidents (input).

  • A table's fully qualified name to save results (output_table).

  • The name of the column with the h3 indexes (index_col).

  • The name of the column with the date (date_col).

  • The name of the column with the values to use for the spacetime Getis-Ord computation (value_col).

  • The size of the k-ring (size). This is the spatial lag used for computing the corresponding Gi* statistic. In our case, we will take 1 ring around each h3 cell.

  • The time unit (time_freq). Equivalent to the h3 resolution for space aggregation time_freq is the time aggregation we will use. We select week as our unit of time aggregation.

  • The size of the time bandwidth (time_bw). This determines the neighboring weeks to be considered for calculating the corresponding Gi* statistic. For this example, we will take 2 weeks, i.e., for every week, we consider the two prior and the two posterior weeks as neighbors.

  • The kernel functions to be used for spatial (kernel) and time weights (kernel_time). For this example, we use uniform kernel for space and quartic kernel for time.

And returns a table with the following schema:

  • index: H3 spatial index at the provided resolution, same as input

  • date: date-time at the provided resolution, same as input

  • gi: the z-score

  • p-value: The two-tail p-value

Running the following, the Getis Ord Gi* for each H3 cell and week is returned.

CALL `carto-un`.carto.GETIS_ORD_SPACETIME_H3_TABLE(
    'project.dataset.bcn_accidents_count_grid',
    'project.dataset.bcn_accidents_count_grid_stgi',
    'h3',
    'datetime',
    'value',
    1,
    'WEEK',
    2,
    'uniform',
    'quartic'
);

Step 4. Getting cold and hotspots

We can now filter the previous table to keep only the rows whose p value is less or equal than 5% and gi positive. This results in keeping only the cells and weeks which are considered as hotspots. Respectively, for coldspots, we need to filter the p value to be less or equal than 5% and gi negative. Then we aggregate per H3 cells the count of weeks left.

SELECT index AS h3, COUNT(*) AS n_weeks 
FROM project.dataset.bcn_accidents_count_grid_stgi
WHERE p_value < 0.05 AND gi > 0
GROUP BY index

The output is shown in the following map, and the number of weeks per cell with a significantly high number of accidents is shown.

Last updated