Understanding accident hotspots

In this tutorial, we’ll be exploring which parts of Paris’ cycle network could most benefit from improved safety measures through exploring accident rates.

We’ll be using two tables for this: PARIS_BIKE_ACCIDENTS (to understand the accident hotspots) and PARIS_CYCLING_NETWORK (to link these to physical cycle infrastructure). These can be found either in the CARTO Academy Data listing on the Snowflake Marketplace. Other cloud users can access the network data in the CARTO Data Warehouse demo tables dataset, and accident location data here and load into their data warehouse.

If you'd like to replicate this analysis for another study area, many local government data hubs will publish similar data on accident locations. The sample cycle network data was sourced from OpenStreetMap; you can follow our guide for accessing data from this source here.

Step-by-Step tutorial

Creating a Workflow

  1. In the CARTO Workspace, head to Workflows and Create a Workflow, using the connection where your data is stored.

  2. Under Sources, locate Paris bike accidents & Paris Cycling Network and drag them onto the canvas.

#1 Convert accidents to a H3 grid & filter to a study area

We use the Draw Custom Features component to create a custom study area covering the city of Paris, and then H3 Polyfill to convert this to a H3 grid. As we’re interested in street-level insights, we’ve used the resolution of 10. Next, we convert the bike accidents to a second H3 grid using H3 from GeoPoint, and Join the two together. Using an inner join type effectively filters the accidents grid to our study area.

  1. Create a study area: you can use any boundary polygon to do this, or follow our approach of using the Draw Custom Features component to create a custom study area.

  2. Convert to a H3 grid using the H3 Polyfill component. We're interested in street-level insights, and so have used a resolution of 10.

  3. Convert the bike accidents to a second H3 grid using H3 from GeoPoint. This should also be a resolution of 10.

  4. Next, Join the two together. Using an inner join type effectively filters the accidents grid to our study area.

#2 Aggregate & calculate hotspots

  1. Next, a Group by component is used with the aggregation column H3 and type count, resulting in a H3 accident frequency grid.

  2. We then use a Getis Ord* component to calculate hotspots (i.e. statistically significant clusters of high data values) from this grid.

  3. Finally, a Simple Filter is used to only retain statistically significant H3 cells with a P value of less than 0.1 - meaning we can be 90% confident that the outputs are spatial hotspots.

#3 Convert the cycle network to a H3 grid

To transform these hotspots into actionable insights, we’ll now work out which parts of the cycle network infrastructure fall within accident hotspots - and so could benefit from some targeted improvements. Rather than using a slower spatial join to do this, we’ll leverage H3 again.

  1. First, we run a 25-meter ST Buffer on the cycle network.

  2. Use H3 Polyfill (resolution 10) again to convert these to a H3 grid - at this stage, we’ll make sure to enable “Keep table input columns.”

#4 Filter network to accident hotspots

  1. Now use another inner Join to join our cycle network H3 polyfilled-grid to the results of our hotspot. The result? We’re now left with only H3 cells in an accident hotspot AND which cover the cycling network.

  2. Use one final Group by with the following parameters:

    1. Group by column: CARTODB_ID

    2. Aggregation: GI (AVG), HIGHWAY (ANY) & GEOM_JOINED (ANY). You can also use an ANY aggregation to retain any contextual information from the cycle links, such as highway name.

Now we have a table consisting of cycle links which can be found in an accident hotspot, as well as their respective average GI* score which indicates the strength of the hotspot. You can see the full workflow below.

You can explore the results of this in the map below - we’ve also included the original accident locations and H3 hotspots.

Through exploring the map, we can see that many of the hotspots of cycle accidents occur in the center of Paris. In particular, the residential street of Rue Mahler in the Marais has the highest average GI* score of any part of the cycle network. With 21 accidents having occurred in the H3 cell which covers the Rue Mahler & Rue de Rivoli junction (the highest of any H3 cell in the city), this would appear to be an ideal candidate for some safety improvements.

More Workflows Tutorials 👉

Last updated