LogoLogo
HomeDocumentationLoginTry for free
  • CARTO Academy
  • Working with geospatial data
    • Geospatial data: the basics
      • What is location data?
      • Types of location data
      • Changing between types of geographical support
    • Optimizing your data for spatial analysis
    • Introduction to Spatial Indexes
      • Spatial Index support in CARTO
      • Create or enrich an index
      • Work with unique Spatial Index properties
      • Scaling common geoprocessing tasks with Spatial Indexes
      • Using Spatial Indexes for analysis
        • Calculating traffic accident rates
        • Which cell phone towers serve the most people?
    • The modern geospatial analysis stack
      • Spatial data management and analytics with CARTO QGIS Plugin
      • Using data from a REST API for real-time updates
  • Building interactive maps
    • Introduction to CARTO Builder
    • Data sources & map layers
    • Widgets & SQL Parameters
    • AI Agents
    • Data visualization
      • Build a dashboard with styled point locations
      • Style qualitative data using hex color codes
      • Create an animated visualization with time series
      • Visualize administrative regions by defined zoom levels
      • Build a dashboard to understand historic weather events
      • Customize your visualization with tailored-made basemaps
      • Visualize static geometries with attributes varying over time
      • Mapping the precipitation impact of Hurricane Milton with raster data
    • Data analysis
      • Filtering multiple data sources simultaneously with SQL Parameters
      • Generate a dynamic index based on user-defined weighted variables
      • Create a dashboard with user-defined analysis using SQL Parameters
      • Analyzing multiple drive-time catchment areas dynamically
      • Extract insights from your maps with AI Agents
    • Sharing and collaborating
      • Dynamically control your maps using URL parameters
      • Embedding maps in BI platforms
    • Solving geospatial use-cases
      • Build a store performance monitoring dashboard for retail stores in the USA
      • Analyzing Airbnb ratings in Los Angeles
      • Assessing the damages of La Palma Volcano
    • CARTO Map Gallery
  • Creating workflows
    • Introduction to CARTO Workflows
    • Step-by-step tutorials
      • Creating a composite score for fire risk
      • Spatial Scoring: Measuring merchant attractiveness and performance
      • Using crime data & spatial analysis to assess home insurance risk
      • Identify the best billboards and stores for a multi-channel product launch campaign
      • Estimate the population covered by LTE cells
      • A no-code approach to optimizing OOH advertising locations
      • Optimizing site selection for EV charging stations
      • How to optimize location planning for wind turbines
      • Calculate population living around top retail locations
      • Identifying customers potentially affected by an active fire in California
      • Finding stores in areas with weather risks
      • How to run scalable routing analysis the easy way
      • Geomarketing techniques for targeting sportswear consumers
      • How to use GenAI to optimize your spatial analysis
      • Analyzing origin and destination patterns
      • Understanding accident hotspots
      • Real-Time Flood Claims Analysis
      • Train a classification model to estimate customer churn
      • Space-time anomaly detection for real-time portfolio management
      • Identify buildings in areas with a deficit of cell network antennas
    • Workflow templates
      • Data Preparation
      • Data Enrichment
      • Spatial Indexes
      • Spatial Analysis
      • Generating new spatial data
      • Statistics
      • Retail and CPG
      • Telco
      • Insurance
      • Out Of Home Advertising
      • BigQuery ML
      • Snowflake ML
  • Advanced spatial analytics
    • Introduction to the Analytics Toolbox
    • Spatial Analytics for BigQuery
      • Step-by-step tutorials
        • How to create a composite score with your spatial data
        • Space-time hotspot analysis: Identifying traffic accident hotspots
        • Spacetime hotspot classification: Understanding collision patterns
        • Time series clustering: Identifying areas with similar traffic accident patterns
        • Detecting space-time anomalous regions to improve real estate portfolio management (quick start)
        • Detecting space-time anomalous regions to improve real estate portfolio management
        • Computing the spatial autocorrelation of POIs locations in Berlin
        • Identifying amenity hotspots in Stockholm
        • Applying GWR to understand Airbnb listings prices
        • Analyzing signal coverage with line-of-sight calculation and path loss estimation
        • Generating trade areas based on drive/walk-time isolines
        • Geocoding your address data
        • Find similar locations based on their trade areas
        • Calculating market penetration in CPG with merchant universe matching
        • Measuring merchant attractiveness and performance in CPG with spatial scores
        • Segmenting CPG merchants using trade areas characteristics
        • Store cannibalization: quantifying the effect of opening new stores on your existing network
        • Find Twin Areas of top-performing stores
        • Opening a new Pizza Hut location in Honolulu
        • An H3 grid of Starbucks locations and simple cannibalization analysis
        • Data enrichment using the Data Observatory
        • New police stations based on Chicago crime location clusters
        • Interpolating elevation along a road using kriging
        • Analyzing weather stations coverage using a Voronoi diagram
        • A NYC subway connection graph using Delaunay triangulation
        • Computing US airport connections and route interpolations
        • Identifying earthquake-prone areas in the state of California
        • Bikeshare stations within a San Francisco buffer
        • Census areas in the UK within tiles of multiple resolutions
        • Creating simple tilesets
        • Creating spatial index tilesets
        • Creating aggregation tilesets
        • Using raster and vector data to calculate total rooftop PV potential in the US
        • Using the routing module
      • About Analytics Toolbox regions
    • Spatial Analytics for Snowflake
      • Step-by-step tutorials
        • How to create a composite score with your spatial data
        • Space-time hotspot analysis: Identifying traffic accident hotspots
        • Computing the spatial autocorrelation of POIs locations in Berlin
        • Identifying amenity hotspots in Stockholm
        • Applying GWR to understand Airbnb listings prices
        • Opening a new Pizza Hut location in Honolulu
        • Generating trade areas based on drive/walk-time isolines
        • Geocoding your address data
        • Creating spatial index tilesets
        • A Quadkey grid of stores locations and simple cannibalization analysis
        • Minkowski distance to perform cannibalization analysis
        • Computing US airport connections and route interpolations
        • New supplier offices based on store locations clusters
        • Analyzing store location coverage using a Voronoi diagram
        • Enrichment of catchment areas for store characterization
        • Data enrichment using the Data Observatory
    • Spatial Analytics for Redshift
      • Step-by-step tutorials
        • Generating trade areas based on drive/walk-time isolines
        • Geocoding your address data
        • Creating spatial index tilesets
Powered by GitBook
On this page
  • Data
  • Spacetime Getis-Ord
  • Finding time series clusters

Was this helpful?

Export as PDF
  1. Advanced spatial analytics
  2. Spatial Analytics for BigQuery
  3. Step-by-step tutorials

Time series clustering: Identifying areas with similar traffic accident patterns

Last updated 10 months ago

Was this helpful?

Spatio-temporal analysis is crucial in extracting meaningful insights from data that possess both spatial and temporal components. By incorporating spatial information, such as geographic coordinates, with temporal data, such as timestamps, spatio-temporal analysis unveils dynamic behaviors and dependencies across various domains. This applies to different industries and use cases like car sharing and micromobility planning, urban planning, transportation optimization, and more.

In this example, we will perform spatio-temporal analysis to identify areas with similar traffic accident patterns over time using the location and time of accidents in London in 2021 and 2022, provided by . This tutorial builds upon where we explained how to use to identify traffic accident hotspots.

Data

The source data we use has two years of weekly aggregated data into an H3 grid, counting the number of collisions per cell. The data is available at cartobq.docs.spacetime_collisions_weekly_h3 and it can be explored in the map below.

Spacetime Getis-Ord

We start by performing a spacetime hotspot analysis to better understand our data. We can use the following call to the Analytics Toolbox to run the procedure:

CALL `carto-un`.carto.GETIS_ORD_SPACETIME_H3_TABLE(
 'cartobq.docs.spacetime_collisions_weekly_h3',
 'cartobq.docs.spacetime_collisions_weekly_h3_gi',
 'h3',
 'week',
 'n_collisions',
 3,
 'WEEK',
 1,
 'gaussian',
 'gaussian'
);
CALL `carto-un-eu`.carto.GETIS_ORD_SPACETIME_H3_TABLE(
 'cartobq.docs.spacetime_collisions_weekly_h3',
 'cartobq.docs.spacetime_collisions_weekly_h3_gi',
 'h3',
 'week',
 'n_collisions',
 3,
 'WEEK',
 1,
 'gaussian',
 'gaussian'
);
CALL carto.GETIS_ORD_SPACETIME_H3_TABLE(
 'cartobq.docs.spacetime_collisions_weekly_h3',
 'cartobq.docs.spacetime_collisions_weekly_h3_gi',
 'h3',
 'week',
 'n_collisions',
 3,
 'WEEK',
 1,
 'gaussian',
 'gaussian'
);

By performing this analysis, we can check how different parts of the city become “hotter” or “colder” as time progresses.

Finding time series clusters

  • input: The query or fully qualified name of the table with the data

  • output_table: The fully qualified name of the output table

  • partitioning_column: Time series unique IDs, which in this case are the H3 indexes

  • ts_column: Name of the column with the value per ID and timestep

  • value_column: Name of the column with the value per ID and timestep

  • options: A JSON containing the advanced options for the procedure

One of the advanced options is the time series clustering method. Currently, it features two basic approaches:

  • Value characteristic that will cluster the series based on the step-by-step distance of its values. One way to think of it is that the closer the signals, the closer the series will be understood to be and the higher the chance of being clustered together.

  • Profile characteristic that will cluster the series based on their dynamics along the time span passed. This time, the closer the correlation between two series, the higher the chance of being clustered together.

Clustering the series as-is can be tricky since these methods are sensitive to the noise in the series. However, since we smoothed the signal using the spacetime Getis-Ord before, we could try clustering the cells based on the resulting temperature. We will only consider those cells with at least 60% of their observations with reasonable significance.

CALL `carto-un`.carto.TIME_SERIES_CLUSTERING(
 '''
   SELECT * FROM `cartobq.docs.spacetime_collisions_weekly_h3_gi`
   QUALIFY PERCENTILE_CONT(p_value, 0.6) OVER (PARTITION BY index) < 0.05
 ''',
 'cartobq.docs.spacetime_collisions_weekly_h3_clusters',
 'index',
 'date',
 'gi',
 JSON '{ "method": "profile", "n_clusters": 4 }'
);
CALL `carto-un-eu`.carto.TIME_SERIES_CLUSTERING(
 '''
   SELECT * FROM `cartobq.docs.spacetime_collisions_weekly_h3_gi`
   QUALIFY PERCENTILE_CONT(p_value, 0.6) OVER (PARTITION BY index) < 0.05
 ''',
 'cartobq.docs.spacetime_collisions_weekly_h3_clusters',
 'index',
 'date',
 'gi',
 JSON '{ "method": "profile", "n_clusters": 4 }'
);
CALL carto.TIME_SERIES_CLUSTERING(
 '''
   SELECT * FROM `cartobq.docs.spacetime_collisions_weekly_h3_gi`
   QUALIFY PERCENTILE_CONT(p_value, 0.6) OVER (PARTITION BY index) < 0.05
 ''',
 'cartobq.docs.spacetime_collisions_weekly_h3_clusters',
 'index',
 'date',
 'gi',
 JSON '{ "method": "profile", "n_clusters": 4 }'
);

Even if it can feel like some layers of indirection, this provides several advantages:

  • Since it has been temporally smoothed, noise has been reduced in the dynamics of the series;

  • and since it has been geographically smoothed, nearby cells are more likely to be clustered together.

This map shows the different clusters that are returned as a result:

We can immediately see the different dynamics in the widget:

  • Apart from cluster #3, which clearly clumps the “colder” areas, the rest start 2021 with very similar accident counts.

  • However, from July 2021 onwards, cluster #2 accumulates clearly more collisions than the other two.

  • Even though #1 and #4 have similar levels, certain points differ, like September 2021 or January 2022.

This information is incredibly useful to kickstart a further analysis to understand the possible causes of these behaviors, and we were able to extract these insights at a single glance at the map. This method “collapsed” the results of the space-time Getis-Ord into a space-only result, which makes the data easier to explore and understand.

For further detail on the spacetime Getis-Ord, take a look at and .

Once we have an initial understanding of the spacetime patterns of our data, we proceed to cluster H3 cells based on their temporal patterns. To do this, we use the procedure, which takes as input:

the documentation
this tutorial
TIME_SERIES_CLUSTERING
Transport for London
this previous one,
the spacetime Getis-Ord functionality
Advanced difficulty banner