LogoLogo
HomeDocumentationLoginTry for free
  • CARTO Academy
  • Working with geospatial data
    • Geospatial data: the basics
      • What is location data?
      • Types of location data
      • Changing between types of geographical support
    • Optimizing your data for spatial analysis
    • Introduction to Spatial Indexes
      • Spatial Index support in CARTO
      • Create or enrich an index
      • Work with unique Spatial Index properties
      • Scaling common geoprocessing tasks with Spatial Indexes
      • Using Spatial Indexes for analysis
        • Calculating traffic accident rates
        • Which cell phone towers serve the most people?
    • The modern geospatial analysis stack
      • Spatial data management and analytics with CARTO QGIS Plugin
      • Using data from a REST API for real-time updates
  • Building interactive maps
    • Introduction to CARTO Builder
    • Data sources & map layers
    • Widgets & SQL Parameters
    • AI Agents
    • Data visualization
      • Build a dashboard with styled point locations
      • Style qualitative data using hex color codes
      • Create an animated visualization with time series
      • Visualize administrative regions by defined zoom levels
      • Build a dashboard to understand historic weather events
      • Customize your visualization with tailored-made basemaps
      • Visualize static geometries with attributes varying over time
      • Mapping the precipitation impact of Hurricane Milton with raster data
    • Data analysis
      • Filtering multiple data sources simultaneously with SQL Parameters
      • Generate a dynamic index based on user-defined weighted variables
      • Create a dashboard with user-defined analysis using SQL Parameters
      • Analyzing multiple drive-time catchment areas dynamically
      • Extract insights from your maps with AI Agents
    • Sharing and collaborating
      • Dynamically control your maps using URL parameters
      • Embedding maps in BI platforms
    • Solving geospatial use-cases
      • Build a store performance monitoring dashboard for retail stores in the USA
      • Analyzing Airbnb ratings in Los Angeles
      • Assessing the damages of La Palma Volcano
    • CARTO Map Gallery
  • Creating workflows
    • Introduction to CARTO Workflows
    • Step-by-step tutorials
      • Creating a composite score for fire risk
      • Spatial Scoring: Measuring merchant attractiveness and performance
      • Using crime data & spatial analysis to assess home insurance risk
      • Identify the best billboards and stores for a multi-channel product launch campaign
      • Estimate the population covered by LTE cells
      • A no-code approach to optimizing OOH advertising locations
      • Optimizing site selection for EV charging stations
      • How to optimize location planning for wind turbines
      • Calculate population living around top retail locations
      • Identifying customers potentially affected by an active fire in California
      • Finding stores in areas with weather risks
      • How to run scalable routing analysis the easy way
      • Geomarketing techniques for targeting sportswear consumers
      • How to use GenAI to optimize your spatial analysis
      • Analyzing origin and destination patterns
      • Understanding accident hotspots
      • Real-Time Flood Claims Analysis
      • Train a classification model to estimate customer churn
      • Space-time anomaly detection for real-time portfolio management
      • Identify buildings in areas with a deficit of cell network antennas
    • Workflow templates
      • Data Preparation
      • Data Enrichment
      • Spatial Indexes
      • Spatial Analysis
      • Generating new spatial data
      • Statistics
      • Retail and CPG
      • Telco
      • Insurance
      • Out Of Home Advertising
      • BigQuery ML
      • Snowflake ML
  • Advanced spatial analytics
    • Introduction to the Analytics Toolbox
    • Spatial Analytics for BigQuery
      • Step-by-step tutorials
        • How to create a composite score with your spatial data
        • Space-time hotspot analysis: Identifying traffic accident hotspots
        • Spacetime hotspot classification: Understanding collision patterns
        • Time series clustering: Identifying areas with similar traffic accident patterns
        • Detecting space-time anomalous regions to improve real estate portfolio management (quick start)
        • Detecting space-time anomalous regions to improve real estate portfolio management
        • Computing the spatial autocorrelation of POIs locations in Berlin
        • Identifying amenity hotspots in Stockholm
        • Applying GWR to understand Airbnb listings prices
        • Analyzing signal coverage with line-of-sight calculation and path loss estimation
        • Generating trade areas based on drive/walk-time isolines
        • Geocoding your address data
        • Find similar locations based on their trade areas
        • Calculating market penetration in CPG with merchant universe matching
        • Measuring merchant attractiveness and performance in CPG with spatial scores
        • Segmenting CPG merchants using trade areas characteristics
        • Store cannibalization: quantifying the effect of opening new stores on your existing network
        • Find Twin Areas of top-performing stores
        • Opening a new Pizza Hut location in Honolulu
        • An H3 grid of Starbucks locations and simple cannibalization analysis
        • Data enrichment using the Data Observatory
        • New police stations based on Chicago crime location clusters
        • Interpolating elevation along a road using kriging
        • Analyzing weather stations coverage using a Voronoi diagram
        • A NYC subway connection graph using Delaunay triangulation
        • Computing US airport connections and route interpolations
        • Identifying earthquake-prone areas in the state of California
        • Bikeshare stations within a San Francisco buffer
        • Census areas in the UK within tiles of multiple resolutions
        • Creating simple tilesets
        • Creating spatial index tilesets
        • Creating aggregation tilesets
        • Using raster and vector data to calculate total rooftop PV potential in the US
        • Using the routing module
      • About Analytics Toolbox regions
    • Spatial Analytics for Snowflake
      • Step-by-step tutorials
        • How to create a composite score with your spatial data
        • Space-time hotspot analysis: Identifying traffic accident hotspots
        • Computing the spatial autocorrelation of POIs locations in Berlin
        • Identifying amenity hotspots in Stockholm
        • Applying GWR to understand Airbnb listings prices
        • Opening a new Pizza Hut location in Honolulu
        • Generating trade areas based on drive/walk-time isolines
        • Geocoding your address data
        • Creating spatial index tilesets
        • A Quadkey grid of stores locations and simple cannibalization analysis
        • Minkowski distance to perform cannibalization analysis
        • Computing US airport connections and route interpolations
        • New supplier offices based on store locations clusters
        • Analyzing store location coverage using a Voronoi diagram
        • Enrichment of catchment areas for store characterization
        • Data enrichment using the Data Observatory
    • Spatial Analytics for Redshift
      • Step-by-step tutorials
        • Generating trade areas based on drive/walk-time isolines
        • Geocoding your address data
        • Creating spatial index tilesets
Powered by GitBook
On this page
  • Step-by-Step tutorial
  • Creating a Workflow
  • #1 Filtering trips to a specific time period
  • #2 Convert origins and destinations to a H3 frequency grid
  • #3 Data cleaning
  • #4 Normalize & Compare

Was this helpful?

Export as PDF
  1. Creating workflows
  2. Step-by-step tutorials

Analyzing origin and destination patterns

Last updated 12 months ago

Was this helpful?

This tutorial leverages the H3 to visualize origin and destination trip patterns in a clear, digestible way. We'll be transforming 2.5 million origin and destination locations into one H3 frequency grid, allowing us to easily compare the spatial distribution of pick up and drop off locations. This kind of analysis is crucial for resource planning in any industry where you expect your origins to have a different geography to your destinations.

You can use any table which contains origin and destination data - we'll be using the NYC Taxi Rides demo table which you can find in the CARTO Data Warehouse (BigQuery) or the listing on the Snowflake Marketplace.


Step-by-Step tutorial

Creating a Workflow

  1. In the CARTO Workspace, head to Workflows and Create a Workflow, using the connection where your data is stored.

  2. Under Sources, locate NYC Taxi Rides (or whichever input dataset you're using) and drag it onto the workflow canvas).

#1 Filtering trips to a specific time period

When running origin-destination analysis, it's important to think about not only spatial but temporal patterns. We can expect to see different trends at different times of the day and we don't want to miss any nuances here.

  1. Connect NYC Taxi Rides to a Spatial Filter component.

  2. Set the filter condition to PART_OF_DAY = morning (see screenshot above). You can pick any time period you'd like; if you select the NYC Taxi Rides source, open the Data preview and view Column Stats (histogram icon) for the PART_OF_DAY variable, you can preview all of the available time periods.

Note we've started grouping sections of the workflow together with annotation boxes to help keep things organized.

#2 Convert origins and destinations to a H3 frequency grid

  1. Connect the match output of the Simple Filter to a H3 from GeoPoint component and change the points column to PICKUP_GEOM; which will create a H3 cell for each input geometry. We're looking for junction and street level insights here, so change the resolution to 11.

  2. Connect the output of this to a Group by component. Set the Group by column to H3 and the aggregation column to H3 (COUNT). This will count the number of duplicate H3 IDs, i.e. the number of points which fall within each cell.

  3. Repeat steps 1 & 2, this time setting the initial points column to DROPOFF_GEOM.

  4. Add a Join component and connect the results of your two Group by components to this. Set the join type to Full Outer; this will retain all cells, even where they don't match (so we will retain a H3 cell that has pickups, but no dropoffs - for instance).

Now we have a H3 grid with count columns for the number of pick ups and drop offs, but if you look in the data preview, things are getting a little messy - so let's clean them up!

#3 Data cleaning

  1. Create Column: at the moment our H3 index IDs are contained in two separate columns; H3 and H3_JOINED. We want just one single column containing all IDS, so let's create a column called H3_FULL and use the following CASE statement to combine the two: CASE WHEN H3 IS NULL THEN H3_JOINED ELSE H3 END.

  2. Drop Columns: now we can drop both H3 and H3_JOINED to avoid any confusion.

  3. Rename Column: now, let's rename H3_COUNT as pickup_count and H3_COUNT_JOINED as dropoff_count to keep things clear.

Now, you should have a table with the fields H3_FULL, pickup_count and dropoff_count, just like in the preview above!

#4 Normalize & Compare

Now, we can compare the spatial distribution of pickups and dropoffs:

  1. Connect two subsequent Normalize components, first normalizing pickup_count, and then dropoff_count. This will convert the raw counts into scores from 0 to 1, making a relative comparison possible.

  2. Add a Create Column component, and calculate the difference between the two normalized fields (pickup_count_norm - dropoff_count_norm). The result of this will be a score ranging from -1 (relatively more dropoffs) to 1 (relatively more pickups).

You can see the full workflow below.

Check out the results below!

Do you notice any patterns here? We can see more drop offs in the business district of Midtown - particularly along Park Avenue - and more pick ups in the more residential areas such as the Upper East and West Side, clearly reflecting the morning commute!

The 2.5 million trips - totalling 5 million origin and destination geometries - is a huge amount of data to work with, so let's get it converted to a Spatial Index to make it easier to work with! We'll be applying the straightforward approach from the tutorial.

More Workflows Tutorials 👉
Spatial Index
CARTO Academy Data
Convert points to a Spatial Index
Filtering taxi trips to a specified time period
Creating a H3 frequency grid
Some quick data cleaning!
The full workflow
Intermediate difficulty banner
A screenshot of CARTO Workflows showing taxi trips being filtered to a morning period
A screenshot of CARTO Workflows showing taxi pickups and dropoffs being converted to a H3 frequency grid
A screenshot of CARTO Workflows with some data cleaning steps
A screenshot of the full workflow for understanding origins & destinations