LogoLogo
HomeDocumentationLoginTry for free
  • CARTO Academy
  • Working with geospatial data
    • Geospatial data: the basics
      • What is location data?
      • Types of location data
      • Changing between types of geographical support
    • Optimizing your data for spatial analysis
    • Introduction to Spatial Indexes
      • Spatial Index support in CARTO
      • Create or enrich an index
      • Work with unique Spatial Index properties
      • Scaling common geoprocessing tasks with Spatial Indexes
      • Using Spatial Indexes for analysis
        • Calculating traffic accident rates
        • Which cell phone towers serve the most people?
    • The modern geospatial analysis stack
      • Spatial data management and analytics with CARTO QGIS Plugin
      • Using data from a REST API for real-time updates
  • Building interactive maps
    • Introduction to CARTO Builder
    • Data sources & map layers
    • Widgets & SQL Parameters
    • AI Agents
    • Data visualization
      • Build a dashboard with styled point locations
      • Style qualitative data using hex color codes
      • Create an animated visualization with time series
      • Visualize administrative regions by defined zoom levels
      • Build a dashboard to understand historic weather events
      • Customize your visualization with tailored-made basemaps
      • Visualize static geometries with attributes varying over time
      • Mapping the precipitation impact of Hurricane Milton with raster data
    • Data analysis
      • Filtering multiple data sources simultaneously with SQL Parameters
      • Generate a dynamic index based on user-defined weighted variables
      • Create a dashboard with user-defined analysis using SQL Parameters
      • Analyzing multiple drive-time catchment areas dynamically
      • Extract insights from your maps with AI Agents
    • Sharing and collaborating
      • Dynamically control your maps using URL parameters
      • Embedding maps in BI platforms
    • Solving geospatial use-cases
      • Build a store performance monitoring dashboard for retail stores in the USA
      • Analyzing Airbnb ratings in Los Angeles
      • Assessing the damages of La Palma Volcano
    • CARTO Map Gallery
  • Creating workflows
    • Introduction to CARTO Workflows
    • Step-by-step tutorials
      • Creating a composite score for fire risk
      • Spatial Scoring: Measuring merchant attractiveness and performance
      • Using crime data & spatial analysis to assess home insurance risk
      • Identify the best billboards and stores for a multi-channel product launch campaign
      • Estimate the population covered by LTE cells
      • A no-code approach to optimizing OOH advertising locations
      • Optimizing site selection for EV charging stations
      • How to optimize location planning for wind turbines
      • Calculate population living around top retail locations
      • Identifying customers potentially affected by an active fire in California
      • Finding stores in areas with weather risks
      • How to run scalable routing analysis the easy way
      • Geomarketing techniques for targeting sportswear consumers
      • How to use GenAI to optimize your spatial analysis
      • Analyzing origin and destination patterns
      • Understanding accident hotspots
      • Real-Time Flood Claims Analysis
      • Train a classification model to estimate customer churn
      • Space-time anomaly detection for real-time portfolio management
      • Identify buildings in areas with a deficit of cell network antennas
    • Workflow templates
      • Data Preparation
      • Data Enrichment
      • Spatial Indexes
      • Spatial Analysis
      • Generating new spatial data
      • Statistics
      • Retail and CPG
      • Telco
      • Insurance
      • Out Of Home Advertising
      • BigQuery ML
      • Snowflake ML
  • Advanced spatial analytics
    • Introduction to the Analytics Toolbox
    • Spatial Analytics for BigQuery
      • Step-by-step tutorials
        • How to create a composite score with your spatial data
        • Space-time hotspot analysis: Identifying traffic accident hotspots
        • Spacetime hotspot classification: Understanding collision patterns
        • Time series clustering: Identifying areas with similar traffic accident patterns
        • Detecting space-time anomalous regions to improve real estate portfolio management (quick start)
        • Detecting space-time anomalous regions to improve real estate portfolio management
        • Computing the spatial autocorrelation of POIs locations in Berlin
        • Identifying amenity hotspots in Stockholm
        • Applying GWR to understand Airbnb listings prices
        • Analyzing signal coverage with line-of-sight calculation and path loss estimation
        • Generating trade areas based on drive/walk-time isolines
        • Geocoding your address data
        • Find similar locations based on their trade areas
        • Calculating market penetration in CPG with merchant universe matching
        • Measuring merchant attractiveness and performance in CPG with spatial scores
        • Segmenting CPG merchants using trade areas characteristics
        • Store cannibalization: quantifying the effect of opening new stores on your existing network
        • Find Twin Areas of top-performing stores
        • Opening a new Pizza Hut location in Honolulu
        • An H3 grid of Starbucks locations and simple cannibalization analysis
        • Data enrichment using the Data Observatory
        • New police stations based on Chicago crime location clusters
        • Interpolating elevation along a road using kriging
        • Analyzing weather stations coverage using a Voronoi diagram
        • A NYC subway connection graph using Delaunay triangulation
        • Computing US airport connections and route interpolations
        • Identifying earthquake-prone areas in the state of California
        • Bikeshare stations within a San Francisco buffer
        • Census areas in the UK within tiles of multiple resolutions
        • Creating simple tilesets
        • Creating spatial index tilesets
        • Creating aggregation tilesets
        • Using raster and vector data to calculate total rooftop PV potential in the US
        • Using the routing module
      • About Analytics Toolbox regions
    • Spatial Analytics for Snowflake
      • Step-by-step tutorials
        • How to create a composite score with your spatial data
        • Space-time hotspot analysis: Identifying traffic accident hotspots
        • Computing the spatial autocorrelation of POIs locations in Berlin
        • Identifying amenity hotspots in Stockholm
        • Applying GWR to understand Airbnb listings prices
        • Opening a new Pizza Hut location in Honolulu
        • Generating trade areas based on drive/walk-time isolines
        • Geocoding your address data
        • Creating spatial index tilesets
        • A Quadkey grid of stores locations and simple cannibalization analysis
        • Minkowski distance to perform cannibalization analysis
        • Computing US airport connections and route interpolations
        • New supplier offices based on store locations clusters
        • Analyzing store location coverage using a Voronoi diagram
        • Enrichment of catchment areas for store characterization
        • Data enrichment using the Data Observatory
    • Spatial Analytics for Redshift
      • Step-by-step tutorials
        • Generating trade areas based on drive/walk-time isolines
        • Geocoding your address data
        • Creating spatial index tilesets
Powered by GitBook
On this page
  • Benefits of using optimized data
  • General tips and rules
  • Data warehouse specific optimizations
  • Optimizing your Google BigQuery data
  • Optimizing your Snowflake data
  • Optimizing your Amazon Redshift data
  • Optimizing your Databricks data
  • Optimizing your PostgreSQL data
  • Optimizing your CARTO Data Warehouse data
  • How CARTO helps you apply these optimizations

Was this helpful?

Export as PDF
  1. Working with geospatial data

Optimizing your data for spatial analysis

Last updated 8 months ago

Was this helpful?

It's not uncommon for geospatial datasets to be larger than their non-geospatial counterparts, and geospatial operations are sometimes slow or resource-demanding — but that's not a surprise: representing things and events on Earth and then computing their relationships is not an easy task.

With CARTO, you will unlock a way to do spatial analytics at scale, combining the huge computational power of your data warehouse with our expertise and tools, for millions or billions of data points. And we'll try to make it easy for you!.

In this guide we'll help you prepare your data so that it is optimized for spatial analysis with CARTO.

Benefits of using optimized data

Having clean, optimized data at the source (your data warehouse) will:

  • Improve the performance of all analysis, apps, and visualizations made with CARTO

  • Reduce the computing costs associated in your data warehouse

General tips and rules

Before we start diving into the specific optimizations and tricks available in your data warehouse, there are some typical data optimization patterns that apply to all data warehouses:

Optimization rule #1 — Can you reduce the volume of data?

While CARTO tries to automatically optimize the amount of data requested, having a huge source table is always a bigger challenge than having a smaller one.

Sometimes we find ourselves trying to use a huge table called raw_data with 50TBs of data only to then realize: I actually don't need all the data in this table!

  • If that's your case and the raw data is static, then it's a good idea to materialize in a different (smaller) table the subset or aggregation that you need for your use case.

  • If that's your case and the raw data changes constantly, then it might be a good idea to build a data pipeline that refreshes your (smaller) table. You can .

Optimization rule #2 — Are you using the right spatial data type?

If you've read our , you already know CARTO supports multiple spatial data types.

Each data type has its own particularities when speaking about performance and optimization:

  • Points: points are great to represent specific locations but dealing with millions or billions of points is typically a sub-optimal way of solving spatial challenges. Consider aggregating your points into spatial indexes .

  • Polygons: polygons typically reflect meaningful areas in our analysis, but they quickly become expensive if using too many, too small, or too complex polygons. Consider simplifying your polygons or using a higher-level aggregation to reduce the number of polygons. Both of these operations can be achieved with .

    • Polygons are also known to become .

    • Generally it is a good idea to avoid overlapping geometries.

  • Lines: lines are an important way of representing linear features such as highways and rivers, and are key to network analyses like route optimization. Like polygons, they can quickly become expensive and should be simplified where possible.

  • Spatial Indexes: spatial indexes currently offer the best performance and costs for visualization and analysis purposes ✨ If you're less familiar with spatial indexes or need a refresher, we have prepared an specific .

Data warehouse specific optimizations

The techniques to optimize your spatial data are slightly different for each data warehouse provider, so we've prepared specific guides for each of them. Check the ones that apply to you to learn more:

CARTO will automatically detect any missing optimization when you try to use data in Data Explorer or Builder. In most cases, we'll help you apply it automatically, in a new table or in that same table.

Optimizing your Google BigQuery data

  • Make sure your data is clustered by your geometry or spatial index column.

Optimizing your Snowflake data

  • If your data is points/polygons: make sure Search Optimization is enabled on your geometry column

  • If your data is based on spatial indexes: make sure it is clustered by your spatial index column.

Optimizing your Amazon Redshift data

  • If your data is points/polygons: make sure the SRID is set to EPSG:4326

  • If your data is based on spatial indexes: make sure you're using your spatial index column as the sort key.

Optimizing your Databricks data

  • Make sure your data uses your H3 column as the z-order.

Optimizing your PostgreSQL data

  • Make sure your data is indexed by your geometry or spatial index column.

  • If your data is points/polygons: make sure the SRID is set to EPSG:3857

Optimizing your CARTO Data Warehouse data

  • Make sure your data is clustered by your geometry or spatial index column.

How CARTO helps you apply these optimizations

Check our for more information.

As you've seen through this guide, we try our best to automatically optimize the performance and the costs of all analysis, apps, and visualizations made using CARTO. We also provide tools like or our to help you succeed.

build it easily using CARTO Workflows
previous guides
using CARTO Workflows
CARTO Workflows
invalid geometries
Introduction to Spatial Indexes
Data Explorer documentation
CARTO Workflows
Data Explorer UI-assisted optimizations
Optimizing your Google BigQuery data
Optimizing your Snowflake data
Optimizing your Amazon Redshift data
Optimizing your Databricks data
Optimizing your PostgreSQL data
Optimizing your CARTO Data Warehouse data