Optimizing your data for spatial analysis

It's not uncommon for geospatial datasets to be larger than their non-geospatial counterparts, and geospatial operations are sometimes slow or resource-demanding — but that's not a surprise: representing things and events on Earth and then computing their relationships is not an easy task.

With CARTO, you will unlock a way to do spatial analytics at scale, combining the huge computational power of your data warehouse with our expertise and tools, for millions or billions of data points. And we'll try to make it easy for you!.

In this guide we'll help you prepare your data so that it is optimized for spatial analysis with CARTO.

Benefits of using optimized data

Having clean, optimized data at the source (your data warehouse) will:

  • Improve the performance of all analysis, apps, and visualizations made with CARTO

  • Reduce the computing costs associated in your data warehouse

General tips and rules

Before we start diving into the specific optimizations and tricks available in your data warehouse, there are some typical data optimization patterns that apply to all data warehouses:

Data warehouse specific optimizations

The techniques to optimize your spatial data are slightly different for each data warehouse provider, so we've prepared specific guides for each of them. Check the ones that apply to you to learn more:

CARTO will automatically detect any missing optimization when you try to use data in Data Explorer or Builder. In most cases, we'll help you apply it automatically, in a new table or in that same table.

Check our Data Explorer documentation for more information.

Optimizing your Google BigQuery data

  • Make sure your data is clustered by your geometry or spatial index column.

Optimizing your Snowflake data

  • If your data is points/polygons: make sure Search Optimization is enabled on your geometry column

  • If your data is based on spatial indexes: make sure it is clustered by your spatial index column.

Optimizing your Amazon Redshift data

  • If your data is points/polygons: make sure the SRID is set to EPSG:4326

  • If your data is based on spatial indexes: make sure you're using your spatial index column as the sort key.

Optimizing your Databricks data

  • Make sure your data uses your H3 column as the z-order.

Optimizing your PostgreSQL data

  • Make sure your data is indexed by your geometry or spatial index column.

  • If your data is points/polygons: make sure the SRID is set to EPSG:3857

Optimizing your CARTO Data Warehouse data

  • Make sure your data is clustered by your geometry or spatial index column.

How CARTO helps you apply these optimizations

As you've seen through this guide, we try our best to automatically optimize the performance and the costs of all analysis, apps, and visualizations made using CARTO. We also provide tools like CARTO Workflows or our Data Explorer UI-assisted optimizations to help you succeed.

Last updated

Was this helpful?