LogoLogo
HomeDocumentationLoginTry for free
  • CARTO Academy
  • Working with geospatial data
    • Geospatial data: the basics
      • What is location data?
      • Types of location data
      • Changing between types of geographical support
    • Optimizing your data for spatial analysis
    • Introduction to Spatial Indexes
      • Spatial Index support in CARTO
      • Create or enrich an index
      • Work with unique Spatial Index properties
      • Scaling common geoprocessing tasks with Spatial Indexes
      • Using Spatial Indexes for analysis
        • Calculating traffic accident rates
        • Which cell phone towers serve the most people?
    • The modern geospatial analysis stack
      • Spatial data management and analytics with CARTO QGIS Plugin
      • Using data from a REST API for real-time updates
  • Building interactive maps
    • Introduction to CARTO Builder
    • Data sources & map layers
    • Widgets & SQL Parameters
    • AI Agents
    • Data visualization
      • Build a dashboard with styled point locations
      • Style qualitative data using hex color codes
      • Create an animated visualization with time series
      • Visualize administrative regions by defined zoom levels
      • Build a dashboard to understand historic weather events
      • Customize your visualization with tailored-made basemaps
      • Visualize static geometries with attributes varying over time
      • Mapping the precipitation impact of Hurricane Milton with raster data
    • Data analysis
      • Filtering multiple data sources simultaneously with SQL Parameters
      • Generate a dynamic index based on user-defined weighted variables
      • Create a dashboard with user-defined analysis using SQL Parameters
      • Analyzing multiple drive-time catchment areas dynamically
      • Extract insights from your maps with AI Agents
    • Sharing and collaborating
      • Dynamically control your maps using URL parameters
      • Embedding maps in BI platforms
    • Solving geospatial use-cases
      • Build a store performance monitoring dashboard for retail stores in the USA
      • Analyzing Airbnb ratings in Los Angeles
      • Assessing the damages of La Palma Volcano
    • CARTO Map Gallery
  • Creating workflows
    • Introduction to CARTO Workflows
    • Step-by-step tutorials
      • Creating a composite score for fire risk
      • Spatial Scoring: Measuring merchant attractiveness and performance
      • Using crime data & spatial analysis to assess home insurance risk
      • Identify the best billboards and stores for a multi-channel product launch campaign
      • Estimate the population covered by LTE cells
      • A no-code approach to optimizing OOH advertising locations
      • Optimizing site selection for EV charging stations
      • How to optimize location planning for wind turbines
      • Calculate population living around top retail locations
      • Identifying customers potentially affected by an active fire in California
      • Finding stores in areas with weather risks
      • How to run scalable routing analysis the easy way
      • Geomarketing techniques for targeting sportswear consumers
      • How to use GenAI to optimize your spatial analysis
      • Analyzing origin and destination patterns
      • Understanding accident hotspots
      • Real-Time Flood Claims Analysis
      • Train a classification model to estimate customer churn
      • Space-time anomaly detection for real-time portfolio management
      • Identify buildings in areas with a deficit of cell network antennas
    • Workflow templates
      • Data Preparation
      • Data Enrichment
      • Spatial Indexes
      • Spatial Analysis
      • Generating new spatial data
      • Statistics
      • Retail and CPG
      • Telco
      • Insurance
      • Out Of Home Advertising
      • BigQuery ML
      • Snowflake ML
  • Advanced spatial analytics
    • Introduction to the Analytics Toolbox
    • Spatial Analytics for BigQuery
      • Step-by-step tutorials
        • How to create a composite score with your spatial data
        • Space-time hotspot analysis: Identifying traffic accident hotspots
        • Spacetime hotspot classification: Understanding collision patterns
        • Time series clustering: Identifying areas with similar traffic accident patterns
        • Detecting space-time anomalous regions to improve real estate portfolio management (quick start)
        • Detecting space-time anomalous regions to improve real estate portfolio management
        • Computing the spatial autocorrelation of POIs locations in Berlin
        • Identifying amenity hotspots in Stockholm
        • Applying GWR to understand Airbnb listings prices
        • Analyzing signal coverage with line-of-sight calculation and path loss estimation
        • Generating trade areas based on drive/walk-time isolines
        • Geocoding your address data
        • Find similar locations based on their trade areas
        • Calculating market penetration in CPG with merchant universe matching
        • Measuring merchant attractiveness and performance in CPG with spatial scores
        • Segmenting CPG merchants using trade areas characteristics
        • Store cannibalization: quantifying the effect of opening new stores on your existing network
        • Find Twin Areas of top-performing stores
        • Opening a new Pizza Hut location in Honolulu
        • An H3 grid of Starbucks locations and simple cannibalization analysis
        • Data enrichment using the Data Observatory
        • New police stations based on Chicago crime location clusters
        • Interpolating elevation along a road using kriging
        • Analyzing weather stations coverage using a Voronoi diagram
        • A NYC subway connection graph using Delaunay triangulation
        • Computing US airport connections and route interpolations
        • Identifying earthquake-prone areas in the state of California
        • Bikeshare stations within a San Francisco buffer
        • Census areas in the UK within tiles of multiple resolutions
        • Creating simple tilesets
        • Creating spatial index tilesets
        • Creating aggregation tilesets
        • Using raster and vector data to calculate total rooftop PV potential in the US
        • Using the routing module
      • About Analytics Toolbox regions
    • Spatial Analytics for Snowflake
      • Step-by-step tutorials
        • How to create a composite score with your spatial data
        • Space-time hotspot analysis: Identifying traffic accident hotspots
        • Computing the spatial autocorrelation of POIs locations in Berlin
        • Identifying amenity hotspots in Stockholm
        • Applying GWR to understand Airbnb listings prices
        • Opening a new Pizza Hut location in Honolulu
        • Generating trade areas based on drive/walk-time isolines
        • Geocoding your address data
        • Creating spatial index tilesets
        • A Quadkey grid of stores locations and simple cannibalization analysis
        • Minkowski distance to perform cannibalization analysis
        • Computing US airport connections and route interpolations
        • New supplier offices based on store locations clusters
        • Analyzing store location coverage using a Voronoi diagram
        • Enrichment of catchment areas for store characterization
        • Data enrichment using the Data Observatory
    • Spatial Analytics for Redshift
      • Step-by-step tutorials
        • Generating trade areas based on drive/walk-time isolines
        • Geocoding your address data
        • Creating spatial index tilesets
Powered by GitBook
On this page
  • You will need...
  • Before we start: understanding MAUP, intensive & extensive properties
  • Points to polygons
  • Lines to polygons
  • Polygons to polygons
  • Advanced enrichment methods

Was this helpful?

Export as PDF
  1. Working with geospatial data
  2. Geospatial data: the basics

Changing between types of geographical support

Last updated 3 months ago

Was this helpful?

The tutorials on this page will teach you how to transform different types of geographic support (such as points, lines and polygons) - and their variables - into polygons. By the end, you will understand how to enrich geographical data and how different geographical supports can impact spatial analysis. We'll be using functions from CARTO's Analytics Toolbox, and you'll be provided with both SQL and low-code approaches.

You will need...

Access to a target polygon table - this is the table we will be transforming data into. You will also need source line and point tables, which we will be transforming data from. These tables will need to have some sort of spatial overlap.

We will be using the following BigQuery tables, sourced from Madrid’s Open Data Portal. You will need either a Google BigQuery connection or to use the CARTO Data Warehouse to use these specific tables.

  • cartobq.docs.madrid_districts: District boundaries in Madrid.

  • cartobq.docs.madrid_bike_parkings: Locations of public bicycle parking.

  • cartobq.docs.madrid_bike_all_infrastructure: Bicycle-friendly infrastructure (bike lanes, shared lanes, and quiet streets).

  • cartobq.docs.madrid_bike_parkings_5min: 5-minute walking isolines around bike parking locations.

Before we start: understanding MAUP, intensive & extensive properties

When aggregating spatial data, it is important to be aware of the Modifiable Areal Unit Problem (MAUP). MAUP occurs when spatial data is grouped into different geographical units, which can lead to misleading interpretations. This issue arises because variation in the size and shape of the geographical areas affect the aggregation results.

Once of the ways that spatial analysts overcome MAUP is by converting data to a regular grid, including Spatial Indexes like H3 and Quadbin. You can see the difference in the maps below. Learn more about the benefits of this approach , or get started with our tutorial.

To better understand MAUP, we distinguish between two types of properties:

  • Extensive properties: These typically increase as the size of an area increase. This could include population, total bike parking spots or total road length.

  • Intensive properties: These are independent of area size and are often derived by normalizing extensive properties. Examples include population density, bike parking density or road length per capita.

You can see the difference between these two types of properties in the maps below, the first of which shows the extensive bike parking count, and the second of which shows the intensive bike parking density.

When transforming numeric variables between different types of geographic support, it's important to be aware of whether you are working with an extensive or intensive variable, as this will impact the type of aggregation you do. For instance, if you wanted to calculate the total population in a county based on census tracts, you would want to sum this extensive property. If you wanted to calculate the population density, you would want to average this intensive property.


Points to polygons

Time needed: < 5 minutes

For our example, we'll be counting the number of bike parking locations in each district. We'll make use of the ENRICH_POLYGONS function using count as the aggregation function. This will create a new column in the destination table called id_count with the total number.

Prior to running the enrichment, we'll also need to generate a row number so that we have a numeric variable to aggregate.

Prefer to use SQL?
DROP TABLE IF EXISTS `cartobq.docs.changing_geo_points_to_polygons`;

CALL `carto-un`.carto.ENRICH_POLYGONS(
  'SELECT id, name, geom FROM `cartobq.docs.madrid_districts`',
  'geom',
  'SELECT id, geom FROM `cartobq.docs.madrid_bike_parkings`',
  'geom',
  [('id', 'count')],
  ['`cartobq.docs.changing_geo_points_to_polygons`']
);

Explore the results 👇


Lines to polygons

Time needed: < 5 minutes

Next, we'll be transforming lines to polygons - but still using the ENRICH_POLYGONS function. For our example, we want to calculate the length of cycling infrastructure within each district.

In the Workflow below, we will aggregate the lane_value variable with sum as the aggregation function (but you could similarly run other aggregation types such as count, avg, min and max). This ensures that the lane values are proportionally assigned based on their intersection length with the district boundaries (rather than the entire length of each line). The sums of all these proportional lengths will be stored in the lane_value_sum column in the destination table.

Prefer to use SQL?
DROP TABLE IF EXISTS `cartobq.docs.changing_geo_lines_to_polygons`;

CALL `carto-un`.carto.ENRICH_POLYGONS(
  'SELECT id, name, geom FROM `cartobq.docs.madrid_districts`',
  'geom',
  'SELECT geom, lane_value FROM `cartobq.docs.madrid_bike_all_infrastructure`',
  'geom',
  [('lane_value', 'sum')],
  ['`cartobq.docs.changing_geo_lines_to_polygons`']
);

Explore the results 👇


Polygons to polygons

Time needed: < 5 minutes

We can also use polygons as source geometries. This is incredibly useful when working with different organizational units - such as census tracts and block groups - which is very common when working with location data. The function works very similarly to when enriching with lines: it will sum the proportions of the intersecting polygons of each district. In this case, the proportions are computed using the intersecting area, rather than length.

Again, we use the Enrich Polygons component for this process, summing the area which intersects each district.

Prefer to use SQL?
DROP TABLE IF EXISTS `cartobq.docs.changing_geo_polygons_to_polygons`;

CALL `carto-un`.carto.ENRICH_POLYGONS(
  'SELECT id, name, geom FROM `cartobq.docs.madrid_districts`',
  'geom',
  'SELECT geom, ST_AREA(geom) AS are FROM `cartobq.docs.madrid_bike_parkings_5min_area`',
  'geom',
  [('coverage', 'sum')],
  ['`cartobq.docs.changing_geo_polygons_to_polygons`']
);

Explore the results 👇

In the resulting map, we can see the total area covered by 5' walking isolines per district, in squared meters.


Advanced enrichment methods

Time needed: < 10 minutes

In addition to the standard enrichment methods we've covered, there are more advanced, alternative ways to enrich polygons. These include:

  • Raw enrichment: This method pairs source and target geometries that intersect and provides useful details, such as the area of the intersection. This allows users to apply their own aggregation methods as needed.

  • Weighted enrichment: This method distributes data based on a chosen column, using a proxy variable to more customize the way values are aggregated across polygons.

This requires two enrichment steps:

  • Weighted enrichment: Using the Enrich Polygons with Weights components, we distribute the estimated number of bikes based on the number of buildings and their floors, assuming taller buildings house more people.

  • H3 grid aggregation: We enrich a standardized H3 grid, making it easier to analyze and visualize patterns with an Enrich H3 Grid component. This approach transforms a single city-wide estimate into a detailed spatial distribution, helping identify where bicycle infrastructure should be expanded to meet demand.

Explore the results 👇

This tutorial covered how to enrich spatial data using the CARTO Analytics Toolbox, addressing challenges like MAUP and leveraging Spatial Indexes for better accuracy. By exploring raw and weighted enrichment, we demonstrated how broad statistics can be transformed into meaningful spatial insights. These techniques will help you make more informed decisions in your own spatial analysis.

Let's start with something simple, counting the number of points in a polygon, which can be achieved with the below Workflow. If this is your first time using CARTO Workflows, we recommend reading our first to get familiar.

If you were to undertake this task with "vanilla SQL" this would be a far more complicated process, and require a deeper usage of (relationships) such as ST_CONTAINS or ST_INTERSECTS. However, this approach is versatile enough to handle more complex spatial operations - let's explore an example.

To demonstrate this, we'll use a simple Workflow to estimate the distribution of bicycles across the city using the dataset. Our starting assumption is that 65% of the population owns a bike, leading to a total estimate of 2.15 million bicycles citywide.

Introduction to Workflows
spatial predicates
Overture Buildings
Workflows
here
Create and enrich an index
A screenshot of CARTO Workflows
A screenshot of CARTO Workflows
A screenshot of CARTO Workflows
A screenshot of CARTO Workflows
Introduction to Spatial Indexes | Academy
Beginner difficulty level banner
Beginner difficulty level banner
Beginner difficulty level banner
Beginner difficulty level banner
Logo