Work with unique Spatial Index properties

Take advantage of the unique properties of Spatial Indexes

On this page, you'll learn how to take advantage of some of the unique properties of Spatial Indexes.

Use parent and children hierarchies

Being able to seamlessly move data between resolutions is one of the reasons Spatial Indexes are so powerful. With geometries, this would involve a heavy spatial join operation whereas Spatial Indexes enable an efficient string process.

Resolutions are referred to as having "parent" and "child" relationships; less detailed hierarchies are the parents, and more detailed hierarchies are the children. In this tutorial, we'll share how you can easily move between these resolutions.

💡 You will need a Spatial Index table to follow this tutorial. You can use your own or follow the steps in the Create or enrich an index tutorial. We'll be using "Spatial Features - United States of America (H3 Resolution 8)" which you can access as a demo table from the CARTO Data Warehouse.

Our source dataset (USA Spatial Features H3 - resolution 8) has around 12 million cells in it - which is a huge amount! In this tutorial, we'll create the workflow below to move down a hierarchy to resolution 7 to make this slightly more manageable.

  1. In the CARTO workspace, head to Workflows > Create a new workflow. Choose the relevant connection for where your data is stored; if you're following this tutorial you can also use the CARTO Data Warehouse.

  2. Drag your Spatial Index table onto the canvas.

  3. Next, drag a H3 to Parent component onto the canvas. Note you can also use a Quadbin to Parent component if you are using quadbins.

  4. Set your index column (likely "H3") and a parent resolution - we'll use 7. Run! This process will have generated a new column in your table - "H3_Parent."

  5. You can now use a Group by component - setting the Group by field to H3_Parent - to create a new table at the new resolution. At this point you can also aggregate any relevant numeric variables; for instance we will SUM the Population field.

At this point, it is good practice to use a Rename Column component to rename the H3_Parent column "H3" so it can be easily identified as the index column.

Create K-rings

K-rings are a simple concept to understand, but can be a powerful tool in your analytics arsenal.

A ring is the adjacent cells surrounding an originating, central cell. The origin cell is referred to as “0,” and the adjacent cells are ring “1.” The cells adjacent to those are ring “2,” and so on - as highlighted in the image below.

What makes this so powerful is that it enables fast and inexpensive distance-based calculations; rather than having to make calculations based on - for example - buffers or isolines, you could instead stipulate 10 K-rings. This is a far quicker and cheaper calculation as it removes the requirement for use heavy geometries.

💡 You will need a Spatial Index table to follow this tutorial. We have used the Retail Stores dataset from demo tables in the CARTO Data Warehouse, and used a Simple Filter to filter this table to stores in Boston. We've then used H3 from GeoPoint to convert these to a H3 table. Please refer to the Convert points to a Spatial Index tutorial for more details on this process.

  1. Connect your H3 table to a H3 KRing component. Note you can also use a Quadbin KRing component if you are using this type of index.

  2. Set the K-ring to 1. You can use this documentation and this hexagon properties calculator to work out how many K-rings you need to approximate specific distances. For instance, we are using a H3 resolution of 8 which has a long-diagonal "radius" of roughly 1km. This means our K-ring of 1 will cover an area approximately 1km away from the central cell.

  3. Run your workflow! This will generate a new field called kring_index which contains the H3 reference for the K-ring cells, which can be linked to the central cell, referenced in the column H3.

So how can you use this? Well, you can see an example in the workflow above in the "Calculate the population" section, where we analyze the population within 1km of each store.

We run a Join (inner) on the results of the K-ring, joining it by the kring_index column to the H3 column in USA Spatial Features table (available for free to all CARTO users via the Spatial Data Catalog). Next, with the Group by component we aggregate by summing the population, and grouping by H3_joined. This gives us the total population in the K-ring around each central cell, approximately the population within 1km of each store. Finally, we use a Join (left) to join this back to our original H3 index which contains the store information.

With this approach, we leverage string-based - rather than geometry-based - calculations, for lighter storage and faster results - ideal for working at scale!

Convert indexes into a geometry

There are some instances where you may want to convert Spatial Indexes back into a geometry. A common example of this is where you wish to calculate the distance from a Spatial Index cell to another feature, for instance to understand the distance from each cell to its closest 4G network tower.

There are two main ways you can achieve this - convert the index cell to a central point, or to a polygon.

💡 You will need a Spatial Index table to follow this tutorial. You can use your own or follow the steps in the Create or enrich an index tutorial. We have used the USA States dataset (available for free to all CARTO users via the Spatial Data Catalog) and filtered it to California. We then used H3 Polyfill to create a H3 index (resolution 5) to cover this area. For more information on this process please refer to the Convert polygons to a Spatial Index tutorial.

  • Converting to a point geometry: connect any Spatial Index component or source to a H3 Center component. Note you can alternatively use Quadbin Center.

  • Converting to a point geometry: connect any Spatial Index component or source to a H3 Boundary component. Note you can alternatively use Quadbin Boundary.

So, which should you use? It depends completely on the outcome you're looking for.

Point geometries are much lighter than polygons, and so will enable faster analysis and lighter storage. They can also be more representative for analysis. Let's illustrate by returning to our example of finding the distance between each cell and nearby 4G towers. By calculating the distance from the central point, you are essentially calculating the average distance for the whole cell. If you were to use a polygon boundary, your results would be skewed towards the side of the cell which is closest to the tower. On the other hand, polygon boundaries enable "cleaner" visualizations and are more appropriate for any overlay analysis you may need to do.

But remember - because Spatial Index grids are geographically "fixed" it's easy to move to and from index and geometry, or different geometry types.

Enriching a geometry with a Spatial Index

So, you've learned how to convert a geometry to a Spatial Index, and how to convert that Spatial Index back to a geometry. Another really common task which is made more efficient with Spatial Indexes is to use them to enrich a geometry - for instance to calculate the population within a specified area.

In this tutorial, we'll calculate the total population within 25 miles of Grand Central Station NYC. You can adapt this for any example; all you need is a polygon to enrich, and a Spatial Index to do the enriching with.

For this specific example, you will need access to the USA Spatial Features H3 table (available for free to all CARTO users either in the CARTO Data Warehouse > demo data > demo tables, or via the Spatial Data Catalog). In addition, the workflow below creates a buffer polygon of 25 miles from Grand Central Station, which we've manually digitized using the Table from GeoJSON component.

To run the enrichment, follow the below steps:

  1. In addition to your polygon, drag your Spatial Index layer onto the canvas.

  2. Connect the ST Buffer output to a H3 Polyfill component (note you can also use a Quadbin Polyfill if you are using this Spatial Index type).

  3. Set the resolution of H3 Polyfill to the same resolution as your input Spatial Index; for us that is 8. If you have multiple polygon input features, we recommend enabling the Keep input table columns option. Optional: run the workflow to check out the interim results! You should have a H3 grid covering your polygon.

  4. To attach population data to this grid, use a Join component with the type Left, and connect the results of H3 Polyfill to the top input. For the bottom input, connect the Spatial Index source layer (for us, that's the Spatial Features table).

  5. Set the main and secondary table columns as H3 (or whichever field contains your index references), and the join type as Left, to retain only features from the Spatial Features table which can also be found in the H3 Polyfill component. Run!

  6. Finally, we want to know the total population in this area, so add a Group by component. Set the aggregation column to population_joined and the type as SUM. If you had multiple input polygons and you wanted to know the total population per polygon, here you could set the Group by column to the unique polygon ID - but we just want to know the total for one polygon so we can leave this empty. Run!

And what's the result?

Show me the answer!

13,576,991 people live within 25 miles of Grand Central Station NYC!

The benefit of this approach is that after you've run the H3 Polyfill component, all of the calculations are based on string fields, rather than geometries. This makes the analysis far less computationally expensive - and faster!

Check out more examples of data enrichment in the Workflows Gallery!

Last updated