Generate a dynamic index based on user-defined weighted variables

Context

In this tutorial, we'll explore how to create a versatile web map application using Builder, focusing on the dynamic customization of index scores through SQL Parameters. You'll learn how to normalize variables using Workflows and how to craft an index based on these normalized variables. We'll guide you through dynamically applying specific weights to these variables, enabling the index to flexibly align with various user scenarios.

Whether it's for optimizing location-based services, fine-tuning geomarketing strategies, or diving deep into trend analysis, this tutorial provides you with the essential tools and knowledge. You'll gain the ability to draw significant and tailored insights from intricate geospatial data, making your mapping application a powerful asset for a wide range of scenarios.

Step-by-Step Guide:

In this guide, we'll walk you through:

Creating normalized variables with Workflows

Access Workflows from your CARTO Workspace using the Navigation menu.

Select the data warehouse where you have the data accessible. We'll be using the CARTO Data Warehouse, which should be available to all users.

In the Sources section location on the left panel, navigate to demo_data > demo tables within CARTO Data Warehouse. Drag and drop the below sources to the canvas.
- usa_states_boundaries
- derived_spatialfeatures_usa_h3res8_v1_yearly_v2
- cell_towers_worldwide

We are going to focus our analysis in California. To extract California boundary, we add the Simple Filter component into the canvas and we connect USA States Boundary source to its input. Then, in the node configuration panel we select 'name' as column, 'equal to' as the operation, and 'California' as the value. We click on "Run". You can use the Map Preview to visualize the output.

We are going to leverage spatial indexes, specifically H3 at resolution level 8, to generate our dynamic, weighted index. After isolating the California state boundary, our next step is to transform it into H3 cells. Add the H3 Polyfill component to the canvas and set the resolution to level 8 in the node. Then, proceed by clicking 'Run' to complete the transformation.

Now that we have California H3 cells, we can use the Join component to select Derived Spatial Features source located in California. Add the component to the canvas, link both sources and select 'Inner' as the join type in the node. Then, click on "Run".

Now we can begin normalizing our key variables. Normalizing a variable involves adjusting its values to a common scale, making it easier to compare across different datasets.

Prior to normalizing, we will use the Select component to keep only the necessary columns using the below expression:

h3, 
population_joined as population, 
retail_joined as retail, 
transportation_joined as transport, 
leisure_joined as leisure

Now, let's normalize our desired variables. To do so, add the Normalize component to the canvas. In the node, select one of the desired variables such population. Click on "Run". Once completed, you can visualizes the result in the Data Preview. By inspecting it you can reveal a new column named population_norm with data varying from 0 to 1.

Repeat the above process by adding the Normalize compoment for each of the remaining variables: retail, leisure and transport.

After finishing with the variables from Derived Spatial Features, we can start analyzing the distance between each H3 cell and the closest cell tower location. The first step of this analysis is to extract the cell towers located within California state boundary. To do so, we will use the Spatial Filter component adding Cell Towers Worldwide source as the main input and California state as the secondary input. In the node, select 'Intersect' as the spatial predicate.

Then, we need to extract the centroid geometry from the H3 cells so we can perform a point-to-point distance operation. To do so, add the H3 Center component to the canvas and link it with H3 Polyfill output as we are only interested on the H3 ids.

Add a unique id to the filtered Cell Tower locations by using Row Number component that will add a new column to your table with the row count on it.

We can now add the Distance to nearest component to calculate the closest distance between each H3 cell to the nearest cell tower location in California. Link the H3 Center output as the main source and add the filtered cell tower locations as the secondary input. In the node, set the configuration as per below image with the distance set to 500 meters. You can use the Data Preview to visualise the resulted columns.

With the distance calculated, we can normalize our variable. As on previous steps, we will use the Normalize compoment to achieve that specifying the column as the nearest_distance.

Given that in our case, a higher distance to a cell tower location is considered less favorable, we need to invert our scale so that higher values are interpreted positively. To achieve this, utilize the Select component and apply the following statement to reverse the scale, thereby assigning higher values a more positive significance.

h3,
1 - nearest_distance_norm as nearest_distance_norm, 
nearest_distance

Let's join the normalized variables using the Join component. In the node, set the join type to 'Inner', as we are only interested on those locations where there is a cell tower location with a minimum distance of 500 meters.

The final step in our analysis is to save our output results as a table. We will use the Save as Table component to generate a table from the normalized variables using H3 spatial index and the California state boundary so we can visualize the analysis location. Save both tables within CARTO Data Warehouse > Organization > Private and name them as following:
- California State Boundary: california_boundary
- Normalized variables: california_normalized_variables

Now that the Workflows is done, you can add Annotations, edit the component names and organize it so that the analysis is easy to read and share.

Creating an Index Score using normalized variables

In Workflows, preview the map result of Save as Table component to generate the California Boundary source. Click on "Create map".

A map opens with California Boundary added as table source. Change the Map Title to "Create index score using normalized variables" and rename the layer to "Search Area".

Access the Layer panel, disable the Fill Color and set the Stroke Color to red, setting the Stroke Width to 1.5.

Now, we will add the normalized variables sources.
- Select the Add source from button at the bottom left on the page.
- Click on the CARTO Data Warehouse connection.
- Select Type your own query.
- Click on the Add Source button.

The SQL query panel will be opened.

Enter the following query replacing the qualified table name by your output table created in Step 15. You can find this name in the Data Explorer by the navigating to the recently created table. Once the query is updated, make sure the Spatial Data Type selected is H3. Then, click on "Run".

SELECT * FROM carto-dw-ac-dp1glsh.private_atena_onboardingdemomaps_ca2c4d8c.califoria_normalized_variables

Now, rename let's modify the query creating an index score based on the normalized variables we previously generated in Workflows. Update the SQL query as per below and click on "Run". Then, rename the Layer to 'Index Score'.

WITH index AS (
 SELECT 
  	h3,
    population_norm + retail_norm + leisure_norm + transport_norm + nearest_distance_norm_joined as index_score
  FROM carto-dw-ac-dp1glsh.private_atena_onboardingdemomaps_ca2c4d8c.califoria_normalized_variables)
  
SELECT h3,ML.MIN_MAX_SCALER(index_score) OVER() as index_score FROM index

After running the SQL query, the data source is updated. Then, you can style your H3 layer by index_score, an index that has been calculated considering all variables as equal weights.

While indexes with equal weights offer valuable insights, we'll also explore custom weighting for each variable. This approach caters to diverse user scenarios, particularly in identifying optimal business locations. In Builder, you can apply weights to variables in two ways:

Static Weights: Here, specific weights are applied directly in the SQL query. These weights are fixed and can only be changed by the Editor. This method is straightforward and useful for standard analyses.
Dynamic Weights: This more flexible approach involves using SQL Parameters. It allows Viewer users to adjust weights for each variable, tailoring the analysis to their specific business needs.

Let's begin with the static method:

Edit your SQL query to include static weights for each normalized variable. Experiment with different weights to observe how they impact the index score. Each time you modify and re-run the query, you'll see how these adjustments influence the overall results.

WITH data_ AS (
  SELECT
  	h3,
  	population_norm * 1 as population_norm,
  	retail_norm * 0.2 as retail_norm,
  	leisure_norm * 0.2 as leisure_norm,
  	transport_norm * 0.6 as transport_norm,
  	nearest_distance_norm_joined * 1 as nearest_distance_norm
  FROM carto-dw-ac-dp1glsh.private_atena_onboardingdemomaps_ca2c4d8c.califoria_normalized_variables),

index AS (
 SELECT 
  	h3,
    population_norm + retail_norm + leisure_norm + transport_norm + nearest_distance_norm as index_score
  FROM data_)
  
SELECT h3,ML.MIN_MAX_SCALER(index_score) OVER() as index_score FROM index

Enabling SQL Parameters for user-defined index customization

SQL parameters are placeholders that you can add in your SQL Query source and can be replaced by input values set by users. In this tutorial, we will learn how you can use them to dynamically update the weights of normalized variables.

The first step in this section is to create a SQL Numeric Parameter. You can access this by clicking on the top right icon in the Sources Panel.

Set the SQL Numeric Parameter configuration as follows:
- Slider Type: Simple Slider
- Min Value: 0
- Default Value: 0.5
- Max Value: 1
- Display name: Population Weight
- SQL name: {{population_weight}}

Once you create a parameter, a parameter control is added to the right panel. From there, you can copy the parameter SQL name to add it to your query. In this case, we will add it as the weight to our population_norm column.

Repeat Step 26 to add a SQL Numeric Parameter and update the SQL Query for each of the normalized variables: leisure_norm, retail_norm, transport_norm and nearest_distance_norm The output SQL query and parameter panel should look similar to the below.

WITH data_ AS (
  SELECT
  	h3,
  	population_norm * {{population_weight}} as population_norm,
  	retail_norm * {{retail_weight}} as retail_norm,
  	leisure_norm * {{leisure_weight}} as leisure_norm,
  	transport_norm * {{transport_weight}} as transport_norm,
  	nearest_distance_norm_joined * {{cell_tower_distance_weight}} as nearest_distance_norm
  FROM carto-dw-ac-dp1glsh.private_atena_onboardingdemomaps_ca2c4d8c.califoria_normalized_variables),

index AS (
 SELECT 
  	h3,
    population_norm + retail_norm + leisure_norm + transport_norm + nearest_distance_norm as index_score
  FROM data_)
  
SELECT h3,ML.MIN_MAX_SCALER(index_score) OVER() as index_score FROM index

Now, style your map as desired. We will be setting up our Fill Color palette to ColorBrewer RdPu 4 with color based on index_socre and changing the basemap to CARTO Dark Matter. You can test the parameter controls to see how the index is updated dynamically taking into account the input weight values.

Let's add a description to our map that can provide viewer users with further context about this map and how to use it.

In the Legend tab, set the legend to open when the map is first loaded.

Finally we can make the map public and share the link to anybody.
- For that you should go to Share section on the top right corner and set the map as Public.
- Activate SQL parameters controls options so that Viewer users can control the exposed parameters.

Copy the public share link and access the map as a Viewer. The end result should look similar to the below:

Last updated 26 days ago