# Spatial Scoring: Measuring merchant attractiveness and performance

<div align="left"><figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2FUx7fNjcfw9KvGNf1JTaW%2Fadvanced%20banner.png?alt=media&#x26;token=ea2ec56e-2c6a-4c54-bae4-561b2fa33b7b" alt="Advanced difficulty banner" width="175"><figcaption></figcaption></figure></div>

Spatial scores provide a unified measure that combines diverse data sources into a **single score**. This allows businesses to comprehensively and holistically evaluate a merchant's potential in different locations. By consolidating variables such as [footfall](https://carto.com/spatial-data-catalog/foot-traffic/), [demographic profiles](https://carto.com/spatial-data-catalog/demographic-data/) and [spend](https://carto.com/industries/financial-services), data scientists can develop actionable strategies to optimize sales, reduce costs, and gain a competitive edge.

### A step-by-step guide to Spatial Scoring

In this tutorial, we’ll be scoring potential merchants across Manhattan to determine the best locations for our product: canned iced coffee!&#x20;

This tutorial has two main steps:

1. **Data Collection & Preparation** to collate all of the relevant variables into the necessary format for the next steps.
2. **Calculating merchant attractiveness** for selling our product. In this step, we’ll be combining data on footfall and proximity to transport hubs into a meaningful score to rank which potential points of sale would be best placed to stock our product.

### You will need...

* **An Area of Interest (AOI) layer.** This is a polygon layer which we will use to filter USA-wide data to just the area we are analyzing. Subscribe to the [County - United States of America (2019)](https://clausa.app.carto.com/spatial-data-catalog/browser/geography/cdb_county_7fc835db/) layer via the **Data Observatory** tab of your CARTO Workspace. Note you can use any AOI that you like, but you will not be able to use the footfall sample data for other regions (see below).&#x20;
* **Potential Points of Sale (POS) data.** We will be using **retail\_stores** from the CARTO Data Warehouse (demo data > demo tables).&#x20;
* **Footfall data.** Our data partner Unacast have kindly provided a sample of their [Activity - United States of America (Quadgrid 17)](https://clausa.app.carto.com/spatial-data-catalog/browser/dataset/uc_activity_1ef60fe2/) data for this tutorial, which you can find again in the CARTO Data Warehouse called **unacast\_activity\_sample\_manhattan** (demo data > demo tables). The assumption here is that the higher the footfall, the more potential sales of our iced coffee!
* **Proximity to public transport hubs.** Let's imagine the marketing for our iced coffee cans directly targets professionals and commuters - where better to stock our products than close to stations? We'll be using [OpenStreetMap](https://carto.com/blog/osm-bigquery) as the source for this data, which again you can access via the CARTO Data Warehouse (demo data > demo tables).&#x20;

***

## Step 1: Data Collection & Preparation

The first step in any analysis is data collection and preparation - we need to calculate the footfall for each store location, as well as the proximity to a station.&#x20;

To get started:

1. Log into the CARTO Workspace, then head to **Workflows** and **Create a new workflow**; use the **CARTO Data Warehouse** connection.
2. Drag the four data sources onto the canvas:
   1. To do this for the Points of Sale, Footfall and Public transport hubs, go to **Sources** (on the left of the screen) > Connection > Demo data > demo\_tables .&#x20;
   2. For the AOI counties layer, switch from Connection to Data Observatory then select CARTO and find County - United States of America (2019).

The full workflow for this analysis is below; let's look at this section-by-section.

<figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2FNfKy7fHdKHWTDWl9Up6p%2Facademy_scoring_full%20workflow.png?alt=media&#x26;token=1e989244-4231-4e68-9dfd-9416a637e1f8" alt="The full spatial scoring workflow"><figcaption><p>The full spatial scoring workflow</p></figcaption></figure>

### Section 1: Filter retail stores to the AOI

<figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2FLrkpCa5ddUrbnMWkPmbv%2Facademy_scoring_workflow%20section%201.png?alt=media&#x26;token=1db4af51-eab7-4f3a-8015-013b17ad8bb4" alt="The workflow for filtering retail stores to the AOI"><figcaption><p>Filtering retail stores to the AOI</p></figcaption></figure>

1. Use a [Simple Filter](https://docs.carto.com/carto-user-manual/workflows/components/data-preparation#simple-filter) with the conditon **do\_label equal to New York** to filter the polygon data to Manhattan.&#x20;
2. Next, use a [Spatial Filter](https://docs.carto.com/carto-user-manual/workflows/components/data-preparation#spatial-filter) to filter the retail\_stores table to those which **intersec**t the AOI we have just created. There should be 66 stores remaining.

### Section 2: Calculating footfall

There are various methods for assigning [Quadbin](https://academy.carto.com/creating-workflows/workflow-templates/spatial-indexes) grid data to points such as retail stores. You may have noticed that our sample footfall data has some missing values, so we will assign footfall based on the value of the closest Quadbin grid cell.

<figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2FFguT612YeFbjnGjooHeO%2Facademy_scoring_workflow%20section%202.png?alt=media&#x26;token=508f687a-5a97-49ae-9873-543b340d9146" alt="The workflow section for calculating footfall"><figcaption><p>Calculating footfall with CARTO Workflows</p></figcaption></figure>

1. Use [Quadbin Center](https://docs.carto.com/carto-user-manual/workflows/components/spatial-indexes#quadbin-center) to convert each grid cell to a central point geometry.&#x20;
2. Now we have two geometries, we can run the [Distance to nearest](https://docs.carto.com/carto-user-manual/workflows/components/spatial-operations#distance-to-nearest) component. Use the output of Section 1 (Spatial Filter; all retail stores in Manhattan) as the top input, and the Quadbin Center as the bottom input.&#x20;
   1. The input geometry columns should both be "geom" and the ID columns shouild be "cartodb\_id" and "quadbin" respectively.&#x20;
   2. Make sure to change the radius to 1000 meters; this is the maximum search distance for nearby features.
3. Finally, use a [Join](https://docs.carto.com/carto-user-manual/workflows/components/joins#join) component to access the footfall value from unacast\_activity... (this is the column called "staying"). Use a Left join and set the join columns to "nearest\_id" and "quadbin."

### Section 3: Calculating distance to stations

We'll take a similar approach in this section to establish the distance to nearby stations.

<figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2FTcw00T0I5fjCqxhvOZeJ%2Facademy_scoring_workflow%20section%203.png?alt=media&#x26;token=9ea1e038-2e8a-4969-ba82-83a2b71aeaf6" alt="A screenshot of CARTO Workflows"><figcaption><p>Calculating proximity to stations</p></figcaption></figure>

1. Use the [Drop Columns](https://docs.carto.com/carto-user-manual/workflows/components/data-preparation#drop-columns) component to omit the nearest\_id, nearest\_distance and quadbin\_joined columns; as we're about the run the Distance to nearest process again, we don't want to end up with confusing duplicate column names.&#x20;
2. Let's turn our attention to **osm\_pois\_usa**. Run a [Simple Filter](https://docs.carto.com/carto-user-manual/workflows/components/data-preparation#simple-filter) with the condition **subgroup\_name equal to Public transport station**.&#x20;
3. Now we can run another [Distance to nearest](https://docs.carto.com/carto-user-manual/workflows/components/spatial-operations#distance-to-nearest) using these two inputs. Set the following parameters:
   1. The geometry columns should both be "geom"
   2. The ID columns should be "cartodb\_id" and "osm\_id" respectively
   3. Set the search distance this time to 2000m

Now we need to do something a little different. For our spatial scoring, we want stores close to stations to score highly, so we need a variable where a short distance to a station is actually assigned a high value. This is really straightforward to do!

1. Connect the results of Distance to nearest to a [Normalize](https://docs.carto.com/carto-user-manual/workflows/components/data-preparation#normalize) component, using the column "nearest\_distance." This will create a new column **nearest\_distance\_norm**, with normalized values from 0 to 1.
2. Next, use a [Create Column](https://docs.carto.com/carto-user-manual/workflows/components/data-preparation#create-column) component, calling the column station\_distance\_norm\_inv and using the code `1-nearest_distance_norm`  which will reverse the normalization.&#x20;
3. Commit the results of this using [Save as Table](https://docs.carto.com/carto-user-manual/workflows/components/import-export#save-as-table).&#x20;

The result of this is a table containing our retail\_stores, all of which we now have a value for footfall and proximity to a station - so now we can run our scoring! &#x20;

***

## Step 2: Calculating merchant attractiveness

In this next section, we’ll create our attractiveness scores! We’ll be using the [CREATE\_SPATIAL\_SCORE](https://docs.carto.com/data-and-analysis/analytics-toolbox-for-bigquery/sql-reference/cpg#create_spatial_score) function to do this; you can read a full breakdown of this code in our documentation [here](https://docs.carto.com/data-and-analysis/analytics-toolbox-for-bigquery/sql-reference/cpg#create_spatial_score).&#x20;

Sample code for this is below; you can run this code either in a [Call Procedure](https://docs.carto.com/carto-user-manual/workflows/components/custom#call-procedure) component in Workflows, or directly in your data warehouse console. Note you will need to replace "yourproject.yourdataset.potential\_POS\_inputs" with the path where you saved the previous table (if you can't find it, it will be at the bottom of the SQL preview window at the bottom of your workflow). You can also adjust the weights (ensuring they always add up to 1) and number of buckets in the scoring parameters section.

```
CALL `carto-un`.carto.CREATE_SPATIAL_SCORE(
   -- Select the input table (created in step 1)
   'SELECT geom, cartodb_id, staying_joined, station_distance_norm_inv FROM `yourproject.yourdataset.potential_POS_inputs`',
   -- Merchant's unique identifier variable
   'cartodb_id',
   -- Output table name
   'yourproject.yourdataset.scoring_attractiveness',
   -- Scoring parameters
   '''{
     "weights":{"staying_joined":0.7, "station_distance_norm_inv":0.3 },
     "nbuckets":5
   }'''
);

```

Let's check out the results! First, you'll need to join the results of the scoring process back to the retail\_stores table as the geometry column is not retained in the process. You can use a Join component in workflows or adapt the SQL below.

```
WITH
  scores AS (
  SELECT
    *
  FROM
    `yourproject.yourdataset.scoring_attractiveness`)
SELECT
  scores.*,
  input.geom
FROM
  scores
LEFT JOIN
  `carto-demo-data.demo_tables.retail_stores` input
ON
  scores.cartodb_id = input.cartodb_id
```

{% embed url="<https://clausa.app.carto.com/map/e8ca0928-9db5-4b1f-beee-ab11f3d82d02>" %}
The results!
{% endembed %}

You can see in the map that the highest scoring locations can be found in extremely busy, accessible locations around Broadway and Times Square - perfect!

***

**Want to take this one step further?** Try calculating merchant performance, which assesses how well stores perform against the expected performance for that location - check out [this tutorial](https://carto.com/blog/spatial-scoring-cpg-merchant-performance) to get started!
