Measuring merchant attractiveness and performance in CPG with spatial scores

In the CGP industry, businesses are typically influenced by various factors that determine their success. It can be challenging to assess the attractiveness, risk, or trendiness of a merchant due to the multitude of criteria involved. Therefore, consolidating diverse data sources into a unified score becomes crucial for businesses to gain a comprehensive understanding of their product's potential in different locations. By doing so, they can develop clear strategies to optimize sales, boost profit margins, and effectively compete with other products in the market.

In this example, we will show you how to create spatial scores using CARTO's Analytics Toolbox for BigQuery for two different use cases:

Understand how attractive each merchant is when it comes to selling a product. We will make use of the CREATE_SPATIAL_SCORE procedure to combine specific variables into a meaningful score and discover potential locations to bring a new beverage product to market.
Identify how well each point of sales is performing to recognize areas of improvement. By using the CREATE_SPATIAL_PERFORMANCE_SCORE procedure, we will detect if locations are performing as expected (or if they are under/over-performing) in terms of some business criteria and a performance KPI, such as sales, volume, or margin.

To derive both scores, we will consider Food & Beverage point of sales (FB-POS) from The Data Appeal Company, enriched with business and financial information, sentiment scores and footfall indicators. Particularly, we will consider the footfall, sentiment and total amount of positive reviews of each POS, and the specific early morning and early afternoon footfall indicators, as these are the hours of peak beverage consumption. Lastly, we computed the amount of horeca POS (hotels, restaurants and cafés) in a 100 meters buffer around each FB-POS, computed by adding up all the horeca POIs contained within the resolution 11-H3 cell each FB-POS belongs to (using the H3_FROMGEOGPOINT procedure) and its 2nd-degree kring (using the H3_KRING procedure) as a proxy for faster performance.

1. Derive the attractiveness score

To derive a score that measures how attractive a POS is to start selling a new beverage product, we will scale and aggregate the selected variables using some user-defined (optional) weights, which add up to 1. For this, we just need to call the CREATE_SPATIAL_SCORE procedure as follows. We need to specify:

the data we want to use to create the score,
the unique geographic identifier of each POS,
the output table name to store the results,
and some optional set up parameters. For this specific use case:
- Using the weights option, we will assign a weight of 0.25 to the variables sentiment and footfall, as we believe they are the most relevant aspects for a POS to be successful in selling beverage. Therefore, the remaining 0.5 will be equally split among the other input features.
- With the nbuckets parameter, we will categorize the merchants into 3 attractiveness levels ('Low', 'Medium' and 'High') of equal width.

CALL carto.CREATE_SPATIAL_SCORE(
    -- Select the merchant's unique identifier and the input features to be aggregated
    'SELECT pos_id, footfall, sentiment, positive_opinions_count, footfall_early_morning, footfall_early_afternoon, horeca_count_100m FROM `cartobq.docs.cpg_spatial_scoring_input`',
    -- Merchant's unique identifier variable
    'pos_id',
    -- Output table name
    '<my-project>.<my-dataset>.cpg_spatial_scoring_attractiveness',
    -- Optional parameters
    '''{
      "weights":{"sentiment":0.25, "footfall":0.25},
      "nbuckets":3
    }'''
);

The output table <my-project>.<my-dataset>.cpg_spatial_scoring_attractiveness includes the POS' locations (pos_id) ranked (ranking) based on the derived spatial attractiveness score (spatial_score) along with their corresponding attractiveness level (spatial_score_buckets). Notice that the score is provided on a 0-to-100 scale.

Additionally, a lookup table <my-project>.<my-dataset>.cpg_spatial_scoring_attractiveness_lookup is provided for the user to aid interpreting the attractiveness level. For instance, if a POS is very attractive to sell beverage, its score will lie within the 66.6 - 100 range.

Notice that this approach is also suitable to create spatial scores of any kind, such as risk or sales potential scores; what they represent depends on the input variables introduced by the user.

2. Derive the performance score

Now, we will use the same variables to run a regression model that estimates the expected performance of each FB-POS and then compare the results with the actual sales volumes. As a result, we will get a measure that allows us to detect locations above performance (over-performers) and below performance (under-performers) according to the specific business criteria that we have defined. Different criteria will yield different results.

To do this, we will call the CREATE_SPATIAL_PERFORMANCE_SCORE procedure as follows:

Apart from the input data, the unique ID of each POS and the output table name, we need to specify also the business KPI variable. Particularly, we want to evaluate the performance of points of sales (POS) selling a beverage product based on a sales volume KPI that has been simulated for this purpose.
As optional parameters,
- we want the R2 score of the regression model (r2_thr) to be 0.7 to ensure an acceptable fitting,
- and the merchants will be categorized into 3 groups of equal widths centered at zero ('Low', 'Neutral' and 'High').

CALL carto.CREATE_SPATIAL_PERFORMANCE_SCORE(
    -- Select the merchant's unique identifier and the input features to be aggregated
    'SELECT pos_id, footfall, sentiment, positive_opinions_count, footfall_early_morning, footfall_early_afternoon, horeca_count_100m, sales_volume FROM `cartobq.docs.cpg_spatial_scoring_input`',
    -- Merchant's unique identifier variable
    'pos_id',
    -- Business KPI variable
    'sales_volume',
    -- Output table name
    '<my-project>.<my-dataset>.cpg_spatial_scoring_performance',
    -- Optional parameters
    '''{
      "r2_thr":0.7,
      "nbuckets":3
    }'''
);

Again, the output table <my-project>.<my-dataset>.cpg_spatial_scoring_performance includes the POS' locations (pos_id) ranked (ranking) based on the derived performance score (spatial_score) along with their corresponding performance level (spatial_score_buckets). Notice that the score is provided as the difference between the actual volume of sales and the estimated value (positive values indicate over-performance while negative values indicate under-performance, according to the selected criteria).

Additionally, a lookup table cartobq.docs.cpg_spatial_scoring_performance_lookup is provided as before to help interpretability. A POS with a score of 0 will always belong to the intermediate group, as the discretization is centered at 0. Notice that outliers are not considered when computing the upper and lower bounds of each bucket, but they belong to the closest bucket.

3. Visualizing the results

In the following maps we can see the results of computing both scores. To get more insights, one can for instance filter the data to select those FB-POS having a high performance on the left, and those FB-POS having a low attractiveness level on the right. It is interesting to see how some of the top performers have a low attractiveness score based on the selected business criteria. Such information can be used to further analyze possible causes and identify other success criteria than can be brought to new locations.