Opening a new Pizza Hut location in Honolulu

Context

Identifying an optimal location for a new store is not always an easy task, and we often do not have enough data at our disposal to build a solid model to predict potential revenues across an entire territory. In these cases, managers rely on different business criteria in order to make a sound decision for their expansion strategy. For example, they rely on defining their target market and segmenting population groups accordingly in order to locate the store closer to where the target market lives (e.g. areas with a great presence of youngsters).

In this example, we are going to use CARTO’s Analytics Toolbox for BigQuery to explore find the best location to open a new Pizza Hut restaurant in Honolulu, Hawaii. For that, we will perform two different spatial analyses based on spatial indices (H3): Commercial Hotspots and Local Outlier Factor.

We'll be using CARTO Workflows to run the majority of this analysis by creating the below pipeline. If you're new to this part of CARTO, we recommend heading the the Creating Workflows section of the Academy to familiarize yourself.

You will need...

You will need to subscribe to two public datasets from the Data Observatory:

Nodes - United States of America from OpenStreetMap. We will use this to establish the locations of both our store and our competitor stores. Learn more about using OpenSteetMap data here.
Sociodemographics, 2018, 5yrs - United States of America (Census Block Group) from the American Community Survey. We'll use this to identify the strength of our target market.

You can adapt this analysis for any country or similar use case by replacing these inputs with demographic and POI inputs relevant for your analysis.

1 Establish an area of study

Head to the Workflows tab of the CARTO Workspace and create a new workflow with any Google BigQuery connection.
Head to Sources > Data Observatory and location each of the above datasets (Nodes and Sociodemographics) in turn, and drag them on to the canvas for later.
We will start by defining an area of interest for our study; in our case we will simply switch from the Sources to Components tab, and locate the Draw Features component. Drag it onto the canvas, select Draw Features in the dialog box on the right and draw a study area. We're drawing a polygon with a roughly 5km radius around Honolulu - but you can draw this for anywhere in the United States.

Prefer to use SQL?

SELECT ST_BUFFER(ST_GEOGPOINT(-157.852587, 21.304390), 5000) AS geom

2 Pizza Huts locations in Honolulu

Next, we will find all Pizza Hut restaurants in Honolulu using OpenStreetMaps’s Planet Nodes dataset, available through CARTO’s Data Observatory.

Drag a Spatial Filter component onto the Canvas. Set the top input as Nodes, and the bottom as your study area.
Connect this to an Unnest component, so we can access the data in the all_tags array field. Run the workflow.
With the data unnested, connect the Unnest component to a Where component, using the condition key = 'brand' AND value = 'Pizza Hut'.
Run again! The result will be all Pizza Huts in our study area.

Prefer to use SQL?

WITH honolulu_buffer AS (
    SELECT ST_BUFFER(ST_GEOGPOINT(-157.852587, 21.304390), 5000) AS geom
)

SELECT 
    tag.value AS brand, 
    d.geometry 
FROM 
    `carto-data.ac_xxxxxx.carto_openstreetmap_pointsofinterest_nodes_gbr_latlon_v1_quarterly_v1` AS d,
    -- remember to replace ac_xxxxxx with your Data Observatory connection ID 
    honolulu_buffer
CROSS JOIN UNNEST(d.all_tags) AS tag 
WHERE 
    ST_CONTAINS(honolulu_buffer.geom, d.geometry) 
    AND tag.key = 'brand' 
    AND tag.value = "Pizza Hut"

3 Polyfill area of study

For our upcoming statistical analysis, we will subdivide the area of study into H3 grid cells of resolution 10 using the H3_POLYFILL function. You can learn about the benefits of using Spatial Indexes like H3 here.

On a separate branch to the Spatial Filter we just used, connect your Draw Features component to a H3 polyfill component. Set the resolution to 10 which is a finer resolution than the default. You may wish to alter this depending on your use case and study area size.
Run!

Prefer to use SQL?

WITH 
    honolulu_buffer AS (SELECT ST_BUFFER(ST_GEOGPOINT(-157.852587, 21.304390), 5000) AS geom)

SELECT h3
    FROM unnest(`carto-un`.carto.H3_POLYFILL(
    (SELECT geom
    FROM honolulu_buffer), 10)) h3

4 Enriching the grid

Our customer is interested in looking for areas with a high density residents under 30 years old who do not have an existing Pizza Hut restaurant nearby.

To perform the demographics analysis, we are going to enrich the H3 grid of our area of interest with a set of population variables from the ACS Sociodemographics dataset at census block group level*, available for free through our Data Observatory. We will make use of the enrichment capabilities of CARTO’s Analytics Toolbox.

Drag an Enrich H3 Grid component onto the canvas.
Set the H3 polyfill as the top input - this is our target layer.
Set Sociodemographics as the bottom input - this is our source layer.
Set the aggregation variables as the 10 fields with male and female populations aged between 18 and 29. Set the aggregation type to SUM.
Run!

*Please note that this data was retired from the in June 2025. You can find similar data products from providers like the ACS by searching for publicly-available demographics data in the Data Observatory.

Prefer to use SQL?

Please note the below CALL procedure can only be undertaken from your BigQuery console; not the CARTO Builder SQL Console.

CALL `carto-un`.carto.DATAOBS_ENRICH_GRID
('h3',
R'''
WITH 
    honolulu_buffer AS (SELECT ST_BUFFER(ST_GEOGPOINT(-157.852587, 21.304390), 5000) AS geom)
SELECT h3
    FROM unnest(`carto-un`.carto.H3_POLYFILL(
    (SELECT geom
    FROM honolulu_buffer), 10)) h3   
''',
'h3',
[('female_15_to_17_eb1658f1', 'sum'), ('female_18_to_19_6d791436', 'sum'), ('female_20_f727dc', 'sum'),
('female_21_77f0174a', 'sum'), ('female_22_to_24_121a63e5', 'sum'), ('female_25_to_29_a90c21d6', 'sum'),
('female_30_to_34_50344313', 'sum'), ('male_15_to_17_8dd9d9ac', 'sum'), ('male_18_to_19_bb6956b', 'sum'),
('male_20_5264b51', 'sum'), ('male_21_72217bc7', 'sum'), ('male_22_to_24_74d5e2b8', 'sum'),
('male_25_to_29_cfc3a08b', 'sum'), ('male_30_to_34_36fbc24e', 'sum')],
NULL,
['yourproject.yourdataset.honolulu_pizza_aos_enriched'],
'carto-data.ac_xxxxxxxx');

If you get stuck in any of the steps of this process, please refer to this guide, where you will find an example notebook that performs an enrichment using Data Observatory data that you can run using your own BigQuery account.

You can see the results of this enrichment below.

5 Processing the enriched data

We are going to process our resulting enriched table to aggregate all the male and female population variables and store the result into a single column.

Connect the Enrich H3 grid component to a Create Column component. Call the new field sum_pop and input the following:

  male_18_to_19_bb6956b_sum +
  male_20_5264b51_sum +
  male_21_72217bc7_sum +
  male_22_to_24_74d5e2b8_sum +
  male_25_to_29_cfc3a08b_sum +
  female_18_to_19_6d791436_sum +
  female_20_f727dc_sum +
  female_21_77f0174a_sum +
  female_22_to_24_121a63e5_sum +
  female_25_to_29_a90c21d6_sum

6 Calculate the minimum distance to a Pizza Hut

Now your workflow should be looking something like this:

In addition to the population, Pizza Hut would like to consider for the analysis the distance to the closest existing Pizza Hut as they would like to avoid cannibalization between their own restaurants. For that, for every cell of our grid we are going to compute the distance to every Pizza Hut restaurant using the H3_DISTANCE function, and then keep the minimum value. Please note that these distances are computed as the number of H3 cells of resolution 10 that separate a pair of locations, and therefore are only an approximation.

Connect the Where component from section 2 to a H3 from Geopoint component, setting the resolution to 10.
Drag a Cross Join component onto the canvas. Connect the H3 from Geopoint component to the top input, and the Create Column component to the bottom input. Run this - you will now have a table which joins all combinations of features from both tables.
Connect this to a H3 Grid Distance component, setting the input fields as H3 and H3_joined respectively.
Now, connect this to a Group by component:
1. Set the group field as H3_joined (i.e. the H3 field from the grid covering your study area).
2. Set the aggregation fields as sum_pop_joined (AVG) and h3_distance (MIN).

The result of this will be a H3 table containing a field with the population and distance to the closest Pizza Hut.

Prefer to use SQL?

WITH 
  honolulu_buffer AS (
    SELECT ST_BUFFER(ST_GEOGPOINT(-157.852587, 21.304390), 5000) AS geom
  ),

  t1 AS (
    SELECT 
      `carto-un`.carto.H3_FROMGEOGPOINT(geometry, 10) AS h3id
    FROM 
      `cartobq.docs.honolulu_planet_nodes` d,
      UNNEST(all_tags) AS tag
    WHERE 
      ST_CONTAINS((SELECT geom FROM honolulu_buffer), geometry)
      AND tag.value IN ("Pizza Hut")
      AND tag.key = 'brand'
  ),

  t2 AS (
    SELECT * 
    FROM `yourproject.yourdataset.honolulu_pizza_aos_enriched_sum`
  )

SELECT 
  t2.h3id, 
  ANY_VALUE(t2.sum_pop) AS sum_pop, 
  MIN(`carto-un`.carto.H3_DISTANCE(t2.h3id, t1.h3id)) AS dist
FROM 
  t1
CROSS JOIN 
  t2
WHERE 
  sum_pop IS NOT NULL
GROUP BY 
  t2.h3id;

7 Find suitable locations

Now, we are going to identify areas that meet Pizza Hut requirements, i.e., locations with large populations aged 15-34 and far from existing Pizza Hut restaurants. We will do this by calculating hotspots for both of these variables.

Connect Group by to a Hotspot Analysis component. Set the following parameters:
1. Column with the unique identifier: H3_joined
2. Input variables: columns containing population and distance values
3. Weights: 0.7, 0.3. Your weights should total 1, be comma separated and written in the order of your input variables.
4. K-ring: 3.
5. Significance level: 0.01

Learn more about hotspot analysis in our guide.

Prefer to use SQL?

CALL `carto-un`.carto.COMMERCIAL_HOTSPOTS(
  'yourproject.yourtable.honolulu_pizza_aos_enriched_sum_wdist',
  NULL,
  'h3',
  'h3',
  ['sum_pop_joined_avg','h3_distance_min'],
  [0.7, 0.3],
   3,
   0.01
);

8 Competitor analysis

Finally, we are going to extract Pizza Hut’s competitors from the OpenStreetMap nodes table and compute the local outlier factor (using the LOF function) to identify those that are very close to one another and those far from the others. This will give us an additional insight to decide where it would be the ideal location to open a new restaurant. You can find a full guide to Local Outlier Factor analysis here.

Move back up the workflow to the Unnest component from section 2. Connect this to a Where component on a new branch of the workflow. Set the condition for this as key = 'amenity' AND value IN ('fast_food', 'restaurant').
At present, Local Outlier Factor is not available as a component in Workflows, so we will need to call the Analytics Toolbox function via SQL. Connect Where to a Custom SQL Select, ensuring you connect to the topmost input. Enter the following SQL code to call the function:

WITH 
lof_output 
AS(SELECT `carto-un`.carto.LOF(ARRAY_AGG(STRUCT(geoid,geom)), 5) as lof 
FROM `$a`
)
SELECT lof.* FROM lof_output, UNNEST(lof_output.lof) AS lof

9 Visual analysis

Now, let's bring all of that analysis together on a map to make a decision! You can select any component in Workflows, open the Map preview (bottom of the screen) and select Create Map to add it to a new Builder map.

However, all workflows results are temporary files which are stored for 30 days. If you want to commit a table permanently, you should use a Save as Table component to save to your lakehouse.

In the map below, we can explore the hotspots from our analysis (pink areas) which would be ideal locations for a new Pizza Hut store. Additionally, we can compare this with the distribution of competitors (orange points).

Last updated 1 month ago

Was this helpful?