Identify buildings in areas with a deficit of cell network antennas
Last updated
Was this helpful?
Last updated
Was this helpful?
In this tutorial, we will learn to identify areas with a deficit of cell network antennas. We will identify busy areas, e.g., areas with a lot of human activity, to later verify if the number of antennas in these locations are enough to satisfy demand while providing a high quality service.
This analysis will be based on three main sources:
Overture Maps: contains topographic data standardized across global administrative boundaries. We will use their Buildings dataset, made up of over 2.3 billion features.
CARTO Spatial Features: provides derived variables across a wide range of themes including demographics, points of interest, and climatology data with global coverage. We will focus on the derived human activity index, a proxy for busy areas.
OpenCelliD: it is an open database of cell towers located worldwide.
We will be running the analysis for the city of Madrid, but if you'd like to replicate it for another study areas, make sure to subscribe to the Overture Maps and Spatial Features datasets, which are available globally in our Data Observatory, and to update your cell towers data properly (OpenCelliD data can be downloaded from here).
Sign in to CARTO at app.carto.com
Head to the Workflows tab and click on Create new workflow
Choose the CARTO Data Warehouse connection or any connection to you Google BigQuery project.
Now, let’s dive into the step-by-step process of creating a workflow to pinpoint high-traffic areas that are lacking mobile phone antennas, and discover which buildings are the best candidates for antenna installation.
Let's import the data into the canvas. First, we will load the Spatial Features dataset from the Sources left-side menu by selecting Data Observatory > CARTO > Spatial Features - Spain (H3 Resolution 8) [v3] and drag-and-drop it into the canvas. Make sure you are subscribed to this dataset (you can follow this tutorial to learn how).
Now, from the Components left-side menu, we will use the Get Table by Name component to load some data we've made publicly available in BigQuery.
First, we will load a sample of the Overture Maps's buildings data, which contains all the building geometries in Madrid, by typing cartobq.docs.buildings_mad
as the source table FQN.
You can also subscribe to the Overture Maps' Buildings - Global dataset, publicly available in the CARTO Data Observatory, then drag-and-drop the full source into the canvas as we previously did for Spatial Features.
Secondly, we will import the geometry of our Area of Interest (AOI), which will help focus our analysis only within Madrid. The FQN of this data is cartobq.docs.madrid_districts
.
Now, we will import the cell towers data using the Import from URL component. We have filtered the OpenCelliD data to keep only the 4G mobile phone antennas we are interested in, and made the sample publicly accessible through a Google Cloud Storage bucket. Copy the following URL to import the source: https://storage.googleapis.com/data_science_public/workflow_templates/cell_towers_madrid.csv
Before we begin with the analysis, we need to standardize all our data to a common geographical reference. This way, we can seamlessly integrate the data, allowing for consistent spatial analysis and ensuring that the results are reliable. We will use Spatial Indexes as our reference system: since the Spatial Features dataset is already in H3, we will convert the other sources to match this format. If you want to learn more about Spatial Indexes, take a look at out Spatial Indexes 101 Report!
To transform the telco data into H3, we will count the number of cell towers within each H3 cell:
Extract the H3 associated to each cell tower location coordinates by connecting the cell tower data source to the H3 from GeoPoint component. Select geom
as the points column and 8
as the resolution.
Use the Group by component to group by h3
and aggregate the cell tower id
's using COUNT
.
Rename the resulting id_count
column as cell_towers
using the Rename Column component.
Next, we will enrich the Area of Interest with all the necessary data:
Connect the AOI source to the H3 Polyfill component to generate a table with indices of all H3 cells of resolution 8
included within the AOI geo-column geom
. Use the Intersects
mode.
Then, Join the polyfilled AOI with the Spatial Features data using the h3
column as key for both sources. Select Inner
as join type to keep only those H3 cells that are common to both tables. Then, eliminate the h3_joined
column using the Drop Columns component.
Now, use another Join to combine the resulting table with the aggregated cell tower counts. Again, use the h3
columns as keys, but make sure to select the appropriate join type, as we want to fill in the H3 cells in Madrid with cell tower information. In this case, we have connected the AOI as the main table, so, we will perform a Left
join.
The aim of the analysis is to identify busy areas, i.e., areas with a lot of human activity, to later verify if the number of antennas in these locations are enough to satisfy demand while providing a high quality service. To do this, we will:
Select the variables of interest. Since we are looking for areas with high human activity and low number of cell towers, we need to reverse the cell tower counts so that high values mean low counts. To do this, use the Create Column component to compute cell_towers_inv
, a proxy for the lack of antennas, by typing the query below, then use the Edit Schema component to select the variables h3
, cell_towers_inv
and human_activity_index
:
Create a spatial score that combines high human mobility and lack of antennas information. Use the Composite Score Unsupervised component with the CUSTOM_WEIGHTS
scoring method to combine both variables using the same weights through a weighted average. Select STANDARD_SCALER
as the scaling method and a LINEAR
aggregation. For more details about Composite Scores, take a look at our step-by-step tutorial!
Compute the Getis Ord statistic to identify statistically significant spatial clusters of high values (hot spots, lack of coverage) and low values (cold spots, sufficient coverage). Use the Getis Ord component with a uniform
kernel of size 1
.
Identify potential buildings to install new antennas using the Enrich Polygons component. Notice that we need to work with geometries here, so we will first get the boundaries of the Getis Ord H3 cells using the H3 Boundary component. Enrich the data by aggregating the gi
value with the AVG
and the p_value
, that represents the significance of the statistic, with the MAX
.
To visualize the results correctly, we will use the Create Vector Tileset component to create a tileset, which allows to process and visualize very large spatial datasets stored in BigQuery. Use 10
and 16
as minimum and maximum zoom levels, respectively.
The following map allows to identify busy areas with a shortage of mobile phone antennas and determine the most suitable buildings for antenna placement.
We can see that the busy city center of Madrid is fully packed of cell towers, enough to satisfy demand. Also, locations with little human activity (like El Pardo park) have also enough network capacity to provide service. However, the outskirts of the city seem to be lacking antennas, based on the overall human activity and cell tower presence patterns in Madrid.