LogoLogo
HomeDocumentationLoginTry for free
  • CARTO Academy
  • Working with geospatial data
    • Geospatial data: the basics
      • What is location data?
      • Types of location data
      • Changing between types of geographical support
    • Optimizing your data for spatial analysis
    • Introduction to Spatial Indexes
      • Spatial Index support in CARTO
      • Create or enrich an index
      • Work with unique Spatial Index properties
      • Scaling common geoprocessing tasks with Spatial Indexes
      • Using Spatial Indexes for analysis
        • Calculating traffic accident rates
        • Which cell phone towers serve the most people?
    • The modern geospatial analysis stack
      • Spatial data management and analytics with CARTO QGIS Plugin
      • Using data from a REST API for real-time updates
  • Building interactive maps
    • Introduction to CARTO Builder
    • Data sources & map layers
    • Widgets & SQL Parameters
    • AI Agents
    • Data visualization
      • Build a dashboard with styled point locations
      • Style qualitative data using hex color codes
      • Create an animated visualization with time series
      • Visualize administrative regions by defined zoom levels
      • Build a dashboard to understand historic weather events
      • Customize your visualization with tailored-made basemaps
      • Visualize static geometries with attributes varying over time
      • Mapping the precipitation impact of Hurricane Milton with raster data
    • Data analysis
      • Filtering multiple data sources simultaneously with SQL Parameters
      • Generate a dynamic index based on user-defined weighted variables
      • Create a dashboard with user-defined analysis using SQL Parameters
      • Analyzing multiple drive-time catchment areas dynamically
      • Extract insights from your maps with AI Agents
    • Sharing and collaborating
      • Dynamically control your maps using URL parameters
      • Embedding maps in BI platforms
    • Solving geospatial use-cases
      • Build a store performance monitoring dashboard for retail stores in the USA
      • Analyzing Airbnb ratings in Los Angeles
      • Assessing the damages of La Palma Volcano
    • CARTO Map Gallery
  • Creating workflows
    • Introduction to CARTO Workflows
    • Step-by-step tutorials
      • Creating a composite score for fire risk
      • Spatial Scoring: Measuring merchant attractiveness and performance
      • Using crime data & spatial analysis to assess home insurance risk
      • Identify the best billboards and stores for a multi-channel product launch campaign
      • Estimate the population covered by LTE cells
      • A no-code approach to optimizing OOH advertising locations
      • Optimizing site selection for EV charging stations
      • How to optimize location planning for wind turbines
      • Calculate population living around top retail locations
      • Identifying customers potentially affected by an active fire in California
      • Finding stores in areas with weather risks
      • How to run scalable routing analysis the easy way
      • Geomarketing techniques for targeting sportswear consumers
      • How to use GenAI to optimize your spatial analysis
      • Analyzing origin and destination patterns
      • Understanding accident hotspots
      • Real-Time Flood Claims Analysis
      • Train a classification model to estimate customer churn
      • Space-time anomaly detection for real-time portfolio management
      • Identify buildings in areas with a deficit of cell network antennas
    • Workflow templates
      • Data Preparation
      • Data Enrichment
      • Spatial Indexes
      • Spatial Analysis
      • Generating new spatial data
      • Statistics
      • Retail and CPG
      • Telco
      • Insurance
      • Out Of Home Advertising
      • BigQuery ML
      • Snowflake ML
  • Advanced spatial analytics
    • Introduction to the Analytics Toolbox
    • Spatial Analytics for BigQuery
      • Step-by-step tutorials
        • How to create a composite score with your spatial data
        • Space-time hotspot analysis: Identifying traffic accident hotspots
        • Spacetime hotspot classification: Understanding collision patterns
        • Time series clustering: Identifying areas with similar traffic accident patterns
        • Detecting space-time anomalous regions to improve real estate portfolio management (quick start)
        • Detecting space-time anomalous regions to improve real estate portfolio management
        • Computing the spatial autocorrelation of POIs locations in Berlin
        • Identifying amenity hotspots in Stockholm
        • Applying GWR to understand Airbnb listings prices
        • Analyzing signal coverage with line-of-sight calculation and path loss estimation
        • Generating trade areas based on drive/walk-time isolines
        • Geocoding your address data
        • Find similar locations based on their trade areas
        • Calculating market penetration in CPG with merchant universe matching
        • Measuring merchant attractiveness and performance in CPG with spatial scores
        • Segmenting CPG merchants using trade areas characteristics
        • Store cannibalization: quantifying the effect of opening new stores on your existing network
        • Find Twin Areas of top-performing stores
        • Opening a new Pizza Hut location in Honolulu
        • An H3 grid of Starbucks locations and simple cannibalization analysis
        • Data enrichment using the Data Observatory
        • New police stations based on Chicago crime location clusters
        • Interpolating elevation along a road using kriging
        • Analyzing weather stations coverage using a Voronoi diagram
        • A NYC subway connection graph using Delaunay triangulation
        • Computing US airport connections and route interpolations
        • Identifying earthquake-prone areas in the state of California
        • Bikeshare stations within a San Francisco buffer
        • Census areas in the UK within tiles of multiple resolutions
        • Creating simple tilesets
        • Creating spatial index tilesets
        • Creating aggregation tilesets
        • Using raster and vector data to calculate total rooftop PV potential in the US
        • Using the routing module
      • About Analytics Toolbox regions
    • Spatial Analytics for Snowflake
      • Step-by-step tutorials
        • How to create a composite score with your spatial data
        • Space-time hotspot analysis: Identifying traffic accident hotspots
        • Computing the spatial autocorrelation of POIs locations in Berlin
        • Identifying amenity hotspots in Stockholm
        • Applying GWR to understand Airbnb listings prices
        • Opening a new Pizza Hut location in Honolulu
        • Generating trade areas based on drive/walk-time isolines
        • Geocoding your address data
        • Creating spatial index tilesets
        • A Quadkey grid of stores locations and simple cannibalization analysis
        • Minkowski distance to perform cannibalization analysis
        • Computing US airport connections and route interpolations
        • New supplier offices based on store locations clusters
        • Analyzing store location coverage using a Voronoi diagram
        • Enrichment of catchment areas for store characterization
        • Data enrichment using the Data Observatory
    • Spatial Analytics for Redshift
      • Step-by-step tutorials
        • Generating trade areas based on drive/walk-time isolines
        • Geocoding your address data
        • Creating spatial index tilesets
Powered by GitBook
On this page
  • Step 1: Climate data
  • Step 2: Historic wildfires
  • Step 3: Creating a composite score

Was this helpful?

Export as PDF
  1. Creating workflows
  2. Step-by-step tutorials

Creating a composite score for fire risk

Last updated 11 months ago

Was this helpful?

In this tutorial, we'll share a low code approach to calculating a composite score using Spatial Indexes. This approach is ideal for creating numeric indicators which combine multiple concepts. In this example, we'll be combining climate and historic fire extents to calculate fire risk - but you can apply these concepts to a wide range of scenarios - from market suitability for your new product to accessibility scores for a service that you offer.

You will need...

  • Climate data. Fires are most likely to start and spread in areas of high temperatures and high wind. We can access this information from our Spatial Features data - a global grid containing various climate, environmental, economic and demographic data. You can subscribe to this from the Data Observatory, or access the USA version of this data in the CARTO Data Warehouse.

  • USA Counties data. This can also be subscribed to from the Data Observatory, or accessed via the CARTO Data Warehouse.

  • Historic fires data. We’ll be using the LA County Historic Fires Perimeter data to understand areas where fires have been historically prevalent. You can download this data as a geojson here.

We’ll be creating the below workflow for this:


Step 1: Climate data

Before running our composite score analysis, we need to first filter the Spatial Features data to our area of interest (LA County). The climate data we are interested in is also reported at monthly levels, so we need to aggregate the variables to annual values.

We’ll be running this initial section of the workflow in this step.

💡 You can run the workflow at any point, or wait to the end and run then! Only non-edited components will run each time you execute.

  1. Set up First, in your CARTO Workspace, head to Workflows and Create a workflow, using the CARTO Data Warehouse connection.

  2. In the workflow, on the Sources panel (left of the screen), in the Connection panel you’ll see the CARTO Data Warehouse. Navigate to demo data > demo tables > usa_counties and derived_spatialfeatures_usa_h3res8_v1_yearly_v2. Drag these onto the canvas.

  3. Beside sources, switch to Components. Search for and drag a Simple Filter onto the canvas, then connect the usa_counties source to this. Set the name as equal to Los Angeles.

  4. Next, connect the Simple Filter to a H3 Polyfill component, ensuring the resolution is set to 8. This will create a H3 grid across LA, which we can use to filter the climate data to this area.

  5. Connect the H3 Polyfill output to the top input and the Spatial Features source to the bottom input of a Join component. Ensure both the main and secondary table join fields are set to H3 (this should autodetect), and then set the join type to Left. This will join only the features from the USA-wide Spatial Features source which are also found in the H3 polyfill component, i.e. only the cells in Los Angeles.

  6. Now, we want to use two subsequent Create Column components to create two new fields. 💡 Please note that if you are using a data warehouse that isn't Google BigQuery, the SQL syntax for these calculations may need to be slightly different.

    1. Temp_avg for the average temperature:(tavg_jan_joined + tavg_feb_joined + tavg_mar_joined + tavg_apr_joined + tavg_may_joined + tavg_jun_joined + tavg_jul_joined + tavg_aug_joined + tavg_sep_joined + tavg_oct_joined + tavg_nov_joined + tavg_dec_joined) / 12

    2. On a separate branch, Wind_avg for the average wind speeds: (wind_jan_joined + wind_feb_joined + wind_mar_joined + wind_apr_joined + wind_may_joined + wind_jun_joined + wind_jul_joined + wind_aug_joined + wind_sep_joined + wind_oct_joined + wind_nov_joined + wind_dec_joined) / 12

  7. Finally, connect the second Create Column to an Edit schema component, selecting the columns h3, temp_avg and wind_avg.

Next up, we'll factor historic wildfire data into our analysis.


Step 2: Historic wildfires

In this step, we'll calculate the number of historic fires which have occurred in each H3 cell.

  1. Locate the LA County Historic Fires Perimeter dataset from where you’ve downloaded it and drag it directly onto your workflow canvas. Alternatively, you can import it into your cloud data warehouse and drag it on via Sources.

  2. Like we did with the LA county boundary, use another H3 Polyfill (resolution 8) to create a H3 grid across the historic fires. Make sure you enable Keep input table columns; this will create duplicate H3 cells where multiple polygons overlap.

  3. Run the workflow!

  4. With a Group by component, set the Group by column to H3 and the aggregation to H3 (COUNT) to count the number of duplicate H3 cells, i.e. the number of fires which have occurred in each area.

  5. Now, drag a Join onto the canvas; connect the Group by to the bottom input and the Edit schema component from Step 1.7 to the top input. The join type should be Left and both input columns should be H3.

  6. Do you see all those null values in the h3_count_joined column? We need to turn those into zeroes, indicating that no fires occurred in those locations. Add a Create Column component, and use the calculation coalesce(h3_count_joined,0) to do this - calling this column wildfire_count.


Step 3: Creating a composite score

There are two main methods for calculating a composite score. Unsupervised scoring (which this tutorial will focus on) consists in the aggregation of a set of variables, scaled and weighted accordingly, , whilst supervised scoring leverages a regression model to relate an outcome of interest to a set of variables and, based on the model residuals, focuses on detecting areas of under and over-prediction. You can find out more about both methods and which to use when here, and access pre-built workflow templates here.

There are three main approaches to unsupervised scoring:

  • Principal Component Analysis (PCA): This method derives weights by maximizing the variation in the data. This process is ideal for when expert knowledge is lacking and the sample size is large enough, and extreme values are not outliers.

  • Entropy: By computing the entropy of the proportion of each variable, this method, like PCA, makes it ideal for those without expert domain knowledge.

  • Custom Weights: Recommended to use for those with expert knowledge of their data and domain, this method allows users to customize both scaling and aggregation functions, along with defining a set of weights, enabling a tailored approach to scoring by incorporating domain-specific insights.

We'll be using Custom Weights here.

  1. First, we need to drop all superfluous columns. With a Drop Columns component, drop all fields apart from h3, temp_avg, wind_avg and wildfire count.

  2. Connect this to a Composite Score Unsupervised component, using the Custom Weights method, and set the following parameters:

    1. Set the weights as: temp_avg = 0.25, wind_avg = 0.25, fire_count = 0.5. Alternatively choose your own weights to see how this affects the outcome!

    2. Leave the user-defined scaling as min-max and the aggregation function as linear, but change the output formatting to jenks. This will partition the results into classes based on minimizing within-class variance and maximizing between-class variance. Keep the number of buckets as 5 - and run!

Once complete, head into the map preview and select Create map. Set the fill color of your grid to be determined by the spatial score and add some widgets to help you explore the results.

With historic fires and climate data factored into our risk score, we can begin to understand the complex concept of risk. For instance, risk is considered much high around Malibu, the location of the famous 2018 Woolsey fire, but low to the southeast of the county.

Check out how we’ve used a combination of widgets & interactive pop ups to help our user interpret the map - head over to the Data visualization section to learn more about how you can do this!

Data visualization
The full composite scores workflow
Formatting the Spatial Features data
Advanced difficulty banner
Screenshot showing the full composite scores workflow
A screenshot showing the first section of the workflow, formatting the Spatial Features table
A screenshot of the Custom Weights component inputs