# Using crime data & spatial analysis to assess home insurance risk

<div align="left"><figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2FhB2W9xXbzzo0kEuXMe3S%2Fintermediate%20banner.png?alt=media&#x26;token=4acd2cc7-c7e8-46c0-9669-6f6b73c030dd" alt="Intermediate difficulty banner" width="175"><figcaption></figcaption></figure></div>

In this tutorial, we'll be using individual crime location data to create a crime risk index. This analysis is really helpful for insurers looking to make more intelligent policy decisions - from customized pricing of premiums to tailored marketing.

### You will need...

* **Crime location data.** We are using data for Los Angeles city (data available [here](https://data.lacity.org/Public-Safety/Crime-Data-from-2020-to-Present/2nrs-mtv8)). Most local governments provide this data as open data, so you should be able to easily adapt this tutorial for your area of interest.
* **Detailed Population data.** We are using [WorldPop](https://www.worldpop.org/) 100m gridded data, which you can subscribe to via our Spatial Data Catalog.&#x20;

***

## Step 1: Sourcing & loading crime data

We’ll be basing our analysis on crime data for Los Angeles city (data available [here](https://data.lacity.org/Public-Safety/Crime-Data-from-2020-to-Present/2nrs-mtv8)).

First, let's load this data into your data warehouse. To do this, head to the [Data Explorer](https://docs.carto.com/carto-user-manual/data-explorer) tab of your **CARTO Workspace:**&#x20;

1. Select **Import data**, then follow the steps to import the table.&#x20;
2. For this dataset we are going to deselect the **Let CARTO automatically define the schema** option on Schema Preview so we can manually select the correct data types for each field. In this example, you want to be sure that latitude and longitude are defined as the type **float64**.

***

## Step 2: Creating a crime grid

Now the data is loaded into our data warehouse, we’ll be building the below workflow to convert the crime locations into a hexagonal [Spatial Index](https://go.carto.com/report-spatial-indexes-101) called [H3](https://carto.com/blog/h3-spatial-indexes-10-use-cases). This process can be used to convert any point dataset into a H3 Index.&#x20;

<figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2FGHDyN5EiZrbMxCxXGaqY%2Fcrime%20risk%20frequency%20grid.png?alt=media&#x26;token=737b88b3-5186-435e-81b8-e253b30da93b" alt="A screenshot of CARTO Workflows"><figcaption></figcaption></figure>

1. With the data downloaded, head to the **Workflows** tab of the CARTO Workspace and select + New Workflow. Use the connection relevant to the location you loaded your data to. Select **Create Workflow**.
2. At the top left of the screen, click on the word Untitled to **rename your workflow** Crime risk.&#x20;
3. You should now be seeing a blank Workflows canvas. The first thing we need to do is load our crime data in. To the left of the window, open the **Sources** tab. Navigate through **Connection data** to the LA crime locations table you just imported, and drag it onto the canvas.&#x20;
4. If you navigate through the **Table** preview (bottom of the window) you'll notice we don't have a geometry column. Let's change that! Switch from the Sources to **Components** window, and search for **ST GeogPoint;** we'll use this to create a point geometry for each crime. Drag this component onto the canvas to the right of the crimes source.&#x20;
5. Connect the right-hand node of the crimes table to the input, left-hand node os ST GeogPoint (this may happen automatically if they're placed close together. Set the latitude and longitude columns as lat and lon respectively - and run the workflow!

At the bottom of the window, select ST GeogPoint and open the **Table** preview again. Scroll right to the end and select **Show Column Stats**.&#x20;

Notice anything weird? The minimum latitude and maximum longitude values are both 0 - which means we have a series of features which are incorrectly sitting in "null island" i.e. longitude, latitude = 0,0. These will skew our subsequent analysis, so let's remove them.

6. Back in **Components**, find **Simple Filter**. Drag this onto the canvas, connecting it to the output of ST Geogpoint. Set the filter condition to latitude does not equal 0, and run. Now let's get on with running our analysis.
7. Now, let's also filter the data to only crimes relevant to home insurance risk. Connect the Simple Filter to a **Select Distinct** component, looking at the column **crm\_cd\_desc**. You can see there are over 130 unique crime codes which we need to filter down.
8. For this filter, as we will have multiple criteria we will instead need to connect a **Where** component to the Simple Filter from step 5. In this Where component, copy and paste the following:

```sql
crm_cd_desc LIKE '%BURGULARY%' OR 
crm_cd_desc LIKE '%THEFT%' OR 
crm_cd_desc LIKE '%VANDALISM%' OR 
crm_cd_desc LIKE '%STOLEN%' OR 
crm_cd_desc LIKE '%ARSON%'
```

5. Connect your Simple Filter to a **H3 from GeoPoint** component, which we'll use to convert each crime to a hexagonal H3 grid cell. Change the resolution to 9 which is slightly more detailed than the default 8.&#x20;
6. In the final step for this section, connect the H3 from GeoPoint component to a **Group by** component. Set the column as H3 and the aggregation as H3 again, with the type COUNT. This will count all duplicate cells, turning our H3 grid into a frequency grid.&#x20;

You can now select the Group by component, open the Map preview tab and select **Create Map** to start exploring your data - here's what ours looks like! Make sure you check out the [Building Interactive Maps](https://academy.carto.com/building-interactive-maps) section of the academy for tutorials on crafting the most impactful maps!

{% embed url="<https://clausa.app.carto.com/map/37f26d11-ac04-4902-a985-75293166ed81>" %}

***

## Step 3: Contextualizing Crime Risk

In this section, we will contextualize the crime counts by calculating the number of crimes per 1,000 residents. First, we need to convert our population data into a H3 Index so we can use it in the same calculation as the crime count.

You can follow the steps in the video below to do this (also outlined below).

{% embed url="<https://youtu.be/Hnof90v2Jr4>" %}
Converting population data to a H3 Index
{% endembed %}

1. If you haven't already, head to the Data Observatory and subscribe to **Population Mosaics, 2020 - United States of America (Grid 100m)**.&#x20;
2. In your workflow, head to **Sources** > Data Observatory > WorldPop and drag the gridded population data onto your canvas. You may need to refresh your workflow if you subscribed to the dataset since you started building.&#x20;
3. Connect this to an [ST Centroid](https://docs.carto.com/carto-user-manual/workflows/components/spatial-operations#st-centroid) to convert each grid cell to a central point.
4. Now, we will use a similar approach to when we converted the crime points to a H3 index. Use  [H3 from GeoPoint](https://docs.carto.com/carto-user-manual/workflows/components/spatial-indexes#h3-from-geopoint) to convert each point geometry to a H3 index; make sure you set the resolution to 9 (the same as the crime count layer).
5. Finally, use the [Group by](https://docs.carto.com/carto-user-manual/workflows/components/aggregation#group-by) component to aggregate the index with the following parameters:
   1. Group by column: H3
   2. Aggregation column: Population, Aggregation type: Sum.

Altogether, this should look something like the below (note how we've used an annotation box to help organize or workflow - you can access these in the **Aa** button at the top of the window).&#x20;

<figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2FfKvQdlCbbqmm4z2cALdz%2Fcrime%20risk%20population%20grid.png?alt=media&#x26;token=2a885587-fce7-4ce3-b128-c20f35c2901b" alt="A screenshot of CARTO Workflows"><figcaption></figcaption></figure>

Now you should have two inputs ready; crime counts and population. Let's bring them together!

1. Add a [Join](https://docs.carto.com/carto-user-manual/workflows/components/joins#join) component, with the Group by component from the previous step as the top (main) input, and and crime count Group by table as the bottom input. Use an **inner join** a type with the columns from both tables as h3.&#x20;
2. Finally, we can calculate the crime rate! Add a [Create Column](https://docs.carto.com/carto-user-manual/workflows/components/data-preparation#create-column) component to do this and input the below formula.

`CASE WHEN population_sum_joined = 0 then 0 ELSE h3_count/(population_sum_joined/1000) END`

3. Use a [Save as Table](https://docs.carto.com/carto-user-manual/workflows/components/input-output#save-as-table) component to commit it.&#x20;

Altogether, your workflow should be looking something like...

<figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2FCjRPVtolnBTyoBeqbQpn%2Fcrime%20risk%20full%20workflow.png?alt=media&#x26;token=d1f5b284-8447-4df4-a075-4178f221b73a" alt="A screenshot of CARTO Workflows"><figcaption></figcaption></figure>

1. Head back to the Builder map you created earlier. Under **Sources** (bottom left), select Add Source from > Data Explorer and navigate to where you saved your table. Add it to the map!
2. Rename the layer Crime rate.
3. Let's style both the Crime count and rate layers in turn by clicking on the layer name in the **Layer** panel:
   1. Reduce the resolutions of both to 6 (as detailed as possible)
   2. Disable the strokes
   3. Change the fill colors to be determined by H3\_Count (average) for the crime count layer, and crime\_rate (average) for the crime rate layer. Pro tip - use different color schemes for both layers, so it's obvious to the user that they aren't directly comparable.&#x20;
4. At the top of the Builder window in **Map views**, turn on **Switch to a dual map view**. Open the legends for each map respectively (at the bottom right of each window) and turn the Crime rates off for the left-hand map and Crime counts off for the right-hand map (or the other way around! You basically only want to see one grid in each map).&#x20;

Check this out below! How do the two compare?

{% embed url="<https://clausa.app.carto.com/map/9e35fc68-a656-496c-b4c5-22811288a92e>" fullWidth="true" %}

***

## Next steps...

Want to take this analysis one step further? Here are some ideas for next steps:

* Calculate crime rate hotspots and outliers with our [Statistics](https://docs.carto.com/data-and-analysis/analytics-toolbox-for-bigquery/sql-reference/statistics) tools
* Assess property-level home risk by joining your results to property data, such as [Overture Maps buildings](https://carto.com/blog/overture-maps-data-now-on-the-cloud-use-it-with-carto)

{% embed url="<https://clausa.app.carto.com/map/21581186-e4bb-4bb2-b4d4-5519c6117822>" fullWidth="true" %}
Building-level crime rates
{% endembed %}

Learn more about this process in our blog [Using crime data & spatial analysis to assess home insurance risk](https://carto.com/blog/crime-data-spatial-analysis-home-insurance-risk).&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://academy.carto.com/creating-workflows/step-by-step-tutorials/using-crime-data-and-spatial-analysis-to-assess-home-insurance-risk.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
