# Analyzing Airbnb ratings in Los Angeles

## Context <a href="#context" id="context"></a>

<div align="left"><figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2FhB2W9xXbzzo0kEuXMe3S%2Fintermediate%20banner.png?alt=media&#x26;token=4acd2cc7-c7e8-46c0-9669-6f6b73c030dd" alt="Intermediate difficulty banner" width="175"><figcaption></figcaption></figure></div>

Founded in 2008, Airbnb has quickly gained global popularity among travelers. To elevate this service, identifying the determinants of listing success and their role in drawing tourism is pivotal. The users' property ratings focus on criteria such as accuracy, communication, cleanliness, location, check-in, and value.

This tutorial aim to extract insights into Airbnb users' overall impressions, connecting the overall rating score with distinct variables while taking into account the geographical neighbors behavior through a Geographically Weighted Regression model.&#x20;

We'll also dive into the regions where location ratings significantly influence the overall score and enrich this analysis with sociodemographic data from CARTO's Data Observatory.

This tutorial will take you through the following sections:

1. [Visualizing Airbnb listings](#visualizing-airbnb-listings)
2. [Aggregating Airbnb data to a H3 grid](#aggregating-data-to-a-h3-grid)
3. [Enriching the grid with demographic data](#enriching-the-grid-with-demographic-data)
4. [Estimating the influence of variables on the score](#estimating-the-influence-of-variables-on-the-score)

***

## **Step-by-Step Guide:**

### **Visualizing Airbnb listings**

1. Access the **Maps** section from your CARTO Workspace using the navigation menu and create a **New Map**.

<figure><img src="https://content.gitbook.com/content/FEElAdsRIl9DzfMhbRlB/blobs/jDxGUWGJyCTd3g8YC2Yl/image.png" alt=""><figcaption></figcaption></figure>

2. Add Los Angeles Airbnb data from CARTO Data Warehouse.
   * Select the **Add source from** button at the bottom left on the page.&#x20;
   * Click on the **CARTO Data Warehouse** connection.
   * Navigate through demo data > demo tables to **losangeles\_airbnb\_data** and select **Add source**.&#x20;
3. Let's add some basic styling! Rename the map to `Map 1 Airbnb initial data exploration`. Then click on Layer 1 in the **Layers** panel and apply the following:
   * Name (select the three dots next to the layer name): `Airbnb listings`
   * Color: your pick!
   * Outline: white, 1px stroke
   * Radius: 3
4. Switch from Layers to Interactions at the top left of the UI. Enable interactions for the layer.
   * Select a style for the pop-up window; we'll use light.
   * From the drop-down menu, select the variable price\_num.&#x20;
   * Select # to format the numbers as dollars. In the box to the right, rename the field Price per night.

You should have something that looks a little like this [👇](https://emojipedia.org/backhand-index-pointing-down)

<figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2FNeaM7fY6uE8LgWSva7fh%2FAirbnbs%20popup.png?alt=media&#x26;token=434d326e-5ff7-4c46-8b50-452cf7f0339e" alt="A screenshot of CARTO Builder"><figcaption></figcaption></figure>

We will now inspect how Airbnb listings are distributed across Los Angeles and aggregate the raw data to have a better understanding on how different variables vary geographically within the city.

Now let's add a new data source to visualize the airbnb listings using an H3 grid.

***

### Aggregating data to a H3 grid

Now let's aggregate this data to a H3 [Spatial Index](https://academy.carto.com/working-with-geospatial-data/introduction-to-spatial-indexes) grid. This approach has multiple advantages:

* Ease of interpreting spatial trends on your map
* Ability to easily enrich that grid with multiple data sources
* Suitability for spatial modelling like Geographically Weighted Regression...

...all of which we'll be covering in this tutorial!

{% embed url="<https://academy.carto.com/working-with-geospatial-data/introduction-to-spatial-indexes>" %}

1. In the CARTO Workspace, head to **Workflows** and select **+ New Workflow**, using the **CARTO Data Warehouse** connection.
2. At the top left of the new workflow, rename the workflow "Airbnb analysis."
3. In the **Sources** panel (left of the window), navigate to Connection Data > demo data > demo\_tables and drag losangeles\_airbnb\_data onto the canvas.&#x20;
4. Switch from Sources to **Components**, and locate **H3 from GeoPoint**. Drag this onto the canvas to the right of losangeles\_airbnb\_data and connect the two together. Set the H3 resolution to 8. This will create a H3 grid cell for every Airbnb location.&#x20;
5. Back in Components, locate **Group by**. Drag this to the right of H3 from GeoPoint, connecting the two. We'll use this to create a frequency grid and aggregate the input numeric variables:
   1. Set the **Group by** field to H3.&#x20;
   2. For the aggregation columns, set review\_scores\_cleanliness, review\_scores\_location, review\_scores\_value, review\_scores\_rating and price\_num to AVG. Add a final aggregation column which is H3 - COUNT (see below).&#x20;

<figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2FDVKX1Utlwv8bYLa9vcNv%2FAirbnbs%20to%20h3.png?alt=media&#x26;token=d8195259-be27-475d-8964-e89e51ec7de7" alt=""><figcaption></figcaption></figure>

1. Connect this **Group by** component to a **Rename column** component, renaming h3\_count to airbnb\_count.
2. Finally, connect the Rename column count to a **Save as Table** component, saving this to CARTO Data Warehouse > Organization > Private and calling it airbnb\_h3r8. If you haven't already, run your workflow!

<details>

<summary>Prefer to use SQL?</summary>

You can replicate this in the CARTO Builder SQL console with the following code:

```sql
WITH h3_airbnb AS (
  SELECT
    `carto-un`.carto.H3_FROMGEOGPOINT(geom,
      8) AS h3,
      *
  FROM
    carto-demo-data.demo_tables.losangeles_airbnb_data),

aggregated_h3 AS (
  SELECT
    h3,
    ROUND(AVG(price_num), 2) price,
    ROUND(AVG(review_scores_rating), 2) overall_rating,
    ROUND(AVG(review_scores_value), 2) value_vs_price_rating,
    ROUND(AVG(review_scores_cleanliness), 2) cleanliness_rating,
    ROUND(AVG(review_scores_location), 2) location_rating,
    COUNT(*) AS total_listings
  FROM
    h3_airbnb
  GROUP BY
    h3)
	
SELECT * FROM aggregated_h3
```

</details>

Now, head back to the CARTO Builder map that we created earlier. Add the H3 aggregation table that you just created to the map (Sources > Add source from > Data Explorer > CARTO Data Warehouse > Organization > Private).&#x20;

Let's style the new layer:

* Name: `H3 Airbnb aggregation`
* Order in display: 2
* Fill color: 6 steps blue-yellow ramp based on column `price_num_avg` using Quantile color scale.
* No stroke

Do you notice how it's difficult to see the grid beneath the Airbnb point layer? Let's enable zoom-based visibility to fix that, so we only see the points as we zoom in further. Go into the layer options for each layer, and set the **Visibility by zoom layer** to 11-21 for Airbnb listings.

You might also find the basemap more difficult to read now we have a grid layer covering it. Head to the basemaps panel (to the right of Layers) and switch to Google Maps > Positron. You'll now notice some of the labels sit on top of your grid data.

Now, let's try looking at this in 3D! At the center-top of the whole screen, switch to 3D view - then in H3 Airbnb aggregation:

* Toggle the *Height* button and style this parameter using:
  * Column: airbnb\_count (SUM)&#x20;
  * Height scale: sqrt
  * Value: 50

Inspect the map results carefully. Notice where most listings are located and where the areas with highest prices are. Optionally, play with different variables and color ramps.

Now let's start to dig a little deeper into our data!

***

### Enriching the grid with demographic data

So far we have seen how the Airbnb listings locations and its main variables are distributed across the city of Los Angeles. Next, we will enrich our visualization by adding `CARTO Spatial Features H3 at resolution 8` dataset from [CARTO Data Observatory](https://docs.carto.com/data-observatory/overview/getting-started/).

This dataset holds information that can be useful to explore the influence of different factors, including variables such as the total population, the urbanity level or the presence of certain type of points of interests in different areas.&#x20;

1. In the CARTO Workspace, click on ‘Data Observatory’ to browse the [Spatial Data Catalog](https://docs.carto.com/data-observatory/guides/accessing-and-browsing-the-spatial-data-catalog/) and apply these filters:

* Countries: `United States of America`
* Licenses: `Public data`
* Sources: `CARTO`

2. Select the `Spatial Features - United States of America (H3 Resolution 8)` dataset and click on **Subscribe for free**. This action will redirect us to the subscription level at the Data Explorer menu.

<figure><img src="https://content.gitbook.com/content/FEElAdsRIl9DzfMhbRlB/blobs/UgkjqTX4yFuwBHmdJEpY/image.png" alt="A screenshot of CARTO&#x27;s Data Observatory"><figcaption></figcaption></figure>

3. Head back into the workflow you created earlier.
4. Navigate to Sources > Data Observatory > CARTO and find the table you just subscribed to and drag it onto the canvas, just below the final Save as Table component. Can't find it? Try refreshing your page.&#x20;
5. Using a **Join** component, connect the output of Save as Table to the top input, and of Spatial Features to the bottom. Set the join columns from each table to H3, and the join type to left - meaning that all features from the first input (Save as Table) will be retained. Run!
6. We now have a huge amount of contextual data to help our analysis - in fact, far more than we want! Connect the output of the join to an **Edit schema** component, selecting only the columns from your original Airbnb grid, plus population and urbanity.

<figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2Ff2ADtTxoekUb8mwHFbR4%2FAirbnbs%20enrichment.png?alt=media&#x26;token=15d4c283-dd43-44bc-8b22-40a88b31ee4e" alt="A screenshot of CARTO Workflows"><figcaption></figcaption></figure>

From here, you can save this as a table and explore it on a map - or move on to the final stage of this tutorial.

***

### Estimating the influence of variables on the score

Next we will apply a Geospatially Weighted Regression (GWR) model using the [GWR\_GRID](https://docs.carto.com/analytics-toolbox-bigquery/sql-reference/statistics/#gwr_grid) function to our Airbnb H3 aggregated data. We’ve already seen where different variables rate higher on our previous map.

This model will allow us to extract insights of what the overall impression of Airbnb users depends on, by relating the overall rating score with different variables (specifically we will use: value, cleanliness and location)

We will also visualize where the *location* *score* variable significantly influences the ‘Overall rating’ result.&#x20;

We will now proceed to calculate the GWR model leveraging CARTO Analytics Toolbox for BigQuery. You can do so using CARTO Workflows or your data warehouse console.

1. In your workflow, connect a **GWR** component to the Edit schema component from earlier. The parameters used in GWR model will be as follows:

* Index column: `h3`
* Feature Variables: &#x20;
  * `review_scores_value_avg`,
  * `review_scores_cleanliness_avg`&#x20;
  * `review_scores_location_avg`&#x20;
* Target variable:&#x20;
  * `review_scores_rating_avg`&#x20;
* Kring Size: `3`
* Kernel function: `gaussian`
* Fit intercept: `True`

2. Finally, let's add another join to rejoin Edit Schema to the results of the GWR analysis so we have all of the contextual information in one table ready to start building our map.&#x20;

Run!

<figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2FhMT2ZXBrecgJh6am8h30%2FScreenshot%202024-07-04%20at%2016.45.45.png?alt=media&#x26;token=5b3c6226-5f0d-46f1-ac27-9abdd2a75f02" alt="A screenshot of CARTO Workflows"><figcaption></figcaption></figure>

<details>

<summary>Prefer to use SQL?</summary>

You can replicate this in your data warehouse SQL console with the following code:

```sql
CALL `carto-un.carto`.GWR_GRID(
          'yourproject.yourdataset.yourtable',
          ['review_scores_location_avg','review_scores_cleanliness_avg', 'review_scores_value_avg'],
          'review_scores_rating_avg',
          'h3',
          'h3',
          3,
          'gaussian',
          true,
          'yourproject.yourdataset.yourtable')
```

</details>

Feel free to use another Save as Table component to materialise it, otherwise it will be stored as a temporary table and deleted after 30 days.

1. In the CARTO Workspace under the Map tab, click on the three dots next to your original map and duplicate it, calling it `Map 2 GWR Model map`.
2. Add your GWR layer in the same way you had added previous layers, and turn off the layer `H3 Airbnb aggregation`**.**&#x20;
3. Style the new layer (you may find it easier to turn the other layers off as you do this - you can just toggle the eye to the right of their names in the layer panel to do this):
   1. Name: `Location relevance (Model)`
   2. Layer order: `3 (the bottom)`
   3. Fill Color: 5 step diverging Colorbrewer blue-red ramp based on `review_scores_location_avg_coef_estimate`.  Here, negative values depict a negative relationship between the location score and overall score, and positive values depict a positive relationship (i.e. location plays an important role in the overall ranking). \
      \
      A good way of visualizing this is to begin with a **Quantile** color scale, and then switch to **Custom** and play around with the color bands until they reflect the same values moving away from a neutral band around zero (see below, where we have bands which diverage from -0.05 to 0.05).&#x20;
   4. No stroke
4. In the **Legend** panel (to the right of Layers), change the Color based on text to **Location - Overall rating coefficient** so it's easier for the user to understand.

<figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2FhMT2ZXBrecgJh6am8h30%2FScreenshot%202024-07-04%20at%2016.45.45.png?alt=media&#x26;token=5b3c6226-5f0d-46f1-ac27-9abdd2a75f02" alt="A screenshot of CARTO Builder"><figcaption></figcaption></figure>

5. In the **Basemaps** panel (to the right of Layers) change the basemap to Google Maps Roadmap basemap.

<figure><img src="https://content.gitbook.com/content/FEElAdsRIl9DzfMhbRlB/blobs/sx62IrykFarpI3ONItIx/tutorial10_basemap_option_roadmap.png" alt=""><figcaption></figcaption></figure>

6. Click on the **Dual map view** button at the top of the screen (next to 3D mode) to toggle the split map option.

<figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2FCmMLsDcGSTUX5zeV0xWp%2FAirbnbs%20dual%20map%20view.png?alt=media&#x26;token=d4f34a52-d582-45f0-9b09-a1fae71b5f3a" alt="A screenshot of CARTO Builder"><figcaption></figcaption></figure>

* Left map: disable the `Location relevance (Model)`&#x20;
* Right map: disable the `H3 AirBnB aggregation`&#x20;

Inspect the model results in detail to understand where the location matters the most for users' overall rating score and how the location rating values are distributed.

{% hint style="info" %}
Try styling the map layers depending on other variables to have a better understanding on how different variables influence model results.
{% endhint %}

Now let's start adding some more elements to our map to help our users better navigate our analysis.

7. Head to the **Widgets** panel, to the left of the Layers panel. Add the following widgets to the map:

* **Total listings**
  * Layer: `Airbnb listings`
  * Type: `Formula`
  * Operation: `COUNT`
  * Formatting: `Integers with thousand separators`
  * Note: `Total nº of Airbnb listings in the map extent.`

<figure><img src="https://content.gitbook.com/content/FEElAdsRIl9DzfMhbRlB/blobs/SHFaqCEhpl62ju84PmjN/image.png" alt=""><figcaption></figcaption></figure>

* **Population near Airbnbs**
  * Layer: `H3 Airbnb aggregation`
  * Type: `Formula`
  * Operation: `SUM`
  * Formatting: `Decimal summarized (12.3K)`
  * Aggregation column: `population`
  * Notes: `Population in cells with Airbnbs`

<figure><img src="https://content.gitbook.com/content/FEElAdsRIl9DzfMhbRlB/blobs/gcblBIw2nxRkxztjg18L/image.png" alt=""><figcaption></figcaption></figure>

* **Urbanity**
  * Layer: `H3 Airbnb aggregation`
  * Type: `Pie`
  * Operation: `COUNT`
  * Column: `urbanity_joined_joined (MODE)`

<figure><img src="https://content.gitbook.com/content/FEElAdsRIl9DzfMhbRlB/blobs/zHcOi1A5a77OC0TSMCbs/image.png" alt=""><figcaption></figcaption></figure>

7. In the **Interactions** tab (to the right of Widgets), add an interaction to H3 Airbnb aggregation so users can review attributes while navigating the map. Switch from Click to **Hover** and choose the style Light. Select the attributes population\_joined\_joined (sum), urbanity\_joined\_joined (mode) and airbnb\_count\_joined. Click on the variable options (#) to choose a more appropriate format and more readable field names. Your map should now be looking a bit like the below:

<figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2FSfv4XS5cTysTiYcRRGu3%2FH3%20Airbnb%20popups.png?alt=media&#x26;token=bf4a6b92-1d36-4f36-b2f1-56e60e1cc884" alt="A screenshot of CARTO Builder"><figcaption></figcaption></figure>

Navigate the map and observe how widget values vary depending on the viewport area. Check out specific areas by hovering over them and review pop-up attributes.

9. Now let's add a rich description of our map so users can have more context - we'll be using [Markdown syntax](https://www.markdownguide.org/basic-syntax/). At the top right of the screen, select the "i" icon to bring up the **Map Description** tab (you can switch between this and widgets). You can copy and paste the below example or create your own.&#x20;

```markdown
### Airbnb Ratings and Location Impact 🌟

![Image: Global Populated Places](https://app-gallery.cartocdn.com/builder/LosAngeles.jpg)

Explore the intricate relationship between Airbnb ratings and the geographical distribution of listings in Los Angeles with our dynamic map. This map provides valuable insights into what influences user ratings and offers a comprehensive view of the city's Airbnb landscape.

**Discover User Ratings** 📊
- Analyze how Airbnb users rate listings based on key factors such as accuracy, communication, cleanliness, location, check-in, and value.
- Visualize the distribution of ratings to uncover patterns that affect overall user impressions.

**Geographic Insights** 🗺️
- Dive into Los Angeles neighborhoods and observe how specific areas impact user ratings.
- Identify regions where location ratings significantly influence the overall score, and explore what makes these neighborhoods stand out.

**Sociodemographic Data Enrichment**
- Enhance your understanding of each neighborhood with sociodemographic insights from the CARTO Data Observatory.
- Access data on total population, urbanity level, tourism presence, and more to gain a holistic view of the city's dynamics.
```

If you click on the "eye" icon, you can preview what this looks like...

<figure><img src="https://3015558743-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFEElAdsRIl9DzfMhbRlB%2Fuploads%2Fv7NVpuvBeFXmX9p7jJkD%2FH3%20Airbnb%20map%20description.png?alt=media&#x26;token=d258f050-5c71-4b2f-aeb6-caf93c102f5b" alt="A screenshot of CARTO Builder"><figcaption></figcaption></figure>

10. Finally we can make the map public and share the link to anybody in the organization. For that you should go to “Share” on the top right corner and set the map as Public. For more details, see [Publishing and sharing maps](https://docs.carto.com/carto-user-manual/maps/publishing-and-sharing-maps).

<figure><img src="https://content.gitbook.com/content/FEElAdsRIl9DzfMhbRlB/blobs/fgF1w0p2hzsDTDdmZYos/image.png" alt=""><figcaption></figcaption></figure>

Now we are ready to share the results! 👇

{% embed url="<https://clausa.app.carto.com/map/32bad237-c0fe-4f9f-88a6-eb3b4cb6c578>" fullWidth="true" %}
