Spatial data management and analytics with QGIS and CARTO

CARTO, as part of the modern geospatial analysis stack, is a fully cloud-native platform for spatial analytics that operates on top of the leading data warehouses, ensuring your cloud data governance and that geospatial analytics are not siloed within your organization. Additionally, our platform integrates seamlessly with open-source desktop GIS tools, which excel in other data management tasks like geometry editing over georeferenced images.

While the CARTO platform excels in analytics and visualization of geospatial big data running natively on cloud data warehouse platforms, certain data management tasks are better accommodated today by desktop tools like QGIS. In this tutorial, you will learn how to effectively use CARTO and QGIS to build a scalable and modern cloud technology stack for spatial data management and analytics, breaking the GIS silo and ensuring all departments can derive insights from spatial data at scale.

Step-by-Step Guide:

In this guide, we'll walk you through:


Connect QGIS to PostgreSQL

First, set up a connection in QGIS to PostgreSQL. Depending on your cloud provider, PostgreSQL comes in different flavors. For example, Google Cloud SQL for PostgreSQL, Amazon RDS for PostgreSQL, and Azure Database for PostgreSQL.

Important: Ensure your PostgreSQL instance has the PostGIS extension to work with geospatial data. If you're new to PostgreSQL & PostGIS, we recommend getting started here.

  1. Open QGIS and create a connection to your PostgreSQL database, do to so:

    • Click on Data Source Manager

    • Select PostgreSQL

    • Click New and add your PostgreSQL credentials (e.g., Google Cloud SQL connection)

Once connected, select a specific table and click Add to add it as a new layer in QGIS. For this tutorial, we will use the planned_regions_2025 table, which contains polygon geometries, but you can use any vector layer.


Georeferencing images in QGIS

Next, we will edit geometries using QGIS against a georeferenced image. The goal is to create new geometries that display planned regions over a satellite image.

  1. Add your image as a raster layer. To do so:

    • Click on Data Source Manager.

    • Select Raster.

    • Upload your image.

    • Click Add to add the raster layer to your map.

As the image is not georeferenced, you won't be able to see it in your desired location. To check that it has been uploaded correctly, click on Zoom to Layer(s).

  1. Let's georeference the image. To access the Georeferencer tool, go to Raster > Georefencer.

  1. After clicking, a new modal opens. In the Settings option in the top bar, configure your transformation settings as below. Pay particular attention to the Target SRS; ideal you want to be georeferencing in the SRS which is native to your raster image. Once done, click Ok.

  1. In the Georeferencer modal, add again your raster image. Once done, it will be displayed in the canvas. We're set to start introducing the control points.

  1. Select Add point in the Georeferencer interface and mark your first control point. Once marked, a popup will appear as below. Here, click on Map Canvas to introduce the map coordinate for that location in the map itself.

  1. In the main map canvas, navigate to the right location and click on the map. Then, the Enter Map Coordinates modal will open, click Ok. That will create your first control point. You must create a minimum of 4 control points before being able to georeference your image.

  2. Once you have added all required control points, click Run to georeference your layer and add it to the map. Once the process is finished, you will be able to see your image georeferenced in your main map canvas.

You now have a georeferenced raster layer! A common next step is to create a polygon layer for the red line boundary found on this image.


Editing geometries in QGIS

In this section, we'll leverage our georeferenced image from the previous step to create a new entry in our PostgreSQL table with spatial and non-spatial attribution.

  1. To start, click on the layer connected to your PostgreSQL table and toggle Editing mode.

  1. Click on Add polygon in the Layer menu and digitize your planned region. Once you are done, right click and add your attribute data. To finalize click Ok and ensure your edits are saved by clicking on Save button in the Editing toolbar. .

A new row containing the geometry as well as the attribution has been added to your PostgreSQL table.


Synchronize changes from PostgreSQL to your Data Warehouse

To perform analytics and visualize the results of the data you just created in QGIS in CARTO, you have two main options:

  • Direct Connection to PostgreSQL: You can directly connect to your cloud PostgreSQL instance to analyze and create visualizations in CARTO. Learn more about this approach here.

  • Synchronize Changes to a Data Warehouse: If you're using a data warehouse like Google BigQuery, you might want to synchronize changes from PostgreSQL to leverage the data warehouse's powerful analytics. This synchronization ensures that BigQuery reflects the latest updates from PostgreSQL. This is the approach that the remainder of this tutorial will follow.

This synchronization can be done using different methods, and in this tutorial, we will explain how to achieve this using a scheduled script in your data warehouse. For this example, we will use Google BigQuery to call PostgreSQL and check for any changes in the last 5 minutes. If there have been changes, the data will be moved to BigQuery.

  1. Before creating a BigQuery script, you should ensure that you can access your PostgreSQL as an external data source in BigQuery.

  1. Now that we've created an external data source connection, we can start creating the script.

The purpose of the script is to check for PostgreSQL updates and if there are any, ensure the information is reflected in BigQuery. Below is an example script but you can customize it according to your needs:

// Google BigQuery script to check PostgreSQL updates
DECLARE has_updates BOOL DEFAULT FALSE;


-- Check for updates in the external data
SET has_updates = (
 SELECT EXISTS (
   SELECT 1
   FROM EXTERNAL_QUERY("cartodb-on-gcp-pm-team.us.cartodb-on-gcp-pm-team", "SELECT 1 FROM planned_regions_2025 WHERE updated_at > NOW() - INTERVAL '5 minutes'")
 )
);


-- If data has changed, create or replace the table
IF has_updates THEN
 CREATE OR REPLACE TABLE `cartodb-on-gcp-pm-team.amanzanares.planned_regions_2025`
 CLUSTER BY geom
 AS
 SELECT * EXCEPT(geom_wkb), ST_GEOGFROMWKB(geom_wkb) AS geom
 FROM EXTERNAL_QUERY("cartodb-on-gcp-pm-team.us.cartodb-on-gcp-pm-team", "SELECT geoid, region_name,planned_date,status,created_at,updated_at,ST_AsBinary(geom) AS geom_wkb FROM planned_regions_2025");
END IF;
```
  1. We'll now schedule this script so that is executed every 5 minutes, using BigQuery built-in functionality. To do this, open your query in the BigQuery console, click the Schedule button on the top toolbar, and configure the frequency to run every 5 minutes. Set your desired start time and time zone, name the scheduled query, and save it.

Now that your BigQuery script is schedule when there are any new entries or updates, we want to ensure that every time our PostgreSQL table changes, that information is properly reflected. To do so, we'll leverage the trigger functionality in PostgreSQL.

  1. In your PostgreSQL console, create a trigger that wil ensure that created_at and updated_at columns reflect the correct information every time there is a change in your table. Below is an example:

CREATE OR REPLACE FUNCTION set_timestamps() RETURNS TRIGGER AS $$
BEGIN
    IF (TG_OP = 'INSERT') THEN
        NEW.created_at := CURRENT_TIMESTAMP;
        NEW.updated_at := CURRENT_TIMESTAMP;
    ELSIF (TG_OP = 'UPDATE') THEN
        NEW.updated_at := CURRENT_TIMESTAMP;
    END IF;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_set_timestamps
BEFORE INSERT OR UPDATE ON planned_regions_2025
FOR EACH ROW
EXECUTE FUNCTION set_timestamps();

To test the process is working as expected, test it by modifying an existing row in QGIS. We've included several new geometries and updated an existing one by setting the status to To be reviewed - Test. You can see now the difference between the created_at timestamp and the updated_at.

If you await 5 minutes or manually execute the script, you'll be able to also see the same information reflected in your data warehouse table.


Extract insights from your data using CARTO Builder

Now that your data is synchronized and available in Google BigQuery, you can leverage the powerful features of the CARTO platform to create interactive and insightful dashboards. This section will focus on utilizing the synced BigQuery table for creating an interactive visualization.

  1. In CARTO Workspace, navigate to Data Explorer and locate your BigQuery table. In here you should be able to have a preview of both the data and the map. From this interface, click on Create map. This will open a new tab with Builder displaying this data source.

Builder is CARTO map-making tool that allows you to create scalable web map applications leveraging the data warehouse capabilities. Let's create our interactive dashboard.

  1. Let's give your map a name, "Toronto - Planned regions for 2025".

  2. After that, we'll rename our layer to "Planned Regions" and style them accordingly so the regions stand out on the map visualization. In our case, we'll set the Fill Color and Stroke Color to light and dark orange. Then, set the Stroke Width to 2.

  1. Let's add Toronto's census data source. To do so, follow the next steps:

    • Select the Add source from button at the bottom left on the page.

    • Select Custom Query (SQL) and then Type your own query under the CARTO Data Warehouse connection.

    • Click on the Add Source button.

The SQL Editor panel will be opened.

To add Toronto's census data source, run the query below:

SELECT * FROM `cartobq.docs.toronto_census_population` 
  1. Rename the newly added layer to "Census population" and set the Fill Color based on Total_population property. Set the Opacity for the Fill Color to 0.2 and the Opacity for the Stroke Color to 0.1.

  1. In the main layer panel, change the position of the layer to the bottom, so that Planned regions stay on top of the visualization.

  1. Now, we'll add a final dataset, the road network for Toronto, to have visibility on the major roads that are likely to be impacted by this project. To do so, add a custom SQL query and run the following query, as per previous source. This query contains a WHERE rank < 5 that will allow us to dynamically obtain just major roads in this location.

SELECT * FROM  `cartobq.docs.toronto_road_network` WHERE rank < 5
  1. Name this layer "Road network" and style the Stroke Color based on its Rank property, from dark to lighter pink. Also, set the Opacity to 0.4. Then, set the Stroke Width to 2.

We have now finished adding our sources, now let's add some functionality to our dashboard that will allow users to dynamically extract information by leveraging pop-up interactions and charts.

  1. Navigate to the Interactions section, and set the properties for each layer as below:

  • Road Network:

    • name

    • type

    • rank

  • Planned regions: All

  • Census Population:

    • Total_population

  1. Now let's include some Widgets to extract insights and allow users to filter data. To do so, navigate to the Widgets tab and include the following widgets:

  • Formula Widget:

    • Source: Census population

    • Widget name: Total Population

    • Operation: SUM(Total_population)

    • Formatting: 12.3k

  • Pie Widget:

    • Source: Planned regions

    • Widget name: Region Status

    • Operation: status

  • Category Widget 1:

    • Source: Planned regions

    • Widget name: Region Name

    • Operation: Region_name

  • Category Widget 2:

    • Source: Road network

    • Widget name: Road Network Type

    • Operation: Type

  1. Before publishing our map, we'll configure our Data Sources Freshness. The data source freshness will determine how up-to-date the data sources in the map are at its initial load. This will ensure users will be able to extract insights always as fresh as you configure this. In our case, we'll set Data Freshness to 5 minutes, aligning with our BigQuery scheduled script.

  1. Finally, we're ready to share the map with others. Let's go to the Preview mode, to ensure the map is looking as expected. To do so, click on Preview next to the Share button. A different layout appears that displays the application as if you were the end-user accessing it.

  1. Once you are happy with the dashboard, click on Share and set it to shared with your organization or publicly.

Congrats, you're done! The final results should look similar to the below:

Learn more about crafting impactful visualizations in the Building Interactive Maps section of the Academy.

Last updated