Types of location data

Raster, Vector & everything in-between

The two primary spatial data types are raster and vector - but what’s the difference?

Raster data

Raster data is represented as a grid of cells or pixels, with each cell containing a value or attribute. It has a grid-based structure and represents continuous values such as elevation, temperature, or satellite imagery.

Common raster file types

Common file types for raster data include:

  • GeoTIFF: a popular raster file format with embedded georeferencing.

  • JPEG, PNG & BMP: ubiquitous image files which can be georeferenced with a World or TAB file. PNG supports lossless compression and transparency, making it particularly useful for spatial visualization.

  • ASCII: stores gridded data in ASCII text format. Each cell value is represented as a text string in a structured grid format, making it easy to read and manipulate.

You may also encounter: ERDAS, NetCDF, HDF, ENVI, xyz.


Vector data

Vector data represents geographic features as discrete points, lines, and polygons.It has a geometry-based structure in which each element in vector data represents a discrete geographic object, such as roads, buildings, or administrative boundaries. Vector data is scalable without loss of quality and can be easily modified or updated.

Vector data is useful for spatial analysis operations such as overlaying, buffering, and network analysis, facilitating advanced geospatial studies. Vector data formats are also well-suited for data editing, updates, and maintenance, making them ideal for workflows that require frequent changes.

Common vector file types

Shapefiles

Shapefiles are a format developed by ESRI. They have been widely adopted across the spatial industry, but their drawbacks see them losing popularity. These drawbacks include:

  1. Shareability: They consist of multiple files (.shp, .shx, .dbf, etc.) that comprise one shapefile, which can make them tricky for non-experts to share and use.

  2. Limited Attribute Capacity: Shapefiles are limited to a maximum of 255 attributes.

  3. Lack of Native Support for Unicode Characters: This can cause issues when working with datasets that contain non-Latin characters or multilingual attributes.

  4. Lack of Topology Information: Shapefiles do not inherently support topological relationships, such as adjacency, connectivity, or overlap between features.

  5. No Native Support for Time Dimension: No native time field type.

  6. Lack of Direct Data Compression: Shapefiles do not provide built-in compression options, which can result in larger file sizes.

Limited File Size Limitations: Shapefile size is limited to 2 GB.

Other vector file types

  1. GeoJSON (Geographic JavaScript Object Notation): GeoJSON is an open standard file format based on JSON (JavaScript Object Notation). It allows for the storage and exchange of geographic data in a human-readable and machine-parseable format.

  2. KML/KMZ (Keyhole Markup Language): KML is an XML-based file format used for representing geographic data and annotations. It was originally developed for Google Earth but has since become widely supported by various GIS software. KMZ is a compressed version of KML, bundling multiple files together.

  3. GPKG (Geopackage): GPKG is an open standard vector file format developed by the Open Geospatial Consortium (OGC). It is a SQLite database that can store multiple layers of vector data along with their attributes, styling, and metadata. GPKG is designed to be platform-independent and self-contained.

  4. FGDB (File Geodatabase): FGDB is a proprietary vector file format developed by Esri as part of the Esri Geodatabase system.

  5. GML (Geography Markup Language): GML is an XML-based file format developed by the OGC.


Everything in-between

There is a small area in between raster and vector data types, with Spatial Indexes being one of the most ubiquitous data types here.

Spatial Indexes are global grids - in that sense, they are a lot like raster data. However, they render a lot like vector data; each "cell" in the grid is an individual feature which can be interrogated. They can be used for both vector-based analysis (like running intersections and spatial joins) and raster-based analysis (like slope or hotspot analysis).

But where they really excel is in their size, and subsequent processing and analysis speeds. Spatial Indexes are "geolocated" through a reference string, not a long geometry description (like vector data). This makes them small, and quick. So many organizations are now taking advantage of Spatial Indexes to enable highly performant analysis of truly big spatial data. Find out more about these in the ebook Spatial Indexes 101.

pageIntroduction to Spatial Indexes

Last updated