Geospatial indexing app with different backends using Spring Boot and Testcontainers
A REST API with different backend implementations to index and search geospatial data.
Disclaimer: I am no expert on geospatial field and the tools mentioned here, I just learned and applied the minimum knowledge for this proof-of-concept. I hope you learn something from this just as I did doing it.
Story
As a developer, I want to index location coordinates so that I can later search for nearby locations within a certain distance from a specified point.
Acceptance Criteria
- REST API using GeoJSON as the data model.
- Easily switchable configuration on what backend to use.
If you want to go straight to the code, see the GitHub project below. Please don’t judge me on how I named the project. XD
TDD Approach
Since I am a self-proclaimed TDD practitioner, we’ll start with defining the tests first. I know unit tests are important but we will skip them for now since we are not aiming for a production-grade version.
Integration Test
We will be performing API level tests on a running instance of the application with the help of Testcontainers. We will run the same test on both implementations.
The tests are ordered to first index all geometry types then perform location proximity tests among the indexed locations.
Architecture
The GeoJSON format will be used for a standard data model. Aside from the geospatial data, an identifier and key are required during indexing. They will be used to identify and group the location. For example, we want to index Mt. Everest and Mt. Fiji on the mountains group.
The data model for this application is only focused on geospatial data. Other properties not useful for geospatial functions are abstracted away on the API client. For example, the name of the mountain doesn’t have an impact on calculating its proximity to other mountains, so will not be saving them.
For a simpler implementation, we will only support Point, LineString, and Polygon geometric types.
For distance proximity queries, the default unit would be in meters for now.
API
We will have 2 endpoints to support the indexing and searching of locations.
POST v1/geo-indexes/{key}
— accepts geometry types in GeoJSON format.GET v1/geo-indexes/{key}/radius
— returns the list of the id of the nearby location within the specified radius of a given latitude and longitude.
You can easily test them via Swagger UI.
Geospatial support
On this POC, we will be trying 2 different tools with geospatial support.
Using Redis
Redis has several commands related to geospatial indexing (GEO commands) but unlike other commands these commands lack their own data type. These commands actually piggy back on the sorted set datatype. This is achieved by encoding the latitude and longitude into the score of the sorted set using the geohash algorithm.
https://redis.com/redis-best-practices/indexing-patterns/geospatial/
We will be using GEOADD
and GEORADIUS
.
Since Redis only allows indexing of latitude and longitude, we are going to extract all the points from the GeoJSON object and index them individually. The key
on which you index the location should also be used when you are performing the location search. If you index 10 mountain locations under the mountains
key, they won’t show if you search for them on the rivers
key.
For a Point, only 1 coordinate is indexed. For a LineString and Polygon, all coordinates are indexed. We will follow the format below when assigning the member id (string).
<identifier>:<index>
So for example a river with identifier NILE
containing 5 points will be indexed as:
NILE:0
NILE:1
NILE:2
NILE:3
NILE:4
However, when performing a location search, if one of the coordinates is part of the result. The suffix will be discarded and only the identifier is considered.
Libraries Used:
- Spring Data Redis
Using PostGIS
PostGIS is a spatial database extender for PostgreSQL object-relational database. It adds support for geographic objects allowing location queries to be run in SQL.
We will be storing the geospatial data in geometries
table with 3 columns for key, identifier, and geometry. The GeoJSON data will be stored on the geometry
column.
ST_DWithin
function will be used in performing the proximity query.
Libraries Used:
- Spring Data JPA
- Hibernate Spatial
Other tools with geospatial support
Local Development
With the help of Testcontainers, we will be able to run a standalone instance of the application along with the required running backend docker container. This would speed up our development or if you just want to check out the application.
We got this approach from this awesome blog by Sergei Egorov. Check it out!
Stateful
Specify the desired backend implementation using profile when running the GeoIndexApplication
then configure the connection on the application-*.yaml
files based on your local environment after you run the required container — Redis or PostGIS.
Stateless
To run the standalone version of the application, just run the provided <Impl>GeoIndexApplication
class. This requires no additional configuration just ensure you have a Docker installed.
However, the data will be lost when the application is shut down. Verifying the data will also be challenging since the ports of the containers are randomly assigned.
Thank you for your time reading up until this point. This is my first blog so it’s both an achievement and hopefully the start of more writing in the future.
I would love to hear your feedback and things to improve not only on the literature but on the code as well!
“For the things we have to learn before we can do them, we learn by doing them.”
― Aristotle