Technical

The data powering the Huli routing engine 🔌🚲

Huli may seem like a bit of a black box, so we decided to open the lid and tell you how we do things. All in the name of being the most trusted route creator on the market.

10 mins

At a startup, it's a fact that everybody does everything to get things going. So while I'm the CEO (*in training*), I have also had to get my hands dirty on the tech side. As part of our promise to be the most trusted routing platform, we figured it would be good to explain some of what goes into making Huli work.

In this post I wanted to dive into a bit more technical detail on the mapping datasets and how it is instrumental to Huli and what we do. 

First, it’s worth a high-level intro to the how Huli works. In short, the Huli app consists of a 4 components: 

  1. Database: where we store all the geospatial data needed to create routes. 
  2. Routing engine: holds the Huli routing algorithm and applies it on the database data
  3. APP: delivers everything to the user in a nice user experience - well that’s the aim anyway!
  4. API: Handles all information that is flowing between the app and the routing engine/database.

In this post I’ll talk briefly about the database part, and specifically the datasets we use for all the path and road information. We'll revisit the other topics in the future, as well as dive into other stuff we think is fun and useful knowledge. To build the database we use the open source relational database known as PostgreSQL along with it's geospatial extension, PostGIS. Both are amazing. The two datasets we use are: 

  1. Path information data, that we take from Open street maps, and
  2. Elevation or height data, also known as a Digital Elevation Model (DEM). Since we are currently only focussing on the UK, we are using a higher-resolution dataset provided by the European Environment Agency.

Ok, ok, wait a minute, what does that actually mean? Let's break it down.

Let’s start with Open Street Maps (OSM). OSM is a birdseye view of what is on the ground. It gives you the x and y coordinates, but it doesn’t provide you with the z coordinate, i.e. height. In the case of a mountain path, the open street map data will tell you where that mountain is, but it won’t tell you how high you have to climb to get to the top. In order to get that, you need the elevation data. So when our path and elevation datasets have a baby, they produce a dataset with an x, y and z coordinate (evolution eh!) and now we know how high the mountain is and how hard it is going to be to get to the top 😫

Open street maps is an incredible resource. It’s a crowd-sourced dataset created by millions of users across the globe plotting lines by hand or uploading GPX (satellite tracked, such as GPS) files. Incredible! It also relies on the community to self-regulate information and ensure it is correct. This means certain areas with high activity are rich with information and up-to-date,  whereas some other areas of less activity may not be up to date or as reliable. Many companies, big and small, build their products on top of open street maps so it is becoming increasingly more accurate and up-to-date but with every dataset there are issues. At Huli, this is where we take extra care to analyse all the map data to ensure, to the best of our ability, that the data is of the highest quality. One of the challenges we face is incorrectly labelled data, where for example a path might be labelled suitable for bikes, but in actual fact it is not suitable. We use multiple methods to check data, but it’s not 100%. That’s why feedback from the community is critical to ensuring we can build a reliable dataset together. If you want to take a shot at updating the data yourself then you can here. It’s actually quite fun, if you are into that kind of thing 🤓.

So back to Huli. After our checks we can bring the data into a geospatial information system to visualise it. In my case, I use the open source QGIS tool, which produces this lovely pink blob made up of over 19 million lines.

A beautiful pink blob of over 19 million paths

If we zoom in on a certain area, like the Lakes then we can see a bit more detail of what the map looks like. This is every path within a 10km radius of Ambleside in the Lake District, a total of 21,000 lines on the map:

10km radius of open street map data in the Ambleside area, amounting to 21,000 different paths/roads

The elevation data we use is created from a satellite carrying a radar sensor, which scans the Earth and calculates the height of the surface below. Are there any shortcomings with this data? Not many, but accuracy can vary depending on the source. The dataset is one of the most accurate free resources available, and has a pixel resolution of 25m and vertical accuracy of +/- 7m. 

Say pixel what?… 🤯

Pixel resolution -  The smaller the pixel resolution the better. For a 25m resolution, it means we get a vertical measurement every 25m. So if you split the whole UK into 25m squares, we would have a vertical measurement at every corner. 

Something like this over Ambleside (zoomed in to actually see some map underneath rather than a black blob):

25m grid, showing the number of elevation data points available

This means that if we have a path that goes through the middle of a square, then we have to calculate the average of the 4 corners to calculate what the elevation or height is likely to be. This isn't fullproof and can introduce some errors that result in a slightly off elevation value.

Vertical accuracy - For every data point we get, it can be off by up to 7m. In the case of Ben Nevis⛰, it might predict that it is 7m taller or shorter, depending on if it is in a good mood or bad mood 😁.

In terms of Huli, this means that when we predict your route elevation gain/loss there will be errors. That's why often the recorded elevation gain/loss from an activity tracker like GARMIN differs from what we predict as those devices either rely on atmospheric pressure or satellite signals, like GPS. A discussion for another day...

STRAVA, KOMOOT and all the others suffer from this too. Technically we should be more accurate as we use a better pixel resolution dataset than them, but the difference is marginal.

If you want better resolution then you have to look at paid sources or look at freely available LIDAR data from governments, however this often is overkill for user needs and isn’t necessary available for your entire region of interest. A few sources of interest here for the UK:

  • ordnance survey provide a 5m dataset called OS Terrain 5 
  • The Scottish Government provide LIDAR data down to 25cm (yes, CM!) which you can find here. Other sources are available for different areas on the respective governmental websites.

So when we import our elevation data to QGIS, we can make cool little Gifs like the one below. In this case, 🟩 = low, 🟥 = high

Digital Elevation Model of the UK and Ireland.🟩 = low, 🟥 = high

And if we take another look at our Ambleside area, we can see that of course there are a lot of mountain paths in the area!

Ambleside region, showing the paths from OSM and the Digital Elevation Model

I'm going to stop there for now but hopefully that was enjoyable or provided some good material to send you off to sleep😴. This was my first attempt at creating a more technical orientated blog so I'd love to hear your thoughts. If there is anything that you would like us to dig into specifically then feel free to email me at steve@huli.life, or the tech team at dev@huli.life - if you go this route, then it's likely going to be me or one of the other 2 part-time Huli members that reads it😂. We're a small but ambitious team. 🤫 Don't tell Komoot.

Happy riding 🚲,

Steve