8

I have a PBF file that contains the following information about a country:

  • Nodes, each with their own longitude, latitude and properties; used to store points in a 2D space.

  • Ways, each with their properties, they are connected through nodes; used to store roads, boundaries.

While this file is only 80 MB in its compressed form, it's 592 MB when uncompressed and stored in a DB.

Yeah, and that's only for one country, Belgium. Imagine storing France, Germany and Italy alongside.


Let's take a single highway for example, from Antwerp through Brussels to Charleroi. This would consist of a ton of nodes to store all the turns in the highway, but do I need all these turns? I doubt it.

Let me tell you what I want to be able to do:

  • I want to view the map at different zooming levels; major cities, minor cities and street level at least.

  • I want to be able to get routing information between two points.

  • I want to be able to compute the nearest road to my GPS location.

  • Search for a location, by means of an index in the database.

But most importantly, the database shouldn't be too big as it will be stored on a mobile device.


So, I thought about a combination of two techniques:

  • Image tiles for viewing purposes, to working around storing/processing all the individual nodes.

  • Storing the endpoints of roads for routing information, alongside information about the road.

The problem with this is that I can't compute the nearest road to my GPS location with only this information; imagine that a bend in a highway, I can't determine that I'm on the highway with just the two endpoints. I was thinking about storing intermediate nodes between endpoints but that would be very costly to generate, I think. Also, determining the endpoints of roads (that are like a T-split) is most likely not even that easy as I need to figure out whether I need to store the midpoint at the top of that T-split or not.

So, viewing is easy using image tiles; but I can't find an easy way to do routing and GPS location finding, what kind of storage technique should I be looking into? I find it a bit inconvenient that a 80 MB file turns into a database of 592 MB, I want to reduce that size a much as possible...

What can I do to do this as efficiently as possible? In terms of disk and CPU. I'm targeting a WP7...

1 Answers1

4

It seems to me that the main issue is only including nodes that add significant information about a road.

i.e. without your GPS requirement, you could just store nodes at junctions and endings (which I think you call start/end nodes). Obviously including weight/costs etc.

One way I can think of approaching this is to first, add all start/end nodes. This is the minimum needed. Obviously this doesn't account for winding roads.

Then, for every road (defined as ending to junction or junction to junction) do the following:

  1. Loop through all intermediate nodes and work out the minimum distance from each node to the road as defined by nodes included so far (to start with only the start and end).
  2. If the sum of the above is larger than (some constant threshold * number of intermediate nodes) we need to add intermediate nodes. If not, exit the loop.
    • To add intermediate nodes find the node that had the largest distance from the current representation of the road and add it.
George Duckett
  • 511
  • 7
  • 16