#osm-pbf #pbf #routing #osm #routes #xml

bin+lib osmgraphing

Playing around with graphs created via parsing OpenStreetMap data

27 releases (1 stable)

1.1.1 Oct 11, 2020
1.0.0 Jan 26, 2020
0.13.0 Apr 28, 2020
0.12.1 Mar 28, 2020
0.6.0 Sep 22, 2019

#7 in #route

Apache-2.0 and maybe GPL-3.0-or-later

8.5MB
13K SLoC

Rust 10K SLoC // 0.1% comments C++ 2K SLoC // 0.1% comments Python 1K SLoC // 0.2% comments Shell 38 SLoC // 0.2% comments BASH 35 SLoC // 0.2% comments

osmgraphing

Build Status nightly

Tag Crates.io Docs

Changelog Last commit

License

Balancing Saarland

This GIF shows, how the balancing improves the spread of 10,000 paths over the network of the German state Saarland. New paths from s to t are guaranteed being not worse than 25 % than the optimal path from s to t, with respect to travel-duration (so 55 min becomes under 1 h 10 min in the worst-case).

Welcome to the osmgraphing-repo! :) Goal of this repo is parsing openstreetmap-data to calculate traffic-routes and different related use-cases on it. This repo is involved in dealing with the analysis of selfish routing and learning metrics for balancing load in street-networks. See my master-thesis for more details. However, if a self-written parser-module does exist, every map-format supported by this module (e.g. own csv-like formats) can be used, which doesn't need to be a street-network.

All calculations will be optimized for a single desktop instead of a more expensive cluster. The current machine (August 2020) uses an AMD Ryzen 7 3700X 8-Core Processor and has 32 GB of RAM.

Reason for version above 1.0.0

This repository is my first project in Rust :). Some interface-decisions might be open for discussion (like the public helpers-module) and the graphbuilder has some quite long functions. The resources consume too much memory due to the graph- and routing-files (fresh clone around 20 MB). However, the code works very solid. Large design-decisions are described nicely in my master-thesis, which bases on this repository. This was a student-project, a master-thesis, a learning-project and fulfilled these purposes perfectly. Because of this, version 1.1.1 has been published in October 2020 without many further refactoring.

Please refer to LICENSE.md for details. Copyright-owner is Parga Cacheiro, Dominic. In short, this repository and generated binaries are licensed under the Apache-2.0-license as long as you are not using the cargo-feature gpl. Using this cargo-feature adds some code and binaries, which depend on code licensed under the GPL-3.0. Thus, the resulting binaries are licensed under the GPL-3.0.

Table of contents

  1. Reason for version above 1.0.0
  2. Copyright and License
  3. Table of contents
  4. Setup and usage
    1. Long story short
    2. Downloading and generating maps
    3. Editing the config
    4. Inlined metrics
    5. Requirements for large maps (e.g. countries)
    6. Contraction-Hierarchies
  5. Balancing
  6. Credits

Setup and usage

Please find instructions and info below.

Long story short

Execute the following for a (verbose) routing-example.

# Further execution-info
cargo run --release --bin osmgraphing -- --help

# Build the binary for parsing maps and do routing
# and parse isle-of-man.
cargo run --release --bin osmgraphing -- --config resources/isle_of_man_2020-03-14/osm.pbf.yaml --routing

You can download pbf-files from geofabrik and cast them to other formats. When editing the config, take resources/blueprint.yaml as guide.

To use the balancer, a clone of multi-ch-constructor-repo is used as submodule, written in c++. This osmgraphing-repo is wrapping the submodule as Rust-module, using configs for its execution-parameters. To get it run, please install the dependencies according to the workflow-file (working successfully in August 2020). Please note that the balancer bases on code, that is licensed under the GPL-3.0. Therefore, you have to enable features (via cargo).

# Update git-submodules, because they are used in the balancer
git submodule update --init --recursive

# Build also features licensed under the `GPL-3.0`.
# Build with GRAPH_DIM=6.
GRAPH_DIM=6 cargo run --release --features='gpl' --bin osmgraphing -- --config resources/isle_of_man_2020-03-14/balancing/config.yaml --balancing

# After finishing, you may visualize the data
# (the results-dir, excluding the utc-stamp, is specified in the config)
py ./scripts/balancing/visualizer --results-dir 'custom/results/<map-name>/utc_<date_and_time>

As mentioned above, you may find a detailled config-blueprint in resources/ and a balancing-example in resources/isle_of_man/. As defined in the config.yaml, the results can be found in custom/results/isle_of_man and may be visualized with the python-module in scripts/. The python-tool has a help-msg, but the balancer also prints the respective command after finishing.

Note: The reources/blueprint.yaml is correct and complete. Configs of the maps might be slightly broken due to missing or old keywords, because they are not used very often. Since this repository is used by just a very few people, these configs are still kept as helping example.

Overview of all features

The following table shows all features of this repository. The cargo-features are needed to build the respective feature. Some cargo-features are optional for the feature, meaning that the cargo-feature adds extra-functionality. You can build with cargo-features using cargo build --features='F0,F1,...' (cargo run builds implicitly).

cargo-feature Notes
'gpl' This feature is needed for every part of the code, that is licensed under the GPL-3.0. Even if you are using this cargo-feature, it doesn't force you to license data under the GPL-3.0, that has been created with the gpl-code.
'custom' This repository ships with small maps, like handmade maps or Isle-of-Man, but larger maps like the German state Saarland, parts of German states like Stuttgart-Regierungsbezirk or countires like Germany consume multiple 100 MB and more memory. Although, some tests are using these maps and configs may be useful, which is the reason for this cargo-feature. To get this feature working, simply download the maps, move them into the respective map-directory in resources/, and name them according to other map-directories.

Downloading and generating maps

Downloaded osm-data is provided in xml (osm) or binary (pbf), where nodes are related to location in latitude and longitude. Problems will be the size-limit when downloading from openstreetmap, but there are other osm data providers like geofabrik for instance. For now, without a parser-module for osm-data, only binary osm.pbf-data is supported.

For testing, some simple text-based format fmi is used. Since they are created manually for certain tasks, parsing them - generally speaking - is unstable. However, this repository has a generator, which can create such fmi-files from pbf- or other fmi-files (e.g. for different metric-order) based on a config.

A tool for creating fmi-map-files, containing graphs contracted via contraction-hierarchies, is multi-ch-constructor, which is used in the balancer and hence a submodule of this repo. Further, this repo has a wrapping binary multi-ch-constructor for the submodule, using a config as well.

Editing the config

Every possible option of a config is described in resources/blueprint.yaml. The binaries (osmgraphing, multi-ch-constructor) (binaries are in target/release after release-building) use the config for different use-cases.

Inlined metrics

Metrics are inlined using SmallVec. This improves performance and can save several GB of RAM. If your config uses less metrics than you have compiled to, you will receive a warning. Further, if the compiled number of inlined metrics is less than the number of your config's metrics, the graph can't be used efficiently and throws an error. In this case, you must change the number of inlined metrics according to your needs, and rebuild. The command cargo build simply becomes GRAPH_DIM=6 cargo build. The default is quite small (~4), but may change over time.

Requirements for large maps (e.g. countries)

In general, the requirements depend on the size of the parsed map (also same map of different dates) and your machine. Following numbers base on an 8-core-CPU and the pbf-maps from March 14th, 2020 running on archlinux with 16 GB RAM. Basing on the numbers below, without doing further detailled benchmarks, the memory-usage scales linearly with the graph-size with a growth-factor of 0.5. Hence you could expect around 2x RAM-usage for 4x graph-size (meaning 4x node- and 4x edge-count).

  • Parsing Germany.pbf (4 metrics, ~51 million nodes, ~106 million edges) needs around 14 GB of RAM at peak. After parsing, the memory-needs are much lower due to the optimized graph-structure.
  • Preprocessing Germany.pbf (including parsing) needs less than 4 minutes.
  • A routing query on Germany.pbf of distance around 600 km takes around 22 seconds with bidirectional Dijkstra, highly depending on the specific src-dst-pair (and its search-space). This could be improved by removing intermediate nodes (like b in a->b->c), but they are kept for now. Maybe, they are needed for precise/realistic traffic-simulation (e.g. visualization). An Astar is not used anymore, because its only purpose is reducing the search-space, which can be reduced much more using Contraction Hierarchies. Further, Astar has issues when it comes to multiple or custom metrics, because of the metrics' heuristics.

Small maps like Isle_of_Man.pbf (~50_000 nodes, ~107_000 edges) run on every machine and are parsed in less than a second.

The German state Baden-Württemberg.pbf (~9 million nodes, ~18 million edges) needs less than 5 GB RAM at peak and around 30 seconds to parse.

Contraction-Hierarchies

For speedup, this repository uses and supports graphs contracted via contraction-hierarchies by a submodule lesstat/multi-ch-constructor. This submodule generates contracted graphs from fmi-files of a certain format (see below). The submodule is called through a thin wrapper in this osmgraphing-repo building and calling multi-ch-constructor. This wrapper uses a config for building and execution, which makes reproducability easier. Keep this in mind when reading the following explanations. See resources/blueprint.yaml for detailled infos about configs.

First of all, the tool multi-ch needs an fmi-map-file of specific format as input. To generate such a fmi-map-file in the correct format, the binary osmgraphing can be used with a config following the defined requirements.

The ignoreds and placeholders (e.g. ch-level) in the config are important, because the multi-ch-constructor needs them. Besides that, the multi-ch-constructor uses node-indices as ids, leading to errors when the mapping node -> indices [0; n] is not surjective. Therefore, export the graph's edges using src-idx and dst-idx instead of srd-id and dst-id. A flag of the multi-ch-constructor (can be set in the config) allows the export of the respective node-ids (instead of their indices) when printing edges.

The multi-ch-tool needs 3 counts at the file-beginning: metric-count (dimension), node-count, edge-count. The osmgraphing-binary does add these counts in this order.

Before the multi-ch-tool can be used, it has to be built. For the sake of optimization, you have to set the metric-count as dimension, similar to the GRAPH_DIM in osmgraphing. Set this dimension in the config-file according to the dimension in the previously generated fmi-file (the c++-submodule allows this via cmake). See its README for more info.

Note that the multi-ch-constructor is not deterministic (March 12th, 2020). Using it does only speedup your queries, but due to a different resulting order in the priority, or rounding-errors, it could lead to different paths of same weight.

Balancing

See cargo run --features='gpl' --release --bin osmgraphing -- --help.

Credits

The project started in the mid of 2019 as a student project. This section honors the workers and helpers of this project, sorted by their last names.

Florian Barth
is the supervisor of the project since beginning and is always helping immediately with his experience and advice.

Dominic Parga Cacheiro
has been part of the project's first weeks when project-planning and learning Rust was on the scope. He continued the work until October 2020 and has improved and extended the simulation. The largest part of his extensions is the metric-generation of a given network to improve the spread of a provided set of source-target-pairs.

Jena Satkunarajan
has been part of the project's first weeks when project-planning and learning Rust was on the scope. He has implemented the first (and running) approach of the A*-algorithm.

Dependencies

~12–23MB
~318K SLoC