Skip to content

Latest commit

 

History

History
188 lines (132 loc) · 14.7 KB

Specification-Binary-Map-File.md

File metadata and controls

188 lines (132 loc) · 14.7 KB

Specification: Mapsforge Binary Map File Format

Conceptual design

The mapsforge binary map file format is designed for map rendering on devices with limited resources like mobile phones. It allows for efficient storage of geographical information (e.g. OpenStreetMap data), fast tile-based access, and filtering of map objects by zoom level.

The map file consists of several sub-files, each storing the map objects for a different zoom interval. Zoom intervals are non-overlapping groups of consecutive zoom levels. Each zoom interval is represented by a single member of the group, the so-called base zoom level.

General remarks

  • All latitude and longitude coordinates are stored in microdegrees (degrees × 106).
  • Numeric fields with a fixed byte size are stored with Big Endian byte order.
  • Unsigned numeric fields with a variable byte encoding are marked with VBE-U INT and stored as follows:
    • the first bit of each byte is used for continuation info, the other seven bits for data.
    • the value of the first bit is 1 if the following byte belongs to the field, 0 otherwise.
    • each byte holds seven bits of the numeric value, starting with the least significant ones.
  • Signed numeric fields with a variable byte encoding are marked with VBE-S INT and stored as follows:
    • the first bit of each byte is used for continuation info, the other six (last byte) or seven (all other bytes) bits for data.
    • the value of the first bit is 1 if the following byte belongs to the field, 0 otherwise.
    • each byte holds six (last byte) or seven (all other bytes) bits of the numeric value, starting with the least significant ones.
    • the second bit in the last byte indicates the sign of the number. A value of 0 means positive, 1 negative.
    • numeric value is stored as magnitude for negative values (as opposed to two's complement).
  • All strings are stored in UTF-8 as follows:
    • the length L of the UTF-8 encoded string in bytes as VBE-U INT.
    • L bytes for the UTF-8 encoding of the string.

File structure

For each zoom interval a so called sub-file is created. A sub-file consists of a tile index segment that stores a fixed-size pointer for each tile created in the tile data segment. The order of storing tiles to the tile data segment and their corresponding pointers to the tile index segment is row-wise and within a row column-wise. Rows and columns are inherently given by the grid layout of the tiles that is defined by the rectangular bounding box. For each tile in the grid, meta information is available in the tile header accompanied by its payload data (POIs and ways).

To read the data of a specific tile in the sub-file, the position of the fixed-size pointer in the index can be computed from the tile coordinates. The index entry points to the offset in the sub-file where the data is stored. Tile coordinates are implicitly given due to the structure of the tile index, thus no tile coordinates need to be stored along each tile.

# meta data
file header

# sub-files
for each sub-file
    # tile index segment
    tile index header
    tile index entries

    # tile data segment
    for each tile
        tile header
        for each POI
            POI data
        for each way
            way properties
            way data

File header

bytes optional name description
20 magic byte mapsforge binary OSM
4 header size size of the file header in bytes (without magic byte) as 4-byte INT
4 file version version number of the currently used binary file format as 4-byte INT
8 file size The total size of the map file in bytes
8 date of creation date in milliseconds since 01.01.1970 as 8-byte LONG
16 bounding box geo coordinates of the bounding box in microdegrees as 4*4-byte INT, in the order minLat, minLon, maxLat, maxLon
2 tile size the tile size in pixels (e.g. 256)
variable projection defines the projection used to create this file as a string
1 flags
  • 1. bit (mask 0x80): flag for existence of debug information
  • 2. bit (mask 0x40): flag for existence of the map start position field
  • 3. bit (mask 0x20): flag for existence of the start zoom level field
  • 4. bit (mask 0x10): flag for existence of the language(s) preference field
  • 5. bit (mask 0x08): flag for existence of the comment field
  • 6. bit (mask 0x04): flag for existence of the created by field
  • 7.-8. bit (mask 0x02, 0x01): reserved for future use
8 yes map start position geo coordinate in microdegrees as 2*4-byte INT, in the order lat, lon
1 yes start zoom level zoom level of the map at first load
variable yes language(s) preference The preferred language(s) for names as defined in ISO 639-1 or ISO 639-2. This field is copied from the preferred-languages option of the map writer.
variable yes comment comment as a string
variable yes created by The name of the application which created the file as a string
variable POI tags
  • amount of tags as 2-byte SHORT
  • tag names as a strings
  • tag IDs are implicitly derived from the order of tag names, starting with 0
variable way tags
  • amount of tags as 2-byte SHORT
  • tag names as a strings
  • tag IDs are implicitly derived from the order of tag names, starting with 0
1 amount of zoom intervals defines the amount of zoom intervals used in this file
variable zoom interval configuration
  • for each zoom interval:
    • base zoom level as BYTE
    • minimal zoom level as BYTE
    • maximal zoom level as BYTE
    • absolute start position of the sub file as 8-byte LONG
    • size of the sub-file as 8-byte LONG

Tile index header

bytes optional name description
16 yes index signature If the debug bit in the file header is set:
+++IndexStart+++

Tile index entry

bytes optional name description
5 index entry
  • 1. bit (mask: 0x80 00 00 00 00): flag to indicate whether the tile is completely covered by water (e.g. a tile amidst the ocean)
  • 2.-40. bit (mask: 0x7f ff ff ff ff): 39 bit offset of the tile in the sub file as 5-bytes LONG (optional debug information and index size is also counted; byte order is BigEndian i.e. most significant byte first)
    If the tile is empty offset(tile,,i,,) = offset(tile,,i+1,,)

Note: to calculate how many tile index entries there will be, use the formulae at [http://wiki.openstreetmap.org/wiki/Slippy_map_tilenames] to find out how many tiles will be covered by the bounding box at the base zoom level of the sub file

Tile header

bytes optional name description
32 yes tile signature If the debug bit in the file header is set:
###TileStartX,Y### where X and Y indicate the tile coordinates of the current tile; the text is always padded to 32 bytes by adding whitespaces
variable zoom table A table indicating the number of POIs and ways in this tile for the different zoom levels covered by the enclosing sub-file. Let Z be the number of zoom levels supported by the enclosing sub-file (e.g. 6 for a sub-file that covers levels 12-17). Then the table has Z rows and 2 columns (first column: POIs, second column: ways). Each cell in the table represents the number of POIs or ways on the specific zoom level. The table is written row-wise and values are encoded as VBE-U INT.
variable first way offset offset in bytes to the first way in this tile as VBE-U INT. The counting starts at the following byte (i.e. first way offset itself is not counted).

POI data

bytes optional name description
32 yes POI signature If the debug bit in the file header is set:
***POIStartX*** where X defines the OSM-ID of the POI; the text is always padded to 32 bytes by adding whitespaces
variable position geo coordinate difference to the top-left corner of the current tile as VBE-S INT, in the order lat-diff, lon-diff
1 special byte
  • 1.-4. bit: layer (OSM-Tag: layer=...) + 5 (to avoid negative values)
  • 5.-8. bit: amount of tags for the POI
variable tag id for each tag of the POI:
  • tag id as VBE-U INT
  • variable values as different data types, whose content can be evaluated from tag's wildcard
1 flags
  • 1. bit: flag for existence of a POI name
  • 2. bit: flag for existence of a house number
  • 3. bit: flag for existence of an elevation
  • 4.-8. bit: reserved for future use
variable yes name name of the POI as a string
variable yes house number house number of the POI as a string
variable yes elevation elevation of the POI in meters as VBE-S INT

Way properties

bytes optional name description
32 yes way signature If the debug bit in the file header is set:
---WayStartX--- where X defines the OSM-ID of the way; the text is always padded to 32 bytes by adding whitespaces
variable way data size number of bytes that are needed to encode the current way as VBE-U INT, starting from the sub tile bitmap (i.e. way signature and way size are not counted)
2 sub tile bitmap A tile on zoom level z is made up of exactly 16 sub tiles on zoom level z+2
for each sub tile (row-wise, left to right):
  • 1 bit that represents a flag whether the way is relevant for the sub tile
Special case: coastline ways must always have all 16 bits set.
1 special byte
  • 1.-4. bit: layer (OSM-Tag: layer=...) + 5 (to avoid negative values)
  • 5.-8. bit: amount of tags for the way
variable tag id for each tag of the way:
  • tag id as VBE-U INT
  • variable values as different data types, whose content can be evaluated from tag's wildcard
1 flags
  • 1. bit: flag for existence of a way name
  • 2. bit: flag for existence of a house number
  • 3. bit: flag for existence of a reference
  • 4. bit: flag for existence of a label position
  • 5. bit: flag for existence of number of way data blocks field
    • case 0: field does not exist, number of blocks is one
    • case 1: field exists, more than one block
  • 6. bit: flag indicating encoding of way coordinate blocks
    • case 0: single delta encoding
    • case 1: double delta encoding
  • 7.-8. bit: reserved for future use
variable yes name name of the way as a string
variable yes house number house number of the way as a string
variable yes reference reference of the way as a string
variable yes label position geo coordinate difference to the first way node in microdegrees as 2 × VBE-S INT, in the order lat-diff, lon-diff
variable yes number of way data blocks The amount of following way data blocks as VBE-U INT.

Way data

bytes optional name description
variable number of way coordinate blocks The amount of following way coordinate blocks as VBE-U INT. An amount larger than 1 indicates a multipolygon with the first block representing the outer way coordinates and the following blocks the inner way coordinates.
variable way coordinate block for each way coordinate block:
  • amount of way nodes of this way as VBE-U INT
  • geo coordinate difference to the top-left corner of the current tile as VBE-S INT, in the order lat-diff, lon-diff
  • geo coordinates of the remaining way nodes stored as differences to the previous way node in microdegrees as 2 × VBE-S INT in the order lat-diff, lon-diff using either single or double delta encoding (see below).

Coordinates in a way data block are encoded in either 'single-delta' or 'double-delta' format according to the flag in the way properties. The encoder chooses the most efficient format on a way-by-way basis so most maps will contain examples of both types.

For single-delta encoding the lat-diff and lon-diff values describe the offset of the node compared to its predecessor.

let x1 be the lat of the previous way node and x2 be the lat of the current way node.
Then the difference is defined as x2 - x1.

For double-delta encoding the lat-diff and lon-diff values describe the change of the offset compared to the offset of the previous node, after the first node. The following pseudocode shows how to decode coordinates encoded in this format.

set 'previousLat' to the latitude (in degrees) of the top-left corner of the current tile
set 'previousOffset' to zero
set 'count' to zero

while there is data to be read:
    set 'encodedValue' to the next item of data (VBE-S, in microdegrees) 
    set 'lat' to 'previousLat' + 'previousOffset' + 'encodedValue' / 1,000,000
    if 'count' is greater than zero, then
        set 'previousOffset' to 'lat' - 'previousLat'
    set 'previousLat' to 'lat'
    
    'lat' contains the decoded data
    add one to 'count'


Example of decoding double-delta encoded data:
    tile origin: 52.123456, encoded values: -8286, -57, 129, -15, -129
    decoded values: 52.11517, 52.115113, 52.115185, 52.115242, 52.11517

Version history

Version Date Changes
1 2010-11-21 Initial release of the specification
2 2011-01-26
  • Introduced variable byte encoding for some numeric fields to reduce the file size
  • Modified some field names and descriptions for clarification
  • Offset encoding is now used on all coordinates
3 2012-03-18
  • Ways are stored as multiple segments
  • Ways can also have a house number
  • Removed obsolete data
  • Added language preference field to the header
  • Added file size field to the header
  • Added start zoom level field to the header
  • Added created by field to the header
  • Added a flag for single and double delta encoding
  • Reordered some fields
  • Removed some data type related limitations
4 2015-11-25
  • Multilingual names storage
5 2017-12-03
  • Variable tag values storage