For this exercise, we will download Massachusetts OSM data and then load it into our PostGIS spatially enabled PostgreSQL database. The OSM data contains
roads, points of interests, building footprints, administrative boundaries, addresses, and too many other things to itemize. Note that much of the data provided in OSM for Massachusetts is provided by our very
own Massachusetts Office of Geographic Information (MassGIS) as well as contributions from people like you. Now on with the show.
Loading the OSM Planet data
- These instructions assume you already have PostGIS 1.5+ installed and a spatially enabled database called osm. You can call the database anything you want or use an existing PostGIS spatial database.
If you don't have one, create one using our Getting Started with PostGIS: An Almost Idiot's Guide
- Hstore is a key value tag column data type for PostgreSQL. It is sometimes referred to as a data type for supporting schema-less designs. It will require a bit more space to load but provides more flexibility
on how you can query your OSM data and has additional information you will not find in any of the other columns.
Installing hstore is optional but you will need it if you use the --hstore flag during load. Install Hstore in your PostgreSQL database. It is located in your PostgreSQL share/contrib/hstore.sql.
If you are running PostgreSQL 9.1 or above, you can use the new extensions system to install by running the SQL statement:
CREATE EXTENSION hstore;
- Download Massachusetts osm file from CloudMade http://downloads.cloudmade.com/americas/northern_america/united_states/massachusetts.
You want to download the file called massachusetts.osm.bz2
In order to load OpenStreetMap .OSM XML files, you will need osm2pgsql which you can find out more about at http://wiki.openstreetmap.org/wiki/Osm2pgsql.
There are compiled binaries available for many Linux variants, Windows, and Mac OSX.
If you are on windows, go here http://wiki.openstreetmap.org/wiki/Osm2pgsql#Windows_XP.
If you plan to build map tiles with the data later, we recommend the HOTOSM package which installs osm2pgsql as well as MapNik and OSMOSIS. These will be useful for generating tiles.
- If you don't see a default.style file in your package, download it from the above links. For the HOTOSM install, default.style is located in the Program Files/HOSTOSM/share folder.
Copy the default.style file into the same folder as your massachusetts.osm.bz2 file.
Note: IF you plan to setup a mapping tile server with OSM data later, check out Dane Springmeyer's Mapnik tutorials: http://www.dbsgeo.com/.
- If you install the windows HOTOSM package make sure to reboot your pc as requested to get all the path variables in your system. Next at the command line cd into the folder containing your data and
run below to load the data:
osm2pgsql massachusetts.osm.bz2 -d osm -U postgres -P 5432 -S default.style --hstore
If you want to load additional states, use the --append option switch. So for example if I wanted to load neighboring states like New Hampshire, I would download New Hampshire and then
follow with this command.
osm2pgsql new_hampshire.osm.bz2 --append -d osm -U postgres -P 5432 -S default.style --hstore
- If all goes well with your install, your screen should look something like:
osm2pgsql SVN version 0.69-21289M
Using projection SRS 900913 (Spherical Mercator)
Setting up table: planet_osm_point
NOTICE: table "planet_osm_point" does not exist, skipping
NOTICE: table "planet_osm_point_tmp" does not exist, skipping
Setting up table: planet_osm_line
NOTICE: table "planet_osm_line" does not exist, skipping
NOTICE: table "planet_osm_line_tmp" does not exist, skipping
Setting up table: planet_osm_polygon
NOTICE: table "planet_osm_polygon" does not exist, skipping
NOTICE: table "planet_osm_polygon_tmp" does not exist, skipping
Setting up table: planet_osm_roads
NOTICE: table "planet_osm_roads" does not exist, skipping
NOTICE: table "planet_osm_roads_tmp" does not exist, skipping
Mid: Ram, scale=100
!! You are running this on 32bit system, so at most
!! 3GB of RAM can be used. If you encounter unexpected
!! exceptions during import, you should try running in slim
!! mode using parameter -s.
Reading in file: massachusetts.osm
Processing: Node(10082k) Way(621k) Relation(2k)
Node stats: total(10082538), max(1202366398)
Way stats: total(621446), max(104206285)
Relation stats: total(2846), max(1463423)
Writing way(621k)
Writing rel(2k)
Committing transaction for planet_osm_point
Sorting data and creating indexes for planet_osm_point
Completed planet_osm_point
Committing transaction for planet_osm_line
Sorting data and creating indexes for planet_osm_line
Completed planet_osm_line
Committing transaction for planet_osm_polygon
Sorting data and creating indexes for planet_osm_polygon
Completed planet_osm_polygon
Committing transaction for planet_osm_roads
Sorting data and creating indexes for planet_osm_roads
Completed planet_osm_roads
Spot checking the tables
If your data loaded, you should see three new tables all with a column called way that holds the PostGIS geometry and another column called tags which holds the hstore key value pairs.
The way column holds the PostGIS geometry in spherical web mercator projection or if you used the reproject switch, a different projection. NOTE that while spherical mercator is good for web mapping display, it
sucks for measuring distances, area or anything that has to do with measurement. We'll talk about that later. So in your database you should see these 3 tables:
- planet_osm_point: which contains points of interest such as restaurants, hospitals, schools, supermarkets and addresses
- planet_osm_lines: contains roads and streets
- planet_osm_polygons: contains lakes, building footprints, administrative boundaries such as towns and cities
Index your hstore column
There is some data available in Hstore that is just not available in any of the columns. Some of the more commonly used tags, you will find as columns in the data.
With that said, we will index our hstore columns with these SQL commands.
CREATE INDEX idx_planet_osm_point_tags ON planet_osm_point USING gist(tags);
CREATE INDEX idx_planet_osm_polygon_tags ON planet_osm_polygon USING gist(tags);
CREATE INDEX idx_planet_osm_line_tags ON planet_osm_line USING gist(tags);
Query the data
Now for a simple query to pull all sushi places. Sadly it seems the sushi offering is not very complete:
SELECT name, ST_AsText(ST_Transform(way,4326)) AS pt_lonlattext -- tags
FROM planet_osm_point
WHERE tags @> 'cuisine=>sushi'::hstore;
-- Result --
name | pt_lonlattext
-------------+-------------------------------
Moshi Moshi | POINT(-72.6285103 42.3202165)
Mr Sushi | POINT(-71.1553199 42.4162195)
Pull all the kinds of amenities and their sources:
This you can write one of two ways. Using the hstore tag (second query) is faster since its indexed.
SELECT DISTINCT amenity, tags->'source_url' As source
FROM planet_osm_point
WHERE amenity > ''
ORDER BY amenity;
-- about twice as fast for my MA dataset --
The ? 'amenity' is an indexable hstore operation that asks if the hstore tags has a key called 'amenity'
SELECT DISTINCT tags->'amenity' As amenity, tags->'source_url' As source
FROM planet_osm_point
WHERE tags ? 'amenity'
ORDER BY tags->'amenity';
amenity | source
----------------------------+---------------------------------------------------------------------
:
bus_station |
cafe |
campsite |
: |
cinema |
City Hall | http://mass.gov/mgis/townhalls.htm
Clinic |
college |
:
library | http://mass.gov/mgis/libraries.htm
:
Post Comments About Part 1: Loading OpenStreetMap data into PostGIS: An Almost Idiot's Guide