Printer Friendly
What is SFSQL?
One of the greatest things about Spatial Relational
Databases is that they bring GIS to a new level by allowing you to
apply the expressive SQL declarative language to the
spatial domain. With spatial relational databases, you can easily
answer questions such as what is the average household income of a
neighborhood block. What political district does X reside in. This new
animal that marries SQL with GIS is called Simple Features for SQL (SFSQL). In
essence SFSQL introduces a new set of functions and aggregate functions
to the SQL Language.
Some Common Queries
In this next section we'll go over some common queries
utilizing the data we downloaded in Part 1. Although we are using
PostGIS for this exercise, most Spatial Relational Databases such as
Oracle Spatial, ArcSDE, DB Spatial Extender, have similar syntax.
Transforming from One Coordinate System to Another
NAD 83 Meters to NAD 83 Ft
As noted in Part 1, the data we downloaded is in NAD
83 Meters MA State Plane. What if we needed the data in NAD 83 feet. To
accomplish this, we would need to transform our spatial data to the new coordinate system with the
transform function. If you look in the spatial_ref_sys table you'll
notice that srid = 2249 is the srid of the Nad 83 ft MA State Plane.
Our query would look something like this.
SELECT town, ST_Transform(geom, 2249) as
geom_nad83ft FROM towns
Getting Latitude and Longitude Centroid
In order to get the latitude and longitude of our
data, we need our coordinate reference system to be in some for of
longlat. To begin with, we first transform our data from NAD
83 Meters Massachusetts State Plane to some variant of longlat -
closest match is NAD 83 North American Datum (srid = 4269). Then
we find the centroid and then the x and y coordinates of that.
SELECT town, ST_X(ST_Centroid(ST_Transform(geom, 4269))) as longitude,
ST_Y(ST_Centroid(ST_Transform(geom, 4269))) as latitude
FROM towns
Aggregate Functions
Spatial aggregate functions are much like regular SQL aggregate functions such as AVG, SUM, COUNT in that they work with GROUP BY and HAVING predicates and collapse multiple records into a single record. If you are unfamiliar with the above terms - take a look at Summarizing data with SQL (Structured Query Language)
Extent
The extent function is an aggregate function that gives you the bounding box of a set of geometries. It is especially useful for determining the bounding rectangle of the area you are trying to map. For example if we wanted to find the bounding box of the boston area in NAD 83 feet, we would do something like this.
SELECT town, ST_Extent(ST_Transform(geom, 2249)) as the_extent
FROM towns
WHERE town = 'BOSTON'
GROUP BY town
ST_Union
The ST_Union function is an aggregate function that takes a set of geometries and unions them together to create a single geometry field. In versions prior to PostGIS 1.2, this was called geomunion and the old name was completely removed in PostGIS 2.0. For our towns data, we may have multiple records per town. To get a single geometry that represents the total region of a town, we could use the geomunion function like the example below.
select town, ST_Union(geom) as thegeom from towns group by town;
It is important to note that while the above query will give you one record per town. Our original plain vanilla of
select town, geom as thegeom from towns;
will give you multiple records per town if multiple record exist per town.
Seeing Results Visually
To get a visual sense of what all these different queries look like, you can dump out the above outputs as an ESRI shape file using the pgsql2shp tool and view it using a shape viewer such as QuantumGIS or use the query plugin tools directly to output directly from the db. .
pgsql2shp -f myshp -h myserver -u apguser -P apgpassword -g thegeom mygisdb "select town, geomunion(geom) as thegeom from towns group by town"
One caveat: the shape dumper utility can only dump out fields of type geometry. So for example to dump out a bbox type such as what is returned by the extent function, you'll need to cast the output as a geometry something like
SELECT town, ST_SetSRID(ST_Extent(ST_Transform(geom, 2249))::geometry,2249) as theextent
FROM towns
WHERE town = 'BOSTON' GROUP BY town
Distance Queries
Getting Minimal Distance using Distance function
One common question asked for of a spatial database is how far one object is from another. PostGIS like similar spatial databases has a function called distance which will tell you the minimum distance between one geometry from another.
An example of such a use is below.
The below query will tell you How far each zip in your zipcode table is from another zip 02109?. In this case we are comparing the field geom_meter which we have in NAD 83 State Plane meters. Distance is always measured in the metric of your spatial reference system.
SELECT z.zipcode As zip1, z2.zipcode As zip2, ST_Distance(z.geom_meter,z2.geom_meter) As thedistance
FROM zipcode z, zipcode z2
WHERE z2.zipcode = '02109'
If the above geometries were polygons, then we would be getting the minimal distance from any point on each polygon to the other. For point geometries there is no distinction between a min distance and other distances.
Getting Average Distance using the Centroid function
If we are not dealing with just points, then there are all kinds of distances we could be asking for. For example If we wanted the average distance between 2 polygons, then we would want the distance between the centroids. An average distance compare would then look like.
SELECT z.zipcode As zip1, z2.zipcode As zip2, ST_Distance(ST_Centroid(z.geom_meter),ST_Centroid(z2.geom_meter)) As averagedistance
FROM zipcode z, zipcode z2
WHERE z2.zipcode = '02109'
Getting a list of objects that are within X distance from another object
An often asked question is what zipcodes are within x away from my zip?. To do that one would think that simply doing a distance < x would suffice. This would suffice except it would be slow since it is not taking advantage of PostGIS indexes and also because if the geometries you are comparing are fairly complex, the search could be exhaustive. To write the query in the most efficient way, we would include the use of the Expand function. Like so
SELECT z.zipcode
FROM zipcode z, zipcode z2
WHERE z2.zipcode = '02109' AND ST_DWithin(z2.geom_meter, z.geom,3000)
The ST_DWithin takes the bounding box of a geometry and expands it out X in all directions and then does a more exhaustive distance check on the actual geometries. So in this case we would be expanding our 02109 zip geometry out 3000 meters.
It internally uses the && operator is the interacts operator that returns true if two geometries bounding boxes intersect and false if they do not. It is a much faster especially if you have indexes on your geometries because it utilizes the bounding box index.
Post Comments About Part 2: Introduction to Spatial Queries and SFSQL with PostGIS