Bringing in Towns As Geography (Geodetic) -- Continued
In the first part we covered bringing in Mass Towns data as Planar geometry, but were stuck because
we need to transform the data to a degree based projection (in particular one listed in sys.spatial_reference_systems) to use the Geography data type,
but SQL Server 2008 has no mechanism to transform that data.
Now we shall demonstrate how to transform and import the data in a supported spatial reference system.
Transforming ESRI Shape from State Plane to WGS 84 long lat
As mentioned before SQL Server 2008 has no mechanism for doing spatial transformations, so what to do?
Using OGR to Transform the data
OGR/GDAL is a free Open Source GIS toolkit that is very useful for data loading and doing spatial transformations among other things.
Its probably the easiest to use of all the tools.
- Install it as described in our OGR2OGR Cheatsheet
- Launch the FWTools Shell from Start->All Programs->FW Tools 2..->FW Tools Shell
- CD into the directory you downloaded your data in my case
cd C:\data\gisdata
- Type the following. The below will take an ESRI shape file called TOWNSSURVEY_POLY.shp that is
of Massachusetts State Plane Meters (EPSG:26986) and transforms it to WGS 84 long lat (EPSG:4326)
ogr2ogr -f "ESRI Shapefile" -a_srs "EPSG:26986" -t_srs "EPSG:4326" towns_geodetic.shp TOWNSSURVEY_POLY.shp
- Now launch Shape2SQL.exe repeat our import step except choose the new towns_geodetic.shp, but choose Geography instead and SRID 4326. Your screen should look:
Doing Queries with Geography (Geodetic)
Now we have our data imported, Launch SQLSpatial.exe as we did before and run these queries
The below fails because Geography does not support Centroid
- get error STCentroid for type STGeography not found.
SELECT TOP 1 geom.STCentroid().STAsText() FROM towns_geodetic WHERE town = 'BOSTON'
So the first thing we learn from the above exercise, is that sometimes planar is still needed since while Geography can cover a larger region,
it is lacking many of the functions available in the regular Geometry. Refer to
SQL Server 2008 PostGIS MySQL Compare
for a compare of function/method differences between SQL Server 2008 Geography and Geometry
.
Distance Searches
One important feature that Geography (Geodetic) has over Geometry is ability to
do distances in meters using spherical coordinates and spanning large regions. In fact Isaac Kunen touches
on this a little in his blog Nearest Neighbors.
In fact doing distance queries and finding neighbors is probably the number one reason why most people will
want to use spatial queries. With this one feature, one can answer questions such as
- How many low income families are within x miles from this welfare office?
- Correlation between outbreaks of cancer and location of a nuclear oil spill taking into consideration
the full area of the oil spill?
Of course questions like these could be answered before, but are answered a bit more trivially with
a spatially enabled database and are extremely difficult to answer if you are trying to find shortest distances
between 2 objects that are not points.
Note we know the distance is in meters because the spatial_reference_systems table tells us so.
SELECT unit_of_measure from sys.spatial_reference_systems
WHERE spatial_reference_id = 4326;
Most of the spatial refence systems defined in this sys table are in meters except for a few weird ones
in Clarke's foot, Survey foot, and German metre.
Here we are going to run this in SQL Server 2008 Studio since we don't have any map data to view and we want to take advantage of SQL Server 2008 Studio show plan features.
Keep in mind just as in all OGC compliant spatial databases, the STDistance function defines the minimum distance between 2 geometries. So if you are comparing a Polygon to a polygon
then its the distance between the points on each polygon that is the closest.
Below is a slightly different query from what we used in planar and can be used equally in planar. Here we arbitrarily take the first point
that defines a polygon in Boston and ask what town POLYGON/MULTIPOLYGON geometries are within 1 mile of this point and we also want to know the exact distances
and results ordered by distance.
SELECT t.town, ref.point1.STDistance(t.geom)/0.3048 As dist_ft
FROM towns_geodetic As t
INNER JOIN (
SELECT TOP 1 geom.STPointN(1) As point1
FROM towns_geodetic WHERE town = 'BOSTON') As ref
ON ref.point1.STDistance(t.geom) < 1609.344
ORDER BY ref.point1.STDistance(t.geom) ;
town dist_ft
BOSTON 0
BOSTON 140.31019135227
BOSTON 211.728831986735
DEDHAM 2616.66222586371
DEDHAM 2616.73216967261
MILTON 3501.37051762325
Now try clicking the "Include Actual Execution Plan" (or Ctrl-M for short) and hitting the Execute
for the above query.
You should see something like this which will give you a rough idea of where your processing time is going.
Shown above is a very small fragment of the plan used. From this we learn that our query is
using a spatial index (this is good), but there is a warning on it, not so good. Usually when you see a little warning like
that, it means your planner statistics are either non-existent or out of date. If you right click on it and view details, it will tell you more about the warning.
This query is already lightning fast, so we won't worry about this minor issue. In our next part, we shall delve into larger sets of data with more sophisticated queries, where
speed will be an issue and we'll want to squeeze out every inch of speed we can.
So I shall leave you with these words of wisdom as far as Query Plans go and these apply for most databases and not just spatial queries. We'll experiment with these rules of thumb in the next section.
- Scan percentages and sniff out highest percentage costs and inspect those
- Scan for lack of indexes in plans where you would expect indexes to be used. Note just because no index is used even when you have an index is not an absolute cause for concern. Sometimes it is faster
to do a table scan than an index scan.
- Scan for warning flags such as above
Post Comments About Part 2: Getting Started With SQL Server 2008 Spatial: Reproject data and More Spatial Queries