I've been working with MySQL for probably as long as I've been databasing things, and I've done work with latitudes/longitudes and utilized the google maps API. I've done distance calculations using google's directions API. I've also plotted regions and determined (via perl) if a point was within the region. But what I haven't done is have a large enough collection of points to see a need to perform geo-spatial calculations from within the database. Recent devel work has changed that -- I cannot continue to merrily calculate these items on-the-fly against all my data points. I need to utilize the database to return a reasonable number of results for a given query that is location-aware. -- When working with a geographical database comprised of hundreds or thousands (or more!) of locations, it is obviously not possible to pull out every point just to parse it with your application code to see which points are within a variable distance of some other point !!! Luckily, it appears MySQL has already recognized this, and they've been putting geo-spatial math into the system. But from what I can tell, really useful calculations like DISTANCE won't be available until v5.6. And I don't know anything about MySQL release cycles, but I do know that I'm not running v5.6, and I need a solution now. After figuring out the parts I needed, I determined my best solution would be to cobble together the parts of MySQL that already work, and add a function to mysql to fill in the gap. I'd rather do this and wait on MySQL to deliver the goods than to learn PostGIS -- I see no compelling reason to patch in a secondary database just for geo-spatial math when I expect MySQL will support these functions in a short enough time. Something that already works in MySQL is a function to determine if a point is within a polygon. Here's an example: I defined my neighborhood (the Chatham Crescent neighborhood in Savannah). My definition of the neighborhood isn't actually the neighborhood; it is based on the definition of the Ardsley Park / Chatham Crescent National Historic District. My example will take this area (defined as a polygon using points comprised of latitude/longitude), and be able to determine if some variable point lies within it. Let's start by defining a table to hold polygons: CREATE TABLE IF NOT EXISTS `spatial_polygon` ( `id` int(5) NOT NULL AUTO_INCREMENT, `name` varchar(50) NOT NULL, `polygon` polygon NOT NULL, PRIMARY KEY (`id`) ) ENGINE=MyISAM; Our next step is to put a polygon into our table. As mentioned, I'll use the AP/CC Neighborhood: INSERT INTO `spatial_polygon` ( `name` , `polygon` ) VALUES ( 'APCCNA', GEOMFROMTEXT( 'POLYGON((32.04089105585838 -81.09558724013324, 32.03990299368573 -81.09502367722654, 32.04044475451813 -81.09398240188207, 32.0396079596504 -81.09052033019061, 32.03921142209595 -81.08897965830977, 32.04077145960087 -81.08814240859529, 32.04182523829949 -81.08756114160242, 32.04508988212888 -81.08636083766568, 32.04827718779046 -81.08524011625887, 32.0525159736407 -81.10180716057883, 32.05245421405966 -81.10209796127137, 32.04355913374913 -81.10559813252672, 32.04089105585838 -81.09558724013324))' ) ); Notice the double parentheses on the POLYGON. This tripped me up -- it's because you can put a hole in your polygon -- it takes multiple arguments. Now lets define another table to contain points. I'm purposely not defining a point for the lat/lon pairs -- I'm leaving the lat/lon in the database as separate float values. Like I said, I'm cobbling together a solution for multiple needs, and I think it'll be more efficient for me to leave them as is because most of my math will be the distance calculations described after this example. CREATE TABLE IF NOT EXISTS `geo_loc` ( `address` longtext NOT NULL, `lat` float(11,8) NOT NULL, `lng` float(11,8) NOT NULL, `index` int(5) unsigned NOT NULL AUTO_INCREMENT, PRIMARY KEY (`index`) ) ENGINE=MyISAM; And let's put in some sample points: INSERT INTO `geo_loc` (`address`, `lat`, `lng`) VALUES ('ChaseFox', 32.04095078, -81.08921051), ('Gainesville, GA', 34.29787827, -83.82406616), ('Toccoa, GA', 34.57732010, -83.33238220), ('Loris, SC', 34.05628204, -78.89030457); Great! That was the easy part! But thanks to MySQL, the next part is also pretty easy. Here's the query that'll return the `geo_loc` points that lie within the "APCCNA" polygon. The following query isn't constrained on the polygon table, but you can easily append: AND `spatial_polygon`.`name` = "APCCNA" to the where clause.SELECT `geo_loc`.`address`, `spatial_polygon`.`name` FROM `geo_loc` JOIN `spatial_polygon` WHERE CONTAINS( `spatial_polygon`.`polygon` , GEOMFROMTEXT(CONCAT('POINT(' , `geo_loc`.`lat` , ' ',`geo_loc`.`lng`,')') )) ....sure, it gets a little nasty at the end there with the CONCAT and all. But that's because I was using floats. It should be effective, though! Now it gets a bit harder because we're stepping outside the MySQL box. I read a lot about complex math like the Haversine Formula, and I found the following snippet online somewhere. I wish I could give credit, but lost the link somehow. This MySQL code defines the function that'll do our grunt work for DISTANCE calculations. DELIMITER // DROP FUNCTION IF EXISTS km_from_deg // CREATE FUNCTION km_from_deg (latA DOUBLE, longA DOUBLE, latB DOUBLE, longB DOUBLE) RETURNS DOUBLE BEGIN DECLARE tmp DOUBLE; SET tmp = COS(RADIANS(longA - longB)) * COS(RADIANS(latB))*COS(RADIANS(latA)) + SIN(RADIANS(latB))*SIN(RADIANS(latA)); IF tmp > 1 THEN set tmp = 1; ELSEIF tmp < -1 THEN set tmp = -1; END IF; return 6372 * ACOS(tmp); END; // DELIMITER ; So now let's put it to good use. Let's say that given a point, we want grab all locations within 305 kilometers from it. Here's an example using one of the same tables we defined earlier: SELECT `geo_loc`.`address`, km_from_deg(`geo_loc`.`lat`, `geo_loc`.`lng`, 32.04, -81.09) AS `d` FROM `geo_loc` WHERE km_from_deg(`geo_loc`.`lat`, `geo_loc`.`lng`, 32.04, -81.09) < 305 ORDER BY `d` ASC AWESOME! I used this great link to check the math: http://www.movable-type.co.uk/scripts/latlong.html I found this slide series interesting, and might look into the speed increase it references at some point: http://www.scribd.com/doc/2569355/Geo-Distance-Search-with-MySQL |
MySQL >