Working on modernizing the way geographic data is stored in a Laravel/MySQL app this morning, and ended up down an internet rabbit hole of Spatial Reference Systems.
TL;DR as I understand it is that unless you know you want something else, you probably want to use WGS 84 / SRID 4326 for MySQL geographic columns.
Today's part of the mission: create the migration and make it work. I want:
- A new column with the type ‘point’
- A spatial index on that column that I can use to get locations within an area.
There are some caveats in MySQL:
- point columns cannot have a default value
- columns with spatial indexes must be not null, and have an SRID set.
So the current plan is:
1. Create the column as a nullable column
2. Set the point data using the existing latitude & longitude columns
3. Alter the table so it is no longer nullable
4. Create the spatial index
So my migration is able to add the column, but the query to set the data based on the existing latitude & longitude columns is failing with this error:
3617 Latitude -90.134513 is out of range in function st_geometryfromtext. It must be within [-90.000000, 90.000000]
My query is:
UPDATE locations SET location=ST_GeometryFromText(CONCAT('POINT(',longitude,' ',latitude,')'), 4326))
It seems that the WKT for POINT(x y) might be expecting latitude to be the first value, but all the documentation I can find for WKT suggests that longitude, (the “x” value in a cartesian earth), should be first.
It seems that I need to investigate further - I don’t want all my latitudes & longitudes reversed!
Following up on this, I found this gem in the MySQL docs:
“By default, geographic coordinates (latitude, longitude) are interpreted as in the order specified by the spatial reference system of geometry arguments.”
- https://dev.mysql.com/doc/refman/8.4/en/gis-wkt-functions.html
So it seems that the WKT for a point isn’t always POINT(x y) and I need to look up the order that WGS 84 expects.
And I found a page that says:
“Consequently CRS created with the "EPSG:4326" or "WGS84" strings use the latitude first, longitude second axis order.”
- https://gdal.org/tutorials/osr_api_tut.html
I reversed the arguments, and I still get an error:
"Query 1 ERROR at Line 8: : Latitude -92.163290 is out of range in function st_geometryfromtext. It must be within [-90.000000, 90.000000].”
Let’s go looking…
SELECT * FROM location WHERE latitude LIKE '%-92.16%';
I get two results with latitudes above 90 degrees… it seems I have a data problem. These results are imported from an external data source, it seems I need to add an additional sanity check to my data importer.
Doing a sanity check here, the two locations at 92 degrees south latitude have around 34 degrees west longitude, which I think is somewhere in the Atlantic, but south of the South Pole.
Switching to 34 degrees north 92 degrees west puts them in North America, where they are expected.
I will have to check the provenance of this data, but I believe it comes from a multi-billion dollar corporation. Everyone makes mistakes, folks!
Ha! It was me that flipped the latitude/longitude when importing data.
Time to add validation so I’ll get errors if I screw this up again in the future, instead of bad data.
And today I learned there is no WKT representation of a circle. I guess I’ll use a polygon as a bumpy circle.
Also, MySQL is giving me an error when I try to pass a strategy to ST_Buffer, so there is that.
Turns out the manual says this will happen!
“If the distance is negative or any strategy (except NULL) is specified, an ER_WRONG_ARGUMENTS error occurs.”
https://dev.mysql.com/doc/refman/8.4/en/spatial-operator-functions.html#function_st-buffer
@johnbeales Can’t you do something with ip2long and its counterpart for storage? That way number comparison and searching becomes a bit easier. Or strip out the period for storage (It’s always in position 3 or 4 anyway) and re-add it when getting it out of the db.
Or… While in the db, treat the coordinates are strings instead of integers, sometimes that helps too. Though probably not for your st_geometryfromtext function.
@arnan I’m just getting back to this now!
I don’t understand what storing as a long integer get me here, my goal is to improve the performance of finding points within a certain geographical area. Currently there is are latitude and longitude columns, (stored as DOUBLE(18,5) columns, which is way too many decimal places), and we do a Haversine calculation in SQL. Am I missing something?