I have a table called locations with ~70000 records which includes a geometry column of points and a spatial index as follows:
UPDATE locations SET geom = ST_GeomFromText(CONCAT( 'POINT(', latitude, ' ', longitude, ')' ), 4326);
ALTER TABLE locations MODIFY geom GEOMETRY NOT NULL;
ALTER TABLE locations ADD SPATIAL INDEX(geom);
I have another table called location_boundaries with ~350 records and a geometry column for country/region boundaries converted from a geojson string of polygons/multipolygons as follows:
UPDATE location_boundaries SET geom = ST_GeomFromGeoJSON(geo_json, 1, 4326);
ALTER TABLE location_boundaries MODIFY geom GEOMETRY NOT NULL;
ALTER TABLE location_boundaries ADD SPATIAL INDEX(geom);
I have a table called jobs and I have the following query to return all jobs within a particular region - in this case North West England (note the geojson boundry for this region was taken from the following site: https://nominatim.openstreetmap.org/search?q=North%20West%20England&countrycodes=gb&polygon_geojson=1&format=json
The goal of this query is to generate a paginated result set. However the query takes forever to execute (well over 40 mins before I had to cancel it). To filter through ~6000 jobs is far too slow.
SET @g = (SELECT geom
FROM location_boundaries
WHERE NAME = 'North West England');
SELECT DISTINCT jobs.*
FROM jobs
INNER JOIN locations
ON locations.id = jobs.location_id
WHERE St_contains(@g, locations.geom)
AND jobs.start_at <= now()
AND jobs.end_at >= now()
AND jobs.status_id = 4
ORDER BY Coalesce(IF(jobs.product_id = 4 AND jobs.refreshed_at <= UTC_TIMESTAMP(), jobs.refreshed_at, jobs.start_at)) DESC
, jobs.id DESC
LIMIT 25 offset 0
This is the explain:
+----+-------------+-----------+------------+--------+-------------------------------------------------------------------------------+--------- +---------+-------------------------+------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+--------+-------------------------------------------------------------------------------+---------+---------+-------------------------+------+----------+----------------------------------------------+
| 1 | SIMPLE | jobs | NULL | ALL | jobs_location_id_foreign,jobs_status_id_start_at_index,jobs_status_id_foreign | NULL | NULL | NULL | 5731 | 33.33 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | locations | NULL | eq_ref | PRIMARY,geom | PRIMARY | 4 | testdb.jobs.location_id | 1 | 5.00 | Using where; Distinct |
+----+-------------+-----------+------------+--------+-------------------------------------------------------------------------------+---------+---------+-------------------------+------+----------+----------------------------------------------+
Why arnt the spatial indexes being used? How can I optimize this query. Any help appreciated.
Strangely when switch the boundary to a different multipolygon say to country level as follows I can get the query to execute in 32 seconds which is still too slow.
SET @g = (SELECT geom
FROM location_boundaries
WHERE NAME = 'United Kingdom');
When you validate the geojson for Northwest england in qgis it is valid