In a Postgres 9.1 database, I have a table table1 with ~1.5M rows and a column label (simplified names for the sake of this question).
There is a functional trigram-index on lower(unaccent(label)) (unaccent() has been made immutable to allow its use in the index).
The following query is quite fast:
SELECT count(*) FROM table1
WHERE (lower(unaccent(label)) like lower(unaccent('%someword%')));
count
-------
1
(1 row)
Time: 394,295 ms
But the following query is slower:
SELECT count(*) FROM table1
WHERE (lower(unaccent(label)) like lower(unaccent('%someword and some more%')));
count
-------
1
(1 row)
Time: 1405,749 ms
And adding more words is even slower, even though the search is stricter.
I tried a simple trick to run a subquery for the first word and then a query with the full search string, but (sadly) the query planner saw through my machinations:
EXPLAIN ANALYZE
SELECT * FROM (
SELECT id, title, label from table1
WHERE lower(unaccent(label)) like lower(unaccent('%someword%'))
) t1
WHERE lower(unaccent(label)) like lower(unaccent('%someword and some more%'));
Bitmap Heap Scan on table1 (cost=16216.01..16220.04 rows=1 width=212) (actual time=1824.017..1824.019 rows=1 loops=1)
Recheck Cond: ((lower(unaccent((label)::text)) ~~ '%someword%'::text) AND (lower(unaccent((label)::text)) ~~ '%someword and some more%'::text))
-> Bitmap Index Scan on table1_label_hun_gin_trgm (cost=0.00..16216.01 rows=1 width=0) (actual time=1823.900..1823.900 rows=1 loops=1)
Index Cond: ((lower(unaccent((label)::text)) ~~ '%someword%'::text) AND (lower(unaccent((label)::text)) ~~ '%someword and some more%'::text))
Total runtime: 1824.064 ms
My ultimate problem is that the search string comes from a web interface which may send quite long strings and thus be quite slow and may also constitute a DOS vector.
So my questions are:
- How to speed up the query?
- Is there a way to break it into subqueries so that it is faster?
- Maybe a later version of Postgres is better? (I tried 9.4 and it does not seem faster: still the same effect. Maybe a later version?)
- Maybe a different indexing strategy is needed?