1

I have a table of products with custom properties for each product. So I have props::hstore column in my table. I want the user to be able to search products by key/value by first selecting the key and then entering the value with autocomplete feature. So I need two steps here:

  1. Get all the keys in props field. SELECT DISTINCT with skeys should work here, but I don't understand how to create index.
  2. Find most relavent values for fixed key (autocomplete feature). But seems like gin_trgm_ops can create index only on value not on value for fixed keys.

Also, maybe it can be a good idea to change the hstore with jsonb, but I don't see why it can be more perfomant.

Ximik
  • 155
  • 1
  • 9

2 Answers2

2

With only 10-20 distinct keys, I would urgently consider one separate column per key in the same row. Unused keys stay NULL. Related case:

That said, there is a way to get the list of distinct key names from a document type column like jsonb or hstore comparatively quickly:

Create a btree expression index on the sorted array of key names (or a concatenated string), traverse it with a recursive CTE to emulate a loose index scan and condense the distinct set of keys from the distinct set of combinations of keys.

That will read 1000 index tuples for 1000 combinations (ideally in an index-only scan) instead of all 10kk of rows. A difference of factor 10k.

You need an IMMUTABLE function extracting the array or list of keys for the index - like akeys() for hstore.

A btree index on an array is rarely useful. This is the rare case.

Related code examples:

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
1

There is no performant way to get all distinct names of keys used in an entire hstore (or jsonb) column for a table with many rows. You could used a materialized view if the list of distinct keys does not have to be perfectly up to date. But more likely, you should store your key-value pairs in a real relational table rather than in denormalized form.

jjanes
  • 42,332
  • 3
  • 44
  • 54