In a columnar database, are indexes unnecessary or useless?
2 Answers
If you want to scan a subset of all data stored you need some kind of indexing structure. It does not have to be a B-tree. In a certain sense, partitioning or segment elimination is also a kind of indexing.
There is nothing intrinsically preventing a columnar database to be stored in a sorted (or roughly sorted way). This would allow seeks to happen.
That said, not all columnar databases provide the ability to scan a subset of the data stored selected by some predicate. This is an implementation issue, not a limitation of the columnar model.
- 7,390
- 5
- 33
- 58
As is mentioned in another response, the answer to your exact question is implementation-specific. Different database vendors will likely give different answers to it.
I'm not sufficiently familiar with other column-based database engines, but in DB2 with BLU acceleration you would still have indexes covering primary keys and unique constraints of column-organized tables, because these indexes are used for the enforcement of uniqueness. There are also some new data structures that, although they differ from traditional B-tree indexes, play a role in optimizing data access by allowing random rather than sequential access to them, thus being functionally similar to indexes.
On the other hand, in column-organized tables the physical organization of data is optimized towards efficient sequential access, so you probably won't need as many indexes in a columnar database as you would in a traditional row-organized database.
- 28,207
- 24
- 60
- 76