5

A Ticket has the statuses:

new
in_progress
on_hold
closed

I could make ticket.status a string (on_hold) or an unique int(2). It is indexed.

Int pros: smallest size in index

Int con: low clarity for BI and evolving schema (3 compared to closed)

String pro: clear data navigation

String con: takes more space to index, gives less performance for same RAM

I imagine a string index wont take too much space over an int index if its low cardinality. Is it premature optimization to choose integer if the field has low cardinality and is not part of a compound index?

I use Postgres with SQLAlchemy, Python ORM.

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
Jesvin Jose
  • 307
  • 3
  • 8

1 Answers1

5

I would definitely normalize for big tables. You could use an integer or smallint column with a FK constraint to a lookup table.

For small tables either of your ideas is ok. If in doubt, stick to the guidelines of the project.

I would use neither. Instead:

For small tables: use an enum. The documentation:

An enum value occupies four bytes on disk.

For big tables: use a "char" field (1 byte) as FK to a lookup table. A good choice for very small sets of lookup values. And each value can be a mnemonic for the referenced state:

status_id status
n new
i in_progress
o on_hold
c closed

It also depends on the complete picture. You only actually save space (and gain performance) in table and indexes if it's not lost to alignment padding anyway. More:

For character types (except for "char"), collation rules may be relevant, too. See:

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633