I am but a lowly code monkey, so this may be a very basic question.
For the current project, our architect has required that all tables, without exception, must have at least these four fields:
`id` INT AUTO_INCREMENT PRIMARY KEY
`created_at` TIMESTAMP,
`modified_at` TIMESTAMP,
`status` INT COMMENT "-1=delete using cron, 0=disabled so only admins can see it, 1=active"
This last field is the one I do not understand. To me, it flies in the face of every normal kind of database design.
It breaks SELECT: every single select must do a full table scan on status to see if it's 1, for every table involved in the join.
It breaks UNIQUE KEY. Any time a table has a unique key, then deleting an entry prevents that entry being recreated until cron has run - which he was saying could be any amount of time, even "three months".
Any time I do a check against, for example, username existence, for someone creating a new username, I must check all data including deleted entries, rather than just active data, because the unique key would prevent creation even though there's no valid record using that value.
It breaks ON DELETE. When something is marked "disabled" or "deleted", all its cascadable items must have that status cascaded through code: the database can't do it for us.
It seems to violate normalization, as it gloms a few different concerns (admin vs user visibility; garbage collection) into a single field.
It is also a criminally poor choice of field name, far too ambiguous. What does it mean?
And that's just the problems I've found so far.
The main assumption here seems to be that DELETE is "expensive", and must be optimized for. But deletions on this system will be extraordinarily rare (<1/month), and the overwhelming majority of operations will be SELECTs (several/second) or INSERTS (several/min).
While looking for other questions on this, I did find What could be causing strange query timeouts between PHP and MySQL? that suggests that there are people using these fields for some tables at least.
But all tables? Including many-many link tables?
The advantages he cites are that it allows things to be easily dis/enabled, and undeleted if they were deleted by accident.
Except that it doesn't. None of the connected systems are able to handle something being temporarily disabled: either it's active, or it's deleted. The deletion of any entity is typically accompanied by API calls to notify other systems of the deletion, and so cannot easily be resurrected just by changing the value of a field on our own DB.
So my question is - is this a standard DB architecture decision? If so, what's it called, and is it documented anywhere, so I can wrap my head around why such an apparently-broken design pattern would ever be useful?