3

I've recently joined a project that uses a database model which is, to me at least, unconventional. Every column, except the primary key, is nullable. Instead, the constraint is enforced in the code used to access the database.

The reasoning of the developers is that it's more flexible to do so, and since the REST API exposing this data is supposed to be the single point of data access, there is no risk of corrupt data entering the database.

It's something I hadn't heard of before, and I didn't manage to find any resources championing it. I can appreciate their reasoning and I can't really poke any holes in it either, but my gut feeling says this method is absolutely wrong.

So, does it make sense to place constraints in the code instead of in the database? The database product of choice is SQL Server.

Stijn
  • 361

3 Answers3

9

Of course it's more flexible to allow everything. Arguing against this is a losing game. The point of constraints, types, information hiding etc. etc. is that we've slowly and painfully realized that total flexibility isn't good for us. We're too error-prone to assume that much responsibility, and we're too bad about judging our own abilities to make the right trade-off ourselves.

In this case, a database with nullable columns cannot guarantee a lot of the things which constitute the advantage of having a relational database in the first place. REST is a nice way of accessing things, but it simply can't deliver the kind of assurances (like atomicity, durability etc.) that people assume from using databases. Maybe all that is of no importance in your particular system, but I really doubt it.

Kilian Foth
  • 110,899
3

This line of reasoning is valid if you never ever have mistakes or bugs in the code.

You only need a single bug in the application code to cause a NULL to be saved in the database. Even if this bug is quickly discovered or fixed, the NULL remains in the database and will cause other parts of the code to fail in obscure ways, since the rest of the code is written with the expectation that this field is never NULL. You can't really get rid of the NULL again, since you don't know what the correct value was supposed to be. So now you have to rewrite all the code to handle the special case where this field is NULL.

Obviously there could also be mistakes in the schema/constraints design, but this would only lead to data corruption in the unlucky case where there is a bug in the schema and a matching bug in the application code. So you are still much better protected. In any case, application logic is typically much more complex and faster changing that the database schema, and hence more error prone.

Of course you should still validate and enforce the business constraints at the application level. I'm just pointing out the risk of only doing it at the application level and not have the database reject invalid data.

JacquesB
  • 61,955
  • 21
  • 135
  • 189
2

The reason that constraints in the database are usually the best place to start, is because of concurrency.

If your application supports more than a single user, or is a web application where you can have multiple people updating data, or a single person with multiple browser tabs open, then constraints at the application level are subject to race conditions: Multiple people could be updated the same record at the same time. Even pre-checks at the application level require milliseconds to occur, and in that small span of time two people could commit their changes at the same time, thus bypassing your constraints at the application level.

Databases have long ago solved race conditions affecting inserting, updating and deleting the same data, which they call concurrency.

Let's also not forget that data can end up in the database without going through your application. A DBA can log in and run a SQL script.

We put constraints in the database simply because we don't have to program around concurrency and race conditions in the application tier, and data can be inserted or updated without going through your application.