4

I have a stored procedure which accepts various varchar parameters. The middle tier code which calls the procedure has not been consistent in terms of the values it submits. For example, sometimes a parameter like "Transport Description" will be submitted as NULL and sometimes it will be an empty string.

In terms of the client, an empty string simply means a value was not entered.

Should I convert all the empty varchar parameters to NULL before inserting the data?

I can't decide if this falls into the realm of a best-practice to preserve the integrity of the data or a bad idea because I'm not representing what was actually submitted to the database.

I should note that all inserts and updates are tracked in an audit log.

8kb
  • 2,639
  • 2
  • 32
  • 36

4 Answers4

8

As a DBA, your sacred duty is to take whatever data was entrusted to you, and leave it as is.

A NULL in a field usually means "DATA was not entered" or "DATA is not applicable" (but can have other meanings - it is context specific), and may actually MEAN something to the business layer. An empty string may actually be what the user entered and WANTS. Of course, I am generalizing - possibly the reason you get either NULLs or '' (empty strings) is due to some sloppy coding in the business layer (let us hope not, though).

Either way, have a chat with whoever provides the Middle Tier and see what they are using NULL vs Empty String for - and then go from there. Maybe they are thinking that they always pass in an Empty String (or always NULLS) and are not aware of the randomness (in which case advise them on what you think should be the standard), or maybe NULL really is information to them

7

This is where theory meets practice: it's a big grey area

Thoughts:

  • NULL goes into the record NULL bitmap which always exists. An empty string takes 2 bytes, so NULL is more efficiently stored
  • IS NULL checks use the NULL bitmap = an optmisation
  • When indexed, a NULL adds a NULL bitmap to each index entry = larger index. Empty string won't (still 2 bytes though).
  • Consistency is king
  • (MyCol IS NULL OR MyCol = @MyCol) is now quite optimised in SQL Server

Personally, I'd use NULLs and avoid empty strings unless there is a good reason for doing so. Mainly for sanity and consistency. Bollocks to theory.

gbn
  • 70,237
  • 8
  • 167
  • 244
2

Since an empty string has no meaning of its own in your case, and you have already sufficiently communicated with the developers who provide the inconsistent data, then I think you can safely convert that string to NULL in this case.

Of course, as others pointed out, you cannot make this a general rule.

Lukas Eder
  • 842
  • 6
  • 19
0

If this is a procedure used by business in general then leave it as is communicate to the development team the need to clean up their dirty data.

If this was created to support a specific project, then talk with the project team and determine how they want the data stored. The result should be to have them clean up their data but there may be a reason they are unable to.

Chad
  • 393
  • 4
  • 8