6

Me and a couple of colleagues are having a discussion regarding the following case:

In an OrderStatus table we're keeping track of all the statuses an order goes through in time, including "Pending", "Available", "Returned", etc. There's a timestamp field in the table to keep track of when each status was added. However, there are two statuses that may be added exactly at the same time, one after the other, in which case both may have exactly the same timestamp (including milliseconds). In another part of the code, we need to check which is the last status an order was set to, and since we're ordering by the timestamp we can easily get the wrong record from the database.

The table has an IDENTITY(1,1) primary key, and my colleagues think we should order using that column, since that will always necessarily give you the latest record added to the table.

I feel like an IDENTITY column should be voided of any meaning and thus should not be used for any sorting, and would rather do a Thread.Sleep(1) to make sure the timestamps are different and we can keep sorting using that field.

What do you think is the best practice in this case?

erictrigo
  • 315

4 Answers4

5

Use numeric status values where the higher value represents the higher state. Add that to your index in descending order. Then pull back the result ordered by time stamp and the status in descending order. This way the entry with the farthest progression is always returned.

GrandmasterB
  • 39,412
1

It sounds like an event sourcing approach would be helpful here

http://martinfowler.com/eaaDev/EventSourcing.html

essentially, each time you update the object, you load it, add the event to the history and save everything.

When you save the object, you check to see that nothing else has updated the history in the meantime. If they have, you rebuild the object from the new history and reattempt adding your event

Alternatively you can lock the object to cause other events to wait for you to finish.

This is superior to things like auto-inc columns because it ensures that the events are processed in order, not just saved in an order.

In your particular case however, it seems to me that your status is a function of the events which have happened. Instead of having the status read from the last event make it a calculation based on all the events which have occured.

In this case, you don't really mind which event happened first, or if two happened at the same time.

ie

say you have 'order picked' and 'customer credit check passed'

conceptually both of these can happen at the same time, but the status might be

'not picked' - no picked event

'pending credit check' - picked event but no credit check event

'ready for delivery' - both picked AND credit checked

Ewan
  • 83,178
1

A better approach would be two keep two tables, current and history. The current table would always be the latest update. The other table would be a history table, which would always be just inserts and reads. If you want to get the latest record, query the current table. If your not allowing any dirty reads it will always be the latest record. If you need the history, query the history table and sort by the PK.

For the current table you want to keep a timestamp so if someone updates the record after your read it you can handle the conflict appropriately before making the update. For the history table you are always inserting so no check is necessary. You could handle the history inserts with a trigger off the current table if desired.

In your example:

  • ORDER (CURRENT)
  • ORDER_HISTORY (HISTORY)

That would make the logic straight forward.

Jon Raynor
  • 11,773
1

Your colleague is right. Order by the primary key. The primary key is not devoid of meaning. It identifies each row in the table. It is numeric, which implies the order in which records were created.

It doesn't fit perfectly into the terminology of your business domain, but don't overlook the simple and obvious solution for ideological reasons, especially when this solution requires little more than a sort clause in a SQL query.