21

I'm pretty sure a lot of applications, critical applications, banks and so on do this on a daily basis.

The idea behind all that is:

  • all the rows must have a history
  • all links must stay coherent
  • it should be easy to make requests to get "current" columns
  • clients who have bought obsolete things should still see what they've bought even though this product is not part of the catalogue anymore

and so on.

Here's what I want to do, and I'll explain the problems I'm facing.

All my tables will have those columns:

  • id
  • id_origin
  • date of creation
  • start date of validity
  • start end of validity

And here are the ideas for CRUD operations:

  • create = insert new row with id_origin = id, date of creation=now, start date of validity=now, end date of validity=null (= means it's the current active record)
  • update =
    • read = read all the records with end date of validity==null
    • update the "current" record end date of validity=null with end date of validity=now
    • create a new one with the new values, and end date of validity=null (= means it's the current active record)
  • delete = update the "current" record end date of validity=null with end date of validity=now

So here's my problem: with many-to-many associations. Let's take an example with values:

  • Table A (id = 1, id_origin = 1, start=now, end=null)
  • Table A_B (start=now, end=null, id_A = 1, id_B = 48)
  • Table B (id = 48, id_origin = 48, start=now, end=null)

Now I want to update table A, record id=1

  • I mark record id=1 with end=now
  • I insert a new value into table A and... damn I've lost my relation A_B unless I duplicate the relation, too... this would end to a table:

  • Table A (id = 1, id_origin = 1, start=now, end=now+8mn)

  • Table A (id = 2, id_origin = 1, start=now+8mn, end=null)
  • Table A_B (start=now, end=null, id_A = 1, id_B = 48)
  • Table A_B (start=now, end=null, id_A = 2, id_B = 48)
  • Table B (id = 48, id_origin = 48, start=now, end=null)

And... well I have another problem: the relation A_B: shall I mark (id_A = 1, id_B = 48) as obsolete or not (A - id=1 is obsolete, but not B - 48)?

How to deal with this?

I have to design this on a big scale: products, partners, and so on.

What is your experience on this? How would you do (how have you done)?

-- Edit

I've found this very interesting article, but it doesn't deal properly with "cascasding obsolescence" (= what I'm asking actually)

Olivier Pons
  • 337
  • 2
  • 10

4 Answers4

5

It's not clear to me if these requirements are for auditing purposes or just simple historical reference such as with CRM and shopping carts.

Either way, consider have an main and main_archive table for each major area where this is required. "Main" will only have current / active entries whereas "main_archive" will have a copy of everything that ever goes into main. Insert / update into main_archive can be a trigger from insert / update into main. Deletes against main_archive can then run across a longer period of time, if ever.

For the referential issues such as Cust X bought Product Y, the easiest way to solve your referential concern of cust_archive -> product_archive is to never delete entries from product_archive. Generally, churn should be much lower in that table so size shouldn't be too bad of a concern.

HTH.

2

This has some overlap with functional programming; specifically the concept of immutability.

You have one table called PRODUCT and another called PRODUCTVERSION or similar. When you change a product you don't do an update, you just insert a new PRODUCTVERSION row. To get the latest, you can index the table by version number (desc), timestamp (desc), or you can have a flag (LatestVersion).

Now if you have something that references a product, you can decide which table it points to. Does it point to the PRODUCT entity (always refers to this product) or to the PRODUCTVERSION entity (only refers to this version of the product)?

It gets complicated. What if you have pictures of the product? They have to point to the version table, because they could be changed, but in many cases, they won't and you don't want to duplicate data unnecessarily. That means you need a PICTURE table and a PRODUCTVERSIONPICTURE many-to-many relationship.

1

I've implemented all the stuff from here with 4 fields that are on all my tables:

  • id
  • date_creation
  • date_validity_start
  • date_validity_end

Each time a record has to be modified, I duplicate it, mark the duplicated record as "old" = date_validity_end=NOW() and the current one as the good one date_validity_start=NOW() and date_validity_end=NULL.

The trick is about the many to many and one to many relations: it works without touching them! It's all about the queries which are more complex: to query a record in a precise date (= not now), I have for each join, and for the main table, to add those constraints:

WHERE (
  (date_validity_start<=:dateparam AND date_validity_end IS NULL)
  OR
  (date_validity_start<=:dateparam AND date_validity_start>=:dateparam)
)

So with products and attributes (many to many relation):

SELECT p.*,a.*

FROM products p

JOIN products_attributes pa
ON pa.id_product = p.id
AND (
  (pa.date_validity_start<=:dateparam AND pa.date_validity_end IS NULL)
  OR
  (pa.date_validity_start<=:dateparam AND pa.date_validity_start>=:dateparam)
)

JOIN attributes a
ON a.id = pa.id_attribute
AND (
  (a.date_validity_start<=:dateparam AND a.date_validity_end IS NULL)
  OR
  (a.date_validity_start<=:dateparam AND a.date_validity_start>=:dateparam)
)

WHERE (
  (p.date_validity_start<=:dateparam AND p.date_validity_end IS NULL)
  OR
  (p.date_validity_start<=:dateparam AND p.date_validity_start>=:dateparam)
)
Olivier Pons
  • 337
  • 2
  • 10
0

How about this? It seems simple and pretty effective for what I have done in the past. In your "history" table, use a different PK. So, your "CustomerID" field is the PK in your Customer table, but in the "history" table, your PK is "NewCustomerID". "CustomerID" becomes just another read-only field. That leaves "CustomerID" unchanged in History and all your relationships stay intact.