31

We are developing a platform for prepaid cards, which basically holds data about cards and their balance, payments, etc.

Up until now we had a Card entity which has a collection of Account entity, and each Account has an Amount, which updates in every Deposit/Withdrawl.

There is a debate now in the team; someone has told us that this breaks Codd's 12 Rules and that updating its value on each payment is trouble.

Is this really a problem?

If it is, How can we fix this?

Mithir
  • 529
  • 5
  • 12

5 Answers5

30

Yes, that's non-normalized, but occasionally non-normalized designs win out for performance reasons.

However, I would probably approach it a little differently, for safety reasons. (Disclaimer: I don't currently, nor have I ever worked in the financial sector. I'm just throwing this out there.)

Have a table for posted balances on the cards. This would have a row inserted for each account, indicating the posted balance at the close of each period (day, week, month, or whatever is appropriate). Index this table by account number and date.

Use another table for holding pending transactions, which are inserted on the fly. At the close of each period, run a routine that adds the unposted transactions to the last closing balance of the account to calculate the new balance. Either mark the pending transactions as posted, or look at the dates to determine what's still pending.

This way, you have a means of calculating a card balance on demand, without having to sum up all the account history, and by putting the balance recalculation in a dedicated posting routine, you can ensure that the transaction safety of this recalculation is limited to a single place (and also limit security on the balance table so only the posting routine can write to it).

Then just keep as much historic data as necessitated by auditing, customer service, and performance requirements.

db2
  • 9,708
  • 4
  • 37
  • 58
17

On the other side, there is an issue that we run into frequently in accounting software. Paraphrased:

Do I really need to aggregate ten years of data to find out how much money is in the checking account?

The answer of course is no you don't. There are a few approaches here. One is storing the calculated value. I don't recommend this approach because software bugs which cause incorrect values are very hard to track down and so I would avoid this approach.

A better way to do it is what I call the log-snapshot-aggregate approach. In this approach our payments and uses are inserts and we never update these values. Periodically we aggregate the data over a period of time and insert a calculated snapshot record which represents data at the time the snapshot became valid (usually a period of time before present).

Now this does not break Codd's rules because over time the snapshots may be less than perfectly dependent on the inserted payment/usage data. If we have working snapshots we can decide to purge 10 year old data without affecting our ability to compute current balances on demand.

András Váczi
  • 31,778
  • 13
  • 102
  • 151
Chris Travers
  • 13,112
  • 51
  • 95
7

For performance reasons, in most cases we must store current balance - otherwise calculating it on the fly may eventually become prohibitively slow.

We do store precalculated running totals in our system. To guarantee that numbers are always correct, we use constraints. The following solution has been copied from my blog. It describes an inventory, which is essentially the same problem:

Calculating running totals is notoriously slow, whether you do it with a cursor or with a triangular join. It is very tempting to denormalize, to store running totals in a column, especially if you select it frequently. However, as usual when you denormalize, you need to guarantee the integrity of your denormalized data. Fortunately, you can guarantee the integrity of running totals with constraints – as long as all your constraints are trusted, all your running totals are correct. Also this way you can easily ensure that the current balance (running totals) is never negative - enforcing by other methods can also be very slow. The following script demonstrates the technique.

CREATE TABLE Data.Inventory(InventoryID INT NOT NULL IDENTITY,
  ItemID INT NOT NULL,
  ChangeDate DATETIME NOT NULL,
  ChangeQty INT NOT NULL,
  TotalQty INT NOT NULL,
  PreviousChangeDate DATETIME NULL,
  PreviousTotalQty INT NULL,
  CONSTRAINT PK_Inventory PRIMARY KEY(ItemID, ChangeDate),
  CONSTRAINT UNQ_Inventory UNIQUE(ItemID, ChangeDate, TotalQty),
  CONSTRAINT UNQ_Inventory_Previous_Columns UNIQUE(ItemID, PreviousChangeDate, PreviousTotalQty),
  CONSTRAINT FK_Inventory_Self FOREIGN KEY(ItemID, PreviousChangeDate, PreviousTotalQty)
    REFERENCES Data.Inventory(ItemID, ChangeDate, TotalQty),
  CONSTRAINT CHK_Inventory_Valid_TotalQty CHECK(TotalQty >= 0 AND (TotalQty = COALESCE(PreviousTotalQty, 0) + ChangeQty)),
  CONSTRAINT CHK_Inventory_Valid_Dates_Sequence CHECK(PreviousChangeDate < ChangeDate),
  CONSTRAINT CHK_Inventory_Valid_Previous_Columns CHECK((PreviousChangeDate IS NULL AND PreviousTotalQty IS NULL)
            OR (PreviousChangeDate IS NOT NULL AND PreviousTotalQty IS NOT NULL))
);
GO
-- beginning of inventory for item 1
INSERT INTO Data.Inventory(ItemID,
  ChangeDate,
  ChangeQty,
  TotalQty,
  PreviousChangeDate,
  PreviousTotalQty)
VALUES(1, '20090101', 10, 10, NULL, NULL);
-- cannot begin the inventory for the second time for the same item 1
INSERT INTO Data.Inventory(ItemID,
  ChangeDate,
  ChangeQty,
  TotalQty,
  PreviousChangeDate,
  PreviousTotalQty)
VALUES(1, '20090102', 10, 10, NULL, NULL);

Msg 2627, Level 14, State 1, Line 10
Violation of UNIQUE KEY constraint 'UNQ_Inventory_Previous_Columns'. Cannot insert duplicate key in object 'Data.Inventory'.
The statement has been terminated.

-- add more
DECLARE @ChangeQty INT;
SET @ChangeQty = 5;
INSERT INTO Data.Inventory(ItemID,
  ChangeDate,
  ChangeQty,
  TotalQty,
  PreviousChangeDate,
  PreviousTotalQty)
SELECT TOP 1 ItemID, '20090103', @ChangeQty, TotalQty + @ChangeQty, ChangeDate, TotalQty
  FROM Data.Inventory
  WHERE ItemID = 1
  ORDER BY ChangeDate DESC;

SET @ChangeQty = 3;
INSERT INTO Data.Inventory(ItemID,
  ChangeDate,
  ChangeQty,
  TotalQty,
  PreviousChangeDate,
  PreviousTotalQty)
SELECT TOP 1 ItemID, '20090104', @ChangeQty, TotalQty + @ChangeQty, ChangeDate, TotalQty
  FROM Data.Inventory
  WHERE ItemID = 1
  ORDER BY ChangeDate DESC;

SET @ChangeQty = -4;
INSERT INTO Data.Inventory(ItemID,
  ChangeDate,
  ChangeQty,
  TotalQty,
  PreviousChangeDate,
  PreviousTotalQty)
SELECT TOP 1 ItemID, '20090105', @ChangeQty, TotalQty + @ChangeQty, ChangeDate, TotalQty
  FROM Data.Inventory
  WHERE ItemID = 1
  ORDER BY ChangeDate DESC;

-- try to violate chronological order

SET @ChangeQty = 5;
INSERT INTO Data.Inventory(ItemID,
  ChangeDate,
  ChangeQty,
  TotalQty,
  PreviousChangeDate,
  PreviousTotalQty)
SELECT TOP 1 ItemID, '20081231', @ChangeQty, TotalQty + @ChangeQty, ChangeDate, TotalQty
  FROM Data.Inventory
  WHERE ItemID = 1
  ORDER BY ChangeDate DESC;

Msg 547, Level 16, State 0, Line 4
The INSERT statement conflicted with the CHECK constraint "CHK_Inventory_Valid_Dates_Sequence". The conflict occurred in database "Test", table "Data.Inventory".
The statement has been terminated.


SELECT ChangeDate,
  ChangeQty,
  TotalQty,
  PreviousChangeDate,
  PreviousTotalQty
FROM Data.Inventory ORDER BY ChangeDate;

ChangeDate              ChangeQty   TotalQty    PreviousChangeDate      PreviousTotalQty
----------------------- ----------- ----------- ----------------------- -----
2009-01-01 00:00:00.000 10          10          NULL                    NULL
2009-01-03 00:00:00.000 5           15          2009-01-01 00:00:00.000 10
2009-01-04 00:00:00.000 3           18          2009-01-03 00:00:00.000 15
2009-01-05 00:00:00.000 -4          14          2009-01-04 00:00:00.000 18


-- try to change a single row, all updates must fail
UPDATE Data.Inventory SET ChangeQty = ChangeQty + 2 WHERE InventoryID = 3;
UPDATE Data.Inventory SET TotalQty = TotalQty + 2 WHERE InventoryID = 3;
-- try to delete not the last row, all deletes must fail
DELETE FROM Data.Inventory WHERE InventoryID = 1;
DELETE FROM Data.Inventory WHERE InventoryID = 3;

-- the right way to update

DECLARE @IncreaseQty INT;
SET @IncreaseQty = 2;
UPDATE Data.Inventory SET ChangeQty = ChangeQty + CASE WHEN ItemID = 1 AND ChangeDate = '20090103' THEN @IncreaseQty ELSE 0 END,
  TotalQty = TotalQty + @IncreaseQty,
  PreviousTotalQty = PreviousTotalQty + CASE WHEN ItemID = 1 AND ChangeDate = '20090103' THEN 0 ELSE @IncreaseQty END
WHERE ItemID = 1 AND ChangeDate >= '20090103';

SELECT ChangeDate,
  ChangeQty,
  TotalQty,
  PreviousChangeDate,
  PreviousTotalQty
FROM Data.Inventory ORDER BY ChangeDate;

ChangeDate              ChangeQty   TotalQty    PreviousChangeDate      PreviousTotalQty
----------------------- ----------- ----------- ----------------------- ----------------
2009-01-01 00:00:00.000 10          10          NULL                    NULL
2009-01-03 00:00:00.000 7           17          2009-01-01 00:00:00.000 10
2009-01-04 00:00:00.000 3           20          2009-01-03 00:00:00.000 17
2009-01-05 00:00:00.000 -4          16          2009-01-04 00:00:00.000 20
A-K
  • 7,444
  • 3
  • 35
  • 52
6

This is a very good question.

Assuming that you have a transactions table that stores each debit/credit, there is nothing wrong with your design. In fact, I have worked with prepaid telco systems that have worked in exactly this way.

The main thing you need to do is ensure that you're doing a SELECT ... FOR UPDATE of the balance while you INSERT the debit/credit. This will guarantee the correct balance if something goes wrong (because the entire transaction will be rolled back).

As others have pointed out, you'll need a snapshot of balances at specific periods of time to verify that all of the transactions in a given period sum with the period start/end balances correctly. Write a batch job that runs at midnight at the period end (month/week/day) to do this.

Philᵀᴹ
  • 31,952
  • 10
  • 86
  • 108
4

The balance is a computed amount based on certain business rules, so yes you do not want to keep the balance but rather compute it from the transactions on the card and therefore the account.

You want to keep track of all transactions on the card for auditing and statement reporting, and even data from different systems later on.

Bottom line - compute any values that need to be computed as and when you need to