Why is a duplicate database Insert fault not considered a Concurrency Error?

Question

A textbook I am teaching from about ASP.Net using ADO.Net in C# says:

Note that concurrency checking isn't necessary here, because a concurrency error can't occur for an insert operation.

The textbook emphasizes using Optimistic Concurrency (as extensively described in the Microsoft online documentation): Update and Delete statements include a Where clause that provides all the original values, and the number of rows affected should be one. If the original row is not found, no exception is thrown, it just returns zero for number of rows affected.

But for Inserts, normally a duplicate row exception would be thrown on unique index values. So the Insert throws an exception and the Update / Delete does not. But it still amounts to "concurrent changes to the data" no matter how they might have come about. It could even be the same user attempting to add the same data again.

It seems to me that an update or delete that affects zero rows ought to throw an exception. But equally, the Insert could simply return zero rows affected. Regardless, I do not see why Inserts cannot be considered as a source of concurrency errors. Is this just the Microsoft take on these terms?

score 3 · Accepted Answer · answered Feb 27 '18 at 23:02

There are several statements in your question that are not correct and once clarified should answer your question:

But for Inserts, normally a duplicate row exception would be thrown on unique index values. So the Insert throws an exception and the Update / Delete does not. But it still amounts to "concurrent changes to the data" no matter how they might have come about

No. An insert that generates a unique key exception is unrelated to concurrent changes. You can try to insert that row one year later and it will still throw the same exception.

It seems to me that an update or delete that affects zero rows ought to throw an exception

No, why? An update or delete query can affect zero to all of the records of a table, depending on your where clause. Affecting zero rows is not an error, it's just that no rows satisfied your where clause.

But equally, the Insert could simply return zero rows affected.

No. An insert is an operation that inserts one row (no more no less). If the operation cannot be executed, it's an error. Note that there several errors could happen (trying to insert a duplicate value in a unique key or providing null for a non nullable column, for example).

Regardless, I do not see why Inserts cannot be considered as a source of concurrency errors

As I mentioned before, errors during inserts are unrelated to concurrency. You can insert a row with key 100 and an hour later try to do it again and it will fail. No concurrency involved whatsoever. On the other hand, to do an update or delete with optimistic concurrency check, you read a row and submit the update or delete with the original values in the where clause. The only reason why the update or delete would not modify any row is that "someone" has modified or deleted that row after you read it. In other words, while (concurrently) you were doing your read/update or read/delete someone else changed that row. That is why this pattern detects concurrency issues. If the other process modifies the data just before or after you, your update or delete will modify the right number of rows. But if it modifies it at the same time, then the operation will affect zero rows.

score 1 · Answer 2 · answered Feb 26 '18 at 19:46

I don't think a return of zero rows is necessarily an exception or an error in any case of insert, update, delete or select. I've never had a need to treat it that way.

In the case of cleaning up data, I may want to update a field that is null and just because they all have a value, why is that an exception? It could be a good query or a bad query as far as my domain logic is concerned, but there's no rule violation here. In fact, everything is great.

An insert can return zero rows especially in the case where you may be transferring data from one table to another or from the result of a select statement to a table. Think of the example of the Null values getting updated and I want to log this in another table. Now I have to handle two exceptions when everything worked as planned.

Most relational databases manage transactions. This is how they're able to be rolled back from a backup to a particular point in time. Many batch processes work in an all or none fashion, so thinking of concurrency in a literal way when who knows what is exactly happening behind the curtain of a RDBMS, isn't much help.

Using record locking or some other mechanism to handle concurrency as far as your application or domain is concerned, that's a different set of issues as others have pointed out.

score 0 · Answer 3 · answered Feb 26 '18 at 18:54

The book may perfectly state that concurrent inserts do not cause concurrency errors, if it assumes that the concurrent operations are performed by processes/threads that are independent from each other.

In your example with the unique code, the error is not caused by the concurrency of the insert but the inconsistency of the data that you attempt to insert (i.e. two records with the same id that should be unique).

This would not happen if you'd let some database functionality generate a unique ID, or if you would use some UUID algorithm to generate one.

Eventually the issue could be caused by another concurrency issue, in the the algorithm that generates the unique ID (but not the insert). This can for example happen if you read the last sequential id and add 1 and assume it is unique. Of course without any synchronisation operation, this will not be the case, as two different processes could obtain the same result.

Why is a duplicate database Insert fault not considered a Concurrency Error?

3 Answers3