59

I have a performance-related question. Let's say I have a user with first name Michael. Take the following query:

UPDATE users
SET first_name = 'Michael'
WHERE users.id = 123

Will the query actually execute the update, even though it is being updated to the same value? If so, how do I prevent it from happening?

OneSneakyMofo
  • 693
  • 1
  • 5
  • 5

4 Answers4

62

Due to the MVCC model of Postgres, and according to the rules of SQL, an UPDATE writes a new row version for every row that is not excluded in the WHERE clause.

This does have a more or less substantial impact on performance, directly and indirectly. "Empty updates" have largely the same cost per row as any other update. They fire triggers (if present) like any other update, they have to be WAL-logged and they produce dead rows bloating the table and causing more work for VACUUM later like any other update.

Index entries and TOASTed columns where none of the involved columns are changed can stay the same, but that is true for any updated row. Related:

It's almost always a good idea to exclude such empty updates (when there is an actual chance it may happen). You did not provide a table definition in your question. We have to assume first_name can be null (unsurprising for a "first name"), hence the query has to use null-safe comparison:

UPDATE users
SET    first_name = 'Michael'
WHERE  id = 123
AND    first_name IS DISTINCT FROM 'Michael';

If first_name IS NULL before the update, a test with just first_name <> 'Michael' would evaluate to null and as such exclude the row from the update. Sneaky error. If the column is defined NOT NULL, use a simple equality check, though, that's a bit cheaper.

Related:

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
6

ORM's like Ruby on Rail's offer deferred execution which mark a record as changed (or not) and then when needed or called, then submit the change to the database.

PostgreSQL is a database and not an ORM. It would have decreased performance if it took the time to check if a new value was the same as the updated value in your query.

It will therefore update the value regardless of whether it is the same as the new value or not.

Hannah Vernon
  • 70,928
  • 22
  • 177
  • 323
Thronk
  • 1,388
  • 3
  • 19
  • 39
3

You could simply add to the where clause:

UPDATE users
SET first_name = 'Michael'
WHERE users.id = 123
    AND (first_name <> 'Michael' OR first_name IS NULL);

If first_name is defined as NOT NULL, the OR first_name IS NULL part can be removed.

The condition:

(first_name <> 'Michael' OR first_name IS NULL)

can also be written more elegantly as (in Erwin's answer):

first_name IS DISTINCT FROM 'Michael'
ypercubeᵀᴹ
  • 99,450
  • 13
  • 217
  • 306
Hannah Vernon
  • 70,928
  • 22
  • 177
  • 323
1

From a database point of view

The answer to your question is YES. The update will take place. The database does not check the previous value, it only sets the new value.

As this happen in memory (and will only be written to the datafiles after a commit is issued) the performance would not be an issue.

From an ORM perspective

Normally you will have an Object representing a single row of the database (it can be a lot more complex than that, but let's keep it simple). This object is managed in memory (at the app server level) and only the latest commited version of that object will actually make it to the database at a certain point.

That may explain the different behaviour.

Now, let's not compare a cargo ship with a 3D printer. The fact that you can send 3D printers using cargo ships doesn't mean that there may be any kind of comparison between them.

Enjoy!

I hope this clarified some concepts.

Silvarion
  • 302
  • 1
  • 5