4

I'm trying to figure out the fastest way to:

  • INSERT a row if it doesn't exist
  • REPLACE (all columns at once) the row if it exists

I thought about DELETE + INSERT, but that looks like 2 scans and if I do an INSERT + ON CONFLICT, I get a huge query where everything has to be duplicated.

Is there a generally acknowledged method to do this?

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
Thomas
  • 355
  • 3
  • 10

1 Answers1

3

That's a generally acknowledged case for UPSERT (INSERT ... ON CONFLICT .. DO UPDATE ...).

And the simple case only involves a single index scan. (Aside from possible locking and writing and index updates.)

For one, race conditions under concurrent write load are typically handled smarter. See:

And it's also typically cheaper. "REPLACE (all columns at once) the row if it exists" sounds mildly suspicious. If "the row exists", at least one or more columns must be identical. On the other hand, UPDATE writes a new row version in any case (if it writes at all). That's still cheaper than DELETE + INSERT if toasted fields can be carried over, or index entries with a H.O.T. UPDATE. See:

Your "huge query where everything has to be duplicated" indicates potential for optimization or possible misunderstandings. If that might be the case, show the query you have, together with your version of Postgres, a reproducible setup, and the desired result (in a new question!).

That said, if you don't have toasted columns, and no heavy concurrent write load, and few indexes (or just one), then DELETE + INSERT is only moderately more expensive.

When replacing large parts of a table without concurrent write load, it may even be cheaper. You might VACUUM the table between DELETE and INSERT ... But that's digressing from the use case in the question.

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633