6

It seems that altering a column from timestamp without time zone to timestamp with time zone converts the existing data based on the current session time zone at the time of the alter statement.

See this example, output shown after each statement

create table tztest (col1 timestamp without time zone);

set timezone = 'UTC'; insert into tztest (col1) values ('2023-02-01 10:10:10'); select col1 as t1, extract(epoch FROM col1) from tztest; -- → 2023-02-01 10:10:10 1675246210

set timezone = 'America/New_York'; alter table tztest alter column col1 type timestamp with time zone; select col1 as t2, extract(epoch FROM col1) from tztest; -- → 2023-02-01 10:10:10-05 1675264210

set timezone = 'UTC'; select col1 as t3, extract(epoch FROM col1) from tztest; -- → 2023-02-01 15:10:10+00 1675264210

The epoch value changes after the alter command.
I was expecting that PG would assume the timestamp value was UTC and not adjust it when changing the type to timestamp with time zone (giving me t2 of 05:10:10 and t3 of 10:10:10).

However, it seems PG assumes the session time zone at the time of the alter, and converts them to UTC.

Is my understanding correct, and is it expected behavior?

On a large table then the alter will update every row, which is something we are concerned about and certainly want to understand.

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633

1 Answers1

4

Yes, your understanding is correct. Don't rely on the fallback (the timezone setting of the current session) without setting it yourself explicitly.
If your timestamps shall be interpreted as UTC timestamps, be explicit about it. It's best to set the session timezone for this particular case and run ALTER TABLE with the simple SET DATA TYPE. Only this route can take the shortcut without table rewrite (see below):

SET timezone = 'UTC';
ALTER TABLE tztest ALTER COLUMN col1 TYPE timestamptz;

You can also apply AT TIME ZONE in a USING clause to SET DATA TYPE:

ALTER TABLE tztest ALTER COLUMN col1 TYPE timestamptz
USING col1 AT TIME ZONE 'UTC';

The epoch value stays the same either way, because Postgres stores timestamptz as UTC timestamps internally. Postgres has to rebuild any index involving col1 in any case.

fiddle

See:

Table rewrite?

On a large table then the alter will update every row ...

Except for noted exceptions, ALTER TABLE rewrites the whole table and all indexes (taking an exclusive lock for the duration!). Changing the type to a binary coercible type (like text --> varchar) can avoid it. The manual:

As an exception, when changing the type of an existing column, if the USING clause does not change the column contents and the old type is either binary coercible to the new type or an unconstrained domain over the new type, a table rewrite is not needed.

Postgres does not realize this for the expression with AT TIME ZONE. But - like Jeff commented - the simple SET DATA TYPE with the session time zone set to UTC qualifies. The release notes for Postgres 12:

Allow ALTER TABLE ... SET DATA TYPE changing between timestamp and timestamptz to avoid a table rewrite when the session time zone is UTC (Noah Misch)

In the UTC time zone, these two data types are binary compatible.

Older versions were not as smart. Compare:

fiddle -- Postgres 11
fiddle -- Postgres 12

Also note that a rewrite is not an UPDATE, which produces dead tuples and bloats the table, but does not interfere with concurrent reads with its weaker lock. This completely rewrites the whole table in pristine condition, but takes an exclusive lock for the duration.

Related:

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633