2

Today I realized a number of points where PostgreSQL (v10.3) is rather lackluster and sparse in its error messages.

The first point is illustrated by this code:

drop schema if exists X cascade;
create schema X;

create domain X.an_illegal_regex as text check ( value ~ '(' );

create table X.table_with_illegal_constraint (
  a text,
  constraint "column a must have a bogus value" check ( a::X.an_illegal_regex = a ) );

select * from X.table_with_illegal_constraint;

insert into X.table_with_illegal_constraint values
  ( 'xxx' ),
  -- ( 'xxx' ),
  ( 'foo' ),
  ( 'xyx' );

This code will throw with

psql:db/experiments/pg-error-fail-illegal-regex.sql:17: ERROR:
invalid regular expression: parentheses () not balanced

There are several problems with this error message:

FAILURE: the error is really in line 5 where a syntactically invalid RegEx is created; the fact that it is a RegEx and not a general string is obvious from the semantics of the ~ (tilde) operator at that point in time.

FAILURE: the offending RegEx is not referred to and not quoted in the error message. As such, it could be anywhere in my many, many kLOCs big DB definition. I cannot even search the RegEx with a RegEx because all I know is some parenthesis is missing, somewhere: RegExes cannot match parentheses, and PG RegExes do not have a unique syntactic marker to them.

FAILURE: before the insert statement, everything runs dandy. We could have built an entire data warehouse application on top of a table definition that can never be syntactically processed but which will only fail when someone accidentally tries to insert a line.

FAILURE: I can select from a table with a syntactically invalid definition.

The second point is related:

drop schema if exists X cascade;
create schema X;

create domain X.a_legal_regex as text check ( value ~ '^x' );

create table X.table_with_constraints (
  a text,
  constraint "column a must start with x" check ( a::X.a_legal_regex = a ),
  constraint "field b must have 3 characters" check ( character_length( a ) = 3 ) );

insert into X.table_with_constraints values
  ( 'xxx' ),
  ( 'foo' ),        /* A: violates first constraint */
  -- ( 'xxxx' ),       /* B: violates second constraint */
  ( 'xyx' );

With only line B active, this gives:

psql:db/experiments/pg-error-fail-no-constraint-name.sql:16:
ERROR:  new row for relation "table_with_constraints" violates
check constraint "field b must have 3 characters"
DETAIL:  Failing row contains (xxxx).

SUCCESS: we get the name of the relation and the name of the violated rule.

SUCCESS: the offending piece of data is quoted.

FAILURE: we don't get the full name of the relation, which is "X"."table_with_constraints". Neither do we get the name of the column that received the offending value.

Lastly, with only line A (not line B) active:

psql:db/experiments/pg-error-fail-no-constraint-name.sql:16:
ERROR:  value for domain x.a_legal_regex violates check constraint "a_legal_regex_check"

FAILURE: no reference to the affected table, column is made.

FAILURE: no reference to the offending piece of data is made.

FAILURE: no reference to the offended constraint is made ("column a must start with x").

What are the best practices or workarounds for the above shortcomings? I've been trying for several hours to figure out what causes an error message a la value for domain xxx violates check constraint "xxx_check" by rewriting table definitions, inserting data row by row and so on, to no avail. What I need is a full chain of the objects (column -> table -> constraint -> domain -> check) that are involved in the error.

Evan Carroll
  • 65,432
  • 50
  • 254
  • 507
John Frazer
  • 317
  • 1
  • 3
  • 8

0 Answers0