4

I have PostgreSQL 9.2 database and a table:

id integer,
allowed_types character varying(255)`

Sample data as below:

id  allowed_types
1   3,4,5,13,14

How I can remove 4 and 5 from allowed_types which is a comma separated varchar list?

After removing, the result should be allowed_types = 3,13,14.

There are many records on that table, each allowed_types could contain different numbers, separated by comma.

I considered string_to_array() and array_remove(), but array_remove() is not in version 9.2, yet.

Evan Carroll
  • 65,432
  • 50
  • 254
  • 507

4 Answers4

6

Proof of concept for regular expressions:

SELECT trim(regexp_replace(',' || '3,4,5,13,14' || ',', ',(4|5)(?=,)', '', 'g'), ',');

Returns:

3,13,14

Pad commas to cover corner cases at begin and end. The regular expression ',(4|5)(?=,)' explained:

, ... literal comma
(4|5) ... two branches: either 4 or 5
(?=,) ... positive lookahead: next character is a literal comma

Would need a functional trigram index on (',' || allowed_types || ',') for good performance with big tables (only slightly bigger than a regular index due to the added commas). Details:

Or, with a more sophisticated regular expression, you can work with the original column and a trigram index on just (allowed_types):

SELECT ltrim(regexp_replace('3,4,5,13,14', '((,|^)(4|5))(?=,|$)', '', 'g'), ',');

dbfiddle here

But I expect the first solution to be faster: complex regular expressions are more expensive.

The according UPDATE

UPDATE tbl
SET    allowed_types = trim(regexp_replace(',' || allowed_types || ',', ',(4|5)(?=,)', '', 'g'), ',')
WHERE  ',' || allowed_types || ',' ~ ',(4|5),';  -- here, plain comma is good

What I would really do:
Update to a current version of Postgres (pg 9.2 reaches EOL Sept 2017) and probably use a normalized 1:n design.

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
3

Postgres 9.2 allows you to use string_to_array and unnest, you can use it to get the desired result.

create table foo(id int, allowed_types varchar(256));
insert into foo values (1, '3,4,5,13,14');
✓

1 rows affected
with a as
(
    select id, unnest(string_to_array(allowed_types, ',')) elem
    from foo
)
select id, string_agg(elem, ',')
from a 
where elem not in ('4','5')
group by id
id | string_agg
-: | :---------
 1 | 3,13,14   

dbfiddle here

McNets
  • 23,979
  • 11
  • 51
  • 89
2

intarray Extension

PostgreSQL ships with an extension that provides this and it's even faster than core array functionality on ints. You can use string_to_array and then intarray's - operator. Click here for the docs on intarray in 9.2

Before we install intarray.

test=# SELECT ARRAY[3,4,5,13,14] - ARRAY[4,5];
ERROR:  operator does not exist: integer[] - integer[]
LINE 1: SELECT ARRAY[3,4,5,13,14] - ARRAY[4,5];
                                  ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

Install intarray and try again

test=# CREATE EXTENSION intarray;
CREATE EXTENSION
test=# SELECT ARRAY[3,4,5,13,14] - ARRAY[4,5];
 ?column?  
-----------
 {3,13,14}
(1 row)

So from varchar.

SELECT string_to_array(allowed_types, ',')::int[] - ARRAY[4,5] AS after_operation
FROM ( VALUES ('3,4,5,13,14') ) AS t(allowed_types);

Other notes

I agree with @Erwin when he says you should probably use a normalized 1:n design. But even if you can't, it's very clear you do not want a csv in your column. Consider at least upgrading the type to int[]. This requires no change in schema and can be done very easily using

ALTER TABLE table
  ALTER COLUMN allowed_types
  SET DATA TYPE int[]
  USING string_to_array(allowed_types, ',')::int[];
Evan Carroll
  • 65,432
  • 50
  • 254
  • 507
2

You can still use two regexp in a row that will take care of everything, without needing to have a functional index:

-- Your update
-- taking out two numbers (4 and 5) from any place in the string
UPDATE
    t
SET
    allowed_type = regexp_replace(regexp_replace(allowed_type, '^4,|,4(,|$)', E'\\1', 'g'), '^5,|,5(,|$)', E'\\1', 'g')
WHERE
    allowed_type ~  '^(4|5),|,(4|5)(,|$)'
RETURNING
    allowed_type ;
| allowed_type |
| :----------- |
| 3,13,14      |

See a step-by-step approach at dbfiddle here


Assumption: your list contains numbers separated by commas, with no spaces (after or before commas, nor at beggining or end of string). The numbers need not be in any order, and they can be repeated (even if it doesn't probably make sense)

joanolo
  • 13,657
  • 8
  • 39
  • 67