0

I have a giant table (500,000)+ rows with many columns that are varchar yes/no, and I want to reduce the size the db occupies, by changing the data types of varchar (yes/no domain), to int. Then relate that to another table with only two varchar entries, 'yes'/'no'.

I have the following SQL, (Postgres)

create table validvalues
(
    value   varchar(3)
        constraint unique_value
            unique
        constraint unique_value_new
            unique,
    numeric_value smallint
);

create table person ( first_name varchar(50), last_name varchar(50), has_permission varchar(3) constraint fk_has_permission references validvalues (value) );

I want to write a script to change the datatype from varchar to int on the has_permission column, then relate that to the valid value table so that I can represent yes, no as 1,0 int and then when querying resolve 1,0 into the string yes no.

-- Update the has_permission column to change datathe  type to int and encode 'yes'/'no' to 1/0
UPDATE person AS p
SET has_permission = CASE
                        WHEN has_permission::smallint = 1 THEN 1
                        WHEN has_permission::smallint = 0 THEN 0
                        ELSE NULL -- Handle other cases if any
                    END;

-- Add 'yes' and 'no' values to the validvalues table INSERT INTO validvalues (value, numeric_value) VALUES ('yes', 1), ('no', 0);

-- Add a foreign key constraint on the has_permission column ALTER TABLE person ADD CONSTRAINT fk_has_permission FOREIGN KEY (has_permission) REFERENCES validvalues (numeric_value);

I am trying this code to do this, been getting this error.

mydb.public> INSERT INTO validvalues (value, numeric_value) VALUES
             ('yes', 1),
             ('no', 0)
[2024-03-01 00:11:58] [23505] ERROR: duplicate key value violates unique constraint "validvalues_pkey"
[2024-03-01 00:11:58] Detail: Key (value)=(yes) already exists

I have been working on this for a while so I figured I would reach out for help.

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633

1 Answers1

3

I want to reduce the size the db occupies, by changing the data types of varchar (yes/no domain), to int.

That would actually increase the storage size on disk if anything. integer occupies 4 bytes and needs alignment padding. See:

The natural data type for a yes/no field is (standard SQL!) boolean, which occupies a single byte, does not require alignment padding and actually reduces your disk footprint.

I suggest:

ALTER TABLE person
  DROP CONSTRAINT fk_has_permission
, ALTER COLUMN has_permission TYPE boolean USING has_permission::boolean
;

Yes, you can just cast to boolean. 'yes' and 'no' are valid string representations of true/false. But drop the FK constraint first.

Then drop the pointless table validvalues - unless it has other uses:

DROP TABLE validvalues;

Else, at least drop one of the two redundant UNIQUE constraints unique_value and unique_value_new. Having two of those does nothing useful.

You can add a view (if you must??) to display yes/no instead of true/false:

CREATE VIEW v_person AS
SELECT first_name, last_name
     , CASE has_permission WHEN true THEN 'yes' WHEN false THEN 'no' END AS has_permission
FROM   person p;

fiddle

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633