1

I am trying to write a Postgres procedure in pgAdmin. The procedure I've written fails with different error messages depending on where I put the cursor. It even sometimes seems to succeed, but with no procedure created.

My questions:

  1. What is wrong with my code?
  2. Is there a better/simpler way to achieve the same thing?

The below code tries to iterate through a table of values (outer loop), and for each row in that table, to iterate through another table that has some missing values (inner loop). Whenever a missing value is found in the inner loop table, if a value that can be used exists in the outer table, I update the inner table with that value.

CREATE OR REPLACE PROCEDURE impute() 
LANGUAGE plpgsql 
AS $$ 
DECLARE 
    cntry_dec_means RECORD; 
    table_row RECORD; 
BEGIN 
    <<outer_loop>> 
    FOR cntry_dec_means IN 
        SELECT * FROM country_decade_means 
    LOOP 
        -- if all imputed values are null, don't do anything and move to the next row 
        IF ISNULL(cntry_dec_means.gdppc) 
            AND ISNULL(cntry_dec_means.gdp) 
            AND ISNULL(cntry_dec_means.ed_gdp) 
            AND ISNULL(cntry_dec_means.population) THEN 
                CONTINUE;
    -- get matching rows from the actual table 
    &lt;&lt;inner_loop&gt;&gt; 
    FOR table_row IN 
        SELECT * FROM inls625_impute 
        WHERE country = cntry_dec_means.country 
        AND decade = cntry_dec_means.decade 
    LOOP 
        IF (table_row.gdppc IS NULL) AND (cntry_dec_means.gdppc IS NOT NULL) THEN   
            UPDATE inls625_impute SET gdppc = cntry_dec_means.gdppc 
            WHERE inls625_impute.country = cntry_dec_means.country 
            AND inls625_impute.decade = cntry_dec_means.decade; 
    END LOOP; --&lt;&lt;inner_loop&gt;&gt; 
END LOOP; --&lt;&lt;outer_loop&gt;&gt;

END; $$;

Update

Thanks too @prasad for clarifying how to use loop labels. Unfortunately, it does not seem to work for me.

Here are the errors (in psql):

CREATE OR REPLACE PROCEDURE impute()
LANGUAGE plpgsql
AS $$"
LINE 1: $$;
        ^
INLS-623-Labs=#     table_row RECORD;
ERROR:  syntax error at or near "table_row"
LINE 1: table_row RECORD;
        ^
INLS-623-Labs=# BEGIN
INLS-623-Labs-#     <<outer_loop>>
INLS-623-Labs-#     FOR cntry_dec_means IN
INLS-623-Labs-#         SELECT * FROM country_decade_means
INLS-623-Labs-#     LOOP
INLS-623-Labs-#         -- if all imputed values are null, don't do anything and move to the next row
INLS-623-Labs-#         IF ISNULL(cntry_dec_means.gdppc)
INLS-623-Labs-#             AND ISNULL(cntry_dec_means.gdp)
INLS-623-Labs-#             AND ISNULL(cntry_dec_means.ed_gdp)
INLS-623-Labs-#             AND ISNULL(cntry_dec_means.population) THEN
INLS-623-Labs-#                 CONTINUE outer_loop;
ERROR:  syntax error at or near "<<"
LINE 2:     <<outer_loop>>
            ^
INLS-623-Labs=#
INLS-623-Labs=#         -- get matching rows from the actual table
INLS-623-Labs=#         <<inner_loop>>
INLS-623-Labs-#         FOR table_row IN
INLS-623-Labs-#             SELECT * FROM inls625_impute
INLS-623-Labs-#             WHERE country = cntry_dec_means.country
INLS-623-Labs-#             AND decade = cntry_dec_means.decade
INLS-623-Labs-#         LOOP
INLS-623-Labs-#             IF ISNULL(table_row.gdppc) AND NOT ISNULL(cntry_dec_means.gdppc) THEN
INLS-623-Labs-#                 UPDATE inls625_impute SET gdppc = cntry_dec_means.gdppc
INLS-623-Labs-#                 WHERE inls625_impute.country = cntry_dec_means.country
INLS-623-Labs-#                 AND inls625_impute.decade = cntry_dec_means.decade;
ERROR:  syntax error at or near "<<"
LINE 1: <<inner_loop>>
        ^
INLS-623-Labs=#         END LOOP inner_loop;
ERROR:  syntax error at or near "LOOP"
LINE 1: END LOOP inner_loop;
            ^
INLS-623-Labs=#     END LOOP outer_loop;
ERROR:  syntax error at or near "LOOP"
LINE 1: END LOOP outer_loop;
            ^
INLS-623-Labs=# END;
WARNING:  there is no transaction in progress
COMMIT
INLS-623-Labs=# $$;
INLS-623-Labs$#

What's more, I reproduced the example provided by @prasad and get similar errors:

CREATE OR REPLACE PROCEDURE t_proc()
LANGUAGE plpgsql
AS $$"
LINE 1: $$;
        ^
INLS-623-Labs=#     t2_rec RECORD;
ERROR:  syntax error at or near "t2_rec"
LINE 1: t2_rec RECORD;
        ^
INLS-623-Labs=# BEGIN
INLS-623-Labs-#     <<outer_loop>>
INLS-623-Labs-#     FOR t_rec IN
INLS-623-Labs-#         SELECT * FROM t
INLS-623-Labs-#     LOOP
INLS-623-Labs-#         IF t_rec.name IS NULL AND t_rec.data IS NULL THEN
INLS-623-Labs-#             CONTINUE outer_loop;
ERROR:  syntax error at or near "<<"
LINE 2:     <<outer_loop>>
            ^
INLS-623-Labs=#         END IF;
ERROR:  syntax error at or near "IF"
LINE 1: END IF;
            ^
INLS-623-Labs=#         RAISE NOTICE 't.id value: %', t_rec.id;
ERROR:  syntax error at or near "RAISE"
LINE 1: RAISE NOTICE 't.id value: %', t_rec.id;
        ^
INLS-623-Labs=#         <<inner_loop>>
INLS-623-Labs-#         FOR t2_rec IN
INLS-623-Labs-#             SELECT * FROM t2 WHERE t2.id = t_rec.id
INLS-623-Labs-#         LOOP
INLS-623-Labs-#             RAISE NOTICE 't2.id value: %', t2_rec.id;
ERROR:  syntax error at or near "<<"
LINE 1: <<inner_loop>>
        ^
INLS-623-Labs=#         END LOOP inner_loop;
ERROR:  syntax error at or near "LOOP"
LINE 1: END LOOP inner_loop;
            ^
INLS-623-Labs=#     END LOOP outer_loop;
ERROR:  syntax error at or near "LOOP"
LINE 1: END LOOP outer_loop;
            ^
INLS-623-Labs=# END;
WARNING:  there is no transaction in progress
COMMIT
INLS-623-Labs=# $$;
Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633

2 Answers2

2
  1. What is wrong with my code?

Technically, only END IF; is missing twice.

  1. Is there a better/simpler way to achieve the same thing?

Yes. A plain UPDATE statement.

UPDATE inls625_impute i
SET    gdppc = c.gdppc 
FROM   cntry_dec_means c
WHERE  i.gdppc IS NULL
AND    c.gdppc IS NOT NULL
AND    ((c.gdp, c.ed_gdp, c.population) IS NULL) IS NOT TRUE
AND    i.country = c.country 
AND    i.decade = c.decade; 

Not only is it much simpler and faster, it is also correct. Your procedure (even with fixed syntax) would be open to race conditions. Between reading data and arriving at the UPDATE eventually, concurrent transactions could interfere. You could prevent that with explicit locking. But just don't go there.

Note, in particular, this formulation:

((c.gdp, c.ed_gdp, c.population) IS NULL) IS NOT TRUE

You only want to skip the update if all of these columns are null. (Plus, c.gdppc cannot be null in any case.) See:

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
0

Here is how you can code your procedure. I am using two table's t and t2 for showing the correct way of the syntax (LOOP, IF-END IF, etc.) you are trying in your code.

I am using PostgreSQL version 14 and the commands are run from psql shell.

TABLE t;
 id | name  | data
----+-------+------
  1 | first |   12
  2 |       |    9
  3 |       |

Table t has three columns and the empty values above are null values. Table t2 has just one column.

TABLE t2;
 id
----
  2
  9

The procedure:

CREATE OR REPLACE PROCEDURE t_proc() 
LANGUAGE plpgsql 
AS $$ 
DECLARE 
    t_rec RECORD;
    t2_rec RECORD; 
BEGIN 
    <<outer_loop>> 
    FOR t_rec IN 
        SELECT * FROM t 
    LOOP 
        IF t_rec.name IS NULL AND t_rec.data IS NULL THEN 
            CONTINUE outer_loop;
        END IF; 
        RAISE NOTICE 't.id value: %', t_rec.id;
        <<inner_loop>>
        FOR t2_rec IN
            SELECT * FROM t2 WHERE t2.id = t_rec.id
        LOOP
            RAISE NOTICE 't2.id value: %', t2_rec.id;
        END LOOP inner_loop; 
    END LOOP outer_loop;
END; 
$$;

Run the procedure and review the output:

CALL t_proc();
NOTICE:  t.id value: 1
NOTICE:  t.id value: 2
NOTICE:  t2.id value: 2
CALL

Further reference: https://www.postgresql.org/docs/14/plpgsql-control-structures.html

prasad_
  • 583
  • 2
  • 10