2

In my Postgres 9.2 database I've got the following SELECT, which will be used by a PL/pgSQL function:

SELECT
      id
FROM
      tablea
WHERE
      migrated = false;

I want to get 2000 records each time and perform something like the following:

CREATE or REPLACE FUNCTION migrate_data()
RETURNS integer AS
$$
declare
        row record;

BEGIN

FOR row IN EXECUTE '
        SELECT
              id
        FROM
              tablea
        WHERE
              migrated = false
'
LOOP

INSERT INTO tableb (id)
VALUES (row.id);

UPDATE tablea a SET migrated = yes WHERE a.id = row.id;

END LOOP;

RETURN num_rows; -- I want it to return the number of processed rows

END

$$ language 'plpgsql';

How can I tell the function to process 2000 records with every call? And it must start from where it left when on the next call.

Example:

select migrate_data(); -- id 0 to id 2000

select migrate_data(); -- id 2001 to id 4000

select migrate_data(); -- id 4001 to id 6000

select migrate_data(); -- id 6001 to id 8000

Etc.

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
Patrick B.
  • 311
  • 6
  • 12

2 Answers2

3

It will be much simpler (and faster) to use cursors for that:

begin;
declare foo cursor for
    SELECT
          id
    FROM
          tablea
    WHERE
          mig = true;
fetch forward 2000 from foo;
fetch forward 2000 from foo;
-- etc
rollback;

About declare
About fetch

Abelisto
  • 1,589
  • 1
  • 10
  • 14
3

You can run a single SQL command with data-modifying CTEs. Then get the row count with GET DIAGNOSTICS in a plpgsql function:

CREATE OR REPLACE FUNCTION migrate_data(OUT row_ct int)
  LANGUAGE plpgsql AS
$func$
BEGIN
   WITH sel AS (
      SELECT id
      FROM   tablea
      WHERE  migrated = false
      ORDER  BY id   -- to proceed in deterministic order
      LIMIT  2000
      FOR    UPDATE  -- only for concurrent write access
      )
    , upd AS (
      UPDATE tablea a 
      SET    migrated = TRUE
      FROM   cte
      WHERE  a.id = sel.id
      )
   INSERT INTO tableb(id)
   TABLE  sel;  -- shorthand for: SELECT * FROM sel

GET DIAGNOSTICS row_ct = ROW_COUNT; -- writes to OUT param directly END $func$;

Related:

You might just use the SQL statement without the function wrapper instead. Possibly as prepared statement.

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633