0

I wrote this script for run postgresql pg_dump of multiples schemas. Since it takes much time to finishes all i'd improved it by running simultaneous dumps with this:

for SCHEMA in $SCH
do
while [ $(jobs | wc -l | xargs) -ge $PROC ]; do  sleep 5; done
/usr/pgsql-$PGVER/bin/pg_dump -h $PGHOST -p $PORT -U postgres -d $DB -n $SCHEMA -Fc -Z 1 2> $LOGDIR/$SCHEMA'-'$DATA'.log' | $ENCCMD $DUMPDIR/$SCHEMA'-'$DATA.bkp.enc $ENCKEY &
done
wait

So basically what this done is get the number of the jobs running, and if greater or equals 5 it waits until the number of process decreases, keeping at least 5 dumps running simultaneous until finished all schemas.

The problem is: sometimes it get stuck in "while loop", the "jobs | wc -l" always returning number 5, and checking the linux processes there isn't any dump running

1 Answers1

0

Two things.

  1. You would probably be better off using directory format in pg_dump, which cam dump in parallel (1 thread per table). So something like pg_dump -d $DB -Fd -j 5 -f $output_directory.
  2. More generic answer. xargs supports exactly what you want with -P:
for SCHEMA in $SCH; do
  echo $SCHEMA
done | xargs -L 1 -P 5 bash -c "/usr/pgsql-$PGVER/bin/pg_dump -h $PGHOST -p $PORT -U postgres -d $DB -n "'$0'" -Z 1 | $ENCCMD $DUMPDIR/$SCHEMA'-'$DATA.bkp.enc $ENCKEY"

Above i used bash to wrap your pipe for encrypting. xargs puts one arg (-L 1) at end of line, after bash and its code, and bash uses that as $0, that's why it's escaped with ' in this example. xargs runs 5 such commands at once and when one ends it starts the next as long as there are lines coming on its stdin.