-1

I have 5 tables in my database with respectively a size of 70Gb, 500Mb, 400 Mb, 110Mb and 20 Mb.

I want to create a new table that contains all columns of all tables, so I tried 2 queries, the first one is :

select into new_table as select .. from t1 , t2 , t3 , t4 ,t5 where t1.id2 = t2.id and t1.id3 = t3.id and t1.id4 = t4.id and t1.id5 = t5.id 

and the second one is :

insert into new_table select .. from t1 , t2 , t3 , t4 ,t5 where t1.id2 = t2.id and t1.id3 = t3.id and t1.id4 = t4.id and t1.id5 = t5.id 

Before executing these two queries on my big data tables, I tried both on a total 1G database, the first on took only 7s and the second one approximately 10 mn.

Now, executing the first one on my huge database, made my disk full even though I had 250Gb free space before running the query, and without finishing the query so I got the follow error :

ERROR:  could not write to temporary file: No space left on device

The second one, is taking a lot of time and consuming my free disk space slowly and, as the first one, not returning the result.

What are the difference between these two queries ? Is there a way to make the insert into non transactional so as I can follow my insert steps. And I guess Postgres uses logs (journalization) so is there a way to deactivate that in order to speed up the insertion ? or I should follow another method in order to get a desired result without filling up all disk.

Ps : No triggers, only a primary key on each table.

Islacine
  • 35
  • 7

2 Answers2

0

.. a new table that contains all columns of all tables ..

Really? That's unlikely to be useful.

select .. from t1 join t2 join t3 join t4 ...

Without "on" clauses to tell postgres how to put the tables together, those 'join's will link every row in t1 to every row in t2 and then every row of that to every row in t3 and then every row of that to every row in t4 ...

That's an awful lot of "rows".

As always, build a select statement that returns the data that you want, then just add the "insert into" bit on the front of it.

If your tables are all the same structure, I'd suspect you'll be better served by 'union [all]'s instead of 'join's but YMMV; we can't tell because you haven't given us the structure of your tables or your intended result.

Phill W.
  • 9,889
  • 1
  • 12
  • 24
0

The difference between those queries is that the first, when it fails, will result in new_table not existing, and as such, all the underlying data files associated with that table will be unlinked (rm'ed, essentially) immediately, while for the second, the table will exist even after the insert fails (since the CREATE TABLE was a separate statement) and those files will still be around, albeit full of rows that never got committed.

The total size of the table that would result would be the same; it's safe to assume your INSERT will run out of space as well, except that when that occurs, the table files full of never-committed rows will not be immediately deleted. You'll need to drop the table to make that space available to the OS again - autovacuum might do so within a short time, but it also might not, depending on a bunch of factors. Be aware that your DB might stop being functional, preventing you from doing the DROP TABLE, if your transaction log is on the same volume as the tablespace where you are performing the inserts. (whether or not this actually occurs depends on luck-of-the-draw things like whether the DB needs to allocate more space for the transaction log in order to add the transaction that performs the DROP TABLE.)

Edit: on the avoiding-the-transaction-log thing, look into UNLOGGED tables. Just be aware that doing an ALTER TABLE ... SET LOGGED will require rewriting the entire table again (or at least it did a few years ago, it's been a while since I've used UNLOGGED.)

Edit 2: As to why the 2nd form is much slower, it is because CREATE TABLE AS (and SELECT INTO) can skip WAL-logging entirely in the default configuration. See https://www.postgresql.org/docs/12/runtime-config-wal.html#GUC-WAL-LEVEL

AdamKG
  • 964
  • 5
  • 9