1

I have a couple of joined tables with industrial data on them:

create table v2.tag (
  tag_id integer generated always as identity,
  tag text not null,

primary key (tag_id), unique (tag) );

create table v2.state ( tag_id integer not null, "timestamp" timestamp without time zone not null, value float not null,

primary key (tag_id, timestamp), foreign key (tag_id) references v2.tag (tag_id) ) partition by range (timestamp);

The state table holds time series data of about 50 million rows from the last 6 months and I need to run a benchmark on it with various queries. The table is partitioned monthly.

The query I tried simply gets the number of data points per day and tag, that any actual TSDB can do without breaking a sweat on such a small dataset:

SELECT 
    count(*) as points,date_trunc('day', timestamp) as timestamp,tag.tag 
FROM 
    v2.state 
JOIN 
    v2.tag USING (tag_id) 
GROUP BY 
    timestamp, tag 
ORDER BY 
    timestamp ASC;

The thing is, that for some reason this query makes the DB take up almost 3GB of RAM and returns a bunch of duplicates. Like this:

 2024-02-01 00:00:00 | /Heave         |      1
 2024-02-01 00:00:00 | /Pitch         |      1
 2024-02-01 00:00:00 | /Roll          |      1
 2024-02-01 00:00:00 | /Velocity      |      1
 2024-02-01 00:00:00 | /Heave         |      1
 ...

And so on. All in the same day, I could not scroll over to the next, it just kept repeating these rows in the result instead of counting them up per tag like I expected. So instead of counting the number of data points per day/tag, it seems to just produce a duplicate for each actual row of the ~50 million rows in the database.

So something is not working in the aggregation. I would expect this query to return around 12K lines (65*30*6), but it returns millions of rows instead, causing the Jupyter notebook I am trying to load it into to get OOM-killed.

I tried to run this with EXPLAIN ANALYZE, but since I am a noob with Postgres, it doesn't really... explain anything:

 Sort  (cost=700769.72..700798.22 rows=11400 width=78) (actual time=80503.260..83825.211 rows=47499969 loops=1)
   Sort Key: (date_trunc('day'::text, state."timestamp"))
   Sort Method: external merge  Disk: 4703296kB
   ->  Finalize GroupAggregate  (cost=697027.86..700001.55 rows=11400 width=78) (actual time=35609.801..64328.719 rows=47
499969 loops=1)
         Group Key: state."timestamp", tag.tag
         ->  Gather Merge  (cost=697027.86..699688.05 rows=22800 width=70) (actual time=35609.453..55143.276 rows=4749996
9 loops=1)
               Workers Planned: 2
               Workers Launched: 2
               ->  Sort  (cost=696027.84..696056.34 rows=11400 width=70) (actual time=34526.070..42018.956 rows=15833323 
loops=3)
                     Sort Key: state."timestamp", tag.tag
                     Sort Method: external merge  Disk: 1414088kB
                     Worker 0:  Sort Method: external merge  Disk: 1446832kB
                     Worker 1:  Sort Method: external merge  Disk: 1470664kB
                     ->  Partial HashAggregate  (cost=695145.67..695259.67 rows=11400 width=70) (actual time=8690.289..20
138.661 rows=15833323 loops=3)
                           Group Key: state."timestamp", tag.tag
                           Batches: 1029  Memory Usage: 8241kB  Disk Usage: 1694608kB
                           Worker 0:  Batches: 901  Memory Usage: 8241kB  Disk Usage: 1727928kB
                           Worker 1:  Batches: 773  Memory Usage: 8241kB  Disk Usage: 1748528kB
                           ->  Hash Join  (cost=2.28..652834.40 rows=5641502 width=62) (actual time=138.598..4142.702 row
s=15833323 loops=3)
                                 Hash Cond: (state.tag_id = tag.tag_id)
                                 ->  Parallel Append  (cost=0.00..599769.83 rows=19794743 width=12) (actual time=138.383.
.2665.699 rows=15833323 loops=3)
                                       ->  Parallel Seq Scan on state_y2024m04 state_4  (cost=0.00..221214.31 rows=874583
1 width=12) (actual time=39.308..827.302 rows=6996457 loops=3)
                                       ->  Parallel Seq Scan on state_y2024m02 state_2  (cost=0.00..172317.34 rows=680943
4 width=12) (actual time=58.866..1102.604 rows=8171318 loops=2)
                                       ->  Parallel Seq Scan on state_y2024m03 state_3  (cost=0.00..78305.04 rows=3095204
 width=12) (actual time=0.766..694.493 rows=7428501 loops=1)
                                       ->  Parallel Seq Scan on state_y2024m05 state_5  (cost=0.00..28879.42 rows=1141442
 width=12) (actual time=180.418..416.467 rows=2739461 loops=1)
                                       ->  Parallel Seq Scan on state_y2024m01 state_1  (cost=0.00..20.00 rows=1000 width
=12) (actual time=0.000..0.001 rows=0 loops=1)
                                       ->  Parallel Seq Scan on state_y2024m06 state_6  (cost=0.00..20.00 rows=1000 width
=12) (actual time=0.000..0.001 rows=0 loops=1)
                                       ->  Parallel Seq Scan on state_y2024m07 state_7  (cost=0.00..20.00 rows=1000 width
=12) (actual time=0.000..0.001 rows=0 loops=1)
                                       ->  Parallel Seq Scan on state_y2024m08 state_8  (cost=0.00..20.00 rows=1000 width
=12) (actual time=0.002..0.002 rows=0 loops=1)
                                 ->  Hash  (cost=1.57..1.57 rows=57 width=58) (actual time=0.149..0.268 rows=65 loops=3)
                                       Buckets: 1024  Batches: 1  Memory Usage: 14kB
                                       ->  Seq Scan on tag  (cost=0.00..1.57 rows=57 width=58) (actual time=0.031..0.036 
rows=65 loops=3)
 Planning Time: 2.447 ms
 JIT:
   Functions: 96
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 14.487 ms, Inlining 105.515 ms, Optimization 203.723 ms, Emission 143.355 ms, Total 467.081 ms
 Execution Time: 86164.911 ms

So what's wrong with my query? Why is it not aggregating?

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
Megakoresh
  • 113
  • 3

2 Answers2

2

Immediate issue

Your answer doesn't pin down the actual problem. You can use "timestamp" as alias. As well as "day". (Doesn't mean either is a good idea.) "timestamp" and "day" are reserved words in standard SQL. Both are less restricted in Postgres, but it's still better to avoid all reserved words as identifiers in any case. That's not the immediate issue, though.

The immediate issue is this:

If a simple name in GROUP BY matches both an output column name and an input column name, it will interpret it as the input column name. In your case, GROUP BY timestamp resolves to GROUP BY state.timestamp. Not what you want. Since that hardly groups any rows you end up with millions of output rows, bringing down performance as a side-effect.

ORDER BY uses the opposite precedence. That's all set in the SQL standard, somewhat due to historical reasons, and reflects the sequence of events in a SELECT command. See:

Solution

Avoid output column names that collide with input column names to rule out any confusion. If such collisions should be unavoidable (really?), repeat the expression based on input columns in GROUP BY, or use positional references. Like:

SELECT count(*) AS points, date_trunc('day', state.timestamp) AS timestamp, tag.tag 
FROM   v2.state
JOIN   v2.tag USING (tag_id) 
GROUP  BY date_trunc('day', state.timestamp), tag.tag 
--   GROUP BY 2, 3  -- or just this for short
ORDER  BY timestamp;

Keeping your questionable alias "timestamp" and some other flaws.

Better solution

SELECT s.points, s.the_day, t.tag
FROM  (
   SELECT count(*) AS points, s.timestamp::date AS the_day, s.tag_id
   FROM   v2.state s
   GROUP  BY 2, 3
   ) s
JOIN   v2.tag t USING (tag_id) 
ORDER  BY s.the_day, t.tag;  -- this deterministic order seems more adequate

Avoid reserved words.
Aggregate before the join. Typically faster. See:

Use table aliases to keep the noise level down.

Table-qualify all columns in queries where conflicts or confusion might arise.

timestamp::date is equivalent to date_trunc('day', timestamp) for an actual timestamp data type, returns type date instead of timestamp, and is a bit more efficient - most importantly for below index.

Since this question is (also) about performance: if the table is mostly read-only and performance of this query is important, consider this matching, multicolumn expression index:

CREATE INDEX my_idx ON ON state ((timestamp::date), tag_id);

Make sure the table is VACUUM'ed enough to give you index-only scans.
I also expect a local optimum for date + integer = 8 bytes. See:

And consider a clean name instead of state.timestamp.

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
0

Ok so me being a noob with Postgres was the root cause here. For some reason, I can't use timestamp as the alias for a field. When I changed the query to look like this:

SELECT 
    count(*) as points,date_trunc('day', timestamp) as day,tag.tag 
FROM 
    v2.state 
JOIN 
    v2.tag USING (tag_id) 
GROUP BY 
    day, tag 
ORDER BY 
    day ASC;

It executed in 3.5 seconds and with the expected results. Okie, dokie, then...

Megakoresh
  • 113
  • 3