2
PostgreSQL 13.2 (Debian 13.2-1.pgdg100+1) on x86_64-pc-linux-gnu, 
compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit

Hello All,

I'm a bit stuck on this one.

Working SQL

Currently I am using this query:

SELECT DATE_PART('year', created_at::timestamp) AS year,
       DATE_PART('week', created_at::timestamp) AS week_number,
       DATE_TRUNC('week', created_at::timestamp) AS week_starting,
       name,
       SUM(seconds) as total_seconds
FROM my_table
WHERE name = 'jeff'
AND created_at::date >= '2021-01-01'
AND created_at::date <= '2021-10-01'
GROUP BY year, week_number, week_starting, name
ORDER BY name;

This gives me:

| "year" |"week_number" |    "week_starting"    | "name" | "total_seconds" |
| "2021" |    "18"      | "2021-05-03 00:00:00" | "jeff" | "11"            |
| "2021" |    "19"      | "2021-05-10 00:00:00" | "jeff" | "40"            |  
| "2021" |    "23"      | "2021-06-07 00:00:00" | "jeff" | "7"             |

What I would like is a result with the gaps filled in:

| "year" |"week_number" |    "week_starting"    | "name" | "total_seconds" |
| "2021" |    "18"      | "2021-05-03 00:00:00" | "jeff" | "11"            |
| "2021" |    "19"      | "2021-05-10 00:00:00" | "jeff" | "40"            |  
| "2021" |    "20"      | "2021-05-17 00:00:00" | "jeff" | "0"             |
| "2021" |    "21"      | "2021-05-24 00:00:00" | "jeff" | "0"             |  
| "2021" |    "22"      | "2021-05-31 00:00:00" | "jeff" | "0"             |  
| "2021" |    "23"      | "2021-06-07 00:00:00" | "jeff" | "7"             |

Setup DB table

CREATE TABLE IF NOT EXISTS my_table (
    id INT GENERATED ALWAYS AS IDENTITY,
    name VARCHAR(255) NOT NULL,
    created_at TIMESTAMP,
    seconds INT,
    PRIMARY KEY(id)
);

INSERT INTO my_table(name, created_at, seconds) VALUES ('jeff', '2021-05-03 15:28', 10); INSERT INTO my_table(name, created_at, seconds) VALUES ('jeff', '2021-05-03 15:29', 1); INSERT INTO my_table(name, created_at, seconds) VALUES ('jeff', '2021-05-10 11:01', 20); INSERT INTO my_table(name, created_at, seconds) VALUES ('jeff', '2021-05-11 13:23', 20); INSERT INTO my_table(name, created_at, seconds) VALUES ('jeff', '2021-06-08 10:04', 2); INSERT INTO my_table(name, created_at, seconds) VALUES ('jeff', '2021-06-09 06:27', 5);

What I have tried so far

So there is a lot of posts around about using GENERATE_SERIES

And I can create a series of weeks starting using:

SELECT DATE_TRUNC('week', GENERATE_SERIES(
    '2021-01-01'::timestamp,
    '2021-10-01'::timestamp,
    '1 week'::interval
  )) as created_at

I have had a good look at using this and in my mind I have tried to create an "empty" weeks temporary table and then merge that into my_table to give the desired output of filling the gaps. However the below query just gives me back the data thats in my_table and does not fill the gaps. Can someone explain what am I missing here?

Working query

SELECT empty_weeks.created_at,
       DATE_PART('year', empty_weeks.created_at::timestamp) AS year,
       DATE_PART('week', empty_weeks.created_at::timestamp) AS week_number,
       DATE_TRUNC('week', empty_weeks.created_at::timestamp) AS week_starting,
       t.name,
       SUM(t.seconds) as total_seconds
  FROM (
    SELECT DATE_TRUNC('week', GENERATE_SERIES(
      '2021-01-01'::timestamp,
      '2021-10-01'::timestamp,
      '1 week'::interval
    )) as created_at,
    'jeff' as name,
    0 as seconds          
  ) empty_weeks

  LEFT JOIN my_table t
  ON DATE_TRUNC('week', t.created_at) = empty_weeks.created_at

  WHERE t.name = 'jeff'
  AND empty_weeks.created_at::date &gt;= '2021-01-01'
  AND empty_weeks.created_at::date &lt;= '2021-10-01'
  GROUP BY year, week_number, week_starting, empty_weeks.created_at, t.name
  ORDER BY t.name;

Provides

| "year" |"week_number" |    "week_starting"    | "name" | "total_seconds" |

| "2021" | "18" | "2021-05-03 00:00:00" | "jeff" | "11" | | "2021" | "19" | "2021-05-10 00:00:00" | "jeff" | "40" |
| "2021" | "23" | "2021-06-07 00:00:00" | "jeff" | "7" |

What I would like

| "year" |"week_number" |    "week_starting"    | "name" | "total_seconds" |
| "2021" |    "1"       | "2020-12-28 00:00:00" | "jeff" | "0"       
...
| "2021" |    "18"      | "2021-05-03 00:00:00" | "jeff" | "11"            |
| "2021" |    "19"      | "2021-05-10 00:00:00" | "jeff" | "40"            |  
| "2021" |    "20"      | "2021-05-17 00:00:00" | "jeff" | "0"             |
| "2021" |    "21"      | "2021-05-24 00:00:00" | "jeff" | "0"             |  
| "2021" |    "22"      | "2021-05-31 00:00:00" | "jeff" | "0"             |  
| "2021" |    "23"      | "2021-06-07 00:00:00" | "jeff" | "7"             |
...
| "2021" |    "40"      | "2021-09-27 00:00:00" | "jeff" | "0"           

Any help would be hugely appreciated but also an explanation would be great. I have had a good read up on LEFT JOIN and I'm expecting it to provide empty rows of data.

As always thanks in advance

Solution

For completeness and thanks to @mustaccio I changed the query to:

SELECT empty_weeks.created_at,
       DATE_PART('year', empty_weeks.created_at::timestamp) AS year,
       DATE_PART('week', empty_weeks.created_at::timestamp) AS week_number,
       DATE_TRUNC('week', empty_weeks.created_at::timestamp) AS week_starting,
       COALESCE(t.name, 'jeff'),
       COALESCE(SUM(t.seconds), 0) as total_seconds
  FROM (
    SELECT DATE_TRUNC('week', GENERATE_SERIES(
      '2021-01-01'::timestamp,
      '2021-10-01'::timestamp,
      '1 week'::interval
    )) as created_at      
  ) empty_weeks

  LEFT JOIN my_table t
  ON DATE_TRUNC('week', t.created_at) = empty_weeks.created_at
  AND t.name = 'jeff'
  AND empty_weeks.created_at::date &gt;= '2021-01-01'
  AND empty_weeks.created_at::date &lt;= '2021-10-01'
  GROUP BY year, week_number, week_starting, empty_weeks.created_at, t.name
  ORDER BY week_number;

ZedRed
  • 53
  • 5

0 Answers0