3

My query and my data set (copied from sqlfiddle):

CREATE TABLE one (
    id int4 primary key, 
    p_id int4, 
    k_id int4
);
CREATE TABLE two(
   id int4 primary key, 
   p_id int4, 
   k_id int4,
   t_id int4,
   pos int4
);

INSERT INTO one(id, p_id, k_id) VALUES 
    (1, 1, 1), 
    (2, 1, 2), 
    (3, 1, 3), 
    (4, 1, 4);
INSERT INTO two(id, p_id, k_id, t_id, pos) VALUES 
    (1, 1, 1, 1, 1), -- t_id = 1 and pos = 1
    (2, 1, 2, 1, 2),
    (3, 1, 3, 1, 1), -- t_id = 1 and pos = 1
    (4, 1, 4, 1, 3),
    (5, 1, 1, 2, 3), -- shares p_id and k_id with row 1
    (6, 1, 2, 2, 1),
    (7, 1, 3, 2, 5), -- shares p_id and k_id with row 3
    (8, 1, 4, 2, 6);

Table one is a join table, I've added it because in the future I might add more columns into it, and then it will be easier to modify this report.

What I want is to be able to filter my dataset like so:

  1. Get all the rows where pos = 1 for t_id = 1

Expected output:

p_id, k_id, stats
1,     1,    [{p_id: 1, k_id: 1, t_id: 1, pos: 1}, {p_id: 1, k_id: 1, t_id: 2, pos: 3}
1,     3,    [{p_id: 1, k_id: 3, t_id: 1, pos: 1}, {p_id: 1, k_id: 3, t_id: 2, pos: 5}

As you can see I want to append to my results the data for others t_ids where p_id and k_id are the with the found results.

Update:

What will be the best way of removing duplicates?

Adding (9, 1, 1, 1, 20) to my dataset will result in having the same type twice in a row;

One solution to get only uniq types and the smallest pos values is:

WITH uniqt AS (
  SELECT   p_id, k_id, t_id, min(pos) as pos
  FROM     two
  GROUP BY 1, 2, 3

)

Because I am selecting from table one 50 rows (limit 50) and then join a table with SELECT like in @Erwin Brandstetter example, having this CTE will slow down my query?

RockNinja
  • 683
  • 4
  • 14
  • 25

2 Answers2

3

You could use an inner join to filter for rows that share a (p_id, k_id) value with rows that have a (t_id, pos) of (1,1) (sqlfiddle):

select  base.p_id
,       base.k_id
,       json_agg(base order by base.p_id, base.k_id, base.t_id, base.pos) as stats
from    (
        select  p_id
        ,       k_id
        ,       t_id
        ,       pos
        from    two
        ) base
join    two as filter
on      filter.t_id = 1
        and filter.pos = 1
        and filter.p_id = base.p_id
        and filter.k_id = base.k_id
group by
        base.p_id
,       base.k_id;

Or an equivalent way using an exists subquery (sqlfiddle):

select  base.p_id
,       base.k_id
,       json_agg(base order by base.p_id, base.k_id, base.t_id, base.pos) as stats
from    (
        select  p_id
        ,       k_id
        ,       t_id
        ,       pos
        from    two
        ) base
where   exists
        (
        select  *
        from    two as filter
        where   filter.t_id = 1
                and filter.pos = 1
                and filter.p_id = base.p_id
                and filter.k_id = base.k_id
        )
group by
        base.p_id
,       base.k_id;

Updated to use json_agg from @ErwinBrandstetter's answer.

Andomar
  • 3,505
  • 25
  • 32
2

It's unclear how table one is relevant at all. So I am ignoring that table

To build the JSON value you seem to be after (according to the query in the fiddle):

SELECT p_id, k_id, json_agg(sub) AS result
FROM  (
   SELECT p_id, k_id, t_id, pos
   FROM   two t2
   WHERE  EXISTS (
      SELECT 1
      FROM   two
      WHERE  pos = 1
      AND    t_id = 1
      AND    p_id = t2.p_id
      AND    k_id = t2.k_id
      )
   ORDER BY 1,2,3,4
   ) sub
GROUP  BY p_id, k_id;

Produces the desired result exactly.

SQL Fiddle.

Explanation

  • The inner query is similar to what @Andomar already posted: An EXISTS semi-join being the key element.

  • ORDER BY relevant columns right away, that's cheaper than adding an ORDER BY clause to the aggregate function.

  • Use json_agg() for the final aggregate. You don't even need row_to_json(), just use the alias sub of the derived table in the subquery as argument to json_agg().

  • In Postgres 9.4 you could use json_build_object() to get key names as desired in your JSON document. See:
    Select columns inside json_agg

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633