1

I have a table that looks like this in Google Big Query:

custID  InteractionDate Purchased?  Sales
1       20150312           F         0
1       20150312           T         200
1       20150314           T         150
2       20150221           T         400
2       20150221           F         0
2       20150221           T         120

..that I need to transform to the following structure:

 custID InteractionDate Success Sales
 1     20150312          0.5    200
 1     20150314           1     150
 2     20150221         0.66    520

..where in this second table the success column is defined as

count(purchase=Yes)/(count(purchase=Yes) + count(purchase=No)),

and sales is just the sum.

I'm quite new to BQ and no-SQL languages. I simply don't know how to do this, I've searched and seen examples that suggest using partition over but I simply haven't seen any example that does just this. Thanks a lot in advance.

Paul White
  • 94,921
  • 30
  • 437
  • 687
xupv5
  • 133
  • 1
  • 4

2 Answers2

3

This would be very simple in SQL. As far as I can read the BigQuery Query reference, it supports all (GROUP BY clause, COUNT() and SUM() functions, CASE expressions):

SELECT 
    custID, InteractionDate,
    1.0 * COUNT(CASE WHEN Purchased = 'T' THEN 1 END) / COUNT(*) AS Success,
    SUM(Sales) AS Sales
FROM
    tableName
GROUP BY
    custID, InteractionDate ;
ypercubeᵀᴹ
  • 99,450
  • 13
  • 217
  • 306
2

You can use this one:

SELECT custID,
       InteractionDate,
       IFNULL(ROUND((yes)/(yes+NO), 2),0) AS Success,
       sales
FROM
  (SELECT custID,
          InteractionDate,
          sum(if(Purchased=='T',1,0)) AS yes,
          sum(if(Purchased!='T',1,0)) AS NO,
          sum(sales) AS sales
   FROM
     (SELECT 1 AS custID,
             20150312 AS InteractionDate,
             'F' AS Purchased,
             0 AS Sales),
     (SELECT 1 AS custID,
             20150312 AS InteractionDate,
             'T' AS Purchased,
             200 AS Sales),
     (SELECT 1 AS custID,
             20150314 AS InteractionDate,
             'T' AS Purchased,
             150 AS Sales),
     (SELECT 2 AS custID,
             20150221 AS InteractionDate,
             'T' AS Purchased,
             400 AS Sales),
     (SELECT 2 AS custID,
             20150221 AS InteractionDate,
             'F' AS Purchased,
             0 AS Sales),
     (SELECT 2 AS custID,
             20150221 AS InteractionDate,
             'T' AS Purchased,
             120 AS Sales)
   GROUP BY custID,
            InteractionDate)

in a simpler way:

SELECT custID,
       InteractionDate,
       round(sum(if(Purchased=='T',1,0))/(sum(if(Purchased=='T',1,0))+sum(if(Purchased!='T',1,0))),2),
       sum(sales) AS sales
FROM TABLE
GROUP BY custID,
         InteractionDate
Pentium10
  • 121
  • 4