I have a table having n number of records in impala. I need to find out how can I divide that table into two equal halves with 50 - 50 percent of records in each.
Asked
Active
Viewed 6,621 times
1 Answers
1
I assume that that you have 3 tables with identical structure, where the first has the data while the two others are empty.
I also assume that the tables or at least table1 has a primary key.
Searching through the Impala SQL documentation, I don't find any window functions, so a ROW_NUMBER() solution is out of the question.
We could use a random function to split the table in 2 (almost equal) halves.:
-- insert half of the rows into table 2
INSERT INTO table2
SELECT *
FROM table1
WHERE rand() < 0.50 ;
-- insert the rest rows into table 3
INSERT INTO table3
SELECT t1.*
FROM table1 AS t1
LEFT JOIN table2 AS t2
ON t1.pk = t2.pk
WHERE t1.pk IS NULL ;
After that, we can drop table1.
ypercubeᵀᴹ
- 99,450
- 13
- 217
- 306