1

I have a table having n number of records in impala. I need to find out how can I divide that table into two equal halves with 50 - 50 percent of records in each.

ypercubeᵀᴹ
  • 99,450
  • 13
  • 217
  • 306
user97537
  • 11
  • 1
  • 2

1 Answers1

1

I assume that that you have 3 tables with identical structure, where the first has the data while the two others are empty. I also assume that the tables or at least table1 has a primary key.

Searching through the Impala SQL documentation, I don't find any window functions, so a ROW_NUMBER() solution is out of the question.

We could use a random function to split the table in 2 (almost equal) halves.:

-- insert half of the rows into table 2
INSERT INTO table2
SELECT *
FROM table1
WHERE rand() < 0.50 ;

-- insert the rest rows into table 3
INSERT INTO table3
SELECT t1.*
FROM table1 AS t1
  LEFT JOIN table2 AS t2
    ON t1.pk = t2.pk
WHERE t1.pk IS NULL ;

After that, we can drop table1.

ypercubeᵀᴹ
  • 99,450
  • 13
  • 217
  • 306