We need to import millions of rows in a quite big table.
Say like 5 million rows into a table with more than 1 billion rows.
After preprocessing the import file we will have update operations and insert operations.
I have a couple of questions:
How the table will be affected? During this import (updates and inserts) what could happen to other processes trying to read or write data to this table?
Definetely we should be using bulk operations right? Insert and updates. I know bulk insert functionality as being something extremely fast compared to normal inserts. Does SQL Server 2008 provides something similar for UPDATES?
Which techniques or strategies should we consider in order to make this process as fast as possible and at the same time keep the underlying table available for other processes to read and write to it while importing?
My ideas are:
All the row to be imported will be of the same UserId, so having this table partitioned by userid will improve the performance right? As all the other queries related to other users which doesnt belong to the affected partition will not be affected... Is this correct?
What about indexes? Should I be droping them and recreating on the affected partition?
I will continue reading and testing on this topic. If anyone has experience on some of the above questions would be great or could point me to some related information would be great!