Questions tagged [bulk]

50 questions
54
votes
2 answers

Optimizing bulk update performance in PostgreSQL

Using PG 9.1 on Ubuntu 12.04. It currently takes up to 24h for us to run a large set of UPDATE statements on a database, which are of the form: UPDATE table SET field1 = constant1, field2 = constant2, ... WHERE id = constid (We're just overwriting…
xyzzyrz
  • 661
  • 1
  • 6
  • 8
45
votes
2 answers

Most efficient way of bulk deleting rows from postgres

I'm wondering what the most efficient way would be to delete large numbers of rows from PostgreSQL, this process would be part of a recurring task every day to bulk import data (a delta of insertions + deletions) into a table. There could be…
tarnfeld
  • 639
  • 2
  • 6
  • 7
24
votes
3 answers

Why is 'LOAD DATA INFILE' faster than normal INSERT statements?

I've read an article that mentioned we can achieve 60,000 inserts per second by using the LOAD DATA IN FILE statement, that reads from csv files and inserts the data into a database. Why should it differ from normal inserts? EDIT: I reduced the…
Alireza
  • 3,676
  • 10
  • 38
  • 44
16
votes
3 answers

Create a table from a CSV file with headers

I'm seeking to find a way to generate a new MySQL table solely based on the contents of a specified CSV. The CSV files I'll be using have the following properties; "|" delimited. First row specifies the column names (headers), also "|" delimited.…
user58602
  • 161
  • 1
  • 1
  • 3
14
votes
2 answers

What specifically does OracleBulkCopy do, and how can I optimize its performance?

To summarize the specifics: We need to stage approximately 5 million rows into a vendor (Oracle) database. Everything goes great for batches of 500k rows using OracleBulkCopy (ODP.NET), but when we try to scale up to 5M, the performance starts…
Aaronaught
  • 441
  • 1
  • 3
  • 11
11
votes
1 answer

Best way to re-import large amount of data with minimal downtime

I need to import about 500,000 records containing IP lookup (read-only reference) data about once a week (only three int/bigint cols). I don't really want to worry about merging the data with the existing table, I'd prefer to clear down the old and…
Mark
  • 941
  • 1
  • 8
  • 20
9
votes
4 answers

Suggestion for Bulk Data Import

We are working on a project which requires import data from Excel spreadsheet on daily basis. The data will be import from pre-define template and what we're thinking, first we will upload data in temporary table and perform cleaning operation…
kodvavi
  • 337
  • 1
  • 6
8
votes
2 answers

Should I disable autovacuum on a table while I do a bulk update?

I need to perform a simple update on all rows in a table. The table has 40-50 million rows. Dropping indexes and constraints during the UPDATE results in a massive performance improvement. But what about autovacuum? Can autovacuum start a VACUUM or…
jpmc26
  • 1,652
  • 3
  • 20
  • 38
7
votes
2 answers

Index before or after bulk load using load infile?

I have a database with over 1B rows and two columns that are indexed (in addition to the PK). Is it better to have the index pre-defined in the table before the load infile or better to index after the data has been loaded? A couple of notes…
qman777
  • 71
  • 1
  • 3
7
votes
2 answers

Bulk Data Loading and Transaction Log

I'm currently working on a project which bulk import data from flat files (csv) about 18 different files each linking to a specific table through some stored procedure. I followed the steps as advised in Data Loading Performance guide. The database…
Raymond
  • 229
  • 1
  • 3
  • 9
5
votes
2 answers

Using CHECKPOINT vs GO

Typically, when one of the developers or data analysts needs to perform a very large update or deletion of data (where a truncate or truncate/insert wouldn't make sense because the data set to keep is too large) I recommend to them to do something…
Alf47
  • 981
  • 1
  • 9
  • 22
5
votes
2 answers

postgres_fdw slower then copy + scp + copy (~12 times)

foreign server 9.2 local server 9.5 table is 10GB data transfer performed on same network interface as foreign server works no indexes set on data old way: copy to - 2:36 scp - 08:17 copy from - 10:11 postgres_fdw: by the time old way…
Vao Tsun
  • 1,263
  • 2
  • 13
  • 25
5
votes
2 answers

BULK INSERT - best usage

I need to export 150 million rows of only int/bigint columns from one SQL instance to another. I am using BCP queryout to export the rows and BULK INSERT to import the rows to another table. I split 150 mln row into 3 BCP Export files each of 50…
yrushka
  • 1,994
  • 1
  • 16
  • 22
4
votes
1 answer

Can SQL Server Bulk Copy cause timeouts when querying the table

In SQL Server 2008 R2, can using BulkCopy to upload thousands of rows of data cause problems for queries taking place on the same table? Specifically could it cause the queries to time out? The table being queried has millions of rows.
4
votes
3 answers

How to send+update on duplicate a big time series to SQL server

I use a SQL Server 2008 I would like to send (and update on duplicate datetime index) a big time series (more than 1000 datapoints and eventually much more than that). What is the best way to do so? I have been advised to use MERGE and BULK…
RockScience
  • 398
  • 2
  • 5
  • 14
1
2 3 4