2

I'm currently uploading one of Wikipedia's dump files. It has 1GB compressed and ~7GB uncompressed. It has been importing it for 6 hours already. The size of the imported table in MySQL is at the moment 1.5GB(data+indexes). Will it continue to import until the size of the table will reach 7GB?

Note: The data directory of the database in on an external HDD. Considering this is only a development, so I'm not concerned by the real performance.

Andrew
  • 173
  • 1
  • 1
  • 7

4 Answers4

8

I assume you are connected by ssh or you are sitting right at the computer. Open a separate window and run this query in mysql:

SHOW PROCESSLIST;

I was just doing an import of a 3GB (uncompressed) db the other day and I believe that was the query that showed me exactly where the import was in its progress.

In my case, the tables were being imported in alphabetical order, so I knew exactly where it was in its progress.

Also, check out this stackoverflow question's answers for killing process vs killing query.

Buttle Butkus
  • 1,230
  • 3
  • 11
  • 21
6

Can't say about the size, but you could try and check for the last entry or the amount of entries in the dump and then check for the current last entry in your database. This might help you determine the time the import will take till it finishes.

The size of the imported data might even get bigger than the uncompressed 7GB, since the indexes are usually not contained in dumps but get built on insert.

As a sidenote: This is also a way to speed up the import itself: drop the index during import and rebuild it later on, this helped me several times to speed things up.

2

It may not be of very much help, but using pv and pipes in Linux, you can create a progress bar that will at least help you observe the quantity of data that has been loaded into the tables. It will also help you to assess the overall progress of the task.

As I am new on StackExchange, I need to pass a reputation score to use more StackOverflow. I request you to upvote, comment about improvements, or any other tips which can make this more helpful.

The command which worked for me is:

pv file-name.sql | mysql -u root -pPass <DataBaseName>

example:

pv steam_activity.sql | mysql -u root -pPass GAMEDATA

This will give a progress bar for SQL import which will look like this :

enter image description here

77.9GiB 9:20:33 [4.40MiB/s] [===========================================> ] 48% ETA 9:52:3

PLEASE NOTE THAT ALL THE METRICS ABOVE ARE CALCULATED WITH RESPECT TO DATA TRANSFERRED FROM SQL FILE TO MYSQL SERVER (i.e. from steam.sql to mysql process). THEREFORE THIS WON'T GIVE YOU PRECISE AMOUNT OF TIME LEFT FOR COMPLETION. BUT IT CAN PROVIDE A GOOD APPROXIMATION OF TIME REQUIRED.

It has nothing to do with the internal working of MySQL server engines so it won't be able to give a precise ETA for completion.

Let us discuss the output progress bar:

I am importing a 173 GB steam_activity.sql file into a newly created empty database named GAMEDATA.

Let us discuss the output progress bar:

77.9GiB 9:20:33 [4.40MiB/s] [===========================================> ] 48% ETA 9:52:3

  • 77.9GiB: Amount of Data transferred from that 173 data file:

  • 9:20:33 <Hours: Minutes: Seconds> : Time Elapsed since the process started

  • [4.40MiB/s] : Rate At which statements are read from steam_activity.sql

  • A progress bar: [===========================================> ]

  • 48% : Amount of transfer completed:

  • ETA 9:52:30: Time required for completing the import. THIS IS NOT THE CORRECT N PRECISE TIME AS SOME INSERT STATEMENT MIGHT TAKE LONGER EXECUTION TiME THAN OTHER. Also, the table checks process can also happen, which can delay the importing.

Original post: How can I monitor the progress of an import of a large .sql file?

If you want tips on speedup, please comment or post other questions ( I loaded that 173 GB .SQL file in 17 hours ) or follow this post: MySQL any way to import a huge (32 GB) SQL dump faster? Estimation is 24 Days

0

You can take a note of the time you start to import the dump and after that finish, run a query to know the last update time of all tables from your database.

select max(update_time) from information_schema.tables where table_schema = 'your_database_name';

After that, just calculate it. (last_update_time - time_you_started) = time_spent.

leo
  • 1