0

I'm attempting to migrate a MySQL database (10M+ records) from an EC2 instance into an RDS Instance using a script:

#!/bin/bash

mysqldump --databases my_db \
    --compress \
    --order-by-primary \
    --single-transaction \
    -u dba \
    -ppassword | mysql \
        --host=host.us-east-1.rds.amazonaws.com \
        --port=3306 \
        --max_allowed_packet=500M \
        -u dba \
        -ppassword

The script connects and begins to transfer data as expected, but then errors out after about 5 minutes with the following error:

ERROR 2006 (HY000) at line 1406: MySQL server has gone away
mysqldump: Got errno 32 on write

The script works fine for smaller databases, and for the larger database it transfers several million records before the error.

I'm not sure how to get past this issue. I'd love to get some suggestions (parameter adjustments on EC2 or RDS, how to get more detail on error etc.)

byron
  • 101
  • 1
  • 4

3 Answers3

1

The error

mysqldump: Got errno 32 on write

simply means you ran out of space.

The question is: Where is the space issue coming from ???

--order-by-primary

Using this requires creating a temp table in the folder configured in tmpdir

Bigger mysqldumps need a much larger tmpdir when using --order-by-primary

If you get rid of --order-by-primary, you can handle bigger dumps.

You just asked

what are the implications of removing --order-by-primary ?

There are improvements to the dump's creation

IMPROVEMENT #1 : Faster Dump

Every time you dump a table using --order-by-primary, a temp table must be created. That can pause the mysqldump on the source side. Such pausing will never occur anymore

IMPROVEMENT #2 : No bloated indexes

When you load a table in PRIMARY KEY order, it can generate up to 45% additonal wasted space in the indexes due to lopsided page splits in your BTREE indexes.

IMPROVEMENT #3 : AUTO_INCREMENT PRIMARY KEYs

If the majority of your tables has a single integer that is auto incremented and is the PRIMARY KEY, the table is already ordered. No need for the overkill of --order-by-primary

Concerning your next comment

I removed --order-by-primary, got the same error at the same time :(

Please make sure you have extra diskspace on the RDS Instance. Why ?

It is possible for a dump to be bigger when reloaded, especially when exacerbated by having keys ordered.

RolandoMySQLDBA
  • 185,223
  • 33
  • 326
  • 536
1
SHOW SESSION VARIABLES LIKE '%timeout';
SHOW SESSION VARIABLES WHERE VALUE = 300;

Your 5-minute timeout was probably one of those timeouts, probably wait_timeout or maybe interactive_timeout.

Raise the value and start over.

--order-by-primary -- Note what the manual says "This is useful when dumping a MyISAM table to be loaded into an InnoDB table, but makes the dump operation take considerably longer." -- So, if you are not dumping a MyISAM table with the intention of loading it as InnoDB, that setting is probably useless. If you are doing such a migration, then the option may lead to a more compact InnoDB table.

When coming from InnoDB, the option probably has no impact (and is harmless).

Rick James
  • 80,479
  • 5
  • 52
  • 119
0

Well, after many hours of tweaking settings and retrying (thanks to everyone for the suggestions), I started looking though the raw data for anomalies. It turns out there were several records in the dataset that had a yuuuuuge block of text (caused by a bug in a release earlier this year that was incorrectly converting special characters) and the import process was failing when it encountered those records. I found them by doing a SELECT * from 'table' ORDER BY LENGTH(text_field) DESC; and reviewing the contents of text_field record by record. The first 4 records contained about 4MB of raw text, whereas the next largest had less than 150KB. I manually edited those records to remove the erroneous special characters and the Error 2006 went away.

I'm still encountering an issue about 40 minutes in to the process where the export process stops, but I think that has to do with my SSH connection timing out, which is a separate issue. I'm going to try running the script with a nohup migrate_to_rds.sh & and see if that works.


UPDATE

It worked. Export/import completed successfully.

Summary of the solution:

  1. Discovered and manually fixed 4 anomalous records that contained a huge amount of text (caused by a bug on the front-end). Those records were the root cause of the initial timeout.

  2. Ran the script as a background process using nohup to prevent SSH session timeout nohup migrate_to_rds.sh &

byron
  • 101
  • 1
  • 4