39

I have MySQL small RDS instance as part of my production system and I want to upgrade it to medium instance with provided IOPS.

As old-school DBA I'm aware about "add slave; promote to master; switch clients" method, but AWS promises to provide magic one-click upgrade path, i.e. "upgrade instance", "add provided IOPS".

Tried this on test RDS instance, downtime is too long, IMHO: about 5 min for small->medium upgrade, and 30 min (!!!) for switching to provided IOPS.

  • Is this normal behavior?
  • Is there any way to run upgrade on production RDS w/o downtime?
  • Do you recommend "stop; create a snapshot; restore from snapshot to bigger instance" way?
Erik
  • 4,833
  • 4
  • 28
  • 57

4 Answers4

41

Upgrading an instance in RDS means RDS will be physically migrating the database to a new instance, likely on a different physical host, so downtime would not be avoidable. Migrating to provisioned IOPS would likely mean your data would be migrated to a new EBS volume (and the server might be migrated to a new instance as well with this change, depending on whether, internally, machines capable of accessing EBS volumes with provisioned IOPS are physically segregated from machines that aren't, so that they can be on a different class of network hardware) so downtime would again be inevitable.

There appears to be a way to avoid this disruption: a Multi-AZ deployment, which creates an invisible and inaccessible (to you) replica in another availability zone within the region.

In the case of system upgrades like OS patching or DB Instance scaling, these operations are applied first on the standby, prior to the automatic failover. As a result, your availability impact is limited only to the time required for automatic failover to complete.

http://aws.amazon.com/rds/multi-az/

That should provide a quick and seamless migration path, though I have not had occasion to test this capability. "Modify" in the console appears to allow you to convert an instance to Multi-AZ. Presumably, this would result in brief I/O freeze as the instance is cloned, so I of course would recommend testing all of this functionality before trying it.

Alternately, RDS supports an internal mechanism that should allow you to emulate the "add slave; promote to master; switch clients" operation, and this also should allow you to achieve a near-zero-downtime conversion:

  • Create an actual RDS read replica of your database with the desired instance class
  • Wait for the replica to come online and be synched with the master
  • Modify the replica's configuration to add Provisioned IOPS
  • Wait for the replica to come online and be synched with the master
  • Verify that both systems have identical data using 3rd party tools
  • Disconnect your application from the old master
  • Verify matching binlog coordinates on master and replica to assure that all application writes have replicated
  • Split the systems with "Promote Read Replica" on the new replica in RDS
  • Connect your application to the new master

http://aws.amazon.com/about-aws/whats-new/2012/10/11/amazon-rds-mysql-rr-promotion/

Michael - sqlbot
  • 22,715
  • 2
  • 49
  • 76
6

Even with a multi-AZ environment, you will have a 60-120s outage. This was the case when I repeatedly hit our RDS instances while performing an upgrade from a PostgreSQL db.m3.medium to an db.m3.large.

Anthony Genovese
  • 2,067
  • 3
  • 22
  • 34
Wayn E
  • 61
  • 1
  • 1
6

It is also POSSIBLE to avoid any downtime during the upgrade. The way to do it is by briefly launching a new RDS from a read replica snapshot and configure it as active/active Master to Master replication. Once it's configured, you can switch application traffic one APP server at the time without any downtime. We use the approach every time AWS announces RDS maintenances to avoid downtime as well as during our scheduled maintenances.

https://workmarket.tech/zero-downtime-maintenances-on-mysql-rds-ba13b51103c2

Here are the details:

M1 - Orignal Master

R1 - Read Replica of the M1

SNAP1 - Snapshot of the R1

M2 - New Master

M2 creation sequence: M1 → R1 → SNAP1 → M2

  • Since we can’t use SUPER privilege on RDS, we don’t use mysqldump with — master_data2 option on the M1. Instead, we launch R1 to obtain the binlog position of the M1 from it. Then create a snapshot (SNAP1) from the R1 and then launch M2 from the SNAP1.

  • Create two separate RDS parameters groups with the followingt offsets to avoid PK conflcts:

    M1: auto_increment_ increment = 4 and auto_increment_offset = 1

    M2: auto_increment_ increment = 4 and auto_increment_offset = 2

  • Create replication user on M1

    GRANT EXECUTE, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO ‘repl’@’%’ IDENTIFIED BY PASSWORD <secret>;

1. Create R1 from M1

-- Connect to the R1 and stop replication
   CALL mysql.rds_stop_replication;
-- Obtain M1’s (!!) current binlog file and position 
        `mysql> show slave status\G
             Master_Log_File: mysql-bin.000622
             Exec_Master_Log_Pos: 9135555

2. Create SNAP1 from R1

  • Create M2 from the SNAP1 with the attributes obtained from M1

  • Assign a parameter group to M2 with a different auto_increment_ offset from M1 to avoid M/M replication key conflicts

4. Setup M/M replication

-- Configure M2 as a slave of M1
CALL mysql.rds_set_external_master ('m1.xyxy24.us-east-1.rds.amazonaws.com', 3306, 'repl', 'mypassword', 'mysql-bin.000622', 9135555, 0);
CALL mysql.rds_start_replication;
-- Connect to M2 and obtain its current binlog file and position
         mysql> show master status\G
            File: mysql-bin.004444
            Position: 6666622
-- Connect to M1 and configure it to be a slave of the M2
CALL mysql.rds_set_external_master ('m2.xyxy24.us-east-1.rds.amazonaws.com', 3306 , 'repl', 'mypassword', 'mysql-bin.004444', 6666622, 0);
CALL mysql.rds_start_replication;

5. Delete R1 and SNAP1 as they’re no longer needed

6. Upgrade M2 via AWS Console

Use the standard procedure to Modify the Instance as per your needs.

7. Perform Graceful Switchover to M2

As M/M replication is set up successfully, we are ready to proceed with DB maintenance without downtime by gracefully switching App servers one at the time.

Here are more details on how it works.

https://workmarket.tech/zero-downtime-maintenances-on-mysql-rds-ba13b51103c2

Brian
  • 103
  • 5
1

This would work, however you must make sure that the endpoints of the RDS instance is not configured in you application as a static entry. Swapping RDS will change the endpoints.

Anup Singh
  • 11
  • 1