65

First of all, I'm a developer, not a DBA or sysadmin; please be gentle :)

I'm working on an application workflow where a single user action will trigger complex changes in the database - creating hundreds of records in some tables, updating hundreds of records in others, etc. All in all, about 12 tables (out of ~100) are touched by this action. Due to the complexity, it's very hard for me to manually revert all of the changes before I can run another test. During most of my development time, I can simply insert a "ROLLBACK" statement near the end of the workflow, but when I get close to commiting my changes, I need to test the real thing.

I have a local copy of the production database to work with. In my case, dumping and restoring between tests is faster than writing a script to undo all the changes. It's faster, but it's still slowing me down a lot (the restore takes around 20 minutes on my ageing laptop). Is there any way I can save a snapshot of the current state of the database, and then quickly restore it?

I'm guaranteed to be the only user on the system, and I have root access. The database dump is ~100MB when tar'ed and gzip'ed. PostgreSQL version is 8.3.

Thanks in advance for any helpful ideas.

Zilk
  • 1,141
  • 2
  • 9
  • 13

8 Answers8

50

You could use file-system level snapshots, but that is often pretty cumbersome, needs special file systems, and is not always available, especially on aging laptops. ;-)

How about you create your base state as a database, and then create a new database from it for your test run, using the CREATE DATABASE ... TEMPLATE functionality. After the test, you throw that database away. Then your speed constraint is essentially only the time to cp -R the database directory. That's about as fast as you're going to get without file system snapshot magic.

Peter Eisentraut
  • 10,723
  • 1
  • 35
  • 35
16

Use Stellar, it's like git for databases:

Stellar allows you to quickly restore database when you are e.g. writing database migrations, switching branches or messing with SQL. PostgreSQL and MySQL (partially) are supported.

Andriy M
  • 23,261
  • 6
  • 60
  • 103
5

If your database runs in Virtualbox, you can easily save snapshots and restore snapshots of both the database state and the OS itself in a few seconds (or 1-2 minutes if you really have a lot of data in the database or the OS or very little memory allocated to the virtual machine) for free.

In your/most cases, it would be best to install a lightweight linux (than a Windows server) for running the virtual machine where the database is hosted given you mention you have little ressources available on your laptop.


On the production site, I use MediaTemple's snapshot backups to achieve the same result (but it's 20$ per backup slot and specific to that webhosting service, so that may not fit you).

wildpeaks
  • 281
  • 1
  • 3
4

Following the TEMPLATE answer by Peter, these are the commands I use:

List databases:

psql -h localhost -l

Copy db:

createdb -T <db_to_be_cloned> <clone_name>

Delete db:

dropdb <db_name>

With these commands we can do snapshots:

Create a snapshot for mydb db:

createdb -h localhost -T mydb sshot-mydb-001

Restore mydb from the snapshot:

dropdb -h localhost mydb
createdb -h localhost -T sshot-mydb-001 mydb
Noam-N
  • 141
  • 1
3

Probably not the answer you are hoping for, but have you considered some lower level of snapshotting - LVM for example?

Jack Douglas
  • 40,517
  • 16
  • 106
  • 178
2

Found this question when trying to do the same and ended up using git on the postgresql data directory. Discarding the changes is as easy as:

git reset --hard
user92843
  • 31
  • 1
1

Though I have to say the Stellar and git reset --hard is an interesting solution, I'll have an issue with bigger databases and tests, and I do use the Virtualbox etc. solutions, how ever, in bigger tests, these to become bit more "problematic" when you are using bare metal etc. solutions.

Thus I HAVE to mention ZFS as a filesystem to consider for these in future for the following reasons that @Peter Eisentraut also mentioned:

  1. Snapshots - especially when you do replication from Prod to QA/DR, you can use that same "filesystem" for the tests:
#On a replication node, rather stop, snap, restore for a "consistent" backup ;)
su -l -c "/usr/bin/m2ee stop" acw_qa
pg_ctlcluster ${=QA} stop --force
zfs destroy -R $SNAPSHOT
pg_ctlcluster ${=REPLICATION} stop --force
zfs snapshot $SNAPSHOT
pg_ctlcluster ${=REPLICATION} start

zfs destroy $CLONE
zfs clone -o mountpoint=$CLONEDIR $SNAPSHOT $CLONE
rm $CLONEDIR/$CLUSTER/recovery.conf
pg_ctlcluster ${=QA} start
su -l -c "/usr/bin/m2ee start" acw_qa
  1. to do a test, just before the test do a postgresql stop as above, zfs snapshot $SNAPSHOT start the postgresql, then to rollback, stop the postgresql, and just zfs rollback $SNAPSHOT

  2. Compression - Postgresql gets a typical 3:1 compression in my databases, so you can do lots of testing more ;)

Hvisage
  • 483
  • 1
  • 4
  • 6
0

Yet another option that could be experimented would be to actually save a copy of the postgresql data directory, then just rewrite the existing directory with the copy when you want to restore it. It will require more space in the disk, but wil definitely be faster than restoring from a backup. I'm not sure if this would be faster than the template method, though, so it would be a good idea to make some tests, first.

Haroldo_OK
  • 101
  • 1