7

This is a slightly loaded question in that I have already assumed that the described scenario is wrong.

A DBA is deploying an application I have written that includes an MS SQL Server 2008 database. He has asked me to take a database backup from my development machine so he can restore it to the production server, thus deploying it. This is a greenfield deployment so there is no existing data to be migrated. I was expecting to provide a DDL script, which I have diligently tested and ensured that it contains everything required. If I execute it in SSMS, the database is created in one click.

To me, using the backup facility for deployment does not seem right, but without being an expert in SQL server I can't think of a solid reason not to do it. I would have thought, for example, that there would be some 'contamination' of the database from the development machine - perhaps the computer name, directory structure or user names stored in there somewhere. Is this the case, or is backup and restore a valid deployment technique?

4 Answers4

10

Everything is wrong with using backup files as deployment. But the burden is not on the DBA to provide the DDL, is on development. Your design and development artifacts should had been database installation and upgrade scripts. Never change anything in the database manually, everything should be modified using your app. Rails gets this in spades with he whole migrations infrastructure and you should try to adopt it too. I have for long advocated using similar techniques, see Version Control and your Database.

First let me make the case why source code based deployment/upgrade is superior to binary based deployment (.bak, or diff tools):

  • source code can be checked in into source control. This alone should settle the whole argument. Source control gives yo history, a future you can look back and read the check in notes, understand the reasoning behind current state.
  • source code can be quickly inspected at a glance. You look at it and read it. binary databases require to be attached and require extensive knowledge of the metadata catalogs to read even the basic properties
  • source code is clean. You see CREATE TABLE Foo (...), which clearly conveys the intent. Binary distribution, if you want to extract the object, showers you in a plethora of default properties. You loose the original intent.
  • source code can be peer reviewed at check in.
  • source code integrates in contiguous deployment

And I also have arguments why deploying by backup is bad (very bad):

  • you only postponed the issue. The very first update of the app will face the issue of deploying an update without loosing the data. This can occur the next day after deployment, when an issue is noticed in production, and you will be left facing the void: how to modify the production DB to match the dev DB?
  • Databases are not self contained. Deployment invokes objects outside DB (logins, SQL Agent jobs, maintenance plan etc) none of which can be deployed with a backup.
  • You never know what you deployed. Forgotten tables left over during dev? Test data? Is very difficult to cleanup a database, but keeping your source code up to date and correct is natural.
Remus Rusanu
  • 52,054
  • 4
  • 96
  • 172
5

No, there is nothing wrong with using a backup for initial deployment, in fact I would say that this is often the safest way to do it. There isn't really any "contamination" that can happen unless you have hard-coded things like server names or database names that are different in production than in the test environment.

Though a backup / restore (much like your own DDL script restricted to the database) won't bring along things like server-level logins, linked servers, SQL Agent jobs, etc.

There are some other side benefits that you get with a backup that you don't necessarily get with a DDL script, for example if you created your original table like this:

CREATE TABLE dbo.foo
(
  bar INT PRIMARY KEY,
  mort INT FOREIGN KEY REFERENCES dbo.mort(MortID),
  x TINYINT CHECK (x IN (1,2)),
  y INT NOT NULL DEFAULT 1
);

All of these constraints have system-generated names, like PK__foo__DE90ECFFA28BBAB8. When you run this same script in production, the name will be different, unless you scripted that exact table definition from the test environment. This can cause problems later if you generate drop/create/alter scripts from test and need to run them in production.

You will also get all of the data in lookup tables etc. when you take a backup, which you'll manually have to script in order to get that data onto production. (Though you have to be sure to clear out any test data that you don't want in production, too.)

And a weakness of scripting this out yourself is that you have to ensure all of the objects are created in the correct dependency order. You may have dependencies in place in test that are missing in production because the objects weren't generated in the correct order.

When it comes down to it, a backup is just cleaner. And you should be testing the database when it is deployed, so you should find any "contaminations" quite quickly, and correct them in both environments.

Once the database is initially deployed, obviously the only way to deploy changes at a later date is to script them. I've had great luck generating compare / deploy scripts using Red-Gate's SQL Compare. While Remus is absolutely right, that source control is the best solution to this, in reality source control is usually going to store a CREATE TABLE script, which doesn't get you very far when you've added a column and changed another column's data type - you still need to build some kind of diff script that will apply only the changes to production, not drop and re-create the table.

If you have things like local lookup tables that are in other databases or could be in different servers, then instead of hard-coding those names in your code, you should use synonyms. Then you only have to ensure the synonyms are correct in each environment, rather than finding all the three/four part names in all of your modules and updating them upon deployment. And if you have local file paths that differ between environments, use a central properties table instead of hard-coding those paths into your procedures etc.

In theory you could use a backup and restore methodology later but that does not work well if the production database is already in use - it gets kind of tricky to restore a database from test and not lose any of the data that has been collected in production.

Aaron Bertrand
  • 181,950
  • 28
  • 405
  • 624
2

I think the answer to this in both yes and no. You said:

This is a greenfield deployment so there is no existing data to be migrated.

If this is an initial deployment then I see no issue with taking a backup of a development database to use in production. Saying that, you (or hopefully your DBA) would obviously have to cleanse the database and remove any security users and other gubbins that might have been restored with the database backup.

However:

As a long term solution, no, this is not a good deployment technique as once a client has started using the database, they will have their data in there so you cannot restore your database over theirs.

So going back to the initial deploy, it's fine to restore a backup of an empty database, but this is the point at which you would start to use version control to manage updates and future deployment, thus maintaining a clean deployable version of the database.

Steps:

  1. Create your clean database (restored from a backup if needs be)
  2. Version control it at this point
  3. Script any updates to the database and version control them
  4. When a new version/release is ready to deploy, execute any new scripts against clean database to get it from version X to version Y and then update the version you have in source control.
  5. Finally, you would execute the scripts against the client database to update it.

Any updates you make to the database would be tested locally before deployment and updating a client database would simply be a case of running the scripts that gets them from one version to another. Source control would give you the history between clean versions of the database along with the scripts that were applied.

Tanner
  • 187
  • 1
  • 11
2

I say don't do it. Use an SQL script made from as standard as possible statements as you can make, taking note of what SQL server versions that version of the script worked for.

I've had problems with software that was distributed via a backup when I've needed to do a re-install of the software at a much later date.

Due to the age of the original backup image used for deployment and no newer one available, I had to use an old version of SQL server, as the newer versions of SQL Server did not support the old format backup. Then in the process of applying nearly all of the application updates in sequence I had to apply an SQL server upgrade over the top of the old SQL server version due to the application update requiring a newer SQL server version than was originally required by the old version "deployment backup" image.

As a DBA I found this VERY frustrating to figure out and then actually perform the required steps. As an extra bonus I was doing this during a disaster recovery situation so I could restore the current production database from backup, as it required the latest version of the software which would not install without it installing the original "deployment backup".