Critiquing two different approaches to delivering databases: Migrations vs state

tl;dr

Some argue that thinking about databases purely in terms of migrations provides repeatability and reliable deployments. Others argue that working declaratively, treating the database state as code, is a better development experience providing immutability, reliable testing and greater productivity. Most people have a strong opinion – not enough people discuss the relative pros and cons in a balanced way. This makes it hard for people who need to make rational choices for real projects to decide which will suit them.

This post aims to address that. Comments welcome.


The most significant choice any team must make when choosing a strategy for delivering their database:

Migrations or state?

I’m taking it as a given that using a standardised process for source control is a good thing. This will make your lives as database developers much easier and it will help you work more closely with your DBAs/ops teams as it will give you more control over your deployments. In order to do this you do need to pick either the state or migrations format to manage your database scripts. This means that whatever option you choose you will have to live with its flaws and find ways to manage them.

This post follows on from my article in NDC magazine (p16) and covers some of the same ground as my “Elephant in the Room” talk. It describes the relative pros and cons of each approach, including a technical example, and suggests which might be better suited to your projects – because often projects are better suited to one approach or the other.

Defining the problem and the two solutions

You need to update your database schema without changing/deleting the data. This means, unlike with application code, you can’t simply delete the old database and create a new one every time you deploy your changes. As well as defining how the database should look, you need to write an upgrade script.

The two approaches, at their root, are about which is your source of truth, the definition of how you want the database to look, or the scripts you produce to handle the upgrade. The former is referred to as the declarative or state-based model. The latter is referred to as the imperative or migrations-based model. There are also a myriad of other names given to the two approaches. In this post I will use the terms ‘state’ and ‘migrations’.

State: Your source of truth is how the database should be

Migrations: Your source of truth is how the database should change

Some would say state is a better development experience, providing immutability and reliable testing. Others would say migrations provides more repeatable and reliable deployments. Most have a strong opinion about which is better and find it hard to understand how the other can ever be reliable or agile. In my opinion there is not enough balanced consideration of their relative strengths. That is what I will aim to provide in this post.

Our scenario

In order to explore the relative strengths of the two approaches, let’s imagine a scenario. I’m going to start with the following table and stored procedure:

-- table_foo.sql

CREATE TABLE table_foo

(col1 NVARCHAR(max), col2 NVARCHAR(max))

-- myproc.sql

CREATE PROCEDURE myproc

AS

SELECT col1

FROM table_foo

And then I have two tasks to perform in order to implement two separate features. These tasks are given to different developers:

  1. Alter the stored procedure so that it selects data from both columns
  2. Rename table_foo to table_bar

Now let’s discuss everything that is about to go wrong. Let’s start by trying a migrations approach.

Birds

The first developer is going to write some code to define the change they need to make. It will probably look a little bit like this:

ALTER PROCEDURE myproc

AS

BEGIN

SELECT col1, col2

FROM table_foo

END

All good.

Now comes the second developer, tasked with renaming the table. It is not hard to imagine the second developer writing some code that looks like this in order to handle the dependency as well as the table rename:

EXEC sp_rename 'table_foo', 'table_bar‘

 

ALTER PROCEDURE myproc

AS

BEGIN

SELECT col1

FROM table_bar

END

Of course, this second developer should have checked that they were working off the most recent version of the database – but they didn’t.

The problem here is that the second developer just overwrote the first developer’s changes – the stored procedure no longer selects col2. With ALTER statements the last one to run always wins. This means that with a migrations-based strategy, team discipline is exceptionally important and managing the order of upgrade scripts is a very important task.

Now imagine you are using branches. Imagine the merge conflict as you have to painstakingly go through every script, line by line, to work out the order that they need to run in or if there are conflicts.

So clearly the migrations approach is hard and we should use state instead, right? Well let’s consider how a state-based approach would handle this scenario:

First, the stored procedure. Instead of writing an ALTER statement the developer will write a new definition of the stored procedure as a CREATE statement, something a bit like this:

-- myproc.sql

CREATE PROCEDURE myproc

AS

SELECT col1, col2

FROM table_foo

And the second developer will do the same thing, something a bit like this:

-- myproc.sql

CREATE PROCEDURE myproc

AS

SELECT col1

FROM table_bar

And we still have a conflict, but this time simple diff tools can help us to spot and resolve it:

diff

So in the state model it is easier to spot and handle conflicts. Great. But what about that table rename?

When using a state-based solution you will most often be using a diff tool like those provided by Redgate or Visual Studio to examine the differences and generate an upgrade script. While this is a very efficient solution for most changes, with table renames and a few other types of table refactoring they can do bad things, a bit like this:

DROP TABLE table_foo
 

CREATE TABLE table_bar

(col1 NVARCHAR(max), col2 NVARCHAR(max))
 

ALTER PROCEDURE myproc

AS

SELECT col1, col2 FROM table_bar

In this scenario, when doing a deployment, a diff tool is likely to see that there is a table in the production database called foo, which should be deleted, and a table in the source called bar, which should be created. It does not know the context of the change.

This is an area where a purely migrations-based approach is strong and a purely state-based approach is weak since developers can write scripts to define precisely how a deployment should be handled.

And here’s another problem: Drift

So we have already seen that migrations- and state-based approaches are strong and weak in different areas. Let’s consider a problem that both suffer with, but in different ways.

Do you work with that frustrating person who is always making ‘hot-fix’ changes to QA or even production databases? Perhaps it is you, because the cost of delaying the fix outweighs the cost of the risk? Either way, let’s just assume that some change has happened on the production environment and was not made in the development environment.

Drift

What would a state-based approach do? It would roll back the fix. It would make the target database look like the version in development that did not have the fix. Oops.

What would a migrations-based approach do? It would ignore the hot-fix. It is possible the deployment would fail, or (in my opinion worse) it would pass. This is bad because now you are running a version of the database in production that has never been tested. You could very easily have integration issues. What is even more damaging is that, unless you have some sort of check in place, you will never find out, until you have a problem.

So both state and migrations fall short here. The only solution is to make efforts to eradicate drift and implement monitoring systems to alert you whenever drift does occur so that it can be dealt with. Redgate offers a free tool that can help with this: DLM Dashboard

So where does that leave us?

For state-based solutions you get the benefits of a much easier development workflow. It enables larger teams to work together on more complicated databases with more complicated dependency webs to unpick. This lends itself to those monolithic legacy systems that so many of us are lumbered with. For similar reasons it also lends itself to fast development, perhaps in the early phase of a project, where breaking changes aren’t so important and the database is frequently being updated.

The caveat is that you need to really understand your deployment tooling and you probably need a way to test and review your upgrade scripts. You probably want to have a staging environment where you can give all deployment scripts a dry run. You also need to have a plan for what to do when an upgrade script generated by your software is not sufficient.

For migrations-based solutions you are giving control of the deployment process to your developers. This gives them great power and great responsibility. It works really well for databases that are simple data stores where most of the changes are table refactorings or data migrations. However it is a very inefficient way to work with stored procedures and functions etc, where there are no data migrations to worry about. It also becomes quite painful quite quickly when there are frequent database changes because the task of managing the order of scripts and reviewing all the code from vX to vY becomes a big burden. In order to manage this effectively you need to enforce rock solid processes to ensure you avoid conflicts.

In both cases you need to think hard about drift and be aware of the consequences of failing to deal with it appropriately.

Skin a cat

It is worth knowing that many of the tools you are likely to use to support you offer features that patch around the problems with their standard approach. For example, among state-based tools Visual Studio SSDT lets you use pre- and post-deployment scripts and Redgate has a ‘migrations scripts’ feature, both allowing users to hard code certain parts of the deployment process. Of the many migrations-based tools ReadyRoll has an interesting feature where you can effectively use a purely migrations approach for your tables at the same time as using a purely state approach for your functions etc. These sort of hybrid solutions do solve some of the problems, but they also bring extra complexity and will normally live mostly in one camp with only a few token gestures at the other.

So what should I use for my project?

In summary, if you are a large development team, making frequent changes, on a database that contains many stored procedures and which has a complicated dependency network, you will probably find a state based solution more successful.

If you are a small team that does not frequently edit your database, which you mostly use as a simple data dump, and as a result most of the changes are table refactorings or data migrations, you are likely to have more success with a migrations-based solution.

Of course, the chances are you won’t fit neatly into one category or the other so you’ll have to make a decision based on what mostly fits your needs and what your team feels most comfortable with. And by ‘team’ I include all the developers, testers and DBAs, as well as anyone else who plays a role delivering your database.

And of course, you may find that throughout the lifecycle of your database you decide to switch from one approach to the other. For example, when working on a greenfield project that has not yet been released, the experience of working in a state-based environment is really nice. You can work quickly making many changes and refactorings as you try different designs – and deployments are no concern just yet. Later on, when you have an established product with paying customers and you make fewer changes to the database you may find the migrations approach is more suitable. And after a while, when that exciting new tool has turned into a monolithic, complex, legacy nightmare where changing something over here breaks something over there, you might find the state-based solution becomes attractive once more.

I guess, if you take one thing from this post, please let it be the message that it is a wonderful thing to be open-minded. And to take anyone who tells you there is only one correct way to do something with a pinch of salt.

I wish you every success. Please do let me know what you have tried and what worked well – and not so well.

  11 comments for “Critiquing two different approaches to delivering databases: Migrations vs state

  1. June 19, 2015 at 10:48 am

    I agree with you about database development in “greenfield” environments: If you and your team working on v1 of your database then the ability to drop and recreate your database at any time during the development cycle makes the state-based approach ideal, since you can chop and change your schema design constantly and not have to deal with a mountain of (potentially obsolete) sequentially-ordered scripts in source control.

    In terms of the suitability of the state-based approach in the context of monolithic legacy systems, then this really depends on what you’re trying to achieve.

    If you’re in the initial stages of your CD journey and trying to get on top of your sprawling database environments, then I agree that a tool that works most effectively at source controlling your schema is very likely going to be the best option. State-based tools are exceptionally good at capturing and tracking snapshots of a schema as its design evolves, thus are a great place to start if you’re trying to understand the nature of the beast that you’re trying to tame. As an added bonus, these tools will often be able to tell you where there are problems in your schema, like a view that references a column that has long since being dropped, or if there is a data-type mismatch between the parameters of two stored procedures.

    If you’ve managed to get a handle on your Production databases, and discovered that it’s actually a giant mess, with circular dependencies between databases, triggers that update data in tables via linked servers, or reference data that differs between environments, then you may encounter problems trying to adopt Continuous Delivery using the state-based approach. Trying to replicate that same big ball of mud within an environment where improvements can be made, tested, refined and then propagated through an automated series of steps onto Production is going to be made difficult due to the need of those tools to analyse the available inputs (as defined by your source-controlled model) and make decisions about how to synchronise the source and target at deployment time. Your ability to carry this out successfully without the need for manual intervention (e.g. tweaking of synchronisation options or editing of the generated script prior to deployment) entirely hinges on a pre-defined set of rules baked into the tool.

    Ironically, an aspect of migrations that you (rightfully) highlighted a weakness of the approach — that schema drift can occur as a result of changes made directly to Production — also lends it flexibility in the way that you decide to adopt database deployment automation: you can tolerate the fact that source control doesn’t have a complete picture of your environment because it means you don’t have to bite off the entire problem at once. Instead, you can start small, for example only adopting CD for a subset of your schema objects, and then revisiting the “too hard basket” once you’ve managed to get a repeatable deployment process in place. Basically, the imperative nature of migrations give you the scope to come up with novel solutions to deployment challenges that are beyond the bounds of the declarative model.

    Of course, the strengths and weaknesses of these two approaches to solve the problem at hand often comes down to environmental and people factors, as well as the various limitations of the available tools. There is definitely nuance to this topic and our industry isn’t well served by those who would proclaim that one way works best in all scenarios.

    Thank you Alex for kicking-off this debate again in a constructive and balanced way!

    • Alex Yates
      June 20, 2015 at 2:23 pm

      Thanks Dan,

      A very fair response. You are right that there are other scenarios that I have not explicitly covered where one or the other approach may be superior and I think you are right to call this out.

      I also agree that sometimes people can find monolithic databases are just too complex for diff tools to handle. Either they have circular dependencies or dependencies on external infrastructure etc, which can complicate build tasks, or they are using deprecated or obscure features of their database platform which are not supported by tooling. In this scenario I agree that the sort of migrations style solution you suggest will provide at least some control – which is better than none.

      However, I’m not sure I’m comfortable taking the step you suggest to tolerate the fact that source control doesn’t have a complete picture of the schema. I can see that applying some rigour to your migrations and keeping those in source control is a good first step – but it is only a first step. Continuous delivery is not just about changes. It is about testing and feedback loops and experimentation. If your source control does not have the complete picture then it cannot be a source of truth – so how can you trust your tests?

      I’ve also been thinking a bit following conversations at NDC and feedback on my talk/this post. I think a lot of it comes down to how early in your pipeline you lock in the deployment script. Effectively, with both state and migrations, almost everyone will start their dev by making some changes to a database and trying to get something working in dev. By definition, that is a state based task as you are making a change to a real database (or scripts that represent one). Then you commit to source control. The thing you commit will either be a representation of the state or the migration – which is basically the point of this post. Even with a tool like ReadyRoll many people will start doing dev in the state world, but then switch to migrations before commit.

      But even when using a state based model some teams will have an automated build where one of the outputs is an upgrade script. This upgrade script is intended as a deployment artifact that can be run on production, potentially in an automated way, but not before it has been tested on various other environments. So even when source controlling the database state these people are using a migrations approach for their build artifacts, locking in deployment scripts really early in their pipeline.

      Then you have the people who remain in a state based world in their testing environments, using a diff tool to work out the deployment scripts for the ‘disposable’ environments which are not managed by DBAs. This gives them more agility to throw whatever version they need onto the test server on demand. It is only when they hand off to the DBA that the deployment script gets locked down, ready to be tested on a staging environment before being run on production. So here the switch from state to migrations happens later again, and perhaps it correlates to the switching from dev to ops responsibility?

      I suppose taking this idea to the extreme, a pure migrations approach would mean developers don’t even have their own databases to play with and a pure state model would leave the generation of the upgrade script right to the actual production deployment. Of course neither of those options are ideal so most will choose somewhere in the middle.

      Perhaps the real question is not black and white, but more a gradual scale: “How early/late do you lock down the deployment scripts?”

  2. June 20, 2015 at 11:11 am

    Wonderful blog post Alex – very well written and nicely balanced. Should be required reading for devs in my opinion. I’m a big fan of SSDT however I am intrigued to know what the Redgate offering can bring to the table since SSDT isn’t perfect by any means. Another thing to add to my list…

    • Alex Yates
      June 20, 2015 at 1:45 pm

      Thanks Graham,

      I appreciate your comments. I made an effort in the main part of the post to be as balanced as possible and not go into too much detail about any particular tools. However, since you asked about Redgate tools… 🙂

      Redgate produces a range of tools that you can use to support you through the various steps of a continuous delivery pipeline, including managing source control, running ‘builds’, creating deployment artifacts, automating tests and executing deployments. You can read more here: http://www.red-gate.com/products/dlm/

      For the deployment part Redgate uses a tool called SQL Release. For SQL Release the deployment artifact, which is called a release, includes an upgrade script which you can test in advance (so you know no magic scripting will happen when deploying to production) and it also includes drift detection by default. It’s designed as a set of PowerShell cmdlets so that it can easily plugin to release management tools like Octopus Deploy and Visual Studio Release Management. You can read more about it here: http://www.red-gate.com/products/dlm/dlm-automation-suite/sql-release

      Redgate also has some free offerings, including DLM Dashboard, which you can use to monitor for drift across your pipeline in real time. It is also well suited for auditing changes to production databases as it provides a history of every change and who executed it: http://www.red-gate.com/products/dlm/dlm-dashboard

      Let me know if you have any questions.

      Alex

  3. June 15, 2016 at 11:39 pm

    Bookmarking! This is exactly the detail I was looking for. Thank you for the detailed walkthrough, as it’s a lot of paradigm shifting concepts to work through.

  4. Robert Lucente
    September 29, 2016 at 10:58 am

    The blog assumes an either or solution in the problem statement. This is self limiting. I wrote a blog post to address this self limitation. The url for the blog post is

    http://rlucente.blogspot.com/2016/09/can-purely-state-based-database-schema.html

  5. Robert Lucente
    September 29, 2016 at 10:46 pm

    I tried the posting below yesterday but somehow it didn’t take but Alex was kind enough to place a reply on the blog post below. Not sure what happened.

    It is realized that the above blog does not mention hybrid solutions on purpose because trying to clarify the debate and explain the pros and cons of each pure approach. Unfortunately, some people might assume that it is an either or situation from the problem statement (Migrations vs state). This is self limiting. I wrote a blog post to provide information on a hybrid approach. It can be found at

    http://rlucente.blogspot.com/2016/09/can-purely-state-based-database-schema.html

    • Robert Lucente
      September 29, 2016 at 10:46 pm

      Ooops on my part. I did not realize that “Your comment is awaiting moderation”.

      • Alex Yates
        September 29, 2016 at 11:51 pm

        Oops, sorry for the delay. I saw the reply and got distracted reading your very good response, and forgot to approve your comment. Sincere apologies.

Leave a Reply

Your email address will not be published. Required fields are marked *