Introducing Java DB Migrations

UPDATE: A new version of the Java DB Migrations framework has been release, check this post for details and the project documentation.</p

Here at Carbon Five we have the luxury of working on many projects, so anything we can do to make things easier will pay off in multiplicity across new projects. One of the things that we have to deal with on every project is maintaining a database schema over time. We’ve had a manual process of capturing changes in incremental db patch scripts for a while, but it was error prone and sometimes neglected. We’ve been doing more Ruby on Rails work and found Rails Migrations easy to work with and a real time saver. We wanted something that would make our lives easier when working on Java projects in the same way Migrations improve Rails development. With that manifest in mind, Alon and I collaborated on a simple Java database migration framework.

During development, it’s a big deal because each engineer has two instances of the database, one for unit tests and another for running the application. We need an easy way to create a new, up-to-date database and update existing databases. Once a project has launched, it’s a big deal because we need a way to migrate a database teeming with important production data to the latest version without losing critical information.

High Level Requirements

  • Initiate a migration from the command-line as a Maven plugin
  • Programmatically migrate a database during application startup
  • Convention over Configuration
  • Initially support migrations written in SQL

At a high level, the migration process looks like this:

  1. Query the database (table db_version) to find the current version.
  2. Determine the latest database schema version available.
  3. If the database is out of date, run each migration in order in its own transaction, updating the db_version for each migration.

We’d identified two usage patterns, the first is more akin to the Rails Migration model in that you explicitly migrate the database via the command line. The second is automatic migration when an application starts up, before Hibernate initializes or any other data access takes place. I’ll
discuss each in turn.

Migrating using Maven

This functionality is easy to enable in a mavenized project. First you add the Carbon Five public plugin repository:

pom.xml

...
&lt;pluginRepositories&gt;
    &lt;pluginRepository&gt;
        &lt;id&gt;c5-public-repository&lt;/id&gt;
        &lt;url&gt;http://mvn.carbonfive.com/public&lt;/url&gt;
    &lt;/pluginRepository&gt;
&lt;/pluginRepositories&gt;
...

And then you configure the migration plugin:

pom.xml

...
&lt;plugin&gt;
    &lt;groupId&gt;com.carbonfive&lt;/groupId&gt;
    &lt;artifactId&gt;maven-migration-plugin&lt;/artifactId&gt;
    &lt;version&gt;0.9-SNAPSHOT&lt;/version&gt;
    &lt;configuration&gt;
        &lt;defaultEnvironment&gt;test&lt;/defaultEnvironment&gt;
        &lt;environments&gt;
            &lt;environment&gt;
                &lt;name&gt;default&lt;/name&gt;
                &lt;driver&gt;com.mysql.jdbc.Driver&lt;/driver&gt;
                &lt;username&gt;dev&lt;/username&gt;
                &lt;password&gt;dev&lt;/password&gt;
            &lt;/environment&gt;
            &lt;environment&gt;
                &lt;name&gt;test&lt;/name&gt;
                &lt;url&gt;jdbc:mysql://localhost/myapp_test&lt;/url&gt;
            &lt;/environment&gt;
            &lt;environment&gt;
                &lt;name&gt;development&lt;/name&gt;
                &lt;url&gt;jdbc:mysql://localhost/myapp_development&lt;/url&gt;
            &lt;/environment&gt;
        &lt;/environments&gt;
    &lt;/configuration&gt;
    &lt;dependencies&gt;
        &lt;dependency&gt;
        &lt;groupId&gt;mysql&lt;/groupId&gt;
        &lt;artifactId&gt;mysql-connector-java&lt;/artifactId&gt;
        &lt;version&gt;5.1.5&lt;/version&gt;
        &lt;/dependency&gt;
    &lt;/dependencies&gt;
&lt;/plugin&gt;
...

You’ll notice that we’ve got 2 environments configured. You can have as many as you need and you can specify which you want to migrate on the command line. If none are specified the default environment will be migrated. In this example we’re specifying the dependency on our JDBC driver so that the plugin has access to the code it needs to connect the database.

Lastly, you drop in your migration scripts into the src/main/resources/db/migrations directory, naming them using the pattern NNN_description.sql, where NNN is three digits indicating the script sequence. Some examples might be:

  • 001_create_users_table.sql
  • 002_add_default_users.sql
  • 003_add_lastvisit_column.sql

The description is optional and isn’t used for anything, it’s just there so that other developers can get an idea of what a script does without having to open it.

From the command line, you can run the migration plugin like this:

$ mvn migration:migrate

Note that he database must exist for the migrations to take place as we do not create missing databases (yet).

I’ve created a simple, complete sample that shows off this functionality, it’s on the C5 public subversion repository here. Check it out and then read the readme.txt at the top of the project.

Migrating from your Application

The other usage scenario is to auto-migrate during application startup. At the core of the framework, there’s an interface called MigrationManager which has two implementations: DataSourceMigrationManager and DriverManagerMigrationManager. Migration happens right after a datasource (of the javax.sql variety) is created.

Migrating from your application is as easy as instantiating one of these early in the startup cycle and invoking the migrate() method, something like this:

MigrationManager migrationManager = new
    DriverManagerMigrationManager(“com.mysql.jdbc.Driver”,
    “jdbc:mysql://localhost/myapp_test”, “dev”, “dev”);
migrationManager.migrate();

Of course this needs to happen before anything else in the application uses the database; we want to database to be updated completely before it’s used.

Spring is part of our standard development stack on our Java projects, and it’s easy to enforce these dependencies in Spring configuration. First we define a data source for the application:

&lt;bean id=&quot;dataSource&quot; class=&quot;org.springframework.jdbc.datasource.DriverManagerDataSource&quot;&gt;
    &lt;property name=&quot;driverClassName&quot; value=&quot;org.h2.Driver&quot;/&gt;
    &lt;property name=&quot;url&quot; value=&quot;jdbc:h2:file:~/.h2/migration_sample2_test&quot;/&gt;
    &lt;property name=&quot;username&quot; value=&quot;dev&quot;/&gt;
    &lt;property name=&quot;password&quot; value=&quot;dev&quot;/&gt;
&lt;/bean&gt;

And now we declare our MigrationManager instance. Note the ‘init-method’ attribute:

&lt;bean id=&quot;migrationManager&quot;
    class=&quot;com.carbonfive.db.migration.DataSourceMigrationManager&quot;
    init-method=&quot;migrate&quot;&gt;
    &lt;constructor-arg ref=&quot;dataSource&quot;/&gt;
&lt;/bean&gt;

And then you define something that’s going to use the defined DataSource. Note the ‘depends-on’ attribute:

&lt;bean id=&quot;userService&quot;
    class=&quot;com.carbonfive.migration.sample2.UserService&quot;
    depends-on=&quot;migrationManager&quot;&gt;
    &lt;constructor-arg ref=&quot;dataSource&quot;/&gt;
&lt;/bean&gt;

This is obviously a little contrived for the sake of example, but you get the point. In a typical application the thing that would depend on the datasource is a Hibernate SessionFactory.

You can visit the source code for this example on the C5 public subversion repository here.

Best Practices

Here are a few of the things we’ve learned along the way:

  • Start using migrations early. Definitely start by time there’s more than one person on a project. I usually start off letting hibernate generate my schema while I’m experimenting with things, but as soon as I’m really working on features I’ll switch over to migrations.
  • All database changes are captured as a new migration.
  • Migration scripts cannot be changed once anyone has run them and make further changes in a new migrations.

Source Code Access

The Carbon Five db-support project which contains all of this migration goodness is available on the C5 public subversion repository at https://svn.carbonfive.com/public/carbonfive/db-support/trunk. It’s a maven project and should compile and pass its tests out of the box. I encourage you to look through the code and check out the tests.

Future

If you look through the code you’ll see some of what’s in store for this project. We’ve got initial support for writing migrations in Groovy and JRuby and we’re thinking about added Java support as well. We’re looking for feedback to drive the future direction of the project, so feel free to write us and let us know what you think.

About Christian Nelson

Christian is a software developer, technical lead and agile coach. He's passionate about helping teams find creative ways to make work fun and productive. He's a partner at Carbon Five and serves as the Director of Engineering in the San Francisco office. When not slinging code or playing agile games, you can find him trekking in the Sierras and playing with his daughters.
This entry was posted in Web and tagged . Bookmark the permalink.
  • Lance

    This is really useful thanks!

    I have noticed one problem in that it doesn’t work with MS SQL Server because MS SQL Server does not allow inserts on a timestamp column. I made a simple fix in the SimpleVersionStrategy class that checks the database product name and lets MS SQL Server take care of the timestamp column by itself other wise works as normal.

    Also on the 0.9.1-SNAPSHOT release that was just posted today I get errors from Maven. It looks like either something was broken or the maven configuration changed.

    Other than that it is a great project and I really look forward to using it more.

    Lance

  • christian

    Thanks for the feedback!

    To date, I’ve been testing against PostgreSQL and MySQL, though I’d like to include MS SQL Server as a supported database. Can you send me a patch including your changes (or just a code snippet)?

    I’ve been busy making a number of changes to both the core db-migration framework and corresponding maven plugin. These changes will be included in v0.9.1 when it’s released in the near future (tomorrow likely). While 0.9.1 includes a number of improvements it’s not going to be drop-in compatible with previous releases. I recommend sticking with 0.9 and avoid the current -SNAPSHOT builds until 0.9.1 comes out and you have time to upgrade to the next version. I’m also updating the documentation on the project’s google code site.

    Cheers,
    Christian

  • Jim Alateras

    This looks great. I am interested in embedding database migration in my project and have several questions

    1. Is the framework responsible for creating the version table ?

    2. You mention above that you are going to support for writing migrators in Java. Is this still on the cards?

    3. Where do you specify the current schema version so that the migrator can determine whether the database schema needs to be migrated.

    4. Is each migrator run in its own transaction?

    cheers

  • christian

    Jim:

    There’s a more recent blog post that goes into more detail… check it out:

    http://blog.carbonfive.com/2008/09/java/java-database-migrations

    Also, there’s now a google code project page:

    http://code.google.com/p/c5-db-migration

    And some answers to your questions:

    1) Yes, by default it’s called schema_version, but it’s configurable.
    2) Not in the immediate future. Migrations can be written in Groovy or SQL though.
    3) The framework compares the schema_version table with the migrations that are available on the filesystem (or classpath) and applied all pending migrationautomatically.
    4) Yes, though this is really only useful for postgresql, the only database that runs DDL in such a way that it is rolled back if there’s an error.

  • Jim Alateras

    Christian,

    Thanks for the prompt response. Currently, i have hibernate auto create the schema. To move to database migration i need to generate the base schema and copy it in to the appropriate resource directory. Is that correct?

  • christian

    Jim:

    Exactly.

    If you’re using the maven plugin to apply migrations, then the default location is src/main/db/migrations/ (note that this changed with the 0.9.1 release). If you’re applying migrations from your application upon startup, then they need to be on the classpath (e.g. src/main/resources/db/migrations).

    Also, I release v0.9.2 last week which fixes some issues with multi-module projects and MS SQL Server.