My Innovative Solution to Software Development: Versioned Database Migration
A good software project shall have a repeatable, quick, reliable database structure change process.
This is included in the “My Innovative Solution to Test Automation and Continuous Testing” series.
In an ideal world, architects design the database well so that it won’t change much during its development. However, as we know, the reality is very different, especially in agile projects (pretty much every one of them nowadays). You do need to alter your database tables’ structure, a lot.
Since 2007, I have observed that an important step was totally missing in many companies’ so-called CI processes: Database Migration. It seems all software teams are managing the database schema manually. Given the database schema needs to evolve frequently, wouldn’t that be a big risk?
Software Teams Need Versioned Database Migration
Many years ago, I watched a presentation by Scott Ambler, the co-author of “Refactoring Databases: Evolutionary Database Design” book. Scott started by asking the audience a question: “Suppose you make a simple change to a database column, such as surname
to last_name
, how confident are you to push the updates to production? If so, raise your hand.” No hands were raised from the audience. Scott then said: “It is a simple change, isn’t it? And, we do need to perform that occasionally, don’t we?”
As soon as one freezes a design, it becomes obsolete. — Fred Brooks
Database table structure surely needs updates, and every experienced software engineer understands that. Typically, they jump on a Database management tool, such as Toad, to alter manually. However, this is not only against engineering principles, it is also error-prone. (some use tools such as Visual Diff. But it is still not good enough)
As a matter of fact, many CI solutions failed on database migrations step before hitting the testing phase. The workaround is manual database updates, Yak.
Why are many software teams still performing manual database migrations?
The short answer is that they don’t have or trust script-based solutions (the term for it is database migration, please note, and this is NOT data migration). Frankly, I was not aware of this concept after programming in Java/C# for nearly ten years until I discovered Ruby on Rails in 2006.
In the past decade, I have observed that some software engineers attempted a quick, reliable, and repeatable database migration process, but failed. There are two challenges:
How will you make database migration reliable and repeatable?
Use a proven script-based migration scheme, such as Ruby on Rails’ one. This is the main topic of this article, read on.How will you make sure that the changes will not cause data issues?
A bad code change is often not a big deal, and can usually be patched up in the next update. However, a bad database update to data, e.g. dropping a wrong column, is disastrous.
Anyway, you need an automated End-to-End regression Testing process. I know many readers would think, “Yes, automated regression testing would be nice, but we don’t have the skills, and it will take too much effort”. My answer is: “Learn and do it if you regard yourself as a software engineer”.
Check out my many other articles on test automation and continuous testing.
Database migrations in my own software projects
Here are the database migration scripts for one of my web apps: WhenWise. Please note the size of the scrollbar.
WhenWise is developed in Ruby on Rails (RoR), a wonderful web framework. Let’s have a look at a RoR database migration script: 20220729231342_add_currency_to_products.rb
, I added multiple currency support for products and services on WhenWise.
class AddCurrencyToProducts < ActiveRecord::Migration[7.0]
def change
add_column :products, :currency, :string, :default => "USD"
end
end
This is the simplest form. There are others, such as rename_column
, create_table
, drop_column
, … ,etc.
Some might think: “This is standard RoR, nothing innovative.”
Next, I will share a story on using this in a Java project successfully.
Story: Add RoR migration to a Java Project
In 2008, I joined a Java project (government department) as the tech lead. There was no CI or Automated End-to-End testing, and even unit testing was done poorly like many software projects.
The first practice I introduced was automated End-to-End testing using Watir, which was well-accepted by the team members. With the growth of the test case numbers, I researched the ‘database reset’ capability. However, the database reset was too slow for SQL Server. (The detailed story was in the article “The Simpsons’ Data Reset Pattern”). Long story short, I implemented database migration scripts to migrate two databases, MySQL and Microsoft SQL Server.
When I mentioned using Ruby on Rails database migration, I could sense the hesitation among the team. These people were Java programmers and had worked on this for more than a year. (they might fear I would change this to a Ruby on Rails project like Bruce Tate, a renowned Java guru, did it in “Beyond Java” book).
So, I decided to do it in Java way. How could I do Ruby on Rails database migration without the Ruby language? The answer came out quickly: JRuby.
The database migration script, as you saw earlier, while in Ruby syntax, is fairly readable. I could use JRuby to run these scripts.
JRuby is a mature implementation of Ruby, in Java. ThoughtWorks’ Mingle (the project management tool like Jira, but before Jira) was coded in Ruby for higher productivity, and then deployed in Java web server.
The implementation was quite simple. I just added a helper jruby.jar
to invoke the database migration like Ruby on Rails. To team members, it was just an extra Java utility library.
Very soon, a few java programmers started liking Ruby. In memory, at least two of them worked on Ruby projects later on (for others or their own).
Another story (C# project)
About ten years ago, I was lent to a sister project team to take care of testing. The development was outsourced to a medium-sized Software Consulting company (Microsoft Gold Partner). Those C# consultants had no idea of data migration. For every sprint build, they would spend quite some time updating the database schema manually. They even brought a database visual-diff tool (compared the changes of dev and local database changes). Anyway, the process was time-consuming and error-prone.
Finally, I couldn’t stand it anymore. I implemented the database migration and showed it to the lead consultant. He was surprised and rejected the change with the excuse that he “can’t introduce another language (ruby)”. The chaos remained. These C# programmers thanked me for my automated regression testing which had detected regression issues (on a daily basis) and some of which were caused by dodgy database changes.
In 2009, I thought I might be using this often in future Java projects. Therefore, I created a reusable package for data migration and uploaded it on Github. But I have never used it as I switched my day work from Programming to Automated Testing the next year. I am still coding though, in my spare time for my own apps which are mainly two types: Web apps using Ruby on Rails and Desktop apps using C++.
Related reading: