T-SQL Tuesday #90 – You are doing “Continuous Integration” wrong!

It’s still Tuesday alright… In the USA anyway. (And by the USA I mean Hawaii.)tsql2sday-300x300

This blog post is part of T-SQL Tuesday #90 – Shipping Database Changes, hosted by James Anderson. T-SQL Tuesday is an online blog party started by Adam Mechanic and you are invited to join in. A record of every previous T-SQL Tuesday is maintained by Steve Jones.

Let’s begin. *Takes a deep breath*

Everyone does CI wrong!

(OK, perhaps not everyone, but a lot of people.)

Whenever I deliver a conference session about database continuous integration (CI), I like to start by asking a question to the audience. “Who can tell me what continuous integration means?”

I almost always get responses like:

“Automated deployments!”

“Automated builds upon commit!”

Very occasionally someone will impress me with something like:

“Unit tests!” or “Automatically running my unit tests!”

Not bad answers. Have a biscuit. But you are still missing the fundamental point.

Yes, it’s easy to understand that if we do some sort of testing or deployment validation every time we commit we can discover problems more quickly and fix them more easily… that’s one of the foundations of CI – but you are missing the other.

Continuous integration means “continuously integrating” your code with the code that other people wrote. “INTEGRATING”. “CONTINUOUSLY”. I mean, that’s exactly what it says on the tin right?

ThoughtWorks would probably struggle to argue that they invented CI, but they have certainly done an awful lot to promote it and develop the idea. This is how they define it.

“Continuous Integration (CI) is a development practice that requires developers to integrate code into a shared repository several times a day.

Each check-in is then verified by an automated build, allowing teams to detect problems early.

By integrating regularly, you can detect errors quickly, and locate them more easily.”

ThoughtWorks

People remember the second and third sentences but they forget the first.

Let’s drill into that first sentence.

“integrate code into a shared repository”

I don’t care what source control system you use. If you are working on different branches you are not integrating your code. According to the true meaning of continuous integration you are only integrating your code when it is merged with master.

“several times a day”

And you must commit or merge to master several times a day. Why? This is because that is roughly the amount of code that a humans can hold in their small, fallible brains at one time. Can you remember the code you wrote yesterday? I can’t. And even if you can, are you confident that everyone you work with (or will work with in the future) will be able to look back at your commit/merge and fully comprehend all of the changes you made as well as their ramifications on the rest of the code base? No. Not if you have several days worth of development all bundled together.

“By integrating regularly, you can detect errors quickly, and locate them more easily.”

This is only true if each change represents a small piece of work that someone can fully load into the limited RAM available in their brain in one go without pulling a face. If a change represents a merge of a sprint’s worth of work – that doesn’t sound easy to unpick to me.

If you do any of these things you are not doing CI

  • Commit less frequently than “several times a day”
  • Work on feature branches that are not merged back to master “several times a day”
  • Have multiple streams of work against the same code base that are not integrated “several times a day”
  • Integrate several times a day, but the result is code that is not releasable

I’ve personally seen the pain when a development team has decided to build a big, complicated feature on a branch alongside regular, simpler bug-fixes and smaller/easier features. They did not want to merge it back in until it was “ready”. The team believed they were doing CI because every commit to their branch triggered a build and the builds mostly passed.

However, after six months they discovered that they had created an enormous merge job. When they discovered quite how complicated the merge job was going to be project management decided they could not afford the time it would take to merge it back so the project was abandoned. Another six months later, they decided that they really needed that big feature after all so they rolled up their sleeves and tried to “integrate” the code again. The poor developers lumped with that task (who had brought it on themselves to be fair) did not enjoy that job but they got it done. Unfortunately, however, in the end they realised the feature was simply not reliable enough so it was abandoned and built from scratch anyway.

What a colossal waste of time and effort! (And money… so so much money.)

Automating your build is easy – the hard thing is…

…developing multiple features at the same time while integrating several times a day. Here are a few ideas about how to handle this challenge:

So, yeah. Sorry if I just burst your bubble – but if you aren’t integrating your code to mainline several times a day you aren’t doing continuous integration.

No-one ever said CI was easy.

*

If you enjoyed this post why not tune in for my GroupBy session in a few weeks. “Getting CI right for SQL Server“. And the following week James Anderson, this month’s T-SQL Tuesday host, will be talking about “SQL Server and Continuous Integration” so don’t miss that one either!

**

Cover image: You’re doing it wrong by Adam Swank shared under the Creative Commons Attribution-ShareAlike 2.0 Generic licence.

Leave a Reply

Your email address will not be published.