Edit: Please note that since writing SQL Lighthouse has been renamed DLM Dashboard.
Last week, at SQL in the City London, Red Gate announced the release of a new tool, SQL Lighthouse. SQL Lighthouse is product for monitoring the database environments in your pipeline for changes or ‘drift’. In this post I’m not going to focus on the product’s features and benefits, I’m going to review how the product came to be and some of the things we did in an effort to be ‘agile’… whatever that means? (Naturally, for the sake of simplicity, I’m cherry picking a few memorable examples.)
‘Drift’ is where changes occur to a software system outside of the normal process, for example, when someone makes a ‘hot-fix’ directly on production (or some other environment). This can cause a problem when the business needs to deploy changes to that environment because, depending on the deployment method, there may be a conflict or the team might accidentally roll-back the fix.
The challenge of managing database drift is not new. Phil Factor wrote a SimpleTalk post about it back in 2012 and I have often had conversations with customers about the problem ever since I joined Red Gate in 2010. But I’m a young whippersnapper – I’m sure the problem is as old as software development itself, and it is particularly pertinent with databases due to the fact databases are persistent. There is a reason why people say the production database is a moving target. We had heard that drift was a problem from enough sources to conclude that it might be worth doing something about it.
So one of our product managers, Jonathan Hickford (who delivered the SQL in the City keynote with Steve Jones and John Theron) came up with the idea of a dashboard that displayed the current state of each of the databases in a pipeline. It would visualize any changes as they occurred. It would also allow teams to name versions of the database and track these versions as they were deployed through your pipeline. Finally, it would be totally independent of your method for deploying changes. It would work straight out of the box regardless of whether you deployed changes manually, using Red Gate products or using any other process.
This concept was borne out, not only from regular feedback from users that they had this problem, but also from previous experience. In a previous life Jon had suffered multiple failed deployments due to drift that had caused his previous company a significant amount of grief.
Keep It Small Stupid
OK – I know KISS stands for something else – but I’m stealing it. Sue me.
The idea of the ‘Drift Dashboard’ (as it was called in the early days) seemed to be based on good foundations. We were attempting to solve a well-documented problem that our customers had told us they faced. However there were still many unknowns. Were we solving it in the right way? Would people pay for this tool or is there a viable alternative financial strategy to fund the project? Do enough people even recognize this problem? Would our solution meet our customers’ needs? Were we able to deliver the product fast enough at an acceptable dev cost?
The worst thing we could do at this point was put a huge team on the project and plough money into it before we were confident that we were building the right thing. We needed to validate our assumptions quickly. Do users really want this product? Can we deliver it? Will our solution work? Can we validate the idea quickly at a low cost?
We established a small team. Two developers, David (blog) and Sean (twitter), a tester, Toby (twitter), half a project manager, Chris (blog/twitter) and one third of me. Why me, a technical sales guy, before the first line of code had been written? Well…
Qualify with potential users early
Our first goal was to build a prototype in two weeks. The emphasis was on shipping something fast rather than something production ready. It should be good enough to demo but we would NEVER show the code to anyone… ever. (We were running SQL Compare on a loop!) We would seek out twenty people who we thought would find the tool valuable and we would show it to them and gather their feedback both about whether the tool would meet their needs and whether the design made sense. Then, if the prototype was popular, we would throw it away and build a quality version from the ground up. The small dev team aced the prototype and achieved the audacious goal on time.
But what did I do? Having spent several years talking to customers who had exactly this problem I made a few calls to a people I thought might be interested. I also posted on SimpleTalk asking for participants and the response was extraordinary – if nothing else the number and relevance of responses from that post indicated that we were thinking along the right lines. While the other people in the team got busy with code I selected the twenty people who seemed most like our target persona out of the many responses I had received and arranged a UX session with each of them.
We didn’t have a tech author or any UX/design people so we all pitched in – which was great fun. We all learned some new skills and, with some mentoring from our resident UX guru, Marine (blog/twitter), we stumbled our way through a few UX sessions.
We recorded the thoughts of our guinea pigs on an enormous whiteboard and very quickly the same things kept coming up.
OK that photo may be taken from a slightly favourable angle. 😛
- The concept was a good one. We were attempting to solve a real problem.
- The features we included in the prototype weren’t exactly right. Users wanted more detailed auditing; they wanted to know exactly who made the change and when. And they wanted email alerts because they didn’t want to have to keep watching the dashboard. Other features (such as scale or integration with SQL Compare/SQL Source Control) weren’t so critical for v1 for most users.
- Our UX sucked. The lack of a designer was evident and the terminology we used confused people.
- Our concern that users would not like our implementation plans for the real product was unfounded. (We were planning to use a DDL trigger and we didn’t know if customers would like that). Sure we had one or two raised eyebrows, but most of the guinea pigs didn’t mind and some had even tried to build a similar solution themselves.
- When drift occurs on a production database DBAs always use metaphors of violence to describe how they would deal with the culprit. Interestingly the exact approach differed between DBAs. Some would use fire, others would use sharp implements. One said he would use a cheese grater. It was never the same but it always involved pain. This led us to the conclusion that this tool might be useful for helping teams work together more productively.
After 14 sessions where we were hearing the same feedback over and over we decided we had enough data to validate that the project was a good idea and that our approach was sensible, but that we had to rethink a few details – and add a UX person to the team.
Iterate in small chunks and continuously gather feedback
So three weeks in, with only three and a bit people, we had the experience and lessons of building the prototype behind us and we knew the product, if we could deliver it, might solve a real problem. We also knew that our approach could work but that we had to fundamentally change our designs. We were also able to prioritise our backlog based on feedback from real users.
We started work on the actual product. We decided to call it SQL Lighthouse (after a similar project that had run the year before). After a month we had a very early alpha and we had the audacity to share it with our fourteen guinea pigs (from now on I’ll call them our alpha users). While we felt the code was pretty good, we knew we were sharing it with users earlier than felt comfortable. I was called on again to go through the installation with our alpha users because it wasn’t what Red Gate would call ‘ingeniously simple’ yet. It also tended to break in undignified ways. And user documentation… what documentation?
Of course, we would not be able to share such an early version of the code with just anyone. We had built up a good relationship with our alpha group and they were very supportive. They wanted us to succeed because they needed the product themselves and they were willing to help us troubleshoot the issues as and when they cropped up. We were very open with them that the product was nowhere near even BETA quality yet but they were quite keen to get involved anyway. One of them even enjoyed taking part for their own personal development as they wanted to see how other companies went about delivering software. Another took pleasure in actively trying to break the product. It was a wonderful experience to troubleshoot issues with our customers, and also to provide them with the fixes quickly. On some occasions they suggested novel solutions that we might not have considered by ourselves.
Once we had the alpha group set up with v0, we shipped v0.1 just a week later, and we continued to release every week right up until the BETA release. (This was a practice some of the team members had adopted on a previous project and Chris, project manager, blogged about it here). We would have released more frequently except for the fact that each customer had to actively install and play with each new version if we were going to get any feedback and, in the early days, sometimes the upgrade process wasn’t very easy.
By iterating relatively quickly we were able to find out if our work was providing value fast and we were able to spot issues pretty quickly. Gradually during the alpha phase, as our confidence in SQL Lighthouse grew, so did the team. We added more developers and testers, we hired a UX person, we stole a tech author from another project and we grabbed a marketer in preparation for the BETA release. Eventually I took a step back from the project because the complicated set-up gremlins were gradually being ironed out and I was needed in other parts of the business.
Technology is not always the best way to meet your goals
Of course, during the alpha phase we were really interested in whether or not our alpha users were actually using and getting value from the tool. With established products that have thousands of users we tend to track this data with automated feature usage reporting. This anonymised data tells us which features our users actually use and sometimes, where they get stuck.
Of course, if SQL Lighthouse took off we would want to do something similar, but in those early days the week or more that would be required to set this up would have slowed us down and we only had fourteen real world users.
Instead of automated feature usage reporting, at the beginning of the alpha phase I grabbed an unused whiteboard and printed out an A4 sheet of paper for each of our alpha users with their name and job role. I split the board into three sections, ‘planning to install’, ‘attempting to install’ and ‘installed’ and I placed each user’s sheet in the appropriate section. I also found different coloured post-it pads to represent bugs, UX issues, feature requests and feedback from each user and started writing on post-its and sticking them to the appropriate sheets.
As the alpha phase progressed we added more users and removed one or two. By the end we had about thirty sheets on the whiteboard. The entire team worked directly with our alpha users to help them get up and running. Yes, we let the dev team speak directly to the alpha users and they did an awesome job of diagnosing the issues and fixing them. It was clear to us exactly how many of our alpha users were blocked because of some bug or other. It was really enjoyable to see each iterative alpha release allowing us to remove post-its with details about bugs or to move an alpha user from the ‘attempting to install’ to the ‘installed’ section. Every now and then we would have to increase the size of the ‘installed’ section to cater for the extra users – which felt great.
Apart from the obvious benefits of discovering bugs or design flaws, it was also really great to see the feedback coming back about a particular fix in the last release or a new feature that people loved. When we finally started to, bit by bit, replace our initial pig-ugly GUI with the really slick version that new UX legend, Jonny (twitter), had designed the feedback was really satisfying.
By working so closely with our early users we were able to build really good relationships and get awesome feedback. This feedback, apart from helping us to build a great product, made the project really rewarding. When comparing this project with other projects I’ve seen where the developers don’t get to see the difference they are making to their users, it seems clear to me why the SQL Lighthouse team was so motivated.
Minimum Viable Products and building quality in
Eric Ries (blog / twitter) talks about the concept of a ‘minimum viable product’ in detail in his book The Lean Startup. We’ve had mixed experiences with his approach but my belief is that where we have had problems in the past it has been because stability has not been called out as a requirement for a product to be ‘viable’. Considering Red Gate’s, brand, stable products are a requirement. If only features are called out as requirements it is possible to find teams under an unreasonable amount of pressure to ship features at the expense of stability. We stated early on that ‘viable’ meant ‘not buggy’ (or something more specific?). I joked at the beginning of the project that David, one of the initial developers, claimed this project would have no bugs.
According to David Simner (http://t.co/r7aparBlxU) our new product will be “bug free”. You heard it here first!
— Alex Yates (@_AlexYates_) May 12, 2014
Obviously that was not entirely true. What he actually meant by that comment is that all the code would have tests and that they were going to ‘stop the line’ and fix the code every time TeamCity (our CI server) reported a test failure. However, there are always cases where unexpected variables in the real world throw a spanner in the works. By shipping uncomfortably early builds to pre-warned people we were able to catch the stuff we had not expected early and, we hope, create a more stable product in the long run.
Prioritising fixes over features, however, came at a cost. We knew that we wanted to ship our BETA quickly, with both the audit and email notification features that our alpha users had told us were really important as well as the improved GUI. (This was important to us because, when we released the public BETA we wanted to be sure that people would get enough value from it that they would continue to use it and show it to their peers). Surprise, surprise, when we shared the code with real users we encountered more bugs than we expected. By focusing on stability we had to push back our estimates for releasing the required features. We hope that by fixing these issues early the BETA will be more stable, and hence more successful.
We also thought really hard about ‘technical debt’. We were fortunate enough to be working on a greenfield project and that gave us an opportunity to create a codebase we could be proud of. Project Manager Chris stuck a large A3 picture on our whiteboard of a man struggling to carry a heavy sack. He told the team that it was inevitable that the team would make some design decisions that would give us technical debt. He wanted the team to call out technical debt whenever they saw it. Each time the team should discuss the implications of these decisions and they should add a post-it to the sack the poor man was carrying. This very emotive poster acts as a reminder to keep the code clean as well as a handy visualisation of how smelly the code is getting.
Well, we released last week – on time for SQL in the City.
By putting a small team on the project and getting eyes on it early we were able to validate our ideas for a relatively low cost. We were able to ramp up our investment in the product once we had more confidence it could work.
By gathering detailed feedback early and throughout we were able to work out what was and what wasn’t important and focus on getting the important work done.
By not including a UX person or tech author in the team early on we created a prototype that confused our users. We learned from that and invested in adding a UX person to the team and the result is a really attractive and simple product.
By releasing frequently and keeping in regular contact with our early users we were able to spot problems and focus on delivering real value, rather than new features that may or may not prove valuable.
By focusing on code quality early we started slower than we would have liked, but we are now in a good position to move relatively quickly.
PS. I’d like to take this opportunity to thank the team for being, plainly, awesome as well as all of our alpha users for their loyal support.
If you would like to take a look at SQL Lighthouse for yourself: