What are microservices good for, anyway?

This article is a small intermission in our series about the socio-technical aspects of microservices. I realized that I hadn’t quite covered why you would choose a microservices architecture in the first place. This is what we’ll cover today.

When an organization grows to a certain size, coordination becomes a major bottleneck. Things that used to work well with a handful of developers are no longer sufficient when there’re dozens of them.

One thing that usually causes problems is the release process.

On change coordination

The more developers there are, the more changes will be made to your codebase. All these changes have to be coordinated. Otherwise, you’ll run into big problems. Merge conflicts are among the easier things to handle. But it gets really difficult when you have to coordinate database migrations as well.

Every time you want to get your changes deployed to production, you’ve got to push them through a deployment pipeline. If you’re the only one making changes, your changes can enter the pipeline immediately. But if someone else is currently sending their changes through the pipeline, you’ve got to wait until it is free again.

In an ideal world, we wouldn’t ever have to wait for the deployment pipeline. This is because we want to get feedback on our changes as quickly as possible. The sooner we can get our changes deployed, the earlier will we be able to learn and to adapt.

In lean software development, we refer to this goal as “reducing our lead time”. The lead time is the time it takes for any change to go from conception to release. Some of the best-performing organizations can get changes deployed within a few hours.

Unfortunately, this goal can be difficult to attain in large organizations. When there’re a hundred developers, all of them trying to change the same monolithic system, there’ll be lots of waiting. The organization will no longer be constrained by how quickly changes can be made, but by how quickly changes can be deployed.

One way to overcome this problem is batching. Instead of deploying each change individually, why not bundle a whole range of them into a single deployment? For example, you could deploy all pending changes simultanously once the previous deployment has finished. This would reduce your average wait time to the time it takes to perform a single deployment.

The trouble with batching

The approach of batching changes can help and, in fact, is quite common. But it also has a big downside: when things go wrong after a deployment, it’s harder to find the source of the problem.

Imagine that most of your database queries suddenly started to time out. Your system is failing to serve your customers and a major incident is raised. It is your job to figure out what exactly has gone wrong. You pull up the list of changes that made it into the previous release and you find changes made by 50 developers. Good luck finding out which of their changes are contributing to the incident.

If you’re especially unlucky, the problem is not caused by a single change, but by the interactions between several changes.

This is exactly why batching is not that great after all and why we should try to avoid it. What else can we do though?

Microservices to the rescue

This is where microservices can help.

When an organization decomposes their monolithic system into a set of independently deployable services, they can overcome the change coordination problem we’ve just talked about.

The idea is quite simple: instead of releasing the entire system and all of its changes in one go, we divide the system into several services. We then assign each service to one team. This allows each team to deploy its service independently of all the others. This significantly reduces the amount of coordination necessary. Now, a change only needs to be communicated with a couple of developers instead of the whole organization.

Even better, we avoid the problems associated with batching. We can go back to deploying each change individually. When things go wrong after a deployment, it’s a lot easier to figure out what has caused the problem.

Of course, microservices are not a silver bullet. They come with their own set of challenges. But if they are applied in the right context, they can remove huge amounts of coordination overhead.

In the next article, we will talk about microservices and cognitive load.