In the first part of this blog mini-series, we examined the unexpected pitfalls of microservices and the careful consideration needed to address their costs and complexities. Despite these challenges, adopting a service architecture remains essential for many organizations to empower their product teams. How can we ensure a smooth migration, free from regret?
Join us as we continue our deep dive into the world of service architecture and unveil the secrets to a regret-free migration.
Microservices migrations regrets
There are plenty of posts that rant about how microservices migration went wrong, and to be honest, I don’t know a single engineering leader that hasn’t regretted rushing into microservices, splitting microservices too thin, or doing the wrong split.
Just to quote the last one I heard:
When people tell me that 'we are splitting our monolith', I laugh and say, ‘Trust me, you will go back to a monolith.’ A little-known secret is that Netflix is going back to monolithic services.
– Bruce Wang, Netflix Director of Engineering - LeadDev NY 2023
So, if people regret microservices architecture so much, what are their biggest regrets? This is my summary of what I hear:
- Our microservices are not well decoupled, and now that the services are split, it is much harder and more expensive to refactor.
- The maintenance cost of having many microservices is very high. We sliced the microservices too thin, thinking the organization will continue scaling, but it hasn’t, and now teams need to maintain a large number of microservices that we barely change but that require to be kept alive and secure.
- Infrastructural cost has increased a fair amount. Often people don’t think that if you move from 1 monolith to 10 services, you will move to (10 container + 10 db) * (number of environments) * (number of nodes in the cluster).
- Production support pressure on teams increases because it is much more likely that something will happen in production with 20 systems in place than with 2.
Regret-free Service Architecture, a proposal
Let’s look at the diagram below that categorise monolith and microservices adding a “poor design” axe.
Let’s assume you are in the bottom left corner with a Monolithic big ball of mud and want to get to a Microservices architecture. If you had to pick a middle step would you go for Modular monolith or for Distributed big ball of mud? I would certainly go with modularised monolith without a doubt.
Let me walk you through what this may look like at a high level, and then I’ll delve into details into some of the core parts:
- You start with a monolith. It’s a big ball of mud, but it served you well in reaching product-market fit. You are now making millions in revenue, and your team has grown enough that you feel you want to invest in a more modularized approach that reduces time to market. Let’s assume that in this case, there are no other non-functional requirements that are likely to impact you in the next 6 months.
- You start modularizing your monolith. Assuming you have 4 teams, my suggestion would be to refactor the monolith toward 4 domains. I would likely do one at a time. While you are at it, probably it’s a good opportunity to stream-align those teams and have domain-specific bounded context for the domain. Focus on one large domain; it will make it faster to define a stable interface, validate the bounded context, and refactor when you find out (not if) that it doesn’t work.
- You feel confident with your bounded context. When you feel good that it is well decoupled, you can decide if you want to move to a service or not. Again, there are pros and cons either way, so you will have to decide what works for you. At this stage, this is completely under the control of a single team anyway.
- Modularize inside the bounded context. Now that the team has a clear bounded context, nothing stops them from creating additional subdomains. This will make the code easier to change, and if there is a need to spawn another team because the company is growing, some groundwork is already there.
The phases above can be parallelised in different ways for example below the diagram show a team that focus on sub-domains boundaries over extracting to a separate deployable unit.
Why is the above regret-free? Here are the three main reasons:
- You can design a loosely coupled system when it is still easy and cheap.
- You keep the distributed system entropy to a minimum while enabling your teams to be independent.
- You reduce your risks because you can temporarily decouple creating clear bounded context from migrating to a microservice.
How to get to Regret-free Service architecture heaven
There are a number of parts that are key to the strategy highlighted above, I hope the below considerations can help you:
- One key to a modularised monolith is the ability to have a tight public interface for the module that it’s the only way to interact with it. If you are using a statically typed language probably your life is slightly easier. If you don’t, static analysis tools or an approach similar to the one described here is your best friend.
- There are certainly a number of ways to find the boundaries of your domains. However, the ones that analyse the behavior of the system instead of data structures are the ones that bring by far the most success. Alberto Brandolini says it quite well "Data-driven splitting (turning database tables into microservices without a behavioural inspection and redesign) also backfires spectacularly: the new system turns out as coupled as its monolithic ancestor, but possibly slower, and with skyrocketing refactoring costs.". In my opinion doesn't matter if you use Event Storming or you analyse the current systems through collaboration diagrams, the key is to look at the dynamic behaviour of the system. I find this course on Team Topology quite a useful guide on how to approach finding candidate bounded contexts.
- Often neglected from those discussions is data access and migration. The truth is that your data needs the same level of modularisation as your code. First, different modules can’t access the same tables, so you will have to migrate the data. Second, you want to have a way to make sure that this constraint is respected. Depending on how strict you want to be with those constraints, potentially separating the tables in different schemas and having a script that validates schema access across the different modules as part of your test suite can be an approach. To reiterate, data separation is not negotiable. It’s only how you help engineers have the right behavior that can be flexible.
- If you are not doing this already, it’s a good idea to get used to stubbing your external dependencies when you run your tests. This is something you will have to do anyway when you move to independent microservices, so start now. I’m a bit bearish toward contract testing because I’ve never seen it work well, so wherever you can, prefer strongly typed.
During the past 10 years, I have seen microservices architectures across 5 different companies, and in each of them, I felt a sense of regret across the teams for underestimating the impact this would have on their productivity.
I think that a service architecture is very important to the healthy growth of a company. However, I believe that many companies move to it too soon and without clear consideration about their domain design.
As mentioned above we have had ten years of budget growth in tech so we could afford (perhaps) inefficient adoptions of architectural solutions. I don't think we have the same luxury right now. In the spirit of "doing more with less" I hope the incremental approach described above can help organizations seeking fast flow achieve it without the massive regrets most of us have suffered.