How did we get here
We always dont have the luxury to start working on a fresh system from scratch. In most of our engagements, we either take over a legacy system or end up creating a monster ourselves unknowingly (or knowingly). But where do we go from there? Do we make peace with the monster and learn to live with it? Do we re-write the whole thing ground up?
We can’t live with the monster, coz someday it will surely break down. Writing it ground up will end up in a big bang, which of course no one wants. So what do we do? Luckily there is a way out of this madness. In this post we will look over the broad steps of doing that.
Broad breakdown of steps are
- Step 1 - Stop Digging
- Step 2 - Decompose your domains
- Step 3 - Define your Contracts
- Step 4 - Build Cross Cutting Features
- Step 5 - Shield your system with API Gateway
- Step 6 - Write Consumer Driven Contract Tests
- Step 7 - Allocate Traffic
- Step 8 - Refactor Away
Step 1 - Stop Digging
First step is to realize you have a problem.
Law of Holes
As the Law of Holes states - When you find yourself in a hole, stop digging!!! Dont add new extensive features or create more mess that needs to be cleaned up.
You can’t code your way out of it
We engineers believe all world problems can be solved by writing code. But the same thought process is detrimental when you are trying to get out of a hole. The more code you add, actually makes the hole little more deeper.
Hardest thing to do
As with any addiction, the first step is always the hardest. We all love to get back to the comfort of our keyboards and headphones and code our way to La La land. Not writing code is the hardest thing to do.
Step 2 - Decompose your domains
Next step is to understand what you are dealing with.
Understand the business better (non-technical)
As developers our skills are interchangeable. We can develop code for a financial institution today and tomorrow we can write code for a Healthcare company. We can add more value if we know the deep intricacies of the business. When decomposing the domains for your future, it is super important that you understand the business better. The easiest way to do that is to get involved with different non-technical and operational teams of the organization.
Understand the Monolith (technical)
Know you enemy. Understanding the workings of the Monolith is very important step before you start hacking it down. This is a surgical procedure and not a demolition. It needs skillful hands and deep understanding of the inner workings.
Use feature clustering to aggregate domains (mixed bag)
This is where all stake holders (technical and non-technical) should come together and share their thoughts on what they want from the software. In my experience, what seems like the reality is not always the case when you ask the actual people who are on the ground.
Step 3 - Define your Contracts
Once you know your domains, define the behavior
Don’t write code, write specifications
Don’t start building your system right away. Follow an API first approach, which doesn’t need any tech. This can be done using JSON Schema or a plain document.
Don’t discuss tech stack
Don’t get into details at this point. Remember we are deciding specifications and not the implementation.
Define how the APIs should behave. IPC, sync/async, concurrency levels, consistency guarantees, etc.
Agree on 1.0
Agree how 1.0 version of your API looks like, and get a buy in from the other teams. 1.0 is written in stone Once you agree on 1.0, it can never change. This version should always be supported no matter how the underlying system changes.
Step 4 - Build Cross Cutting Features
After you have your specifications ready, invest time in building Cross Cutting features upfront which will reap dividends later.
Build common features
Build features which are domain agnostics. These features are not tied down to any specific domain and can be used across team without creating inter-team dependencies.
One of the most important and most ignored cross-cutting feature. Without Logging in a distributed system, you are walking in the middle of the highway with your eyes closed. You will never know what hit you !!! Always have a robust logging system in place before you build anything. This is something which should be figured out before and not after development.
If your future system consists of Micro-services, invest heavily in distributed tracing. Without Distributed Tracing it will be impossible to diagnose multi-service hopping issues and performance bottlenecks. It is wiser to use a industry standard solution out of the box (AWS X-Ray, NewRelic) rather than investing time and energy building it yourself. Build it only if none of the products meet your requirements.
Change your development methodology from gut feeling to data driven approach. What we feel is no good in a production environment. Invest in technology which gives your developers the ease to build monitoring dashboards. These dashboard should be easy to build (drag and drop widgets, SQL like data querying capabilities) and should be cheap to build. Having a restriction of 1 dashboard per user will be of no use as it will be so cramped up that nothing useful will come out of it.
Continuous Integration (CI) is the most important of all the cross cutting features. You should have a automated Test Infrastructure which stamps every production release. Eliminate humans as much as possible in this process. As this will be final gate before your code hits production, having a robust testing infrastructure is super critical.
Step 5 - Shield your system with API Gateway
API Gateway is your gateway to the world. This is your armour against everything from outside your system boundary.
API Gateway is a pattern not product.
API Gateway Features
These are features your API Gateway should definitely have.
API Gateway should be able to route requests based on rules or configuration. This way you can de-couple the world from your internal systems.
- Reduce number of client calls using API composition
- Can be single system or multiple micro-services behind the API gateway
- Each client gets their own protocol
- Internal systems can communicate using low latency protocols (example: TCP, gRPC)
- REST is no longer mandatory for all systems
Implementing Edge Functions
API Gateway must support these Edge Functions
- Rate Limiting
- Request Logging
Putting it all together
- Implement your contract specifications (Routing and Composition)
- Move all edge functionality to API Gateway
- Hide Legacy system behind Non-Internet facing Firewall
Benefits of API Gateway
- Legacy system no longer exposed to clients
- Any team can build its own API Gateway
- Edge functions no longer baked inside internal systems
- Refactoring and migration is opaque to the outside world
Drawbacks of API Gateway
- Another system to develop, maintain and scale
- Single point of failure and attack
- Can become another monolith if not designed correctly
API Gateway: Build or Buy
Available API Gateways
- AWS API Gateway: Partial Implementation of the features
- Zuul: Best in the industry, built and used by Netflix.
- Kong : Fully functional API Gateway
- Traefik: Similar to Kong
Building your own
If you feel adventurous enough to build your own API Gateway, it should meet these requirements.
- Request Routing
- API Composition
- Edge Functions *Partial Failure handling
Step 6 - Write Consumer Driven Contract Tests
- With API Gateway in place, let consumers of your API contribute to your test suite
- Bring your own data - each suite should be idempotent (setup and teardown should be built in)
- Run with every build / change
Step 7 - Allocate Traffic
- Use Request Routing to allocate traffic
- Day 1: 100% Legacy, 0% Future
- Day X: 0% Legacy, 100% Future
Step 8 - Refactor Away
- Your contracts are decided
- API Gateways are firing
- Consumer Contracts tests are in place
- Pipelines are in place to ensure contracts are not broken and new tests keep getting added
Now you can - Refactor without fear !!! 😃