Monolith to Microservices

Since I've been speaking, I've also been talking to engineers who have problems.

Sometimes those problems are with the monolith to microservices transition. A lot of text has been spent on the technical side, but less has been spent on the management layer. Because the issues I hear in the hall and in BoF sessions is centered in people. This is me documenting them for later, and for you, my googly readers reading back in time.

Every product needs product-tracking metadata

A 20 line Go application needs metadata.

A 730Kloc PHP application needs metadata.

Both apps require the same metadata:

  • Bug-tracker presence.
  • Feature-tracker presence.
  • Product-portfolio presence.
  • Continuous Integration presence.
  • Language/Framework tracking.
  • A list of people who know enough about the product to develop it.
  • A product-manager assigned it.

Scaling up your product-tracking from 3 products to 300 will show all the little ways you aren't ready for scale.

  • You'll find that much of this metadata is currently tacked onto the teams, and not the products. This will need to be split.
  • Language/Framework tracking may be entirely in meat-space, rather than written down. Everyone just knows which products are which languages. They'll need writing down.
  • Creating new CI pipelines may be an entirely bespoke process, where adding a double handful of microservice pipelines overwhelms your CI team. The CI process will need much more automation and self-service work.
  • The intra-product dependencies represented in code become inter-product dependencies represented in... what exactly?
  • 12 months into the project you may find yourself with products no one has touched in months and doesn't know anything about.
    • Sometimes this happens right after it fell over for some reason.
  • 18 months later when management reorganizes, allocating products to teams will be a major headache. All those people to product dependencies might not align with the executive vision.

If you're looking to bust apart a monolith, these are the things you need to pay attention to at the management layer. The dependencies you're hoping to get out of managing are still there, just on a different layer.

Calling a function in the same codebase is not the same as calling an API

Thanks to serverless, this idea is getting more traction in general engineering, but it needs repeating. Calling a function in the same codebase has the exact same availability as the code calling it, because they're executing on the same hardware, and likely in the same process. Moving this to an API call is different in so many ways, but there are two huge ones:

  • It introduces network latency where there didn't used to be any. That 1.5ms can actually matter.
  • It introduces the possibility that the API won't be there when the calling function needs it.

You would be very sad how often engineers who have been working against APIs their whole career somehow miss that microservices mean they have to do the same retry-tricks when the API is on their own infrastructure. These are documented in just about every how-to-do-microservices guide out there, but you need to actually do them. Even if it means turning a 29 line function into 130 lines, or importing yet another module.

Do not let the retries, backoffs, and circuit-breakers fall under the MVP bus. This is a bad habit to get into -- that's how you get distributed monolith -- and getting out of it is hard.

Sometimes a group of small monoliths is better than a distributed monolith

This is a bit counter-cultural, but consider the motivations for going microservices:

  • Separate dependency domains.
  • Separate fault domains.
  • Simplify testing and deployment.

These are good things. However, if you end up with a set of microservices that all have to be deployed at once, you are getting all the bad parts of API-driven architectures along with the bad parts of monolith. The smart engineering managers will take a risk-management approach to their project and only create microservices for the bits that will benefit in excess of the incurred risk.

What if instead you broke apart that 1.5M Loc Ruby/Rails monolith into a group of 5 smaller monoliths? Split it on functional areas, accept the API-driven hit for much fewer operations, and go back what your team is already good at: managing monoliths.