The lifecycle of infrastructure at a standard-pattern cloudy startup

There seems to be a consistent pattern of infrastructure usage for cloud-based startups running on the Silicon Valley pattern of startup growth.

Bootstrapped (no venture capital) Startup:

Money is a precious, precious thing for this kind of startup. Every penny is accounted for, and cost-saving measures like delete-fests and delaying cost-bearing updates happen a lot. Time is cheap, money is not.

In companies like these, cloud expenses will be purely compute. Why pay extra for Amazon/Microsoft to manage the database, when we can run it ourselves on the same sized instance for much less money?

Early Stage Venture-Capitalist Funded Startup:

This company has runway. Money is a concern, but a distant one thanks to VC. Money is cheap, time is not.

In companies like these, cloud expenses run the board. Why spend time managing a database and replication chains when Amazon/Microsoft does it for you? Why bother running container frameworks, when the cloud-vendor has one of their own that works good enough?

  • SaaS it where possible.
  • If it carries state, PaaS it.
  • Offload as much ops-work as you can.
  • Focus on what the company is good at, and don't bother to reinvent management frameworks for state-containing services.

Mid Stage, Starting To See Scaling Issues Startup Company

This company has been around a while, and their infrastructure is starting to run into scaling problems. Maybe the container framework isn't flexible enough. Or the database offering can't handle multi-region failover well, or at all. Money is always there, time is budgeted, but complexity is not cheap.

Companies that reach this stage are starting to feel the corners of the box that the cloud-provider puts folk into. This is when companies start in-sourcing some of the previously cloud-provider offerings.

  • Implement a novel datastore that isn't offered by the provider, because the new datastore solves more problems than in-sourcing causes.
  • Implement a RDBMS replication framework too complex for the provider.
  • In-source container frameworks because scaling-bugs in the provider are that annoying.
  • Many other things.

Global Company

Can't really be called a startup anymore, much as people would like to. Instead, they get the name Unicorn because of how rare it is for a startup to get to this stage without failing or getting eaten by another Unicorn. They're profitable (for SV values of profit), always hiring, and have been managing complexity for years.

Companies that reach this stage have enough compute going on that the question of, "do we need to build our own datacenters to save money?" becomes a real concern. They have a long history of cloud-provider usage, but that relationship has been proven to be a bit to... training-wheels lately.

  • Keep stuff in the cloud-provider that the cloud-provider is good at (S3 buckets, for instance)
  • Put into your own infrastructure things that are central to the business, and core to the offering.
  • Maintain cloud-provider relationships for peripheral products, development work, and business-automation work (that isn't SaaSed already).
  • Open-source the frameworks and homebrew products that have been used for years internally (spinnaker, kafka, kubernetes...)