On appropriate metrics, a tale of servers and fashion

One of the questions that SysAdmins frequently get asked is:

I have a web application based on $Platform. It needs to support $NumUsers concurrent connections. How much server do I need?
$Platform can be anything from 'php' to 'tomcat' to the incredibly unhelpful 'linux'. $NumUsers can be anything from a reasonable number to completely unreasonable numbers representing the anticipated worst-case (or maybe that's 'best case') scenario (50,000 users! Two meeeeelion Users!).

The answer they're looking for is:

Two AWS large instances.

The answer they'll get is:

It depends on the application code.
They're laboring under the misconception that $Platform and $NumUsers are the only variables in the Grand Equation of Scaling. HAHahahahaahaha.  There actually is a GEoS, but I'm getting ahead of myself.

The part of the garment industry that focuses on pants has a problem. People shopping for off-the-shelf pants want to only have to look at one, maybe two, numbers to determine fit. Also, the fewer number-combinations possible reduces the number of different sizes they have to make in order to cover everyone.

And yet, there are there are four measurements that determine how well a pair of pants fit:
  • Waist
  • Hip
  • Rise (crotch to waist)
  • Inseam

This? This is a problem. There is no way to sell off-the-shelf pants with four measurements on the tag. Well, you could but retailers would hate it since they'd have to shelve umpty different permutations, and customers would hate it since finding the right one would take far too long. Clearly a non-starter. What to do?


The question here is, "How related are these four measurements, and is there a single one that would provide the best predictive power for the rest?"

What they did was measure a lot of people. They've done this several times over the last century, but the most recent dataset is far more multi-cultural than the ones used back in the 1950's. It actually has non-white people in it!

What they found was:

  • For men, Waist is a strong predictor of Hip
  • For women, Waist is a poor overall predictor of Hip
  • For women, there are clusters around certain ethnicities where Waist is actually a pretty good predictor of Hip

Add this into several decades of marketing-habits they've learned over the years:

  • Men can handle two numbers on the label, so can handle a separate Inseam measurement
  • Women expect only one number on the label
  • Rise is dictated by overall fashion trends instead of individual bodies (compare 1980's pants vs today's)
  • Inseam for women is dictated equally by body measurements and fashion (you wear different pants when wearing heels)

Men's pants are easy: Waist, Inseam, done.

Women's pants.... trickier.

Pants?? What about Servers!

Consider what an enterprising new clothing manufacturer faces when designing pants. How do they get the best fit for their picky customers? Like the question at the top of this article, all they know is that women's pants have a single size number on them, and come in petite/regular/tall. But what does that mean? How should they size their pants?

I'm a new clothing designer for women's ready-to-wear. What's the hip measurement for a 28" waist?
The answer they're looking for is "40 inches".

They answer they'll get is, "Who are you selling to?"

Does this sound familiar at all? It should.

The AppDev looking for server-sizing is expecting their problem to be a men's-pants kind of problem; one sizing-style fits everything. When in fact, it's far more complex.

So, what is this Grand Equation of Scaling I talked about above?

Take a look at how the fashion industry solved their sizing problem. They took a lot of measurements, ran quite a bit of analysis over it, watched things over the course of years, and adjusted to fit.

  • They measured things relevant to the problem (body measurements)
  • They researched customer preferences (shopping data and market research)
  • They analyzed trends in the data

With these three things done, they were able to construct a sizing regime that works pretty well for them, their retailers, and their customers.

The same bullet-points apply for figuring how much hardware (physical or virtual) is needed for a web or mobile application. The easiest to quantify metrics are $NumUsers and $NumServers, but they're just parts of the overall dataset needed to be analyzed to appropriately answer the question.

Measuring things relevant to the problem

When figuring out how many resources are required to support a given application, any of the following variables will need to be tracked (and I'm sure I'm missing some for edge-cases):

  • Number of Users
  • Number of Web Servers
  • Size of WebServer
  • Number of Databases
  • Size of DatabaseServers
  • Caching tier efficiency
  • Number of concurrent accesses
  • Number of concurrent sessions

And much, much more.

These variables can be discovered in pre-deployment testing, and by monitoring production performance. Since things like user concurrency is not a static value, a range needs to be assessed.

Research customer preferences

This will tell you how your customers expect your application to perform, when they start getting peeved at slow-downs, how they work through the application, and a bunch of other things. The item easiest to measure is perceived performance thresholds. People using it on mobile networks will be more forgiving of stuttering than those on traditional networks. It is through this research that you find the performance envelope you have to stay within.

Analyzing trends in the data

A pile of data means nothing unless you have someone who can make sense of it. It is through this process that the Grand Equation of Scaling is derived. Now that you have a pile of data about how your application works, and you know what performance goals you have to hit, you can start constraining your equation.  This is where you get to use the higher maths you got in college, and is one of the reasons why Google likes to hire Ph.D's in Mathematics.

Because this equation? For complex environments it can involve Calculus and Discrete Mathematics. This is why they sometimes call us Systems Engineers, not Administrators.

With these three steps you can actually answer the "how do I build an infrastructure that can support 2 meeeeelion users" question.

The same steps can be used to figure out whether or not your new web-app can be run on shared-hosting or needs more expensive dedicated-hosting. It won't even involve calculus for something this small!

Either way, you do have to know what to measure. For pants, see above. For IT infrastructure, find a Systems Administrator/Engineer. We'll even wear pants if you ask.


FYI, I'm getting 404s for some post pages. Example: