May 2017 Archives

The two sides of this story are:

  • Software requirements are often... mutable over time.
  • Developer work-estimates are kind of iffy.

Here is a true story about the later. Many have similar tales to tell, but this one is mind. It's about metrics, agile, and large organizations. If you're expecting this to turn into a frAgile post, you're not wrong. But it's about failure-modes.

The setup

My employer at the time decided to go all-in on agile/scrum. It took several months, but every department in the company was moved to it. As they were an analytics company, their first reflex was to try to capture as much workflow data as possible so they can do magic data-analysis things on it. Which meant purchasing seats in an Agile SaaS product so everyone could track their work in it.

The details

By fiat from on-high, story points were effectively 'days'.

Due to the size of the development organization, there were three levels of Product Managers over the Individual Contributors I counted myself a part of.

The apex Product Manager, we had two, were for the two products we sold.

Marketing was also in Agile, as was Sales.

The killer feature

Because we were an analytics company, the CEO wanted a "single pane of glass" to give a snapshot of how close we were to achieving our product goals. Gathering metrics on all of our sprint-velocities, story/epic completion percentages, and story estimates, allowed us to give him that pane of glass. It had:

  • Progress bars for how close our products were to their next major milestones.
  • How many sprints it will take to get there.
  • How many sprints it would take to get to the milestone beyond it.

Awesome!

The failure

That pane of glass was a lying piece of shit.

The dashboard we had to build was based on so many fuzzy measurements that it was a thumb in the wind approximation for how fast we were going, and in what direction. The human bias to trust numbers derived using Science! is a strong one, and they were inappropriately trusted. Which lead to pressure from On High for highly accurate estimates, as the various Product Managers realized what was going on and attempted to compensate (remove uncertainty from one of the biggest sources of it).

Anyone who has built software knows that problems come in three types:

  1. Stuff that was a lot easier than you thought
  2. Stuff that was pretty much as bad as you thought.
  3. Hell-projects of tar-pits, quicksand, ambush-yaks, and misery.

In the absence of outside pressures, story estimates usually are pitched at type 2 efforts; the honest estimate. Workplace culture introduces biases to this, urging devs to skew one way or the other. Skew 'easier', and you'll end up overshooting your estimates a lot. Skew 'harder' and your velocity will look great, but capacity planning will suffer.

This leads to an interesting back and forth! Dev-team skews harder for estimates. PM sees that team typically exceeds its capacity in productivity, so adds more capacity in later sprints. In theory equilibrium is reached between work-estimation and work-completion-rate. In reality, it means that the trustability of number is kind of low and always will be.

The irreducible complexity

See, the thing is, marketing and sales both need to know when a product will be released so they can kick off marketing campaigns and start warming up the sales funnel. Some kinds of ad-buys are done weeks or more in advance, so slipping product-ship at the last minute can throw off the whole marketing cadence. Trusting in (faulty) numbers means it may look like release will be in 8 weeks, so its safe to start baking that in.

Except those numbers aren't etched in stone. They're graven in the finest of morning dew.

As that 8 week number turns into 6, then 4, then 2, pressure to hit the mark increases. For a company selling on-prem software, you can afford to miss your delivery deadline so long as you have a hotfix/service-pack process in place to deliver stability updates quickly. You see this a lot with game-dev: the shipping installer is 8GB, but there are 2GB of day-1 patches to download before you can play. SaaS products need to work immediately on release, so all-nighters may become the norm for major features tied to marketing campaigns.

Better estimates would make this process a lot more trustable. But, there is little you can do to actually improve estimate quality.