Change-automation vs. LazyCoder

The lazyCoder is someone sees a need to write code, but doesn't because it's too much work. This describes a lot of sysadmins, as it happens. It also describes software engineers looking at an unfamiliar language. Part of the lazy_coder is definitely a disinclination to write something in a language they're not that familiar with, part of it is a disinclination to work.

It has been said in DevOps circles (though I can't hunt up the reference):
A good sysadmin can probably earn a living as a software engineer, though they choose not to.
A sentiment close to my heart as that definitely applies to me. I have that CompSci degree (before software engineering degrees were common, CSci was the degree-of-choice for the enterprising dot-com boom programmer) that says I know for code. And yet, when I hit the workplace I tacked as close to systems administration as I could. And I did. And like many sysadmins of my age cohort or older, I managed to avoid writing code for a very large part of my career.

I could do it as needed, as proven by a few rather complex scripts I cobbled together over that time. But I didn't go into full time code-writing because of the side-effects on my quality of life. In my regular day to day life problems came and went generally on the same day or with in a couple days of introduction. When I was heads down in front of an IDE the problem took weeks to smash, and I was angry most of the time. I didn't like being cranky that long, so I avoided long coding projects.

Problems are supposed to be resolved quickly, damnit.

Sysadmins also tend to be rather short of attention-span because there is always something new going on. Variety. It's what keeps some of us going. But being heads down in front of a wall of text? The only thing that changes is what aggravating bit of code is aggravating me right now[1]. Not variety.

So you take someone with that particular background and throw them into a modern-age scaled system. Such a system has a few characteristics:

  • It's likely cloud-based[2], so hardware engineering is no longer on the table.
  • It's likely cloud-based[2], so deploying new machines can be done from a GUI, or an API. And probably won't involve actual OS install tasks, just OS config tasks.
  • There are likely to be a lot of the same kind of machine about.

And they have a problem. This problem becomes glaringly obvious when they're told to apply one specific change to triple-digits of virtual machines. Even the laziest of LAZY_CODER will think to themselves:

Guh, there has got to be a better way than just doing it all by hand. There's only one of me.
If they're a Windows admin and the class of machines are all in AD as it should, they'll cheer and reach for a Group Policy Object. All done!

But if whatever needs changing isn't really doable via GPO, or requires a reboot to apply? Then... powershell starts looming[3].

If they're a *nix admin, the problem will definitely involve rolling some custom scripting.

Or maybe, instead, a configuration management engine like Puppet, CFEngine, Chef or the like. Maybe the environment already has something like that but the admin hasn't gone there since it's new to them and they didn't have time to learn the domain-specific-langage used by the management engine. Well, with triple digits of machines to update learning that DSL is starting to look like a good idea.

Code-writing is getting hard to avoid, even for sysadmin hold-outs. Especially now that Microsoft is starting to Strongly Encourage systems engineers to use automation tools to manage their infrastructures.

This changing environment is forcing the lazy coder to overcome the migration threshold needed to actually bother learning a new programming language (or better learning one they already kinda-sorta know). Sysadmins who really don't like to write code will move elsewhere, to jobs where hardware and OS install/config are still a large part of the job.

One of the key things that changes once the idea of a programmable environment starts really setting in is the workflow of applying a fix. For smaller infrastructures that do have some automation, I frequently see this cascade:

  1. Apply the fix.
  2. Automate the fix.

Figure out what you need to do, apply it to a few production systems to make sure it works, then put it into the automation once you're sure of the steps. Or worse, apply the fix everywhere by hand, and automate it so that new systems have it. However, for a fully programmable environment, this is backwards. It really should be:

  1. Automate the fix
  2. Apply the fix.

Because you'll get a much more consistent application of the fix this way. The older way will leave a few systems with slight differences of application; maybe config-files are ordered differently, or maybe the case used in a config file is different from the others. Small differences, but they can really add up. This transition is a very good thing to have happen.

The nice thing about Lazy Coders is that once they've learned the new thing they've been avoiding, they tend to stop being lazy about it. Once that DSL for Puppet has been learned, the idea of amending an existing module to fix a problem becomes something you just do. They've passed the migration threshold, and are now in a new state.

This workflow-transition is beginning to happen in my workplace, and it cheers me.

[1]: As Obi-Wan said, It all depends on your point of view. To an actual Software Engineer, this is not the same problem coming back to thwart me, it's all different problems. Variety! It's what keeps them going.
[2]: Or if you're like that, a heavily virtualized environment that may or may not belong to the company you're working for. So there just might be some hardware engineering going on, but not as much as there used to be. Sixteen big boxes with a half TB of RAM each is a much easier to maintain physical fleet than the old infrastructure with 80 phsysical boxes of mostly different spec.
[3]: Though if they're a certain kind of Windows admin who has had to reach for programming in the past, they'll reach instead for VBScript; Powershell being too new, they haven't bothered to learn it yet.


Another advantage to moving towards a more programmatic based approach for managing systems is that it breaks down the walls between ops and general engineering. If you're building and maintaining systems using DSLs like with Puppet manifests, anyone in the engineering organisation can submit changes and improvements. With a standardised way of describing the infrastructure, it's a lot easier to learn and be involved with the deployment of your code.

A sysadmin will still have to review these changes before they get deployed (perhaps through a Github pull request) because they will have the bigger picture of the whole infrastructure but it also means you no longer have to be an expert in the entire system to make small changes. It's also helpful when you're dealing with an on-call issue for an app you may not know much about. It's all just described in code!

The main problem i'm having is getting fellow sysadmins to treat puppet DSL as actual code (creating unit tests, etc). Getting traditional sysadmins to move to a new way of doing things (especially when they haven't had experience with SCM, a language other than perl, and CI) can be quite a challenge.