Changing cultures

| 1 Comment
It's now about a year and a half into my tenure here at newjob. They hired me as their first full time IT hire. Previously they'd been making do with part-time consultants in the sysadmin role.

Consider, also, that this company is definitely in startup mode, so 40 hour weeks are something of a minimum. So yeah, that part time sysadmin had his hands full. He was definitely involved in the classic sysadminly things like hardware installs, network expansions, remote access management, and random troubleshooting. Though the later was restricted to things that could wait until he was on the clock.

Now that they have a full-timer, the list has expanded. It now includes items closer to the development side of the house. Things like:

  • Automation engineering. 
    • System Deployment automation
    • Software Deployment automation.
    • OS Update automation
  • Configuration Management (previously this was an entirely Dev thing)
  • Consulting on strategic planning for future growth
  • Performance analysis
  • Monitoring System deployment

In fact, the automation-engineering part is probably what I've spent 50% of my time on in the last 6 months and is an almost 100% writing-code task. The traditional sysadminly things I itemized above remain a small but significant part of what I do.

But what I wanted to talk about was Configuration Management.

When I got here they already had a Puppet installation. It was designed to be git-cloned into /etc/puppet on every deployed machine, and to be run exactly once when the machine was being installed. They weren't using puppet as a Configuration Management utility, they were using it as a Deployment Automation tool. Which it works as, but is definitely leaving a lot of functionality on the table.

There were many modules in there that were written to run every time puppet was run, regardless of the need to run it. One egregious example was a tar-ball that was downloaded from the Internet and then compiled. If the local compile-directory was still there and hadn't been cleaned, this actually ran fairly quickly on re-applys. But once I made /tmp ephemeral across reboots? Applys now took 15 minutes regardless of what actually needed application.

There were several more examples of this. I've spent time making it so it can be run multiple times with a minimum of redundant processing.

I've also put in a Puppetmaster server, since /etc/puppet on everything is a waste of space and a challenge to keep updated.

The repo as it stands right now can be run repeatedly without side-effects. It has taken me a long time to get here. Now I'm trying to convince the rest of Development that when a package needs to be added to all of a certain class of machine the best way to do this is NOT to go to each once and install it, rather add it to Puppet and then just do an Apply on all of 'em; it'll get done the same way on each, as it should.

Unfortunately, somewhere along the line most of the devs stopped paying attention to Puppet. Importantly, one of the release-engineers only uses it when it shows up on checklists and is definitely of the "apply the package now, get it into the automation later" mindset. Clearly, I have more work to do.

Just one more step in my ongoing battle to minimize the harm caused by organically grown systems. It's an ongoing process, and at the end of it we'll either be a straight up DevOps team or the change-demon will be running rampant causing everyone grief.

1 Comment

I've run into similar trying to bring automation to legacy departments. keep pushing. I can't say it will get easier but in my experience people eventually give in. The next hurdle for me is getting other people to be policy writers. I am definitely a bottleneck right now. but at least I have people reusing existing policy fairly consistently now.