From the book:
Tom's books total over 2,100 pages of advice. In this class he'll narrow all that down to 32 essential practices. Tom will blast though all the 32 practices, explaining what brought him to include each one on the list, plus tips for incorporating the practice, policy, or technology into your organization. You'll find some great ideas for providing better service with less effort.This was a very good session. It covers the Limoncelli Test, unsurprisingly. This is one of many attempts to come up with a Sysadmin version of the Joel Test (ServerFault tried). But this one seems to be going the distance. Do click on the link, as it leads right to the test. Tom has even written essays about each point to support its being there.
Take back to work: How to identify and fix your biggest problems, cross-train your team, strengthen your systems--and more!
- Improving sysadmin-user interaction
- Best practices for working together as a team
- Best practices for service operations
- Engineering for reliability
- Sustainable Enterprise fleet (desktop/laptop) management
- How to figure out what your team does right, and where it needs to improve
Some of the stuff in here is obvious if you've been in the industry for a while (use a ticket-tracking system, automated patching) others perhaps not so much (there are three policies that all sysadmin departments need to have defined to be effective). Some applies only to multi-person environments (pager rotations) while others are universally applicable (service monitoring).
I got a lot of goodies out of this. Some of it I had been peripherally aware of, but had never seen written up like this all in one spot before.
An Ops Doc is a kind of service documentation. Each service you offer needs an ops doc and it needs to have certain things in it:
- Overview: What it is, what it does.
- Build: How to built it, get it.
- Deploy: How to install it, configure it.
- Common Tasks: What do you commonly do with it, and what kinds of issues commonly come up + resolutions.
- Pager playbook: Document alert handling.
- Disaster Recovery: What are the DR policies for this service, and how do you run them.
- Service Level Agreements: What has been promised to whom, what are the penalties. What is it, where is it, how to deal with it.
Critical? Periodic audit. Probably by a non-technical manager, such as a Project Manager. And by "non-technical" I mean someone for whom managing people is their job, not managing technology directly.
The Three Empowering Policies
There are three policies that all Sysadmins need to have defined in order to be effective. Otherwise, people will just walk up whenever and ask you to do stuff, and you'll do it whenever they ask, however they ask, since we're nice that way. This is managing by interrupt and that's not a good way to manage our time. What's more, it leads to grumbling, and reinforcing the Server Troll reputation we sometimes get. The three policies
Acceptable methods for users to ask for help
Walking up and asking may be the best way for you, but in general it isn't a good way. Having a policy that defines what are the ways that users may ask for help allows sysadmins to better budget their time, and makes them more efficient overall.
The definition of an emergency
By enshrining the definitions of emergency into policy you prevent localized issues being advocated by a vocal person or small group of users from sucking resources away from a larger issue affecting the entire system but doesn't have a vocal advocate driving attention to it. The example Tom uses is a Code Red is something that stops production cold, a Code Yellow is something that could lead to a Code Red if left unattended.
The scope of service
This policy defines what is and is not covered. It is this policy that tells people that the sysadmins are not fax-repair qualified, or whether or not they make house-calls for teleworkers. This policy also defines when service is available, and what the after-hours options are.
How to convince people to make big changesI have to give big, big thanks to Tom for this one. This section of the class was focused on how to convince manager-types or other people with the power to block IT changes that such a change is in their best interest. I've been saying for years that one of the chief skills a well qualified Systems Engineer needs is the ability to effectively speak to management. A technician doesn't need to talk to people persuasively. A technical manager needs to talk to other managers. Tom went there.
I've met many people in our field who stuck with computers because either people are scary, or they don't want to deal with the bullshit that dealing with people day in and day out requires. These are not the people that make it to Senior jobs, at least not without some help. Tom identified some effective strategies for social engineering your way to what needs doing.
A more full treatment of this topic will be in another blog post. Heck, I've got a proto-book on this very topic in progress. So this will be briefer than it really needs.
- Don't make people feel wrong. Phrase changes in non-accusatory ways. Making them feel wrong gets them defensive, and MUCH less likely to agree that you are presenting the best way forward.
- Don't make people feel blamed. Explain how this change will improve everything overall. People feeling responsible for bad decisions get defensive. You don't want that.
- Invent questions that'll give THEM the idea. Social engineering. If they come up with it (subtly pushed) they're more likely to follow through.
- Don't be threatening to their authority. Authority can come in the form of direct power (they're your boss), or indirect (they have 20 years on you, and everyone listens to them before you). People don't like upstarts, and can quash your idea out of hand just because you seem like a threat. Don't be a threat.
- For big changes, break them up into smaller changes and present those. Smaller change is less scary than bigger change.
- The Statement of Undeniable Value. If you can distil your change down to simple to understand numbers, it can make it a LOT easier to convince people that this change is needed. Suddenly, all of that seemingly irreducible complexity is now distilled into a discrete dollars-per-unit savings.
- Some people respond to data, others respond to peer recommendation. Knowing the difference is key. Knowing that Google uses a specific product raises that product's shine in the eyes of that specific decision maker saying 'no' all the time.
Also behaviorwizard.org is a nice wizard-style website to help you figure out how to persuade certain people to do things. It takes some social know-how to really get the most out of it, but if you have that it can really help you get even better.