The number one piece of password advice is:

Only memorize a single complex password, use a password manager for everything else.

Gone is the time when you can plan on memorizing complex strings of characters using shift keys, letter substitution and all of that. The threats surrounding passwords, and the sheer number of things that require them, mean that human fragility is security's greatest enemy. The use of prosthetic memory is now required.

It could be a notebook you keep with you everywhere you go.
It could be a text file on a USB stick you carry around.
It could be a text file you keep in Dropbox and reference on all of your devices.
It could be an actual password manager like 1Password or LastPass that installs in all of your browsers.

There are certain accounts that act as keys to other accounts. The first account you need to protect like Fort Knox is the email accounts that receive activation-messages for everything else you use, since that vector can be used to gain access to those other accounts through the 'Forgotten Password' links.

ForgottenEmail.png

The second account you need to protect like Fort Knox are the identity services used by other sites so they don't have to bother with user account management, that would be all those "Log in with Twitter/Facebook/Google/Yahoo/Wordpress" buttons you see everywhere.

LoginEverywhere.png

The problem with prosthetic memory is that to beat out memorization it needs to be everywhere you ever need to log into anything. Your laptop, phone and tablet all can use the same manager, but the same isn't true of going to a friend's house and getting on their living-room machine to log into Hulu-Plus real quick since you have an account, they don't, but they have the awesome AV setup.

It's a hard problem. Your brain is always there, it's hard to beat that for convenience. But it's time to offload that particular bit of memorization to something else; your digital life and reputation depends on it.

The different kinds of money

| No Comments

Joseph Kern posted this gem to Twitter yesterday.

CapEx.png

It's one of those things I never thought about since I kind of instinctively learned what it is, but I'm sure there are those out there who don't know the difference between a Capital Expenditure and an Operational Expenditure, and what that means when it comes time to convince the fiduciary Powers That Be to fork over money to upgrade/install something that there is a crying need for.

Capital Expenditures

In short, these are (usually) one-time payments for things you buy once:

  • Server hardware.
  • Large storage arrays.
  • Perpetual licenses.
  • HVAC units.
  • UPS systems (but not batteries, see below).

Operational Expenditure

These are things that come with an ongoing cost of some kind. Could be monthly, could be annual.

  • Your AWS bill.
  • The Power Company bill for your datacenter.
  • Salaries and benefits for staff.
  • Consumables for your hardware (UPS batteries, disk-drives)
  • Support contract costs.
  • Annual renewal licenses.

Savy vendors have figured out a fundamental truth to budgeting:

OpEx ends up in the 'base-budget' and doesn't have to be justified every year, so is easier to sell.
CapEx has to be fought for every time you go to the well.

This is part of why perpetual licenses are going away.


But you, the sysadmin with a major problem on your hands, have found a solution for it. It is expensive, which means you need to get approval before you go buy it. It is very important that you know how your organization views these two expense categories. Once you know that, you can vet solutions for their likelihood of acceptance by cost-sensitive upper management. Different companies handle things differently.

Take a scrappy, bootstrapped startup. This is a company that does not have a deep bank-account, likely lives month to month on revenue, and a few bad months in a row can be really bad news. This is a company that is very sensitive to costs right now. Large purchases can be planned for and saved for (just like you do with cars). Increases in OpEx can make a month in the black become one in the red, and we all know what happens after too many red months. For companies like these, pitch towards CapEx. A few very good months means more cash, cash that can be spread on infrastructure upgrades.

Take a VC fueled startup. They have a large pile of money somewhere and are living off of it until they can reach profitability. Stable OpEx means calculating runway is easier, something investors and prospective employees like to know. Increased non-people CapEx means more assets to dissolve when the startup goes bust (as most do). OpEx (that AWS bill) is an easier pitch.

Take a civil-service job much like one of my old ones. This is big and plugged into the public finance system. CapEx costs over a certain line go before review (or worse, an RFC process), and really big ones may have to go before law-makers for approval. Departmental budget managers know many ways to... massage... things to get projects approved with minimal overhead. One of those ways is increasing OpEx, which becomes part of the annually approved budget. OpEx is treated differently than CapEx, and is often a lot easier to get approved... so long as costs are predictable 12 months in advance.


While the push for IPv6 at the Internet edge is definitely there, the push for internal adoption is not nearly as strong. In the absence of a screaming crisis or upper-management commands to push things along, it is human-factors that will prevent such a push. I'm going to go into a few.

Nature is analog, not digital

| No Comments

A bit off topic, but it's been on my mind lately.

XX and XY are not the sex-absolutes you may think it is. They're the two most common bins, but they're far from the only genetic bins that humans end up in. Many, many people have been surprised when examining genes to determine "true" sex, often unhappily, and often complicatedly as a genetic condition a test wasn't designed to handle is encountered (how do you type XXY?).

What else is there out there?

Possibly the most famous is Androgen Insensitivity Syndrome (which comes in 'complete' and 'partial' varieties) in which a mutation on the hormone receptor for Testosterone either doesn't work or only partly works. Babies with C-AIS will end up with an F on their birth-certificate because that's what they look like, and they'll go through a normal female puberty even though they're still producing Testosterone.

That's because the liver does this neat trick called armoatization in which excess Testosterone is converted into Estrogen. This is why some perfectly normal teenage boys end up with gynecomastia, as all that surging Testosterone (puberty does that) causes a bit of it to convert.

Anyway, AIS girls develop in the womb along female patterns. The testes are still there, they're just not well developed. They also won't develop a uterus, since it wasn't there to begin with. Because of this, they won't menstruate but in every other way will look like any other girl (if a bit taller).

P-AIS is less definite, and is where some Intersex conditions come in to play.

I remember a scandal in the 90's when genetic testing for maleness was introduced among female Olympians, and they found two who tested male because of this. This was an extremely unpleasant surprise for them, as they'd both been competing at the world level for a while.

Next up is Klinefelter syndrome, which is an individual with an extra X chromosome to make XXY. And sometimes even more chromosomes get tacked on depending on what happened. These babies will most likely get an M on their birth-certificate, but development is where the differences begin to show. Testosterone production is reduced compared to XY males, but is still elevated compared to XX females.

In the same vein we have XXYY males. Those extra chromosomes aren't good things to have, but it does show up often enough we know about it.

The thing that breaks peoples brains is mosaicism, in which one person can have two different genomes. People with this can have a heart with one set, and an ovary with another, or eyes with different colors. One type of Turner Syndrome involves a mosaic of -X and XY (where -X is a missing X, they're short one). Depending on what tissue you take for typing, that individual may come up as either Turner-Female, or Male.

A slightly different version of this is chimerism, in which the two genomes came from two different zygotes. This can lead to fun things like true hermaphrodism if the reproductive parts of both individuals end up in the same body, and may have already allowed human parthogenesis. As with mosaics, these individuals can sex-type differently based on which tissue you take for testing.

If you ever wanted to see what a highly complex, failure accepting system looks like... biology. It's amazing we get anything done with all those transcription errors.

The dragon in the datacenter

| No Comments

Systems Administrators have a reputation, a bad one, when it comes to personal skills. I saw it at WWU when problems went unreported because users were afraid we'd yell at them for being stupid. I see it every time someone speaks with passion about DevOps improving the adversarial relationship between Dev and Ops. Two different groups of people, two different problems, same root cause.

  1. Not formally trained people experiencing problems we're tasked with fixing (a.k.a. "users").
  2. Formally trained engineers trying to build/maintain a complex system (a.k.a. "dev").

Dealing with the untrained

End users are tricky people. They don't think the way we do. Because they don't know how a system works, they develop completely wrong mythologies for why things break the way they do. They share folk remedies with each other rather than calling for trained assistance. Those folk remedies can actually make things worse.

Dealing with the trained

Developers are tricky people. They're supposed to understand this stuff, but for some reason only get part of it. Or they only really see one part of the whole constellation of the problem-space and don't understand how their actions make things difficult for another part of the puzzle. It's forever frustrating because they're supposed to know better.


Cynicism: (1): The firm belief that the person telling you how to do something differently is blowing smoke up your ass because they don't know it doesn't work that way.
(2): The firm belief that a certain class of person will just never, ever, get it.


Sysadmins become jaded cynics because the end users never get any better, and explaining the same thing over and over again gets old. And it never helps. And they keep doing the same stupid things, over, and over, and over. No amount of training helps. No amount of "intuitive" walk-throughs help. No amount of video tours help. The customer support organization helps filter the blithering lunacy, but it just means the extra special stupid escalates to L3 where we live.

Customer Service is an outlook as much as it is a skill. Far too many of us lack that outlook and aren't motivated to get the skill. The 'customer' we're serving most of the time is an abstract known as "uptime", that's quantifiable and doesn't file reports with your boss when you get a bit firm with it over the phone. As an industry we're regular consumers of Customer Support in the form of our vendors and the support contracts we hold with them. We know what we like when we get to the human:

  • They speak our language.
  • They don't get defensive when we blow steam about our frustrations with their product.
  • When we describe in detail what we think the problem is they don't dismiss our concerns and tell us how it really failed.

The jaded cynic sysadmin doesn't do any of that. We use condescending language (very probably unintentionally condescending). We respond to attacks on our systems by getting defensive. We see a chance to myth-bust and jump on it with glee, describing in detail how that failure mode actually occurred.

When users have problems they don't come to the jaded cynic sysadmin with them. This is driven through a combination of fear of being attacked, disgust that such people are allowed to keep working, and a desire to avoid assholes whenever possible.


Corrosive Cynicism: The belief that everyone around you doesn't know how it really works, and it's your job to explain why that is.


Sysadmins become jaded cynics after the developers persistently and stubbornly refuse to pick up the little platform quirks they're developing the application on. It gets tiring having to continually disabuse them of their assumptions on how the OS/platform works. You wish they'd talk to you sooner rather than wait to the end when all the bad assumptions have been baked in and they have to patch around them.

This is not some ever-changing population of end users, these are your coworkers. You see them every day (or, well, at least once or twice a week at meetings). You're both supporting the same overall problem, but your focus areas are different. They're concerned with algorithmic efficiency, you're concerned with system resources and what consumption rates mean for the future. They're concerned with making this one application work, you're concerned with how that application will fit in to the whole ecosystem of apps that share the same resources.

No one understands how it all fits together but you and your fellow sysadmins. If they came to you earlier, they wouldn't have these problems.

Congratulation, you're a BOFH.

The failure-mode here is the same as it was with the end users, a lack of Customer Service skills. Only instead of an ever changing population of stupid-doers you have a small population of the willfully ignorant. If you become hard to approach, you'll be fixing messes well after it was cheap and easy to fix. They're avoiding you because you're forever telling them 'no', and you're not exactly nice about it.


From the point of view of others

Green Dragon
Alignment Lawful Evil
Breath Weapon Acid Cone
Preferred Habitat Forests and Datacenters

The jaded cynic sysadmin most definitely works within the system. They may even be the system, but that authoritiy is derived from someone who let them have the keys to the kingdom. However, they're very often the last word when it comes to their systems This makes them lawful.

The jaded cynic sysadmin never seems to care what others think. They have their own goals, and asking them for stuff doesn't seem to do anything. Bribery can work, though. This makes them evil.

The jaded cynic sysadmin is... not someone you want to piss off. And they're easy to piss off, just existing seems to be enough sometimes. When that happens you risk a verbal flaying. It's called a breath weapon for a reason.

Not a bad observation

| No Comments

A friend of mine recently posted some job stuff and he had a good observation:

I investigate businesses that pay employees under the table. I ensure that unemployment insurance is paid by the employers, protecting the employees and ensuring they get unemployment insurance if they get laid off (if they get paid under the table they don't get unemployment).

I have been picking up a lot of businesses who are avoiding taxes (surprisingly, or maybe not, software companies are a big issue, along with housecleaners and dog groomers/sitters/walkers).

Emphasis mine.

You know, that's an interesting point and doesn't surprise me much. He does his work in the Seattle area, which is one of the major tech-hubs. And one thing tech-startups are known for is distributed offices. Take a 10 person company with people in 6 different states, no one who has run a company like that before, and you have prime conditions for dropping the ball on unemployment reporting and payment.

So you fired the slacker living in Waukegan, Illinois. Did you report their earnings in Illinois, where they live, Wisconsin, where the shared-office they 'worked' out of was, or Washington State, where HQ is?

aaahhh.... lemme get back to you on that.

As he tells me, that can be a very expensive mistake to make depending on how long the misunderstanding was in place. Your payroll vendor may or may not know WTF they're doing with a startup-style distributed office, so don't rely solely on them. Work location and residential location are different things. You can work in Vancouver, WA but live in Portland, OR; you pay Oregon income taxes, but will earn Washington unemployment if you get laid off.

It all began with a bit of Twitter snark:


SmallLAMPStack.png

Utilities follow a progression. They begin as a small shell script that does exactly what I need it to do in this one instance. Then someone else wants to use it, so I open source it. 10 years of feature-creep pass, and then you can't use my admin suite without a database server, a web front end, and just maybe a worker-node or two. Sometimes bash just isn't enough you know? It happens.

Anyway...

Back when Microsoft was just pushing out their 2007 iteration of all of their enterprise software, they added PowerShell support to  most things. This was loudly hailed by some of us, as it finally gave us easy scriptability into what had always been a black box with funny screws on it to prevent user tampering. One of the design principles they baked in was that they didn't bother building UI elements for things you'd only do a few times, or would do once a year.

That was a nice time to be a script-friendly Microsoft administrator since most of the tools would give you their PowerShell equivalents on one of the Wizard pages, so you could learn-by-practical-example a lot easier than you could otherwise. That was a real nice way to learn some of the 'how to do a complex thing in powershell' bits. Of course, you still had to learn variable passing, control loops, and other basic programming stuff but you could see right there what the one-liner was for that next -> next -> next -> finish wizard was.

SmallLAMPStack-2.png

One thing that a GUI gives you is a much shallower on-ramp to functionality. You don't have to spend an hour or two feeling your way around a new syntax in order to do one simple thing, you just visually assemble your bits, hit next, then finish, then done. You usually have the advantage of a documented UI explaining what each bit means, a list of fields you have to fill out, syntax checking on those fields, which gives you a lot of information about what kinds of data a task requires. If it spits out a blob of scripting at the end, even better.

An IDE, tab-completion, and other such syntactic magic help scripters build what they need; but it all relies upon on the fly programatic interpretation of syntax in a script-builder. It's the CLI version of a GUI, so doesn't have the sigma of 'graphical' ("if it can't be done through bash, I won't use it," said the Linux admin).

Neat GUIs and scriptability do not need to be diametrically opposed things, ideally a system should have both. A GUI to aid discoverability and teach a bit of scripting, and scripting for site-specific custom workflows. The two interface paradigms come from different places, but as Microsoft has shown you can definitely make one tool support the other. More things should take their example.

Worried about the IPv4 to IPv6 migration?

NetWare users had a similar migration when Novell finally got off of IPX and moved to native TCP/IP with the release of NetWare 5.0 on or around 1999. We've done it before. Like the IPv6 transition, it was reasons other than "because it's a good idea" that pushed for the retirement of IPX from the core network. Getting rid of old networking protocols is hard and involves a lot of legacy, so they stick around for a long, long time.

As it happens IPv6 is spookily familiar to old IPX hands, but better in pretty much every way. It's what Novell had in mind back in the 80's, but done right.

  • Dynamic network addressing that doesn't require DHCP.
  • A mechanism for whole-network announcements (SAP in IPX, various multicast methods for IPv6)

Anyway, you have a network protocol you need to eventually retire, but pretty much everything uses it. What do you do? Like the stages of grief, there is a progression at work here:

  1. Ignore it. We're using the old system just fine, it's going to work for the forseeable future, no reason to migrate.
  2. On by default, but disabled manually. The installer asks for the new stuff, but we just turn it off as soon as the system is up. We're not migrating yet.
  3. The WAN link doesn't support the old stuff. Um, crap. Tunnel the old stuff over the new stuff for that link and otherwise... continue to not migrate.
  4. Clients go on-by-default, but disabled manually. New clients are supporting the new stuff, but we disable it manually when we push out new clients. We're not migrating.
  5. Clients get trouble related to protocol negotiation. Thanks to the tunnel there is new stuff out there and clients are finding it, but can't talk to it. Which is creating network delays and causing support tickets. Find ways to disable protocol discovery, push that out to clients.
  6. Internal support says all the manual changes are harshing their workflow, and can we please migrate since everything supports it now anyway. Okay, maybe we can go dual stack now.
  7. Network team asks if they can turn off the old stuff since everything is also using the new stuff. Say no, and revise deploy guides to start disabling the old stuff on clients but keep it on servers just in case.
  8. Network team asks again since the networking vendor has issued a bulletin on this stuff. Audit servers to see if there is any oldstuff usage. Find that the only usage is between the servers themselves and some really old, extremely broken stuff. Replace the broken stuff, turn off old stuff stack on servers.
  9. Migration complete.

At WWU we finished our IPX to IP migration by following this road and it took us something like 7 years to do it.

Ask yourself where you are in your IPv6 implementation. At WWU when I left we'd gotten to step 5 (but didn't have a step 3).

I've done this before, and so have most old NetWare hands. Appeals to best practices and address-space exhaustion won't work as well as you'd hope, feeling the pain of the protocol transition does. Just like we're seeing right now. Migration will happen after operational pain is felt, because people are lazy. We're going to have RFC1918 IPv4 islands hiding behind corporate firewalls for years and years to come, with full migration only happening after devices stop supporting IPv4 at all.

The IPX transition was a private-network only transition since it was never transited over the public Internet. The IPv6 transition is Internet wide, but there are local mitigations that will allow local v4 islands to function for a long, long time. I know this, since I've done it before.

This is a bit of a rehash of a post I did back in 2005, but Novell did it right when it came to handling user credentials way back in the late 80's and early 90's. The original documents have pretty much fallen off the web, but Novell chose to use a one-way RSA method (or possibly a two-way RSA method but elected to not retain the decryption key, which is much the same thing) to encipher the passwords. The certificate used in this method was generated by the tree itself at creation time, so was unique per tree.

The authentication process looked something like this (from memory, see also: primary documentation is offline)

  1. Client connects to a server, says, I want to log in a user, here is a temporary key.
  2. Server replies using the temporary key, "Sure. Here is my public key and a salt."
  3. Client says, "I want to log in bobjoe42.employees.corporate"
  4. Server replies, "Here is the public key for bobjoe42.employees.corporate"
  5. Client crypts the password with bobjoe42's certificate.
  6. Client crypts the cryptotext+salt with the server's signing key.
  7. Client says, "Here is the login for bobjoe42.emploees.corporate"
  8. Server decrypts login request to get at the cryptotext+salt of bobjoe42.emploees.corporate.
  9. Removes salt.
  10. Server compares the submitted cryptotext to the cryptotext on bobjoe42.employees.corporate's object. It matches.
  11. Server says, "You're good."

Unfortunately, the passwords were monocased before crypting computation.

Fortunately, they allowed really long passwords unlike many systems (ahem 1993 version of UNIX-crypt).

That said, this system does a lot of password-handling things right:

  1. Passwords are never passed in the clear over the network, only the enciphered values are transferred.
  2. Passwords are never stored in the clear.
  3. Passwords are never stored in a reversable way.
  4. Reversible keys are never transferred in the clear.
  5. The password submission process prevents replay attacks through the use of a random salt with each login transaction.
  6. The passwords themselves were stored encrypted with tree-unique crypto certificates, so the ciphertext of a password in one tree would look different than the same password in a different tree.

You can get a similar system using modern web technologies:

  1. Client connects to server over SSL. A secure channel is created.
  2. Client retrieves javascript or whatever from the server describing how to encode login credentials.
  3. Client says, "I'm about to log in, give me a salt."
  4. Server returns a salt to the client.
  5. Client computes a salted hash from the user-supplied password.
  6. Client submits, "I'm logging in bobjoe42@zmail.us with this hashtext."
  7. Server compares the hashtext to the password database, finds a match.
  8. Server replies, "You're good, use this token."

However, a lot of systems don't even bother going that complex, relying instead on the SSL channel to provide transaction security and allowing unhashed passwords to be passed over that crypted channel. That's "good enough" for a lot of things, and clearly Novell with rather paranoid back in the day.

As it happened, that method ended up being so secure they had to change their authentication system when it came time to handle systems that wanted to authenticate using non-NCP methods like, oh, CIFS, or Appletalk. Those other protocols don't have mechanisms to handle the sort of handshake that NCP allows so something else had to be created, and thus the Universal Password was born. But that's kind of beyond the scope of this article.

Yep, they did it right back then. A network sniffer on the network (a lot easier in the days of hubbed networks) was much less likely to yield tasty numnums. SMB wasn't so lucky.

Novell introduced NDS with NetWare 4.0 in 1993, and is still being shipped 21years later as part of Open Enterprise Server.

For those of you who've never run into it, NDS (Novell Directory Services, currently marketed as eDirectory) is currently a distributed LDAP database that also provides non-LDAP interfaces for interacting with the object store. It can scale up to very silly object counts and due to Novell's long experience with distributed database management does so with a minimum of object corruption. It just works (albeit on a proprietary system).

It didn't start off as an LDAP datastore, though. No, it began life in 1993 as the authentication database behind NetWare and had a few very revolutionary features versus what was available on the market at the time:

  • It allowed multiple servers to use the same authentication database, so you didn't have to have an account on each server if users needed to access more than one of them. This was the biggest selling point, and seems pretty basic right now.
    • NIS/NIS+ already did this and predates NDS, but was a UNIX-only system not useful for non-UNIX offices.
  • It ran the database on multiple nodes, which made it a replicated database.
  • It partitioned the database to provide improved database locality, which made it a sharded database.
  • It allowed write operations on more than one replica per shard, which made it a distributed database.
  • It had eventual convergence built into it.
  • It had robust authentication features, which I'll get into in a later post.

NDS was a replicated, distributed, sharded database with eventual convergence that was written in 1993. MongoDB can do three of those (replicated, sharded, eventual consistency, but can distribute reads if needed), Cassandra does all four. This is a solvable problem but it's a rather complex one as Novell found out.

Consider the state of networking in 1993.

  • 10Mbps Ethernet was high-speed, and was probably hubbed even in the "datacenter".
  • Any enterprise of any size had very slow WAN links connecting small offices to central, so you had high latency links.
  • 16Mbps token-ring was still in frequent enough use NetWare had to support it.
    • Since TR was faster than Ethernet, it was frequently deployed in the datacenter, which necessitated TR to Ethernet bridges.
    • TR was often the edge network as well.
  • The tech industry hadn't yet converged on a single Ethernet Layer 2 framing protocol, so anything talking to Ethernet had to be able to handle up to 4 different framing standards (to the best of my limited knowledge, Cisco gear stillcan be configured to use any of the three losers of that contest, even though none of them has been in common usage for a long time).
  • TCP/IP was not the only data standard, NetWare used its very own IPX protocol which is not an IP protocol (more on that in a later post).

Can you imagine trying to run something like Cassandra on 10Mbps links with some nodes on the other side of links with pings approaching 1000ms? It can certainly be done, but it sure as heck magnifies any problems in the convergence protocol.

Novel learned that too. Early versions of NDS were prone to corruption, very prone. Real world networking conditions were so very unlike the assumed conditions the developers engineered in that it was only after NDS hit production that they truly appreciated the full array of situations it had to support. From memory, it was only after NDS version 6 released on or about NetWare 4.11 service-pack 3 that it really became stable. That took Novell over 4 years to get right.

Corruption bugs continued in NDS even into the modern era since that's a very hard problem to stomp. The edge cases surrounding a node disappearing, and reappearing with old/new/changed data and how convergence happens gets very nuanced, very quickly. The open-source distributed database projects are dealing with that right now.

For all that it was a strong backing database for very large authentication and identity databases, NDS/eDirectory was never designed to be highly transactional. It's an LDAP database, and you use it where you'd use an LDAP database.

Other Blogs

My Other Stuff

Monthly Archives