Recently in linux Category

What my CompSci degree got me

| No Comments

The what use is a csci degree meme has been going around again, so I thought I'd interrogate what mine got me.

First, a few notes on my career journey:

  1. Elected not to go to grad-school. Didn't have the math for a masters or doctorate.
  2. Got a job in helpdesk, intending to get into Operations.
  3. Got promoted into sysadmin work.
  4. Did some major scripting as part of Y2K remediation, first big coding project after school.
  5. Got a new job, at WWU.
  6. Microsoft released PowerShell.
  7. Performed a few more acts of scripting. Knew I so totally wasn't a software engineer.
  8. Manage to change career tracks into Linux. Started learning Ruby as a survival mechanism.
  9. Today: I write code every day. Still don't consider myself a 'software engineer'.

Elapsed time: 20ish years.

As it happens, even though my career has been ops-focused I still got a lot out of that degree. Here are the big points.

It all began with a bit of Twitter snark:


Utilities follow a progression. They begin as a small shell script that does exactly what I need it to do in this one instance. Then someone else wants to use it, so I open source it. 10 years of feature-creep pass, and then you can't use my admin suite without a database server, a web front end, and just maybe a worker-node or two. Sometimes bash just isn't enough you know? It happens.


Back when Microsoft was just pushing out their 2007 iteration of all of their enterprise software, they added PowerShell support to  most things. This was loudly hailed by some of us, as it finally gave us easy scriptability into what had always been a black box with funny screws on it to prevent user tampering. One of the design principles they baked in was that they didn't bother building UI elements for things you'd only do a few times, or would do once a year.

That was a nice time to be a script-friendly Microsoft administrator since most of the tools would give you their PowerShell equivalents on one of the Wizard pages, so you could learn-by-practical-example a lot easier than you could otherwise. That was a real nice way to learn some of the 'how to do a complex thing in powershell' bits. Of course, you still had to learn variable passing, control loops, and other basic programming stuff but you could see right there what the one-liner was for that next -> next -> next -> finish wizard was.


One thing that a GUI gives you is a much shallower on-ramp to functionality. You don't have to spend an hour or two feeling your way around a new syntax in order to do one simple thing, you just visually assemble your bits, hit next, then finish, then done. You usually have the advantage of a documented UI explaining what each bit means, a list of fields you have to fill out, syntax checking on those fields, which gives you a lot of information about what kinds of data a task requires. If it spits out a blob of scripting at the end, even better.

An IDE, tab-completion, and other such syntactic magic help scripters build what they need; but it all relies upon on the fly programatic interpretation of syntax in a script-builder. It's the CLI version of a GUI, so doesn't have the sigma of 'graphical' ("if it can't be done through bash, I won't use it," said the Linux admin).

Neat GUIs and scriptability do not need to be diametrically opposed things, ideally a system should have both. A GUI to aid discoverability and teach a bit of scripting, and scripting for site-specific custom workflows. The two interface paradigms come from different places, but as Microsoft has shown you can definitely make one tool support the other. More things should take their example.

As I look around the industry with an eye towards further employment, I've noticed a difference of philosophy between startups and the more established players. One easy way to see this difference is on their job postings.

  • If it says RHEL and VMWare on it, they believe in support contracts.
  • If it says CentOS and OpenStack on it, they believe in community support.

For the same reason that tech startups almost never use Windows if they can get away with it, they steer clear of other technologies that come with license costs or mandatory support contracts. Why pay the extra support cost when you can get the same service by hiring extremely smart people and use products with a large peer support community? Startups run lean, and all that extra cost is... cost.

And yet some companies find that they prefer to run with that extra cost. Some, like StackExchange, don't mind the extra licensing costs of their platform (Windows) because they're experts in it and can make it do exactly what they want it to do with a minimum of friction, which means the Minimum Viable Product gets kicked out the door sooner. A quicker MVP means quicker profitability, and that can pay for the added base-cost right there.

Other companies treat support contracts like insurance: something you carry just in case, as a hedge against disaster. Once you grow to a certain size, business continuity insurance investments start making a lot more sense. Running for the brass ring of market dominance without a net makes sense, but once you've grabbed it keeping it needs investment. Backup vendors love to quote statistics on the percentage of business that fail after a major data-loss incident (it's a high percentage), and once you have a business worth protecting it's good to start protecting it.

This is part of why I'm finding that the long established companies tend to use technologies that come with support. Once you've dominated your sector, keeping that dominance means a contract to have technology experts on call 24/7 from the people who wrote it.

"We may not have to call RedHat very often, but when we do they know it'll be a weird one."

So what happens when startups turn into market dominators? All that no-support Open Source stuff is still there...

They start investing in business continuity, just the form may be different from company to company.

  • Some may make the leap from CentOS to RHEL.
  • Some may contract for 3rd party support for their OSS technologies (such as with 10gen for MongoDB).
  • Some may implement more robust backup solutions.
  • Some may extend their existing high-availability systems to handle large-scale local failures (like datacenter or availability-zone outages).
  • Some may acquire actual Business Continuity Insurance.

Investors may drive adoption of some BC investment, or may actively discourage it. I don't know, I haven't been in those board meetings and can argue both ways on it.

Which one do I prefer?

Honestly, I can work for either style. Lean OSS means a steep learning curve and a strong incentive to become a deep-dive troubleshooter of the platform, which I like to be. Insured means someone has my back if I can't figure it out myself, and I'll learn from watching them solve the problem. I'm easy that way.

This morning on the opensuse-factory mailing-list it was pointed out that the ongoing problems getting version 12.2 to a state where it can even be reliably tested was a symptom of structural problems within how OpenSUSE as a whole handles package development. Steven Kulow posted the following this morning:


It's time we realize delaying milestones is not a solution. Instead,
let's use the delay of 12.2 as a reason to challenge our current
development model and look at new ways. Rather than continue to delay
milestones, let's re-think how we work.

openSUSE has grown. We have many interdependent packages in Factory. The
problems are usually not in the packages touched, so the package updates
work. What's often missing though is the work to fix the other packages
that rely on the updated package. We need to do a better job making sure
bugs caused by updates of "random packages" generate a working system.
Very fortunately we have an increasing number of contributors that
update versions or fix bugs in packages, but lately, the end result has
been getting worse, not better. And IMO it's because we can't keep up in
the current model.

I don't remember a time during 12.2 development when we had less than
100 "red" packages in Factory. And we have packages that fail for almost
five months without anyone picking up a fix. Or packages that have
unsubmitted changes in their devel project for six months without anyone
caring to submit it (even ignoring newly introduced reminder mails).

So I would like to throw in some ideas to discuss (and you are welcome
to throw in yours as well - but please try to limit yourself to things
you have knowledge about - pretty please):

1. We need to have more people that do the integration work - this
  partly means fixing build failures and partly debugging and fixing
  bugs that have unknown origin.
  Those will get maintainer power of all of factory devel projects, so
  they can actually work on packages that current maintainers are unable
2. We should work way more in pure staging projects and less in develop
  projects. Having apache in Apache and apache modules in
  Apache:Modules and ruby and rubygems in different projects may have
  appeared like a clever plan when set up, but it's a nightmare when it
  comes to factory development - an apache or ruby update are a pure
  game of luck. The same of course applies to all libraries - they never
  can have all their dependencies in one project.
  But this needs some kind of tooling support - but I'm willing to
  invest there, so we can more easily pick "green stacks".
  My goal (a pretty radical change to now) is a no-tolerance
  strategy about packages breaking other packages.
3. As working more strictly will require more time, I would like to
  either ditch release schedules all together or release only once a
  year and then rebase Tumbleweed - as already discussed in the RC1

Let's discuss things very openly - I think we learned enough about where
the current model works and where it doesn't so we can develop a new one

Greetings, Stephan

The discussion is already going.

The current release-schedule is an 8 months schedule, which is in contrast to a certain other highly popular Linux distro that releases every 6. Before they went to the every-8 schedule, they had a "when it's good and ready" schedule. The "when it's good and ready" schedule lead to criticisms that OpenSUSE was an habitually stale distro that never had the latest shiny new toys that the other distros had; which, considering the then-increasing popularity of that certain other distro, was a real concern in the fight for relevancy.

Looks like the pain of a hard-ish deadline is now exceeding the pain of not having the new shiny as fast as that other distro.

One thing that does help, and Stephan pointed this out, is the advent of the Tumbleweed release of OpenSUSE. Tumbleweed is a continual integration release similar in style to Mint; unlike the regular release it does get regular updates to the newest-shiny, but only after it passes testing. Now that OpenSUSE has Tumbleweed, allowing the mainline release to revert to a "when it's good and ready" schedule is conceivable without threatening to throw the project out of relevance.

These are some major changes being suggested, and I'm sure that they'll keep the mailing-list busy for a couple of weeks. But in the end, it'll result in a more stable OpenSUSE release process.

Moving /tmp to a tmpfs

| 1 Comment
There is a move afoot in Linux-land to make /tmp a tmpfs file-system. For those of you who don't know what that is, tmpfs is in essence a ramdisk. OpenSUSE is considering the ramifications of this possible move. There are some good points to this move:

  • A lot of what goes on in /tmp are ephemeral files for any number of system and user processes, and by backing that with RAM instead of disk you make that go faster.
  • Since a lot of what goes on in /tmp are ephemeral files, by backing it with RAM you save writes on that swanky new SSD you have.
  • Since nothing in /tmp should be preserved across reboots, a ramdisk makes sense.

All of the above make a lot of sense for something like a desktop oriented distribution or use-case, much like the one I'm utilizing right this moment.

However, there are a couple of pretty large downsides to such a move:

  • Many programs use /tmp as a staging area for potentially-larger-than-ram files.
  • Some programs don't clean up after themselves, which leaves /tmp growing over time.

Working as I do in the eDiscovery industry where transforming files from one type to another is common event, the libraries we use to do that transformation can and do drop very big files in /tmp while they're working. All it takes is one bozo dropping a 20MB file with .txt at the end (containing a single SMTP email message with MIME'ed attachment) to generate an eight thousand page PDF file. Such a process behaves poorly when told "Out Of Space" for either RAM or '/tmp.

And when such processes crash out for some reason, they can leave behind 8GB files in /tmp.

That would not be happy on a 2GB RAM system with a tmpfs style /tmp.

In our case, we need to continue to keep /tmp on disk. Now we have to start doing this on purpose, not just trusting that it comes that way out of the tin.

Do I think this is a good idea? It has merit on high RAM workstations, but is a very poor choice for RAM constrained environments such as VPSs, and virtual machines.

Are the benefits good enough to merit coming that way out of the box? Perhaps, though I would really rather that the "server" option during installation default to disk-backed /tmp.

In the last month-ish I've had a chance to read about and use two new graphical shells that impact my life:

  • Windows 8 Metro, which I already have on my work-phone
  • Gnome 3

Before I go further I must point out a few things. First of all, as a technologist change is something I must embrace. Nay, cheer on. Attitudes like, "If it ain't broke, don't fix it," are not attitudes I share, since 'broke' is an ever moving target.

Secondly, I've lived through just enough change so far to be leery of going through more of it. This does give me some caution for change-for-change's-sake.

As you can probably guess from the lead-up, I'm not a fan of those two interfaces.

They both go in a direction that the industry as a whole is going, and I'm not fond of that area. This is entirely because I spend 8-12 hours a day earning a living using a graphical shell. "Commoditization" is a battle-cry for change right now, and that means building things for consumers.

The tablet in my backpack and the phones in my pocket are all small, touch-screen devices. Doing large-scale text-entry in any of those, such as writing blog posts, is a chore. Rapid task-switching is doable through a few screen-presses, though I don't get to nearly the window-count I do on my traditional-UI devices. They're great for swipe-and-tap games.

When it comes to how I interact with the desktop, the Windows 7 and Gnome 2 shells are not very different other than the chrome and are entirely keyboard+mouse driven. In fact, those shells are optimized for the keyboard+mouse interaction. Arm and wrist movements can be minimized, which extends the lifetime of various things in my body.

Windows 8 MetroUI brings the touch-screen metaphor to a screen that doesn't (yet) have a touch-screen. Swipe-and-tap will have to be done entirely with the mouse, which isn't a terribly natural movement (I've never been a user of mouse-gesture support in browsers). When I do get a touch-screen, I'll be forced to elevate my arm(s) from the keyboarding level to tap the screen a lot, which adds stress to muscles in my back that are already unhappy with me.

And screen-smudges. I'll either learn to ignore the grime, or I'll be wiping the screen down ever hour or two.

And then there is the "One application has the top layer" metaphor, which is a change from the "One window has the top layer" metaphor we've been living with on non-Apple platforms for a very long time now. And I hate on large screens. Apple has done this for years, which is a large part of why Gnome 3 has done it, and is likely why Microsoft is doing it for Metro.

As part of my daily job I'll have umpty Terminal windows open to various things and several Browser windows as well. I'll be reading stuff off of the browser window as reference for what I'm typing in the terminal windows. Or the RDP/VNC windows. Or the browser windows and the java-console windows. When an entire application's windows elevate to the top it can bury windows I want to read at the same time, which means that my window-placement decisions will have to take even more care than I already apply.

I do not appreciate having to do more work because the UI changed.

I may be missing something, but it appears that the Windows Metro UI has done away with the search-box for finding Start-button programs and control-panel items. If so, I object. If I'm on a desktop, I have a hardware keyboard so designing the UI to minimize keystrokes in favor of swiping is a false economy.

Gnome 3 at least has kept the search box.

In summation, it is my opinion that bringing touch-UI elements into a desktop UI is creating some bad compromises. I understand the desire to have a common UI metaphor across the full range from 4" to 30" screens, but input and interaction methodologies are different enough that some accommodation-to-form-factor needs to be taken.
Since I had some time this weekend and actually had my work laptop home, I decided to take the plunge and get OpenSUSE 12.1 onto it. I"ve been having some instability issues that trace to the kernel, so getting the newer kernel seemed like a good idea.


The upgrade failed to take for two key reasons:

  1. VMWare Workstation doesn't yet work with the 3.1 kernel.
  2. Gnome 3 and my video card don't get along in a multi-monitor environment.

The first is solvable with a community patch that patches the VMware kernel modules to work with 3.1. The down side is that every time I launch a VM I get the "This is running a newer kernel than is known to work" message, and that gets annoying. Also, rather fragile since I'd have to re-patch every time a kernel update comes down the pipe (not to mention breaking any time VMWare releases a new version).

The second is where I spent most of my trouble-shooting time. I've been hacking on X since the days when you had to hand-roll your own config files, so I do know what I'm looking at in there. What I found:

  • When X was correctly configured to handle multi-monitors the way I want, the Gnome shell won't load.
  • When the Gnome shell was told to work right, GDM won't work.

Given those two, and how badly I need more screen real-estate at work, I'm now putting 11.4 back on the laptop.

One of the tasks I have is to build some separation between the production environment and test/integration environments. As we approach that Very Big Deadline I've talked about, this is a pretty significant task. Dev processes that we've been using for a really long time now need to change in order to provide the separation we need, and that's some tricky stuff.

One of the ways I plan on accomplishing this is through the use of Environments in Puppet. This looks like a great way to allow automation while delivering different code to different environments. Nifty keen stuff.

But it requires that I re-engineer our current Puppet environment, which is a single monolithic all-or-nothing repo to accommodate this. This is fairly straight-forward, as suggested by the Puppet Docs:

If the values of any settings in puppet.conf reference the $environment variable (like modulepath = $confdir/environments/$environment/modules:$confdir/modules, for example), the agent's environment will be interpolated into them.

So if I need to deliver a different set of files to production, such as a different authorized_keys file, I can drop that into $confdir/environments/production/modules and all will be swell.

So I tried to set that up. I made the environment directory, changed the puppet.conf file to reflect that, made an empty module directory for the module that'd get the changed files (but not put the files in there yet), and ran a client against it.

It didn't work so well.

The error I was getting on the puppetmaster was err: Could not find class custommod for clunod-wk0130.sub.example.local at /etc/puppet/manifests/site.pp:84 on node clunod-wk0130.sub.example.local

I checked eight ways from Sunday for spelling mistakes, bracket errors, and other such syntax problems but could not find out why it was having trouble locating the custommod directory. I posted on ServerFault for some added help, but didn't get much beyond, "Huh, it SHOULD work that way," which wasn't that helpful.

I decided that I needed to debug where it was searching for modules since it manifestly wasn't seeing the one staring it in the face. For this I used strace, which produced an immense logfile. Grepping out the 'stat' calls as it checked the import statements, I noticed something distinctly odd.


It was attempting to read in the init.pp file for the custommod, even though there wasn't a file there. And what's more, it wasn't then re-trying under /etc/puppet/modules/custommod/manifests/init.pp. Clearly, the modulepath statement does not work like the $PATH variable in bash and every other shell I've used. In those, if it fails to find a command in (for example) /usr/local/bin, it'll then try /usr/bin, and then ~/bin until it finds the file.

The modulepath statement in puppet.conf is focused on modules, not individual files.

This is a false cognate. By having that empty module in the production environment, I was in effect telling Puppet that the Production environment doesn't have that module. Also, my grand plan to provide a different authorized_keys file in the Production environment requires me to have a complete module for the Production environment directory, not just the one or two files I want changed.

Good to know.

When there is no clean back-out

| 1 Comment
Today I ended up banging my head on a particularly truculent wall. A certain clustering solution just wasn't doing the deed. No matter how I cut, sliced, sawed or hacked it wouldn't come to life. I wanted my two headed beast! Most vexing.

Then I hit that point where I said unto myself (I hope), "!#^#^ it, I'm reformatting these two and starting from scratch."

So I did.

It worked first time.


Ok, I'll take it.

Mind, all the cutting and slicing and sawing and hacking was with me knowing full well that iterative changes like that can stack unknowingly, and I was backing out the changes when I determined that they weren't working. Clearly, my back-out methods weren't clean enough. Okay, food for thought; dirty uninstallers are hardly an unusual thing.

Of course I kicked myself for not trying that earlier, but during the course of aforementioned reanimatory activity I did learn a fair amount about this particular clustering technology. Not a complete loss. I do wish I had those 4 hours back, since we're at a part of the product deployment schedule where hours count.

Learning UEFI

| No Comments
This weekend I put together (mostly as it turned out) a new fileserver for home. The box that's currently pulling that duty is 9 years old and still working just fine.

However, 9 years old. I'm happily surprised that 40GB hard-drive is still working. Only 58,661 power-on hours, and all of 3 reallocated sectors. But still, 9 years old. There comes a time when hardware just needs to be retired before it goes bad in spectacular ways.

The system that is replacing this stalwart performer cost under $400 with shipping, which contrasts starkly with the sticker price on the old unit which was around $1400 back in the day. This system is based on the AMD E350 CPU. Underpowered, but this is a file-server and that ancient P4 is still keeping up. I have a Drobo that I'll be moving to this unit, which is where the bulk of the data I want to serve is being kept anyway.

As for the title of the post, this E350 mobo has a UEFI firmware on it. That's when things got complicated.

Other Blogs

My Other Stuff

Monthly Archives