Recently in opensuse Category

Confusion on containers

My dayjob hasn't brought me into much contact with containers, which is becoming a problem. I mean, I've been doing virtualization for a decade and a half at this point so the concepts aren't new. But how containers work these days isn't something I'm all that up on.

So when the OpenSuSE factory mailing list exploded recently with news of what the next versions of what's now Leap would look like, I paid attention. To quote a weekly status update that caused a lot of debate:

we have mostly focused this week's meeting on how the ALP Desktop is going to look like and how we can dispel fears around flatpaks and containers when it comes to the ALP desktop.

They're moving away from a traditional RPM-based platform built on top of the SuSE Linux Enterprise Server (SLES) base, and are doing it because (among other reasons) Python on SLES is an officially unsupported version. Most Leap users want more recent Python, which the OpenSuSE project can't do due to a lack of volunteer support to rewrite all the OS-related Python bits.

What they're replacing it with is something called Adaptable Linux Platform (ALP) which is built in part on the work done in OpenSUSE MicroOS. It will use Flatpak packages (or RPM wrappers around Flatpaks), which is a contentious decision let me tell you. This debate is what made me look into Flatpak and related tech.

The hell no side of the debate are old-guard sysadmins from my era who have been doing vulnerability management for a long, long time and know that containers make that work problematic. Building a Configuration Management Database (CMDB) that lists all of your installed packages is a core competency for doing vulnerability management, because that's the list you consult when yet another OpenSSL vulnerability arrives and you need to see how bad it'll be this time. This work is made way harder when every container you have comes bundled with its own version of the OpenSSL libraries. We now have emerging tooling that will scan inside containers for bad packages, but that's only half the problem; the other half is convincing upstream container providers to actually patch on your vulnerability program's schedule. Many won't.

On the well actually, that's kind of a good idea side of the debate are the package maintainers. For a smaller project like OpenSUSE, package maintainers need to support packages for a few different distributions:

  • Tumbleweed, the rolling release that is kept up to date at all times
  • The current Leap release, which has base OS libraries getting older every year (and might be old enough to prevent compiling 'newest' without lots of patches)
  • Leap Micro (based on SLES Micro), has a read-only file system assumption, using Flatpaks, is designed to be container-first, and isn't meant to be a desktop distribution
  • Any additional architectures beyond x86_64/amd64 they want to support (various ARM flavors are in demand, but OpenSuSE also has s390 support)

For some packages this is no big deal. For others that involve touching the kernel in any way, such as VirtualBox, each distribution can have rather different requirements. This adds to the load of supporting things, and consumes more volunteer time.

Flatpak lets you get away with making a single package that will support all the distributions. Quite the labor saving device for a project strapped for volunteers. Of course, this leads to the usual 54 versions of openssl libraries problem, but the distributions can all still be made. That's valuable.

This also makes it easier to sandbox desktop utilities. I say sandbox instead of containerize because the key benefit you're getting from this isn't the containers, but the sandboxing that comes with them. I've spent time in the past attempting to build an AppArmor profile for Firefox (it mostly worked, but took too much maintenance to tune). These days you can use systemd user units to apply cgroups to processes and get even more control of what they're allowed to do, including restricting them to filesystem namespaces. This isn't a bad thing.

Also in the not a bad thing camp is ALP's stance that most of the filesystem will be marked read-only. This improves security, because malicious processes that break out of their sandbox will have a harder time replacing system binaries. Having partitions like /usr mounted as read-only has shown up on hardening guides for a couple of decades at this point.

The final thing is that the Open Build Service, which creates all the OpenSUSE packages, already has support for creating Flatpaks along side traditional RPMs. This will make maintenance even easier.

What do I think of all of this?

I'm still making up my mind. I'm going to have to get over no longer having "tarball with some added scripts and metadata" style packagers like I've used since the 1990s, that writing is on the wall. We'll still have some of that around, but major application-packages are going to get shipped as functional filesystem images for even "base" Linux installs. It won't be all that space-efficient, a package needing modern Python will end up shipping an entire Python 3.10 interpreter in the Flatpak for a SLES server, but that's less important in the days of 1TB microSD cards.

Ubuntu has this idea in their Snaps, flatpack and snap are highly similar in goals and features, which has been controversial all by itself. At DayJob we've had the vulnerability management conversation regarding snaps and our ability to introspect into snaps for managing CVEs, which isn't great yet, and decided not to go there at this time.

All in all, the industry is slowly shifting from treating the list of system-packages as the definitive list of software installed on a server towards a far more nuanced view. NPM is famous for building vast software dependency trees, and we've had most of a decade for vulnerability scanning tooling to catch up to that method. For all that OS-side updates were among the first to create dependency resolvers (all that metadata on .deb and .rpm packages is there for a reason) the not-developing-on-C software industry as a whole has taken this approach to whole new levels, which in turn forced vulnerability management tooling to adapt. We're about at the point where SLES Micro makes a good point; the use of traditional methods for a minimal OS and container-blobs for everything else is just about supportable given today's tooling.

The only constant in this industry is change, and this is feeling like one of those occasional sea changes.

This weekend's project was replacing the home server/router. This isn't a high-spec machine for my internal build tooling, this is pretty much the router and file/print server. Given cloud, it's more of a packet-shoveller than anything else. When I built it, I was going for low power to maximize how long the UPS would last in a power-outage, and increase the chances that I'd get my old IP addresses when the power came back.

mdadm tells me the mirror-set was created Sun Sep 25 22:25:34 2011.

I hadn't quite realized it was that old. As it happened I blogged about it the day after the build, so consider this an update. The system this one replaced was also nine years old, so I guess my replacement cycle is 9 years. I never bothered to get UEFI booting working, which in the end doomed this box. The /boot partition couldn't take another kernel update as of Friday; kernels had just grown too big!

Rather than homebrew the home server, this time I went off-the-shelf and bought a System76 Meerkat. It's a fifth the size of the old server + drobo box and has as much storage in SSD/NVME. Also, it's vastly quieter. The fan in the old server had picked up a buzz once in a while, it needed a therapeutic tap to knock it out of vibration. My office is so quiet now (it couldn't live in the basement because the Internet enters the house in my office). Being 9 years newer means it probably draws half the power of the old one, so UPS endurance just shot up. Whee!

Over its 9 year life:

  • Glombed onto hardware from the old MythTV setup, which was the Drobo array. The Drobo is probably 13 or 14 years old by this point.
  • Upgraded one at a time from OpenSuse 11.4 all the way to 15.2, but couldn't handle the size of the 15.2 kernel. It just worked, no surprises, could do a headless update every time.
  • Moved houses twice, no failures. Even after the move where it was powered off for three days.
  • When we left the land of Verizon FiOS five years ago, it became my internet gateway.
  • Did the work to get IPv6 working, with prefix delegation, which was tricker than I liked and is still feeling hacky. But it let me do IPv6 in the house without using the ISP router.
  • Two years ago put telegraf/influxdb/grafana on it to track internet gateway usage, and a few other details. Such as temperature. Which showed rather nicely how big the temperature swings are in our house over the winter.

The hard-drives (a pair of 160GB Western Digital Blacks):

  • 79,874 power-on hours.
  • 50 power-cycles
    • 10 of those were planned maintenance of various types, like moving, painting, and other things.
    • The rest were power-outages of various durations.
  • Zero reallocated sectors on either drive
  • Load_Cycle_Count of 5.4 million. Comes to about 78 load-cycles an hour. Clearly this is an important SMART metric.

In a hero of the revolution moment, I managed to get enough of the DHCP state transferred between the old and new hardware that we're on the same IPv4 address and IPv6 prefix from before the migration!

Let's see if this one also goes nine years. Check back in 2029.

Perhaps you've seen this error:

Version mismatch with VMCI driver: expecting 11, got 10.

I get this every time I upgrade a kernel, and this is how I fix it.

Moving /tmp to a tmpfs

| 1 Comment
There is a move afoot in Linux-land to make /tmp a tmpfs file-system. For those of you who don't know what that is, tmpfs is in essence a ramdisk. OpenSUSE is considering the ramifications of this possible move. There are some good points to this move:

  • A lot of what goes on in /tmp are ephemeral files for any number of system and user processes, and by backing that with RAM instead of disk you make that go faster.
  • Since a lot of what goes on in /tmp are ephemeral files, by backing it with RAM you save writes on that swanky new SSD you have.
  • Since nothing in /tmp should be preserved across reboots, a ramdisk makes sense.

All of the above make a lot of sense for something like a desktop oriented distribution or use-case, much like the one I'm utilizing right this moment.

However, there are a couple of pretty large downsides to such a move:

  • Many programs use /tmp as a staging area for potentially-larger-than-ram files.
  • Some programs don't clean up after themselves, which leaves /tmp growing over time.

Working as I do in the eDiscovery industry where transforming files from one type to another is common event, the libraries we use to do that transformation can and do drop very big files in /tmp while they're working. All it takes is one bozo dropping a 20MB file with .txt at the end (containing a single SMTP email message with MIME'ed attachment) to generate an eight thousand page PDF file. Such a process behaves poorly when told "Out Of Space" for either RAM or '/tmp.

And when such processes crash out for some reason, they can leave behind 8GB files in /tmp.

That would not be happy on a 2GB RAM system with a tmpfs style /tmp.

In our case, we need to continue to keep /tmp on disk. Now we have to start doing this on purpose, not just trusting that it comes that way out of the tin.

Do I think this is a good idea? It has merit on high RAM workstations, but is a very poor choice for RAM constrained environments such as VPSs, and virtual machines.

Are the benefits good enough to merit coming that way out of the box? Perhaps, though I would really rather that the "server" option during installation default to disk-backed /tmp.

Since I had some time this weekend and actually had my work laptop home, I decided to take the plunge and get OpenSUSE 12.1 onto it. I"ve been having some instability issues that trace to the kernel, so getting the newer kernel seemed like a good idea.

Oops.

The upgrade failed to take for two key reasons:

  1. VMWare Workstation doesn't yet work with the 3.1 kernel.
  2. Gnome 3 and my video card don't get along in a multi-monitor environment.

The first is solvable with a community patch that patches the VMware kernel modules to work with 3.1. The down side is that every time I launch a VM I get the "This is running a newer kernel than is known to work" message, and that gets annoying. Also, rather fragile since I'd have to re-patch every time a kernel update comes down the pipe (not to mention breaking any time VMWare releases a new version).

The second is where I spent most of my trouble-shooting time. I've been hacking on X since the days when you had to hand-roll your own config files, so I do know what I'm looking at in there. What I found:

  • When X was correctly configured to handle multi-monitors the way I want, the Gnome shell won't load.
  • When the Gnome shell was told to work right, GDM won't work.

Given those two, and how badly I need more screen real-estate at work, I'm now putting 11.4 back on the laptop.

Learning UEFI

This weekend I put together (mostly as it turned out) a new fileserver for home. The box that's currently pulling that duty is 9 years old and still working just fine.

However, 9 years old. I'm happily surprised that 40GB hard-drive is still working. Only 58,661 power-on hours, and all of 3 reallocated sectors. But still, 9 years old. There comes a time when hardware just needs to be retired before it goes bad in spectacular ways.

The system that is replacing this stalwart performer cost under $400 with shipping, which contrasts starkly with the sticker price on the old unit which was around $1400 back in the day. This system is based on the AMD E350 CPU. Underpowered, but this is a file-server and that ancient P4 is still keeping up. I have a Drobo that I'll be moving to this unit, which is where the bulk of the data I want to serve is being kept anyway.

As for the title of the post, this E350 mobo has a UEFI firmware on it. That's when things got complicated.

OpenSUSE and Java

| 2 Comments
I was just catching up on email when I ran into this interesting tidbit

Up to now, openSUSE users had the choice of using openJDK (GPL with classpath exceptions) or Sun/Oracle's Java. The Sun/Oracle Java was licensed under the "Distributor's License for Java (DLJ)", which allowed Linux distributors to package and redistribute Sun/Oracle Java. Recently, Oracle announced that openJDK 7 is the new official reference implementation for Java SE7. They no longer see the need for the DLJ licensed Java implementation and so have retired that license.

openSUSE chooses to proceed with distributing the GPL licensed official reference implementation, openJDK. We will no longer distribute newer versions or updates of the now proprietary licensed Sun/Oracle JDK. Existing installations of the Sun/Oracle JDK are still licensed under the now retired DLJ.

openSUSE users who wish to continue using the Sun/Oracle JDK (including new versions thereof and updates) should now download directly from http://www.oracle.com/java. For now we keep the current sun-java packages (under the DLJ license) in the Java:sun:Factory project and will not update them anymore.

I suggest to document in the openSUSE wiki how to install the Sun/Oracle JDK version from Oracle under openSUSE.
Which is to say, Oracle is killing the license that allows OpenSUSE to provide Sun/Oracle Java as a part of the repo.

As a user of java apps in the line of my work, this deeply annoys me. What few management java apps work well on Linux (it's always a surprise when something does), seem to work best with Oracle Java not OpenJDK..

It probably won't be just OpenSUSE affected by this.

The Linux boot process, a chart

| 1 Comment
The Linux boot process, a branching chart. For the most part, the levels do not depend on column. LILO does not require EFI, nor does GRUB require BIOS. EFI and BIOS support are now included in both bootloaders. Yes, there is some simplification here. EFI is very featured, but hasn't been commonly exercised to its full extent. Eventually EFI could possibly replace GRUB/LILO, but that day hasn't come yet.

Pre-Boot Enviroinment
BIOS
EFI
  • POST
  • Read bootable media
  • Load Master Boot Record
  • Execute MBR
  • POST
  • Read bootable media
  • Load the GPT table
  • Mount the EFI system-partition
  • Load EFI-drivers for the system
  • Execute MBR
Bootstrap Environment
GRUB(v1)
GRUB(v2) (E)LILO
  • Stage 1 loaded into MBR/EFI and gets executed by BIOS/EFI
  • Stage 1.5 loaded by Stage 1, including critical drivers
  • Stage 2, in the boot filesystem, executes
  • Stage 2 loads the kernel
  • Stage 1 loaded into MBR/EFI and gets executed by BIOS/EFI
  • Load first sector of core.img
  • Continues loading core.img
  • Loads GRUB config
  • Loads the kernel
  • Stage1 loaded into MBR (or EFI by ELILO) and executed by BIOS/EFI
  • Stage 1 loads the first cluster of Stage 2, and executes.
  • Loads LILO information.
  • Loads the kernel
Kernel Load
  • The kernel uncompresses into memory
  • If configured, the kernel mounts the Initial Ramdisk, which contains needed modules to load the rest of the OS
  • Mounts the root filesystem, loading any needed modules from initrd
  • Swaps / from initrd to the actual root filesystem
  • Executes the specified init process
UID 1 process
Initd
Systemd
Upstart
Launchd
  • Checks /etc/inittab for loading procedures
  • Runs scripts specified by inittab
    • Mounts needed filesystems
    • Loads needed modules
    • Starts needed services based on runlevel
    • Finishes setting up userspace
  • Reads /etc/system.conf
  • Mounts needed filesystems
  • Loads needed modules
  • Starts services as needed
  • Runs startup events listed in /etc/events.d based on runlevel.
  • Loads needed modules
  • Mounts needed filesystems
  • Starts needed services
  • Reads /etc/launchd.conf for config details
  • Reads /etc/launchd.plist for per-driver/service details

I did this because Things have Changed from the last time I really studied this. Back when I started it was BIOS, LILO, and Initd. I never did bother to wrap my head around Grub, mostly because the automatic tools have gotten a lot better so knowing it just to install isn't needed, and I haven't had the right kind of boot problems where I'd learn it through troubleshooting. I've also yet to run into EFI in the wild (I think...). Now that OpenSUSE is actively considering moving over to SystemD and Ubuntu having thrown initd over the side some time ago in favor of Upstart, it's time to revisit.

I'm still fuzzy on SystemD so that's probably wrong.

OpenSUSE and rolling updates

I somehow missed this when it first came out, but there is a move a-foot to create a rolling-update version of OpenSUSE. The announcement is here:

http://lists.opensuse.org/opensuse-project/2010-11/msg00206.html

In the last couple of days the repo has been created and opensuse-factory has been awash in requests to add packages to it.

What IS this thing, anyway?

It's like Factory, but with the stable branches of the packages in use. For instance if OpenSUSE 11.4 releases and a month later Gnome 3.0 drops, 11.4 will still be on whatever version it shipped with but OpenSUSE Tumbelweed (the name of this new rolling-updates version) will get it. The same applies to kernel versions. 11.4 will likely have 2.6.37 in it, but 2.6.38 will drop pretty soon after 11.4 releases.

Is this suitable for production?

Depends on how much testing you want in your packages. The order of tested-stable of SUSE versions from least to most is:

  1. Factory (breaks regularly!)
  2. Factory-Tested
  3. Tumbleweed
  4. OpenSUSE releases
  5. SLES releases (long term support contracts available)
Factory-Tested is also pretty new. It's a version of Factory that passes certain very specific tests, so you're much less likely to brick your system by using it over Factory itself. It will have bugs though, just not as many blockers.

There are some usages where I'd be leery of OpenSUSE releases just from code quality and support considerations, and Tumbleweed would be nearly certain to be rejected. And yet, if your use-case needs cutting edge (not bleeding edge) packages, Tumbleweed is the version for you.

Right now it looks like determining which packages get updated in Tumbleweed will be determined by the project maintainers. For the Gnome example, they have a Devel and Stable branch in their repo and it will be the Stable repo that gets included in Tumbleweed. Find a bug? It'll get reported to the repo-maintainer for that package. It may get fixed quicker there, or not. Tumbleweed users will help the OpenSUSE releases be more stable by providing testing.

Personally, I'll be sticking with the Releases version on my work desktop, since I need to maintain some stability there. I just might go Tumbleweed on my home laptop, though. 

New Laptop

I've been looking for a new laptop. I finally found one, Dell had a sale on their Studio 15 line. This is a Core i5 laptop with all the niftyness I wanted. I also purchased a new 320GB hard-drive for it. Why? So I could put the Dell supplied one, with its pre-built Win7 install, into a baggie for later use if I so chose, and so I could get openSUSE onto it without mucking about with resizing partitions and all that crapola.

I did what I could to make sure the components were Linux compatible (Intel wireless not Dell, that kind of thing), but some things just don't work out. This is a brand new laptop with a brand new processor/chipset/GPU architecture, so I planned on having at least one thing require several hours of hacking to get working. This is the price you pay for desktop-linux on brand spanking new hardware. I, at least, am willing to pay it.

And pay I am. I installed OpenSUSE pretty simply, or at least it started there. It got to the first reboot and got a black screen of nothingness. Watching post it was pretty clearly a kernel video handling problem. Some quick googling identified the problem:

OpenSUSE 11.2 uses the 2.6.31 kernel. This laptop uses an intel 4500MHD GPU. Support for which was introduced in 2.6.32, and greatly refined in 2.6.33. What's more, it uses Kernel Mode Setting Direct Rendering Management, support for which was introduced in 2.6.32. All this means is that 2.6.31 simply can't support this video GPU at anything like reasonable speeds.

OpenSUSE 11.3 (currently in a very buggy Milestone 3 release, soon to be M4) has a 2.6.33 series kernel. But I don't want to be buggy. So...

Time to compile a kernel!

Because I've done it before (a LOT), kernel compiles do not scare me. They take time, and generally require multiple runs to get right, so you just have to plan for time to get it right. So I booted to the OpenSUSE 11.2 Rescue System, followed these instructions, and got into my (still half installed) file-system. I plugged it into wired ethernet because that's hella easier to set up from command-line, and used yast to grab the kernel-dev environment. Then downloaded 2.6.33 from kernel.org. I had to grab the /proc/config.gz from a working x86_64 system, so I pulled the one from my 11.2 install here at work, threw it into the kernel-source directory, ran 'make oldconfig', answered a bajillion questions, and tada. make bzImage; make modules; make modules_install ; make install ; mkinitrd -k vmlinuz-2.6.33 -i initrd-2.6.33, and a bit of Yast GRUB work to make certain Grub was set up right, and reboot.

Most of the way there. I had to add this line to /etc/modprobe.d/99-local:

options i915 modeset=1

Which got me enough graphics to finish the openSUSE install, and get me to a desktop. I don't yet know how stable it is, haven't had time to battletest it. I probably need updated x.org software for full stability. I did get it up long enough this morning to find out that the wireless driver needs attention; dmesg suggested it had trouble loading firmware. So that's tonight's task.

Update: Getting the wireless to work involved downloading firmware for the 6000 from here, dropping the files into /lib/firmware, and rebooting. Dead easy. Now, Suspend doesn't work for some reason. That might be intractable, though.