Tuesday, November 03, 2009

To ship or not to ship

The openSUSE project is attempting a vote to determine if 11.2 is baked enough to ship it now, or if it needs to slip.

https://features.opensuse.org/308284

If you have an opinion, go ahead and vote. Or just read the comments!

Yes, there are bugs. Perhaps a lot of them. If some of these are the type to break your install, start working on it!

Labels:


Thursday, October 15, 2009

Clearly I am missing something

On the opensuse-factory list this exchange has happened several times:

Q: Installation from LiveDVD is broken. Bug?

A: LiveCD's are not installation sources.

Clearly, something has changed in the Land of Linux Installers. Enough mind-share has shifted to "I install my linux with my LiveDVD" that it has become a very common question on the factory list when it doesn't work. When I was a kid, we installed our linux from an Install CD. LiveCD's were for things like Knoppix, used for ass-saving or quick access to Linux tools that don't exist on Windows. I seem to remember a way to install Knoppix to a hard-drive, but I never did so.

When did this change? Is this something Ubuntu is doing?

Labels:


Wednesday, September 30, 2009

I have a degree in this stuff

I have a CompSci degree. This qualified me for two things:
  • A career in academics
  • A career in programming
You'll note that Systems Administration is not on that list. My degree has helped my career by getting me past the "4 year degree in a related field" requirement of jobs like mine. An MIS degree would be more appropriate, but there were very few of those back when I graduated. It has indirectly helped me in troubleshooting, as I have a much better foundation about how the internals work than your average computer mechanic.

Anyway. Every so often I stumble across something that causes me to go Ooo! ooo! over the sheer computer science of it. Yesterday I stumbled across Barrelfish, and this paper. If I weren't sick today I'd have finished it, but even as far as I've gotten into it I can see the implications of what they're trying to do.

The core concept behind the Barrelfish operating system is to assume that each computing core does not share memory and has access to some kind of message passing architecture. This has the side effect of having each computing core running its own kernel, which is why they're calling Barrelfish a 'multikernel operating system'. In essence, they're treating the insides of your computer like the distributed network that it is, and using already existing distributed computing methods to improve it. The type of multi-core we're doing now, SMP, ccNUMA, uses shared memory techniques rather than message passing, and it seems that this doesn't scale as far as message passing does once core counts go higher.

They go into a lot more detail in the paper about why this is. A big one is hetergenaity of CPU architectures out there in the marketplace, and they're not just talking just AMD vs Intel vs CUDA, this is also Core vs Core2 vs Nehalem. This heterogenaity in the marketplace makes it very hard for a traditional Operating System to be optimized for a specific platform.

A multikernel OS would use a discrete kernel for each microarcitecture. These kernels would communicate with each other using OS-standardized message passing protocols. On top of these microkernels would be created the abstraction called an Operating System upon which applications would run. Due to the modularity at the base of it, it would take much less effort to provide an optimized microkernel for a new microarcitecture.

The use of message passing is very interesting to me. Back in college, parallel computing was my main focus. I ended up not pursuing that area of study in large part because I was a strictly C student in math, parallel computing was a largely academic endeavor when I graduated, and you needed to be at least a B student in math to hack it in grad school. It still fired my imagination, and there was squee when the Pentium Pro was released and you could do 2 CPU multiprocessing.

In my Databases class, we were tasked with creating a database-like thingy in code and to write a paper on it. It was up to us what we did with it. Having just finished my Parallel Computing class, I decided to investigate distributed databases. So I exercised the PVM extensions we had on our compilers thanks to that class. I then used the six Unix machines I had access to at the time to create a 6-node distributed database. I used statically defined tables and queries since I didn't have time to build a table parser or query processor and needed to get it working so I could do some tests on how optimization of table positioning impacted performance.

Looking back on it 14 years later (eek) I can see some serious faults about my implementation. But then, I've spent the last... 12 years working with a distributed database in the form of Novell's NDS and later eDirectory. At the time I was doing this project, Novell was actively developing the first version of NDS. They had some problems with their implementation too.

My results were decidedly inconclusive. There was a noise factor in my data that I was not able to isolate and managed to drown out what differences there were between my optimized and non-optimized runs (in hindsight I needed larger tables by an order of magnitude or more). My analysis paper was largely an admission of failure. So when I got an A on the project I was confused enough I went to the professor and asked how this was possible. His response?
"Once I realized you got it working at all, that's when you earned the A. At that point the paper didn't matter."
Dude. PVM is a message passing architecture, like most distributed systems. So yes, distributed systems are my thing. And they're talking about doing this on the motherboard! How cool is that?

Both Linux and Windows are adopting more message-passing architectures in their internal structures, as they scale better on highly parallel systems. In Linux this involved reducing the use of the Big Kernel Lock in anything possible, as invoking the BKL forces the kernel into single-threaded mode and that's not a good thing with, say, 16 cores. Windows 7 involves similar improvements. As more and more cores sneak into everyday computers, this becomes more of a problem.

An operating system working without the assumption of shared memory is a very different critter. Operating state has to be replicated to each core to facilitate correct functioning, you can't rely on a common memory address to handle this. It seems that the form of this state is key to performance, and is very sensitive to microarchitecture changes. What was good on a P4, may suck a lot on a Phenom II. The use of a per-core kernel allows the optimal structure to be used on each core, with changes replicated rather than shared which improves performance. More importantly, it'll still be performant 5 years after release assuming regular per-core kernel updates.

You'd also be able to use the 1.75GB of GDDR3 on your GeForce 295 as part of the operating system if you really wanted to! And some might.

I'd burble further, but I'm sick so not thinking straight. Definitely food for thought!

Labels: , , , , ,


Wednesday, September 02, 2009

A clustered file-system, for windows?

Yesterday I ran into this:

http://blogs.msdn.com/clustering/archive/2009/03/02/9453288.aspx

On the surface it looks like NTFS behaving like OCFS. But Microsoft has a warning on this page:
In Windows Server® 2008 R2, the Cluster Shared Volumes feature included in failover clustering is only supported for use with the Hyper-V server role. The creation, reproduction, and storage of files on Cluster Shared Volumes that were not created for the Hyper-V role, including any user or application data stored under the ClusterStorage folder of the system drive on every node, are not supported and may result in unpredictable behavior, including data corruption or data loss on these shared volumes. Only files that are created for the Hyper-V role can be stored on Cluster Shared Volumes. An example of a file type that is created for the Hyper-V role is a Virtual Hard Disk (VHD) file.

Before installing any software utility that might access files stored on Cluster Shared Volumes (for example, an antivirus or backup solution), review the documentation or check with the vendor to verify that the application or utility is compatible with Cluster Shared Volumes.
So unlike OCFS2, this multi-mount NTFS is only for VM's and not for general file-serving. In theory you could use this in combination with Network Load Balancing to create a high-availability cluster with even higher uptime than failover clusters already provide. The devil is in the details though, and Microsoft aludes to them.

A file system being used for Hyper-V isn't a complex locking environment. You'll have as many locks as there are VHD files, and they won't change often. Contrast this with a file-server where you can have thousands of locks that change by the second. Additionally, unless you disable Opportunistic Locking you are at grave risk of corrupting files used by more than one user (Acess databases!) if you are using the multi-mount NTFS.

Microsoft will have to promote awareness of this type of file-system into the SMB layer before this can be used for file-sharing. SMB has its own lock layer, and this will have to coordinate the SMB layers in the other nodes for it to work right. This may never happen, we'll see.

Labels: , ,


Tuesday, September 01, 2009

Pushing a feature

One of the things I have missed when Novell went from SLE9 to SLE10 was the lack of a machine name in the title-bar for YaST. It used to look like this:

The old YaST titlebar

With that handy "@[machinename]" in it. These days it is much less informative.

The new YaST titlebar

If you're using SSH X-forwarding to manage remote servers, it is entirely possible you'll have multiple YaST windows open. How can you tell them apart? Back in the 9 days it was simple, the window told you. Since 10, the marker has gone away. This hasn't changed in 11.2 either. I would like this changed so I put in a FATE request!

If you'd also like this changed, feel free to vote up feature 306852! You'll need a novell.com login to vote (the opensuse.org site uses the same auth back end so if you have one there you have one on OpenFATE).

Thank you!

Labels: , ,


Monday, August 24, 2009

Trying KDE

In light of the announcement that openSUSE 11.2 and later will have KDE as the default desktop, I installed milestone 6 of 11.2 (barely beta quality, but much improved from earlier Milestones) and had a look at KDE for the first time since, oh, SLE9. I should further note that the Slackware servers I've run at home since college have had GNOME by preference pretty much since Gnome came available.

My first reaction? It probably matched what a lot of long-time Windows users had when they saw Vista for the first time and wanted to do something that wasn't run a web browser:
Dear God, I can't find anything, and none of my hotkeys work. THIS SUCKS!
In time I calmed down. I installed gnome, and lo it was good. I also noted certain settings that I needed on the KDE side. I switched back and explored some more. Found out where they kept the hotkeys. Found out where the sub-pixel hinting settings were hiding. This made my fingers and eyeballs happier.

Now that I've tried it for a while, in fact I'm posting from it right now, it's not that bad. Another desktop-dialect I can learn. I've gotten over that initial clueless flailing and have grasped the beginnings of the basic metaphor of KDE. I still prefer the Gnome side, but we'll see where I go in the future.

Labels: ,


Friday, August 07, 2009

Why aren't schools using open-source?

From http://blogs.techrepublic.com.com/opensource/?p=811

This article was primarily aimed at K-12, which is a much different environment than higher ed. For one, the budgets are a lot smaller per-pupil. However, some of the questions do apply to us as well.

As it happens, part of our mandate is to prepare our students for the Real World (tm). And until very recently, Real World meant MS Office. We've been installing Open Office along side MS Office on our lab images for some time, and according to our lab managers they've seen a significant increase on OO usage. I'm sure part of this is due to the big interface change Microsoft pushed with Office 2007, but this may also be reflective of a shift in mind-share on the part of our incoming students. Parallel installs just work, so long as you have the disk space and CPU power it is very easy to set up.

Our choice of lab OS image has many complexities, not the least of which is a lack of certain applications. There are certain applications, of which Adobe Photoshop is but one, that don't have Linux versions yet. Because of this, Windows will remain.

We could do something like allow dual-boot workstations, or have a certain percentage of each lab as Linux stations. Hard drive sizes are big enough these days that we could dual-boot like that and still allow local-partition disk-imaging, and it would allow the student a choice in environments they can work in. Now that we're moving to a Windows environment, that actually better enables interoperability (samba). Novell's NCP client for Linux was iffy performance-wise, and we had political issues surrounding CIFS usage.

However... one of the obstacles in this is the lack of Linux workstation experience on the part of our lab managers. Running lab workstations is a constant cat and mouse game between students trying to do what they want, malware attempting to sneak in, and the manager attempting to keep a clean environment. You really want your lab-manager to be good at defensive desktop management, and that skill-set is very operating system dependent. Thus the reluctance regarding wide deployment of Linux in our labs.

Each professor can help urge OSS usage by not mandating file formats for homework submissions. The University as a whole can help urge it through retraining ITS staff in linux management, not just literacy. Certain faculty can promote it in their own classes, which some already do. But then, we have the budget flexibility to dual stack if we really want to.

Labels: ,


Tuesday, July 21, 2009

Digesting Novell financials

It's a perennial question, "why would anyone use Novell any more?" Typically coming from people who only know Novell as "That NetWare company," or perhaps, "the company that we replaced with Exchange." These are the same people who are convinced Novell is a dying company who just doesn't know it yet.

Yeah, well. Wrong. Novell managed to turn the corner and wean themselves off of the NetWare cash-cow. Take the last quarterly statement, which you can read in full glory here. I'm going to excerpt some bits, but it'll get long. First off, their description of their market segments. I'll try to include relevant products where I know them.

We are organized into four business unit segments, which are Open Platform Solutions, Identity and Security Management, Systems and Resource Management, and Workgroup. Below is a brief update on the revenue results for the second quarter and first six months of fiscal 2009 for each of our business unit segments:



Within our Open Platform Solutions business unit segment, Linux and open source products remain an important growth business. We are using our Open Platform Solutions business segment as a platform for acquiring new customers to which we can sell our other complementary cross-platform identity and management products and services. Revenue from our Linux Platform Products category within our Open Platform Solutions business unit segment increased 25% in the second quarter of fiscal 2009 compared to the prior year period. This product revenue increase was partially offset by lower services revenue of 11%, such that total revenue from our Open Platform Solutions business unit segment increased 18% in the second quarter of fiscal 2009 compared to the prior year period.

Revenue from our Linux Platform Products category within our Open Platform Solutions business unit segment increased 24% in the first six months of fiscal 2009 compared to the prior year period. This product revenue increase was partially offset by lower services revenue of 17%, such that total revenue from our Open Platform Solutions business unit segment increased 15% in the first six months of fiscal 2009 compared to the prior year period.

[sysadmin1138: Products include: SLES/SLED]



Our Identity and Security Management business unit segment offers products that we believe deliver a complete, integrated solution in the areas of security, compliance, and governance issues. Within this segment, revenue from our Identity, Access and Compliance Management products increased 2% in the second quarter of fiscal 2009 compared to the prior year period. In addition, services revenue was lower by 45%, such that total revenue from our Identity and Security Management business unit segment decreased 16% in the second quarter of fiscal 2009 compared to the prior year period.

Revenue from our Identity, Access and Compliance Management products decreased 3% in the first six months of fiscal 2009 compared to the prior year period. In addition, services revenue was lower by 40%, such that total revenue from our Identity and Security Management business unit segment decreased 18% in the first six months of fiscal 2009 compared to the prior year period.

[sysadmin1138: Products include: IDM, Sentinal, ZenNAC, ZenEndPointSecurity]



Our Systems and Resource Management business unit segment strategy is to provide a complete “desktop to data center” offering, with virtualization for both Linux and mixed-source environments. Systems and Resource Management product revenue decreased 2% in the second quarter of fiscal 2009 compared to the prior year period. In addition, services revenue was lower by 10%, such that total revenue from our Systems and Resource Management business unit segment decreased 3% in the second quarter of fiscal 2009 compared to the prior year period. In the second quarter of fiscal 2009, total business unit segment revenue was higher by 8%, compared to the prior year period, as a result of our acquisitions of Managed Object Solutions, Inc. (“Managed Objects”) which we acquired on November 13, 2008 and PlateSpin Ltd. (“PlateSpin”) which we acquired on March 26, 2008.

Systems and Resource Management product revenue increased 3% in the first six months of fiscal 2009 compared to the prior year period. The total product revenue increase was partially offset by lower services revenue of 14% in the first six months of fiscal 2009 compared to the prior year period. Total revenue from our Systems and Resource Management business unit segment increased 1% in the first six months of fiscal 2009 compared to the prior year period. In the first six months of fiscal 2009 total business unit segment revenue was higher by 12% compared to the prior year period as a result of our Managed Objects and PlateSpin acquisitions.

[sysadmin1138: Products include: The rest of the ZEN suite, PlateSpin]



Our Workgroup business unit segment is an important source of cash flow and provides us with the potential opportunity to sell additional products and services. Our revenue from Workgroup products decreased 14% in the second quarter of fiscal 2009 compared to the prior year period. In addition, services revenue was lower by 39%, such that total revenue from our Workgroup business unit segment decreased 17% in the second quarter of fiscal 2009 compared to the prior year period.

Our revenue from Workgroup products decreased 12% in the first six months of fiscal 2009 compared to the prior year period. In addition, services revenue was lower by 39%, such that total revenue from our Workgroup business unit segment decreased 15% in the first six months of fiscal 2009 compared to the prior year period.

[sysadmin1138: Products include: Open Enterprise Server, GroupWise, Novell Teaming+Conferencing,

The reduction in 'services' revenue is, I believe, a reflection in a decreased willingness for companies to pay Novell for consulting services. Also, Novell has changed how they advertise their consulting services which seems to also have had an impact. That's the economy for you. The raw numbers:


Three months ended


April 30, 2009

April 30, 2008

(In thousands)


Net revenue
Gross
profit


Operating
income (loss)


Net revenue
Gross
profit


Operating
income (loss)

Open Platform Solutions


$ 44,112
$ 34,756

$ 21,451

$ 37,516
$ 26,702

$ 12,191

Identity and Security Management



38,846

27,559


18,306


46,299

24,226


12,920

Systems and Resource Management



45,354

37,522


26,562


46,769

39,356


30,503

Workgroup



87,283

73,882


65,137


105,082

87,101


77,849

Common unallocated operating costs





(3,406 )

(113,832 )



(2,186 )

(131,796 )























Total per statements of operations


$ 215,595
$ 170,313

$ 17,624

$ 235,666
$ 175,199

$ 1,667



























Six months ended


April 30, 2009

April 30, 2008

(In thousands)


Net revenue
Gross
profit


Operating
income (loss)


Net revenue
Gross
profit


Operating
income (loss)

Open Platform Solutions


$ 85,574
$ 68,525

$ 40,921

$ 74,315
$ 52,491

$ 24,059

Identity and Security Management



76,832

52,951


35,362


93,329

52,081


29,316

Systems and Resource Management



90,757

74,789


52,490


90,108

74,847


58,176

Workgroup



177,303

149,093


131,435


208,840

173,440


155,655

Common unallocated operating costs





(7,071 )

(228,940 )



(4,675 )

(257,058 )























Total per statements of operations


$ 430,466
$ 338,287

$ 31,268

$ 466,592
$ 348,184

$ 10,148

So, yes. Novell is making money, even in this economy. Not lots, but at least they're in the black. Their biggest growth area is Linux, which is making up for deficits in other areas of the company. Especially the sinking 'Workgroup' area. Once upon a time, "Workgroup," constituted over 90% of Novell revenue.
Revenue from our Workgroup segment decreased in the first six months of fiscal 2009 compared to the prior year period primarily from lower combined OES and NetWare-related revenue of $13.7 million, lower services revenue of $10.5 million and lower Collaboration product revenue of $6.3 million. Invoicing for the combined OES and NetWare-related products decreased 25% in the first six months of fiscal 2009 compared to the prior year period. Product invoicing for the Workgroup segment decreased 21% in the first six months of fiscal 2009 compared to the prior year period.
Which is to say, companies dropping OES/NetWare constituted the large majority of the losses in the Workgroup segment. Yet that loss was almost wholly made up by gains in other areas. So yes, Novell has turned the corner.

Another thing to note in the section about Linux:
The invoicing decrease in the first six months of 2009 reflects the results of the first quarter of fiscal 2009 when we did not sign any large deals, many of which have historically been fulfilled by SUSE Linux Enterprise Server (“SLES”) certificates delivered through Microsoft.
Which is pretty clear evidence that Microsoft is driving a lot of Novell's Operating System sales these days. That's quite a reversal, and a sign that Microsoft is officially more comfortable with this Linux thing.

Labels: , , , , , , , ,


Wednesday, July 08, 2009

Google and Microsoft square off

As has been hard to avoid lately, Google has announced that it's releasing an actual operating system. And for hardware you can build yourself, not just on your phone. Some think this is the battle of the titans we've been expecting for years. I'm... not so sure of that.

What the new Google OS, called Chrome to make it confusing, is under the hood is Linux. They created, "a new windowing system on top of a Linux kernel." This might be a replacement for X-Windows, or it could just be a replacement for Gnome/KDE.

To my mind, this isn't the battle of the titans some think it is. Linux on the net-top has been around for some time. Long enough for a separate distribution to gain some traction (hello Moblin). What google brings to the party that moblin does not is its name. That alone will drive up adoption, regardless of how nice the user experience ends up being.

And finally, this is a distribution aimed at the cheapest (but admittedly fastest growing) segment of the PC market: sub-$500 laptops. Yes, Chrome could further chip away at the Microsoft desktop lock-in, but so far I have yet to see anything about Chrome that could actually do something significant about that. Chrome is far more likely to chip away at the Linux market-share than it is the Windows market-share, since it shares an ecosystem with Linux.

Microsoft is not quaking in its boots about this anouncement. With the release of Android, it was pretty clear that a move like this was very likely. Microsoft itself has admitted that it needs to do better in the, "slow but cheap," hardware space. They're already trying to compete in this space. Chrome will be another salvo from Google, but it won't make a hole below the water-line.

Labels: , ,


Wednesday, June 24, 2009

Why ext4 is very interesting

The new ext4 file-system is one I'm very interested in, especially since the state of reiserfs in the Linux community's gestalt is so in flux. Unlike ext3, it does a lot of things nicer:
  • Large directories. It can now handle very large directories. One source says unlimited directory sizes, and another less reliable source says sizes up to 64,000 files. This is primo for, say, a mailer. GroupWise would be able to run on this, if one was doing that.
  • Extent-mapped files. More of a feature for file-systems that contain a lot of big files rather than tens of millions of 4k small files. This may be an optional feature. Even so, it replicates an XFS feature which makes ext4 a top contender for the MythTV and other linux-based PVR products. Even SD video can be 2GB/hour, HD can be a lot larger.
  • Delayed allocation. Somewhat controversial, but if you have very high faith in your power environment, it can make for some nice performance gains. It allows the system to hold off on block allocation until a process either finishes writing, or triggers a dirty-cache limit, which further allows a decrease in potential file-frag and helps optimize disk I/O to some extent. Also, for highly transient files, such as mail queue files, it may allow some files to never actually hit disk between write and erase, which further increases performance. On the down side, in case of sudden power loss, delayed writes are simply lost completely.
  • Persistent pre-allocation. This is a feature XFS has that allows a process to block out a defined size of disk without actually writing to it. For a PVR application, the app could pre-allocate a 1GB hunk of disk at the start of recording and be assured that it would be as contiguous as possible. Certain bittorrent clients 'preallocate' space for downloads, though some fake it by writing 4.4GB of zeros and overwriting the zeros with the true data as it comes down. This would allow such clients to truly pre-allocate space in a much more performant way.
  • Online defrag. On traditional rotational media, and especially when there isn't a big chunk of storage virtualization between the server and the disk such as that provided by a modern storage array, this is much nicer. This is something else that XFS had, and ext is now picking up. Going back to the PVR again, if that recording pre-allocated 1GB of space and the recording is 2 hours long, at 2GB/hour it'll need a total of 4 GB. In theory each 1GB chunk could be on a completely diffrent part of the drive. An online defrag allows that file to be made contiguous without file-system downtime.
    • SSD: A bad idea for SSD media, where you want to conserve your writes, and doesn't suffer a performance penalty for frag. Hopefully, the efstools used to make the filesystem will be smart enough to detect SSDs and set features appropriately.
    • Storage Arrays: Not a good idea for extensive disk arrays like the HP EVA line. These arrays virtualize storage I/O to such an extent that file fragmentation doesn't cause the same performance problems as would a simple 6-disk RAID5 array would experience.
  • Higher resolution time-stamps. This doesn't yet have much support anywhere, but it allows timestamps with nanosecond resolution. This may sound like a strange thing to put into a 'nifty features' list, but this does make me go "ooo!".

Also, as I mentioned before, it looks like openSUSE 11.2 will make ext4 the default filesystem.

Labels: , ,


Wednesday, June 17, 2009

Editing powershell scripts in VIM

Vim. The vi replacement on opensuse, has a downloadable PowerShell syntax filter.

vim with syntax coloring for powershell

The fact that powershell scripting has gotten enough traction that someone felt the need to code up a powershell syntax file for vim is very interesting. I'm perfectly willing to reap the rewards.

Labels: ,


Monday, June 15, 2009

openSUSE 11.2 milestone 2

I'm doing some testing of openSUSE. Pitching in, getting my open-source groove on. One of the new things is decidedly minor, but it makes good eye-candy. They have a new title font!

New title font

See?

Also, as of milestone 2, ext4 is the default file-system.

Labels: ,


Friday, June 12, 2009

Explaining LDAP.

The question was asked recently...

"How would you explain LDAP to a sysadmin who'd've heard of it, but not interacted with it before."

My first reaction illustrates my own biases rather well. "How could they NOT have heard of it before??" goes the rant in my head. Active Directory, choice of enterprise Windows deployments everywhere, includes both X500 and LDAP. Anyone doing unified authentication on Linux servers is using either LDAP or WinBind, which also uses LDAP. It seems that any PHP application doing authentication probably has an LDAP back end to it1. So it seems somewhat disingenuous to suppose a sysadmin who didn't know what LDAP was and could do.

But then, I remind myself, I've been playing with X500 directories since 1996 so LDAP was just another view on the same problem to me. Almost as easy as breathing. This proposed sysadmin probably has been working in a smaller shop. Perhaps with a bunch of Windows machines in a Workgroup, or a pack of Linux application servers that users don't generally log in to. It IS possible to be in IT and not run into LDAP. This is what makes this particular question an interesting challenge, since I've been doing it long enough that the definition is no longer ready to my head. Unfortunately for the person I'm about to info-dump upon, I get wordy.

LDAP.... Lightweight Directory Access Protocol. It came into existence as a way to standardize TCP access to X500 directories. X500 is a model of directory service that things like Active Directory and Novell eDirectory/NDS implemented. Since X500 was designed in the 1980's it is, perhaps, unwarrantedly complex (think ISDN), and LDAP was a way to simplify some of that complexity. Hence the 'lightweight".

LDAP, and X500 before it, are hierarchical in organization, but doesn't have to be. Objects in the database can be organized into a variety of containers, or just one big flat blob of objects. That's part of the flexibility of these systems. Container types vary between directory implementation, and can be an Organizational Unit (OU=), a DNS domain (DC=), or even a Cannonical Name (CN=), if not more. The name of an object is called the Distinguished Name (DN), and is composed of all the containers up to root. An example would be:

CN=Fred,OU=Users,DC=organization,DC=edu

This would be the object called Fred, in the 'Users' Organizational Unit, which is contained in the organization.edu domain.

Each directory has a list of classes and attributes allowable in the directory, and this is called a Schema. Objects have to belong to at least one class, and can belong to many. Belonging to a class grants the object the ability to define specific attributes, some of which are mandatory similar to Primary Keys in database tables.

Fred is a member of the User class, which itself inherits from the Top class. The Top class is the class that all other classes inherit from, as it defines the bare minimum attributes needed to define an object in the database. The User class can then define additional attributes that are distinct to the class, such as "first name", "password", and "groupMembership".

The LDAP protocol additionally defines a syntax for searching the directory for information. The return format is also defined. Lets look at the case of an authentication service such as a web-page or Linux login.

A user types in "fred" at the Login prompt of an SSH login to a linux server. The linux server then asks for a password, which the user provides. The Pam back-end then queries the LDAP server for objects of class User named "fred", and gets one, located at CN=Fred,OU=Users,DC=organization,DC=edu. It then queries the LDAP server for objects of class Group that are named CN=LinuxServerAccess,OU=Servers,DC=Organization,DC=EDU, and pulls the membership attributes. It finds that Fred is in this group, and therefore allowed to log in to that server. It then makes a third connection to the LDAP server and attempts to authenticate as Fred, with the password Fred supplied at the SSH login. Since Fred did not fat finger his password, the LDAP server allows the authenticated login. The Linux server detects the successfull login, and logs out of LDAP, finally permiting Fred to log in by way of SSH.

As I said before, these databases can be organized any which way. Some are organized based on the Organizational Chart of the organization, not with all the users in one big pile like the above example. In that case, Fred's distinguished-name could be as long as:

CN=Fred,ou=stuhelpdesk,ou=DesktopSupport,ou=InfoTechSvcs,ou=AcadAffairs,dc=organization,dc=edu

How to organize the directory is up to the implementers, and is not included in the standards.

The higher performing LDAP systems, such as the systems that can scale to 500,000 objects or higher, tend to index their databases in much the same way that relational databases do. This greatly speeds up common searches. Searching for an object's location is often one of the fastest searches an LDAP system can perform. Because of this LDAP very frequently is the back end for authentication systems.

LDAP is in many ways a speciallized identity database. If done right, on identical hardware it can easilly outperform even relational-databases in returning results.

Any questions?
Yeah, I get wordy.

1: Yes, this is wrong. MySQL contains a lot, if not most, of these PHP-application logins on the greater internet. I said I had my biases, right?

Labels: , ,


Tuesday, April 28, 2009

LinuxFest Northwest: gender imbalance

I went to LinuxFest Northwest this weekend. It was interesting! OpenSUSE, Ubuntu, Fedora, and FreeBSD were all there and passing out CDs. I learned stuff.

In one of the sessions I went to, "Participate or Die!" the question was asked of the presenter, a Fedora guy, about whether he is seeing any change in the gender imbalance at Linux events. He hemmed and hawed and said ultimately, 'not really'. I've been thinking about that myself, as I've noticed a similar thing at BrainShare.

Looking at what I've seen in amongst my friends, the women ARE out there. They're just not well represented in the ranks of the code-monkeys. Among closed source shops, I see women many places.

I have known several women involved with technical writing and user-factors. In fact, I don't know any men involved in these roles. Amongst all but the largest and well funded of open-source projects, tech-writing is largely done by the programmers themselves or by the end-user community on a Wiki. Except for the Wiki, the same can be said for interface design. As the tech-writers I know lament, programmers do a half-assed job of doc, and write for other programmers not somewhat clueless end-users. At the same time, the UI choices of some projects can be downright tragic for all but fellow code-monkeys. This is the reason the large pay-for-it closed-source software development firms employ dedicated Technical Writers with actual English degrees (you hope) to produce their doc.

I've also known some women involved with QA. In closed-source shops QA is largely done in house, and there may be a NDA-covered beta among trusted customers towards the end of the dev-cycle. In the land of small-project open-source, QA is again done by developers and maybe a small cadre of bug-finders. The fervent hope expectation is that bug-finders will occasionally submit a patch to fix the bug as well.

I also know women involved with defining the spec for software. Generally this is for internal clients, but the same applies elsewhere. These are the women who meet with internal customers to figure out what they need, and write it up as a feature list that developers can code against. These women also frequently are the liaison between the programmers and the requesting unit. In the land of open-source, the spec is typically generated in the head a programmer who has a problem that needs solving, who then goes out to solve it, and ultimately publishes the problem-resolution as an open source project in case anyone else wants that problem solved too.

All of this underlines one of the key problems of Linux that they've been trying to shed of late. For years Linux was made by coders, for coders, and it certainly looked and behaved like that. It has only been in recent years when a concerted effort has been made to try and make Linux Desktops look and feel comprehensible to tech-shy non-nerds. Ubuntu's success comes in large part due to these efforts, and Novell has spent a lot of time doing the same through user-factors studies.

Taking a step back, I see women in every step of the closed-source software development process, though they are underrepresented in the ranks of the code-monkeys. The open-source dev-process, on the other hand, is almost all about the code-monkey in all but the largest projects. Therefore it is no surprise that women are significantly absent at Linux-conventions.

I've read a lot of press about the struggles Computer Science departments have in attracting women to their programs. At the same time, Linux as an ecosystem has a hard time attracting women as active devs. Once more women start getting degreed, or not scared off in the formative teen years which is something Linux et. al. can help with, we'll see more of them among the code-slingers.

Something else that might help would be to tweak the CompSci course offerings to perhaps partner with the English departments or Business departments to produce courses aimed at the hard problem of translating user requirements into geek, and geek into manuals. Because the Software Engineering process involves more than just writing code in teams, it involves:
  • Building the spec
  • Working to the spec
  • Testing the produced code against the spec
  • Writing the How-To manual against the deliverable code
  • Delivery
This is an interdisciplinary process involving more than just programmers. The programmers need to be familiar with each step, of course, but it would be better if other people gave the same focus to those steps as the programmers have to focus on producing the code. Doing this in class might just inspire more non-programmers to participate on projects in such key areas as helping guide the UI, writing how-tos and man-pages, and creatively torturing software to make bugs squeal. And maybe even inspire some to give programming a try, some who never really looked at it before due to unfortunate cultural blinders. That would REALLY help.

Labels: ,


Wednesday, April 22, 2009

A new version of BIND

I saw on the SANS log today that the ISC is starting work on BIND10. A list of the new stuff can be found here. A couple of those items are very interesting to me. Specifically the Modularity and Clustering items.

Modularity:
...the selection of a variety of back-ends for data storage, be it the current in-memory database, a traditional SQL-based server, an embedded database engine or back-ends for specific applications such as a high performance, pre-compiled answer database.
Which makes me think of eDirectory backed DNS. Novell has had this for ages with NetWare, and from what I recall it was based on BIND. But... BIND8. BIND10 would formalize this in the linux base, which would further allow Novell to publish a more 'pure' eDir-integrated BIND.

Clustering:
run on multiple but related systems simultaneously, using a pluggable, open-source architecture to enable backbone communications between individual members of the cluster. These coordination services would enable a server farm to maintain consistency and coherence.
This is exactly what AD-integrated DNS and the DNS on NetWare has been doing for over 8 years now. Glad to see BIND catch up.

The big thing about using a database of some kind as the back-end for DNS is that you no longer have to create Secondary servers and muck about with Zone Transfers. For domains that change on a second by second basis, such as an AD DNS domain with dynamic updates enabled and thousands of computers during morning power-on, it is entirely possible for a BIND secondary-server to be missing many, many DNS updates. Microsoft has known about this issue, which is why they have their own directory-integrated DNS service.

This also shows just how creaky the NetWare DNS service really is. That's based on BIND8 code, which is now over 10 years old. Very creaky.

I'm looking forward to BIND10. It is a needed update that addresses DNS as it is done today, and would better enable BIND to handle large Active Directory domains.

Labels: , , , ,


Thursday, March 19, 2009

Evolution and Exchange 2007

The question came up today, so I googled around. Turns out the MAPI plugin for Evolution is out there and can be installed on most OpenSUSE builds.

http://download.opensuse.org/repositories/GNOME://Evolution://mapi/

However, it required Samba 4. Presumably for the RPC interface. So if you're not willing to upgrade your Samba, then you can't use it. Still, nice to see it out there!

Labels: ,


Thursday, February 26, 2009

SLES11 will be out soon.

For Novell has posted a preview release of SLES11. It says "RC4", which I suspect means we're within a month or so of release. This release is just in time for BrainShare 2009, were it actually happening. SLES11 would have been the major message of BS09.

Labels: , ,


Wednesday, February 11, 2009

High availability

64-bit OES provides some options to highly available file serving. Now that we've split the non-file services out of the main 6-node cluster, all that cluster is doing is NCP and some trivial other things. What kinds of things could we do with this should we get a pile of money to do whatever we want?

Disclaimer: Due to the budget crisis, it is very possible we will not be able to replace the cluster nodes when they turn 5 years old. It may be easier to justify eating the greatly increased support expenses. Won't know until we try and replace them. This is a pure fantasy exercise as a result.

The stats of the 6-node cluster are impressive:
  • 12 P4 cores, with an average of 3GHz per core (36GHz).
  • A total of 24GB of RAM
  • About 7TB of active data
The interesting thing is that you can get a similar server these days:
  • HP ProLiant DL580 (4 CPU sockets)
  • 4x Quad Core Xeon E7330 Processors (2.40GHz per core, 38.4GHz total)
  • 24 GB of RAM
  • The usual trimmings
  • Total cost: No more than $16,547 for us
With OES2 running in 64-bit mode, this monolithic server could handle what six 32-bit nodes are handling right now. The above is just a server that matches the stats of the existing cluster. If I were to really replace the 6 node cluster with a single device I would make a few changes to the above. Such as moving to 32GB of RAM at minimum, and using a 2-socket server instead of a 4-socket server; 8 cores should be plenty for a pure file-server this big.

A single server does have a few things to recommend it. By doing away with the virtual servers, all of the NCP volumes would be hosted on the same server. Right now each virtual-server/volume pair causes a new connection to each cluster node. Right now if I fail all the volumes to the same cluster node, that cluster node will legitimately have on the order of 15,000 concurrent connections. If I were to move all the volumes to a single server itself, the concurrent connection count would drop to only ~2500.

Doing that would also make one of the chief annoyances of the Vista Client for Novell much less annoying. Due to name cache expiration, if you don't look at Windows Explorer or that file dialog in the Vista client once every 10 minutes, it'll take a freaking-long time to open that window when you do. This is because the Vista client has to enumerate/resolve the addresses of each mapped drive. Because of our cluster, each user gets no less than 6 drive mappings to 6 different virtual servers. Since it takes Vista 30-60 seconds per NCP mapping to figure out the address (it has to try Windows resolution methods before going to Novell resolution methods, and unlike WinXP there is no way to reverse that order), this means a 3-5 minute pause before Windows Explorer opens.

By putting all of our volumes on the same server, it'd only pause 30-60 seconds. Still not great, but far better.

However, putting everything on a single server is not what you call "highly available". OES2 is a lot more stable now, but it still isn't to the legendary stability of NetWare 3. Heck, NetWare 6.5 isn't at that legendary stability either. Rebooting for patches takes everything down for minutes at a time. Not viable.

With a server this beefy it is quite doable to do a cluster-in-a-box by way of Xen. Lay a base of SLES10-Sp2 on it, run the Xen kernel, and create four VMs for NCS cluster nodes. Give each 64-bit VM 7.75GB of RAM for file-caching, and bam! Cluster-in-a-box, and highly available.

However, this is a pure fantasy solution, so chances are real good that if we had the money we would use VMWare ESX instead XEN for the VM. The advantage to that is that we don't have to keep the VM/Host kernel versions in lock-step, which reduces downtime. There would be some performance degradation, and clock skew would be a problem, but at least uptime would be good; no need to perform a CLUSTER DOWN when updating kernels.

Best case, we'd have two physical boxes so we can patch the VM host without having to take every VM down.

But I still find it quite interesting that I could theoretically buy a single server with the same horsepower as the six servers driving our cluster right now.

Labels: , , , , , , ,


Saturday, January 31, 2009

An open-source project I can really get behind

Link grabbed on Slashdot:

http://www.computerworld.com.au/article/274883/openchange_kde_bring_exchange_compatibility_linux


A project is now in the works to bring true Exchange compatibility to a non-Windows desktop. The Exchange Connector for Evolution and others of that ilk have done so using the WebDAV or RPC-over-HTTP interface. This project is different in that it is trying to emulate how Outlook interacts with the Exchange system; using RPC over SMB among other protocols.

This would bring the full power of the Exchange groupware environment to anything that can run KDE from the sounds of it. And that's really nifty. It uses Samba 4 code to a great extent, which makes sense as there is a lot of neat new things in that codebase. This would be a big step for Linux in the enterprise! I also saw mention of Novell input to the project, which is nice to see.

Labels: ,


Wednesday, January 14, 2009

NTP on NetWare

A while back I did some work setting up an ntp peer-group on a pair of SLES servers (SLES9 and SLES10). That worked pretty good, and I managed to get autokey security working, which I thought was nifty. Then my thoughts turned to the OES environment.

If/when we get off of NetWare and move to the Linux kernel, NTP becomes the only way to do timesync. So I figured I'd see how amenable NetWare's xntpd was to secure configuration. Turns out it can do it, but there are some caveats.

First of all, it seems that the NTP for NetWare is based on NTPv3, not NTPv4, which means it doesn't support autokey and only supports symmetric keys. This also means that some other items on the ntp.conf file on the sles servers couldn't be carried over.

As it happens, the following sys:/etc/ntp.conf file works pretty well:
server ntpserver1
server ntpserver2 minpoll 6 maxpoll 13
peer ntppeer1 key 1

enable auth monitor
keys sys:\etc\ntp.keys
trustedkey 1
requestkey 1

restrict default ignore
restrict 140.160.0.0 mask 255.255.0.0 nomodify nopeer
restrict 127.0.0.1
restrict [ip of ntpserver1]
restrict [ip of ntpserver2]
restrict [ip of ntppeer1]

Populating the ntp.keys file couldn't be done from NetWare directly, I had to do that on a SLES server and copy it over. But once I did that, the ntppeer1 server and the NetWare server correctly authenticated to each other.

Interestingly, when I pointed an NTPv4 linux machine at the NetWare NTP setup I got complaints on the NetWare server about the incoming timehost not having the correct key and not being able to sync time. This is interesting because this linux machine was NOT one of the specified time-hosts. When I put in the 'restrict' line above with the 'nopeer' flag on it, those messages stopped.

The above configuration was successful in enabling a peer relationship between the two timehosts. This is loosely analogous to a PRIMARY group in traditional NetWare TIMESYNC setup. Should one or both of these hosts lose connection to the non-WWU time-servers (which are in essence equivalent to REFERENCE servers in Timesync, but unlike Timesync you can have more than one), they can negotiate time between themselves. This is important, as it prevents them from going out of sync, which would have dire consequences if allowed to happen more than a few minutes.

Labels: , ,


Monday, December 08, 2008

When you can't trust tcpdump

I just spent a good chunk of today bothering the telecom people to try and figure out why one server of mine couldn't talk to any off campus NTP servers. I had two servers, one was talking just fine, the other wasn't. For proof, I had packet traces showing that the non-working server was not getting an appropriate "ntp server" return packet.

And yet, after the telecom people sniffed the border firewall connections they saw UDP/123 packets on both sides. In other words, it transited the firewall just peachy. And also our IDS/IPS. So, clearly it was getting in. But it wasn't showing up on the tcpdump output on the server.

Then about 15 minutes ago I turned off the "restrict default ignore" line on that server.

And it started syncing off campus just fine. With packets.

WHY was tcpdump not showing the packets? That's what I want to know! Somehow, the UDP packets were being dropped before tcpdump saw them. Strange.

Labels: ,


Wednesday, December 03, 2008

OES2 SP1 ships!

Full announcement.

It's out!

Labels: , , , , , ,


Tuesday, December 02, 2008

The NetWare 7 that never was

My last post generated some comments lamenting where NetWare has gone. I hear ya.

I have friends and have spoken with people at BrainShare who were closer to things than I was regarding how the next version of NetWare evolved. And to be truthful, it sounded a lot like how Microsoft moved from XP to Vista. If you'll recall, "the version of Windows after XP," was something of a moving target for many years. I recall media reports of Microsoft scrapping the whole project and starting afresh at least once.

My very first BrainShare was 2001, and that was the release party for NetWare 6. It was in 2003 when Novell bought Ximian, and bought SuSE, so it is clear when Novell probably decided to bet the house on this whole Linux thing. Yet at BS01 there was talk about NW7, or if there would be a NW6.1 version out. The rumors I remember from back then had NW7 being a progression towards a more application-friendly environment. I also remember hearing the L word around once or twice.

What we actually got was NetWare 6.5, which solidified NetWare 6 and made the core services better and more mature. What it wasn't was any more application friendly than NetWare 6 was (or even NetWare 5.1 for that matter). NetWare 6.5 released in August of 2003, the same month as the Ximian purchase. This is what tells me that Novell had decided on a path for NetWare 7, and it was green, not red. Open Enterprise Server arrived in 2005, which gives OES a solid year and a half dev-time between when SuSE was bought and when we started seeing public betas of OES. The NetWare version of OES was NetWare 6.5 SP3.

What happened to NetWare 7? It got lost on the roadmap. When NW6 came out, Novell probably had 6.5 on the roadmap as the next rev, with NW7 next down. The rumors we were hearing were very provisional, as the spot on the map held by NW7 was at least 3 years away. Sometime between BrainShare 2001 and when Novell started buying its way into the Linux world NW7 was dropped and the decision was made to port to a completely different Kernel. That decision was probably made in the summer of 2003, as the NetWare 6.5 development was entering final beta, and the task of allocating developer resources for the next full rev needed to be made.

Which brings us to today. OES2 SP1 is going to drop any day now, probably in time for Novell's quarterly earnings report. SP1 finally brings the Linux-kernel 'NetWare Services' to feature-comparable with the NetWare kernel. There are still a few things missing, like an eDirectory integrated SLP server, but now all the major points are covered. If you count it up, this has taken Novell a bit over 5 years to get to this point.

In my opinion, that's about right for an organization the size of Novell. Porting over the proprietary NetWare services to completely new kernel requires a LOT of developer attention, and Novell is a lot smaller than Microsoft. Also of note, it took Microsoft 5 years to give us Vista after XP, including the presumed nuke-and-rewrite they did. Novell got a boost in that they had already ported eDirectory to Linux, so that helped out the NCP side. But that didn't help the NSS folks, who had to figure out a way to do a NetWare-style rich metadata file-system on a kernel and driver model that expects POSIX-spartan file-systems. The problems Novell had with this were amply displayed in the performance problems reported with OES1-FCS. Samba doesn't scale to the same levels as CIFS-on-NetWare did, so that meant Novell had to create their own CIFS stack from scratch. The AFP stack on Linux is a joke, and the resurgence of Apple since 2003 meant they had to do something about that as well; by making a proprietary AFP stack. All of this represents nuke-and-rebuild-from-spec, which takes time.

Novell probably should have started the migration in 2000 instead of 2003. They already knew that Exchange 5.5 upgrades were driving a LOT of customers into Active Directory, which was triggering migrations away from NetWare. But, there are business concerns here. Novell managed to survive the fall of NetWare by diversifying their product portfolio enough that GroupWise, Zen, and Identity Management could support the company. It took until this year to return to the black, but they did it. Had they shot the NetWare cash cow two years earlier, it is entirely possible that Novell couldn't have survived the lean years.

Labels: , , , ,


Tuesday, October 28, 2008

The gift of security

Last Christmas my parents bought me a 4GB IronKey. This is a nifty little device! And really, the gift of data security is rather thoughtful. And yesterday, it finally got the new firmware that enables Linux support!

Between then and now I haven't really been able to use it at work. And since transporting files between work and home is one of the nicer features of it, it has largely sat unused. But right this moment it is mounted to my openSUSE 10.3 workstation. This beats a floppy disk for transporting pgp/gpg keys.

Labels: , , ,


Monday, October 20, 2008

Dorm printing

On my post about finally running vista patrickbuller asked:
So you have printers that students in the dorms can print to? Wow. Do you audit all those and charge the numbers of pages against the student?
The answer to that is that we make big use of AND Technology's PCounter product. When paired with their PrintStations, it makes a very nice way to put a lid on unrestricted 'free' printing in the dorms. The PrintStations also make sure that only jobs people want to pick up get printed, which saves a serious amount of paper.

PCounter is core to our student printing. We'll only move our NDPS/iPrint infrastructure over to OES2-linux when Pcounter is supported on that platform, not before. We'll keep a 2 node NetWare cluster around just for printing if we have to. Since accounting support is one of the features that's supposed to be in OES2-SP1, it is my hope that PCounter will support OES2-Linux within a year after SP1's release. But I haven't heard any specifics.

Labels: , , , , ,


Wednesday, October 15, 2008

OES2 SP1 (public beta) has been posted

The public beta of OES2 SP1 has been posted.

I believe the NDA has lifted, but I'm not 100% on that. Will check. But, some of the new stuff in SP1:
  • An AFP stack that doesn't suck. Or more specifically, an AFP stack that scales beyond 100 users and is eDirectory integrated.
  • A new CIFS stack written by Novell, so it can scale well past the Samba limit.
  • A migration toolkit in one UI, rather than a cluster of scripts.
  • A new version of iFolder
  • EDirectory integrated DNS/DHCP. But no eDir integrated SLP yet, open-source politics you know.
  • IIRC a beta of eDir 8.8 SP4.
  • The ability to put iPrint-for-Linux on NSS volumes (handy for Clustering).
  • And lots more I can't remember off the top of my head.
Go forth, and have fun. There is a beta-feedback box on the beta page I linked to above in case you find a bug and want to tell Novell about it.

One thing I think it is safe to say, is that even though it says "Beta4" on it, it's really a release-candidate. Only major bugs are getting quashed right now. UI freeze was a month or more ago, and strange, annoying behaviors may get "fixed in doc" rather than getting true fixes which will have to wait for SP2. Still report them anyway, since it'll go on the list to fix in the next SP.

Labels: , , , , ,


Thursday, September 11, 2008

Fixing DNS issues

I've noticed some slow DNS on my station for the last few weeks and finally got down to checking it out. In the wake of the cache-poisoning scare of late July, we had to upgrade our DNS servers to something a bit less scarily old. I believe this required an operating system rev. The last time this happened to me, we figured out that the DNS server in question had auto-negotiated itself to 10-HalfDuplex, and the switch thought it was 100-FullDuplex. You can imagine what that did to throughput.

I fired up wireshark and started tracking my DNS requests. A pattern soon emerged. The first entry in my resolve.conf list was taking anywhere from .5 to 5.2 seconds to resolve most queries. This is hella slow for a DNS server. Since I don't manage these machines, I let the admin who did manage 'em know about it. He couldn't find anything wrong with the DNS servers on a first glance.

Another thing I noticed when looking at the resolver requests I was passing was a lot of IPv6 requests. Almost all of them were for Active Directory related queries, as I've turned off IPv6 support in my web-browser. I still haven't quite figured out how to disable IPv6 on my openSUSE 10.3 machine here.

As it happens, said DNS admin came back in and said to look at things again. So I dropped into nslookup and started throwing queries and watching the response times in wireshark, and sure enough they were zippy again. He turned off IPv6 support on the DNS servers.

Looks like we'll probably need to have a conversation on campus about IPv6 sooner rather than later. Vista comes with it turned on by default, and happily we don't have much of that yet. But these newer linux distros all have it turned on by default.

Labels: ,


Wednesday, September 10, 2008

That darned budget

This is where I whine about not having enough money.

It has been a common complaint amongst my co-workers that WWU wants enterprise level service for a SOHO budget. Especially for the Win/Novell environments. Our Solaris stuff is tied in closely to our ERP product, SCT Banner, and that gets big budget every 5 years to replace. We really need the same kind of thing for the Win/Novell side of the house, such as this disk-array replacement project we're doing right now.

The new EVAs are being paid for by Student Tech Fee, and not out of a general budget request. This is not how these devices should be funded, since the scope of this array is much wider than just student-related features. Unfortunately, STF is the only way we could get them funded, and we desperately need the new arrays. Without the new arrays, student service would be significantly impacted over the next fiscal year.

The problem is that the EVA3000 contains between 40-45% directly student-related storage. The other 55-60% is Fac/Staff storage. And yet, the EVA3000 was paid for by STF funds in 2003. Huh.

The summer of 2007 saw a Banner Upgrade Project, when the servers that support SCT Banner were upgraded. This was a quarter million dollar project and it happens every 5 years. They also got a disk-array upgrade to a pair of StorageTek (SUN, remember) arrays, DR replicated between our building and the DR site in Bond Hall. I believe they're using Solaris-level replication rather than Array-level replication.

The disk-array upgrade we're doing now got through the President's office just before the boom went down on big expensive purchases. It languished in the Purchasing department due to summer-vacation related under-staffing. I hate to think how late it would have gone had it been subjected to the added paperwork we now have to go through for any purchase over $1000. Under no circumstances could we have done it before Fall quarter. Which would have been bad, since we were too short to deal with the expected growth of storage for Fall quarter.

Now that we're going deep into the land of VMWare ESX, centralized storage-arrays are line of business. Without the STF funded arrays, we'd be stuck with "Departmental" and "Entry-level" arrays such as the much maligned MSA1500, or building our own iSCSI SAN from component parts (a DL385, with 2x 4-channel SmartArray controller cards, 8x MSA70 drive enclosures, running NetWare or Linux as an iSCSI target, with bonded GigE ports for throughput). Which would blow chunks. As it is, we're still stuck using SATA drives for certain 'online' uses, such as a pair of volumes on our NetWare cluster that are low usage but big consumers of space. Such systems are not designed for the workloads we'd have to subject them to, and are very poor performers when doing things like LUN expansions.

The EVA is exactly what we need to do what we're already doing for high-availability computing, yet is always treated as an exceptional budget request when it comes time to do anything big with it. Since these things are hella expensive, the budgetary powers-that-be balk at approving them and like to defer them for a year or two. We asked for a replacement EVA in time for last year's academic year, but the general-budget request got denied. For this year we went, IIRC, both with general-fund and STF proposals. The general fund got denied, but STF approved it. This needs to change.

By October, every person between and Governor Gregoir will be new. My boss is retiring in October. My grandboss was replaced last year, my great grand boss also has been replaced in the last year, and the University President stepped down on September 1st. Perhaps the new people will have a broader perspective on things and might permit the budget priorities to be realigned to the point that our disk-arrays are classified as the critical line-of-business investments they are.

Labels: , , , , , , , , , , , ,


Monday, September 08, 2008

Web-servers

Looking at usage stats, the amount of data transferred by Myweb for Students has gone down somewhat from its heyday in 2006. I blame Web 2.0. Myweb is a static HTML service. We don't allow any server-side processing of any kind other than server-side includes. This is not how web-development is done anymore. This very blog is database backed, but Blogger publishes static HTML pages to represent that database, which is why I'm able to host this blog on Myweb for FacStaff.

If we were to provide a full-out hosting service for our students (and staff), I'm sure there would be a heck of a lot more uptake. A few years ago there was a push in certain Higher Ed circles to provide a, "portfolio service", which would host a student's work for a certain time after graduation so they could point employers at it as a reference. We never did that for a variety of reasons (cost being a big one), but the sentiment is still there.

If we were to provide not only full-out hosting, but actual domain-hosting for students, it could fill this need quite well. Online brand is important, and if a student can build a body of work on "$studentname.[com|org|net|biz]" it can be quite useful in hunting down employment. Several of the ResTek technicians I know have their own domains hosting their own blogs, so the demand is there.

I've never worked for a company that did web-hosting as a business item, so I've only heard horror stories of how bad it can get. First of all, we'll need a full LAMP stack server-farm to run the thing. That's money. Second, we'll need the organizational experience with the technology to prevent badly configured Wordpress or PhpBB installs from DoSing other cohosted sites from resource-exhaustion by hackers. This is a worker-hours thing.

Then we'd have to figure out the graduated problem. Once a student graduates, do we keep hosting for them? Do we charge them? Do we force them off the system after a specific time? Questions that need answers, and these are the kinds of questions that contributed to the killing of the portfolio-server idea.

Personally, I think this is something we could provide. However, someone needs to kick the money tree hard enough to shake loose the funds to make it happen. Perhaps Student Tech Fee could do it. Perhaps it could be a 'discounted' added-cost service we provide. Who knows. But we could probably do it.

Labels: , , , ,


Monday, August 18, 2008

Enabling autokey auth in NTP on SLES10

The NTP protocol permits the use of crypto to authenticate clients and servers to each other, as well as between time servers. By default, SLES10 is set up to allow the v3 method of using symmetric keys, but not the v4 method that uses public/private keys. If you want to use the v4 method, this is the tip for you.

Background

By default SLES runs NTP inside a chroot jail. This can be changed from the YaST NTP config screen if you wish. This is a more secure method of running NTP. The chroot jail's root is at /var/lib/ntp/.

Additionally, ntp runs with an AppArmor profile loaded against it for added security.

Getting NTPv4 auth to work

There are 4 steps to get this to work.

  1. Copy the .rnd file to the chroot jail
  2. Run ntp-keygen
  3. Modify the AppArmor profile for /usr/sbin/ntpd to allow read access to the new files
  4. Modify the /etc/ntp.conf file to enable v4 auth.

Copy the .rnd file to the chroot jail

By default, there should be a .rnt file at /root/.rnd. If so, copy this to /var/lib/ntp/etc/.rnd. If there is no file there, one can be generated through use of openssl.

timehost:~ # openssl rand -out /var/lib/ntp/etc/.rnd 1

Run ntp-keygen

Change-directory to /var/lib/ntp/etc, and execute the following command:

timehost:~ # ntp-keygen -T

This will drop a pair of files in the directory you run it, so running it while in /var/lib/ntp/etc saves you the step of copying them to this directory.

Modify the AppArmor profile

This is done through YaST

  1. Launch YaST
  2. Go to the "Novell AppArmor" section, and enter the "Edit Profile" tool.
  3. Select "/usr/sbin/ntpd" and click Next.
  4. Click the "Add Entry" button and select File.
  5. Browse to /var/lib/ntp/etc/.rnd and click the "Read" permissions check-box, and click OK
  6. Repeat the previous two steps to add the two files created by ntp-keygen, named "ntpkey_cert_[hostname]" and "ntpkey_host_[hostname]".
    1. Note: AppArmor behavior changes between SP1 and SP2. In SP1 you can use the link files, in SP2 you need to specify the link targets.
  7. Click Done on the main Profile Dialog
  8. Agree to reload the AppArmor profile

Modify /etc/ntp.conf

The YaST tool for NTP doesn't allow for v4 configurations, so this has to be done on the command line. Open the /etc/ntp.conf file with your editor of choice, and insert the following lines before your "server" lines:

keysdir /var/lib/ntp/etc/
crypto randfile /var/lib/ntp/etc/.rnd

Then append the word "autokey" to the server and peer lines of your choice. At this point, you should be able to restart ntpd, and it will use authentication. This is a very basic NTPv4 configuration setup, but this should set the ground up for more complex configs.

Labels: , , , ,


Thursday, August 14, 2008

Virtualization and Fileservers

There are some workloads that fit well within VM of any kind, and others that are very tricky. Fileservers are one area that are not good candidates for VM. In some cases they qualify as highly transactional. In others, the memory required to do fileserving well makes them very expensive. When you can fit 40 web-servers on a VM host, but only 4 fileservers, it makes the calculus obvious.

This is on my mind since we're running into memory problems on our NetWare cluster. We've just plain outgrown the 32-bit memory space for file-cache. NW can use memory above the 4GB line, it does have PAE support, but memory access above there is markedly slower than it is below the line. Last I heard the conventional wisdom is that 12GB is about the point where it starts earning you performance gains again. eek!

So, I'm looking forward to 64-bit memory spaces and OES2. 6GB should do us for a few years. That said, 6GB of actually-used RAM in a virtual-host means that I could fit... two of them on a VM server with 16GB of RAM.

16GB of RAM in, say, an ESX cluster is enough to host 10 other servers. Especially with memory deduplication. In the case of my hypothetical 6GB file-servers, 5.5GB of that RAM will be consumed by file-cache that will be unique to that server and thus very little gains from memory de-dup.

In the end, how well a fileserver fits in a VM environment is based on how large of a 'working set' your users have. If the working set it large enough, it can mean that you'll get small gains for virtualization. However, I realize fileserving on the scale we do it is somewhat rare, so for departmental fileservers VM can be a good-sized win. As always, know your environment.

In light of the budgetary woes we'll be having, I don't know what we'll do. Last I heard the State is projected to have a 2.7 billion deficit for the 2009-2011 (fiscal year starts July 1) budget cycle. So it may very well be possible that the only way I'll get access to 64-bit memory spaces is in an ESX context. That may mean a 6 node cluster on 3 physical hosts. And that's assuming I can get new hardware at all. If it gets bad enough I'll have to limp along until 2010 and play partitioning games to load-balance my data-loads across all 6 nodes. By 2011 all of our older hardware falls off of cheap-maintenance and we'll have to replace it, so worst-case that's when I can do my migration to 64-bit. Arg.

Labels: , , , ,


Friday, July 25, 2008

Handling eDirectory core-files on linux

If you've been getting core files generated by ndsd on your Linux servers, and want to call Novell Support about it, there are a few things you can do to maximize what Novell will get out of the files themselves. You may not get much, but these will help the people with the debug symbols figure out what's going on.

Packaging the Core


First and foremost, you already have the tool to package core files for delivery to Novell already on your system. TID3078409 describes the details of how to use 'novell-getcore.sh'. It is included on 8.7.3.x installations as well as 8.8.x installations.

Running it looks like this:
edirsrv1:~ # novell-getcore -b /var/opt/novell/eDirectory/data/dib/core.31448 /opt/novell/eDirectory/sbin/ndsd
Novell GetCore Utility 1.1.34 [Linux]
Copyright (C) 2007 Novell, Inc. All rights reserved.


[*] User specified binary that generated core: /opt/novell/eDirectory/sbin/ndsd
[*] Processing '/var/opt/novell/eDirectory/data/dib/core.31448' with GDB...
[*] PreProcessing GDB output...
[*] Parsing GDB output...
[*] Core file /var/opt/novell/eDirectory/data/dib/core.31448 is a valid Linux core
[*] Core generated by: /opt/novell/eDirectory/sbin/ndsd
[*] Obtaining names of shared libraries listed in core...
[*] Counting number of shared libraries listed in core...
[*] Total number of shared libraries listed in core: 72
[*] Corefile bundle: core_20080725_092227_linux_ndsd_edirsrv1
[*] Generating GDBINIT commands to open core remotely...
[*] Generating ./opencore.sh...
[*] Gathering package info...
[*] Creating core_20080725_092227_linux_ndsd_edirsrv1.tar...
[*] GZipping ./core_20080725_092227_linux_ndsd_edirsrv1.tar...
[*] Done. Corefile bundle is ./core_20080725_092227_linux_ndsd_edirsrv1.tar.gz


Once you have the packaged core, you can upload it to ftp.novell.com/incoming as part of your service-request.

Including More Data


If you're lucky enough to be able to cause the core file to drop on demand, or it just plain happens often enough that repetition isn't a problem, there is one more thing you can do to include better data in the core you ship to Novell. TID3113982 describes a setting you can add to the ndsd launch script (/etc/init.d/ndsd) that'll include more data. The TID describes what is being done pretty well. In essence, you're using an alternate malloc call that fails with better information than the normal one. You don't want to run with this set for very long, especially in busy environments, as it impacts performance. But if you have a repeatable core, the information it can provide is better than a 'naked' core. Setting MALLOC_CHECK_=2 is my recommendation.

Be sure to unset this once you're done troubleshooting. As I said, it can impact performance of your eDirectory server.

Labels: , , , , ,


Wednesday, July 16, 2008

Patching SLES

Last night I attempted to patch one of our OES2 servers. This particular server is an elderly beast, a P3 1GHz machine. So I wasn't expecting anything like fastness out of it. Especially with rug.

But still, it was painful!
normandy: ~#: rug lu
Waking up ZMD...
[8 minutes later]
[list of one update, libzypp]
normandy: ~#: rug update
Resolving Dependencies....
[8 minutes later]
Install this update? (y/N)
y
[12 minutes later]
Restarting ZMD...
[8 minutes later]
normandy: ~#: rug lu
[list of updates. No need to wait 8 minutes this time.]
normandy: ~#: rug update
Resolving Dependencies...
[8 minutes later]
Dependency resolution failed for bind-util and bind-libs. libdns-whatzihoozit required by bind-util is provided by bind-libs. Please fix you hoser.
[insert swearing here]
normandy: ~#: rug in bind-util bind-libs
Resolving Dependencies....
[8 minutes later]
Install these updates? (y/N)
y
[12 minutes later]
normandy: ~#: exit

As this had taken far longer than even I was expecting, I stopped. I'll finish up tonight. As this is an OES2 server, this means SLES10-SP1. I can attest that SLES10-SP2 on identical hardware is MUCH faster. I can't wait until OES2-SP1 comes out and this dinosaur can get faster patching.

Labels: , , ,


Monday, June 30, 2008

Novell Client for Linux, packaged for OpenSUSE

It has been mentioned many places, and I've done some of the mentioning, that since openSUSE is the foundation for SLED, it makes sense for Novell to distribute an NCL for openSUSE. It turns out they're working on just that. And here is the Novell beta page. I'm soooo going to try this out, since I'm running openSUSE 10.3 on my work desktop and won't be moving to openSUSE 11 until I can run the client on it (oh, wait, I can).

It should also be mentioned that Ubuntu is a very frequently requested target for another NCL, but I have reason to believe that'll never happen. First of all, any Novell Client involves closed source 3rd party licensed code, which makes it hard to port to Linux in the first place (a relic of being based on code from the days when open-source was just an ethical standpoint rather than a tangible market force). Second, Novell has proven to be rather light in developer resources in certain areas, and linux integration with non-SUSE linux distros is very minimal.

Labels: , , , , ,


Thursday, May 29, 2008

OES2 and SLES10-SP2

Per Novell:

Updating OES2

OES2 systems should NOT be updated to SLES10 SP2 at this time!
Very true. And most especially true if you're running virtualized NetWare! The paravirtualization components in NW65SP7 are designed around the version of Xen that's in SLES10-SP1, and SP2 contains a much newer version of Xen (trying to play catch-up to VMWare means a fast dev cycle, after all). So, expect problems if you do it.

Also, the OES2 install does contain some kernel packages, such as those relating to NSS.

OES2 systems need to wait until either Novell gives the all clear for SP2 deployments on OES2-fcs, or OES2-SP1 ships. OES2-SP1 is built around SLES10-Sp2.

Labels: , , , , ,


Friday, May 23, 2008

Problem with SLES10-SP2

Just this morning Novell posted a new TID:

Updates catalogs missing after updating libzypp

I've heard on the grape-vine that this particular libzypp update was put into the SLES10-SP1 channel in order to prepare for SP2's release. Those fine folk out there that have turned on Auto Updating on their SLE[S|D] boxes have very probably already been bit by it. I hope Novell gets this one fixed, and posts recovery steps, soon.

Labels: , , , ,


Wednesday, May 21, 2008

SLES10 SP2 shipped

According to Novell, SLES10 SP2 has shipped.

This means that the ongoing OES2 SP1 beta I'm a part of will be done on released code for the SLES side of it. So any bugs we find there may end up as patches on the SP2 channel.

One nice thing in the new code?

"rug refresh --clean"

This will do what I posted about a few days ago. It'll nuke the zmd database and rebuild it fresh! Niiiice! Unfortunately, a truly better version of rug won't come until "Code 11".

Labels: , , ,


Tuesday, May 06, 2008

Being annoyed by rug?

Rug/zmd in SLES10-SP1 is still a headache maker. Novell knows this, but I strongly suspect that we'll have to wait until SLES11 before we get anything improved. OpenSUSE now has zypper which works pretty good, and I think you can do it in SLES if you want, but I haven't tried.

One of the chief annoyances of rug is that the zmd.db file kept in /var/lib/zmd/zmd.db gets corrupted far too easily. And when that happens, rug can take HOURS to return anything. If it returns anything at all.

The fix for it is easy, stop zmd, delete the zmd.db file, restart zmd. Since I'm doing this fairly often, I've whipped up a bash script to do it for me.

nukezmd
#!/bin/sh
#
# For killing ZMD when it is clearly hung. An all too often occurance.
#

declare PIDZMD

# First get the PID of ZMD

printf "Getting PID... "
let PIDZMD=`rczmd showpid`
printf "$PIDZMD\n"
# Then unconditionally kill it

printf "Killing zmd hard... \n"
kill -9 $PIDZMD

# Remove the old, inconsistent database

printf "Nuking old database... \n"
rm /var/lib/zmd/zmd.db

# Restart ZMD, which will build a new, consistent database

printf "Restarting ZMD\n"
rczmd start
Simple, to the point. Works.

Labels: , , ,


Monday, May 05, 2008

Linux @ Home

My laptop at home dual-boots between openSUSE and WinXP. There are a few reasons why I don't boot the Linux side very often, some of them work related. And, what the heck, here are the two reasons.

1: Wireless driver problems
I have an intel 3945 WLAN card. It works just fine in linux, well supported. What throws it for a loop, however, are sleep and hibernate states. It can go one, two, four, maybe five cycles through sleep before it will require a reboot in order to find the home wireless again. If it doesn't lock the laptop up hard. Since my usage patterns are heavily dependent upon Sleep mode, this is a major, major disincentive to keep the Linux side booted.

I understand the 2.6.25 kernel is a lot better about this particular driver. Thus, I wait with eager anticipation the release of openSUSE 11.0. This driver is currently the ipw3945 driver, and will eventually turn into iwl3945 driver once it comes down the pipe. What little I've read about it suggests that the iwl driver is more stable through power states.

2: NetWare remote console
I use rconip for remote console to NetWare. Back when Novell first created the IP-based rconsole, they also released rconj along side ConsoleOne to provide it. As this was written in Java, it was mind bogglingly slow. This little .exe file was vastly faster, and I've come to use it extensively. Unless I get Wine working, this tool will have to stay on my Windows XP partition. It works great, and I haven't found a good linux-based replacement yet.

Time has moved on. Hardware has gotten faster, and the 'java penalty' has reduced markedly. RconJ is actually usable, but I still don't use it. Plus, it would require me to install ConsoleOne onto my laptop. It's 32-bit, so that's actually possible, but I really don't want to do that.

The Remote Console through the Novell Remote Monitor (that service out on :8009) has a nice remote-console utility, but it also requires Java. I'm still biased against java, and java-on-linux still seems fairly unstable to me. I don't trust it yet. It also doesn't scale well. When I'm service-packing, it is a LOT nicer looking to have 6 rconip windows up than 6 browser-based NRM java-consoles open. Plus, rconip will allow me access to the server console if DS is locked, something that NRM can't do and is invaluable in an emergency.

Once the wireless driver problems are fixed, I'll boot the linux side much more often. Remote-X over SSH actually makes some of my remote management a touch easier than it is in WinXP. And if I really really need to use Windows, my work XP VM is accessible over RDesktop. There are a few other non-work reasons why I don't boot Linux very often, but I'll not go into those here.

So, oddly, NetWare is partly responsible for keeping me in Windows at home. But only partly.

Labels: , , , ,


Monday, April 28, 2008

The GPL in a software-as-a-service world

Just this last weekend I went to Linuxfest Northwest, which is held here in Bellingham. This is nice! It's just a short drive.

One of the talks I went to was held by Ted Haeger, currently of Bungee Labs. The topic of the talk was one he had just posted to his blog, "Sharing Source Code In The Cloud".

One point he brought up that I hadn't heard of before is that the GPL triggers when you 'convey' the software to someone else. And that the GPL specifically excludes where the software is hosted on a server and users just use the software there, so long as the software itself never leaves the company in question. This is exactly what Google did and still does. All of their search IP was built on an OSS platform, but is still held as the crown jewels of their company; all because they haven't given the software to anyone else.

Apparently, this 'loophole' is being exploited by a LOT of new companies trying to get in on the software-as-a-service market. Such as Bungee Labs, as it happens. What effect will this have on the state of GPLed software? Hard to say, the market is still in its early days.

It makes you think.

Labels: ,


Thursday, April 17, 2008

And a gripe

2.5 hours is too freakin' long for "rug lu" to tell me which patches need application to this particular OES2 server. This needs fixing. I hope its fixed in SLES10 SP2.

Labels: , ,


Wednesday, April 09, 2008

Stupid user tricks

I had a case of this the other day. I was minding my own business, when suddenly one of my monitors starts going wonky. This is an LCD monitor, but an older one, so it isn't inconceivable that it could be going bad. How else would I explain the weird spots that were showing up on it? They looked like this:

Pretty spots

Which looks like weird hot-spots in the screen. So I started to muttering. Plus, the screen was noticeably dimmer. Futzing with the brigthness and contrast settings didn't do a thing for it either. Plus it seemed to follow no matter which window I put on the hot spots.

Then, I realized what the problem was.

Pretty stars

Compiz. Somehow, the rdesktop window that represents had been made slightly transparent, and the wall-paper was showing through. This screen shot is with the transparency fully down, you can barely make out the ConsoleOne icon in it.

So no, I didn't have a monitor going bad, I had a mouse mis-cue somewhere that caused that rdesktop window to go a bit transparent. No worries!

Aie.

Labels: ,


Thursday, March 20, 2008

BrainShare Thursday

Not a good day. My first course, "Advanced BASH," could more accurately be described as, "BASH scripting tips & tricks". I then proceeded to skip the other three sessions I had signed up for.
  • Novell Open Enterprise Server 2 Interoperability with Windows and AD. All about Domain Services for Windows and Samba. Neither of which we'll ever use. No idea why I wanted to be in this session.
  • Rapid Deployment of ZENworks Configuration Management. Other people around here have suggested that if we haven't moved yet, wait until at least SP3 before moving. If then. So, demotivated. Plus I was rather tired.
  • Configuring Samba on OES2. CIFS will do what we need, I don't need Samba. Don't need this one. Skipped.
DL236: Advanced BASH Course
BASH tips and tricks. I got a lot out of it, but the developers around me were quietly derisive.

ZEN Overview and Features
Not so much with the futures, but it did explain Novell's overall ZEN strategy. It isn't a coincidence that most of Novell's recent purchases have been for ZEN products.

TUT303: OES2 Clusters, from beginning to extremes
This was great. They had a full demo rig, and they showed quite a bit in it. Including using Novell Cluster Services to migrate Xen VM's around. They STRONGLY recommended using AutoYast to set up your cluster nodes to ensure they are simply identical except for the bits you explicitly want different (hostname, IP). And also something else I've heard before, you want one LUN for each NSS Pool. Really. Plus, the presenters were rather funny. A nice cap for the day.

And tonight, Meet the Experts!

Labels: , , , , , , ,


BrainShare Wednesday

The Wednesday keynote was, indeed, a bunch of demos. It was also mostly pointless as far as the technology I'm concerned with. Lots of GroupWise (don't care), lots and lots of PlateSpin (can't afford it), lots of Zen (not the bits I'd use).

That said, the new GroupWise WebAccess is gorgeous. I wish Exchange had their non-ActiveX pages look that good.

TUT175: RBAC: Avoiding the horror, getting past the hype
Mostly about IDM as it turned out. Only minimally interesting from an abstract viewpoint about roles in general.

TUT 277: Advanced eDirectory Configuration, new features, and tuning for performance
I learned a few things I didn't know, such as the fact that each object as an "AncestorList" attribute listing who their parent objects are. This apparently greatly speeds up searching. SP3, coming out this Summer, will have faster LDAP binds for a couple of reasons. Right now Novell is recommending 2 million objects as a reasonable maximum size for a partition for performance reasons.

And also they reiterated something I've heard before...
You know how back in the NetWare 4 days, we said to design your tree by geography at the first level, and then get to departments? Um, sorry about that. It was great back then, but for LDAP or IDM it really, really slows things down.
Yep. I took my first class for my CNA when 'Green River' was just coming out, or was just out. So I remember that.

TUT221: iPrint on Linux, what Novell Support wants you to know
A nice session from a mainline support guy about the ways people don't do iPrint on linux correctly. We're not going there until pcounter can run in linux, so this is still somewhat abstract. But, nice to know.
  • The reason that some print jobs render differently than direct-print jobs, is because of how Windows is designed. Direct-print jobs render with the 'local print provider', and iPrint jobs render with the 'network print provider'. This is a Microsoft thing, not an iPrint thing. You can duplicate it by setting up a microsoft IPP printer (assuming you're not mandating SSL like we are) and printing to the same printer with the same driver.
  • The Manager on Linux doesn't use a Broker, it uses a 'driver store'.
  • The Manager on NetWare doesn't always bind to the same broker. I didn't know that.
  • It is recommended to have only one Broker, or one driver store per tree.
  • Novell recommends using DNS rather than IP for your printer-agents, check your manager load scripts.

Labels: , , , , , , , ,


Tuesday, March 18, 2008

BrainShare Tuesday

Today started off with a bit of panic, as I hadn't set my alarm. Me being a west-coaster, 7:20 (when I woke up) is an entirely reasonable time to get up as far as my body is concerned. Only, I needed to get dressed and breakfasted before my first session at 8:30. Aie! I had to eat quick, but I got there. Didn't get a chance to check work email, though.

ATT326: Advanced Linux Troubleshooting
An ATT, therefore hard to summarize. But I learned about a few new commands I didn't know about before. Like strace. And vimdiff.

TUT130: Challenges in Storage I/O in Virtualization
Another nice one, but an emergency at work (printing down in a dorm, during finals week) distracted me heavily during the first half of it. Which resulted in the following note in my notes:
NPIV looks really nifty. Look into it.
NPIV being how you can use fibre-channel zoning to zone off VM's, rather than HBA's. Highly useful. I also learned about a neat new thing called Virtual Fabrics. Virtual Fabrics work kind of like VLANS for fabrics. You can segregate your fabrics into fabrics that share hardware but nothing else. Handy if your, say, Solaris admins don't want you mucking about with their zoning, while saving money through consolidated hardware.

TUT216: OES2 SP1 Architectural Overview
There is a LOT of new stuff in SP1.
  • It will include eDir 8.8.4 (8.8.3 will ship this summer sometime)
  • NCP and eDir will be fully 64-bit
  • OES2 SP1 will be based on SLES SP2, which will be releasing about the same time
  • AFP Support
    • AFP 3.1
    • Uses Diffie-Helman 1 for password exchange, meaning the 8-character password problem is solved.
    • Fully SMP-safe
    • Has cross-protocol locking with NCP. CIFS doesn't have cross-protocol locking yet, but IIRC, Samba does
    • Does not need LUM enabled users
  • CIFS Support
    • NTLMv1, but v2 is a possibility if enough people ask, so file those enhancement requests!!
    • CIFS is separate from Samba, therefore can not be used in conjunction with Domain Services for Windows
    • As with AFP, fully SMP safe
  • EDir 8.8.4
    • LDAP auditing enhanced
    • "newer auth protocols", but they didn't say what.
I should also mention that they're still deploying Novell Integrated Samba, which is what you'll have to use to get Domain Services For Windows. Samba still doesn't scale as far as I'd like ('only' 700-800 concurrent users), so that may be an issue for higher ed types who want high concurrency CIFS and also DSFW on the same box.

TUT211: Enhanced Protocol Support in OES2 SP1
This is the session where they went into detail about the AFP and CIFS support. They said that netatalk, the existing AFP stack on Linux, gets really slow once you go over the 20 concurrent users. Whoa! I can soooo understand why Novell felt the need to make a new one.
  • The 8 character password limit has been fixed! They now support DH1 for passing passwords.
  • The 'afptcp' daemon can use one password protocol at a time, so you can only use DH1, or one of the other three I can't remember.
  • Support for OSX 10.1 and 10.2 is scanty, and 10.5 is limited but users may not notice anyway.
  • Passwords will be case sensitive.
  • Kerberos will be in a future release
  • Performance is faster than NetWare, partly due to the ability to multi-thread
  • Can register services by way of SLP
  • Only supports NSS for the time being, the other Linux file-systems will be a future feature.
  • Can support 500 concurrent users, and 1000+ in the future. This fits our current AFP loads.
  • We can configure more about how it works than we could on NetWare, such as how many worker threads to spawn.
  • Has meaninful debug logs!
  • Has a new command, 'afpstat' that works like 'netstat' for giving a snapshot of afp connections.
And then some CIFS stuff. We can't use it for political reasons so I didn't pay attention. Sorry.

Tonight was the night formerly known as 'Sponsor Night,' but has a new name now that everyone who gets a booth is no longer a 'sponsor'. Some are sponsors, some are exhibiters. I can't keep track. Anyway, today was their party. "World of Novellcraft!" Homage to vid-gaming.

Lots of Wii, lots of Rock Band, some Halo, lots of women dressed in Renaissance Festival gear getting their pictures taken by the 90%+ male audience. I've blogged before about my ambivalence about Sponsor Night. I lasted until about 7, when I came back to the hotel.

Tomorrow I have an actual LUNCH BREAK in my schedule! Ooo! And Soul Asylum Soul Coughing Collective Soul plays the concert! I've been listening to two of their CD's for the past two months so I think I may even know a few songs by now.

Labels: , , , , , ,


Monday, February 25, 2008

First OES2

This weekend I upgraded the one replica server running OES1-Linux to OES2-Linux. It already was at eDir 8.8.2 so the only real changes were to the base OS. It went rather well. The upgrade documentation provided by Novell was just fine. Really, a simple upgrade.

It being done on a Pentium III 1.2GHz machine meant it took a while. But very little in the way of complication. The one hitch was that it changed the certificate the NLDAP server loads to the default, which I didn't catch until a certain service we wrote failed. But that was a very easy fix.

Labels: , , ,


Saturday, January 19, 2008

Good migration

At home I just migrated the linux server to new hardware. This has to be one of the easiest migrations I've ever done for that service. Now just the obsessive tweaking needs doing, all the major functions are moved.

That server is running Slackware. I'm not using SuSE at home for a couple of reasons:
  1. I've been using Slack since college
  2. Diversity is good when figuring out how to run Linux
    1. Slackware doesn't have anything approaching YaST.
    2. Getting a new service online with Slackware takes about five times longer than it does with SuSE, but at the end of it you know how it bloody well works.
  3. It's easier to crib from existing config files that way.
I've also done a major rework of the internal network, which required a small rewrite of the network start scripts to handle it correctly.

I got my first wireless access point in November of 2000. Way back then, they hadn't quite figured out all the short-cuts to cracking WEP so it required a certain amount of traffic to analyze. This was a Linksys B AP, and a Linksys wireless card. Together they had el-crappo for range (to today's standards).

With that in mind I segregated my network.

Internet <- Cisco 675 DSL -> Wired network <- Linux server -> Wireless network

Didn't have cable in our area yet back then. The Cisco handled everything I needed. Unfortunately, it was badly behaved. It had the nasty habit of ARPing through the whole dhcp range, one addr per second, continually.

At that point in time I had one wireless device. The always-on windows server was on the wired network, and the linux server configured to proxy things. So the only traffic on the wireless network was from my laptop; no ARP ARP ARP ARP ARP and no windows browse packets. In other words, it was a network that was hard to crack. Oh yeah, baby.

Fast forward a couple years. I move out here, we get cable instead of DSL.
Another year or two, and the 802.11b AP died so we moved to a G AP.
Another year, and I added a certain linux-based media server (wireless for long reasons) and my wife got a PowerBook.

The 10MB ethernet card in the back of that Linux machine (a Pentium 2 450MHz machine) was really... concerning me. Comcast is still under 10MB, but... it's the principle of the thing. It was a bloody ISA card for pete's sake.

So today I flattened the network. It's structured the same, but rather than have separate subnets I'm just using brctl to bridge the two; I like being able to easily sniff my wireless traffic. We no longer have an always-on Windows box. And WPA-PSK is a heckova lot harder to crack than WPA ever was. So, I figure it's safe. Plus, if the linux machine ever dies I only have to move one cable to get things back online.

Now the internet seems faster when browsing on the laptops. I guess that 10MB card was actually slowing things down a bit.

Labels: ,


Friday, November 30, 2007

OES2 SP1 timing

Novell just posted the third draft of their OES2 Best Practices guide. Which you can locate here. In that guide is this text:
Domain Services for Windows, which is scheduled to ship with OES 2 SP1 (currently scheduled for late 2008), will also offer some clear advantages.
"Late 2008" means they WILL NOT have SP1 out by August of 2008. This means that the upgrade of our 6 node cluster to OES will have to wait until 2009. Grrarrr!

Another 21 months of a 32-bit operating system on the single biggest storage consumer on campus. We'll have at least one hardware refresh before then for some of the nodes, and... boy I hope they have NetWare drivers for that. The very limited testing I did with NetWare-in-Xen was not encouraging from a performance stand-point. If it looks like I'll have to deploy that way for the next servers we get in the cluster, I'll have to do more real testing to characterize the performance hit (if any). The idea of a 64-bit memory space for file-caching makes me drool. Not getting it for 21 months is painful.

That said, if Novell releases the eDirectory enabled AFP server for OES2-Linux outside of the service-pack I could still make the 2008 window. That's our only dependency for SP1.

Update (09/08/08): Looks like 'late October' is the date for SP1's release. Should be in public beta before then.

Update (12/03/08): It's out!

Labels: , , , , ,


Monday, November 19, 2007

I didn't realize it was this bad.

A while back Novell held an online survey about YaST usage. They've just released results.

Right at the top, in the demographics section are the results for the 'gender' question.

Men = 97.7%, Women = 2.3%

Ow. Women are 2.3%? Jeez.

These sorts of surveys are FAR from scientific. But still, such a STRONG bias is rather disheartening. I know the BrainShare crowd is somewhere between 4:1 to 6:1 Men-to-Women (don't have exact numbers). That said, most of the women I meet there are there for either Identity Management or GroupWise. The audiences for sessions on high Linux geekery (like for Clusters or HA computing) are... very male.

Just looking at that chart makes me wince. Yeesh.

Labels: ,


Tuesday, October 09, 2007

Xen in 10.3

Because I couldn't get a good video driver working for it, I haven't spent much time in the new Xen. I believe it is Xen 3.1. And yes, it IS a lot faster on the network. Wow. It used to be painful, now it is improved quite a bit. I just patched the windows vm on my work machine, and the network transfer went really fast. Then I had to turn it off since I needed my linux desktop back.

And now I get to see how to convert a Xen disk-image into a vmware disk image. I know it can be done, but I haven't dug up the script or whatever.

Labels: ,


Friday, October 05, 2007

openSUSE 10.3 is in

No problems, but one. The "nvidia in Xen" problem is back. The prior solution isn't working yet. Unresolved symbols. This could be a work-stopper, or the thing that makes me move all the way to VMWare.

Labels: , ,


Tuesday, October 02, 2007

NCL 2.0 beta and Xen, pt 2

I now have server side and client side captures. The server side shows what's really going on. It is clear that those jumbos I talked about earlier are being disassembled in the Xen network stack. The client traces look similar to this

-> NCP to server
<- Ack
-> NCP to server
-> NCP to server
<- Ack
<- Ack
-> NCP to server
<- Ack

With variations on the order. The NCP to server packets are all jumbos. From the server side it looks a lot different. The same sequence from the server side:

-> NCP to server
-> NCP to server
-> NCP to server
<- Ack
-> NCP to server
-> NCP to server
-> NCP to server
<- Ack
-> NCP to server
-> NCP to server
-> NCP to server
<- Ack
-> NCP to server
-> NCP to server
-> NCP to server
<- Ack

What's more, the server sees a marked delay in packets between the <- Ack and the first -> NCP to server. On the client side the delays are between the -> NCP to server and the responding <- Ack. I interpret this to show a packet-disassembly delay in the Xen stack.

What I can't figure out is how are the jumbos getting on the network stack at all? The configured network interfaces (except for loopback) in the Dom0 all have MTU values of 1500. I suspect NCL throughput for higher record sizes will improve markedly if it didn't force the Xen layer to disassemble the jumbos. Overriding the MTU on an interface is something that can only be done in kernel-space (I think) which would point to the novfs kernel module.

Labels: , ,


Monday, October 01, 2007

NCL 2.0 beta and Xen

I found a good candidate for why the Novell Client for Linux 2.0-beta is so crappy in a Xen kernel. Take a look at this:

Frame 6 (4434 bytes on wire, 4434 bytes captured)

What the heck? What it should look like is this:

Frame 6 (1514 bytes on wire, 1514 bytes captured)

So I go to our network guys and ask, "Have we turned on jumbo frames anywhere?" No, we haven't. Anywhere. Which I pretty much knew, since we're not doing any iSCSI. So where the heck is that jumbo coming from? The only thing I can think of is that the sniffing layer I'm getting this at is above the layer that'd grab what actually hits the wire, and something between the sniffer and the wire is converting these jumbos to the normal 1514B ethernet MTU, and that's where my lag is coming from.

This is a case where I'd like to span a port and get a sniff of what actually hits the wire so I can compare.

Labels: , ,


Friday, September 28, 2007

Novell client for Linux, part 2

I'm doing a test with the vanilla openSUSE kernel and the results are VASTLY BETTER. A snippet:
    KB  reclen   write rewrite    read    reread
51200 256 9178 9189 9130 8898
Compare that with the earlier numbers for that record size and you can see the problem.
    KB  reclen   write rewrite    read    reread
51200 256 595 587 8551 8491
See? It has to be something in the Xen Dom0 environment.

This would be a lot easier if Wireshark would stop freezing hard when I try to save a capture.

Labels: , ,


Novell Client for Linux

I'm running the beta of it, but keep in mind I'm also running it on openSUSE, which is not what it has been specifically designed for. That said, I do have some performance numbers. Yesterday I ran a very simple IOZONE test on a 50MB file. This isn't a great test since it fits entirely inside the server's cache. But I'm not worried about that, I just want to see how fast this code can peddle.

Performance was... not good. Unfortunately, right now I can't tell if that is a side effect of running it on openSUSE 10.2, inside a Dom0 Xen domain. I want to re-run it running the standard kernel and see if the performance also doth suck.
                                   random  random
KB reclen write rewrite read reread
51200 4 3259 3218 3875 4026
51200 8 5144 4925 5404 5215
51200 16 193 196 6980 6788
51200 32 593 599 8371 8309
51200 64 609 594 8213 8437
51200 128 589 590 8399 8432
51200 256 595 587 8551 8491
51200 512 598 595 8528 8439
51200 1024 596 588 8580 8295
51200 2048 594 596 8595 8542
51200 4096 597 609 8258 8548
51200 8192 595 599 8683 8683
51200 16384 608 599 8673 8585
As you can see, performance at record sizes over 8 is not great for writes. Reads, on the other hand, are quite zippy. Still not as zippy as the Windows client, which I've seen get up to 11000 on those tests. But still, zippy. I don't know where the problem is. I did some sniffing to try and figure it out, but nothing really stuck out as a cause. I'm seeing some 160ms delays in ACKs, but they're coming from the server. I can't tell what condition of the client-side write is causing the delayed ACK. Need more testing.

Labels: , ,


Tuesday, September 25, 2007

The perils of a manual process

Yesterday I found the root cause of a rather perplexing problem. We had a user, happily for me one of the other sysadmins at WWU, who couldn't get their eDir password changed. No matter how many times he ran the identity management process, his AD PW would change, but eDir would not even though the success on the event was good.

A word of note:

We do not use Novell Identity Management. We've built our own. When Novell first came out with DirXML 1.0, we already had the foundation of what we have right now. So, when I talk about IDM, I'm actually referring to our own self-built system not Novell's IDM.

To troubleshoot, I ran many tests. The longest one was to turn on dstrace logging on the root replica server, and push changes to the object. I'd push a change, watch the logs, then parse through the log for the user's object.
  • Changing it via LDAP made a sync.
  • Changing it via the IDM did not make a sync.
  • Changing it via iManager made a sync.
  • Changing it via ConsoleOne on the IDM server made a sync
This would point to some flaw in the IDM process. This is unlikely, as the password change logic has been largely unchanged for close to 7 years. The underlying libraries have also been unchanged for close to 3 years. Very unlikely to be that. What it could be, though, is some odd-ball untrapped error.

To figure out what that is, I needed to sniff packets. PKTSCAN to the rescue. On the IDM server I turned off connections to all but the server holding the Master replicas of everything. Then on the master replica server I loaded PKTSCAN. I turned on sniffing, make the change, wait 5 seconds just to be safe, turn off the sniff, save the sniff, and load the sniff in Wireshark. The two cases I tested:
  • Change the concurrent connections attribute through IDM
  • Change the concurrent connections attribute through ConsoleOne on the IDM server
This is what showed my problem. When I did it through IDM, it was attempting to change the Concurrent Connections attribute of T=WWU. Ahem. When I did it through ConsoleOne, it was attempting to change the Concurrent Connections attribute of CN=[username].OU=Users.O=WWU. AHAH!

Looking at the details of T=WWU, I saw that it had an aux class associated with it. It was posixAccount. Thus, was I illuminated.

This particular sysadmin requested to have his account 'turned on for linux'. Which is code for having the posixAccount aux-class associated and the uid, gid, cn, and shell attributes added. This is still a manual process for us since requests are few and far between, though that is changing. It would seem that when I did it, I right-clicked on the wrong object. Whoopsie poo! Easily fixed, though.

I removed the aux-class from the tree root object, and suddenly... IDM changes started applying to the right object! Hooray! I think the IDM code was keying off of commonName rather than CN for some reason, which is why the aux-class got in the way.

Labels: , , ,


Friday, August 24, 2007

openSUSE news

They have a new news-portal site, which is nifty:

http://news.opensuse.org

They had a nice article today about the results of a recent desktop linux survey. Bucky had a nice bit of analysis about it too.

This is nice to see. As I've mentioned before, I'm using openSUSE 10.2 at work (and right now). I can't use SLED because we're not entitled to it, nor will my boss pay for it. That said, I probably could do what a co-worker is doing and run SLES10sp1 instead ;). OpenSUSE now has 10.3 beta 2 out, which I'm not going to test quite yet as I don't have a test system for it; I wish I did have one.

One of the nice things that'll be in 10.3 (or rather, not in) is they're doing away with zmd for updates, and using a libzypp based process instead. This cheers me, as I've had a lot of trouble with the zmd one. It's better than it was in 10.0/1, but still not good.

Also, of course, openSUSE code forms the basis for what'll eventually become SLED.

Labels: ,


Wednesday, July 11, 2007

Novell Client for Linux beta 2.0 release on openSUSE

I have managed to get the new beta of the NCL 2.0 working on my openSUSE 10.2 workstation. This is very nice. Some nice details can be found here in the Novell support groups. My steps were rather similar.
  1. Install the referenced RPM. I did it with an "rpm -i [rpm-name]". Use the RPM for your processor type. I'm using 64-bit and it worked just fine for me.
  2. Run the "ncl-install" from the beta download.
That was pretty much it. It isn't perfect, but it is w-a-y better than using NetStorage and WebDav for this stuff. One area of inperfection is the tray icon gets smooshed.
NWTRAY getting smooshed
See that little sliver of an icon between the magnifying glass and the vertical bar? That's the nwtray icon. It's about 2 pixels wide. If I can click on it I get the full NWTRAY menu just fine, but it's hard to hit.

The other problem is the "Novell Services" button in nautilus. When I click that button, it looks like gnome crashes. I haven't been able to find out where the dump traces are going so I don't know what's up with that. If I access the services from the 2-pixel wide NWTRAY things work just fine, though.

Throughput still sucks, though. The Windows client is still better for that. But... throughput is w-a-y better than using a WebDav connection! Progress!

Labels: , , ,


Friday, June 08, 2007

EVEN MORE NIGH!

From what I can see, it's on the Red Carpet servers now!

SUSE-Linux-Enterprise-SDK-SP1
SUSE-Linux-Enterprise-Server-SP1
SUSE-Linux-Enterprise-Desktop-SP1

I wonder if I can register against it? Hmmm....

Labels: , ,


SLES 10 SP1 is nigh! Nigh!

There are a couple of highly interesting items available for download right now from download.novell.com.

SUSE Linux Enterprise On Microsoft's Virtual Hard Drive
SUSE Linux Enterprise Server (SLES) on Microsoft's Virtual Hard Drive (VHD) is a VHD image file that contains the latest release of SUSE Linux Enterprise Server: SLES 10 SP1 with most available packages already installed. With this image you can run SLES 10 SP1 on your Microsoft Virtual Server 2005 R2.
While not interesting to me in that I don't use VHD, this is reportedly a SLES 10 SP1 build. If they've felt the need to release this, then SP1 must be out really soon.

iFolder 3.4 for SLED 10 SP1

The iFolderTM 3.4 client for SUSE® Linux Enterprise Desktop 10 SP1 enables users to share their local files through a central Novell® iFolder 3.2 server. Users can create multiple iFolders, share each of the iFolders with other users, and specify each member's access right to the iFolder data. Users can also participate in iFolders that others share with them.

The iFolder 3.4 client for SUSE Linux Enterprise Desktop 10 SP1 is available for 32-bit (i586) and 64-bit (x86_64) architectures. The client consists of three modules: iFolder, Nautilus, and Simias. Each of the compressed download files contains the three modules for the specified architecture.

To use the client, the user must also have an iFolder account on a Novell iFolder 3.2 server.

Again, we don't use iFolder, but this is the client that works with SP1. This is available for download right now.

Could we have SP1 out on Monday? Could OES2 (based on SLES10 SP1 code) be out Monday? We shall see, we shall see.

Labels: ,


Monday, April 30, 2007

Quick hits from Linuxfest Northwest

Probably the biggest news, Ted Hagger is leaving Novell. Though the message says "March 24th", he said in session on Saturday, "Well, as of Wednesday I'm no longer with Novell." Which would imply that Tuesday April 24th was his last day. March 24th was the Saturday after BrainShare.

Whoa.

He's moving to a company that's doing Web 2.0 work. As that's soooo not my field, Ted is likely dropping off of my radar. He will not be doing any more Novell Open Audio.

Also, I was inspired by a session on OpenID to do a few things differently around here. I'm not sure we'll become and OpenID provider, but it is within the realm of possibility.

I learned more about Xen virtualization, which is nifty as I need to know that stuff.

Labels: ,


Friday, February 09, 2007

NetStorage and gnome davs::

I just figured out how to get NetStorage to work with openSUSE 10.2! You have to set NetStorage to 'cookieless mode'. Now I have access to my NetWare hosted volumes from openSUSE! It'd be better if I could direct mount them, much more efficient transfer protocols, but this will do until Novell releases the next Client for Linux with SLED10 sp1.

Labels: , ,


Friday, December 22, 2006

Other client news

My earlier message about the open audio podcast also included some information about the Novell Client for Linux. It still isn't quite there yet, and Jason Williams admitted as much in the podcast. However, he did share some news about the next version of the Novell Client for Linux due out with either OES2 or SLED10 sp1.
  • NCL is build using AutoBuild, but a private AutoBuild.
  • The NCL will never be open-sourced because it has to use proprietary third party modules that Novell does not have the rights to open-source. One of the key modules are the RSA modules.
  • The new NCL will be able to auto-detect the kernel being used and will be able to automatically generate the required kernel modules (novfs and others) when kernel version change. Really nifty!
  • Introducing Gnome integration for Novell volumes. They already have the KDE version out. This will allow you to do some of the things you can do right now in Windows Explorer, only on Gnome and KDE.
Some of the key technologies behind the Novell Client on any platform were developed w-a-y back in the day. Like, NetWare 4.0 back in the day. Back when Novell still held the complete rights to Unix(tm). Open Source didn't have nearly the mind-share it does now, so building your infrastructure on proprietary third party components was perfectly OK. The key technology behind how Novell handled NetWare authentication as of the advent NDS was built around RSA-licensed cryptography.

The RSA bits ended up being just a little bit too secure for a heterogeneous environment. For a pure NetWare/NDS environment it was great, arguably the most secure thing around at the time. Since then Novell has realized that password synchronization between different authentication sources is a key technology in and of itself, which required that password encryption be reversible. Which the RSA stuff wasn't. Thus, the introduction of Simple passwords, and ultimately Universal passwords; both of which are less secure than the RSA methods, but still secure enough.

A lot of companies have moved to Universal passwords, but Novell still has to support the RSA-style login methods through the client. Therefore, the RSA-libraries have to be distributed with the client, and that makes it impossible to open-source without RSA's approval. Which isn't coming any time soon. There may be other 3rd party licensed technology in the client, but I can't remember who else may be involved.

It may be possible to create a true open-source client that just doesn't speak the proprietary protocols, though I strongly suspect that doing so will require significant reconfiguration of Open Enterprise Server. If not significant re-engineering. This is a legacy issue.

Labels: , ,


Monday, November 06, 2006

Vendor lock-in, my reality

Freedom from vendor lock-in is one of the louder themes in the criticism in this post Novell/Microsoft-deal world. Free and Open Source Software (FOSS, learned that one this week) is all about avoiding vendor lock-in. Linux comes in the wide flavors it does so that vendor lock-in doesn't have to happen.

Here is how I view vendor lock-in as a part of my job. This is coming from a system administrator who manages in a largish regional enterprise. Both this job and OldJob didn't have significant WAN links, but we did have well over 1000 users to play with.

Both my old job and my new job saw significant savings by sole-sourcing server hardware. OldJob was a Dell shop pretty completely, with a very, very few Compaq hold-outs. Here in Technical Services we're a very solid HP shop, with some Dell thrown in as we inherit systems from other departments. Both old and new jobs had some Solaris around for various things. At OldJob we didn't use OpenManage much at all. Here, we're not really using OpenView.

Both companies realized that whiteboxing, building servers from component parts and supporting things internally, was a losing idea at the scales both jobs built out at. Having a single vendor support us for server hardware eased support costs and made the internal knowledge-base required to support our hardware much less complex.

OldJob used to be a Compaq shop, but dropped it in favor of Dell due to and I quote directly, "all of that proprietary crap." The fact that Dell was a lot cheaper at the time also helped significantly. As I understand it, Technical Services was that rarity, a HP shop. The Compaq merger was viewed with some dismay in certain parts. This is why we have some HP LH3's stacked against one wall.

As for operating systems, OldJob and NewJob had different ideas. One of the reasons I left OldJob was a decision from the top of IT management to unify on a single Directory; Microsoft's. How that would interact with GroupWise, which was VERY firmly established in the enterprise and not being moved out, was a complete mystery to us techs who would eventually be asked to make it work. Accompanying this decision was the directive to start migrating the NetWare servers out in favor of Microsoft Windows. They were reducing support costs by reducing the number of OSs in the datacenter.

WWU works differently. They've had a 'best of breed' philosophy for services for some time, and NetWare is still king for File Serving performance. Solaris was king for Oracle for a long time, though that throne is in a bit of question at the moment. Linux has started creeping in as an application server a few places, so far just web-servers (the web-server that drives the time-card web front-end is Linux, with a Solaris/Oracle back end).

One of the differences between my old environment and this environment is budgetary. OldJob had a strict three year replacement cycle for servers, that they were thinking of extending to 3.5 or 4 yeaers to better handle server swaps (3-6 month stage up and migration, 3-6 month demigration and stage down). WWU plans on keeping servers in some form of production for 5 years, though the last one to two years may be as a dev server rather than mainline production. The same sort of pressures happen on the software we use on the servers, as we're more likely to use OSS than we were at my old job.

That said, when it comes to package management on the Solaris and Linux servers we rarely go outside of the vendor for packages. I know we compile Samba from source on Solaris, but then we're using an old Solaris version that doesn't do what we need out of the box very well. If it isn't on the Solaris or SLES9 repositories (we don't have any SLES10 in yet), we probably won't deploy on that.

Our OS vendors are Microsoft, Novell, and Solaris. Other departments use Gentoo, Slackware, and I'm sure some Ubuntu and Red Hat.

All that said, we do deal with, and embrace, vendor lock-in. We do it in hardware. We do it in software, as our Help Desk software only runs on a Windows/IIS stack (though can do MSSQL or Oracle as a DB source). We do it in file serving OS, as it would be a royal pain to use anything but an NCP server from Novell on NetWare (even OES-Linux presents some problems). We do it in small-office databases, which are all housed on an MS-SQL server (we can do oracle, but it is a pain and the processes are not well established or documented).

We also embrace diversity. Our HelpDesk app requires Windows/IIS, but our student email web-access app required Apache/PHP. Web-serving is perhaps our most diversified environment, as we are widely deployed in both Apache and IIS based technologies. Several main line applications are java driven, others are driven by ASP, still others use PHP.

When it comes to an OS we'll use on mainline applications, we have to have enterprise support. This means buying our OS from a firm that can provide 24x7x365.25 support. This reduces the number of firms we can purchase Linux support from. Additionally, we need some form of patch management system. NetWare is the least well supported when it comes to patch management, as it is a 100% manual process, happily it doesn't need a lot of attention. We won't do NetWare-style patch management on Linux, as that method doesn't scale. In effect we're limited to Red Hat or Novell for Linux, and as we already have a contract with Novell for NetWare support that's where we went. Furthering our lock-in with Novell.

So vendor lock in is really a fact of life here. I appreciate the fact that I can toss a Slackware server in amongst the servers and have a fair chance of making it work if I really need it to. But that'll be emergency processing, not standard procedure.

I'll leave what the Novell/Microsoft deal means in terms of the viability of Linux in the enterprise for another post.

Tags: ,

Labels: ,


This page is powered by Blogger. Isn't yours?