August 2005 Archives

Netware & SFTP

It doesn't look like OpenSSH on NetWare can do key-exchange authentication! Looking at the debug files, I'm not sure it knows where to find 'authorized_keys' in userland at that point in the checking. It does the lookup for the end-users environment after authentication is completed and it is setting up the environment. There are log-lines that point to checking...

31 Aug - 10:49:35[0031427875] <137> debug1: userauth-request for user riedesg service ssh-connection method publickey
31 Aug - 10:49:35[0031427875] <137> debug1: attempt 1 failures 1
31 Aug - 10:49:35[0031427875] <137> debug2: input_userauth_request: try method publickey
31 Aug - 10:49:35[0031427875] <137> debug1: test whether pkalg/pkblob are acceptable
31 Aug - 10:49:35[0031427875] <137> debug1: trying public key file /.ssh/authorized_keys
31 Aug - 10:49:35[0031427875] <137> debug1: trying public key file /.ssh/authorized_keys

But it doesn't seem to find it. Later on in the log-files, you get this:

31 Aug - 10:49:38[0031427932] <0> debug3: authorize_sftp(riedesg) calling create_identity(NULL, 'cn=riedesg<context>', <password>, NULL, XPORT_TCP, 71c4940c)
31 Aug - 10:49:38[0031427941] <0> debug3: authorize_sftp(riedesg) create_identity succeeded identity = 10001.
31 Aug - 10:49:38[0031427941] <0> debug3: authorize_sftp(riedesg) calling NXCreatePathContext(NULL, '<homedir>', NX_PNF_NKS, 10001, 71c49410)
31 Aug - 10:49:39[0031427951] <0> debug3: authorize_sftp(riedesg) NXCreatePathContext succeeded.
31 Aug - 10:49:39[0031427951] <0> debug3: authorize_sftp() setcwd succeeded for riedesg. User authenticated. rc = 0

Which is the point where it actually creates a connection to the home-directory with intent to access files. Since this step happens AFTER authentication, I must presume that it can't access said home-directory before auth. It clearly connects to the remote resource with the user's credentials, and those don't exist before authentication.

So it looks like key-exchange can't be done with netware. At least, with end-user supplied keys. It may be possible with system-configed keys, but my ssh-fu is too weak to try that out right now.

FTP is turned off

The last user has been talked to. The last testing has been done. We're turning of FTP to the Novell cluster as of this morning, and only allowing sftp access.

Cool tip accepted!

| 1 Comment
The Cool Solutions folk have accepted my tip for getting rid of the tilde in URL's. It got published and everything! Yay!

Fantastic future

| 3 Comments
So. Would WOULD a GW7 installation look like here at WWU?

From a hardware point of view, that would require three servers.

From a software point of view, that would require three additional cluster licenses for Netware.

Our Exchange environment has seven mail stores. Those stores represent the three mail quota policies we have in place. Normal, Large, and Admin. Normal and Large have two and one store, respectively, on each Exchange server. Admin is for us admin types and the very exaulted few who get Unlimited quota. Each of the Normal stores has around 900-odd users in it. The Large stores vastly less than that.

If you roll the Normal and Large stores together since GW doesn't have any sort of storage quota system, you get around 1000 users per store. Since I'm having trouble finding sizing recommendations for GW7 for users-per-POA, I'm having to fake it. I know I heard that a POA with users running in cached-mode can support 10,000 users. I also know that a large percentage of our users aren't using cached mode (Outlook, not GroupWise of course) yet, users fear change so most haven't upgraded yet.

Thanks to the new features in GW7, we'd only need one MTA. GWIA, WebAccess, and the like can coexist with users in a domain. Which is good, since that's what they're doing now in Exchange.

So we'd need to configure 4 POAs under one MTA. Each of the five services would be its own clustered service, so it could live on any of the three dedicated groupwise cluster-nodes. All five could host on one node, or not. With a 4GB RAM server, it could handle all five. Though I don't give any guarantees for speediness during User Maintenance and Reindexing during the night.

Since the MTA would be responsible for handling all inter-POA traffic, it should reside on a node with fewer POAs on it. This will take a bit of tweaking to see how it works in the environment, but isn't very hard once we get users on the system.

Oh look, GW7 released

I see that Groupwise 7 has released. I wonder if the migration to GW65 from GW5.5 is complete at OldJob? Probably not, all things considered.

After digging through the GW7 documentation, I've noticed a few things (not the full list of fixes/improvements, just ones I find interesting):
  • Still no functionality that looks like Exchange quotas. Just Expire/Reduce, same as we had in GW5 and earlier.
  • Still the same WPCSIN/WPCSOUT message flow
  • Still the same database back end
  • Much better integration with Outlook than before
  • Exchange to Groupwise migration tool actually exists, nifty
  • End-users can edit distribution lists, nifty
  • All-day events in the calendar are supported
  • Can set your default read/write views, long overdue
  • GWIA now can talk to the MTA by TCP/IP, instead of file-level access. Double nifty.
  • Configurable deferred message processing
  • All agents now support explicit binds to IP addresses, rather than the generic 0.0.0.0 bind. Very nifty if a backup network is in use.

I wonder if the Groupwise client still renders HTML content in the Internet zone? That sucked. Outlook has had the option of rendering HTML content (if told to render it at all) in the Restricted Zone for several years now. I know I submitted an enhancement request or three over the years at OldJob for just that ability. I hope it made it into this version.

However, hopes of a migration of WWU to GW are the stuff of pipe-smoke:
  • No user-quota implimentation. I don't care if it 'doesn't need it'. The Quota meme is so ingraned into TPTB that if it doesn't have quota management, with built in consequences for bashing your head into it, it doesn't have the ability to manage storage growth. No amount of automated stats runs on the GW databases and nag-mails to get religion about archiving will help. 3rd party solutions that fix this just prove that GroupWise is not mature enough. Or so goes TPTB. Clearly, this is a deal breaker.
  • It doesn't use a recognized high capacity database for mail storage. T'ain't relational at the guts, cain't support lots of users. Period. Another psychological barrier.
  • No concept of 'public folders'. Public Folders are an abomination, but sadly we have them and have to support them.
  • Our current mail administrator had a bad experience with GroupWise at a previous job. Which, really, is the real kiss-of-death. Except for this one thing, he's a strong Novell supporter.
  • GroupWise seems to run best on NetWare. Sadly, it does seem that NetWare is getting less development attention. The AV support for GroupWise does not come from any of the Big Three. So the AV will have to come from an in-line appliance of some form.
There are a couple of features of GroupWise that I wistfully miss:
  • GroupWise clusters can go Active/Active. Every major Exchange downtime we've had in the last year can be attributed to failures of Exchange's cluster model and how we had to put it together. Resources permitting, GroupWise has the ability to have two services cohabitate quite happily.
  • Resources. A simple concept, but sadly missed in Exchange. Public Folders are close, but not it. What do you do for the group that wants to send mail from "admissions@wwu.edu"? A Resource would formalize that function in a way we can't do in Exchange.

More security fun in .edu-land

| 1 Comment
Two things.

FrSIRT announces a vulnerability in BackupExec Remote Agent that currently (as of this posting) has no patch. This will be a problem! Mark my words.

And next, from a SANS mailing I get:
Editor's Note (Pescatore): There has been a flood of universities acknowledging data compromises and .edu domains are one of the largest sources of computers compromised with malicious software. While the amount of attention universities pay to security has been rising in the past few years, it has mostly been to react to potential lawsuits do to illegal file sharing and the like - universities need to pay way more attention to how their own sys admins manage their own servers.
Hi, that's me. As I covered a couple of days ago, we have some challenges that corps don't have. For one, we have no firewall, just router filtering rules. And today I learned more about our security posture campus-wide.

It seems the buildings have some pretty restrictive filters on them at the router level, but our servers don't have much at all. This seems to be driven by a need to be good netizens rather than a need to prevent security intrusions. End-user systems are hideously hard to patch, spyware is rampant, and it doesn't take much to turn a WinXP machine that someone isn't paying attention to into a botnet drone.

Servers, on the other hand, are professionally managed. We pay attention to those. Security is a priority very near the top! Therefore, we don't have to be as strict (from a network point of view) with them as we do end-user systems.

Because of the firewall-free nature of the vast majority of our datacenter (more on that later), any application we buy that runs on a server has to be able to run in a hostile network. This has caused real problems. A lot of programs assume that the datacenter is a 'secure' environment and that hackers will be rattling door-knobs very infrequently. BackupExec comes to mind here. Add into that independent purchase authority, and you get departments buying off-the-shelf apps without considering their network security requirements in the context of the WWU environment.

Every single server I've had to unhack since 1/1/2005 has been due to:
  • Non-Microsoft patches that got missed (Veritas)
  • Microsoft patches that didn't get applied correctly as part of our standard update procedure. This is the classic, "the dialog said 'applied', but it really wasn't," problem.
  • Zero-day exploits (Veritas, others) where the vulnerability is not formally acknowledged by the vendor
This is the point where I say that Microsoft is no longer the bad boy of the bunch. Their patching process and built in tools are rich enough that they're no longer the #1 vector of attack. Yes, these are all on Windows, but it isn't Windows getting hacked most of the time. It's the apps that sit on it. We have now hit the point where we expect Windows Update-like updating for all of our apps, and forget to check vendors weekly.

Heck, weekly is too long! Take this new Remote Agent exploit. When the last Remote Agent exploit was released in June, it was less than 6 days after the patch was made available that the exploits started. We took 9 days to apply it since it needs reboots. Too long!

We now have to have a vendor and a patch-source for each and every program installed on a server. And even that isn't enough. Take HP. They just announced several bugs in their Server Management products, but I saw the notice on Bugtraq, not from any notice from HP. They offer a wide enough variety of programs that it is difficult to determine if the broken bits are the bits I installed on my servers or if I'm safe.

We have a Tuesday night regular downtime arranged so we can get the MS patches in. For things like the Veritas Remote Agent, we'd have to apply a patch 'out of cycle', and that's tough. It took 6 days for the last Remote Agent but to lead to hacked servers. For something like this, where there may already be a metasploit widget created, we need to apply ASAP after the patch releases. So a weekly patch application interval is not longer good enough, we need to be able to do it in 24 hours.

Presuming we are even aware the patch exists in the first place. From the same e-mail:
Editor's Note (Paller): Sadly many of the people who bought BrightStor packages have no idea the vulnerability exists. Computer Associates, like other larger vendors, sold through resellers to customers who never bothered to register. Those organizations, large and small, are at extreme risk and are completely unaware of the risk.
Which is precicely the problem. Heck, we're registered with Veritas and HP, but we were not notified of the recent problems. We had to find them out for ourselves. This is why auto-patching products that come with patch-feeds charge such extortionist amounts of money. It is ALMOST worth it to pay 'em.

Really, we're like an ISP that has far more official responsibility over the machines on our network. A traditional ISP has terms of service and a pretty 'hey, whatever,' attitude, and then harden the crap out of their own internal servers. We have to run business apps in an ISP environment. And if one of our workstations get hacked and becomes a drone that participates in a DoS, we get sued not the owner of the PC (...which is.. us.. unlike an ISP).

A final case in point and then I'll sign off. We recently reviewed a Point-of-sale application that an organization on campus will be using. It took about 45 seconds after the presentation began before we identified the glaring hole in their security setup. Sadly, this product was already purchased, and apparently a lot of other higher eds use it too. We just get to try and minimize the hole how we can, without actually fixing it.

Netware smugness

ftp: 24337981440 bytes sent in 1459.78Seconds 16672.35Kbytes/sec.

24 minutes, 19 seconds

That's is FTPing a whonking big file up to a Netware server, over NWFTPD. The two devices were on the same GigE switch, though different VLANs. This over-the-network copy took about as long as a similar operation took to copy the same size file to a different disk-partition on a Windows 2003 server. For comparison, copying to another Windows server, but with a 100Meg Ethernet at the other end would have taken around 7 hours. If I had another Windows server on the same GigE switch that had that kind of space kicking around, I'da done it just to see speeds.

The Netware server in question took the packets no problem. CPU jumped to about 15% during the transfer, which wasn't good, but it kept up. No ECB problems, either.

Exchange issues

We are having them. We know.

Security challenges in .edu land

The Internet Storm Center from SANS has a handler diary that I read daily. For a reason. It is good stuff. Today's entry had a list of things you want to do to protect a corporate network from botnet invasion. These are basic principals, and have a lot going for them.

Unfortunately, in a .EDU environment, especially in higher ed where 'information shall be free' is rampant, these things aren't possible in a lot of cases. I'm going to take each recommendation and whine about why we can't do that. Poor us.

Centralize network egress
This we DO do. We have one pipe to the Internet, and a backup pipe in case it goes down. All of our border monitoring is done at this pipe. Such as it is.

Employ Egress filtering
We're an educational institution. God alone knows what all is going on among the faculty when they work with other faculty. We have an active Computer Science department, and that means we have weird stuff on our network all the time. By only permitting outbound traffic that we know we want, we'll block the stuff we don't know about but is critical to some class somewhere. Or something. No, we don't do Egress filtering. But we do ingress filtering! That's the same... right?

Centralize your logging
That would require having such devices. But I'm kidding. I'm not familiar what our telecom section does for these devices, but I know they do something. I've not had to look up VPN user info, so I don't know how they do that. But I have had them look up netflow data, which they've spent some time working on to make easier to use.

Deploy Intrusion Sensors
We'd like to. But that would be spying. And we can't have that. We keep trying to put up a honeynet somewhere, but we keep lacking the time.

Establish flexible routing control
That we do have, though getting Telecom to throw in strange routes like null-routes to addresses is like moving rock up hill. Much easier to block the bad IP at the border router. Happily, we've not yet had a botnet controller on our network, just drones.

DNS - Blackholes, Poisoning and Reporting
This is a new one to me. I know our primary DNS is provided by BIND, and is managed by hand by our trio of UNIX admins. We do have dDNS in some form, but I'm again not all that familiar with how it works since it is over there in UNIX land, and I don't speak that language officially. Logging DNS queries would also be spying. And we can't have that.

A lot of the problem stems from the fact that Educational networks work much closer to the ISP model than the Corporate model. I keep advocating for a firewall around our datacenter, but we run smack into performance fears whenever it comes up. Failing a firewall, getting router filters set up would do about 90% of what we'd need done on such a device.

Usage

This is a graph of web-server sessions for the Student side of myweb:

As you can see, it does get some traffic. This graph is from the start of Spring quarter to last Friday, so it includes summer traffic. You can see how the traffic gently increases over time, before summer hits and the hit-rate goes steady.

I know from looking at the logs that CS101 and CS102 use myweb for class. Looking at hourly stats show that CS102 was held at 11-Noon.

Comparatively speaking, this traffic is nothing special. The web-pages on Titan get more traffic than these do. Though I do expect this will change in time.

SummerStart

| 1 Comment
SummerStart was this weekend. This is the time when incoming Freshman have the chance to check out our campus and get briefed on things. A lot of them activate their IT accounts, so we need to have our infrastructure in place to handle it. This is the only weekend of the year where we have a formal on-call schedule. I'll not get into the politics of that.

But the weekend went well! I didn't get a call.

But this morning, the main site at www.wwu.edu and everything hosted on that webserver is down. Nothing I fix, since that's a Solaris thing and I'm not a Solaris kinda guy. All of our stuff is behaving this morning.

Intermapper

We had an internal presentation on Intermapper today.