Recently in exchange Category

A while back we managed to push through some new purchasing rules that required IT review of any IT technology purchases. This is needed, since end-user departments haven't the first clue what'll work with our existing infrastructure, and it helps us advise them of complications. For instance, if a product requires PHP on IIS for some reason, we really want to be able to let them know before they purchase that doing so will require a server purchase as well since we don't support that environment currently.

Unfortunately, a small number of things still slip through. Perhaps we didn't read the manuals enough. Perhaps a high enough manager expended sufficient political capital to Make It So. But complications can arise when we go to make the new thingy work.

A case in point:

For the last two weeks I've been attempting to get a certain package up and running that has email capabilities. This has to fit within our Exchange system, which is a rather common environment. What isn't so common, it seems, is our insistence on secure protocols for authentication. While Exchange 2007 is perfectly willing to support naked POP3 and even naked SMTP-Auth, we, on the other hand, are not so forgiving. We wisely have a security standard in place that says that all authentication traffic must be encrypted, and this prevents us from running POP3 and SMTP in a way that allows passwords in the clear.

This package has support for one SSLed service: POP3-SSL. We don't support POP3 since our users were forever screwing themselves thanks to the default of "Delete on retrieval" in most mailer clients, which kind of pissed them off when they got to the office the next morning and their mailbox was empty.

Thanks to the use of stunnel I was able to tunnel unencrypted IMAP to Exchange's IMAP-SSL port at least, so that channel got working.

Right now I'm trying to convince stunnel and the application to work together to get SMTP-TLS working. Sadly for me, I have to wait a couple of hours before the app attempts an SMTP check for me to see if it works.

On the 'up' side, we're charging this department by the hour to get this set up. So the labor bill on this will be fairly high.

It's the little things

| 1 Comment | No TrackBacks
My attention was drawn to something yesterday that I just hadn't registered before. Perhaps because I see it so often I didn't twig to it being special in just that place.

Here are the Received: headers of a bugzilla message I got yesterday. It's just a sample. I've bolded the header names for readability:
Received: from ExchEdge2.cms.wwu.edu (140.160.248.208) by ExchHubCA1.univ.dir.wwu.edu (140.160.248.102) with Microsoft SMTP Server (TLS) id 8.1.393.1; Tue, 15 Sep 2009 13:58:10 -0700
Received: from mail97-va3-R.bigfish.com (216.32.180.112) by
ExchEdge2.cms.wwu.edu (140.160.248.208) with Microsoft SMTP Server (TLS) id 8.1.393.1; Tue, 15 Sep 2009 13:58:09 -0700
Received: from mail97-va3 (localhost.localdomain [127.0.0.1]) by mail97-va3-R.bigfish.com (Postfix) with ESMTP id 6EFC9AA0138 for me; Tue, 15 Sep 2009 20:58:09 +0000 (UTC)
Received: by mail97-va3 (MessageSwitch) id 12530482889694_15241; Tue, 15 Sep 2009 20:58:08 +0000 (UCT)
Received: from monroe.provo.novell.com (monroe.provo.novell.com [137.65.250.171]) by mail97-va3.bigfish.com (Postfix) with ESMTP id 5F7101A58056 for me; Tue, 15 Sep 2009 20:58:07 +0000 (UTC)
Received: from soval.provo.novell.com ([137.65.250.5]) by
monroe.provo.novell.com with ESMTP; Tue, 15 Sep 2009 14:57:58 -0600
Received: from bugzilla.novell.com (localhost [127.0.0.1]) by soval.provo.novell.com (Postfix) with ESMTP id A56EECC7CE for me; Tue, 15 Sep 2009 14:57:58 -0600 (MDT)
For those who haven't read these kinds of headers before, read from the bottom up. The mail flow is:
  1. Originating server was Bugzilla.novell.com, which mailed to...
  2. soval.provo.novell.com running Postfix, who forwarded it on to Novell's outbound mailer...
  3. monroe.provo.novell.com, who attempted to send to us and sent to the server listed in our MX record...
  4. mail97-va3.bigfish.com running Postfix, who forwarded it on to another mailer on the same machine...
  5. mail97-ca3-r running something called MessageSwitch, who sent it on to the internal server we set up...
  6. exchedge2.cms.wwu.edu running Exchange 2007, who send it on to the Client Access Server...
  7. exchhubca1.univ.dir.wwu.edu for 'terminal delivery'. Actually it went on to one of the Mailbox servers, but that doesn't leave a record in the SMTP headers.
Why is this unusual? Because steps 4 and 5 are at Microsoft's Hosted ForeFront mail security service. The perceptive will notice that step 4 indicates that the server is running Postfix.

Postfix. On a Microsoft server. Hur hur hur.

Keep in mind that Microsoft purchased the ForeFront product line lock stock and barrel. If that company had been using non-MS products as part of their primary message flow, then Microsoft probably kept that up. Next versions just might move to more explicitly MS-branded servers. Or not, you never know. Microsoft has been making placating notes towards Open Source lately. They may keep it.
Remember this from a month ago? As threatened in that post I did go ahead and call Microsoft. To my great pleasure, they were able to reproduce this problem on their side. I've been getting periodic updates from them as they work through the problem. I went through a few cycles of this during the month:

MS Tech: Ahah! We have found the correct regex recipe. This is what it is.
Me: Let's try it out shall we?
MS Tech: Absolutely! Do you mind if we open up an Easy Assist session?
Me: Sure. [does so. Opens sends a few messages through, finds an edge case that the supplied regex doesn't handle]. Looks like we're not there yet in this edge case.
MS Tech: Indeed. Let me try some more things out in the lab and get back to you.

They've finally come up with a set of rules to match this text definition: "Match any X-SpamScore header with a signed integer value between 15 and 30".

Reading the KB article on this you'd think these ORed patterns would match:
^1(5|6|7|8|9)$
^2\d$
^30$
But you'd be wrong. The rule that actually works is:
(^1(5$|6$|7$|8$|9$))|(^2(\d$))|(^3(0$))
Except if ^-
Yes, that 'except if' is actually needed, even though the first rule should never match a negative value. You really need to have the $ inside the parens for the first statement, or it doesn't match right; this won't work: ^1(5|6|7|8|9)$. The same goes for the second statement with the \d$ constructor. The last statement doesn't need the 0$ in parens, but is there to match the pattern of the previous two statements of having the $ in the paren.

Riiiiiight.

In the end, regexes in Exchange 2007 Transport Rules are still broken, but they can be made to work if you pound on them enough. We will not be using them because they are broken, and when Microsoft gets around to fixing them the hack-ass recipes we cook up will probably break at that time as well. A simple value list is what we're using right now, and it works well for 16-30. It doesn't scale as well for 31+, but there does seem to be a ceiling on what X-SpamScore can be set to.
Exchange 2007 supports a limited set of Regular Expressions in its transport-rules. The Microsoft technet page describing them is here. Unfortunately, I believe I've stumbled into a bug. We've recently migrated our AntiSpam to ForeFront. And part what ForeFront does is header markup. There is a Spamminess number in the header:
X-SpamScore: 66
That ranges from deeply negative to over a hundred. With this we can structure transport-rules to handle spammy email. In theory, the following trio of regexes should catch anything with a score of 15 or higher:
^1(5|6|7|8|9)$
^(2|3|4|5|6|7|8|9)\d$
^\d\d\d$
Those of you that speak Unix regex are quirking an eyebrow at that, I know. Like I said, Microsoft didn't do the full Unix regex treatment. The "\d" flag, "matches any single numeric digit." The parenthetical portion, "Parentheses act as grouping delimiters," and, "The pipe ( | ) character performs an OR function."

Unfortunately, for reasons that do not match the documentation the above trio of regexes is returning true on this:
X-SpamScore: 5
It's the second recipe that's doing it, and it looks to be the combination of paren and \d that's the problem. For instance, the following rule:
^\d(6|7)$
returns true for any single numeric value, but returns false for "56". Where this rule:
^5(6|7)
only returns true for 56 and 57. To me this says there is some kind of interaction going on between the \d and the () constructors that's causing it to change behavior. I'll be calling Microsoft to see if this is working as designed and just not documented correctly, or a true bug.
We missed a step or something in decommissioning our Exchange2003 servers. As a result, we have a whole lot of... stuff going 'unresolvable' due to how Outlook and Exchange work. There is an attribute on users and group called LegacyExchangeDN. Several processes store this value as the DN of the object. If that object was created in Exchange 2003 (or earlier) it's set to a location that no longer exists.

The fix is to add an X500 address to the object. That way, when the resolver attempts to resolve that DN it'll turn up the real object. So how do you add an X500 address to over 5000 objects? Powershell!

$TargetList = get-distributiongroup grp.*

foreach ($target in $TargetList) {
$DN=$target.SamAccountName
$Leg=$target.LegacyExchangeDN
$Email=$target.EmailAddresses
$Has500=0

if ($leg -eq "/O=WWU/OU=WWU/cn=Recipients/cn=$DN") {
foreach ($addy in $Email) {
if ($addy -eq [Microsoft.Exchange.Data.CustomProxyAddress]("X500:" + $Leg)) {
$Has500=1
}
}
if ($Has500 -eq 0) {
$Email += [Microsoft.Exchange.Data.CustomProxyAddress]("X500:" + $Leg)
$target.EmailAddresses = $Email
set-distributiongroup -instance $target
write-host "$DN had X500 added" -foregroundcolor cyan
} else {
write-host "$DN already had X500 address" -foregroundcolor red
}
} else {
write-host "$DN has correct LegacyExchangeDN" -foregroundcolor yellow
}
}


It's customized for our environment, but it should be obvious where you'd need to change things to get it working for you. When doing users, use "get-mailbox" and "set-mailbox" instead of "get-distributiongroup" and "set-distributiongroup". It's surprisingly fast.

My contribution to the community!

ForeFront and spam

| No Comments | No TrackBacks
They have an option to set a custom X-header for indicating spam. The other options are subject-line markup and quarantine on the ForeFront servers. What they never document is what they set the header to. As it happens, if the message is spam it gets set like this:
X-WWU-JunkIt: This message appears to be spam.
Very basic. And not documented. Now that we know what it looks like we can create a Transport Rule that'll direct such mail to the junk folder in Outlook. Handy!
Back when I worked in a GroupWise shop, we'd get the occasional request from end users to see if Outlook could be used against the GroupWise back-end. Back in those days there was a MAPI plug-in for Outlook that allowed it to talk natively to GroupWise and it went rather well. Then Microsoft made some changes, Novell made some changes, and the plug-in broke. GW Admins still remember the plug-in, because it allowed a GroupWise back end to look exactly like and Exchange back end to the end users.

Through the grapevine I've heard tales of Exchange to GroupWise migrations almost coming to a halt when it came time to pry Outlook out of the hands of certain highly placed noisy executives. The MAPI plugin was very useful in quieting them down. Also, some PDA sync software (this WAS a while ago) only worked with Outlook, and using the MAPI plugin was a way to allow a sync between your PDA and GroupWise. I haven't checked if there is something equivalent in the modern GW8 era.

It seems like Google has figured this out. A plugin for Outlook that'll talk to Google Apps directly. It'll allow the die-hard Outlook users to still keep using the product they love.

Explaining email

| 1 Comment | No TrackBacks
In the wake of this I've done a lot of diagramming of email flow to various decision makers. Once upon a time, when the internet was more trusting, SMTP was a simple protocol and diagramming was very simple. Mail goes from originator, through one or more message transfer agents (or none at all and deliver directly, it was a more trusting time back then) to its final home, where it gets delivered. SMTP was designed in an era when UUCP was still in widespread use.

Umpteen years later and you can't operate a receiving mail server without some kind of anti-spam product. Also, anti-spam has been around as an industry long enough that certain metaphors are now ingrained in the user experience. Almost all products have a way to review the "junk e-mail". Almost all products just silently drop virus-laden mail without a way to 'review' it. Most products contain the ability to whitelist and blacklist things, and some even let that happen at the user level.

What this has done is made the mail-flow diagram bloody complicated. As a lot of companies are realizing the need to have egress filters in addition to the now-standard ingress filters, mail can get mysteriously blocked at nearly every step of the mail handling process. The mail-flow diagram now looks like a flow-chart since actual decisions beyond "how do I get mail to domain.com" are made at many steps of the process.

The flow-charts are bad enough, but they can be explained comprehensibly if given enough time to describe the steps. What takes even more time is when deploying a new anti-spam product and deciding how much end-user control to add (do we allow user-level white lists? Access to the spam-quarantine?), what kinds of workflow changes will happen at the helpdesk (do we allow helpdesk people access to other people's spam-quarantine? Can HD people modify the system whitelist?), or overall architectural concerns (is there a native plugin that allows access to quarantine/whitelist without having to log in to a web-page? If the service is outsourced (postini, forefront) does it provide a SSO solution or is this going to be another UID/PW to manage?).

And I haven't even gotten to any kind of email archiving.
As mentioned in the Western Front, we're finally migrating students to the new hosted Exchange system Microsoft runs. They've since changed the name from Exchange Labs to OutlookLive. It has taken us about two quarters longer than we intended to start the migration process, but it is finally under way.

Unfortunately for us, we got hit with a problem related to conflicting mail priorities. But first, a bit of background.

ATUS was getting a lot of complaints from students that the current email system (sendmail, with SquirrelMail) was getting snowed under with spam. The open-source tools we used for filtering out spam were not nearly as effective as the very expensive software in front of the Faculty/Staff Exchange system. Or much more importantly, were vastly less effective than the experience Gmail and Hotmail give. Something had to change.

That choice was either to pay between $20K and $50K for an anti-spam system that actually worked, or outsource our email for free to either Google or Microsoft. $20K.... or free. The choice was dead simple. Long story short, we picked Microsoft's offering.

Then came the problem of managing the migration. That took its own time, as the Microsoft service wasn't quite ready for the .EDU regulatory environment. We ran into FERPA related problems that required us to get legal opinions from our own staff and the Registrar relating to what constitutes published information, which required us to design systems to accommodate that. Microsoft's stuff didn't make that easy. Since then, they've rolled out new controls that ease this. Plus, as the article mentioned, we had to engineer the migration process itself.

Now we're migrating users! But there was another curveball we didn't see, but should have. The server that student email was on has been WWU's smart-host for a very long time. It also had the previously mentioned crappy anti-spam. Being the smart-host, it was the server that all of our internal mail blasts (such as campus notifications of the type Virginia Tech taught us to be aware of) relayed through. These mail blasts are deemed critical, so this smart-host was put onto the OutlookLive safe-senders list.

Did I mention that we're forwarding all mail sent to the old .cc.wwu.edu address to the new students.wwu.edu address? The perceptive just figured it out. Once a student is migrated, the spam stream heading for their now old cc.wwu.edu address gets forwarded on to OutlookLive by way of a server that bypasses the spam checker. Some students are now dealing with hundreds of spam messages in their inbox a day.

The obvious fix is to take the old mail server off of the bypass list. This can't be done because right now critical emails are being sent via the old mail server that have to deliver. The next obvious fix, turn off forwarding for students that request it, won't work either since the ERP system has all the old cc.wwu.edu addresses hard-coded in right now and the forwards are how messages from said system get to the students.wwu.edu addresses. So we geeks are now trying to set up a brand new smart-host, and are in the process of finding all the stuff that was relaying through the old server and attempting to change settings to relay through the new smart-host.

Some of these settings require service restarts of critical systems, such as Blackboard, that we don't normally do during the middle of a quarter. Some are dead simple, such as changing a single .ini entry. Still others require our developers to compile new code with the new address built in, and publish the updated code to the production web servers.

Of course, the primary sysadmin for the old mail-server was called for Federal jury-duty last week and has been in Seattle all this time. I think he comes back Monday. His grep-fu is strong enough to tell us what all relays through the old server. I don't have a login on that server so I can't try it out myself.

Changing smart-hosts is a lot of work. Once we get the key systems working through the new smart-host (Exchange 2007, as it happens), we can tell Microsoft to de-list the old mail-server from the bypass list. This hopefully will cut down the spam flow to the students to only one or two a day at most. And it will allow us to do our own authorized spamming of students through a channel that doesn't include a spam checker. Valuable!
Anandtech has a nice article up right now that compares SAS, SATA, and SSD drives in a database environment. Go read it. I'll wait.

While the bulk of the article is about how much the SSD drives blow the pants off of rotational magnetic media, the charts show how SAS performs versus SATA. As they said at the end of the article:
Our testing also shows that choosing the "cheaper but more SATA spindles" strategy only makes sense for applications that perform mostly sequential accesses. Once random access comes into play, you need two to three times more SATA drives - and there are limits to how far you can improve performance by adding spindles.
Which matches my experience. SATA is great for sequential loads, but is bottom of the pack when it comes to random I/O. In a real world example, take this MSA1500CS we have. It has SATA drives in it

If you have a single disk group with 14 1TB drives in it, this gives a theoretical maximum capacity of 12.7TB (that storage industry TB vs OS TB problem again). Since you can only have LUNs as large as 2TB due to the 32-bit signed integer problem, this would mean this disk group would have to be carved into 7 LUNs. So how do you go about getting maximum performance from this set up?

You'll have to configure a logical volume on your server such that each LUN appends to the logical volume in order, and then make sure your I/O writes (or reads) sequentially across the logical volume. Since all 7 LUNs are on the same physical disks, any out-of-order arrangement of LUN on that spanned logical volume would result in semi-random I/O and throughput would drop. Striping the logical volume just ensures that every other access requires a significant drive-arm move, and would seriously drop throughput. It is for this reason that HP doesn't recommend using SATA drives in 'online' applications.

Another thing in the article that piqued my interest is there on page 11. This is where they did a test of various data/log volume combinations between SAS and SSD. The conclusion they draw is interesting, but I want to talk about it:
Transactional logs are written in a sequential and synchronous manner. Since SAS disks are capable of delivering very respectable sequential data rates, it is not surprising that replacing the SAS "log disks" with SSDs does not boost performance at all.
This is true, to a point. If you have only one transaction log, this is very true. If you put multiple transaction logs on the same disk, though, SSD becomes the much better choice. They did not try this configuration. I would have liked to have seen a test like this one:
  • Three Data volumes running on SAS drives
  • One Log volume running on an SSD with all three database logs on it
I'm willing to bet that the performance of the above would match, if not exceed, running three separate log volumes running on SAS.

The most transactional database in my area is probably Exchange. If we were able to move the Logs to SSD's, we very possibly could improve performance of those databases significantly. I can't prove it, but I suspect we may have some performance issues in that database.

And finally, it does raise the question of file-system journals. If I were to go out and buy a high quality 16GB SSD for my work-rig, I could use that as an external journal for my SATA-based filesystems. As it is an SSD, running multiple journals on it should be no biggie. Plus, offloading the journal-writes should make the I/O on the SATA drives just a bit more sequential and should improve speeds. But would it even be perceptible? I just don't know.

Other Blogs

My Other Stuff

About this Archive

This page is an archive of entries from June 2010 listed from newest to oldest.

May 2010 is the previous archive.

July 2010 is the next archive.

Find recent content on the main index or look in the archives to find all content.