Friday, September 18, 2009

It's the little things

My attention was drawn to something yesterday that I just hadn't registered before. Perhaps because I see it so often I didn't twig to it being special in just that place.

Here are the Received: headers of a bugzilla message I got yesterday. It's just a sample. I've bolded the header names for readability:
Received: from ExchEdge2.cms.wwu.edu (140.160.248.208) by ExchHubCA1.univ.dir.wwu.edu (140.160.248.102) with Microsoft SMTP Server (TLS) id 8.1.393.1; Tue, 15 Sep 2009 13:58:10 -0700
Received: from mail97-va3-R.bigfish.com (216.32.180.112) by
ExchEdge2.cms.wwu.edu (140.160.248.208) with Microsoft SMTP Server (TLS) id 8.1.393.1; Tue, 15 Sep 2009 13:58:09 -0700
Received: from mail97-va3 (localhost.localdomain [127.0.0.1]) by mail97-va3-R.bigfish.com (Postfix) with ESMTP id 6EFC9AA0138 for me; Tue, 15 Sep 2009 20:58:09 +0000 (UTC)
Received: by mail97-va3 (MessageSwitch) id 12530482889694_15241; Tue, 15 Sep 2009 20:58:08 +0000 (UCT)
Received: from monroe.provo.novell.com (monroe.provo.novell.com [137.65.250.171]) by mail97-va3.bigfish.com (Postfix) with ESMTP id 5F7101A58056 for me; Tue, 15 Sep 2009 20:58:07 +0000 (UTC)
Received: from soval.provo.novell.com ([137.65.250.5]) by
monroe.provo.novell.com with ESMTP; Tue, 15 Sep 2009 14:57:58 -0600
Received: from bugzilla.novell.com (localhost [127.0.0.1]) by soval.provo.novell.com (Postfix) with ESMTP id A56EECC7CE for me; Tue, 15 Sep 2009 14:57:58 -0600 (MDT)
For those who haven't read these kinds of headers before, read from the bottom up. The mail flow is:
  1. Originating server was Bugzilla.novell.com, which mailed to...
  2. soval.provo.novell.com running Postfix, who forwarded it on to Novell's outbound mailer...
  3. monroe.provo.novell.com, who attempted to send to us and sent to the server listed in our MX record...
  4. mail97-va3.bigfish.com running Postfix, who forwarded it on to another mailer on the same machine...
  5. mail97-ca3-r running something called MessageSwitch, who sent it on to the internal server we set up...
  6. exchedge2.cms.wwu.edu running Exchange 2007, who send it on to the Client Access Server...
  7. exchhubca1.univ.dir.wwu.edu for 'terminal delivery'. Actually it went on to one of the Mailbox servers, but that doesn't leave a record in the SMTP headers.
Why is this unusual? Because steps 4 and 5 are at Microsoft's Hosted ForeFront mail security service. The perceptive will notice that step 4 indicates that the server is running Postfix.

Postfix. On a Microsoft server. Hur hur hur.

Keep in mind that Microsoft purchased the ForeFront product line lock stock and barrel. If that company had been using non-MS products as part of their primary message flow, then Microsoft probably kept that up. Next versions just might move to more explicitly MS-branded servers. Or not, you never know. Microsoft has been making placating notes towards Open Source lately. They may keep it.

Labels: , , ,


Tuesday, September 08, 2009

Exchange Transport Rules, update

Remember this from a month ago? As threatened in that post I did go ahead and call Microsoft. To my great pleasure, they were able to reproduce this problem on their side. I've been getting periodic updates from them as they work through the problem. I went through a few cycles of this during the month:

MS Tech: Ahah! We have found the correct regex recipe. This is what it is.
Me: Let's try it out shall we?
MS Tech: Absolutely! Do you mind if we open up an Easy Assist session?
Me: Sure. [does so. Opens sends a few messages through, finds an edge case that the supplied regex doesn't handle]. Looks like we're not there yet in this edge case.
MS Tech: Indeed. Let me try some more things out in the lab and get back to you.

They've finally come up with a set of rules to match this text definition: "Match any X-SpamScore header with a signed integer value between 15 and 30".

Reading the KB article on this you'd think these ORed patterns would match:
^1(5|6|7|8|9)$
^2\d$
^30$
But you'd be wrong. The rule that actually works is:
(^1(5$|6$|7$|8$|9$))|(^2(\d$))|(^3(0$))
Except if ^-
Yes, that 'except if' is actually needed, even though the first rule should never match a negative value. You really need to have the $ inside the parens for the first statement, or it doesn't match right; this won't work: ^1(5|6|7|8|9)$. The same goes for the second statement with the \d$ constructor. The last statement doesn't need the 0$ in parens, but is there to match the pattern of the previous two statements of having the $ in the paren.

Riiiiiight.

In the end, regexes in Exchange 2007 Transport Rules are still broken, but they can be made to work if you pound on them enough. We will not be using them because they are broken, and when Microsoft gets around to fixing them the hack-ass recipes we cook up will probably break at that time as well. A simple value list is what we're using right now, and it works well for 16-30. It doesn't scale as well for 31+, but there does seem to be a ceiling on what X-SpamScore can be set to.

Labels: , ,


Wednesday, August 05, 2009

Exchange transport-rules

Exchange 2007 supports a limited set of Regular Expressions in its transport-rules. The Microsoft technet page describing them is here. Unfortunately, I believe I've stumbled into a bug. We've recently migrated our AntiSpam to ForeFront. And part what ForeFront does is header markup. There is a Spamminess number in the header:
X-SpamScore: 66
That ranges from deeply negative to over a hundred. With this we can structure transport-rules to handle spammy email. In theory, the following trio of regexes should catch anything with a score of 15 or higher:
^1(5|6|7|8|9)$
^(2|3|4|5|6|7|8|9)\d$
^\d\d\d$
Those of you that speak Unix regex are quirking an eyebrow at that, I know. Like I said, Microsoft didn't do the full Unix regex treatment. The "\d" flag, "matches any single numeric digit." The parenthetical portion, "Parentheses act as grouping delimiters," and, "The pipe ( | ) character performs an OR function."

Unfortunately, for reasons that do not match the documentation the above trio of regexes is returning true on this:
X-SpamScore: 5
It's the second recipe that's doing it, and it looks to be the combination of paren and \d that's the problem. For instance, the following rule:
^\d(6|7)$
returns true for any single numeric value, but returns false for "56". Where this rule:
^5(6|7)
only returns true for 56 and 57. To me this says there is some kind of interaction going on between the \d and the () constructors that's causing it to change behavior. I'll be calling Microsoft to see if this is working as designed and just not documented correctly, or a true bug.

Labels: ,


Friday, July 10, 2009

Email reputation

One of the hot new things in anti-spam technology is something that's rather old. Yes, the Realtime Blackhole List is back. Only these RBL's aren't the old school DNS servers of yesteryear, these RBLs are maintained by the big anti-spam vendors and are completely proprietary. The new name is now, "IP Reputation," and that's showing up on the marketing glossies.

The idea is that you deploy a network of sensors (say, every anti-spam appliance you ship, or software-package installed) that relay spam/ham information back to home base. Home base then builds a profile of the behaviors of the incoming IP connections. Once certain completely proprietary threshold are crossed, the anti-spam vendor then publishes that particular IP addresses reputation to their service. The installed base then queries the reputation service on every incoming TCP connection to see how to handle that connection.

The response varies from vendor to vendor, but include:
  • Outright blocking. Do not accept traffic from this IP address. The connection is terminated before any SMTP commands can be issued. Do not pass EHLO. Do not collect 220-ESMTP.
  • Deferr. Issue a 421 error message. Smart mailers will attempt redelivery later. Bots are generally too stupid to try this and just pass on to the next address on their list.
  • Throttle. Get very slow in accepting mail. Take a long time to issue 250-Ready statuses after SMTP commands.
The nice thing about IP reputation is that it is fast and cheap. Instead of having to lexically scan every incoming email for spamminess, you can just look at the source's reputation and block a very large percentage of messages. When we turned this on for our spam product a while back, the reputation filter blocked between 90% to 95% of all messages ultimately blocked as spam. Clean email is the single most expensive mail to pass since it has to go through every single stage of the spam/ham test pipeline, and blocking things earlier in the pipeline is a good way to shed load.

Not all optimizations are without side effects, and this one wasn't. The former student email server, titan, got itself 'greylisted' due to spam quantities. Around 50% of the message traffic into Exchange from this system was ultimately blocked as Spam according to the old anti-spam appliances we had (we'd routed its mail through the 'outbound' queue on those appliances so it wouldn't be subject to reputation tests, but would still scan email). As part of the migration of student email to OutlookLive.Edu, we set up forwards from the old cc.wwu.edu addresses to the new addresses. The spam-checkers on titan were of poor enough quality that enough spam got through to cause OutlookLive to start grey-listing Titan, causing mail to really back up on it.

That's not the only thing. Certain mailers managed by departments other that ITS here at WWU have managed to get themselves greylisted or outright blacklisted on these proprietary reputation lists. The one common denominator we've found is that certain specific UNIXy mailers do not apply their anti-spam processes to mail that is subjected to a .forward. At least, not without specific config telling it to scan that traffic. So if a person on one of these mailers has a .forward sending all mail into Exchange, the full spam-filled feed heads to Exchange and the reptuation of that mailer gets dinged.

Which is a long way of saying that, ahem:

In this era of IP reputation, outbound spam filtering is now just as required as inbound.

Really. Go do it. It'll help prevent blacklistings, and that sucks for anyone subjected to it.

Labels: ,


Friday, June 26, 2009

ForeFront and spam

They have an option to set a custom X-header for indicating spam. The other options are subject-line markup and quarantine on the ForeFront servers. What they never document is what they set the header to. As it happens, if the message is spam it gets set like this:
X-WWU-JunkIt: This message appears to be spam.
Very basic. And not documented. Now that we know what it looks like we can create a Transport Rule that'll direct such mail to the junk folder in Outlook. Handy!

Labels: , ,


Tuesday, June 09, 2009

Email delivery problems to Comcast.net

Yesterday we got some concerned mails from the one of the groups who sends mail by way of one of our web-servers. It's a somewhat critical function they do, so we paid attention to it. It seems they were getting bounce-messages from comcast.net. The bounce said that the incoming IP address did not have a reverse lookup (PTR record) and they don't talk to people like that.

This was confusing. Because we really do have a PTR record for that particular mailer. And yet, getting bounces. So one of the Webdevs calls Comcast to ask politely what the heck, and the Comcast support person walks them through a series of steps to demonstrate what went wrong. According to them, or so implied the webdev who doesn't speak SMTP as well as we do, the problem was that 'wwu.edu' does not resolve to an IP address.

There are reasons we haven't done this, and they have to do with mail delivery. Certain stupid mailers will deliver to a resoveable host before searching MX records, and if "wwu.edu" is resoveable, it'll attempt delivery to THAT instead of where it should. The server that runs 'www.wwu.edu' is the one that we'd have to point 'wwu.edu' to, and it is not a mail host. Far from. This seemed to be a strange requirement of Comcast.

I cracked it earlier today. You see, if you take a look at the NameServer records for the "wwu.edu" domain you will find three records.

140.160.242.13
140.160.240.12
216.186.4.245

It's that last one that's the problem. For some reason, our offsite DNS didn't have that particular reverse-lookup domain replicated to it. So if Comcast used it for resolving the incoming IP, it would get 'UNKNOWN' and block the connection. If they picked one of the other two, it would resolve and delivery would continue. Tada! The Comcast error message really was true, we just didn't realize one of our DNS servers didn't have all the data it needed. Oops.

Labels: ,


Tuesday, May 12, 2009

Explaining email

In the wake of this I've done a lot of diagramming of email flow to various decision makers. Once upon a time, when the internet was more trusting, SMTP was a simple protocol and diagramming was very simple. Mail goes from originator, through one or more message transfer agents (or none at all and deliver directly, it was a more trusting time back then) to its final home, where it gets delivered. SMTP was designed in an era when UUCP was still in widespread use.

Umpteen years later and you can't operate a receiving mail server without some kind of anti-spam product. Also, anti-spam has been around as an industry long enough that certain metaphors are now ingrained in the user experience. Almost all products have a way to review the "junk e-mail". Almost all products just silently drop virus-laden mail without a way to 'review' it. Most products contain the ability to whitelist and blacklist things, and some even let that happen at the user level.

What this has done is made the mail-flow diagram bloody complicated. As a lot of companies are realizing the need to have egress filters in addition to the now-standard ingress filters, mail can get mysteriously blocked at nearly every step of the mail handling process. The mail-flow diagram now looks like a flow-chart since actual decisions beyond "how do I get mail to domain.com" are made at many steps of the process.

The flow-charts are bad enough, but they can be explained comprehensibly if given enough time to describe the steps. What takes even more time is when deploying a new anti-spam product and deciding how much end-user control to add (do we allow user-level white lists? Access to the spam-quarantine?), what kinds of workflow changes will happen at the helpdesk (do we allow helpdesk people access to other people's spam-quarantine? Can HD people modify the system whitelist?), or overall architectural concerns (is there a native plugin that allows access to quarantine/whitelist without having to log in to a web-page? If the service is outsourced (postini, forefront) does it provide a SSO solution or is this going to be another UID/PW to manage?).

And I haven't even gotten to any kind of email archiving.

Labels: ,


Thursday, April 30, 2009

Conflicting email priorities

As mentioned in the Western Front, we're finally migrating students to the new hosted Exchange system Microsoft runs. They've since changed the name from Exchange Labs to OutlookLive. It has taken us about two quarters longer than we intended to start the migration process, but it is finally under way.

Unfortunately for us, we got hit with a problem related to conflicting mail priorities. But first, a bit of background.

ATUS was getting a lot of complaints from students that the current email system (sendmail, with SquirrelMail) was getting snowed under with spam. The open-source tools we used for filtering out spam were not nearly as effective as the very expensive software in front of the Faculty/Staff Exchange system. Or much more importantly, were vastly less effective than the experience Gmail and Hotmail give. Something had to change.

That choice was either to pay between $20K and $50K for an anti-spam system that actually worked, or outsource our email for free to either Google or Microsoft. $20K.... or free. The choice was dead simple. Long story short, we picked Microsoft's offering.

Then came the problem of managing the migration. That took its own time, as the Microsoft service wasn't quite ready for the .EDU regulatory environment. We ran into FERPA related problems that required us to get legal opinions from our own staff and the Registrar relating to what constitutes published information, which required us to design systems to accommodate that. Microsoft's stuff didn't make that easy. Since then, they've rolled out new controls that ease this. Plus, as the article mentioned, we had to engineer the migration process itself.

Now we're migrating users! But there was another curveball we didn't see, but should have. The server that student email was on has been WWU's smart-host for a very long time. It also had the previously mentioned crappy anti-spam. Being the smart-host, it was the server that all of our internal mail blasts (such as campus notifications of the type Virginia Tech taught us to be aware of) relayed through. These mail blasts are deemed critical, so this smart-host was put onto the OutlookLive safe-senders list.

Did I mention that we're forwarding all mail sent to the old .cc.wwu.edu address to the new students.wwu.edu address? The perceptive just figured it out. Once a student is migrated, the spam stream heading for their now old cc.wwu.edu address gets forwarded on to OutlookLive by way of a server that bypasses the spam checker. Some students are now dealing with hundreds of spam messages in their inbox a day.

The obvious fix is to take the old mail server off of the bypass list. This can't be done because right now critical emails are being sent via the old mail server that have to deliver. The next obvious fix, turn off forwarding for students that request it, won't work either since the ERP system has all the old cc.wwu.edu addresses hard-coded in right now and the forwards are how messages from said system get to the students.wwu.edu addresses. So we geeks are now trying to set up a brand new smart-host, and are in the process of finding all the stuff that was relaying through the old server and attempting to change settings to relay through the new smart-host.

Some of these settings require service restarts of critical systems, such as Blackboard, that we don't normally do during the middle of a quarter. Some are dead simple, such as changing a single .ini entry. Still others require our developers to compile new code with the new address built in, and publish the updated code to the production web servers.

Of course, the primary sysadmin for the old mail-server was called for Federal jury-duty last week and has been in Seattle all this time. I think he comes back Monday. His grep-fu is strong enough to tell us what all relays through the old server. I don't have a login on that server so I can't try it out myself.

Changing smart-hosts is a lot of work. Once we get the key systems working through the new smart-host (Exchange 2007, as it happens), we can tell Microsoft to de-list the old mail-server from the bypass list. This hopefully will cut down the spam flow to the students to only one or two a day at most. And it will allow us to do our own authorized spamming of students through a channel that doesn't include a spam checker. Valuable!

Labels: , ,


Monday, March 16, 2009

Death of cc.wwu.edu

Part of the process of moving the students over to Exchange Labs is decommissioning the cc.wwu.edu domain for email. Students have been there for a loooong time, and once upon a time faculty/staff mail was there as well. We've since moved to wwu.edu for our fac/staff domain.

Next week we're turning off cc.wwu.edu for fac/staff. The students still over there will be moved slowly over to the hosted solution. The Fac/Staff users will be moved to Exchange, period.

This has created some heated feelings as there are professors who've published books and have "@cc.wwu.edu" printed in the books. I'm not sure how we're handling that, but... that's not my email system. Email addresses do tend to get stale after a while, and that's just a fact of the internet.

However, one of the guys in the office here was one of the very first people to get an email address at cc.wwu.edu way back in the dark and misty reaches of a more trusting internet. I don't know how long he had that address, but it very well could have been over 20 years. He's letting it go with a tear in his eye, but not a big one. He's one of the unlucky schmucks with his first name as his username, and it's in every. single. solitary. mail-list known to God and man. His @cc.wwu.edu account has been nothing but a spam trap for years now.

Labels: ,


Wednesday, January 28, 2009

Spam stats

It has been a while. Some current spam stats from our border appliance:


Processed Spam Suspected Spam Attacks Zombie Blocked Allowed Viruses Suspected Virus Worms Unscannable Malware Passworded .zip file
Content
Last 30 days
9,489,662 4,328,334 (46%) 8,895 (<> 221,926 (2%) 4,163,253 (44%) 1 (<> 10,248 (<> 1,265 (<> 1,233 (<> 2,590 (<> 26,508 (<> 0 (0%) 13 (<> 0 (0%)

That is a lot of spam. The 'Zombie' class are connections rejected due to IP reputation. Between zombies and outright spam detections, 95% of all incoming mail has been rejected over the last 30 days. Shortly after the McColo shutdown that percentage dropped a lot. We're now back to where we were before the shutdown.

Labels: ,


The paradox of legitimate mass-mailings

Spam has been with us for over a decade now. Sad fact, but true. Also sad is the need for legitimate mass mailings. Here at WWU, these can look like budget updates from the University President, something we're all deeply interested in. Due to the cat and mouse games between spammers and the anti-spam vendors, it is an ever shifting game trying to figure out what'll get through the various spam checkers. At this moment we have three layers:

  1. Border appliance spam checker, which does IP reputation of incoming mail connections, resulting in a 93% bounce rate (as of this morning, just shy of the pre-McColo-shutdown high of 95%).
  2. Software on the Exchange servers themselves, which don't catch much, but do catch the few connections that leak around the border appliance.
  3. The Outlook Junk Mail filter
As it happens, it is #3 that has been causing us the most grief. In general, the rule of getting past the filters is simple: don't look like spam. In specific, this is really hard. Especially since our border appliance is good enough that most users don't see what a truly unfiltered email stream looks like.

Last week the President's office sent out a notice about a Mid Year Report to Campus. Unfortunately, they sent the mail from a mailbox that we haven't taken great pains to whitelist, and the mail was just an embedded image. In other words, they sent a mail that functionally looked exactly like the pump-n-dump stock scams of 2 years ago, and did so from a mailbox we haven't white-listed. Of course Outlook junked it, as its the dumbest filter of the bunch.

We're not the only place this this problem. All-hands emails are something most organizations have to do at some point, as Intranet sites with posted announcements get vastly fewer eyeballs. Perhaps mandated RSS feeds in Outlook is the future of this... huh. Anyway, we're constantly working with the internal mass-mail sources to help them tune their mail to actually arrive.

Labels:


Tuesday, January 20, 2009

Inept phishers

Over the weekend (Saturday, in fact) we had a phish attempted against us. As this is still a relatively new experience for us, it got the notice of the higher-ups. When I got in, I got the task of grepping logs to see if anyone replied to it.

While doing that I noticed something about the email. It had no clickable links in it, and the From: address (there was no Reply-To: address) was @wwu.edu, and was an illegal address.

In short, anyone replying to it would get a bounce message, and there was no way for the phishers to get the data they wanted.

More broadly, we've noticed a decided increase in phishing attempts against .edu looking for username/password combinations. The phishers then use that information to log in to webmail portals to send spam messages the hard way, copy-paste into new emails. This has the added benefit (for them) of coming from our legitimate mailers, from a legitimate address, and thus bypasses spam reputation checks, SPF records, and other blacklists. It doesn't have the volume of botnet-spam, but its much more likely to get past spam-checkers. At last check, about 50% of incoming mail connections to @wwu.edu are terminated due to IP-reputation failures.

Labels: ,


Tuesday, November 18, 2008

Spam drop-off continues

It's been a week, and they haven't replaced their lost spam capacity yet.

30 day chart showing spam drop-off

The green bars are the 'clean' messages. But look at that! Over half of our incoming spam was from the same botnet/hosting provider. Wow. But, I do expect levels to go back to normal before too long.

Labels: ,


Wednesday, November 12, 2008

Very visible drop off in spam

Thanks to this, we've seen a significant drop-off in spam. And what better way to show it than with a graph?

Graphic showing serious, close to 50%, drop-off in spam levels

Dramatic, eh?

Let's see long it takes 'em to find a new host and re-tool. This is but a brief respite, but it's fun to look at.

Labels: ,


Tuesday, July 15, 2008

Your spam-checker ate my email

This is a question I get a fair amount. This is understandable, as the spam-checker is the software whose entire job is to eat email. So naturally that's the first place people think to check when mail gets sent but not received.

I also hate dealing with this kind of question. The spam appliances we use have a search feature, which is critical for figuring out if some email is being eaten incorrectly. Unfortunately, the search feature is devilishly slow. I swear, it is grepping hundreds files tens of megabytes in size and post-processing the output. It generally takes 5 minutes to answer a question every time I hit the 'search' button. And just like google, it can take a few tries to phrase my search terms correctly to get what I want.

Right now we have a complaint that all email sent to us by a certain domain never arrives. This is false, as on the day in question we received and passed on to Exchange about 20 messages from the domain. As it happens the Edge server is having a problem with it, and that needs attention. But I had to do about 30 minutes of waiting for search results to really determine this.

Labels: ,


Thursday, June 12, 2008

Email Hygiene

A blog over on TechRepublic talks a bit about one way to reduce spam. In short, a global white list of actual people managed by some trustable central authority. This attacks the "untrusted sender" vulnerability in SMTP. It takes it a bit farther than SPF or SenderID in that it's an actual person not just a domain.

Dooooooomed to failure. Email is global, and there simply isn't a central trustable authority of any kind. The blog post mentions the FCC, which might be good for US-based email, but certainly not good for trusting email out of China or Russia.

It wouldn't stop much in the way of spam. Such a central repository is its own version of a spammer's dream mailing list, and also represents a treasure-trove of email From: lines likely to be trusted. It would only work when used in conjunction with something like SPF or SenderID to ensure that the person who is "joe.bob@mywork.biz" only sends mail from the mywork.biz mail-servers. It also wouldn't stop "gray-mail" mail-blasts from vendors, as the Sales department folk would just put their own mail address on the From: line of their mass mailings in order to get them past the "Real person" filters.

Email hygiene is a hard problem. SMTP is the poster-protocol for a protocol designed in a far more trusting time. Both the addresses on the To: line and From: line, as well as the addresses on the RCPT TO: and MAIL FROM: lines on the envelope probably should be validated in some way. As well as the IP address(es) of the servers involved in mail delivery. SMTP doesn't do this, and there is a very thriving industry to provide just this sort of thing.

Labels:


Monday, May 05, 2008

Back-scatter spam

There was a recent slashdot post on this. We've had a fair amount of this sort of spam. And the victims are at pretty high levels of our organization, too. Last week the person who is responsible for us even having a Blackberry Enterprise Server asked us to figure out a way to prevent these emails from being forwarded to their blackberry. When a spam campaign is rolling, that person can get a bounce-message every 5-15 minutes for up to 8 hours, into the wee hours of the night. And that's just the mails that get PAST our anti-spam appliance. We set up some forwarding filters, but we haven't heard back about how effective they are.

This is a hard thing to guard against. You can't use the reputation of the sender IP address, since they're all legitimate mailers being abused by the spam campaign and are returning delivery service notices per spec. So the spam filtering has to be by content, which is a bit less effective. In one case, of the 950-odd DSN's we received for a specific person during a specific spam campaign, only 15 made it to the inbox. But that 15 was enough above what they normally saw (about 3 a day) that they complained.

Backscatter is a problem. However, our affected users have so far been sophisticated enough users of email to realize that this was more likely forgery than something wrong with their computer. So, we haven't been asked to "track down those responsible." This is a relief for us, as we've been asked that in the past when forged spams have come to the attention of higher level executives.

If it becomes a more wide-spread problem, we will be told to Do Something by the powers that be. Unfortunately, there isn't a lot that can be done. Blocking these sorts of DSNs is doable, but that's an expensive thing to manage in terms of people time. In 6-12 months we can expect the big anti-spam vendors to include options to just block DSN's uniformly, but until that time comes (and we have the budget for the added expenses) we'd have to do it through dumb keyword filters. Not a good solution. And it would also cause legitimate bounce messages to fail to arrive.

Labels: , ,


Friday, April 11, 2008

On email, what comes in it

A friend recently posted the following:
80-90% of ALL email is directory harvesting attacks. 60-70% of the rest is spam or phishing. 1-5% of email is legit. Really makes you think about the invisible hand of email security, doesn't it?
Those of us on the front lines of email security (which isn't quite me, I'm more of a field commander than a front line researcher) suspected as much. And yes, most people, nay, the vast majority, don't realize exactly what the signal-to-noise ratio is for email. Or even suspect the magnitude. I suspect that the statistic of, "80% of email is crap," is well known, but I don't think people even realize that the number is closer to, "95% of email is crap."

Looking at statistics on the mail filter in front of Exchange, it looks like 5.9% of incoming messages for the last 7 days are clean. That is a LOT of messages getting dropped on the floor. This comes to just shy of 40,000 legitimate mail messages a day. For comparison, the number of mail messages coming in from Titian (the student email system, and unpublished backup MTA) has a 'clean' rate of 42.5%, or 2800ish legit messages a day.

People expect their email to be legitimate. Directory-harvesting attacks do constitute the majority to discrete emails; these are the messages you receive that have weird subjects, come from people you don't know, but don't have anything in the body. They're looking to see which addresses result in 'no person by that name here' messages and those that seemingly deliver. This is also why people unfortunate enough to have usernames or emails like "fred@" or "cindy@" have the worst spam problems of any organization.

As I've mentioned many times, we're actively considering migrating student email to one of the free email services offered by Google or Microsoft. This is because historically student email has had a budget of "free", and our current strategy is not working. The way it is not working is because the email filters aren't robust enough to meet expectation. Couple that with the expectation of effectively unlimited mail quota (thank you Google) and student email is no longer a "free" service. We can either spend $30,000 or more on an effective commercial anti-spam product, or we can give our email to the free services in exchange for valuable demographic data.

It's very hard to argue with economics like that.

One thing that you haven't seen yet in this article are viruses. In the last 7 days, our border email filter saw that 0.108% of incoming messages contain viruses. This is a weensy bit misleading, since the filter will drop connections with bad reputations before even accepting mail and that may very well cut down the number of reported viruses. But the fact remains that viruses in email are not the threat they once were. All the action these days are on subverted and outright evil web-sites, and social engineering (a form of virus of the mind).

This is another example of how expectation and reality differ. After years of being told, and in many cases living through the after-effects of it, people know that viruses come in email. The fact that the threat is so much more based on social engineering hasn't penetrated as far, so products aimed at the consumer call themselves anti-virus when in fact most of the engineering in them was pointed at spam filtering.

Anti-virus for email is ubiquitous enough these days that it is clear that the malware authors out there don't bother with email vectors for self-propagating software any more. That's not where the money is. The threat had moved on from cleverly disguised .exe files to cunningly wrought (in their minds) emails enticing the gullible to hit a web site that will infest them through the browser. These are the emails that border filters try to keep out, and it is a fundamentally harder problem than .exe files were.

The big commercial vendors get the success rate they do for email cleaning in part because they deploy large networks of sensors all across the internet. Each device or software-install a customer turns on can potentially be a sensor. The sensors report back to the mother database, and proprietary and patented methods are used to distill out anti-spam recipes/definitions/modules for publishing to subscribed devices and software. There is nothing saying that an open-source product can't do this, but the mother-database is a big cost that someone has to pay for and is a very key part of this spam fighting strategy. Bayesian filtering only goes so far.

And yet, people expect email to just be clean. Especially at work. That is a heavy expectation to meet.

Labels: , , ,


Thursday, October 25, 2007

This one leaked through

Every so often something slips by the spam filters and also catches my attention. Maybe a couple times a year, but this one needed chasing.

I got a mail on a private account with the highly suspicious subject line of "YOU HAVE WON!!!!!!!!!!!!!!"

Rightie then. Time for a text-mode reader! PINE to the rescue! I drop into header mode so it won't render anything in there. This happens fairly frequently when things leak, I like to see the header-spam to see what the spam checkers thought of it on the way through. This one was somewhat unremarkable, but one thing did stand out. It passed SPF checks.

X-RC-DBID: 046c9cac-dc1e-47d7-acbb-d595ac2651b6
X-RC-ID: 20071025215619610
X-RC-IP: 209.8.50.37
X-RC-FROM:
X-RC-RCPT:
DomainKey-Signature: a=rsa-sha1;
h=Received:From:To:Reply-To:Subject:MIME-Version:Content-Type:Message-Id:Dat
e;
b=e3NoRXbKhaqJoV3E9ofjd93PAw0NK64MJVN2M3AYWq2t0oDuGu9TJ/nbFp/UUyclm2BRKlf/0R
EJP05/UN9dia4UmNKmmCRlhsvg/ov0dAgbjRUktkKwWW32izAfrA3uczt6fFSjmAy3U76siqXxNH
/QlL/RWHQbX2i8KIAx0KA=; c=nofws; d=yousendit.com; q=dns; s=signed
Received: from localhost (unknown [209.8.50.53])
by wa-smtp-02.yousendit.com (Postfix) with ESMTP id 6FA7B3550334
for ; Thu, 25 Oct 2007 14:56:15 -0700 (PDT)
From: Victor Kundala via YouSendIt
To: xxxxxxxxxxxxxxx,
Reply-To: victor_kundala5@yahoo.co.uk
Subject: YOU HAVE WON!!!!!!!!!!!!!!
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="=_7f6931b42522c2a348a97f74dbe1dad0"
Message-Id: <20071025215615.6fa7b3550334@wa-smtp-02.yousendit.com>
Date: Thu, 25 Oct 2007 14:56:15 -0700 (PDT)

Huh. So I google up "yousendit" and find that it really is a legitimate service. The text of the email was the typical gark:

Hello from YouSendIt,

Hello from YouSendIt,

You have a file or files called Dear Winner.doc (1 file(s)) from
victor_kundala5@yahoo.co.uk waiting for download.

You can click on the following link to retrieve your File. The link will expire
in 14 Days .

Link: http://download.yousendit.com/05CE02D8475BB9F9




Do not reply to this automatically-generated email. If you have any questions,
please email us at paidsupport@yousendit.com.

-----
File too big for email? Try YouSendIt at @ysi.base.url@

YouSendIt
1919 S.Bascom Ave., 3rd Floor
Campbell, CA 95008


Really? So a little wget magic and I have the file, which I crack open with strings and I get this text:
Dear Winner
We happily announce to you today, the draw of the online UK National Lottery programme held on 20th of October 2007. Your e-mail address won you in the second category, your e-mail address attached to a ticket numbers: 4-33-34-38-39-49(bonus no.23).
You have therefore been approved to claim a total sum of
420,200 British pounds sterling. You are to contact our AFFILIATE COURIER COMPANY for delivery of your winning certificate and winning cheque.
You are to reply to this email address below: MR SOLOMON STONE INTERNATIONAL COURIER SYSTEMS EMAIL: solo_stone2004@yahoo.com Congratulations once more from all members and staffs of this programme.
Yours Truly,
Victor Kundala
It's a phish! And in homage to its 409 past, it even has a Nigerian-sounding name. Awwww.

Labels:


Thursday, March 08, 2007

Spam stats!

Yummy stats! These are from the anti-spam appliance in front of Exchange, for the last 24 hours.



Processed
Spam
Suspected Spam
Attacks
Blocked
Allowed
Viruses
Suspected Virus
Worms
Unscannable





Summary 168,802 85,166 (50%) 544 (<> 4,837 (3%) 0 (0%) 0 (0%) 43 (<> 31 (<> 3,730 (2%) 1,772 (1%)

And now, definitions:
Processed: The number of messages processed. This is unexploded, so that mail sent to 42 people still counts as just 1.
Spam: The number of Spam messages with a confidence of 90% or higher.
Suspected Spam: The number of Spam messages with a user defined confidence of (in this case) 70% or higher.
Attacks: An aggregate statistic, but in this case they're all Directory Harvest Attack messages. A directory-harvest-attack message is one of those messages sent to 20 people at a site with generated names, in an effort to see which addresses don't generate a bounce message.
Allowed/Blocked: We don't use this feature.
Viruses: Viruses that are not mass mailers.
Suspected Viruses: Heuristically detected viruses. Good for picking up permutations of common viri.
Worms: Viruses that are mass mailers.
Unscannable: Messages that are unscannable for whatever reason.

Like my boss, you may be looking at that 50% number and wonder what happened. It is commonly reported in the press that, "90% of all email is now spam," so where are the other 40% going? I looked into where the press were getting their numbers, and most of them get them from MessageLabs. They report their numbers on the Threat Watch. Today, the Spam rate is, "48.43%", so the 50% we're seeing is well within reason. Looking at their historical data the spam rate waxes and wanes on a day to day and week to week basis.

Labels: ,


Saturday, March 03, 2007

Editorial: responses to the Slashdot thread

In this age, there is not much point in a school going halfway with an email system...either offer something reasonably close to the state-of-the-art or outsource it to someone who does. If you do neither, it won't get used. Even mandating the use of the school email doesn't work. You end up with professors collecting their students' gmail/hotmail/etc addresses at the beginning of the semester and having a TA type all those addresses into a mailing list.
-paeanblack (191171)
A good point. Our Fac/Staff side is done to corporate standards, and is pretty good. We use Exchange, and pay for some (rather good) anti-SPAM appliances. The quality of email provided to our FacStaff is state of the art. Student side is another matter. The prime mailer right now is handled by the venerable postfix, with antispam provided by other open-source products.

In both cases, though, mail quota doesn't come even remotely close to the "gmail standard". I THINK student quota is 100MB these days, and I could be quite wrong. We have students mailing (*sigh*) 10MB power-point files around, so that can get chewed up right quick. Students get POP and IMAP support, though from what I hear the SPAM problem is the main complaint, and there is some grumbling that squirrel mail isn't the best interface to use.
You give them a campus e-mail address. It's the *official* address. Delivery to that mailbox for all official college correspondence is guaranteed. THEN, if you opt to forward it off-campus to gmail or wherever, that's your own business, and you're responsible for the failings of such at your own peril.
Dredd13 (14750)
This is what we do. The official address is the @cc.wwu.edu address. Students can then forward that mail to somewhere else if they so wish (and a lot do). We haven't accepted an off-campus 'official address' because of the inability to guarantee delivery of things like billing and assignments.
I don't understand the problem with having a universal campus-hosed e-mail service. They have servers accessible to the outside world, so why not throw in an e-mail server? If you make it simple (ie: SquirrelMail seems to be a popular campus e-mail hosting app, probably cause of it's cost and simplicity), I wouldn't think size would be an issue, as long as you set the proper quotas per e-mail/user.
-Anonymous Coward
The problem with this is funding. We use SquirrelMail. Unfortunately, the spam problem is bad enough that we need to spend money, not just admin time, to fix the problem to the end user's sastisfaction. Spending money for 18,000 accounts is not cheap by any stretch. Spending on that front is largely tied to student tech fees, which students are understandably loath to increase more than they have to. I don't know what success we've had getting fees approved for things like commercial anti-SPAM products.
All students will be forced onto the system by the end of the semester, but it doesn't support POP or IMAP. Because of that limitation, the only freely available mail client it supports is Windows Live Desktop, which is only available on Windows
-Topic head
This is a problem that has been brought up. A sizable percentage of our student population has PowerBooks as their primary computer, and a Windows-only solution isn't workable. Our Computer Science department is, understandably, a den of anti-Microsoft sentiment (which is why the cs.wwu.edu domain receives mail independant of the central services). This is one of the reasons why we NEED something like POP or better yet IMAP support in whatever we go with. Web-only portals like gmail can work, but some students really like just dropping all their mail into a single mail client that has links to all of their email accounts.
I agree, switching to gmail for university email doesn't sound that bad. Especially if it would raise the storage limit from 20 MB to >2GB. I don't really care though, I almost never use my university email as I have all of my class email sent to my Yahoo/SBC account.
-assassinator42 (844848)
Before the current Windows Live vs. Google debate started there were murmurings of looking at converting to a gmail setup. We got hung up on several of the points mentioned in my previous post; no SSO, no easy account create/delete, no password sync.
My University [dailynorthwestern.com] is switching to Google. One of my concerns is that I really like my desktop clients (alpine and thunderbird) and prefer IMAP. While gmail is an excellent web-client, I don't really use my gmail account that much, because it doesn't offer IMAP & POP is both "flaky" and limiting.
-Anonymous Coward
IMAP is something of a sore point with us techs. We prefer it to POP. Neither service offers IMAP yet, which is one of the reasons we haven't lept in with glad cries.
You're forgetting about something, though. Microsoft give huge discounts and tons of free stuff to colleges, therefore the colleges have raging boners for Microsoft.
-Anonymous Coward
Heh. Us more than most, since we're close enough to Redmond that a number of our alumni work for Microsoft and can donate software from the Company Store. That's how we paid for MS Office the last time around. The IRS has changed some rules to make that more expensive, but it is still a lot cheaper than regular alumni appeals. This is how we were able to afford to import all students into Active Directory.

However... while Microsoft is 'the cheap option' a lot of the time, recent licensing changes at Microsoft have made it much more expensive for us, and our Alumni arm-twisters. We're still wondering how we're going to pay for Exchange 2007. Vista... oof. Not going there yet. Like ALL institutions, we've factored in a certain level of money for software and Microsoft is making themselves more expensive. So, the raging boner is going flacid.

Besides, we've been a NetWare shop for a long time. Hah!
Our boss dismissed the idea of outsourcing to Google or anybody else based SOLELY upon the fact that they reserved the right to advertise in the future to our students. We don't view our students as a commodity to be sold, so that kinda killed the whole "outsource the email" idea.
-Sorthum (123064)
Yeah, that's giving us pause too. Neither outright states that they won't advertise to students. Both admit they'll be using usage data to improve their advertising targeting in general.

Labels:


Outsourcing student e-mail

I saw on Slashdot today a piece about a University migrating their student email to Windows Live.

There have been high-level discussions about doing the same here at WWU, only we're still trying to figure out if Windows Live or a Google program makes the most sense. No decision has been made, though Windows Live would integrate much better into our environment due to the presence of student accounts in Active Directory. The Google offering has better, 'hearts and mind,' support among us techs, but the Microsoft offering would require less work from us techs to get running.

Last I heard, neither offering supported IMAP. GMail doesn't support IMAP, so I doubt any Google offer would. No idea if Windows Live (general access) even does.

There are a number of reasons why outsourcing email is attractive, and right there at the top is SPAM. We can't afford any commercial product to do student anti-spam, as they all charge per-head and even $2/head gets pretty spendy when you have to cover 18,000 student accounts. Currently, student e-mail anti-SPAM is all open-source and I still hear that the SPAM problem is pretty bad. The most senior of our unix admins spends about half his day dealing with nothing but SPAM related problems, so outsourcing would save us that expense as well.

The number two reason is price. Both the Google offering and Microsoft offering are free. Both have promised that they won't put advertising in their web portals for active students, but the usage data may be used to tailor advertising programs targeted (elsewhere) at the high-profit college-age population. Both offerings permit the student to maintain the address after graduation, though in that case they would get advertising in their web portals.

There are a number of problems that outsourcing introduces.
  • Identity synchronization. MS is easiest, Google will require some custom code.
  • Password synchronization. Do we even want to do it? If so, how? If not, why not?
  • Account enable/disable. How do we deactivate accounts?
  • Single sign-on. Is it possible to integrate whichever we use into CAS? Can we integrate it into the WWU Portal?
  • Web interface skinning. Will they permit skinning with the WWU style, or will they force their own?
The answers to all of the above are not in yet, which is why a decision hasn't been made on which way we're going. But the decision to outsource at all is all but made at this point.

Update 1 10/13/2007
Update 2 8/1/2008

Labels: ,


Friday, February 02, 2007

Depressing stock spams

In the course of dealing with the new antispam appliances, I've had to look at a lot of spam. Wowzers, a lot of spam. Most of them are stock scams, which jives with the industry conventional wisdom about spam these day. On a lark, I dropped some of the symbols into CNN Money to see what suckers had fallen for the scams. Too many.

Drop 'PSUD' into your favorite stock tracker, and look at the 10 day report. I saw the mails arrive mid Monday, which is after the buy-up. Two days ago, trading volume was about 4x what it normally got. As of today, the price is still above what it was two weeks ago.

'AFML' had activity two days ago. Their chart shows a clear bump a few weeks ago where the stock was abused. The volume average is well above yesterday's volume, so this is another victim of pump-n-dump.

'QCPC' in a message from yesterday, has today's volume about 2.5x their volume average, another clear sign of pumping. They have been a victim of this scan several times over the last two months. Perhaps the scammers are trying to make back money lost?

Depressing statistics, none the less. Unlike other scams, this is something you can actually see happen from your desk. You only hear about victims of the Nigerian scams in the news, or if you're unlucky through the grapevine. Stocks are tracked WIDELY, so you can see them rise and fall.

Labels:


Thursday, February 01, 2007

Community filters

The admin of the student email system related a tale the other day that I found interesting. They use dspam for their anti-spam needs, and it has a Bayesian filter. It also has some other features which have, as I said, interesting side-effects.

There is a local independent movie theater that sends out a newsletter. Some students have plonked the newsletter into Spam rather than unsubscribe.

The dspam system is configured so that if enough students mark a specific sender as spam, then that sender is blacklisted system wide.

You can see where this is going? I thought so. Enough students have reported this independent movie theater's newsletter as spam that the whole system now blocks it, and we're getting reports of 'false positive!'

Labels:


Thursday, January 25, 2007

SPAM!

The decision to tell the appliances to delete Spam was made yesterday. Anything coming in flagged as Spam, not Suspect Spam, will be dropped. This is 99% of the stuff flagged as spam, as 'suspect' is a really small category. This does reduce the load on the Exchange front-end servers as they have to do much less spam checking and handle a lot fewer messages. Though, as I'll show below, only a little less data.

And now, fun stats for Yesterday!

Total messages processed: 193,242
Percentage flagged as Spam: 49%
Percentage flagged as Suspect Spam: less than 1%
Virus mails: 731 messages
Top virus: Trojan.Peacomm (45% of viruses)
Top non-WWU inbound mailer: 129.41.62.246
Top spam sender: service@watermarkcu.org, 4% of spam (go phish!)

The mail flow goes something like this:

[inbound] -> BigIP -> Appliance -> BigIP -> Exchange FrontEnd -> Exchange

The BigIP is used to load-balance between the exchange front-ends for SMTP traffic. As it flows through the BigIP, I get stats on data volume over those ports

Mail volume to Appliances: 1.7G
Mail volume to Exchange: 1.4G

So data volume isn't greatly affected by dropping 49% of incoming mail. What is affected is the number of messages being processed. The front-end servers weren't terribly loaded as it was, this just means that Outlook Web Access is more responsive than it was.

Labels:


Friday, January 12, 2007

MORE SPAM!

On days like this, I really think I should pick up this T-Shirt. I've been tempted by it for a while. Just sayin'.

That said, now that the thingy has been in place for more than 24 hours I have some interesting data to play with. Unlike previous estimates, the appliance has handled 'only' 230,000 emails in the 24 hours period defined as 9am to 9am today. This is about a fifth of previous estimates, which makes me wonder what we were counting.

What's also interesting is how few viruses have been detected. It looks like the era of the mass mailer worm is largely over. Of that 230K odd mails, only 240 viruses were found. Most of them were mass-mailers, of course, but this is not the way things were even 3 years ago. This appliance is an anti-spam appliance that also does anti-virus, not the other way around like some other appliances I can think of.

Labels: ,


Thursday, January 11, 2007

New anti-spam appliance

The new anti-spam appliance finally has a license file, so I can start dorking around with it.

Happily, this appliance DOES catch picture-spam! YAY!

Unfortunately it also classifies the following as pic-spam:
To: <Everyone>
From: "The Bowler Family" <redacted>
Subject: In need of a serious laugh?

The Purina Diet

I was in Wal-Mart buying a large bag of Purina for my dogs and was in line to check out.

A woman behind me asked if I had a dog........ Duh!

I was feeling a bit crabby so on impulse, I told her no, I was starting The Purina Diet again, although I probably shouldn't because I'd ended up in the hospital last time, but that I'd lost 50 pounds before I awakened in an intensive care unit with tubes coming out of most of my orifices and IV's in both arms.

[...]

[attachments: "dadshirt Bkgrd.gif"]

Perhaps the spam/ham threshold was a bit low. Most pic-spam I know of is one line of text and an attached image. Which also makes it hard to differentiate between that stuff and stuff like this:
To: You
From: Me
Subject: Too damned cute

Dickens was sleeping upside down again. This time, I got a picture.

[attachment: UpsidedownHedgehog.JPG]
It's the pic-spam that is causing the powers that be to start mumbling about finding money, somewhere, anywhere, to just stop it. We've had these appliances sitting on the floor for a few months now, waiting for priorities to shift to the point where we can work with them. Now they have, and now I have.

I must say, it does a pretty good job. It scores on a 0-100 scale, which it sadly doesn't expose, and is hardcoded to toss anything that scores in the 90-100 range. And... it makes good decisions. You can tune the 'suspected spam' threshold lower then that, which is what I've been tweaking. Happily, it's in 'monitor and record' mode, so I can watch message flow without actually DOING anything with the messages; letting the antispam software actually on the Exchange boxes handle the load. This allows me to set the 'suspect' threshold to various spots and look to see what it tags.

Set it low enough, and I saw one message from a student to Financial Aid, asking about canceling a loan for the quarter, got picked up. Yep, raised the threshold a few ticks after that one. Apparently The Economist sends out bulletins, and that gets picked up around the 65 range. A group of students was chatting in e-mail about a class that got canceled yesterday (ice and snow), which got tagged due to the number of people on the To: line (also at about 65). One googlegroups message discussing in a scholarly way a subject that appears in spam a lot, which was tagged when the filter was set to 70.

All in all, less than 1% of the messages tagged as SPAM are tagged 'suspect'. This thing does a good job.

Labels:


Tuesday, October 31, 2006

Spam numbers

The following came out in the "Academic Technology News" yesterday:
To put the new spam filtering in perspective, consider the following: WWU errs on the side of caution to ensure that we do not filter any legitimate email; even with this 'cautionary' configuration more than 80% of all inbound email to campus is filtered out of our email system as known spam, compared to around 65% with our previous solution. In terms of numbers, that means that the staggering number of 1.3 million spam emails are filtered from your incoming mail each day.

Number of emails received: 1.6 million
Number of messages filtered: 1.3 million
Number of messages delivered: 0.3 million
There you have it. 1.6 million messages a day! Our Exchange system has around 6000 email accounts.


Tags:

Labels: ,


This page is powered by Blogger. Isn't yours?