Coping with a domain outage

| 2 Comments
Yesterday we were caught up in the Media Temple DNS server outage. Details about what happened are found here. We were out name resolution for two and a half hours. Paying customers couldn't get at our hosting, and our email stopped flowing.

This is not an outage I'd dealt with before, as both WWU and my earlier job self-hosted their DNS servers. WWU had one off campus, but the first two records were on campus.

This was a case of, "a lot of fire, nothing I can do about it," which is one of the hardest major events to endure as one of the prime fire-fighters. Happily, people figured out fairly quickly that this was an external event we had no control over. Until that time, the statusing was pretty intense.

And unfortunately, there is nothing much you can do about a DNS domain outage. The 24-48 hour DNS propagation time pretty much shoots down any short term quick-fixes. I went several rounds with a couple people here about ways to work around the problem and there was much educating going on as a result. We did give some end-users the IP address of the servers in question. And then the SSL based services stopped working, of course.

As with any major failure of a system you never thought to think about before, there is discussion about moving our DNS to somewhere else. We'll see if this holds up.

2 Comments

Have you considered just having another company provide secondary DNS? It is pretty cheap ($40/year), and usually pretty easy to setup.

We use dyndns for secondary DNS.
http://www.dyndns.com/services/dns/secdns/

I've been using EasyDNS.com for our clients for the last decade without any real problems. They are a pure DNS play, focused on being the best DNS services provider out there with very high reliability robust services. While they aren't the cheapest, you certainly get more than your moneys worth with them.