Well that was ugly.
Today at 4:40pm Eastern we experienced a wholesale loss of services from Media Temple, my expensive hosting provider, and that complete and total failure of redundant website hosting, FTP, email and everything else that I am virtually made of demonstrates just how awful and tender and fragile technology is on the internet and how poor planning can come around and chomp even a good company on the behind. 


Here is the official notification from Media Temple that took over an
hour to get published online. It should have also been mass mailed to
every customer but it was not. Media Temple are located in California
so that explains the time difference between me and them. “Availability
Issues” is putting it mildly:

(mt) Media Temple Operations
Availability issues on Friday, July 28th 1:00pm
At approximately 1:00pm today, (mt) Media Temple’s (LA-IDC2) data
center experienced another power issue. Unfortunately, information from
the building management at the Garland Building is sparse, vague at
best, and lacking important detail. It is being speculated that one of
the buildings back up generators has caught fire; however this
information is hearsay and can not be confirmed. (mt) Media Temple
staff has responded to this issue promptly and with the utmost urgency
by fully staffing the company and answering customer phone calls and
support tickets.

At the time of this writing, all power systems have
been restored and engineers are currently bringing customer servers
back up and repairing file systems that have become temporarily
corrupted.
This incident is clearly an unacceptable situation for our company and
our customers. (mt) Media Temple has decided to accelerate its plans to
move all customers in (IDC-LA2) to a new data center which has
undergone exhaustive testing to insure such power issues will not be a
problem in the future.
Our President and CEO, Demian Sellfors, is making himself personally
available to call customers back who wish to discuss these issues in
more detail. We encourage any customer who wishes to receive a call
back from Mr. Sellfors to please indicate so inside this ticket.
Shortly, he will be calling you back to discuss these matters with you.

I pay a lot of money for the Media Temple Dedicated-Virtual server and
when this blog and all my sites and email were still down at 6:10pm
Eastern, I called Media Temple. I was told the problem was being
addressed and my services should be restored “soon.”
“Soon” is a hard word to swallow when all your eggs are in the dropped
Media Temple basket.
We fully went back online around 7:22pm.

We’ll see if we stay alive or
if we pop on and off again as things simmer down or bubble up again.
I’m sorry for being offline so long.
If everything seems to disappear again in the future and you are
wondering what happened — or whatever — you are welcome to use my
Gmail address “dboles” to tell me what’s up or to ask me what happened.
For a moment earlier today I thought my Verizon DSL connectivity was
having internet routing issues, but now we know that wasn’t the case.
We were dead and down.
Hard.
It makes you a little sick to your stomach.

19 Comments

  1. Hi David,
    Good to see the blog back online again!
    I think California is falling apart under their heatwave.
    I haven’t been able to get into my Myspace account all day to see if anyone sent me any messages. Earlier last week, they had some power problems that caused service outages.
    It’s not a big deal because that service is just for fun, but it shows that “big computer companies” are vulnerable and aren’t as stable as we always assume they should be.
    I think our predictions that were made in our heatwave article about the power grid being fragile is being proved.

  2. Chris!
    Yeah, I think the whole world is melting!
    Pro shops like Media Temple should have backup upon backup upon backup systems in place so if something overheats or catches fire you flip a switch and push everything from a separate core until the main core gets fixed. At least that’s how I’ve seen other high-velocity web hosting companies operate.
    I’m surprised MySpace is down, too! What a wacky virtual world we weave!

  3. I noticed something was wrong when I could access the 9rules blog. I’m glad you didn’t lose anything =)
    I pay so little money every year with Jumba, and I never fail to be amazed at the customer service and server reliability. Though I can’t really talk about how reliable Media Temple is since I’ve never used them.

  4. I took a look at the features of my hosting company and note that they have redundant power, air conditioning, and fire suppression systems, as well as off-site backup.
    I remember reading the blog written by the guys running the data center in New Orleans. They survived the hurricane, looting, soldiers, mold, and everything else nature threw at them.
    It’s too bad that places that have tons of money don’t have enough backup systems to survive power outages in a specific area. Especially something like Myspace that is now the web host for brand new teen movies. I wonder if advertisers are getting worried …
    From C-Net:

    Last weekend, the 12-hour outages on MySpace sent the News Corp.-owned social-networking site’s young user base into disarray. …
    Then, on Monday morning, MySpace regained its server power and things seemed to be back to normal.
    But now: They’re baaaaa-ack! On Friday morning, the blogosphere began buzzing with the news that an unknown portion of MySpace profiles were inaccessible, displaying messages that say “Invalid Friend ID. This user has either cancelled their membership, or their account has been deleted.”
    This is, however, nothing like last weekend’s power fizzle; this mini-outage does not appear to affect all MySpace users, nor has it shut down the main site, because it seems to be limited to profiles. There is no official word from MySpace yet about what exactly the problem is, and as of around 10 a.m. Pacific time, it hasn’t ended yet. But despite its limited scope, the new outages have inspired a smaller version of the panic from over the weekend–among the MySpace demographic, “deleted” can be a really, really bad word.

  5. Hiya Yvonne!
    I went to some of the other sites I know Media Temple hosts to see if they were down, too, and they sure were.
    Then I tried to hit Media Temple’s homepage and it was unavailable! When a web host’s own site goes down for a long time you know there’s deep trouble brewing.
    You’re lucky to have a hosting service you like and that works well for you! Congrats!

  6. I agree with you, Chris! Fires and electrical problems should be expected and properly prevented when your web business is the business of the web.
    I’m sorry to hear about your continued MySpace trouble! What a disappointment! It’s interesting how the power grid isn’t getting upgraded but more and more services and companies that are hungry for power keep coming online. It’s no wonder the whole web infrastructure hasn’t melted yet.

  7. Hi David,
    I read not too long ago that over 90 people had died in the heatwave scorching California.
    From the AP:

    Fresno, California: Corpses piled up at the morgue, and aid workers went door-to-door, checking in on elderly people in hopes of keeping the death toll from California’s 12-day-old heatwave from rising.
    The number of deaths possibly connected to the heatwave climbed to 98, California coroner’s offices said on Thursday.
    In Fresno County’s morgue, the walk-in freezer was stuffed with bodies, with some piled on top of others, said Coroner Loralee Cervantes.
    With limited air conditioning, employees worked in sweltering heat as they investigated at least 22 possible heat-related deaths.

    We rely on the power grid to keep us alive.
    And, it hasn’t been working, as seen in Queens, St. Louis, and California.
    When a well-funded service like News Corp’s $580 million MySpace — a service that generates 10,593 page views per second with its 15 million daily unique logins — can be taken to its knees by power failures, we have much to fear.
    The mighty and the powerful shouldn’t be affected — but they are — right along with the weak and the common people as everyone becomes powerless.
    It’s a sure sign that we need to think about infrastructure and the costs — both in terms of human lives and business dollars lost — when we fail to think of the future.

  8. OUCH – makes my minor problems this week pale in comparrison. What really infuriates me so much about this is that IF companies have STATUS pages ( to me the first point of call) – they are often on the same server as the ones that fail ! That really ticks me off.
    I have been blaming my problems on the heat …… it is literally as if the wires have melted and have been turned to glue.
    Very relieved to see you are back up and running – and that you have not lost previous comments etc. ( Shame about the spam though ).

  9. Chris —
    Thanks for the valuable links! You can add Staten Island to the list of misery. Right after Queens returned to power after 9 days in the dark, Staten Island went brown and nearly all the way down.
    Now they’re saying the “fix” to Queens is only temporary because all the cables underground were burned out so the current “fix” has 23 miles of replacement electrical cable IN THE STREETS!
    Yes, the power to Queens is being delivered from the gutters.
    There are plans to move the power cables up to telephone poles or back down underground — but to do that — you have to take Queens offline again, block-by-block a day at a time. It’s a filthy mess on every level.
    You’re right our needs increase but the system’s capability to fulfill our needs diminishes. We’ve lived in Jersey City for four years now and I have personally added two power hungry computers, several fans, a new printer… but taken nothing else offline.
    We have probably doubled our power consumption here in four years. What should I turn off? My 85 watt-hungry MacBook Pro? My laser printer? My air conditioner? The needs of the immediate are predicted by future innovation.
    Modern life demands modern tools and if those needs are not met then we all begin to face the problem of ancient punishments of cold, heat and darkness.
    We’re seeing the suggestion of our future in our current (pun intended!) failures.

  10. Hi Nicola!
    You’re right! You need to have a separate server SOMEWHERE in the world you can call into service to send mass email updates and create an emergency status page. You can’t, as a web business providing vital services to people for profit, shelter yourself from one storm at a time. You must predict and move in the middle of repetitive catastrophic storms.
    There should be another update RIGHT NOW on the Media Temple site explaining in more detail what happened and what will be done to prevent a repeat of the mess but we still see the same message today I posted here yesterday.
    Yes, it’s a shame when getting Spam again is the sign of life.
    😉

  11. Hey! Glad to see you back. You went down right around the time of my last post, so I wasn’t sure that had even gotten through. I thought maybe it was the connection, but when I got home and tried from here – nothing!
    Then *I* was down. Equipment failure. I’m back, too.
    Offline all day…

  12. Yeah! I had my response ready for you and then — BOOM! — we were dark for the next three hours. It was a painful wait. Usually if there’s a service burp it lasts for 5 or 10 minutes. Three hours is an eternity.
    You went down, too? We’re all melting!
    😀