BFI.orgs Web hosting troubles

Submitted by admin on Tue, 2006-07-11 10:59.
Hi,

I am the web-administrator, designer and developer behind bfi.org. I just wanted to post a quick message for those curious about why our site has been going down so much lately. In this posting I will explain our current hosting situation and give some details to the 15+ hour outage that occurred yesterday on July 10 2006. In addition I will provide some information on how you can help us make bfi.org a faster and more reliable resource.

We are currently using site5 to host bfi.org and some of our other affiliated sites such as the Sustainability Dialogues site.

Initially site5 has been great, very affordable and we had really no issues with them from October 2005 up until perhaps April 2006, during which our troubles began.

Bfi.org began to experience minor issues on a weekly basis and the design science lab site, also hosted by site5 (although on a different server), started to go down once a day for most of May and June 2006. After sending email after email over the course of two weeks (!), I was able to finally get a formalized response and apology. In the response site5 claimed that the downtime was only due to them re-organizing their client accounts on their servers and that we would see drastic improvements in availability after everything is resolved. (Not much of an excuse for having a client site go down daily for almost a month in my opinion...)


In regards to our recent outage

Our recent outage on Bfi.org which occurred for 15+ hours yesterday was due to site5 suspending our account since the email traffic from sending out our Design Science News newsletter via crontab(a system batch job that sends the email in chunks) was suspicious to them.

After our account went down I sent out emails immediately to Customer Support at site5 but got no response. An hour later I was informed that our site was sending suspicious emails so therefore they suspended our account. So I clarified the situation to them in an email but didn't hear anything back. After that I sent out another email an hour later and still heard nothing.

Fast forward to 11pm. After four more emails and another 5 hours I still had heard nothing... I checked back in the next morning and still had no responses.

After another mail (this time with a slightly more angry tone) to site5 at around 7am I finally got the following response:

"I have unsuspended your account as you say these are legitamite mailings, please ensure to adjust your cronjob as you mentioned. Also, please do not open multiple tickets on the same issue or reply to tickets asking for updates, as this will actually delay our response."

OK. So maybe I panicked a little and sent too many emails but having to wait for 13 hours is kind of ridiculous !

But then it actually gets even better:

After unsuspending our account, the site now showed an internal server error message and was still not accessible to the public! As you might have guessed by now, I sent an immediate email after which there was the usual no response. After about an hour I sent another email. Again, no response until about another hour later when I got the following:

"Greetings, Thank you for contacting the Site5 Customer Service Group! I have fixed the error. Your site is loading fine now. Let us know if you need any further assistance from us."

So after all this the site was finally usable again. A total downtime of 15 hours, 13 of which were just spent basically waiting for someone from site5 to get back to me.

So what can we learn from this?

For starters, it seems that BFI is going to need to start looking for a more suitable hosting provider. We need a hosting provider that can offer us a dedicated server so that we can cover all of our mailing and bandwidth needs as well as be able to offer the community a much more reliable and faster site.

Please look for a more formalized announcement about these issues on our website within the next few days. The posting will contain detailed information about how you can help bfi.org become more reliable resource.

Thanks for your continued support !
Best,
- Jochen

| posted in: | help