• Welcome to Orpington Astronomical Society.
 

News:

New version SMF 2.1.4 installed. You may need to clear cookies and login again...

Main Menu

Downtime Sunday 26th Feb

Started by Rick, Feb 26, 2012, 21:51:58

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Rick

Apologies for the forum and website down-time today. The server on which the OAS websites are held crashed sometime early on Sunday morning. I reported the problem at about 08:10. Unfortunately it seems it wasn't quite the easiest thing to fix. Here's the system status report from the hosting provider:

QuoteThe server at 91.204.209.41 has suffered two drive failures at the same time causing the server to fail.

As this is one of our older servers running RAID5 this means we're not able to replace the drives and repair, instead the server needs to be restored from backups.

As you may know we did have plans to replace some of the older hardware and this server was due in the coming weeks. Instead of replacing the drives and restoring to the old hardware we have a brand new server online with much higher specs and will be restoring to that.

We're currently preping the new server for restore and will start the restoration process in around an hour. When this happens we will be creating a post in the "Network Issues" section of our website with progress and full details.

Apologies for any inconvenience, we're working as quickly as we can to bring everyone back online.

11:57 - The restore has been started. We'll give R1Soft a little while to get going before announcing an ETA as it tends to jump around to begin with.

12:35 - It looks like the restoration has around two and a half hours left, then there will be around 30 minutes of configuration required before booting.

14:00 - Currently reporting 1hr 23 remaining.

14:53 - 30 minutes remaining

15:30 - The restore is complete and we're reconfiguring the kernel and drive setup before fully booting the server.

19:22 - The server has booted, Litespeed configured, MySQL repairs performed, grub configured, kernel rebuilt and we're now performing the IP migrations. Unfortunately as the new server is in a different part of the datacenter the old IPs can't be re-used. New IPs will be issued shortly.

21:10 - The server has been up and running for around an hour. The DNS has propagated across our cluster but may take a few hours to fully propagate if you have it cached.

The website and forum look to have been restored to their state at some time past 10pm on Saturday. We may have lost a very few forum comments, but it shouldn't be any worse than that. If you find anything missing please let me know.

Cheers,
Rick.

Carole

Thanks for letting us know Rick

Carole