Service outage 31st July & 1st August 2014
We are really sorry about the prolonged service outage experienced from about 20:30 hours on 31st July to 23:50 hours on 1st August 2014 (GMT +1 times). The reasons for this outage are numerous, complicated and still somewhat unknown. Unfortunately the outage was also completely out of our hands.
At aproximately 20:30 last night the internet connection between the the Cre@tive technologies servers and the ISP's point pf presence in Telehouse went down. All the servers themselves were online, but nothing had access to "the outside world"due to the ISP fault. ( the ISP is a very large UK IT company, and many customers, including government departments were affected)
The ISP had engineers working on the issue from the early hours of 1st August at their datacentre in Telehouse London. Unfortunately we didn't receive any meaningful updates until 15:30, when it became apparent that the outage had been caused by multiple failures of multiple firewalls at Telehouse.
The ISP had been trying to restore their firewalls since aprox 13:00, but these restores had failed until aprox 19:00, when the first services started to come back online. At this time, while it was possible to contact the Cre@tive technologies servers, there was no DNS resolution passing through the ISP's firewalls, meaning that individual sites and services still apeared offline to the internet.
Following more testing and some different fixes (and the power of Twitter), the ISP told us at 23:45 that DNS should be restored, which indeed proved to be the case, making the server fully available again.
We will post more information about the causes of the outage when / if the ISP ever reveals them.
What can we do to stop this happening again? In short; not a lot. This was a very unusual error and the first and only time I've known this ISP / IT company to have such a large outage.