Resolved

This incident has now been fully resolved, we have been monitoring activity for the last 30 minutes and all systems are working well.

Thank you for your patience during this time.

The story: Rackspace is renowned for being the global leader in cloud computing.

With over 7200 staff, 2400 of them being cloud technicians.

So what’s happened?

The London cloud team, made up of one engineer last night (one??), pushed out a critical fix which in turn has had a knock on effect, bringing several of their London cloud servers down. Over 15 hours later there is still only one member of staff in that team and he is working remotely with most still down.

You couldn’t write this stuff, covid is being blamed.

All teams around him were unable to help albeit trying.

We tried to make as much noise as we could, but as you can imagine with teams working from home, it being a weekend, it has taken time.

Rackspace is a global leader, but this unfortunately reminds us that things can still go wrong.

Although we have to reflect and look at our uptime and reliability over the last 3+ years we have been with them, and there is no question that their solution is of the highest standard and apart from today is miles above the rest.

We are so sorry that they have been letting us down and in turn we are letting you down. We understand how important your website is to your organisation and we will continue to be fully available to work with Rackspace over the weekend if we need to be.

Thank you once again for your on going patience.

Avatar for
Recovering

Great news, we are back live.

So Rackspace is renowned for being the global leader in cloud computing.

With over 7200 staff, 2400 of them being cloud technicians.

So what’s happened?

The London cloud team, made up of one engineer last night (one??), pushed out a critical fix which in turn has had a knock on effect, bringing several of their London cloud servers down. Over 15 hours later there is still only one member of staff in that team and he is working remotely with most still down.

You couldn’t write this stuff, covid is being blamed.

All teams around him were unable to help albeit trying.

We tried to make as much noise as we could, but as you can imagine with teams working from home, it being a weekend, it has taken time.

Rackspace is a global leader, but this unfortunately reminds us that things can still go wrong.

Although we have to reflect and look at our uptime and reliability over the last 3+ years we have been with them, and there is no question that their solution is of the highest standard and apart from today is miles above the rest.

We are so sorry that they have been letting us down and in turn we are letting you down. We understand how important your website is to your organisation and we will continue to be fully available to work with Rackspace over the weekend if we need to be.

Thank you once again for your on going patience.

We will keep this ticket open for the next 30 minutes whilst we monitor activity to ensure everything is running smoothly.

Avatar for
Updated

Rackspace have now issued a workflow for us to implement to get us back onto the network. There are quite a few steps we need to go through and it sounds like they are not expecting an easy ride, but it’s a step forward so we are about to try this now.

A further update in 2 hours time if not before.

Avatar for
Updated

The Rackspace Linux team has been extremely helpful, but are facing the same problem trying to get the instances to recover. The issue is around the cloud storage and API.

A bit more info here https://rackspace.service-now.com/system_status?id=child_service_status&service=02b882f0db6cf200e93ff2e9af961910

But still out of date and still not reflecting a solution that works.

We will continue to update this ticket. At most in 2 hours time.

Avatar for
Updated

We can reassure you we have full backups, we have also heard of some servers coming back on line, and we have also managed to connect to our own servers and see files, albeit it only for a very short period of time. These changes give us positive signs things are moving.

We believe the cloud team at Rackspace are in the middle of a critical planned maintenance (more info here https://rackspace.service-now.com/system_status due to go on until this evening), but this has clearly disrupted services, which was not planned for and their focus is to complete the maintenance ASAP, which could be why they are not hugely responding to us. Other teams around them are being very responsive, but unfortunately it is the cloud team that need to fix this.

We apologise for the inconvenience this is causing and thank you for your continued patience, we will continue to chase Rackspace to get this issue resolved as quickly as possible. This ticket will be updated again in a couple of hours if not before.

Avatar for
Updated

We can unfortunately confirm that services are still out.

There is a group of servers that are not responding (we are aware of several servers across multiple rackspace clients) but so far Rackspace have not been overly helpful or very responsive.

Of course this isn’t acceptable and very opposite to their normal impeccable service, but right now we are in their hands and awaiting an update on this.

We only hope as the day starts a more response team will be in place to help get this resolved as soon as possible.

Apologies for the inconvenience, and thank you for your ongoing patience, we realise how important your website is for your business.

Avatar for
Updated

We still don’t have an update for this and pass on huge apologies for any inconveniences.

Very unlike Rackspace, they have confirmed their is an issue and confirmed they are investigating, but not come back to us yet. We wonder if their onsite protocols are an issue with Covid.

As it’s clear this is something at rackspaces end, we will update this ticket in the morning, with the hope it gets resolved very shortly.

Thanks for your patience

Avatar for
Identified

There is a wider problem with the Rackspace network. We are in close contact. Their service status only talks about planned maintenance, but it looks like something has gone wrong. https://rackspace.service-now.com/system_status?id=child_service_status&service=cdfa0ef0db6cf200e93ff2e9af96197f

Avatar for
Investigating

We are aware of a loss of connection and are investigating now.

Avatar for
Began at:

Affected components
  • Cloud Servers
  • CDN
  • Cloud Network