Enterprises in Australia need to rethink their cloud backup strategies following the recent AWS outage in Sydney
In early June, Amazon Web Services’ (AWS) Sydney-based cloud was unavailable for up to 10 hours for some customers after power to the datacentre was cut during a storm.
The event demonstrated that even the world’s largest cloud computing platforms are vulnerable to periodic failure, which means enterprise cloud users must still consider business continuity planning – particularly for mission-critical applications.
The AWS Cloud is used by Australian organisations such as the Commonwealth Bank, accounting software business MYOB and ad trading platform Brandscreen, and to host the popular consumer game, Fruit Ninja.
While Fruit Ninja players might have been momentarily frustrated by being unable to blow up a banana, business service disruption is far more serious, and the incident served as a reminder that enterprises cannot ignore business continuity planning even if they have signed up for cloud.
Amazon’s service health website, which tracks the performance of the cloud, shows that the EC2 instance on the Sydney cloud was down for about two hours during the storm, with knock-on effects for other Amazon cloud services, such as Redshift, Elastic Beanstalk, the Storage Gateway and Cloud Formation. After 10 hours, most of the issues had been resolved.
Five days after the outage, Amazon released a post mortem of the incident, which said that the electricity substation feeding the datacentre was blacked out in the storm and AWS’s uninterruptible power supply failed.
Even after power was restored, a software bug in AWS’s instance management software meant that recovery was slower than predicted for customers.
AWS has apologised to customers for the inconvenience and is now overhauling its power supply infrastructure and software to reduce the chances of it happening again.
But it also noted: “For this event, customers that were running their applications across multiple availability zones in the region were able to maintain availability throughout the event. For customers that need the highest availability for their applications, we continue to recommend running applications with this architecture.”
But that would not have been an option for customers with data sovereignty concerns because AWS does not have datacentres in multiple locations in Australia, only in the Sydney area.
According to Gartner research director Olive Huang, too many companies take an “ostrich approach” to cloud business continuity, expecting their cloud suppliers to take care of that side of the house. “You can have redundancy, but it costs money,” she said. “People go to the public cloud very ill-prepared.”
Huang said that although IT departments running in-house systems might have business continuity and disaster recovery plans designed around the degree of systems failure a business could tolerate, that was often lacking when companies bought cloud.
The problem is compounded because cloud services are often bought not by IT, but by the business, so much less thought is put into business continuity, she said. “Only when these things happen, someone needs to clean up,” she added.