There are good reasons to move big data systems to the cloud, but doing so also poses challenges for IT teams on migrating workloads and then managing clusters and system instances.
Companies are increasingly shifting big data clusters to the cloud for more flexibility and easier scalability. But IT managers who have made the move warn that getting the clusters there isn’t easy, and that there are ongoing complications to contend with after you do.
The hurdles start with workload and data migration challenges, and they continue with a variety of management issues, according to speakers and attendees at the 2017 Strata Data Conference here. They pointed to things such as frequent system crashes and the need to carefully manage temporary clusters that are set up to run particular processing jobs and then shut down. In addition, they said some workloads aren’t a good fit for the cloud computing model, which can require integration with systems that are left running internally.
The ability to dynamically spin up and modify big data clusters as needed in the cloud makes dealing with the downsides worthwhile for Chris Mills, who leads the big data team at The Meet Group Inc., a New Hope, Pa., company that operates a set of social networking and online dating sites.
After switching from an on-premises big data environment to one in the Amazon Web Services (AWS) cloud, clusters can be added or expanded “in minutes,” Mills said. That has reduced IT overhead costs and made experimental and “deep-dive” analytics applications more feasible, he added.
But moving to the cloud “is going to cost more and take longer than you planned,” Mills cautioned in a conference session. In The Meet Group’s case, that was partly due to the project team identifying potential new applications during the migration process. But unexpected issues also cropped up along the way, he said. All told, it took about six months to set up the cloud-based big data architecture, and another six months to fine-tune the environment.
Getting in tune on cloud migrations
At music streaming company Spotify, migrating thousands of processing workloads from an on-premises Hadoop cluster to a new architecture on the Google Cloud Platform created both technical and organizational challenges, said Alison Gilles, director of engineering for its data infrastructure group.
Stockholm-based Spotify couldn’t just start moving jobs to the cloud without potentially blocking others from continuing to run successfully, said Gilles, who works at the company’s U.S. headquarters in New York. Nor could its 100 or so product engineering and operations teams, which control their own workloads, stop working on projects related to the streaming service to focus on the migration effort.
To make sure processing jobs don’t get blocked, Spotify is aggressively copying data back and forth between the on-premises cluster and the cloud architecture, said Josh Baer, who is managing the data migration process. In August, the copying work amounted to 110,000 jobs in its own right.
“We’re incurring some technical debt,” Baer acknowledged during a joint presentation with Gilles. “But we thought the long-term gain was worth some short-term pain here.”
The data infrastructure unit also developed a set of open source software to help streamline migrations. To support “forklifting” of workloads to the cloud platform, Baer said, it built a tool for scheduling batch migration jobs to run in Docker containers via the Kubernetes orchestration engine, plus a technology that automates the setup of temporary big data clusters to handle the migration workflows. A Scala API was created, as well, for teams that want to rewrite their workloads as part of the migration, although Baer said the infrastructure group encourages them to move the applications first.
For full Story, Click here.