Rackspace Supercharges Big Data With New Hadoop and Spark Bare Metal Solutions

Rackspace made some noise yesterday at the Strata + Hadoop World conference currently taking place in New York’s Jacob Javits Center, which could have a major impact in the field of big data analytics. The company announced new offerings it’s calling the OnMetal Cloud Big Data Platform that essentially gives customers bare metal access to Hadoop clusters with Spark via the cloud.

“This solution breaks new ground for the world of big data,” said John Engates, CTO, Rackspace. “For the first time, Hadoop and Spark can have the best of both worlds: bare metal performance with cloud agility...” Rackspace has been offering customers the ability to leverage Hadoop via the cloud, but the previous solutions were virtualized, which can result in significant performance penalties due to the shared resources of virtual machines. With its OnMetal Cloud Big Data Platform though, Rackspace can now offer powerful, dedicated hardware, i.e. “bare metal”, and all of the performance advantages that come with it, via the cloud. Previously, bare metal access for Hadoop required on-site installations that also needed to be maintained and managed. By offering fully managed Hadoop and Spark hardware and software via its Managed Cloud, Rackspace’s pitch is that OnMetal can save customers time and money by eliminating the need to deploy and maintain the setup on their own, but still reap the benefits of dedicated hardware.

Configuring Rackspace’s OnMetal Cloud Big Data Platform is also apparently quite simple, provided the default installation options are adequate for your workload. According to Rackspace, “Data Scientists and decision makers can now launch a big data platform in three clicks”. Of course, optimizing Hadoop and Spark for particular workloads can be complex, but the initial configuration requires little more than making some simple choices via a remote UI, to select region, node, and size options.

Rackspace Showed Off Its OnMetal Cloud Big Data Platform live at Strata + Hadoop World.

To demonstrate the performance gains possible with OnMetal with Hadoop and Spark, Rackspace was running a demo in its booth at the Strata + Hadoop World conference in which data from Twitter was being pulled in and analyzed in almost real-time. During the demo, a 5-second snapshot of tweets using a particular hashtag were streamed in, and sentiment analysis was performed on said data. Then, a visualization was displayed on-screen that showed whether the tweet was mostly positive, mostly negative, or neutral. Basically, every 5 seconds, a 5-second snapshot of sentiments on Twitter were analyzed.

On the surface, the demo may seem relatively simple. But with previous virtualized installations with Hadoop, the data would have to be brought it, partitioned properly, analyzed, and then the visualization could be generated. Under ideal conditions, the process could take roughly 2 to 3 minutes, but could take up to 10 to 15 minutes, whereas with OnMetal it was happening almost real-time. That type of speed-up is huge for any application where machine learning data is used to make mission critical decisions on pricing or gaming analysis, among other things.

Rackspace’s OnMetal Cloud Big Data Platform employs multi-core servers with up to 128GB of RAM and fast PCI Express-based solid state storage solutions, dedicated to its customers. The performance offered by configurations like this is simply not possible with virtualized solutions.

If Rackspace’s OnMetal Cloud Big Data Platform is interesting to your enterprise, I’d like to hear from you in the comments below. The kind of performance improvements being talked about here could be game changers for some use cases; if you agree or disagree I want to know.

More From Forbes

Rackspace Supercharges Big Data With New Hadoop and Spark Bare Metal Solutions