Wednesday 30 July 2014

Inside Intel's $US740m punt on Cloudera

In March this year Intel paid $US740m for an 18 percent stake in Hadoop software company Cloudera becoming its largest strategic shareholder. According to Cloudera founder and CTO, Amr Awadallah, it's a sign that Intel believes the Cloudera approach could revolutionise the data centre and the way large organisations manage their massive data bases.

Intel's investment was made at the time Cloudera raised $US160m in a round of venture funding but it seems that Intel bought most of its shares on the market, so Cloudera didn't get the money.

That however does not diminish the significance of the move, the largest single investment in data centre technology in Intel's history, according to Intel. Intel said: "The deal will join Cloudera's leading enterprise analytic data management software powered by Apache Hadoop with the leading data centre architecture based on Intel Xeon technology. The goal is acceleration of customer adoption of big data solutions, making it easier for companies of all sizes to obtain increased business value from data by deploying open source Apache Hadoop solutions."

Awadallah likens the move to a number of other landmark initiatives over the years that have helped Intel become the dominant chipmaker. "Intel historically has been very clever in surveying their customers and seeing which new workflows are growing very quickly within data centres," he told me in an interview earlier this week. "So Intel now has a 96 percent market share in the data centre: Ninety six percent of the servers run Intel CPUs."

He identifies the earlier landmarks as being: The 'Wintel' alliance some 20 years ago; the alliance with RedHat about 15 years ago; the backing of virtualisation leader VMware about a decade ago. "Intel saw Cloudera growing very quickly and wanted to make sure that whatever they did was optimised for Cloudera," Awadallah said.

So just what is it about Cloudera and its Hadoop distribution that has merited Intel making the company the target of its largest ever investment in data centre technology?

It all centres on Cloudera's concept of the Enterprise Data Hub. According to Awadallah, a traditional data centre architecture comprises dedicated, and costly storage systems and processors connected by a network. Data is by and large dedicated to each application and, where necessary, replicated to serve different applications.

Cloudera's Enterprise Data Hub is made up of low-cost, commodity 'pizza box' servers containing both CPU and disc, and open source software. This software manages a single pool of data and serves data up to applications as needed. For redundancy data is replicated across multiple pizza boxes and the management software automatically isolates any device that fails.

According to this Cloudera white paper on the EDH, "An enterprise data hub (EDH) is one place to store all data, for as long as desired or required, in its original fidelity; integrated with existing infrastructure and tools; with the flexibility to run a variety of enterprise workloads—including batch processing, interactive SQL, enterprise search, and advanced analytics—together with the robust security, governance, data protection, and management that enterprises require. With an enterprise data hub, leading organisations are changing the way they think about data, transforming it from a cost to an asset."

Awadallah contrasts the EDH approach with a data warehouse built on relational database technology and claims that the costs of data storage are 30 to 100 times lower. "With relational systems you are looking at average cost of $30,000 per terabyte, $30 million for one petabyte. The cost with the Enterprise Data Hub is anywhere from $300,000 per petabyte to $1 million per petabyte. We are 30 to 100 times cheaper."

This is only part of the story, according to Awadallah. The power of Hadoop is that it enables analysis of and insights into both structured and unstructured data. The lower cost means that much more data can be simultaneously available for analysis than with a data warehouse, where costs dictate that old and little used data must be archived. This in turn enables organisations to completely re-engineer the way they operate and enables them to extract many more valuable insights from their data.

This Awadallah says, represents "the highest level of maturity of the hub vision," and is "when you have achieved enlightenment as an organisation, what we refer to as converged analytics.

This is where you have a single place with all your data and your workloads all come to the data, as opposed to the data going to the workloads. This is typically a four-year journey for some organisations it can be a ten-year journey and for some organisations it can be a one-year journey.


An organisation's path to enlightenment with the Enterprise Hub, Awadallah says, must pass a hiatus where the technology moves from being the domain of IT to being in the domain of users within the business, how it manages that transition is yet another example of the challenges organisations face in 'becoming digital', but that's a story for another day.

No comments:

Post a Comment