Cloudera’s recent Hadoop World 2011 event confirmed that the world of big data is getting even bigger. As I wrote of last year’s event, Hadoop, the open source large-scale data processing technology, has gone mainstream. And while 75% of the audience attended this year for the first time and so may not have realized the breadth of Hadoop’s acceptance, statistics announced in the opening keynote show widespread use of it. Mike Olson, Cloudera CEO, reported that the event was sold out, with 1,400 attendees from 580 organizations and 27 countries. In independent confirmation, our benchmark research shows that 54% of organizations are either using or evaluating Hadoop for their big-data needs.
Topics: Big Data, Datameer, MapR, Sales Performance, Social Media, Supply Chain Performance, Business Analytics, Business Intelligence, Business Performance, Cloudera, Customer & Contact Center, Financial Performance, Hortonworks, Informatica HParser, Karmasphere, NetApp, Workforce Performance, Strata+Hadoop
Cloudera is riding the wave of big data. I first learned about the company while working at Vertica, one of Cloudera’s partners. Customers that managed large amounts of structured relational data also needed to process large amounts of semistructured data such as the type found in web logs and application logs. The emerging channel of social media provided another source of data lacking the structure that would lend itself to analysis in a relational database. Other organizations needed to perform calculations and analyses that were difficult to express in SQL. Seeing this market Cloudera recognized earlier than others an opportunity to leverage the Apache Hadoop project; it has been offering the Cloudera Distribution for Hadoop (CDH) since early 2009.
Topics: Big Data, Predictive Analytics, Sales Performance, Social Media, Supply Chain Performance, Business Analytics, Business Intelligence, Business Performance, CDH3, Cloudera, Customer & Contact Center, Information Management, Strata+Hadoop
Last week I attended the IBM Big Data Symposium at the Watson Research Center in Yorktown Heights, N.Y. The event was held in the auditorium where the recent Jeopardy shows featuring the computer called Watson took place and which still features the set used for the show – a fitting environment for IBM to put on another sort of “show” involving fast processing of lots of data. The same technology featured prominently in IBM’s big-data message, and the event was an orchestrated presentation more like a TV show than a news conference. Although it announced very little news at the event, IBM did make one very important statement: The company will not produce its own distribution of Hadoop, the open source distributed computing technology that enables organizations to process very large amounts of data quickly. Instead it will rely on and throw its weight behind the Apache Hadoop project – a stark contrast to EMC’s decision to do exactly that, announced earlier in the week. As an indication of IBM’s approach, Anant Jhingran, vice president and CTO for information management, commented, “We have got to avoid forking. It’s a death knell for emerging capabilities.”
Topics: Big Data, EMC, Analytics, Business Analytics, Business Intelligence, Cloud Computing, Cloudera, Customer & Contact Center, Greenplum, IBM, Information Applications, Information Management, InfoSphere, Location Intelligence, Operational Intelligence, IT Performance Management (ITPM), Strata+Hadoop
Earlier this week EMC announced it will create its own distribution for Apache Hadoop. Hadoop provides distributed computing capabilities that enable organizations to process very large amounts of data quickly. As I have written previously, the Hadoop market continues to grow and evolve. In fact, the rate of change may be accelerating. Let’s start with what EMC announced and then I’ll address what the announcement means for the market.
Topics: Big Data, EMC, Social Media, Teradata, Business Analytics, Business Collaboration, Business Intelligence, Cloud Computing, Cloudera, Customer & Contact Center, Greenplum, Information Management, Strata+Hadoop