You are currently browsing the tag archive for the ‘Teradata’ tag.

Teradata recently held its Partners User Group meeting (Twitter hashtag #TDPUG11) in San Diego. Analysts were briefed previously on some of the announcements, which I covered in an earlier post.

With the benefit of hindsight, I realize that Teradata is at the center of a trend in the information management market toward broader availability and acceptance of a range of data appliances. Teradata has been in what’s now called the appliance business for decades, but  only in the last decade have other vendors, beginning with Netezza in 2000, recognized an opportunity to compete in the appliance segment. Reflecting on the Teradata Partners event, including the announcements and the customers I met, it occurred to me that Teradata has unique breath of products, which it calls a family of appliances, an appropriate label because they share a common set of software (with the exception of the products recently acquired with Aster Data). I’m sure IBM would like to position its appliances as similarly broad, but its “family” of appliances is more like cousins than siblings. They are independent product offerings with different software, and each addresses a different portion of the market.

In this context of a family of products, the announcements surrounding the Teradata Partners meeting can be divided into two groups: enhancements to individual appliances and new features related to the whole family. Several announcements covered features of the upcoming Teradata 14, including newcolumnar capabilitiesextended analytic capabilities and Teradata Unity to tie together various parts of the Teradata product family. Unity provides query routing and database synchronization among different Teradata instances in an organization and also distinguishes the Teradata offerings as a family. With Unity you can query all of them as a single instance. Behind the scenes it routes the query to the appropriate instance based on which system has the data and the availability of that system. Unity also maintains consistency among individual database instances, and all or portions of a database can be synchronized across different machines. Unity does not require Teradata 14 and is available now in the Americas. The workload management capabilities of Teradata Active System Management (TASM) are separate from Unity, but I hope to see integration of these two sets of capabilities in the future.

In conjunction with Partners, Teradata announced the 2690 Data Warehouse appliance, scheduled to be available in the first quarter of 2012. The 2690 replaces the 2650 in the product line. It adds new compression capabilities and offloads compression and decompression tasks to a co-processor. With the enhanced compression the 2690 can hold more data and consume less power while managing the same workload as its predecessor. Also right next to the 2690 on the exhibit floor was NetApp showing its Hadoop applianceannounced earlier this year and the integration of it to the Teradata family of big data appliances.

Another announcement says that early next year the Aster product line will be enhanced with version 5.0, the introduction of a MapReduce appliance and an adaptor for moving data between Teradata and Aster systems. The 5.0 product includes more prebuilt analytic functions based on Aster’s SQL-MapReduce capabilities and enhances workload management for balancing SQL and MapReduce tasks. In addition, the Unity framework will be extended to incorporate the MapReduce appliance, integrating the Aster products into the analytical ecosystem and making them less of an orphan in the Teradata family.

The Aster MapReduce appliance will compete with other recently announced MapReduce appliances from EMC Greenplum and Oracle. However, the Aster version is based on its own, patented SQL-MapReduce implementation rather than the open source Apache Hadoop project. SQL-MapReduce has the advantage of making MapReduce more accessible to SQL programmers and SQL-based applications. However, it is not supported by the larger community surrounding Hadoop MapReduce. It remains to be seen whether this strategy will be successful on a large scale or SQL-MapReduce will turn out to be the black sheep of the Teradata family. Nevertheless, Teradata has an impressive set of appliances for different purposes that earn the designation of family.

Regards,

David Menninger – VP & Research Director

Earlier this week EMC announced it will create its own distribution for Apache Hadoop.  Hadoop provides distributed computing capabilities that enable organizations to process very large amounts of data quickly. As I have written previously, the Hadoop market continues to grow and evolve. In fact, the rate of change may be accelerating. Let’s start with what EMC announced and then I’ll address what the announcement means for the market.

 EMC announced three new offerings, slated for the third quarter of 2011, that leverage its acquisition of Greenplum last year, ranging from an open source version to incorporation in its data warehouse appliance.

The EMC Greenplum HD Community Edition is a free, open source version of the Apache Hadoop stack comprising HDFS, MapReduce, Zookeeper, Hive and HBase. EMC extends Hadoop with fault tolerance for the Name Node and Job Tracker, both of which are well-known points of failure in standard Hadoop implementations.

The EMC Greenplum HD Enterprise Edition, interface-compatible with the Apache Hadoop stack, provides several additional features including snapshots, wide-area replication, a Network File System (NFS) interface and some management tools. EMC also claims performance increases of two to five times the performance over standard packaged versions of Apache Hadoop.

The EMC Greenplum HD Data Computing Appliance integrates Apache Hadoop with the Greenplum database and computing hardware. The appliance configuration provides SQL access and analytics to Hadoop data residing on the Hadoop Distributed File System (HDFS) as external tables, eliminating the need to materialize the data in the Greenplum database.

Until now Cloudera has dominated the emerging commercial Hadoop market and faced little or no competition since it introduced the Cloudera Distribution for Hadoop (CDH). The EMC announcements are both good and bad news for Cloudera. On the one hand they suggest – you might even say validate – that Cloudera has chosen a valuable market. EMC seems to be willing to invest heavily to try to get a share of it. On the other hand, Cloudera now faces a competitor that has significant resources. For customers competition is generally a good thing, of course, as it pushes vendors to innovate and improve their products to win more business.

EMC’s approach to the market differs dramatically from IBM’s strategy. IBM announced on Twitter at its Big Data Symposium held this week that it is putting all its weight behind Apache Hadoop in the hope of avoiding the fragmentation that plagued the UNIX market for years. EMC’s Enterprise Edition promises to tackle issues well known to the Hadoop market, but EMC faces competition from others who are also tackling these issues. If lower-cost or free competitive offerings adequately address these issues it could seriously undercut the market for EMC’s Enterprise Edition. While EMC brings more enterprise credentials to the Hadoop market than Cloudera, it has less experience with Hadoop. Multiple vendors are attempting to bring enterprise class capabilities to Hadoop, and it’s too soon to see who will succeed. However, overall, the Hadoop market will benefit from all the attention and investment.

 I find it interesting and a little ironic that prior to its acquisition by EMC, Greenplum (along with Aster Data, now part of Teradata helped popularize MapReduce, one of Hadoop’s most commonly used components, by embedding MapReduce as part of its databases. These proprietary implementations could be credited with helping to bring Hadoop into the mainstream big-data market because they combined data warehousing with MapReduce. It spawned a debate in which database guru Mike Stonebraker at first dismissed MapReduce and then embraced it. The debate attracted attention, a key ingredient in building any new market. Now EMC Greenplum completes the circle by embracing Hadoop.

 To its credit, EMC aligned a dozen partners around these announcements, creating an ecosystem of third-party products and services. Concurrent, CSC, Datameer, Informatica, Jaspersoft, Karmasphere, MicroStrategy, Pentaho, SAS, SnapLogic, Talend and VMware all announced their support for the EMC products in one form or another. Most of these companies also partner with Cloudera, so this is a good move but not a coup for EMC.

 The Hadoop market continues to evolve. We are now analyzing the data collected in our benchmark research on the state of the large-scale or now called the big data market, including Hadoop. Stay tuned for the results. It will be interesting to see where the market ends up. I expect more changes and innovation driven in part by the increased competition.

 The Hadoop market is no longer a one-elephant race.

 Regards,

 David Menninger – VP & Research Director

Follow on WordPress.com

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 14 other followers

David Menninger – Twitter

Ventana Research

Top Rated

Blog Stats

  • 41,030 hits
Follow

Get every new post delivered to your Inbox.

%d bloggers like this: