You are currently browsing the tag archive for the ‘IBM’ tag.
There has been a spate of acquisitions in the data warehousing and business analytics market in recent months. In May 2010 SAP announced an agreement to acquire Sybase, primarily for its mobility technology and had already been advancing its efforts with SAP HANA and BI. In July 2010 EMC agreed to acquire data warehouse appliance vendor Greenplum. In September 2010 IBM countered by acquiring Netezza, a competitor of Greenplum. In February 2011 HP announced after giving up on its original focus with HP Neoview and now has acquired analytics vendor Vertica that had been advancing its efforts efficiently. Even Microsoft shipped in 2010 its new release of SQL Server database and appliance efforts. Now, less than one month later, Teradata has announced its intent to acquire Aster Data for analytics and data management. Teradata bought an 11% stake in Aster Data in September, so its purchase of the rest of the company shouldn’t come as a complete surprise. My colleague had raised the question if Aster Data could be the new Teradata but now is part of them.
All these plays have implications for how enterprises manage and use their fast-growing stores of data. We’re living in an era of large-scale data, as I wrote recently. Founded in 1979, Teradata has been a dominant player in this market for years. Teradata was a pioneer in massively parallel processing (MPP) for database systems that I recently described, a concept behind much of today’s analytic database market, including all the recently acquired vendors mentioned above. When I worked at Oracle in the late 1990s, Teradata was the chief competitor when pursuing 1 terabyte (TB) data warehouse opportunities. Yes, managing a single terabyte was considered a significant challenge then that few vendors were ready to take on. Although the data volumes have grown, little else has changed since those years with Oracle now competing against more providers despite its recent promotions of its second generation Oracle Exadata appliance and it Oracle 11g Release 2 database at the 2010 Oracle OpenWorld.
Of course, that has all changed long since. Over the last few years Aster Data established itself as a player in the data warehousing market with an MPP relational database product. It also embraced the MapReduce parallel processing technology earlier than most other data warehouse vendors. MapReduce is a key concept of the increasingly significant Apache Hadoop project; it appears that Aster Data’s proprietary implementation of MapReduce was a significant factor in Teradata’s decision to acquire it. The tone of the joint Teradata/Aster Data briefing for analysts even suggests that Teradata is attempting an end-run around Hadoop. Aster Data has successfully developed a market niche of customers doing analysis of unstructured data and social networks. Both of these are activities one might use Hadoop to do. The company also had success in other segments, such as financial services, marketing services, retail and e-commerce.
Aster Data customers should benefit from the increased resources Teradata can invest in developing the product, and the rich heritage of Teradata in this space should enhance the infrastructure supporting the Aster Data product line. For example, Teradata’s workload management tools are among the best in the industry. However, even if the association with Teradata brings some of these capabilities to the Aster Data platform, it’s likely that the cost advantages of Aster Data over Teradata will decline over time. Integration into the Teradata organization and technology stack could detour Aster Data from its previous path of innovation. So customers may see a longer time between releases and maybe a less ambitious product roadmap.
Obviously Teradata has many more customers than Aster Data and is most concerned about them. They, too, might see some negative impact on development schedules, but on the plus side they instantly receive a new source of technology that could be beneficial to them. The question, as yet unanswered until a roadmap is published, will be how quickly Teradata customers can take advantage of the innovations Aster Data has brought to market.
Teradata officials talked about Aster Data retaining some independence in operations, in the product line and perhaps in identity, but integration is the key to making the acquisition valuable to the Teradata customer base. One likely outcome favorable to current and future Teradata customers is more support for industry-standard server hardware, which was specifically mentioned as a benefit of the acquisition. Teradata customers may also benefit from the columnar database capabilities if those capabilities are ported to the main Teradata product line.
There were a couple of notable omissions from the discussion as the acquisition was announced. Both Teradata and Aster Data had partnerships with SAS and Cloudera, the Hadoop vendor. The SAS relationship provided advanced analytics including statistical analyses and data mining embedded within the database. Teradata may be looking to shift to using Aster Data’s embedded analytic capabilities. With respect to Hadoop, although both companies had publicly announced partnerships with Cloudera, neither had offered any deep integration with Hadoop. I expect such an arm’s-length relationship to continue, but I suspect the new combination will put more weight behind the embedded SQL-MR capability that Aster Data developed. Teradata may be large enough to attempt such an independent strategy, but I think it would be a mistake to alienate the entire Hadoop community that I have been researching, which in my opinion is a market not served by Teradata today.
Customers of other data warehousing companies should feel little immediate negative impact from this coupling. In fact, more competition over the analysis of unstructured data could spark those companies to enhance their product offerings. As I noted in beginning, the market for independent products has shrunk over the last six months, so the remaining vendors could see an increase in revenues as customers who were using Aster Data or others in preference to larger companies may start to consider other alternatives. However, at some point the musical chairs of the large-scale database acquisitions is going to stop, and when that does, if your vendor is still reliant on venture capital (VC) funding (i.e., does not yet have positive cash flow) you could have a problem. The VCs may decide to stop funding such a company, which could force them to scale back on their plans.
Probably some other shoes will drop before the market shift is over. Dell had a partnership with Aster Data that is now called into question. Dell therefore might seek an alternative partner or even acquire one of the other vendors to get into the game. Potential candidates include many we have been assessing including: 1010Data, Calpont, Kognitio and ParAccel. I didn’t include Infobright in this list because it doesn’t have an MPP offering, which seems to be table stakes now. This latest acquisition could also lead to another round of acquisitions based on Hadoop or NoSQL technologies. Or perhaps the game may take a step in the direction of complex event processing or predictive analytics. We’ll have to wait and see although at this rate we may not have to wait long!
David Menninger – VP & Research Director
In the weeks leading up to and as part of its Information On Demand Conference that my colleague assessed, IBM introduced version 8.5 of InfoSphere Information Server and several related product updates. As my colleague suggested earlier, IBM has an ambitious agenda to provide comprehensive information management capabilities through a combination of product development and acquisitions. The breadth of this portfolio is impressive, and InfoSphere Information Server 8.5 makes significant strides in tying the various pieces together.
In the same way that IBM has sought to unify capabilities further up the BI stack with IBM Cognos 10, as I assessed, one of the key pieces of InfoSphere 8.5 is a new interface. InfoSphere Blueprint Director provides a broader view of the process associated with data integration projects. It also improves reusability, access to metadata, life-cycle management and collaboration. At first blush Blueprint Director might look like a data integration workflow, but it is really much broader, extending to the reports and analytics for which the data is being prepared. Throughout the process you can access the specs and metadata using some of the same tools that end users access, such as the Business Glossary. It will be interesting to learn how far these blueprints extend the reusability paradigm. The key will be how fine-grained it is and how well the developers have encapsulated and exposed the end points, which could occur at any step in the process if it is granular enough.
This release includes many under-the-cover enhancements, too. Version 8.5 includes autodiscovery of warehouse schema and some of the business glossary information from existing systems. High availability via clustering and fail-over mechanisms is available in both production and development environments. The release also provides scalability and load balancing at the domain and database tiers for development teams. Parallel processing of XML data provides another area of performance improvements.
I was impressed to learn that IBM has created a shared transaction context from change-data capture all the way through the date integration and data quality stages so you can insure integrity and recoverability of the movement of data through the system. And support for third-party source-code control systems is available via the SCCS standard.
If your organization uses data in multiple languages, you can now apply data-quality rules to more locales, including Argentina, Brazil, Chile, Peru, Mexico, the Netherlands, India, Traditional Chinese and Japanese Kana.
In its data warehouse portfolio, IBM announced InfoSphere Warehouse enhancements, a new Warehouse Pack and extensions to its Smart Analytics Systems. Optim Performance Manager and IBM Mashup Center are now bundled with InfoSphere Warehouse. In addition, SAS functions can execute inside the DB2 database engine. SAS has been making efforts to partner with other data warehouse vendors to provide similar capabilities. It makes for interesting bedfellows in that IBM owns SPSS, which competes with SAS, but the collaboration is good for joint SAS/IBM customers because it minimizes the need to move data around in order to perform advanced analytics.
IBM also added a third Warehouse Pack to its offerings. Supply Chain Insight comes with a predefined data model and 20 Cognos reports designed to perform analyses across vendor performance, inventory and distribution aspects of the supply chain process. Supply chain analytics are generally well understood, as my colleague noted, so a packaged offering can automate and speed the process of implementation, but you should expect additional work to be required. IBM describes these analyses as “examples” of the types of analyses you would perform and provides customization guidelines.
Announced in April, IBM’s data warehouse appliance offering, Smart Analytics, has been extended with three new systems. Two systems based on x86 architectures have been added to the lower end of the product family, and a system using IBM’s Power technology has been added, including a version with solid-state storage. The result is a product line that is nearly as broad as Teradata’s but much less unified. It will be interesting to see how the Netezza acquisition will affect these appliance offerings, assuming the deal is completed. I would expect the Netezza products to play a prominent role in future Smart Analytics offerings, so we suggest caution with respect to purchasing the existing products.
In addition to the enhancements above, IBM updated a variety of other pieces of the stack. This is where I think IBM still faces some challenges. The product set is complicated and could use some rationalization. Several updates were announced. Initiate Master Data Service 9.5 has capabilities helpful for creating and managing trusted data with external data sources and partners. It also provides a mashup composer for operating on trusted data. Traceability Server 3.0 adds more connections to master-data sources, serial number management and performance enhancements. Data Architect 7.5.3 adds support for Cognos and InfoSphere Warehouse with autodiscovery capabilities. And Guardiam Data Redaction 1.1.2 supports French, German and Spanish and can provide redaction for unstructured documents in FileNet.
Somewhat buried in these announcement was a technology preview of Hadoop-based “big data” analytics software called IBM InfoSphere BigInsights running both on premises and in the IBM Test Development cloud. It is only available for development purposes currently, but it is worth noting because it indicates the importance of Hadoop to IBM, which has even created its own distribution of Hadoop. I expect to see other vendors embracing Hadoop as well, as indicated in my earlier blog post regarding Hadoop World.
IBM is a force to be reckoned with in information management. It competes head-to-head with Oracle and SAP and also with Informatica, which has been steadily building its portfolio through acquisitions. IBM’s recent announcements help unify its stack. In contrast with the Cognos 10 portion of the stack I assessed recently, the company seems to have done a better job of integrating with third-party technologies but still has further to go in unifying the many components it offers. That said, the integration with its offerings for master data management and data warehousing make IBM InfoSphere Information Server 8.5 worthy of consideration.
David Menninger – VP & Research Director