David Menninger's Analyst Perspectives

SnapLogic Makes Data Quality in the Cloud a Snap

Written by David Menninger | Jul 7, 2011 8:32:12 PM

SnapLogic, a cloud-based data integration vendor, has extended its product linewith new data quality capabilities. This is worth comment because SnapLogic sits at the intersection of two recent trends in information management.

One is data integration vendors broadening their portfolios with previously independent capabilities such as data quality, master data management (MDM), data governance, data virtualization and application integration. IBM with InfoSphereInformaticaand SAPhave the broadest information management product lines, including all of the capabilities above. Other vendors are following suit; for example, Talend has extended its open source data integration product line with data quality, MDM and application integration, as I have noted.

The other trend is the rapid growth of cloud-based data integration services. Recognizing the market opportunity here, Informatica established a separate division, InformaticaCloud, and has invested significantly to established its presence in this segment. Coming at the market from another angle, IBM acquired a pure-play cloud vendor, Cast Iron Systems. Boomi, another cloud pure play, was acquired by a new entrant to the data integration market, Dell Systems.

These acquisitions and investments in cloud-based data integration indicate that a significant market exists for such services. Ventana Research is studying this emerging market in a benchmark research program to gauge the influence of and identify issues related to business data in the cloud. Having observed cloud computing’s increasingly large influence over the IT landscape we expect to see more entrants competing in this segment.

SnapLogic is well positioned to join the battle. In a briefing earlier this year, SnapLogic CEO Gaurav Dhillon, founder and former CEO of Informatica, described the market as exploding with new data types and new data sources. SnapLogic approaches the market expecting a proliferation of data sources and architected its product to be an intermediary or broker between different services. SnapLogic and others can author “Snaps” that connect with these sources. To date more than 100 Snaps are available in the SnapLogic marketplace called SnapStore.

Driven in part by customer demand but also recognizing that data integration alone is not sufficient in today’s market, in this Summer 2011 release SnapLogic has added its first set of data quality capabilities. These capabilities use Trillium’s on-demand service with a Snap that formats data to interact with the data quality capabilities that Trillium offers, such as name and address verification, de-duplication and enriching of data with other values that may not have been part of the original data set. Our recent analytics benchmark research shows that concerns about data accuracy and time spent in data preparation affect a majority of organizations where two thirds of the analytic time is spent in data related tasks.

This release also includes CouchDB and a JavaScript Object Notation (JSON) Snap to help companies integrate unstructured data. CouchDB is an open source document-oriented database. Because it stores documents in their entirety and indexes them for query purposes it can be useful for dealing with unstructured data and documents. CouchDB is accessed via a JSON API, hence the JSON Snap to go along with the CouchDB support.

The release also includes some in-memory optimizations to avoid disk-based operations, which offers performance improvements. And SnapLogic recently added connectors for Hadoop and for Greenplum to help manage large volumes of data.

A relatively new entrant to the data integration market, SnapLogic has identified the cloud as an area of focus. Focus is a good thing for emerging companies, but to be more than a niche vendor it will need to broaden its capabilities. Data quality is a good start, but SnapLogic will need to follow it with other important aspects of information management. We’ll soon be launching a benchmark research program to assess the relative importance of the various aspects of information management including data quality, master data management, data governance and more. I suspect we’ll see SnapLogic adding more of these necessary capabilities over time. In the meantime, you can try the existing capabilities here.

Regards,

David Menninger – VP & Research Director