David Menninger's Analyst Perspectives

MapR Takes Road Less Traveled to Big Data

Posted by David Menninger on Jan 19, 2017 8:52:07 AM

The big data market continues to evolve, as I have written previously. Vendors are attempting to differentiate their offerings as they seek to encourage customers to pay for technology that they could potentially download for free.

MapR is one of those big data vendors. It entered the market in 2011 with a Hadoop distribution that used an alternative, POSIX-based file system that also offers HDFS API compatibility. While other Hadoop distribution vendors were chasing a volume-based business by using the Apache open source community surrounding Hadoop, MapR deliberately chose a more focused strategy of targeting enterprise capabilities and features. The MapR file system was designed to provide features that were missing in the early days of Hadoop, and it continues to form the backbone of the company’s offering today. I recently attended the company’s inaugural analyst day at its headquarters in San Jose to get an update on its progress and approach to the market.

The MapR file system (MapR-FS) provides NFS access, allowing organizations to load and share files from standard file storage systems. MapR-FS provides read-write capabilities, which typically are not available in Hadoop implementations. Using the NFS capabilities, MapR was able to provide high availability, snapshots and replication before other distributions. The company also claims NFS offers higher performance than HDFS since NFS is supported natively in the operating system. With these capabilities MapR secured some early enterprise deals with customers such as American Express and comScore that remain customers today.

In 2013, MapR added MapR-DB for NoSQL database capabilities running in the Hadoop cluster. MapR-DB uses the JSON interface for document database capabilities and provides an API for HBase applications. The high availability, snapshot and replication capabilities mentioned above are available for MapR-DB since MapR-FS is the underlying platform for both parts of the system.

Then in 2015, the company introduced MapR Streams for processing streaming event data, and it began to call the combined products the MapR Converged Data Platform since it vr_IoT_and_OI_15_latency_in_IoT_applications.pngoffered batch processing via Hadoop, operational processing via NoSQL and streaming data. The last is a critical capability for Internet of Things (IoT) applications. Our recently completed IoT and operational intelligence benchmark research shows that nearly half (46%) of those implementing IoT applications consider it essential to have low or very low latency for processing events.

Like other parts of its platform, MapR Streams supports an open source API, in this case the Kafka API. Combining open source APIs with its proprietary products enables MapR to participate in and benefit from the open source ecosystem surrounding the big data market. The MapR platform also supports Apache Spark, which I have written about, and provides a SQL interface via Apache Drill.

The strategy seems to be working. Many of MapR’s early customers have continued to use its products and increased their investments in them. This approach allows MapR to have clearly differentiated products based on open source technology. However, it also creates some challenges. The divergence from the open source versions results in a smaller community for the MapR products and may cause it to be passed over by prospects who prefer to stay closer to the Apache version of Hadoop. Nevertheless, MapR has managed to establish a position as one of the top three Hadoop distributions. It also claims to be growing revenues significantly and shared some financial metrics under nondisclosure with the analysts in attendance.

MapR offers a robust platform that covers many of the big data requirements, which often require integration of separate products or open source projects. If you are considering a big data project, I recommend evaluating whether MapR meets your needs.

Regards,

David Menninger

SVP & Research Director

Follow Me on Twitter @dmenningerVR and Connect with me on LinkedIn.

Topics: Business Intelligence, Internet of Things, Big Data, Information Optimization, Analytics

David Menninger

Written by David Menninger

David is responsible for the overall research direction of data, information and analytics technologies at Ventana Research covering major areas including Analytics, Big Data, Business Intelligence and Information Management along with the additional specific research categories including Information Applications, IT Performance Management, Location Intelligence, Operational Intelligence and IoT, and Data Science. David is also responsible for examining the role of cloud computing, collaboration and mobile technologies as they affect these areas. David brings to Ventana Research over twenty-five years of experience, through which he has marketed and brought to market some of the leading edge technologies for helping organizations analyze data to support a range of action-taking and decision-making processes. Prior to joining Ventana Research, David was the Head of Business Development & Strategy at Pivotal a division of EMC, VP of Marketing and Product Management at Vertica Systems, VP of Marketing and Product Management at Oracle, Applix, InforSense and IRI Software. David earned his MS in Business from Bentley University and a BS in Economics from University of Pennsylvania.