In this analyst perspective, Dave Menninger takes a look at data lakes. He explains the term “data lake,” describes common use cases and shares his views on some of the latest market trends. He explores the relationship between data warehouses and data lakes and share some of Ventana Research’s findings on the subject. He also provides an assessment of the risks organizations face in working with data lakes and offers recommendations for maximizing the potential of data.
Teradata recently held its annual Partners conference, at which gather several thousand customers and partners from around the world. This was the first Partners event since Vic Lund was appointed president and CEO in May. Year on year, Teradata’s revenues are down about 5 percent, which likely prompted some changes at the company. Over the past few years Teradata made several technology acquisitions and perhaps spread its resources too thin. At the event, Lund committed the company to a focus on customers, which was a significant part of Teradata’s success in the past. This commitment was well received by customers I spoke with at the event.
Ventana Research has newly published its Mobile Analytics and Business Intelligence 2016 Value Index. The Value Index provides a comprehensive evaluation of vendors and their product offerings across seven categories. In performing that analysis, I realized that this software category is at a crossroads. Once an optional capability often reserved for executives, mobile analytics is becoming a requirement of business users across organizations. The blurring of lines between work and personal lives has provoked a change from single device BI to BI on multiple devices including smartphones and tablets as well as laptops and desktops. From a platform standpoint, the adoption of HTML5 is contesting the prevalence of native mobile applications.
It’s part of my job to cover the ecosystem of Hadoop, the open source big data technology, but sometimes it makes my head spin. If this is not your primary job, how can you possibly keep up? I hope that a discussion of what I’ve found to be most important will help those who don’t have the time and energy to devote to this wide-ranging topic.
Data virtualization is not new, but it has changed over the years. The term describes a process of combining data on the fly from multiple sources rather than copying that data into a common repository such as a data warehouse or a data lake, which I have written about. There are many reasons for an organization concerned with managing its data to consider data virtualization, most stemming from the fact that the data does not have to be copied to a new location. It could, for instance, eliminate the cost of building and maintaining a copy of one of the organization’s big data sources. Recognizing these benefits, many database and data integration companies offer data virtualization products. Denodo, one of the few independent, best-of-breed vendors in this market today, brings these capabilities to big data sources and data lakes.
Predictive analytics is a rewarding yet challenging subject. In our benchmark research on next-generation predictive analytics at least half the participants reported that predictive analytics allows them to achieve competitive advantage (57%) and create new revenue opportunities (50%). Yet even more participants said that users of predictive analytics don’t have enough skills training to produce their own analyses (79%) and don’t understand the mathematics involved (66%). (In the term “predictive analytics” I include all types of data science, not just one particular type of analysis.)
Qlik helped pioneer the visual discovery market with its QlikView product. In some respects, Qlik and its competitors also spawned the self-service trend rippling through the analytics market today. Their aim was to enable business users to perform analytics for themselves rather than building a product with the perfect set of features for IT. After establishing success with end users the company began to address more of the concerns of IT, eventually creating a robust enterprise-grade analytics platform. This approach has worked for Qlik, driving growth that led to an initial public offering in 2010. The company now generates more than half a billion dollars in revenue annually, making it one of the largest independent analytics vendors. Of which based on their company and products was rated a Hot Vendor in our 2015 Value Index on Analytics and Business Intelligence and one of the highest ranked in usability.
It has been more than five years since James Dixon of Pentaho coined the term “data lake.” His original post suggests, “If you think of a data mart as a store of bottled water – cleansed and packaged and structured for easy consumption – the data lake is a large body of water in a more natural state.” The analogy is a simple one, but in my experience talking with many end users there is still mystery surrounding the concept. In this post I’d like to clarify what a data lake is, review the reasons an organization might consider using one and the challenges they present, and outline some developments in software tools that support data lakes.
Topics: Big Data, Data Science, Predictive Analytics, Social Media, Business Analytics, Business Intelligence, Data Governance, Data Lake, Governance, Risk & Compliance (GRC), Information Management, Uncategorized, Strata+Hadoop
I recently attended the SAS Analyst Summit in Steamboat Springs, Colo. (Twitter Hashtag #SASSB) The event offers an occasion for the company to discuss its direction and to assess its strengths and potential weaknesses. SAS is privately held, so customers and prospects cannot subject its performance to the same level of scrutiny as public companies, and thus events like this one provide a valuable source of additional information.
Topics: Big Data, Predictive Analytics, SAS, Analytics, Business Analytics, Business Collaboration, Business Intelligence, Business Performance, Information Applications, Information Management, Uncategorized, Visualization
Last week I attended Spark Summit East 2016 at the New York Hilton Midtown. It revealed several ways in which Spark technology might impact the big data market.