We at Ventana Research recently published our research agendas for 2018. Analytics and business intelligence are evolving and so is our research on their use across practice areas. Earlier research has shown that analytics can deliver significant value to organizations; for example, our predictive analytics research shows that 57 percent of organizations reported achieving a competitive advantage and half created new revenue opportunities with predictive analytics. Waves of investment in self-service analytics have propelled the market for analytics tools, significantly empowering line-of-business organizations to create their own analytics and set their own analytic priorities. But organizations are also beginning to recognize some of the limitations of current analytics implementations – for self-service, for example. Our Data Preparation Benchmark Research reveals that fewer than half (42%) of organizations are comfortable allowing business users to work with data not prepared by IT. Our research this year will continue to explore both the successes and challenges organizations face as they continue to use analytics and BI.
Ventana Research recently published the findings of our benchmark research on Data Preparation, which examines the practices organizations use to accomplish data preparation. We view data preparation as a sequence of steps: identifying, locating and then accessing the data; aggregating data from different sources; and enriching, transforming and cleaning it to create a single uniform data set. Using data to accomplish organizational goals requires that it be prepared for use; to do this job properly, businesses need flexible tools that enable them to enrich the context of data drawn from multiple sources and collaborate on its preparation as well as ensure security and consistency. Users of data preparation tools range from analysts to operations professionals in the lines of business to IT professionals.
I recently attended SAP TechEd in Las Vegas to hear the latest from the company regarding its analytics and business intelligence offerings as well as its data management platform. The company used the event to launch SAP Data Hub and made several other data and analytics announcements that I’ll cover below.
The Strata Data Conference is changing and it’s changing in a good way. At the recent Strata Data Conference in New York, Mike Olson, chief strategy officer at Cloudera, which co-sponsored the event, commented that at prior events we used to talk about the “Hadoop zoo animals,” meaning the various components of the Hadoop ecosystem of which I have written previously. Following last fall’s Strata event, I observed that the conference was evolving to focus on the use of data. Advancing that evolution, this year’s event focused on a particular type of usage: artificial intelligence (AI) and machine learning. The evolution from a focus on zoo animals to a focus on business value using advanced analytics shows further maturation of the big data market.
There’s been some speculation in the market that Hadoop may be disappearing. Some of this speculation has been driven by vendors that have recently downplayed Hadoop in their marketing efforts. For example, the Strata+Hadoop World conference is now known as the Strata Data Conference. The Hadoop Summit is now known as the Dataworks Summit. In Cloudera’s S-1 filing with the SEC for its initial public offering, the term “Hadoop” appears only 14 times, while the term “machine learning” appears 83 times. So, if some of the vendors that created the market appear to be pivoting away from Hadoop, does your organization need to do something similar, or is there a role for Hadoop in your IT architecture?
Recently Hortonworks announced some significant additions to its products at the DataWorks Summit. These additions reflect the fact that the big data market continues to evolve, as I have previously written.
Natural language generation (NLG), the process of generating text or narratives based on a set of data values, can reach a broader audience. NLG narratives can be used for a variety of purposes, but in this perspective I focus on how NLG can be used to enhance business intelligence (BI) processes. In the case of BI, NLG can be used to explain what has happened and why it is happening, and even what actions to take. The NLG narratives can be understood by a broader range of business users than the tables and charts of data that are the typical output of most BI applications or analytics tools.
Many organizations continue to struggle with preparing data for use in operational and analytical processes. We see these issues reported in our Data and Analytics in the Cloud benchmark research, where 55 percent of organizations identify data preparation as the most time-consuming task in their analytical processes. Similarly, in our Next-Generation Predictive Analytics research, 62 percent of companies report that they’re unsatisfied because data needed for access or integration is not readily available. In our Big Data Integration research, 52 percent report spending that in working with big data integration processes, they spend the most time reviewing data for quality and consistency. And nearly half of companies (48%) report this same issue in our Internet of Things research. We are currently conducting further research into this critical issue with our Data Preparation benchmark research.
This is my second analyst perspective based on our IoT Benchmark Research. In the first, I discussed the business focus of IoT applications and some of the challenges organizations are facing. Now I’ll share some of the findings about technologies used in IoT applications and the impact those technologies appear to have on the success of users’ projects.
This year various types of organizations are embracing machine learning like it is going out of style – or maybe it would be better to say coming into style. And now with a little investigation on LinkedIn finds over half million professionals with machine learning in their job title. Machine learning is the application of specific data science algorithms that become more accurate as the system records more outcomes and processes more data. This improvement is referred to as “learning,” hence the name. There are good reasons machine learning is growing so rapidly, but there are pitfalls to avoid as well.