In 2017 Strata + Hadoop World was changed to the Strata Data Conference. As I pointed out in my coverage of last year’s event, the focus was largely on machine learning and artificial intelligence (AI). That theme continued this year, but my impression of the event was of a community looking to get value out of data regardless of the technology being used to manage that data. The change was subtle: The location was the same; the exhibitors were largely the same; attendance was similar this year and last. But there was no particular vendor or technology dominating the event.
Topics: Analytics, Business Intelligence, data science, Big Data, Data Integration, Data Governance, Data Preparation, Information Optimization, Machine Learning, digital technology, Machine Learning and Cognitive Computing
Ventana Research recently published the findings of our benchmark research on Data Preparation, which examines the practices organizations use to accomplish data preparation. We view data preparation as a sequence of steps: identifying, locating and then accessing the data; aggregating data from different sources; and enriching, transforming and cleaning it to create a single uniform data set. Using data to accomplish organizational goals requires that it be prepared for use; to do this job properly, businesses need flexible tools that enable them to enrich the context of data drawn from multiple sources and collaborate on its preparation as well as ensure security and consistency. Users of data preparation tools range from analysts to operations professionals in the lines of business to IT professionals.
I recently attended SAP TechEd in Las Vegas to hear the latest from the company regarding its analytics and business intelligence offerings as well as its data management platform. The company used the event to launch SAP Data Hub and made several other data and analytics announcements that I’ll cover below.
The Strata Data Conference is changing and it’s changing in a good way. At the recent Strata Data Conference in New York, Mike Olson, chief strategy officer at Cloudera, which co-sponsored the event, commented that at prior events we used to talk about the “Hadoop zoo animals,” meaning the various components of the Hadoop ecosystem of which I have written previously. Following last fall’s Strata event, I observed that the conference was evolving to focus on the use of data. Advancing that evolution, this year’s event focused on a particular type of usage: artificial intelligence (AI) and machine learning. The evolution from a focus on zoo animals to a focus on business value using advanced analytics shows further maturation of the big data market.
There’s been some speculation in the market that Hadoop may be disappearing. Some of this speculation has been driven by vendors that have recently downplayed Hadoop in their marketing efforts. For example, the Strata+Hadoop World conference is now known as the Strata Data Conference. The Hadoop Summit is now known as the Dataworks Summit. In Cloudera’s S-1 filing with the SEC for its initial public offering, the term “Hadoop” appears only 14 times, while the term “machine learning” appears 83 times. So, if some of the vendors that created the market appear to be pivoting away from Hadoop, does your organization need to do something similar, or is there a role for Hadoop in your IT architecture?
Recently Hortonworks announced some significant additions to its products at the DataWorks Summit. These additions reflect the fact that the big data market continues to evolve, as I have previously written.
Many organizations continue to struggle with preparing data for use in operational and analytical processes. We see these issues reported in our Data and Analytics in the Cloud benchmark research, where 55 percent of organizations identify data preparation as the most time-consuming task in their analytical processes. Similarly, in our Next-Generation Predictive Analytics research, 62 percent of companies report that they’re unsatisfied because data needed for access or integration is not readily available. In our Big Data Integration research, 52 percent report spending that in working with big data integration processes, they spend the most time reviewing data for quality and consistency. And nearly half of companies (48%) report this same issue in our Internet of Things research. We are currently conducting further research into this critical issue with our Data Preparation benchmark research.
This is my second analyst perspective based on our IoT Benchmark Research. In the first, I discussed the business focus of IoT applications and some of the challenges organizations are facing. Now I’ll share some of the findings about technologies used in IoT applications and the impact those technologies appear to have on the success of users’ projects.
This year various types of organizations are embracing machine learning like it is going out of style – or maybe it would be better to say coming into style. And now with a little investigation on LinkedIn finds over half million professionals with machine learning in their job title. Machine learning is the application of specific data science algorithms that become more accurate as the system records more outcomes and processes more data. This improvement is referred to as “learning,” hence the name. There are good reasons machine learning is growing so rapidly, but there are pitfalls to avoid as well.
Informatica reintroduced itself to the world at its recent customer conference, Informatica World, in San Francisco. The company took advantage of the event to showcase its new branding in an effort to change the way customers think about the company. Informatica has been providing information services in the cloud for more than a decade. Even though cloud revenue comprises a minority of Informatica’s business, in absolute terms, the revenue is significant, and company executives want the public to recognize Informatica as a leader in cloud-based data management services for enterprises. Presenters also made notable product announcements, discussed below, including the application of machine learning to the data management process.
Topics: Analytics, Business Intelligence, data science, Big Data, Data Integration, Data Governance, Data Preparation, Information Optimization, Machine Learning Digital Technology, Machine Learning and Cognitive Computing, Cloud Computing