Big data has become an integral part of information management. Nearly all organizations have some need to access big data sources and produce actionable information for decision-makers. Recognizing this connection, we merged these two topics when we put together our recently published research agendas for 2017. As we plan our research, we focus on current technologies and how they can be used to improve an organization’s performance. We then share those results with our readers.
Topics: Big Data, data science, Data Governance, Data Integration, Data Preparation, Information Management, Internet of Things, analytics, Machine Learning and Cognitive Computing, Machine Learning Digital Technology
The business intelligence market is bounded on one side by big data and on the other side by data preparation. That is, to maximize their performance in using information, organizations have to collect and analyze ever increasing volumes of data while the tools available are constantly evolving in the big data ecosystem that I have written about. In our benchmark research on big data analytics, half (51%) of organizations said they want to access big data using their existing BI tools. At the same time, as I have noted, end users are demanding self-service access to data preparation capabilities to facilitate their analyses.
Teradata recently held its annual Partners conference, at which gather several thousand customers and partners from around the world. This was the first Partners event since Vic Lund was appointed president and CEO in May. Year on year, Teradata’s revenues are down about 5 percent, which likely prompted some changes at the company. Over the past few years Teradata made several technology acquisitions and perhaps spread its resources too thin. At the event, Lund committed the company to a focus on customers, which was a significant part of Teradata’s success in the past. This commitment was well received by customers I spoke with at the event.
It’s part of my job to cover the ecosystem of Hadoop, the open source big data technology, but sometimes it makes my head spin. If this is not your primary job, how can you possibly keep up? I hope that a discussion of what I’ve found to be most important will help those who don’t have the time and energy to devote to this wide-ranging topic.
Data virtualization is not new, but it has changed over the years. The term describes a process of combining data on the fly from multiple sources rather than copying that data into a common repository such as a data warehouse or a data lake, which I have written about. There are many reasons for an organization concerned with managing its data to consider data virtualization, most stemming from the fact that the data does not have to be copied to a new location. It could, for instance, eliminate the cost of building and maintaining a copy of one of the organization’s big data sources. Recognizing these benefits, many database and data integration companies offer data virtualization products. Denodo, one of the few independent, best-of-breed vendors in this market today, brings these capabilities to big data sources and data lakes.
Qlik helped pioneer the visual discovery market with its QlikView product. In some respects, Qlik and its competitors also spawned the self-service trend rippling through the analytics market today. Their aim was to enable business users to perform analytics for themselves rather than building a product with the perfect set of features for IT. After establishing success with end users the company began to address more of the concerns of IT, eventually creating a robust enterprise-grade analytics platform. This approach has worked for Qlik, driving growth that led to an initial public offering in 2010. The company now generates more than half a billion dollars in revenue annually, making it one of the largest independent analytics vendors. Of which based on their company and products was rated a Hot Vendor in our 2015 Value Index on Analytics and Business Intelligence and one of the highest ranked in usability.
It has been more than five years since James Dixon of Pentaho coined the term “data lake.” His original post suggests, “If you think of a data mart as a store of bottled water – cleansed and packaged and structured for easy consumption – the data lake is a large body of water in a more natural state.” The analogy is a simple one, but in my experience talking with many end users there is still mystery surrounding the concept. In this post I’d like to clarify what a data lake is, review the reasons an organization might consider using one and the challenges they present, and outline some developments in software tools that support data lakes.
Topics: Big Data, data science, Predictive Analytics, Social Media, Business Analytics, Business Intelligence, Data Governance, Data lake, Governance, Risk & Compliance (GRC), Hadoop, Information Management, Uncategorized
On Monday, March 21, Informatica, a vendor of information management software, announced Big Data Management version 10.1. My colleague Mark Smith covered the introduction of v. 10.0 late last year, along with Informatica’s expansion from data integration to broader data management. Informatica’s Big Data Management 10.1 release offers new capabilities, including for the hot topic of self-service data preparation for Hadoop, which Informatica is calling Intelligent Data Lake. The term “data lake” describes large collections of detailed data from across an organization, often stored in Hadoop. With this release Informatica seeks to add more enterprise capabilities to data lake implementations.
I recently attended the SAS Analyst Summit in Steamboat Springs, Colo. (Twitter Hashtag #SASSB) The event offers an occasion for the company to discuss its direction and to assess its strengths and potential weaknesses. SAS is privately held, so customers and prospects cannot subject its performance to the same level of scrutiny as public companies, and thus events like this one provide a valuable source of additional information.
Topics: Big Data, Predictive Analytics, SAS, Analytics, Business Analytics, Business Collaboration, Business Intelligence, Business Performance, Information Applications, Information Management, Uncategorized, Visualization
Last week I attended Spark Summit East 2016 at the New York Hilton Midtown. It revealed several ways in which Spark technology might impact the big data market.