Everyone talks about data quality, as they should. Our research shows that improving the quality of information is the top benefit of data preparation activities. Data quality efforts are focused on clean data. Yes, clean data is important. but so is bad data. To be more accurate, the original data as recorded by an organization’s various devices and systems is important.
The amount of data flowing into organizations is growing exponentially, creating a need to process more data more quickly than ever before. Our Data Preparation Benchmark Research shows that accessing and preparing data continues to be the most time-consuming part of making data available for analysis. This can potentially slow down the organizational functions which depend on the analysis results. Trying to get ahead of the backlog with incremental improvements to existing approaches and traditional technologies alone can be frustrating.
Organizations are accelerating their digital transformation and looking for innovative ways to engage with customers in this new digital era of data management. The goal is to understand how to manage the growing volume of data in real time, across all sources and platforms, and use it to inform, streamline and transform internal operations. Over the years, the adoption of cloud computing has gained momentum with more and more organizations trying to make use of applications, data, analytics and self-service business intelligence (BI) tools running on top of cloud-computing infrastructure in order to improve efficiency. However, cloud adoption means living with a mix of on-premises and multiple cloud-based systems in a hybrid computing environment. The challenge is to ensure that processes, applications and data can still be integrated across cloud and on-premises systems. Our research shows that organizations still have a significant requirement for on-premises data management but also have a growing requirement for cloud-based capabilities.
Topics: business intelligence, embedded analytics, Analytics, Collaboration, Data Governance, Data Preparation, Information Management, Internet of Things, Data, natural language processing, data lakes, AI & Machine Learning
Ventana Research recently announced its 2021 Market Agenda for data, continuing the guidance we have offered for nearly two decades to help organizations derive optimal value and improve business outcomes.
Data is becoming more valuable and more important to organizations. At the same time, organizations have become more disciplined about the data on which they rely to ensure it is robust, accurate and governed properly. Without data integrity, organizations cannot trust the information produced by their data processes, and will be discouraged from using that data, resulting in inefficiencies and reduced effectiveness.
Organizations are dealing with exponentially increasing data that ranges broadly from customer-generated information, financial transactions, edge-generated data and even operational IT server logs. A combination of complex data lake and data warehouse capabilities are required to leverage this data. Our research shows that nearly three-quarters of organizations deploy both data lakes and data warehouses but are using a variety of approaches which can be cumbersome. A single platform that can provide both capabilities will help address organizations’ requirements.
Traditional on-premises data processing solutions have led to a hugely complex and expensive set of data silos where IT spends more time managing the infrastructure than extracting value from the data. Big data architectures have attempted to solve the problem with large pools of cost-effective storage, but in doing so have often created on-premises management and administration challenges. These challenges of acquiring, installing and maintaining large clusters of computing resources gave rise to cloud-based implementations as an alternative. Public cloud is becoming the new center for data as organizations migrate from static on-premises IT architectures to global, dynamic and multi-cloud architectures.
Organizations are always looking to improve their ability to use data and AI to gain meaningful and actionable insights into their operations, services and customer needs. But unlocking value from data requires multiple analytics workloads, data science tools and machine learning algorithms to run against the same diverse data sets. Organizations still struggle with limited data visibility and insufficient insights, which are often caused by a multitude of reasons such as analytic workloads running independently, data spread across multiple data centers, data governance, etc. In our ongoing benchmark research project, we are researching the ways in which organizations work with big data and the challenges they face.
In this analyst perspective, Dave Menninger takes a look at data lakes. He explains the term “data lake,” describes common use cases and shares his views on some of the latest market trends. He explores the relationship between data warehouses and data lakes and share some of Ventana Research’s findings on the subject. He also provides an assessment of the risks organizations face in working with data lakes and offers recommendations for maximizing the potential of data.
I was recently asked to identify key modern data architecture trends. Data architectures have changed significantly to accommodate larger volumes of data as well as new types of data such as streaming and unstructured data. Here are some of the trends I see continuing to impact data architectures.