Data is becoming more valuable and more important to organizations. At the same time, organizations have become more disciplined about the data on which they rely to ensure it is robust, accurate and governed properly. Without data integrity, organizations cannot trust the information produced by their data processes, and will be discouraged from using that data, resulting in inefficiencies and reduced effectiveness.
Organizations are dealing with exponentially increasing data that ranges broadly from customer-generated information, financial transactions, edge-generated data and even operational IT server logs. A combination of complex data lake and data warehouse capabilities are required to leverage this data. Our research shows that nearly three-quarters of organizations deploy both data lakes and data warehouses but are using a variety of approaches which can be cumbersome. A single platform that can provide both capabilities will help address organizations’ requirements.
Traditional on-premises data processing solutions have led to a hugely complex and expensive set of data silos where IT spends more time managing the infrastructure than extracting value from the data. Big data architectures have attempted to solve the problem with large pools of cost-effective storage, but in doing so have often created on-premises management and administration challenges. These challenges of acquiring, installing and maintaining large clusters of computing resources gave rise to cloud-based implementations as an alternative. Public cloud is becoming the new center for data as organizations migrate from static on-premises IT architectures to global, dynamic and multi-cloud architectures.
Organizations are always looking to improve their ability to use data and AI to gain meaningful and actionable insights into their operations, services and customer needs. But unlocking value from data requires multiple analytics workloads, data science tools and machine learning algorithms to run against the same diverse data sets. Organizations still struggle with limited data visibility and insufficient insights, which are often caused by a multitude of reasons such as analytic workloads running independently, data spread across multiple data centers, data governance, etc. In our ongoing benchmark research project, we are researching the ways in which organizations work with big data and the challenges they face.
In this analyst perspective, Dave Menninger takes a look at data lakes. He explains the term “data lake,” describes common use cases and shares his views on some of the latest market trends. He explores the relationship between data warehouses and data lakes and share some of Ventana Research’s findings on the subject. He also provides an assessment of the risks organizations face in working with data lakes and offers recommendations for maximizing the potential of data.
I was recently asked to identify key modern data architecture trends. Data architectures have changed significantly to accommodate larger volumes of data as well as new types of data such as streaming and unstructured data. Here are some of the trends I see continuing to impact data architectures.
Ventana Research recently announced its 2020 research agenda for data, continuing the guidance we’ve offered for nearly two decades to help organizations derive optimal value and improve business outcomes. Data volumes continue to grow while data latency requirements continue to shrink. Meanwhile, virtually every organization is confronting a need for good data governance.
This year, I attended Informatica World 2019, Informatica's annual user conference. The main focus this year was on the cloud with a heavy does of AI. Under that focus, Informatica's conference emphasized capabilities across six areas (all strong areas for Informatica): data integration, data management, data quality & governance, Master Data Management (MDM), data cataloging, and data security.
The emerging internet of things (IoT) is an extension of digital connectivity to devices and sensors in homes, businesses, vehicles and potentially almost anywhere. This innovation means that virtually any appropriately designed device can generate and transmit data about its operations, which can facilitate monitoring and a range of automatic functions. To do this IoT requires a set of event-centered information and analytic processes that enable people to use that event information to make optimal decisions and take act effectively.
Organizations now must store, process and use data of significantly greater volume and variety than in the past. These factors plus the velocity of data today — the unrelentingly rapid rate at which it is generated, both in enterprise systems and on the internet — add to the challenge of getting the data into a form that can be used for business tasks.