The technology industry throws around a lot of similar terms with different meanings as well as entirely different terms with similar meanings. In this post, I don’t want to debate the meanings and origins of different terms; rather, I’d like to highlight a technology weapon that you should have in your data management arsenal. We currently refer to this technology as data virtualization. Other similar terms you may have heard include data fabric, data mesh and [data] federation. I’ll briefly discuss these terms and how I see them being used, but ultimately, I’d like to share with you some research that shows why data virtualization can be valuable, regardless of what you call it.
Collibra is a data governance software company that offers tools for metadata management and data cataloging. The software enables organizations to find data quickly, identify its source and assure its integrity. Line-of-business workers can use it to create, review and update the organization's policies on different data assets. Collibra’s software uses a microservice architecture and open application programming interfaces to connect to various data ecosystems. Its data intelligence cloud platform can automatically classify data from various sources such as online transaction processing databases, master repositories and Excel files without moving the data, so the information assets stay protected.
Rapidminer is a visual enterprise data science platform that includes data extraction, data mining, deep learning, artificial intelligence and machine learning (AI/ML) and predictive analytics. It can support AI/ML processes with data preparation, model validation, results visualization and model optimization. Rapidminer Studio is its visual workflow designer for the creation of predictive models. It offers more than 1,500 algorithms and functions in their library, along with templates, for common use cases including customer churn, predictive maintenance and fraud detection. It has a drag and drop visual interface and can connect to databases, enterprise data warehouses, data lakes, cloud storage, business applications and social media. The platform also supports push-down processing for data prep and ETL inside databases to minimize data movement and optimize performance.
The annual Ventana Research Digital Innovation Awards showcase advances in the productivity and potential of business applications, as well as technology that contributes significantly to the improved processes and performance of an organization. Our goal is to recognize technology and vendors that have introduced noteworthy digital innovations to advance business and IT.
Everyone talks about data quality, as they should. Our research shows that improving the quality of information is the top benefit of data preparation activities. Data quality efforts are focused on clean data. Yes, clean data is important. but so is bad data. To be more accurate, the original data as recorded by an organization’s various devices and systems is important.
The amount of data flowing into organizations is growing exponentially, creating a need to process more data more quickly than ever before. Our Data Preparation Benchmark Research shows that accessing and preparing data continues to be the most time-consuming part of making data available for analysis. This can potentially slow down the organizational functions which depend on the analysis results. Trying to get ahead of the backlog with incremental improvements to existing approaches and traditional technologies alone can be frustrating.
Organizations are accelerating their digital transformation and looking for innovative ways to engage with customers in this new digital era of data management. The goal is to understand how to manage the growing volume of data in real time, across all sources and platforms, and use it to inform, streamline and transform internal operations. Over the years, the adoption of cloud computing has gained momentum with more and more organizations trying to make use of applications, data, analytics and self-service business intelligence (BI) tools running on top of cloud-computing infrastructure in order to improve efficiency. However, cloud adoption means living with a mix of on-premises and multiple cloud-based systems in a hybrid computing environment. The challenge is to ensure that processes, applications and data can still be integrated across cloud and on-premises systems. Our research shows that organizations still have a significant requirement for on-premises data management but also have a growing requirement for cloud-based capabilities.
Topics: business intelligence, embedded analytics, Analytics, Collaboration, Data Governance, Data Preparation, Information Management, Internet of Things, Data, natural language processing, data lakes, AI & Machine Learning
Ventana Research recently announced its 2021 Market Agenda for data, continuing the guidance we have offered for nearly two decades to help organizations derive optimal value and improve business outcomes.
Data is becoming more valuable and more important to organizations. At the same time, organizations have become more disciplined about the data on which they rely to ensure it is robust, accurate and governed properly. Without data integrity, organizations cannot trust the information produced by their data processes, and will be discouraged from using that data, resulting in inefficiencies and reduced effectiveness.
Organizations are dealing with exponentially increasing data that ranges broadly from customer-generated information, financial transactions, edge-generated data and even operational IT server logs. A combination of complex data lake and data warehouse capabilities are required to leverage this data. Our research shows that nearly three-quarters of organizations deploy both data lakes and data warehouses but are using a variety of approaches which can be cumbersome. A single platform that can provide both capabilities will help address organizations’ requirements.