There is a fundamental flaw in information technology, or at least in the way it is most commonly delivered. Most technology systems are developed under the assumption that all people will use the system primarily in the same way. Sure, there are some options built in — perhaps the same action can be initiated by either clicking on a button, selecting a menu item or invoking a keyboard short-cut. The problem is that when every variation needs to be coded into the system, the prospect of providing personalized software programs to every individual is impractical.
Organizations have been using data virtualization to collect and integrate data from various sources, and in different formats, to create a single source of truth without redundancy or overlap, thus improving and accelerating decision-making giving them a competitive advantage in the market. Our research shows that data virtualization is popular in the big data world. One-quarter (27%) of participants in our Data Lake Dynamic Insights Research reported they were currently using data virtualization, and another two-quarters (46%) planned to include data virtualization in the future. Even more interesting, those who are using data virtualization reported higher rates of satisfaction (79%) with their data lake than those who are not (36%). Our Analytics and Data Benchmark Research shows more than one-third of organizations (37%) are using data virtualization in that context. Here, too, those using data virtualization reported higher levels of satisfaction (88%) than those that are not (66%).
I have written previously that the world of data and analytics will become more and more centered around real-time, streaming data. Data is created constantly and increasingly is being collected simultaneously. Technology advances now enable organizations to process and analyze information as it is being collected to respond in real time to opportunities and threats. Not all use cases require real-time analysis and response, but many do, including multiple use cases that can improve customer experiences. For example, best-in-class e-commerce interactions should provide real-time updates on inventory status to avoid stock-out or back-order situations. Customer service interactions should provide real-time recommendations that minimize the time to resolution. Location-based offers should be targeted at the customer’s current location, not their location several minutes ago. Another domain where real-time analyses are critical is internet of things (IoT) applications. Additionally, use cases like predictive maintenance require timely information to prevent equipment failures that help avoid additional costs and damage.
Organizations of all sizes are dealing with exponentially increasing data volume and data sources, which creates challenges such as siloed information, increased technical complexities across various systems and slow reporting of important business metrics. Migrating to the cloud does not solve the problems associated with performing analytics and business intelligence on data stored in disparate systems. Also, the computing power needed to process large volumes of data consists of clusters of servers with hundreds or thousands of nodes that can be difficult to administer. Our Analytics and Data Benchmark Research shows that organizations have concerns about current analytics and BI technology. Findings include difficulty integrating data with other business processes, systems that are not flexible enough to scale operations and trouble accessing data from various data sources.
TIBCO is a large, independent cloud-computing and data analytics software company that offers integration, analytics, business intelligence and events processing software. It enables organizations to analyze streaming data in real time and provides the capability to automate analytics processes. It offers more than 200 connectors, more than 200 enterprise cloud computing and application adapters, and more than 30 non-relational structured query language databases, relational database management systems and data warehouses.
Talend is a data integration and management software company that offers applications for cloud computing, big data integration, application integration, data quality and master data management. The platform enables personnel to work with relational databases, Apache Hadoop, Spark and NoSQL databases for cloud or on-premises jobs. Talend data integration software offers an open and scalable architecture and can be integrated with multiple data warehouses, systems and applications to provide a unified view of all data. Its code generation architecture uses a visual interface to create Java or SQL code.
Databricks is a data engineering and analytics cloud platform built on top of Apache Spark that processes and transforms huge volumes of data and offers data exploration capabilities through machine learning models. It can enable data engineers, data scientists, analysts and other workers to process big data and unify analytics through a single interface. The platform supports streaming data, SQL queries, graph processing and machine learning. It also offers a collaborative user interface — workspace — where workers can create data pipelines in multiple languages — including Python, R, Scala, and SQL — and train and prototype machine learning models.
Access to external data can provide a competitive advantage. Our research shows that more than three-quarters (77%) of participants consider external data to be an important part of their machine learning (ML) efforts. The most important external data source identified is social media, followed by demographic data from data brokers. Organizations also identified government data, market data, environmental data and location data as important external data sources. External data is not just part of ML analyses though. Our research shows that external data sources are also a routine part of data preparation processes, with 80% of organizations incorporating one or more external data sources. And a similar proportion of participants in our research (84%) include external data in their data lakes.
Sisu Data is an analytics platform for structured data that uses machine learning and statistical analysis to automatically monitor changes in data sets and surface explanations. It can prioritize facts based on their impact and provide a detailed, interpretable context to refine and support conclusions. The product features fact boards, annotations and the ability to share facts and analysis across teams. Data teams and analysts start by creating common definitions of key performance indicators, which Sisu then utilizes to automatically test thousands of hypotheses to identify differences between groups.
We are happy to share some insights about Amazon QuickSight drawn from our latest Value Index research, which assesses how well vendors’ offerings meet buyers’ requirements.