Organizations today have huge volumes of data across various cloud and on-premises systems which keep growing by the second. To derive value from this data, organizations must query the data regularly and share insights with relevant teams and departments. Automating this process using natural language processing (NLP) and artificial intelligence and machine learning (AI/ML) enables line-of-business personnel to query the data faster, generate reports themselves without depending on IT, and make quick decisions. Some organizations have started using NLP in self-service analytics to quickly identify patterns and simplify data visualization. Our Analytics and Data Benchmark Research finds that about 81% of organizations expect to use natural language search for analytics to make timely and informed decisions.
Organizations today are working with multiple applications and systems, including enterprise resource planning (ERP), customer relationship management (CRM), supply chain management (SCM) and other systems, where data can easily become fragmented and siloed. And as the organization increases its data sources and adds more systems and custom applications, it becomes challenging to manage the data consistently and keep data definitions up to date. This increases the need to use master data management (MDM) software that can provide a single source of truth to drive accurate analytics and business operations.
Talend is a data integration and management software company that offers applications for cloud computing, big data integration, application integration, data quality and master data management. The platform enables personnel to work with relational databases, Apache Hadoop, Spark and NoSQL databases for cloud or on-premises jobs. Talend data integration software offers an open and scalable architecture and can be integrated with multiple data warehouses, systems and applications to provide a unified view of all data. Its code generation architecture uses a visual interface to create Java or SQL code.
Databricks is a data engineering and analytics cloud platform built on top of Apache Spark that processes and transforms huge volumes of data and offers data exploration capabilities through machine learning models. It can enable data engineers, data scientists, analysts and other workers to process big data and unify analytics through a single interface. The platform supports streaming data, SQL queries, graph processing and machine learning. It also offers a collaborative user interface — workspace — where workers can create data pipelines in multiple languages — including Python, R, Scala, and SQL — and train and prototype machine learning models.
Access to external data can provide a competitive advantage. Our research shows that more than three-quarters (77%) of participants consider external data to be an important part of their machine learning (ML) efforts. The most important external data source identified is social media, followed by demographic data from data brokers. Organizations also identified government data, market data, environmental data and location data as important external data sources. External data is not just part of ML analyses though. Our research shows that external data sources are also a routine part of data preparation processes, with 80% of organizations incorporating one or more external data sources. And a similar proportion of participants in our research (84%) include external data in their data lakes.
Alteryx is a data analytics software company that offers data preparation and analytics tools to simplify and automate data wrangling, data cleaning and modeling processes, enabling line-of-business personnel to quickly access, manipulate, analyze and output data. The platform features tools to run a variety of analytic functions such as diagnostic, predictive, prescriptive and geospatial analytics in a unified platform, and can connect to various data warehouses, cloud applications, spreadsheets and other sources.
Sisu Data is an analytics platform for structured data that uses machine learning and statistical analysis to automatically monitor changes in data sets and surface explanations. It can prioritize facts based on their impact and provide a detailed, interpretable context to refine and support conclusions. The product features fact boards, annotations and the ability to share facts and analysis across teams. Data teams and analysts start by creating common definitions of key performance indicators, which Sisu then utilizes to automatically test thousands of hypotheses to identify differences between groups.
Rapidminer is a visual enterprise data science platform that includes data extraction, data mining, deep learning, artificial intelligence and machine learning (AI/ML) and predictive analytics. It can support AI/ML processes with data preparation, model validation, results visualization and model optimization. Rapidminer Studio is its visual workflow designer for the creation of predictive models. It offers more than 1,500 algorithms and functions in their library, along with templates, for common use cases including customer churn, predictive maintenance and fraud detection. It has a drag and drop visual interface and can connect to databases, enterprise data warehouses, data lakes, cloud storage, business applications and social media. The platform also supports push-down processing for data prep and ETL inside databases to minimize data movement and optimize performance.
Confluent Platform is a streaming platform built by the original creators of Apache Kafka. It enables organizations to organize and manage streaming data from various sources. Confluent launched its IPO in June this year and raised $828 million to further expand its business. Confluent Platform was brought to several public cloud vendor marketplaces last year as Confluent Cloud. The offering is currently available in Azure, AWS, and GCP marketplaces. Furthermore, the company strengthened its partnership with Microsoft at the beginning of this year, establishing Confluent Cloud as a fully managed Apache Kafka service directly available on Microsoft Azure. Azure customers can access the extensive library of pre-built connectors, a unified billing model with options to use Azure committed spend on Confluent Cloud, and deeper integrations with Azure services.
The annual Ventana Research Digital Innovation Awards showcase advances in the productivity and potential of business applications, as well as technology that contributes significantly to the improved processes and performance of an organization. Our goal is to recognize technology and vendors that have introduced noteworthy digital innovations to advance business and IT.