David Menninger's Analyst Perspectives

Use External Data Platform to Improve Analytics

Posted by David Menninger on Oct 19, 2021 3:00:00 AM

Access to external data can provide a competitive advantage. Our research shows that more than three-quarters (77%) of participants consider external data to be an important part of their machine learning (ML) efforts. The most important external data source identified is social media, followed by demographic data from data brokers. Organizations also identified government data, market data, environmental data and location data as important external data sources. External data is not just part of ML analyses though. Our research shows that external data sources are also a routine part of data preparation processes, with 80% of organizations incorporating one or more external data sources. And a similar proportion of participants in our research (84%) include external data in their data lakes.

External data enriches the customer data an organization collects, enabling better segmentation of the customer base and, ultimately, a better customer experience. External data also enriches artificial intelligence and machine learning (AI/ML)-enhancing feature engineering for better accuracy in the models. Driver-based planning can be improved with external data, such as economic data, that may impact operations and capital costs. Organizations can also use external data to benchmark their performance relative to their peers and relative to their competitors. External data provides the basis for market share analysis and helps determine the size of the addressable market.

Ventana_Research_DI_Machine_Learning_08_Benefits_of_External_DataAnd our research underscores the value of external data. When we compare the benefits reported in our Machine Learning Dynamic Insights among those organizations using external data with those that are not, we see a significant difference. For example, 31% of organizations using external data report they have improved customer experience versus 7% of those that do not report using external data. Similarly, 28% of those using external data report they have achieved a competitive advantage compared with 9% that are not. Correlation is not necessarily causation, but there is a clear correlation in the data.

Despite the value it provides, acquiring relevant external data presents several challenges. It can be expensive and hard to access - hundreds of data providers, data aggregators, and data marketplaces make finding and acquiring relevant data overwhelming. Data is tedious to use, as manipulating, transforming, and matching to internal data takes significant time and effort. Finally, the data can be out of date and pose compliance and governance risks.

Organizations can attempt to acquire and manage the external data they need, or they can work with an external data provider. An external data platform provider manages the information, keeps it current and makes it available as necessary. Ideally, the platform would also assist with analyses by automatically discovering features of the data and their potential impacts on ML models. The platform should help match and validate multiple external data sources with each other and with internal data sources. Where data needs to be loaded into internal systems, the data vendor can provide various ways to export data or analyses, such as ML models based on the data.

Organizations that are not using external data should strongly consider how this data could enhance their financial, operational and analytical processes. As organizations utilize external data, they should address the issues associated with purchasing, storing, accessing and maintaining this data. Each organization will need to address these issues with approaches appropriate for its situation, but should ensure its strategy encourages rather than discourages the use of external data.


David Menninger

Topics: Analytics, Business Intelligence, Internet of Things, Data, Digital Technology, AI and Machine Learning, Streaming Data, Streaming Analytics

David Menninger

Written by David Menninger

David is responsible for the overall research direction of data, information and analytics technologies at Ventana Research covering major areas including Analytics, Big Data, Business Intelligence and Information Management along with the additional specific research categories including Information Applications, IT Performance Management, Location Intelligence, Operational Intelligence and IoT, and Data Science. David is also responsible for examining the role of cloud computing, collaboration and mobile technologies as they affect these areas. David brings to Ventana Research over twenty-five years of experience, through which he has marketed and brought to market some of the leading edge technologies for helping organizations analyze data to support a range of action-taking and decision-making processes. Prior to joining Ventana Research, David was the Head of Business Development & Strategy at Pivotal a division of EMC, VP of Marketing and Product Management at Vertica Systems, VP of Marketing and Product Management at Oracle, Applix, InforSense and IRI Software. David earned his MS in Business from Bentley University and a BS in Economics from University of Pennsylvania.