David Menninger's Analyst Perspectives

The Untapped Potential of Data Preparation

Posted by David Menninger on Dec 29, 2017 5:53:35 AM

Ventana Research recently published the findings of our benchmark research on Data Preparation, which examines the practices organizations use to accomplish data preparation. We view data preparation as a sequence of steps: identifying, locating and then accessing the data; aggregating data from different sources; and enriching, transforming and cleaning it to create a single uniform data set. Using data to accomplish organizational goals requires that it be prepared for use; to do this job properly, businesses need flexible tools that enable them to enrich the context of data drawn from multiple sources and collaborate on its preparation as well as ensure security and consistency. Users of data preparation tools range from analysts to operations professionals in the lines of business to IT professionals.

A variety of new factors are changing the data preparation process, including the growing importance of streaming data sources flowing into big data repositories and a resulting need to apply data science techniques to derive meaning from it. These technical factors will likely increase the need for IT professionals to be involved in preparing data. Nonetheless, the trend toward deploying tools that support self-service data preparation in the lines of business is growing. Self-service tools enable analysts to perform all or many of the data preparation tasks without the assistance of IT. These two trends can lead to conflict for organizations that want to derive maximum business value from their data as quickly as possible while still maintaining the appropriate data governance, security and consistency.

Advances in data preparation have unquestionablydatapreparation_performance.png provided an opportunity for organizations to change the way they approach information management, but overall, organizations have not embraced these changes. Four years ago our Information Optimization Performance Index analysis found that more than half of organizations (52%) placed at the top two levels of our performance hierarchy; this most recent research places only 43 percent at those levels. This decline in performance suggests that many organizations need to improve their use of data preparation with a dedicated approach.

Two changes may be driving this decline: the growing complexity of data both in volume and variety and a greater focus on enabling line-of-business users to work with data independent of their IT organizations. It’s worth noting, though, that lackluster performance is not necessarily an indication of organizations’ interest in data preparation: in this research, 88 percent of participants said that self-service data preparation is important to their organizations. Despite this high level of interest in providing self-service data preparation, the reality is that organizations have not succeeded in deploying these capabilities. (Those organizations that didn’t consider it important cite security, governance or risk issues as their main concerns.)

Drilling down into the results, data preparation tools are meeting organizations’ needs in some cases but the research suggests plenty of room for improvement. Just more than half vr_DataPrep_14_Comfort_with_Bus_Users.png(56%) said they consider their data preparation technologies completely or mostly adequate. A slightly higher percentage (62%) reported confidence in their organization’s ability to prepare data. However, fewer than half (44%) said they are comfortable allowing users to work with data not prepared by IT. Furthermore, many users complain that their data preparation technology is not flexible or adaptable when change is needed and IT’s top complaint is that it requires too many resources. This difference points to a broader disconnect between business units and IT: They do not always see eye to eye on data preparation issues. Nearly half (45%) of participants report that the top issue between the two groups is their differing view on access to data, with business units preferring an expansive approach and IT preferring a controlled approach.

Many organizations (45%) expect to be reevaluating the way they assess and select data preparation technology in the next 12 to 18 months. When considering technology options and vendors, organizations rated usability and functionality the most important evaluation criteria. However, cost is a barrier for many organizations: Nearly six in 10 organizations (58%) cite it, making it far and away the most often reported barrier issue, followed by inadequate skills (35%), limited awareness (33%) and lack of resources (33%). On the other hand, issues such as latency, big data and scalability are least likely to be barriers, suggesting the obstacles are organizational rather than technical.

As you evaluate your data preparation processes, consider the findings in this benchmark research. Understand the primary use cases and the specific people and technology requirements of your organization. Create clear goals for your data preparation efforts and encourage your organization to adopt a cross-functional approach for designing and deploying data preparation tasks. These steps can help your organization realize the full potential of its data preparation efforts.


David Menninger

SVP & Research Director

Follow Me on Twitter and Connect with me on LinkedIn.

Topics: Big Data, Analytics, Data Preparation

David Menninger

Written by David Menninger

David is responsible for the overall research direction of data, information and analytics technologies at Ventana Research covering major areas including Analytics, Big Data, Business Intelligence and Information Management along with the additional specific research categories including Information Applications, IT Performance Management, Location Intelligence, Operational Intelligence and IoT, and Data Science. David is also responsible for examining the role of cloud computing, collaboration and mobile technologies as they affect these areas. David brings to Ventana Research over twenty-five years of experience, through which he has marketed and brought to market some of the leading edge technologies for helping organizations analyze data to support a range of action-taking and decision-making processes. Prior to joining Ventana Research, David was the Head of Business Development & Strategy at Pivotal a division of EMC, VP of Marketing and Product Management at Vertica Systems, VP of Marketing and Product Management at Oracle, Applix, InforSense and IRI Software. David earned his MS in Business from Bentley University and a BS in Economics from University of Pennsylvania.