Services for Organizations

Using our research, best practices and expertise, we help you understand how to optimize your business processes using applications, information and technology. We provide advisory, education, and assessment services to rapidly identify and prioritize areas for improvement and perform vendor selection

Consulting & Strategy Sessions

Ventana On Demand

    Services for Investment Firms

    We provide guidance using our market research and expertise to significantly improve your marketing, sales and product efforts. We offer a portfolio of advisory, research, thought leadership and digital education services to help optimize market strategy, planning and execution.

    Consulting & Strategy Sessions

    Ventana On Demand

      Services for Technology Vendors

      We provide guidance using our market research and expertise to significantly improve your marketing, sales and product efforts. We offer a portfolio of advisory, research, thought leadership and digital education services to help optimize market strategy, planning and execution.

      Analyst Relations

      Demand Generation

      Product Marketing

      Market Coverage

      Request a Briefing

        David Menninger's Analyst Perspectives

        << Back to Blog Index

        Pentaho 7.0 Brings Visualization and Data Preparation

        The business intelligence market is bounded on one side by big data and on the other side by data preparation. That is, to maximize their performance in using information, organizations have to collect and analyze ever increasing volumes of data while the tools available are constantly evolving in the big data ecosystem that I have written about. In our benchmark research on big data analytics, half (51%) of organizations said they want to access big data using their existing BI tools. At the same time, as I have noted, end users are demanding self-service access to data preparation capabilities to facilitate their analyses.

        vr_Big_Data_Analytics_22_access_to_big_data_analytics.pngPentaho provides open source business intelligence and data integration products that cross the divide between big data and data preparation. The company recently introduced Pentaho 7.0, which has features designed to further bridge this gap. The new version embeds the visualization capabilities of Pentaho Business Analytics in the data preparation capabilities of Pentaho Data Integration (PDI). The combination enables users to visualize data at each step in the data preparation process and uncover data quality issues earlier in the analytics process.

        Part of the engineering work that made the new visualization features possible includes behind-the-scenes simplification and unification of Pentaho’s server architecture. PDI now interacts more directly with Pentaho Business Analytics. As a result, data engineers and data analysts doing data preparation can share data at any point with users of Business Analytics; this facilitates collaboration between different parts of the organization. The divide between business and IT can be an obstacle. These new capabilities should help bridge that divide.

        Pentaho 7.0 also includes enhanced big data capabilities. Recognizing the growing popularity of Spark for big data processing and analysis, which I have commented on, the company has added Spark capabilities to PDI. It now supports Spark SQL as a data source. In addition, Spark Streaming routines and machine learning routines written in Spark ML or MLlib can be included as objects in data pipelines managed by PDI. Kerberos security capabilities have also been enhanced, including integration with Cloudera’s Sentry for more granular access to Hadoop data sets. Other big data enhancements include support for Avro, Kafka and Parquet.

        Beginning with version 6.1 of PDI Pentaho introduced the concept of “metadata injection.” Data integration, especially big data integration, often involves many files – potentially thousands. It can be a challenge to process these files without creating many different data integration routines to deal with the variety of file types, even for something as simple as the file names. Metadata injection provides a way to apply a single routine to many different files by enabling PDI routines to accept parameters at various steps in the data pipeline process. PDI 7.0 adds metadata injection support to another 30 steps in the sequence of processing files.

        A year ago we wrote about the prior major version of Pentaho and its relationship with its new parent Hitachi Data Systems (HDS). At the time HDS executives promised to leave Pentaho’s roadmap intact, and they appear to have done so. However, we haven’t seen as much integration of Pentaho into HDS products as expected. HDS has a major focus on IoT. Our recently completed IoT and Operational Intelligence benchmark research will be published soon. Over time I expect to see more IoT-related capabilities emerge that draw upon both Pentaho’s and HDS’s technologies.

        Within the context of IoT and otherwise, Pentaho is in position to bring data preparation and analytics together, and thus fill a gap in the market. The pure-play data prep vendors lack analytic capabilities, and pure-play analytics vendors lack data preparation capabilities, although several have started to add them. Bringing these capabilities together will be good for users. If your organization needs to support users with both data preparation and analytic capabilities I recommend considering the new features in Pentaho 7.0.


        David Menninger

        SVP & Research Director

        Follow Me on Twitter @dmenningerVR and Connect with me on LinkedIn.


        David Menninger
        Executive Director, Technology Research

        David Menninger leads technology software research and advisory for Ventana Research, now part of ISG. Building on over three decades of enterprise software leadership experience, he guides the team responsible for a wide range of technology-focused data and analytics topics, including AI for IT and AI-infused software.


        Analyst Perspective Policy

        • Ventana Research’s Analyst Perspectives are fact-based analysis and guidance on business, industry and technology vendor trends. Each Analyst Perspective presents the view of the analyst who is an established subject matter expert on new developments, business and technology trends, findings from our research, or best practice insights.

          Each is prepared and reviewed in accordance with Ventana Research’s strict standards for accuracy and objectivity and reviewed to ensure it delivers reliable and actionable insights. It is reviewed and edited by research management and is approved by the Chief Research Officer; no individual or organization outside of Ventana Research reviews any Analyst Perspective before it is published. If you have any issue with an Analyst Perspective, please email them to

        View Policy

        Subscribe to Email Updates

        Posts by Month

        see all

        Posts by Topic

        see all

        Analyst Perspectives Archive

        See All