Welcome back -

Services for Organizations

Using our research, best practices and expertise, we help you understand how to optimize your business processes using applications, information and technology. We provide advisory, education, and assessment services to rapidly identify and prioritize areas for improvement and perform vendor selection

Consulting & Strategy Sessions

Ventana On Demand

    Services for Investment Firms

    We provide guidance using our market research and expertise to significantly improve your marketing, sales and product efforts. We offer a portfolio of advisory, research, thought leadership and digital education services to help optimize market strategy, planning and execution.

    Consulting & Strategy Sessions

    Ventana On Demand

      Services for Technology Vendors

      We provide guidance using our market research and expertise to significantly improve your marketing, sales and product efforts. We offer a portfolio of advisory, research, thought leadership and digital education services to help optimize market strategy, planning and execution.

      Analyst Relations

      Demand Generation

      Product Marketing

      Market Coverage

      Request a Briefing

        David Menninger's Analyst Perspectives

        << Back to Blog Index

        Informatica Parses the World of Hadoop

        Informatica recently introduced HParser, an expansion of its capabilities for working with Hadoop data sources. Beginning with Version 9.1, introduced earlier this year, Informatica’s flagship product has been able to access data stored in HDFS as either a source or a target for information management processes. However, it could not manipulate or transform the data within the Hadoop environment. With this announcement, Informatica starts to bring its data transformation capabilities to Hadoop.

        We recently completed benchmark research on Hadoop, the open source large-scale processing technology, and I have been writing regularly about Hadoop in this blog. In this era of big data, Hadoop has quickly become a popular technique for storing and analyzing big data; more than half (54%) of participants in our research are either using or evaluating the technology. The research shows that most often, they use Hadoop in conjunction with unstructured data including application logs and event data. Before someone can analyze this data, it must be parsed to determine the various bits of information recorded in each row of the log files.

        Informatica’s HParser is designed to make this process easier. Using DT Studio, Informatica’s Eclipse-based integrated development environment (IDE), organizations can create data transformation routines via a graphical user interface that parses the information in log files and other types of data typically processed with Hadoop. Once developed, these routines get deployed to the Hadoop cluster and are invoked as part of the MapReduce scripts, which enables them to use the full distributed processing and parallel execution capabilities of Hadoop. Using a graphical environment to develop these routines should make it easier and faster to create the code necessary to parse the data. As our research shows, staffing and training are the two biggest obstacles to leveraging Hadoop, so tools like HParser that can minimize the specialized skills required can be valuable to organizations deploying Hadoop.

        Informatica is making two versions of HParser available. The community edition is free, but it’s not open source. It can be used to process log files, Omniture Web analytics data, XML documents and the JavaScript Object Notation (JSON) data interchange format. The enterprise edition also supports a number of industry-standard data formats including SWIFT, X12, and NACHA for the financial industry, HL7 and HIPAA for healthcare, ASN.1 for telecommunications, and documents in PDF, XLS or Microsoft Word formats. For the most part, the enterprise offering is targeted for those in the Informatica user base who might be extending their efforts into Hadoop. The community edition may provide enough value for customers not currently working with Informatica to consider trying some of the company’s other products.

        We’ve seen other information management vendors take a similar approach. Earlier this year Syncsort announced a free version of its sort routines for the Hadoop market as well as an enterprise edition. HParser appears to be part of a bigger effort on the part of Informatica to embrace Hadoop. The company has been conducting a series of webcasts called Hadoop Tuesdays, one of which I participated in one last week, to help educate the market about Hadoop. You may find these useful if you want to learn more about Hadoop. They are not product sales pitches but are focused on explaining the technology and its uses. In addition, Informatica will be delivering a keynote presentation at Hadoop World next week.

        We’ve seen business intelligence vendors and information management vendors alike embrace Hadoop. I expect we’ll continue to see more investment from Informatica and others as organizations work to make Hadoop a disciplined part of their IT infrastructure processes. As our research shows, integration is one of the top four issues for organizations working with Hadoop. The more that existing products can be extended to incorporate Hadoop or new products can be developed to make Hadoop easier to use, the more widespread its usage will become.

        Die-hard MapReduce programmers may not feel that they need HParser. However, enterprise IT organizations already using Informatica should find it a welcome addition in their efforts to deal with Hadoop-based data sources. You can give it a try for yourself here.


        David Menninger – VP & Research Director


        David Menninger
        Executive Director, Technology Research

        David Menninger leads technology software research and advisory for Ventana Research, now part of ISG. Building on over three decades of enterprise software leadership experience, he guides the team responsible for a wide range of technology-focused data and analytics topics, including AI for IT and AI-infused software.


        Our Analyst Perspective Policy

        • Ventana Research’s Analyst Perspectives are fact-based analysis and guidance on business, industry and technology vendor trends. Each Analyst Perspective presents the view of the analyst who is an established subject matter expert on new developments, business and technology trends, findings from our research, or best practice insights.

          Each is prepared and reviewed in accordance with Ventana Research’s strict standards for accuracy and objectivity and reviewed to ensure it delivers reliable and actionable insights. It is reviewed and edited by research management and is approved by the Chief Research Officer; no individual or organization outside of Ventana Research reviews any Analyst Perspective before it is published. If you have any issue with an Analyst Perspective, please email them to ChiefResearchOfficer@ventanaresearch.com

        View Policy

        Subscribe to Email Updates

        Posts by Month

        see all

        Posts by Topic

        see all

        Analyst Perspectives Archive

        See All