David Menninger's Analyst Perspectives

SAS Advances Big Data and Cloud Computing

Written by David Menninger | Mar 26, 2012 6:08:46 PM

I want to share my observations from the recent annual SAS analyst briefing. SAS is a huge software company with a unique culture and a history of success. Being privately held SAS is  not required to make the same financial disclosures as publicly held organizations, it released enough information to suggest another successful year, with more than $2.7 billion in revenue and 10 percent growth in its core analytics and data management businesses. Smaller segments showed even higher growth rates. With only selective information disclosed, it’s hard to dissect the numbers to spot specific areas of weakness, but the top-line figures suggest SAS is in good health.

One of the impressive figures SAS chooses to disclose is its investment in research and development, which at 24 percent of total revenue is a significant amount. Based on presentations at the analyst event, it appears a large amount of this investment is being directed toward big data and cloud computing. At last year’s event SAS unveiled plans for big data, and much of the focus at this year’s event was on the company’s current capabilities, which consist of high-performance computing and analytics.

SAS has three ways of processing and managing large amounts of data. Its high-performance computing (HPC) capabilities are effectively a massively parallel processing (MPP) database, albeit with rich analytic functionality. The main benefit of HPC is scalability; it allows processing of large data sets.

A variation on the HPC configuration includes pushing down operations into third-party databases for “in-database” analytics. Currently, in-database capabilities are available from Teradata and EMC Greenplum, plus SAS has announced plans to integrate with other database technologies. The main benefit of in-database processing is that it minimizes the need to move data out of the database and into the SAS application, which saves time and effort.

More recently, SAS introduced a third alternative that it calls High Performance Analytics (HPA), which provides in-memory processing and can be used with either configuration. The main benefit of in-memory processing is enhanced performance.

These different configurations each have advantages and disadvantages, but having multiple alternatives can create confusion about which one to use. As a general rule of thumb, if your analysis involves a relatively small amount of data, perhaps as much as a couple of gigabytes, you can run HPA on a single node. If your system involves larger amounts of data coming from a database on multiple nodes, you will want to install HPA on each node to be able to process the information more quickly and handle internode transfers of data.

SAS also has the ability to work with data in Hadoop. Users can access information in Hadoop via Hive to interact with tables as if they were native SAS data sets. The analytic processing is done in SAS, but this approach eliminates the need to extract the data from Hadoop and put it into SAS. Users can also invoke MapReduce jobs in Hadoop from the SAS environment. To be clear, SAS does  not automatically generate these jobs or assist users in creating them, but this does offer a way for SAS users to create a single process that mixes SAS and Hadoop processing.

I’d like to see SAS push down more processing into the Hadoop environment and make more of the Hadoop processing automatic. SAS plans to introduce a new capability later in the year called LASR Analytic Server that is supposed to deliver better integration with Hadoop as well as better integration with the other distributed databases SAS supports, such as EMC Greenplum and Teradata.

There were some other items to note at the event. One is a new product for end-user interactive data visualization called Visual Analytics Explorer, which is scheduled to be introduced during the first quarter of this year. For years SAS was known for having powerful analytics but lackluster user interfaces, so this came as a bit of a surprise, but the initial impression shared by many in attendance was that SAS has done a good job on the design of the user interface for this product.

In the analytics software market, many vendors have introduced products recently that provide interactive visualization. Companies such as QlikViewTableau and Tibco Spotfire have built their businesses around interactive visualization and data exploration. Within the last year, IBM introduced Cognos Insight, MicroStrategy introduced Visual Insight, and Oracle introduced visualization capabilities in its Exalytics appliance. SAS customers will soon have the option of using an integrated product rather than a third-party product for these visualization capabilities.

Based on SAS CTO Keith Collins’s presentation I expect to see SAS making a big investment in SaaS (software as a service, pun intended) and other cloud offerings, including platform as a service and infrastructure as a service. Collins outlined the company’s OnCloud initiative, which begins with offering some applications on demand and will roll out additional capabilities over the next two years. SAS plans full cloud support for its products, including a self-service subscription portal, developer capabilities and a marketplace for SAS and third-party cloud-based applications and also plans to support public cloud, private cloud and hybrid configurations. Since SAS already offers its products on a subscription basis, the transition to a SaaS offering should be relatively easy from a financial perspective. This move is consistent with the market trends identified in our Business Data in the Cloud benchmark research. We also see other business intelligence vendors such as MicroStrategy and information management vendors such as Informatica adopting similarly broad commitments to cloud-based versions of their products.

Overall, SAS continues to execute well. Its customers should welcome these new developments, particularly the interactive visualization. The big-data strategy is still too SAS-centric, focused primarily on extracting information from Hadoop and other databases. I expect that the upcoming LASR Analytics Server will leverage these underlying MPP environments better. The cloud offerings will make it easier for new customers to evaluate the SAS products. I recommend you keep an eye on these developments at they come to market.

Regards,

David Menninger – VP & Research Director