You are currently browsing the tag archive for the ‘SAS’ tag.

I want to share my observations from the recent annual SAS analyst briefing. SAS is a huge software company with a unique culture and a history of success. Being privately held SAS is  not required to make the same financial disclosures as publicly held organizations, it released enough information to suggest another successful year, with more than $2.7 billion in revenue and 10 percent growth in its core analytics and data management businesses. Smaller segments showed even higher growth rates. With only selective information disclosed, it’s hard to dissect the numbers to spot specific areas of weakness, but the top-line figures suggest SAS is in good health.

One of the impressive figures SAS chooses to disclose is its investment in research and development, which at 24 percent of total revenue is a significant amount. Based on presentations at the analyst event, it appears a large amount of this investment is being directed toward big data and cloud computing. At last year’s event SAS unveiled plans for big data, and much of the focus at this year’s event was on the company’s current capabilities, which consist of high-performance computing and analytics.

SAS has three ways of processing and managing large amounts of data. Its high-performance computing (HPC) capabilities are effectively a massively parallel processing (MPP) database, albeit with rich analytic functionality. The main benefit of HPC is scalability; it allows processing of large data sets.

A variation on the HPC configuration includes pushing down operations into third-party databases for “in-database” analytics. Currently, in-database capabilities are available from Teradata and EMC Greenplum, plus SAS has announced plans to integrate with other database technologies. The main benefit of in-database processing is that it minimizes the need to move data out of the database and into the SAS application, which saves time and effort.

More recently, SAS introduced a third alternative that it calls High Performance Analytics (HPA), which provides in-memory processing and can be used with either configuration. The main benefit of in-memory processing is enhanced performance.

These different configurations each have advantages and disadvantages, but having multiple alternatives can create confusion about which one to use. As a general rule of thumb, if your analysis involves a relatively small amount of data, perhaps as much as a couple of gigabytes, you can run HPA on a single node. If your system involves larger amounts of data coming from a database on multiple nodes, you will want to install HPA on each node to be able to process the information more quickly and handle internode transfers of data.

SAS also has the ability to work with data in Hadoop. Users can access information in Hadoop via Hive to interact with tables as if they were native SAS data sets. The analytic processing is done in SAS, but this approach eliminates the need to extract the data from Hadoop and put it into SAS. Users can also invoke MapReduce jobs in Hadoop from the SAS environment. To be clear, SAS does  not automatically generate these jobs or assist users in creating them, but this does offer a way for SAS users to create a single process that mixes SAS and Hadoop processing.

I’d like to see SAS push down more processing into the Hadoop environment and make more of the Hadoop processing automatic. SAS plans to introduce a new capability later in the year called LASR Analytic Server that is supposed to deliver better integration with Hadoop as well as better integration with the other distributed databases SAS supports, such as EMC Greenplum and Teradata.

There were some other items to note at the event. One is a new product for end-user interactive data visualization called Visual Analytics Explorer, which is scheduled to be introduced during the first quarter of this year. For years SAS was known for having powerful analytics but lackluster user interfaces, so this came as a bit of a surprise, but the initial impression shared by many in attendance was that SAS has done a good job on the design of the user interface for this product.

In the analytics software market, many vendors have introduced products recently that provide interactive visualization. Companies such as QlikViewTableau and Tibco Spotfire have built their businesses around interactive visualization and data exploration. Within the last year, IBM introduced Cognos Insight, MicroStrategy introduced Visual Insight, and Oracle introduced visualization capabilities in its Exalytics appliance. SAS customers will soon have the option of using an integrated product rather than a third-party product for these visualization capabilities.

Based on SAS CTO Keith Collins’s presentation I expect to see SAS making a big investment in SaaS (software as a service, pun intended) and other cloud offerings, including platform as a service and infrastructure as a service. Collins outlined the company’s OnCloud initiative, which begins with offering some applications on demand and will roll out additional capabilities over the next two years. SAS plans full cloud support for its products, including a self-service subscription portal, developer capabilities and a marketplace for SAS and third-party cloud-based applications and also plans to support public cloud, private cloud and hybrid configurations. Since SAS already offers its products on a subscription basis, the transition to a SaaS offering should be relatively easy from a financial perspective. This move is consistent with the market trends identified in our Business Data in the Cloud benchmark research. We also see other business intelligence vendors such as MicroStrategy and information management vendors such as Informatica adopting similarly broad commitments to cloud-based versions of their products.

Overall, SAS continues to execute well. Its customers should welcome these new developments, particularly the interactive visualization. The big-data strategy is still too SAS-centric, focused primarily on extracting information from Hadoop and other databases. I expect that the upcoming LASR Analytics Server will leverage these underlying MPP environments better. The cloud offerings will make it easier for new customers to evaluate the SAS products. I recommend you keep an eye on these developments at they come to market.

Regards,

David Menninger – VP & Research Director

In various forms, business intelligence (BI) – as queries, reporting, dashboards and online analytical processing (OLAP) – is being used increasingly widely. And as basic BI capabilities spread to more organizations, innovative ones increasingly are exploring how to take advantage of the next step in the strategic use of BI: predictive analytics. The trend in Web searches for the phrase “predictive analytics” gives one indication of the rise in interest in this area. From 2004 through 2008, the number of Web search was relatively steady. Beginning in 2009, the number of searches rose significantly and has continued to rise.

While a market for products that deliver predictive analytics has existed for many years and enterprises in a few vertical industries and specific lines of business have been willing to pay large sums to acquire those capabilities, this constitutes but a fraction of the potential user base. Predictive analytics are nowhere near the top of the list of requirements for business intelligence implementations. In our recent benchmark research on business analytics, participants from more than 2,600 organizations ranked predictive analytics 10th among technologies they use today to generate analytics; only one in eight companies use them. Planning and forecasting fared only slightly better, ranking sixth with use by 26 percent. Yet in other research, 80 percent of organizations indicated that applying predictive analytics is important or very important.

For the untapped majority, technology has now advanced to a stage at which it is feasible to harness the potential of predictive analytics, and the value of these advanced analytics is increasingly clear. Like BI, predictive analytics also can appear in several identities, including predictive statistics, forecasting, simulation, optimization and data mining. Among lines of business, contact centers and supply chains are most interested in predictive analytics. By industry, telecommunications, medicine and financial services are among the heavier users of these analytics, but even there no more than 20 percent now employ them. Ironically, our recent research on analytics shows that finance departments are the line of business least likely to use predictive analytics, even though they could find wide applicability for them.

Predictive analytics defined broadly have a checkered history. Some software vendors have long offered tools for statistics, forecasting and data mining including Angoss, SAS and SPSS, which is now part of IBM. I was part of the team at Oracle that acquired data mining technology for the Oracle database a dozen years ago. Companies like KXEN sprang up around that time, perhaps hoping to capitalize on what appeared to be a growing market. The open source statistics project known simply as R has been around for more than a decade and some BI vendors like Information Builders have embedded into their product called WebFOCUS RStat.

But then things quieted down for almost a decade as the large vendors focused on consolidating BI capabilities into a single platform. Around 2007, interest began to rise again, as BI vendors looked for other ways to differentiate their product offerings. Information Builders began touting its successes in predictive analytics with a customer case study of how the Richmond, Va., police department was using the technology to fight crime. In 2008 Tibco acquired Insightful and Netezza acquired NuTech, both to gain more advanced analytics. And in 2009 IBM acquired SPSS.

As well these combinations, there are signs that more vendors see a market in supplying advanced analytics. Several startups have focused their attention on predictive analytics, including some such as Predixion with offerings that run in the cloud. Database companies have been pushing to add more advanced analytics to their products, some by forming partnerships with or acquiring advanced analytics vendors and others by building their own analytics frameworks. The trend has even been noted by mainstream media: For example, in 2009 an article in The New York Times suggested that “data mining has entered a golden age, whether being used to set ad prices, find new drugs more quickly or fine-tune financial models.”

Thus the predictive analytics market seems to be coming to life just as need intensifies. The huge volumes of data processed by major websites and online retailers, to name just two types, cannot be moved easily or quickly to a separate system for more advanced analyses. The Apache Hadoop project has been extended with a project called Mahout, a machine learning library that supports large data sets. Simultaneously, data warehouse vendors have been adding more analytics to their databases. Other vendors are touting in-memory systems as the way to deliver advanced analytics on large-scale data. It is possible that this marriage of analytics with large-scale data will cause a surge in the use of predictive analytics – unless other challenges prevent more widespread use of these capabilities.

One is the inherent complexity of the analytics themselves. They employ arcane algorithms and often require knowledge of sampling techniques to avoid bias in the analyses. Developing such analytics tools involves creating models, deploying them and managing them to understand when a model has become “stale” or outdated and ought to be revised or replaced. It should be obvious that only the most technically advanced user will have (or desire) any familiarity with this complexity; to achieve broad adoption, either there must be more highly trained users (which is unlikely) or vendors must mask the complexity and make their tools easier to use.

It also is possible that a lack of understanding of the business value these analyses provide is holding back adoption. Predictive analytics have been employed to advantage in certain segments of the market; other segments can learn from those deployments, but it is up to vendors to make the benefits and examples more widely known. In an upcoming benchmark research I’ll be investigating what is necessary to deploy these techniques more broadly and looking for the best practices from successful implementations.

Regards,

David Menninger – VP & Research Director

Follow on WordPress.com

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 14 other followers

David Menninger – Twitter

Ventana Research

Top Rated

Blog Stats

  • 41,017 hits
Follow

Get every new post delivered to your Inbox.

%d bloggers like this: