This year various types of organizations are embracing machine learning like it is going out of style – or maybe it would be better to say coming into style. And now with a little investigation on LinkedIn finds over half million professionals with machine learning in their job title. Machine learning is the application of specific data science algorithms that become more accurate as the system records more outcomes and processes more data. This improvement is referred to as “learning,” hence the name. There are good reasons machine learning is growing so rapidly, but there are pitfalls to avoid as well.
The fundamental force driving increased interest is machine learning is big data. As data volumes have increased, so, too, has the need to apply data science to glean full value from this data. I often say that you can’t browse a billion rows of data. Even if you could, it would not be an effective way to find all the valuable relationships in the data, and it would be nearly impossible to review streams of data in real time as the data is generated. Therefore, organizations need an algorithmic approach if they hope to keep up with all the data they are generating. Our Big Data Analytics Benchmark Research confirms that predictive analytics is the most important type of big data analytics, selected by more than three-fourths (78%) of participants.
Organizations today have the means as well as the need to process all the data they receive. The costs of collecting, storing and analyzing data have declined dramatically, due largely to parallel computing. The ability to use tens, hundreds or even thousands of commodity servers together to store and process data significantly lowers the expense of high-performance computing. With the acceptance of cloud computing, users can rent this computing power for as long or as short a period as necessary without making significant capital outlays. Our Data and Analytics in the Cloud benchmark research finds that nearly one-third (32%) of organizations reported using the cloud for big data computing, and another one-quarter (28%) said they plan to use it within 18 months.
Parallel computing has been common for nearly a decade, but I believe that Apache Spark, which I have written about previously, was one of the catalysts that recently pushed machine learning over the top. Spark brings to the table two key capabilities that enable machine-learning deployments: a scalable machine-learning library combined with real-time processing.
On another front, machine learning reckons with the dearth of skilled data science resources in many organizations. It still requires data science skills, but once a model is developed and deployed, it requires less maintenance than nonlearning analytical techniques. Nonlearning algorithms become less accurate over time. Information changes, and the model should reflect these changes. With a static algorithm, users must update the model manually. With machine learning, the model continually improves based on the data it processes. As a result, companies can focus their scarce data science resources on new application areas rather than maintaining existing models.
Machine learning has taken another route to market as well, as software vendors apply it to make their software products easier to use. Many vendors have incorporated machine learning into their products from analytics to business application vendors. Acquiring software products with such capabilities embedded is an easy way to get machine learning into an organization. Using that software requires no data science resources on users’ parts and can have direct benefits, such as improved productivity and better outcomes.
However, embedded use is not a panacea for all machine-learning needs. Each organization has unique characteristics and business processes and should seek to apply machine learning to its core activities. To help users and potential adopters understand the applications of machine learning, Ventana Research is conducting a new type of research called Dynamic Insights in which we explore its use. Please try it out to let us know what you are doing with machine learning and to receive immediate feedback and recommendations. Engage with us as we share the findings of the machine learning research and help organizations reach their full potential.
SVP & Research Director