Much of what we hear about artificial intelligence and machine learning in security products is steeped in marketing, making it hard to know what these tools actually do. Here’s a clear-eyed look at the current state of AI & ML in security.
Let’s start by dispelling the most common misconception: There is very little if any true artificial intelligence (AI) being incorporated within enterprise security software. The fact that the term comes up frequently is largely to do with marketing, and very little to do with the technology. Pure AI is about reproducing cognitive abilities.
That said, machine learning (ML), one of many subsets of artificial intelligence, is being baked into some security software. But even the term machine learning may be employed somewhat optimistically. Its use in security software today shares more in common with the rules-based “expert systems” of the 1980s and 1990s than it does with true AI. If you’ve ever used a Bayesian spam trap and trained it with thousands of known spam emails and thousands of known good emails, you have a glimmer of how machine learning works, if not the scale. In most cases, it’s not capable of self-training, and requires human intervention, including programming, to update its training. There are so many variables in security, so many data points, that keeping its training current and therefore effective can be a challenge.
Machine learning, however, can be very effective when it is trained with a high volume of the data from the environment in which it will be used by people who know what they’re doing. Although complex systems are possible, machine learning works better at more targeted tasks or sets of tasks rather than a wide-ranging mission.
One of machine learning’s greater strengths is outlier detection, which is the basis of user and entity behavior analytics (UEBA), says Chris Kissel, IDC research director, global security products. “The short definition of what UEBA does,” he adds, “is determining whether an activity emanating from or being received by a given device is anomalous.” UEBA fits naturally into many major cybersecurity defensive activities.
When a machine learning system is trained thoroughly and well, in most cases you’ve defined the known good events. That lets your threat intelligence or security monitoring system focus on identifying anomalies. What happens when the system is trained by the vendor solely with its own generic data? Or it is trained with an insufficient volume of events? Or there are too many outliers that lack identification, which become part of a rising din of background noise? You may wind up with the bane of enterprise threat-detection software: an endless succession of false positives. If you’re not training your machine-learning system on an ongoing basis, you won’t get the real advantage that ML has to offer. And as time goes by, your system will become less effective.