Machine Learning and UEBA is here to pick up SIEM slack.
A SIEM captures giant amount of data. Security analysts then need to review this data to find abnormalities in user behavior. In data science, this is a needle-in-a-haystack problem. This is because it cannot be solved just by throwing a bunch of humans at it to review each log entry. Enter machine learning (ML).
Machine Learning is the game-changer
If we know what we’re looking for, then we can sit down and write a set of detection rules. For example, all traffic coming in from a certain IP needs to be flagged. Or flag users who download more than XX bytes of a SharePoint data resource afterhours.
But if you don’t know what to look for, and where or when, then how can you come up with a rule to look for it? Or, how does one build a whitelist/blacklist around this? How does one harness a third party’s threat intelligence, like a SIEM, to find something when we don’t know what we’re looking for?
In the world of anomaly detection this is an “unknown unknown”.
As the data scientists behind predictive security analytics solutions, we need to catch that anomaly whose signature we do not have. We use ML to handle this. More specifically, this involves an unsupervised or semi-supervised outlier detection algorithm. This technique builds the user’s profile over a period of days, or weeks. Then compares the detection time behavior against an established user baseline for the user to recognize an abnormality. This baseline user profile is actually a collection of profiles. One might think of it as a profile for every attribute in the user’s data record. Hence an abnormality in any attribute in the data for the user will stand out against their established baseline. This shifts the focus from just looking at individual events to monitoring overall behaviors.
UEBA solutions based on machine learning for detection
UEBA solutions based on machine learning consolidate and analyze user information. It establishes normal behavior, then highlights deviations from established behavior and finally risk scores them accordingly. Frequently, solutions like UEBA escalate expected, or well understood, patterns of anomalous behavior. However, because UEBA applies advanced analytics to all available data captured in an environment, it can also detect changes in patterns, and new patterns, as they emerge – the unknown unknowns. This is not possible with a simple rules-based solution.
SIEMs are only part of the solution
From a data scientist’s perspective, traditional SIEMs are only part of the solution. This is because they remain reliant on a rules-based and rule-only approach. This detects the anomalies the platform “knows about” (or known unknowns). Meanwhile unknown unknowns, the damaging ones, slip by. As a result, SIEMs cannot ensure a holistic security environment.
Written by Aruna Rajasekhar, Lead Data Scientist, Gurucul