Data is the currency of the digital age. It is the foundation for analytics. The value of data lies in the context it provides and the timeliness of its content. Information decline is an important concern for data scientists in predictive security analytics.
The riskiness of a transaction, or a user’s activity, is often assessed in security analytics to detect a threat, or to prevent an attack. Predictive modeling uses a set of well-known analytical techniques applied to the cyber security domain to score event based transactions. These scores convey a sense of “riskiness” for that event at a transactional level but does not capture user intent. To quantify user behavior, there is a need for aggregating event based scores from various models. The risk score will then be the summary of multiple scores obtained from various individual scoring algorithms within a system.
Combining risk scores from different models, can be a problematic process. Risk assessed from one component should not duplicate or dominate risk from other models for a given user or entity. The aggregate risk assessment must be meaningful, providing helpful context to assure the minimization of false positives. In addition, some behavioral components may have varying degrees of importance than others, and as a result must be weighed accordingly within the broader context of the established and evolving baseline of an individual and their peer group’s persona of ‘normal’ behavior.
In the realm of cyber security, cases are opened when the risk score of a user crosses a predefined threshold. To maintain the scores’ meaningful context over time, and within the scope of additional activity in the environment, the risk score can be decayed over time or over events. As this decay occurs, however, the informed attacker who has gained access to an environment can essentially wait the process out, pacing their malicious activity to a slower cadence, until their risk score has fallen below monitored thresholds. Then they strike again, under the radar. In the alternative approach, the cumulative risk is allowed to decay over a configurable number of events. Low-score normal events will bring down a high score from a prior malicious event but it may provide a better representation of risky behavior in this domain.
The user or entity risk score is a cumulative figure that captures the overall riskiness of the user or entity. Solutions that allow for a threshold to be set for alert generation based on the user or entity score, deliver a wider scope of capabilities within predictive security analytics. These capabilities expand with the ability to flag large increases in the risk score as well as having the ability to assign a “riskiness factor” to the user and entities based on their privileges. Richer and up-to-date context is a byproduct of this behavior analytics capability.
The risk score that includes cross-application interactions, and not just additive independent factors, provides an enhanced representation of the overall risk not offered by all UEBA vendors. This monitoring of the user or entity activity across various applications in a business is necessary and is often a challenge on the available resources. With these analytics capability data decay is further mitigated and becomes fundamental for the success of any UEBA solution.
Aruna Rajasekhar and Pete Gajria, Data Scientists