The Role Of Machine Learning In Predictive Security


Leslie K. Lambert |

“Machine learning is playing a transformative role in predictive security, enabling organizations to develop complex risk detection and prevention capabilities”


Big data has been around for decades. But, only in the past few years has it been possible for organizations to mine it intelligently, thanks to the advent of machine learning, which supports analytical processes across science, business, and a range of other settings and applications.

Not surprisingly, machine learning is playing a transformative role in predictive security, enabling organizations to develop complex risk detection and prevention capabilities. That being said, not all machine learning is alike. Some approaches use manually driven, rules-based processes that deal exclusively with known bads.

On the other hand, mature machine learning uses automated and iterative algorithms to learn patterns in data, and probes the data for structure that may be new and previously unknown. This enables real-time link analysis to evaluate relationships or connections between data nodes such as organizations, people, transactions, and so on.

Preventing Social Engineering Attacks

Several real-world cases from Verizon’s 2016 Data Breach Digest: Scenarios from the Field illustrate various ways machine learning can be used to detect and prevent attacks.

In one example, a manufacturer’s designs for a new model of heavy construction equipment were stolen following a social engineering attack. The company was alerted when a primary competitor introduced a piece of equipment that looked like an exact copy of its new model.

The attackers identified the company’s chief design engineer as the person with likely access to the new product designs. They targeted the victim via a spear phishing email campaign that contained fictitious employment opportunities, and installed malware on his system — which exfiltrated the data they wanted.

The company could have detected the malware and suspicious activity if it had examined multiple log files containing rich information about what data had been transferred, when, by whom, and to where. It didn’t do so because such work is typically very arduous and time-intensive.

In this scenario, using machine learning to analyze the data already in hand would have quickly detected several suspicious activities, including when and how the files were extracted, and IP addresses where they were sent.

Machine Learning Detects Abuse by Trusted Users

Account compromise attacks using “hijacked” credentials are a leading threat facing organizations.

Another Verizon example involved an organization amidst a merger and acquisition, that was using retention contracts to prevent employee attrition.

To find out what other employees were being offered, a middle manager acquired IT administrator credentials from a colleague/friend. He used these credentials to access the company’s onsite spam filter and to spy on the CEO’s incoming email, and to browse sensitive files.

What makes it difficult to detect insider threats like this one is context, or more accurately, the lack of it. The mind-numbing volume of log files and outputs from security tools are typically standalone, siloed sources of data. Rarely are these rich sources of intelligence correlated with one another to achieve greater understanding of what access and activities have taken place.

Security teams need to be able to examine these access patterns and behaviors to identify suspicious relationships between multiple sets of activities, possibly taking place in different locations concurrently. This is easier said than done with existing, traditional resources and techniques.

Similar to the first security example, this one cries out for the invaluable real-time analysis capabilities of machine learning.

How to Catch a RAT

In another Verizon example, a manufacturing company experienced a breach of a shared engineering workstation in its R&D department. A phishing email created a Remote Access Trojan (RAT) that was downloaded onto the system, which enabled the attackers to escalate privileges and capture user credentials for everyone who had used the system. By the time the breach was discovered, a significant amount of information had been leaked out via FTP to a foreign IP address.

A RAT is malicious software (malware) that runs in the background on a computer and gives unauthorized access to hackers, enabling them to steal information or install additional malware. Hackers don’t even have to create their own RATs; these programs are available for download from dark areas of the web. Trojans have been around for two decades.

RATs usually start out as executable files that are downloaded from the internet. They are often masked as another program or added to a seemingly harmless application. Once the RAT installs, it runs in system memory and adds itself to system startup directories and registry entries. Each time the computer is started, the RAT starts as well.

RATs generate anomalous data conditions from several system resources. The role of machine learning algorithms detect this activity as atypical, since they represent system services or resources that are not “normally” running.

These examples all demonstrate the role of machine learning analytics can perform in moving the industry from rules-based security models that look for known bad activity, to a predictive approach that analyzes behavior and context. Not only can machine learning models compare “self-versus-self” and “self-versus-peer group” access and behavior for machines and users, it can do so using historical baselines to determine anomalies with unprecedented accuracy.

Attackers have been using machine-based methods to identify and exploit vulnerabilities for years. It’s time for the good guys to enlist machine learning to protect digital assets against threats that have evolved beyond the capabilities of legacy security tools to detect on their own.

Founded in 2010, Gurucul provides Actionable Risk Intelligence powered by machine learning that can proactively detect, prevent, and deter advanced insider threats, fraud, and external threats to system accounts and devices.

Share this page:

Related Posts