The most difficult task of a machine learning based User and Entity Behavior Analytics (UEBA) product is to interpret and predict sentiment. This is the essence of Qualitative Analysis – to ascertain the qualitative nature of a data set.
Quantitative vs. Qualitative Analysis
Machine learning is excellent at quantitative analysis because it’s a technique that seeks to understand behavior by using mathematical and statistical modeling. Numbers are straightforward and leave nothing to interpretation. Quantitative analysis deals with measurements, quantities, time series, numbers, frequency, and volume. It’s easy to quantify. It’s factual. It is not interpretive. We are talking zeros and ones and nothing in between.
Qualitative analysis, on the other hand, leaves pretty much everything to interpretation. This is more about using machine learning to figure out what’s right or wrong, good or bad, happy or sad, content or disillusioned. With UEBA, it’s specifically about understanding the sentiment behind words people use in emails, texts and voice messages. And that’s hard to do – really hard.
I Love My Job and Hate Ice Cream
Consider the subject line of an email that reads, “I love my job and hate ice cream.” As a person reading this, it’s clear that the person loves their job and hates ice cream. As a data scientist trying to use machine learning to ascertain the sentiment of this statement, having the words “love” and “hate” in the same sentence can be problematic. Is this a negative sentiment? Yes, the person hates ice cream. But is the negative sentiment risky for the company? No, the person loves his job. Herein lies the dilemma – how can machine learning distinguish between the love of job and hate of ice cream? This is where the best data science really shines. And you need the best data science to conquer sentiment analysis.
Qualitative Analysis is the Foundation for Sentiment Analysis
Gurucul uses sentiment analysis to predict departing users. It’s one of our best UEBA use cases and is the cornerstone of many of our Insider Threat behavior models. When it comes to predicting whether employees are going to leave the company, sentiment analysis plays a key role.
How does it work? Gurucul UEBA looks at email subject lines and builds a grammatical tree of words. It’s a different from a word cloud, which is more like a cluster of words where the words used most often are shown in larger font. A grammatical tree breaks down sentences into phrases and words to determine whether a sentence is of negative or positive sentiment. They key here is context and phrases, not simply words. Words unto themselves mean nothing.
Take the word, “mad”. In itself it could be construed as a negative sentiment. But the phrase, “I’m mad about data science” is in fact a positive sentiment. In contrast, “I’m mad as heck and I’m not going to take it anymore” is negative. Context is absolutely essential in determining whether words, phrases and ultimately complete sentences are positive or negative.
I Hate My Job
If you hate your job, don’t send emails or text messages saying just that. You will be caught out with the right UEBA product. Gurucul UEBA is able to root out sentiments associated with departing users like employees sending emails with subject lines containing phrases such as, “I hate my job.”
Gurucul UEBA looks at frequency counts of words. We train our models on sets of public emails. We’ve gone to great lengths to ensure our qualitative analysis is vetted. Our machine learning models can scan sentences and reconstruct intent based on the English language. That’s why we’re able to stop departing employees from taking intellectual property. We predict they are planning to leave using sentiment analysis. And then we put these employees on watch lists and implement more restrictive DLP policies so they cannot send email attachments or copy files to a USB drive. We take action proactively to keep your data safe.
Context is Critical
Never forget that context is critical to all things machine learning. Gurucul combines sentiment analysis with other behavior models to paint the full picture of departing employees. It’s not just what users are typing or saying, it’s also what they are doing. If they are sending emails with subject lines like, “I hate my boss” and they are visiting career websites, and they are downloading customer lists more frequently than normal then this is indeed behavior indicative of someone planning to quit.
Ultimately, the data is only as good as the data science. If you’re looking for a UEBA product that absolutely delivers value above and beyond quantitative analysis, look no further.
Gurucul UEBA is phenomenal. Give us a look and see for yourself why SC Media awarded Gurucul Best Behaviour Analytics / Enterprise Threat Detection in the recent SC Awards Europe 2020.