It can’t be over yet! Or can it? Our final model in the #MachineLearningMadness blog series is up next and it’s a whopper!
How does the Outlier Categorical Model machine learning model work, what does it do? This is a really powerful model. The Outlier Categorical Model uses a Bayesian layer to classify categorical based data and gives the relative probability of an occurrence being an outlier based on prior observations.
The Outlier Categorical Model takes into account previously observed behavior patterns, and will automatically flag anything outside of the norm.
Let’s look at an example. Let’s say you have a rule to detect logins after midnight, but someone logs in 1 or 2 seconds prior to midnight. This type of subtle variation is impossible for a rule to catch, but would get picked up by the Outlier Categorical Model. You would assign a category of “late night” as logins between midnight and 4:00am. The model will assign a probability between 0 and 1 for a transaction, and that gets mapped to score between 0 and 100 where the high score indicates the likelihood of a transaction being an outlier. A user logging in at 1 second to midnight will have a very high score and thus get flagged as anomalous by the Outlier Categorical Model.
The Outlier Categorical Model is one of many machine learning models used to detect merchant fraud.
It detects changes in transaction behavior patterns like:
The Outlier Categorical Model looks at past behavior to identify fraudulent transactions. With the Outlier Categorical Model, you perform machine learning training on a dataset. There is no need to retrain the training dataset after initial training is completed. Prediction is real time on incoming data.
The merchant space offers lots of examples, check out these three, which are all Gurucul Fraud Analytics use cases:
Example 1: Abnormal change in merchant details from different geolocations and devices used. A merchant changes merchant details by granting the account additional benefits, which is not normal behavior. For example, the merchant gives 5% cashback or a reversal of transaction fee on all transactions multiple times in a day, or over several days. In addition, the merchant also used different devices to perform such transactions to attempt to make it look like normal behavior.
Example 2: Rogue process or unusual process change in merchant and credit card details. This is basically an unauthorized way of changing bank account and credit details of a merchant. These are typically done by a customer support representative or account manager. In this case, a rogue process periodically changes the credit card and bank account information for merchants, and then changes it back a short amount of time after certain transactions have gone through. These types of attacks are very difficult to detect without the Outlier Categorical Model.
Example 3: Same bank account added to different merchants with additional questionable behaviors. Adding the same account to different merchants is normal behavior because users can have multiple businesses using the same bank account. In this case, however, what was interesting to detect was that it happened randomly across different merchants and every time this bank account was added, it also got removed within a certain time frame. Basically, this bank account was never attached for long periods to certain merchants. This is the type of anomalous behavior that the Outlier Categorical Model detects.
The benefit of the Outlier Categorical Model is that it can reliably detect unknown unknowns. It looks into the transactions of merchants by analyzing transactions of changes in a bank account – like the adding or removing of credit cards – along with other important categorical data like devices and location. It then gives the probability score of what may be a true positive when it comes to detecting fraudulent transactions.
Here’s the Outlier Categorical Model secret sauce: it analyzes transactions as a pattern instead of treating them as single events. It establishes a relationship between a sequence of events instead of looking at each event individually.
Why is this so important? This gives the benefit of doing rich analytics to detect patterns across resources, multiple applications and/or categorical fields instead of doing analysis on single resource. This is why the Outlier Categorical Model can detect unknown unknowns. Rules don’t find the deviation in patterns.
Want to see this machine learning model in action? Request a demo of Gurucul Risk Analytics today!