Enterprise Scale Machine Learning For Detecting Email Based Attacks

Bayan Bruss

Bayan Bruss, Senior Machine Learning Engineer, Capital One

Email is an effective means for perpetrating cyber attacks on organizations as the attack surface tends to be broad and easily exploitable. Email based attacks can result in credential harvesting, drive-by downloads and other malicious activities such as executive spoofing. In many organizations attackers easily avoid commercial email filters. This moves the front line of defense to a company’s employees to report potentially malicious emails. Studies have shown that even immediately after receiving training, the average employee still misses more than 30% of phishing attempts. Furthermore, even if they do detect a malicious email there is no guarantee that they will report the incident. To address these problems, Capital One has built an enterprise-scale detection system that uses Machine Learning to identify the highest priority attacks that have evaded existing defenses and send them to trained analysts for validation and remediation. The validated data feeds back into the ML system allowing it to dynamically adapt to changes in the threat landscape. The initial version of this system has been deployed for the Cybersecurity Operations Center (CSOC) on the subset of emails reported by Capital One employees. This set of emails is overwhelmingly (95%) harmless marketing emails, with roughly 5% being malicious emails. This makes it hard for analysts to spend the time needed on the more urgent cases. The system described here classifies each email as either malicious or benign by analyzing body text, embedded URLs, sender information, and headers. The system automatically generates analyst alerts in commonly used platforms such as Slack, and ticketing systems like Jira. Current model results allow a reduction in analyst workload by over 75%. The success of this initial use case sets Capital One up for the development of an enterprise scale platform inspecting all inbound emails with a wide variety of Machine Learning enabled security tasks currently under development. This includes automated HTML content analysis, attachment analysis, image classification, and a broad suite of URL enrichments and classifications that leads into comprehensive infrastructure analysis. Furthermore, Capital One intends to leverage this broad set of capabilities across the spectrum of attack vectors as appropriate.

Bayan is a Senior Machine Learning Engineer within Capital One’s Center For Machine Learning. He leads a team of Machine Learning Engineers developing machine learning based cybersecurity products to strengthen Capital One’s cyber defenses and respond to a dynamic threat landscape. Bayan’s team currently focuses on building a machine learning platform for identifying phishing & other malicious emails entering Capital One. Prior to Capital One, Bayan worked as a Data Science and Big Data engineering consultant with Accenture. There he worked on a range of projects across the Federal and Financial Services industries. He is broadly interested in NLP, and has conducted research on the cross section of social psychology, culture and computational linguistics.

BibTeX

@conference {215317,
author = {Bayan Bruss},
title = {Enterprise Scale Machine Learning For Detecting Email Based Attacks},
year = {2018},
address = {Atlanta, GA},
publisher = {USENIX Association},
month = may
}

Download