Security Data Science at Cloud Scale

Yogesh Roy, Principal Engineer, Microsoft


Hyperscale cloud computing platforms offer a rich set of services that handle billions of transactions and generate petabyte scale logs. A single service like Azure Active Directory in Microsoft’s cloud handles 450 billion logins per month generating over 10 Petabyte of logs annually. The Azure Security Data Science team is tasked with detecting malicious activities and behaviors in our cloud services by employing data driven approaches to security.

In this talk, we present three classes of problems. First, we show how to detect unusual logins behavior in AAD, a foundational Azure service – which at first pass may look trivial, but at a scale of 5 billion enterprise logins per day, even a small amount of false positive rate can cripple security analysts. We discuss our solution that employs a combination of random walks and Markov chains that drastically reduces false positive rates. Next, we share our approach for combining detections from multiple Azure services in a graphical model, giving us the ability to detect complex multistage attacks in the cloud. Finally, we discuss approaches towards determining user risk score, addressing questions like: how does one model risk? What are the different components that contribute to the riskiness of a user? We answer these questions by sharing our experience in building unsupervised risk score function that integrates various detections that we have.

Yogesh Roy, Principal Engineer, Microsoft

Yogesh Roy is a Principal Applied Machine Learning Manager in the Azure Security Division, working at the intersection of machine learning and cloud security and developing solutions to protect Microsoft cloud customers on services like Azure Active Directory, Azure Resource Manager, Azure Information Protection, KeyVault, etc. He has 20+ year of experience working in the software industry working on areas spanning distributed high performance computing, cloud services & security, information retrieval, ranking and relevance, and mobile computing. He is has worked on many initiatives leveraging machine learning techniques for extracting value out of big data. At Microsoft, he also worked on Bing search, where he was involved in delivering new search page experiences and improving the ranking and relevance for deep links on the search results.

@conference {215327,
author = {Yogesh Roy},
title = {Security Data Science at Cloud Scale},
year = {2018},
address = {Atlanta, GA},
publisher = {USENIX Association},
month = may