"You Can't Just Turn the Crank": Machine Learning for Fighting Abuse on the Consumer Web

David Freeman, Research Scientist/Engineer, Facebook


Fighting fake registrations, phishing, spam and other types of abuse on the consumer web appears at first glance to be an application tailor-made for machine learning: you have lots of data and lots of features, and you are looking for a binary response (is it an attack or not) on each request. However, building machine learning systems to address these problems in practice turns out to be anything but a textbook process. In particular, you must answer such questions as:

  • How do we obtain quality labeled data?
  • How do we keep models from "forgetting the past"?
  • How do we test new models in adversarial environments?
  • How do we stop adversaries from learning our classifiers?

In this talk I will explain how machine learning is typically used to solve abuse problems, discuss these and other challenges that arise, and describe some approaches that can be implemented to produce robust, scalable systems.

David Freeman, Research Scientist/Engineer, Facebook

David Freeman is a research scientist/engineer at Facebook working on integrity and abuse problems. He previously led anti-abuse engineering and data science teams at LinkedIn, where he built statistical models to detect fraud and abuse and worked with the larger machine learning community at LinkedIn to build scalable modeling and scoring infrastructure. He has published numerous academic papers on aspects of computer security and recently co-authored a book on Machine Learning and Security, published by O'Reilly. He holds a Ph.D. in mathematics from UC Berkeley and did postdoctoral research in cryptography and security at CWI and Stanford University.

@conference {215319,
author = {David Freeman},
title = {"You Can{\textquoteright}t Just Turn the Crank": Machine Learning for Fighting Abuse on the Consumer Web},
year = {2018},
address = {Atlanta, GA},
publisher = {USENIX Association},
month = may