Deep Entity Classification: Abusive Account Detection for Online Social Networks

Authors: 

Teng Xu, Gerard Goossen, Huseyin Kerem Cevahir, Sara Khodeir, and Yingyezhe Jin, Facebook, Inc; Frank Li, Facebook, Inc, and Georgia Institute of Technology; Shawn Shan, Facebook, Inc, and University of Chicago; Sagar Patel and David Freeman, Facebook, Inc; Paul Pearce, Facebook, Inc, and Georgia Institute of Technology

Abstract: 

Online social networks (OSNs) attract attackers that use abusive accounts to conduct malicious activities for economic, political, and personal gain. In response, OSNs often deploy abusive account classifiers using machine learning (ML) approaches. However, a practical, effective ML-based defense requires carefully engineering features that are robust to adversarial manipulation, obtaining enough ground truth labeled data for model training, and designing a system that can scale to all active accounts on an OSN (potentially in the billions).

To address these challenges we present Deep Entity Classification (DEC), an ML framework that detects abusive accounts in OSNs that have evaded other, traditional abuse detection systems. We leverage the insight that while accounts in isolation may be difficult to classify, their embeddings in the social graph—the network structure, properties, and behaviors of themselves and those around them—are fundamentally difficult for attackers to replicate or manipulate at scale. Our system:

  • Extracts "deep features" of accounts by aggregating properties and behavioral features from their direct and indirect neighbors in the social graph.
  • Employs a "multi-stage multi-task learning" (MS-MTL) paradigm that leverages imprecise ground truth data by consuming, in separate stages, both a small number of high-precision human-labeled samples and a large amount of lower-precision automated labels. This architecture results in a single model that provides high-precision classification for multiple types of abusive accounts.
  • Scales to billions of users through various sampling and reclassification strategies that reduce system load.

DEC has been deployed at Facebook where it classifies all users continuously, resulting in an estimated reduction of abusive accounts on the network by 27% beyond those already detected by other, traditional methods.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {263806,
author = {Teng Xu and Gerard Goossen and Huseyin Kerem Cevahir and Sara Khodeir and Yingyezhe Jin and Frank Li and Shawn Shan and Sagar Patel and David Freeman and Paul Pearce},
title = {Deep Entity Classification: Abusive Account Detection for Online Social Networks},
booktitle = {30th USENIX Security Symposium (USENIX Security 21)},
year = {2021},
isbn = {978-1-939133-24-3},
pages = {4097--4114},
url = {https://www.usenix.org/conference/usenixsecurity21/presentation/xu-teng},
publisher = {USENIX Association},
month = aug
}

Presentation Video