VAPD: An Anomaly Detection Model for PDF Malware Forensics with Adversarial Robustness

Side Liu, Wuhan University; Jiang Ming, Tulane University; Yilin Zhou, Jianming Fu, and Guojun Peng, Wuhan University

Malicious PDFs are a prevalent threat in the modern web security landscape, often used as attack vectors in phishing campaigns and other web application attacks. With the widespread integration of PDF readers in browsers, malicious PDFs exploit vulnerabilities in web applications and browsers, posing significant risks. Despite advances in machine learning for malware detection, existing PDF classifiers struggle with adversarial attacks, where minor modifications to malicious files evade detection and lead to serious consequences like ransomware or data breaches.

In this paper, we propose VAPD, an anomaly detection model based on reconstruction with dual forensics objectives: 1) identifying PDF malware through the reconstruction error between input and output, and 2) pinpointing anomalous regions. We strategically leverage the notion that a model exclusively trained on benign samples struggles to reconstruct malicious counterparts, thereby yielding amplified reconstruction errors. We have evaluated VAPD on multiple datasets, including real-world Advanced Persistent Threat samples, achieving an accuracy rate of 99.54% that stands out among existing anomaly detection methods. Moreover, we measure the robustness of VAPD by utilizing four adversarial attack frameworks in both feature and problem spaces. Our findings demonstrate that our model exhibits superior robustness performance when compared to the state-of-the-art work. Notably, we achieve this level of performance at a significantly lower training cost, which is equivalent to only 4.8% of the state-of-the-art work. Additionally, VAPD offers an advanced localization capability that outperform signature-based tools.

Category: 
Short Presentation

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {309564,
author = {Side Liu and Jiang Ming and Yilin Zhou and Jianming Fu and Guojun Peng},
title = {{VAPD}: An Anomaly Detection Model for {PDF} Malware Forensics with Adversarial Robustness},
booktitle = {34th USENIX Security Symposium (USENIX Security 25)},
year = {2025},
isbn = {978-1-939133-52-6},
address = {Seattle, WA},
pages = {4759--4778},
url = {https://www.usenix.org/conference/usenixsecurity25/presentation/liu-side},
publisher = {USENIX Association},
month = aug
}