{ALOHA}: Auxiliary Loss Optimization for Hypothesis Augmentation

Ethan M. Rudd; Felipe N. Ducau; Cody Wild; Konstantin Berlin; Richard Harang

Authors:

Ethan M. Rudd, Felipe N. Ducau, Cody Wild, Konstantin Berlin, and Richard Harang, Sophos

Abstract:

Malware detection is a popular application of Machine Learning for Information Security (ML-Sec), in which an ML classifier is trained to predict whether a given file is malware or benignware. Parameters of this classifier are typically optimized such that outputs from the model over a set of input samples most closely match the samples’ true malicious/benign (1/0) target labels. However, there are often a number of other sources of contextual metadata for each malware sample, beyond an aggregate malicious/benign label, including multiple labeling sources and malware type information (e.g. ransomware, trojan, etc.), which we can feed to the classifier as auxiliary prediction targets. In this work, we fit deep neural networks to multiple additional targets derived from metadata in a threat intelligence feed for Portable Executable (PE) malware and benignware, including a multi-source malicious/benign loss, a count loss on multi-source detections, and a semantic malware attribute tag loss. We find that incorporating multiple auxiliary loss terms yields a marked improvement in performance on the main detection task. We also demonstrate that these gains likely stem from a more informed neural network representation and are not due to a regularization artifact of multi-target learning. Our auxiliary loss architecture yields a significant reduction in detection error rate (false negatives) of 42.6% at a false positive rate (FPR) of 10^-3 when compared to a similar model with only one target, and a decrease of 53.8% at 10^-5 FPR.

Ethan M. Rudd, Sophos

Felipe N. Ducau, Sophos

Cody Wild, Sophos

Konstantin Berlin, Sophos

Richard Harang, Sophos

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

@inproceedings {236304,
author = {Ethan M. Rudd and Felipe N. Ducau and Cody Wild and Konstantin Berlin and Richard Harang},
title = {{ALOHA}: Auxiliary Loss Optimization for Hypothesis Augmentation},
booktitle = {28th USENIX Security Symposium (USENIX Security 19)},
year = {2019},
isbn = {978-1-939133-06-9},
address = {Santa Clara, CA},
pages = {303--320},
url = {https://www.usenix.org/conference/usenixsecurity19/presentation/rudd},
publisher = {USENIX Association},
month = aug
}

Download

Rudd PDF

ALOHA: Auxiliary Loss Optimization for Hypothesis Augmentation