Generative Intrusion Detection and Prevention on Data Stream


HyungBin Seo and MyungKeun Yoon, Kookmin University


Data arrive in a stream, for example, network packets, emails, or malicious files, and ideally they should be investigated for cybersecurity. The current best practice would be to check if each data includes any suspicious signatures, or simply strings, which were obtained a priori by elaborate manual analysis in previous cyberattack cases. Unfortunately, unknown attacks, called zero-day attacks, cannot be timely detected in this way because no signature is available yet. To tackle this problem, recent studies have presented high-speed methods that can extract frequent substrings from the data stream and use them as attack signatures because the frequently-occurred signatures are often related with attacks; unfortunately, more benign signatures are extracted than malicious ones, especially when there is no attack in most of the time. This causes both a tremendous number of false-positives and extra human interventions to remove benign signatures. In this paper, we design a new streaming algorithm that can first identify a frequent group of signatures appearing together at the same time from data streams. Using this frequent signature-group instead of frequently-occurred individual signatures, the new scheme achieves a high detection accuracy by mitigating the false-positive problem with only a small fixed amount of memory and a constant number of hash operations, which has not been achieved by any previous work. This improvement comes from a new method for summarizing similar data with a fixed amount of memory, called a minHashed virtual vector, which allows us to automatically identify a frequent group of signatures with each data read only once. We perform exhaustive experiments on different private and open datasets, to verify both the practical effectiveness and the experimental reproducibility of the new scheme.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@inproceedings {291263,
author = {HyungBin Seo and MyungKeun Yoon},
title = {Generative Intrusion Detection and Prevention on Data Stream},
booktitle = {32nd USENIX Security Symposium (USENIX Security 23)},
year = {2023},
isbn = {978-1-939133-37-3},
address = {Anaheim, CA},
pages = {4319--4335},
url = {},
publisher = {USENIX Association},
month = aug

Presentation Video