{USD}: {NSFW} Content Detection for {Text-to-Image} Models via Scene Graph

Yuyang Zhang; Kangjie Chen; Xudong Jiang; Jiahui Wen; Yihui Jin; Ziyou Liang; Yihao Huang; Run Wang; Lina Wang

Yuyang Zhang, Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University; Kangjie Chen, Nanyang Technological University; Xudong Jiang, Jiahui Wen, Yihui Jin, and Ziyou Liang, Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University; Yihao Huang, National University of Singapore; Run Wang and Lina Wang, Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University

In recent years, Text-to-Image (T2I) techniques have achieved remarkable success in synthesizing high-quality visual content. However, this advancement has raised significant societal concerns regarding the potential security risks, particularly the generation of unsafe images, such as those containing sexual or violent content. Previous research has primarily focused on classifying unsafe concepts based on overall image features. However, extracting abstract harmful concepts directly from concrete image content has proven to be challenging, limiting the effectiveness of existing methods. Our observations reveal that harmful concepts are often embedded in entities and their relationships, particularly in the actions involving these entities. In this work, we propose \Name, a novel approach for identifying unsafe scenes. For the first time, we leverage scene graph generation and classification to detect harmful attributes and relationships within images. Our method focuses on defining and detecting unsafe scenes, providing insight into how unsafe images are generated by Text-to-Image models. In three meta-scenarios, our method achieved F1 scores that were, on average, 95.52% higher than baseline approaches. Additionally, Name effectively localized unsafe portions of the image, removing 95% of harmful content while preserving 76.34% of image consistency. This pioneering study highlights the importance of investigating the intent and purpose of unsafe images to enhance the security of T2I models and ensure safer applications of this technology.

Category:

Long Presentation

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

@inproceedings {309482,
author = {Yuyang Zhang and Kangjie Chen and Xudong Jiang and Jiahui Wen and Yihui Jin and Ziyou Liang and Yihao Huang and Run Wang and Lina Wang},
title = {{USD}: {NSFW} Content Detection for {Text-to-Image} Models via Scene Graph},
booktitle = {34th USENIX Security Symposium (USENIX Security 25)},
year = {2025},
isbn = {978-1-939133-52-6},
address = {Seattle, WA},
pages = {879--895},
url = {https://www.usenix.org/conference/usenixsecurity25/presentation/zhang-yuyang},
publisher = {USENIX Association},
month = aug
}

Download

Zhang PDF

USD: NSFW Content Detection for Text-to-Image Models via Scene Graph

Open Access Media