Knowledge Expansion and Counterfactual Interaction for Reference-Based Phishing Detection


Ruofan Liu, Shanghai Jiao Tong University and National University of Singapore; Yun Lin, Shanghai Jiao Tong University; Yifan Zhang, Penn Han Lee, and Jin Song Dong, National University of Singapore


Phishing attacks have been increasingly prevalent in recent years, significantly eroding societal trust. As a state-of-the-art defense solution, reference-based phishing detection excels in terms of accuracy, timeliness, and explainability. A reference-based solution detects phishing webpages by analyzing their domain-brand consistencies, utilizing a predefined reference list of domains and brand representations such as logos and screenshots. However, the predefined references have limitations in differentiating between legitimate webpages and those of unknown brands. This issue is particularly pronounced when new and emerging brands become targets of attacks.

In this work, we propose DynaPhish as a remedy for reference-based phishing detection, going beyond the predefined reference list. DynaPhish assumes a runtime deployment scenario and (1) actively expands a dynamic reference list, and (2) supports the detection of brandless webpages with convincing counterfactual explanations. For the former, we propose a legitimacy-validation technique for the genuineness of the added references. For the latter, we propose a counterfactual interaction technique to verify the webpage's legitimacy even without brand information. To evaluate DynaPhish, we constructed the largest dynamic phishing dataset consisting of 6344 interactable phishing webpages, to the best of our knowledge. Our experimental results demonstrate that DynaPhish significantly improves the recall of the state-of-the-art approach by 28% while maintaining a negligible cost in precision. Our controlled wild study on the emerging webpages further shows that DynaPhish significantly (1) improves the state-of-the-art by finding on average 9 times more real-world phishing webpages and (2) discovers many unconventional brands as the phishing targets. Our code is available at

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@inproceedings {291106,
author = {Ruofan Liu and Yun Lin and Yifan Zhang and Penn Han Lee and Jin Song Dong},
title = {Knowledge Expansion and Counterfactual Interaction for {Reference-Based} Phishing Detection},
booktitle = {32nd USENIX Security Symposium (USENIX Security 23)},
year = {2023},
isbn = {978-1-939133-37-3},
address = {Anaheim, CA},
pages = {4139--4156},
url = {},
publisher = {USENIX Association},
month = aug

Presentation Video