USENIX Annual Technical Conference 2003, FREENIX Track

USENIX 2003 Annual Technical Conference, FREENIX Track — Abstract

Pp. 63-76 of the Proceedings

Learning Spam: Simple Techniques for Freely-Available Software

Bart Massey, Mick Thomure, Raya Budrevich, and Scott Long, Portland State University

Abstract

The problem of automatically filtering out spam e-mail using a classifier based on machine learning methods is of great recent interest. This paper gives an introduction to machine learning methods for spam filtering, reviewing some of the relevant ideas and work in the open source community. An overview of several feature detection and machine learning techniques for spam filtering is given. The authors' freely-available implementations of these techniques are discussed. The techniques' performance on several different corpora are evaluated. Finally, some conclusions are drawn about the state of the art and about fruitful directions for spam filtering for freely-available UNIX software practitioners.

View the full text of this paper in HTML and PDF.
Until June 2004, you will need your USENIX membership identification in order to access the full papers. The Proceedings are published as a collective work, © 2003 by the USENIX Association. All Rights Reserved. Rights to individual papers remain with the author or the author's employer. Permission is granted for the noncommercial reproduction of the complete work for educational or research purposes. USENIX acknowledges all trademarks within this paper.
If you need the latest Adobe Acrobat Reader, you can download it from Adobe's site.

To become a USENIX Member, please see our Membership Information.

Need help? Use our Contacts page.

Last changed: 7 Nov. 2003 jel

Technical Program

USENIX Annual Technical Conference 2003 Home

USENIX home