SIFT -- A Tool for Wide-Area Information Dissemination

                  Tak W. Yan   Hector Garcia-Molina

                    Department of Computer Science
                         Stanford University
                          Stanford, CA 94305

                    {tyan, hector}@cs.stanford.edu

                               Abstract

The dissemination model is becoming increasingly important in wide-area
information system. In this model, the user subscribes to an information
dissemination service by submitting profiles that describe his interests.
He then passively receives new, filtered information. The Stanford
Information Filtering Tool (SIFT) is a tool to help provide such service. It
supports full-text filtering using well-known information retrieval models.
The SIFT filtering engine implements novel indexing techniques, capable of
processing large volumes of information against a large number of
profiles. It runs on several major Unix platforms and is freely available
to the public. In this paper we present SIFT's approach
to user interest modeling and user-server communication. We demonstrate
the processing capability of SIFT by describing a running server
that disseminates USENET News. We present an empirical study of SIFT's
performance, examining its main memory requirement and ability to scale
with information volume and user population.
 


Download the full text of this paper in ASCII (36,664 bytes) and POSTSCRIPT (201,086 bytes) form.

To Become a USENIX Member, please see our Membership Information.