Check out the new USENIX Web site.


USENIX, The Advanced Computing Systems Association

2006 USENIX Annual Technical Conference Abstract

Pp. 375–380 of the Proceedings

Efficient Query Subscription Processing for Prospective Search Engines

Utku Irmak, Polytechnic University; Svilen Mihaylov, University of Pennsylvania; Torsten Suel, Polytechnic University; Samrat Ganguly and Rauf Izmailov, NEC Laboratories America

Abstract

Current web search engines are retrospective in that they limit users to searches against already existing pages. Prospective search engines, on the other hand, allow users to upload queries that will be applied to newly discovered pages in the future. Some examples of prospective search are the subscription features in Google News and in RSS-based blog search engines.

In this paper, we study the problem of efficiently processing large numbers of keyword query subscriptions against a stream of newly discovered documents, and propose several query processing optimizations for prospective search. Our experimental evaluation shows that these techniques can improve the throughput of a well known algorithm by more than a factor of 20, and allow matching hundreds or thousands of incoming documents per second against millions of subscription queries per node.

  • View the full text of this paper in HTML and PDF. Listen to the presentation in MP3 format.
    Click here if you have forgotten your password Until June 2007, you will need your USENIX membership identification in order to access the full papers. The Proceedings are published as a collective work, © 2006 by the USENIX Association. All Rights Reserved. Rights to individual papers remain with the author or the author's employer. Permission is granted for the noncommercial reproduction of the complete work for educational or research purposes. USENIX acknowledges all trademarks within this paper.

  • If you need the latest Adobe Acrobat Reader, you can download it from Adobe's site.
To become a USENIX Member, please see our Membership Information.

Last changed: 15 Sept. 2006 ch