Check out the new USENIX Web site.

A Text Retrieval Package for the Unix Operating System


Liam R. E. Quin
SoftQuad Inc.
(lee@sq.com)

Abstract

This paper describes lq-text, an inverted index text retrieval package written by the author. Inverted index text retrieval provides a fast and effective way of searching large amounts of text. This is implemented by making an index to all of the natural-language words that occur in the text. The actual text remains unaltered in place, or, if desired, can be compressed or archived; the index allows rapid searching even if the data files have been altogether removed.

The design and implementation of lq-text are discussed, and performance measurements are given for comparison with other text searching programs such as grep and agrep. The functionality provided is compared briefly with other packages such as glimpse and zbrowser.

The lq-text package is available in source form, has been successfully integrated into a number of other systems and products, and is in use at over 100 sites.


Download the full text of this paper in ASCII (54,410 bytes) and POSTSCRIPT (264,871 bytes) form.

To Become a USENIX Member, please see our Membership Information.