Mining Longest Repeated Subsequences to Predict World Wide Web Surfing

Jim Pitkow; Peter Pirolli

Mining Longest Repeated Subsequences to Predict World Wide Web Surfing

Modeling and predicting user surfing paths involves tradeoffs between model complexity and predictive accuracy. In this paper we explore predictive modeling techniques that attempt to reduce model complexity while retaining predictive accuracy. We show that compared to various Markov models, longest repeating subsequence models are able to significantly reduce model size while retaining the ability to make accurate predictions. In addition, sharp increases in the overall predictive capabilities of these models are achievable by modest increases to the number of predictions made.

Jim Pitkow, Xerox PARC

Peter Pirolli, Xerox PARC

BibTeX

@inproceedings {271508,
author = {Jim Pitkow and Peter Pirolli},
title = {Mining Longest Repeated Subsequences to Predict World Wide Web Surfing},
booktitle = {Second USENIX Symposium on Internet Technologies \& Systems (USITS 99)},
year = {1999},
address = {Boulder, CO },
url = {https://www.usenix.org/conference/usits-99/mining-longest-repeated-subsequences-predict-world-wide-web-surfing},
publisher = {USENIX Association},
month = oct
}

Download

Links

Paper:

http://usenix.org/publications/library/proceedings/usits99/full_papers/pitkow/pitkow.pdf

Paper (HTML):

http://usenix.org/publications/library/proceedings/usits99/full_papers/pitkow/pitkow_html/index.html

USENIX Conference Policies

Mining Longest Repeated Subsequences to Predict World Wide Web Surfing

Jim Pitkow, Xerox PARC

Peter Pirolli, Xerox PARC

Links