Check out the new USENIX Web site. next up previous
Next: Analysis of HTTP Latency Up: Background Previous: Background

Caching in the HTTP Protocol

The HyperText Transfer Protocol (HTTP) [4] supports caching in clients (i.e. browsers) and intermediate servers known as proxy-caching servers. To display a page, a client without a cached copy will unconditionally send a page request to the content-provider or a proxy-caching server. A client with the page already cached may return the cached copy or contact another host to determine whether the page has changed. This check for page currency is done by sending an HTTP GET request with a header specifying If-Modified-Since followed by the timestamp of the cached copy. If a proxy-caching server is used, it can respond to a page request or currency check request if it has a cached copy that is deemed usable and if the client has not specified that the cache must not be used (with a Pragma: no-cachegif directive, most commonly sent when a user tells a browser to reload a fresh copy of a page).

The proxy-caching server decides whether or not the cached page is usable and, if so, whether or not the client's cached copy is current, based on the optional extra information that the clients can send with their requests and which HTTP servers can send back with a page. Beyond the Pragma: no-cache directive mentioned above, the client may specify a bound on the age of the cached copy it is willing to accept (the Cache-control: max-age directive). Content-providers can specify when the page was last modified, whether or not the page can be cached (the Cache-control: no-cache field), and how long a client or proxy-caching server should cache the page (the Expires field). Dynamic data, often the output of a CGI script, is typically sent with no Last-Modified timestamp and set up to expire from the cache immediately (equivalent to disabling caching).

Currently, if a page has no Last-Modified timestamp, checking for the freshness of a cached copy requires retrieving the fresh copy from the content provider and shipping it all the way to the client browser. Similarly, changes to a page with the timestamp will require the proxy to obtain the file from the content provider and transmitting the entire file to the client; the transmission is elided only if the page has not been modified at all.

next up previous
Next: Analysis of HTTP Latency Up: Background Previous: Background

Gaurav Banga
Tue Nov 12 20:47:38 EST 1996