Check out the new USENIX Web site. next up previous
Next: Probabilistic Checking Up: Algorithm Previous: Write Protocol

Basic Read Protocol

As mentioned, the replicated data content needs to support flexible query operations (reads); calculating the result of such a query can be a computationally very intensive task if it requires applying an aggregation function on the entire data content (a complex join for a database, or a grep Expression Path request in the case of a file system). Therefore, in order decrease the workload on the master servers and improve performance, read requests are handled by the slaves.

When a client wants to perform a read, it sends the request to its assigned slave server. The slave executes the request, and constructs a ``pledge'' packet which contains a copy of the request, the secure hash (SHA-1 [1]) of the result, and the latest time-stamped $
content\_version $ value received from the master. After signing this ``pledge'' packet, the slave sends it to the client, together with the result of the query.

At the other end, the client first computes the secure hash of the query result, and makes sure it matches the hash in the ``pledge'' packet. Then, the client verifies the slave's signature on the ``pledge'' packet. Finally, the client makes sure the time-stamp is not older than $ max\_latency $. If all these conditions are met, the client accepts the answer, otherwise it rejects it.

Because of the asynchronous nature of the network connection between the client and the slave, it is possible that a result that was ``fresh'' when sent by the slave, becomes stale by the time it reaches the client. In such a situation, the client has to drop the answer and try the query again. By carefully selecting the value for $ max\_latency $, and the frequency masters send ``keep-alive'' packets, the probability of such events occuring can be reduced. However, clients with very slow or unreliable network connections may never be able to get fresh-enough responses. One possible way to accommodate such clients is to relax the consistency model and allow the $ max\_latency $ to be set by the clients themselves. In this case, clients with fast network connections can set high ``freshness'' requirements, while clients with slow connections can settle with more modest expectations.

The main vulnerability of this basic read protocol is that a malicious slave can return wrong answers to client requests. To protect against such malicious slaves, we employ two techniques - probabilistic checking and auditing which will be described in the next two sections.


next up previous
Next: Probabilistic Checking Up: Algorithm Previous: Write Protocol
Popescu Bogdan
2003-06-11