We redesigned the way file download requests are served in order to exploit data sharing opportunities across concurrent transfers of files with total storage footprint larger than the server buffer cache. We introduce the file download problem to formalize the system design objectives, and analytically examine the tradeoffs across different resource management approaches. We specify the Circus algorithm for scheduling disk access and network transfers in the file download problem setting, introduce metrics for evaluating the performance of download servers, and briefly describe a prototype implementation of the Circus algorithm. We use experimental evaluation to compare the performance benefits of our system against currently used download server implementations. We find the average file download time with Circus to remain close to minimum across different workloads, and several factors lower than conventional implementations. Additionally, Circus more than doubles the server network throughput in several cases, and reduces the required disk bandwidth by an order of magnitude in comparison to existing systems.
Based on the flexibility of the application-layer framing principle for the design of distributed applications, we demonstrate dramatic improvements of performance metrics measured in both the server and client side of a download service. It remains interesting open question how integration of application packet boundaries into the transport protocol can simplify the development of services potentially restricted by the traditional semantics of reliable data transport, and can further enhance the resource utilization and performance of the system overall. Combining out-of-order transfers with selective acknowledgments might further reduce the buffer space requirements and improve sharing. Experimentation with streaming applications or multi-source download services and investigation of how file segmentation affects the performance benefits of data transfer reordering are other interesting issues that deserve extra research effort. More detailed analytical modeling of the system behavior will certainly shed additional light into the stability of the proposed service under specific operating conditions.