A big challenge in the design of a download server is the need to develop a system that adjusts automatically to the features of multiple different clients without intervention from the system administrator. One basic feature that potentially varies across different clients requesting the same file is the capacity of their network link to the server. In our experience, the closer the transfer rates of two clients match, the easier it becomes to exploit the data sharing between them. As the difference increases, it becomes much more difficult to efficiently share cached data between the clients.
In Figure 8, we measure the network bandwidth of an unmodified (sequential) and a Circus (out-of-order) server across clients with different link capacities, when they download a single file of size 512MB. In typical ftpd implementations (including the one that we use here), each active download request spawns an extra server process with resident memory space of about 1MB. Consequently, we only show T1 measurements for up to 30-40% load, roughly corresponding to about 200 concurrent clients. Beyond this point significant paging activity starts making performance measurements less meaningful.
In all the three cases of a single client link capacity (a-c), we observe the out-of-order network throughput to increase proportionally with the system load. In particular, when we set the system load equal to 40% load, we expect to receive 51.2MByte/s (= .4x1Gb/s) throughput, which is roughly what we actually observe in cases (b) and (c). The measured throughput is somewhat lower (in case d) when we combine clients of different rates on the same server, but still reaches 50MByte/s at 50% load. Quite remarkably, the sequential case only matches the out-of-order performance at 10% load in the four cases, and never exceeds 30MByte/s (on average) as the load becomes higher.
The explanation for the poor performance of the sequential transfers can be given by looking at Figure 9 where we show the throughput of the data disk. With sequential transfers, the disk is highly utilized even at low loads, regardless of the link capacity of the clients. On the contrary, when we enable out-of-order transfers (a-c), the disk throughput drops to the highest capacity of a single client. For example the disk throughput is about 1MByte/s with 10T transfers (b), an order of magnitude lower than the sequential case. When we mix clients of different capacities (d), the above observation is still correct at low loads with the disk throughput about 5.6 MByte/s. At higher loads, the proportion of non-sharing (independent) clients increases raising the disk throughput accordingly. The above observations are further verified in Figure 10. With out-of-order transfers, the download duration of each request remains roughly constant at different system loads, according to the link capacity of the requesting client. Instead, when sequential transfers are used, the download duration grows aggressively as a function of the system load.