Comparing Hummingbird with UFS: single thread.

**Table 2:** Comparing Hummingbird with UFS, UFS-async, and UFS-soft when files greater than 64 KB are not cached.
file	disk	main	proxy	FS read	FS write	# of disk	mean disk
system	size	memory (MB)	hit rate	time (ms)	time (ms)	I/Os	I/O time (ms)
Hummingbird	4 GB	256	0.62	1.81	0.32	1,161,163	5.14
UFS-async	4 GB	128+128	0.64	6.13	2.54	4,807,440	4.66
UFS-soft	4 GB	128+128	0.64	5.54	21.07	10,737,300	4.97
UFS	4 GB	128+128	0.64	5.93	20.77	10,238,460	5.02
Hummingbird	4 GB	1024	0.63	1.05	0.03	552,882	5.75
UFS-async	4 GB	512+512	0.64	3.21	2.35	3,217,440	3.95
UFS-soft	4 GB	512+512	0.64	3.27	20.85	9,714,420	4.76
UFS	4 GB	512+512	0.64	3.92	18.43	9,022,200	4.63
Hummingbird	8 GB	256	0.64	2.03	0.32	1,194,494	5.64
UFS-async	8 GB	128+128	0.67	5.69	1.47	4,605,360	4.34
UFS-soft	8 GB	128+128	0.67	5.78	15.66	9,058,141	4.86
UFS	8 GB	128+128	0.67	6.17	12.83	8,313,360	4.63
Hummingbird	8 GB	1024	0.64	1.20	0.03	578,562	6.37
UFS-async	8 GB	512+512	0.67	4.05	1.34	3,561,180	4.13
UFS-soft	8 GB	512+512	0.67	3.74	15.65	8,148,721	4.62
UFS	8 GB	512+512	0.67	3.81	15.70	7,611,840	4.68

We compared Hummingbird with three versions of UFS on FreeBSD 4.1. The three versions of UFS were: UFS, which is UFS mounted synchronously (the default), UFS-soft, which is UFS with soft updates, and UFS-async, which is UFS mounted asynchronously, so that meta-data updates are not synchronous and the file system is not guaranteed to be recoverable after a crash. We used a version of Hummingbird with a single working thread, where the daemons were called explicitly every 1000 log events. Table 2 presents comparisons for two different disk sizes, 4 GB and 8 GB, with two memory sizes, 256 MB and 1024 MB, when files greater than 64 KB are not cached. The memory was split evenly between the Squid cache and the file system buffer cache³. The proxy-perceived latency in Table 2 is the 5th column, the FS read time. Hummingbird's smaller file system read time is due to the hits in main memory caused by grouping files in locality sets into clusters. Hummingbird's smaller file system write time (6th column) when compared to UFS-async is due to cluster writes, which write multiple files to disk in a single operation. The FS write times for UFS and UFS-soft are greater than UFS-async due to the synchronous file create operation.

The effectiveness of the clustered reads and writes and the collocation strategy is illustrated in the number of disk I/Os. In all test configurations, Hummingbird issued substantially fewer disk I/Os than any of the UFS configurations. Also, note that the number of disk I/Os in the UFS experiments is larger than the total number of requests in the log. This is because file operations resulted in multiple disk I/Os. This also explains why UFS read and write operations (as seen in FS read and write times) are slower than individual disk I/Os. The mean disk I/O time is larger in Hummingbird since the request size is a cluster, which is larger when compared to the mean data transfer size accessed by UFS.

**Figure 2:** Request throughput from Table 2.
$\begin{figure}\setlength{\epsfxsize}{2.5in}\centering\leavevmode \epsffile{tab2.eps}\vspace{-.2in} \end{figure}$

The throughput for each experiment in Table 2 is shown in Figure 2. Figure 2 shows that Hummingbird throughput is much higher than both UFS, UFS-soft, and UFS-async on the same disk size and memory size. This is not quite a fair comparison since the proxy hit rate is lower with Hummingbird. (We do not expect the experiment run time to increase more than 10% when the Hummingbird policies are set so that it would have equivalent hit rate to wg-Squid). The throughput is larger for Hummingbird since much less time is spent in disk I/O. Using throughput as a comparison metric, we see that Hummingbird is 2.3-4.0 times faster than simulated Squid running on UFS-async, 5.6-8.4 times larger than a simulated version of Squid running on UFS-soft, and 5.4-9.4 times faster than simulated Squid running on UFS. These numbers include also the results from Table 3.

**Table 3:** Comparing Hummingbird with UFS, UFS-async, and UFS-soft when all files are cached.
file	disk	main	proxy	FS read	FS write	# of disk	mean disk	experiment
system	size	memory (MB)	hit rate	time (ms)	time (ms)	I/Os	I/O time (ms)	run time (s)
Hummingbird	4 GB	256	0.60	1.68	0.39	1,349,175	4.17	6,362
UFS-async	4 GB	128+128	0.62	6.61	2.88	5,134,380	4.72	25,510
UFS-soft	4 GB	128+128	0.62	5.81	22.91	11,858,100	5.02	59,475
UFS	4 GB	128+128	0.62	6.13	22.67	11,283,060	5.05	59,807
Hummingbird	4 GB	1024	0.61	0.88	0.52	630,362	5.62	6,030
UFS-async	4 GB	512+512	0.62	3.67	2.70	3,919,620	3.98	16,384
UFS-soft	4 GB	512+512	0.62	3.51	22.89	10,868,700	4.84	52,377
UFS	4 GB	512+512	0.62	4.24	20.23	10,115,400	4.69	49,749
Hummingbird	8 GB	256	0.64	2.17	0.40	1,464,727	5.03	7,923
UFS-async	8 GB	128+128	0.66	6.42	2.30	5,017,920	4.67	24,657
UFS-soft	8 GB	128+128	0.66	6.14	19.31	10,461,841	4.98	51,797
UFS	8 GB	128+128	0.66	7.37	16.42	9,954,540	4.89	51,062
Hummingbird	8 GB	1024	0.62	1.18	0.51	695,771	6.41	6,802
UFS-async	8 GB	512+512	0.66	4.66	1.90	4,016,400	4.32	18,240
UFS-soft	8 GB	512+512	0.67	3.69	15.71	8,158,080	4.61	37,097
UFS	8 GB	512+512	0.66	4.24	18.90	8,894,760	4.82	45,044

The experiments for Table 2 assumed that files greater than 64 KB were not cached by the proxy. We got similar results when assuming the proxy would cache all files; see Table 3. Note that the proxy hit rate in Table 3 is lower than in Table 2. This is the result of the cache being ``polluted'' with large files, which cause some smaller files to be evicted. The end result is that there are less hits, which translate into less file system activity, and fewer file accesses.