With the advent of high-speed LAN technologies such as Gigabit Ethernet, IP-networked storage has become increasingly common in client-server environments. The availability of 10 Gb/s Ethernet in the near future is likely to further accelerate this trend. IP-networked storage is broadly defined to be any storage technology that permits access to remote data over IP. The traditional method for networking storage over IP is to simply employ a network file system such as NFS . In this approach, the server makes a subset of its local namespace available to clients; clients access meta-data and files on the server using a RPC-based protocol (see Figure 1(a)).
In contrast to this widely used approach, an alternate approach for accessing remote data is to use an IP-based storage area networking (SAN) protocol such as iSCSI . In this approach, a remote disk exports a portion of its storage space to a client. The client handles the remote disk no differently than its local disks--it runs a local file system that reads and writes data blocks to the remote disk. Rather than accessing blocks from a local disk, the I/O operations are carried out over a network using a block access protocol (see Figure1(b)). In case of iSCSI, remote blocks are accessed by encapsulating SCSI commands into TCP/IP packets .
The two techniques for accessing remote data employ fundamentally different abstractions. Whereas a network file system accesses remote data at the granularity of files, SAN protocols access remote data at the granularity of disk blocks. We refer to these techniques as file-access and block-access protocols, respectively. Observe that, in the former approach, the file system resides at the server, whereas in the latter approach it resides at the client (see Figure 1). Consequently, the network I/O consists of file operations (file and meta-data reads and writes) for file-access protocols and block operations (block reads and writes) for block-access protocols.
Given these differences, it is not a priori clear which protocol type is better suited for IP-networked storage. In this paper, we take a first step towards addressing this question. We use NFS and iSCSI as specific instantiations of file- and block-access protocols and experimentally compare their performance. Our study specifically assumes an environment where a single client machine accesses a remote data store (i.e., there is no data sharing across machines), and we study the impact of the abstraction-level and caching on the performance of the two protocols.
Using a Linux-based storage system testbed, we carefully micro-benchmark three generations of the NFS protocols--NFS versions 2, 3 and 4, and iSCSI. We also measure application performance using a suite of data-intensive and meta-data intensive benchmarks such as PostMark, TPC-C and TPC-H on the two systems. We choose Linux as our experimental platform, since it is currently the only open-source platform to implement all three versions of NFS as well as the iSCSI protocol. The choice of Linux presents some challenges, since there are known performance issues with the Linux NFS implementation, especially for asynchronous writes and server CPU overhead. We perform detailed analysis to separate out the protocol behavior from the idiosyncrasies of the Linux implementations of NFS and iSCSI that we encounter during our experiments.
Broadly, our results show that, for environments in which storage is not shared across machines, iSCSI and NFS are comparable for data-intensive workloads, while the former outperforms the latter by a factor of two for meta-data intensive workloads. We identify aggressive meta-data caching and aggregation of meta-data updates in iSCSI as the primary reasons for this performance difference. We propose enhancements to NFS to extract these benefits of meta-data caching and update aggregation.
The rest of this paper is structured as follows. Section 2 provides a brief overview of NFS and iSCSI. Sections 3, 4, and 5 present our experimental comparison of NFS and iSCSI. Implications of our results are discussed in Section 6. Section 7 discusses our observed limitations of NFS and proposes an enhancement. Section 8 discusses related work, and we present our conclusions in Section 9.