Check out the new USENIX Web site. next up previous
Next: Implementation and Usability Aspects Up: Results Previous: Experiments

Performance Evaluation

We examine the performance of measure calls invoked through: (i) the kernel file_mmap LSM hook, (ii) the kernel load_module function, and (iii) user space applications writing measure requests into /sys/security/measure.

We first examine the overhead of the file_mmap LSM security hook, which measures all executable content and dynamic libraries. This is by far the most frequently called and most performance-sensitive measure hook. To determine the latencies of the file_mmap LSM measurement hook, we measure the latencies of the mmap system call from user level, which calls this file_mmap LSM hook. Our latency measurement (including both mapping and unmapping) considers three different cases, namely no_SHA1, SHA1, and SHA1+extend. no_SHA1 represents the case when file_mmap finds the target in the cache as clean. In the very rarely observed SHA1 case, the target file is remeasured and the SHA1 fingerprint is recalculated. However, the TPM is not extended because the fingerprint is found to be already in the cache. SHA1+extend represents the case when a brand new file is measured and the resulting fingerprint needs to be extended into the TPM chip. This happens more often at system start or after system updates, for example. Since the goal is to measure the latency, we use a test file size of 2 bytes. Implementation of the micro-benchmarks is based on the HBench framework [16]. Table 1 shows the results.

Table 1: Latency of the file_mmap LSM hook (file size 2 bytes).
mmap type mmap latency (stdev) file_mmap LSM
no_SHA1 1.73 $\mu$s (0.0) 0.08 $\mu$s
SHA1 4.21 $\mu$s (0.0) 2.56 $\mu$s
SHA1+extend 5430 $\mu$s (1.3) 5430 $\mu$s
reference 1.65 $\mu$s (0.0) n/a

For reference purposes, we include the running time of an mmap system call without invoking the file_mmap LSM measurement hook. It is clear from the table that the overhead for the file_mmap LSM hook in the case of a clean cache hit (no_SHA1) is minimal - it takes 0.08 (1.73 - 1.65) $\mu$s to run. It does little more than reading the dirty-flag information from the inode of the file to be mapped. Fortunately, our experiences indicate that this is the majority case, even for servers that tend to run for a long time, accounting for more than 99.9% of all measure calls.

When the file is remeasured (SHA1), the mmap system call takes about 4.21 $\mu$s, an overhead of about 2.5 $\mu$s against the reference value. This case shows the overhead of setting up the file for measurement and searching the hash table for a matching fingerprint. Notice that this case does not measure the overhead of the fingerprinting itself, since the file size is only 2 bytes. Fingerprinting performance will be discussed later. The extend operation is clearly the most expensive, taking about 5 milliseconds to execute. This is understandable, because the extend operation interacts with the TPM chip as well as creates a new measurement list entry. As mentioned before, these two cases together represent less than 0.1% of all measure calls. Thus, we are confident -and our experiences confirm- that the performance penalty our system imposes for measuring executable upon the user will be negligible.

Invoking a measurement from user-level comprises (i) opening /sys/security/measure, (ii) writing the measure request, and (iii) closing /sys/security/measure. This method applies to measuring configuration files or interpreted script files (e.g., bash scripts or source files). As with the file_mmap LSM hook, we distinguish also here the three cases no_SHA1, SHA1, and SHA1+extend. The results are shown in Table 2.

Table 2: Latency of user level measurements via sysfs (file size 2 bytes).
Measurements via sysfs Overhead (stdev)
  no_SHA1 4.32 $\mu$s (0.0)
measure SHA1 7.50 $\mu$s (0.0)
  SHA1+extend 5430 $\mu$s (1.6)
  sys fs  
reference open/write/close 4.32 $\mu$s (0.0)

The user-level measurement latency is 4.32 $\mu$s in the no_SHA1 case. This overhead is mostly file system related overhead -open, write, close of /sys/security/measure as the reference value in Table 2 indicates. The measurement-related overhead for the no_SHA1 case simply disappears in the context switching and file system related overhead. Interpreting the other measurement values is straightforward.

Measuring kernel modules can be done in two ways as described in Section 5.1: by user-level applications insmod and modprobe, or by inducing a measurement routine before relocating the kernel module in the load_module function called by the init_module system call. Measuring them via insmod or modprobe transfers kernel module measurement performance into the domain of user-level measurements with the overhead as described in Table 2. The latency of measuring kernel modules in the load_module kernel function is almost the same as the latency of measuring executable content in the file_mmap LSM measurement hook. However, because kernel modules are already in memory before they are relocated, there is no dirty flagging information and we do not have clean hits but only the cases SHA1 or SHA1+extend. We consider kernel module loading an infrequent and less time critical event and thus recommend from a security standpoint (see Section 5.1) that they be measured in the kernel.

Next, we present the fingerprinting performance as a function of file sizes. We measure the mmap system call's running time in the SHA1 case, varying the input file sizes. This includes the reference overhead of 1.65 $\mu$s for the pure mmap system call as shown in Table 1. The results are shown in Table 3. When the file size is large, the fingerprinting overhead can be significant. For example, measuring a 128 Kilobytes file takes about 1.5 milliseconds. The running time increases close to a linear fashion as the size of file increases. These latencies translate to a throughput performance of about 80 MB per second.

Table 3: Performance of the SHA1 Fingerprinting Operation as a Function of File Sizes.
File Size (Bytes) Overhead (stdev)
2 4.21 $\mu$s (0.0)
512 10.3 $\mu$s (0.0)
1K 16.3 $\mu$s (0.0)
16K 197 $\mu$s (0.1)
128K 1550 $\mu$s (1.1)
1M 12700 $\mu$s (16)

Measuring in-memory kernel modules, we expect slightly better throughput in computing the SHA1 than measuring files -which fist have to be read from disk into memory- in the file_mmap LSM hook as described in Table 1. However, our measurements yielded only slightly better performance than in the file_mmap case shown in Table 3. We explain this with the Linux file caching effect. The measurements were done many times with a hot cache on the same file, which makes it very likely, that almost the complete file was already residing in the file cache when the measurement started. This also suggests that the throughput numbers in Table 1 should be considered a optimistic for file measurements.

These experiments were run with a measurement list containing about 1000 entries on an IBM Netvista M desktop workstation, including an Intel Pentium 2.4 GHz processor and 1 GByte of RAM. All non-essential services where stopped.

next up previous
Next: Implementation and Usability Aspects Up: Results Previous: Experiments
sailer 2004-05-18