| ||||||||||||||||||||||||||||||||||||||||||||||||||||
|
MobiSys '05 Paper   
[MobiSys '05 Technical Program]
Reincarnating PCs with Portable SoulPads
Ramón Cáceres Casey Carter Chandra Narayanaswami Mandayam Raghunath {caceres, chandras, mtr}@us.ibm.com, casey@carter.net Authors listed in alphabetical order
Abstract The ability to walk up to any computer, personalize it, and use it as one’s own has long been a goal of mobile computing research. We present SoulPad, a new approach based on carrying an auto-configuring operating system along with a suspended virtual machine on a small portable device. With this approach, the computer boots from the device and resumes the virtual machine, thus giving the user access to his personal environment, including previously running computations. SoulPad has minimal infrastructure requirements and is therefore applicable to a wide range of conditions, particularly in developing countries. We report our experience implementing SoulPad and using it on a variety of hardware configurations. We address challenges common to systems similar to SoulPad, and show that the SoulPad model has significant potential as a mobility solution. 1 IntroductionToday’s laptop computers give users two highly desirable features. One is the ability to suspend a computing session (e.g., running applications, open windows) and resume it later, perhaps at a different location. The other is access to their personal and familiar software environment (e.g., applications, files, preferences) wherever they are. In spite of this convenience, a major drawback of this model is that the user has to carry a fairly bulky device. In addition, though docking stations allow the user to use a larger display and attach some peripherals, the user is limited to the capabilities of the hardware integrated in the portable computer, such as the processor and memory. Before the advent of portable computers, there were two main approaches to suspending a session in one location and resuming it at another. One method was based on process migration between the machines at the two locations [3, 17]. Another technique was to move just the user interface and graphical windows across stationary machines while continuing to run the application processes on a single machine [11, 15]. There are several solutions that store the user’s data on a central server to make it possible for a user to log in to one of several machines that are connected to the server and have a common startup environment [16]. More recent solutions to this problem have centered on the use of virtual machines. For example, in Internet Suspend/Resume (ISR) [7, 8] the user’s computation state is stored as a check-pointed virtual machine image in the network when computation is suspended, and retrieved from the network when computation is resumed at a machine that has similar base software. ISR has since explored using a portable storage device as a cache [18].
Figure 1: SoulPad architecture and use.
In this paper we present SoulPad, a portable device carrying the software stack shown in Figure 1, that allows a user to walk up to a hitherto unseen personal computer and resume a personal computing session that was suspended on another machine. The SoulPad approach exploits portable storage devices, fast local wired connections, auto-configuring operating systems and virtual machine technology, while coexisting with the widely deployed PC ecosystem. In summary, we decouple the user’s machine into a body (display, CPU, RAM, I/O) and a soul (session state, software, data, preferences). The soul is carried in a small and light portable device, the SoulPad. The soul can reincarnate on any one of a large class of x86-based personal computers with no preloaded software, and effectively convert that computer into the user’s computer. The computers on which the SoulPad can reincarnate itself on are denoted as EnviroPCs. We presently rely on USB 2.0 connections between the SoulPad and the EnviroPC. The EnviroPC’s CPU, memory and I/O devices are used to run the software on the SoulPad. There are several practical advantages to our method. The first is that the SoulPad has no battery and thus the user need not worry about recharging it. The second is that no network connectivity is required to retrieve suspended state. Another advantage is that the EnviroPCs do not require any preloaded software and thus can be unmanaged. In fact, the EnviroPCs can be diskless and can be relegated to pieces of furniture that don’t require constant monitoring for viruses. Since all software running on the EnviroPC comes from the SoulPad and belongs to the user, the user does not have to trust a preinstalled operating system on the EnviroPC. Our approach also allows the user to exploit the full capabilities of the EnviroPC, for example a high-resolution display or a fast processor. Finally, by resorting to a fast wired connection between the SoulPad and the EnviroPC, we avoid the problems associated with wireless connections between the two devices – namely device disambiguation and association, and the power consumed by wireless communication as in the Intel Personal Server [21]. We observe that these advantages are in addition to the general benefits of virtualization, such as encapsulation and easier system migration. We believe that the SoulPad approach could change the way computers are built and used. If the software on the internal disks adopt the SoulPad stack, users will be able to easily migrate from one machine to another by simply moving the disk. For example, a business professional could insert his disk into a light and compact laptop for travel, into a larger but more powerful laptop for regular use, and even into a wearable computer with an eyeglass-mounted display if necessary. Obviously, the disk attachment interface must be compatible with the different form factors. Our method is also particularly well suited to developing countries, where a large class of society cannot afford to buy computers and keep them connected to the Internet. Voltage fluctuations and power outages also add to the problem. Shared community PCs provide a solution in such environments. For example many people use web-based applications from public places. However, this solution does not address the personalization and environment preservation issue. By moving to a model where users own the SoulPad and borrow or rent the EnviroPC, we can reduce their investment and offer them personalization and environment preservation across suspend and resume cycles. Our approach has only recently become feasible. Technical advances in storage devices have made it possible to carry small disk drives that fit in a pocket and hold upwards of 60GB for around US$150 (in May 2005). Flash storage of several gigabytes already fits on a key fob. Clearly we can expect tens of gigabytes to fit on smaller and cheaper portable and wearable devices over time. Atomic Force Microscope (AFM)-based storage technologies such as Millipede [20] can have a density of 125 GB per square inch ― ten times higher than the densest magnetic storage available today. Already several portable music players such as the Apple iPod and some digital image viewers feature large drives. Interfaces like USB 2.0 provide sustained data access rates of more than 150Mbps, leading to acceptable resume and suspend times. Compared with today’s laptop model, the disadvantages of our method include the performance degradation due to virtualization, and longer resume times. In addition, portable devices are susceptible to loss or damage, but regularly backing up the contents of the SoulPad can address this issue. We have implemented our solution and report our results in this paper. We address suspend and resume times and how they vary with disk, processor, and interconnect speeds; runtime overheads caused by virtualization and use of an external disk; practical issues that arise due to evolution in processor architecture; and security issues. We first present the software architecture of SoulPad, followed by our implementation and experimental results. We then discuss some of the issues we had to deal with as we moved from concept to prototype, and some challenges that remain. A number of these issues are relevant to other efforts such as ISR [7, 8] and the Stanford Collective [13] that also use virtual machine technology for mobility. Finally, we discuss some of the related work that has helped shape our solution. Throughout this paper, we use the term SoulPad to refer both to the design of our system and to any device that embodies that design. 2 Architecture2.1 ComponentsWe want SoulPads to work with a wide range of x86 EnviroPCs without relying on a preinstalled operating system. We also want to allow the user to preserve session state across EnviroPCs. In order to meet these needs, the software stack on the SoulPad has the following three components:
1. A Host OS that boots on EnviroPCs and addresses hardware diversity via auto-configuration. 2. A Virtual Machine Monitor (VMM) that can suspend/resume virtual machines and supports Guest OS diversity. 3. A Virtual Machine (VM) that runs the user’s applications on a Guest OS of the user’s choosing. While booting on an EnviroPC, the auto-configuring Host OS discovers the hardware characteristics and I/O devices of the EnviroPC, and configures itself to the hardware present by installing appropriate driver modules. Auto-configuration is a requirement for this layer since the SoulPad has to boot on an EnviroPC that it may not have seen before. This characteristic contrasts with a traditional operating system that goes through a separate install phase. Once this step is complete, the Host OS provides a known environment for the next layer, namely the Virtual Machine Monitor. The VMM runs a virtual machine, relying on the underlying Host OS for any services that the VM requires. The VM provides an environment on which the user’s operating system and applications (also stored on the SoulPad) are run. Since the user’s computing environment runs on top of a VM, it is possible for the VMM to suspend the user’s session state and resume it later. The suspended session state is also stored on the SoulPad. The user can suspend his session, then shut down the VMM layer and the Host OS, and walk away with his SoulPad. The user can later attach the SoulPad to a different EnviroPC, start the Host OS and the VMM layer, then load the suspended session state, resume it, and continue his session. If the user’s tasks do not require network access, the PC may be completely disconnected from the network.
Figure 2: A sample of USB 2.0 portable disks used as SoulPads. Clockwise from upper left: LaCie 40GB DataBank, LaCie 60GB PocketDrive, and IBM 40GB Portable Hard Drive. The generality of our three-level architecture allows users a choice of personal computing environments, from Windows to Linux to any other OS that can run on the VMs provided by the VMM. Users can even maintain multiple Guest OS environments on the same SoulPad, each OS running in its own VM. 2.2 Issues addressedThe issues we addressed in the course of building our SoulPad prototype are listed below.· Performance: Working on a VM introduces some overhead when compared to working on bare hardware. Using an external disk instead of an internal disk could make the situation worse. We therefore evaluate the suspend/resume and operational performance of SoulPad.· Security and privacy: Portable devices are prone to theft and loss. We safeguard privacy by encrypting the user data stored on a SoulPad. Moreover, since the software on EnviroPCs may not be trustworthy, we rely only on their hardware and firmware.· Reliability: Portable devices are prone to damage and loss. We implemented a way to recover the contents of SoulPads from a backup source.· Hardware independence: There are many hardware differences between PCs. Some of these differences are hidden by VM technology, but others are exposed to the Guest OS and its applications. We need to determine across how wide a range of PCs SoulPad will operate.The following section describes how we addressed some of these issues with our implementation. Later sections will return to discuss these and other challenges in more detail.3 Implementation3.1 OverviewWe used off-the-shelf USB 2.0 portable disks as SoulPad devices. Figure 2 shows some examples. These devices are much smaller and lighter than portable PCs. For example, the LaCie 40GB DataBank measures 4.4 x 2.5 x 0.6 inches and weighs 4.8 ounces. In contrast, a latest-generation “ultraportable” notebook computer like the IBM ThinkPad X40 measures 10.5 x 8.3 x 1.06 inches and weighs 2.7 pounds. Despite their small size, these portable disks have comparable storage capacity to notebook and laptop PCs, e.g., 40-60 GB. They in fact use the same hard-disk technology. Given the popularity of portable PCs, it follows that SoulPads can satisfy the storage needs of large numbers of users. To implement the software architecture described in the previous section, we made the following choices:
1. Knoppix for the auto-configuring Host OS. 2. VMware Workstation for the VMM. 3. Windows or Linux for the Guest OS.
Knoppix [24] provides us with the zero-install and auto-configuration features that SoulPad needs from a Host OS. Knoppix is a version of GNU/Linux distributed as a single bootable CD that includes the Linux kernel and a range of applications. Knoppix enables users to get a familiar Linux desktop along with their favorite applications on almost any PC without having to install any software on the PC’s hard disk. The bootloader from the CD loads a Linux kernel and an in-memory disk image called the Initial RAM Disk. Subsequently, Knoppix scans for devices, loads the appropriate device drivers, initializes discovered network interfaces, generates an appropriate X11 configuration for the discovered display hardware, and carries out other auto-configuration steps. These steps are necessary because Knoppix does not have prior knowledge of the hardware configuration of PCs on which it boots. An in-memory filesystem is created for read-write data. All of the applications, libraries and other read-only data reside on a compressed filesystem on the CD, which is mounted using a loopback device in the kernel. The compressed filesystem approach enables Knoppix to pack almost 2 Gigabytes of data onto a single 700MB CD. All of the local session state created by the user typically resides in the in-memory filesystem and is lost when the user shuts down Knoppix. Some users combine a Knoppix CD with a small USB flash key where they store their personal files and other persistent data. We create a SoulPad disk by first installing Knoppix on a USB hard disk, using the hard-disk install script that comes with Knoppix. We also install a bootloader on the USB disk that loads the kernel and the Initial RAM Disk in the same manner as the bootloader on a Knoppix CD. We had to make a few modifications to the Initial RAM Disk and startup scripts, for example to ensure that USB-related kernel modules were loaded before trying to mount the root file system from the USB disk. With these changes, we were able to take the USB disk from one machine to another and boot a Knoppix environment. While Knoppix by itself enables users to walk up to any PC and personalize it with their Linux environment, there is no easy way for users to preserve their computing state as they move from one machine to another because Knoppix needs a full reboot every time it is moved. Knoppix users are also limited to that one OS. We installed VMware Workstation [24] on top of Knoppix to support suspend/resume of user sessions as well as OS diversity. We then created virtual machines on which we installed Windows XP Professional or a Linux variant as the Guest OS. We automated the SoulPad suspend and resume sequences so that each runs to completion after an initial user action. Users initiate suspend by selecting the VMware Workstation suspend operation on their screens. After the VM suspends, Knoppix shuts down and powers down the machine. At this point the user can disconnect the SoulPad from one PC and connect it to another. Users initiate a resume operation by powering up the new PC so that it boots from the SoulPad. The PC boots into Knoppix, which starts VMware Workstation, which resumes the Guest OS session. On our SoulPad disks we created a 4GB partition to hold Knoppix and a 2G partition to serve as swap space. The remaining disk space is available for sharing among VM images. For example, on a 40GB disk holding only one VM, 34GB are available for the Guest OS environment. 3.2 Encrypted virtual machine imageTo protect user data if a SoulPad is misplaced or stolen, we encrypt the disk partition that holds the VM images using the AES128 block cipher. We used the publicly available loop-aes package for Linux in our implementation. The encryption key is generated by hashing a user-supplied passphrase. After the Host OS boots, it prompts the user to enter the passphrase. If the user supplies an incorrect passphrase, the resulting hash will not correspond to the AES key and the mount operation will fail since the decrypted data will not correspond to a valid filesystem. In order to defeat brute force attacks that attempt to guess the passphrase, the loop-aes package requires the passphrase to be at least 20 characters long. For convenience, we permit users to supply this passphrase via an auxiliary USB flash key. While the Guest OS partition is mounted, the AES key is retained in kernel memory. When the partition is unmounted, the AES key is erased from memory. It is never stored on disk. At run time it is possible that the Host OS swaps out pages holding the user’s Guest OS state to the swap partition on the SoulPad. We also use loop-aes to encrypt the swap partition to prevent user data from appearing in plaintext form on the SoulPad. The key for the swap partition is auto-generated for each session since swap state does not have to be preserved across Host OS boot cycles. SoulPad never writes to the internal disk on an EnviroPC. Therefore, there is no risk of leaving sensitive data on the PC’s persistent storage after disconnecting. 3.3 Networking configurationAt resume time, if the EnviroPC is connected to a network with a working DHCP server, the Host OS will obtain an IP address and establish network connectivity. The VMware Workstation virtual machines are configured to use Network Address Translation (NAT) to connect to the external network through the Host OS. Thus, the Guest OS enjoys network connectivity whenever the Host OS does. In short, from a networking perspective, a SoulPad suspend followed by a resume is similar to suspending a laptop at one location and resuming it at another location. Many networked applications already support suspend and resume of laptops, e.g., email and instant messaging clients. They simply attempt to re-establish their network connections at resume time. Similarly in the case of SoulPad, such applications running inside the Guest OS re-establish their connections when they are able to do so. In some cases, the resume may happen outside an intranet and some network resources may not be reachable unless the user establishes a VPN connection into the intranet. Laptop users are already familiar with such situations and the behavior is identical while using a SoulPad. 3.4 BackupsIn our enterprise environment we have configured backups from the SoulPad to Tivoli Storage Manager (TSM), a file-level networked backup service. Whenever the SoulPad is connected to a PC that has connectivity to the TSM server, we perform an incremental backup of the SoulPad. If a user loses his SoulPad, a copy of it can be re-created from the backup server. Again, this model is similar to the situation where a user loses his laptop and has to recover data from the most recent backup. On our prototype SoulPads, we have configured incremental backups both at the Host OS and Guest OS levels. At the Host OS level any changes to Guest OS files appear as changes to the large binary files corresponding to the VMware Workstation virtual disks. Our current backup implementation does not handle minor modifications to large binary files very well, as it simply treats the file as having changed and transfers the entire file to the backup server. So, we specifically exclude these files from the files backed up at the Host OS level. Instead we rely on the incremental backups at the Guest OS level to back up the modified Guest OS files. In the future we propose to investigate better backup schemes that handle large binary files. We do not backup the suspended virtual machine state at suspend time because this would add considerable latency to the suspend operation. This means that if the user loses the SoulPad, he also loses the latest session state and must reboot the VM after recovering SoulPad state from backup. In environments with poor infrastructure where managed network backup services are not viable, it is possible to perform local backups using LAN-connected devices such as Mirra [24], or backups to a second locally connected USB storage device. 4 Experimental ResultsWe confirmed the usability of SoulPad through a variety of experiments. These experiments fall into three main categories: resume and suspend latencies, application response times, and hardware independence. This section describes our methodology and results. The SoulPad software stack used in all experiments consisted of Knoppix 3.4, VMware Workstation 4.5.1, and Windows XP Professional. In addition, VMware Tools was installed in the Guest OS of all VMs.
Table 1: Characteristics of disks used. We used disks with varying characteristics, as shown in Table 1. The transfer rates are averages of 10 runs of the hdparm –t Linux command. This command measures how fast the drive can sustain sequential data reads, without file-system buffering effects. All transfer rates were measured on a NetVista desktop PC, using a USB 2.0 connection for all but the IDE disk. As shown, the IDE disk has close to twice the transfer rate of the fastest USB disk. There are also large differences in transfer rates among USB disks. 4.1 Resume and suspend latenciesResume and suspend latencies are key to SoulPad’s usability. We define resume latency as the time between when the user powers up the SoulPad-EnviroPC combination, and when the VM has finished resuming, i.e., when the user can continue working. We define suspend latency as the time between when the user requests that the VM be suspended, and when the Host OS has saved modified state to the SoulPad and shut down, i.e., when the user can walk away with his SoulPad.
Table 2: Resume and suspend latencies, sorted by increasing resume time. We designed our suspend/resume experiments to expose the effects of disk speed, interconnect speed, processor speed, and memory size. Table 2 shows averages and standard deviations calculated over at least 10 runs for a variety of disk and PC configurations. The NetVista and ThinkCentre PCs are desktop machines; the two ThinkPad models are laptops. In all these experiments, there were 256MB of memory and 16GB of disk space allocated to the virtual machine. In the interests of simplicity and space, we omit results for hardware combinations that do not expose significant additional information. The first row of Table 2 serves as a reference point for additional observations. For the results in this row, we installed the SoulPad software stack on the internal disk of the NetVista instead of on a portable disk. It is noteworthy that external USB drives achieved resume times close to those of the internal IDE drive. For example, the average resume time on the PocketDrive connected to the same NetVista PC was only 5 seconds longer than the reference (121 vs. 116 seconds). The average suspend time on that same configuration was 16 seconds longer than the reference (26 vs. 10 seconds). Disks with lower transfer rates and rotational speeds, like the DataBank and MobileDrive, have longer resume and suspend times. Other disk characteristics not captured in Table 1, such as buffer size, also affect the resume and suspend latencies shown in Table 2. Another observation is that physical memory size matters for SoulPad. Resume and suspend latencies are noticeably longer on the ThinkCentre PC with 512 MB less memory than the other PCs, even though the ThinkCentre has the fastest CPU. Resume time rose to nearly 3 minutes and suspend time closer to 1 minute. Finally, the last row of Table 2 makes clear that USB 1.1 is too slow to support SoulPad. Resume times when using USB 1.1 rise to more than 16 minutes while suspend times rise to more than 6 minutes. Our overall conclusion is that SoulPad is usable on a range of existing portable disk and PC configurations. Disk transfer rate and physical memory size have a significant effect on resume and suspend latencies, while processor speed has less of an effect (at least for the 1.7-3.0 GHz range we used in our experiments). Disk-to-PC interconnects with speeds comparable to USB 2.0 are required but increasingly standard on commercially available PCs. We proceeded to collect fine-grained timings of different stages in the SoulPad suspend and resume sequences. As a timing mechanism, we used the Time Stamp Counter (TSC) available on x86 processors. This monotonically increasing value resets to zero on each powerup, advances with each clock cycle, and can be read by a single instruction from firmware, the boot loader, kernel space, or user space. Table 3 and Table 4 contain timings captured during sample resume and suspend runs, respectively. Both runs used the DataBank disk connected over USB2.0 to the NetVista PC, as in the fourth row of Table 2. As shown in Table 3, the sample SoulPad resume operation took almost 140 seconds. Autoconfiguring the Host OS accounted for somewhat less than half of this time, or roughly 57 seconds. We will see below that there is room for reducing the latency of this stage.
Table 3: Resume stages and sample latencies.
Table 4: Suspend stages and sample latencies. The time to load VM state from disk into memory and resume the running VM is another major contributor to total resume latency, also accounting for roughly 57 seconds in the sample run of Table 3. Techniques such as ballooning [13] can be used to reduce the latency of this stage. Ballooning zeroes unused pages of physical memory allocated to VMs. These pages would then lend themselves to more effective compression for transfer between SoulPads and PCs. As shown in Table 4, the sample SoulPad suspend operation took under 22 seconds. The two main components of this latency are the time to stop the VM (roughly 6 seconds) and the time to save to disk the contents of the VM’s memory as well as other recently changed VM state (roughly 16 seconds). Aside from the contents of the VM memory, the amount of state saved at suspend time is relatively small because writes to the VM’s virtual disks have been propagated to the SoulPad throughout the VM’s operation. We then explored ways to reduce the resume latency by streamlining the Knoppix autoconfiguration procedure. The results in Table 2 were obtained using a base Knoppix installation. We were able to eliminate two steps from this base case: rebuilding the mapping from library names to path names, and rebuilding kernel-module dependencies. The former is necessary only when libraries are installed or moved, and the latter is only necessary when kernel modules are added, changed, or removed. Such Host OS configuration changes will be rare on a SoulPad since the Host OS is only used as a vehicle to bring up a virtual machine. This layer of the SoulPad architecture can be tightly managed by system administrators working for enterprises or service providers.
Table 5: Impact on resume and suspend latencies of streamlining the Host OS boot sequence, then storing the VM image in an encrypted file system. Table 5 shows the impact on resume latency of eliminating these two steps. These measurements were done on the same DataBank-NetVista combination shown in the fourth row of Table 2, yielding a baseline resume time of 141 seconds. As shown in Table 5, streamlining the Knoppix autoconfiguration stage reduced resume latency by 12 seconds, to 129 seconds total. Further optimizations of the boot sequence may be possible. Table 5 also shows the impact of encrypting the VM image. We placed the VM image on a file system encrypted with the AES128 cipher. We then measured suspend and resume latencies on the same DataBank-NetVista combination after streamlining the Knoppix autoconfiguration stage. As shown, resume latency rose by 10 seconds but remained below the original 141 seconds, and suspend latency rose by 6 seconds but remained below 30 seconds. We conclude that using an encrypted file system is both desirable and viable. Finally, it is useful to compare |