Security '04 Paper
[Security '04 Technical Program]
Collapsar: A VM-Based Architecture for Network Attack Detention Center
Xuxian Jiang, Dongyan Xu
The honeypot has emerged as an effective tool to provide insights into new attacks and current exploitation trends. Though effective, a single honeypot or multiple independently operated honeypots only provide a limited local view of network attacks. Deploying and managing a large number of coordinating honeypots in different network domains will not only provide a broader and more diverse view, but also create potentials in global network status inference, early network anomaly detection, and attack correlation in large scale. However, coordinated honeypot deployment and operation require close and consistent collaboration across participating network domains, in order to mitigate potential security risks associated with each honeypot and the non-uniform level of security expertise in different network domains. It is challenging, yet desirable, to provide the two conflicting features of decentralized presence and uniform management in honeypot deployment and operation.
To address these challenges, this paper presents Collapsar, a virtual-machine-based architecture for network attack detention. A Collapsar center hosts and manages a large number of high-interaction virtual honeypots in a local dedicated network. These honeypots appear, to potential intruders, as typical systems in their respective production networks. Decentralized logical presence of honeypots provides a wide diverse view of network attacks, while the centralized operation enables dedicated administration and convenient event correlation, eliminating the need for honeypot experts in each production network domain. We present the design, implementation, and evaluation of a Collapsar testbed. Our experiments with several real-world attack incidences demonstrate the effectiveness and practicality of Collapsar.
Recent years have witnessed a phenomenal increase in network attack incidents . This has motivated research efforts to develop systems and testbeds for capturing, monitoring, analyzing, and, ultimately, preventing network attacks. Among the most notable approaches, the honeypot  has emerged as an effective tool for observing and understanding intruder's toolkits, tactics, and motivations. A honeypot's nature is to suspect every packet transmitted to/from it, giving it the ability to collect highly concentrated and less noisy datasets for network attack analysis.
However, honeypots are not panacea and suffer from a number of limitations. In this paper, we will focus on the following limitations of independently operated honeypots:
It is challenging, yet desirable, to accommodate two conflicting features in honeypot deployment and operation: decentralized presence and centralized management. To address these challenges, this paper presents Collapsar, a virtual machine (VM) based architecture for a network attack detention center. A Collapsar center hosts and manages a large number of honeypots in a local dedicated physical network. However, to the intruders, these honeypots appear to be in different network domains. These two seemingly conflicting features are achieved by Collapsar. On one hand, honeypots are logically present in different physical production networks, providing a more distributed diverse view of network attacks. On the other hand, the centralized physical location gives security experts the ability to locally manage honeypots and collect, analyze, and correlate attack data pertaining to multiple production networks.
There are two types of components in Collapsar: functional components and assurance modules. Functional components are integral parts of Collapsar, responsible for creating decentralized logical presence of honeypots. Through the functional components, suspicious traffic will be transparently redirected from different production networks to the Collapsar center (namely the physical detention center) where honeypots accept traffic and behave, to the intruders, like authentic hosts. Assurance modules are pluggable and are responsible for mitigating the risks associated with honeypots and collecting tamper-proof log information for attack analysis.
In summary, Collapsar has the following advantages over conventional honeypot systems: (1) distributed virtual presence, (2) centralized management, and (3) convenient attack correlation and data mining. The rest of this paper is organized as follows: Section 2 presents background information about conventional honeypots and describes the Collapsar vision and challenges. The architecture of Collapsar is presented in Section 3, while the implementation details of Collapsar are described in Section 4. Section 5 evaluates Collapsar's performance. Section 6 presents several real-world attack incidents captured by our Collapsar prototype. Related work is presented in Section 7. Finally, we conclude this paper in Section 8.
According to Lance Spitzner's definition , a honeypot is a ``security resource whose value lies in being probed, attacked, or compromised.'' The resource can be actual computer systems, scripts running emulated services , or honeytokens . This paper focuses on honeypots in the form of actual computer systems.
Honeypots can be classified based on level of interaction with intruders. The typical classifications are: high-interaction honeypots, medium-interaction honeypots, and low-interaction honeypots. High-interaction honeypots allow intruders to access a full-fledged operating system with few restrictions, although, for security reason, the surrounding environment may be restricted to confine any hazardous impact of honeypots. This is highly valuable because new attack tools and vulnerabilities in real operating systems and applications can be brought to light . However, such a value comes with high risk and increased operator responsibility. Medium-interaction honeypots involve less risk but more restrictions than high-interaction honeypots. One example of medium-interaction is the use of jail or chroot in a UNIX environment. Still, medium-interaction honeypots provide more functionalities than low-interaction honeypots, which are, on the contrary, easier to install, configure, and maintain. Low-interaction honeypots can emulate a variety of services that the intruders can (only) interact with.
Another classification criteria differentiates between physical honeypots and virtual honeypots. A physical honeypot is a real machine in a network, while a virtual honeypot is a virtual machine hosted in a physical machine. For example, honeyd  is an elegant and effective low-interaction virtual honeypot framework. In recent years, advances in virtual machine enabling platforms have allowed for development and deployment of virtual honeypots. Virtual machine platforms such as VMware  and User-Mode Linux (UML)  enable high-fidelity emulation of physical machines, and have been increasingly adopted to host virtual honeypots .
39] proposed by Lance Spitzner. However, to the best of our knowledge, there has been no prior realization of Honeyfarm using high-interaction honeypots, with detailed design, implementation, and real-world experiments. Furthermore, we demonstrate that by using high-interaction honeypots, the Honeyfarm vision can be more completely realized than using low-interaction honeypots or passive traffic monitors. Meanwhile, we identify new challenges associated with high-interaction honeypots in mitigating risks and containing attacks.
The development of Collapsar is more challenging than the deployment of a stand-alone decoy system. System authenticity requires honeypots to behave, from an intruder's point of view, as normal hosts in their associated network domains. From the perspective of Collapsar operators, the honeypots should be easily configured, monitored, and manipulated for system manageability. To realize a full-fledged Collapsar, the following problems need to be addressed:
This paper presents our solutions to the first problem. For the second and the third problems, we have developed Collapsar components and mechanisms for the enforcement of different traffic filtering and attack curtailing policies specified by Collapsar operators and network administrators. This paper does not address any specific policy and its impact. Instead, it focuses on the architectural and functional aspects of Collapsar.
1. Collapsar is comprised of three main functional components: the redirector, the front-end, and the virtual honeypot (VM). These components work together to achieve authenticity-preserving traffic redirection. Collapsar also includes the following assurance modules in order to capture, contain, and analyze the activities of intruders: the logging module, the tarpitting module, and the correlation module.
In the reverse direction, the front-end accepts response traffic from the honeypots, and scrutinizes all packets with the help of assurance modules (to be described in Section 3.2) for attack stoppage. If necessary, the front-end will curtail the interaction with the intruder to prevent a compromised honeypot from attacking other hosts on the Internet. If a policy determines that continued interaction is allowed, the front-end will forward the packets back to their original redirectors which will then redirect the packets into the network, such that the packets appear to the remote intruder as originating from the target network.
The Collapsar functional components create virtual presence of honeypots. Assurance modules provide necessary facilities for attack investigation and mitigation of associated risks.
9]. All communications related to honeypots are highly suspicious and need to be recorded. However, the traditional Network Intrusion Detection System (NDIS) based on packet sniffing may become less effective if the attack traffic is encrypted. In fact, it has become common for intruders to communicate with compromised hosts using encryption-capable backdoors, such as trojaned sshd daemons. In order to log the details of such attacks without intruders tampering with the log, the logging module in each honeypot consists of sensors embedded in the honeypot's guest OS as well as log storage in the physical machine's host OS. As a result, log collection is invisible to the intruder and the log storage is un-reachable by the intruder.
41] by limiting the rate packets are sent (for example TCP-SYN packets) or reducing average traffic volume and (2) scrutinizing out-going traffic based on known attack signatures, and crippling detected attacks by invalidating malicious attack codes .
35], worm propagations , and hidden overlay networks such as IRC-based networks or peer-to-peer networks formed by certain worms.
In this section, we present the implementation details of Collapsar. Based on virtual machine technologies, Collapsar is able to support virtual honeypots running various operating systems.
There are two approaches to transparent traffic redirection: the router-based approach and the end-system-based approach. In the router-based approach, an intermediate router or the edge router of a network domain can be configured to activate the Generic Routing Encapsulation (GRE) [28,29] tunneling mechanism to forward honeypot traffic to the Collapsar center. The approach has the advantage of high network efficiency. However, it requires the privilege of router configuration. On the other hand, the end-to-end approach does not require access and changes to routers. Instead, it requires an application-level redirector in the target production network for forwarding packets between the intruder and the honeypot. In a fully cooperative environment such as a university campus, the router-based approach may be a more efficient option, while in an environment with multiple autonomous domains, the end-system-based approach may be adopted for easy deployment. In this paper, we describe the design and implementation of the end-system-based approach.
To more easily describe the end-system-based approach, let be the default router of a production network, be the IP address of the physical host where the redirector component runs, and be the IP address of the honeypot as appearing to the intruders. , , and an interface of , say , belong to the same network. When there is a packet addressed to , router will receive it first and then try to forward the packet based on its current routing table. Since address appears in the same network as , will send the packet over . To successfully forward the packet to , needs to know the corresponding MAC address of in the ARP cache table. If the MAC address is not in the table, an ARP request packet will be broadcasted to get the response from . will receive the ARP request. knows that there is no real host with IP address . To answer the query, responds with its own MAC address, so that the packet to can be sent to and the redirector in will then forward the packet to the Collapsar center. Note that one redirector can support the virtual presence of multiple honeypots in the same production network.
The redirector is implemented as a virtual machine running our extended version of UML. This approach adds considerable flexibility to the redirector since the VM is able to support policy-driven configuration for packet filtering and forwarding, and can be conveniently extended to support useful features such as packet logging, inspection, and in-line rewriting. The redirector has two virtual NICs: the pcap/libnet interface and the tunneling interface. The pcap/libnet interface performs the actual packet capture and injection. Captured packets will be echoed as input to the UML kernel. The redirector kernel acts as a bridge, and performs policy-driven packet inspection, filtering, and subversion. The tunneling interface tunnels the inspected packets transparently to the Collapsar center. For communication in the opposite direction, the redirector kernel's tunneling interface accepts packets from the Collapsar center and moves them into the redirector kernel itself, which will inspect, filter, and subvert the packets from the honeypots, and re-inject the inspected packets into the production network through the pcap/libnet interface.
The Collapsar front-end is similar to a transparent firewall. It dispatches incoming packets from redirectors to their respective honeypots based on the destination field in the packet header. The front-end can also be implemented using UML which creates another point for packet logging, inspection, and filtering.
Ideally, packets should be forwarded directly to the honeypots after dispatching. However, virtualization techniques in different VM enabling platforms complicate this problem. In order to accommodate various VMs (especially those using VMware), the front-end will first inject packets into the Collapsar network via an injection interface. The injected packets will then be claimed by the corresponding virtual honeypots and be moved into the VM kernels via their virtual NICs. This approach supports VMware-based VMs without any modification. However, it incurs additional overhead (as shown in Section 5). Furthermore, it causes the undesirable cross-talk between honeypots which logically belong to different production networks. Synthetic cross-talk may decrease the authenticity of Collapsar. A systematic solution to this problem requires a slight modification to the virtualization implementation, especially the NIC virtualization. Unfortunately, modifying the VM requires the access to the VM's source code. With open-source VM implementations, such as UML, the injection interface of the front-end can be modified to feed packets directly into the VM (honeypot) kernels. As shown in Section 5, considerable performance improvement will be achieved with this technique.
22], Virtual PC , and UMLinux  will also be supported in the future.
VMware is a commercial software system and is one of the most mature and versatile VM enabling platforms. A key feature is the ability to support various commodity operating systems and to take snapshot of live virtual machine images. Support for commodity operating systems provides more diverse view of network attacks, while image snapshot generation and restoration (without any process distortion) add considerable convenience to forensic analysis.
As mentioned in Section 4.2, the network interface virtualization of VMware is not readily compatible with Collapsar design. More specifically, VMware creates a special vmnet, which emulates an inner bridge. A VMware-hosted virtual machine injects packets directly into the inner bridge, and receives packets from the inner bridge. A special host process is created to be attached to the bridge and acts as an agent to forward packets between the local network and the inner bridge. The ability to read packets from the local network is realized by a loadable kernel module called vmnet.o, which installs a callback routine registering for all packets on a specified host NIC via the dev_add_pack routine. The packets will be re-injected into the inner-bridge. Meanwhile, the agent will read packets from the inner-bridge and call the dev_queue_xmit routine to directly inject packets to the specified host NIC. It is possible to re-write the special host process to send/receive packets directly to/from the Collapsar front-end avoiding the overhead of injecting and capturing packets twice - once in the front-end and once in the special host process. This solution requires modifications to VMware.
UML is an open-source VM enabling platform that runs directly in the unmodified user space of the host OS. Processes within a UML (the guest OS) are executed in the virtual machine in exactly the same way as they would be executed on a native Linux machine. Leveraging the capability of ptrace, a special thread is created to intercept the system calls made by any process thread in the UML kernel, and redirect them to the guest OS kernel. Meanwhile, the host OS has a separate kernel space, eliminating any security impact caused by the individual UMLs.
Taking advantage of UML being open source, we enhance UML's network virtualization implementation such that each packet from the front-end can be immediately directed to the virtual NIC of a UML-based VM. This technique not only avoids the unnecessary packet capture and re-injection, as in VMware, but also eliminates the cross-talk between honeypots in the Collapsar center.
8] and snort  are able to record plain traffic, while embedded sensors inside the honeypot (VM) kernel are able to uncover an intruder's encrypted communications. In section 6.1, we will present details of several attack incidences demonstrating the power of in-kernel logging. The in-kernel logging module in VMware-based honeypots leverages an open-source project called sebek , while in-kernel logging module for UML-based honeypots is performed by kernort , a kernelized snort .
Tarpitting modules are deployed in both the front-end and redirectors. The modules perform in-line packet inspection, filtering, and rewriting. Currently, the tarpitting module is based on snort-inline , an open-source project. It can limit the number of out-going connections within a time unit (e.g., one minute) and can also compare packet contents with known attack signatures in the snort package. Once a malicious code is identified, the packets will be rewritten to invalidate its functionality.
The Collapsar center provides a convenient venue to perform correlation-based attack analysis such as wide-area DDoS attacks or stepping stone attacks . The current prototype is capable of attack correlation based on simple heuristics and association rules. However, the Collapsar correlation module can be extended in the future to support more complex event correlation and data mining algorithms enabling the detection of non-trivial attacks such as low and slow scanning and hidden overlay networks.
To measure the virtualization-incurred overhead, we use two physical hosts (with aliases seattle and tacoma, respectively) with no background load, connected by a lightly loaded 100Mbps LAN. Seattle is a Dell PowerEdge server with a 2.6GHz Intel Xeon processor and 2GB RAM, while tacoma is a Dell desktop PC with a 1.8GHz Intel Pentium 4 processor and 768MB RAM. A VM runs on top of seattle, and measurement packets are sent from tacoma to the VM. The TCP throughput is measured by repeatedly transmitting a file of 100MB under different socket buffer size, while the latency is measured using standard ICMP packets with different payload sizes. Three sets of experiments are performed: (1) from tacoma to a VMware-based VM in seattle, (2) from tacoma to a UML-based VM in seattle, and (3) from tacoma directly to seattle with no VM running. The results in TCP throughput and ICMP latency are shown in Figures 2(a) and 2(b), respectively. The curves ``VMware,'' ``UML,'' and ``Direct'' correspond to experiments (1), (2), and (3), respectively.
Figure 2(a) indicates that UML performs worse in TCP throughput than VMware, due to UML's user-level virtualization implementation. More specifically, UML uses a ptrace-based technique implemented at the user level and emulates an x86 machine by virtualizing system calls. On the other hand, VMware employs the binary rewriting technique implemented in the kernel, which inserts a breakpoint in place of sensitive instructions. However, both VMware and UML exhibit similar latency degradation because the (much lighter) ICMP traffic does not incur high CPU load therefore hiding the difference between kernel and application level virtualization. A more thorough and rigorous comparison between VMware and UML is presented in .
We then measure the performance overhead incurred by the traffic redirection and dispatching mechanisms of Collapsar. We set up tacoma as the Collapsar front-end. In a different LAN, we deploy a redirector running on a machine with the same configuration as seattle. The two LANs are connected by a high performance Cisco 3550 router. A machine in the same LAN as the redirector serves as the ``intruder'' machine, connecting to the VM (honeypot) running in seattle. Again, three sets of experiments are performed for TCP throughput and ICMP latency measurement: (1) from to a VMware-based honeypot in seattle, (2) from to a UML-based honeypot in seattle, and (3) from to the machine hosting the redirector (but without the redirector running). The results are shown in Figures 3(a) and 3(b). The curves ``VMware,'' ``UML,'' and ``Direct'' correspond to experiments (1), (2), and (3), respectively.
Contrary to the results in Figures 2(a) and 2(b), the UML-based VM achieves better TCP throughput and ICMP latency than the VMware-based VM. We believe this is due to the optimized traffic dispatching mechanism implemented for UML (Section 4.2). Another important observation from Figures 3(a) and 3(b) is that traffic redirecting and dispatching in Collapsar incur a non-trivial network performance penalty (comparing with the curve ``Direct''). For remote intruders (or those behind a weak link), such penalty may be ``hidden'' by the already degraded end-to-end network performance. However, for ``nearby'' intruders, such penalty may be observable by comparing performance to a real host in the same network. This is a limitation of the Collapsar design. Router-based traffic redirection (Section 4.1) as well as future hardware-based virtualization technology are expected to alleviate this limitation.
In this section, we present a number of real-world network attack incidences captured by our Collapsar testbed. We also present the recorded intruder activities to demonstrate the effectiveness and practicality of Collapsar. Finally, we demonstrate the potential of Collapsar in log mining and event correlation.
The Collapsar center creates exciting opportunities to perform correlation and mining based attack analysis. The current Collapsar center hosts only 40 virtual honeypots, still far from a desirable scale for Internet-wide attack analysis. However, current Collapsar log information already demonstrates the potential of such capability. In this section, we show two simple examples.
8. We note that such evidence is by no means sufficient to confirm a stepping stone  case. However, with wider range of target networks and longer duration of log accumulation, a future Collapsar center may become capable of detecting stepping stones and tracing back original attackers with satisfactory accuracy.
Network scanning has become a common incident, with the existence of various scanning methods such as ping sweeping, port knocking, OS finger-printing, and firewalking. Figure 9 shows the ICMP (ping) sweeping activity from the same source address (xx.yy.zzz.125) against three honeypots within a very short period of time (1.0 second). The honeypots are virtually present in three different production networks. Based on the payload, it is likely that a Nachi worm  is performing the scan.
36], Network Telescope , Netbait , and SANS's Internet Storm Center .
Honeyd  is the most comparable work with respect to support for multiple honeypots and traffic diversion. Simulating multiple virtual computer systems at the network level with different personality engines, honeyd is able to deceive network fingerprinting tools and provide arbitrary routing topologies and services for an arbitrary number of virtual systems. The most obvious difference between honeyd and Collapsar is that honeyd is a low-interaction virtual honeypot framework, while all honeypots in Collapsar are high-interaction virtual honeypots. Honeyd is more scalable than Collapsar, since every computer system in honeyd is simulated. On the other hand, with high-interaction honeypots, Collapsar is able to provide a more authentic environment for intruders to interact with and has a potential for early worm detection.
Network Telescope  is an architectural framework that provides distributed presence for the detection of global-scale security incidents. Using a similar architecture, Netbait  runs a set of simplified network services in each participating machine. The services will log all incoming requests and federate the data to a centralized server, so that pattern matching techniques can be applied to identify well-known signatures of various worms and viruses. Network Telescope and Netbait do not involve real-time traffic diversion mechanisms. They are not designed as an interactive environment where activities of intruders are closely monitored and recorded. The Internet Storm Center  was set up by SANS institute in November 2000 to gather log data from participating intrusion detection sensors. The sensors are distributed around the world. Again, it neither presents an interactive environment to intruders, nor is capable of real-time intruder traffic diversion.
Leveraging the power of individual honeypots, there have been significant advances in recent years in attack logging and analysis. Among the most notable are VM-based retrospection , backtracker , ReVirt , and forensix . VM-based retrospection  is capable of inspecting inner machine states from a VM monitor. Backtracker  and, similarly, forensix  are able to automatically identify potential sequences of steps that could occur during an intrusion, with the help of system call recording. These results are highly effective and can be readily applied to Collapsar to improve the capability of individual virtual honeypots.
Meanwhile, it has been noted that virtual honeypots based on current VM enabling platforms could expose certain VM foot-printing . Such deficiency could diminish the value of virtual honeypots. This situation has led to another round of ``arms race'': methods such as  have been proposed to minimize VM foot-printing, although the technique in  is still VM-specific.
This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.71)
The command line arguments were:
The translation was initiated by Xuxian Jiang on 2004-05-19
Xuxian Jiang 2004-05-19
This paper was originally published in the
Proceedings of the 13th USENIX Security Symposium,
August 913, 2004
San Diego, CA
Last changed: 23 Nov. 2004 aw