Check out the new USENIX Web site. next up previous
Next: AVES NAT Daemon Up: Implementation Previous: AVES-Aware DNS Server Daemon


AVES Waypoint Daemon

Our AVES waypoints are based on Linux PCs. Each machine can be assigned multiple waypoint IP addresses as aliases of its network interface. The AVES waypoint daemon uses the Linux IP firewall (ipfw) API to filter selected data packets to user-level for manipulation, it requires Linux kernel version 2.2 or higher. To filter in-coming data packets to user-level, the waypoint daemon opens a raw NETLINK_FIREWALL netlink socket. Filter entries can then be added to the input firewall via the ipfw API and the kernel can be instructed to direct matching packets to the netlink socket. After data packets are manipulated in user-level, they are reinjected into the network via a raw socket with the IP header included option (IP_HDRINCL) enabled.

We have fully implemented the delayed binding technique as described in Section 3.3.2. When there are multiple alias waypoint IP addresses on the machine, each address is treated independently by the waypoint daemon. The wait period $T_{wait}$ in our implementation is 2 seconds which should provide sufficient time for a connection to be made. When the waypoint IP address $IP_W$ is in a wait state, the waypoint daemon filters all in-coming packets with destination address $IP_W$ regardless of the source address. Packets that do not indicate a new connection are processed normally according to existing translation table entries. A new connection (indicated by a TCP SYN packet, or any non-TCP packet) to $IP_W$ is either accepted or rejected according to the algorithm shown in Figure 6. If the connection is accepted, a filter for the source and destination address pair is added to the firewall and a translation table entry is created. The packet is then processed normally. If the connection is rejected, the packet is dropped, and an ICMP ``destination host unreachable'' message [19] is sent back to the initiator. This signals to the initiator that it needs to retry the connection. The reject period $T_{reject}$ is 3 minutes in our implementation, which we think is sufficient to prompt the initiator to retry the connection, and does not make $IP_W$ unavailable to the initiator again for too long. Note that, when $IP_W$ is in a wait state, AVES SETUP messages sent to $IP_W$ are ignored for simplicity. Below is a summary of the other noteworthy features supported by the waypoint daemon:

Fragmentation & Path MTU Discovery - Because the waypoint daemon encapsulates a translated packet in an IP header and adds a 16-byte MD5 checksum, typical 1500 byte in-coming Ethernet packets will have to be fragmented on their way out. It turns out that Linux does not perform fragmentation for packets sent through a raw socket with the IP_HDRINCL option enabled, therefore IP fragmentation has been implemented in the waypoint daemon. The waypoint daemon also supports path MTU discovery [16]. That is, when the ``Don't Fragment'' flag of an in-coming IP packet is set but fragmentation is necessary, the waypoint daemon drops the packet, and returns an ICMP ``destination unreachable fragmentation needed'' message [19] to the initiator with the MTU field set to 1464 bytes. Finally, a consequence of IP fragmentation is that, the AVES NAT gateway must be configured to reassemble all in-coming fragmented packets so that the AVES NAT daemon can function properly.

Protocol Specific Timeouts - A translation table entry represents a session opened by an initiator and will expire if there is no traffic activity for a period of time. To optimize resource usage, we use different timeout values for different protocols. The protocols the waypoint daemon recognizes are ICMP, TCP, and UDP. First, if an initiator is transmitting an unknown protocol or a mixed set of protocols to the responder, a default timeout value of 15 minutes is used. For ICMP, since it is mostly generated by ping or traceroute, we aggressively timeout these entries in 1 minute. For UDP, the timeout value is set to 15 minutes. For TCP, the timeout value is set to 30 minutes. These choices are somewhat arbitrary, but we think they are reasonable. To further optimize, we keep track of the TCP connections that correspond to a translation table entry, and when all of them have terminated (indicated by TCP FIN packets), the translation table entry is removed immediately without waiting for the timeout. An exception to this is when the traffic is HTTP (i.e. port 80) because popular browser software such as Netscape and Internet Explorer always cache DNS replies for 15 minutes. Thus, for HTTP, we simply use a timeout value of 20 minutes without checking for TCP FIN packets.


next up previous
Next: AVES NAT Daemon Up: Implementation Previous: AVES-Aware DNS Server Daemon