Special Workshop on Intelligence at the Network Edge 2000
RSIP: Address Sharing with End-to-End Security
Michael S. Borella
Realm Specific IP (RSIP) is a new architecture under consideration in the Internet Engineering Task Force (IETF) that can potentially alleviate some of the problems associated with partitioning of the Internet address space due to, for example, the shortage of IPv4 addresses. It is being positioned as a replacement for Network Address Translation (NAT), because, among other things, it can support end-to-end security via IPsec, which NAT cannot. This paper introduces the motivation behind RSIP, the RSIP architecture, and provides a basic overview of the RSIP protocol.
IPv4, the current version of the Internet Protocol, supports 32 bits of address space, which means that over 4 billion individually addressable hosts can be on the Internet. At the time of its inception, IPv4 designers could not have imagined the explosive growth of the Internet in the mid-to-late 1990's. As a result of this growth, as well as overly generous address allocation schemes of the past, it is becoming increasingly difficult and expensive to obtain IP addresses. Although IPv6, which has 128 bits of address space, has been approved by the Internet Engineering Task Force (IETF), there is no clear upgrade path from IPv4 to IPv6. Concerns that IPv6 may either never deploy, or may only partially deploy, are being expressed .
Thus far, network users and administrators have responded to the address shortage by deploying Network Address Translation (NAT)  in boundary routers. However, this technology requires that the NAT be aware of any application run across it that transmits IP addresses or TCP/UDP port information in the packet payload. Furthermore, NAT inhibits the use of end-to-end security via IPsec . The latter, in particular, is a major disadvantage, considering the popularity of virtual private networks and the well-known need for security in e-commerce and business communications.
Currently, network administrators must choose between a complete solution that is not supported by many vendors and incompatible with the rest of the Internet (IPv6), and a stop-gap solution that introduces many problems and prevents the deployment of some popular applications (NAT). Realm Specific IP (RSIP)  has been proposed as a replacement for NAT that will have little impact on application layer protocols, and will inter-operate with IPsec. RSIP has the additional advantage that it can co-exist with NAT, and therefore networks using NAT can be smoothly upgraded, in part or in whole, to use RSIP.
This paper presents an introduction to RSIP, focusing on how it is different from NAT, and how it can support end-to-end IPsec. Since RSIP is still in the draft phase of its journey towards IETF standardization, we include a discussion of architectural and protocol issues that are currently being addressed and evaluated.
In this section we discuss the IP address shortage and how it is expected to become a critical issue in the near future. We then describe the operation of NAT, which is required in order to appreciate the alternative
The technical media and popular press have been creating much hype about the IP address shortage. Currently, it is still possible to obtain IP addresses in most parts of the world. However, a requester typically has to justify that they actually need the number of addresses that they request, and must demonstrate use of the addresses that they are allocated. Furthermore, addresses have become a commodity of sorts, as ISPs typically charge significantly more for additional addresses beyond the default number that they give to a client.
In the near future, we expect that three trends will cause the rapid exhaustion of the remaining IPv4 address space.
These trends indicate that within 12-36 months we will begin to feel the IP address crunch in earnest, and that not long thereafter, it may become severe.
The address shortage began to be felt in the early-to-mid 1990's. NAT was developed as an intermediate, temporary solution, that would hold us over until IPv6 deployed. Obviously, IPv6 has not deployed, and even IPv6 advocates admit that unless there is a clearly defined transition phase, it may never deploy. As a result, NATs are becoming widespread, especially in the small enterprise and home networking space.
In principle, a NAT is a boundary router between two different address spaces. Typically, one is a private space of an ISP, residential network, or corporate intranet, and the other is the public space of the Internet. The hosts in the private space use private IP addresses  that are unroutable from the public Internet. The NAT performs a one-to-one translation of outgoing (private-to-public) packets from each private address to a unique public IP address, and performs the converse operation for incoming (public-to-private) packets. A popular variation of NAT, Network Address and Port Translation (NAPT) maps all private addresses to one or more public addresses, differentiating amongst these hosts by local port number. Thus, a NAPT device must ensure that all port numbers used on the local side of a session are unique per public address. This requires that some port numbers chosen by a host will be translated along with their address.
The major drawback that NAT introduces is that if an application transmits IP addresses or ports as part of its packets' payloads, the NAT must contain an application layer gateway (ALG) in order for it to support the protocol. The classic example of this is FTP. In the FTP control stream, the client transmits the IP address and port number to which the server should open a socket. In order for FTP to work across a NAT, the NAT must examine the payload of FTP control packets, determine where the address and port information is encoded, and perform the necessary translation. In the worst case, this requires that the NAT also modify the packet length and sequence number fields in the IP and TCP headers, respectively, TCP header checksum, and maintain a running delta of the TCP sequence number for lifetime of the connection.
The need for a protocol-specific ALG in the NAT for each protocol that transmits address or port content is more of a deployment issue than an engineering issue. As long as a protocol payload can be decoded, read, and modified without disrupting end-to-end communications, NAT manufacturers can develop an appropriate ALG. However, deploying this ALG to an installed base of customers can prove to be trying. Since new protocols are being developed at a record pace, a NAT user must perform software or firmware upgrades on a regular and frequent basis. Despite these limitations, the utility of NATs has so far outweighed these drawbacks. Perhaps the true showstoppers for NAT, however, are applications that cryptographically prevent NATs from modifying IP packets. In general, this rules out end-to-end application of IPsec. In particular, NAT breaks all applications of the authentication header (AH), and applications of the Encapsulating Security Payload (ESP) in so-called transport mode. ESP in tunnel mode, however, is impervious to modifications of the outermost IP header. In spite of its deleterious effect on IPsec, NATs do not completely hinder the use of end-to-end data security. Security mechanisms at or above the transport layer, such as TLS or SSH, are unaffected if the applications being run do not transmit addresses or ports in their payloads.
Nevertheless, given the increasing demand for end-to-end network layer security, typically in the form of the virtual private networks that IPsec enables, NAT is seen as a critical roadblock for these services.
RSIP is an alternative to NAT that operates under the same assumptions of physical architecture and connectivity. While NAT is only defined in terms of operations on a flow of packets, RSIP is defined in terms of operations on the flow, and also a signaling association between the client host (RSIP client) and gateway router (RSIP server). The nature of the operations on the flow of packets is also quite different. NAT gateways must match incoming flows with arbitrary filters. The filters used by RSIP servers use only a very small fixed number of prefixes, which lends itself much more to efficient implementation in hardware.
Before a client can invoke the services of an RSIP server, it must first locate the RSIP server. That is, it must obtain the RSIP server's IP address. Obviously, this information can be manually configured in each client, but it is highly desirable to automate the discovery process, especially in situations in which a roaming client is visiting a ``foreign'' network.
Routers between a client and the RSIP server will not intercept and relay RSIP messages like is common for DHCP. Since on most networks, clients will obtain their local addresses with DHCP, the IETF currently is defining a DHCP option that informs clients of the address of the RSIP server . An alternative that is also under consideration is to use the Service Location Protocol (SLP) . This protocol provides a framework for highly dynamic query-based discovery of services. SLP clients (user agents) can discover only those services (service agents) that satisfy certain criteria expressed by the client. Services are defined by the SLP templates. There is now such a template for SLP-based discovery of RSIP servers .
The key to RSIP is for the RSIP client to prepare packets that are ready for the public network, such that no translation is necessary by the RSIP server. In order to do so, the RSIP client queries the RSIP server for the appropriate public address and port number(s) to utilize (see Section 3.3 for details), and then prepares a packet using these parameters. The RSIP client tunnels this packet over the private network to the RSIP server, which then only has to strip off the tunnel header, and forward the packet on to the public network.
An example RSIP packet flow is shown in Figure 1. An RSIP client (address 10.0.0.4) is connected to a multi-homed RSIP server by a private network. The RSIP server maintains one interface on the private network (address 10.0.0.1) and one interface on the public network (address 18.104.22.168). The figure illustrates a typical HTTP request-reply transaction between the RSIP client and a public WWW server (address 22.214.171.124). It is assumed that the RSIP server has allocated its public IP address and port 10000 to the RSIP client. When creating the HTTP request packet, the RSIP server uses port 10000 as the source port and 126.96.36.199 as the source IP address. This packet is then encapsulated in a tunnel that delivers it across the local network, to the RSIP server. The RSIP server removes the tunnel header and forwards the packet on the public network. For the incoming response from the WWW server, the RSIP server performs a lookup on the destination port number. Finding port 10000 associated with address 10.0.0.4, the RSIP server forms the tunnel header and transmits the encapsulated packet to the RSIP client.
Since the RSIP client prepares the packet to appear as if it originated from the RSIP server, there is no need for an ALG. Furthermore, even if the RSIP client is using end-to-end network layer encryption with a public server, the transaction will operate properly through the RSIP server, because the RSIP server does not need to examine the payload contents. In general, the RSIP server will allocate more than one port number per RSIP client; thus, the client can utilize protocols, such as FTP, that require multiple simultaneous sessions.
When an RSIP client communicates on the private network, it uses its local (private) address, and is not restricted by RSIP in any way. Thus, an RSIP client must respond to ARPs for its private address, but it must not respond to ARPs for a public address that it is using.
Before an RSIP client can contact a public host, it must establish a signaling association with the RSIP server. The association can be either a TCP connection or a UDP session. It allows the RSIP server to lease address and port bindings to the client, de-allocate these bindings, and otherwise manage resources. The RSIP protocol runs in a simple request-response format. There are three major states that a client may be in: unregistered, registered, and assigned. All other states indicate that the client has transmitted a request to the server and is waiting for a response. This protocol is described in more detail in .
All clients begin in the unregistered state. It is assumed that they have acquired a private IP address, either via DHCP, static assignment, or some other mechanism. Before attempting communication with the public network, they must register with the RSIP server. This registration phase is expected to be performed as part of system initialization. Alternatively, it could be postponed until the client determines that it will, in fact, require RSIP services, and only then will the client request resources from the RSIP server. RSIP clients notify the RSIP server of their presence with a REGISTER_REQUEST message. The server replies with a REGISTER_RESPONSE message that includes a unique CLIENTID token as well as other policy. The client must include this token in all subsequent messages. Upon successful registration, the client enters the registered state. If the registration is not successful, the server will inform the client why with an ERROR_RESPONSE message. All RSIP request messages may be responded to with an ERROR_RESPONSE message, if the request is not granted.
Once registered, a client may request a public IP address and one or more ports with an ASSIGN_REQUEST message. Once an associated ASSIGN_RESPONSE is received from the server, the client enters the assigned state. Each ASSIGN_RESPONSE includes a per-client unique BINDID that identifies the bound resources. In the assigned state, the client may communicate with public hosts, request more resources with another ASSIGN_REQUEST message, free some assigned resources with a FREE_REQUEST message, or de-register with a DE-REGISTER_REQUEST message. From the assigned state, a de-registration frees all resources bound to the client.
Although RSIP is initially targeted at home networks and small to medium enterprises, it may also be deployed in large enterprise networks. These networks may include hundreds of subnets behind an RSIP server, and may require the RSIP client to know whether a given host is on the local or remote side of its RSIP server. To facilitate this situation, a client may transmit a QUERY_REQUEST to the server with the address of a host. The QUERY_RESPONSE from the server will indicate whether the host in question is local or not. In general, it is expected that the server will know of all subnets on the local side because it will be performing firewalling or packet filtering duties as a gateway to the public network.
If an RSIP client is required to act as a server for some application layer protocol, it must inform the RSIP server to pass to it all incoming packets to a particular IP address / port tuple. The client achieves this by transmitting a LISTEN_REQUEST to the server, and the server responding with a LISTEN_RESPONSE.
All resources acquired by an RSIP client are leased for a finite amount of time. Once the lease on a binding has expired, the RSIP server will transmit a FREE_RESPONSE message that informs the client that it may no longer use the resources associated with the binding specified. The server may transmit a FREE_RESPONSE message at any time. Likewise, an RSIP server may transmit a DE-REGISTER_RESPONSE at any time, terminating an RSIP client's registration and all of its bindings.
IPsec enables secure end-to-end communication. Packets can either be encrypted, authenticated, or both. In order to use IPsec, two hosts must first establish a security association (SA), perhaps using the Internet Key Exchange (IKE)  protocol. All packets protected by an SA have at least one extra header inserted between the IP header and the transport layer header. Figure 2 shows IPsec packets in transport mode. The other of two possible modes is tunnel mode, in which the AH or ESP headers are followed by another IP header. The encapsulating security payload (ESP) header, shown in Figure 2a, encrypts the entire packet payload, and optionally authenticates the entire packet payload except for part of the ESP trailer. The authentication header (AH), shown in Figure 2b, authenticates the entire packet including the immediately preceding IP header (except, of course, for the IP header fields that change per hop). Both ESP and AH may be applied at a packet, as is shown in Figure 2c.
Even though ESP does not include the preceding IP header in its cryptographic calculation, it does include the entire payload, which in transport mode includes the transport header. Commonly, the transport protocol is either TCP or UDP, in which case the transport headers include a pseudo header containing, among other things, the source and destination address fields in the preceding IP header. Thus, indirectly, ESP in transport mode renders the IP source and destination addresses immutable, and is broken by NAT gateways.
Given that the keying material used for the encryption and authentication can only be shared between the RSIP client and its public peer, the RSIP-enabled gateway will not be able to read TCP/UDP ports when ESP is used, and will not be able to modify the ports when ESP or AH is used. Thus, IPsec generally will not work through a NAT. However, ESP in tunnel mode is impervious to NATs, in that the outer IP header is not cryptographically protected. Hence, it is possible to modify the outside header without rendering the packet useless. In spite of this, NAT gateways have no surefire way of establishing the appropriate mapping to demultiplex these packets. Recent Linux implementations use temporal association to guess what the right mappings are, based on the assumption that outgoing packets are immediately followed by incoming traffic. And from this, it is possible to guess which client will be expecting the subsequent incoming IPsec packets.
In order to support IPsec through an RSIP-enabled gateway, we need to solve two problems: (1) finding one or more fields in the packet headers to use to demultiplex incoming packets to RSIP clients, and (2) a way of ensuring that these fields are unique per public IP address used for RSIP. The key to addressing these issues is that all ESP and AH headers contain a 32-bit value called the security parameter index (SPI), that is unique per SA and is always kept in the clear. Furthermore, the SPI value in packets received by a host is specified by the host during IKE negotiation. Therefore, an RSIP server can allocate mutually exclusive SPI values along with each IP address and port assignment. The RSIP client will tell its peer to use an allocated SPI value. The RSIP server, knowing the SPI values allocated to each client, will be able to route incoming packets to the proper client, by examining the destination address and SPI. Note that even though port numbers do not need to be unique for routing purposes, they do need to be unique in order to avoid socket collisions when two RSIP clients using the same IP address communicate with the same server, on the same destination port.
One more detail needs to be addressed before the RSIP/IPsec integration is complete. IKE implementations currently use port 500 as source and destination for all communication. If concurrent IKE negotiations are taking place between two RSIP clients and the same public host, the RSIP router will not have enough information to route the public host's packets to the proper client just by examining the headers. However, all IKE packets contain an initiator cookie in the first eight bytes of payload. The value of this token is chosen by the initiator of the IKE session; i.e., the RSIP client. Thus, initiator cookies, like ports and SPIs, can be distributed in a mutually exclusive fashion by the RSIP server. A more elegant alternative is to let IKE clients use an ephemeral source port number. An RSIP implementation would then be able to choose a locally-unique port that could be used to demultiplex incoming IKE replies.
The discussions of RSIP/IPsec interactions in this section are necessarily brief due to space constraints. A more complete presentation of these issues, including the RSIP signaling messages for IPsec, is found in .
RSIP presents a potentially revolutionary concept that can be deployed in an evolutionary fashion. RSIP overcomes the difficulties of NAT in a way that can co-exist with NAT. As NAT gateways are upgraded to support RSIP, legacy NAT clients can continue to use the NAT while RSIP clients take advantage of the benefits offered by RSIP. RSIP gives the network community a more scalable, usable, and secure alternative to NAT as a holdover until IPv6 is deployed.
A number of RSIP issues are currently being resolved in the IETF. While  contains a more complete list, we will address the most relevant.
Incoming IKE connections are much easier to support if the peer can initiate IKE exchanges to a port other than 500. In this case, the RSIP client would allocate that port at the RSIP server via ASSIGN_REQUEST. Alternatively, if the RSIP client is able to allocate an IP address at the RSIP server, the peer could simply initiate the IKE exchange to port 500 at that address.
If there is only one address that must be shared by the RSIP server and all its clients, and if the peer can only send to port 500, the problem is much more difficult. At any given time, the combination of address and UDP port 500 may be registered and used by only one RSIP system (including clients and server).
Solving this issue requires demultiplexing the incoming IKE connection request based on something other than the port and address combination. It may be possible to do so by first registering an identity with a new RSIP command of LISTEN_RSIP_IKE. Note that the identity could not be that of the IKE responder (the RSIP client), but that of the initiator (the peer). The reason is that IKE Phase 1 only allows the sender to include its own identity, not that of the intended recipient (both, by the way, are allowed in Phase 2). Furthermore, the identity must be in the clear in the first incoming packet for the RSIP server to be able to use it as a demultiplexor. This rules out all variants of Main Mode and Aggressive Mode with Public Key Encryption (and Revised Mode of Public Key Encryption), since these encrypt the ID payload.
The only Phase 1 variants which enable incoming IKE sessions are Aggressive Mode with signatures or with pre-shared keys. Because this scheme involves the RSIP server demultiplexing based on the identity of the IKE initiator, it is conceivable that only one RSIP client at a time may register interest in fielding requests from any given peer. Furthermore, this precludes more than one RSIP client's being available to any unspecified peer.
Once the IKE session is in place, IPsec is set up as discussed in this document, namely, by the RSIP client and the RSIP server agreeing on an incoming SPI value, which is then communicated to the peer as part of Quick Mode.
The alternate address and port combination must be discovered by the remote peer using methods such as manual configuration, or the use of KX  or SRV  records. It may even be possible for the DNS query to trigger the above mechanisms to prepare for the incoming and impending IKE session initiation. Such a mechanism would allow more than one RSIP client to be available at any given time, and would also enable each of them to respond to IKE initiations from unspecified peers. Such a DNS query, however, is not guaranteed to occur. For example, the result of the query could be cached and reused after the RSIP server is no longer listening for a given IKE peer's identity.
Due to the limitations implied by having to rely on the identity of the IKE initiator, the only practical way of supporting incoming connections is for the peer to initiate the IKE session to a port other than 500.
An RSIP server is located at the border between two disparate address spaces. In most deployments scenarios, this is the border between the global Internet and a private network. In other scenarios (for example in business to business communications), the address spaces at either side of the RSIP server may have conflicting address ranges. This may happen if both address spaces use net 10.0.0.0 within their network. In this case, the QUERY_REQUEST cannot be resolved by the RSIP server by just examining an IP address. A variant of QUERY_REQUEST that uses DNS names instead of IP addresses may solve this issue.
A general solution probably implies further refinements to the protocol.
In , the future of IPv6 deployment is addressed. Three scenarios are explored, one in which IPv6 never deploys, another in which it partially deploys, and a third in which it fully deploys. It is very unlikely that IPv6 will fully deploy without some intermediate form of partial deployment. Thus, a key ingredient for the transition to IPv6 is the existence of transition technologies  that allow IPv6 to be deployed on an incremental basis, while coexisting with the legacy IPv4 infrastructure.
In situations in which IPv6 is deployed on some edge networks while backbones and other edge networks remain IPv4-only, RSIP can play a valuable role. The hosts on the IPv6 edge networks may be dual stack (i.e., they simultaneously support both IPv4 and IPv6). By placing RSIP clients in these hosts and an RSIP server on the gateway router between the IPv4 and IPv6 spaces, RSIP can allocate IPv4 addresses to the IPv4 stacks of these hosts when necessary. For example, the following communications can be supported:
This system is very similar in spirit to the Dual Stack Transition Mechanism  proposal, which uses DHCP for IPv4 address allocation. However, DSTM does not allow multiple clients to share the same IP address from the gateway machine. In other words, DSTM does not have an equivalent to the RSAP-IP  or RSIPSEC  methods in RSIP.
History has shown us that changing the core of the network to support a new protocol is very difficult. The lack of widespread RSVP, IP multicast, and IPv6 deployment attests to this premise. However, changes to the network edge occur gradually over time. Privacy concerns have led to the deployment of firewalls and host security in almost every edge network. Configuration has been eased by deployment of DHCP. Roaming is supported by Mobile IP. RSIP is another way to upgrade the edge of the network to overcome the limitations of legacy network design.
This document was generated using the LaTeX2HTML translator Version 99.2beta6 (1.42)
The command line arguments were:
The translation was initiated by on 2000-03-09
This paper was originally published in the
Proceedings of the Special Workshop on Intelligence at the Network Edge,
March 20, 2000, San Francisco, California, USA
Last changed: 11 Feb 2002 ml