Check out the new USENIX Web site. next up previous
Next: Related Work Up: Evaluation Previous: Application Throughput

Cornell National Lambda Rail Rings

Figure 12: Data loss as a result of disaster and wide-area link failure (Cornell NLR-Rings, 37 ms one-way delay).

In addition to our emulated setup and results, we are beginning to physically study systems that operate on dedicated lambda networks that might be seen in cutting edge financial, military, and educational settings. To study these ``personal'' lambda networks, we have created a new testbed consisting of optical network paths of varying physical length that start and end at Cornell, the Cornell National Lambda Rail (NLR) Rings testbed.

The Cornell NLR-Rings testbed consists of three rings: a short ring that goes from Cornell to New York City and back, a medium ring that goes to Chicago down to Atlanta and back, and a long ring that goes to Seattle down to Los Angeles and back. The one-way latency is 7.9 ms, 37 ms, and 94 ms, for the short, medium, and long rings, respectively. The underlying optical networking technology is state-of-the-art: a 10 Gbps wide-area network running on dedicated fiber optics (separate from the public Internet) and created as a scientific research infrastructure by the NLR consortium [3]. Each ring includes multiple segments of optical fiber, linked by routers and repeaters. More importantly, for the medium and long ring, each network packet traverses a unique path without going along the same segment. See NLR [3] for a map.

Though all rings in the testbed are capable of 10 Gbps end-to-end, we are only able to operate at hundreds of megabits per second at this time due to network construction. Nonetheless, we are able to study the effects of disaster on dedicated wide-area lambda networks and hope to be able to use increasingly more bandwidth in the future.

To study the effects of disaster in this wide-area testbed, we conduct the same disaster experiment described in Section 5.3. We induced loss on the wide-area link 0.5 second before the primary site fails via a router that we control. Later, when the primary site fails, the wide-area link and all processes are killed. Figure 12 shows data loss during this disaster for the medium path on the Cornell NLR-Rings testbed. The x-axis shows the loss induced on the wide-area link (link losses are random, independent and identically distributed) and the y-axis shows the number of messages sent and the number of unrecoverable messages. There are two interesting results illustrated. First, local-sync lost messages even when no loss was induced on the wide-area link. This may be because our wide-area testbed may drop packets, which prevents local-sync protocols from delivering to the mirroring application. Local-sync+FEC and network-sync, on the other hand, did not lose messages because both can mask wide-area link loss. Second, due to the relatively low bandwidth, packets were able to transit outside of the local-area, preventing loss from occurring in the local-area and enabling both local-sync+FEC and network-sync to mask wide-area link loss.

next up previous
Next: Related Work Up: Evaluation Previous: Application Throughput
Hakim Weatherspoon 2009-01-14