forwarding latency

S

sbk 19 years ago

i want to quantify the latency in my network, as part of an effort to determine whether or not we can deploy a new application

so i've grabbed my Finisar THG box (a hardware packet sniffer with an internal clock accurate to 20ns), a stack of various switches, and some cables. i plug the two ports of the THG box into a switch, send 1,000 pings at a specified internval from one THG NIC through the switch to the other THG NIC, subtract the packet insertion time, average the resulting pile of numbers, and come up with a figure for the forwarding latency (aka decision time) of the device. see my results below. [acknowledgement: work performed by the staff of Network Protocol Specialists]

now i want a sanity check. Cisco must perform this same test (possibly with fancier hardware, like SmartBits boxes) routinely on their gear ... where do they post these results? i've been poking around

formatting link

without success

for interest, here are my numbers:

Catalyst 4003 100BaseT ports: 3170ns Catalyst 4003 1000BaseSX ports: 705ns [same forwarding latency for 64 byte and for 1518 byte packets]

Cat 4503 1000BaseSX 64 byte: 3300ns Cat 4503 1000BaseSX 1518 byte: 7120ns [why the change in forwarding latency depending on packet size? remember, i've already subtracted packet insertion time]

Cat 6506 1000BaseSX 64 byte: 5000ns Cat 6506 1000BaseSx 1518 byte: 7120ns

Datacomm Aggregration tap 100Mb: 320ns In-Line Finisar 100Mb tap: 0ns NetGear 100Mb hub: 330ns

and finally, we ran a test across our production network (which translates into two access-layer Cat 4506s, two distribution layer Cat

6506s, one core layer Cat 6506, plus ~500m of cabling ... and came out with ~20us of latency, exclusive of packet insertion time. good stuff

-where does Cisco post the numbers they have recorded?

-has anyone else performed these kinds of measurements? i'm wanting a conversation about what drives fowarding latency in different Catalyst models ... why, for instance, different packet sizes change decision time, in some models but not in others? what techniques and tools have you used to perform these measurements?

i'm wanting both a sanity check and a deeper understanding

--sk

stuart kendrick fhcrc

Vote

A

anybody43 19 years ago

I guess they don't publish such things for marketing reasons. They think that it is best not to.

I don't know if you will find these precise numbers however there are a number of organisations that do performance tests on equipment. These are often "competitive" tests where say A Vendor pays for a test and the Tester compares The Vendors kit with other kit against a test sctipt that The Vendor has generated.

The Vendor of course knows what the result is going to be, but they get they believe a marketing advantage from having the tests executed by an independent body.

Scott Bradner of Harvard university used to run such tests and I recall that someone who used to (or does?) post here does too. Scott founded the Harvard Network Device Test Lab.

Serialisation delay:- 64 1518

10M 51us 1200us 100M 5us 120us 1,000M 0.5us 12us

If you are able to get pre-sales support from Cisco and you are willing to sign a non-disclosure agreement you may get a verbal briefing on such numbers from a Cisco Systems Engineer.

"[why the change in forwarding latency depending on packet size?" Well the packet has to get transmited around inside the box and it may be subject to serialisation delay there too. e.g. IIRC the original 6500 with 32G bus had a bus 256 bits wide.

Your latency numbers look OK but they are only a small part of the picture when you have multiple hops. The serialisation delay will nearly always dominate. In a normal data network even if you are sending a small frame, if a full size frame hits the output buffer just before yours arrives that your small frame will be delayed by the serialisation delay of the full size frame.

With TCP, you can get multiple back to back frames sent by a single sender. TCP fairly aggressively probes the network to fill it up.

Good luck.

Vote

S

stephen 19 years ago

you need to think about what matters for the applications you use - they have to put up with serialisation delay, possibly at several points in the path between 2 end points.

so a 1500 byte packet is 12000 bits, or 12 uSec more delay on a 1 Gbps port.

Cat 4500s have an internal architecture with blade ASICs, link to backplane, central switch then back out.

the links from a blade to the fabric depend on the Sup type - a Sup5 has around a 6Gb/s link, so store and forward across that will add delay which is packet size dependent.

Same as for 4500 - you need to know which Sup / line card type to understand the underlying traffic path.

bandwidths are higher, so effect on delay for bigger packets is proportionally less.

1 thing to notice is that multiple hops is such that delays thru a single switch do not dominate - so a few nSec in the switch are probably irrelevant.

or - the packet serialisation delay dominates?

not seen much on latency in switches - the main reason is that the numbers should be low enough that store and forward dominates.

try

formatting link

i did some tests on Cat6k - although we were mainly worrying about overall thruput and what happens under congestion conditions.

Vote

S

sbk 19 years ago

thanx for the heads-up on why such test results may be misleading

as far as i can tell, NDTL is defunct ... if you know differently, do tell

ahh, of course ... i had forgotten about this :( hmmm, given this ... i'm curious to understand how the Catalyst 4003 achieves its invariant forwarding latency, i.e. unaffected by packet size

ahh, yes, good point thanx for the input

--sk

Vote

S

sbk 19 years ago

hi stephen,

ok, i buy that -- i need to understand the underlying hardware architecture in order to spin a coherent story behind the latency i'm observing

got it, serialization delay dominates on multihop paths

thanx for the pointer ... i see that eantc reports a 12us forwarding delay in the C6K for 64 byte packets, under some sort of load ... quite a bit more than the 5us i'm seeing ... but then ... i'm not testing under load. still, same ball park, and i find that reassuring

--sk

Vote

S

sbk 19 years ago

[...]

on a vaguely related note, i'm listening to vendors (they sell Fibre Channel switches) tell me that their gear reduces network latency by employing cut-thru switching ... this, combined with low forwarding latencies (2us) means, according to their view, that networks constructed from their gear contribute little to the total latency calculation ... serialization delay is incurred roughly once, no matter how many hops

[i think they are making assumptions here ... like ... minimal load on the network ... once multiple frames are competing for an egress port, all but one of those frames gets buffered and will incur serialization delay on egress. this view also assumes that all ports are transmitting at the same speed ... i.e. any 2Gb 4Gb transitions immediately reverts the traffic to store-and-forward]

as far as i can tell, cut-thru was a flash-in-the-pan in the Ethernet world ... i recall it as Kalpana's special sauce in 1994 ... phased out by the end of the 90s

i don't have a Fibre Channel analyzer, so i'm having trouble validating these claims myself

-can anyone confirm, through personal observation, that cut-thru happens in Fibre Channel gear?

-can anyone confirm 2us forwarding delays?

-other comments?

i'm not particularly concerned about this ... seems to me that disks and OSes and applications are still delivering latencies on the orders of milliseconds ... so shaving microseconds off network latency sounds inconsequential to me. however, i'm curious as to how Fibre Channel switch manufacturers could be delivering forwarding latencies which are so much lower than those of the Ethernet switch manufacturers ... i mean ... silicon is silicon, isn't it? ;) ... and i'm also curious as to how cut-thru has survived in Fibre Channel land, whereas it hasn't in Ethernet land. [of course, these claims may be sales fluff, i.e. technically not accurate]

--sk

Vote

A

anybody43 19 years ago

Just a quick comment for now.

The real issue with Cut through as I understand it is that it cannot be made to work across ports of differing speed. Well slow-to-fast is the problem. This is I believe why it was abanoned by pretty much all of the mainstream network kit makers, well Cisco anyway:-).

It may be that with fiber channel this issue is not present or is worked around. With Ethernet and arpa/Ethernet II encapsulation it is not known until the end of the frame arrives how long the frame is. FC may be different.

Without measuring the fc kit is is clearly hard to be sure what is going on. I don't know about FC at all really but remember that the 6500 for example can do a lot more than L2 forwarding of a frame. It has huge buffers to manage, does L3 forwarding, QoS, all kinds of packet re-writing, ACLs. All in hardware at line rate (lets say).

Oh lets not forget multicast and broadcast. Seems OK to me that it takes a few us to figure out what to do.

It seems to me that you need to get you Cisco Partner to wheel in a nice SE from Cisco. That used to be the only way to get a handle on the various architectures.

Some of Cisco's on-line presentations are quite nice but may be available only to partners.

I was up to speed on the 6500 32G bus but have no real idea about the 4500 or sup720.

If you are not able to convince Cisco to get a pre-sales resource in then you may be in some difficulty.

There was years ago an academic paper on the internet that seemed to describe the 8500 architecture. I suspect that the current cross-bars are not the same but it would give you an insight into the issues that need to be considered.

I will have a look for it.

Vote

forwarding latency

Join the Discussion

Didn't find your answer?