Dropping packets

D

die.spam 17 years ago

I have a cisco 3400 ME switch which is dropping a lot of packets on one of its uplinks and customers are noticing the packet loss.

the switch has two gig uplinks, one has a tx/rx combined rate of

445Mbps which never drops a packet, the other has a combined tx/rx throughput of 607mbps (give or take) and drops 100's of packets a second. i think the issue may be caused by the 3400 configured for jumbo frames and the 4507 running an MTU of 1500.

However lookign at the traffic patterns there is a huge difference in data transmitted than received, so what is the best way to check for loops on the switch? since the 3400 doesnt support STP on the UNI ports, what is th ebest way to check for loops? I have checked the mac table and there is no dupes, is there any other methods for tracking them down?

GigabitEthernet0/13 is up, line protocol is up (connected) Hardware is Gigabit Ethernet, address is 0021.d7ed.e60d (bia

0021.d7ed.e60d) Description: xx MTU 9000 bytes, BW 1000000 Kbit, DLY 10 usec, reliability 255/255, txload 154/255, rxload 4/255 Encapsulation ARPA, loopback not set Keepalive not set Full-duplex, 1000Mb/s, link type is auto, media type is 1000BaseLX SFP input flow-control is off, output flow-control is unsupported ARP type: ARPA, ARP Timeout 04:00:00 Last input 00:00:00, output 00:00:00, output hang never Last clearing of "show interface" counters 8w5d Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops:
151923763 Queueing strategy: fifo Output queue: 0/40 (size/max) 5 minute input rate 16614000 bits/sec, 13486 packets/sec 5 minute output rate 605606000 bits/sec, 118387 packets/sec 81890216650 packets input, 16901966809462 bytes, 0 no buffer Received 149312500 broadcasts (148384638 multicasts) 0 runts, 0 giants, 0 throttles 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 0 watchdog, 148384638 multicast, 0 pause input 0 input packets with dribble condition detected 508715659366 packets output, 360874240939303 bytes, 0 underruns 0 output errors, 0 collisions, 0 interface resets 0 babbles, 0 late collision, 0 deferred 0 lost carrier, 0 no carrier, 0 PAUSE output 0 output buffer failures, 0 output buffers swapped out

Vote

B

bod43 17 years ago

I am not familiar with this switch but I am familiar with the related 2950 and 3550.

Regarding loops. If your design allows customers to create loops then this would seem to be a bit of a problem.

Some switches log 'suspicious' mac address to port changes. i.e. they log cases where mac adresses move from one port to another.

You could use snmp to log the stats (drops, bytes in and bytes out) for all ports. Then you might be able to see a pattern.

Note that most cisco kit these days seems to only update the snmp counters every 15 or 20 seconds.

At present your rate stats are being recorded with a time constant of 5 mins. This can be changed down to 30 secs with

conf t int x load-interval 30

Maybe you need to see if you can apply QoS to some of your clients?

There is no way for the 4500 to tell the 3400 that the MTU is inconsistent. If big frames are present in the network the 3400 will just transmit them and the 4500 will just drop them. This can not be instrumental in your output drops.

Ideally you want to get mrtg or something like that going but a quick and dirty free snmp client is getif.exe. I have recently pointed to it on this list. roughly

formatting link

It is pretty clunky and limited but with a small mumber of ports such as you have then it will get the job done just about. hmmm maybe not - the 16 bit counters will wrap pretty quickly and I am pretty sure it wont handle the 64bit ones. It does not handle overflows very gracefully.

what speed are your customer links? what model is your switch? what software is it running?

Vote

F

fugettaboutit 17 years ago

Check your Ethernet flow-control settings and/or for upstream congestion...your upstream switch may be causing this switch to drop frames.

flamer snipped-for-privacy@hotmail.com wrote:

Vote

D

die.spam 17 years ago

Thanks for the response, first flow control is off on both switches, I would expect the PAUSE counters to be increasing should that be the cause.

I have a decent SNMP system running already, HP network node manager, and HP Open View, but it hasnt helped me narrow down any loops.

All other ports have 0 output drops, and no host flapping events.

there are 3 customer gig ports, and 3 100mb ports, but they are rate limited with policies maps on the gig trunk, however im not sure that they are all rate limited, that will be my next step to ensure all vlans have a rate limit specified.

But thanks for your point about the MTU size, your correct, if the

3400 is running larger than the 4500 the drops would be on the 4500 ingress.

Vote

B

bod43 17 years ago

Well, if you collect fine grained statistics from the ports you would be able to detect loops from simultaneous changes in the port traffic on differing ports.

As mentioned I think that most switches update the snmp counters every 15 or 20 seconds. This is of course irritatingly infrequent for some purposes but it may be the best that you can do.

It is easy enough to determine the behaviour of your platform with snmpget or otherwise.

There are counters for pause frames on the 4500 for sure.

From memory :- sh int counters

Vote

T

Thrill5 17 years ago

An upstream switch isn't causing this switch to drop packets, because input flow-control on this switch is turned off. What is the rest of your config, specifcally I would like to see what the interface configuration looks like. What is interesting is that you are getting output drops, but all of the other output error counters are zero. Do you have a QoS policy set on the interface? If you are rate-limiting the interface then this would be the source of your output drops with zero error counters.

Vote

F

fugettaboutit 17 years ago

Another possibility is head-of-line blocking. Since he's talking about mixed media rates (GE and FE), HOLB *could* come into play. He could try playing with the tx-ring buffers on the interfaces in question and see if there's any change.

FWIW, I am assuming that there are no other traffic policies in effect (QoS, rate-limiting/policing, etc.).

Thrill5 wrote:

Vote

A

Aaron Leonard 17 years ago

~ GigabitEthernet0/13 is up, line protocol is up (connected) ~ Full-duplex, 1000Mb/s, link type is auto, media type is 1000BaseLX SFP ~ Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops:151923763 ~ Queueing strategy: fifo ~ Output queue: 0/40 (size/max) ~ 5 minute input rate 16614000 bits/sec, 13486 packets/sec ~ 5 minute output rate 605606000 bits/sec, 118387 packets/sec ~ 81890216650 packets input, 16901966809462 bytes, 0 no buffer ~ 508715659366 packets output, 360874240939303 bytes, 0 underruns

a) 151923763 / 508715659366 = 0.03%

b) Consider the possibility that this router simply has more load to offer than the link can dequeue. The fact that your 5-minute output rate is > 600Mbps would tend to confirm such a hypothesis.

Aaron

Vote

D

die.spam 17 years ago

I would tend to agree, After apply policies on every interface I had several complaints of slow speeds, I again removed all the policies and that has stopped the slow speed issues and the packet loss is a bit less now, I am arranging a second link to be put in today, one question I have, if anyone knows from previous experience, If I combine two links into a port channel, will there be any service disruption? (noting that the customers will notice any outage over 1 second long)

Flamer.

Vote

T

Thrill5 17 years ago

Yes, if you put the port a port into a port-channel there will be a brief outage.

I would tend to agree, After apply policies on every interface I had several complaints of slow speeds, I again removed all the policies and that has stopped the slow speed issues and the packet loss is a bit less now, I am arranging a second link to be put in today, one question I have, if anyone knows from previous experience, If I combine two links into a port channel, will there be any service disruption? (noting that the customers will notice any outage over 1 second long)

Flamer.

Vote

B

brink 17 years ago

Increasing the output buffer could also help, it is still at the default 40.

Vote

B

bod43 17 years ago

It is very easy to get etherchannel wrong and to have the thing flap about. It might be best to be prepared for an outage at least.

Since it appear that you do not alreay have a port channel then you will be creating new ports. If you have STP running on them then they will all come up blocking. This will result in a 50 second delay before traffic starts.

If you do not have STP and both ports come up but the channel is unhappy you will have two parallel links and a loop:(

Obviously you can use RSTP - of which I have no experience.

One key point is that all of the physical ports on a single end *must* have identical configuration with regard to Trunking (allowed VLANs etc) and speed or the bundle will not form, or will come down.

If it is not working mysteriously check for errdisable ports.

Being forced into doing this in service would

**really** get my attention.

Vote

Dropping packets

Join the Discussion

Didn't find your answer?