flow/packet loss through L3 C3560, pings OK

Have a question or want to start a discussion? Post it! No Registration Necessary.  Now with pictures!

Threaded View
I have a Catalyst 3560G that is doing L3 routing.  I tried to use it as
default gateway for a web cluster, which was doing about 120mbps of
traffic, 5kpps each in and out.  However, users noticed slow page loads,
broken inline images, etc.

I was able to ping all the servers from outside the 3560G with zero
packet loss in tens of thousands of 1500-byte pings.    I moved the web
cluster to a C6509 (same interface config) and the issue disappeared.

Web client experience was noticably impacted, so if it were simple
packet loss, I think I would have seen it with ping.   It seemed as
though the issue was related either to the type of traffic (plain http)
or flow (lots of flows).

The 3560 has a pretty vanilla config; the web cluster traffic was being
routed between a "no switchport" interface and a Vlan interface.  I did
notice that the "no switchport" interface had "ip route-cache
same-interface" configured, and I'm not sure why.  Also, the 3560 is
carrying about 7k external routes, but I monitor it to make sure it
doesn't hit the limit.  I didn't see any clues in syslog.

Phil

Re: flow/packet loss through L3 C3560, pings OK
Quoted text here. Click to load it

we had some issues with the 10/100 versions with buffer tuning where we had
problems with traffic bursts overwhelming the buffers, esp when you turn QoS
on as you effectively reduce the buffer pool for any 1 QoS type by 75%.

If you have several GigE connected servers contending for a congested or
rate limited port this could be an issue.

there are some commands to look at the buffers - something like
show platform port-asic statistics..... you want the drop stats for any
overloaded outbound ports.
Quoted text here. Click to load it

you need "sdm prefer routing" in the config to handle lots of IP routes - if
not they overflow the hardware forwarding table and get dealt with in
software.
Quoted text here. Click to load it
--
Regards

stephen_hope@xyzworld.com - replace xyz with ntl



Re: flow/packet loss through L3 C3560, pings OK
stephen wrote:
Quoted text here. Click to load it

Thanks for reminding me.  I did set that last May (it's logged), and
then power-cycled the switch, but I do not appear to have verified "show
sdm" after the power cycle.  Now I see that the switch is using
default/desktop, which could be the source of my trouble.  Weird.

Re: flow/packet loss through L3 C3560, pings OK
Quoted text here. Click to load it

yes - hardware forwarding tables will fill with 1 to 2k routes.

everything that arrives after the tables fill goes in software forwarding -
so whether it is an irritation or a disaster depends on the order the routes
arrive.

Not a fun thing to trouble shoot, but it does log an "out of space"
message - shame Cisco couldnt make it obvious what it is an error about....
--
Regards

stephen_hope@xyzworld.com - replace xyz with ntl



Site Timeline