FWSM and dual chassis failover problem

Hi all

A dual 6509 core with FWSM modules configured for failover. I am troubleshooting a problem which occurs with the Primary/Active switch. Occasionly, approximately once a month, the switch "hangs" and as a result there is no connectivity between vlans. No failover is occuring. To solve the problem I have to cold restart one of the switches either pri or sec.

what could be the issue?

thx

Reply to
timizart
Loading thread data ...

Could be Cisco bug ID CSCsc90277:

On a Supervisor 720, if you have unicast Reverse Path Forwarding (uRPF) configured on multiple VLANs and one of those VLANs is shut down, Layer

2 traffic to the Supervisor 720 and within a VLAN is dropped for the VLAN that is still up.

Layer 3 traffic between the VLANs that is still up also experiences connectivity problems.

interface vlan 1 ip address 10.10.20.1 255.255.255.0 ip verify unicast source reachable-via rx no ip redirects no ip unreachables ip pim sparse-mode ip route-cache same-interface ip route-cache flow ip cgmp

interface vlan2 ip address 10.10.10.1 255.255.255.0 ip verify unicast source reachable-via rx no ip redirects no ip unreachables ip pim sparse-mode ip route-cache flow mls rp vtp-domain U2k mls rp ip

If VLAN 1 is shut down, Layer 2 traffic in VLAN 2 fails an RPF check.

This is an example:

Host A ---- VLAN 2 ----- cat6500------ VLAN 1 --- Host B

If VLAN 1 is shut down, the traffic that comes from Host A in VLAN 2 to the IP address of VLAN 2 on a Catalyst 6500 fails an RPF check.

The same is true for any traffic that originates in VLAN 2 and goes to any other VLAN.

If mls rate-limiter for IP errors is configured, traffic is intermittently dropped as well based on the rate configured in the rate limiter.

--------------------------------

This bug is fixed in Cisco IOS Software Releases 12.2(18)SXF2 and

12.2(18)SXE5 and later.

--------------------------------

Workaround:

  1. Disable mls rate-limiter. The RPF check still fails but traffic does go through.

  1. Disable uRPF.

  2. Shut / no shut the VLAN interface.

  1. Issue the clear ip route command.

--------------------------------

Sincerely,

Brad Reese BradReese.Com - Cisco Jobs

formatting link
Hendersonville Road, Suite 17 Asheville, North Carolina USA 28803 USA & Canada: 877-549-2680 International: 828-277-7272 Fax: 775-254-3558 AIM: R2MGrant BradReese.Com - Cisco Salary and Compensation Rates
formatting link

Reply to
www.BradReese.Com

I am using Catalyst 6000 supervisor 2.

Is the the problem not related to failover errors displayed below ?:

Stateful Failover Logical Update Statistics Link : failoverstateif Vlan 101 Stateful Obj xmit xerr rcv rerr General 5284650 20 446597 0 sys cmd 320105 0 319997 24 up time 0 0 0 0 RPC services 8 0 0 0 xlate 0 0 0 0 TCP conn 18718301 0 59123 0 UDP conn 0 0 0 0 ARP tbl 4964509 0 126600 0 RIP Tbl 0 0 0 0 L2BRIDGE Tbl 0 0 0 0 Xlate_Timeout 35 0 0 0 TCP NPs 11795359 0 170947 138173 UDP NPs 15068266 0 103272 138173

Logical Update Queue Information Cur Max Total Recv Q: 0 15 446602 Xmit Q: 0 57 5284677

formatting link
a =E9crit :

Reply to
timizart

Your Catalyst 6000 switch with Supervisor 2 running CatOS software hangs after running for several months.

This issue has occured only on the Supervisor 2 running the CatOS version 7.6(1) which is documented in Cisco bug ID CSCed38989.

Can be resolved by upgrading the software to CatOS version 7.6(5) and above.

Resolved in Catalyst OS (CatOS) release 7.6(5) and later, which can be downloaded from:

formatting link

---------------------------------

Cisco bug ID CSCed38989:

Problem/Symptom:

A certain type of symptom has been reported by some customers, only with Sup 2, running 7.6(1) code.

The symptom includes *all* of the following:

A. Unable to reach the switch via telnet/ping.

B. Unable to access the switch via snmp or other management applications.

C. Able to reach the MSFC via telnet/ping.

D. When connected via console, nothing is output or some garbled (repeating character 'R' or '.' or some other character). characters output.

E. System status LED is normal (green) and Backplane utilisation LED at

0%. In some cases, Traffic meter LED may be at 100%.

F. System has been up for more than 7 months, approximately.

Platform /releases affected: =========================== Only Sup2

Workaround/Fix: ==============

This caveat has been fixed in 7.6(5).

Single Supervisor: ==================

Currently the only known workaround is to schedule a maintainance window to reset the supervisor when the uptime is closer to 150 days.

The uptime can be viewed via show system or show version.

Dual Supervisor: ================

In Dual Sup scenarios, the other sup will takeover in 10 minutes.

However, if required a maintainance window can be scheduled to do the following:

If HA is not enabled, enable HA using set system highavailability enable.

Wait for HA to sync. show system highavailability should say 'ON' for Highavailability Operational-status.

  1. switch to the other supervisor using switch supervisor.

---------------------------------

Hope this helps.

Brad Reese BradReese.Com - Cisco Repair

formatting link
Hendersonville Road, Suite 17 Asheville, North Carolina USA 28803 USA & Canada: 877-549-2680 International: 828-277-7272 Fax: 775-254-3558 AIM: R2MGrant BradReese.Com - Cisco Power Supply Headquarters
formatting link

Reply to
www.BradReese.Com

Cabling-Design.com Forums website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.