mixing different Sup engines in a 6500

I have a 6509 with two Sup2A engines. The Sup2A is in slot 1 bit the dust last night and is throwing all kinds of trace errors.

Since I've never had a Sup fail on me, I am unsure of the process. Can I simply remove the bad Sup, insert a replacement Sup2A, insert the old Sup's flash card or a new one, then poof it works?! Is there any specific command to resync the config, or reset the slot module as active?

The followup to this is if I want to upgrade to a Sup720, is it possible to upgrade one slot at a time. That is, I replace the bad Sup2A with a Sup720, and leave the slot 2 with a Sup2A - or will that not work? Just a thought....

-John

Reply to
essenz
Loading thread data ...

You need to back up the config off of the system.

As I recall -

Supervisor hardware must be identical.

You need to make sure that the software releases are "compatible". For most people this would be *Identical*.

If all you got was trace errors it is possible that you have suffered only a software crash.

If you have a replacement SE with identical software then you just put it in and the configs synchronise.

Clearly if you dont have a spare chassis to play with then you are going to have some problems sorting this out without a system outage.

There is a command to switch supervisors.

For more check the "redundancy" documents.

If you have more questions then please let us know whether you have CatOS or IOS and which software version you are running.

You might have to convert a new SE from IOS to CatOS or vice versa.

I forget if the 2A has an integral MSFC or not.

Reply to
bod43

Here are some more details. I will scrap the Sup720 upgrade, I just deal with my Sup2 issue for now.

I'm running IOS 12.2(18)SXD7b, bootstrap ROM is 12.1(4r)E. My Sup2 has been upgraded with MSFC2 and PFC2, max memory, 1Gb

Slot 1 Sup2 has been running non-stop for 11 months. Last night it failed, and the Sup2 in Slot 2 took over. I have console system with buffer, so when I try to console to the bad Sup2 I see the following repeating:

Unexpected exception, CPU signal 10, PC = 0x0

-Traceback= 0 401A4DCC 401A4328 401995EC 401987C8 401995D0 401987C8

40196B88 $0 : 00000000, AT : 43380000, v0 : 43E6EA08, v1 : 00000000 a0 : 0000000A, a1 : 00000010, a2 : 43E958A4, a3 : 40DB0000 t0 : 00000038, t1 : 34018001, t2 : 34018000, t3 : FFFF00FF t4 : 40198408, t5 : 00002008, t6 : 00000000, t7 : 69705F66 s0 : 0000000A, s1 : 43E90000, s2 : 0000000A, s3 : 43E90000 s4 : 00000010, s5 : 0000000A, s6 : 40E90000, s7 : 43E958A4 t8 : 50008650, t9 : 00000000, k0 : 43E93940, k1 : 00000038 gp : 4338CB04, sp : 40DC28C8, s8 : 03FFFFFF, ra : 401A4DCC EPC : 00000000, ErrorEPC : BFC24CB4, SREG : 34018003 MDLO : 00000000, MDHI : 00000002, BadVaddr : 00000000 Cause 00002008 (Code 0x2): TLB (load or instruction fetch) exception

Not sure if this is hardware problem or just a software crash.

I guess my main question is, if I get an exact replacement Sup2, what is the best procedure for replacement with minimal downtime. Both Sup2's have the flash PCMCIA cards with my running config. So I assume I can take the flash card out of the bad Sup, insert it into the replacement, remove the Sup2 from the chassis, insert the replacement, then issue the command "hw-module module 1 reset", is that correct?

Also, my redundancy config is:

redundancy mode sso main-cpu auto-sync running-config

So if I were to run "hw-module module 1 reset" right now, would that reset slot 1 Sup2 and make it active or just reset it so I can do some debugging on it?

Thanks

Reply to
essenz

## It might work I suppose. ##

I dont work with these every day so I cant really give firm procedures.

It is a bit more complex than a small router.

You suggested process will probably not work. There is a "boot string" stored on the SE which may not allow the device to boot your image. As far as I recall "the image may be in the "bootflash" or on slot0/1". It will also I think just boot the first image it finds in some circumstances. e.g. no bootvar path, or maybe if the bootvar path is invalid.

You also need to consider the configuration register.

I tend to take a cautious approach to networking and I would not do this without booting the new SE in a single SE chassis to see what it was doing.

You could possibly end up for example with the config from the new one as the master config and that getting overwritten into your existing SE. this might not be what you want and is the reason that you shuld be sure you have a backup of the config off of the device.

sh bootvar (if I recall correctly) displays the boot string sh ver displays the config register

What I think you will have to do. - Boot the new SE as a single SE in a chassis. - Set the boot string and config register - Put in your flash card with the image

reboot it to test

Done.

Reply to
bod43

I think that is what I will do. I can get a spare chassis on ebay for around $500, and use that to test the new engine.

The Cisco docs are misleading. I have IOS 12.2, but the docs for CatOS say that when you replace the standby Sup2, the active Sup will automatically sync the bootrom, flash, etc.,. as long as the replacement is identical. Not sure how accurate that is.

As for that trace error, has anyone seen it before, does it look like fatal hardware problem, or just a random software crash?

Reply to
essenz

not seen the error before, but it may be worth checking it if you have cisco CCO access.

more to the point - there is no such thing as "just a software crash" on an IOS / CatOS system - think of the disruption if your standby Sup had not taken up the slack), and think about the 3 sec to 5 minutes when the Sup lights were out ie the time the switchover might have taken, depending on the resilience mode you are running....

Reply to
Stephen

Cabling-Design.com Forums website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.