How to prevent network loop?

Question

We had a network problem that was caused by a debug command given in one switch. After the debug command the CPU load of the switch was 100% and the whole network went down.

The network is a star-type network that has 20 pieces of Cisco 3750 connected to one cisco 7609 router. The connection is done via gigabit fiber ports and the ports are configured as L2 trunk ports.

There is MSTP with few instances running in the network but it did not prevent the network from the loop. My question is that is there a way to prevent the network from this kind of problem?

-Sami

Bjarke Andersen · Accepted Answer

"aquarius" crashed Echelon writing news:_GEwh.47656$ snipped-for-privacy@news2.nokia.com:

Spanning-tree and BPDU-guard on endtail links (meaning no BPDU-guard on links which serves as backup line e.g. 3 core switches linked together in a circle)

aquarius · Answer

I didn't quite undesrtand your answer. There are 2x24 port gigabit interface cards in the 7609 and every 3750 has 2 gigabit trunk interfaces connected to

7609 and one of these interfaces is blocked by MSTP. Do you mean that the Spanning-tree and BPDU-guard should not be configured to these blocked interfaces?

-Sami

Bod43 · Answer

The smart-ass answer is that Cisco do not recommend the use of the debug command in production networks.

Really! How could it be otherwise?

Some debugs are suitable for use in production but it is pretty much left up to the Network Engineer to evaluate each use of debug themselves.

Did you follow the debug guidelines?

no logg console ! or logg level < debugging at least

If I thought there was the slightest risk add: no logg monitor ! or logg level < debugging at least no logg trap ! or logg level < debugging at least

!! also don't log to snmp whatever command that is.

so we log only to the buffer, the cheapest option.

logg buffer debb logg buffer 50000 ! if we have enough memory

Then consider just how your debug may affect the router.

Use available tools to limit the number of log messages.

It the end of the day, if a router or switch has not got enough CPU to process spanning tree messages then the network may well break. Debug is designed to be as good for troubleshooting as possible, this means that it HAS to be more important for the CPU to do debug than it is to do spanning tree. We are back full circle, debug is not supported in production.

You could manually shutdown the interfaces that could cause loops.

stephen · Answer

MST will fall back to spanning tree under some conditions - usually when the

2 switch ports decide they are not both MST compatible.

"raw" spanning tree doesnt block ports with some 1 way link problems - and a

1 way loop is just as good a broadcast overload generator as the ordinary kind.

Also if you manage to build a working loop and the load makes the CPU hit

100% it may end up stuck in that state - although Cisco IOS /CatOS seems better than some other switch code for this.

My preference is to use L3 design to get rid of spanning tree.

If you have to put up with it for some reason, uni directional link detection (UDLD) can get rid of the most common 1 way faults. Note - Cisco special so not feasible if you have other manufacturers kit.

But it is worth remembering that spanning tree mostly works well, but it works by actively suppressing loops when only when all ports around the loop work in both directions, so if something goes wrong the failure is likely to cause an active loop

aquarius · Answer

Yes, that would solve the problem, but it is not possible at the moment.

Yes, we have the UDLD in use for "broken wire" cases.

And this time the cause of the loop was CPU overload. I will try the loop-guard, as soon as I know how and where should I configure it.

Thanks for your answer !

-Sami

Thrill5 · Answer

The cause of the CPU problem was your debug command. Please see the previous post. If you spike the CPU because of debug command flooding the console with debug messages, or because the debug command you entered requires significant CPU to process, you will still bring that device down, layer 2 or layer 3 switch/routing not withstanding.

Scott

aquarius · Answer

Yes, that's clear, but the question was that is there a way to protect therest of the network in case of one switch's CPU overload. I have nowconfigured spanning-tree loopguard to all root- and alternative ports anddebug-commands will not be used any more...-Sami

stephen · Answer

there is a clear answer - but you wont like it, because it sounds like you cannot use it. The flip side to that is it may be a good reason to prise some money loose to fix it......

Design out L2 loops.

dont use spanning tree to suppress loops in normal operation - use spanning tree as a safety feature in case someone adds a loop by accident.

because of the protocol design, every device / link within any loop has to run to protocol correctly within the timeout window to keep the loop from acting as a broadcast replicator.

aquarius · Answer

You guys have confirmed me that the best solution is to redesign the networkfrom L2 to L3.-Sami

How to prevent network loop?

Join the Discussion

Didn't find your answer?