OSPF hello timer

Does anyone have experience of winding OSPF timers down from the usual

10-second hello interval? We have a stub area that we'd like to make respond more quickly to outages and this seems like the obvious way to do it. The equipment will be Cisco 6500s and Sun load balancing switches (which we've yet to check the config details of) and the links will all be GigE. We're not proposing to go below 1 second for hellos and we'd keep the usual 4x multiplier for dead-interval.

All comments welcome.

Sam

Reply to
Sam Wilson
Loading thread data ...

Are all of the OSPF speakers Cisco ?

What version(s) of IOS is running on the 6500 ?

Have you looked at using BFD ?

Reply to
Merv

"... and Sun load balancing switches..." I think it's the N2000 series.

12.2(18)SXF7. No we're not looking to upgrade unless we really really really have to - the last year or two have been painful and we still have stuff that is flaky (distributed etherchannel on 10GE in 12.2(18)SXF4, anyone? MAC address learning in 12.2(17d)SXB8?).

Nope. I don't think the Suns support it and given how new it seems to be I'd be more than a little chary about it anyway. I'm clearly getting conservative in my old age... :-)

Sam

Reply to
Sam Wilson

given OSPF speakers from different vendors then OSPF hello/dead timers are the knobs to use.

are the GE links point-to-point or GE broadcast (ie. more than two OSPF speakers) ?

Reply to
Merv

That's what I thought. Have you (or anyone else who's listening in) any experience of sub-10-second hello timers? Any gotchas with using 1 second? 2 seconds?

Broadcast. It's a VLAN with redundant links spread across two sites. At present there'll be two Ciscos and two Suns involved.

Sam

Reply to
Sam Wilson

I have used 1 second hellos on GE interfaces but point-to-point so usually detect physical link failure.

You may also want/need to look at OSP exponential SPF backoff feature

router ospf 1 timers throttle spf 1 5000 10000 end

sh ip ospf Routing Process "ospf 1" with ID 3.3.3.3 Supports only single TOS(TOS0) routes Supports opaque LSA Supports Link-local Signaling (LLS) Supports area transit capability Initial SPF schedule delay 1 msecs Minimum hold time between two consecutive SPFs 5000 msecs Maximum wait time between two consecutive SPFs 10000 msecs Incremental-SPF disabled Minimum LSA interval 5 secs Minimum LSA arrival 1000 msecs LSA group pacing timer 240 secs Interface flood pacing timer 33 msecs Retransmission pacing timer 66 msecs Number of external LSA 0. Checksum Sum 0x0 Number of opaque AS LSA 0. Checksum Sum 0x0 Number of DCbitless external and opaque AS LSA 0 Number of DoNotAge external and opaque AS LSA 0 Number of areas in this router is 0. 0 normal 0 stub 0 nssa Number of areas transit capable is 0 External flood list length 0 IETF NSF helper support enabled Cisco NSF helper support enabled Reference bandwidth unit is 100 mbps

Reply to
Merv

Thanks. I'm trying to work out whether there are any implications for the LSA retransmit-delay - is there any point in dropping that as well?

OK, that seems to have made it into 12.2()SX (it's not in mainline

12.2). The manuals explain how to do this and what it does but not why I'd want to do it. Presumably you're reducing the intervals so that topology changes get propagated more quickly since we're detecting them more quickly, right? Even so changing from 5ms default wait to 1ms seems a little extreme when I'm looking at changing the dead-interval to 4 seconds. Perhaps I've misunderstood.

Sam

Reply to
Sam Wilson

this needs you to change the interface default timers - on a Cat6k these give you a 5 sec delay before the switch reacts to a "down" interface.

agreed - 1 mSec is pushing it, and the other timers involved mean 10s of mSec anyway to react.

it depends where you expect "faults" to come from. If the link layer 1 fault detect is going to kick off a Dykstra, then interface timing is the key.

if "dead" timers wont normally detect a problem, then you can relax those and use as the fallback for all the unusual wierd failure modes.

there are a bunch of recommendations about this kind of tuning that may be useful in CCO.

1 to watch is that the processing load of OSPF goes up as the hello timer reduces - and scales up as you add OSPF interfaces.

So 100s of ports and 1 sec hello is a recipe for disaster....

you can go to sub second hellos (i think you specify how many per sec, and you get a 1 sec dead timer by default).

finally Cisco support a new point to point "dead detect" protocol, that may mean you dont need to be so aggressive with OSPF timers - BFD or black hole forwarding detection?

Reply to
stephen

SPF exponential setting of 1 ms say that for the first "failure", SPF is to be executed immeditely following the OSPF neighbour adjacency loss. This is a GOOD THING. For the second failure wait longer ( 5 seconds to execute SPF) and for 3rd and supsequent wait even long ( ie. there is a flapping neighbour). This is a very good feature so you may want to make use of it in other oplaces in your network

I would lower the OSPF hello timers first to see if the resulting convergence improvement is satisfactory.

If not then also make use of the SPF exponential backoff feature

Reply to
Merv

Cabling-Design.com Forums website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.