Redundant links between subnets

- G
- Ghazan Haider
  
  Contact options for registered users
posted
17 years ago

Tue, Jun 20, 2006 3:02 PM

We have two subnets connected by two wireless connections. The wireless connections are unreliable.

The wireless connections have IP nodes on each end.. thus we have the

10.0.0.250 and 10.0.0.254 links on our 10.0.0.0/24 subnet and the 10.0.1.250 and 10.0.1.254 nodes on the 10.0.1.0/24 subnet.

I'd like to use the links (a) redundantly and (b) in a load-balancing fashion. We have two cisco 1841 routers in each subnet acting as a firewall and defaultroute. I know etherchannel/trunk will make two layer 2 links redundant. But I dont know what to use.

I've been reading on VRRP HSRP and GLBP, but they make routers redundant. This would work if the wireless nodes were cisco, alas they're not. I read about DLSw+ but that works on layer2 as well.

Then I've been digging around OSPF and EIGRP, which might solve the problem. But OSPF links go down having not received hello packets 3 times or so. That may take 10 seconds of total downtime before IP is routed through the other link. Our application is tolerant of upto 2-3 seconds of delay (TCP/ODBC), but will break if packets are dropped with destination host unreachable. I dont know if in this case OSPF will resend the packet through the other link so nothing is lost. If it does, and hello packets are frequent enough, my problem is solved.

But I'm wondering about other cisco shrink-wrapped technology that just works for redundant layer 3 routes without a full-blown routing protocol. Can any expert here give me a better clue before I give OSPF/EIGRP a shot?

Yes I've scoured the cisco site, news groups and routerie.com for the past 5 days.

- M
- Merv
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Jun 20, 2006 7:45 PM

HSRP/VRRp could be enalbed on the 1841's if that feature is not already enabled. This would provide you with redundant gateway support on each subnet.

The next key issue is what is the best interworking to use between the

1841's and the wireless boxes. In order to answer that question more details on those wireless boxes would be requireded - make, model ,OS and routing protocols supported,etc...

- S
- stephen
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Jun 20, 2006 9:36 PM

default timers for OSPF are for 10 sec "tick" - so dead timer would give you up to 40 sec outage.

you can change the timers down to sub 1 sec - this would set the dead timer to 1 sec.

There are a bunch of related timers on a cisco designed to make interface and OSPF itself less sensitive to changes - the side effect is that the protocol doesnt get events like interface state changes instantly using default settings -so you will need to alter those as well to get the protocol to react quickly to outages.

if the path is lossy you may need to sort that out - a path that is up but losing a large %age of traffic can be a much bigger problem than an outage.

Our application is tolerant of upto 2-3

FWIW i would want to actually test this

the number of times a software peddler has told me that their wonderful program needs a perfect network and it turned out to be "less than completely accurate" is fairly high. Mind you the opposite happens as well...

I dont know if in this case OSPF will

Nope. All OSPF is doing is sorting out which are good candidate paths for packets between the routers.

If packets get lost in the intervening path, then they are gone.

I like OSPF - although the protocol is complicated, using it is fairly simple.

if you want another mechanism to detect a failed path then maybe BFD will help (bidirectional forwarding detection). this can feed into various routing protocols - but you will still need a routing protocol.

However i have only heard of this rather than used it in anger - and only for high end boxes such as 7600s.

- G
- Ghazan Haider
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Jun 21, 2006 1:13 PM

As a matter of fact its non lossy and works great until it disconnects, and remains disconnected for at least 5 minutes. This is where I thought OSPF will help. It doesnt 'flap' and ping times are great and predictable when its up. The wireless system is from the Wave Wireless company lanspeed 9000, based on 802.11b connections but its a blackbox-packaged system giving me a web-based configuration system and IP addresses at both ends. Not too many features. I'll have to rely on cisco on both ends to detect and fix routes.

I will. But to start any reduction in downtime and my not being required to reboot the links and manually switch routes, will be good. RIP is 90 seconds. OSPF can bring it down to 1 second which might be tolerable if packets are not lost, and are repeated over the next link if the first link goes down.

Well this would be something better than what we have now. Still hoping for some obscure cisco technology that will fix us up.

I'll read on this.

I know I was talking about load balancing over the two links as well, but if the two links can be used to mirror the traffic so no packet is lost, is an option too.

I think my biggest problem is the cisco router cannot see when the wireless link goes down. This is only indicated when a packet is lost and it gets a destination host unreachable. OSPF's sub-second hello packets sound like the best bet.

- A
- anybody43
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Jun 21, 2006 5:11 PM

I understand that IS-IS (ISIS) is the Routing Protocol of choice for fastest failure detection and re-convergence.

It uses the same core algorithms as OSPF and is I think is available on much of the Cisco equipment that does OSPF.

The issue is that fewer people use it so support will be more difficult. In a simple network though I would certainly consider it. I read recently that IS-IS, RIP and BGP (from memory??) are the only protocols that are as yet implemented by Cisco for IP V6?

In some circumstances I understand that EIGRP can be very quick too, in particular is load balancing is configured the "new" path is already in use.

In general data networks are unreliable. The level of unreliability is the issue however /all/ applications must cope with this unreliability.

In many cases TCP is used to provide reliable stream delivery. There are parameters that can be used to make use of more recent TCP developments.

Turn on selective ACK and timestamps.

- M
- Merv
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Jun 21, 2006 7:45 PM

What routing protocols idoes the WaveWireless unit support ?

The website

formatting link

shows a product called SPEEDLink 9200; it that what you have ?

Most of thier products seem to only support RIP version 1 and 2.

Not a lot of sense about talking about different interior gateway protocols (IGPs) if the unit does not support any of them ...

- G
- Ghazan Haider
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Jun 21, 2006 8:07 PM

Thanks for the answers.

Yeah I have speedlink 9000, not the newer 9200 which is almost the same thing. It supports only ipv4. Its rip2 is lousy and broke things more than fixed them. Theres not much in the line of 'configuration' I can do with the wireless equipment. I know its based on 802.11b and when its up, its non lossy. It stays up with 0 loss packets for days, then it goes down for 5 minutes to indefinitely (reboot). I have to use its IPv4 nodes as gateways to the other side. So all I need is something to use redundant IPv4 paths at layer 3 and above.

I didnt know EIGRP and ISIS could be faster. I'll take a look. I'll try the fast-update option in OSPF, update as much as I can (20 times a second?) and hopefully downtime is 1 second or less. Since the 1841 routers will themselves be the firewall and gateway to everything, route changes should be quick and easy, as opposed to flooding new routes to other routers etc. If the hello packets are small enough, the frequent hellos should not cause trouble.

Speaking of TCP, layer 4 is where things can go wrong. If the router gets 'destination host unreachable' from the wireless node and just loses the packet and uses the good route, TCP retrys should fix things. If the router forwards the destination host unreachable packet to the windows clients, that will bring down the app. In that case I'll have to do something funky not to allow such packets into the subnet from the wireless nodes. Getting downtime to 1 second and breaking apps is the first step. Changing downtime to mere delay and letting the app continue would be big. Using redundant unreliable connections is not a rare thing, I'm surprised by the lack of info out there for such a setup. I know dynamic routing protocols were invented just for these problems, but if the downtime can be 1 sec. TCP could work around it.

- M
- Merv
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Thu, Jun 22, 2006 11:37 AM

Suggest you explore using EIGRP or OSPF using GRE tunnels.

EIGRP and OSPF use multicast hellos and a GRE tunnel can provide the mechanism to get those hellos between Cisco routers on either side of the wirless "cloud"

- G
- Ghazan Haider
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Thu, Jun 22, 2006 8:46 PM

I thought (wrongly?) that specifying a neighbor, or using a point to point link makes the hellos go as unicast.

Back to the books.

Merv wrote:

- M
- Merv
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Thu, Jun 22, 2006 8:55 PM

At least for EIGRP, using the network statment for neighbours has a nasty side-affect.

"What does the neighbor statement in the EIGRP configuration section do?

A. The neighbor command is used in EIGRP to define a neighboring router with which to exchange routing information. Due to the current behavior of this command, EIGRP exchanges routing information with the neighbors in the form of unicast packets whenever the neighbor command is configured for an interface. EIGRP will stop processing all multicast packets coming inbound on that interface. Also, EIGRP stops sending multicast packets on that interface.

The ideal behavior of this command would be for EIGRP to start sending EIGRP packets as unicast packets to the specified neighbor, but not stop sending and receiving multicast packets on that interface. Since the command does not behave as intended, the neighbor command should be used carefully, understanding the impact of the command on the network."

- C
- clsawyer
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Thu, Jun 22, 2006 9:32 PM

I belive that normally EIGRP uses multicast, but with point to point the router has to queue up a packet for each tunnel.

Ghazan Haider wrote:

- G
- Ghazan Haider
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 23, 2006 3:46 AM

Defining the interface as NMBA or adding virtual links in OSPF will make it use unicast hello packets. I've never had to use EIGRP but I'll check to see if its convergence is faster for us, along with ISIS.

As for the original subject, I'm sure our devices dont do IPX ipv6 etc, but ipv4 should include multicast. Maybe it'll just work and I'll keep things default. Routers are taking an aweful long time to arrive via CDW.

I'll also disable ospf on the other interface, and these two links are the only place the sub-second hellos are being sent. Should also keep them small enough. I've heard references of downtime being 1 second, but I wonder if increasing the multiplier (divider?) should give me 0.5 seconds downtime or less. Links are 6mbit each and load-balancing should give me 12mbit. I can sacrifice upto 2mbits to hellos even though I dont think it'll get that high. I'd be more worried about the

1841 routers' CPU performance since they'll be doing NAT, VPN and possibly IDS.

- M
- Merv
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 23, 2006 11:33 AM

For EIGRP, the interfaces at both end of the links have to be addresed from the same subnet before adjacency will be formed. So that will need to be taken into consideration given that the wireless devices are in the midddle.

I think OSPF might form an adjacency but I suspect that the LSA would not get populated into the RIB (main routing table) as it will consider the advertising router as unreachable.

I would be very surprised if you can get this to work without GRE tunnels ...

- R
- rdymek
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 23, 2006 6:34 PM

All previous statements about EIGRP and OSPF have been completely accurate in that you can configure them as unicast when required, you can tweak hold timers, etc. But I think we may be over-thinking this solution and the OP actually may have been on a better track.

I believe GLBP (Gateway Load Balancing Protocol) will be the best solution (all GLBP is, is HSRP with load balancing turned on). One thing I'd ask, is... Is it possible to turn your wireless units into bridges? If so, then you can do this flawlessly making the wireless units irrelevant to layer 3. This will allow the entire load balancing function to happen on the Cisco routers. GLBP does round robin load balancing (default) to start - so if you loose a link, only half the connections will even be affected at all. The other half will converge quickly as the gateway IP will become fully active on the live interface (you'd make two separate GLBP/HSRP groups, one for each subnet). Of course this only works if you can move the routing completely to the cisco and off the wireless units by turning the wireless units into bridges.

Just another way to skin a cat, but I believe all the other options would work as well provided they were configured correctly using Unicast (i.e. configuring OSPF to be on an NBMA network).

Also, a question no one has asked - why do you even have 2 subnets? Is it because the router has to have a different subnet for each interface? If thats so, you can use IP Unnumbered to have the same subnet exist on two interfaces.

Ryan

- S
- stephen
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jun 23, 2006 7:50 PM

agreed. it is often used inside ISPs and Telcos - the MPLS core at work uses IS-IS.

However it is a pain to use if you dont know it reasonably well, since you have to set up OSI CLNS protocol to carry the routing packets as well as IP.

The main advantage over OSPF is sub second "hello" times - but that is now on the cisco OSPF enhancements, so less of an issue.

this is only an issue when you have a big routing table - OSPF calculates the new routing table locally from the internal database, so for a fast router and low number of routes the difference is not significant.

FWIW we got OSPF convergence in a lab down to around 40 mSec on a mix of

3700, 7200, 7304 and a couple of Cat6ks with Sup720s.

However - a routing protocol cannot converge until it detects a state change - and all the really fast times depend on a physical change event - such as losing incoming light on a GigE etc.

The OP scenario is where a "gap" opens up in a layer 2 or layer 3 path, but there isnt any physical signalling of the event - so this depends on losing some number of hello packets between routing neighbours.

So the choice comes down to which protocols work across the link (hint - much easier if the wireless can act as a simple Ethernet or bridge - any other way you need a VPN style arrangement to isolate the "outer" routers from the wireless IP routing) how fast can the routing protocol converge. how hard do you tune the timers (remembering that slower timer are lower router overhead, and tolerate short interruptions in flow better, so faster timers equals potentially less stability). can you load balance.

- M
- Merv
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sat, Jun 24, 2006 7:09 PM

To achieve LAN gateway redundancy HSRP, VRRP or GLBP can be used. LAN redundancy protocols however will not address the WAN redundancy issue.

Given that use of a dynmaic IGP is required (EIGRP, OSPF< ISIS or RIPv2), this is where you need to focus your research efforts

Since the WaveWireless SPEEDLAN 9200 is an IP router (it is NOT an bridge), its LAN and wireless interfaces require IP addresses.

The OP should post the planned network topology

Is it

LAN 1 --- Cisco 1841 A --- wireless router A --wireless link 1 --- wireless router B --- Cisco 1841 B --- LAN 2

LAN 1 --- Cisco 1841 C --- wireless router C --wireless link 1 --- wireless router D --- Cisco 1841 D --- LAN 2

Thus it would appear that each Cisco 1841 - Wave Wireless router pair will be require a common subnet so they can communicate with each other. Thus it would seem that the use of unnumbered interfaces will not be an option.

In fact given the wireless connectivity issue that the OP says he experieinces, having an IP address on every interface in the path will be use for troubleshooting.

So the key question is for each target IGP, is the IGP neighbour required to be addressed from a common subnet?

I know for sure that EIGRP has this requirement. (regardless of the use of multicast or unicast hellos)

I believe there is a command for RIP to tell it to ignore source address verification.

I suspect that OSPF will form an adjacency, but will not populate the RIB.

Not sure what ISIS will do.

- G
- Ghazan Haider
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sun, Jun 25, 2006 9:52 PM

I looked into using the wireless connections at layer 2 as the first step. Its not possible, and I'm stuck with layer 3 connectivity. Everything I read about GLBP/HSRP/VRRP showed dual routers on each side having the same virtual MAC address. This is different from my having only one router on each side. I had also been looking into various Linux and OpenBSD technologies to achieve this, CARP from OpenBSD is a lot like GLBP but can also be used to load balance redundant layer 3 paths.

If GLBP can use redundant routes (and the wireless IP nodes do not speak GLBP) with fast convergence, I'll prefer that. Else its OSPF.

Now as another poster commented, convergence can only be as fast as the detection. The detection of (not next-hop) layer 3 connections going down takes its time, which is why I think I should throttle the hello packets as fast as possible, even upto 10% of our available bandwidth if thats what it takes. Simply processing hello packets should not take too much CPU on the routers. I believe the wireless nodes have SNMP traps as soon as its connection goes down. I'm not sure if I can use these as triggers for route change on cisco routers. With or without dynamic routing protocols.

Yes I'm sure I can get OSPF up and running. The wireless should also pass multicast packets along, I havent tested that. Shouldnt be a problem either way.

We have 2 subnets because the wireless connections are layer 3 only. With the same subnet, packets will stay on the same side destined for the other side, and the wireless nodes will be confused.

- G
- Ghazan Haider
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sun, Jun 25, 2006 9:59 PM

I wont mind achieving that here at all. If it was possible.

Default is 3. I'll try using 2 hello losses.

Our connection (based on 802.11b) is stable when its up. When it goes down, its down for at least 5 minutes. So I thought of using faster hellos, but we might lose the packets when the bandwidth is fully being used. I'd have to use guaranteed QoS to make sure the hello packets get there.

- V
- Vincent C Jones
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Jun 28, 2006 3:26 AM

Look for a solution that works on parallel VPNs. That is the closest equivalent to what you are looking for. IGPs like OSPF and EIGRP won't work natively across other routers. BGP will, but won't meet your detect and recover requirements. So you're pretty much stuck with GRE tunnels that can support OSPF or IS-IS sub-second hellos. Just watch out for the hit on path MTU.

Good luck and have fun!

- M
- Merv
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Jun 28, 2006 10:54 AM

Glad I am not the only one who is of the option that GRE tunnels is pretty much the only solution given the OP's network setup.