How does CEF determine PPP peer adjacency?

How exactly does CEF determine an adjacency for a PPP peer? I know it sounds like a simple question, but...

I've had CEF disabled for a long time because I knew it did something to kill my serial PPP connection. Now I really need to enable it so voice performance will be acceptable on the LAN side and I've been trying to understand the problem. This is a 3640 with 12.4(13a), but I think I first noticed the CEF/PPP issue on an older platform with older software.

When CEF is first enabled it creates a valid adjacency for the PPP peer, but after a few minutes the adjacency becomes invalid and subsequent packets are rate-limited-punted, with the adjacency never again becoming valid. Debugging CEF gives the expected "stalled adjacency" message for each packet destined for the PPP peer.

Here is an example of the valid adjacency:

x.x.x.2/32, version 73, epoch 0, attached, connected, per-packet sharing

0 packets, 0 bytes via Virtual-Access1, 2 dependencies valid adjacency 0 packets, 0 bytes switched through the prefix tmstats: external 0 packets, 0 bytes internal 0 packets, 0 bytes

Here it has become invalid:

x.x.x.2/32, version 73, epoch 0, attached, connected, per-packet sharing

0 packets, 0 bytes via Virtual-Access1, 2 dependencies invalid adjacency 0 packets, 0 bytes switched through the prefix tmstats: external 0 packets, 0 bytes internal 0 packets, 0 bytes

Here is my dialer configuration:

interface Dialer0 ip address x.x.x.1 255.255.255.252 ip access-group 100 in ip accounting output-packets encapsulation ppp dialer pool 1 dialer remote-name xxxxxxxx-gateway dialer idle-timeout 0 dialer string xxx dialer load-threshold 100 either dialer vpdn dialer-group 1 no cdp enable ppp accm 0 ppp pfc local request ppp pfc remote apply ppp acfc local request ppp acfc remote apply ppp chap hostname xxxxxxxx ppp chap password 7 XXXXXXXXXXXXXXXXXXX ppp multilink ppp multilink links maximum 2

interface Serial0/0 bandwidth 56 no ip address encapsulation ppp no ip mroute-cache dialer in-band dialer pool-member 1 dialer-group 1 pulse-time 1 no cdp enable

I tried removing the access-group, vpdn dialer, and multilink configuration. None of this made any difference except that the last caused Serial0/0 to show up in the adjacencies instead of Virtual-Access1. I can disable CEF on the dialer interface and that "works" but it makes me uncomfortable to have the invalid adjacencies for the peer any everything that routes through it. Obviously I'm doing something stupid to break CEF's adjacency test (PPP does work with CEF, right?) but I can't tell what else it might be...

Dan Lanciani ddl@danlan.*com

Reply to
Dan Lanciani
Loading thread data ...

I think this has something to do with the fact that you using PFC and ACFC on the PPP interface.

formatting link
This document states...

"Using ACFC and PFC can result in minor gains in effective bandwidth because they reduce the amount of framing overhead for each packet. However, using ACFC or PFC changes the alignment of the network data in the frame, which in turn can impair the switching efficiency of the packets both at the local and remote ends of the connection. For these reasons, it is generally recommended that ACFC and PFC not be enabled without carefully considering the potential results. "

I take this to mean that PFC and ACFC can break CEF, which is what has happened.

Disabling CEF on the dialer interface and having "invalid CEF adjacencies" is not something to be concerned about.

Reply to
Thrill5

| I think this has something to do with the fact that you using PFC and ACFC | on the PPP interface.

I disabled both and there was no change in behavior. It would have been disappointing if this were the problem since I waited through about five major IOS releases for the ability to suppress those useless bytes. :)

Dan Lanciani ddl@danlan.*com

Reply to
Dan Lanciani

One thing to remember is that the performance with ordinary "fast switching" is comparable with that of CEF. I have the idea that fast sw was a tiny bit better on some particular platform years and years ago but there is effectively no difference.

The key advantage of CEF is that there is no caching so full performance is available right away on a cache invalidation (routing table change). For most networks this is not going to be significant. It is significant for a carrier with a 6500 in the core with thousands of thousands of hosts passing through it where you could imagine it taking a significant time to sort out the cache again.

int x ip route-cache ! turns on fast switching. This may be a "default" entry and so will not show up in the config.

I personally don't think I would worry about having some unused CEF entries hanging about however:)

Reply to
bod43

In article , snipped-for-privacy@hotmail.co.uk (bod43) writes: | On 23 Jan, 08:42, ddl@danlan.*com (Dan Lanciani) wrote: | > In article , snipped-for-privacy@somewhere.com (Thri= | ll5) writes: | >

| > | I think this has something to do with the fact that you using PFC and A= | CFC | > | on the PPP interface. | >

| > I disabled both and there was no change in behavior. It would have been | > disappointing if this were the problem since I waited through about five | > major IOS releases for the ability to suppress those useless bytes. :) | | One thing to remember is that the performance with | ordinary "fast switching" is comparable with that of CEF.

I can't use ordinary fast switching because of policy routing on the Ethernet interface. I did try "ip route-cache policy" (and my map does fall within the subset that can be fast switched) but while this helped the voice problem it did not fully resolve it. With CEF on voice is fine. With CEF off anything that bothers the CPU severly interferes with voice.

| The key advantage of CEF is that there is no caching | so full performance is available right away on a cache | invalidation (routing table change). For most networks | this is not going to be significant. | with a 6500 in the core with thousands of thousands of

It seems to be significant if using voice ports. Granted, using voice ports is probably a mistake; Cisco voice support is less robust than I had hoped. CallerID on FXO ports is unreliable, though this may also be a matter of CPU usage. I haven't had enough time with CEF partially on to see if CallerID works better, but even a flood ping seems to sometimes be enough to prevent the FXO port from catching the CallerID information.

In any case, it would be nice to know how CEF determines adjacencies on PPP links. It isn't clear to me why it should be removing an entry for a point-to-point peer at all since there is no obvious underlying mechanism analogous to ARP to re-verify that peer.

Dan Lanciani ddl@danlan.*com

Reply to
Dan Lanciani

The 3640 is a really, really low-end platform with a very under-powered CPU, which is exacerbated by the fact that it has limited hardware assist for most things. A 2811 has about 5x CPU and does just about everything in hardware and a very capable platform. You should consider upgrading to that platform with everything that your trying to do. A 2811 is fairly inexpensive (compared to what you would have paid for a new 3640.) Plus the

3640 was EoS a number of years ago and 12.4 will be the is the last supported IOS train.
Reply to
Thrill5

| > In any case, it would be nice to know how CEF determines adjacencies | > on PPP links. It isn't clear to me why it should be removing an entry | > for a point-to-point peer at all since there is no obvious underlying | > mechanism analogous to ARP to re-verify that peer. | >

| The 3640 is a really, really low-end platform with a very under-powered CPU, | which is exacerbated by the fact that it has limited hardware assist for | most things.

That's funny. I remember when the 3640 was the shiny new multi-service platform. Of course, that's when I had a 2500 series which was really, really low-end by then even though it was a great step "up" from the previous 3000 series (which had more RAM and NVRAM but they stopped making boot ROM upgrades available) that had in turn been the cat's meow in comparison to the MGS. I guess it's all relative. :)

| A 2811 has about 5x CPU and does just about everything in | hardware and a very capable platform. You should consider upgrading to that | platform with everything that your trying to do.

I'm trying to one run one 10Mb/s Ethernet, one 56k serial line, and one voice stream. I don't think 5x the CPU will make a difference (50x might) as long as the voice stream is being process-level-switched. I also suspect that whatever I'm doing to break CEF will still break it on a newer platform/IOS. So I'm back to finding out how I'm breaking CEF...

| A 2811 is fairly | inexpensive (compared to what you would have paid for a new 3640.) Plus the | 3640 was EoS a number of years ago and 12.4 will be the is the last | supported IOS train.

Each time I've run into a voice problem on the 3640 I've tested with

12.4T(15) on a 3660 and found no difference. Voice support has been around for several major IOS releases so I doubt that everything is suddenly better in the 12.4T(>15) range which inconveniently requires higher than a 3700 series. When I get a chance I'll try to reproduce this CEF issue on 12.4T(15) but it is harder to set that up.

Dan Lanciani ddl@danlan.*com

Reply to
Dan Lanciani

cisco keep a table of "raw" router performance to answer exactly this kind of Q:

formatting link

the table show 3640 at 50k pps CEF and 2811 at 120k, so the ratio isnt that big - but still pretty good considering the 2811 has much more built in and a lot lower price.

ironically the 3640 is actually faster for process switching, which shows some of the 2811 "go faster stripes" are probably based on faster memory and better hardware offload of the forwarding process.

see above - 2811 is actually worse for process switching.

Note the 56k line will be the bottleneck if that is where the traffic is going, and 3000 to 4000 pps for process switching should be plenty to fill it a couple of times over.

however - if you are using compression etc you may be slugging the CPU in other ways.

Reply to
Stephen

I find that *very* hard to believe - you never know though. I would put a small wager on it being a typo.

I have done a bit of digging and found a few things.

Bug: CSCsl92595 Is a very good match. Not quite perfect but that is not unusual. "incomplete adjancency after 3 min for multilink virtual access"

"Symptoms: After 3 minutes of normal operation, packet loss occurs over Dialer PPP multilink (MLPPP enabled) interfaces.

Conditions: Occurs when CEF is enabled and "ip address negotiated" is configured on the interface.

Workaround: Use one of the following options: Permanent: disable CEF with the no ip cef command. Permanent: configure a static IP address on the interface. Temporary: Use the clear adj command to refresh all adjacencies (will last 3 minutes)

Fixed-In

12.4(17b)M 12.4(15)T4 12.4(18a)M 12.4(19)M 12.4(19.10)M"

New bug tookit is a *huge* improvement by the way.

I have written the following already and will post it but the fit of the bug is good enough for me to start there.

formatting link
Dialer CEF

*Restrictions for Dialer CEF* "The Dialer CEF feature is not supported when a static route is pointing to the Dialer without specifying a next hop IP address. When using the Cisco IOS Release 12.3(11)T and higher, the ppp ipcp default route command may be used in Dialer interface configuration mode to work around this restriction. "

However I have a static interface route and is seems to be working here - 12.4(15)T on 877.

router#sh deb IP CEF: IP CEF events debugging is on

033186: Jan 24 18:13:47.985 GMT: CEF: Installed CEF auto adjacency for NVI0 033187: Jan 24 18:13:47.985 GMT: CEF: Installed CEF auto adjacency for Dialer0 033434: Jan 24 18:14:47.968 GMT: CEF: Installed CEF auto adjacency for NVI0 033435: Jan 24 18:14:47.968 GMT: CEF: Installed CEF auto adjacency for Dialer0 033680: Jan 24 18:15:47.951 GMT: CEF: Installed CEF auto adjacency for NVI0 033681: Jan 24 18:15:47.951 GMT: CEF: Installed CEF auto adjacency for Dialer0

IP Dialer0 point2point(13) 6140628 packets, 432577746 bytes 000109000021 !

Reply to
bod43

| I have done a bit of digging and found a few things. | | Bug: CSCsl92595 | Is a very good match. Not quite perfect but | that is not unusual. | "incomplete adjancency after 3 min for multilink virtual access" | | "Symptoms: After 3 minutes of normal operation, packet loss | occurs over Dialer PPP multilink (MLPPP | enabled) interfaces. | | Conditions: Occurs when CEF is enabled and "ip address | negotiated" is configured on the interface. | | Workaround: Use one of the following options: | Permanent: disable CEF with the no ip cef command. | Permanent: configure a static IP address on the interface. | Temporary: Use the clear adj command to refresh all | adjacencies (will last 3 minutes)

Indeed, pretty close. Unfortunately I don't use a negotiated address and removing the multilink configuration did not change anything. Clearing the adjacency also did not help. (Clearing the adjacency while it is still valid makes it invalid.)

| I have written the following already and will post it but the | fit of the bug is good enough for me to start there. | |

formatting link
| l | Dialer CEF | *Restrictions for Dialer CEF* | "The Dialer CEF feature is not supported when a static route | is pointing to the Dialer without specifying a next hop IP | address. When using the Cisco IOS Release 12.3(11)T and | higher, the ppp ipcp default route command may be used in | Dialer interface configuration mode to work around this | restriction. "

Interesting. While I do not have a static route pointing to the Dialer without a next hop address, I do have a "set interface" in a route map which I'll bet would provoke whatever problem they are worried about. Nothing was using this map during my tests but I do need it to work. Oh, one other odd thing I noticed. Although I appear to be using dialer profiles, the CEF description said something like "legacy dialer CEF active on interface".

Anyway, it sounds like CEF just isn't well supported on PPP links. I will leave it disabled on the Dialer interface for a while. This should give me a punt adjacency so that packets headed towards the serial line will be sent up to the next level. Does this mean fast switching or are they punted all the way to process level? I assume incoming packets on the serial line can still be fast switched. Show ip cache shows lots of entries for both Dialer and Ethernet, so apparently some fast switching is occurring.

| !

Reply to
Dan Lanciani

I have no inside of Cisco knowledge however I have apparently identified bugs correctly even with inexact matches. I imagine that no attempt will be made to identify every possible way of stimulation a bug. They will get a reproducable bug and fix it. Presumably the developers will try to fix it as well as they can to cover the widest possible case but I doubt very much that every possible stimulation will be documented in the bug notes.

That does seems quite divergent from the bug description.

There were a couple of other similar bugs but to my eye they seemed further away. I did not check properly but I got the idea that they were all fixed in the same releases.

CSCsj12558 CSCsk92337

"Punt" by definition is to process switching is my (quite possibly imperfect) understanding.

The switching method used is that configured on the inbound interface I believe so you have to have the same switching method for all traffic inbound on any particular interface.

ip route-cache - fast ip route-cache cef - cef no ip route-cache - process Of course one of these is the non-display 'default'.

Regarding voice - just to mention that I have seen a call manager installation with about ten PRIs spread all over the globe. 1000 users (not big really) and all worked very well. No issues at all with a 2801 running 2 x PRIs. I have an idea that we had a 2600 (2620XL at best) with a PRI too. Of course they use DSPs for 'voice processing offload' let me call it. Never seen analogue voice cards.

In general for performance issues:- Are you getting a lot of buffer misses? Manually configure buffers to eliminate if you have a bit of spare memory.

Are you getting a lot of unneeded broadcasts hitting the router? permit udp [dhcp stuff if required] deny ip any host 255.255.255.255 is OK on a lot of networks.

sh proc cpu CPU utilization for five seconds: 1%/0% is total/interrupt - but you knew that:)

sh proc cpu IP Input - is process switching load - and that:)

you may want to look at sheduler allocate scheduler interval - from memory

If you are getting any queuing on the interfaces that are carrying the voice traffic than you will need QoS. i.e. output drops.

If the 1sec (or longer) CPU is getting pegged you *will* have issues with voice for sure.

Are you doing excessive logging? no logg console ! is good practise. Each character sent causes an interrupt I believe.

Reply to
bod43

| > Anyway, it sounds like CEF just isn't well supported on PPP links. I | > will leave it disabled on the Dialer interface for a while. This should | There were a couple of other similar bugs but to my eye they seemed | further away. I did not check properly but I got the idea that | they were all fixed in the same releases. | | CSCsj12558

This one is also interesting for the dialer profile/legacy distinction. I don't know why CEF thinks my configuration is legacy. I had to switch to profiles because of some other bugs introduced in 12.4 around (13). I guess I should try a much newer 12.4 though that will mean a reduced feature set.

| > give me a punt adjacency so that packets headed towards the serial line | > will be sent up to the next level. Does this mean fast switching or are | > they punted all the way to process level? | "Punt" by definition is to process switching is my | (quite possibly imperfect) understanding.

It's confusing because in some places they say the "next slowest" method.

| Regarding voice - just to mention that I have seen a | call manager installation with about ten PRIs spread | all over the globe. 1000 users (not big really) and all worked | very well. No issues at all with a 2801 running 2 x PRIs. | I have an idea that we had a 2600 (2620XL at best) | with a PRI too. | Of course they use DSPs for 'voice processing offload' | let me call it. Never seen analogue voice cards.

The analog cards have DSPs as well. The problem here appears to be not so much a total processing power issue but a dependency on not having too much latency at process level. And the whole CallerID implementation may just be poor...

| sh proc cpu | CPU utilization for five seconds: 1%/0% | is total/interrupt - but you knew that:)

Speaking of that:

gateway>sho proc cpu sort CPU utilization for five seconds: 8%/0%; one minute: 6%; five minutes: 5% PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process 210 11588 39818870 0 3.44% 3.41% 3.34% 0 IP SLA Mon Event 47 119996 2681 44757 0.98% 0.11% 0.06% 0 Per-minute Jobs 58 472404 458515 1030 0.73% 0.32% 0.24% 0 IP Input 242 2684 4988545 0 0.57% 0.56% 0.57% 0 PPP manager 62 32 35 914 0.49% 0.04% 0.00% 130 Virtual Exec 243 1468 4988554 0 0.32% 0.32% 0.32% 0 PPP Events 98 704 1596447 0 0.16% 0.14% 0.14% 0 RBSCP Background 2 128 31947 4 0.08% 0.04% 0.06% 0 Load Meter

Why is SLA such a hog? It's doing one ping every 40 seconds. That's it. I see this on other routers as well.

Dan Lanciani ddl@danlan.*com

Reply to
Dan Lanciani

Sorry, but I've hidden the router name, but the date is genuine - it's routing between a 2Mbps serial line and 10 Mbps ethernet. It arrived with a CSC/3 (which was overkill) but inherited the CSC/4 when we dumped our AGS+s. The power is not protected which is why the uptime is isn't

10+ years.

Sam

rtr>sh clock

15:15:04.600 UTC Mon Jan 26 2009 rtr >sh hard Cisco Internetwork Operating System Software IOS (tm) GS Software (GS3-K-M), Version 10.3(3), RELEASE SOFTWARE (fc4) Copyright (c) 1986-1995 by cisco Systems, Inc. Compiled Fri 19-May-95 23:00 by nitin Image text-base: 0x00001000, data-base: 0x004C9348

ROM: System Bootstrap, Version 5.3(2.6), SOFTWARE

rtr uptime is 85 weeks, 4 days, 22 hours, 19 minutes System restarted by power-on at 18:03:31 BST Wed Jun 6 2007 Running default software

CSC4 (68040) processor with 16384K bytes of memory. X.25 software, Version 2.0, NET2, BFE and GOSIP compliant. Bridging software.

1 MCI controller (2 Ethernet, 2 Serial). 2 Ethernet/IEEE 802.3 interfaces. 2 Serial network interfaces. 32K bytes of non-volatile configuration memory.

Configuration register is 0x2102

rtr>

Reply to
Sam Wilson

| > | The 3640 is a really, really low-end platform with a very under-powered | > | CPU, | > | which is exacerbated by the fact that it has limited hardware assist for | > | most things. | > | > That's funny. I remember when the 3640 was the shiny new multi-service | > platform. Of course, that's when I had a 2500 series which was really, | > really low-end by then even though it was a great step "up" from the | > previous 3000 series (which had more RAM and NVRAM but they stopped | > making boot ROM upgrades available) that had in turn been the cat's meow | > in comparison to the MGS. I guess it's all relative. :) | | Sorry, but I've hidden the router name, but the date is genuine - it's | routing between a 2Mbps serial line and 10 Mbps ethernet. It arrived | with a CSC/3 (which was overkill) but inherited the CSC/4 when we dumped | our AGS+s. The power is not protected which is why the uptime is isn't | 10+ years.

I never got beyond the CSC/2 on the multibus platforms, but they were routing multiple Ethernets. (I tossed the last MGS last year but it had been on the shelf for a while as backup.) It may be that IOS is subject to the same bloat problem that afflicts Windows such that the latest versions require far more cpu to do the same job. :(

Dan Lanciani ddl@danlan.*com

Reply to
Dan Lanciani

Cabling-Design.com Forums website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.