Cisco 837-k9-64 router sometimes crashes during boot

I have a Cisco 837-k9-64 router that sometimes crashes during boot (after power-on or after a reload command). If left to run after a successful boot, it is stable and reliable.

Partial output from the crashinfo file is as follows. Any suggestions as to likely causes?

*Mar 1 11:00:24: %LINK-3-UPDOWN: Interface FastEthernet3, changed state to down *Mar 1 11:00:24: %LINK-3-UPDOWN: Interface FastEthernet4, changed state to down *Mar 1 11:00:24: %LINEPROTO-5-UPDOWN: Line protocol on Interface ATM0, changed state to down *Mar 1 11:00:25: %LINEPROTO-5-UPDOWN: Line protocol on Interface NVI0, changed state to up *Mar 1 11:00:25: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet3, changed state to down *Mar 1 11:00:25: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet4, changed state to down *Mar 1 11:00:44: %SYS-2-MALLOCFAIL: Memory allocation of 10260 bytes failed from 0x80488920, alignment 0 Pool: Processor Free: 25076 Cause: Memory fragmentation Alternate Pool: None Free: 0 Cause: No Alternate pool -Process= "Init", ipl= 0, pid= 3 -Traceback= 0x8028280C 0x803E3924 0x803EB290 0x80488924 0x8048D530 0x8048A4EC 0x8048ACB0 0x8048C26C 0x8048D160 0x804333A8 0x80433584 0x80433698 0x80433698 0x80434AB8 0x8046E8A8 0x8047D6CC

=== Start of Crashinfo Collection (11:00:46 aedt Fri Mar 1 2002) ===

For image: Cisco IOS Software, C837 Software (C837-K9O3SY6-M), Version 12.4(6)T2, RELEASE SOFTWARE (fc1) Technical Support:

formatting link
(c) 1986-2006 by Cisco Systems, Inc. Compiled Tue 16-May-06 17:32 by kellythw

========= Show Alignment =============================

No spurious memory references have been recorded.

Reply to
Jason White
Loading thread data ...

Hi Jason,

A crash can be caused by software problems, hardware problems, or both.

Important information about the crash is lost if the router is reloaded after the crash, such as from the power-cycle or issuing the

reload

formatting link
command.

Collect the information found in the

show tech-support

formatting link
and

show log

formatting link
command output, as well as the crashinfo file before reloading the router.

-------------------------------------------------

To address a router crash, perform these steps:

Step 1. Find out the type of crash by issuing a

show version

formatting link
command.

Step 2. Look for the reason the system restarted.

This sample output shows where to look for the reason the system restarted:

Router#show version Cisco Internetwork Operating System Software IOS (tm) RSP Software (RSP-PV-M), Version 12.0(10.6)ST, EARLY DEPLOYMENT MAINTENANCE INTERIM SOFTWARE Copyright (c) 1986-2000 by cisco Systems, Inc. Compiled Fri 23-Jun-00 16:02 by richv Image text-base: 0x60010908, data-base: 0x60D96000

ROM: System Bootstrap, Version 12.0(19990806:174725), DEVELOPMENT SOFTWARE BOOTFLASH: RSP Software (RSP-BOOT-M), Version 12.0(9)S, EARLY DEPLOYMENT RELEASE SOFTWARE (fc1)

Router uptime is 20 hours, 56 minutes System returned to ROM by error - a Software forced crash, PC

0x60287EE8 System image file is "slot0:rsp-pv-mz.120-10.6.ST"

For further information please investigate:

Troubleshooting Router Crashes

formatting link
For information regarding hardware troubleshooting please investigate:

Hardware Troubleshooting Index Page

formatting link
Hope this helps.

Brad Reese BradReese.Com - Cisco Network Engineer Directory

formatting link
Hendersonville Road, Suite 17 Asheville, North Carolina USA 28803 USA & Canada: 877-549-2680 International: 828-277-7272 Fax: 775-254-3558 AIM: R2MGrant Website:
formatting link

Reply to
www.BradReese.Com

changed state to down

changed state to up

FastEthernet3, changed state to down

FastEthernet4, changed state to down

from 0x80488920, alignment 0

0x80488924 0x8048D530 0x8048A4EC 0x8048ACB0 0x8048C26C 0x8048D160 0x804333A8 0x80433584 0x80433698 0x80433698 0x80434AB8 0x8046E8A8 0x8047D6CC

You run out of memory, how much memory do you have compared to the recommended minimum values for the version of code you're running ?

What features do you ahve configured ? Could we see the config, perhaps there is something that allocate large amounts of memory.

/Jesper

SOFTWARE (fc1)

Reply to
Jesper Skriver

64mb, which I think is the recommended minimum for 12.4T on the 837. The Processor pool has about 7mb free under normal operation, so the problem occurs only during boot. I have just found other recommendations to use memory-size iomem 5 to make more memory available to the processor pool; I can try that or turn off some of the features enabled in my config - or buy a 16mb module.
Reply to
Jason White
*Mar 1 11:00:44: %SYS-2-MALLOCFAIL: Memory allocation of 10260 bytes failed from 0x80488920

This message indicates that the process is unable to find a large enough block of contiguous memory. The IP Input process attempts to get

10260 bytes from the processor pool of memory.

Perform these steps.

Step 1. Verify that the router has enough DRAM to support the Cisco IOS Software by issuing the

show version

formatting link
command, as shown in this example:

------------------------------------------------

Router#show version Cisco Internetwork Operating System Software IOS (tm) C2600 Software (C2600-I-M), Version 12.2(8)T5, RELEASE SOFTWARE (fc1) TAC Support: Copyright (c) 1986-2002 by Cisco Systems, Inc. Compiled Fri 21-Jun-02 08:50 by ccai Image text-base: 0x80008074, data-base: 0x80A2BD40

ROM: System Bootstrap, Version 12.2(7r) [cmong 7r], RELEASE SOFTWARE (fc1)

morannon uptime is 5 days, 1 hour, 27 minutes System returned to ROM by power-on System image file is "flash:c2600-i-mz.122-8.T5.bin"

Cisco 2620XM (MPC860P) processor (revision 0x100) with 28672K/4096K bytes of memory. Processor board ID JAD071008B1 (1020186390)

------------------------------------------------

The router in the example above is running Cisco IOS Software release

12.2(8)T5 with the I feature set (C2600-I-M). In addition, it has 32 MB of DRAM (28672K/4096K).

Step 2. Verify the minimum memory requirements for Cisco IOS Software, such as the platform, train and feature set. To check the minimum memory requirements for Cisco IOS Software, perform show version analysis by cutting and pasting the command output in the

Output Interpreter

formatting link
tool. Follow the link provided in the Cisco IOS Image Software Advisor-IOS Image Name section of the analysis output.

Step 3. Verify that the minimum recommended amount of DRAM is installed on the router. A large routing table requires more than the minimum DRAM requirements.

For specific information on DRAM memory requirements when using Border Gateway Protocol (BGP), refer to:

Achieve Optimal Routing and Reduce BGP Memory Consumption

formatting link
The Cisco series 2600, 3600 and 3700 routers require a minimum amount of I/O memory to support certain interface processors. To determine the correct amount, refer to the:

Memory Calculator Tool

formatting link
If the amount of installed memory is less than the recommended amount, perform one of these options:

A. Add memory. B. Go to the Cisco IOS Software version supported with the current amount of memory.

Step 4. Some applications have features, such as the User Tracking (UT) Discovery feature of CiscoWorks, which can result in low memory conditions unless the

ip cef

formatting link
command is issued.

Step 5. Memory allocation failures can be caused by a memory leak bug or memory fragmentation. In this case, analyze the output of the

show memory

formatting link
command by pasting the output into the Output Interpreter tool.

Step 6. To determine if fragmentation is occurring, compare the Largest and Free fields by issuing the show memory [summary] command.

Fragmentation is taking place if the number in the Largest field is much smaller than the number in the Free field. This is because the Largest field indicates the largest contiguous free memory block and it should normally be close to the free memory, as shown in this example:

------------------------------------------------

Router#show memory summary Head Total(b) Used(b) Free(b) Lowest(b) Largest(b) Processor C0E48 13885880 1615712 12270168 12077808 12077808 I/O E00000 2097152 398396 1698756 1698756 1698588

This is a brief description of the fields:

Total is the total memory allocated to the processor or I/O memory. This does not include the amount of memory taken up by the Cisco IOS Software.

Used is the amount of memory being used at the time the show memory [summary] command is issued.

Free is the amount of available free memory at the time the show memory [summary] command is issued.

Lowest is the lowest amount of memory available since the last reload.

Largest is the largest amount of free contiguous memory at the time the show memory [summary] command is issued. This should normally be close to the free memory. A small number compared to the free memory indicates fragmentation.

Step 7. To determine if a memory leak is occurring, capture the output of the show memory [summary] command several times at regular intervals. The intervals depend on the length of time it takes for the memory allocation failures to appear. If the router begins to display the errors after four days, then one or two captures per day is sufficient to establish a pattern.

If the free memory steadily decreases, a memory leak may be occurring.

A memory leak occurs when a process takes and uses memory, but does not release the memory back to the system. To determine the offending process, issue the

show processes memory

formatting link
command and perform these steps:

Step A. To determine which process is not freeing memory back to the system, capture the show processes memory command output several times at regular intervals

Step B. The two counters used for this capture are Freed and Holding. If the Holding counter for a process increases, but the Freed counter does not increase, that process may be the cause of the memory leak

Step C. Once the process is identified, search for a memory leak issue by referring to the

Bug Toolkit

formatting link
This issue relates to the process that affects the Cisco IOS Software currently installed on the router.

For more information on solving memory issues, refer to:

Troubleshooting Memory Problems

formatting link
Sincerely,

Brad Reese Cisco Repair

formatting link

Reply to
www.BradReese.Com

HI, I have seen exactly this on an 837 but an older one that I think had 48M DRAM and was on 12.3???

I successfully worked round it with the iomem thing. Luckily (as I understand it) IO memory is allocated statically so you can determine if you will have enough from a show mem of the router if it does boot.

The amount you need depends basically on the number of Interfaces configured. (Maybe virtual as well as physical.)

One other idea might be to have a look at the buffers and arrange for fewer misses/ failures if there are any. The concept is that the box is trying to do stuff but if a buffer failure occurs then that action is suspended pending fixing the buffer issue. Box goes on to next job which needs more memory.

I just add permanent buffers /gradually/ until I eliminate the majority of the misses and pretty much all of the failures. Always check how much memory you have before adding buffers.

Finally I don't think that the 837 is big enough to live on a busy network e.g. a lot of broadcasts In this may be what is causing the issues. Yes, check the buffers.

It is pretty clearly a bug. I fancy a version change later or earlier or a TAC case.

Reply to
anybody43

I issued no exception crashinfo flash to prevent it from writing out a crashinfo file if it ran out of memory, scheduled a reload, turned off the two computers that were currently connected to the router and it came back up with about 300k in the "lowest" column of the show memory statistics display (for the processor pool).

I will also bear in mind the other suggestions that have been made, and Brad's helpful references - which led me to a document on memory management that was useful.

Reply to
Jason White

It would be helpful if you posted more information. In particular what is the free memory.

The first few lines of sh mem would be useful.

300k is very little nowadays.

Your options are:-

You have the wrong amount of memory It's a bug and you need to change the software version It's a bug and you have to get cisco to fix it.

You may be able to workaround by:-

  1. Iomem options
  2. If you have sufficient Free memory after initialisation you might consider:- Locating the offending process(es) (pretty hard) by analysing the sh proc mem output for amount allocated and freed getbufs retbufs and if you are VERY lucky you may be able to fugure out which process is needing a lot of initialisation memory. If you are even luckier you will be able to turn off that process and still get the result that you want.
Reply to
anybody43

Cabling-Design.com Forums website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.