* Kernel NMI error
@ 2003-09-16 11:38 msrinath
2003-09-16 13:25 ` Alan Cox
0 siblings, 1 reply; 6+ messages in thread
From: msrinath @ 2003-09-16 11:38 UTC (permalink / raw)
To: linux-kernel
Hello Everybody,
Can anyone help me on this?
Recently one of our servers running RedHat linux 7.2 with 2.4.7-10 SMP
kernel generated the following log and the system rebooted. This system has
2 CPUs.
Sep 16 01:34:24 cbesc ftpd[30753]: FTP LOGIN FROM 16.128.157.7
[16.128.157.7], scuser
Sep 16 01:36:48 cbesc ftpd[30753]: FTP session closed
Sep 16 01:54:30 cbesc kernel: Uhhuh. NMI received for unknown reason 35.
Sep 16 01:54:30 cbesc kernel: Dazed and confused, but trying to continue
Sep 16 01:54:30 cbesc kernel: Do you have a strange power saving mode
enabled?
Sep 16 01:54:30 cbesc kernel: eth0: card reports no resources.
Sep 16 01:58:09 cbesc syslogd 1.4.1: restart.
Sep 16 01:58:09 cbesc syslog: syslogd startup succeeded
This is the first time we have faced this problem. The ethernet card used is
intel eepro 100. The details are shown below.
Sep 16 07:33:53 cbesc kernel: eepro100.c:v1.09j-t 9/29/99 Donald Becker
http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
Sep 16 07:33:53 cbesc kernel: eepro100.c: $Revision: 1.36 $ 2000/11/17
Modified by Andrey V. Savochkin <saw@saw.sw.com.sg> and others
Sep 16 07:33:53 cbesc kernel: eth0: Intel Corporation 82557 [Ethernet Pro
100], 00:A0:C9:A0:B7:71, IRQ 16.
Sep 16 07:33:53 cbesc kernel: Receiver lock-up bug exists -- enabling
work-around.
Sep 16 07:33:53 cbesc kernel: Board assembly 668081-004, Physical
connectors present: RJ45
Sep 16 07:33:53 cbesc kernel: Primary interface chip i82555 PHY #1.
Sep 16 07:33:54 cbesc kernel: General self-test: passed.
Sep 16 07:33:54 cbesc kernel: Serial sub-system self-test: passed.
Sep 16 07:33:54 cbesc kernel: Internal registers self-test: passed.
Sep 16 07:33:54 cbesc kernel: ROM checksum self-test: passed (0x3c15c8f1).
Sep 16 07:33:54 cbesc kernel: Receiver lock-up workaround activated.
Please let me know why this happened and whether it indicates any hardware
problem in the system.
Please send a CC to my email address, since I have not subscribed to the
list.
Thanks & Regards,
- Srinath.
--
This message has been scanned for viruses and
dangerous content by Kaspersky on bpl Server, and is
believed to be clean.
bpl www.kaspersky.com
.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Kernel NMI error
2003-09-16 11:38 Kernel NMI error msrinath
@ 2003-09-16 13:25 ` Alan Cox
2003-09-17 9:51 ` msrinath
0 siblings, 1 reply; 6+ messages in thread
From: Alan Cox @ 2003-09-16 13:25 UTC (permalink / raw)
To: msrinath; +Cc: Linux Kernel Mailing List
On Maw, 2003-09-16 at 12:38, msrinath wrote:
> Recently one of our servers running RedHat linux 7.2 with 2.4.7-10 SMP
> kernel generated the following log and the system rebooted. This system has
> 2 CPUs.
Typically an NMI is a system error. That could be a memory error, it
could be a freak power glitch if its only ever happened once.
If you are using a 2.4.7 kernel you really should also update to the
current errata kernel and other updates.
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: Kernel NMI error
2003-09-16 13:25 ` Alan Cox
@ 2003-09-17 9:51 ` msrinath
2003-09-17 13:41 ` Alan Cox
0 siblings, 1 reply; 6+ messages in thread
From: msrinath @ 2003-09-17 9:51 UTC (permalink / raw)
To: 'Alan Cox'; +Cc: 'Linux Kernel Mailing List'
Thanks for the reply. This is the only time this has ever happened. How can
I make out if it is a memory error? Is there any way by which I can test it?
Thanks & Regards,
- Srinath
-----Original Message-----
From: Alan Cox [mailto:alan@lxorguk.ukuu.org.uk]
Sent: 16 September 2003 18:55
To: msrinath
Cc: Linux Kernel Mailing List
Subject: Re: Kernel NMI error
On Maw, 2003-09-16 at 12:38, msrinath wrote:
> Recently one of our servers running RedHat linux 7.2 with 2.4.7-10 SMP
> kernel generated the following log and the system rebooted. This system
has
> 2 CPUs.
Typically an NMI is a system error. That could be a memory error, it
could be a freak power glitch if its only ever happened once.
If you are using a 2.4.7 kernel you really should also update to the
current errata kernel and other updates.
--
This message has been scanned for viruses and
dangerous content by Kaspersky on bpl Server, and is
believed to be clean.
bpl www.kaspersky.com
.
--
This message has been scanned for viruses and
dangerous content by Kaspersky on bpl Server, and is
believed to be clean.
bpl www.kaspersky.com
.
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: Kernel NMI error
2003-09-17 9:51 ` msrinath
@ 2003-09-17 13:41 ` Alan Cox
2003-09-18 3:52 ` msrinath
0 siblings, 1 reply; 6+ messages in thread
From: Alan Cox @ 2003-09-17 13:41 UTC (permalink / raw)
To: msrinath; +Cc: 'Linux Kernel Mailing List'
On Mer, 2003-09-17 at 10:51, msrinath wrote:
> Thanks for the reply. This is the only time this has ever happened. How can
> I make out if it is a memory error? Is there any way by which I can test it?
If you can schedule down time for the machine run memtest86 on it for a
few hours to check. If not just see if it happens again I guess, if so
then think about testing the RAM
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: Kernel NMI error
2003-09-17 13:41 ` Alan Cox
@ 2003-09-18 3:52 ` msrinath
0 siblings, 0 replies; 6+ messages in thread
From: msrinath @ 2003-09-18 3:52 UTC (permalink / raw)
To: 'Alan Cox'; +Cc: 'Linux Kernel Mailing List'
Thanks. I will wait and watch.
- Srinath
-----Original Message-----
From: Alan Cox [mailto:alan@lxorguk.ukuu.org.uk]
Sent: 17 September 2003 19:11
To: msrinath
Cc: 'Linux Kernel Mailing List'
Subject: RE: Kernel NMI error
On Mer, 2003-09-17 at 10:51, msrinath wrote:
> Thanks for the reply. This is the only time this has ever happened. How
can
> I make out if it is a memory error? Is there any way by which I can test
it?
If you can schedule down time for the machine run memtest86 on it for a
few hours to check. If not just see if it happens again I guess, if so
then think about testing the RAM
--
This message has been scanned for viruses and
dangerous content by Kaspersky on bpl Server, and is
believed to be clean.
bpl www.kaspersky.com
.
--
This message has been scanned for viruses and
dangerous content by Kaspersky on bpl Server, and is
believed to be clean.
bpl www.kaspersky.com
.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Kernel NMI error
@ 2003-09-16 11:13 msrinath
0 siblings, 0 replies; 6+ messages in thread
From: msrinath @ 2003-09-16 11:13 UTC (permalink / raw)
To: linux-kernel
Hello Everybody,
Can anyone help me on this?
Recently one of our servers running RedHat linux 7.2 with 2.4.7-10 SMP
kernel generated the following log and the system rebooted. This system has
2 CPUs.
Sep 16 01:34:24 cbesc ftpd[30753]: FTP LOGIN FROM 16.128.157.7
[16.128.157.7], scuser
Sep 16 01:36:48 cbesc ftpd[30753]: FTP session closed
Sep 16 01:54:30 cbesc kernel: Uhhuh. NMI received for unknown reason 35.
Sep 16 01:54:30 cbesc kernel: Dazed and confused, but trying to continue
Sep 16 01:54:30 cbesc kernel: Do you have a strange power saving mode
enabled?
Sep 16 01:54:30 cbesc kernel: eth0: card reports no resources.
Sep 16 01:58:09 cbesc syslogd 1.4.1: restart.
Sep 16 01:58:09 cbesc syslog: syslogd startup succeeded
This is the first time we have faced this problem. The ethernet card used is
intel eepro 100. The details are shown below.
Sep 16 07:33:53 cbesc kernel: eepro100.c:v1.09j-t 9/29/99 Donald Becker
http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
Sep 16 07:33:53 cbesc kernel: eepro100.c: $Revision: 1.36 $ 2000/11/17
Modified by Andrey V. Savochkin <saw@saw.sw.com.sg> and others
Sep 16 07:33:53 cbesc kernel: eth0: Intel Corporation 82557 [Ethernet Pro
100], 00:A0:C9:A0:B7:71, IRQ 16.
Sep 16 07:33:53 cbesc kernel: Receiver lock-up bug exists -- enabling
work-around.
Sep 16 07:33:53 cbesc kernel: Board assembly 668081-004, Physical
connectors present: RJ45
Sep 16 07:33:53 cbesc kernel: Primary interface chip i82555 PHY #1.
Sep 16 07:33:54 cbesc kernel: General self-test: passed.
Sep 16 07:33:54 cbesc kernel: Serial sub-system self-test: passed.
Sep 16 07:33:54 cbesc kernel: Internal registers self-test: passed.
Sep 16 07:33:54 cbesc kernel: ROM checksum self-test: passed (0x3c15c8f1).
Sep 16 07:33:54 cbesc kernel: Receiver lock-up workaround activated.
Please let me know why this happened and whether it indicates any hardware
problem in the system.
Thanks & Regards,
- Srinath.
--
This message has been scanned for viruses and
dangerous content by Kaspersky on bpl Server, and is
believed to be clean.
bpl www.kaspersky.com
.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2003-09-18 3:49 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-09-16 11:38 Kernel NMI error msrinath
2003-09-16 13:25 ` Alan Cox
2003-09-17 9:51 ` msrinath
2003-09-17 13:41 ` Alan Cox
2003-09-18 3:52 ` msrinath
-- strict thread matches above, loose matches on Subject: below --
2003-09-16 11:13 msrinath
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox