* Asus CUV4X-D, 2.4.3 crashes at boot
@ 2001-04-01 4:15 Simon Garner
2001-04-01 5:13 ` Allen Campbell
0 siblings, 1 reply; 15+ messages in thread
From: Simon Garner @ 2001-04-01 4:15 UTC (permalink / raw)
To: linux-smp; +Cc: linux-kernel
Hi,
I've compiled kernel 2.4.3 on the following RH7 system, and I'm now getting
random crashes at boot, during IO-APIC initialisation. Random meaning that
sometimes it boots fine, other times it doesn't, and it hangs in different
places (but always around IO-APIC stuff). It almost always hangs after a
cold boot - if I do a Ctrl+Alt+Del then it will usually boot up OK.
System: Asus CUV4X-D motherboard, Dual P3 800EB.
The last thing I see on the screen when it hangs is, for example:
CPU1: Intel Pentium III (Coppermine) stepping 06
CPU has booted.
Before bogomips.
Total of 2 processors activated (3207.98 BogoMIPS).
Before bogocount - setting activated=1.
Boot done.
ENABLING IO-APIC IRQs
...changing IO-APIC physical APIC ID to 2 ... ok.
Synchronizing Arb IDs.
...TIMER: vector=49 pin1=2 pin2=0
Sometimes it gets a little further, but it's always somewhere near the
IO-APIC
stuff.
When it does boot, I get:
CPU1: Intel Pentium III (Coppermine) stepping 06
CPU has booted.
Before bogomips.
Total of 2 processors activated (3207.98 BogoMIPS).
Before bogocount - setting activated=1.
Boot done.
ENABLING IO-APIC IRQs
...changing IO-APIC physical APIC ID to 2 ... ok.
Synchronizing Arb IDs.
init IO_APIC IRQs
IO-APIC (apicid-pin) 2-5, 2-10, 2-11, 2-13, 2-19, 2-20, 2-21, 2-22, 2-23
not connected.
..TIMER: vector=49 pin1=2 pin2=0
number of MP IRQ sources: 17.
number of IO-APIC #2 registers: 24.
testing the IO APIC.......................
IO APIC #2......
.... register #00: 02000000
....... : physical APIC id: 02
.... register #01: 00178011
....... : max redirection entries: 0017
....... : IO APIC version: 0011
WARNING: unexpected IO-APIC, please mail
to linux-smp@vger.kernel.org
.... register #02: 00000000
....... : arbitration: 00
Full dmesg output:
http://www.expio.co.nz/~sgarner/orion/smp/dmesg.txt
My kernel .config:
http://www.expio.co.nz/~sgarner/orion/smp/config.txt
Output from lspci -xx:
http://www.expio.co.nz/~sgarner/orion/smp/lspcixx.txt
Any ideas?
Thanks in advance,
Simon Garner
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: Asus CUV4X-D, 2.4.3 crashes at boot
2001-04-01 4:15 Asus CUV4X-D, 2.4.3 crashes at boot Simon Garner
@ 2001-04-01 5:13 ` Allen Campbell
2001-04-01 9:18 ` Simon Garner
2001-04-02 22:27 ` Alan Cox
0 siblings, 2 replies; 15+ messages in thread
From: Allen Campbell @ 2001-04-01 5:13 UTC (permalink / raw)
To: Simon Garner; +Cc: linux-kernel
On Sun, Apr 01, 2001 at 04:15:38PM +1200, Simon Garner wrote:
> Hi,
>
> I've compiled kernel 2.4.3 on the following RH7 system, and I'm now getting
> random crashes at boot, during IO-APIC initialisation. Random meaning that
> sometimes it boots fine, other times it doesn't, and it hangs in different
> places (but always around IO-APIC stuff). It almost always hangs after a
> cold boot - if I do a Ctrl+Alt+Del then it will usually boot up OK.
>
> System: Asus CUV4X-D motherboard, Dual P3 800EB.
>
> The last thing I see on the screen when it hangs is, for example:
[snip]
I've seen the exact same behavior with my CUV4X-D (2x1GHz) under
2.4.2 (debian woody). In addition, the kernel would sometimes hang
around NMI watchdog enable. At least, I think it's trying to
`enable'. The hang would occur around 50% of boot attempts. Once
booted, everything was stable. A non-SMP 2.4.2 kernel (no IO-APIC
either, sorry, didn't test that) always booted without hangs.
Strangely, (happily for me,) the boot hangs stopped with 2.4.3.
I've booted maybe 10 times (hot and cold) since I built 2.4.3 and
I've had no hangs. When I get back to the box, I'll try booting
a few dozen more times and see if I can confirm your observation.
--
Allen Campbell
allenc@campbell.cwx.net
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Asus CUV4X-D, 2.4.3 crashes at boot
2001-04-01 5:13 ` Allen Campbell
@ 2001-04-01 9:18 ` Simon Garner
2001-04-01 9:47 ` Allen Campbell
2001-04-02 22:27 ` Alan Cox
1 sibling, 1 reply; 15+ messages in thread
From: Simon Garner @ 2001-04-01 9:18 UTC (permalink / raw)
To: Allen Campbell; +Cc: linux-kernel
From: "Allen Campbell" <lkml@campbell.cwx.net>
> I've seen the exact same behavior with my CUV4X-D (2x1GHz) under
> 2.4.2 (debian woody). In addition, the kernel would sometimes hang
> around NMI watchdog enable. At least, I think it's trying to
> `enable'. The hang would occur around 50% of boot attempts. Once
> booted, everything was stable. A non-SMP 2.4.2 kernel (no IO-APIC
> either, sorry, didn't test that) always booted without hangs.
Yep, sounds like the same problem.
>
> Strangely, (happily for me,) the boot hangs stopped with 2.4.3.
> I've booted maybe 10 times (hot and cold) since I built 2.4.3 and
> I've had no hangs. When I get back to the box, I'll try booting
> a few dozen more times and see if I can confirm your observation.
>
Please do test it. I think you'll find the problem is still very much
present.
Cheers
Simon Garner
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Asus CUV4X-D, 2.4.3 crashes at boot
2001-04-01 9:18 ` Simon Garner
@ 2001-04-01 9:47 ` Allen Campbell
0 siblings, 0 replies; 15+ messages in thread
From: Allen Campbell @ 2001-04-01 9:47 UTC (permalink / raw)
To: Simon Garner; +Cc: linux-kernel
On Sun, Apr 01, 2001 at 09:18:25PM +1200, Simon Garner wrote:
> From: "Allen Campbell" <lkml@campbell.cwx.net>
>
> > I've seen the exact same behavior with my CUV4X-D (2x1GHz) under
> > 2.4.2 (debian woody). In addition, the kernel would sometimes hang
> > around NMI watchdog enable. At least, I think it's trying to
> > `enable'. The hang would occur around 50% of boot attempts. Once
> > booted, everything was stable. A non-SMP 2.4.2 kernel (no IO-APIC
> > either, sorry, didn't test that) always booted without hangs.
>
> Yep, sounds like the same problem.
>
>
> >
> > Strangely, (happily for me,) the boot hangs stopped with 2.4.3.
> > I've booted maybe 10 times (hot and cold) since I built 2.4.3 and
> > I've had no hangs. When I get back to the box, I'll try booting
> > a few dozen more times and see if I can confirm your observation.
> >
>
> Please do test it. I think you'll find the problem is still very much
> present.
Yeah, still there. Cold boot only.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Asus CUV4X-D, 2.4.3 crashes at boot
2001-04-01 5:13 ` Allen Campbell
2001-04-01 9:18 ` Simon Garner
@ 2001-04-02 22:27 ` Alan Cox
2001-04-02 22:40 ` Simon Garner
1 sibling, 1 reply; 15+ messages in thread
From: Alan Cox @ 2001-04-02 22:27 UTC (permalink / raw)
To: Allen Campbell; +Cc: Simon Garner, linux-kernel
> I've seen the exact same behavior with my CUV4X-D (2x1GHz) under
> 2.4.2 (debian woody). In addition, the kernel would sometimes hang
> around NMI watchdog enable. At least, I think it's trying to
Known problem. Thats one reason why -ac trees had nmi watchdog turned off.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Asus CUV4X-D, 2.4.3 crashes at boot
2001-04-02 22:27 ` Alan Cox
@ 2001-04-02 22:40 ` Simon Garner
2001-04-03 6:47 ` Allen Campbell
0 siblings, 1 reply; 15+ messages in thread
From: Simon Garner @ 2001-04-02 22:40 UTC (permalink / raw)
To: Alan Cox; +Cc: linux-kernel
From: "Alan Cox" <alan@lxorguk.ukuu.org.uk>
> > I've seen the exact same behavior with my CUV4X-D (2x1GHz) under
> > 2.4.2 (debian woody). In addition, the kernel would sometimes hang
> > around NMI watchdog enable. At least, I think it's trying to
>
> Known problem. Thats one reason why -ac trees had nmi watchdog turned off.
It still crashes with nmi_watchdog turned off.
Running with noapic fixes it but then the system crashes if you access the
RTC with hwclock (and probably creates a hundred other problems...).
How can I get this chipset/motherboard supported properly under Linux? I'm
happy to test patches etc. on the box. *pleading*
Cheers
Simon Garner
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Asus CUV4X-D, 2.4.3 crashes at boot
2001-04-02 22:40 ` Simon Garner
@ 2001-04-03 6:47 ` Allen Campbell
2001-04-03 6:53 ` Simon Garner
0 siblings, 1 reply; 15+ messages in thread
From: Allen Campbell @ 2001-04-03 6:47 UTC (permalink / raw)
To: Simon Garner; +Cc: Alan Cox, linux-kernel
On Tue, Apr 03, 2001 at 10:40:36AM +1200, Simon Garner wrote:
> From: "Alan Cox" <alan@lxorguk.ukuu.org.uk>
>
> > > I've seen the exact same behavior with my CUV4X-D (2x1GHz) under
> > > 2.4.2 (debian woody). In addition, the kernel would sometimes hang
> > > around NMI watchdog enable. At least, I think it's trying to
> >
> > Known problem. Thats one reason why -ac trees had nmi watchdog turned off.
>
> It still crashes with nmi_watchdog turned off.
>
> Running with noapic fixes it but then the system crashes if you access the
> RTC with hwclock (and probably creates a hundred other problems...).
>
> How can I get this chipset/motherboard supported properly under Linux? I'm
> happy to test patches etc. on the box. *pleading*
Patience is likely to be effective. The chipset isn't exactly rare
being on SMP boards from Gigabyte, MSI, Tyan and Asus, and likely
others. I'm betting it will be fixed soon enough. UP and 2.2.x
kernels worked fine here if you're really desperate. OTOH, the
board is stable once you get past the boot problems... What sort
of production system needs frequent unattended boots?
Sorry about this, I just don't remember signing any paychecks for
what I know is likely to be a non-issue probably before the next
time I actually have to do something drastic, like reboot.
--
Allen Campbell | Lurking at the bottom of the
allenc@verinet.com | gravity well, getting old.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Asus CUV4X-D, 2.4.3 crashes at boot
2001-04-03 6:47 ` Allen Campbell
@ 2001-04-03 6:53 ` Simon Garner
0 siblings, 0 replies; 15+ messages in thread
From: Simon Garner @ 2001-04-03 6:53 UTC (permalink / raw)
To: Allen Campbell; +Cc: Alan Cox, linux-kernel
From: "Allen Campbell" <lkml@campbell.cwx.net>
> Patience is likely to be effective. The chipset isn't exactly rare
> being on SMP boards from Gigabyte, MSI, Tyan and Asus, and likely
> others. I'm betting it will be fixed soon enough. UP and 2.2.x
> kernels worked fine here if you're really desperate. OTOH, the
> board is stable once you get past the boot problems... What sort
> of production system needs frequent unattended boots?
>
I was planning to install the box as a colocated production webserver in 1-2
weeks' time.
I don't want to colocate a box that I cannot reboot, so I'll just have to
sit on it until it's fixed I guess.
Regards
Simon Garner
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Asus CUV4X-D, 2.4.3 crashes at boot
@ 2001-04-01 9:55 Mikael Pettersson
0 siblings, 0 replies; 15+ messages in thread
From: Mikael Pettersson @ 2001-04-01 9:55 UTC (permalink / raw)
To: linux-smp, sgarner; +Cc: linux-kernel
Simon Garner wrote:
>I've compiled kernel 2.4.3 on the following RH7 system, and I'm now getting
>random crashes at boot, during IO-APIC initialisation. Random meaning that
>sometimes it boots fine, other times it doesn't, and it hangs in different
>places (but always around IO-APIC stuff). It almost always hangs after a
>cold boot - if I do a Ctrl+Alt+Del then it will usually boot up OK.
>
>System: Asus CUV4X-D motherboard, Dual P3 800EB.
>...
>Any ideas?
Boot with "nmi_watchdog=0" as a boot parameter. Does it work now?
Some people have reported before here that the IO-APIC driven NMI
watchdog itself can cause boot-time hangs.
/Mikael
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Asus CUV4X-D, 2.4.3 crashes at boot
@ 2001-04-01 10:04 Simon Garner
2001-04-01 10:09 ` David Weinehall
0 siblings, 1 reply; 15+ messages in thread
From: Simon Garner @ 2001-04-01 10:04 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-smp
From: "Mikael Pettersson" <mikpe@csd.uu.se>
> Boot with "nmi_watchdog=0" as a boot parameter. Does it work now?
>
> Some people have reported before here that the IO-APIC driven NMI
> watchdog itself can cause boot-time hangs.
>
> /Mikael
Thanks, but I do not have watchdog support compiled into the kernel.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Asus CUV4X-D, 2.4.3 crashes at boot
2001-04-01 10:04 Simon Garner
@ 2001-04-01 10:09 ` David Weinehall
2001-04-01 12:51 ` Keith Owens
0 siblings, 1 reply; 15+ messages in thread
From: David Weinehall @ 2001-04-01 10:09 UTC (permalink / raw)
To: Simon Garner; +Cc: linux-kernel, linux-smp
On Sun, Apr 01, 2001 at 10:04:17PM +1200, Simon Garner wrote:
> From: "Mikael Pettersson" <mikpe@csd.uu.se>
>
> > Boot with "nmi_watchdog=0" as a boot parameter. Does it work now?
> >
> > Some people have reported before here that the IO-APIC driven NMI
> > watchdog itself can cause boot-time hangs.
> >
> > /Mikael
>
>
> Thanks, but I do not have watchdog support compiled into the kernel.
Doesn't matter. The NMI-watchdog tries to detect SMP-lockups, and is
always present. Unless you specifically disable it on boot.
/David Weinehall
_ _
// David Weinehall <tao@acc.umu.se> /> Northern lights wander \\
// Project MCA Linux hacker // Dance across the winter sky //
\> http://www.acc.umu.se/~tao/ </ Full colour fire </
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Asus CUV4X-D, 2.4.3 crashes at boot
2001-04-01 10:09 ` David Weinehall
@ 2001-04-01 12:51 ` Keith Owens
2001-04-01 23:49 ` Simon Garner
0 siblings, 1 reply; 15+ messages in thread
From: Keith Owens @ 2001-04-01 12:51 UTC (permalink / raw)
To: David Weinehall; +Cc: Simon Garner, linux-kernel, linux-smp
On Sun, 1 Apr 2001 12:09:18 +0200,
David Weinehall <tao@acc.umu.se> wrote:
>On Sun, Apr 01, 2001 at 10:04:17PM +1200, Simon Garner wrote:
>> Thanks, but I do not have watchdog support compiled into the kernel.
>
>Doesn't matter. The NMI-watchdog tries to detect SMP-lockups, and is
>always present. Unless you specifically disable it on boot.
Not any more. In 2.4.3-ac* the default is no watchdog and it must be
specifically enabled at boot.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Asus CUV4X-D, 2.4.3 crashes at boot
2001-04-01 12:51 ` Keith Owens
@ 2001-04-01 23:49 ` Simon Garner
0 siblings, 0 replies; 15+ messages in thread
From: Simon Garner @ 2001-04-01 23:49 UTC (permalink / raw)
To: linux-kernel, linux-smp
From: "Keith Owens" <kaos@ocs.com.au>
> >Doesn't matter. The NMI-watchdog tries to detect SMP-lockups, and is
> >always present. Unless you specifically disable it on boot.
>
> Not any more. In 2.4.3-ac* the default is no watchdog and it must be
> specifically enabled at boot.
>
nmi_watchdog 0 didn't help - the above would explain why.
Any more ideas? My expensive server is basically useless because of this. :(
^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <Pine.LNX.3.96.1010401185932.6155D-100000@mandrakesoft.mandrakesoft.com>]
* Re: Asus CUV4X-D, 2.4.3 crashes at boot
[not found] <Pine.LNX.3.96.1010401185932.6155D-100000@mandrakesoft.mandrakesoft.com>
@ 2001-04-02 0:48 ` Simon Garner
2001-04-02 2:57 ` Simon Garner
0 siblings, 1 reply; 15+ messages in thread
From: Simon Garner @ 2001-04-02 0:48 UTC (permalink / raw)
To: Jeff Garzik; +Cc: linux-kernel, linux-smp
From: "Jeff Garzik" <jgarzik@mandrakesoft.com>
> (private reply, because I have lost discussion context)
>
> Have you tried booting with 'noapic'?
>
>
Thanks Jeff, this seems to fix the problem, and also fixes my problem with
the aic7xxx scsi driver ABORTing multiple times at startup (which I presumed
was unrelated).
However, the machine now crashes at "Configuring Kernel Parameters" during
rc initialisation:
Welcome to Red Hat Linux
Press 'I' for interactive startup
Mounting /proc filesystem... [ OK ]
Configuring Kernel Parameters...
This is if I type "linux noapic" at the Lilo boot prompt.
Also, what do I lose by running with noapic?
Thanks
Simon Garner
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: Asus CUV4X-D, 2.4.3 crashes at boot
2001-04-02 0:48 ` Simon Garner
@ 2001-04-02 2:57 ` Simon Garner
0 siblings, 0 replies; 15+ messages in thread
From: Simon Garner @ 2001-04-02 2:57 UTC (permalink / raw)
To: linux-kernel, linux-smp
Hi all,
>
> However, the machine now crashes at "Configuring Kernel Parameters" during
> rc initialisation:
>
>
> Welcome to Red Hat Linux
> Press 'I' for interactive startup
>
> Mounting /proc filesystem... [ OK ]
> Configuring Kernel Parameters...
>
>
> This is if I type "linux noapic" at the Lilo boot prompt.
>
> Also, what do I lose by running with noapic?
>
>
Just discovered the above is not quite correct - it actually says [ OK ]
after Configuring Kernel Parameters, and crashes on the next line.
Reading through /etc/rc.d/rc.sysinit, the next line is where it sets the
system clock. If I comment out the line:
/sbin/hwclock $CLOCKFLAGS
Then the system will boot OK with 'noapic'. So presumably the system RTC is
not accessed in a SMP-compatible way without APIC.
Anyway, I'm not too happy about having to run without APIC - seems more of a
workaround than a fix. I'm happy to test patches etc if anyone has any
ideas - this problem I presume affects all motherboards using the VIA 694XDP
chipset.
Thanks in advance,
Simon Garner
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2001-04-03 6:54 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-04-01 4:15 Asus CUV4X-D, 2.4.3 crashes at boot Simon Garner
2001-04-01 5:13 ` Allen Campbell
2001-04-01 9:18 ` Simon Garner
2001-04-01 9:47 ` Allen Campbell
2001-04-02 22:27 ` Alan Cox
2001-04-02 22:40 ` Simon Garner
2001-04-03 6:47 ` Allen Campbell
2001-04-03 6:53 ` Simon Garner
-- strict thread matches above, loose matches on Subject: below --
2001-04-01 9:55 Mikael Pettersson
2001-04-01 10:04 Simon Garner
2001-04-01 10:09 ` David Weinehall
2001-04-01 12:51 ` Keith Owens
2001-04-01 23:49 ` Simon Garner
[not found] <Pine.LNX.3.96.1010401185932.6155D-100000@mandrakesoft.mandrakesoft.com>
2001-04-02 0:48 ` Simon Garner
2001-04-02 2:57 ` Simon Garner
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.