* apic errors and looping with 2.4, none with 2.2
@ 2004-03-23 12:49 Chris Stromsoe
2004-03-24 10:50 ` apic errors and looping with 2.4, none with 2.2 (supermicro/serverworks LE chipset) Chris Stromsoe
2004-03-25 16:39 ` apic errors and looping with 2.4, none with 2.2 Mikael Pettersson
0 siblings, 2 replies; 8+ messages in thread
From: Chris Stromsoe @ 2004-03-23 12:49 UTC (permalink / raw)
To: linux-kernel; +Cc: Marcelo Tosatti
I have one machine that won't run 2.4. As soon as a 2.4 kernel boots, it
starts throwing APIC errors.
The machine is a dual CPU pIII 933MHz system with 512Mb ram on a
SuperMicro motherboard, either a P3TDLR or a 370DLR, with the ServerWorks
LE chipset. I'm booting using lilo with append="noapic".
As soon as I boot into a 2.4 kernel, I start getting APIC errors on both
CPUs. Varying combinations of:
Mar 23 00:40:45 dahlia kernel: APIC error on CPU0: 02(08)
Mar 23 00:40:45 dahlia kernel: APIC error on CPU1: 01(08)
Mar 23 00:45:45 dahlia kernel: APIC error on CPU1: 08(08)
Mar 23 00:45:45 dahlia kernel: APIC error on CPU0: 08(08)
Mar 23 00:58:27 dahlia kernel: APIC error on CPU0: 08(01)
Mar 23 00:58:27 dahlia kernel: APIC error on CPU1: 08(02)
Mar 23 01:04:54 dahlia kernel: APIC error on CPU1: 02(02)
Mar 23 01:04:54 dahlia kernel: APIC error on CPU0: 01(02)
Mar 23 01:05:46 dahlia kernel: APIC error on CPU1: 02(08)
Mar 23 01:05:46 dahlia kernel: APIC error on CPU0: 02(08)
Mar 23 01:08:37 dahlia kernel: APIC error on CPU1: 08(02)
Mar 23 01:08:37 dahlia kernel: APIC error on CPU0: 08(02)
Mar 23 01:11:04 dahlia kernel: APIC error on CPU1: 02(02)
Mar 23 01:11:04 dahlia kernel: APIC error on CPU0: 02(0a)
Mar 23 01:11:04 dahlia kernel: APIC error on CPU1: 02(08)
Mar 23 01:25:45 dahlia kernel: APIC error on CPU1: 08(08)
Mar 23 01:25:45 dahlia kernel: APIC error on CPU0: 0a(08)
After a few hours of uptime, the box stops responding to keyboard input.
It begins printing the above messages to console over and over. I have
several other identical machines that I received in the same batch that
run 2.4 without any problems (though they do seem to require "noapic").
It runs fine with 2.2 and is running 2.2.26 right now.
The machine is not in production use and can be used to test. Any ideas
for what I should look at?
-Chris
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: apic errors and looping with 2.4, none with 2.2 (supermicro/serverworks LE chipset)
2004-03-23 12:49 apic errors and looping with 2.4, none with 2.2 Chris Stromsoe
@ 2004-03-24 10:50 ` Chris Stromsoe
2004-03-24 21:27 ` Marcelo Tosatti
2004-03-25 16:39 ` apic errors and looping with 2.4, none with 2.2 Mikael Pettersson
1 sibling, 1 reply; 8+ messages in thread
From: Chris Stromsoe @ 2004-03-24 10:50 UTC (permalink / raw)
To: linux-kernel; +Cc: Marcelo Tosatti
I've rebooted with noapic and nolapic and the machine seemed to be stable
for a while. Then I got:
Mar 24 00:27:08 dahlia kernel: APIC error on CPU1: 00(02)
Mar 24 00:27:08 dahlia kernel: APIC error on CPU0: 00(02)
Mar 24 00:27:08 dahlia kernel: spurious APIC interrupt on CPU#0, should never happen.
Mar 24 00:27:13 dahlia kernel: APIC error on CPU1: 02(08)
Mar 24 00:27:13 dahlia kernel: APIC error on CPU0: 02(08)
Mar 24 00:28:07 dahlia kernel: APIC error on CPU1: 08(02)
Mar 24 00:28:07 dahlia kernel: APIC error on CPU0: 08(02)
Mar 24 00:28:07 dahlia kernel: APIC error on CPU0: 02(08)
Mar 24 00:28:07 dahlia kernel: APIC error on CPU0: 08(02)
Mar 24 00:28:07 dahlia kernel: APIC error on CPU1: 02(0a)
I added nosmp to the lilo append line and rebooted.
noapic, nolapic, and nosmp seems to be stable. I haven't had anything
logged in the last 2 hours. Are there known APIC or SMP problems with
serverworks LE chipsets or supermicro motherboards and 2.4? What are the
steps to troubleshooting an APIC problem?
-Chris
On Tue, 23 Mar 2004, Chris Stromsoe wrote:
> I have one machine that won't run 2.4. As soon as a 2.4 kernel boots, it
> starts throwing APIC errors.
>
> The machine is a dual CPU pIII 933MHz system with 512Mb ram on a
> SuperMicro motherboard, either a P3TDLR or a 370DLR, with the ServerWorks
> LE chipset. I'm booting using lilo with append="noapic".
>
> As soon as I boot into a 2.4 kernel, I start getting APIC errors on both
> CPUs. Varying combinations of:
>
> Mar 23 00:40:45 dahlia kernel: APIC error on CPU0: 02(08)
> Mar 23 00:40:45 dahlia kernel: APIC error on CPU1: 01(08)
> Mar 23 00:45:45 dahlia kernel: APIC error on CPU1: 08(08)
> Mar 23 00:45:45 dahlia kernel: APIC error on CPU0: 08(08)
> Mar 23 00:58:27 dahlia kernel: APIC error on CPU0: 08(01)
> Mar 23 00:58:27 dahlia kernel: APIC error on CPU1: 08(02)
> Mar 23 01:04:54 dahlia kernel: APIC error on CPU1: 02(02)
> Mar 23 01:04:54 dahlia kernel: APIC error on CPU0: 01(02)
> Mar 23 01:05:46 dahlia kernel: APIC error on CPU1: 02(08)
> Mar 23 01:05:46 dahlia kernel: APIC error on CPU0: 02(08)
> Mar 23 01:08:37 dahlia kernel: APIC error on CPU1: 08(02)
> Mar 23 01:08:37 dahlia kernel: APIC error on CPU0: 08(02)
> Mar 23 01:11:04 dahlia kernel: APIC error on CPU1: 02(02)
> Mar 23 01:11:04 dahlia kernel: APIC error on CPU0: 02(0a)
> Mar 23 01:11:04 dahlia kernel: APIC error on CPU1: 02(08)
> Mar 23 01:25:45 dahlia kernel: APIC error on CPU1: 08(08)
> Mar 23 01:25:45 dahlia kernel: APIC error on CPU0: 0a(08)
>
> After a few hours of uptime, the box stops responding to keyboard input.
> It begins printing the above messages to console over and over. I have
> several other identical machines that I received in the same batch that
> run 2.4 without any problems (though they do seem to require "noapic").
>
> It runs fine with 2.2 and is running 2.2.26 right now.
>
> The machine is not in production use and can be used to test. Any ideas
> for what I should look at?
>
>
> -Chris
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: apic errors and looping with 2.4, none with 2.2 (supermicro/serverworks LE chipset)
2004-03-24 10:50 ` apic errors and looping with 2.4, none with 2.2 (supermicro/serverworks LE chipset) Chris Stromsoe
@ 2004-03-24 21:27 ` Marcelo Tosatti
2004-03-24 21:25 ` Chris Stromsoe
[not found] ` <Pine.LNX.4.55.0403251418340.11552@jurand.ds.pg.gda.pl>
0 siblings, 2 replies; 8+ messages in thread
From: Marcelo Tosatti @ 2004-03-24 21:27 UTC (permalink / raw)
To: Chris Stromsoe; +Cc: linux-kernel, Maciej W. Rozycki, mikpe
Chris,
The least I know is that similar IOAPIC errors have been seen due to
BIOS/hardware misconfigurations.
Maybe Maciej or Mikael have more clue of what might be happening.
On Wed, Mar 24, 2004 at 02:50:32AM -0800, Chris Stromsoe wrote:
> I've rebooted with noapic and nolapic and the machine seemed to be stable
> for a while. Then I got:
>
> Mar 24 00:27:08 dahlia kernel: APIC error on CPU1: 00(02)
> Mar 24 00:27:08 dahlia kernel: APIC error on CPU0: 00(02)
> Mar 24 00:27:08 dahlia kernel: spurious APIC interrupt on CPU#0, should never happen.
> Mar 24 00:27:13 dahlia kernel: APIC error on CPU1: 02(08)
> Mar 24 00:27:13 dahlia kernel: APIC error on CPU0: 02(08)
> Mar 24 00:28:07 dahlia kernel: APIC error on CPU1: 08(02)
> Mar 24 00:28:07 dahlia kernel: APIC error on CPU0: 08(02)
> Mar 24 00:28:07 dahlia kernel: APIC error on CPU0: 02(08)
> Mar 24 00:28:07 dahlia kernel: APIC error on CPU0: 08(02)
> Mar 24 00:28:07 dahlia kernel: APIC error on CPU1: 02(0a)
>
> I added nosmp to the lilo append line and rebooted.
>
> noapic, nolapic, and nosmp seems to be stable. I haven't had anything
> logged in the last 2 hours. Are there known APIC or SMP problems with
> serverworks LE chipsets or supermicro motherboards and 2.4? What are the
> steps to troubleshooting an APIC problem?
>
>
> -Chris
>
> On Tue, 23 Mar 2004, Chris Stromsoe wrote:
>
> > I have one machine that won't run 2.4. As soon as a 2.4 kernel boots, it
> > starts throwing APIC errors.
> >
> > The machine is a dual CPU pIII 933MHz system with 512Mb ram on a
> > SuperMicro motherboard, either a P3TDLR or a 370DLR, with the ServerWorks
> > LE chipset. I'm booting using lilo with append="noapic".
> >
> > As soon as I boot into a 2.4 kernel, I start getting APIC errors on both
> > CPUs. Varying combinations of:
> >
> > Mar 23 00:40:45 dahlia kernel: APIC error on CPU0: 02(08)
> > Mar 23 00:40:45 dahlia kernel: APIC error on CPU1: 01(08)
> > Mar 23 00:45:45 dahlia kernel: APIC error on CPU1: 08(08)
> > Mar 23 00:45:45 dahlia kernel: APIC error on CPU0: 08(08)
> > Mar 23 00:58:27 dahlia kernel: APIC error on CPU0: 08(01)
> > Mar 23 00:58:27 dahlia kernel: APIC error on CPU1: 08(02)
> > Mar 23 01:04:54 dahlia kernel: APIC error on CPU1: 02(02)
> > Mar 23 01:04:54 dahlia kernel: APIC error on CPU0: 01(02)
> > Mar 23 01:05:46 dahlia kernel: APIC error on CPU1: 02(08)
> > Mar 23 01:05:46 dahlia kernel: APIC error on CPU0: 02(08)
> > Mar 23 01:08:37 dahlia kernel: APIC error on CPU1: 08(02)
> > Mar 23 01:08:37 dahlia kernel: APIC error on CPU0: 08(02)
> > Mar 23 01:11:04 dahlia kernel: APIC error on CPU1: 02(02)
> > Mar 23 01:11:04 dahlia kernel: APIC error on CPU0: 02(0a)
> > Mar 23 01:11:04 dahlia kernel: APIC error on CPU1: 02(08)
> > Mar 23 01:25:45 dahlia kernel: APIC error on CPU1: 08(08)
> > Mar 23 01:25:45 dahlia kernel: APIC error on CPU0: 0a(08)
> >
> > After a few hours of uptime, the box stops responding to keyboard input.
> > It begins printing the above messages to console over and over. I have
> > several other identical machines that I received in the same batch that
> > run 2.4 without any problems (though they do seem to require "noapic").
> >
> > It runs fine with 2.2 and is running 2.2.26 right now.
> >
> > The machine is not in production use and can be used to test. Any ideas
> > for what I should look at?
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: apic errors and looping with 2.4, none with 2.2 (supermicro/serverworks LE chipset)
2004-03-24 21:27 ` Marcelo Tosatti
@ 2004-03-24 21:25 ` Chris Stromsoe
[not found] ` <Pine.LNX.4.55.0403251418340.11552@jurand.ds.pg.gda.pl>
1 sibling, 0 replies; 8+ messages in thread
From: Chris Stromsoe @ 2004-03-24 21:25 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: linux-kernel, Maciej W. Rozycki, mikpe
I also tested with 2.6.5-rc2 with no kernel command line parameters and
locked up the machine hard (no serial console) after about 2 minutes of
APIC errors. I'll check the BIOS in a few hours after I go reboot the
machine.
-Chris
On Wed, 24 Mar 2004, Marcelo Tosatti wrote:
>
> Chris,
>
> The least I know is that similar IOAPIC errors have been seen due to
> BIOS/hardware misconfigurations.
>
> Maybe Maciej or Mikael have more clue of what might be happening.
>
> On Wed, Mar 24, 2004 at 02:50:32AM -0800, Chris Stromsoe wrote:
> > I've rebooted with noapic and nolapic and the machine seemed to be stable
> > for a while. Then I got:
> >
> > Mar 24 00:27:08 dahlia kernel: APIC error on CPU1: 00(02)
> > Mar 24 00:27:08 dahlia kernel: APIC error on CPU0: 00(02)
> > Mar 24 00:27:08 dahlia kernel: spurious APIC interrupt on CPU#0, should never happen.
> > Mar 24 00:27:13 dahlia kernel: APIC error on CPU1: 02(08)
> > Mar 24 00:27:13 dahlia kernel: APIC error on CPU0: 02(08)
> > Mar 24 00:28:07 dahlia kernel: APIC error on CPU1: 08(02)
> > Mar 24 00:28:07 dahlia kernel: APIC error on CPU0: 08(02)
> > Mar 24 00:28:07 dahlia kernel: APIC error on CPU0: 02(08)
> > Mar 24 00:28:07 dahlia kernel: APIC error on CPU0: 08(02)
> > Mar 24 00:28:07 dahlia kernel: APIC error on CPU1: 02(0a)
> >
> > I added nosmp to the lilo append line and rebooted.
> >
> > noapic, nolapic, and nosmp seems to be stable. I haven't had anything
> > logged in the last 2 hours. Are there known APIC or SMP problems with
> > serverworks LE chipsets or supermicro motherboards and 2.4? What are the
> > steps to troubleshooting an APIC problem?
> >
> >
> > -Chris
> >
> > On Tue, 23 Mar 2004, Chris Stromsoe wrote:
> >
> > > I have one machine that won't run 2.4. As soon as a 2.4 kernel boots, it
> > > starts throwing APIC errors.
> > >
> > > The machine is a dual CPU pIII 933MHz system with 512Mb ram on a
> > > SuperMicro motherboard, either a P3TDLR or a 370DLR, with the ServerWorks
> > > LE chipset. I'm booting using lilo with append="noapic".
> > >
> > > As soon as I boot into a 2.4 kernel, I start getting APIC errors on both
> > > CPUs. Varying combinations of:
> > >
> > > Mar 23 00:40:45 dahlia kernel: APIC error on CPU0: 02(08)
> > > Mar 23 00:40:45 dahlia kernel: APIC error on CPU1: 01(08)
> > > Mar 23 00:45:45 dahlia kernel: APIC error on CPU1: 08(08)
> > > Mar 23 00:45:45 dahlia kernel: APIC error on CPU0: 08(08)
> > > Mar 23 00:58:27 dahlia kernel: APIC error on CPU0: 08(01)
> > > Mar 23 00:58:27 dahlia kernel: APIC error on CPU1: 08(02)
> > > Mar 23 01:04:54 dahlia kernel: APIC error on CPU1: 02(02)
> > > Mar 23 01:04:54 dahlia kernel: APIC error on CPU0: 01(02)
> > > Mar 23 01:05:46 dahlia kernel: APIC error on CPU1: 02(08)
> > > Mar 23 01:05:46 dahlia kernel: APIC error on CPU0: 02(08)
> > > Mar 23 01:08:37 dahlia kernel: APIC error on CPU1: 08(02)
> > > Mar 23 01:08:37 dahlia kernel: APIC error on CPU0: 08(02)
> > > Mar 23 01:11:04 dahlia kernel: APIC error on CPU1: 02(02)
> > > Mar 23 01:11:04 dahlia kernel: APIC error on CPU0: 02(0a)
> > > Mar 23 01:11:04 dahlia kernel: APIC error on CPU1: 02(08)
> > > Mar 23 01:25:45 dahlia kernel: APIC error on CPU1: 08(08)
> > > Mar 23 01:25:45 dahlia kernel: APIC error on CPU0: 0a(08)
> > >
> > > After a few hours of uptime, the box stops responding to keyboard input.
> > > It begins printing the above messages to console over and over. I have
> > > several other identical machines that I received in the same batch that
> > > run 2.4 without any problems (though they do seem to require "noapic").
> > >
> > > It runs fine with 2.2 and is running 2.2.26 right now.
> > >
> > > The machine is not in production use and can be used to test. Any ideas
> > > for what I should look at?
>
^ permalink raw reply [flat|nested] 8+ messages in thread[parent not found: <Pine.LNX.4.55.0403251418340.11552@jurand.ds.pg.gda.pl>]
* Re: apic errors and looping with 2.4, none with 2.2 (supermicro/serverworks LE chipset)
[not found] ` <Pine.LNX.4.55.0403251418340.11552@jurand.ds.pg.gda.pl>
@ 2004-03-26 1:09 ` Chris Stromsoe
2004-04-18 5:12 ` Chris Stromsoe
0 siblings, 1 reply; 8+ messages in thread
From: Chris Stromsoe @ 2004-03-26 1:09 UTC (permalink / raw)
To: Maciej W. Rozycki; +Cc: Marcelo Tosatti, linux-kernel, mikpe
On Thu, 25 Mar 2004, Maciej W. Rozycki wrote:
> On Wed, 24 Mar 2004, Marcelo Tosatti wrote:
>
> > Maybe Maciej or Mikael have more clue of what might be happening.
> >
> > On Wed, Mar 24, 2004 at 02:50:32AM -0800, Chris Stromsoe wrote:
> > > I've rebooted with noapic and nolapic and the machine seemed to be
> > > stable for a while. Then I got:
> > >
> > > Mar 24 00:27:08 dahlia kernel: APIC error on CPU1: 00(02)
> > > Mar 24 00:27:08 dahlia kernel: APIC error on CPU0: 00(02)
> > > Mar 24 00:27:08 dahlia kernel: spurious APIC interrupt on CPU#0, should never happen.
> > > Mar 24 00:27:13 dahlia kernel: APIC error on CPU1: 02(08)
> > > Mar 24 00:27:13 dahlia kernel: APIC error on CPU0: 02(08)
> [...]
> > > I added nosmp to the lilo append line and rebooted.
> > >
> > > noapic, nolapic, and nosmp seems to be stable. I haven't had
> > > anything logged in the last 2 hours. Are there known APIC or SMP
> > > problems with serverworks LE chipsets or supermicro motherboards and
> > > 2.4? What are the steps to troubleshooting an APIC problem?
>
> As long as you boot more than a single CPU, local APIC units are used
> at least to send IPIs. The error messages you see report receive
> checksum and receive acceptance errors. The latters result from the
> formers and all of them, including the spurious APIC interrupt are
> results of signal errors (noise?) during a transmission over the
> inter-APIC serial bus. This is a hardware problem. I'd start by
> checking the power supply first.
The only way that I've been able to boot and stay up is with nosmp,
noapic, and nolapic. I'll try replacing the power supply and see if that
helps things out. It's going to take me a few days to get a replacement
-- is there anything else that I should check while I'm waiting?
-Chris
> --
> + Maciej W. Rozycki, Technical University of Gdansk, Poland +
> +--------------------------------------------------------------+
> + e-mail: macro@ds2.pg.gda.pl, PGP key available +
>
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: apic errors and looping with 2.4, none with 2.2 (supermicro/serverworks LE chipset)
2004-03-26 1:09 ` Chris Stromsoe
@ 2004-04-18 5:12 ` Chris Stromsoe
2004-04-21 14:28 ` Maciej W. Rozycki
0 siblings, 1 reply; 8+ messages in thread
From: Chris Stromsoe @ 2004-04-18 5:12 UTC (permalink / raw)
To: Maciej W. Rozycki; +Cc: Marcelo Tosatti, linux-kernel, mikpe
On Thu, 25 Mar 2004, Chris Stromsoe wrote:
> On Thu, 25 Mar 2004, Maciej W. Rozycki wrote:
> > On Wed, 24 Mar 2004, Marcelo Tosatti wrote:
> >
> > > Maybe Maciej or Mikael have more clue of what might be happening.
> > >
> > > On Wed, Mar 24, 2004 at 02:50:32AM -0800, Chris Stromsoe wrote:
> > > > I've rebooted with noapic and nolapic and the machine seemed to be
> > > > stable for a while. Then I got:
> > > >
> > > > Mar 24 00:27:08 dahlia kernel: APIC error on CPU1: 00(02)
> > > > Mar 24 00:27:08 dahlia kernel: APIC error on CPU0: 00(02)
> > > > Mar 24 00:27:08 dahlia kernel: spurious APIC interrupt on CPU#0, should never happen.
> > > > Mar 24 00:27:13 dahlia kernel: APIC error on CPU1: 02(08)
> > > > Mar 24 00:27:13 dahlia kernel: APIC error on CPU0: 02(08)
> > [...]
> > > > I added nosmp to the lilo append line and rebooted.
> > > >
> > > > noapic, nolapic, and nosmp seems to be stable. I haven't had
> > > > anything logged in the last 2 hours. Are there known APIC or SMP
> > > > problems with serverworks LE chipsets or supermicro motherboards and
> > > > 2.4? What are the steps to troubleshooting an APIC problem?
> >
> > As long as you boot more than a single CPU, local APIC units are used
> > at least to send IPIs. The error messages you see report receive
> > checksum and receive acceptance errors. The latters result from the
> > formers and all of them, including the spurious APIC interrupt are
> > results of signal errors (noise?) during a transmission over the
> > inter-APIC serial bus. This is a hardware problem. I'd start by
> > checking the power supply first.
>
>
> The only way that I've been able to boot and stay up is with nosmp,
> noapic, and nolapic. I'll try replacing the power supply and see if
> that helps things out. It's going to take me a few days to get a
> replacement -- is there anything else that I should check while I'm
> waiting?
I replaced the power supply and haven't had any APIC errors for the last
week. It looks like that definitely solved the problem.
When there are APIC errors logged, is it generally a hardware issue? I
have a few other boxes that log errors unless they're booted with noapic
(but with noapic, they run fine). Is the power supply generally the first
thing to check when trying to track down the source of an APIC related
hardware problem? Pointers to URLs or other forms of documentation
appreciated.
Thanks.
-Chris
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: apic errors and looping with 2.4, none with 2.2 (supermicro/serverworks LE chipset)
2004-04-18 5:12 ` Chris Stromsoe
@ 2004-04-21 14:28 ` Maciej W. Rozycki
0 siblings, 0 replies; 8+ messages in thread
From: Maciej W. Rozycki @ 2004-04-21 14:28 UTC (permalink / raw)
To: Chris Stromsoe; +Cc: Marcelo Tosatti, linux-kernel, mikpe
On Sat, 17 Apr 2004, Chris Stromsoe wrote:
> When there are APIC errors logged, is it generally a hardware issue? I
Yep.
> have a few other boxes that log errors unless they're booted with noapic
> (but with noapic, they run fine). Is the power supply generally the first
> thing to check when trying to track down the source of an APIC related
Experience shows that's a good thing for a start. Of course the reason
may actually be a bad motherboard design and replacing the power supply
wouldn't help then.
> hardware problem? Pointers to URLs or other forms of documentation
> appreciated.
Well, just mailing list archives, unless, of course, you are interested
in the exact details of APIC operation. In that case,
http://developer.intel.com/ would be the right place to look for
documentation.
--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: macro@ds2.pg.gda.pl, PGP key available +
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: apic errors and looping with 2.4, none with 2.2
2004-03-23 12:49 apic errors and looping with 2.4, none with 2.2 Chris Stromsoe
2004-03-24 10:50 ` apic errors and looping with 2.4, none with 2.2 (supermicro/serverworks LE chipset) Chris Stromsoe
@ 2004-03-25 16:39 ` Mikael Pettersson
1 sibling, 0 replies; 8+ messages in thread
From: Mikael Pettersson @ 2004-03-25 16:39 UTC (permalink / raw)
To: Chris Stromsoe; +Cc: linux-kernel, Marcelo Tosatti, macro
Chris Stromsoe writes:
> I have one machine that won't run 2.4. As soon as a 2.4 kernel boots, it
> starts throwing APIC errors.
>
> The machine is a dual CPU pIII 933MHz system with 512Mb ram on a
> SuperMicro motherboard, either a P3TDLR or a 370DLR, with the ServerWorks
> LE chipset. I'm booting using lilo with append="noapic".
>
> As soon as I boot into a 2.4 kernel, I start getting APIC errors on both
> CPUs. Varying combinations of:
>
> Mar 23 00:40:45 dahlia kernel: APIC error on CPU0: 02(08)
> Mar 23 00:40:45 dahlia kernel: APIC error on CPU1: 01(08)
...
> After a few hours of uptime, the box stops responding to keyboard input.
> It begins printing the above messages to console over and over. I have
> several other identical machines that I received in the same batch that
> run 2.4 without any problems (though they do seem to require "noapic").
>
> It runs fine with 2.2 and is running 2.2.26 right now.
Like Maciej wrote, the machine's local APIC bus corrupts messages.
The fact that your other supposedly-identical machines don't have
this problem indicates strongly that this particular box has a HW
problem. It could be a weak power-supply, inadequate cooling, a
bad CPU, or a bad motherboard.
I just checked the 2.2.25 kernel and it doesn't seem to enable
APIC_LVTERR. You're problably getting APIC bus errors with 2.2
too, you just won't see them logged anywhere.
(Regarding your other boxes that need "noapic": you have tried
w/o ACPI, right? acpi=off or pci=noacpi.)
/Mikael
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2004-04-21 14:28 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-03-23 12:49 apic errors and looping with 2.4, none with 2.2 Chris Stromsoe
2004-03-24 10:50 ` apic errors and looping with 2.4, none with 2.2 (supermicro/serverworks LE chipset) Chris Stromsoe
2004-03-24 21:27 ` Marcelo Tosatti
2004-03-24 21:25 ` Chris Stromsoe
[not found] ` <Pine.LNX.4.55.0403251418340.11552@jurand.ds.pg.gda.pl>
2004-03-26 1:09 ` Chris Stromsoe
2004-04-18 5:12 ` Chris Stromsoe
2004-04-21 14:28 ` Maciej W. Rozycki
2004-03-25 16:39 ` apic errors and looping with 2.4, none with 2.2 Mikael Pettersson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox