From: James Cleverdon <jamesclv@us.ibm.com>
To: Chris Rankin <rankincj@yahoo.com>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: APIC error on SMP machine
Date: Tue, 30 Sep 2003 18:52:47 -0700 [thread overview]
Message-ID: <200309301852.47835.jamesclv@us.ibm.com> (raw)
In-Reply-To: <3F79F8BB.2080905@yahoo.com>
On Tuesday 30 September 2003 2:42 pm, Chris Rankin wrote:
> Linux-2.4.22-SMP, 1 GB RAM, devfs, gcc-3.2.3.
>
> Hi,
>
> Today, my dual PIII (Coppermine) refused to boot, and wrote a large number
> of these messages to the serial console instead:
>
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
>
> Can anyone tell me what these might mean, please? The kernel source implies
> that it's a "Send accept error", but this doesn't help me in an "Ah, I can
> fix that!" sense.
>
> Does this APIC error just mean that the CPU is unhappy in this slot, and is
> refusing to listen to the motherboard? Or is the motherboard refusing to
> listen to the CPU?
Neither. An APIC send accept error means that when trying to send an
interrupt, it was not accepted by the target. In this case, the target is a
CPU, either your other CPU or the same one (a CPU can send itself an
interrupt).
While there are several reasons why this can happen, the most common ones are:
1) The target CPU is "full". The local APIC on P54Cs through P3s only has two
interrupt latches per interrupt "level", which is the high nibble of the IRQ
vector number. So, if a CPU had already latched interrupt vectors 0x30 and
0x3A, it would have to reject any other 0x3X vector that was sent until it
could service one of the two latched vectors.
You can force this to happen by manually binding too many IRQs that happen to
be on the same "level" to one CPU, then causing a lot of interrupt traffic on
those devices.
In order to avoid this problem, Linux spreads the IRQs among as many vector
levels as possible. Still, the vector assignment is done before any devices
have requested interrupts. You may get unlucky and have 3 devices on one
level.
2) The interrupt cannot be delivered because something is wrong with it. This
can happen if the kernel screws up and picks "clustered" APIC mode on a
"flat" system or vice versa. A dual P3 system should be flat. Check your
dmesg log to make sure it was properly detected. (This seldom happens unless
you're doing interrupt development work in Linux.)
3) Maybe the other CPU is broken and physically cannot accept the interrupt.
Do any previous kernels boot?
> Background:
> This machine has been misbehaving for a while. I thought I had worked
> around the problem by underclocking the FSB from 133 MHz to 100 MHz, but
> that now looks like it was just a "reprieve". I have tried running "nosmp",
> "pci=noacpi" and "noapic pci=noacpi" without success, and have resorted to
> yanking the CPU out of this slot entirely. (I suspect that the CPU is fine,
> however.) I have also restored the FSB to 133 MHz, so I am currently
> running the SMP kernel on a single 933 MHz PIII.
>
> Cheers,
> Chris
>
> -
--
James Cleverdon
IBM xSeries Linux Solutions
{jamesclv(Unix, preferred), cleverdj(Notes)} at us dot ibm dot comm
next prev parent reply other threads:[~2003-10-01 1:52 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-09-30 21:42 APIC error on SMP machine Chris Rankin
2003-10-01 1:52 ` James Cleverdon [this message]
2003-10-01 10:14 ` Chris Rankin
2003-10-01 7:47 ` Rogier Wolff
-- strict thread matches above, loose matches on Subject: below --
2003-10-01 13:08 Matt_Domsch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200309301852.47835.jamesclv@us.ibm.com \
--to=jamesclv@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rankincj@yahoo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox