public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: James Cleverdon <jamesclv@us.ibm.com>
To: Chris Rankin <rankincj@yahoo.com>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: APIC error on SMP machine
Date: Tue, 30 Sep 2003 18:52:47 -0700	[thread overview]
Message-ID: <200309301852.47835.jamesclv@us.ibm.com> (raw)
In-Reply-To: <3F79F8BB.2080905@yahoo.com>

On Tuesday 30 September 2003 2:42 pm, Chris Rankin wrote:
> Linux-2.4.22-SMP, 1 GB RAM, devfs, gcc-3.2.3.
>
> Hi,
>
> Today, my dual PIII (Coppermine) refused to boot, and wrote a large number
> of these messages to the serial console instead:
>
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
> APIC error on CPU1: 04(04)
>
> Can anyone tell me what these might mean, please? The kernel source implies
> that it's a "Send accept error", but this doesn't help me in an "Ah, I can
> fix that!" sense.
>
> Does this APIC error just mean that the CPU is unhappy in this slot, and is
> refusing to listen to the motherboard? Or is the motherboard refusing to
> listen to the CPU?

Neither.  An APIC send accept error means that when trying to send an 
interrupt, it was not accepted by the target.  In this case, the target is a  
CPU, either your other CPU or the same one (a CPU can send itself an 
interrupt).

While there are several reasons why this can happen, the most common ones are:

1) The target CPU is "full".  The local APIC on P54Cs through P3s only has two 
interrupt latches per interrupt "level", which is the high nibble of the IRQ 
vector number.  So, if a CPU had already latched interrupt vectors 0x30 and 
0x3A, it would have to reject any other 0x3X vector that was sent until it 
could service one of the two latched vectors.

You can force this to happen by manually binding too many IRQs that happen to 
be on the same "level" to one CPU, then causing a lot of interrupt traffic on 
those devices.

In order to avoid this problem, Linux spreads the IRQs among as many vector 
levels as possible.  Still, the vector assignment is done before any devices 
have requested interrupts.  You may get unlucky and have 3 devices on one 
level.

2) The interrupt cannot be delivered because something is wrong with it.  This 
can happen if the kernel screws up and picks "clustered" APIC mode on a 
"flat" system or vice versa.  A dual P3 system should be flat.  Check your 
dmesg log to make sure it was properly detected.  (This seldom happens unless 
you're doing interrupt development work in Linux.)

3) Maybe the other CPU is broken and physically cannot accept the interrupt.  
Do any previous kernels boot?

> Background:
> This machine has been misbehaving for a while. I thought I had worked
> around the problem by underclocking the FSB from 133 MHz to 100 MHz, but
> that now looks like it was just a "reprieve". I have tried running "nosmp",
> "pci=noacpi" and "noapic pci=noacpi" without success, and have resorted to
> yanking the CPU out of this slot entirely. (I suspect that the CPU is fine,
> however.) I have also restored the FSB to 133 MHz, so I am currently
> running the SMP kernel on a single 933 MHz PIII.
>
> Cheers,
> Chris
>
> -


-- 
James Cleverdon
IBM xSeries Linux Solutions
{jamesclv(Unix, preferred), cleverdj(Notes)} at us dot ibm dot comm

  reply	other threads:[~2003-10-01  1:52 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-09-30 21:42 APIC error on SMP machine Chris Rankin
2003-10-01  1:52 ` James Cleverdon [this message]
2003-10-01 10:14   ` Chris Rankin
2003-10-01  7:47 ` Rogier Wolff
  -- strict thread matches above, loose matches on Subject: below --
2003-10-01 13:08 Matt_Domsch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200309301852.47835.jamesclv@us.ibm.com \
    --to=jamesclv@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rankincj@yahoo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox