From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH] x86: adjust handling of interrupts coming in via legacy vectors
Date: Mon, 14 May 2012 15:38:45 +0100 [thread overview]
Message-ID: <4FB118F5.4050709@citrix.com> (raw)
In-Reply-To: <4FB132C80200007800083808@nat28.tlf.novell.com>
On 14/05/12 15:28, Jan Beulich wrote:
>>>> On 14.05.12 at 15:33, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> On 14/05/12 13:55, Jan Beulich wrote:
>>>>>> On 14.05.12 at 14:39, "Jan Beulich" <JBeulich@suse.com> wrote:
>>>> The debugging code added in c/s 24707:96987c324a4f was hit a (small)
>>>> number of times (one report being
>>>> http://lists.xen.org/archives/html/xen-devel/2012-05/msg00332.html),
>>>> apparently always with a vector within the legacy range. Obviously,
>>>> besides legacy vectors not normally expected to be in use on systems
>>>> with IO-APIC(s), they should never make it to the IRQ migration logic.
>>>>
>>>> This wasn't being prevented so far: Since we don't have a one-to-one
>>>> mapping between vectors and IRQs - legacy IRQs may have two vectors
>>>> associated with them (one used in either 8259A, the other used in one
>>>> of the IO-APICs) -, vector-to-IRQ translations for legacy vectors (as
>>>> used in do_IRQ()) would yield a valid IRQ number despite the IRQ
>>>> really being handled via an IO-APIC.
>>>>
>>>> This gets changed here - disable_8259A_irq() zaps the legacy vector-to-
>>>> IRQ mapping, and enable_8259A_irq(), should it ever be called for a
>>>> particular interrupts, restores it.
>>>>
>>>> Additionally, the spurious interrupt logic in do_IRQ() gets adjusted
>>>> too: Interrupts coming in via legacy vectors obviously didn't get
>>>> reported through the IO-APIC/LAPIC pair (as we never program these
>>>> vectors into any RTE), and hence shouldn't get ack_APIC_irq() called on
>>>> them. Instead, a new function (pointer) bogus_8259A_irq() gets used to
>>>> have the 8259A driver take care of the bogus interrupt (as outside of
>>>> automatice EOI mode it may need an EOI to be issued for it to prevent
>>>> other interrupts that may legitimately go through the 8259As from
>>>> getting masked out).
>>> Note that this patch does not make any attempt at dealing with the
>>> underlying issue that causes the bogus interrupt(s) to show up. If
>>> my analysis is right, we shouldn't see crashes anymore, but instead
>>> observe instances of spurious interrupts on legacy vectors. It would
>>> certainly be nice to have an actual proof of this (albeit I realize that
>>> this isn't readily reproducible), in order to then - if indeed behaving
>>> as expected - add debugging code to identify whether such interrupts
>>> in fact get raised by one of the 8259A-s (particularly printing the
>>> cached and physical mask register values), or whether they get
>>> introduced into the system by yet another obscure mechanism.
>>>
>>> One particular thing I'm suspicious about are the numerous aliases
>>> to the two (each) 8259A I/O ports that various chipsets have: What
>>> if some component in Dom0 accesses one of the alias ports in order
>>> to do something specific to a non-standard platform (say, probe for
>>> some special hardware interface), not realizing that it actually plays
>>> with PIC state? Linux under the same conditions wouldn't severely
>>> suffer - as it has a 1:1 vector <-> IRQ translation, it likely would
>>> merely observe an extra interrupt.
>> On the whole, the patch looks sensible, but what happens if the spurious
>> interrupt is coming in through the Local APIC ? If this is the case,
>> then we still need to ACK it, even if it is a bogus PIC interrupt.
>>
>> Perhaps in irq.c, the changes should check whether the observed vector
>> has been raised in the LAPIC and ack it, and then decide whether it is
>> bogus or not.
> Should that really turn out to be the case, we're in much bigger trouble,
> as then we need an explanation how an interrupt at that vector could
> have got raised in the first place. I'd therefore like to keep the current
> change deal only with things that we know can happen.
We would be in huge trouble. As it currently stands, I am not certain
that we can be sure that this is not happening.
As a concession, perhaps a test of the LAPIC IIR, and an obvious error
to the console? It would be be more useful than having Xen crash/hang
due to no longer always ack'ing the LAPIC.
>
>> Might it also be sensible to remove dom0's permissions to use the PIC
>> ports, in case it is some weird issue like that?
> That's already being done iirc. The problem is that it's non-trivial (and
> perhaps non-reliable) to determine the aliases, and hence we can't
> blindly remove more than the two real ports from Dom0's permitted
> set.
>
> Jan
Ah yes - in which case its not feasible.
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
next prev parent reply other threads:[~2012-05-14 14:38 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-14 12:39 x86: adjust handling of interrupts coming in via legacy vectors Jan Beulich
2012-05-14 12:55 ` [PATCH] " Jan Beulich
2012-05-14 13:33 ` Andrew Cooper
2012-05-14 14:28 ` Jan Beulich
2012-05-14 14:38 ` Andrew Cooper [this message]
2012-05-14 15:39 ` Jan Beulich
2012-05-14 15:35 ` Keir Fraser
2012-05-14 15:56 ` Jan Beulich
2012-05-14 16:24 ` Keir Fraser
2012-05-15 6:43 ` Jan Beulich
2012-05-15 8:03 ` AP
2012-05-15 8:22 ` Jan Beulich
2012-05-15 8:52 ` Andrew Cooper
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FB118F5.4050709@citrix.com \
--to=andrew.cooper3@citrix.com \
--cc=JBeulich@suse.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).