xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: George Dunlap <george.dunlap@eu.citrix.com>
Cc: "Keir (Xen.org)" <keir@xen.org>, Jan Beulich <jbeulich@suse.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH] x86: fix ordering of operations in destroy_irq()
Date: Thu, 30 May 2013 18:22:19 +0100	[thread overview]
Message-ID: <51A78ACB.7050105@citrix.com> (raw)
In-Reply-To: <51A783A3.6080703@eu.citrix.com>

On 30/05/2013 17:51, George Dunlap wrote:
> On 05/30/2013 05:42 PM, Jan Beulich wrote:
>>>>> George Dunlap <george.dunlap@eu.citrix.com> 05/30/13 6:23 PM >>>
>>> On 05/29/2013 07:58 AM, Jan Beulich wrote:
>>>> The fix for XSA-36, switching the default of vector map management to
>>>> be per-device, exposed more readily a problem with the cleanup of these
>>>> vector maps: dynamic_irq_cleanup() clearing desc->arch.used_vectors
>>>> keeps the subsequently invoked clear_irq_vector() from clearing the
>>>> bits for both the in-use and a possibly still outstanding old vector.
>>>>
>>>> Fix this by folding dynamic_irq_cleanup() into destroy_irq(), which was
>>>> its only caller, deferring the clearing of the vector map pointer until
>>>> after clear_irq_vector().
>>>>
>>>> Once at it, also defer resetting of desc->handler until after the loop
>>>> around smp_mb() checking for IRQ_INPROGRESS to be clear, fixing a
>>>> (mostly theoretical) issue with the intercation with do_IRQ(): If we
>>>> don't defer the pointer reset, do_IRQ() could, for non-guest IRQs, call
>>>> ->ack() and ->end() with different ->handler pointers, potentially
>>>> leading to an IRQ remaining un-acked. The issue is mostly theoretical
>>>> because non-guest IRQs are subject to destroy_irq() only on (boot time)
>>>> error paths.
>>>>
>>>> As to the changed locking: Invoking clear_irq_vector() with desc->lock
>>>> held is okay because vector_lock already nests inside desc->lock (proven
>>>> by set_desc_affinity(), which takes vector_lock and gets called from
>>>> various desc->handler->ack implementations, getting invoked with
>>>> desc->lock held).
>>>>
>>>> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>> How big of an impact is this bug?  How many people are actually affected
>>> by it?
>> Andrew will likely be able to give you more precise info on this, but this
>> fixes a problem observed in practice. Any AMD system with IOMMU would
>> be affected.
>>
>>> It's a bit hard for me to tell from the description, but it looks like
>>> it's code motion, then some "theoretical" issues.
>> No, the description is pretty precise here: It fixes an actual issue and,
>> along the way, also a theoretical one.
>>
>>> Is the improvement this patch represents worth the potential risk of
>>> bugs at this point?
>> I think so - otherwise it would need to be backported right away after the
>> release.
> Right -- then if you could also commit this tomorrow, it will get the 
> best testing we can give it. :-)
>
> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
>
>   -George

As for the impact without this patch.

Following XSA-36, any AMD boxes with IOMMUs will either fail with an
ASSERT() or incorrectly program their interrupt remapping tables after
you map/unmap MSI/MSI-X irqs a few times.

Basically, the desc->arch.used_vectors steadily accumulated history
until an ASSERT(!test_bit ...) failed.

This is because the per-device vector table code was broken right from
the go, but wasn't observed as global was the default.

~Andrew

  reply	other threads:[~2013-05-30 17:22 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-29  6:58 [PATCH] x86: fix ordering of operations in destroy_irq() Jan Beulich
2013-05-29  7:23 ` Jan Beulich
2013-05-29 22:17   ` Andrew Cooper
2013-05-29  7:29 ` Keir Fraser
2013-05-30 16:23 ` George Dunlap
2013-05-30 16:42   ` Jan Beulich
2013-05-30 16:51     ` George Dunlap
2013-05-30 17:22       ` Andrew Cooper [this message]
2013-05-31  6:36       ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51A78ACB.7050105@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).