xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
	"Keir (Xen.org)" <keir@xen.org>,
	xen-devel <xen-devel@lists.xen.org>
Subject: Re: [PATCH] x86: fix ordering of operations in destroy_irq()
Date: Wed, 29 May 2013 23:17:40 +0100	[thread overview]
Message-ID: <51A67E84.9050404@citrix.com> (raw)
In-Reply-To: <51A5C91502000078000D9768@nat28.tlf.novell.com>

On 29/05/2013 08:23, Jan Beulich wrote:
>>>> On 29.05.13 at 08:58, "Jan Beulich" <JBeulich@suse.com> wrote:
>> The fix for XSA-36, switching the default of vector map management to
>> be per-device, exposed more readily a problem with the cleanup of these
>> vector maps: dynamic_irq_cleanup() clearing desc->arch.used_vectors
>> keeps the subsequently invoked clear_irq_vector() from clearing the
>> bits for both the in-use and a possibly still outstanding old vector.
> This resulted from an Andrew bringing the issue up on security@,
> and there are more problems here which partly he pointed out and
> partly we came to realize during the discussion there.
>
> One issue is that the vector map management is not really taking
> effect for devices using multiple interrupts (at present MSI-X only)
> because the initial vector assignment in create_irq() can't possibly
> take into account the vector map, as that can be (and is being, in
> map_domain_pirq()) set only after that function returns.
>
> Andrew suggests to immediately re-assign the vector in
> map_domain_pirq(), but that would be cumbersome as a "normal"
> re-assignment only takes effect the next time an IRQ actually
> occurs. I therefore think that we should rather suppress the initial
> assignment of a vector, deferring it until after the vector map got
> set.
>
> We figured that this is no security relevant as the respective
> assertion is a debug build only thing (and debug build only issues
> were decided to not be handled via issuing advisories some time
> ago), and the effect this guards against (improperly set up IRTEs
> on AMD IOMMU) is only affecting the guest itself (using - until the
> multi-vector MSI series goes in - solely vector based indexing): A
> second CPU getting the same vector allocated as is in use for
> another IRQ of the same device on another CPU, this would simply
> break the IRQs raised to that guest.
>
>> Once at it, also defer resetting of desc->handler until after the loop
>> around smp_mb() checking for IRQ_INPROGRESS to be clear, fixing a
>> (mostly theoretical) issue with the intercation with do_IRQ(): If we
>> don't defer the pointer reset, do_IRQ() could, for non-guest IRQs, call
>> ->ack() and ->end() with different ->handler pointers, potentially
>> leading to an IRQ remaining un-acked. The issue is mostly theoretical
>> because non-guest IRQs are subject to destroy_irq() only on (boot time)
>> error paths.
> While this change also reduces the chances of an IRQ getting raised
> along with destroy_irq() getting invoked and, due to do_IRQ() losing
> the race for desc->lock, calling ack_bad_irq() due to ->handler
> already having got set to &no_irq_type, it doesn't entirely close the
> window. While the issue is cosmetic only (the IRQ raised here has no
> meaning anymore as it ie being torn down, and hence the message
> issued from ack_bad_irq() is merely a false positive), I think that this
> could be fixed by transitioning desc->arch.used to IRQ_RESERVED
> (instead of IRQ_UNUSED) in __clear_irq_vector(), and deferring both
> the resetting of desc->handler and the setting of desc->arch.used to
> IRQ_UNUSED to an RCU callback.
>
>> As to the changed locking: Invoking clear_irq_vector() with desc->lock
>> held is okay because vector_lock already nests inside desc->lock (proven
>> by set_desc_affinity(), which takes vector_lock and gets called from
>> various desc->handler->ack implementations, getting invoked with
>> desc->lock held).
> The locking around vector management appears to be a wider issue:
> Changes to IRQ descriptors generally are to be protected by holding
> desc->lock, yet all of the vector management is only guarded by
> vector_lock. The patch here addresses this for destroy_irq()'s
> invocation of clear_irq_vector(), but this apparently would need
> taking care of on all other affected code paths.
>
> Andrew - please double check I didn't forget any of the aspects we
> discussed yesterday.
>
> Jan
>

I believe that covers the issues we discovered.  I will be back in the
office properly on Friday and can see about addressing the issues not
addressed so far in this patch.

As for the patch itself,
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

As for the further points.

I am not sure the RCU stuff is sufficient.  There can be an unknown
number of irqs in transit somewhere in the hardware between a
destroy_irq() running ->shutdown() and desc->handler being replaced with
&no_irq_type.

A grace time long enough to be fairly sure that any irqs-in-transit are
delivered will remove the chance of false-positives in ack_none(), at
the risk of some false negatives slipping by.  I suppose another trick
with -irq might be able to be used to distinguish between a late
irq-in-transit and a real spurious irq.

~Andrew

  reply	other threads:[~2013-05-29 22:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-29  6:58 [PATCH] x86: fix ordering of operations in destroy_irq() Jan Beulich
2013-05-29  7:23 ` Jan Beulich
2013-05-29 22:17   ` Andrew Cooper [this message]
2013-05-29  7:29 ` Keir Fraser
2013-05-30 16:23 ` George Dunlap
2013-05-30 16:42   ` Jan Beulich
2013-05-30 16:51     ` George Dunlap
2013-05-30 17:22       ` Andrew Cooper
2013-05-31  6:36       ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51A67E84.9050404@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).