All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
	"Keir (Xen.org)" <keir@xen.org>,
	xen-devel <xen-devel@lists.xen.org>
Subject: Re: [PATCH] x86: fix ordering of operations in destroy_irq()
Date: Wed, 29 May 2013 23:17:40 +0100	[thread overview]
Message-ID: <51A67E84.9050404@citrix.com> (raw)
In-Reply-To: <51A5C91502000078000D9768@nat28.tlf.novell.com>

On 29/05/2013 08:23, Jan Beulich wrote:
>>>> On 29.05.13 at 08:58, "Jan Beulich" <JBeulich@suse.com> wrote:
>> The fix for XSA-36, switching the default of vector map management to
>> be per-device, exposed more readily a problem with the cleanup of these
>> vector maps: dynamic_irq_cleanup() clearing desc->arch.used_vectors
>> keeps the subsequently invoked clear_irq_vector() from clearing the
>> bits for both the in-use and a possibly still outstanding old vector.
> This resulted from an Andrew bringing the issue up on security@,
> and there are more problems here which partly he pointed out and
> partly we came to realize during the discussion there.
>
> One issue is that the vector map management is not really taking
> effect for devices using multiple interrupts (at present MSI-X only)
> because the initial vector assignment in create_irq() can't possibly
> take into account the vector map, as that can be (and is being, in
> map_domain_pirq()) set only after that function returns.
>
> Andrew suggests to immediately re-assign the vector in
> map_domain_pirq(), but that would be cumbersome as a "normal"
> re-assignment only takes effect the next time an IRQ actually
> occurs. I therefore think that we should rather suppress the initial
> assignment of a vector, deferring it until after the vector map got
> set.
>
> We figured that this is no security relevant as the respective
> assertion is a debug build only thing (and debug build only issues
> were decided to not be handled via issuing advisories some time
> ago), and the effect this guards against (improperly set up IRTEs
> on AMD IOMMU) is only affecting the guest itself (using - until the
> multi-vector MSI series goes in - solely vector based indexing): A
> second CPU getting the same vector allocated as is in use for
> another IRQ of the same device on another CPU, this would simply
> break the IRQs raised to that guest.
>
>> Once at it, also defer resetting of desc->handler until after the loop
>> around smp_mb() checking for IRQ_INPROGRESS to be clear, fixing a
>> (mostly theoretical) issue with the intercation with do_IRQ(): If we
>> don't defer the pointer reset, do_IRQ() could, for non-guest IRQs, call
>> ->ack() and ->end() with different ->handler pointers, potentially
>> leading to an IRQ remaining un-acked. The issue is mostly theoretical
>> because non-guest IRQs are subject to destroy_irq() only on (boot time)
>> error paths.
> While this change also reduces the chances of an IRQ getting raised
> along with destroy_irq() getting invoked and, due to do_IRQ() losing
> the race for desc->lock, calling ack_bad_irq() due to ->handler
> already having got set to &no_irq_type, it doesn't entirely close the
> window. While the issue is cosmetic only (the IRQ raised here has no
> meaning anymore as it ie being torn down, and hence the message
> issued from ack_bad_irq() is merely a false positive), I think that this
> could be fixed by transitioning desc->arch.used to IRQ_RESERVED
> (instead of IRQ_UNUSED) in __clear_irq_vector(), and deferring both
> the resetting of desc->handler and the setting of desc->arch.used to
> IRQ_UNUSED to an RCU callback.
>
>> As to the changed locking: Invoking clear_irq_vector() with desc->lock
>> held is okay because vector_lock already nests inside desc->lock (proven
>> by set_desc_affinity(), which takes vector_lock and gets called from
>> various desc->handler->ack implementations, getting invoked with
>> desc->lock held).
> The locking around vector management appears to be a wider issue:
> Changes to IRQ descriptors generally are to be protected by holding
> desc->lock, yet all of the vector management is only guarded by
> vector_lock. The patch here addresses this for destroy_irq()'s
> invocation of clear_irq_vector(), but this apparently would need
> taking care of on all other affected code paths.
>
> Andrew - please double check I didn't forget any of the aspects we
> discussed yesterday.
>
> Jan
>

I believe that covers the issues we discovered.  I will be back in the
office properly on Friday and can see about addressing the issues not
addressed so far in this patch.

As for the patch itself,
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

As for the further points.

I am not sure the RCU stuff is sufficient.  There can be an unknown
number of irqs in transit somewhere in the hardware between a
destroy_irq() running ->shutdown() and desc->handler being replaced with
&no_irq_type.

A grace time long enough to be fairly sure that any irqs-in-transit are
delivered will remove the chance of false-positives in ack_none(), at
the risk of some false negatives slipping by.  I suppose another trick
with -irq might be able to be used to distinguish between a late
irq-in-transit and a real spurious irq.

~Andrew

  reply	other threads:[~2013-05-29 22:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-29  6:58 [PATCH] x86: fix ordering of operations in destroy_irq() Jan Beulich
2013-05-29  7:23 ` Jan Beulich
2013-05-29 22:17   ` Andrew Cooper [this message]
2013-05-29  7:29 ` Keir Fraser
2013-05-30 16:23 ` George Dunlap
2013-05-30 16:42   ` Jan Beulich
2013-05-30 16:51     ` George Dunlap
2013-05-30 17:22       ` Andrew Cooper
2013-05-31  6:36       ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51A67E84.9050404@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.