* [PATCH] x86: fix ordering of operations in destroy_irq()
@ 2013-05-29 6:58 Jan Beulich
2013-05-29 7:23 ` Jan Beulich
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Jan Beulich @ 2013-05-29 6:58 UTC (permalink / raw)
To: xen-devel; +Cc: George Dunlap, Andrew Cooper, Keir Fraser
[-- Attachment #1: Type: text/plain, Size: 3501 bytes --]
The fix for XSA-36, switching the default of vector map management to
be per-device, exposed more readily a problem with the cleanup of these
vector maps: dynamic_irq_cleanup() clearing desc->arch.used_vectors
keeps the subsequently invoked clear_irq_vector() from clearing the
bits for both the in-use and a possibly still outstanding old vector.
Fix this by folding dynamic_irq_cleanup() into destroy_irq(), which was
its only caller, deferring the clearing of the vector map pointer until
after clear_irq_vector().
Once at it, also defer resetting of desc->handler until after the loop
around smp_mb() checking for IRQ_INPROGRESS to be clear, fixing a
(mostly theoretical) issue with the intercation with do_IRQ(): If we
don't defer the pointer reset, do_IRQ() could, for non-guest IRQs, call
->ack() and ->end() with different ->handler pointers, potentially
leading to an IRQ remaining un-acked. The issue is mostly theoretical
because non-guest IRQs are subject to destroy_irq() only on (boot time)
error paths.
As to the changed locking: Invoking clear_irq_vector() with desc->lock
held is okay because vector_lock already nests inside desc->lock (proven
by set_desc_affinity(), which takes vector_lock and gets called from
various desc->handler->ack implementations, getting invoked with
desc->lock held).
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -197,12 +197,24 @@ int create_irq(int node)
return irq;
}
-static void dynamic_irq_cleanup(unsigned int irq)
+void destroy_irq(unsigned int irq)
{
struct irq_desc *desc = irq_to_desc(irq);
unsigned long flags;
struct irqaction *action;
+ BUG_ON(!MSI_IRQ(irq));
+
+ if ( dom0 )
+ {
+ int err = irq_deny_access(dom0, irq);
+
+ if ( err )
+ printk(XENLOG_G_ERR
+ "Could not revoke Dom0 access to IRQ%u (error %d)\n",
+ irq, err);
+ }
+
spin_lock_irqsave(&desc->lock, flags);
desc->status |= IRQ_DISABLED;
desc->status &= ~IRQ_GUEST;
@@ -210,16 +222,19 @@ static void dynamic_irq_cleanup(unsigned
action = desc->action;
desc->action = NULL;
desc->msi_desc = NULL;
- desc->handler = &no_irq_type;
- desc->arch.used_vectors = NULL;
cpumask_setall(desc->affinity);
spin_unlock_irqrestore(&desc->lock, flags);
/* Wait to make sure it's not being used on another CPU */
do { smp_mb(); } while ( desc->status & IRQ_INPROGRESS );
- if (action)
- xfree(action);
+ spin_lock_irqsave(&desc->lock, flags);
+ desc->handler = &no_irq_type;
+ clear_irq_vector(irq);
+ desc->arch.used_vectors = NULL;
+ spin_unlock_irqrestore(&desc->lock, flags);
+
+ xfree(action);
}
static void __clear_irq_vector(int irq)
@@ -286,24 +301,6 @@ void clear_irq_vector(int irq)
spin_unlock_irqrestore(&vector_lock, flags);
}
-void destroy_irq(unsigned int irq)
-{
- BUG_ON(!MSI_IRQ(irq));
-
- if ( dom0 )
- {
- int err = irq_deny_access(dom0, irq);
-
- if ( err )
- printk(XENLOG_G_ERR
- "Could not revoke Dom0 access to IRQ%u (error %d)\n",
- irq, err);
- }
-
- dynamic_irq_cleanup(irq);
- clear_irq_vector(irq);
-}
-
int irq_to_vector(int irq)
{
int vector = -1;
[-- Attachment #2: x86-destroy_irq-ordering.patch --]
[-- Type: text/plain, Size: 3547 bytes --]
x86: fix ordering of operations in destroy_irq()
The fix for XSA-36, switching the default of vector map management to
be per-device, exposed more readily a problem with the cleanup of these
vector maps: dynamic_irq_cleanup() clearing desc->arch.used_vectors
keeps the subsequently invoked clear_irq_vector() from clearing the
bits for both the in-use and a possibly still outstanding old vector.
Fix this by folding dynamic_irq_cleanup() into destroy_irq(), which was
its only caller, deferring the clearing of the vector map pointer until
after clear_irq_vector().
Once at it, also defer resetting of desc->handler until after the loop
around smp_mb() checking for IRQ_INPROGRESS to be clear, fixing a
(mostly theoretical) issue with the intercation with do_IRQ(): If we
don't defer the pointer reset, do_IRQ() could, for non-guest IRQs, call
->ack() and ->end() with different ->handler pointers, potentially
leading to an IRQ remaining un-acked. The issue is mostly theoretical
because non-guest IRQs are subject to destroy_irq() only on (boot time)
error paths.
As to the changed locking: Invoking clear_irq_vector() with desc->lock
held is okay because vector_lock already nests inside desc->lock (proven
by set_desc_affinity(), which takes vector_lock and gets called from
various desc->handler->ack implementations, getting invoked with
desc->lock held).
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -197,12 +197,24 @@ int create_irq(int node)
return irq;
}
-static void dynamic_irq_cleanup(unsigned int irq)
+void destroy_irq(unsigned int irq)
{
struct irq_desc *desc = irq_to_desc(irq);
unsigned long flags;
struct irqaction *action;
+ BUG_ON(!MSI_IRQ(irq));
+
+ if ( dom0 )
+ {
+ int err = irq_deny_access(dom0, irq);
+
+ if ( err )
+ printk(XENLOG_G_ERR
+ "Could not revoke Dom0 access to IRQ%u (error %d)\n",
+ irq, err);
+ }
+
spin_lock_irqsave(&desc->lock, flags);
desc->status |= IRQ_DISABLED;
desc->status &= ~IRQ_GUEST;
@@ -210,16 +222,19 @@ static void dynamic_irq_cleanup(unsigned
action = desc->action;
desc->action = NULL;
desc->msi_desc = NULL;
- desc->handler = &no_irq_type;
- desc->arch.used_vectors = NULL;
cpumask_setall(desc->affinity);
spin_unlock_irqrestore(&desc->lock, flags);
/* Wait to make sure it's not being used on another CPU */
do { smp_mb(); } while ( desc->status & IRQ_INPROGRESS );
- if (action)
- xfree(action);
+ spin_lock_irqsave(&desc->lock, flags);
+ desc->handler = &no_irq_type;
+ clear_irq_vector(irq);
+ desc->arch.used_vectors = NULL;
+ spin_unlock_irqrestore(&desc->lock, flags);
+
+ xfree(action);
}
static void __clear_irq_vector(int irq)
@@ -286,24 +301,6 @@ void clear_irq_vector(int irq)
spin_unlock_irqrestore(&vector_lock, flags);
}
-void destroy_irq(unsigned int irq)
-{
- BUG_ON(!MSI_IRQ(irq));
-
- if ( dom0 )
- {
- int err = irq_deny_access(dom0, irq);
-
- if ( err )
- printk(XENLOG_G_ERR
- "Could not revoke Dom0 access to IRQ%u (error %d)\n",
- irq, err);
- }
-
- dynamic_irq_cleanup(irq);
- clear_irq_vector(irq);
-}
-
int irq_to_vector(int irq)
{
int vector = -1;
[-- Attachment #3: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] x86: fix ordering of operations in destroy_irq()
2013-05-29 6:58 [PATCH] x86: fix ordering of operations in destroy_irq() Jan Beulich
@ 2013-05-29 7:23 ` Jan Beulich
2013-05-29 22:17 ` Andrew Cooper
2013-05-29 7:29 ` Keir Fraser
2013-05-30 16:23 ` George Dunlap
2 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2013-05-29 7:23 UTC (permalink / raw)
To: xen-devel; +Cc: George Dunlap, Andrew Cooper, Keir Fraser
>>> On 29.05.13 at 08:58, "Jan Beulich" <JBeulich@suse.com> wrote:
> The fix for XSA-36, switching the default of vector map management to
> be per-device, exposed more readily a problem with the cleanup of these
> vector maps: dynamic_irq_cleanup() clearing desc->arch.used_vectors
> keeps the subsequently invoked clear_irq_vector() from clearing the
> bits for both the in-use and a possibly still outstanding old vector.
This resulted from an Andrew bringing the issue up on security@,
and there are more problems here which partly he pointed out and
partly we came to realize during the discussion there.
One issue is that the vector map management is not really taking
effect for devices using multiple interrupts (at present MSI-X only)
because the initial vector assignment in create_irq() can't possibly
take into account the vector map, as that can be (and is being, in
map_domain_pirq()) set only after that function returns.
Andrew suggests to immediately re-assign the vector in
map_domain_pirq(), but that would be cumbersome as a "normal"
re-assignment only takes effect the next time an IRQ actually
occurs. I therefore think that we should rather suppress the initial
assignment of a vector, deferring it until after the vector map got
set.
We figured that this is no security relevant as the respective
assertion is a debug build only thing (and debug build only issues
were decided to not be handled via issuing advisories some time
ago), and the effect this guards against (improperly set up IRTEs
on AMD IOMMU) is only affecting the guest itself (using - until the
multi-vector MSI series goes in - solely vector based indexing): A
second CPU getting the same vector allocated as is in use for
another IRQ of the same device on another CPU, this would simply
break the IRQs raised to that guest.
> Once at it, also defer resetting of desc->handler until after the loop
> around smp_mb() checking for IRQ_INPROGRESS to be clear, fixing a
> (mostly theoretical) issue with the intercation with do_IRQ(): If we
> don't defer the pointer reset, do_IRQ() could, for non-guest IRQs, call
> ->ack() and ->end() with different ->handler pointers, potentially
> leading to an IRQ remaining un-acked. The issue is mostly theoretical
> because non-guest IRQs are subject to destroy_irq() only on (boot time)
> error paths.
While this change also reduces the chances of an IRQ getting raised
along with destroy_irq() getting invoked and, due to do_IRQ() losing
the race for desc->lock, calling ack_bad_irq() due to ->handler
already having got set to &no_irq_type, it doesn't entirely close the
window. While the issue is cosmetic only (the IRQ raised here has no
meaning anymore as it ie being torn down, and hence the message
issued from ack_bad_irq() is merely a false positive), I think that this
could be fixed by transitioning desc->arch.used to IRQ_RESERVED
(instead of IRQ_UNUSED) in __clear_irq_vector(), and deferring both
the resetting of desc->handler and the setting of desc->arch.used to
IRQ_UNUSED to an RCU callback.
> As to the changed locking: Invoking clear_irq_vector() with desc->lock
> held is okay because vector_lock already nests inside desc->lock (proven
> by set_desc_affinity(), which takes vector_lock and gets called from
> various desc->handler->ack implementations, getting invoked with
> desc->lock held).
The locking around vector management appears to be a wider issue:
Changes to IRQ descriptors generally are to be protected by holding
desc->lock, yet all of the vector management is only guarded by
vector_lock. The patch here addresses this for destroy_irq()'s
invocation of clear_irq_vector(), but this apparently would need
taking care of on all other affected code paths.
Andrew - please double check I didn't forget any of the aspects we
discussed yesterday.
Jan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] x86: fix ordering of operations in destroy_irq()
2013-05-29 7:23 ` Jan Beulich
@ 2013-05-29 22:17 ` Andrew Cooper
0 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2013-05-29 22:17 UTC (permalink / raw)
To: Jan Beulich; +Cc: George Dunlap, Keir (Xen.org), xen-devel
On 29/05/2013 08:23, Jan Beulich wrote:
>>>> On 29.05.13 at 08:58, "Jan Beulich" <JBeulich@suse.com> wrote:
>> The fix for XSA-36, switching the default of vector map management to
>> be per-device, exposed more readily a problem with the cleanup of these
>> vector maps: dynamic_irq_cleanup() clearing desc->arch.used_vectors
>> keeps the subsequently invoked clear_irq_vector() from clearing the
>> bits for both the in-use and a possibly still outstanding old vector.
> This resulted from an Andrew bringing the issue up on security@,
> and there are more problems here which partly he pointed out and
> partly we came to realize during the discussion there.
>
> One issue is that the vector map management is not really taking
> effect for devices using multiple interrupts (at present MSI-X only)
> because the initial vector assignment in create_irq() can't possibly
> take into account the vector map, as that can be (and is being, in
> map_domain_pirq()) set only after that function returns.
>
> Andrew suggests to immediately re-assign the vector in
> map_domain_pirq(), but that would be cumbersome as a "normal"
> re-assignment only takes effect the next time an IRQ actually
> occurs. I therefore think that we should rather suppress the initial
> assignment of a vector, deferring it until after the vector map got
> set.
>
> We figured that this is no security relevant as the respective
> assertion is a debug build only thing (and debug build only issues
> were decided to not be handled via issuing advisories some time
> ago), and the effect this guards against (improperly set up IRTEs
> on AMD IOMMU) is only affecting the guest itself (using - until the
> multi-vector MSI series goes in - solely vector based indexing): A
> second CPU getting the same vector allocated as is in use for
> another IRQ of the same device on another CPU, this would simply
> break the IRQs raised to that guest.
>
>> Once at it, also defer resetting of desc->handler until after the loop
>> around smp_mb() checking for IRQ_INPROGRESS to be clear, fixing a
>> (mostly theoretical) issue with the intercation with do_IRQ(): If we
>> don't defer the pointer reset, do_IRQ() could, for non-guest IRQs, call
>> ->ack() and ->end() with different ->handler pointers, potentially
>> leading to an IRQ remaining un-acked. The issue is mostly theoretical
>> because non-guest IRQs are subject to destroy_irq() only on (boot time)
>> error paths.
> While this change also reduces the chances of an IRQ getting raised
> along with destroy_irq() getting invoked and, due to do_IRQ() losing
> the race for desc->lock, calling ack_bad_irq() due to ->handler
> already having got set to &no_irq_type, it doesn't entirely close the
> window. While the issue is cosmetic only (the IRQ raised here has no
> meaning anymore as it ie being torn down, and hence the message
> issued from ack_bad_irq() is merely a false positive), I think that this
> could be fixed by transitioning desc->arch.used to IRQ_RESERVED
> (instead of IRQ_UNUSED) in __clear_irq_vector(), and deferring both
> the resetting of desc->handler and the setting of desc->arch.used to
> IRQ_UNUSED to an RCU callback.
>
>> As to the changed locking: Invoking clear_irq_vector() with desc->lock
>> held is okay because vector_lock already nests inside desc->lock (proven
>> by set_desc_affinity(), which takes vector_lock and gets called from
>> various desc->handler->ack implementations, getting invoked with
>> desc->lock held).
> The locking around vector management appears to be a wider issue:
> Changes to IRQ descriptors generally are to be protected by holding
> desc->lock, yet all of the vector management is only guarded by
> vector_lock. The patch here addresses this for destroy_irq()'s
> invocation of clear_irq_vector(), but this apparently would need
> taking care of on all other affected code paths.
>
> Andrew - please double check I didn't forget any of the aspects we
> discussed yesterday.
>
> Jan
>
I believe that covers the issues we discovered. I will be back in the
office properly on Friday and can see about addressing the issues not
addressed so far in this patch.
As for the patch itself,
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
As for the further points.
I am not sure the RCU stuff is sufficient. There can be an unknown
number of irqs in transit somewhere in the hardware between a
destroy_irq() running ->shutdown() and desc->handler being replaced with
&no_irq_type.
A grace time long enough to be fairly sure that any irqs-in-transit are
delivered will remove the chance of false-positives in ack_none(), at
the risk of some false negatives slipping by. I suppose another trick
with -irq might be able to be used to distinguish between a late
irq-in-transit and a real spurious irq.
~Andrew
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] x86: fix ordering of operations in destroy_irq()
2013-05-29 6:58 [PATCH] x86: fix ordering of operations in destroy_irq() Jan Beulich
2013-05-29 7:23 ` Jan Beulich
@ 2013-05-29 7:29 ` Keir Fraser
2013-05-30 16:23 ` George Dunlap
2 siblings, 0 replies; 9+ messages in thread
From: Keir Fraser @ 2013-05-29 7:29 UTC (permalink / raw)
To: Jan Beulich, xen-devel; +Cc: George Dunlap, Andrew Cooper, Keir Fraser
On 29/05/2013 07:58, "Jan Beulich" <JBeulich@suse.com> wrote:
> The fix for XSA-36, switching the default of vector map management to
> be per-device, exposed more readily a problem with the cleanup of these
> vector maps: dynamic_irq_cleanup() clearing desc->arch.used_vectors
> keeps the subsequently invoked clear_irq_vector() from clearing the
> bits for both the in-use and a possibly still outstanding old vector.
>
> Fix this by folding dynamic_irq_cleanup() into destroy_irq(), which was
> its only caller, deferring the clearing of the vector map pointer until
> after clear_irq_vector().
>
> Once at it, also defer resetting of desc->handler until after the loop
> around smp_mb() checking for IRQ_INPROGRESS to be clear, fixing a
> (mostly theoretical) issue with the intercation with do_IRQ(): If we
> don't defer the pointer reset, do_IRQ() could, for non-guest IRQs, call
> ->ack() and ->end() with different ->handler pointers, potentially
> leading to an IRQ remaining un-acked. The issue is mostly theoretical
> because non-guest IRQs are subject to destroy_irq() only on (boot time)
> error paths.
>
> As to the changed locking: Invoking clear_irq_vector() with desc->lock
> held is okay because vector_lock already nests inside desc->lock (proven
> by set_desc_affinity(), which takes vector_lock and gets called from
> various desc->handler->ack implementations, getting invoked with
> desc->lock held).
>
> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -197,12 +197,24 @@ int create_irq(int node)
> return irq;
> }
>
> -static void dynamic_irq_cleanup(unsigned int irq)
> +void destroy_irq(unsigned int irq)
> {
> struct irq_desc *desc = irq_to_desc(irq);
> unsigned long flags;
> struct irqaction *action;
>
> + BUG_ON(!MSI_IRQ(irq));
> +
> + if ( dom0 )
> + {
> + int err = irq_deny_access(dom0, irq);
> +
> + if ( err )
> + printk(XENLOG_G_ERR
> + "Could not revoke Dom0 access to IRQ%u (error %d)\n",
> + irq, err);
> + }
> +
> spin_lock_irqsave(&desc->lock, flags);
> desc->status |= IRQ_DISABLED;
> desc->status &= ~IRQ_GUEST;
> @@ -210,16 +222,19 @@ static void dynamic_irq_cleanup(unsigned
> action = desc->action;
> desc->action = NULL;
> desc->msi_desc = NULL;
> - desc->handler = &no_irq_type;
> - desc->arch.used_vectors = NULL;
> cpumask_setall(desc->affinity);
> spin_unlock_irqrestore(&desc->lock, flags);
>
> /* Wait to make sure it's not being used on another CPU */
> do { smp_mb(); } while ( desc->status & IRQ_INPROGRESS );
>
> - if (action)
> - xfree(action);
> + spin_lock_irqsave(&desc->lock, flags);
> + desc->handler = &no_irq_type;
> + clear_irq_vector(irq);
> + desc->arch.used_vectors = NULL;
> + spin_unlock_irqrestore(&desc->lock, flags);
> +
> + xfree(action);
> }
>
> static void __clear_irq_vector(int irq)
> @@ -286,24 +301,6 @@ void clear_irq_vector(int irq)
> spin_unlock_irqrestore(&vector_lock, flags);
> }
>
> -void destroy_irq(unsigned int irq)
> -{
> - BUG_ON(!MSI_IRQ(irq));
> -
> - if ( dom0 )
> - {
> - int err = irq_deny_access(dom0, irq);
> -
> - if ( err )
> - printk(XENLOG_G_ERR
> - "Could not revoke Dom0 access to IRQ%u (error %d)\n",
> - irq, err);
> - }
> -
> - dynamic_irq_cleanup(irq);
> - clear_irq_vector(irq);
> -}
> -
> int irq_to_vector(int irq)
> {
> int vector = -1;
>
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] x86: fix ordering of operations in destroy_irq()
2013-05-29 6:58 [PATCH] x86: fix ordering of operations in destroy_irq() Jan Beulich
2013-05-29 7:23 ` Jan Beulich
2013-05-29 7:29 ` Keir Fraser
@ 2013-05-30 16:23 ` George Dunlap
2013-05-30 16:42 ` Jan Beulich
2 siblings, 1 reply; 9+ messages in thread
From: George Dunlap @ 2013-05-30 16:23 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Keir Fraser, xen-devel
On 05/29/2013 07:58 AM, Jan Beulich wrote:
> The fix for XSA-36, switching the default of vector map management to
> be per-device, exposed more readily a problem with the cleanup of these
> vector maps: dynamic_irq_cleanup() clearing desc->arch.used_vectors
> keeps the subsequently invoked clear_irq_vector() from clearing the
> bits for both the in-use and a possibly still outstanding old vector.
>
> Fix this by folding dynamic_irq_cleanup() into destroy_irq(), which was
> its only caller, deferring the clearing of the vector map pointer until
> after clear_irq_vector().
>
> Once at it, also defer resetting of desc->handler until after the loop
> around smp_mb() checking for IRQ_INPROGRESS to be clear, fixing a
> (mostly theoretical) issue with the intercation with do_IRQ(): If we
> don't defer the pointer reset, do_IRQ() could, for non-guest IRQs, call
> ->ack() and ->end() with different ->handler pointers, potentially
> leading to an IRQ remaining un-acked. The issue is mostly theoretical
> because non-guest IRQs are subject to destroy_irq() only on (boot time)
> error paths.
>
> As to the changed locking: Invoking clear_irq_vector() with desc->lock
> held is okay because vector_lock already nests inside desc->lock (proven
> by set_desc_affinity(), which takes vector_lock and gets called from
> various desc->handler->ack implementations, getting invoked with
> desc->lock held).
>
> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
How big of an impact is this bug? How many people are actually affected
by it?
It's a bit hard for me to tell from the description, but it looks like
it's code motion, then some "theoretical" issues.
Remember our three goals:
- A bug-free release
- An awesome release
- An on-time release
Is the improvement this patch represents worth the potential risk of
bugs at this point?
-George
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] x86: fix ordering of operations in destroy_irq()
2013-05-30 16:23 ` George Dunlap
@ 2013-05-30 16:42 ` Jan Beulich
2013-05-30 16:51 ` George Dunlap
0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2013-05-30 16:42 UTC (permalink / raw)
To: george.dunlap; +Cc: andrew.cooper3, keir, xen-devel
>>> George Dunlap <george.dunlap@eu.citrix.com> 05/30/13 6:23 PM >>>
>On 05/29/2013 07:58 AM, Jan Beulich wrote:
>> The fix for XSA-36, switching the default of vector map management to
>> be per-device, exposed more readily a problem with the cleanup of these
>> vector maps: dynamic_irq_cleanup() clearing desc->arch.used_vectors
>> keeps the subsequently invoked clear_irq_vector() from clearing the
>> bits for both the in-use and a possibly still outstanding old vector.
>>
>> Fix this by folding dynamic_irq_cleanup() into destroy_irq(), which was
>> its only caller, deferring the clearing of the vector map pointer until
>> after clear_irq_vector().
>>
>> Once at it, also defer resetting of desc->handler until after the loop
>> around smp_mb() checking for IRQ_INPROGRESS to be clear, fixing a
>> (mostly theoretical) issue with the intercation with do_IRQ(): If we
>> don't defer the pointer reset, do_IRQ() could, for non-guest IRQs, call
>> ->ack() and ->end() with different ->handler pointers, potentially
>> leading to an IRQ remaining un-acked. The issue is mostly theoretical
>> because non-guest IRQs are subject to destroy_irq() only on (boot time)
>> error paths.
>>
>> As to the changed locking: Invoking clear_irq_vector() with desc->lock
>> held is okay because vector_lock already nests inside desc->lock (proven
>> by set_desc_affinity(), which takes vector_lock and gets called from
>> various desc->handler->ack implementations, getting invoked with
>> desc->lock held).
>>
>> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
>How big of an impact is this bug? How many people are actually affected
>by it?
Andrew will likely be able to give you more precise info on this, but this
fixes a problem observed in practice. Any AMD system with IOMMU would
be affected.
>It's a bit hard for me to tell from the description, but it looks like
>it's code motion, then some "theoretical" issues.
No, the description is pretty precise here: It fixes an actual issue and,
along the way, also a theoretical one.
>Is the improvement this patch represents worth the potential risk of
>bugs at this point?
I think so - otherwise it would need to be backported right away after the
release.
Jan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] x86: fix ordering of operations in destroy_irq()
2013-05-30 16:42 ` Jan Beulich
@ 2013-05-30 16:51 ` George Dunlap
2013-05-30 17:22 ` Andrew Cooper
2013-05-31 6:36 ` Jan Beulich
0 siblings, 2 replies; 9+ messages in thread
From: George Dunlap @ 2013-05-30 16:51 UTC (permalink / raw)
To: Jan Beulich; +Cc: andrew.cooper3, keir, xen-devel
On 05/30/2013 05:42 PM, Jan Beulich wrote:
>>>> George Dunlap <george.dunlap@eu.citrix.com> 05/30/13 6:23 PM >>>
>> On 05/29/2013 07:58 AM, Jan Beulich wrote:
>>> The fix for XSA-36, switching the default of vector map management to
>>> be per-device, exposed more readily a problem with the cleanup of these
>>> vector maps: dynamic_irq_cleanup() clearing desc->arch.used_vectors
>>> keeps the subsequently invoked clear_irq_vector() from clearing the
>>> bits for both the in-use and a possibly still outstanding old vector.
>>>
>>> Fix this by folding dynamic_irq_cleanup() into destroy_irq(), which was
>>> its only caller, deferring the clearing of the vector map pointer until
>>> after clear_irq_vector().
>>>
>>> Once at it, also defer resetting of desc->handler until after the loop
>>> around smp_mb() checking for IRQ_INPROGRESS to be clear, fixing a
>>> (mostly theoretical) issue with the intercation with do_IRQ(): If we
>>> don't defer the pointer reset, do_IRQ() could, for non-guest IRQs, call
>>> ->ack() and ->end() with different ->handler pointers, potentially
>>> leading to an IRQ remaining un-acked. The issue is mostly theoretical
>>> because non-guest IRQs are subject to destroy_irq() only on (boot time)
>>> error paths.
>>>
>>> As to the changed locking: Invoking clear_irq_vector() with desc->lock
>>> held is okay because vector_lock already nests inside desc->lock (proven
>>> by set_desc_affinity(), which takes vector_lock and gets called from
>>> various desc->handler->ack implementations, getting invoked with
>>> desc->lock held).
>>>
>>> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>
>> How big of an impact is this bug? How many people are actually affected
>> by it?
>
> Andrew will likely be able to give you more precise info on this, but this
> fixes a problem observed in practice. Any AMD system with IOMMU would
> be affected.
>
>> It's a bit hard for me to tell from the description, but it looks like
>> it's code motion, then some "theoretical" issues.
>
> No, the description is pretty precise here: It fixes an actual issue and,
> along the way, also a theoretical one.
>
>> Is the improvement this patch represents worth the potential risk of
>> bugs at this point?
>
> I think so - otherwise it would need to be backported right away after the
> release.
Right -- then if you could also commit this tomorrow, it will get the
best testing we can give it. :-)
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
-George
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] x86: fix ordering of operations in destroy_irq()
2013-05-30 16:51 ` George Dunlap
@ 2013-05-30 17:22 ` Andrew Cooper
2013-05-31 6:36 ` Jan Beulich
1 sibling, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2013-05-30 17:22 UTC (permalink / raw)
To: George Dunlap; +Cc: Keir (Xen.org), Jan Beulich, xen-devel@lists.xen.org
On 30/05/2013 17:51, George Dunlap wrote:
> On 05/30/2013 05:42 PM, Jan Beulich wrote:
>>>>> George Dunlap <george.dunlap@eu.citrix.com> 05/30/13 6:23 PM >>>
>>> On 05/29/2013 07:58 AM, Jan Beulich wrote:
>>>> The fix for XSA-36, switching the default of vector map management to
>>>> be per-device, exposed more readily a problem with the cleanup of these
>>>> vector maps: dynamic_irq_cleanup() clearing desc->arch.used_vectors
>>>> keeps the subsequently invoked clear_irq_vector() from clearing the
>>>> bits for both the in-use and a possibly still outstanding old vector.
>>>>
>>>> Fix this by folding dynamic_irq_cleanup() into destroy_irq(), which was
>>>> its only caller, deferring the clearing of the vector map pointer until
>>>> after clear_irq_vector().
>>>>
>>>> Once at it, also defer resetting of desc->handler until after the loop
>>>> around smp_mb() checking for IRQ_INPROGRESS to be clear, fixing a
>>>> (mostly theoretical) issue with the intercation with do_IRQ(): If we
>>>> don't defer the pointer reset, do_IRQ() could, for non-guest IRQs, call
>>>> ->ack() and ->end() with different ->handler pointers, potentially
>>>> leading to an IRQ remaining un-acked. The issue is mostly theoretical
>>>> because non-guest IRQs are subject to destroy_irq() only on (boot time)
>>>> error paths.
>>>>
>>>> As to the changed locking: Invoking clear_irq_vector() with desc->lock
>>>> held is okay because vector_lock already nests inside desc->lock (proven
>>>> by set_desc_affinity(), which takes vector_lock and gets called from
>>>> various desc->handler->ack implementations, getting invoked with
>>>> desc->lock held).
>>>>
>>>> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>> How big of an impact is this bug? How many people are actually affected
>>> by it?
>> Andrew will likely be able to give you more precise info on this, but this
>> fixes a problem observed in practice. Any AMD system with IOMMU would
>> be affected.
>>
>>> It's a bit hard for me to tell from the description, but it looks like
>>> it's code motion, then some "theoretical" issues.
>> No, the description is pretty precise here: It fixes an actual issue and,
>> along the way, also a theoretical one.
>>
>>> Is the improvement this patch represents worth the potential risk of
>>> bugs at this point?
>> I think so - otherwise it would need to be backported right away after the
>> release.
> Right -- then if you could also commit this tomorrow, it will get the
> best testing we can give it. :-)
>
> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
>
> -George
As for the impact without this patch.
Following XSA-36, any AMD boxes with IOMMUs will either fail with an
ASSERT() or incorrectly program their interrupt remapping tables after
you map/unmap MSI/MSI-X irqs a few times.
Basically, the desc->arch.used_vectors steadily accumulated history
until an ASSERT(!test_bit ...) failed.
This is because the per-device vector table code was broken right from
the go, but wasn't observed as global was the default.
~Andrew
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] x86: fix ordering of operations in destroy_irq()
2013-05-30 16:51 ` George Dunlap
2013-05-30 17:22 ` Andrew Cooper
@ 2013-05-31 6:36 ` Jan Beulich
1 sibling, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2013-05-31 6:36 UTC (permalink / raw)
To: George Dunlap; +Cc: andrew.cooper3, keir, xen-devel
>>> On 30.05.13 at 18:51, George Dunlap <george.dunlap@eu.citrix.com> wrote:
> Right -- then if you could also commit this tomorrow, it will get the
> best testing we can give it. :-)
Done.
Jan
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-05-31 6:36 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-29 6:58 [PATCH] x86: fix ordering of operations in destroy_irq() Jan Beulich
2013-05-29 7:23 ` Jan Beulich
2013-05-29 22:17 ` Andrew Cooper
2013-05-29 7:29 ` Keir Fraser
2013-05-30 16:23 ` George Dunlap
2013-05-30 16:42 ` Jan Beulich
2013-05-30 16:51 ` George Dunlap
2013-05-30 17:22 ` Andrew Cooper
2013-05-31 6:36 ` Jan Beulich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).