Xen crash on dom0 shutdown

All of lore.kernel.org
 help / color / mirror / Atom feed

* Xen crash on dom0 shutdown
@ 2008-09-23 10:34 Jan Beulich
  2008-09-23 11:27 ` Keir Fraser
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2008-09-23 10:34 UTC (permalink / raw)
  To: xen-devel

There is a BUG_ON() at xen/arch/x86/physdev.c:169 which appears to
be dependent upon guest behavior (should close event channel before
un-mapping pirq), rather than on internal hypervisor state. In 2.6.18,
this likely goes unnoticed because pci_device_shutdown() only calls all
the driver shutdown routines. In newer kernels, however, it also calls
pci_msi_shutdown() and pci_msix_shutdown(), which remove all pirq
mappings. If now (which commonly appears to be the case for storage
drivers) an MSI/MSI-X driver has no shutdown handler (or one that
doesn't do much, because on native this is not causing any problems),
the assumption in Xen is violated and the hypervisor crashes.

Simply removing the BUG_ON() seems inappropriate, though. But I'm
uncertain whether it would be reasonable to call pirq_guest_unbind()
instead of the BUG_ON(), and if so, how to properly deal with
irq_desc[vector].lock (the immediate idea would be to factor out the
locking into a wrapper function, but an alternative would be to use
recursive locks, and perhaps there are other possibilities).

Thanks for any hints, Jan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Xen crash on dom0 shutdown
  2008-09-23 10:34 Xen crash on dom0 shutdown Jan Beulich
@ 2008-09-23 11:27 ` Keir Fraser
  2008-09-24  8:59   ` [PATCH] " Jan Beulich
  0 siblings, 1 reply; 10+ messages in thread
From: Keir Fraser @ 2008-09-23 11:27 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Jiang, Yunhong, Shan, Haitao

On 23/9/08 11:34, "Jan Beulich" <jbeulich@novell.com> wrote:

> Simply removing the BUG_ON() seems inappropriate, though. But I'm
> uncertain whether it would be reasonable to call pirq_guest_unbind()
> instead of the BUG_ON(), and if so, how to properly deal with
> irq_desc[vector].lock (the immediate idea would be to factor out the
> locking into a wrapper function, but an alternative would be to use
> recursive locks, and perhaps there are other possibilities).

Well, this hypercall doesn't do pirq_guest_unbind() on IO-APIC-routed
interrupts either, so I think the problem is wider ranging than just MSI
interrupts. Consider unmap_irq() followed by pirq_guest_unbind() later.
We'll BUG_ON(vector == 0) in the latter function. Looks a bit of a mess to
me. I would say that if there are bindings remaining we should re-direct the
event-channel to a 'no-op' pirq (e.g., -1) and indeed call
pirq_guest_unbind() or similar.

 -- Keir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] Re: Xen crash on dom0 shutdown
  2008-09-23 11:27 ` Keir Fraser
@ 2008-09-24  8:59   ` Jan Beulich
  2008-09-24  9:13     ` Keir Fraser
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2008-09-24  8:59 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Yunhong Jiang, Haitao Shan, xen-devel

>>> Keir Fraser <keir.fraser@eu.citrix.com> 23.09.08 13:27 >>>
>On 23/9/08 11:34, "Jan Beulich" <jbeulich@novell.com> wrote:
>
>> Simply removing the BUG_ON() seems inappropriate, though. But I'm
>> uncertain whether it would be reasonable to call pirq_guest_unbind()
>> instead of the BUG_ON(), and if so, how to properly deal with
>> irq_desc[vector].lock (the immediate idea would be to factor out the
>> locking into a wrapper function, but an alternative would be to use
>> recursive locks, and perhaps there are other possibilities).
>
>Well, this hypercall doesn't do pirq_guest_unbind() on IO-APIC-routed
>interrupts either, so I think the problem is wider ranging than just MSI
>interrupts. Consider unmap_irq() followed by pirq_guest_unbind() later.
>We'll BUG_ON(vector == 0) in the latter function. Looks a bit of a mess to
>me. I would say that if there are bindings remaining we should re-direct the
>event-channel to a 'no-op' pirq (e.g., -1) and indeed call
>pirq_guest_unbind() or similar.

How about this one? It doesn't exactly follow the path you suggested,
i.e. doesn't mess with event channels, but rather marks the
irq<->vector mapping as invalid, allowing only a subsequent event
channel unbind (i.e. close) to recover from that state (which seems
better in terms of requiring proper discipline in the guest, as it
prevents re-using the irq or vector without properly cleaning up).

Signed-off-by: Jan Beulich <jbeulich@novell.com>

Index: 2008-09-19/xen/arch/x86/io_apic.c
===================================================================
--- 2008-09-19.orig/xen/arch/x86/io_apic.c	2008-09-17 09:26:41.000000000 +0200
+++ 2008-09-19/xen/arch/x86/io_apic.c	2008-09-24 09:19:41.000000000 +0200
@@ -721,7 +721,6 @@ next:
 
 static struct hw_interrupt_type ioapic_level_type;
 static struct hw_interrupt_type ioapic_edge_type;
-struct hw_interrupt_type pci_msi_type;
 
 #define IOAPIC_AUTO	-1
 #define IOAPIC_EDGE	0
Index: 2008-09-19/xen/arch/x86/irq.c
===================================================================
--- 2008-09-19.orig/xen/arch/x86/irq.c	2008-09-24 09:17:33.000000000 +0200
+++ 2008-09-19/xen/arch/x86/irq.c	2008-09-23 15:26:26.000000000 +0200
@@ -343,6 +343,8 @@ static void __pirq_guest_eoi(struct doma
     int                 vector;
 
     vector = domain_irq_to_vector(d, irq);
+    if ( vector < 0 )
+        return;
     desc   = &irq_desc[vector];
     action = (irq_guest_action_t *)desc->action;
 
@@ -418,7 +420,7 @@ int pirq_acktype(struct domain *d, int i
     unsigned int vector;
 
     vector = domain_irq_to_vector(d, irq);
-    if ( vector == 0 )
+    if ( vector <= 0 )
         return ACKTYPE_NONE;
 
     desc = &irq_desc[vector];
@@ -469,7 +471,7 @@ int pirq_shared(struct domain *d, int ir
     int                 shared;
 
     vector = domain_irq_to_vector(d, irq);
-    if ( vector == 0 )
+    if ( vector <= 0 )
         return 0;
 
     desc = &irq_desc[vector];
@@ -493,7 +495,7 @@ int pirq_guest_bind(struct vcpu *v, int 
 
  retry:
     vector = domain_irq_to_vector(v->domain, irq);
-    if ( vector == 0 )
+    if ( vector <= 0 )
         return -EINVAL;
 
     desc = &irq_desc[vector];
@@ -575,20 +577,15 @@ int pirq_guest_bind(struct vcpu *v, int 
     return rc;
 }
 
-void pirq_guest_unbind(struct domain *d, int irq)
+void __pirq_guest_unbind(struct domain *d, int irq, unsigned int vector,
+                         unsigned long flags)
 {
-    unsigned int        vector;
-    irq_desc_t         *desc;
+    irq_desc_t         *desc = &irq_desc[vector];
     irq_guest_action_t *action;
     cpumask_t           cpu_eoi_map;
-    unsigned long       flags;
     int                 i;
 
-    vector = domain_irq_to_vector(d, irq);
-    desc = &irq_desc[vector];
-    BUG_ON(vector == 0);
-
-    spin_lock_irqsave(&desc->lock, flags);
+    ASSERT(spin_is_locked(&desc->lock));
 
     action = (irq_guest_action_t *)desc->action;
 
@@ -626,7 +623,7 @@ void pirq_guest_unbind(struct domain *d,
     BUG_ON(test_bit(irq, d->pirq_mask));
 
     if ( action->nr_guests != 0 )
-        goto out;
+        return;
 
     BUG_ON(action->in_flight != 0);
 
@@ -659,9 +656,26 @@ void pirq_guest_unbind(struct domain *d,
     desc->status &= ~IRQ_INPROGRESS;
     kill_timer(&irq_guest_eoi_timer[vector]);
     desc->handler->shutdown(vector);
+}
 
- out:
-    spin_unlock_irqrestore(&desc->lock, flags);    
+void pirq_guest_unbind(struct domain *d, int irq)
+{
+    int vector = domain_irq_to_vector(d, irq);
+    unsigned long flags;
+
+    if ( likely(vector > 0) )
+    {
+        spin_lock_irqsave(&irq_desc[vector].lock, flags);
+        __pirq_guest_unbind(d, irq, vector, flags);
+        spin_unlock_irqrestore(&irq_desc[vector].lock, flags);
+    }
+    else
+    {
+        BUG_ON(vector == 0);
+        spin_lock_irqsave(&d->arch.irq_lock, flags);
+        d->arch.pirq_vector[irq] = d->arch.vector_pirq[-vector] = 0;
+        spin_unlock_irqrestore(&d->arch.irq_lock, flags);
+    }
 }
 
 extern void dump_ioapic_irq_info(void);
Index: 2008-09-19/xen/arch/x86/msi.c
===================================================================
--- 2008-09-19.orig/xen/arch/x86/msi.c	2008-08-15 16:18:55.000000000 +0200
+++ 2008-09-19/xen/arch/x86/msi.c	2008-09-24 09:19:47.000000000 +0200
@@ -727,7 +727,6 @@ void pci_disable_msi(int vector)
         __pci_disable_msix(vector);
 }
 
-extern struct hw_interrupt_type pci_msi_type;
 static void msi_free_vectors(struct pci_dev* dev)
 {
     struct msi_desc *entry, *tmp;
Index: 2008-09-19/xen/arch/x86/physdev.c
===================================================================
--- 2008-09-19.orig/xen/arch/x86/physdev.c	2008-09-24 09:17:33.000000000 +0200
+++ 2008-09-19/xen/arch/x86/physdev.c	2008-09-24 09:19:57.000000000 +0200
@@ -27,8 +27,6 @@ ioapic_guest_write(
     unsigned long physbase, unsigned int reg, u32 pval);
 
 
-extern struct hw_interrupt_type pci_msi_type;
-
 static int get_free_pirq(struct domain *d, int type, int index)
 {
     int i;
@@ -150,7 +148,7 @@ static int unmap_domain_pirq(struct doma
 
     vector = d->arch.pirq_vector[pirq];
 
-    if ( !vector )
+    if ( vector <= 0 )
     {
         gdprintk(XENLOG_G_ERR, "domain %X: pirq %x not mapped still\n",
                  d->domain_id, pirq);
@@ -160,18 +158,30 @@ static int unmap_domain_pirq(struct doma
 
     desc = &irq_desc[vector];
     spin_lock_irqsave(&desc->lock, flags);
+
+    if ( desc->status & IRQ_GUEST )
+    {
+        dprintk(XENLOG_G_WARNING, "dom%d: forcing unbind of pirq %d\n",
+                d->domain_id, pirq);
+        __pirq_guest_unbind(d, pirq, vector, flags);
+        vector = -vector;
+    }
+
     if ( desc->msi_desc )
         pci_disable_msi(vector);
 
     if ( desc->handler == &pci_msi_type )
-    {
-        /* MSI is not shared, so should be released already */
-        BUG_ON(desc->status & IRQ_GUEST);
-        irq_desc[vector].handler = &no_irq_type;
-    }
+        desc->handler = &no_irq_type;
+
     spin_unlock_irqrestore(&desc->lock, flags);
 
-    d->arch.pirq_vector[pirq] = d->arch.vector_pirq[vector] = 0;
+    if ( vector > 0 )
+        d->arch.pirq_vector[pirq] = d->arch.vector_pirq[vector] = 0;
+    else
+    {
+        d->arch.pirq_vector[pirq] = vector;
+        d->arch.vector_pirq[-vector] = -pirq;
+    }
 
     ret = irq_deny_access(d, pirq);
     if ( ret )
@@ -244,7 +254,7 @@ static int physdev_map_pirq(struct physd
     }
 
     spin_lock_irqsave(&d->arch.irq_lock, flags);
-    if ( map->pirq == -1 )
+    if ( map->pirq < 0 )
     {
         if ( d->arch.vector_pirq[vector] )
         {
@@ -252,6 +262,11 @@ static int physdev_map_pirq(struct physd
                                     map->index, map->pirq,
                                     d->arch.vector_pirq[vector]);
             pirq = d->arch.vector_pirq[vector];
+            if ( pirq < 0 )
+            {
+                ret = -EBUSY;
+                goto done;
+            }
         }
         else
         {
Index: 2008-09-19/xen/include/asm-x86/irq.h
===================================================================
--- 2008-09-19.orig/xen/include/asm-x86/irq.h	2008-09-24 09:17:33.000000000 +0200
+++ 2008-09-19/xen/include/asm-x86/irq.h	2008-09-23 15:25:53.000000000 +0200
@@ -52,6 +52,10 @@ extern atomic_t irq_mis_count;
 int pirq_acktype(struct domain *d, int irq);
 int pirq_shared(struct domain *d , int irq);
 
+/* May only be called be irq_desc[vector].lock held. */
+void __pirq_guest_unbind(struct domain *d, int irq, unsigned int vector,
+                         unsigned long flags);
+
 extern int domain_irq_to_vector(struct domain *d, int irq);
 extern int domain_vector_to_irq(struct domain *d, int vector);
 #endif /* _ASM_HW_IRQ_H */
Index: 2008-09-19/xen/include/asm-x86/msi.h
===================================================================
--- 2008-09-19.orig/xen/include/asm-x86/msi.h	2008-08-15 16:18:55.000000000 +0200
+++ 2008-09-19/xen/include/asm-x86/msi.h	2008-09-24 09:21:44.000000000 +0200
@@ -106,7 +106,7 @@ struct msi_desc {
  */
 #define NR_HP_RESERVED_VECTORS 	20
 
-extern int vector_irq[NR_VECTORS];
+extern struct hw_interrupt_type pci_msi_type;
 
 /*
  * MSI-X Address Register

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Re: Xen crash on dom0 shutdown
  2008-09-24  8:59   ` [PATCH] " Jan Beulich
@ 2008-09-24  9:13     ` Keir Fraser
  2008-09-24 11:31       ` Jiang, Yunhong
  2008-09-24 11:45       ` Keir Fraser
  0 siblings, 2 replies; 10+ messages in thread
From: Keir Fraser @ 2008-09-24  9:13 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Yunhong Jiang, Haitao Shan, xen-devel

On 24/9/08 09:59, "Jan Beulich" <jbeulich@novell.com> wrote:

>> Well, this hypercall doesn't do pirq_guest_unbind() on IO-APIC-routed
>> interrupts either, so I think the problem is wider ranging than just MSI
>> interrupts. Consider unmap_irq() followed by pirq_guest_unbind() later.
>> We'll BUG_ON(vector == 0) in the latter function. Looks a bit of a mess to
>> me. I would say that if there are bindings remaining we should re-direct the
>> event-channel to a 'no-op' pirq (e.g., -1) and indeed call
>> pirq_guest_unbind() or similar.
> 
> How about this one? It doesn't exactly follow the path you suggested,
> i.e. doesn't mess with event channels, but rather marks the
> irq<->vector mapping as invalid, allowing only a subsequent event
> channel unbind (i.e. close) to recover from that state (which seems
> better in terms of requiring proper discipline in the guest, as it
> prevents re-using the irq or vector without properly cleaning up).

Yeah, this is the kind of thing I had in mind. I will work on this a bit
more (e.g., need to synchronise on d->evtchn_lock to avoid racing
EVTCHNOP_bind_pirq; also I'm afraid about leaking MSI vectors on domain
destruction). Thanks.

 -- Keir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] Re: Xen crash on dom0 shutdown
  2008-09-24  9:13     ` Keir Fraser
@ 2008-09-24 11:31       ` Jiang, Yunhong
  2008-09-24 11:45       ` Keir Fraser
  1 sibling, 0 replies; 10+ messages in thread
From: Jiang, Yunhong @ 2008-09-24 11:31 UTC (permalink / raw)
  To: Keir Fraser, Jan Beulich; +Cc: xen-devel@lists.xensource.com, Shan, Haitao

xen-devel-bounces@lists.xensource.com <> wrote:
> On 24/9/08 09:59, "Jan Beulich" <jbeulich@novell.com> wrote:
>
>>> Well, this hypercall doesn't do pirq_guest_unbind() on IO-APIC-routed
>>> interrupts either, so I think the problem is wider ranging than just MSI
>>> interrupts. Consider unmap_irq() followed by pirq_guest_unbind() later.
>>> We'll BUG_ON(vector == 0) in the latter function. Looks a bit of a mess to
>>> me. I would say that if there are bindings remaining we should re-direct
>>> the event-channel to a 'no-op' pirq (e.g., -1) and indeed call
>>> pirq_guest_unbind() or similar.
>>
>> How about this one? It doesn't exactly follow the path you suggested,
>> i.e. doesn't mess with event channels, but rather marks the
>> irq<->vector mapping as invalid, allowing only a subsequent event
>> channel unbind (i.e. close) to recover from that state (which seems
>> better in terms of requiring proper discipline in the guest, as it
>> prevents re-using the irq or vector without properly cleaning up).
>
> Yeah, this is the kind of thing I had in mind. I will work on
> this a bit
> more (e.g., need to synchronise on d->evtchn_lock to avoid racing
> EVTCHNOP_bind_pirq; also I'm afraid about leaking MSI vectors on domain
> destruction). Thanks.

Sorry to notice that the vector is not managed at all :$
Currently assign_irq_vector() only checks IO_APIC_VECTOR(irq), while for AUTO_ASSIGN situation, there is no management at all.
I'm considering if we can check the irq_desc[vector]'s handler to see if the vector has assigned or not.

Also noticed following snipt in  setupOneDevice in python/xen/xend/server/pciif.py, I suspect it should have less space before it. Also maybe now it is better to be placed under pciback.
           rc = xc.physdev_map_pirq(domid = fe_domid,
                                   index = dev.irq,
                                   pirq  = dev.irq)

>
> -- Keir
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Re: Xen crash on dom0 shutdown
  2008-09-24  9:13     ` Keir Fraser
  2008-09-24 11:31       ` Jiang, Yunhong
@ 2008-09-24 11:45       ` Keir Fraser
  2008-09-25 12:42         ` Shan, Haitao
  1 sibling, 1 reply; 10+ messages in thread
From: Keir Fraser @ 2008-09-24 11:45 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Yunhong Jiang, Haitao Shan, xen-devel

On 24/9/08 10:13, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:

>> How about this one? It doesn't exactly follow the path you suggested,
>> i.e. doesn't mess with event channels, but rather marks the
>> irq<->vector mapping as invalid, allowing only a subsequent event
>> channel unbind (i.e. close) to recover from that state (which seems
>> better in terms of requiring proper discipline in the guest, as it
>> prevents re-using the irq or vector without properly cleaning up).
> 
> Yeah, this is the kind of thing I had in mind. I will work on this a bit more
> (e.g., need to synchronise on d->evtchn_lock to avoid racing
> EVTCHNOP_bind_pirq; also I'm afraid about leaking MSI vectors on domain
> destruction). Thanks.

Changeset 18539. I made the locking quite a bit stricter, by getting rid of
'irq_lock' and instead using 'evtchn_lock' (which may be better renamed now
to 'event_lock' since it isn't protecting just event channels now). Now the
pirq-to-vector mapping is protected by both evtchn_lock and irq_desc->lock.
A user of the mapping can protect themselves with either lock (whichever is
most convenient).

Some TODOs:
 * There is no management of MSI vectors. They always leak! I didn't fix
that here since it isn't a mere synchronisation race; the code just isn't
there.
 * I decided to WARN_ON(!spin_is_locked(&d->evtchn_lock)) in
pirq_guest_[un]bind(). The reason is that in any case those functions do not
expect to be re-entered -- they really want to be per-domain serialised.
Furthermore I am pretty certain that the HVM passthrough code is not
synchronising properly with changes to the pirq-to-vector mapping (it uses
domain_irq_to_vector() in many places with no care for locking) nor is it
synchronised with other users of pirq_guest_bind() --- a reasonable
semantics here would be that a domain pirq can be bound to once, either via
an event channel, or through a virtual PIC in HVM emulation context. I
therefore think that careful locking is required -- it may suffice to get
rid of (or at least make less use of) the hvm_domain.irq_lock and replace
its use with evtchn_lock (only consideration is that the latter is not
IRQ-safe). The WARN_ON() is a nice reminder of work to be done here. ;-)

Comments?

 -- Keir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] Re: Xen crash on dom0 shutdown
  2008-09-24 11:45       ` Keir Fraser
@ 2008-09-25 12:42         ` Shan, Haitao
  2008-09-25 13:24           ` Jiang, Yunhong
  2008-09-25 13:39           ` Keir Fraser
  0 siblings, 2 replies; 10+ messages in thread
From: Shan, Haitao @ 2008-09-25 12:42 UTC (permalink / raw)
  To: 'Keir Fraser', 'Jan Beulich'
  Cc: Jiang, Yunhong, 'xen-devel@lists.xensource.com'

Hi, Keir,

As I tested my patch today, I find cs18539 breaks my system. I have to revert this changeset to test MSI.
Here is the output from serial port.
(XEN) ----[ Xen-3.4-unstable  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    2
(XEN) RIP:    e008:[<ffff828c8013c2f9>] pirq_guest_unbind+0xb9/0x2f0
(XEN) RFLAGS: 0000000000010082   CONTEXT: hypervisor
(XEN) rax: ffff828c80274d80   rbx: 0000000000000000   rcx: 00000000000000d0
(XEN) rdx: ffff83007c60fe38   rsi: 00000000000000ff   rdi: ffff83007c80e080
(XEN) rbp: ffff83007c80e080   rsp: ffff83007c60fe28   r8:  0000000000000282
(XEN) r9:  ffff828c80274da4   r10: ffff828c8026e580   r11: 0000000000000206
(XEN) r12: ffff828c80274d80   r13: 00000000000000d0   r14: 00000000000000ff
(XEN) r15: 00000000000000ff   cr0: 000000008005003b   cr4: 00000000000026b0
(XEN) cr3: 000000005ea9c000   cr2: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen stack trace from rsp=ffff83007c60fe28:
(XEN)    0000000000000282 ffff83007c80e080 0000000000000282 fffffffffffffffd
(XEN)    ffff83007c80e080 00000000000000ff 00000000000000d0 ffff8800204408b0
(XEN)    00000000000000ff ffff828c80149075 00000008000000ff ffffffff803908d6
(XEN)    828c801a98a0b948 c390ec90d1ffffff ffff83007c80e1f8 ffff83007c60fed8
(XEN)    ffff828c8025f900 ffff83007c80d080 000000ff804e7ff0 ffffffffffffffff
(XEN)    0000000000000000 0000000000000000 000000010000001a 00a0fb000000001a
(XEN)    00000001801fe568 ffff83007d5e6080 00000000000000ff ffff8800204408b0
(XEN)    0000000000000000 ffff8800204408b0 ffff880020440000 ffff828c801a9169

The actually call trace is do_physdev_op->pirq_guest_unbind, which definitely is in unmap_pirq hypercall for MSI interrupt. This will happen when dom0 first unbind guest pirq and then unmap that pirq. I think this little patch will fix the problem. But I am not quite sure whether this will break the intension of cs 18539.

diff -r 7a32c2325fdc xen/arch/x86/irq.c
--- a/xen/arch/x86/irq.c        Thu Sep 25 10:26:08 2008 +0100
+++ b/xen/arch/x86/irq.c        Thu Sep 25 21:12:03 2008 +0800
@@ -619,6 +619,8 @@
     }

     action = (irq_guest_action_t *)desc->action;
+       if ( !action )
+               goto out;
     vector = desc - irq_desc;

     for ( i = 0; (i < action->nr_guests) && (action->guest[i] != d); i++ )

Signed-off-by:  Shan haitao <haitao.shan@intel.com>


Best Regards
Haitao Shan

Keir Fraser wrote:
> On 24/9/08 10:13, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
>
>>> How about this one? It doesn't exactly follow the path you
>>> suggested, i.e. doesn't mess with event channels, but rather marks
>>> the irq<->vector mapping as invalid, allowing only a subsequent
>>> event channel unbind (i.e. close) to recover from that state (which
>>> seems better in terms of requiring proper discipline in the guest,
>>> as it prevents re-using the irq or vector without properly cleaning
>>> up).
>>
>> Yeah, this is the kind of thing I had in mind. I will work on this a
>> bit more (e.g., need to synchronise on d->evtchn_lock to avoid racing
>> EVTCHNOP_bind_pirq; also I'm afraid about leaking MSI vectors on
>> domain destruction). Thanks.
>
> Changeset 18539. I made the locking quite a bit stricter, by getting
> rid of 'irq_lock' and instead using 'evtchn_lock' (which may be
> better renamed now to 'event_lock' since it isn't protecting just
> event channels now). Now the pirq-to-vector mapping is protected by
> both evtchn_lock and irq_desc->lock. A user of the mapping can
> protect themselves with either lock (whichever is most convenient).
>
> Some TODOs:
>  * There is no management of MSI vectors. They always leak! I didn't
> fix that here since it isn't a mere synchronisation race; the code
> just isn't there.
>  * I decided to WARN_ON(!spin_is_locked(&d->evtchn_lock)) in
> pirq_guest_[un]bind(). The reason is that in any case those functions
> do not expect to be re-entered -- they really want to be per-domain
> serialised. Furthermore I am pretty certain that the HVM passthrough
> code is not synchronising properly with changes to the pirq-to-vector
> mapping (it uses domain_irq_to_vector() in many places with no care
> for locking) nor is it synchronised with other users of
> pirq_guest_bind() --- a reasonable semantics here would be that a
> domain pirq can be bound to once, either via an event channel, or
> through a virtual PIC in HVM emulation context. I therefore think
> that careful locking is required -- it may suffice to get rid of (or
> at least make less use of) the hvm_domain.irq_lock and replace its
> use with evtchn_lock (only consideration is that the latter is not
> IRQ-safe). The WARN_ON() is a nice reminder of work to be done here.
> ;-)
>
> Comments?
>
>  -- Keir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] Re: Xen crash on dom0 shutdown
  2008-09-25 12:42         ` Shan, Haitao
@ 2008-09-25 13:24           ` Jiang, Yunhong
  2008-09-25 13:39           ` Keir Fraser
  1 sibling, 0 replies; 10+ messages in thread
From: Jiang, Yunhong @ 2008-09-25 13:24 UTC (permalink / raw)
  To: Shan, Haitao, 'Keir Fraser', 'Jan Beulich'
  Cc: 'xen-devel@lists.xensource.com'

After the pirq is unbind for MSI, the vector is in fact free and action is freed, but it is not cleared in pirq_to_vector mapping in domain still.

Yunhong Jiang

Shan, Haitao <> wrote:
> Hi, Keir,
>
> As I tested my patch today, I find cs18539 breaks my system. I
> have to revert this changeset to test MSI.
> Here is the output from serial port.
> (XEN) ----[ Xen-3.4-unstable  x86_64  debug=n  Not tainted ]---- (XEN) CPU:
> 2 (XEN) RIP:    e008:[<ffff828c8013c2f9>] pirq_guest_unbind+0xb9/0x2f0
> (XEN) RFLAGS: 0000000000010082   CONTEXT: hypervisor
> (XEN) rax: ffff828c80274d80   rbx: 0000000000000000   rcx: 00000000000000d0
> (XEN) rdx: ffff83007c60fe38   rsi: 00000000000000ff   rdi: ffff83007c80e080
> (XEN) rbp: ffff83007c80e080   rsp: ffff83007c60fe28   r8: 0000000000000282
> (XEN) r9:  ffff828c80274da4   r10: ffff828c8026e580   r11: 0000000000000206
> (XEN) r12: ffff828c80274d80   r13: 00000000000000d0   r14: 00000000000000ff
> (XEN) r15: 00000000000000ff   cr0: 000000008005003b   cr4: 00000000000026b0
> (XEN) cr3: 000000005ea9c000   cr2: 0000000000000000
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
> (XEN) Xen stack trace from rsp=ffff83007c60fe28:
> (XEN)    0000000000000282 ffff83007c80e080 0000000000000282 fffffffffffffffd
> (XEN)    ffff83007c80e080 00000000000000ff 00000000000000d0 ffff8800204408b0
> (XEN)    00000000000000ff ffff828c80149075 00000008000000ff ffffffff803908d6
> (XEN)    828c801a98a0b948 c390ec90d1ffffff ffff83007c80e1f8 ffff83007c60fed8
> (XEN)    ffff828c8025f900 ffff83007c80d080 000000ff804e7ff0 ffffffffffffffff
> (XEN)    0000000000000000 0000000000000000 000000010000001a 00a0fb000000001a
> (XEN)    00000001801fe568 ffff83007d5e6080 00000000000000ff ffff8800204408b0
> (XEN)    0000000000000000 ffff8800204408b0 ffff880020440000 ffff828c801a9169
>
> The actually call trace is do_physdev_op->pirq_guest_unbind,
> which definitely is in unmap_pirq hypercall for MSI interrupt.
> This will happen when dom0 first unbind guest pirq and then
> unmap that pirq. I think this little patch will fix the
> problem. But I am not quite sure whether this will break the intension of
> cs 18539.
>
> diff -r 7a32c2325fdc xen/arch/x86/irq.c
> --- a/xen/arch/x86/irq.c        Thu Sep 25 10:26:08 2008 +0100
> +++ b/xen/arch/x86/irq.c        Thu Sep 25 21:12:03 2008 +0800 @@ -619,6
>     +619,8 @@ }
>
>     action = (irq_guest_action_t *)desc->action;
> +       if ( !action )
> +               goto out;
>     vector = desc - irq_desc;
>
>     for ( i = 0; (i < action->nr_guests) && (action->guest[i] != d); i++ )
>
> Signed-off-by:        Shan haitao <haitao.shan@intel.com>
>
>
> Best Regards
> Haitao Shan
>
> Keir Fraser wrote:
>> On 24/9/08 10:13, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
>>
>>>> How about this one? It doesn't exactly follow the path you
>>>> suggested, i.e. doesn't mess with event channels, but rather marks
>>>> the irq<->vector mapping as invalid, allowing only a subsequent
>>>> event channel unbind (i.e. close) to recover from that state (which
>>>> seems better in terms of requiring proper discipline in the guest,
>>>> as it prevents re-using the irq or vector without properly cleaning
>>>> up).
>>>
>>> Yeah, this is the kind of thing I had in mind. I will work on this a
>>> bit more (e.g., need to synchronise on d->evtchn_lock to avoid racing
>>> EVTCHNOP_bind_pirq; also I'm afraid about leaking MSI vectors on
>>> domain destruction). Thanks.
>>
>> Changeset 18539. I made the locking quite a bit stricter, by getting
>> rid of 'irq_lock' and instead using 'evtchn_lock' (which may be
>> better renamed now to 'event_lock' since it isn't protecting just
>> event channels now). Now the pirq-to-vector mapping is protected by
>> both evtchn_lock and irq_desc->lock. A user of the mapping can
>> protect themselves with either lock (whichever is most convenient).
>>
>> Some TODOs:
>>  * There is no management of MSI vectors. They always leak! I didn't
>> fix that here since it isn't a mere synchronisation race; the code just
>>  isn't there. * I decided to WARN_ON(!spin_is_locked(&d->evtchn_lock)) in
>> pirq_guest_[un]bind(). The reason is that in any case those functions
>> do not expect to be re-entered -- they really want to be per-domain
>> serialised. Furthermore I am pretty certain that the HVM passthrough
>> code is not synchronising properly with changes to the pirq-to-vector
>> mapping (it uses domain_irq_to_vector() in many places with no care
>> for locking) nor is it synchronised with other users of
>> pirq_guest_bind() --- a reasonable semantics here would be that a
>> domain pirq can be bound to once, either via an event channel, or
>> through a virtual PIC in HVM emulation context. I therefore think
>> that careful locking is required -- it may suffice to get rid of (or
>> at least make less use of) the hvm_domain.irq_lock and replace its
>> use with evtchn_lock (only consideration is that the latter is not
>> IRQ-safe). The WARN_ON() is a nice reminder of work to be done here. ;-)
>>
>> Comments?
>>
>>  -- Keir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Re: Xen crash on dom0 shutdown
  2008-09-25 12:42         ` Shan, Haitao
  2008-09-25 13:24           ` Jiang, Yunhong
@ 2008-09-25 13:39           ` Keir Fraser
  2008-09-25 14:17             ` Shan, Haitao
  1 sibling, 1 reply; 10+ messages in thread
From: Keir Fraser @ 2008-09-25 13:39 UTC (permalink / raw)
  To: Shan, Haitao, 'Jan Beulich'
  Cc: Jiang, Yunhong, 'xen-devel@lists.xensource.com'

On 25/9/08 13:42, "Shan, Haitao" <haitao.shan@intel.com> wrote:

> The actually call trace is do_physdev_op->pirq_guest_unbind, which definitely
> is in unmap_pirq hypercall for MSI interrupt. This will happen when dom0 first
> unbind guest pirq and then unmap that pirq. I think this little patch will fix
> the problem. But I am not quite sure whether this will break the intension of
> cs 18539.

Your patch is along the write lines but should actually check for IRQ_GUEST
flag. I've applied a bigger patch as 18547. It also I think simplifies the
logic of 18539 in a number of respects, especially by splitting
pirq_guest_unbind() into two interface functions. Let me know how you get on
with it!

 -- Keir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] Re: Xen crash on dom0 shutdown
  2008-09-25 13:39           ` Keir Fraser
@ 2008-09-25 14:17             ` Shan, Haitao
  0 siblings, 0 replies; 10+ messages in thread
From: Shan, Haitao @ 2008-09-25 14:17 UTC (permalink / raw)
  To: 'Keir Fraser', 'Jan Beulich'
  Cc: Jiang, Yunhong, 'xen-devel@lists.xensource.com'

Yes. The problem is fixed now.

Best Regards
Haitao Shan

Keir Fraser wrote:
> On 25/9/08 13:42, "Shan, Haitao" <haitao.shan@intel.com> wrote:
>
>> The actually call trace is do_physdev_op->pirq_guest_unbind, which
>> definitely is in unmap_pirq hypercall for MSI interrupt. This will
>> happen when dom0 first unbind guest pirq and then unmap that pirq. I
>> think this little patch will fix the problem. But I am not quite
>> sure whether this will break the intension of cs 18539.
>
> Your patch is along the write lines but should actually check for
> IRQ_GUEST flag. I've applied a bigger patch as 18547. It also I think
> simplifies the logic of 18539 in a number of respects, especially by
> splitting pirq_guest_unbind() into two interface functions. Let me
> know how you get on with it!
>
>  -- Keir

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-09-25 14:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-23 10:34 Xen crash on dom0 shutdown Jan Beulich
2008-09-23 11:27 ` Keir Fraser
2008-09-24  8:59   ` [PATCH] " Jan Beulich
2008-09-24  9:13     ` Keir Fraser
2008-09-24 11:31       ` Jiang, Yunhong
2008-09-24 11:45       ` Keir Fraser
2008-09-25 12:42         ` Shan, Haitao
2008-09-25 13:24           ` Jiang, Yunhong
2008-09-25 13:39           ` Keir Fraser
2008-09-25 14:17             ` Shan, Haitao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.