All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] x86: properly handle MSI-X unmask operation from guests
@ 2013-11-21 10:51 Wu, Feng
  2013-11-21 11:43 ` Andrew Cooper
  0 siblings, 1 reply; 7+ messages in thread
From: Wu, Feng @ 2013-11-21 10:51 UTC (permalink / raw)
  To: Jan Beulich (JBeulich@suse.com), xen-devel@lists.xen.org
  Cc: Auld, Will, Nakajima, Jun, Zhang, Xiantao

patch revision history
----------------------
v1: Initial patch to handle this issue involving changing the hypercall interface
v2:Totally handled inside hypervisor.
v3:Change some logics of handling msi-x pending unmask operations.

>From 78ae225e6af88b0b850980fc55640d0776aeafbc Mon Sep 17 00:00:00 2001
From: Feng Wu <feng.wu@intel.com>
Date: Wed, 13 Nov 2013 21:43:48 -0500
Subject: [PATCH] x86: properly handle MSI-X unmask operation from guests

For a pass-through device with MSI-x capability, when guest tries
to unmask the MSI-x interrupt for the passed through device, xen
doesn't clear the mask bit for MSI-x in hardware in the following
scenario, which will cause network disconnection:

1. Guest masks the MSI-x interrupt
2. Guest updates the address and data for it
3. Guest unmasks the MSI-x interrupt (This is the problematic step)

In the step #3 above, Xen doesn't handle it well. When guest tries
to unmask MSI-X interrupt, it traps to Xen, Xen just returns to Qemu
if it notices that address or data has been modified by guest before,
then Qemu will update Xen with the latest value of address/data by
hypercall. However, in this whole process, the MSI-X interrupt unmask
operation is missing, which means Xen doesn't clear the mask bit in
hardware for the MSI-X interrupt, so it remains disabled, that is why
it loses the network connection.

This patch fixes this issue.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 xen/arch/x86/hvm/io.c        |    3 +++
 xen/arch/x86/hvm/vmsi.c      |   22 +++++++++++++++++++++-
 xen/include/asm-x86/domain.h |    7 +++++++
 xen/include/xen/pci.h        |    1 +
 4 files changed, 32 insertions(+), 1 deletions(-)

diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index deb7b92..516f0a4 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -297,6 +297,9 @@ void hvm_io_assist(ioreq_t *p)
         break;
     }

+    if (msix_post_handler(curr))
+        gdprintk(XENLOG_ERR, ": msix_post_handler error\n");
+
     if ( p->state == STATE_IOREQ_NONE )
         vcpu_end_shutdown_deferral(curr);
}
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index 4826b4a..cd97a3b 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -292,8 +292,11 @@ static int msixtbl_write(struct vcpu *v, unsigned long address,
     }

     /* exit to device model if address/data has been modified */
-    if ( test_and_clear_bit(nr_entry, &entry->table_flags) )
+    if ( test_and_clear_bit(nr_entry, &entry->table_flags) ) {
+        v->arch.pending_msix_unmask.valid = 1;
+        v->arch.pending_msix_unmask.ctrl_address = address;
         goto out;
+    }

     virt = msixtbl_addr_to_virt(entry, address);
     if ( !virt )
@@ -528,3 +531,20 @@ void msixtbl_pt_cleanup(struct domain *d)
     spin_unlock(&d->arch.hvm_domain.msixtbl_list_lock);
     local_irq_restore(flags);
 }
+
+int msix_post_handler(struct vcpu *v)
+{
+    int rc;
+
+    if (v->arch.pending_msix_unmask.valid == 0)
+        return 0;
+
+    v->arch.pending_msix_unmask.valid = 0;
+
+    rc = msixtbl_write(v, v->arch.pending_msix_unmask.ctrl_address, 4, 0);
+    if (rc != X86EMUL_OKAY)
+        return -1;
+    else
+        return 0;
+}
+
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 9d39061..b3bdfa3 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -375,6 +375,11 @@ struct pv_vcpu
     spinlock_t shadow_ldt_lock;
 };

+struct pending_msix_unmask_info {
+    int valid;
+    unsigned long ctrl_address;
+};
+
 struct arch_vcpu
 {
     /*
@@ -439,6 +444,8 @@ struct arch_vcpu

     /* A secondary copy of the vcpu time info. */
     XEN_GUEST_HANDLE(vcpu_time_info_t) time_info_guest;
+
+    struct pending_msix_unmask_info pending_msix_unmask;
 } __cacheline_aligned;

 /* Shorthands to improve code legibility. */
diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
index cadb525..ce8f6ff 100644
--- a/xen/include/xen/pci.h
+++ b/xen/include/xen/pci.h
@@ -147,5 +147,6 @@ struct pirq;
 int msixtbl_pt_register(struct domain *, struct pirq *, uint64_t gtable);
 void msixtbl_pt_unregister(struct domain *, struct pirq *);
 void msixtbl_pt_cleanup(struct domain *d);
+int msix_post_handler(struct vcpu *v);

 #endif /* __XEN_PCI_H__ */
--
1.7.1

Thanks,
Feng

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] x86: properly handle MSI-X unmask operation from guests
  2013-11-21 10:51 [PATCH v3] x86: properly handle MSI-X unmask operation from guests Wu, Feng
@ 2013-11-21 11:43 ` Andrew Cooper
  2013-11-21 12:13   ` Wu, Feng
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Cooper @ 2013-11-21 11:43 UTC (permalink / raw)
  To: Wu, Feng
  Cc: Auld, Will, Zhang, Xiantao, Nakajima, Jun,
	Jan Beulich (JBeulich@suse.com), xen-devel@lists.xen.org

On 21/11/13 10:51, Wu, Feng wrote:
> patch revision history
> ----------------------
> v1: Initial patch to handle this issue involving changing the hypercall interface
> v2:Totally handled inside hypervisor.
> v3:Change some logics of handling msi-x pending unmask operations.
>
> From 78ae225e6af88b0b850980fc55640d0776aeafbc Mon Sep 17 00:00:00 2001
> From: Feng Wu <feng.wu@intel.com>
> Date: Wed, 13 Nov 2013 21:43:48 -0500
> Subject: [PATCH] x86: properly handle MSI-X unmask operation from guests
>
> For a pass-through device with MSI-x capability, when guest tries
> to unmask the MSI-x interrupt for the passed through device, xen
> doesn't clear the mask bit for MSI-x in hardware in the following
> scenario, which will cause network disconnection:
>
> 1. Guest masks the MSI-x interrupt
> 2. Guest updates the address and data for it
> 3. Guest unmasks the MSI-x interrupt (This is the problematic step)
>
> In the step #3 above, Xen doesn't handle it well. When guest tries
> to unmask MSI-X interrupt, it traps to Xen, Xen just returns to Qemu
> if it notices that address or data has been modified by guest before,
> then Qemu will update Xen with the latest value of address/data by
> hypercall. However, in this whole process, the MSI-X interrupt unmask
> operation is missing, which means Xen doesn't clear the mask bit in
> hardware for the MSI-X interrupt, so it remains disabled, that is why
> it loses the network connection.
>
> This patch fixes this issue.
>
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> ---
>  xen/arch/x86/hvm/io.c        |    3 +++
>  xen/arch/x86/hvm/vmsi.c      |   22 +++++++++++++++++++++-
>  xen/include/asm-x86/domain.h |    7 +++++++
>  xen/include/xen/pci.h        |    1 +
>  4 files changed, 32 insertions(+), 1 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
> index deb7b92..516f0a4 100644
> --- a/xen/arch/x86/hvm/io.c
> +++ b/xen/arch/x86/hvm/io.c
> @@ -297,6 +297,9 @@ void hvm_io_assist(ioreq_t *p)
>          break;
>      }
>
> +    if (msix_post_handler(curr))

Spaces inside the brackets

> +        gdprintk(XENLOG_ERR, ": msix_post_handler error\n");
> +
>      if ( p->state == STATE_IOREQ_NONE )
>          vcpu_end_shutdown_deferral(curr);
> }
> diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
> index 4826b4a..cd97a3b 100644
> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -292,8 +292,11 @@ static int msixtbl_write(struct vcpu *v, unsigned long address,
>      }
>
>      /* exit to device model if address/data has been modified */
> -    if ( test_and_clear_bit(nr_entry, &entry->table_flags) )
> +    if ( test_and_clear_bit(nr_entry, &entry->table_flags) ) {

Braces should line up

> +        v->arch.pending_msix_unmask.valid = 1;
> +        v->arch.pending_msix_unmask.ctrl_address = address;
>          goto out;
> +    }
>
>      virt = msixtbl_addr_to_virt(entry, address);
>      if ( !virt )
> @@ -528,3 +531,20 @@ void msixtbl_pt_cleanup(struct domain *d)
>      spin_unlock(&d->arch.hvm_domain.msixtbl_list_lock);
>      local_irq_restore(flags);
>  }
> +
> +int msix_post_handler(struct vcpu *v)
> +{
> +    int rc;
> +
> +    if (v->arch.pending_msix_unmask.valid == 0)

spaces

> +        return 0;
> +
> +    v->arch.pending_msix_unmask.valid = 0;
> +
> +    rc = msixtbl_write(v, v->arch.pending_msix_unmask.ctrl_address, 4, 0);
> +    if (rc != X86EMUL_OKAY)
> +        return -1;
> +    else
> +        return 0;

return rc != X86EMUL_OKAY ? -1 : 0;

> +}
> +
> diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
> index 9d39061..b3bdfa3 100644
> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -375,6 +375,11 @@ struct pv_vcpu
>      spinlock_t shadow_ldt_lock;
>  };
>
> +struct pending_msix_unmask_info {
> +    int valid;
> +    unsigned long ctrl_address;

valid should be boolean, and reeordered for packing purposes.

> +};
> +
>  struct arch_vcpu
>  {
>      /*
> @@ -439,6 +444,8 @@ struct arch_vcpu
>
>      /* A secondary copy of the vcpu time info. */
>      XEN_GUEST_HANDLE(vcpu_time_info_t) time_info_guest;
> +
> +    struct pending_msix_unmask_info pending_msix_unmask;

What happens if multiple msix interrupts are masked, all updated with
addresses, then all unmasked?

~Andrew

>  } __cacheline_aligned;
>
>  /* Shorthands to improve code legibility. */
> diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
> index cadb525..ce8f6ff 100644
> --- a/xen/include/xen/pci.h
> +++ b/xen/include/xen/pci.h
> @@ -147,5 +147,6 @@ struct pirq;
>  int msixtbl_pt_register(struct domain *, struct pirq *, uint64_t gtable);
>  void msixtbl_pt_unregister(struct domain *, struct pirq *);
>  void msixtbl_pt_cleanup(struct domain *d);
> +int msix_post_handler(struct vcpu *v);
>
>  #endif /* __XEN_PCI_H__ */
> --
> 1.7.1
>
> Thanks,
> Feng
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] x86: properly handle MSI-X unmask operation from guests
  2013-11-21 11:43 ` Andrew Cooper
@ 2013-11-21 12:13   ` Wu, Feng
  2013-11-21 12:21     ` Andrew Cooper
  0 siblings, 1 reply; 7+ messages in thread
From: Wu, Feng @ 2013-11-21 12:13 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Auld, Will, Zhang, Xiantao, Nakajima, Jun,
	Jan Beulich (JBeulich@suse.com), xen-devel@lists.xen.org



> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Thursday, November 21, 2013 7:44 PM
> To: Wu, Feng
> Cc: Jan Beulich (JBeulich@suse.com); xen-devel@lists.xen.org; Auld, Will;
> Nakajima, Jun; Zhang, Xiantao
> Subject: Re: [Xen-devel] [PATCH v3] x86: properly handle MSI-X unmask
> operation from guests
> 
> On 21/11/13 10:51, Wu, Feng wrote:
> > patch revision history
> > ----------------------
> > v1: Initial patch to handle this issue involving changing the hypercall interface
> > v2:Totally handled inside hypervisor.
> > v3:Change some logics of handling msi-x pending unmask operations.
> >
> > From 78ae225e6af88b0b850980fc55640d0776aeafbc Mon Sep 17 00:00:00
> 2001
> > From: Feng Wu <feng.wu@intel.com>
> > Date: Wed, 13 Nov 2013 21:43:48 -0500
> > Subject: [PATCH] x86: properly handle MSI-X unmask operation from guests
> >
> > For a pass-through device with MSI-x capability, when guest tries
> > to unmask the MSI-x interrupt for the passed through device, xen
> > doesn't clear the mask bit for MSI-x in hardware in the following
> > scenario, which will cause network disconnection:
> >
> > 1. Guest masks the MSI-x interrupt
> > 2. Guest updates the address and data for it
> > 3. Guest unmasks the MSI-x interrupt (This is the problematic step)
> >
> > In the step #3 above, Xen doesn't handle it well. When guest tries
> > to unmask MSI-X interrupt, it traps to Xen, Xen just returns to Qemu
> > if it notices that address or data has been modified by guest before,
> > then Qemu will update Xen with the latest value of address/data by
> > hypercall. However, in this whole process, the MSI-X interrupt unmask
> > operation is missing, which means Xen doesn't clear the mask bit in
> > hardware for the MSI-X interrupt, so it remains disabled, that is why
> > it loses the network connection.
> >
> > This patch fixes this issue.
> >
> > Signed-off-by: Feng Wu <feng.wu@intel.com>
> > ---
> >  xen/arch/x86/hvm/io.c        |    3 +++
> >  xen/arch/x86/hvm/vmsi.c      |   22 +++++++++++++++++++++-
> >  xen/include/asm-x86/domain.h |    7 +++++++
> >  xen/include/xen/pci.h        |    1 +
> >  4 files changed, 32 insertions(+), 1 deletions(-)
> >
> > diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
> > index deb7b92..516f0a4 100644
> > --- a/xen/arch/x86/hvm/io.c
> > +++ b/xen/arch/x86/hvm/io.c
> > @@ -297,6 +297,9 @@ void hvm_io_assist(ioreq_t *p)
> >          break;
> >      }
> >
> > +    if (msix_post_handler(curr))
> 
> Spaces inside the brackets

Accept

> 
> > +        gdprintk(XENLOG_ERR, ": msix_post_handler error\n");
> > +
> >      if ( p->state == STATE_IOREQ_NONE )
> >          vcpu_end_shutdown_deferral(curr);
> > }
> > diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
> > index 4826b4a..cd97a3b 100644
> > --- a/xen/arch/x86/hvm/vmsi.c
> > +++ b/xen/arch/x86/hvm/vmsi.c
> > @@ -292,8 +292,11 @@ static int msixtbl_write(struct vcpu *v, unsigned
> long address,
> >      }
> >
> >      /* exit to device model if address/data has been modified */
> > -    if ( test_and_clear_bit(nr_entry, &entry->table_flags) )
> > +    if ( test_and_clear_bit(nr_entry, &entry->table_flags) ) {
> 
> Braces should line up

Accept

> 
> > +        v->arch.pending_msix_unmask.valid = 1;
> > +        v->arch.pending_msix_unmask.ctrl_address = address;
> >          goto out;
> > +    }
> >
> >      virt = msixtbl_addr_to_virt(entry, address);
> >      if ( !virt )
> > @@ -528,3 +531,20 @@ void msixtbl_pt_cleanup(struct domain *d)
> >      spin_unlock(&d->arch.hvm_domain.msixtbl_list_lock);
> >      local_irq_restore(flags);
> >  }
> > +
> > +int msix_post_handler(struct vcpu *v)
> > +{
> > +    int rc;
> > +
> > +    if (v->arch.pending_msix_unmask.valid == 0)
> 
> spaces

Accept

> 
> > +        return 0;
> > +
> > +    v->arch.pending_msix_unmask.valid = 0;
> > +
> > +    rc = msixtbl_write(v, v->arch.pending_msix_unmask.ctrl_address, 4, 0);
> > +    if (rc != X86EMUL_OKAY)
> > +        return -1;
> > +    else
> > +        return 0;
> 
> return rc != X86EMUL_OKAY ? -1 : 0;

Accept

> 
> > +}
> > +
> > diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
> > index 9d39061..b3bdfa3 100644
> > --- a/xen/include/asm-x86/domain.h
> > +++ b/xen/include/asm-x86/domain.h
> > @@ -375,6 +375,11 @@ struct pv_vcpu
> >      spinlock_t shadow_ldt_lock;
> >  };
> >
> > +struct pending_msix_unmask_info {
> > +    int valid;
> > +    unsigned long ctrl_address;
> 
> valid should be boolean, and reeordered for packing purposes.

Accept

> 
> > +};
> > +
> >  struct arch_vcpu
> >  {
> >      /*
> > @@ -439,6 +444,8 @@ struct arch_vcpu
> >
> >      /* A secondary copy of the vcpu time info. */
> >      XEN_GUEST_HANDLE(vcpu_time_info_t) time_info_guest;
> > +
> > +    struct pending_msix_unmask_info pending_msix_unmask;
> 
> What happens if multiple msix interrupts are masked, all updated with
> addresses, then all unmasked?

In my understanding, for a specific VCPU, if there is a pending msix unmask
operation, it means that the Qemu emulation has not been completed yet,
so the guest doesn't have chance to do another msix unmask request until
the current Qemu emulation path is finished(return to the guest). So I think
msix unmask requests from the guest on one VCPU cannot happen at the
same time. Correct my if my understanding is not correct! Thanks you!

> 
> ~Andrew
> 
> >  } __cacheline_aligned;
> >
> >  /* Shorthands to improve code legibility. */
> > diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
> > index cadb525..ce8f6ff 100644
> > --- a/xen/include/xen/pci.h
> > +++ b/xen/include/xen/pci.h
> > @@ -147,5 +147,6 @@ struct pirq;
> >  int msixtbl_pt_register(struct domain *, struct pirq *, uint64_t gtable);
> >  void msixtbl_pt_unregister(struct domain *, struct pirq *);
> >  void msixtbl_pt_cleanup(struct domain *d);
> > +int msix_post_handler(struct vcpu *v);
> >
> >  #endif /* __XEN_PCI_H__ */
> > --
> > 1.7.1
> >
> > Thanks,
> > Feng
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel

Thanks,
Feng

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] x86: properly handle MSI-X unmask operation from guests
  2013-11-21 12:13   ` Wu, Feng
@ 2013-11-21 12:21     ` Andrew Cooper
  2013-11-21 12:35       ` Wu, Feng
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Cooper @ 2013-11-21 12:21 UTC (permalink / raw)
  To: Wu, Feng
  Cc: Auld, Will, Zhang, Xiantao, Nakajima, Jun,
	Jan Beulich (JBeulich@suse.com), xen-devel@lists.xen.org

On 21/11/13 12:13, Wu, Feng wrote:
>
>> -----Original Message-----
>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>> Sent: Thursday, November 21, 2013 7:44 PM
>> To: Wu, Feng
>> Cc: Jan Beulich (JBeulich@suse.com); xen-devel@lists.xen.org; Auld, Will;
>> Nakajima, Jun; Zhang, Xiantao
>> Subject: Re: [Xen-devel] [PATCH v3] x86: properly handle MSI-X unmask
>> operation from guests
>>
>> On 21/11/13 10:51, Wu, Feng wrote:
>>> patch revision history
>>> ----------------------
>>> v1: Initial patch to handle this issue involving changing the hypercall interface
>>> v2:Totally handled inside hypervisor.
>>> v3:Change some logics of handling msi-x pending unmask operations.
>>>
>>> From 78ae225e6af88b0b850980fc55640d0776aeafbc Mon Sep 17 00:00:00
>> 2001
>>> From: Feng Wu <feng.wu@intel.com>
>>> Date: Wed, 13 Nov 2013 21:43:48 -0500
>>> Subject: [PATCH] x86: properly handle MSI-X unmask operation from guests
>>>
>>> For a pass-through device with MSI-x capability, when guest tries
>>> to unmask the MSI-x interrupt for the passed through device, xen
>>> doesn't clear the mask bit for MSI-x in hardware in the following
>>> scenario, which will cause network disconnection:
>>>
>>> 1. Guest masks the MSI-x interrupt
>>> 2. Guest updates the address and data for it
>>> 3. Guest unmasks the MSI-x interrupt (This is the problematic step)
>>>
>>> In the step #3 above, Xen doesn't handle it well. When guest tries
>>> to unmask MSI-X interrupt, it traps to Xen, Xen just returns to Qemu
>>> if it notices that address or data has been modified by guest before,
>>> then Qemu will update Xen with the latest value of address/data by
>>> hypercall. However, in this whole process, the MSI-X interrupt unmask
>>> operation is missing, which means Xen doesn't clear the mask bit in
>>> hardware for the MSI-X interrupt, so it remains disabled, that is why
>>> it loses the network connection.
>>>
>>> This patch fixes this issue.
>>>
>>> Signed-off-by: Feng Wu <feng.wu@intel.com>
>>>
>>> +};
>>> +
>>>  struct arch_vcpu
>>>  {
>>>      /*
>>> @@ -439,6 +444,8 @@ struct arch_vcpu
>>>
>>>      /* A secondary copy of the vcpu time info. */
>>>      XEN_GUEST_HANDLE(vcpu_time_info_t) time_info_guest;
>>> +
>>> +    struct pending_msix_unmask_info pending_msix_unmask;
>> What happens if multiple msix interrupts are masked, all updated with
>> addresses, then all unmasked?
> In my understanding, for a specific VCPU, if there is a pending msix unmask
> operation, it means that the Qemu emulation has not been completed yet,states
> so the guest doesn't have chance to do another msix unmask request until
> the current Qemu emulation path is finished(return to the guest). So I think
> msix unmask requests from the guest on one VCPU cannot happen at the
> same time. Correct my if my understanding is not correct! Thanks you!
>

Your patch description suggests that the problem occurs because the
address and data have changed while the MSI-X interrupt is masked. 
There is a tracking structure for a single MSI-X interrupt, which would
indicate that having two masked interrupts and updating them both cant
be correctly tracked.

Or have I misunderstood the problem?

~Andrew

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] x86: properly handle MSI-X unmask operation from guests
  2013-11-21 12:21     ` Andrew Cooper
@ 2013-11-21 12:35       ` Wu, Feng
  2013-11-21 12:41         ` Andrew Cooper
  0 siblings, 1 reply; 7+ messages in thread
From: Wu, Feng @ 2013-11-21 12:35 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Auld, Will, Zhang, Xiantao, Nakajima, Jun,
	Jan Beulich (JBeulich@suse.com), xen-devel@lists.xen.org



> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Thursday, November 21, 2013 8:22 PM
> To: Wu, Feng
> Cc: Jan Beulich (JBeulich@suse.com); xen-devel@lists.xen.org; Auld, Will;
> Nakajima, Jun; Zhang, Xiantao
> Subject: Re: [Xen-devel] [PATCH v3] x86: properly handle MSI-X unmask
> operation from guests
> 
> On 21/11/13 12:13, Wu, Feng wrote:
> >
> >> -----Original Message-----
> >> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> >> Sent: Thursday, November 21, 2013 7:44 PM
> >> To: Wu, Feng
> >> Cc: Jan Beulich (JBeulich@suse.com); xen-devel@lists.xen.org; Auld, Will;
> >> Nakajima, Jun; Zhang, Xiantao
> >> Subject: Re: [Xen-devel] [PATCH v3] x86: properly handle MSI-X unmask
> >> operation from guests
> >>
> >> On 21/11/13 10:51, Wu, Feng wrote:
> >>> patch revision history
> >>> ----------------------
> >>> v1: Initial patch to handle this issue involving changing the hypercall
> interface
> >>> v2:Totally handled inside hypervisor.
> >>> v3:Change some logics of handling msi-x pending unmask operations.
> >>>
> >>> From 78ae225e6af88b0b850980fc55640d0776aeafbc Mon Sep 17 00:00:00
> >> 2001
> >>> From: Feng Wu <feng.wu@intel.com>
> >>> Date: Wed, 13 Nov 2013 21:43:48 -0500
> >>> Subject: [PATCH] x86: properly handle MSI-X unmask operation from guests
> >>>
> >>> For a pass-through device with MSI-x capability, when guest tries
> >>> to unmask the MSI-x interrupt for the passed through device, xen
> >>> doesn't clear the mask bit for MSI-x in hardware in the following
> >>> scenario, which will cause network disconnection:
> >>>
> >>> 1. Guest masks the MSI-x interrupt
> >>> 2. Guest updates the address and data for it
> >>> 3. Guest unmasks the MSI-x interrupt (This is the problematic step)
> >>>
> >>> In the step #3 above, Xen doesn't handle it well. When guest tries
> >>> to unmask MSI-X interrupt, it traps to Xen, Xen just returns to Qemu
> >>> if it notices that address or data has been modified by guest before,
> >>> then Qemu will update Xen with the latest value of address/data by
> >>> hypercall. However, in this whole process, the MSI-X interrupt unmask
> >>> operation is missing, which means Xen doesn't clear the mask bit in
> >>> hardware for the MSI-X interrupt, so it remains disabled, that is why
> >>> it loses the network connection.
> >>>
> >>> This patch fixes this issue.
> >>>
> >>> Signed-off-by: Feng Wu <feng.wu@intel.com>
> >>>
> >>> +};
> >>> +
> >>>  struct arch_vcpu
> >>>  {
> >>>      /*
> >>> @@ -439,6 +444,8 @@ struct arch_vcpu
> >>>
> >>>      /* A secondary copy of the vcpu time info. */
> >>>      XEN_GUEST_HANDLE(vcpu_time_info_t) time_info_guest;
> >>> +
> >>> +    struct pending_msix_unmask_info pending_msix_unmask;
> >> What happens if multiple msix interrupts are masked, all updated with
> >> addresses, then all unmasked?
> > In my understanding, for a specific VCPU, if there is a pending msix unmask
> > operation, it means that the Qemu emulation has not been completed
> yet,states
> > so the guest doesn't have chance to do another msix unmask request until
> > the current Qemu emulation path is finished(return to the guest). So I think
> > msix unmask requests from the guest on one VCPU cannot happen at the
> > same time. Correct my if my understanding is not correct! Thanks you!
> >
> 
> Your patch description suggests that the problem occurs because the
> address and data have changed while the MSI-X interrupt is masked.
> There is a tracking structure for a single MSI-X interrupt, which would
> indicate that having two masked interrupts and updating them both cant
> be correctly tracked.
> 

The problem occurs when guests try to unmask the MSI-X interrupt after
updating the address and data field, because the MSI-X unmask operation
is discarded after the Qemu IO emulation path is finished. In this patch
I add a MSI-X post handler to do the unmask operation, or the MSI-X will
remain masked, hence the network is disconnected.

> Or have I misunderstood the problem?
> 
> ~Andrew

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] x86: properly handle MSI-X unmask operation from guests
  2013-11-21 12:35       ` Wu, Feng
@ 2013-11-21 12:41         ` Andrew Cooper
  2013-11-21 12:45           ` Wu, Feng
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Cooper @ 2013-11-21 12:41 UTC (permalink / raw)
  To: Wu, Feng
  Cc: Auld, Will, Zhang, Xiantao, Nakajima, Jun,
	Jan Beulich (JBeulich@suse.com), xen-devel@lists.xen.org

On 21/11/13 12:35, Wu, Feng wrote:
>
>> -----Original Message-----
>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>> Sent: Thursday, November 21, 2013 8:22 PM
>> To: Wu, Feng
>> Cc: Jan Beulich (JBeulich@suse.com); xen-devel@lists.xen.org; Auld, Will;
>> Nakajima, Jun; Zhang, Xiantao
>> Subject: Re: [Xen-devel] [PATCH v3] x86: properly handle MSI-X unmask
>> operation from guests
>>
>> On 21/11/13 12:13, Wu, Feng wrote:
>>>> -----Original Message-----
>>>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>>>> Sent: Thursday, November 21, 2013 7:44 PM
>>>> To: Wu, Feng
>>>> Cc: Jan Beulich (JBeulich@suse.com); xen-devel@lists.xen.org; Auld, Will;
>>>> Nakajima, Jun; Zhang, Xiantao
>>>> Subject: Re: [Xen-devel] [PATCH v3] x86: properly handle MSI-X unmask
>>>> operation from guests
>>>>
>>>> On 21/11/13 10:51, Wu, Feng wrote:
>>>>> patch revision history
>>>>> ----------------------
>>>>> v1: Initial patch to handle this issue involving changing the hypercall
>> interface
>>>>> v2:Totally handled inside hypervisor.
>>>>> v3:Change some logics of handling msi-x pending unmask operations.
>>>>>
>>>>> From 78ae225e6af88b0b850980fc55640d0776aeafbc Mon Sep 17 00:00:00
>>>> 2001
>>>>> From: Feng Wu <feng.wu@intel.com>
>>>>> Date: Wed, 13 Nov 2013 21:43:48 -0500
>>>>> Subject: [PATCH] x86: properly handle MSI-X unmask operation from guests
>>>>>
>>>>> For a pass-through device with MSI-x capability, when guest tries
>>>>> to unmask the MSI-x interrupt for the passed through device, xen
>>>>> doesn't clear the mask bit for MSI-x in hardware in the following
>>>>> scenario, which will cause network disconnection:
>>>>>
>>>>> 1. Guest masks the MSI-x interrupt
>>>>> 2. Guest updates the address and data for it
>>>>> 3. Guest unmasks the MSI-x interrupt (This is the problematic step)
>>>>>
>>>>> In the step #3 above, Xen doesn't handle it well. When guest tries
>>>>> to unmask MSI-X interrupt, it traps to Xen, Xen just returns to Qemu
>>>>> if it notices that address or data has been modified by guest before,
>>>>> then Qemu will update Xen with the latest value of address/data by
>>>>> hypercall. However, in this whole process, the MSI-X interrupt unmask
>>>>> operation is missing, which means Xen doesn't clear the mask bit in
>>>>> hardware for the MSI-X interrupt, so it remains disabled, that is why
>>>>> it loses the network connection.
>>>>>
>>>>> This patch fixes this issue.
>>>>>
>>>>> Signed-off-by: Feng Wu <feng.wu@intel.com>
>>>>>
>>>>> +};
>>>>> +
>>>>>  struct arch_vcpu
>>>>>  {
>>>>>      /*
>>>>> @@ -439,6 +444,8 @@ struct arch_vcpu
>>>>>
>>>>>      /* A secondary copy of the vcpu time info. */
>>>>>      XEN_GUEST_HANDLE(vcpu_time_info_t) time_info_guest;
>>>>> +
>>>>> +    struct pending_msix_unmask_info pending_msix_unmask;
>>>> What happens if multiple msix interrupts are masked, all updated with
>>>> addresses, then all unmasked?
>>> In my understanding, for a specific VCPU, if there is a pending msix unmask
>>> operation, it means that the Qemu emulation has not been completed
>> yet,states
>>> so the guest doesn't have chance to do another msix unmask request until
>>> the current Qemu emulation path is finished(return to the guest). So I think
>>> msix unmask requests from the guest on one VCPU cannot happen at the
>>> same time. Correct my if my understanding is not correct! Thanks you!
>>>
>> Your patch description suggests that the problem occurs because the
>> address and data have changed while the MSI-X interrupt is masked.
>> There is a tracking structure for a single MSI-X interrupt, which would
>> indicate that having two masked interrupts and updating them both cant
>> be correctly tracked.
>>
> The problem occurs when guests try to unmask the MSI-X interrupt after
> updating the address and data field, because the MSI-X unmask operation
> is discarded after the Qemu IO emulation path is finished. In this patch
> I add a MSI-X post handler to do the unmask operation, or the MSI-X will
> remain masked, hence the network is disconnected.

Ah - so it is strictly on the emulation path of an attempt to unmask the
MSI-X interrupt that the problem occurs, rather than a combination of
effects over the time during which the guest has the MSI masked.

~Andrew

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] x86: properly handle MSI-X unmask operation from guests
  2013-11-21 12:41         ` Andrew Cooper
@ 2013-11-21 12:45           ` Wu, Feng
  0 siblings, 0 replies; 7+ messages in thread
From: Wu, Feng @ 2013-11-21 12:45 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Auld, Will, Zhang, Xiantao, Nakajima, Jun,
	Jan Beulich (JBeulich@suse.com), xen-devel@lists.xen.org



> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Thursday, November 21, 2013 8:41 PM
> To: Wu, Feng
> Cc: Jan Beulich (JBeulich@suse.com); xen-devel@lists.xen.org; Auld, Will;
> Nakajima, Jun; Zhang, Xiantao
> Subject: Re: [Xen-devel] [PATCH v3] x86: properly handle MSI-X unmask
> operation from guests
> 
> On 21/11/13 12:35, Wu, Feng wrote:
> >
> >> -----Original Message-----
> >> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> >> Sent: Thursday, November 21, 2013 8:22 PM
> >> To: Wu, Feng
> >> Cc: Jan Beulich (JBeulich@suse.com); xen-devel@lists.xen.org; Auld, Will;
> >> Nakajima, Jun; Zhang, Xiantao
> >> Subject: Re: [Xen-devel] [PATCH v3] x86: properly handle MSI-X unmask
> >> operation from guests
> >>
> >> On 21/11/13 12:13, Wu, Feng wrote:
> >>>> -----Original Message-----
> >>>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> >>>> Sent: Thursday, November 21, 2013 7:44 PM
> >>>> To: Wu, Feng
> >>>> Cc: Jan Beulich (JBeulich@suse.com); xen-devel@lists.xen.org; Auld, Will;
> >>>> Nakajima, Jun; Zhang, Xiantao
> >>>> Subject: Re: [Xen-devel] [PATCH v3] x86: properly handle MSI-X unmask
> >>>> operation from guests
> >>>>
> >>>> On 21/11/13 10:51, Wu, Feng wrote:
> >>>>> patch revision history
> >>>>> ----------------------
> >>>>> v1: Initial patch to handle this issue involving changing the hypercall
> >> interface
> >>>>> v2:Totally handled inside hypervisor.
> >>>>> v3:Change some logics of handling msi-x pending unmask operations.
> >>>>>
> >>>>> From 78ae225e6af88b0b850980fc55640d0776aeafbc Mon Sep 17
> 00:00:00
> >>>> 2001
> >>>>> From: Feng Wu <feng.wu@intel.com>
> >>>>> Date: Wed, 13 Nov 2013 21:43:48 -0500
> >>>>> Subject: [PATCH] x86: properly handle MSI-X unmask operation from
> guests
> >>>>>
> >>>>> For a pass-through device with MSI-x capability, when guest tries
> >>>>> to unmask the MSI-x interrupt for the passed through device, xen
> >>>>> doesn't clear the mask bit for MSI-x in hardware in the following
> >>>>> scenario, which will cause network disconnection:
> >>>>>
> >>>>> 1. Guest masks the MSI-x interrupt
> >>>>> 2. Guest updates the address and data for it
> >>>>> 3. Guest unmasks the MSI-x interrupt (This is the problematic step)
> >>>>>
> >>>>> In the step #3 above, Xen doesn't handle it well. When guest tries
> >>>>> to unmask MSI-X interrupt, it traps to Xen, Xen just returns to Qemu
> >>>>> if it notices that address or data has been modified by guest before,
> >>>>> then Qemu will update Xen with the latest value of address/data by
> >>>>> hypercall. However, in this whole process, the MSI-X interrupt unmask
> >>>>> operation is missing, which means Xen doesn't clear the mask bit in
> >>>>> hardware for the MSI-X interrupt, so it remains disabled, that is why
> >>>>> it loses the network connection.
> >>>>>
> >>>>> This patch fixes this issue.
> >>>>>
> >>>>> Signed-off-by: Feng Wu <feng.wu@intel.com>
> >>>>>
> >>>>> +};
> >>>>> +
> >>>>>  struct arch_vcpu
> >>>>>  {
> >>>>>      /*
> >>>>> @@ -439,6 +444,8 @@ struct arch_vcpu
> >>>>>
> >>>>>      /* A secondary copy of the vcpu time info. */
> >>>>>      XEN_GUEST_HANDLE(vcpu_time_info_t) time_info_guest;
> >>>>> +
> >>>>> +    struct pending_msix_unmask_info pending_msix_unmask;
> >>>> What happens if multiple msix interrupts are masked, all updated with
> >>>> addresses, then all unmasked?
> >>> In my understanding, for a specific VCPU, if there is a pending msix unmask
> >>> operation, it means that the Qemu emulation has not been completed
> >> yet,states
> >>> so the guest doesn't have chance to do another msix unmask request until
> >>> the current Qemu emulation path is finished(return to the guest). So I think
> >>> msix unmask requests from the guest on one VCPU cannot happen at the
> >>> same time. Correct my if my understanding is not correct! Thanks you!
> >>>
> >> Your patch description suggests that the problem occurs because the
> >> address and data have changed while the MSI-X interrupt is masked.
> >> There is a tracking structure for a single MSI-X interrupt, which would
> >> indicate that having two masked interrupts and updating them both cant
> >> be correctly tracked.
> >>
> > The problem occurs when guests try to unmask the MSI-X interrupt after
> > updating the address and data field, because the MSI-X unmask operation
> > is discarded after the Qemu IO emulation path is finished. In this patch
> > I add a MSI-X post handler to do the unmask operation, or the MSI-X will
> > remain masked, hence the network is disconnected.
> 
> Ah - so it is strictly on the emulation path of an attempt to unmask the
> MSI-X interrupt that the problem occurs, rather than a combination of
> effects over the time during which the guest has the MSI masked.

Yes, this issue happens only because of the guest MSI-X unmask operation is missed
in the Qemu IO emulation path.
> 
> ~Andrew

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-11-21 12:45 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-21 10:51 [PATCH v3] x86: properly handle MSI-X unmask operation from guests Wu, Feng
2013-11-21 11:43 ` Andrew Cooper
2013-11-21 12:13   ` Wu, Feng
2013-11-21 12:21     ` Andrew Cooper
2013-11-21 12:35       ` Wu, Feng
2013-11-21 12:41         ` Andrew Cooper
2013-11-21 12:45           ` Wu, Feng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.