qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH v2 0/2] Fix MSIx lost after PE reset
@ 2014-09-01  0:53 Gavin Shan
  2014-09-01  0:53 ` [Qemu-devel] [RFC PATCH v2 1/2] VFIO: Drop vfio_container_do_ioctl() Gavin Shan
  2014-09-01  0:53 ` [Qemu-devel] [RFC PATCH v2 2/2] VFIO: Clear stale MSIx table during EEH reset Gavin Shan
  0 siblings, 2 replies; 5+ messages in thread
From: Gavin Shan @ 2014-09-01  0:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: aik, alex.williamson, Gavin Shan

The 2 patches fix MSIx lost after PE reset. Otherwise, the MSIx
entries can't be restored successfully after PE reset and the
EEH recovery fails on broadcom tg3 adapter (as tested) in guest.

Note: The patchset "EEH support for guest" isn't merged yet, those
      2 patches are based on Alex Graf's "ppc-next" branch + the
      patchset supporting EEH for guest, which can be checked out
      from below link:

      git@github.com:gwshan/qemu.git (branch: eeh)

Gavin Shan (2):
  VFIO: Drop vfio_container_do_ioctl()
  VFIO: Clear stale MSIx table during EEH reset

 hw/misc/vfio.c | 65 +++++++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 46 insertions(+), 19 deletions(-)

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Qemu-devel] [RFC PATCH v2 1/2] VFIO: Drop vfio_container_do_ioctl()
  2014-09-01  0:53 [Qemu-devel] [RFC PATCH v2 0/2] Fix MSIx lost after PE reset Gavin Shan
@ 2014-09-01  0:53 ` Gavin Shan
  2014-09-01  0:53 ` [Qemu-devel] [RFC PATCH v2 2/2] VFIO: Clear stale MSIx table during EEH reset Gavin Shan
  1 sibling, 0 replies; 5+ messages in thread
From: Gavin Shan @ 2014-09-01  0:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: aik, alex.williamson, Gavin Shan

The patch drops vfio_container_do_ioctl() and merges its logic to
parent function call vfio_container_ioctl() so that the subsequent
patches can reused the found VFIO group in vfio_container_ioctl().

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 hw/misc/vfio.c | 33 +++++++++++++++------------------
 1 file changed, 15 insertions(+), 18 deletions(-)

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index 7d5f1bb..1a3e7eb 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -4419,8 +4419,8 @@ static void register_vfio_pci_dev_type(void)
 
 type_init(register_vfio_pci_dev_type)
 
-static int vfio_container_do_ioctl(AddressSpace *as, int32_t groupid,
-                                   int req, void *param)
+int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
+                         int req, void *param)
 {
     VFIOGroup *group;
     VFIOContainer *container;
@@ -4433,22 +4433,11 @@ static int vfio_container_do_ioctl(AddressSpace *as, int32_t groupid,
     }
 
     container = group->container;
-    if (group->container) {
-        ret = ioctl(container->fd, req, param);
-        if (ret < 0) {
-            error_report("vfio: failed to ioctl container: ret=%d, %s",
-                         ret, strerror(errno));
-        }
+    if (!container) {
+        error_report("vfio: no container for group %d\n", groupid);
+        goto out;
     }
 
-    vfio_put_group(group);
-
-    return ret;
-}
-
-int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
-                         int req, void *param)
-{
     /* We allow only certain ioctls to the container */
     switch (req) {
     case VFIO_CHECK_EXTENSION:
@@ -4458,8 +4447,16 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
     default:
         /* Return an error on unknown requests */
         error_report("vfio: unsupported ioctl %X", req);
-        return -1;
+        goto out;
     }
 
-    return vfio_container_do_ioctl(as, groupid, req, param);
+    ret = ioctl(container->fd, req, param);
+    if (ret < 0) {
+        error_report("vfio: failed to ioctl container: ret=%d, %s",
+                     ret, strerror(errno));
+    }
+
+out:
+    vfio_put_group(group);
+    return ret;
 }
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [RFC PATCH v2 2/2] VFIO: Clear stale MSIx table during EEH reset
  2014-09-01  0:53 [Qemu-devel] [RFC PATCH v2 0/2] Fix MSIx lost after PE reset Gavin Shan
  2014-09-01  0:53 ` [Qemu-devel] [RFC PATCH v2 1/2] VFIO: Drop vfio_container_do_ioctl() Gavin Shan
@ 2014-09-01  0:53 ` Gavin Shan
  2014-09-02 20:10   ` Alex Williamson
  1 sibling, 1 reply; 5+ messages in thread
From: Gavin Shan @ 2014-09-01  0:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: aik, alex.williamson, Gavin Shan

The PCI device MSIx table is cleaned out in hardware after EEH PE
reset. However, we still hold the stale MSIx entries in QEMU, which
should be cleared accordingly. Otherwise, we will run into another
(recursive) EEH error and the PCI devices contained in the PE have
to be offlined exceptionally.

The patch clears stale MSIx table before EEH PE reset so that MSIx
table could be restored properly after EEH PE reset.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 hw/misc/vfio.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index 1a3e7eb..1f55051 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -2724,6 +2724,17 @@ static void vfio_disable_interrupts(VFIODevice *vdev)
     }
 }
 
+static void vfio_disable_and_reset_interrupts(VFIODevice *vdev)
+{
+    vfio_disable_interrupts(vdev);
+
+    switch (vdev->interrupt) {
+    case VFIO_INT_MSIX:
+        msix_reset(&vdev->pdev);
+        break;
+    }
+}
+
 static int vfio_setup_msi(VFIODevice *vdev, int pos)
 {
     uint16_t ctrl;
@@ -4442,8 +4453,27 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
     switch (req) {
     case VFIO_CHECK_EXTENSION:
     case VFIO_IOMMU_SPAPR_TCE_GET_INFO:
-    case VFIO_EEH_PE_OP:
         break;
+    case VFIO_EEH_PE_OP: {
+        VFIODevice *vdev;
+        struct vfio_eeh_pe_op *arg = (struct vfio_eeh_pe_op *)param;
+
+        switch (arg->op) {
+        case VFIO_EEH_PE_RESET_HOT:
+        case VFIO_EEH_PE_RESET_FUNDAMENTAL:
+            /*
+             * The MSIx table will be cleaned out by reset. We need
+             * disable it so that it can be reenabled properly. Also,
+             * the cached MSIx table should be cleared as it's not
+             * reflecting the contents in hardware.
+             */
+            QLIST_FOREACH(vdev, &group->device_list, next) {
+                vfio_disable_and_reset_interrupts(vdev);
+            }
+        }
+
+        break;
+    }
     default:
         /* Return an error on unknown requests */
         error_report("vfio: unsupported ioctl %X", req);
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 2/2] VFIO: Clear stale MSIx table during EEH reset
  2014-09-01  0:53 ` [Qemu-devel] [RFC PATCH v2 2/2] VFIO: Clear stale MSIx table during EEH reset Gavin Shan
@ 2014-09-02 20:10   ` Alex Williamson
  2014-09-02 23:12     ` Gavin Shan
  0 siblings, 1 reply; 5+ messages in thread
From: Alex Williamson @ 2014-09-02 20:10 UTC (permalink / raw)
  To: Gavin Shan; +Cc: aik, qemu-devel

On Mon, 2014-09-01 at 10:53 +1000, Gavin Shan wrote:
> The PCI device MSIx table is cleaned out in hardware after EEH PE
> reset. However, we still hold the stale MSIx entries in QEMU, which
> should be cleared accordingly. Otherwise, we will run into another
> (recursive) EEH error and the PCI devices contained in the PE have
> to be offlined exceptionally.
> 
> The patch clears stale MSIx table before EEH PE reset so that MSIx
> table could be restored properly after EEH PE reset.
> 
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> ---
>  hw/misc/vfio.c | 32 +++++++++++++++++++++++++++++++-
>  1 file changed, 31 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> index 1a3e7eb..1f55051 100644
> --- a/hw/misc/vfio.c
> +++ b/hw/misc/vfio.c
> @@ -2724,6 +2724,17 @@ static void vfio_disable_interrupts(VFIODevice *vdev)
>      }
>  }
>  
> +static void vfio_disable_and_reset_interrupts(VFIODevice *vdev)
> +{
> +    vfio_disable_interrupts(vdev);
> +
> +    switch (vdev->interrupt) {
> +    case VFIO_INT_MSIX:
> +        msix_reset(&vdev->pdev);
> +        break;
> +    }

This is apparently untested because vdev->interrupt should never be set
to VFIO_INT_MSIX after vfio_disable_interrupts().  Also, you need to
update the normal reset path to call msix_reset() unless it's already
happening via another reset handler.  Thanks,

Alex

> +}
> +
>  static int vfio_setup_msi(VFIODevice *vdev, int pos)
>  {
>      uint16_t ctrl;
> @@ -4442,8 +4453,27 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
>      switch (req) {
>      case VFIO_CHECK_EXTENSION:
>      case VFIO_IOMMU_SPAPR_TCE_GET_INFO:
> -    case VFIO_EEH_PE_OP:
>          break;
> +    case VFIO_EEH_PE_OP: {
> +        VFIODevice *vdev;
> +        struct vfio_eeh_pe_op *arg = (struct vfio_eeh_pe_op *)param;
> +
> +        switch (arg->op) {
> +        case VFIO_EEH_PE_RESET_HOT:
> +        case VFIO_EEH_PE_RESET_FUNDAMENTAL:
> +            /*
> +             * The MSIx table will be cleaned out by reset. We need
> +             * disable it so that it can be reenabled properly. Also,
> +             * the cached MSIx table should be cleared as it's not
> +             * reflecting the contents in hardware.
> +             */
> +            QLIST_FOREACH(vdev, &group->device_list, next) {
> +                vfio_disable_and_reset_interrupts(vdev);
> +            }
> +        }
> +
> +        break;
> +    }
>      default:
>          /* Return an error on unknown requests */
>          error_report("vfio: unsupported ioctl %X", req);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v2 2/2] VFIO: Clear stale MSIx table during EEH reset
  2014-09-02 20:10   ` Alex Williamson
@ 2014-09-02 23:12     ` Gavin Shan
  0 siblings, 0 replies; 5+ messages in thread
From: Gavin Shan @ 2014-09-02 23:12 UTC (permalink / raw)
  To: Alex Williamson; +Cc: aik, Gavin Shan, qemu-devel

On Tue, Sep 02, 2014 at 02:10:42PM -0600, Alex Williamson wrote:
>On Mon, 2014-09-01 at 10:53 +1000, Gavin Shan wrote:
>> The PCI device MSIx table is cleaned out in hardware after EEH PE
>> reset. However, we still hold the stale MSIx entries in QEMU, which
>> should be cleared accordingly. Otherwise, we will run into another
>> (recursive) EEH error and the PCI devices contained in the PE have
>> to be offlined exceptionally.
>> 
>> The patch clears stale MSIx table before EEH PE reset so that MSIx
>> table could be restored properly after EEH PE reset.
>> 
>> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>> ---
>>  hw/misc/vfio.c | 32 +++++++++++++++++++++++++++++++-
>>  1 file changed, 31 insertions(+), 1 deletion(-)
>> 
>> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
>> index 1a3e7eb..1f55051 100644
>> --- a/hw/misc/vfio.c
>> +++ b/hw/misc/vfio.c
>> @@ -2724,6 +2724,17 @@ static void vfio_disable_interrupts(VFIODevice *vdev)
>>      }
>>  }
>>  
>> +static void vfio_disable_and_reset_interrupts(VFIODevice *vdev)
>> +{
>> +    vfio_disable_interrupts(vdev);
>> +
>> +    switch (vdev->interrupt) {
>> +    case VFIO_INT_MSIX:
>> +        msix_reset(&vdev->pdev);
>> +        break;
>> +    }
>
>This is apparently untested because vdev->interrupt should never be set
>to VFIO_INT_MSIX after vfio_disable_interrupts().  Also, you need to
>update the normal reset path to call msix_reset() unless it's already
>happening via another reset handler.  Thanks,
>

Yes, I didn't test this revision. I'll change the code according to
your comments and retest, then send v3 for your comments.

Thanks,
Gavin

>Alex
>
>> +}
>> +
>>  static int vfio_setup_msi(VFIODevice *vdev, int pos)
>>  {
>>      uint16_t ctrl;
>> @@ -4442,8 +4453,27 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
>>      switch (req) {
>>      case VFIO_CHECK_EXTENSION:
>>      case VFIO_IOMMU_SPAPR_TCE_GET_INFO:
>> -    case VFIO_EEH_PE_OP:
>>          break;
>> +    case VFIO_EEH_PE_OP: {
>> +        VFIODevice *vdev;
>> +        struct vfio_eeh_pe_op *arg = (struct vfio_eeh_pe_op *)param;
>> +
>> +        switch (arg->op) {
>> +        case VFIO_EEH_PE_RESET_HOT:
>> +        case VFIO_EEH_PE_RESET_FUNDAMENTAL:
>> +            /*
>> +             * The MSIx table will be cleaned out by reset. We need
>> +             * disable it so that it can be reenabled properly. Also,
>> +             * the cached MSIx table should be cleared as it's not
>> +             * reflecting the contents in hardware.
>> +             */
>> +            QLIST_FOREACH(vdev, &group->device_list, next) {
>> +                vfio_disable_and_reset_interrupts(vdev);
>> +            }
>> +        }
>> +
>> +        break;
>> +    }
>>      default:
>>          /* Return an error on unknown requests */
>>          error_report("vfio: unsupported ioctl %X", req);
>
>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-09-02 23:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-01  0:53 [Qemu-devel] [RFC PATCH v2 0/2] Fix MSIx lost after PE reset Gavin Shan
2014-09-01  0:53 ` [Qemu-devel] [RFC PATCH v2 1/2] VFIO: Drop vfio_container_do_ioctl() Gavin Shan
2014-09-01  0:53 ` [Qemu-devel] [RFC PATCH v2 2/2] VFIO: Clear stale MSIx table during EEH reset Gavin Shan
2014-09-02 20:10   ` Alex Williamson
2014-09-02 23:12     ` Gavin Shan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).