* [Qemu-devel] [RFC PATCH 0/2] Fix MSIx lost after PE reset @ 2014-08-20 9:52 Gavin Shan 2014-08-20 9:52 ` [Qemu-devel] [RFC PATCH 1/2] VFIO: Drop vfio_container_do_ioctl() Gavin Shan ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Gavin Shan @ 2014-08-20 9:52 UTC (permalink / raw) To: qemu-devel; +Cc: aik, alex.williamson, Gavin Shan The 2 patches fix MSIx lost after PE reset. Otherwise, the MSIx entries can't be restored successfully after PE reset and the EEH recovery fails on broadcom tg3 adapter (as tested) in guest. Note: The patchset "EEH support for guest" isn't merged yet, those 2 patches are based on Alex Graf's "ppc-next" branch + the patchset supporting EEH for guest, which can be checked out from below link: git@github.com:gwshan/qemu.git (branch: eeh) Gavin Shan (2): VFIO: Drop vfio_container_do_ioctl() VFIO: Clear stale MSIx table during EEH reset hw/misc/vfio.c | 57 +++++++++++++++++++++++++++++++++++++++------------------ 1 file changed, 39 insertions(+), 18 deletions(-) -- 1.8.3.2 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Qemu-devel] [RFC PATCH 1/2] VFIO: Drop vfio_container_do_ioctl() 2014-08-20 9:52 [Qemu-devel] [RFC PATCH 0/2] Fix MSIx lost after PE reset Gavin Shan @ 2014-08-20 9:52 ` Gavin Shan 2014-08-26 10:27 ` Alexey Kardashevskiy 2014-08-20 9:52 ` [Qemu-devel] [RFC PATCH 2/2] VFIO: Clear stale MSIx table during EEH reset Gavin Shan 2014-08-26 0:58 ` [Qemu-devel] [RFC PATCH 0/2] Fix MSIx lost after PE reset Gavin Shan 2 siblings, 1 reply; 11+ messages in thread From: Gavin Shan @ 2014-08-20 9:52 UTC (permalink / raw) To: qemu-devel; +Cc: aik, alex.williamson, Gavin Shan The patch drops vfio_container_do_ioctl() and merges its logic to parent function call vfio_container_ioctl() so that the subsequent patches can reused the found VFIO group in vfio_container_ioctl(). Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> --- hw/misc/vfio.c | 33 +++++++++++++++------------------ 1 file changed, 15 insertions(+), 18 deletions(-) diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c index 7d5f1bb..1a3e7eb 100644 --- a/hw/misc/vfio.c +++ b/hw/misc/vfio.c @@ -4419,8 +4419,8 @@ static void register_vfio_pci_dev_type(void) type_init(register_vfio_pci_dev_type) -static int vfio_container_do_ioctl(AddressSpace *as, int32_t groupid, - int req, void *param) +int vfio_container_ioctl(AddressSpace *as, int32_t groupid, + int req, void *param) { VFIOGroup *group; VFIOContainer *container; @@ -4433,22 +4433,11 @@ static int vfio_container_do_ioctl(AddressSpace *as, int32_t groupid, } container = group->container; - if (group->container) { - ret = ioctl(container->fd, req, param); - if (ret < 0) { - error_report("vfio: failed to ioctl container: ret=%d, %s", - ret, strerror(errno)); - } + if (!container) { + error_report("vfio: no container for group %d\n", groupid); + goto out; } - vfio_put_group(group); - - return ret; -} - -int vfio_container_ioctl(AddressSpace *as, int32_t groupid, - int req, void *param) -{ /* We allow only certain ioctls to the container */ switch (req) { case VFIO_CHECK_EXTENSION: @@ -4458,8 +4447,16 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, default: /* Return an error on unknown requests */ error_report("vfio: unsupported ioctl %X", req); - return -1; + goto out; } - return vfio_container_do_ioctl(as, groupid, req, param); + ret = ioctl(container->fd, req, param); + if (ret < 0) { + error_report("vfio: failed to ioctl container: ret=%d, %s", + ret, strerror(errno)); + } + +out: + vfio_put_group(group); + return ret; } -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 1/2] VFIO: Drop vfio_container_do_ioctl() 2014-08-20 9:52 ` [Qemu-devel] [RFC PATCH 1/2] VFIO: Drop vfio_container_do_ioctl() Gavin Shan @ 2014-08-26 10:27 ` Alexey Kardashevskiy 2014-08-26 20:05 ` Alex Williamson 0 siblings, 1 reply; 11+ messages in thread From: Alexey Kardashevskiy @ 2014-08-26 10:27 UTC (permalink / raw) To: Gavin Shan, qemu-devel; +Cc: alex.williamson On 08/20/2014 07:52 PM, Gavin Shan wrote: > The patch drops vfio_container_do_ioctl() and merges its logic to > parent function call vfio_container_ioctl() so that the subsequent > patches can reused the found VFIO group in vfio_container_ioctl(). > > Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> > --- > hw/misc/vfio.c | 33 +++++++++++++++------------------ > 1 file changed, 15 insertions(+), 18 deletions(-) > > diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c > index 7d5f1bb..1a3e7eb 100644 > --- a/hw/misc/vfio.c > +++ b/hw/misc/vfio.c > @@ -4419,8 +4419,8 @@ static void register_vfio_pci_dev_type(void) > > type_init(register_vfio_pci_dev_type) > > -static int vfio_container_do_ioctl(AddressSpace *as, int32_t groupid, > - int req, void *param) > +int vfio_container_ioctl(AddressSpace *as, int32_t groupid, > + int req, void *param) > { > VFIOGroup *group; > VFIOContainer *container; > @@ -4433,22 +4433,11 @@ static int vfio_container_do_ioctl(AddressSpace *as, int32_t groupid, > } > > container = group->container; > - if (group->container) { > - ret = ioctl(container->fd, req, param); > - if (ret < 0) { > - error_report("vfio: failed to ioctl container: ret=%d, %s", > - ret, strerror(errno)); > - } > + if (!container) { > + error_report("vfio: no container for group %d\n", groupid); > + goto out; > } > > - vfio_put_group(group); > - > - return ret; > -} > - > -int vfio_container_ioctl(AddressSpace *as, int32_t groupid, > - int req, void *param) > -{ > /* We allow only certain ioctls to the container */ > switch (req) { > case VFIO_CHECK_EXTENSION: > @@ -4458,8 +4447,16 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, > default: > /* Return an error on unknown requests */ > error_report("vfio: unsupported ioctl %X", req); > - return -1; > + goto out; > } > > - return vfio_container_do_ioctl(as, groupid, req, param); > + ret = ioctl(container->fd, req, param); > + if (ret < 0) { > + error_report("vfio: failed to ioctl container: ret=%d, %s", > + ret, strerror(errno)); > + } > + > +out: > + vfio_put_group(group); > + return ret; Nack. We specifically separated an ioctl-filtering function and actual code which does ioctl. I understand that you want to save on vfio_get_group() calls for the next EEH patch but this is not a hot path anyway. > } > -- Alexey ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 1/2] VFIO: Drop vfio_container_do_ioctl() 2014-08-26 10:27 ` Alexey Kardashevskiy @ 2014-08-26 20:05 ` Alex Williamson 0 siblings, 0 replies; 11+ messages in thread From: Alex Williamson @ 2014-08-26 20:05 UTC (permalink / raw) To: Alexey Kardashevskiy; +Cc: Gavin Shan, qemu-devel On Tue, 2014-08-26 at 20:27 +1000, Alexey Kardashevskiy wrote: > On 08/20/2014 07:52 PM, Gavin Shan wrote: > > The patch drops vfio_container_do_ioctl() and merges its logic to > > parent function call vfio_container_ioctl() so that the subsequent > > patches can reused the found VFIO group in vfio_container_ioctl(). > > > > Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> > > --- > > hw/misc/vfio.c | 33 +++++++++++++++------------------ > > 1 file changed, 15 insertions(+), 18 deletions(-) > > > > diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c > > index 7d5f1bb..1a3e7eb 100644 > > --- a/hw/misc/vfio.c > > +++ b/hw/misc/vfio.c > > @@ -4419,8 +4419,8 @@ static void register_vfio_pci_dev_type(void) > > > > type_init(register_vfio_pci_dev_type) > > > > -static int vfio_container_do_ioctl(AddressSpace *as, int32_t groupid, > > - int req, void *param) > > +int vfio_container_ioctl(AddressSpace *as, int32_t groupid, > > + int req, void *param) > > { > > VFIOGroup *group; > > VFIOContainer *container; > > @@ -4433,22 +4433,11 @@ static int vfio_container_do_ioctl(AddressSpace *as, int32_t groupid, > > } > > > > container = group->container; > > - if (group->container) { > > - ret = ioctl(container->fd, req, param); > > - if (ret < 0) { > > - error_report("vfio: failed to ioctl container: ret=%d, %s", > > - ret, strerror(errno)); > > - } > > + if (!container) { > > + error_report("vfio: no container for group %d\n", groupid); > > + goto out; > > } > > > > - vfio_put_group(group); > > - > > - return ret; > > -} > > - > > -int vfio_container_ioctl(AddressSpace *as, int32_t groupid, > > - int req, void *param) > > -{ > > /* We allow only certain ioctls to the container */ > > switch (req) { > > case VFIO_CHECK_EXTENSION: > > @@ -4458,8 +4447,16 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, > > default: > > /* Return an error on unknown requests */ > > error_report("vfio: unsupported ioctl %X", req); > > - return -1; > > + goto out; > > } > > > > - return vfio_container_do_ioctl(as, groupid, req, param); > > + ret = ioctl(container->fd, req, param); > > + if (ret < 0) { > > + error_report("vfio: failed to ioctl container: ret=%d, %s", > > + ret, strerror(errno)); > > + } > > + > > +out: > > + vfio_put_group(group); > > + return ret; > > > Nack. We specifically separated an ioctl-filtering function and actual code > which does ioctl. I understand that you want to save on vfio_get_group() > calls for the next EEH patch but this is not a hot path anyway. I think the split was a result of a comment from me suggesting that a wrapper function might make the code flow better. As implemented, I don't think that result was every actually realized, but the code was functional, so I didn't push it any further. I don't have any issue with recombining it. Thanks, Alex ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Qemu-devel] [RFC PATCH 2/2] VFIO: Clear stale MSIx table during EEH reset 2014-08-20 9:52 [Qemu-devel] [RFC PATCH 0/2] Fix MSIx lost after PE reset Gavin Shan 2014-08-20 9:52 ` [Qemu-devel] [RFC PATCH 1/2] VFIO: Drop vfio_container_do_ioctl() Gavin Shan @ 2014-08-20 9:52 ` Gavin Shan 2014-08-26 10:25 ` Alexey Kardashevskiy 2014-08-26 20:25 ` Alex Williamson 2014-08-26 0:58 ` [Qemu-devel] [RFC PATCH 0/2] Fix MSIx lost after PE reset Gavin Shan 2 siblings, 2 replies; 11+ messages in thread From: Gavin Shan @ 2014-08-20 9:52 UTC (permalink / raw) To: qemu-devel; +Cc: aik, alex.williamson, Gavin Shan The PCI device MSIx table is cleaned out in hardware after EEH PE reset. However, we still hold the stale MSIx entries in QEMU, which should be cleared accordingly. Otherwise, we will run into another (recursive) EEH error and the PCI devices contained in the PE have to be offlined exceptionally. The patch clears stale MSIx table before EEH PE reset so that MSIx table could be restored properly after EEH PE reset. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> --- hw/misc/vfio.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c index 1a3e7eb..3cf7f02 100644 --- a/hw/misc/vfio.c +++ b/hw/misc/vfio.c @@ -4424,6 +4424,8 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, { VFIOGroup *group; VFIOContainer *container; + VFIODevice *vdev; + struct vfio_eeh_pe_op *arg; int ret = -1; group = vfio_get_group(groupid, as); @@ -4442,7 +4444,29 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, switch (req) { case VFIO_CHECK_EXTENSION: case VFIO_IOMMU_SPAPR_TCE_GET_INFO: + break; case VFIO_EEH_PE_OP: + arg = (struct vfio_eeh_pe_op *)param; + switch (arg->op) { + case VFIO_EEH_PE_RESET_HOT: + case VFIO_EEH_PE_RESET_FUNDAMENTAL: + /* + * The MSIx table will be cleaned out by reset. We need + * disable it so that it can be reenabled properly. Also, + * the cached MSIx table should be cleared as it's not + * reflecting the contents in hardware. + */ + QLIST_FOREACH(vdev, &group->device_list, next) { + if (msix_enabled(&vdev->pdev)) { + vfio_disable_msix(vdev); + } + + msix_reset(&vdev->pdev); + } + + break; + } + break; default: /* Return an error on unknown requests */ -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 2/2] VFIO: Clear stale MSIx table during EEH reset 2014-08-20 9:52 ` [Qemu-devel] [RFC PATCH 2/2] VFIO: Clear stale MSIx table during EEH reset Gavin Shan @ 2014-08-26 10:25 ` Alexey Kardashevskiy 2014-08-26 20:25 ` Alex Williamson 1 sibling, 0 replies; 11+ messages in thread From: Alexey Kardashevskiy @ 2014-08-26 10:25 UTC (permalink / raw) To: Gavin Shan, qemu-devel; +Cc: alex.williamson On 08/20/2014 07:52 PM, Gavin Shan wrote: > The PCI device MSIx table is cleaned out in hardware after EEH PE > reset. However, we still hold the stale MSIx entries in QEMU, which > should be cleared accordingly. Otherwise, we will run into another > (recursive) EEH error and the PCI devices contained in the PE have > to be offlined exceptionally. > > The patch clears stale MSIx table before EEH PE reset so that MSIx > table could be restored properly after EEH PE reset. > > Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> > --- > hw/misc/vfio.c | 24 ++++++++++++++++++++++++ > 1 file changed, 24 insertions(+) > > diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c > index 1a3e7eb..3cf7f02 100644 > --- a/hw/misc/vfio.c > +++ b/hw/misc/vfio.c > @@ -4424,6 +4424,8 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, > { > VFIOGroup *group; > VFIOContainer *container; > + VFIODevice *vdev; > + struct vfio_eeh_pe_op *arg; > int ret = -1; > > group = vfio_get_group(groupid, as); > @@ -4442,7 +4444,29 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, > switch (req) { > case VFIO_CHECK_EXTENSION: > case VFIO_IOMMU_SPAPR_TCE_GET_INFO: > + break; > case VFIO_EEH_PE_OP: > + arg = (struct vfio_eeh_pe_op *)param; > + switch (arg->op) { > + case VFIO_EEH_PE_RESET_HOT: > + case VFIO_EEH_PE_RESET_FUNDAMENTAL: > + /* > + * The MSIx table will be cleaned out by reset. We need > + * disable it so that it can be reenabled properly. Also, > + * the cached MSIx table should be cleared as it's not > + * reflecting the contents in hardware. > + */ > + QLIST_FOREACH(vdev, &group->device_list, next) { > + if (msix_enabled(&vdev->pdev)) { > + vfio_disable_msix(vdev); > + } > + > + msix_reset(&vdev->pdev); > + } > + > + break; > + } This has to be done in spapr_pci_vfio (but we do not have there access to vfio_disable_msix) or in the way that x86 would benefit from too but in this case yes, we need an opinion from Alex (Williamson)... > + > break; > default: > /* Return an error on unknown requests */ > -- Alexey ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 2/2] VFIO: Clear stale MSIx table during EEH reset 2014-08-20 9:52 ` [Qemu-devel] [RFC PATCH 2/2] VFIO: Clear stale MSIx table during EEH reset Gavin Shan 2014-08-26 10:25 ` Alexey Kardashevskiy @ 2014-08-26 20:25 ` Alex Williamson 2014-08-27 13:15 ` Gavin Shan 1 sibling, 1 reply; 11+ messages in thread From: Alex Williamson @ 2014-08-26 20:25 UTC (permalink / raw) To: Gavin Shan; +Cc: aik, qemu-devel On Wed, 2014-08-20 at 19:52 +1000, Gavin Shan wrote: > The PCI device MSIx table is cleaned out in hardware after EEH PE > reset. However, we still hold the stale MSIx entries in QEMU, which > should be cleared accordingly. Otherwise, we will run into another > (recursive) EEH error and the PCI devices contained in the PE have > to be offlined exceptionally. > > The patch clears stale MSIx table before EEH PE reset so that MSIx > table could be restored properly after EEH PE reset. > > Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> > --- > hw/misc/vfio.c | 24 ++++++++++++++++++++++++ > 1 file changed, 24 insertions(+) > > diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c > index 1a3e7eb..3cf7f02 100644 > --- a/hw/misc/vfio.c > +++ b/hw/misc/vfio.c > @@ -4424,6 +4424,8 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, > { > VFIOGroup *group; > VFIOContainer *container; > + VFIODevice *vdev; > + struct vfio_eeh_pe_op *arg; Define these within the scope of the case since they're not used outside of it. > int ret = -1; > > group = vfio_get_group(groupid, as); > @@ -4442,7 +4444,29 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, > switch (req) { > case VFIO_CHECK_EXTENSION: > case VFIO_IOMMU_SPAPR_TCE_GET_INFO: > + break; > case VFIO_EEH_PE_OP: > + arg = (struct vfio_eeh_pe_op *)param; > + switch (arg->op) { > + case VFIO_EEH_PE_RESET_HOT: > + case VFIO_EEH_PE_RESET_FUNDAMENTAL: > + /* > + * The MSIx table will be cleaned out by reset. We need > + * disable it so that it can be reenabled properly. Also, > + * the cached MSIx table should be cleared as it's not > + * reflecting the contents in hardware. > + */ > + QLIST_FOREACH(vdev, &group->device_list, next) { > + if (msix_enabled(&vdev->pdev)) { > + vfio_disable_msix(vdev); > + } > + > + msix_reset(&vdev->pdev); > + } We already have vfio_disable_interrupts(), can't we use that (blindly)? Do we really need to call msix_reset()? If so, should vfio_disable_msix() call it? > + > + break; Extraneous break > + } > + > break; > default: > /* Return an error on unknown requests */ Thanks, Alex ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 2/2] VFIO: Clear stale MSIx table during EEH reset 2014-08-26 20:25 ` Alex Williamson @ 2014-08-27 13:15 ` Gavin Shan 2014-08-27 13:29 ` Alex Williamson 0 siblings, 1 reply; 11+ messages in thread From: Gavin Shan @ 2014-08-27 13:15 UTC (permalink / raw) To: Alex Williamson; +Cc: aik, Gavin Shan, qemu-devel On Tue, Aug 26, 2014 at 02:25:47PM -0600, Alex Williamson wrote: >On Wed, 2014-08-20 at 19:52 +1000, Gavin Shan wrote: >> The PCI device MSIx table is cleaned out in hardware after EEH PE >> reset. However, we still hold the stale MSIx entries in QEMU, which >> should be cleared accordingly. Otherwise, we will run into another >> (recursive) EEH error and the PCI devices contained in the PE have >> to be offlined exceptionally. >> >> The patch clears stale MSIx table before EEH PE reset so that MSIx >> table could be restored properly after EEH PE reset. >> >> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> >> --- >> hw/misc/vfio.c | 24 ++++++++++++++++++++++++ >> 1 file changed, 24 insertions(+) >> >> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c >> index 1a3e7eb..3cf7f02 100644 >> --- a/hw/misc/vfio.c >> +++ b/hw/misc/vfio.c >> @@ -4424,6 +4424,8 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, >> { >> VFIOGroup *group; >> VFIOContainer *container; >> + VFIODevice *vdev; >> + struct vfio_eeh_pe_op *arg; > >Define these within the scope of the case since they're not used outside >of it. > Yes, I'll fix. >> int ret = -1; >> >> group = vfio_get_group(groupid, as); >> @@ -4442,7 +4444,29 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, >> switch (req) { >> case VFIO_CHECK_EXTENSION: >> case VFIO_IOMMU_SPAPR_TCE_GET_INFO: >> + break; >> case VFIO_EEH_PE_OP: >> + arg = (struct vfio_eeh_pe_op *)param; >> + switch (arg->op) { >> + case VFIO_EEH_PE_RESET_HOT: >> + case VFIO_EEH_PE_RESET_FUNDAMENTAL: >> + /* >> + * The MSIx table will be cleaned out by reset. We need >> + * disable it so that it can be reenabled properly. Also, >> + * the cached MSIx table should be cleared as it's not >> + * reflecting the contents in hardware. >> + */ >> + QLIST_FOREACH(vdev, &group->device_list, next) { >> + if (msix_enabled(&vdev->pdev)) { >> + vfio_disable_msix(vdev); >> + } >> + >> + msix_reset(&vdev->pdev); >> + } > >We already have vfio_disable_interrupts(), can't we use that (blindly)? >Do we really need to call msix_reset()? If so, should >vfio_disable_msix() call it? > Yes, vfio_disable_interrupts() would be better to be used here as it can covers all types of interrupt (INTx/MSI/MSIx). vfio_disable_interrupts() needn't clear MSIx vectors. If you prefer calling msix_reset() in the function, I guess I have to add one more parameter to vfio_disable_interrupts() to indicate if we need clear MSI/MSIx vector: static void vfio_disable_interrupts(VFIODevice *vdev, bool clr_vector) >> + >> + break; > >Extraneous break > Yep, I'll remove it. >> + } >> + >> break; >> default: >> /* Return an error on unknown requests */ > >Thanks, >Alex > Thanks, Gavin ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 2/2] VFIO: Clear stale MSIx table during EEH reset 2014-08-27 13:15 ` Gavin Shan @ 2014-08-27 13:29 ` Alex Williamson 2014-08-29 7:13 ` Gavin Shan 0 siblings, 1 reply; 11+ messages in thread From: Alex Williamson @ 2014-08-27 13:29 UTC (permalink / raw) To: Gavin Shan; +Cc: aik, qemu-devel On Wed, 2014-08-27 at 23:15 +1000, Gavin Shan wrote: > On Tue, Aug 26, 2014 at 02:25:47PM -0600, Alex Williamson wrote: > >On Wed, 2014-08-20 at 19:52 +1000, Gavin Shan wrote: > >> The PCI device MSIx table is cleaned out in hardware after EEH PE > >> reset. However, we still hold the stale MSIx entries in QEMU, which > >> should be cleared accordingly. Otherwise, we will run into another > >> (recursive) EEH error and the PCI devices contained in the PE have > >> to be offlined exceptionally. > >> > >> The patch clears stale MSIx table before EEH PE reset so that MSIx > >> table could be restored properly after EEH PE reset. > >> > >> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> > >> --- > >> hw/misc/vfio.c | 24 ++++++++++++++++++++++++ > >> 1 file changed, 24 insertions(+) > >> > >> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c > >> index 1a3e7eb..3cf7f02 100644 > >> --- a/hw/misc/vfio.c > >> +++ b/hw/misc/vfio.c > >> @@ -4424,6 +4424,8 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, > >> { > >> VFIOGroup *group; > >> VFIOContainer *container; > >> + VFIODevice *vdev; > >> + struct vfio_eeh_pe_op *arg; > > > >Define these within the scope of the case since they're not used outside > >of it. > > > > Yes, I'll fix. > > >> int ret = -1; > >> > >> group = vfio_get_group(groupid, as); > >> @@ -4442,7 +4444,29 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, > >> switch (req) { > >> case VFIO_CHECK_EXTENSION: > >> case VFIO_IOMMU_SPAPR_TCE_GET_INFO: > >> + break; > >> case VFIO_EEH_PE_OP: > >> + arg = (struct vfio_eeh_pe_op *)param; > >> + switch (arg->op) { > >> + case VFIO_EEH_PE_RESET_HOT: > >> + case VFIO_EEH_PE_RESET_FUNDAMENTAL: > >> + /* > >> + * The MSIx table will be cleaned out by reset. We need > >> + * disable it so that it can be reenabled properly. Also, > >> + * the cached MSIx table should be cleared as it's not > >> + * reflecting the contents in hardware. > >> + */ > >> + QLIST_FOREACH(vdev, &group->device_list, next) { > >> + if (msix_enabled(&vdev->pdev)) { > >> + vfio_disable_msix(vdev); > >> + } > >> + > >> + msix_reset(&vdev->pdev); > >> + } > > > >We already have vfio_disable_interrupts(), can't we use that (blindly)? > >Do we really need to call msix_reset()? If so, should > >vfio_disable_msix() call it? > > > > Yes, vfio_disable_interrupts() would be better to be used here as it > can covers all types of interrupt (INTx/MSI/MSIx). > > vfio_disable_interrupts() needn't clear MSIx vectors. If you prefer > calling msix_reset() in the function, I guess I have to add one more > parameter to vfio_disable_interrupts() to indicate if we need clear > MSI/MSIx vector: > > static void vfio_disable_interrupts(VFIODevice *vdev, bool clr_vector) How about just creating a vfio_reset_interrupts() that calls msix_reset() if the device supports MSIX? We could wrap vfio_disable_interrupts() and vfio_reset_interrupts() into a vfio_disable_and_reset_interrupts(), but I'm not sure it's worth it. The reset device path should also add the interrupt reset call. Thanks, Alex > >> + > >> + break; > > > >Extraneous break > > > > Yep, I'll remove it. > > >> + } > >> + > >> break; > >> default: > >> /* Return an error on unknown requests */ > > > >Thanks, > >Alex > > > > Thanks, > Gavin > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 2/2] VFIO: Clear stale MSIx table during EEH reset 2014-08-27 13:29 ` Alex Williamson @ 2014-08-29 7:13 ` Gavin Shan 0 siblings, 0 replies; 11+ messages in thread From: Gavin Shan @ 2014-08-29 7:13 UTC (permalink / raw) To: Alex Williamson; +Cc: aik, Gavin Shan, qemu-devel On Wed, Aug 27, 2014 at 07:29:41AM -0600, Alex Williamson wrote: >On Wed, 2014-08-27 at 23:15 +1000, Gavin Shan wrote: >> On Tue, Aug 26, 2014 at 02:25:47PM -0600, Alex Williamson wrote: >> >On Wed, 2014-08-20 at 19:52 +1000, Gavin Shan wrote: >> >> The PCI device MSIx table is cleaned out in hardware after EEH PE >> >> reset. However, we still hold the stale MSIx entries in QEMU, which >> >> should be cleared accordingly. Otherwise, we will run into another >> >> (recursive) EEH error and the PCI devices contained in the PE have >> >> to be offlined exceptionally. >> >> >> >> The patch clears stale MSIx table before EEH PE reset so that MSIx >> >> table could be restored properly after EEH PE reset. >> >> >> >> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> >> >> --- >> >> hw/misc/vfio.c | 24 ++++++++++++++++++++++++ >> >> 1 file changed, 24 insertions(+) >> >> >> >> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c >> >> index 1a3e7eb..3cf7f02 100644 >> >> --- a/hw/misc/vfio.c >> >> +++ b/hw/misc/vfio.c >> >> @@ -4424,6 +4424,8 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, >> >> { >> >> VFIOGroup *group; >> >> VFIOContainer *container; >> >> + VFIODevice *vdev; >> >> + struct vfio_eeh_pe_op *arg; >> > >> >Define these within the scope of the case since they're not used outside >> >of it. >> > >> >> Yes, I'll fix. >> >> >> int ret = -1; >> >> >> >> group = vfio_get_group(groupid, as); >> >> @@ -4442,7 +4444,29 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, >> >> switch (req) { >> >> case VFIO_CHECK_EXTENSION: >> >> case VFIO_IOMMU_SPAPR_TCE_GET_INFO: >> >> + break; >> >> case VFIO_EEH_PE_OP: >> >> + arg = (struct vfio_eeh_pe_op *)param; >> >> + switch (arg->op) { >> >> + case VFIO_EEH_PE_RESET_HOT: >> >> + case VFIO_EEH_PE_RESET_FUNDAMENTAL: >> >> + /* >> >> + * The MSIx table will be cleaned out by reset. We need >> >> + * disable it so that it can be reenabled properly. Also, >> >> + * the cached MSIx table should be cleared as it's not >> >> + * reflecting the contents in hardware. >> >> + */ >> >> + QLIST_FOREACH(vdev, &group->device_list, next) { >> >> + if (msix_enabled(&vdev->pdev)) { >> >> + vfio_disable_msix(vdev); >> >> + } >> >> + >> >> + msix_reset(&vdev->pdev); >> >> + } >> > >> >We already have vfio_disable_interrupts(), can't we use that (blindly)? >> >Do we really need to call msix_reset()? If so, should >> >vfio_disable_msix() call it? >> > >> >> Yes, vfio_disable_interrupts() would be better to be used here as it >> can covers all types of interrupt (INTx/MSI/MSIx). >> >> vfio_disable_interrupts() needn't clear MSIx vectors. If you prefer >> calling msix_reset() in the function, I guess I have to add one more >> parameter to vfio_disable_interrupts() to indicate if we need clear >> MSI/MSIx vector: >> >> static void vfio_disable_interrupts(VFIODevice *vdev, bool clr_vector) > >How about just creating a vfio_reset_interrupts() that calls >msix_reset() if the device supports MSIX? We could wrap >vfio_disable_interrupts() and vfio_reset_interrupts() into a >vfio_disable_and_reset_interrupts(), but I'm not sure it's worth it. >The reset device path should also add the interrupt reset call. Thanks, > Thanks, Alex. It sounds a better way. I'll refactor the code and send v2 out for your comments. Thanks, Gavin >Alex > >> >> + >> >> + break; >> > >> >Extraneous break >> > >> >> Yep, I'll remove it. >> >> >> + } >> >> + >> >> break; >> >> default: >> >> /* Return an error on unknown requests */ >> > >> >Thanks, >> >Alex >> > >> >> Thanks, >> Gavin >> > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 0/2] Fix MSIx lost after PE reset 2014-08-20 9:52 [Qemu-devel] [RFC PATCH 0/2] Fix MSIx lost after PE reset Gavin Shan 2014-08-20 9:52 ` [Qemu-devel] [RFC PATCH 1/2] VFIO: Drop vfio_container_do_ioctl() Gavin Shan 2014-08-20 9:52 ` [Qemu-devel] [RFC PATCH 2/2] VFIO: Clear stale MSIx table during EEH reset Gavin Shan @ 2014-08-26 0:58 ` Gavin Shan 2 siblings, 0 replies; 11+ messages in thread From: Gavin Shan @ 2014-08-26 0:58 UTC (permalink / raw) To: Gavin Shan; +Cc: aik, alex.williamson, qemu-devel On Wed, Aug 20, 2014 at 07:52:06PM +1000, Gavin Shan wrote: >The 2 patches fix MSIx lost after PE reset. Otherwise, the MSIx >entries can't be restored successfully after PE reset and the >EEH recovery fails on broadcom tg3 adapter (as tested) in guest. > >Note: The patchset "EEH support for guest" isn't merged yet, those > 2 patches are based on Alex Graf's "ppc-next" branch + the > patchset supporting EEH for guest, which can be checked out > from below link: > > git@github.com:gwshan/qemu.git (branch: eeh) > >Gavin Shan (2): > VFIO: Drop vfio_container_do_ioctl() > VFIO: Clear stale MSIx table during EEH reset > > hw/misc/vfio.c | 57 +++++++++++++++++++++++++++++++++++++++------------------ > 1 file changed, 39 insertions(+), 18 deletions(-) > Alex, any comments? :-) Thanks, Gavin ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2014-08-29 7:14 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-08-20 9:52 [Qemu-devel] [RFC PATCH 0/2] Fix MSIx lost after PE reset Gavin Shan 2014-08-20 9:52 ` [Qemu-devel] [RFC PATCH 1/2] VFIO: Drop vfio_container_do_ioctl() Gavin Shan 2014-08-26 10:27 ` Alexey Kardashevskiy 2014-08-26 20:05 ` Alex Williamson 2014-08-20 9:52 ` [Qemu-devel] [RFC PATCH 2/2] VFIO: Clear stale MSIx table during EEH reset Gavin Shan 2014-08-26 10:25 ` Alexey Kardashevskiy 2014-08-26 20:25 ` Alex Williamson 2014-08-27 13:15 ` Gavin Shan 2014-08-27 13:29 ` Alex Williamson 2014-08-29 7:13 ` Gavin Shan 2014-08-26 0:58 ` [Qemu-devel] [RFC PATCH 0/2] Fix MSIx lost after PE reset Gavin Shan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).