All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	qemu-ppc@nongnu.org, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 1/3] VFIO: Clear stale MSIx table during EEH reset
Date: Thu, 26 Mar 2015 12:10:52 +1100	[thread overview]
Message-ID: <20150326011052.GB28039@voom.redhat.com> (raw)
In-Reply-To: <20150326005348.GA15928@shangw>

[-- Attachment #1: Type: text/plain, Size: 9270 bytes --]

On Thu, Mar 26, 2015 at 11:53:48AM +1100, Gavin Shan wrote:
> On Tue, Mar 24, 2015 at 06:53:29AM -0600, Alex Williamson wrote:
> >On Tue, 2015-03-24 at 17:54 +1100, David Gibson wrote:
> >> On Tue, Mar 24, 2015 at 05:24:55PM +1100, Gavin Shan wrote:
> >> > On Tue, Mar 24, 2015 at 04:41:21PM +1100, David Gibson wrote:
> >> > >On Mon, Mar 23, 2015 at 04:25:10PM +1100, Gavin Shan wrote:
> >> > >> On Mon, Mar 23, 2015 at 04:06:56PM +1100, David Gibson wrote:
> >> > >> >On Fri, Mar 20, 2015 at 05:27:29PM +1100, Gavin Shan wrote:
> >> > >> >> On Fri, Mar 20, 2015 at 05:04:01PM +1100, David Gibson wrote:
> >> > >> >> >On Tue, Mar 17, 2015 at 03:31:24AM +1100, Gavin Shan wrote:
> >> > >> >> >> The PCI device MSIx table is cleaned out in hardware after EEH PE
> >> > >> >> >> reset. However, we still hold the stale MSIx entries in QEMU, which
> >> > >> >> >> should be cleared accordingly. Otherwise, we will run into another
> >> > >> >> >> (recursive) EEH error and the PCI devices contained in the PE have
> >> > >> >> >> to be offlined exceptionally.
> >> > >> >> >> 
> >> > >> >> >> The patch clears stale MSIx table before EEH PE reset so that MSIx
> >> > >> >> >> table could be restored properly after EEH PE reset.
> >> > >> >> >> 
> >> > >> >> >> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> >> > >> >> >> ---
> >> > >> >> >> v2: vfio_container_eeh_event() stub for !CONFIG_PCI and separate
> >> > >> >> >>     error message for this function. Dropped vfio_put_group()
> >> > >> >> >>     on NULL group
> >> > >> >> >> ---
> >> > >> >> >>  hw/vfio/Makefile.objs  |  6 +++++-
> >> > >> >> >>  hw/vfio/common.c       |  7 +++++++
> >> > >> >> >>  hw/vfio/pci-stub.c     | 17 +++++++++++++++++
> >> > >> >> >>  hw/vfio/pci.c          | 38 ++++++++++++++++++++++++++++++++++++++
> >> > >> >> >>  include/hw/vfio/vfio.h |  2 ++
> >> > >> >> >>  5 files changed, 69 insertions(+), 1 deletion(-)
> >> > >> >> >>  create mode 100644 hw/vfio/pci-stub.c
> >> > >> >> >> 
> >> > >> >> >> diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
> >> > >> >> >> index e31f30e..1b8a065 100644
> >> > >> >> >> --- a/hw/vfio/Makefile.objs
> >> > >> >> >> +++ b/hw/vfio/Makefile.objs
> >> > >> >> >> @@ -1,4 +1,8 @@
> >> > >> >> >>  ifeq ($(CONFIG_LINUX), y)
> >> > >> >> >>  obj-$(CONFIG_SOFTMMU) += common.o
> >> > >> >> >> -obj-$(CONFIG_PCI) += pci.o
> >> > >> >> >> +ifeq ($(CONFIG_PCI), y)
> >> > >> >> >> +obj-y += pci.o
> >> > >> >> >> +else
> >> > >> >> >> +obj-y += pci-stub.o
> >> > >> >> >> +endif
> >> > >> >> >>  endif
> >> > >> >> >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> >> > >> >> >> index 148eb53..ed07814 100644
> >> > >> >> >> --- a/hw/vfio/common.c
> >> > >> >> >> +++ b/hw/vfio/common.c
> >> > >> >> >> @@ -949,7 +949,14 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
> >> > >> >> >>      switch (req) {
> >> > >> >> >>      case VFIO_CHECK_EXTENSION:
> >> > >> >> >>      case VFIO_IOMMU_SPAPR_TCE_GET_INFO:
> >> > >> >> >> +        break;
> >> > >> >> >>      case VFIO_EEH_PE_OP:
> >> > >> >> >> +        if (vfio_container_eeh_event(as, groupid, param) != 0) {
> >> > >> >> >
> >> > >> >> >I really dislike the idea of having an arbitrarily complex side effect
> >> > >> >> >from a function whose name suggest's it's just a trivial wrapper
> >> > >> >> >around the ioctl().
> >> > >> >> >
> >> > >> >> 
> >> > >> >> Ok. I guess you would like putting the complex in the callers of
> >> > >> >> vfio_container_ioctl().
> >> > >> >
> >> > >> >Well.. maybe.  I'd also be happy if helper functions were implemeneted
> >> > >> >which both called the ioctl() and did the other necessary pieces.
> >> > >> >They should just be called something that indicates their full
> >> > >> >function, not a name which suggests they're just an ioctl wrapper.
> >> > >> >
> >> > >> 
> >> > >> Indeed, vfio_container_ioctl() isn't indicating what the function is doing.
> >> > >> How about renaming it to vfio_container_event_and_ioctl()? I'm always bad
> >> > >> at giving a good function name :)
> >> > >
> >> > >Well, I don't think your wrapper should be multiplexed.  The multiplex
> >> > >works for the simple ioctl() wrapper, because there really is nothing
> >> > >that varies apart from the exact ioctl number called.
> >> > >
> >> > >But now that you have different operations here, I think you want
> >> > >wrappers for each one - each one will call the ioctl(), then do the
> >> > >specific extra steps necessary for that operation.  So
> >> > >vfio_container_event() will go away as well, split into various other
> >> > >functions.
> >> > >
> >> > 
> >> > It wouldn't a good idea if I understand your proposal correctly. Currnetly,
> >> > the global function vfio_container_ioctl() can be called from sPAPR platform
> >> > for any ioctl commands handled in kernel source file vfio_iommu_spapr_tce.c,
> >> > which means the function isn't called for EEH only. Other sPAPR TCE container
> >> > ioctl commands are also routed by this function. There will be lots if having
> >> > one global function for each ioctl commands, which just improve the cost to
> >> > maintain the code.
> >> 
> >> I don't really follow your objection.  I'm only suggesting separate
> >> wrappers for things which require extra actions currently implemented
> >> in vfio_container_event().  Things which only ned the plain ioctl()
> >> can still use the simple vfio_container_ioctl() wrapper.
> >
> >vfio_container_ioctl() also filters to a limited set of ioctls, it
> >clearly does not allow any ioctl.
> >
> 
> Ok. I think your guys expect something like follows. Note that the following
> vfio_container_eeh_ioctl() will accept a limited set of EEH operations, similar
> to what's doing in vfio_contain_ioctl() to the ioctl commands:
> 
> If you agree to have the changes, I'll put another patch on top of this one
> to replace vfio_container_ioctl() in spapr_pci_vfio.c with vfio_container_eeh_ioctl()
> for EEH cases.
> 
> int vfio_container_eeh_ioctl(AddressSpace *as, int32_t groupid,
>                              struct vfio_eeh_pe_op *op)
> {
>     switch (op->op) {
>     case VFIO_EEH_PE_RESET_HOT:
>     case VFIO_EEH_PE_RESET_FUNDAMENTAL: {
>         VFIOGroup *group;
>         VFIODevice *vbasedev;
>         VFIOPCIDevice *vdev;
> 
>         /*
>          * The MSIx table will be cleaned out by reset. We need
>          * disable it so that it can be reenabled properly. Also,
>          * the cached MSIx table should be cleared as it's not
>          * reflecting the contents in hardware.
>          */
>         group = vfio_get_group(groupid, as);
>         if (!group) {
>             error_report("vfio: group %d not found\n", groupid);
>             return -1;
>         }
> 
>         QLIST_FOREACH(vbasedev, &group->device_list, next) {
>             vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
>             if (msix_enabled(&vdev->pdev)) {
>                 vfio_disable_msix(vdev);
>             }
> 
>             msix_reset(&vdev->pdev);
>         }
> 
>         vfio_put_group(group);
> 
>         break;
>     }
>     case VFIO_EEH_PE_DISABLE:
>     case VFIO_EEH_PE_ENABLE:
>     case VFIO_EEH_PE_UNFREEZE_IO:
>     case VFIO_EEH_PE_UNFREEZE_DMA:
>     case VFIO_EEH_PE_GET_STATE:
>     case VFIO_EEH_PE_RESET_DEACTIVATE:
>     case VFIO_EEH_PE_CONFIGURE:
>         break;
>     default:
>         error_report("vfio: unsupported EEH operation %X\n", op->op);
>         return -1;
>     }
> 
>     return vfio_container_ioctl(as, groupid, VFIO_EEH_PE_OP, op);
> }


No, extra operation specific logic inside the ioctl wrapper is exactly
what I want to avoid.  Instead I want to see
vfio_container_eeh_ioctl() remain as it is now - doing nothing but
verifying the ioctl() number, then passing the arguments on to
ioctl().

What I'm expecting is then to add a new functions, along the lines of:

int vfio_eeh_pe_reset(...)
{
    VFIOGroup *group;
    VFIODevice *vbasedev;
    VFIOPCIDevice *vdev;

    /*
     * The MSIx table will be cleaned out by reset. We need
     * disable it so that it can be reenabled properly. Also,
     * the cached MSIx table should be cleared as it's not
     * reflecting the contents in hardware.
     */
    group = vfio_get_group(groupid, as);
    if (!group) {
        error_report("vfio: group %d not found\n", groupid);
        return -1;
    }

    QLIST_FOREACH(vbasedev, &group->device_list, next) {
        vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
        if (msix_enabled(&vdev->pdev)) {
            vfio_disable_msix(vdev);
        }

        msix_reset(&vdev->pdev);
    }

    vfio_put_group(group);

    return vfio_eeh_container_ioctl(as, groupid,
                                    VFIO_EEH_PE_RESET_FUNDAMENTAL, op);
}

I this function can build the op structure itself from sensible
arguments, then that's even better.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2015-03-26  1:10 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-16 16:31 [Qemu-devel] [PATCH v2 1/3] VFIO: Clear stale MSIx table during EEH reset Gavin Shan
2015-03-16 16:31 ` [Qemu-devel] [PATCH v2 2/3] VFIO: Disable INTx interrupt on " Gavin Shan
2015-03-17 21:16   ` Alex Williamson
2015-03-18  4:54     ` [Qemu-devel] [Qemu-ppc] " Gavin Shan
2015-03-20  4:01       ` Gavin Shan
2015-03-20  5:57         ` Gavin Shan
2015-03-16 16:31 ` [Qemu-devel] [PATCH v2 3/3] sPAPR: Reenable EEH functionality on reboot Gavin Shan
2015-03-17 21:09 ` [Qemu-devel] [PATCH v2 1/3] VFIO: Clear stale MSIx table during EEH reset Alex Williamson
2015-03-17 23:26   ` Gavin Shan
2015-03-23  5:05     ` Gavin Shan
2015-03-20  6:04 ` David Gibson
2015-03-20  6:27   ` [Qemu-devel] [Qemu-ppc] " Gavin Shan
2015-03-23  5:06     ` David Gibson
2015-03-23  5:25       ` Gavin Shan
2015-03-24  5:41         ` David Gibson
2015-03-24  6:24           ` Gavin Shan
2015-03-24  6:54             ` David Gibson
2015-03-24 12:53               ` Alex Williamson
2015-03-26  0:53                 ` Gavin Shan
2015-03-26  1:10                   ` David Gibson [this message]
2015-03-26  1:30                     ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150326011052.GB28039@voom.redhat.com \
    --to=david@gibson.dropbear.id.au \
    --cc=alex.williamson@redhat.com \
    --cc=gwshan@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.