qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>,
	qemu-devel@nongnu.org, qemu-ppc@nongnu.org
Subject: Re: [Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR
Date: Fri, 15 Dec 2017 15:09:28 +1100	[thread overview]
Message-ID: <20171215040928.GE7753@umbus.fritz.box> (raw)
In-Reply-To: <20171212090525.19a21378@t450s.home>

[-- Attachment #1: Type: text/plain, Size: 8639 bytes --]

On Tue, Dec 12, 2017 at 09:05:25AM -0700, Alex Williamson wrote:
> On Tue, 12 Dec 2017 18:01:40 +1100
> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> 
> > On 12/12/17 17:06, Alexey Kardashevskiy wrote:
> > > On 12/12/17 16:54, Alex Williamson wrote:  
> > >> On Tue, 12 Dec 2017 16:21:31 +1100
> > >> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> > >>  
> > >>> This makes use of a new VFIO_REGION_INFO_CAP_MSIX_MAPPABLE capability
> > >>> which tells that a region with MSIX data can be mapped entirely, i.e.
> > >>> the VFIO PCI driver won't prevent MSIX vectors area from being mapped.
> > >>>
> > >>> This adds a "msix-no-mmap" property to the vfio-pci device, it is "true"
> > >>> by default and "false" for pseries-2.12+ machines.
> > >>>
> > >>> This requites kernel's "vfio-pci: Allow mapping MSIX BAR"
> > >>> https://www.spinics.net/lists/kvm/msg160282.html
> > >>>
> > >>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > >>> ---
> > >>>
> > >>> This is an RFC as it requires kernel headers update which is not there yet.
> > >>>
> > >>> I'd like to make it "msix-mmap" (without "no") but could not find a way
> > >>> of enabling a device property for machine versions newer than some value.
> > >>>
> > >>> I changed 2.11 machine just for the demonstration purpose.
> > >>>
> > >>>
> > >>> ---
> > >>>  hw/vfio/pci.h                 |  1 +
> > >>>  include/hw/vfio/vfio-common.h |  1 +
> > >>>  linux-headers/linux/vfio.h    |  5 +++++
> > >>>  hw/ppc/spapr.c                | 10 +++++++++-
> > >>>  hw/vfio/common.c              | 15 +++++++++++++++
> > >>>  hw/vfio/pci.c                 | 11 +++++++++++
> > >>>  6 files changed, 42 insertions(+), 1 deletion(-)
> > >>>
> > >>> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> > >>> index a8fb3b3..53912ef 100644
> > >>> --- a/hw/vfio/pci.h
> > >>> +++ b/hw/vfio/pci.h
> > >>> @@ -142,6 +142,7 @@ typedef struct VFIOPCIDevice {
> > >>>      bool no_kvm_intx;
> > >>>      bool no_kvm_msi;
> > >>>      bool no_kvm_msix;
> > >>> +    bool msix_no_mmap;
> > >>>  } VFIOPCIDevice;
> > >>>  
> > >>>  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
> > >>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> > >>> index f3a2ac9..927d600 100644
> > >>> --- a/include/hw/vfio/vfio-common.h
> > >>> +++ b/include/hw/vfio/vfio-common.h
> > >>> @@ -171,6 +171,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index,
> > >>>                           struct vfio_region_info **info);
> > >>>  int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
> > >>>                               uint32_t subtype, struct vfio_region_info **info);
> > >>> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region);
> > >>>  #endif
> > >>>  extern const MemoryListener vfio_prereg_listener;
> > >>>  
> > >>> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
> > >>> index 4e7ab4c..bce9baf 100644
> > >>> --- a/linux-headers/linux/vfio.h
> > >>> +++ b/linux-headers/linux/vfio.h
> > >>> @@ -300,6 +300,11 @@ struct vfio_region_info_cap_type {
> > >>>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG	(2)
> > >>>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG	(3)
> > >>>  
> > >>> +/*
> > >>> + * The MSIX mappable capability informs that MSIX data of a BAR can be mmapped.
> > >>> + */
> > >>> +#define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE	3
> > >>> +
> > >>>  /**
> > >>>   * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9,
> > >>>   *				    struct vfio_irq_info)
> > >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > >>> index 9de63f0..1dfc386 100644
> > >>> --- a/hw/ppc/spapr.c
> > >>> +++ b/hw/ppc/spapr.c
> > >>> @@ -3742,13 +3742,21 @@ static const TypeInfo spapr_machine_info = {
> > >>>  /*
> > >>>   * pseries-2.11
> > >>>   */
> > >>> +#define SPAPR_COMPAT_2_11                                             \
> > >>> +    HW_COMPAT_2_10                                                    \
> > >>> +    {                                                                 \
> > >>> +        .driver = "vfio-pci",                                         \
> > >>> +        .property = "msix-no-mmap",                                   \
> > >>> +        .value    = "on",                                             \
> > >>> +    },                                                                \
> > >>> +
> > >>>  static void spapr_machine_2_11_instance_options(MachineState *machine)
> > >>>  {
> > >>>  }
> > >>>  
> > >>>  static void spapr_machine_2_11_class_options(MachineClass *mc)
> > >>>  {
> > >>> -    /* Defaults for the latest behaviour inherited from the base class */
> > >>> +    SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11);
> > >>>  }
> > >>>  
> > >>>  DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
> > >>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > >>> index ed7717d..593514c 100644
> > >>> --- a/hw/vfio/common.c
> > >>> +++ b/hw/vfio/common.c
> > >>> @@ -1408,6 +1408,21 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
> > >>>      return -ENODEV;
> > >>>  }
> > >>>  
> > >>> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region)
> > >>> +{
> > >>> +        struct vfio_region_info *info = NULL;
> > >>> +        bool ret = false;
> > >>> +
> > >>> +        if (!vfio_get_region_info(vbasedev, region, &info)) {
> > >>> +            if (vfio_get_region_info_cap(info, cap_type)) {
> > >>> +                ret = true;
> > >>> +            }
> > >>> +            g_free(info);
> > >>> +        }
> > >>> +
> > >>> +        return ret;
> > >>> +}
> > >>> +
> > >>>  /*
> > >>>   * Interfaces for IBM EEH (Enhanced Error Handling)
> > >>>   */
> > >>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > >>> index c977ee3..d9aeae8 100644
> > >>> --- a/hw/vfio/pci.c
> > >>> +++ b/hw/vfio/pci.c
> > >>> @@ -1289,6 +1289,12 @@ static void vfio_pci_fixup_msix_region(VFIOPCIDevice *vdev)
> > >>>      off_t start, end;
> > >>>      VFIORegion *region = &vdev->bars[vdev->msix->table_bar].region;
> > >>>  
> > >>> +    if (!vdev->msix_no_mmap &&
> > >>> +        vfio_is_cap_present(&vdev->vbasedev, VFIO_REGION_INFO_CAP_MSIX_MAPPABLE,
> > >>> +                            vdev->msix->table_bar)) {
> > >>> +        return;
> > >>> +    }
> > >>> +
> > >>>      /*
> > >>>       * We expect to find a single mmap covering the whole BAR, anything else
> > >>>       * means it's either unsupported or already setup.
> > >>> @@ -1473,6 +1479,10 @@ static int vfio_msix_setup(VFIOPCIDevice *vdev, int pos, Error **errp)
> > >>>       */
> > >>>      memory_region_set_enabled(&vdev->pdev.msix_pba_mmio, false);
> > >>>  
> > >>> +    if (!vdev->msix_no_mmap) {
> > >>> +        memory_region_set_enabled(&vdev->pdev.msix_table_mmio, false);
> > >>> +    }  
> > >>
> > >> No, you're conflating issues.  There's (a) can we mmap over the MSI-X
> > >> vector table and (b) do we require QEMU emulation of the MSI-X vector
> > >> table.  (a) does NOT imply (b).  AFAICT, (a) should be enabled any time
> > >> the kernel supports it,  
> > > 
> > > This is the default setting, or you do not want it to be a property so the
> > > user cannot shoot himself in a foot?
> 
> If the kernel allows mmap, other than debugging, why would it ever need
> to be disabled?
> 
>    
> > >> (b) should never be enabled on ppc, regardless
> > >> of (a).  Thanks,  
> > > 
> > > 
> > > The intention is to have a property - msix_no_mmap=true, except a single
> > > case - ppc-pseries, I just do not know how to enforce it for a specific
> > > machine type.  
> 
> The intention is wrong.  (a) should be done any time the kernel allows
> it.  (b) is an independent concept of disabling QEMU MSI-X emulation
> for platforms, ie. machine types, that do not require it.  (b) has
> nothing to do with the mmap'ability of the msix table area.  So far (b)
> includes only the ppc spapr machine and I don't see a reason to allow
> the user to control this.  Thanks,

I don't either, but setting properties like this seems to be the
more-or-less standard way for the machine type (and/or version) to
affect operation of other devices.

Allowing the user to shoot themselves in the foot is a side effect of
that, but seems to be one we're ok with.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2017-12-15  5:51 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-12  5:21 [Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR Alexey Kardashevskiy
2017-12-12  5:45 ` Alexey Kardashevskiy
2017-12-12  5:54 ` Alex Williamson
2017-12-12  6:06   ` Alexey Kardashevskiy
2017-12-12  7:01     ` Alexey Kardashevskiy
2017-12-12 16:05       ` Alex Williamson
2017-12-15  4:09         ` David Gibson [this message]
2017-12-15  4:07 ` David Gibson
2017-12-15 16:04   ` Alex Williamson
2017-12-18  3:58     ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171215040928.GE7753@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).