qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PULL 0/3] VFIO updates 2017-10-03
@ 2017-10-03 22:12 Alex Williamson
  2017-10-03 22:12 ` [Qemu-devel] [PULL 1/3] vfio/pci: Do not unwind on error Alex Williamson
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Alex Williamson @ 2017-10-03 22:12 UTC (permalink / raw)
  To: qemu-devel

The following changes since commit d147f7e815f97cb477e223586bcb80c316ae10ea:

  Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging (2017-10-03 16:27:24 +0100)

are available in the git repository at:


  git://github.com/awilliam/qemu-vfio.git tags/vfio-updates-20171003.0

for you to fetch changes up to dfbee78db8fdf7bc8c151c3d29504bb47438480b:

  vfio/pci: Add NVIDIA GPUDirect Cliques support (2017-10-03 12:57:36 -0600)

----------------------------------------------------------------
VFIO updates 2017-10-03

 - NVIDIA GPUDirect Cliques experimental support (Alex Williamson)

----------------------------------------------------------------
Alex Williamson (3):
      vfio/pci: Do not unwind on error
      vfio/pci: Add virtual capabilities quirk infrastructure
      vfio/pci: Add NVIDIA GPUDirect Cliques support

 hw/vfio/pci-quirks.c | 114 +++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/vfio/pci.c        |  17 +++++++-
 hw/vfio/pci.h        |   4 ++
 3 files changed, 133 insertions(+), 2 deletions(-)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PULL 1/3] vfio/pci: Do not unwind on error
  2017-10-03 22:12 [Qemu-devel] [PULL 0/3] VFIO updates 2017-10-03 Alex Williamson
@ 2017-10-03 22:12 ` Alex Williamson
  2017-10-03 22:12 ` [Qemu-devel] [PULL 2/3] vfio/pci: Add virtual capabilities quirk infrastructure Alex Williamson
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Alex Williamson @ 2017-10-03 22:12 UTC (permalink / raw)
  To: qemu-devel

If vfio_add_std_cap() errors then going to out prepends irrelevant
errors for capabilities we haven't attempted to add as we unwind our
recursive stack.  Just return error.

Fixes: 7ef165b9a8d9 ("vfio/pci: Pass an error object to vfio_add_capabilities")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 hw/vfio/pci.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 31e1edf44745..916d365dfab3 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1826,7 +1826,7 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp)
     if (next) {
         ret = vfio_add_std_cap(vdev, next, errp);
         if (ret) {
-            goto out;
+            return ret;
         }
     } else {
         /* Begin the rebuild, use QEMU emulated list bits */
@@ -1862,7 +1862,7 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp)
         ret = pci_add_capability(pdev, cap_id, pos, size, errp);
         break;
     }
-out:
+
     if (ret < 0) {
         error_prepend(errp,
                       "failed to add PCI capability 0x%x[0x%x]@0x%x: ",

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PULL 2/3] vfio/pci: Add virtual capabilities quirk infrastructure
  2017-10-03 22:12 [Qemu-devel] [PULL 0/3] VFIO updates 2017-10-03 Alex Williamson
  2017-10-03 22:12 ` [Qemu-devel] [PULL 1/3] vfio/pci: Do not unwind on error Alex Williamson
@ 2017-10-03 22:12 ` Alex Williamson
  2017-10-03 22:13 ` [Qemu-devel] [PULL 3/3] vfio/pci: Add NVIDIA GPUDirect Cliques support Alex Williamson
  2017-10-05 13:43 ` [Qemu-devel] [PULL 0/3] VFIO updates 2017-10-03 Peter Maydell
  3 siblings, 0 replies; 5+ messages in thread
From: Alex Williamson @ 2017-10-03 22:12 UTC (permalink / raw)
  To: qemu-devel

If the hypervisor needs to add purely virtual capabilties, give us a
hook through quirks to do that.  Note that we determine the maximum
size for a capability based on the physical device, if we insert a
virtual capability, that can change.  Therefore if maximum size is
smaller after added virt capabilities, use that.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 hw/vfio/pci-quirks.c |    4 ++++
 hw/vfio/pci.c        |    8 ++++++++
 hw/vfio/pci.h        |    1 +
 3 files changed, 13 insertions(+)

diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index 349085ea12bc..40aaae76feb9 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -1850,3 +1850,7 @@ void vfio_setup_resetfn_quirk(VFIOPCIDevice *vdev)
         break;
     }
 }
+int vfio_add_virt_caps(VFIOPCIDevice *vdev, Error **errp)
+{
+    return 0;
+}
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 916d365dfab3..bfeaaef22d00 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1833,8 +1833,16 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp)
         pdev->config[PCI_CAPABILITY_LIST] = 0;
         vdev->emulated_config_bits[PCI_CAPABILITY_LIST] = 0xff;
         vdev->emulated_config_bits[PCI_STATUS] |= PCI_STATUS_CAP_LIST;
+
+        ret = vfio_add_virt_caps(vdev, errp);
+        if (ret) {
+            return ret;
+        }
     }
 
+    /* Scale down size, esp in case virt caps were added above */
+    size = MIN(size, vfio_std_cap_max_size(pdev, pos));
+
     /* Use emulated next pointer to allow dropping caps */
     pci_set_byte(vdev->emulated_config_bits + pos + PCI_CAP_LIST_NEXT, 0xff);
 
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index a8366bb2a74a..958cee058b3b 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -160,6 +160,7 @@ void vfio_bar_quirk_setup(VFIOPCIDevice *vdev, int nr);
 void vfio_bar_quirk_exit(VFIOPCIDevice *vdev, int nr);
 void vfio_bar_quirk_finalize(VFIOPCIDevice *vdev, int nr);
 void vfio_setup_resetfn_quirk(VFIOPCIDevice *vdev);
+int vfio_add_virt_caps(VFIOPCIDevice *vdev, Error **errp);
 
 int vfio_populate_vga(VFIOPCIDevice *vdev, Error **errp);
 

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PULL 3/3] vfio/pci: Add NVIDIA GPUDirect Cliques support
  2017-10-03 22:12 [Qemu-devel] [PULL 0/3] VFIO updates 2017-10-03 Alex Williamson
  2017-10-03 22:12 ` [Qemu-devel] [PULL 1/3] vfio/pci: Do not unwind on error Alex Williamson
  2017-10-03 22:12 ` [Qemu-devel] [PULL 2/3] vfio/pci: Add virtual capabilities quirk infrastructure Alex Williamson
@ 2017-10-03 22:13 ` Alex Williamson
  2017-10-05 13:43 ` [Qemu-devel] [PULL 0/3] VFIO updates 2017-10-03 Peter Maydell
  3 siblings, 0 replies; 5+ messages in thread
From: Alex Williamson @ 2017-10-03 22:13 UTC (permalink / raw)
  To: qemu-devel

NVIDIA has defined a specification for creating GPUDirect "cliques",
where devices with the same clique ID support direct peer-to-peer DMA.
When running on bare-metal, tools like NVIDIA's p2pBandwidthLatencyTest
(part of cuda-samples) determine which GPUs can support peer-to-peer
based on chipset and topology.  When running in a VM, these tools have
no visibility to the physical hardware support or topology.  This
option allows the user to specify hints via a vendor defined
capability.  For instance:

  <qemu:commandline>
    <qemu:arg value='-set'/>
    <qemu:arg value='device.hostdev0.x-nv-gpudirect-clique=0'/>
    <qemu:arg value='-set'/>
    <qemu:arg value='device.hostdev1.x-nv-gpudirect-clique=1'/>
    <qemu:arg value='-set'/>
    <qemu:arg value='device.hostdev2.x-nv-gpudirect-clique=1'/>
  </qemu:commandline>

This enables two cliques.  The first is a singleton clique with ID 0,
for the first hostdev defined in the XML (note that since cliques
define peer-to-peer sets, singleton clique offer no benefit).  The
subsequent two hostdevs are both added to clique ID 1, indicating
peer-to-peer is possible between these devices.

QEMU only provides validation that the clique ID is valid and applied
to an NVIDIA graphics device, any validation that the resulting
cliques are functional and valid is the user's responsibility.  The
NVIDIA specification allows a 4-bit clique ID, thus valid values are
0-15.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 hw/vfio/pci-quirks.c |  110 ++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/vfio/pci.c        |    5 ++
 hw/vfio/pci.h        |    3 +
 3 files changed, 118 insertions(+)

diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index 40aaae76feb9..14291c2a16b2 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -14,6 +14,7 @@
 #include "qemu/error-report.h"
 #include "qemu/range.h"
 #include "qapi/error.h"
+#include "qapi/visitor.h"
 #include "hw/nvram/fw_cfg.h"
 #include "pci.h"
 #include "trace.h"
@@ -1850,7 +1851,116 @@ void vfio_setup_resetfn_quirk(VFIOPCIDevice *vdev)
         break;
     }
 }
+
+/*
+ * The NVIDIA GPUDirect P2P Vendor capability allows the user to specify
+ * devices as a member of a clique.  Devices within the same clique ID
+ * are capable of direct P2P.  It's the user's responsibility that this
+ * is correct.  The spec says that this may reside at any unused config
+ * offset, but reserves and recommends hypervisors place this at C8h.
+ * The spec also states that the hypervisor should place this capability
+ * at the end of the capability list, thus next is defined as 0h.
+ *
+ * +----------------+----------------+----------------+----------------+
+ * | sig 7:0 ('P')  |  vndr len (8h) |    next (0h)   |   cap id (9h)  |
+ * +----------------+----------------+----------------+----------------+
+ * | rsvd 15:7(0h),id 6:3,ver 2:0(0h)|          sig 23:8 ('P2')        |
+ * +---------------------------------+---------------------------------+
+ *
+ * https://lists.gnu.org/archive/html/qemu-devel/2017-08/pdfUda5iEpgOS.pdf
+ */
+static void get_nv_gpudirect_clique_id(Object *obj, Visitor *v,
+                                       const char *name, void *opaque,
+                                       Error **errp)
+{
+    DeviceState *dev = DEVICE(obj);
+    Property *prop = opaque;
+    uint8_t *ptr = qdev_get_prop_ptr(dev, prop);
+
+    visit_type_uint8(v, name, ptr, errp);
+}
+
+static void set_nv_gpudirect_clique_id(Object *obj, Visitor *v,
+                                       const char *name, void *opaque,
+                                       Error **errp)
+{
+    DeviceState *dev = DEVICE(obj);
+    Property *prop = opaque;
+    uint8_t value, *ptr = qdev_get_prop_ptr(dev, prop);
+    Error *local_err = NULL;
+
+    if (dev->realized) {
+        qdev_prop_set_after_realize(dev, name, errp);
+        return;
+    }
+
+    visit_type_uint8(v, name, &value, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    if (value & ~0xF) {
+        error_setg(errp, "Property %s: valid range 0-15", name);
+        return;
+    }
+
+    *ptr = value;
+}
+
+const PropertyInfo qdev_prop_nv_gpudirect_clique = {
+    .name = "uint4",
+    .description = "NVIDIA GPUDirect Clique ID (0 - 15)",
+    .get = get_nv_gpudirect_clique_id,
+    .set = set_nv_gpudirect_clique_id,
+};
+
+static int vfio_add_nv_gpudirect_cap(VFIOPCIDevice *vdev, Error **errp)
+{
+    PCIDevice *pdev = &vdev->pdev;
+    int ret, pos = 0xC8;
+
+    if (vdev->nv_gpudirect_clique == 0xFF) {
+        return 0;
+    }
+
+    if (!vfio_pci_is(vdev, PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID)) {
+        error_setg(errp, "NVIDIA GPUDirect Clique ID: invalid device vendor");
+        return -EINVAL;
+    }
+
+    if (pci_get_byte(pdev->config + PCI_CLASS_DEVICE + 1) !=
+        PCI_BASE_CLASS_DISPLAY) {
+        error_setg(errp, "NVIDIA GPUDirect Clique ID: unsupported PCI class");
+        return -EINVAL;
+    }
+
+    ret = pci_add_capability(pdev, PCI_CAP_ID_VNDR, pos, 8, errp);
+    if (ret < 0) {
+        error_prepend(errp, "Failed to add NVIDIA GPUDirect cap: ");
+        return ret;
+    }
+
+    memset(vdev->emulated_config_bits + pos, 0xFF, 8);
+    pos += PCI_CAP_FLAGS;
+    pci_set_byte(pdev->config + pos++, 8);
+    pci_set_byte(pdev->config + pos++, 'P');
+    pci_set_byte(pdev->config + pos++, '2');
+    pci_set_byte(pdev->config + pos++, 'P');
+    pci_set_byte(pdev->config + pos++, vdev->nv_gpudirect_clique << 3);
+    pci_set_byte(pdev->config + pos, 0);
+
+    return 0;
+}
+
 int vfio_add_virt_caps(VFIOPCIDevice *vdev, Error **errp)
 {
+    int ret;
+
+    ret = vfio_add_nv_gpudirect_cap(vdev, errp);
+    if (ret) {
+        return ret;
+    }
+
     return 0;
 }
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index bfeaaef22d00..9e86db7c3b6d 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2970,6 +2970,8 @@ static void vfio_instance_init(Object *obj)
     vdev->host.bus = ~0U;
     vdev->host.slot = ~0U;
     vdev->host.function = ~0U;
+
+    vdev->nv_gpudirect_clique = 0xFF;
 }
 
 static Property vfio_pci_dev_properties[] = {
@@ -2994,6 +2996,9 @@ static Property vfio_pci_dev_properties[] = {
     DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice,
                        sub_device_id, PCI_ANY_ID),
     DEFINE_PROP_UINT32("x-igd-gms", VFIOPCIDevice, igd_gms, 0),
+    DEFINE_PROP_UNSIGNED_NODEFAULT("x-nv-gpudirect-clique", VFIOPCIDevice,
+                                   nv_gpudirect_clique,
+                                   qdev_prop_nv_gpudirect_clique, uint8_t),
     /*
      * TODO - support passed fds... is this necessary?
      * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name),
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index 958cee058b3b..502a5755b944 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -135,6 +135,7 @@ typedef struct VFIOPCIDevice {
     int32_t bootindex;
     uint32_t igd_gms;
     uint8_t pm_cap;
+    uint8_t nv_gpudirect_clique;
     bool pci_aer;
     bool req_enabled;
     bool has_flr;
@@ -162,6 +163,8 @@ void vfio_bar_quirk_finalize(VFIOPCIDevice *vdev, int nr);
 void vfio_setup_resetfn_quirk(VFIOPCIDevice *vdev);
 int vfio_add_virt_caps(VFIOPCIDevice *vdev, Error **errp);
 
+extern const PropertyInfo qdev_prop_nv_gpudirect_clique;
+
 int vfio_populate_vga(VFIOPCIDevice *vdev, Error **errp);
 
 int vfio_pci_igd_opregion_init(VFIOPCIDevice *vdev,

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PULL 0/3] VFIO updates 2017-10-03
  2017-10-03 22:12 [Qemu-devel] [PULL 0/3] VFIO updates 2017-10-03 Alex Williamson
                   ` (2 preceding siblings ...)
  2017-10-03 22:13 ` [Qemu-devel] [PULL 3/3] vfio/pci: Add NVIDIA GPUDirect Cliques support Alex Williamson
@ 2017-10-05 13:43 ` Peter Maydell
  3 siblings, 0 replies; 5+ messages in thread
From: Peter Maydell @ 2017-10-05 13:43 UTC (permalink / raw)
  To: Alex Williamson; +Cc: QEMU Developers

On 3 October 2017 at 23:12, Alex Williamson <alex.williamson@redhat.com> wrote:
> The following changes since commit d147f7e815f97cb477e223586bcb80c316ae10ea:
>
>   Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging (2017-10-03 16:27:24 +0100)
>
> are available in the git repository at:
>
>
>   git://github.com/awilliam/qemu-vfio.git tags/vfio-updates-20171003.0
>
> for you to fetch changes up to dfbee78db8fdf7bc8c151c3d29504bb47438480b:
>
>   vfio/pci: Add NVIDIA GPUDirect Cliques support (2017-10-03 12:57:36 -0600)
>
> ----------------------------------------------------------------
> VFIO updates 2017-10-03
>
>  - NVIDIA GPUDirect Cliques experimental support (Alex Williamson)
>
> ----------------------------------------------------------------
> Alex Williamson (3):
>       vfio/pci: Do not unwind on error
>       vfio/pci: Add virtual capabilities quirk infrastructure
>       vfio/pci: Add NVIDIA GPUDirect Cliques support
>

Applied, thanks.

-- PMM

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-10-05 13:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-03 22:12 [Qemu-devel] [PULL 0/3] VFIO updates 2017-10-03 Alex Williamson
2017-10-03 22:12 ` [Qemu-devel] [PULL 1/3] vfio/pci: Do not unwind on error Alex Williamson
2017-10-03 22:12 ` [Qemu-devel] [PULL 2/3] vfio/pci: Add virtual capabilities quirk infrastructure Alex Williamson
2017-10-03 22:13 ` [Qemu-devel] [PULL 3/3] vfio/pci: Add NVIDIA GPUDirect Cliques support Alex Williamson
2017-10-05 13:43 ` [Qemu-devel] [PULL 0/3] VFIO updates 2017-10-03 Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).