qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/14] virtio-net: add support for SR-IOV emulation
@ 2023-12-02  8:00 Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 01/14] vfio: Avoid inspecting option QDict for rombar Akihiko Odaki
                   ` (14 more replies)
  0 siblings, 15 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-12-02  8:00 UTC (permalink / raw)
  To: Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
	Cédric Le Goater, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Jason Wang, Sriram Yagnaraman, Keith Busch,
	Klaus Jensen
  Cc: qemu-devel, qemu-block, Yui Washizu, Akihiko Odaki

Introduction
------------

This series is based on the RFC series submitted by Yui Washizu[1].
See also [2] for the context.

This series enables SR-IOV emulation for virtio-net. It is useful
to test SR-IOV support on the guest, or to expose several vDPA devices in a
VM. vDPA devices can also provide L2 switching feature for offloading
though it is out of scope to allow the guest to configure such a feature.

The new code of SR-IOV emulation for virtio-net actually resides in
virtio-pci since it's specific to PCI. Although it is written in a way
agnostic to the virtio device type, it is restricted for virtio-net because
of lack of validation.

User Interface
--------------

A user can configure a SR-IOV capable virtio-net device by adding
virtio-net-pci functions to a bus. Below is a command line example:
  -netdev user,id=n -netdev user,id=o -netdev user,id=p -netdev user,id=q
  -device virtio-net-pci,addr=0x0.0x3,netdev=q,sriov-pf=f
  -device virtio-net-pci,addr=0x0.0x2,netdev=p,sriov-pf=f
  -device virtio-net-pci,addr=0x0.0x1,netdev=o,sriov-pf=f
  -device virtio-net-pci,addr=0x0.0x0,netdev=n,id=f

The VFs specify the paired PF with "sriov-pf" property. The PF must be
added after all VFs. It is user's responsibility to ensure that VFs have
function numbers larger than one of the PF, and the function numbers have
a consistent stride.

Implementation Challenge
------------------------

The major problem with SR-IOV emulation is that it allows the guest to
realize and unrealize VFs at runtime, which means we cannot realize VFs at
initialization time and keep them. In this series, virtio-pci realizes VFs
at initialization time, but instead of keeping them, it extracts VF
configurations that are necessary to initialize the PF and device options
that will be used to realize VFs later, and unrealize them.

Retrieving Device Options
-------------------------

Usually device options are applied with property setters, and applied
options are bound to a particular device instance. It is problematic for
SR-IOV emulation because it recreates device instances at runtime. The
earlier RFC series[1] had no configurability because of this.
Looking at the code, I found there are currently two methods to retrieve
device options at initialization time, but both of them had downsides.

Existing Approach: DeviceState::opts
------------------------------------

One of them is to reading DeviceState::opts, which holds options except
"id", "bus", and "driver". However, this member of DeviceState is only used
by vfio to know the "rombar" option of pci-device is set and vfio shouldn't
do that in my opinion. DeviceState::opts is untyped, and it is
responsibility of pci-device to type the "rombar" property, but vfio reads
the untyped value in an intrusive way. There will be no usage of
DeviceState::opts If I eliminate this hacky usage, and keeping it only for
SR-IOV emulation of virtio-net is too much. As such, I determined
DeviceState::opts should be gone.

Existing Approach: DeviceListener::hide_device()
------------------------------------------------

The other method is to use DeviceListener::hide_device() callback. The
callback receives device options and decide *not* to realize the device
when a device is being added. virtio-net uses it to _hide_ the primary
device.

A downside of this approach is that it needs explicit registration.
virtio-net failover implementation only registers a DeviceListener after
a virtio-net device is added so it simply *ignores* the primary device
if it is added before the virtio-net device. It is better generate some
error message in such a situation at least.

Another problem of DeviceListener::hide_device() is that it is called for
all devices. For virtio-net failover, the primary device should be a
pci-device. For the SR-IOV emulation, the VF should be a virtio-pci.

Proposal: DeviceClass::hide()
-----------------------------

In this series, I propose DeviceClass:hide() as an alternative to
DeviceListener::hide_device(). A device that can be hidden implements this
function to decide whether it should be hidden. It requires no
registration, and encapsled in specific devices.

Summary
-------

Patch 1 will change the definition of "rombar" property of pci-device to
eliminate DeviceState::opts access in vfio. It will be used later to
generate an error if rombar is requested for SR-IOV VF.
Patch 2 removes DeviceState::opts.
Patch 3 adds DeviceClass::hide().
Patch 4 and 5 use DeviceClass::hide() to implement virtio-net failover.
Patch 6 removes DeviceListener::hide_device().
Patch [7, 11] makes trivial changes for SR-IOV emulation.
Patch 12 changes the common SR-IOV emulation code to accept device options.
Patch 13 adds the SR-IOV emulation code to virtio-pci.
Patch 14 enables the SR-IOV emulation code for virtio-net.

[1] https://patchew.org/QEMU/1689731808-3009-1-git-send-email-yui.washidu@gmail.com/
[2] https://lore.kernel.org/all/5d46f455-f530-4e5e-9ae7-13a2297d4bc5@daynix.com/

Co-developed-by: Yui Washizu <yui.washidu@gmail.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
Akihiko Odaki (14):
      vfio: Avoid inspecting option QDict for rombar
      hw/qdev: Remove opts member
      qdev: Add DeviceClass::hide()
      hw/pci: Add pci-failover
      virtio-net: Implement pci-failover
      qdev: Remove DeviceListener::hide_device()
      hw/pci: Add hide()
      qdev: Add qdev_device_new_from_qdict()
      hw/pci: Do not add ROM BAR for SR-IOV VF
      msix: Call pcie_sriov_vf_register_bar() for SR-IOV VF
      pcie_sriov: Release VFs failed to realize
      pcie_sriov: Allow to specify VF device options
      virtio-pci: add SR-IOV capability
      virtio-net: Add SR-IOV capability

 docs/pcie_sriov.txt            |   2 +-
 include/hw/pci/pci_device.h    |  21 +++++
 include/hw/pci/pcie_sriov.h    |  13 ++-
 include/hw/qdev-core.h         |  61 +++++-------
 include/hw/virtio/virtio-net.h |   3 +-
 include/hw/virtio/virtio-pci.h |   2 +
 include/monitor/qdev.h         |   2 +
 hw/core/qdev.c                 |  19 ----
 hw/net/igb.c                   |   2 +-
 hw/net/virtio-net.c            |  24 +----
 hw/nvme/ctrl.c                 |   2 +-
 hw/pci/msix.c                  |   8 +-
 hw/pci/pci.c                   |  61 +++++++++++-
 hw/pci/pcie_sriov.c            |  71 +++++++++++---
 hw/vfio/pci.c                  |   3 +-
 hw/virtio/virtio-net-pci.c     |  15 +++
 hw/virtio/virtio-pci.c         | 208 +++++++++++++++++++++++++++++++++++++++--
 system/qdev-monitor.c          |  49 +++++++---
 18 files changed, 442 insertions(+), 124 deletions(-)
---
base-commit: 4705fc0c8511d073bee4751c3c974aab2b10a970
change-id: 20231202-sriov-9402fb262be8

Best regards,
-- 
Akihiko Odaki <akihiko.odaki@daynix.com>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 01/14] vfio: Avoid inspecting option QDict for rombar
  2023-12-02  8:00 [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
@ 2023-12-02  8:00 ` Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 02/14] hw/qdev: Remove opts member Akihiko Odaki
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-12-02  8:00 UTC (permalink / raw)
  To: Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
	Cédric Le Goater, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Jason Wang, Sriram Yagnaraman, Keith Busch,
	Klaus Jensen
  Cc: qemu-devel, qemu-block, Yui Washizu, Akihiko Odaki

vfio determines if rombar is explicitly enabled by inspecting QDict.
Inspecting QDict is not nice because QDict is untyped and depends on the
details on the external interface.

Instead of inspecting QDict, inspect PCIDevice::rom_bar.
PCIDevice::rom_bar is changed to have -1 by the default to tell rombar
is explicitly enabled. It is consistent with other properties like addr
and romsize.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 include/hw/pci/pci_device.h | 5 +++++
 hw/pci/pci.c                | 2 +-
 hw/vfio/pci.c               | 3 +--
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h
index d3dd0f64b2..5b6436992f 100644
--- a/include/hw/pci/pci_device.h
+++ b/include/hw/pci/pci_device.h
@@ -205,6 +205,11 @@ static inline uint16_t pci_get_bdf(PCIDevice *dev)
     return PCI_BUILD_BDF(pci_bus_num(pci_get_bus(dev)), dev->devfn);
 }
 
+static inline bool pci_rom_bar_explicitly_enabled(PCIDevice *dev)
+{
+    return d->rom_bar && d->rom_bar != -1;
+}
+
 uint16_t pci_requester_id(PCIDevice *dev);
 
 /* DMA access functions */
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index c49417abb2..53c59a5b9f 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -71,7 +71,7 @@ static Property pci_props[] = {
     DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1),
     DEFINE_PROP_STRING("romfile", PCIDevice, romfile),
     DEFINE_PROP_UINT32("romsize", PCIDevice, romsize, -1),
-    DEFINE_PROP_UINT32("rombar",  PCIDevice, rom_bar, 1),
+    DEFINE_PROP_UINT32("rombar",  PCIDevice, rom_bar, -1),
     DEFINE_PROP_BIT("multifunction", PCIDevice, cap_present,
                     QEMU_PCI_CAP_MULTIFUNCTION_BITNR, false),
     DEFINE_PROP_BIT("x-pcie-lnksta-dllla", PCIDevice, cap_present,
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index c62c02f7b6..bc29ce9194 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1008,7 +1008,6 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
 {
     uint32_t orig, size = cpu_to_le32((uint32_t)PCI_ROM_ADDRESS_MASK);
     off_t offset = vdev->config_offset + PCI_ROM_ADDRESS;
-    DeviceState *dev = DEVICE(vdev);
     char *name;
     int fd = vdev->vbasedev.fd;
 
@@ -1042,7 +1041,7 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
     }
 
     if (vfio_opt_rom_in_denylist(vdev)) {
-        if (dev->opts && qdict_haskey(dev->opts, "rombar")) {
+        if (pci_rom_bar_explicitly_enabled(&vdev->pdev)) {
             warn_report("Device at %s is known to cause system instability"
                         " issues during option rom execution",
                         vdev->vbasedev.name);

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 02/14] hw/qdev: Remove opts member
  2023-12-02  8:00 [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 01/14] vfio: Avoid inspecting option QDict for rombar Akihiko Odaki
@ 2023-12-02  8:00 ` Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 03/14] qdev: Add DeviceClass::hide() Akihiko Odaki
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-12-02  8:00 UTC (permalink / raw)
  To: Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
	Cédric Le Goater, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Jason Wang, Sriram Yagnaraman, Keith Busch,
	Klaus Jensen
  Cc: qemu-devel, qemu-block, Yui Washizu, Akihiko Odaki

It is no longer used.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 include/hw/pci/pci_device.h |  2 +-
 include/hw/qdev-core.h      |  4 ----
 hw/core/qdev.c              |  1 -
 system/qdev-monitor.c       | 12 +++++++-----
 4 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h
index 5b6436992f..8e287c5414 100644
--- a/include/hw/pci/pci_device.h
+++ b/include/hw/pci/pci_device.h
@@ -205,7 +205,7 @@ static inline uint16_t pci_get_bdf(PCIDevice *dev)
     return PCI_BUILD_BDF(pci_bus_num(pci_get_bus(dev)), dev->devfn);
 }
 
-static inline bool pci_rom_bar_explicitly_enabled(PCIDevice *dev)
+static inline bool pci_rom_bar_explicitly_enabled(PCIDevice *d)
 {
     return d->rom_bar && d->rom_bar != -1;
 }
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 151d968238..6befbca311 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -237,10 +237,6 @@ struct DeviceState {
      * @pending_deleted_expires_ms: optional timeout for deletion events
      */
     int64_t pending_deleted_expires_ms;
-    /**
-     * @opts: QDict of options for the device
-     */
-    QDict *opts;
     /**
      * @hotplugged: was device added after PHASE_MACHINE_READY?
      */
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 43d863b0c5..c98691a90d 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -706,7 +706,6 @@ static void device_finalize(Object *obj)
         dev->canonical_path = NULL;
     }
 
-    qobject_unref(dev->opts);
     g_free(dev->id);
 }
 
diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index a13db763e5..71c00f62ee 100644
--- a/system/qdev-monitor.c
+++ b/system/qdev-monitor.c
@@ -625,6 +625,7 @@ DeviceState *qdev_device_add_from_qdict(const QDict *opts,
     char *id;
     DeviceState *dev = NULL;
     BusState *bus = NULL;
+    QDict *properties;
 
     driver = qdict_get_try_str(opts, "driver");
     if (!driver) {
@@ -705,13 +706,14 @@ DeviceState *qdev_device_add_from_qdict(const QDict *opts,
     }
 
     /* set properties */
-    dev->opts = qdict_clone_shallow(opts);
-    qdict_del(dev->opts, "driver");
-    qdict_del(dev->opts, "bus");
-    qdict_del(dev->opts, "id");
+    properties = qdict_clone_shallow(opts);
+    qdict_del(properties, "driver");
+    qdict_del(properties, "bus");
+    qdict_del(properties, "id");
 
-    object_set_properties_from_keyval(&dev->parent_obj, dev->opts, from_json,
+    object_set_properties_from_keyval(&dev->parent_obj, properties, from_json,
                                       errp);
+    qobject_unref(properties);
     if (*errp) {
         goto err_del_dev;
     }

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 03/14] qdev: Add DeviceClass::hide()
  2023-12-02  8:00 [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 01/14] vfio: Avoid inspecting option QDict for rombar Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 02/14] hw/qdev: Remove opts member Akihiko Odaki
@ 2023-12-02  8:00 ` Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 04/14] hw/pci: Add pci-failover Akihiko Odaki
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-12-02  8:00 UTC (permalink / raw)
  To: Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
	Cédric Le Goater, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Jason Wang, Sriram Yagnaraman, Keith Busch,
	Klaus Jensen
  Cc: qemu-devel, qemu-block, Yui Washizu, Akihiko Odaki

DeviceClass::hide() is a better alternative to
DeviceListener::hide_device() that does not need listener registration
and is contained in specific devices that need the hiding capability.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 include/hw/qdev-core.h | 33 +++++++++++++++++++++++----------
 system/qdev-monitor.c  | 11 +++++++++++
 2 files changed, 34 insertions(+), 10 deletions(-)

diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 6befbca311..de221b6f02 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -56,16 +56,15 @@
  * Hiding a device
  * ---------------
  *
- * To hide a device, a DeviceListener function hide_device() needs to
- * be registered. It can be used to defer adding a device and
- * therefore hide it from the guest. The handler registering to this
- * DeviceListener can save the QOpts passed to it for re-using it
- * later. It must return if it wants the device to be hidden or
- * visible. When the handler function decides the device shall be
- * visible it will be added with qdev_device_add() and realized as any
- * other device. Otherwise qdev_device_add() will return early without
- * adding the device. The guest will not see a "hidden" device until
- * it was marked visible and qdev_device_add called again.
+ * To hide a device, a DeviceClass function hide() needs to be registered. It
+ * can be used to defer adding a device and therefore hide it from the guest.
+ * The handler can save the QOpts passed to it for re-using it later. It must
+ * return if it wants the device to be hidden or visible. When the handler
+ * function decides the device shall be visible it will be added with
+ * qdev_device_add() and realized as any other device. Otherwise
+ * qdev_device_add() will return early without adding the device. The guest
+ * will not see a "hidden" device until it was marked visible and
+ * qdev_device_add called again.
  *
  */
 
@@ -90,6 +89,8 @@ typedef enum DeviceCategory {
     DEVICE_CATEGORY_MAX
 } DeviceCategory;
 
+typedef bool (*DeviceHide)(DeviceClass *dc, const QDict *device_opts,
+                           bool from_json, Error **errp);
 typedef void (*DeviceRealize)(DeviceState *dev, Error **errp);
 typedef void (*DeviceUnrealize)(DeviceState *dev);
 typedef void (*DeviceReset)(DeviceState *dev);
@@ -151,6 +152,18 @@ struct DeviceClass {
     bool hotpluggable;
 
     /* callbacks */
+    /**
+     * @hide: informs qdev if a device should be visible or hidden.
+     *
+     * This callback is called upon init of the DeviceState.
+     * We can hide a failover device depending for example on the device
+     * opts.
+     *
+     * On errors, it returns false and errp is set. Device creation
+     * should fail in this case.
+     */
+    DeviceHide hide;
+
     /**
      * @reset: deprecated device reset method pointer
      *
diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index 71c00f62ee..639beabc5f 100644
--- a/system/qdev-monitor.c
+++ b/system/qdev-monitor.c
@@ -669,6 +669,17 @@ DeviceState *qdev_device_add_from_qdict(const QDict *opts,
         return NULL;
     }
 
+    if (dc->hide) {
+        if (dc->hide(dc, opts, from_json, errp)) {
+            if (bus && !qbus_is_hotpluggable(bus)) {
+                error_setg(errp, QERR_BUS_NO_HOTPLUG, bus->name);
+            }
+            return NULL;
+        } else if (*errp) {
+            return NULL;
+        }
+    }
+
     if (phase_check(PHASE_MACHINE_READY) && bus && !qbus_is_hotpluggable(bus)) {
         error_setg(errp, QERR_BUS_NO_HOTPLUG, bus->name);
         return NULL;

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 04/14] hw/pci: Add pci-failover
  2023-12-02  8:00 [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
                   ` (2 preceding siblings ...)
  2023-12-02  8:00 ` [PATCH 03/14] qdev: Add DeviceClass::hide() Akihiko Odaki
@ 2023-12-02  8:00 ` Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 05/14] virtio-net: Implement pci-failover Akihiko Odaki
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-12-02  8:00 UTC (permalink / raw)
  To: Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
	Cédric Le Goater, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Jason Wang, Sriram Yagnaraman, Keith Busch,
	Klaus Jensen
  Cc: qemu-devel, qemu-block, Yui Washizu, Akihiko Odaki

pci-failover allows to create a device capable of failover without
relying on DeviceListener::hide_device(), which intrudes the
pci-device implementation from outside.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 include/hw/pci/pci_device.h | 14 ++++++++++++++
 hw/pci/pci.c                | 43 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+)

diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h
index 8e287c5414..a7bfb192e8 100644
--- a/include/hw/pci/pci_device.h
+++ b/include/hw/pci/pci_device.h
@@ -9,6 +9,11 @@ typedef struct PCIDeviceClass PCIDeviceClass;
 DECLARE_OBJ_CHECKERS(PCIDevice, PCIDeviceClass,
                      PCI_DEVICE, TYPE_PCI_DEVICE)
 
+#define TYPE_PCI_FAILOVER "pci-failover"
+typedef struct PCIFailoverClass PCIFailoverClass;
+DECLARE_CLASS_CHECKERS(PCIFailoverClass, PCI_FAILOVER, TYPE_PCI_FAILOVER)
+#define PCI_FAILOVER(obj) INTERFACE_CHECK(PciFailover, (obj), TYPE_PCI_FAILOVER)
+
 /*
  * Implemented by devices that can be plugged on CXL buses. In the spec, this is
  * actually a "CXL Component, but we name it device to match the PCI naming.
@@ -162,6 +167,15 @@ struct PCIDevice {
     uint32_t acpi_index;
 };
 
+struct PCIFailoverClass {
+    /* private */
+    InterfaceClass parent_class;
+
+    /* public */
+    bool (* set_primary)(DeviceState *dev, const QDict *device_opts,
+                         bool from_json, Error **errp);
+};
+
 static inline int pci_intx(PCIDevice *pci_dev)
 {
     return pci_get_byte(pci_dev->config + PCI_INTERRUPT_PIN) - 1;
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 53c59a5b9f..3d07246f8e 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -46,6 +46,7 @@
 #include "hw/pci/msix.h"
 #include "hw/hotplug.h"
 #include "hw/boards.h"
+#include "qapi/qmp/qdict.h"
 #include "qapi/error.h"
 #include "qemu/cutils.h"
 #include "pci-internal.h"
@@ -2050,6 +2051,40 @@ PCIDevice *pci_find_device(PCIBus *bus, int bus_num, uint8_t devfn)
     return bus->devices[devfn];
 }
 
+static bool pci_qdev_hide(DeviceClass *dc, const QDict *device_opts,
+                          bool from_json, Error **errp)
+{
+    const char *standby_id;
+    DeviceState *dev;
+    ObjectClass *class;
+    ObjectClass *interface;
+
+    if (!device_opts) {
+        return false;
+    }
+
+    if (!qdict_haskey(device_opts, "failover_pair_id")) {
+        return false;
+    }
+
+    standby_id = qdict_get_str(device_opts, "failover_pair_id");
+    dev = qdev_find_recursive(sysbus_get_default(), standby_id);
+    if (!dev) {
+        error_setg(errp, "failover pair not found");
+        return false;
+    }
+
+    class = object_get_class(OBJECT(dev));
+    interface = object_class_dynamic_cast(class, TYPE_PCI_FAILOVER);
+    if (!interface) {
+        error_setg(errp, "failover pair does not support failover");
+        return false;
+    }
+
+    return ((PCIFailoverClass *)interface)->set_primary(dev, device_opts,
+                                                        from_json, errp);
+}
+
 #define ONBOARD_INDEX_MAX (16 * 1024 - 1)
 
 static void pci_qdev_realize(DeviceState *qdev, Error **errp)
@@ -2653,6 +2688,7 @@ static void pci_device_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *k = DEVICE_CLASS(klass);
 
+    k->hide = pci_qdev_hide;
     k->realize = pci_qdev_realize;
     k->unrealize = pci_qdev_unrealize;
     k->bus_type = TYPE_PCI_BUS;
@@ -2861,6 +2897,12 @@ static const TypeInfo pci_device_type_info = {
     .class_base_init = pci_device_class_base_init,
 };
 
+static const TypeInfo pci_failover_type_info = {
+    .name = TYPE_PCI_FAILOVER,
+    .parent = TYPE_INTERFACE,
+    .class_size = sizeof(PCIFailoverClass),
+};
+
 static void pci_register_types(void)
 {
     type_register_static(&pci_bus_info);
@@ -2870,6 +2912,7 @@ static void pci_register_types(void)
     type_register_static(&cxl_interface_info);
     type_register_static(&pcie_interface_info);
     type_register_static(&pci_device_type_info);
+    type_register_static(&pci_failover_type_info);
 }
 
 type_init(pci_register_types)

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 05/14] virtio-net: Implement pci-failover
  2023-12-02  8:00 [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
                   ` (3 preceding siblings ...)
  2023-12-02  8:00 ` [PATCH 04/14] hw/pci: Add pci-failover Akihiko Odaki
@ 2023-12-02  8:00 ` Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 06/14] qdev: Remove DeviceListener::hide_device() Akihiko Odaki
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-12-02  8:00 UTC (permalink / raw)
  To: Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
	Cédric Le Goater, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Jason Wang, Sriram Yagnaraman, Keith Busch,
	Klaus Jensen
  Cc: qemu-devel, qemu-block, Yui Washizu, Akihiko Odaki

This change removes the parsing of pci-device's failover_pair_id
property from virtio-net, and lets pci-device to report an error if
an unknown ID is specified for the property.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 include/hw/virtio/virtio-net.h |  3 ++-
 hw/net/virtio-net.c            | 24 ++++--------------------
 hw/virtio/virtio-net-pci.c     | 14 ++++++++++++++
 3 files changed, 20 insertions(+), 21 deletions(-)

diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index 55977f01f0..753cab4b32 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -218,7 +218,6 @@ struct VirtIONet {
     /* primary failover device is hidden*/
     bool failover_primary_hidden;
     bool failover;
-    DeviceListener primary_listener;
     QDict *primary_opts;
     bool primary_opts_from_json;
     Notifier migration_state;
@@ -233,6 +232,8 @@ size_t virtio_net_handle_ctrl_iov(VirtIODevice *vdev,
                                   unsigned out_num);
 void virtio_net_set_netclient_name(VirtIONet *n, const char *name,
                                    const char *type);
+bool virtio_net_set_primary(VirtIONet *n, const QDict *device_opts,
+                            bool from_json, Error **errp);
 uint64_t virtio_net_supported_guest_offloads(const VirtIONet *n);
 
 #endif
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 80c56f0cfc..7def9a1200 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3536,19 +3536,11 @@ static void virtio_net_migration_state_notifier(Notifier *notifier, void *data)
     virtio_net_handle_migration_primary(n, s);
 }
 
-static bool failover_hide_primary_device(DeviceListener *listener,
-                                         const QDict *device_opts,
-                                         bool from_json,
-                                         Error **errp)
+bool virtio_net_set_primary(VirtIONet *n, const QDict *device_opts,
+                            bool from_json, Error **errp)
 {
-    VirtIONet *n = container_of(listener, VirtIONet, primary_listener);
-    const char *standby_id;
-
-    if (!device_opts) {
-        return false;
-    }
-
-    if (!qdict_haskey(device_opts, "failover_pair_id")) {
+    if (!n->failover) {
+        error_setg(errp, "failover pair does not support failover");
         return false;
     }
 
@@ -3557,11 +3549,6 @@ static bool failover_hide_primary_device(DeviceListener *listener,
         return false;
     }
 
-    standby_id = qdict_get_str(device_opts, "failover_pair_id");
-    if (g_strcmp0(standby_id, n->netclient_name) != 0) {
-        return false;
-    }
-
     /*
      * The hide helper can be called several times for a given device.
      * Check there is only one primary for a virtio-net device but
@@ -3621,9 +3608,7 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp)
     }
 
     if (n->failover) {
-        n->primary_listener.hide_device = failover_hide_primary_device;
         qatomic_set(&n->failover_primary_hidden, true);
-        device_listener_register(&n->primary_listener);
         migration_add_notifier(&n->migration_state,
                                virtio_net_migration_state_notifier);
         n->host_features |= (1ULL << VIRTIO_NET_F_STANDBY);
@@ -3789,7 +3774,6 @@ static void virtio_net_device_unrealize(DeviceState *dev)
 
     if (n->failover) {
         qobject_unref(n->primary_opts);
-        device_listener_unregister(&n->primary_listener);
         migration_remove_notifier(&n->migration_state);
     } else {
         assert(n->primary_opts == NULL);
diff --git a/hw/virtio/virtio-net-pci.c b/hw/virtio/virtio-net-pci.c
index e03543a70a..e421cd9cea 100644
--- a/hw/virtio/virtio-net-pci.c
+++ b/hw/virtio/virtio-net-pci.c
@@ -64,10 +64,19 @@ static void virtio_net_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
     qdev_realize(vdev, BUS(&vpci_dev->bus), errp);
 }
 
+static bool virtio_net_pci_set_primary(DeviceState *dev,
+                                       const QDict *device_opts,
+                                       bool from_json, Error **errp)
+{
+    return virtio_net_set_primary(&VIRTIO_NET_PCI(dev)->vdev, device_opts,
+                                  from_json, errp);
+}
+
 static void virtio_net_pci_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
     PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+    PCIFailoverClass *pfc = PCI_FAILOVER_CLASS(klass);
     VirtioPCIClass *vpciklass = VIRTIO_PCI_CLASS(klass);
 
     k->romfile = "efi-virtio.rom";
@@ -75,6 +84,7 @@ static void virtio_net_pci_class_init(ObjectClass *klass, void *data)
     k->device_id = PCI_DEVICE_ID_VIRTIO_NET;
     k->revision = VIRTIO_PCI_ABI_VERSION;
     k->class_id = PCI_CLASS_NETWORK_ETHERNET;
+    pfc->set_primary = virtio_net_pci_set_primary;
     set_bit(DEVICE_CATEGORY_NETWORK, dc->categories);
     device_class_set_props(dc, virtio_net_properties);
     vpciklass->realize = virtio_net_pci_realize;
@@ -98,6 +108,10 @@ static const VirtioPCIDeviceTypeInfo virtio_net_pci_info = {
     .instance_size = sizeof(VirtIONetPCI),
     .instance_init = virtio_net_pci_instance_init,
     .class_init    = virtio_net_pci_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_PCI_FAILOVER },
+        { }
+    },
 };
 
 static void virtio_net_pci_register(void)

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 06/14] qdev: Remove DeviceListener::hide_device()
  2023-12-02  8:00 [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
                   ` (4 preceding siblings ...)
  2023-12-02  8:00 ` [PATCH 05/14] virtio-net: Implement pci-failover Akihiko Odaki
@ 2023-12-02  8:00 ` Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 07/14] hw/pci: Add hide() Akihiko Odaki
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-12-02  8:00 UTC (permalink / raw)
  To: Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
	Cédric Le Goater, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Jason Wang, Sriram Yagnaraman, Keith Busch,
	Klaus Jensen
  Cc: qemu-devel, qemu-block, Yui Washizu, Akihiko Odaki

It is no longer used.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 include/hw/qdev-core.h | 24 ------------------------
 hw/core/qdev.c         | 18 ------------------
 system/qdev-monitor.c  |  9 ---------
 3 files changed, 51 deletions(-)

diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index de221b6f02..9acf6f79c4 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -306,17 +306,6 @@ struct DeviceState {
 struct DeviceListener {
     void (*realize)(DeviceListener *listener, DeviceState *dev);
     void (*unrealize)(DeviceListener *listener, DeviceState *dev);
-    /*
-     * This callback is called upon init of the DeviceState and
-     * informs qdev if a device should be visible or hidden.  We can
-     * hide a failover device depending for example on the device
-     * opts.
-     *
-     * On errors, it returns false and errp is set. Device creation
-     * should fail in this case.
-     */
-    bool (*hide_device)(DeviceListener *listener, const QDict *device_opts,
-                        bool from_json, Error **errp);
     QTAILQ_ENTRY(DeviceListener) link;
 };
 
@@ -1054,19 +1043,6 @@ static inline void qbus_mark_full(BusState *bus)
 void device_listener_register(DeviceListener *listener);
 void device_listener_unregister(DeviceListener *listener);
 
-/**
- * qdev_should_hide_device() - check if device should be hidden
- *
- * @opts: options QDict
- * @from_json: true if @opts entries are typed, false for all strings
- * @errp: pointer to error object
- *
- * When a device is added via qdev_device_add() this will be called.
- *
- * Return: if the device should be added now or not.
- */
-bool qdev_should_hide_device(const QDict *opts, bool from_json, Error **errp);
-
 typedef enum MachineInitPhase {
     /* current_machine is NULL.  */
     PHASE_NO_MACHINE,
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index c98691a90d..e61a147016 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -224,24 +224,6 @@ void device_listener_unregister(DeviceListener *listener)
     QTAILQ_REMOVE(&device_listeners, listener, link);
 }
 
-bool qdev_should_hide_device(const QDict *opts, bool from_json, Error **errp)
-{
-    ERRP_GUARD();
-    DeviceListener *listener;
-
-    QTAILQ_FOREACH(listener, &device_listeners, link) {
-        if (listener->hide_device) {
-            if (listener->hide_device(listener, opts, from_json, errp)) {
-                return true;
-            } else if (*errp) {
-                return false;
-            }
-        }
-    }
-
-    return false;
-}
-
 void qdev_set_legacy_instance_id(DeviceState *dev, int alias_id,
                                  int required_for_version)
 {
diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index 639beabc5f..42aac94b8c 100644
--- a/system/qdev-monitor.c
+++ b/system/qdev-monitor.c
@@ -660,15 +660,6 @@ DeviceState *qdev_device_add_from_qdict(const QDict *opts,
         }
     }
 
-    if (qdev_should_hide_device(opts, from_json, errp)) {
-        if (bus && !qbus_is_hotpluggable(bus)) {
-            error_setg(errp, QERR_BUS_NO_HOTPLUG, bus->name);
-        }
-        return NULL;
-    } else if (*errp) {
-        return NULL;
-    }
-
     if (dc->hide) {
         if (dc->hide(dc, opts, from_json, errp)) {
             if (bus && !qbus_is_hotpluggable(bus)) {

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 07/14] hw/pci: Add hide()
  2023-12-02  8:00 [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
                   ` (5 preceding siblings ...)
  2023-12-02  8:00 ` [PATCH 06/14] qdev: Remove DeviceListener::hide_device() Akihiko Odaki
@ 2023-12-02  8:00 ` Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 08/14] qdev: Add qdev_device_new_from_qdict() Akihiko Odaki
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-12-02  8:00 UTC (permalink / raw)
  To: Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
	Cédric Le Goater, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Jason Wang, Sriram Yagnaraman, Keith Busch,
	Klaus Jensen
  Cc: qemu-devel, qemu-block, Yui Washizu, Akihiko Odaki

hide() can be implemented to prevent creating a PCI device and get
device options.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 include/hw/pci/pci_device.h | 2 ++
 hw/pci/pci.c                | 8 ++++++++
 2 files changed, 10 insertions(+)

diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h
index a7bfb192e8..deae29f070 100644
--- a/include/hw/pci/pci_device.h
+++ b/include/hw/pci/pci_device.h
@@ -29,6 +29,8 @@ DECLARE_CLASS_CHECKERS(PCIFailoverClass, PCI_FAILOVER, TYPE_PCI_FAILOVER)
 struct PCIDeviceClass {
     DeviceClass parent_class;
 
+    bool (*hide)(PCIDeviceClass *pc, const QDict *device_opts, bool from_json,
+                 Error **errp);
     void (*realize)(PCIDevice *dev, Error **errp);
     PCIUnregisterFunc *exit;
     PCIConfigReadFunc *config_read;
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 3d07246f8e..67d8ae3f61 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2054,11 +2054,19 @@ PCIDevice *pci_find_device(PCIBus *bus, int bus_num, uint8_t devfn)
 static bool pci_qdev_hide(DeviceClass *dc, const QDict *device_opts,
                           bool from_json, Error **errp)
 {
+    PCIDeviceClass *pc = PCI_DEVICE_CLASS(dc);
     const char *standby_id;
     DeviceState *dev;
     ObjectClass *class;
     ObjectClass *interface;
 
+    if (pc->hide) {
+        bool hide = pc->hide(pc, device_opts, from_json, errp);
+        if (hide || *errp) {
+            return hide;
+        }
+    }
+
     if (!device_opts) {
         return false;
     }

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 08/14] qdev: Add qdev_device_new_from_qdict()
  2023-12-02  8:00 [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
                   ` (6 preceding siblings ...)
  2023-12-02  8:00 ` [PATCH 07/14] hw/pci: Add hide() Akihiko Odaki
@ 2023-12-02  8:00 ` Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 09/14] hw/pci: Do not add ROM BAR for SR-IOV VF Akihiko Odaki
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-12-02  8:00 UTC (permalink / raw)
  To: Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
	Cédric Le Goater, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Jason Wang, Sriram Yagnaraman, Keith Busch,
	Klaus Jensen
  Cc: qemu-devel, qemu-block, Yui Washizu, Akihiko Odaki

qdev_device_new_from_qdict() can be used to create a device from QDict
without realizing it.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 include/monitor/qdev.h |  2 ++
 system/qdev-monitor.c  | 23 ++++++++++++++++++++---
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/include/monitor/qdev.h b/include/monitor/qdev.h
index 1d57bf6577..013108a7dd 100644
--- a/include/monitor/qdev.h
+++ b/include/monitor/qdev.h
@@ -11,6 +11,8 @@ int qdev_device_help(QemuOpts *opts);
 DeviceState *qdev_device_add(QemuOpts *opts, Error **errp);
 DeviceState *qdev_device_add_from_qdict(const QDict *opts,
                                         bool from_json, Error **errp);
+DeviceState *qdev_device_new_from_qdict(const QDict *opts, bool from_json,
+                                        BusState **busp, Error **errp);
 
 /**
  * qdev_set_id: parent the device and set its id if provided.
diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index 42aac94b8c..028b97f2b5 100644
--- a/system/qdev-monitor.c
+++ b/system/qdev-monitor.c
@@ -618,6 +618,25 @@ const char *qdev_set_id(DeviceState *dev, char *id, Error **errp)
 
 DeviceState *qdev_device_add_from_qdict(const QDict *opts,
                                         bool from_json, Error **errp)
+{
+    DeviceState *dev;
+    BusState *bus;
+
+    dev = qdev_device_new_from_qdict(opts, from_json, &bus, errp);
+    if (!dev) {
+        return NULL;
+    }
+
+    if (!qdev_realize(dev, bus, errp)) {
+        object_unparent(OBJECT(dev));
+        object_unref(OBJECT(dev));
+    }
+
+    return dev;
+}
+
+DeviceState *qdev_device_new_from_qdict(const QDict *opts, bool from_json,
+                                        BusState **busp, Error **errp)
 {
     ERRP_GUARD();
     DeviceClass *dc;
@@ -720,9 +739,7 @@ DeviceState *qdev_device_add_from_qdict(const QDict *opts,
         goto err_del_dev;
     }
 
-    if (!qdev_realize(dev, bus, errp)) {
-        goto err_del_dev;
-    }
+    *busp = bus;
     return dev;
 
 err_del_dev:

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 09/14] hw/pci: Do not add ROM BAR for SR-IOV VF
  2023-12-02  8:00 [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
                   ` (7 preceding siblings ...)
  2023-12-02  8:00 ` [PATCH 08/14] qdev: Add qdev_device_new_from_qdict() Akihiko Odaki
@ 2023-12-02  8:00 ` Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 10/14] msix: Call pcie_sriov_vf_register_bar() " Akihiko Odaki
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-12-02  8:00 UTC (permalink / raw)
  To: Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
	Cédric Le Goater, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Jason Wang, Sriram Yagnaraman, Keith Busch,
	Klaus Jensen
  Cc: qemu-devel, qemu-block, Yui Washizu, Akihiko Odaki

A SR-IOV VF cannot have a ROM BAR.

Co-developed-by: Yui Washizu <yui.washidu@gmail.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/pci/pci.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 67d8ae3f61..54d9e0f4cf 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2419,6 +2419,14 @@ static void pci_add_option_rom(PCIDevice *pdev, bool is_default_rom,
         return;
     }
 
+    if (pci_is_vf(pdev)) {
+        if (pdev->rom_bar && pdev->rom_bar != -1) {
+            error_setg(errp, "ROM BAR cannot be enabled for SR-IOV VF");
+        }
+
+        return;
+    }
+
     if (load_file || pdev->romsize == -1) {
         path = qemu_find_file(QEMU_FILE_TYPE_BIOS, pdev->romfile);
         if (path == NULL) {

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 10/14] msix: Call pcie_sriov_vf_register_bar() for SR-IOV VF
  2023-12-02  8:00 [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
                   ` (8 preceding siblings ...)
  2023-12-02  8:00 ` [PATCH 09/14] hw/pci: Do not add ROM BAR for SR-IOV VF Akihiko Odaki
@ 2023-12-02  8:00 ` Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 11/14] pcie_sriov: Release VFs failed to realize Akihiko Odaki
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-12-02  8:00 UTC (permalink / raw)
  To: Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
	Cédric Le Goater, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Jason Wang, Sriram Yagnaraman, Keith Busch,
	Klaus Jensen
  Cc: qemu-devel, qemu-block, Yui Washizu, Akihiko Odaki

A SR-IOV VF needs to use pcie_sriov_vf_register_bar() instead of
pci_register_bar().

Co-developed-by: Yui Washizu <yui.washidu@gmail.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/pci/msix.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/hw/pci/msix.c b/hw/pci/msix.c
index ab8869d9d0..3b94ce389f 100644
--- a/hw/pci/msix.c
+++ b/hw/pci/msix.c
@@ -421,8 +421,12 @@ int msix_init_exclusive_bar(PCIDevice *dev, unsigned short nentries,
         return ret;
     }
 
-    pci_register_bar(dev, bar_nr, PCI_BASE_ADDRESS_SPACE_MEMORY,
-                     &dev->msix_exclusive_bar);
+    if (pci_is_vf(dev)) {
+        pcie_sriov_vf_register_bar(dev, bar_nr, &dev->msix_exclusive_bar);
+    } else {
+        pci_register_bar(dev, bar_nr, PCI_BASE_ADDRESS_SPACE_MEMORY,
+                         &dev->msix_exclusive_bar);
+    }
 
     return 0;
 }

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 11/14] pcie_sriov: Release VFs failed to realize
  2023-12-02  8:00 [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
                   ` (9 preceding siblings ...)
  2023-12-02  8:00 ` [PATCH 10/14] msix: Call pcie_sriov_vf_register_bar() " Akihiko Odaki
@ 2023-12-02  8:00 ` Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 12/14] pcie_sriov: Allow to specify VF device options Akihiko Odaki
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-12-02  8:00 UTC (permalink / raw)
  To: Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
	Cédric Le Goater, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Jason Wang, Sriram Yagnaraman, Keith Busch,
	Klaus Jensen
  Cc: qemu-devel, qemu-block, Yui Washizu, Akihiko Odaki

Release VFs failed to realize just as we do in unregister_vfs().

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/pci/pcie_sriov.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c
index 5ef8950940..3ec786d341 100644
--- a/hw/pci/pcie_sriov.c
+++ b/hw/pci/pcie_sriov.c
@@ -153,6 +153,8 @@ static PCIDevice *register_vf(PCIDevice *pf, int devfn, const char *name,
     qdev_realize(&dev->qdev, &bus->qbus, &local_err);
     if (local_err) {
         error_report_err(local_err);
+        object_unparent(OBJECT(dev));
+        object_unref(dev);
         return NULL;
     }
 

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 12/14] pcie_sriov: Allow to specify VF device options
  2023-12-02  8:00 [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
                   ` (10 preceding siblings ...)
  2023-12-02  8:00 ` [PATCH 11/14] pcie_sriov: Release VFs failed to realize Akihiko Odaki
@ 2023-12-02  8:00 ` Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 13/14] virtio-pci: add SR-IOV capability Akihiko Odaki
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-12-02  8:00 UTC (permalink / raw)
  To: Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
	Cédric Le Goater, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Jason Wang, Sriram Yagnaraman, Keith Busch,
	Klaus Jensen
  Cc: qemu-devel, qemu-block, Yui Washizu, Akihiko Odaki

Specifying VF device options will be useful to create VFs based on
conventional device emulation code which have user-configurable
options.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 docs/pcie_sriov.txt         |  2 +-
 include/hw/pci/pcie_sriov.h | 13 ++++++--
 hw/net/igb.c                |  2 +-
 hw/nvme/ctrl.c              |  2 +-
 hw/pci/pcie_sriov.c         | 73 ++++++++++++++++++++++++++++++++++++---------
 5 files changed, 72 insertions(+), 20 deletions(-)

diff --git a/docs/pcie_sriov.txt b/docs/pcie_sriov.txt
index a47aad0bfa..dc70b40ae2 100644
--- a/docs/pcie_sriov.txt
+++ b/docs/pcie_sriov.txt
@@ -52,7 +52,7 @@ setting up a BAR for a VF.
       ...
 
       /* Add and initialize the SR/IOV capability */
-      pcie_sriov_pf_init(d, 0x200, "your_virtual_dev",
+      pcie_sriov_pf_init(d, 0x200, "your_virtual_dev", NULL,
                        vf_devid, initial_vfs, total_vfs,
                        fun_offset, stride);
 
diff --git a/include/hw/pci/pcie_sriov.h b/include/hw/pci/pcie_sriov.h
index 095fb0c9ed..aa3d81fa44 100644
--- a/include/hw/pci/pcie_sriov.h
+++ b/include/hw/pci/pcie_sriov.h
@@ -15,10 +15,16 @@
 
 #include "hw/pci/pci.h"
 
+typedef struct PCIESriovVFOpts {
+    QDict *device_opts;
+    bool from_json;
+} PCIESriovVFOpts;
+
 struct PCIESriovPF {
     uint16_t num_vfs;   /* Number of virtual functions created */
     uint8_t vf_bar_type[PCI_NUM_REGIONS];   /* Store type for each VF bar */
     const char *vfname; /* Reference to the device type used for the VFs */
+    PCIESriovVFOpts *vfopts; /* Poiner to an array of VF options */
     PCIDevice **vf;     /* Pointer to an array of num_vfs VF devices */
 };
 
@@ -28,9 +34,10 @@ struct PCIESriovVF {
 };
 
 void pcie_sriov_pf_init(PCIDevice *dev, uint16_t offset,
-                        const char *vfname, uint16_t vf_dev_id,
-                        uint16_t init_vfs, uint16_t total_vfs,
-                        uint16_t vf_offset, uint16_t vf_stride);
+                        const char *vfname, PCIESriovVFOpts *vfopts,
+                        uint16_t vf_dev_id, uint16_t init_vfs,
+                        uint16_t total_vfs, uint16_t vf_offset,
+                        uint16_t vf_stride);
 void pcie_sriov_pf_exit(PCIDevice *dev);
 
 /* Set up a VF bar in the SR/IOV bar area */
diff --git a/hw/net/igb.c b/hw/net/igb.c
index 8089acfea4..8168d401cb 100644
--- a/hw/net/igb.c
+++ b/hw/net/igb.c
@@ -447,7 +447,7 @@ static void igb_pci_realize(PCIDevice *pci_dev, Error **errp)
 
     pcie_ari_init(pci_dev, 0x150);
 
-    pcie_sriov_pf_init(pci_dev, IGB_CAP_SRIOV_OFFSET, TYPE_IGBVF,
+    pcie_sriov_pf_init(pci_dev, IGB_CAP_SRIOV_OFFSET, TYPE_IGBVF, NULL,
         IGB_82576_VF_DEV_ID, IGB_MAX_VF_FUNCTIONS, IGB_MAX_VF_FUNCTIONS,
         IGB_VF_OFFSET, IGB_VF_STRIDE);
 
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index f026245d1e..91bbccb49f 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -8040,7 +8040,7 @@ static void nvme_init_sriov(NvmeCtrl *n, PCIDevice *pci_dev, uint16_t offset)
                                       le16_to_cpu(cap->vifrsm),
                                       NULL, NULL);
 
-    pcie_sriov_pf_init(pci_dev, offset, "nvme", vf_dev_id,
+    pcie_sriov_pf_init(pci_dev, offset, "nvme", NULL, vf_dev_id,
                        n->params.sriov_max_vfs, n->params.sriov_max_vfs,
                        NVME_VF_OFFSET, NVME_VF_STRIDE);
 
diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c
index 3ec786d341..4e73559dc1 100644
--- a/hw/pci/pcie_sriov.c
+++ b/hw/pci/pcie_sriov.c
@@ -15,8 +15,11 @@
 #include "hw/pci/pcie.h"
 #include "hw/pci/pci_bus.h"
 #include "hw/qdev-properties.h"
+#include "monitor/qdev.h"
 #include "qemu/error-report.h"
 #include "qemu/range.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qobject.h"
 #include "qapi/error.h"
 #include "trace.h"
 
@@ -25,9 +28,10 @@ static PCIDevice *register_vf(PCIDevice *pf, int devfn,
 static void unregister_vfs(PCIDevice *dev);
 
 void pcie_sriov_pf_init(PCIDevice *dev, uint16_t offset,
-                        const char *vfname, uint16_t vf_dev_id,
-                        uint16_t init_vfs, uint16_t total_vfs,
-                        uint16_t vf_offset, uint16_t vf_stride)
+                        const char *vfname, PCIESriovVFOpts *vfopts,
+                        uint16_t vf_dev_id, uint16_t init_vfs,
+                        uint16_t total_vfs, uint16_t vf_offset,
+                        uint16_t vf_stride)
 {
     uint8_t *cfg = dev->config + offset;
     uint8_t *wmask;
@@ -37,6 +41,7 @@ void pcie_sriov_pf_init(PCIDevice *dev, uint16_t offset,
     dev->exp.sriov_cap = offset;
     dev->exp.sriov_pf.num_vfs = 0;
     dev->exp.sriov_pf.vfname = g_strdup(vfname);
+    dev->exp.sriov_pf.vfopts = vfopts;
     dev->exp.sriov_pf.vf = NULL;
 
     pci_set_word(cfg + PCI_SRIOV_VF_OFFSET, vf_offset);
@@ -76,6 +81,16 @@ void pcie_sriov_pf_exit(PCIDevice *dev)
     unregister_vfs(dev);
     g_free((char *)dev->exp.sriov_pf.vfname);
     dev->exp.sriov_pf.vfname = NULL;
+
+    if (dev->exp.sriov_pf.vfopts) {
+        uint8_t *cfg = dev->config + dev->exp.sriov_cap;
+
+        for (uint16_t i = 0; i < pci_get_word(cfg + PCI_SRIOV_TOTAL_VF); i++) {
+            qobject_unref(dev->exp.sriov_pf.vfopts[i].device_opts);
+        }
+
+        g_free(dev->exp.sriov_pf.vfopts);
+    }
 }
 
 void pcie_sriov_pf_init_vf_bar(PCIDevice *dev, int region_num,
@@ -144,25 +159,50 @@ void pcie_sriov_vf_register_bar(PCIDevice *dev, int region_num,
 static PCIDevice *register_vf(PCIDevice *pf, int devfn, const char *name,
                               uint16_t vf_num)
 {
-    PCIDevice *dev = pci_new(devfn, name);
-    dev->exp.sriov_vf.pf = pf;
-    dev->exp.sriov_vf.vf_number = vf_num;
-    PCIBus *bus = pci_get_bus(pf);
+    PCIDevice *pci_dev;
+    BusState *bus = qdev_get_parent_bus(DEVICE(pf));
     Error *local_err = NULL;
 
-    qdev_realize(&dev->qdev, &bus->qbus, &local_err);
+    if (pf->exp.sriov_pf.vfopts) {
+        BusState *local_bus;
+        PCIESriovVFOpts *vfopts = pf->exp.sriov_pf.vfopts + vf_num;
+        DeviceState *dev = qdev_device_new_from_qdict(vfopts->device_opts,
+                                                      vfopts->from_json,
+                                                      &local_bus, &local_err);
+        if (!dev) {
+            error_report_err(local_err);
+            return NULL;
+        }
+
+        pci_dev = PCI_DEVICE(dev);
+
+        if (bus != local_bus) {
+            error_report("unexpected SR-IOV VF parent bus");
+            goto fail;
+        }
+    } else {
+        pci_dev = pci_new(devfn, name);
+    }
+
+    pci_dev->exp.sriov_vf.pf = pf;
+    pci_dev->exp.sriov_vf.vf_number = vf_num;
+
+    qdev_realize(&pci_dev->qdev, bus, &local_err);
     if (local_err) {
         error_report_err(local_err);
-        object_unparent(OBJECT(dev));
-        object_unref(dev);
-        return NULL;
+        goto fail;
     }
 
     /* set vid/did according to sr/iov spec - they are not used */
-    pci_config_set_vendor_id(dev->config, 0xffff);
-    pci_config_set_device_id(dev->config, 0xffff);
+    pci_config_set_vendor_id(pci_dev->config, 0xffff);
+    pci_config_set_device_id(pci_dev->config, 0xffff);
 
-    return dev;
+    return pci_dev;
+
+fail:
+    object_unparent(OBJECT(pci_dev));
+    object_unref(pci_dev);
+    return NULL;
 }
 
 static void register_vfs(PCIDevice *dev)
@@ -170,6 +210,8 @@ static void register_vfs(PCIDevice *dev)
     uint16_t num_vfs;
     uint16_t i;
     uint16_t sriov_cap = dev->exp.sriov_cap;
+    uint16_t total_vfs =
+        pci_get_word(dev->config + sriov_cap + PCI_SRIOV_TOTAL_VF);
     uint16_t vf_offset =
         pci_get_word(dev->config + sriov_cap + PCI_SRIOV_VF_OFFSET);
     uint16_t vf_stride =
@@ -178,6 +220,9 @@ static void register_vfs(PCIDevice *dev)
 
     assert(sriov_cap > 0);
     num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF);
+    if (num_vfs > total_vfs) {
+        return;
+    }
 
     dev->exp.sriov_pf.vf = g_new(PCIDevice *, num_vfs);
     assert(dev->exp.sriov_pf.vf);

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 13/14] virtio-pci: add SR-IOV capability
  2023-12-02  8:00 [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
                   ` (11 preceding siblings ...)
  2023-12-02  8:00 ` [PATCH 12/14] pcie_sriov: Allow to specify VF device options Akihiko Odaki
@ 2023-12-02  8:00 ` Akihiko Odaki
  2023-12-02  8:00 ` [PATCH 14/14] virtio-net: Add " Akihiko Odaki
  2023-12-02  8:08 ` [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
  14 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-12-02  8:00 UTC (permalink / raw)
  To: Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
	Cédric Le Goater, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Jason Wang, Sriram Yagnaraman, Keith Busch,
	Klaus Jensen
  Cc: qemu-devel, qemu-block, Yui Washizu, Akihiko Odaki

This enables SR-IOV emulation on virtio-pci devices. It introduces a
property 'sriov-pf' to state that the device will be a VF, and it
will be paired with the PF identified with the property.
Currently this feature needs to be explicitly enabled by a subclass.

Co-developed-by: Yui Washizu <yui.washidu@gmail.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 include/hw/virtio/virtio-pci.h |   2 +
 hw/virtio/virtio-pci.c         | 208 +++++++++++++++++++++++++++++++++++++++--
 2 files changed, 201 insertions(+), 9 deletions(-)

diff --git a/include/hw/virtio/virtio-pci.h b/include/hw/virtio/virtio-pci.h
index 5a3f182f99..0cd781ea98 100644
--- a/include/hw/virtio/virtio-pci.h
+++ b/include/hw/virtio/virtio-pci.h
@@ -105,6 +105,7 @@ struct VirtioPCIClass {
     PCIDeviceClass parent_class;
     DeviceRealize parent_dc_realize;
     void (*realize)(VirtIOPCIProxy *vpci_dev, Error **errp);
+    bool sriov_supported;
 };
 
 typedef struct VirtIOPCIRegion {
@@ -159,6 +160,7 @@ struct VirtIOPCIProxy {
     uint32_t gfselect;
     uint32_t guest_features[2];
     VirtIOPCIQueue vqs[VIRTIO_QUEUE_MAX];
+    GArray *sriov_vfs;
 
     VirtIOIRQFD *vector_irqfd;
     int nvqs_with_notifiers;
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 205dbf24fb..3f1b3db9b7 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -26,6 +26,9 @@
 #include "hw/pci/pci.h"
 #include "hw/pci/pci_bus.h"
 #include "hw/qdev-properties.h"
+#include "monitor/qdev.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qobject.h"
 #include "qapi/error.h"
 #include "qemu/error-report.h"
 #include "qemu/log.h"
@@ -49,6 +52,18 @@
  * configuration space */
 #define VIRTIO_PCI_CONFIG_SIZE(dev)     VIRTIO_PCI_CONFIG_OFF(msix_enabled(dev))
 
+typedef struct VirtIOPCISriovVF {
+    ObjectClass *class;
+    PCIESriovVFOpts opts;
+    struct {
+        pcibus_t size;
+        uint8_t type;
+    } io_regions[PCI_NUM_REGIONS];
+    uint16_t devfn;
+} VirtIOPCISriovVF;
+
+static GHashTable *sriov_vfs;
+
 static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
                                VirtIOPCIProxy *dev);
 static void virtio_pci_reset(DeviceState *qdev);
@@ -1912,6 +1927,18 @@ static void virtio_pci_pre_plugged(DeviceState *d, Error **errp)
     VirtIOPCIProxy *proxy = VIRTIO_PCI(d);
     VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
 
+    if (d->id) {
+        if (pci_is_vf(&proxy->pci_dev)) {
+            if (g_hash_table_contains(sriov_vfs, d->id)) {
+                error_setg(errp, "a function cannot be SR-IOV PF and VF at the same time");
+                return;
+            }
+        } else {
+            proxy->sriov_vfs = g_hash_table_lookup(sriov_vfs, d->id);
+            virtio_add_feature(&vdev->host_features, VIRTIO_F_SR_IOV);
+        }
+    }
+
     if (virtio_pci_modern(proxy)) {
         virtio_add_feature(&vdev->host_features, VIRTIO_F_VERSION_1);
     }
@@ -1919,10 +1946,26 @@ static void virtio_pci_pre_plugged(DeviceState *d, Error **errp)
     virtio_add_feature(&vdev->host_features, VIRTIO_F_BAD_FEATURE);
 }
 
+static gint virtio_pci_sriov_vfs_compare(gconstpointer a, gconstpointer b)
+{
+    return ((VirtIOPCISriovVF *)a)->devfn - ((VirtIOPCISriovVF *)b)->devfn;
+}
+
+static void virtio_pci_register_bar(VirtIOPCIProxy *proxy, int region_num,
+                                    uint8_t type, MemoryRegion *memory)
+{
+    if (pci_is_vf(&proxy->pci_dev)) {
+        pcie_sriov_vf_register_bar(&proxy->pci_dev, region_num, memory);
+    } else {
+        pci_register_bar(&proxy->pci_dev, region_num, type, memory);
+    }
+}
+
 /* This is called by virtio-bus just after the device is plugged. */
 static void virtio_pci_device_plugged(DeviceState *d, Error **errp)
 {
     VirtIOPCIProxy *proxy = VIRTIO_PCI(d);
+    VirtioPCIClass *k = VIRTIO_PCI_GET_CLASS(d);
     VirtioBusState *bus = &proxy->bus;
     bool legacy = virtio_pci_legacy(proxy);
     bool modern;
@@ -2026,18 +2069,18 @@ static void virtio_pci_device_plugged(DeviceState *d, Error **errp)
             memory_region_init(&proxy->io_bar, OBJECT(proxy),
                                "virtio-pci-io", 0x4);
 
-            pci_register_bar(&proxy->pci_dev, proxy->modern_io_bar_idx,
-                             PCI_BASE_ADDRESS_SPACE_IO, &proxy->io_bar);
+            virtio_pci_register_bar(proxy, proxy->modern_io_bar_idx,
+                                    PCI_BASE_ADDRESS_SPACE_IO, &proxy->io_bar);
 
             virtio_pci_modern_io_region_map(proxy, &proxy->notify_pio,
                                             &notify_pio.cap);
         }
 
-        pci_register_bar(&proxy->pci_dev, proxy->modern_mem_bar_idx,
-                         PCI_BASE_ADDRESS_SPACE_MEMORY |
-                         PCI_BASE_ADDRESS_MEM_PREFETCH |
-                         PCI_BASE_ADDRESS_MEM_TYPE_64,
-                         &proxy->modern_bar);
+        virtio_pci_register_bar(proxy, proxy->modern_mem_bar_idx,
+                                PCI_BASE_ADDRESS_SPACE_MEMORY |
+                                PCI_BASE_ADDRESS_MEM_PREFETCH |
+                                PCI_BASE_ADDRESS_MEM_TYPE_64,
+                                &proxy->modern_bar);
 
         proxy->config_cap = virtio_pci_add_mem_cap(proxy, &cfg.cap);
         cfg_mask = (void *)(proxy->pci_dev.wmask + proxy->config_cap);
@@ -2072,8 +2115,92 @@ static void virtio_pci_device_plugged(DeviceState *d, Error **errp)
                               &virtio_pci_config_ops,
                               proxy, "virtio-pci", size);
 
-        pci_register_bar(&proxy->pci_dev, proxy->legacy_io_bar_idx,
-                         PCI_BASE_ADDRESS_SPACE_IO, &proxy->bar);
+        virtio_pci_register_bar(proxy, proxy->legacy_io_bar_idx,
+                                PCI_BASE_ADDRESS_SPACE_IO, &proxy->bar);
+    }
+
+    if (proxy->sriov_vfs) {
+        uint16_t first_devfn;
+        uint16_t stride;
+        PCIESriovVFOpts *opts;
+
+        if (!k->sriov_supported) {
+            error_setg(errp, "SR-IOV is not supported by this device type");
+            return;
+        }
+
+        if (!pci_is_express(&proxy->pci_dev)) {
+            error_setg(errp, "PCI Express is required for SR-IOV");
+            return;
+        }
+
+        g_array_sort(proxy->sriov_vfs, virtio_pci_sriov_vfs_compare);
+
+        first_devfn = g_array_index(proxy->sriov_vfs, VirtIOPCISriovVF, 0).devfn;
+        if (first_devfn <= proxy->pci_dev.devfn) {
+            error_setg(errp, "a VF function number is less than the PF function number");
+            return;
+        }
+
+        stride = proxy->sriov_vfs->len < 2 ?
+                 0 :
+                 (g_array_index(proxy->sriov_vfs, VirtIOPCISriovVF, 1).devfn -
+                  first_devfn);
+
+        for (uint16_t i = 0; i < proxy->sriov_vfs->len; i++) {
+            VirtIOPCISriovVF *vf = &g_array_index(proxy->sriov_vfs,
+                                                  VirtIOPCISriovVF,
+                                                  i);
+            if (vf->class != object_get_class(OBJECT(proxy))) {
+                error_setg(errp, "a VF and its paired PF have different types");
+                return;
+            }
+
+            for (size_t j = 0; j < PCI_NUM_REGIONS; j++) {
+                if (j == PCI_ROM_SLOT) {
+                    continue;
+                }
+
+                if (vf->io_regions[j].size != proxy->pci_dev.io_regions[j].size ||
+                    vf->io_regions[j].type != proxy->pci_dev.io_regions[j].type) {
+                    error_setg(errp, "inconsistent SR-IOV BARs");
+                }
+            }
+
+            if (vf->devfn - first_devfn != stride * i) {
+                error_setg(errp, "inconsistent SR-IOV stride");
+                return;
+            }
+        }
+
+        opts = g_new(PCIESriovVFOpts, proxy->sriov_vfs->len);
+
+        for (uint16_t i = 0; i < proxy->sriov_vfs->len; i++) {
+            opts[i] = g_array_index(proxy->sriov_vfs, VirtIOPCISriovVF, i).opts;
+            qobject_ref(opts[i].device_opts);
+        }
+
+        pcie_sriov_pf_init(&proxy->pci_dev, PCI_CONFIG_SPACE_SIZE,
+                           proxy->pci_dev.name, opts,
+                           PCI_DEVICE_ID_VIRTIO_10_BASE
+                           + virtio_bus_get_vdev_id(bus),
+                           proxy->sriov_vfs->len, proxy->sriov_vfs->len,
+                           first_devfn - proxy->pci_dev.devfn,
+                           stride);
+
+        for (int i = 0; i < PCI_NUM_REGIONS; i++) {
+            if (i == PCI_ROM_SLOT) {
+                continue;
+            }
+
+            VirtIOPCISriovVF *vf = &g_array_index(proxy->sriov_vfs,
+                                                  VirtIOPCISriovVF,
+                                                  0);
+            uint8_t type = vf->io_regions[i].type;
+            size = vf->io_regions[i].size;
+
+            pcie_sriov_pf_init_vf_bar(&proxy->pci_dev, i, type, size);
+        }
     }
 }
 
@@ -2093,9 +2220,69 @@ static void virtio_pci_device_unplugged(DeviceState *d)
         if (modern_pio) {
             virtio_pci_modern_io_region_unmap(proxy, &proxy->notify_pio);
         }
+        if (proxy->sriov_vfs) {
+            pcie_sriov_pf_exit(&proxy->pci_dev);
+        }
     }
 }
 
+static bool virtio_pci_hide(PCIDeviceClass *pc, const QDict *device_opts,
+                            bool from_json, Error **errp)
+{
+    const char *pf;
+    GArray *array;
+    QDict *cloned_device_opts;
+    VirtIOPCISriovVF vf;
+    DeviceState *dev;
+    PCIDevice *pci_dev;
+
+    if (!device_opts) {
+        return false;
+    }
+
+    pf = qdict_get_try_str(device_opts, "sriov-pf");
+    if (!pf) {
+        return false;
+    }
+
+    cloned_device_opts = qdict_clone_shallow(device_opts);
+    qdict_del(cloned_device_opts, "sriov-pf");
+
+    dev = qdev_device_add_from_qdict(cloned_device_opts, from_json, errp);
+    if (!dev) {
+        qobject_unref(cloned_device_opts);
+        return false;
+    }
+
+    pci_dev = PCI_DEVICE(dev);
+    vf.class = object_get_class(OBJECT(dev));
+    vf.opts.device_opts = cloned_device_opts;
+    vf.opts.from_json = from_json;
+
+    for (size_t i = 0; i < PCI_NUM_REGIONS; i++) {
+        vf.io_regions[i].size = pci_dev->io_regions[i].size;
+        vf.io_regions[i].type = pci_dev->io_regions[i].type;
+    }
+
+    vf.devfn = pci_dev->devfn;
+
+    qdev_unplug(dev, errp);
+    if (*errp) {
+        qobject_unref(cloned_device_opts);
+        return false;
+    }
+
+    array = g_hash_table_lookup(sriov_vfs, pf);
+    if (!array) {
+        array = g_array_new(false, false, sizeof(VirtIOPCISriovVF));
+        g_hash_table_insert(sriov_vfs, g_strdup(pf), array);
+    }
+
+    g_array_append_val(array, vf);
+
+    return true;
+}
+
 static void virtio_pci_realize(PCIDevice *pci_dev, Error **errp)
 {
     VirtIOPCIProxy *proxy = VIRTIO_PCI(pci_dev);
@@ -2325,7 +2512,10 @@ static void virtio_pci_class_init(ObjectClass *klass, void *data)
     VirtioPCIClass *vpciklass = VIRTIO_PCI_CLASS(klass);
     ResettableClass *rc = RESETTABLE_CLASS(klass);
 
+    sriov_vfs = g_hash_table_new(g_str_hash, g_str_equal);
+
     device_class_set_props(dc, virtio_pci_properties);
+    k->hide = virtio_pci_hide;
     k->realize = virtio_pci_realize;
     k->exit = virtio_pci_exit;
     k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 14/14] virtio-net: Add SR-IOV capability
  2023-12-02  8:00 [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
                   ` (12 preceding siblings ...)
  2023-12-02  8:00 ` [PATCH 13/14] virtio-pci: add SR-IOV capability Akihiko Odaki
@ 2023-12-02  8:00 ` Akihiko Odaki
  2023-12-02  8:08 ` [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
  14 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-12-02  8:00 UTC (permalink / raw)
  To: Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
	Cédric Le Goater, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Jason Wang, Sriram Yagnaraman, Keith Busch,
	Klaus Jensen
  Cc: qemu-devel, qemu-block, Yui Washizu, Akihiko Odaki

This enables the SR-IO capability previously added to virtio-pci for
virtio-net-pci.

Buglink: https://issues.redhat.com/browse/RHEL-1216
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 hw/virtio/virtio-net-pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/virtio/virtio-net-pci.c b/hw/virtio/virtio-net-pci.c
index e421cd9cea..421d69e206 100644
--- a/hw/virtio/virtio-net-pci.c
+++ b/hw/virtio/virtio-net-pci.c
@@ -88,6 +88,7 @@ static void virtio_net_pci_class_init(ObjectClass *klass, void *data)
     set_bit(DEVICE_CATEGORY_NETWORK, dc->categories);
     device_class_set_props(dc, virtio_net_properties);
     vpciklass->realize = virtio_net_pci_realize;
+    vpciklass->sriov_supported = true;
 }
 
 static void virtio_net_pci_instance_init(Object *obj)

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 00/14] virtio-net: add support for SR-IOV emulation
  2023-12-02  8:00 [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
                   ` (13 preceding siblings ...)
  2023-12-02  8:00 ` [PATCH 14/14] virtio-net: Add " Akihiko Odaki
@ 2023-12-02  8:08 ` Akihiko Odaki
  2023-12-04  9:04   ` Yui Washizu
  14 siblings, 1 reply; 17+ messages in thread
From: Akihiko Odaki @ 2023-12-02  8:08 UTC (permalink / raw)
  To: Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
	Cédric Le Goater, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Jason Wang, Sriram Yagnaraman, Keith Busch,
	Klaus Jensen
  Cc: qemu-devel, qemu-block, Yui Washizu

On 2023/12/02 17:00, Akihiko Odaki wrote:
> Introduction
> ------------
> 
> This series is based on the RFC series submitted by Yui Washizu[1].
> See also [2] for the context.
> 
> This series enables SR-IOV emulation for virtio-net. It is useful
> to test SR-IOV support on the guest, or to expose several vDPA devices in a
> VM. vDPA devices can also provide L2 switching feature for offloading
> though it is out of scope to allow the guest to configure such a feature.
> 
> The new code of SR-IOV emulation for virtio-net actually resides in
> virtio-pci since it's specific to PCI. Although it is written in a way
> agnostic to the virtio device type, it is restricted for virtio-net because
> of lack of validation.

I forgot to prefix this as RFC. It is the first version of the series 
and I'm open for design changes.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 00/14] virtio-net: add support for SR-IOV emulation
  2023-12-02  8:08 ` [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
@ 2023-12-04  9:04   ` Yui Washizu
  0 siblings, 0 replies; 17+ messages in thread
From: Yui Washizu @ 2023-12-04  9:04 UTC (permalink / raw)
  To: Akihiko Odaki, Michael S. Tsirkin, Marcel Apfelbaum,
	Alex Williamson, Cédric Le Goater, Paolo Bonzini,
	Daniel P. Berrangé, Eduardo Habkost, Jason Wang,
	Sriram Yagnaraman, Keith Busch, Klaus Jensen
  Cc: qemu-devel, qemu-block


On 2023/12/02 17:08, Akihiko Odaki wrote:
> On 2023/12/02 17:00, Akihiko Odaki wrote:
>> Introduction
>> ------------
>>
>> This series is based on the RFC series submitted by Yui Washizu[1].
>> See also [2] for the context.
>>
>> This series enables SR-IOV emulation for virtio-net. It is useful
>> to test SR-IOV support on the guest, or to expose several vDPA 
>> devices in a
>> VM. vDPA devices can also provide L2 switching feature for offloading
>> though it is out of scope to allow the guest to configure such a 
>> feature.
>>
>> The new code of SR-IOV emulation for virtio-net actually resides in
>> virtio-pci since it's specific to PCI. Although it is written in a way
>> agnostic to the virtio device type, it is restricted for virtio-net 
>> because
>> of lack of validation.
>
> I forgot to prefix this as RFC. It is the first version of the series 
> and I'm open for design changes.


Thank you. I'll proceed with building and reviewing the patch content.



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-12-04  9:06 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-02  8:00 [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
2023-12-02  8:00 ` [PATCH 01/14] vfio: Avoid inspecting option QDict for rombar Akihiko Odaki
2023-12-02  8:00 ` [PATCH 02/14] hw/qdev: Remove opts member Akihiko Odaki
2023-12-02  8:00 ` [PATCH 03/14] qdev: Add DeviceClass::hide() Akihiko Odaki
2023-12-02  8:00 ` [PATCH 04/14] hw/pci: Add pci-failover Akihiko Odaki
2023-12-02  8:00 ` [PATCH 05/14] virtio-net: Implement pci-failover Akihiko Odaki
2023-12-02  8:00 ` [PATCH 06/14] qdev: Remove DeviceListener::hide_device() Akihiko Odaki
2023-12-02  8:00 ` [PATCH 07/14] hw/pci: Add hide() Akihiko Odaki
2023-12-02  8:00 ` [PATCH 08/14] qdev: Add qdev_device_new_from_qdict() Akihiko Odaki
2023-12-02  8:00 ` [PATCH 09/14] hw/pci: Do not add ROM BAR for SR-IOV VF Akihiko Odaki
2023-12-02  8:00 ` [PATCH 10/14] msix: Call pcie_sriov_vf_register_bar() " Akihiko Odaki
2023-12-02  8:00 ` [PATCH 11/14] pcie_sriov: Release VFs failed to realize Akihiko Odaki
2023-12-02  8:00 ` [PATCH 12/14] pcie_sriov: Allow to specify VF device options Akihiko Odaki
2023-12-02  8:00 ` [PATCH 13/14] virtio-pci: add SR-IOV capability Akihiko Odaki
2023-12-02  8:00 ` [PATCH 14/14] virtio-net: Add " Akihiko Odaki
2023-12-02  8:08 ` [PATCH 00/14] virtio-net: add support for SR-IOV emulation Akihiko Odaki
2023-12-04  9:04   ` Yui Washizu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).