[Qemu-devel] [PATCH v6 0/4] virtio: Use ioeventfd for virtqueue notify

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v6 0/4] virtio: Use ioeventfd for virtqueue notify
@ 2010-12-17 12:01 Stefan Hajnoczi
  2010-12-17 12:01 ` [Qemu-devel] [PATCH v6 1/4] virtio-pci: Rename bugs field to flags Stefan Hajnoczi
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2010-12-17 12:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Michael S. Tsirkin

See below for the v6 changelog.

Virtqueue notify is currently handled synchronously in userspace virtio.  This
prevents the vcpu from executing guest code while hardware emulation code
handles the notify.

On systems that support KVM, the ioeventfd mechanism can be used to make
virtqueue notify a lightweight exit by deferring hardware emulation to the
iothread and allowing the VM to continue execution.  This model is similar to
how vhost receives virtqueue notifies.

The result of this change is improved performance for userspace virtio devices.
Virtio-blk throughput increases especially for multithreaded scenarios and
virtio-net transmit throughput increases substantially.

Now that this code is in virtio-pci.c it is possible to explicitly enable
devices for which virtio-ioeventfd should be used.  Only virtio-blk and
virtio-net are enabled at this time.

v6:
 * Default to ioeventfd=off for virtio-net

v5:
 * Fix spurious whitespace change in documentation
 * Test and clear event notifier when deassigning to catch race condition

v4:
 * Simpler start/stop ioeventfd mechanism using bool ioeventfd_started state
 * Support for migration
 * Handle deassign race condition to avoid dropping a virtqueue kick
 * Add missing kvm_enabled() check to kvm_has_many_ioeventfds()
 * Documentation updates for qdev -device with ioeventfd=on|off

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH v6 1/4] virtio-pci: Rename bugs field to flags
  2010-12-17 12:01 [Qemu-devel] [PATCH v6 0/4] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi
@ 2010-12-17 12:01 ` Stefan Hajnoczi
  2010-12-17 12:01 ` [Qemu-devel] [PATCH v6 2/4] virtio-pci: Use ioeventfd for virtqueue notify Stefan Hajnoczi
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2010-12-17 12:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Stefan Hajnoczi, Michael S. Tsirkin

The VirtIOPCIProxy bugs field is currently used to enable workarounds
for older guests.  Rename it to flags so that other per-device behavior
can be tracked.

A later patch uses the flags field to remember whether ioeventfd should
be used for virtqueue host notification.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 hw/virtio-pci.c |   15 +++++++--------
 1 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 6186142..13dd391 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -80,9 +80,8 @@
  * 12 is historical, and due to x86 page size. */
 #define VIRTIO_PCI_QUEUE_ADDR_SHIFT    12
 
-/* We can catch some guest bugs inside here so we continue supporting older
-   guests. */
-#define VIRTIO_PCI_BUG_BUS_MASTER	(1 << 0)
+/* Flags track per-device state like workarounds for quirks in older guests. */
+#define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
 
 /* QEMU doesn't strictly need write barriers since everything runs in
  * lock-step.  We'll leave the calls to wmb() in though to make it obvious for
@@ -95,7 +94,7 @@
 typedef struct {
     PCIDevice pci_dev;
     VirtIODevice *vdev;
-    uint32_t bugs;
+    uint32_t flags;
     uint32_t addr;
     uint32_t class_code;
     uint32_t nvectors;
@@ -159,7 +158,7 @@ static int virtio_pci_load_config(void * opaque, QEMUFile *f)
        in ready state. Then we have a buggy guest OS. */
     if ((proxy->vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) &&
         !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
-        proxy->bugs |= VIRTIO_PCI_BUG_BUS_MASTER;
+        proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
     }
     return 0;
 }
@@ -185,7 +184,7 @@ static void virtio_pci_reset(DeviceState *d)
     VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev);
     virtio_reset(proxy->vdev);
     msix_reset(&proxy->pci_dev);
-    proxy->bugs = 0;
+    proxy->flags = 0;
 }
 
 static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
@@ -235,7 +234,7 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
            some safety checks. */
         if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
             !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
-            proxy->bugs |= VIRTIO_PCI_BUG_BUS_MASTER;
+            proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
         }
         break;
     case VIRTIO_MSI_CONFIG_VECTOR:
@@ -403,7 +402,7 @@ static void virtio_write_config(PCIDevice *pci_dev, uint32_t address,
 
     if (PCI_COMMAND == address) {
         if (!(val & PCI_COMMAND_MASTER)) {
-            if (!(proxy->bugs & VIRTIO_PCI_BUG_BUS_MASTER)) {
+            if (!(proxy->flags & VIRTIO_PCI_FLAG_BUS_MASTER_BUG)) {
                 virtio_set_status(proxy->vdev,
                                   proxy->vdev->status & ~VIRTIO_CONFIG_S_DRIVER_OK);
             }
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH v6 2/4] virtio-pci: Use ioeventfd for virtqueue notify
  2010-12-17 12:01 [Qemu-devel] [PATCH v6 0/4] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi
  2010-12-17 12:01 ` [Qemu-devel] [PATCH v6 1/4] virtio-pci: Rename bugs field to flags Stefan Hajnoczi
@ 2010-12-17 12:01 ` Stefan Hajnoczi
  2010-12-17 12:01 ` [Qemu-devel] [PATCH v6 3/4] virtio-pci: Don't use ioeventfd on old kernels Stefan Hajnoczi
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2010-12-17 12:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Stefan Hajnoczi, Michael S. Tsirkin

Virtqueue notify is currently handled synchronously in userspace virtio.  This
prevents the vcpu from executing guest code while hardware emulation code
handles the notify.

On systems that support KVM, the ioeventfd mechanism can be used to make
virtqueue notify a lightweight exit by deferring hardware emulation to the
iothread and allowing the VM to continue execution.  This model is similar to
how vhost receives virtqueue notifies.

The result of this change is improved performance for userspace virtio devices.
Virtio-blk throughput increases especially for multithreaded scenarios and
virtio-net transmit throughput increases substantially.

Some virtio devices are known to have guest drivers which expect a notify to be
processed synchronously and spin waiting for completion.  Only enable ioeventfd
for virtio-blk and virtio-net for now.

Care must be taken not to interfere with vhost-net, which uses host
notifiers.  If the set_host_notifier() API is used by a device
virtio-pci will disable virtio-ioeventfd and let the device deal with
host notifiers as it wishes.

After migration and on VM change state (running/paused) virtio-ioeventfd
will enable/disable itself.

 * VIRTIO_CONFIG_S_DRIVER_OK -> enable virtio-ioeventfd
 * !VIRTIO_CONFIG_S_DRIVER_OK -> disable virtio-ioeventfd
 * virtio_pci_set_host_notifier() -> disable virtio-ioeventfd
 * vm_change_state(running=0) -> disable virtio-ioeventfd
 * vm_change_state(running=1) -> enable virtio-ioeventfd

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 hw/virtio-pci.c |  190 ++++++++++++++++++++++++++++++++++++++++++++++++-------
 hw/virtio.c     |   14 +++-
 hw/virtio.h     |    1 +
 3 files changed, 179 insertions(+), 26 deletions(-)

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 13dd391..6b12248 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -83,6 +83,11 @@
 /* Flags track per-device state like workarounds for quirks in older guests. */
 #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
 
+/* Performance improves when virtqueue kick processing is decoupled from the
+ * vcpu thread using ioeventfd for some devices. */
+#define VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT 1
+#define VIRTIO_PCI_FLAG_USE_IOEVENTFD   (1 << VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT)
+
 /* QEMU doesn't strictly need write barriers since everything runs in
  * lock-step.  We'll leave the calls to wmb() in though to make it obvious for
  * KVM or if kqemu gets SMP support.
@@ -107,6 +112,8 @@ typedef struct {
     /* Max. number of ports we can have for a the virtio-serial device */
     uint32_t max_virtserial_ports;
     virtio_net_conf net;
+    bool ioeventfd_started;
+    VMChangeStateEntry *vm_change_state_entry;
 } VirtIOPCIProxy;
 
 /* virtio device */
@@ -179,12 +186,131 @@ static int virtio_pci_load_queue(void * opaque, int n, QEMUFile *f)
     return 0;
 }
 
+static int virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *proxy,
+                                                 int n, bool assign)
+{
+    VirtQueue *vq = virtio_get_queue(proxy->vdev, n);
+    EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
+    int r;
+    if (assign) {
+        r = event_notifier_init(notifier, 1);
+        if (r < 0) {
+            return r;
+        }
+        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
+                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
+                                       n, assign);
+        if (r < 0) {
+            event_notifier_cleanup(notifier);
+        }
+    } else {
+        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
+                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
+                                       n, assign);
+        if (r < 0) {
+            return r;
+        }
+
+        /* Handle the race condition where the guest kicked and we deassigned
+         * before we got around to handling the kick.
+         */
+        if (event_notifier_test_and_clear(notifier)) {
+            virtio_queue_notify_vq(vq);
+        }
+
+        event_notifier_cleanup(notifier);
+    }
+    return r;
+}
+
+static void virtio_pci_host_notifier_read(void *opaque)
+{
+    VirtQueue *vq = opaque;
+    EventNotifier *n = virtio_queue_get_host_notifier(vq);
+    if (event_notifier_test_and_clear(n)) {
+        virtio_queue_notify_vq(vq);
+    }
+}
+
+static void virtio_pci_set_host_notifier_fd_handler(VirtIOPCIProxy *proxy,
+                                                    int n, bool assign)
+{
+    VirtQueue *vq = virtio_get_queue(proxy->vdev, n);
+    EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
+    if (assign) {
+        qemu_set_fd_handler(event_notifier_get_fd(notifier),
+                            virtio_pci_host_notifier_read, NULL, vq);
+    } else {
+        qemu_set_fd_handler(event_notifier_get_fd(notifier),
+                            NULL, NULL, NULL);
+    }
+}
+
+static int virtio_pci_start_ioeventfd(VirtIOPCIProxy *proxy)
+{
+    int n, r;
+
+    if (!(proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD) ||
+        proxy->ioeventfd_started) {
+        return 0;
+    }
+
+    for (n = 0; n < VIRTIO_PCI_QUEUE_MAX; n++) {
+        if (!virtio_queue_get_num(proxy->vdev, n)) {
+            continue;
+        }
+
+        r = virtio_pci_set_host_notifier_internal(proxy, n, true);
+        if (r < 0) {
+            goto assign_error;
+        }
+
+        virtio_pci_set_host_notifier_fd_handler(proxy, n, true);
+    }
+    proxy->ioeventfd_started = true;
+    return 0;
+
+assign_error:
+    while (--n >= 0) {
+        if (!virtio_queue_get_num(proxy->vdev, n)) {
+            continue;
+        }
+
+        virtio_pci_set_host_notifier_fd_handler(proxy, n, false);
+        virtio_pci_set_host_notifier_internal(proxy, n, false);
+    }
+    proxy->ioeventfd_started = false;
+    proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
+    return r;
+}
+
+static int virtio_pci_stop_ioeventfd(VirtIOPCIProxy *proxy)
+{
+    int n;
+
+    if (!proxy->ioeventfd_started) {
+        return 0;
+    }
+
+    for (n = 0; n < VIRTIO_PCI_QUEUE_MAX; n++) {
+        if (!virtio_queue_get_num(proxy->vdev, n)) {
+            continue;
+        }
+
+        virtio_pci_set_host_notifier_fd_handler(proxy, n, false);
+        virtio_pci_set_host_notifier_internal(proxy, n, false);
+    }
+    proxy->ioeventfd_started = false;
+    return 0;
+}
+
 static void virtio_pci_reset(DeviceState *d)
 {
     VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev);
+    virtio_pci_stop_ioeventfd(proxy);
     virtio_reset(proxy->vdev);
     msix_reset(&proxy->pci_dev);
-    proxy->flags = 0;
+    proxy->flags &= ~VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
 }
 
 static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
@@ -209,6 +335,7 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
     case VIRTIO_PCI_QUEUE_PFN:
         pa = (target_phys_addr_t)val << VIRTIO_PCI_QUEUE_ADDR_SHIFT;
         if (pa == 0) {
+            virtio_pci_stop_ioeventfd(proxy);
             virtio_reset(proxy->vdev);
             msix_unuse_all_vectors(&proxy->pci_dev);
         }
@@ -223,6 +350,12 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
         virtio_queue_notify(vdev, val);
         break;
     case VIRTIO_PCI_STATUS:
+        if (val & VIRTIO_CONFIG_S_DRIVER_OK) {
+            virtio_pci_start_ioeventfd(proxy);
+        } else {
+            virtio_pci_stop_ioeventfd(proxy);
+        }
+
         virtio_set_status(vdev, val & 0xFF);
         if (vdev->status == 0) {
             virtio_reset(proxy->vdev);
@@ -403,6 +536,7 @@ static void virtio_write_config(PCIDevice *pci_dev, uint32_t address,
     if (PCI_COMMAND == address) {
         if (!(val & PCI_COMMAND_MASTER)) {
             if (!(proxy->flags & VIRTIO_PCI_FLAG_BUS_MASTER_BUG)) {
+                virtio_pci_stop_ioeventfd(proxy);
                 virtio_set_status(proxy->vdev,
                                   proxy->vdev->status & ~VIRTIO_CONFIG_S_DRIVER_OK);
             }
@@ -480,30 +614,27 @@ assign_error:
 static int virtio_pci_set_host_notifier(void *opaque, int n, bool assign)
 {
     VirtIOPCIProxy *proxy = opaque;
-    VirtQueue *vq = virtio_get_queue(proxy->vdev, n);
-    EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
-    int r;
-    if (assign) {
-        r = event_notifier_init(notifier, 1);
-        if (r < 0) {
-            return r;
-        }
-        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
-                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
-                                       n, assign);
-        if (r < 0) {
-            event_notifier_cleanup(notifier);
-        }
+
+    /* Stop using ioeventfd for virtqueue kick if the device starts using host
+     * notifiers.  This makes it easy to avoid stepping on each others' toes.
+     */
+    if (proxy->ioeventfd_started) {
+        virtio_pci_stop_ioeventfd(proxy);
+        proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
+    }
+
+    return virtio_pci_set_host_notifier_internal(proxy, n, assign);
+}
+
+static void virtio_pci_vm_change_state_handler(void *opaque, int running, int reason)
+{
+    VirtIOPCIProxy *proxy = opaque;
+
+    if (running && (proxy->vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
+        virtio_pci_start_ioeventfd(proxy);
     } else {
-        r = kvm_set_ioeventfd_pio_word(event_notifier_get_fd(notifier),
-                                       proxy->addr + VIRTIO_PCI_QUEUE_NOTIFY,
-                                       n, assign);
-        if (r < 0) {
-            return r;
-        }
-        event_notifier_cleanup(notifier);
+        virtio_pci_stop_ioeventfd(proxy);
     }
-    return r;
 }
 
 static const VirtIOBindings virtio_pci_bindings = {
@@ -563,6 +694,10 @@ static void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev,
     proxy->host_features |= 0x1 << VIRTIO_F_NOTIFY_ON_EMPTY;
     proxy->host_features |= 0x1 << VIRTIO_F_BAD_FEATURE;
     proxy->host_features = vdev->get_features(vdev, proxy->host_features);
+
+    proxy->vm_change_state_entry = qemu_add_vm_change_state_handler(
+                                        virtio_pci_vm_change_state_handler,
+                                        proxy);
 }
 
 static int virtio_blk_init_pci(PCIDevice *pci_dev)
@@ -590,6 +725,9 @@ static int virtio_blk_init_pci(PCIDevice *pci_dev)
 
 static int virtio_exit_pci(PCIDevice *pci_dev)
 {
+    VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
+
+    qemu_del_vm_change_state_handler(proxy->vm_change_state_entry);
     return msix_uninit(pci_dev);
 }
 
@@ -597,6 +735,7 @@ static int virtio_blk_exit_pci(PCIDevice *pci_dev)
 {
     VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
 
+    virtio_pci_stop_ioeventfd(proxy);
     virtio_blk_exit(proxy->vdev);
     blockdev_mark_auto_del(proxy->block.bs);
     return virtio_exit_pci(pci_dev);
@@ -658,6 +797,7 @@ static int virtio_net_exit_pci(PCIDevice *pci_dev)
 {
     VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
 
+    virtio_pci_stop_ioeventfd(proxy);
     virtio_net_exit(proxy->vdev);
     return virtio_exit_pci(pci_dev);
 }
@@ -705,6 +845,8 @@ static PCIDeviceInfo virtio_info[] = {
         .qdev.props = (Property[]) {
             DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0),
             DEFINE_BLOCK_PROPERTIES(VirtIOPCIProxy, block),
+            DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags,
+                            VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),
             DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
             DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features),
             DEFINE_PROP_END_OF_LIST(),
@@ -717,6 +859,8 @@ static PCIDeviceInfo virtio_info[] = {
         .exit       = virtio_net_exit_pci,
         .romfile    = "pxe-virtio.bin",
         .qdev.props = (Property[]) {
+            DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags,
+                            VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, false),
             DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 3),
             DEFINE_VIRTIO_NET_FEATURES(VirtIOPCIProxy, host_features),
             DEFINE_NIC_PROPERTIES(VirtIOPCIProxy, nic),
diff --git a/hw/virtio.c b/hw/virtio.c
index 07dbf86..e40296a 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -575,11 +575,19 @@ int virtio_queue_get_num(VirtIODevice *vdev, int n)
     return vdev->vq[n].vring.num;
 }
 
+void virtio_queue_notify_vq(VirtQueue *vq)
+{
+    if (vq->vring.desc) {
+        VirtIODevice *vdev = vq->vdev;
+        trace_virtio_queue_notify(vdev, vq - vdev->vq, vq);
+        vq->handle_output(vdev, vq);
+    }
+}
+
 void virtio_queue_notify(VirtIODevice *vdev, int n)
 {
-    if (n < VIRTIO_PCI_QUEUE_MAX && vdev->vq[n].vring.desc) {
-        trace_virtio_queue_notify(vdev, n, &vdev->vq[n]);
-        vdev->vq[n].handle_output(vdev, &vdev->vq[n]);
+    if (n < VIRTIO_PCI_QUEUE_MAX) {
+        virtio_queue_notify_vq(&vdev->vq[n]);
     }
 }
 
diff --git a/hw/virtio.h b/hw/virtio.h
index 02fa312..5ae521c 100644
--- a/hw/virtio.h
+++ b/hw/virtio.h
@@ -219,5 +219,6 @@ void virtio_queue_set_last_avail_idx(VirtIODevice *vdev, int n, uint16_t idx);
 VirtQueue *virtio_get_queue(VirtIODevice *vdev, int n);
 EventNotifier *virtio_queue_get_guest_notifier(VirtQueue *vq);
 EventNotifier *virtio_queue_get_host_notifier(VirtQueue *vq);
+void virtio_queue_notify_vq(VirtQueue *vq);
 void virtio_irq(VirtQueue *vq);
 #endif
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH v6 3/4] virtio-pci: Don't use ioeventfd on old kernels
  2010-12-17 12:01 [Qemu-devel] [PATCH v6 0/4] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi
  2010-12-17 12:01 ` [Qemu-devel] [PATCH v6 1/4] virtio-pci: Rename bugs field to flags Stefan Hajnoczi
  2010-12-17 12:01 ` [Qemu-devel] [PATCH v6 2/4] virtio-pci: Use ioeventfd for virtqueue notify Stefan Hajnoczi
@ 2010-12-17 12:01 ` Stefan Hajnoczi
  2010-12-17 12:01 ` [Qemu-devel] [PATCH v6 4/4] docs: Document virtio PCI -device ioeventfd=on|off Stefan Hajnoczi
  2010-12-29 14:52 ` [Qemu-devel] [PATCH] ioeventfd: minor fixups Michael S. Tsirkin
  4 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2010-12-17 12:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Stefan Hajnoczi, Michael S. Tsirkin

There used to be a limit of 6 KVM io bus devices inside the kernel.  On
such a kernel, don't use ioeventfd for virtqueue host notification since
the limit is reached too easily.  This ensures that existing vhost-net
setups (which always use ioeventfd) have ioeventfds available so they
can continue to work.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 hw/virtio-pci.c |    4 ++++
 kvm-all.c       |   49 +++++++++++++++++++++++++++++++++++++++++++++++++
 kvm-stub.c      |    5 +++++
 kvm.h           |    1 +
 4 files changed, 59 insertions(+), 0 deletions(-)

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 6b12248..b2181fc 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -690,6 +690,10 @@ static void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev,
     pci_register_bar(&proxy->pci_dev, 0, size, PCI_BASE_ADDRESS_SPACE_IO,
                            virtio_map);
 
+    if (!kvm_has_many_ioeventfds()) {
+        proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
+    }
+
     virtio_bind_device(vdev, &virtio_pci_bindings, proxy);
     proxy->host_features |= 0x1 << VIRTIO_F_NOTIFY_ON_EMPTY;
     proxy->host_features |= 0x1 << VIRTIO_F_BAD_FEATURE;
diff --git a/kvm-all.c b/kvm-all.c
index cae24bb..255b6fa 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -28,6 +28,11 @@
 #include "kvm.h"
 #include "bswap.h"
 
+/* This check must be after config-host.h is included */
+#ifdef CONFIG_EVENTFD
+#include <sys/eventfd.h>
+#endif
+
 /* KVM uses PAGE_SIZE in it's definition of COALESCED_MMIO_MAX */
 #define PAGE_SIZE TARGET_PAGE_SIZE
 
@@ -72,6 +77,7 @@ struct KVMState
     int irqchip_in_kernel;
     int pit_in_kernel;
     int xsave, xcrs;
+    int many_ioeventfds;
 };
 
 static KVMState *kvm_state;
@@ -441,6 +447,39 @@ int kvm_check_extension(KVMState *s, unsigned int extension)
     return ret;
 }
 
+static int kvm_check_many_ioeventfds(void)
+{
+    /* Older kernels have a 6 device limit on the KVM io bus.  Find out so we
+     * can avoid creating too many ioeventfds.
+     */
+#ifdef CONFIG_EVENTFD
+    int ioeventfds[7];
+    int i, ret = 0;
+    for (i = 0; i < ARRAY_SIZE(ioeventfds); i++) {
+        ioeventfds[i] = eventfd(0, EFD_CLOEXEC);
+        if (ioeventfds[i] < 0) {
+            break;
+        }
+        ret = kvm_set_ioeventfd_pio_word(ioeventfds[i], 0, i, true);
+        if (ret < 0) {
+            close(ioeventfds[i]);
+            break;
+        }
+    }
+
+    /* Decide whether many devices are supported or not */
+    ret = i == ARRAY_SIZE(ioeventfds);
+
+    while (i-- > 0) {
+        kvm_set_ioeventfd_pio_word(ioeventfds[i], 0, i, false);
+        close(ioeventfds[i]);
+    }
+    return ret;
+#else
+    return 0;
+#endif
+}
+
 static void kvm_set_phys_mem(target_phys_addr_t start_addr,
 			     ram_addr_t size,
 			     ram_addr_t phys_offset)
@@ -717,6 +756,8 @@ int kvm_init(int smp_cpus)
     kvm_state = s;
     cpu_register_phys_memory_client(&kvm_cpu_phys_memory_client);
 
+    s->many_ioeventfds = kvm_check_many_ioeventfds();
+
     return 0;
 
 err:
@@ -1046,6 +1087,14 @@ int kvm_has_xcrs(void)
     return kvm_state->xcrs;
 }
 
+int kvm_has_many_ioeventfds(void)
+{
+    if (!kvm_enabled()) {
+        return 0;
+    }
+    return kvm_state->many_ioeventfds;
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
     if (!kvm_has_sync_mmu()) {
diff --git a/kvm-stub.c b/kvm-stub.c
index 5384a4b..33d4476 100644
--- a/kvm-stub.c
+++ b/kvm-stub.c
@@ -99,6 +99,11 @@ int kvm_has_robust_singlestep(void)
     return 0;
 }
 
+int kvm_has_many_ioeventfds(void)
+{
+    return 0;
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
 }
diff --git a/kvm.h b/kvm.h
index 60a9b42..ce08d42 100644
--- a/kvm.h
+++ b/kvm.h
@@ -42,6 +42,7 @@ int kvm_has_robust_singlestep(void);
 int kvm_has_debugregs(void);
 int kvm_has_xsave(void);
 int kvm_has_xcrs(void);
+int kvm_has_many_ioeventfds(void);
 
 #ifdef NEED_CPU_H
 int kvm_init_vcpu(CPUState *env);
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH v6 4/4] docs: Document virtio PCI -device ioeventfd=on|off
  2010-12-17 12:01 [Qemu-devel] [PATCH v6 0/4] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2010-12-17 12:01 ` [Qemu-devel] [PATCH v6 3/4] virtio-pci: Don't use ioeventfd on old kernels Stefan Hajnoczi
@ 2010-12-17 12:01 ` Stefan Hajnoczi
  2010-12-29 14:52 ` [Qemu-devel] [PATCH] ioeventfd: minor fixups Michael S. Tsirkin
  4 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2010-12-17 12:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Stefan Hajnoczi, Michael S. Tsirkin

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 docs/qdev-device-use.txt |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/docs/qdev-device-use.txt b/docs/qdev-device-use.txt
index f252c8e..f2f9b75 100644
--- a/docs/qdev-device-use.txt
+++ b/docs/qdev-device-use.txt
@@ -97,10 +97,13 @@ The -device argument differs in detail for each kind of drive:
 
 * if=virtio
 
-  -device virtio-blk-pci,drive=DRIVE-ID,class=C,vectors=V
+  -device virtio-blk-pci,drive=DRIVE-ID,class=C,vectors=V,ioeventfd=IOEVENTFD
 
   This lets you control PCI device class and MSI-X vectors.
 
+  IOEVENTFD controls whether or not ioeventfd is used for virtqueue notify.  It
+  can be set to on (default) or off.
+
   As for all PCI devices, you can add bus=PCI-BUS,addr=DEVFN to
   control the PCI device address.
 
@@ -240,6 +243,9 @@ For PCI devices, you can add bus=PCI-BUS,addr=DEVFN to control the PCI
 device address, as usual.  The old -net nic provides parameter addr
 for that, it is silently ignored when the NIC is not a PCI device.
 
+For virtio-net-pci, you can control whether or not ioeventfd is used for
+virtqueue notify by setting ioeventfd= to on or off (default).
+
 -net nic accepts vectors=V for all models, but it's silently ignored
 except for virtio-net-pci (model=virtio).  With -device, only devices
 that support it accept it.
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH] ioeventfd: minor fixups
  2010-12-17 12:01 [Qemu-devel] [PATCH v6 0/4] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi
                   ` (3 preceding siblings ...)
  2010-12-17 12:01 ` [Qemu-devel] [PATCH v6 4/4] docs: Document virtio PCI -device ioeventfd=on|off Stefan Hajnoczi
@ 2010-12-29 14:52 ` Michael S. Tsirkin
  2011-01-03 16:38   ` [Qemu-devel] " Michael S. Tsirkin
  2011-01-04 16:57   ` [Qemu-devel] " Stefan Hajnoczi
  4 siblings, 2 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2010-12-29 14:52 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel

This is on top of the ioeventfd patches, and
fixes two potential issues when ioeventfd is mixed with vhost-net:
- ioeventfd could start running then get stopped for vhost-net.
  For example, vm state change handlers run in no particular order.
  This would be hard to debug. Prevent this by always running state
  callback before ioeventfd start and after ioeventfd stop.
- Separate the flags for user-requested ioeventfd and for ioeventfd
  being overridden by vhost backend.
  This will make it possible to enable/disable vhost depending on
  guest behaviour.

I'll probably split this patch in two, and merge into the
appropriate patches in the ioeventfd series.

Compile-tested only so far, would appreciate feedback/test reports.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

diff --git a/hw/vhost.c b/hw/vhost.c
index 6082da2..1d09ed0 100644
--- a/hw/vhost.c
+++ b/hw/vhost.c
@@ -630,6 +630,17 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
 int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
 {
     int i, r;
+    /* A binding must support vmstate notifiers (for migration to work) */
+    if (!vdev->binding->vmstate_change) {
+        fprintf(stderr, "binding does not support vmstate notifiers\n");
+        r = -ENOSYS;
+        goto fail;
+    }
+    if (!vdev->binding->set_guest_notifiers) {
+        fprintf(stderr, "binding does not support guest notifiers\n");
+        r = -ENOSYS;
+        goto fail;
+    }
     if (!vdev->binding->set_guest_notifiers) {
         fprintf(stderr, "binding does not support guest notifiers\n");
         r = -ENOSYS;
diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index ec1bf8d..ccb3e63 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -54,8 +54,6 @@ typedef struct VirtIONet
     uint8_t nouni;
     uint8_t nobcast;
     uint8_t vhost_started;
-    bool vm_running;
-    VMChangeStateEntry *vmstate;
     struct {
         int in_use;
         int first_multi;
@@ -102,7 +100,7 @@ static void virtio_net_set_config(VirtIODevice *vdev, const uint8_t *config)
 static bool virtio_net_started(VirtIONet *n, uint8_t status)
 {
     return (status & VIRTIO_CONFIG_S_DRIVER_OK) &&
-        (n->status & VIRTIO_NET_S_LINK_UP) && n->vm_running;
+        (n->status & VIRTIO_NET_S_LINK_UP) && n->vdev.vm_running;
 }
 
 static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
@@ -453,7 +451,7 @@ static void virtio_net_handle_rx(VirtIODevice *vdev, VirtQueue *vq)
 static int virtio_net_can_receive(VLANClientState *nc)
 {
     VirtIONet *n = DO_UPCAST(NICState, nc, nc)->opaque;
-    if (!n->vm_running) {
+    if (!n->vdev.vm_running) {
         return 0;
     }
 
@@ -708,7 +706,7 @@ static int32_t virtio_net_flush_tx(VirtIONet *n, VirtQueue *vq)
         return num_packets;
     }
 
-    assert(n->vm_running);
+    assert(n->vdev.vm_running);
 
     if (n->async_tx.elem.out_num) {
         virtio_queue_set_notification(n->tx_vq, 0);
@@ -769,7 +767,7 @@ static void virtio_net_handle_tx_timer(VirtIODevice *vdev, VirtQueue *vq)
     VirtIONet *n = to_virtio_net(vdev);
 
     /* This happens when device was stopped but VCPU wasn't. */
-    if (!n->vm_running) {
+    if (!n->vdev.vm_running) {
         n->tx_waiting = 1;
         return;
     }
@@ -796,7 +794,7 @@ static void virtio_net_handle_tx_bh(VirtIODevice *vdev, VirtQueue *vq)
     }
     n->tx_waiting = 1;
     /* This happens when device was stopped but VCPU wasn't. */
-    if (!n->vm_running) {
+    if (!n->vdev.vm_running) {
         return;
     }
     virtio_queue_set_notification(vq, 0);
@@ -806,7 +804,7 @@ static void virtio_net_handle_tx_bh(VirtIODevice *vdev, VirtQueue *vq)
 static void virtio_net_tx_timer(void *opaque)
 {
     VirtIONet *n = opaque;
-    assert(n->vm_running);
+    assert(n->vdev.vm_running);
 
     n->tx_waiting = 0;
 
@@ -823,7 +821,7 @@ static void virtio_net_tx_bh(void *opaque)
     VirtIONet *n = opaque;
     int32_t ret;
 
-    assert(n->vm_running);
+    assert(n->vdev.vm_running);
 
     n->tx_waiting = 0;
 
@@ -988,16 +986,6 @@ static NetClientInfo net_virtio_info = {
     .link_status_changed = virtio_net_set_link_status,
 };
 
-static void virtio_net_vmstate_change(void *opaque, int running, int reason)
-{
-    VirtIONet *n = opaque;
-    n->vm_running = running;
-    /* This is called when vm is started/stopped,
-     * it will start/stop vhost backend if appropriate
-     * e.g. after migration. */
-    virtio_net_set_status(&n->vdev, n->vdev.status);
-}
-
 VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
                               virtio_net_conf *net)
 {
@@ -1052,7 +1040,6 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
     n->qdev = dev;
     register_savevm(dev, "virtio-net", -1, VIRTIO_NET_VM_VERSION,
                     virtio_net_save, virtio_net_load, n);
-    n->vmstate = qemu_add_vm_change_state_handler(virtio_net_vmstate_change, n);
 
     add_boot_device_path(conf->bootindex, dev, "/ethernet-phy@0");
 
@@ -1062,7 +1049,6 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
 void virtio_net_exit(VirtIODevice *vdev)
 {
     VirtIONet *n = DO_UPCAST(VirtIONet, vdev, vdev);
-    qemu_del_vm_change_state_handler(n->vmstate);
 
     /* This will stop vhost backend if appropriate. */
     virtio_net_set_status(vdev, 0);
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index b2181fc..75b5e53 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -112,8 +112,8 @@ typedef struct {
     /* Max. number of ports we can have for a the virtio-serial device */
     uint32_t max_virtserial_ports;
     virtio_net_conf net;
+    bool ioeventfd_disabled;
     bool ioeventfd_started;
-    VMChangeStateEntry *vm_change_state_entry;
 } VirtIOPCIProxy;
 
 /* virtio device */
@@ -251,6 +251,7 @@ static int virtio_pci_start_ioeventfd(VirtIOPCIProxy *proxy)
     int n, r;
 
     if (!(proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD) ||
+        proxy->ioeventfd_disabled ||
         proxy->ioeventfd_started) {
         return 0;
     }
@@ -280,7 +281,7 @@ assign_error:
         virtio_pci_set_host_notifier_internal(proxy, n, false);
     }
     proxy->ioeventfd_started = false;
-    proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
+    proxy->ioeventfd_disabled = true;
     return r;
 }
 
@@ -350,13 +351,16 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
         virtio_queue_notify(vdev, val);
         break;
     case VIRTIO_PCI_STATUS:
-        if (val & VIRTIO_CONFIG_S_DRIVER_OK) {
-            virtio_pci_start_ioeventfd(proxy);
-        } else {
+        if (!(val & VIRTIO_CONFIG_S_DRIVER_OK)) {
             virtio_pci_stop_ioeventfd(proxy);
         }
 
         virtio_set_status(vdev, val & 0xFF);
+
+        if (val & VIRTIO_CONFIG_S_DRIVER_OK) {
+            virtio_pci_start_ioeventfd(proxy);
+        }
+
         if (vdev->status == 0) {
             virtio_reset(proxy->vdev);
             msix_unuse_all_vectors(&proxy->pci_dev);
@@ -618,22 +622,24 @@ static int virtio_pci_set_host_notifier(void *opaque, int n, bool assign)
     /* Stop using ioeventfd for virtqueue kick if the device starts using host
      * notifiers.  This makes it easy to avoid stepping on each others' toes.
      */
-    if (proxy->ioeventfd_started) {
+    proxy->ioeventfd_disabled = assign;
+    if (assign) {
         virtio_pci_stop_ioeventfd(proxy);
-        proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
     }
 
     return virtio_pci_set_host_notifier_internal(proxy, n, assign);
 }
 
-static void virtio_pci_vm_change_state_handler(void *opaque, int running, int reason)
+static void virtio_pci_vmstate_change(void *opaque, bool running)
 {
     VirtIOPCIProxy *proxy = opaque;
 
     if (running && (proxy->vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
+        virtio_set_status(proxy->vdev, proxy->vdev->status);
         virtio_pci_start_ioeventfd(proxy);
     } else {
         virtio_pci_stop_ioeventfd(proxy);
+        virtio_set_status(proxy->vdev, proxy->vdev->status);
     }
 }
 
@@ -646,6 +652,7 @@ static const VirtIOBindings virtio_pci_bindings = {
     .get_features = virtio_pci_get_features,
     .set_host_notifier = virtio_pci_set_host_notifier,
     .set_guest_notifiers = virtio_pci_set_guest_notifiers,
+    .vmstate_change = virtio_pci_vmstate_change,
 };
 
 static void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev,
@@ -699,9 +706,6 @@ static void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev,
     proxy->host_features |= 0x1 << VIRTIO_F_BAD_FEATURE;
     proxy->host_features = vdev->get_features(vdev, proxy->host_features);
 
-    proxy->vm_change_state_entry = qemu_add_vm_change_state_handler(
-                                        virtio_pci_vm_change_state_handler,
-                                        proxy);
 }
 
 static int virtio_blk_init_pci(PCIDevice *pci_dev)
@@ -729,9 +733,6 @@ static int virtio_blk_init_pci(PCIDevice *pci_dev)
 
 static int virtio_exit_pci(PCIDevice *pci_dev)
 {
-    VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
-
-    qemu_del_vm_change_state_handler(proxy->vm_change_state_entry);
     return msix_uninit(pci_dev);
 }
 
diff --git a/hw/virtio.c b/hw/virtio.c
index e40296a..13c3373 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -751,11 +751,21 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f)
 
 void virtio_cleanup(VirtIODevice *vdev)
 {
+    qemu_del_vm_change_state_handler(vdev->vmstate);
     if (vdev->config)
         qemu_free(vdev->config);
     qemu_free(vdev->vq);
 }
 
+static void virtio_vmstate_change(void *opaque, int running, int reason)
+{
+    VirtIODevice *vdev = opaque;
+    vdev->vm_running = running;
+    if (vdev->binding->vmstate_change) {
+        vdev->binding->vmstate_change(vdev->binding_opaque, running);
+    }
+}
+
 VirtIODevice *virtio_common_init(const char *name, uint16_t device_id,
                                  size_t config_size, size_t struct_size)
 {
@@ -782,6 +792,8 @@ VirtIODevice *virtio_common_init(const char *name, uint16_t device_id,
     else
         vdev->config = NULL;
 
+    vdev->vmstate = qemu_add_vm_change_state_handler(virtio_vmstate_change, vdev);
+
     return vdev;
 }
 
diff --git a/hw/virtio.h b/hw/virtio.h
index 5ae521c..d8546d5 100644
--- a/hw/virtio.h
+++ b/hw/virtio.h
@@ -95,6 +95,7 @@ typedef struct {
     unsigned (*get_features)(void * opaque);
     int (*set_guest_notifiers)(void * opaque, bool assigned);
     int (*set_host_notifier)(void * opaque, int n, bool assigned);
+    void (*vmstate_change)(void * opaque, bool running);
 } VirtIOBindings;
 
 #define VIRTIO_PCI_QUEUE_MAX 64
@@ -123,6 +124,8 @@ struct VirtIODevice
     const VirtIOBindings *binding;
     void *binding_opaque;
     uint16_t device_id;
+    bool vm_running;
+    VMChangeStateEntry *vmstate;
 };
 
 static inline void virtio_set_status(VirtIODevice *vdev, uint8_t val)

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: [PATCH] ioeventfd: minor fixups
  2010-12-29 14:52 ` [Qemu-devel] [PATCH] ioeventfd: minor fixups Michael S. Tsirkin
@ 2011-01-03 16:38   ` Michael S. Tsirkin
  2011-01-04 10:28     ` Stefan Hajnoczi
  2011-01-04 16:57   ` [Qemu-devel] " Stefan Hajnoczi
  1 sibling, 1 reply; 10+ messages in thread
From: Michael S. Tsirkin @ 2011-01-03 16:38 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel

On Wed, Dec 29, 2010 at 04:52:46PM +0200, Michael S. Tsirkin wrote:
> This is on top of the ioeventfd patches, and
> fixes two potential issues when ioeventfd is mixed with vhost-net:
> - ioeventfd could start running then get stopped for vhost-net.
>   For example, vm state change handlers run in no particular order.
>   This would be hard to debug. Prevent this by always running state
>   callback before ioeventfd start and after ioeventfd stop.
> - Separate the flags for user-requested ioeventfd and for ioeventfd
>   being overridden by vhost backend.
>   This will make it possible to enable/disable vhost depending on
>   guest behaviour.
> 
> I'll probably split this patch in two, and merge into the
> appropriate patches in the ioeventfd series.
> 
> Compile-tested only so far, would appreciate feedback/test reports.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>


Stefan - just to clarify, I'm waiting for you ack
to merge this + the ioeventfd patches.

> diff --git a/hw/vhost.c b/hw/vhost.c
> index 6082da2..1d09ed0 100644
> --- a/hw/vhost.c
> +++ b/hw/vhost.c
> @@ -630,6 +630,17 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
>  int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
>  {
>      int i, r;
> +    /* A binding must support vmstate notifiers (for migration to work) */
> +    if (!vdev->binding->vmstate_change) {
> +        fprintf(stderr, "binding does not support vmstate notifiers\n");
> +        r = -ENOSYS;
> +        goto fail;
> +    }
> +    if (!vdev->binding->set_guest_notifiers) {
> +        fprintf(stderr, "binding does not support guest notifiers\n");
> +        r = -ENOSYS;
> +        goto fail;
> +    }
>      if (!vdev->binding->set_guest_notifiers) {
>          fprintf(stderr, "binding does not support guest notifiers\n");
>          r = -ENOSYS;
> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> index ec1bf8d..ccb3e63 100644
> --- a/hw/virtio-net.c
> +++ b/hw/virtio-net.c
> @@ -54,8 +54,6 @@ typedef struct VirtIONet
>      uint8_t nouni;
>      uint8_t nobcast;
>      uint8_t vhost_started;
> -    bool vm_running;
> -    VMChangeStateEntry *vmstate;
>      struct {
>          int in_use;
>          int first_multi;
> @@ -102,7 +100,7 @@ static void virtio_net_set_config(VirtIODevice *vdev, const uint8_t *config)
>  static bool virtio_net_started(VirtIONet *n, uint8_t status)
>  {
>      return (status & VIRTIO_CONFIG_S_DRIVER_OK) &&
> -        (n->status & VIRTIO_NET_S_LINK_UP) && n->vm_running;
> +        (n->status & VIRTIO_NET_S_LINK_UP) && n->vdev.vm_running;
>  }
>  
>  static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
> @@ -453,7 +451,7 @@ static void virtio_net_handle_rx(VirtIODevice *vdev, VirtQueue *vq)
>  static int virtio_net_can_receive(VLANClientState *nc)
>  {
>      VirtIONet *n = DO_UPCAST(NICState, nc, nc)->opaque;
> -    if (!n->vm_running) {
> +    if (!n->vdev.vm_running) {
>          return 0;
>      }
>  
> @@ -708,7 +706,7 @@ static int32_t virtio_net_flush_tx(VirtIONet *n, VirtQueue *vq)
>          return num_packets;
>      }
>  
> -    assert(n->vm_running);
> +    assert(n->vdev.vm_running);
>  
>      if (n->async_tx.elem.out_num) {
>          virtio_queue_set_notification(n->tx_vq, 0);
> @@ -769,7 +767,7 @@ static void virtio_net_handle_tx_timer(VirtIODevice *vdev, VirtQueue *vq)
>      VirtIONet *n = to_virtio_net(vdev);
>  
>      /* This happens when device was stopped but VCPU wasn't. */
> -    if (!n->vm_running) {
> +    if (!n->vdev.vm_running) {
>          n->tx_waiting = 1;
>          return;
>      }
> @@ -796,7 +794,7 @@ static void virtio_net_handle_tx_bh(VirtIODevice *vdev, VirtQueue *vq)
>      }
>      n->tx_waiting = 1;
>      /* This happens when device was stopped but VCPU wasn't. */
> -    if (!n->vm_running) {
> +    if (!n->vdev.vm_running) {
>          return;
>      }
>      virtio_queue_set_notification(vq, 0);
> @@ -806,7 +804,7 @@ static void virtio_net_handle_tx_bh(VirtIODevice *vdev, VirtQueue *vq)
>  static void virtio_net_tx_timer(void *opaque)
>  {
>      VirtIONet *n = opaque;
> -    assert(n->vm_running);
> +    assert(n->vdev.vm_running);
>  
>      n->tx_waiting = 0;
>  
> @@ -823,7 +821,7 @@ static void virtio_net_tx_bh(void *opaque)
>      VirtIONet *n = opaque;
>      int32_t ret;
>  
> -    assert(n->vm_running);
> +    assert(n->vdev.vm_running);
>  
>      n->tx_waiting = 0;
>  
> @@ -988,16 +986,6 @@ static NetClientInfo net_virtio_info = {
>      .link_status_changed = virtio_net_set_link_status,
>  };
>  
> -static void virtio_net_vmstate_change(void *opaque, int running, int reason)
> -{
> -    VirtIONet *n = opaque;
> -    n->vm_running = running;
> -    /* This is called when vm is started/stopped,
> -     * it will start/stop vhost backend if appropriate
> -     * e.g. after migration. */
> -    virtio_net_set_status(&n->vdev, n->vdev.status);
> -}
> -
>  VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
>                                virtio_net_conf *net)
>  {
> @@ -1052,7 +1040,6 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
>      n->qdev = dev;
>      register_savevm(dev, "virtio-net", -1, VIRTIO_NET_VM_VERSION,
>                      virtio_net_save, virtio_net_load, n);
> -    n->vmstate = qemu_add_vm_change_state_handler(virtio_net_vmstate_change, n);
>  
>      add_boot_device_path(conf->bootindex, dev, "/ethernet-phy@0");
>  
> @@ -1062,7 +1049,6 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
>  void virtio_net_exit(VirtIODevice *vdev)
>  {
>      VirtIONet *n = DO_UPCAST(VirtIONet, vdev, vdev);
> -    qemu_del_vm_change_state_handler(n->vmstate);
>  
>      /* This will stop vhost backend if appropriate. */
>      virtio_net_set_status(vdev, 0);
> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index b2181fc..75b5e53 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -112,8 +112,8 @@ typedef struct {
>      /* Max. number of ports we can have for a the virtio-serial device */
>      uint32_t max_virtserial_ports;
>      virtio_net_conf net;
> +    bool ioeventfd_disabled;
>      bool ioeventfd_started;
> -    VMChangeStateEntry *vm_change_state_entry;
>  } VirtIOPCIProxy;
>  
>  /* virtio device */
> @@ -251,6 +251,7 @@ static int virtio_pci_start_ioeventfd(VirtIOPCIProxy *proxy)
>      int n, r;
>  
>      if (!(proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD) ||
> +        proxy->ioeventfd_disabled ||
>          proxy->ioeventfd_started) {
>          return 0;
>      }
> @@ -280,7 +281,7 @@ assign_error:
>          virtio_pci_set_host_notifier_internal(proxy, n, false);
>      }
>      proxy->ioeventfd_started = false;
> -    proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
> +    proxy->ioeventfd_disabled = true;
>      return r;
>  }
>  
> @@ -350,13 +351,16 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
>          virtio_queue_notify(vdev, val);
>          break;
>      case VIRTIO_PCI_STATUS:
> -        if (val & VIRTIO_CONFIG_S_DRIVER_OK) {
> -            virtio_pci_start_ioeventfd(proxy);
> -        } else {
> +        if (!(val & VIRTIO_CONFIG_S_DRIVER_OK)) {
>              virtio_pci_stop_ioeventfd(proxy);
>          }
>  
>          virtio_set_status(vdev, val & 0xFF);
> +
> +        if (val & VIRTIO_CONFIG_S_DRIVER_OK) {
> +            virtio_pci_start_ioeventfd(proxy);
> +        }
> +
>          if (vdev->status == 0) {
>              virtio_reset(proxy->vdev);
>              msix_unuse_all_vectors(&proxy->pci_dev);
> @@ -618,22 +622,24 @@ static int virtio_pci_set_host_notifier(void *opaque, int n, bool assign)
>      /* Stop using ioeventfd for virtqueue kick if the device starts using host
>       * notifiers.  This makes it easy to avoid stepping on each others' toes.
>       */
> -    if (proxy->ioeventfd_started) {
> +    proxy->ioeventfd_disabled = assign;
> +    if (assign) {
>          virtio_pci_stop_ioeventfd(proxy);
> -        proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
>      }
>  
>      return virtio_pci_set_host_notifier_internal(proxy, n, assign);
>  }
>  
> -static void virtio_pci_vm_change_state_handler(void *opaque, int running, int reason)
> +static void virtio_pci_vmstate_change(void *opaque, bool running)
>  {
>      VirtIOPCIProxy *proxy = opaque;
>  
>      if (running && (proxy->vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> +        virtio_set_status(proxy->vdev, proxy->vdev->status);
>          virtio_pci_start_ioeventfd(proxy);
>      } else {
>          virtio_pci_stop_ioeventfd(proxy);
> +        virtio_set_status(proxy->vdev, proxy->vdev->status);
>      }
>  }
>  
> @@ -646,6 +652,7 @@ static const VirtIOBindings virtio_pci_bindings = {
>      .get_features = virtio_pci_get_features,
>      .set_host_notifier = virtio_pci_set_host_notifier,
>      .set_guest_notifiers = virtio_pci_set_guest_notifiers,
> +    .vmstate_change = virtio_pci_vmstate_change,
>  };
>  
>  static void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev,
> @@ -699,9 +706,6 @@ static void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev,
>      proxy->host_features |= 0x1 << VIRTIO_F_BAD_FEATURE;
>      proxy->host_features = vdev->get_features(vdev, proxy->host_features);
>  
> -    proxy->vm_change_state_entry = qemu_add_vm_change_state_handler(
> -                                        virtio_pci_vm_change_state_handler,
> -                                        proxy);
>  }
>  
>  static int virtio_blk_init_pci(PCIDevice *pci_dev)
> @@ -729,9 +733,6 @@ static int virtio_blk_init_pci(PCIDevice *pci_dev)
>  
>  static int virtio_exit_pci(PCIDevice *pci_dev)
>  {
> -    VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
> -
> -    qemu_del_vm_change_state_handler(proxy->vm_change_state_entry);
>      return msix_uninit(pci_dev);
>  }
>  
> diff --git a/hw/virtio.c b/hw/virtio.c
> index e40296a..13c3373 100644
> --- a/hw/virtio.c
> +++ b/hw/virtio.c
> @@ -751,11 +751,21 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f)
>  
>  void virtio_cleanup(VirtIODevice *vdev)
>  {
> +    qemu_del_vm_change_state_handler(vdev->vmstate);
>      if (vdev->config)
>          qemu_free(vdev->config);
>      qemu_free(vdev->vq);
>  }
>  
> +static void virtio_vmstate_change(void *opaque, int running, int reason)
> +{
> +    VirtIODevice *vdev = opaque;
> +    vdev->vm_running = running;
> +    if (vdev->binding->vmstate_change) {
> +        vdev->binding->vmstate_change(vdev->binding_opaque, running);
> +    }
> +}
> +
>  VirtIODevice *virtio_common_init(const char *name, uint16_t device_id,
>                                   size_t config_size, size_t struct_size)
>  {
> @@ -782,6 +792,8 @@ VirtIODevice *virtio_common_init(const char *name, uint16_t device_id,
>      else
>          vdev->config = NULL;
>  
> +    vdev->vmstate = qemu_add_vm_change_state_handler(virtio_vmstate_change, vdev);
> +
>      return vdev;
>  }
>  
> diff --git a/hw/virtio.h b/hw/virtio.h
> index 5ae521c..d8546d5 100644
> --- a/hw/virtio.h
> +++ b/hw/virtio.h
> @@ -95,6 +95,7 @@ typedef struct {
>      unsigned (*get_features)(void * opaque);
>      int (*set_guest_notifiers)(void * opaque, bool assigned);
>      int (*set_host_notifier)(void * opaque, int n, bool assigned);
> +    void (*vmstate_change)(void * opaque, bool running);
>  } VirtIOBindings;
>  
>  #define VIRTIO_PCI_QUEUE_MAX 64
> @@ -123,6 +124,8 @@ struct VirtIODevice
>      const VirtIOBindings *binding;
>      void *binding_opaque;
>      uint16_t device_id;
> +    bool vm_running;
> +    VMChangeStateEntry *vmstate;
>  };
>  
>  static inline void virtio_set_status(VirtIODevice *vdev, uint8_t val)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: [PATCH] ioeventfd: minor fixups
  2011-01-03 16:38   ` [Qemu-devel] " Michael S. Tsirkin
@ 2011-01-04 10:28     ` Stefan Hajnoczi
  0 siblings, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2011-01-04 10:28 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: qemu-devel

On Mon, Jan 03, 2011 at 06:38:53PM +0200, Michael S. Tsirkin wrote:
> On Wed, Dec 29, 2010 at 04:52:46PM +0200, Michael S. Tsirkin wrote:
> > This is on top of the ioeventfd patches, and
> > fixes two potential issues when ioeventfd is mixed with vhost-net:
> > - ioeventfd could start running then get stopped for vhost-net.
> >   For example, vm state change handlers run in no particular order.
> >   This would be hard to debug. Prevent this by always running state
> >   callback before ioeventfd start and after ioeventfd stop.
> > - Separate the flags for user-requested ioeventfd and for ioeventfd
> >   being overridden by vhost backend.
> >   This will make it possible to enable/disable vhost depending on
> >   guest behaviour.
> > 
> > I'll probably split this patch in two, and merge into the
> > appropriate patches in the ioeventfd series.
> > 
> > Compile-tested only so far, would appreciate feedback/test reports.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> 
> Stefan - just to clarify, I'm waiting for you ack
> to merge this + the ioeventfd patches.

Back from the holidays today.  Will send feedback/ack later today.

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] ioeventfd: minor fixups
  2010-12-29 14:52 ` [Qemu-devel] [PATCH] ioeventfd: minor fixups Michael S. Tsirkin
  2011-01-03 16:38   ` [Qemu-devel] " Michael S. Tsirkin
@ 2011-01-04 16:57   ` Stefan Hajnoczi
  2011-01-04 17:53     ` Michael S. Tsirkin
  1 sibling, 1 reply; 10+ messages in thread
From: Stefan Hajnoczi @ 2011-01-04 16:57 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Stefan Hajnoczi, qemu-devel

On Wed, Dec 29, 2010 at 2:52 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> I'll probably split this patch in two, and merge into the
> appropriate patches in the ioeventfd series.
>
> Compile-tested only so far, would appreciate feedback/test reports.

virtio-ioeventfd works as expected.

> diff --git a/hw/vhost.c b/hw/vhost.c
> index 6082da2..1d09ed0 100644
> --- a/hw/vhost.c
> +++ b/hw/vhost.c
> @@ -630,6 +630,17 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
>  int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
>  {
>     int i, r;
> +    /* A binding must support vmstate notifiers (for migration to work) */
> +    if (!vdev->binding->vmstate_change) {
> +        fprintf(stderr, "binding does not support vmstate notifiers\n");
> +        r = -ENOSYS;
> +        goto fail;
> +    }
> +    if (!vdev->binding->set_guest_notifiers) {
> +        fprintf(stderr, "binding does not support guest notifiers\n");
> +        r = -ENOSYS;
> +        goto fail;
> +    }

Merge error?  The set_guest_notifiers check is already present.

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] ioeventfd: minor fixups
  2011-01-04 16:57   ` [Qemu-devel] " Stefan Hajnoczi
@ 2011-01-04 17:53     ` Michael S. Tsirkin
  0 siblings, 0 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2011-01-04 17:53 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Stefan Hajnoczi, qemu-devel

On Tue, Jan 04, 2011 at 04:57:43PM +0000, Stefan Hajnoczi wrote:
> On Wed, Dec 29, 2010 at 2:52 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > I'll probably split this patch in two, and merge into the
> > appropriate patches in the ioeventfd series.
> >
> > Compile-tested only so far, would appreciate feedback/test reports.
> 
> virtio-ioeventfd works as expected.
> 
> > diff --git a/hw/vhost.c b/hw/vhost.c
> > index 6082da2..1d09ed0 100644
> > --- a/hw/vhost.c
> > +++ b/hw/vhost.c
> > @@ -630,6 +630,17 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
> >  int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
> >  {
> >     int i, r;
> > +    /* A binding must support vmstate notifiers (for migration to work) */
> > +    if (!vdev->binding->vmstate_change) {
> > +        fprintf(stderr, "binding does not support vmstate notifiers\n");
> > +        r = -ENOSYS;
> > +        goto fail;
> > +    }
> > +    if (!vdev->binding->set_guest_notifiers) {
> > +        fprintf(stderr, "binding does not support guest notifiers\n");
> > +        r = -ENOSYS;
> > +        goto fail;
> > +    }
> 
> Merge error?  The set_guest_notifiers check is already present.
> 
> Stefan

OK I need to test with/without vhost-net and merge then.

Thanks!

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-01-04 17:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-17 12:01 [Qemu-devel] [PATCH v6 0/4] virtio: Use ioeventfd for virtqueue notify Stefan Hajnoczi
2010-12-17 12:01 ` [Qemu-devel] [PATCH v6 1/4] virtio-pci: Rename bugs field to flags Stefan Hajnoczi
2010-12-17 12:01 ` [Qemu-devel] [PATCH v6 2/4] virtio-pci: Use ioeventfd for virtqueue notify Stefan Hajnoczi
2010-12-17 12:01 ` [Qemu-devel] [PATCH v6 3/4] virtio-pci: Don't use ioeventfd on old kernels Stefan Hajnoczi
2010-12-17 12:01 ` [Qemu-devel] [PATCH v6 4/4] docs: Document virtio PCI -device ioeventfd=on|off Stefan Hajnoczi
2010-12-29 14:52 ` [Qemu-devel] [PATCH] ioeventfd: minor fixups Michael S. Tsirkin
2011-01-03 16:38   ` [Qemu-devel] " Michael S. Tsirkin
2011-01-04 10:28     ` Stefan Hajnoczi
2011-01-04 16:57   ` [Qemu-devel] " Stefan Hajnoczi
2011-01-04 17:53     ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).