qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/16] block: remove aio_disable_external() API
@ 2023-04-19 17:28 Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 01/16] hw/qdev: introduce qdev_is_realized() helper Stefan Hajnoczi
                   ` (15 more replies)
  0 siblings, 16 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

v2:
- Do not rely on BlockBackend request queuing, implement .drained_begin/end()
  instead in xen-block, virtio-blk, and virtio-scsi [Paolo]
- Add qdev_is_realized() API [Philippe]
- Add patch to avoid AioContext lock around blk_exp_ref/unref() [Paolo]
- Add patch to call .drained_begin/end() from main loop thread to simplify
  callback implementations

The aio_disable_external() API temporarily suspends file descriptor monitoring
in the event loop. The block layer uses this to prevent new I/O requests being
submitted from the guest and elsewhere between bdrv_drained_begin() and
bdrv_drained_end().

While the block layer still needs to prevent new I/O requests in drained
sections, the aio_disable_external() API can be replaced with
.drained_begin/end/poll() callbacks that have been added to BdrvChildClass and
BlockDevOps.

This newer .bdrained_begin/end/poll() approach is attractive because it works
without specifying a specific AioContext. The block layer is moving towards
multi-queue and that means multiple AioContexts may be processing I/O
simultaneously.

The aio_disable_external() was always somewhat hacky. It suspends all file
descriptors that were registered with is_external=true, even if they have
nothing to do with the BlockDriverState graph nodes that are being drained.
It's better to solve a block layer problem in the block layer than to have an
odd event loop API solution.

The approach in this patch series is to implement BlockDevOps
.drained_begin/end() callbacks that temporarily stop file descriptor handlers.
This ensures that new I/O requests are not submitted in drained sections.

The first two virtio-scsi patches were already sent as a separate series. I
included them because they are necessary in order to fully remove
aio_disable_external().

Based-on: 087bc644b7634436ca9d52fe58ba9234e2bef026 (kevin/block-next)

Stefan Hajnoczi (16):
  hw/qdev: introduce qdev_is_realized() helper
  virtio-scsi: avoid race between unplug and transport event
  virtio-scsi: stop using aio_disable_external() during unplug
  block/export: only acquire AioContext once for
    vhost_user_server_stop()
  util/vhost-user-server: rename refcount to in_flight counter
  block/export: wait for vhost-user-blk requests when draining
  block/export: stop using is_external in vhost-user-blk server
  hw/xen: do not use aio_set_fd_handler(is_external=true) in
    xen_xenstore
  block: add blk_in_drain() API
  block: drain from main loop thread in bdrv_co_yield_to_drain()
  xen-block: implement BlockDevOps->drained_begin()
  hw/xen: do not set is_external=true on evtchn fds
  block/export: rewrite vduse-blk drain code
  block/export: don't require AioContext lock around blk_exp_ref/unref()
  block/fuse: do not set is_external=true on FUSE fd
  virtio: make it possible to detach host notifier from any thread

 hw/block/dataplane/xen-block.h              |   2 +
 include/block/export.h                      |   2 +
 include/hw/qdev-core.h                      |  17 ++-
 include/qemu/vhost-user-server.h            |   8 +-
 include/sysemu/block-backend-common.h       |  25 ++--
 include/sysemu/block-backend-global-state.h |   1 +
 block/block-backend.c                       |   7 ++
 block/export/export.c                       |  13 +-
 block/export/fuse.c                         |  58 ++++++++-
 block/export/vduse-blk.c                    | 128 ++++++++++++++------
 block/export/vhost-user-blk-server.c        |  73 +++++++----
 block/io.c                                  |   3 +-
 hw/block/dataplane/virtio-blk.c             |   2 +
 hw/block/dataplane/xen-block.c              |  42 +++++--
 hw/block/xen-block.c                        |  24 +++-
 hw/i386/kvm/xen_xenstore.c                  |   2 +-
 hw/scsi/scsi-bus.c                          |   6 +-
 hw/scsi/scsi-disk.c                         |   1 +
 hw/scsi/virtio-scsi-dataplane.c             |   9 ++
 hw/scsi/virtio-scsi.c                       |  21 ++--
 hw/xen/xen-bus.c                            |  11 +-
 util/vhost-user-server.c                    |  37 +++---
 22 files changed, 348 insertions(+), 144 deletions(-)

-- 
2.39.2



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2 01/16] hw/qdev: introduce qdev_is_realized() helper
  2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
@ 2023-04-19 17:28 ` Stefan Hajnoczi
  2023-04-20  8:17   ` Philippe Mathieu-Daudé
  2023-04-19 17:28 ` [PATCH v2 02/16] virtio-scsi: avoid race between unplug and transport event Stefan Hajnoczi
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

Add a helper function to check whether the device is realized without
requiring the Big QEMU Lock. The next patch adds a second caller. The
goal is to avoid spreading DeviceState field accesses throughout the
code.

Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/hw/qdev-core.h | 17 ++++++++++++++---
 hw/scsi/scsi-bus.c     |  3 +--
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index bd50ad5ee1..4d734cf35e 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -1,6 +1,7 @@
 #ifndef QDEV_CORE_H
 #define QDEV_CORE_H
 
+#include "qemu/atomic.h"
 #include "qemu/queue.h"
 #include "qemu/bitmap.h"
 #include "qemu/rcu.h"
@@ -164,9 +165,6 @@ struct NamedClockList {
 
 /**
  * DeviceState:
- * @realized: Indicates whether the device has been fully constructed.
- *            When accessed outside big qemu lock, must be accessed with
- *            qatomic_load_acquire()
  * @reset: ResettableState for the device; handled by Resettable interface.
  *
  * This structure should not be accessed directly.  We declare it here
@@ -332,6 +330,19 @@ DeviceState *qdev_new(const char *name);
  */
 DeviceState *qdev_try_new(const char *name);
 
+/**
+ * qdev_is_realized:
+ * @dev: The device to check.
+ *
+ * May be called outside big qemu lock.
+ *
+ * Returns: %true% if the device has been fully constructed, %false% otherwise.
+ */
+static inline bool qdev_is_realized(DeviceState *dev)
+{
+    return qatomic_load_acquire(&dev->realized);
+}
+
 /**
  * qdev_realize: Realize @dev.
  * @dev: device to realize
diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index c97176110c..07275fb631 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -60,8 +60,7 @@ static SCSIDevice *do_scsi_device_find(SCSIBus *bus,
      * the user access the device.
      */
 
-    if (retval && !include_unrealized &&
-        !qatomic_load_acquire(&retval->qdev.realized)) {
+    if (retval && !include_unrealized && !qdev_is_realized(&retval->qdev)) {
         retval = NULL;
     }
 
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 02/16] virtio-scsi: avoid race between unplug and transport event
  2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 01/16] hw/qdev: introduce qdev_is_realized() helper Stefan Hajnoczi
@ 2023-04-19 17:28 ` Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 03/16] virtio-scsi: stop using aio_disable_external() during unplug Stefan Hajnoczi
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse,
	Daniil Tatianin

Only report a transport reset event to the guest after the SCSIDevice
has been unrealized by qdev_simple_device_unplug_cb().

qdev_simple_device_unplug_cb() sets the SCSIDevice's qdev.realized field
to false so that scsi_device_find/get() no longer see it.

scsi_target_emulate_report_luns() also needs to be updated to filter out
SCSIDevices that are unrealized.

These changes ensure that the guest driver does not see the SCSIDevice
that's being unplugged if it responds very quickly to the transport
reset event.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/scsi/scsi-bus.c    |  3 ++-
 hw/scsi/virtio-scsi.c | 18 +++++++++---------
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 07275fb631..64d7311757 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -486,7 +486,8 @@ static bool scsi_target_emulate_report_luns(SCSITargetReq *r)
             DeviceState *qdev = kid->child;
             SCSIDevice *dev = SCSI_DEVICE(qdev);
 
-            if (dev->channel == channel && dev->id == id && dev->lun != 0) {
+            if (dev->channel == channel && dev->id == id && dev->lun != 0 &&
+                qdev_is_realized(&dev->qdev)) {
                 store_lun(tmp, dev->lun);
                 g_byte_array_append(buf, tmp, 8);
                 len += 8;
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 612c525d9d..000961446c 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -1063,15 +1063,6 @@ static void virtio_scsi_hotunplug(HotplugHandler *hotplug_dev, DeviceState *dev,
     SCSIDevice *sd = SCSI_DEVICE(dev);
     AioContext *ctx = s->ctx ?: qemu_get_aio_context();
 
-    if (virtio_vdev_has_feature(vdev, VIRTIO_SCSI_F_HOTPLUG)) {
-        virtio_scsi_acquire(s);
-        virtio_scsi_push_event(s, sd,
-                               VIRTIO_SCSI_T_TRANSPORT_RESET,
-                               VIRTIO_SCSI_EVT_RESET_REMOVED);
-        scsi_bus_set_ua(&s->bus, SENSE_CODE(REPORTED_LUNS_CHANGED));
-        virtio_scsi_release(s);
-    }
-
     aio_disable_external(ctx);
     qdev_simple_device_unplug_cb(hotplug_dev, dev, errp);
     aio_enable_external(ctx);
@@ -1082,6 +1073,15 @@ static void virtio_scsi_hotunplug(HotplugHandler *hotplug_dev, DeviceState *dev,
         blk_set_aio_context(sd->conf.blk, qemu_get_aio_context(), NULL);
         virtio_scsi_release(s);
     }
+
+    if (virtio_vdev_has_feature(vdev, VIRTIO_SCSI_F_HOTPLUG)) {
+        virtio_scsi_acquire(s);
+        virtio_scsi_push_event(s, sd,
+                               VIRTIO_SCSI_T_TRANSPORT_RESET,
+                               VIRTIO_SCSI_EVT_RESET_REMOVED);
+        scsi_bus_set_ua(&s->bus, SENSE_CODE(REPORTED_LUNS_CHANGED));
+        virtio_scsi_release(s);
+    }
 }
 
 static struct SCSIBusInfo virtio_scsi_scsi_info = {
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 03/16] virtio-scsi: stop using aio_disable_external() during unplug
  2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 01/16] hw/qdev: introduce qdev_is_realized() helper Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 02/16] virtio-scsi: avoid race between unplug and transport event Stefan Hajnoczi
@ 2023-04-19 17:28 ` Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 04/16] block/export: only acquire AioContext once for vhost_user_server_stop() Stefan Hajnoczi
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse,
	Zhengui Li, Daniil Tatianin

This patch is part of an effort to remove the aio_disable_external()
API because it does not fit in a multi-queue block layer world where
many AioContexts may be submitting requests to the same disk.

The SCSI emulation code is already in good shape to stop using
aio_disable_external(). It was only used by commit 9c5aad84da1c
("virtio-scsi: fixed virtio_scsi_ctx_check failed when detaching scsi
disk") to ensure that virtio_scsi_hotunplug() works while the guest
driver is submitting I/O.

Ensure virtio_scsi_hotunplug() is safe as follows:

1. qdev_simple_device_unplug_cb() -> qdev_unrealize() ->
   device_set_realized() calls qatomic_set(&dev->realized, false) so
   that future scsi_device_get() calls return NULL because they exclude
   SCSIDevices with realized=false.

   That means virtio-scsi will reject new I/O requests to this
   SCSIDevice with VIRTIO_SCSI_S_BAD_TARGET even while
   virtio_scsi_hotunplug() is still executing. We are protected against
   new requests!

2. Add a call to scsi_device_purge_requests() from scsi_unrealize() so
   that in-flight requests are cancelled synchronously. This ensures
   that no in-flight requests remain once qdev_simple_device_unplug_cb()
   returns.

Thanks to these two conditions we don't need aio_disable_external()
anymore.

Cc: Zhengui Li <lizhengui@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/scsi/scsi-disk.c   | 1 +
 hw/scsi/virtio-scsi.c | 3 ---
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 97c9b1c8cd..e01bd84541 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2522,6 +2522,7 @@ static void scsi_realize(SCSIDevice *dev, Error **errp)
 
 static void scsi_unrealize(SCSIDevice *dev)
 {
+    scsi_device_purge_requests(dev, SENSE_CODE(RESET));
     del_boot_device_lchs(&dev->qdev, NULL);
 }
 
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 000961446c..a02f9233ec 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -1061,11 +1061,8 @@ static void virtio_scsi_hotunplug(HotplugHandler *hotplug_dev, DeviceState *dev,
     VirtIODevice *vdev = VIRTIO_DEVICE(hotplug_dev);
     VirtIOSCSI *s = VIRTIO_SCSI(vdev);
     SCSIDevice *sd = SCSI_DEVICE(dev);
-    AioContext *ctx = s->ctx ?: qemu_get_aio_context();
 
-    aio_disable_external(ctx);
     qdev_simple_device_unplug_cb(hotplug_dev, dev, errp);
-    aio_enable_external(ctx);
 
     if (s->ctx) {
         virtio_scsi_acquire(s);
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 04/16] block/export: only acquire AioContext once for vhost_user_server_stop()
  2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2023-04-19 17:28 ` [PATCH v2 03/16] virtio-scsi: stop using aio_disable_external() during unplug Stefan Hajnoczi
@ 2023-04-19 17:28 ` Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 05/16] util/vhost-user-server: rename refcount to in_flight counter Stefan Hajnoczi
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

vhost_user_server_stop() uses AIO_WAIT_WHILE(). AIO_WAIT_WHILE()
requires that AioContext is only acquired once.

Since blk_exp_request_shutdown() already acquires the AioContext it
shouldn't be acquired again in vhost_user_server_stop().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 util/vhost-user-server.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 40f36ea214..5b6216069c 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -346,10 +346,9 @@ static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
     aio_context_release(server->ctx);
 }
 
+/* server->ctx acquired by caller */
 void vhost_user_server_stop(VuServer *server)
 {
-    aio_context_acquire(server->ctx);
-
     qemu_bh_delete(server->restart_listener_bh);
     server->restart_listener_bh = NULL;
 
@@ -366,8 +365,6 @@ void vhost_user_server_stop(VuServer *server)
         AIO_WAIT_WHILE(server->ctx, server->co_trip);
     }
 
-    aio_context_release(server->ctx);
-
     if (server->listener) {
         qio_net_listener_disconnect(server->listener);
         object_unref(OBJECT(server->listener));
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 05/16] util/vhost-user-server: rename refcount to in_flight counter
  2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
                   ` (3 preceding siblings ...)
  2023-04-19 17:28 ` [PATCH v2 04/16] block/export: only acquire AioContext once for vhost_user_server_stop() Stefan Hajnoczi
@ 2023-04-19 17:28 ` Stefan Hajnoczi
  2023-04-20  8:18   ` Philippe Mathieu-Daudé
  2023-04-19 17:28 ` [PATCH v2 06/16] block/export: wait for vhost-user-blk requests when draining Stefan Hajnoczi
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

The VuServer object has a refcount field and ref/unref APIs. The name is
confusing because it's actually an in-flight request counter instead of
a refcount.

Normally a refcount destroys the object upon reaching zero. The VuServer
counter is used to wake up the vhost-user coroutine when there are no
more requests.

Avoid confusing by renaming refcount and ref/unref to in_flight and
inc/dec.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/qemu/vhost-user-server.h     |  6 +++---
 block/export/vhost-user-blk-server.c | 11 +++++++----
 util/vhost-user-server.c             | 14 +++++++-------
 3 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/include/qemu/vhost-user-server.h b/include/qemu/vhost-user-server.h
index 25c72433ca..bc0ac9ddb6 100644
--- a/include/qemu/vhost-user-server.h
+++ b/include/qemu/vhost-user-server.h
@@ -41,7 +41,7 @@ typedef struct {
     const VuDevIface *vu_iface;
 
     /* Protected by ctx lock */
-    unsigned int refcount;
+    unsigned int in_flight;
     bool wait_idle;
     VuDev vu_dev;
     QIOChannel *ioc; /* The I/O channel with the client */
@@ -60,8 +60,8 @@ bool vhost_user_server_start(VuServer *server,
 
 void vhost_user_server_stop(VuServer *server);
 
-void vhost_user_server_ref(VuServer *server);
-void vhost_user_server_unref(VuServer *server);
+void vhost_user_server_inc_in_flight(VuServer *server);
+void vhost_user_server_dec_in_flight(VuServer *server);
 
 void vhost_user_server_attach_aio_context(VuServer *server, AioContext *ctx);
 void vhost_user_server_detach_aio_context(VuServer *server);
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
index 3409d9e02e..e93f2ed6b4 100644
--- a/block/export/vhost-user-blk-server.c
+++ b/block/export/vhost-user-blk-server.c
@@ -49,7 +49,10 @@ static void vu_blk_req_complete(VuBlkReq *req, size_t in_len)
     free(req);
 }
 
-/* Called with server refcount increased, must decrease before returning */
+/*
+ * Called with server in_flight counter increased, must decrease before
+ * returning.
+ */
 static void coroutine_fn vu_blk_virtio_process_req(void *opaque)
 {
     VuBlkReq *req = opaque;
@@ -67,12 +70,12 @@ static void coroutine_fn vu_blk_virtio_process_req(void *opaque)
                                     in_num, out_num);
     if (in_len < 0) {
         free(req);
-        vhost_user_server_unref(server);
+        vhost_user_server_dec_in_flight(server);
         return;
     }
 
     vu_blk_req_complete(req, in_len);
-    vhost_user_server_unref(server);
+    vhost_user_server_dec_in_flight(server);
 }
 
 static void vu_blk_process_vq(VuDev *vu_dev, int idx)
@@ -94,7 +97,7 @@ static void vu_blk_process_vq(VuDev *vu_dev, int idx)
         Coroutine *co =
             qemu_coroutine_create(vu_blk_virtio_process_req, req);
 
-        vhost_user_server_ref(server);
+        vhost_user_server_inc_in_flight(server);
         qemu_coroutine_enter(co);
     }
 }
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 5b6216069c..1622f8cfb3 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -75,16 +75,16 @@ static void panic_cb(VuDev *vu_dev, const char *buf)
     error_report("vu_panic: %s", buf);
 }
 
-void vhost_user_server_ref(VuServer *server)
+void vhost_user_server_inc_in_flight(VuServer *server)
 {
     assert(!server->wait_idle);
-    server->refcount++;
+    server->in_flight++;
 }
 
-void vhost_user_server_unref(VuServer *server)
+void vhost_user_server_dec_in_flight(VuServer *server)
 {
-    server->refcount--;
-    if (server->wait_idle && !server->refcount) {
+    server->in_flight--;
+    if (server->wait_idle && !server->in_flight) {
         aio_co_wake(server->co_trip);
     }
 }
@@ -192,13 +192,13 @@ static coroutine_fn void vu_client_trip(void *opaque)
         /* Keep running */
     }
 
-    if (server->refcount) {
+    if (server->in_flight) {
         /* Wait for requests to complete before we can unmap the memory */
         server->wait_idle = true;
         qemu_coroutine_yield();
         server->wait_idle = false;
     }
-    assert(server->refcount == 0);
+    assert(server->in_flight == 0);
 
     vu_deinit(vu_dev);
 
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 06/16] block/export: wait for vhost-user-blk requests when draining
  2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
                   ` (4 preceding siblings ...)
  2023-04-19 17:28 ` [PATCH v2 05/16] util/vhost-user-server: rename refcount to in_flight counter Stefan Hajnoczi
@ 2023-04-19 17:28 ` Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 07/16] block/export: stop using is_external in vhost-user-blk server Stefan Hajnoczi
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

Each vhost-user-blk request runs in a coroutine. When the BlockBackend
enters a drained section we need to enter a quiescent state. Currently
any in-flight requests race with bdrv_drained_begin() because it is
unaware of vhost-user-blk requests.

When blk_co_preadv/pwritev()/etc returns it wakes the
bdrv_drained_begin() thread but vhost-user-blk request processing has
not yet finished. The request coroutine continues executing while the
main loop thread thinks it is in a drained section.

One example where this is unsafe is for blk_set_aio_context() where
bdrv_drained_begin() is called before .aio_context_detached() and
.aio_context_attach(). If request coroutines are still running after
bdrv_drained_begin(), then the AioContext could change underneath them
and they race with new requests processed in the new AioContext. This
could lead to virtqueue corruption, for example.

(This example is theoretical, I came across this while reading the
code and have not tried to reproduce it.)

It's easy to make bdrv_drained_begin() wait for in-flight requests: add
a .drained_poll() callback that checks the VuServer's in-flight counter.
VuServer just needs an API that returns true when there are requests in
flight. The in-flight counter needs to be atomic.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/qemu/vhost-user-server.h     |  4 +++-
 block/export/vhost-user-blk-server.c | 19 +++++++++++++++++++
 util/vhost-user-server.c             | 14 ++++++++++----
 3 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/include/qemu/vhost-user-server.h b/include/qemu/vhost-user-server.h
index bc0ac9ddb6..b1c1cda886 100644
--- a/include/qemu/vhost-user-server.h
+++ b/include/qemu/vhost-user-server.h
@@ -40,8 +40,9 @@ typedef struct {
     int max_queues;
     const VuDevIface *vu_iface;
 
+    unsigned int in_flight; /* atomic */
+
     /* Protected by ctx lock */
-    unsigned int in_flight;
     bool wait_idle;
     VuDev vu_dev;
     QIOChannel *ioc; /* The I/O channel with the client */
@@ -62,6 +63,7 @@ void vhost_user_server_stop(VuServer *server);
 
 void vhost_user_server_inc_in_flight(VuServer *server);
 void vhost_user_server_dec_in_flight(VuServer *server);
+bool vhost_user_server_has_in_flight(VuServer *server);
 
 void vhost_user_server_attach_aio_context(VuServer *server, AioContext *ctx);
 void vhost_user_server_detach_aio_context(VuServer *server);
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
index e93f2ed6b4..dbf5207162 100644
--- a/block/export/vhost-user-blk-server.c
+++ b/block/export/vhost-user-blk-server.c
@@ -254,6 +254,22 @@ static void vu_blk_exp_request_shutdown(BlockExport *exp)
     vhost_user_server_stop(&vexp->vu_server);
 }
 
+/*
+ * Ensures that bdrv_drained_begin() waits until in-flight requests complete.
+ *
+ * Called with vexp->export.ctx acquired.
+ */
+static bool vu_blk_drained_poll(void *opaque)
+{
+    VuBlkExport *vexp = opaque;
+
+    return vhost_user_server_has_in_flight(&vexp->vu_server);
+}
+
+static const BlockDevOps vu_blk_dev_ops = {
+    .drained_poll  = vu_blk_drained_poll,
+};
+
 static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
                              Error **errp)
 {
@@ -292,6 +308,7 @@ static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
     vu_blk_initialize_config(blk_bs(exp->blk), &vexp->blkcfg,
                              logical_block_size, num_queues);
 
+    blk_set_dev_ops(exp->blk, &vu_blk_dev_ops, vexp);
     blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
                                  vexp);
 
@@ -299,6 +316,7 @@ static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
                                  num_queues, &vu_blk_iface, errp)) {
         blk_remove_aio_context_notifier(exp->blk, blk_aio_attached,
                                         blk_aio_detach, vexp);
+        blk_set_dev_ops(exp->blk, NULL, NULL);
         g_free(vexp->handler.serial);
         return -EADDRNOTAVAIL;
     }
@@ -312,6 +330,7 @@ static void vu_blk_exp_delete(BlockExport *exp)
 
     blk_remove_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
                                     vexp);
+    blk_set_dev_ops(exp->blk, NULL, NULL);
     g_free(vexp->handler.serial);
 }
 
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 1622f8cfb3..2e6b640050 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -78,17 +78,23 @@ static void panic_cb(VuDev *vu_dev, const char *buf)
 void vhost_user_server_inc_in_flight(VuServer *server)
 {
     assert(!server->wait_idle);
-    server->in_flight++;
+    qatomic_inc(&server->in_flight);
 }
 
 void vhost_user_server_dec_in_flight(VuServer *server)
 {
-    server->in_flight--;
-    if (server->wait_idle && !server->in_flight) {
-        aio_co_wake(server->co_trip);
+    if (qatomic_fetch_dec(&server->in_flight) == 1) {
+        if (server->wait_idle) {
+            aio_co_wake(server->co_trip);
+        }
     }
 }
 
+bool vhost_user_server_has_in_flight(VuServer *server)
+{
+    return qatomic_load_acquire(&server->in_flight) > 0;
+}
+
 static bool coroutine_fn
 vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
 {
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 07/16] block/export: stop using is_external in vhost-user-blk server
  2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
                   ` (5 preceding siblings ...)
  2023-04-19 17:28 ` [PATCH v2 06/16] block/export: wait for vhost-user-blk requests when draining Stefan Hajnoczi
@ 2023-04-19 17:28 ` Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 08/16] hw/xen: do not use aio_set_fd_handler(is_external=true) in xen_xenstore Stefan Hajnoczi
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

vhost-user activity must be suspended during bdrv_drained_begin/end().
This prevents new requests from interfering with whatever is happening
in the drained section.

Previously this was done using aio_set_fd_handler()'s is_external
argument. In a multi-queue block layer world the aio_disable_external()
API cannot be used since multiple AioContext may be processing I/O, not
just one.

Switch to BlockDevOps->drained_begin/end() callbacks.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/export/vhost-user-blk-server.c | 43 ++++++++++++++--------------
 util/vhost-user-server.c             | 10 +++----
 2 files changed, 26 insertions(+), 27 deletions(-)

diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
index dbf5207162..6e1bc196fb 100644
--- a/block/export/vhost-user-blk-server.c
+++ b/block/export/vhost-user-blk-server.c
@@ -207,22 +207,6 @@ static const VuDevIface vu_blk_iface = {
     .process_msg           = vu_blk_process_msg,
 };
 
-static void blk_aio_attached(AioContext *ctx, void *opaque)
-{
-    VuBlkExport *vexp = opaque;
-
-    vexp->export.ctx = ctx;
-    vhost_user_server_attach_aio_context(&vexp->vu_server, ctx);
-}
-
-static void blk_aio_detach(void *opaque)
-{
-    VuBlkExport *vexp = opaque;
-
-    vhost_user_server_detach_aio_context(&vexp->vu_server);
-    vexp->export.ctx = NULL;
-}
-
 static void
 vu_blk_initialize_config(BlockDriverState *bs,
                          struct virtio_blk_config *config,
@@ -254,6 +238,25 @@ static void vu_blk_exp_request_shutdown(BlockExport *exp)
     vhost_user_server_stop(&vexp->vu_server);
 }
 
+/* Called with vexp->export.ctx acquired */
+static void vu_blk_drained_begin(void *opaque)
+{
+    VuBlkExport *vexp = opaque;
+
+    vhost_user_server_detach_aio_context(&vexp->vu_server);
+}
+
+/* Called with vexp->export.blk AioContext acquired */
+static void vu_blk_drained_end(void *opaque)
+{
+    VuBlkExport *vexp = opaque;
+
+    /* Refresh AioContext in case it changed */
+    vexp->export.ctx = blk_get_aio_context(vexp->export.blk);
+
+    vhost_user_server_attach_aio_context(&vexp->vu_server, vexp->export.ctx);
+}
+
 /*
  * Ensures that bdrv_drained_begin() waits until in-flight requests complete.
  *
@@ -267,6 +270,8 @@ static bool vu_blk_drained_poll(void *opaque)
 }
 
 static const BlockDevOps vu_blk_dev_ops = {
+    .drained_begin = vu_blk_drained_begin,
+    .drained_end   = vu_blk_drained_end,
     .drained_poll  = vu_blk_drained_poll,
 };
 
@@ -309,13 +314,9 @@ static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
                              logical_block_size, num_queues);
 
     blk_set_dev_ops(exp->blk, &vu_blk_dev_ops, vexp);
-    blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
-                                 vexp);
 
     if (!vhost_user_server_start(&vexp->vu_server, vu_opts->addr, exp->ctx,
                                  num_queues, &vu_blk_iface, errp)) {
-        blk_remove_aio_context_notifier(exp->blk, blk_aio_attached,
-                                        blk_aio_detach, vexp);
         blk_set_dev_ops(exp->blk, NULL, NULL);
         g_free(vexp->handler.serial);
         return -EADDRNOTAVAIL;
@@ -328,8 +329,6 @@ static void vu_blk_exp_delete(BlockExport *exp)
 {
     VuBlkExport *vexp = container_of(exp, VuBlkExport, export);
 
-    blk_remove_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
-                                    vexp);
     blk_set_dev_ops(exp->blk, NULL, NULL);
     g_free(vexp->handler.serial);
 }
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 2e6b640050..332aea9306 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -278,7 +278,7 @@ set_watch(VuDev *vu_dev, int fd, int vu_evt,
         vu_fd_watch->fd = fd;
         vu_fd_watch->cb = cb;
         qemu_socket_set_nonblock(fd);
-        aio_set_fd_handler(server->ioc->ctx, fd, true, kick_handler,
+        aio_set_fd_handler(server->ioc->ctx, fd, false, kick_handler,
                            NULL, NULL, NULL, vu_fd_watch);
         vu_fd_watch->vu_dev = vu_dev;
         vu_fd_watch->pvt = pvt;
@@ -299,7 +299,7 @@ static void remove_watch(VuDev *vu_dev, int fd)
     if (!vu_fd_watch) {
         return;
     }
-    aio_set_fd_handler(server->ioc->ctx, fd, true,
+    aio_set_fd_handler(server->ioc->ctx, fd, false,
                        NULL, NULL, NULL, NULL, NULL);
 
     QTAILQ_REMOVE(&server->vu_fd_watches, vu_fd_watch, next);
@@ -362,7 +362,7 @@ void vhost_user_server_stop(VuServer *server)
         VuFdWatch *vu_fd_watch;
 
         QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) {
-            aio_set_fd_handler(server->ctx, vu_fd_watch->fd, true,
+            aio_set_fd_handler(server->ctx, vu_fd_watch->fd, false,
                                NULL, NULL, NULL, NULL, vu_fd_watch);
         }
 
@@ -403,7 +403,7 @@ void vhost_user_server_attach_aio_context(VuServer *server, AioContext *ctx)
     qio_channel_attach_aio_context(server->ioc, ctx);
 
     QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) {
-        aio_set_fd_handler(ctx, vu_fd_watch->fd, true, kick_handler, NULL,
+        aio_set_fd_handler(ctx, vu_fd_watch->fd, false, kick_handler, NULL,
                            NULL, NULL, vu_fd_watch);
     }
 
@@ -417,7 +417,7 @@ void vhost_user_server_detach_aio_context(VuServer *server)
         VuFdWatch *vu_fd_watch;
 
         QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) {
-            aio_set_fd_handler(server->ctx, vu_fd_watch->fd, true,
+            aio_set_fd_handler(server->ctx, vu_fd_watch->fd, false,
                                NULL, NULL, NULL, NULL, vu_fd_watch);
         }
 
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 08/16] hw/xen: do not use aio_set_fd_handler(is_external=true) in xen_xenstore
  2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
                   ` (6 preceding siblings ...)
  2023-04-19 17:28 ` [PATCH v2 07/16] block/export: stop using is_external in vhost-user-blk server Stefan Hajnoczi
@ 2023-04-19 17:28 ` Stefan Hajnoczi
  2023-04-20  7:54   ` Paul Durrant
  2023-04-19 17:28 ` [PATCH v2 09/16] block: add blk_in_drain() API Stefan Hajnoczi
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse,
	David Woodhouse

There is no need to suspend activity between aio_disable_external() and
aio_enable_external(), which is mainly used for the block layer's drain
operation.

This is part of ongoing work to remove the aio_disable_external() API.

Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/i386/kvm/xen_xenstore.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c
index 900679af8a..6e81bc8791 100644
--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -133,7 +133,7 @@ static void xen_xenstore_realize(DeviceState *dev, Error **errp)
         error_setg(errp, "Xenstore evtchn port init failed");
         return;
     }
-    aio_set_fd_handler(qemu_get_aio_context(), xen_be_evtchn_fd(s->eh), true,
+    aio_set_fd_handler(qemu_get_aio_context(), xen_be_evtchn_fd(s->eh), false,
                        xen_xenstore_event, NULL, NULL, NULL, s);
 
     s->impl = xs_impl_create(xen_domid);
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 09/16] block: add blk_in_drain() API
  2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
                   ` (7 preceding siblings ...)
  2023-04-19 17:28 ` [PATCH v2 08/16] hw/xen: do not use aio_set_fd_handler(is_external=true) in xen_xenstore Stefan Hajnoczi
@ 2023-04-19 17:28 ` Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 10/16] block: drain from main loop thread in bdrv_co_yield_to_drain() Stefan Hajnoczi
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

The BlockBackend quiesce_counter is greater than zero during drained
sections. Add an API to check whether the BlockBackend is in a drained
section.

The next patch will use this API.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/sysemu/block-backend-global-state.h | 1 +
 block/block-backend.c                       | 7 +++++++
 2 files changed, 8 insertions(+)

diff --git a/include/sysemu/block-backend-global-state.h b/include/sysemu/block-backend-global-state.h
index 2b6d27db7c..ac7cbd6b5e 100644
--- a/include/sysemu/block-backend-global-state.h
+++ b/include/sysemu/block-backend-global-state.h
@@ -78,6 +78,7 @@ void blk_activate(BlockBackend *blk, Error **errp);
 int blk_make_zero(BlockBackend *blk, BdrvRequestFlags flags);
 void blk_aio_cancel(BlockAIOCB *acb);
 int blk_commit_all(void);
+bool blk_in_drain(BlockBackend *blk);
 void blk_drain(BlockBackend *blk);
 void blk_drain_all(void);
 void blk_set_on_error(BlockBackend *blk, BlockdevOnError on_read_error,
diff --git a/block/block-backend.c b/block/block-backend.c
index 9e0f48692a..e0a1d9ec0f 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1270,6 +1270,13 @@ blk_check_byte_request(BlockBackend *blk, int64_t offset, int64_t bytes)
     return 0;
 }
 
+/* Are we currently in a drained section? */
+bool blk_in_drain(BlockBackend *blk)
+{
+    GLOBAL_STATE_CODE(); /* change to IO_OR_GS_CODE(), if necessary */
+    return qatomic_read(&blk->quiesce_counter);
+}
+
 /* To be called between exactly one pair of blk_inc/dec_in_flight() */
 static void coroutine_fn blk_wait_while_drained(BlockBackend *blk)
 {
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 10/16] block: drain from main loop thread in bdrv_co_yield_to_drain()
  2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
                   ` (8 preceding siblings ...)
  2023-04-19 17:28 ` [PATCH v2 09/16] block: add blk_in_drain() API Stefan Hajnoczi
@ 2023-04-19 17:28 ` Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 11/16] xen-block: implement BlockDevOps->drained_begin() Stefan Hajnoczi
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

For simplicity, always run BlockDevOps .drained_begin/end/poll()
callbacks in the main loop thread. This makes it easier to implement the
callbacks and avoids extra locks.

Move the function pointer declarations from the I/O Code section to the
Global State section in block-backend-common.h.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/sysemu/block-backend-common.h | 25 +++++++++++++------------
 block/io.c                            |  3 ++-
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/include/sysemu/block-backend-common.h b/include/sysemu/block-backend-common.h
index 2391679c56..780cea7305 100644
--- a/include/sysemu/block-backend-common.h
+++ b/include/sysemu/block-backend-common.h
@@ -59,6 +59,19 @@ typedef struct BlockDevOps {
      */
     bool (*is_medium_locked)(void *opaque);
 
+    /*
+     * Runs when the backend receives a drain request.
+     */
+    void (*drained_begin)(void *opaque);
+    /*
+     * Runs when the backend's last drain request ends.
+     */
+    void (*drained_end)(void *opaque);
+    /*
+     * Is the device still busy?
+     */
+    bool (*drained_poll)(void *opaque);
+
     /*
      * I/O API functions. These functions are thread-safe.
      *
@@ -76,18 +89,6 @@ typedef struct BlockDevOps {
      * Runs when the size changed (e.g. monitor command block_resize)
      */
     void (*resize_cb)(void *opaque);
-    /*
-     * Runs when the backend receives a drain request.
-     */
-    void (*drained_begin)(void *opaque);
-    /*
-     * Runs when the backend's last drain request ends.
-     */
-    void (*drained_end)(void *opaque);
-    /*
-     * Is the device still busy?
-     */
-    bool (*drained_poll)(void *opaque);
 } BlockDevOps;
 
 /*
diff --git a/block/io.c b/block/io.c
index db438c7657..6285d67546 100644
--- a/block/io.c
+++ b/block/io.c
@@ -331,7 +331,8 @@ static void coroutine_fn bdrv_co_yield_to_drain(BlockDriverState *bs,
     if (ctx != co_ctx) {
         aio_context_release(ctx);
     }
-    replay_bh_schedule_oneshot_event(ctx, bdrv_co_drain_bh_cb, &data);
+    replay_bh_schedule_oneshot_event(qemu_get_aio_context(),
+                                     bdrv_co_drain_bh_cb, &data);
 
     qemu_coroutine_yield();
     /* If we are resumed from some other event (such as an aio completion or a
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 11/16] xen-block: implement BlockDevOps->drained_begin()
  2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
                   ` (9 preceding siblings ...)
  2023-04-19 17:28 ` [PATCH v2 10/16] block: drain from main loop thread in bdrv_co_yield_to_drain() Stefan Hajnoczi
@ 2023-04-19 17:28 ` Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 12/16] hw/xen: do not set is_external=true on evtchn fds Stefan Hajnoczi
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

Detach event channels during drained sections to stop I/O submission
from the ring. xen-block is no longer reliant on aio_disable_external()
after this patch. This will allow us to remove the
aio_disable_external() API once all other code that relies on it is
converted.

Extend xen_device_set_event_channel_context() to allow ctx=NULL. The
event channel still exists but the event loop does not monitor the file
descriptor. Event channel processing can resume by calling
xen_device_set_event_channel_context() with a non-NULL ctx.

Factor out xen_device_set_event_channel_context() calls in
hw/block/dataplane/xen-block.c into attach/detach helper functions.
Incidentally, these don't require the AioContext lock because
aio_set_fd_handler() is thread-safe.

It's safer to register BlockDevOps after the dataplane instance has been
created. The BlockDevOps .drained_begin/end() callbacks depend on the
dataplane instance, so move the blk_set_dev_ops() call after
xen_block_dataplane_create().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/block/dataplane/xen-block.h |  2 ++
 hw/block/dataplane/xen-block.c | 42 +++++++++++++++++++++++++---------
 hw/block/xen-block.c           | 24 ++++++++++++++++---
 hw/xen/xen-bus.c               |  7 ++++--
 4 files changed, 59 insertions(+), 16 deletions(-)

diff --git a/hw/block/dataplane/xen-block.h b/hw/block/dataplane/xen-block.h
index 76dcd51c3d..7b8e9df09f 100644
--- a/hw/block/dataplane/xen-block.h
+++ b/hw/block/dataplane/xen-block.h
@@ -26,5 +26,7 @@ void xen_block_dataplane_start(XenBlockDataPlane *dataplane,
                                unsigned int protocol,
                                Error **errp);
 void xen_block_dataplane_stop(XenBlockDataPlane *dataplane);
+void xen_block_dataplane_attach(XenBlockDataPlane *dataplane);
+void xen_block_dataplane_detach(XenBlockDataPlane *dataplane);
 
 #endif /* HW_BLOCK_DATAPLANE_XEN_BLOCK_H */
diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 734da42ea7..02e0fd6115 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -663,6 +663,30 @@ void xen_block_dataplane_destroy(XenBlockDataPlane *dataplane)
     g_free(dataplane);
 }
 
+void xen_block_dataplane_detach(XenBlockDataPlane *dataplane)
+{
+    if (!dataplane || !dataplane->event_channel) {
+        return;
+    }
+
+    /* Only reason for failure is a NULL channel */
+    xen_device_set_event_channel_context(dataplane->xendev,
+                                         dataplane->event_channel,
+                                         NULL, &error_abort);
+}
+
+void xen_block_dataplane_attach(XenBlockDataPlane *dataplane)
+{
+    if (!dataplane || !dataplane->event_channel) {
+        return;
+    }
+
+    /* Only reason for failure is a NULL channel */
+    xen_device_set_event_channel_context(dataplane->xendev,
+                                         dataplane->event_channel,
+                                         dataplane->ctx, &error_abort);
+}
+
 void xen_block_dataplane_stop(XenBlockDataPlane *dataplane)
 {
     XenDevice *xendev;
@@ -673,13 +697,11 @@ void xen_block_dataplane_stop(XenBlockDataPlane *dataplane)
 
     xendev = dataplane->xendev;
 
-    aio_context_acquire(dataplane->ctx);
-    if (dataplane->event_channel) {
-        /* Only reason for failure is a NULL channel */
-        xen_device_set_event_channel_context(xendev, dataplane->event_channel,
-                                             qemu_get_aio_context(),
-                                             &error_abort);
+    if (!blk_in_drain(dataplane->blk)) {
+        xen_block_dataplane_detach(dataplane);
     }
+
+    aio_context_acquire(dataplane->ctx);
     /* Xen doesn't have multiple users for nodes, so this can't fail */
     blk_set_aio_context(dataplane->blk, qemu_get_aio_context(), &error_abort);
     aio_context_release(dataplane->ctx);
@@ -818,11 +840,9 @@ void xen_block_dataplane_start(XenBlockDataPlane *dataplane,
     blk_set_aio_context(dataplane->blk, dataplane->ctx, NULL);
     aio_context_release(old_context);
 
-    /* Only reason for failure is a NULL channel */
-    aio_context_acquire(dataplane->ctx);
-    xen_device_set_event_channel_context(xendev, dataplane->event_channel,
-                                         dataplane->ctx, &error_abort);
-    aio_context_release(dataplane->ctx);
+    if (!blk_in_drain(dataplane->blk)) {
+        xen_block_dataplane_attach(dataplane);
+    }
 
     return;
 
diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index f5a744589d..f099914831 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -189,8 +189,26 @@ static void xen_block_resize_cb(void *opaque)
     xen_device_backend_printf(xendev, "state", "%u", state);
 }
 
+/* Suspend request handling */
+static void xen_block_drained_begin(void *opaque)
+{
+    XenBlockDevice *blockdev = opaque;
+
+    xen_block_dataplane_detach(blockdev->dataplane);
+}
+
+/* Resume request handling */
+static void xen_block_drained_end(void *opaque)
+{
+    XenBlockDevice *blockdev = opaque;
+
+    xen_block_dataplane_attach(blockdev->dataplane);
+}
+
 static const BlockDevOps xen_block_dev_ops = {
-    .resize_cb = xen_block_resize_cb,
+    .resize_cb     = xen_block_resize_cb,
+    .drained_begin = xen_block_drained_begin,
+    .drained_end   = xen_block_drained_end,
 };
 
 static void xen_block_realize(XenDevice *xendev, Error **errp)
@@ -242,8 +260,6 @@ static void xen_block_realize(XenDevice *xendev, Error **errp)
         return;
     }
 
-    blk_set_dev_ops(blk, &xen_block_dev_ops, blockdev);
-
     if (conf->discard_granularity == -1) {
         conf->discard_granularity = conf->physical_block_size;
     }
@@ -277,6 +293,8 @@ static void xen_block_realize(XenDevice *xendev, Error **errp)
     blockdev->dataplane =
         xen_block_dataplane_create(xendev, blk, conf->logical_block_size,
                                    blockdev->props.iothread);
+
+    blk_set_dev_ops(blk, &xen_block_dev_ops, blockdev);
 }
 
 static void xen_block_frontend_changed(XenDevice *xendev,
diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index c59850b1de..b8f408c9ed 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -846,8 +846,11 @@ void xen_device_set_event_channel_context(XenDevice *xendev,
                            NULL, NULL, NULL, NULL, NULL);
 
     channel->ctx = ctx;
-    aio_set_fd_handler(channel->ctx, qemu_xen_evtchn_fd(channel->xeh), true,
-                       xen_device_event, NULL, xen_device_poll, NULL, channel);
+    if (ctx) {
+        aio_set_fd_handler(channel->ctx, qemu_xen_evtchn_fd(channel->xeh),
+                           true, xen_device_event, NULL, xen_device_poll, NULL,
+                           channel);
+    }
 }
 
 XenEventChannel *xen_device_bind_event_channel(XenDevice *xendev,
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 12/16] hw/xen: do not set is_external=true on evtchn fds
  2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
                   ` (10 preceding siblings ...)
  2023-04-19 17:28 ` [PATCH v2 11/16] xen-block: implement BlockDevOps->drained_begin() Stefan Hajnoczi
@ 2023-04-19 17:28 ` Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 13/16] block/export: rewrite vduse-blk drain code Stefan Hajnoczi
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

is_external=true suspends fd handlers between aio_disable_external() and
aio_enable_external(). The block layer's drain operation uses this
mechanism to prevent new I/O from sneaking in between
bdrv_drained_begin() and bdrv_drained_end().

The previous commit converted the xen-block device to use BlockDevOps
.drained_begin/end() callbacks. It no longer relies on is_external=true
so it is safe to pass is_external=false.

This is part of ongoing work to remove the aio_disable_external() API.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/xen/xen-bus.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index b8f408c9ed..bf256d4da2 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -842,14 +842,14 @@ void xen_device_set_event_channel_context(XenDevice *xendev,
     }
 
     if (channel->ctx)
-        aio_set_fd_handler(channel->ctx, qemu_xen_evtchn_fd(channel->xeh), true,
+        aio_set_fd_handler(channel->ctx, qemu_xen_evtchn_fd(channel->xeh), false,
                            NULL, NULL, NULL, NULL, NULL);
 
     channel->ctx = ctx;
     if (ctx) {
         aio_set_fd_handler(channel->ctx, qemu_xen_evtchn_fd(channel->xeh),
-                           true, xen_device_event, NULL, xen_device_poll, NULL,
-                           channel);
+                           false, xen_device_event, NULL, xen_device_poll,
+                           NULL, channel);
     }
 }
 
@@ -923,7 +923,7 @@ void xen_device_unbind_event_channel(XenDevice *xendev,
 
     QLIST_REMOVE(channel, list);
 
-    aio_set_fd_handler(channel->ctx, qemu_xen_evtchn_fd(channel->xeh), true,
+    aio_set_fd_handler(channel->ctx, qemu_xen_evtchn_fd(channel->xeh), false,
                        NULL, NULL, NULL, NULL, NULL);
 
     if (qemu_xen_evtchn_unbind(channel->xeh, channel->local_port) < 0) {
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 13/16] block/export: rewrite vduse-blk drain code
  2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
                   ` (11 preceding siblings ...)
  2023-04-19 17:28 ` [PATCH v2 12/16] hw/xen: do not set is_external=true on evtchn fds Stefan Hajnoczi
@ 2023-04-19 17:28 ` Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 14/16] block/export: don't require AioContext lock around blk_exp_ref/unref() Stefan Hajnoczi
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

vduse_blk_detach_ctx() waits for in-flight requests using
AIO_WAIT_WHILE(). This is not allowed according to a comment in
bdrv_set_aio_context_commit():

  /*
   * Take the old AioContex when detaching it from bs.
   * At this point, new_context lock is already acquired, and we are now
   * also taking old_context. This is safe as long as bdrv_detach_aio_context
   * does not call AIO_POLL_WHILE().
   */

Use this opportunity to rewrite the drain code in vduse-blk:

- Use the BlockExport refcount so that vduse_blk_exp_delete() is only
  called when there are no more requests in flight.

- Implement .drained_poll() so in-flight request coroutines are stopped
  by the time .bdrv_detach_aio_context() is called.

- Remove AIO_WAIT_WHILE() from vduse_blk_detach_ctx() to solve the
  .bdrv_detach_aio_context() constraint violation. It's no longer
  needed due to the previous changes.

- Always handle the VDUSE file descriptor, even in drained sections. The
  VDUSE file descriptor doesn't submit I/O, so it's safe to handle it in
  drained sections. This ensures that the VDUSE kernel code gets a fast
  response.

- Suspend virtqueue fd handlers in .drained_begin() and resume them in
  .drained_end(). This eliminates the need for the
  aio_set_fd_handler(is_external=true) flag, which is being removed from
  QEMU.

This is a long list but splitting it into individual commits would
probably lead to git bisect failures - the changes are all related.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/export/vduse-blk.c | 132 +++++++++++++++++++++++++++------------
 1 file changed, 93 insertions(+), 39 deletions(-)

diff --git a/block/export/vduse-blk.c b/block/export/vduse-blk.c
index f7ae44e3ce..35dc8fcf45 100644
--- a/block/export/vduse-blk.c
+++ b/block/export/vduse-blk.c
@@ -31,7 +31,8 @@ typedef struct VduseBlkExport {
     VduseDev *dev;
     uint16_t num_queues;
     char *recon_file;
-    unsigned int inflight;
+    unsigned int inflight; /* atomic */
+    bool vqs_started;
 } VduseBlkExport;
 
 typedef struct VduseBlkReq {
@@ -41,13 +42,24 @@ typedef struct VduseBlkReq {
 
 static void vduse_blk_inflight_inc(VduseBlkExport *vblk_exp)
 {
-    vblk_exp->inflight++;
+    if (qatomic_fetch_inc(&vblk_exp->inflight) == 0) {
+        /* Prevent export from being deleted */
+        aio_context_acquire(vblk_exp->export.ctx);
+        blk_exp_ref(&vblk_exp->export);
+        aio_context_release(vblk_exp->export.ctx);
+    }
 }
 
 static void vduse_blk_inflight_dec(VduseBlkExport *vblk_exp)
 {
-    if (--vblk_exp->inflight == 0) {
+    if (qatomic_fetch_dec(&vblk_exp->inflight) == 1) {
+        /* Wake AIO_WAIT_WHILE() */
         aio_wait_kick();
+
+        /* Now the export can be deleted */
+        aio_context_acquire(vblk_exp->export.ctx);
+        blk_exp_unref(&vblk_exp->export);
+        aio_context_release(vblk_exp->export.ctx);
     }
 }
 
@@ -124,8 +136,12 @@ static void vduse_blk_enable_queue(VduseDev *dev, VduseVirtq *vq)
 {
     VduseBlkExport *vblk_exp = vduse_dev_get_priv(dev);
 
+    if (!vblk_exp->vqs_started) {
+        return; /* vduse_blk_drained_end() will start vqs later */
+    }
+
     aio_set_fd_handler(vblk_exp->export.ctx, vduse_queue_get_fd(vq),
-                       true, on_vduse_vq_kick, NULL, NULL, NULL, vq);
+                       false, on_vduse_vq_kick, NULL, NULL, NULL, vq);
     /* Make sure we don't miss any kick afer reconnecting */
     eventfd_write(vduse_queue_get_fd(vq), 1);
 }
@@ -133,9 +149,14 @@ static void vduse_blk_enable_queue(VduseDev *dev, VduseVirtq *vq)
 static void vduse_blk_disable_queue(VduseDev *dev, VduseVirtq *vq)
 {
     VduseBlkExport *vblk_exp = vduse_dev_get_priv(dev);
+    int fd = vduse_queue_get_fd(vq);
 
-    aio_set_fd_handler(vblk_exp->export.ctx, vduse_queue_get_fd(vq),
-                       true, NULL, NULL, NULL, NULL, NULL);
+    if (fd < 0) {
+        return;
+    }
+
+    aio_set_fd_handler(vblk_exp->export.ctx, fd, false,
+                       NULL, NULL, NULL, NULL, NULL);
 }
 
 static const VduseOps vduse_blk_ops = {
@@ -152,42 +173,19 @@ static void on_vduse_dev_kick(void *opaque)
 
 static void vduse_blk_attach_ctx(VduseBlkExport *vblk_exp, AioContext *ctx)
 {
-    int i;
-
     aio_set_fd_handler(vblk_exp->export.ctx, vduse_dev_get_fd(vblk_exp->dev),
-                       true, on_vduse_dev_kick, NULL, NULL, NULL,
+                       false, on_vduse_dev_kick, NULL, NULL, NULL,
                        vblk_exp->dev);
 
-    for (i = 0; i < vblk_exp->num_queues; i++) {
-        VduseVirtq *vq = vduse_dev_get_queue(vblk_exp->dev, i);
-        int fd = vduse_queue_get_fd(vq);
-
-        if (fd < 0) {
-            continue;
-        }
-        aio_set_fd_handler(vblk_exp->export.ctx, fd, true,
-                           on_vduse_vq_kick, NULL, NULL, NULL, vq);
-    }
+    /* Virtqueues are handled by vduse_blk_drained_end() */
 }
 
 static void vduse_blk_detach_ctx(VduseBlkExport *vblk_exp)
 {
-    int i;
-
-    for (i = 0; i < vblk_exp->num_queues; i++) {
-        VduseVirtq *vq = vduse_dev_get_queue(vblk_exp->dev, i);
-        int fd = vduse_queue_get_fd(vq);
-
-        if (fd < 0) {
-            continue;
-        }
-        aio_set_fd_handler(vblk_exp->export.ctx, fd,
-                           true, NULL, NULL, NULL, NULL, NULL);
-    }
     aio_set_fd_handler(vblk_exp->export.ctx, vduse_dev_get_fd(vblk_exp->dev),
-                       true, NULL, NULL, NULL, NULL, NULL);
+                       false, NULL, NULL, NULL, NULL, NULL);
 
-    AIO_WAIT_WHILE(vblk_exp->export.ctx, vblk_exp->inflight > 0);
+    /* Virtqueues are handled by vduse_blk_drained_begin() */
 }
 
 
@@ -220,8 +218,55 @@ static void vduse_blk_resize(void *opaque)
                             (char *)&config.capacity);
 }
 
+static void vduse_blk_stop_virtqueues(VduseBlkExport *vblk_exp)
+{
+    for (uint16_t i = 0; i < vblk_exp->num_queues; i++) {
+        VduseVirtq *vq = vduse_dev_get_queue(vblk_exp->dev, i);
+        vduse_blk_disable_queue(vblk_exp->dev, vq);
+    }
+
+    vblk_exp->vqs_started = false;
+}
+
+static void vduse_blk_start_virtqueues(VduseBlkExport *vblk_exp)
+{
+    vblk_exp->vqs_started = true;
+
+    for (uint16_t i = 0; i < vblk_exp->num_queues; i++) {
+        VduseVirtq *vq = vduse_dev_get_queue(vblk_exp->dev, i);
+        vduse_blk_enable_queue(vblk_exp->dev, vq);
+    }
+}
+
+static void vduse_blk_drained_begin(void *opaque)
+{
+    BlockExport *exp = opaque;
+    VduseBlkExport *vblk_exp = container_of(exp, VduseBlkExport, export);
+
+    vduse_blk_stop_virtqueues(vblk_exp);
+}
+
+static void vduse_blk_drained_end(void *opaque)
+{
+    BlockExport *exp = opaque;
+    VduseBlkExport *vblk_exp = container_of(exp, VduseBlkExport, export);
+
+    vduse_blk_start_virtqueues(vblk_exp);
+}
+
+static bool vduse_blk_drained_poll(void *opaque)
+{
+    BlockExport *exp = opaque;
+    VduseBlkExport *vblk_exp = container_of(exp, VduseBlkExport, export);
+
+    return qatomic_read(&vblk_exp->inflight) > 0;
+}
+
 static const BlockDevOps vduse_block_ops = {
-    .resize_cb = vduse_blk_resize,
+    .resize_cb     = vduse_blk_resize,
+    .drained_begin = vduse_blk_drained_begin,
+    .drained_end   = vduse_blk_drained_end,
+    .drained_poll  = vduse_blk_drained_poll,
 };
 
 static int vduse_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
@@ -268,6 +313,7 @@ static int vduse_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
     vblk_exp->handler.serial = g_strdup(vblk_opts->serial ?: "");
     vblk_exp->handler.logical_block_size = logical_block_size;
     vblk_exp->handler.writable = opts->writable;
+    vblk_exp->vqs_started = true;
 
     config.capacity =
             cpu_to_le64(blk_getlength(exp->blk) >> VIRTIO_BLK_SECTOR_BITS);
@@ -322,14 +368,20 @@ static int vduse_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
         vduse_dev_setup_queue(vblk_exp->dev, i, queue_size);
     }
 
-    aio_set_fd_handler(exp->ctx, vduse_dev_get_fd(vblk_exp->dev), true,
+    aio_set_fd_handler(exp->ctx, vduse_dev_get_fd(vblk_exp->dev), false,
                        on_vduse_dev_kick, NULL, NULL, NULL, vblk_exp->dev);
 
     blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
                                  vblk_exp);
-
     blk_set_dev_ops(exp->blk, &vduse_block_ops, exp);
 
+    /*
+     * We handle draining ourselves using an in-flight counter and by disabling
+     * virtqueue fd handlers. Do not queue BlockBackend requests, they need to
+     * complete so the in-flight counter reaches zero.
+     */
+    blk_set_disable_request_queuing(exp->blk, true);
+
     return 0;
 err:
     vduse_dev_destroy(vblk_exp->dev);
@@ -344,6 +396,9 @@ static void vduse_blk_exp_delete(BlockExport *exp)
     VduseBlkExport *vblk_exp = container_of(exp, VduseBlkExport, export);
     int ret;
 
+    assert(qatomic_read(&vblk_exp->inflight) == 0);
+
+    vduse_blk_detach_ctx(vblk_exp);
     blk_remove_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
                                     vblk_exp);
     blk_set_dev_ops(exp->blk, NULL, NULL);
@@ -355,13 +410,12 @@ static void vduse_blk_exp_delete(BlockExport *exp)
     g_free(vblk_exp->handler.serial);
 }
 
+/* Called with exp->ctx acquired */
 static void vduse_blk_exp_request_shutdown(BlockExport *exp)
 {
     VduseBlkExport *vblk_exp = container_of(exp, VduseBlkExport, export);
 
-    aio_context_acquire(vblk_exp->export.ctx);
-    vduse_blk_detach_ctx(vblk_exp);
-    aio_context_acquire(vblk_exp->export.ctx);
+    vduse_blk_stop_virtqueues(vblk_exp);
 }
 
 const BlockExportDriver blk_exp_vduse_blk = {
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 14/16] block/export: don't require AioContext lock around blk_exp_ref/unref()
  2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
                   ` (12 preceding siblings ...)
  2023-04-19 17:28 ` [PATCH v2 13/16] block/export: rewrite vduse-blk drain code Stefan Hajnoczi
@ 2023-04-19 17:28 ` Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 15/16] block/fuse: do not set is_external=true on FUSE fd Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 16/16] virtio: make it possible to detach host notifier from any thread Stefan Hajnoczi
  15 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

The FUSE export calls blk_exp_ref/unref() without the AioContext lock.
Instead of fixing the FUSE export, adjust blk_exp_ref/unref() so they
work without the AioContext lock. This way it's less error-prone.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/block/export.h   |  2 ++
 block/export/export.c    | 13 ++++++-------
 block/export/vduse-blk.c |  4 ----
 3 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/include/block/export.h b/include/block/export.h
index 7feb02e10d..f2fe0f8078 100644
--- a/include/block/export.h
+++ b/include/block/export.h
@@ -57,6 +57,8 @@ struct BlockExport {
      * Reference count for this block export. This includes strong references
      * both from the owner (qemu-nbd or the monitor) and clients connected to
      * the export.
+     *
+     * Use atomics to access this field.
      */
     int refcount;
 
diff --git a/block/export/export.c b/block/export/export.c
index e3fee60611..edb05c9268 100644
--- a/block/export/export.c
+++ b/block/export/export.c
@@ -201,11 +201,10 @@ fail:
     return NULL;
 }
 
-/* Callers must hold exp->ctx lock */
 void blk_exp_ref(BlockExport *exp)
 {
-    assert(exp->refcount > 0);
-    exp->refcount++;
+    assert(qatomic_read(&exp->refcount) > 0);
+    qatomic_inc(&exp->refcount);
 }
 
 /* Runs in the main thread */
@@ -227,11 +226,10 @@ static void blk_exp_delete_bh(void *opaque)
     aio_context_release(aio_context);
 }
 
-/* Callers must hold exp->ctx lock */
 void blk_exp_unref(BlockExport *exp)
 {
-    assert(exp->refcount > 0);
-    if (--exp->refcount == 0) {
+    assert(qatomic_read(&exp->refcount) > 0);
+    if (qatomic_fetch_dec(&exp->refcount) == 1) {
         /* Touch the block_exports list only in the main thread */
         aio_bh_schedule_oneshot(qemu_get_aio_context(), blk_exp_delete_bh,
                                 exp);
@@ -339,7 +337,8 @@ void qmp_block_export_del(const char *id,
     if (!has_mode) {
         mode = BLOCK_EXPORT_REMOVE_MODE_SAFE;
     }
-    if (mode == BLOCK_EXPORT_REMOVE_MODE_SAFE && exp->refcount > 1) {
+    if (mode == BLOCK_EXPORT_REMOVE_MODE_SAFE &&
+        qatomic_read(&exp->refcount) > 1) {
         error_setg(errp, "export '%s' still in use", exp->id);
         error_append_hint(errp, "Use mode='hard' to force client "
                           "disconnect\n");
diff --git a/block/export/vduse-blk.c b/block/export/vduse-blk.c
index 35dc8fcf45..611430afda 100644
--- a/block/export/vduse-blk.c
+++ b/block/export/vduse-blk.c
@@ -44,9 +44,7 @@ static void vduse_blk_inflight_inc(VduseBlkExport *vblk_exp)
 {
     if (qatomic_fetch_inc(&vblk_exp->inflight) == 0) {
         /* Prevent export from being deleted */
-        aio_context_acquire(vblk_exp->export.ctx);
         blk_exp_ref(&vblk_exp->export);
-        aio_context_release(vblk_exp->export.ctx);
     }
 }
 
@@ -57,9 +55,7 @@ static void vduse_blk_inflight_dec(VduseBlkExport *vblk_exp)
         aio_wait_kick();
 
         /* Now the export can be deleted */
-        aio_context_acquire(vblk_exp->export.ctx);
         blk_exp_unref(&vblk_exp->export);
-        aio_context_release(vblk_exp->export.ctx);
     }
 }
 
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 15/16] block/fuse: do not set is_external=true on FUSE fd
  2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
                   ` (13 preceding siblings ...)
  2023-04-19 17:28 ` [PATCH v2 14/16] block/export: don't require AioContext lock around blk_exp_ref/unref() Stefan Hajnoczi
@ 2023-04-19 17:28 ` Stefan Hajnoczi
  2023-04-19 17:28 ` [PATCH v2 16/16] virtio: make it possible to detach host notifier from any thread Stefan Hajnoczi
  15 siblings, 0 replies; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

This is part of ongoing work to remove the aio_disable_external() API.

Use BlockDevOps .drained_begin/end/poll() instead of
aio_set_fd_handler(is_external=true).

As a side-effect the FUSE export now follows AioContext changes like the
other export types.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/export/fuse.c | 58 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 56 insertions(+), 2 deletions(-)

diff --git a/block/export/fuse.c b/block/export/fuse.c
index 06fa41079e..65a7f4d723 100644
--- a/block/export/fuse.c
+++ b/block/export/fuse.c
@@ -50,6 +50,7 @@ typedef struct FuseExport {
 
     struct fuse_session *fuse_session;
     struct fuse_buf fuse_buf;
+    unsigned int in_flight; /* atomic */
     bool mounted, fd_handler_set_up;
 
     char *mountpoint;
@@ -78,6 +79,42 @@ static void read_from_fuse_export(void *opaque);
 static bool is_regular_file(const char *path, Error **errp);
 
 
+static void fuse_export_drained_begin(void *opaque)
+{
+    FuseExport *exp = opaque;
+
+    aio_set_fd_handler(exp->common.ctx,
+                       fuse_session_fd(exp->fuse_session), false,
+                       NULL, NULL, NULL, NULL, NULL);
+    exp->fd_handler_set_up = false;
+}
+
+static void fuse_export_drained_end(void *opaque)
+{
+    FuseExport *exp = opaque;
+
+    /* Refresh AioContext in case it changed */
+    exp->common.ctx = blk_get_aio_context(exp->common.blk);
+
+    aio_set_fd_handler(exp->common.ctx,
+                       fuse_session_fd(exp->fuse_session), false,
+                       read_from_fuse_export, NULL, NULL, NULL, exp);
+    exp->fd_handler_set_up = true;
+}
+
+static bool fuse_export_drained_poll(void *opaque)
+{
+    FuseExport *exp = opaque;
+
+    return qatomic_read(&exp->in_flight) > 0;
+}
+
+static const BlockDevOps fuse_export_blk_dev_ops = {
+    .drained_begin = fuse_export_drained_begin,
+    .drained_end   = fuse_export_drained_end,
+    .drained_poll  = fuse_export_drained_poll,
+};
+
 static int fuse_export_create(BlockExport *blk_exp,
                               BlockExportOptions *blk_exp_args,
                               Error **errp)
@@ -101,6 +138,15 @@ static int fuse_export_create(BlockExport *blk_exp,
         }
     }
 
+    blk_set_dev_ops(exp->common.blk, &fuse_export_blk_dev_ops, exp);
+
+    /*
+     * We handle draining ourselves using an in-flight counter and by disabling
+     * the FUSE fd handler. Do not queue BlockBackend requests, they need to
+     * complete so the in-flight counter reaches zero.
+     */
+    blk_set_disable_request_queuing(exp->common.blk, true);
+
     init_exports_table();
 
     /*
@@ -224,7 +270,7 @@ static int setup_fuse_export(FuseExport *exp, const char *mountpoint,
     g_hash_table_insert(exports, g_strdup(mountpoint), NULL);
 
     aio_set_fd_handler(exp->common.ctx,
-                       fuse_session_fd(exp->fuse_session), true,
+                       fuse_session_fd(exp->fuse_session), false,
                        read_from_fuse_export, NULL, NULL, NULL, exp);
     exp->fd_handler_set_up = true;
 
@@ -246,6 +292,8 @@ static void read_from_fuse_export(void *opaque)
 
     blk_exp_ref(&exp->common);
 
+    qatomic_inc(&exp->in_flight);
+
     do {
         ret = fuse_session_receive_buf(exp->fuse_session, &exp->fuse_buf);
     } while (ret == -EINTR);
@@ -256,6 +304,10 @@ static void read_from_fuse_export(void *opaque)
     fuse_session_process_buf(exp->fuse_session, &exp->fuse_buf);
 
 out:
+    if (qatomic_fetch_dec(&exp->in_flight) == 1) {
+        aio_wait_kick(); /* wake AIO_WAIT_WHILE() */
+    }
+
     blk_exp_unref(&exp->common);
 }
 
@@ -268,7 +320,7 @@ static void fuse_export_shutdown(BlockExport *blk_exp)
 
         if (exp->fd_handler_set_up) {
             aio_set_fd_handler(exp->common.ctx,
-                               fuse_session_fd(exp->fuse_session), true,
+                               fuse_session_fd(exp->fuse_session), false,
                                NULL, NULL, NULL, NULL, NULL);
             exp->fd_handler_set_up = false;
         }
@@ -287,6 +339,8 @@ static void fuse_export_delete(BlockExport *blk_exp)
 {
     FuseExport *exp = container_of(blk_exp, FuseExport, common);
 
+    blk_set_dev_ops(exp->common.blk, NULL, NULL);
+
     if (exp->fuse_session) {
         if (exp->mounted) {
             fuse_session_unmount(exp->fuse_session);
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 16/16] virtio: make it possible to detach host notifier from any thread
  2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
                   ` (14 preceding siblings ...)
  2023-04-19 17:28 ` [PATCH v2 15/16] block/fuse: do not set is_external=true on FUSE fd Stefan Hajnoczi
@ 2023-04-19 17:28 ` Stefan Hajnoczi
  2023-04-19 18:51   ` Eric Blake
  15 siblings, 1 reply; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-19 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Stefan Hajnoczi,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

virtio_queue_aio_detach_host_notifier() does two things:
1. It removes the fd handler from the event loop.
2. It processes the virtqueue one last time.

The first step can be peformed by any thread and without taking the
AioContext lock.

The second step may need the AioContext lock (depending on the device
implementation) and runs in the thread where request processing takes
place. virtio-blk and virtio-scsi therefore call
virtio_queue_aio_detach_host_notifier() from a BH that is scheduled in
AioContext

Scheduling a BH is undesirable for .drained_begin() functions. The next
patch will introduce a .drained_begin() function that needs to call
virtio_queue_aio_detach_host_notifier().

Move the virtqueue processing out to the callers of
virtio_queue_aio_detach_host_notifier() so that the function can be
called from any thread. This is in preparation for the next patch.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/block/dataplane/virtio-blk.c | 2 ++
 hw/scsi/virtio-scsi-dataplane.c | 9 +++++++++
 2 files changed, 11 insertions(+)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index b28d81737e..bd7cc6e76b 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -286,8 +286,10 @@ static void virtio_blk_data_plane_stop_bh(void *opaque)
 
     for (i = 0; i < s->conf->num_queues; i++) {
         VirtQueue *vq = virtio_get_queue(s->vdev, i);
+        EventNotifier *host_notifier = virtio_queue_get_host_notifier(vq);
 
         virtio_queue_aio_detach_host_notifier(vq, s->ctx);
+        virtio_queue_host_notifier_read(host_notifier);
     }
 }
 
diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index 20bb91766e..81643445ed 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -71,12 +71,21 @@ static void virtio_scsi_dataplane_stop_bh(void *opaque)
 {
     VirtIOSCSI *s = opaque;
     VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s);
+    EventNotifier *host_notifier;
     int i;
 
     virtio_queue_aio_detach_host_notifier(vs->ctrl_vq, s->ctx);
+    host_notifier = virtio_queue_get_host_notifier(vs->ctrl_vq);
+    virtio_queue_host_notifier_read(host_notifier);
+
     virtio_queue_aio_detach_host_notifier(vs->event_vq, s->ctx);
+    host_notifier = virtio_queue_get_host_notifier(vs->event_vq);
+    virtio_queue_host_notifier_read(host_notifier);
+
     for (i = 0; i < vs->conf.num_queues; i++) {
         virtio_queue_aio_detach_host_notifier(vs->cmd_vqs[i], s->ctx);
+        host_notifier = virtio_queue_get_host_notifier(vs->cmd_vqs[i]);
+        virtio_queue_host_notifier_read(host_notifier);
     }
 }
 
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 16/16] virtio: make it possible to detach host notifier from any thread
  2023-04-19 17:28 ` [PATCH v2 16/16] virtio: make it possible to detach host notifier from any thread Stefan Hajnoczi
@ 2023-04-19 18:51   ` Eric Blake
  2023-04-20 11:29     ` Stefan Hajnoczi
  0 siblings, 1 reply; 23+ messages in thread
From: Eric Blake @ 2023-04-19 18:51 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Stefano Stabellini, Marcel Apfelbaum, Fam Zheng,
	Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

On Wed, Apr 19, 2023 at 01:28:17PM -0400, Stefan Hajnoczi wrote:
> virtio_queue_aio_detach_host_notifier() does two things:
> 1. It removes the fd handler from the event loop.
> 2. It processes the virtqueue one last time.
> 
> The first step can be peformed by any thread and without taking the
> AioContext lock.
> 
> The second step may need the AioContext lock (depending on the device
> implementation) and runs in the thread where request processing takes
> place. virtio-blk and virtio-scsi therefore call
> virtio_queue_aio_detach_host_notifier() from a BH that is scheduled in
> AioContext
> 
> Scheduling a BH is undesirable for .drained_begin() functions. The next
> patch will introduce a .drained_begin() function that needs to call
> virtio_queue_aio_detach_host_notifier().
> 
> Move the virtqueue processing out to the callers of
> virtio_queue_aio_detach_host_notifier() so that the function can be
> called from any thread. This is in preparation for the next patch.
>

This mentions a next patch, but is 16/16 in the series.  Am I missing
something?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 08/16] hw/xen: do not use aio_set_fd_handler(is_external=true) in xen_xenstore
  2023-04-19 17:28 ` [PATCH v2 08/16] hw/xen: do not use aio_set_fd_handler(is_external=true) in xen_xenstore Stefan Hajnoczi
@ 2023-04-20  7:54   ` Paul Durrant
  0 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2023-04-20  7:54 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Julia Suvorova,
	Hanna Reitz, Daniel P. Berrangé, Paolo Bonzini, Coiby Xu,
	Ronnie Sahlberg, Eduardo Habkost, Juan Quintela,
	Michael S. Tsirkin, Stefano Garzarella, Anthony Perard,
	Kevin Wolf, Richard W.M. Jones, Richard Henderson, xen-devel,
	qemu-block, Dr. David Alan Gilbert, Philippe Mathieu-Daudé,
	Peter Lieven, eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji,
	David Woodhouse, David Woodhouse

On 19/04/2023 18:28, Stefan Hajnoczi wrote:
> There is no need to suspend activity between aio_disable_external() and
> aio_enable_external(), which is mainly used for the block layer's drain
> operation.
> 
> This is part of ongoing work to remove the aio_disable_external() API.
> 
> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>   hw/i386/kvm/xen_xenstore.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 

Reviewed-by: Paul Durrant <paul@xen.org>



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 01/16] hw/qdev: introduce qdev_is_realized() helper
  2023-04-19 17:28 ` [PATCH v2 01/16] hw/qdev: introduce qdev_is_realized() helper Stefan Hajnoczi
@ 2023-04-20  8:17   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 23+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-20  8:17 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Julia Suvorova,
	Hanna Reitz, Daniel P. Berrangé, Paolo Bonzini, Coiby Xu,
	Paul Durrant, Ronnie Sahlberg, Eduardo Habkost, Juan Quintela,
	Michael S. Tsirkin, Stefano Garzarella, Anthony Perard,
	Kevin Wolf, Richard W.M. Jones, Richard Henderson, xen-devel,
	qemu-block, Dr. David Alan Gilbert, Peter Lieven, eesposit,
	Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

On 19/4/23 19:28, Stefan Hajnoczi wrote:
> Add a helper function to check whether the device is realized without
> requiring the Big QEMU Lock. The next patch adds a second caller. The
> goal is to avoid spreading DeviceState field accesses throughout the
> code.
> 
> Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>   include/hw/qdev-core.h | 17 ++++++++++++++---
>   hw/scsi/scsi-bus.c     |  3 +--
>   2 files changed, 15 insertions(+), 5 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

Thank you!



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 05/16] util/vhost-user-server: rename refcount to in_flight counter
  2023-04-19 17:28 ` [PATCH v2 05/16] util/vhost-user-server: rename refcount to in_flight counter Stefan Hajnoczi
@ 2023-04-20  8:18   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 23+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-20  8:18 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Stefano Stabellini, Marcel Apfelbaum, Fam Zheng, Julia Suvorova,
	Hanna Reitz, Daniel P. Berrangé, Paolo Bonzini, Coiby Xu,
	Paul Durrant, Ronnie Sahlberg, Eduardo Habkost, Juan Quintela,
	Michael S. Tsirkin, Stefano Garzarella, Anthony Perard,
	Kevin Wolf, Richard W.M. Jones, Richard Henderson, xen-devel,
	qemu-block, Dr. David Alan Gilbert, Peter Lieven, eesposit,
	Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

On 19/4/23 19:28, Stefan Hajnoczi wrote:
> The VuServer object has a refcount field and ref/unref APIs. The name is
> confusing because it's actually an in-flight request counter instead of
> a refcount.
> 
> Normally a refcount destroys the object upon reaching zero. The VuServer
> counter is used to wake up the vhost-user coroutine when there are no
> more requests.
> 
> Avoid confusing by renaming refcount and ref/unref to in_flight and
> inc/dec.
> 
> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>   include/qemu/vhost-user-server.h     |  6 +++---
>   block/export/vhost-user-blk-server.c | 11 +++++++----
>   util/vhost-user-server.c             | 14 +++++++-------
>   3 files changed, 17 insertions(+), 14 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 16/16] virtio: make it possible to detach host notifier from any thread
  2023-04-19 18:51   ` Eric Blake
@ 2023-04-20 11:29     ` Stefan Hajnoczi
  2023-04-20 12:20       ` Juan Quintela
  0 siblings, 1 reply; 23+ messages in thread
From: Stefan Hajnoczi @ 2023-04-20 11:29 UTC (permalink / raw)
  To: Eric Blake
  Cc: Stefan Hajnoczi, qemu-devel, Stefano Stabellini, Marcel Apfelbaum,
	Fam Zheng, Julia Suvorova, Hanna Reitz, Daniel P. Berrangé,
	Paolo Bonzini, Coiby Xu, Paul Durrant, Ronnie Sahlberg,
	Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

On Wed, 19 Apr 2023 at 14:52, Eric Blake <eblake@redhat.com> wrote:
>
> On Wed, Apr 19, 2023 at 01:28:17PM -0400, Stefan Hajnoczi wrote:
> > virtio_queue_aio_detach_host_notifier() does two things:
> > 1. It removes the fd handler from the event loop.
> > 2. It processes the virtqueue one last time.
> >
> > The first step can be peformed by any thread and without taking the
> > AioContext lock.
> >
> > The second step may need the AioContext lock (depending on the device
> > implementation) and runs in the thread where request processing takes
> > place. virtio-blk and virtio-scsi therefore call
> > virtio_queue_aio_detach_host_notifier() from a BH that is scheduled in
> > AioContext
> >
> > Scheduling a BH is undesirable for .drained_begin() functions. The next
> > patch will introduce a .drained_begin() function that needs to call
> > virtio_queue_aio_detach_host_notifier().
> >
> > Move the virtqueue processing out to the callers of
> > virtio_queue_aio_detach_host_notifier() so that the function can be
> > called from any thread. This is in preparation for the next patch.
> >
>
> This mentions a next patch, but is 16/16 in the series.  Am I missing
> something?

Good thing you caught this. The patch series was truncated because I
was in the middle of git rebase -i :(.

I will send a v3 with the remaining patches.

Thanks,
Stefan


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 16/16] virtio: make it possible to detach host notifier from any thread
  2023-04-20 11:29     ` Stefan Hajnoczi
@ 2023-04-20 12:20       ` Juan Quintela
  0 siblings, 0 replies; 23+ messages in thread
From: Juan Quintela @ 2023-04-20 12:20 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Eric Blake, Stefan Hajnoczi, qemu-devel, Stefano Stabellini,
	Marcel Apfelbaum, Fam Zheng, Julia Suvorova, Hanna Reitz,
	Daniel P. Berrangé, Paolo Bonzini, Coiby Xu, Paul Durrant,
	Ronnie Sahlberg, Eduardo Habkost, Michael S. Tsirkin,
	Stefano Garzarella, Anthony Perard, Kevin Wolf,
	Richard W.M. Jones, Richard Henderson, xen-devel, qemu-block,
	Dr. David Alan Gilbert, Philippe Mathieu-Daudé, Peter Lieven,
	eesposit, Aarushi Mehta, Stefan Weil, Xie Yongji, David Woodhouse

Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Wed, 19 Apr 2023 at 14:52, Eric Blake <eblake@redhat.com> wrote:
>>
>> On Wed, Apr 19, 2023 at 01:28:17PM -0400, Stefan Hajnoczi wrote:
>> > virtio_queue_aio_detach_host_notifier() does two things:
>> > 1. It removes the fd handler from the event loop.
>> > 2. It processes the virtqueue one last time.
>> >
>> > The first step can be peformed by any thread and without taking the
>> > AioContext lock.
>> >
>> > The second step may need the AioContext lock (depending on the device
>> > implementation) and runs in the thread where request processing takes
>> > place. virtio-blk and virtio-scsi therefore call
>> > virtio_queue_aio_detach_host_notifier() from a BH that is scheduled in
>> > AioContext
>> >
>> > Scheduling a BH is undesirable for .drained_begin() functions. The next
>> > patch will introduce a .drained_begin() function that needs to call
>> > virtio_queue_aio_detach_host_notifier().
>> >
>> > Move the virtqueue processing out to the callers of
>> > virtio_queue_aio_detach_host_notifier() so that the function can be
>> > called from any thread. This is in preparation for the next patch.
>> >
>>
>> This mentions a next patch, but is 16/16 in the series.  Am I missing
>> something?
>
> Good thing you caught this. The patch series was truncated because I
> was in the middle of git rebase -i :(.
>
> I will send a v3 with the remaining patches.

I saw that it was not migration/* stuff and though that I was done O:-)



^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2023-04-20 12:21 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-19 17:28 [PATCH v2 00/16] block: remove aio_disable_external() API Stefan Hajnoczi
2023-04-19 17:28 ` [PATCH v2 01/16] hw/qdev: introduce qdev_is_realized() helper Stefan Hajnoczi
2023-04-20  8:17   ` Philippe Mathieu-Daudé
2023-04-19 17:28 ` [PATCH v2 02/16] virtio-scsi: avoid race between unplug and transport event Stefan Hajnoczi
2023-04-19 17:28 ` [PATCH v2 03/16] virtio-scsi: stop using aio_disable_external() during unplug Stefan Hajnoczi
2023-04-19 17:28 ` [PATCH v2 04/16] block/export: only acquire AioContext once for vhost_user_server_stop() Stefan Hajnoczi
2023-04-19 17:28 ` [PATCH v2 05/16] util/vhost-user-server: rename refcount to in_flight counter Stefan Hajnoczi
2023-04-20  8:18   ` Philippe Mathieu-Daudé
2023-04-19 17:28 ` [PATCH v2 06/16] block/export: wait for vhost-user-blk requests when draining Stefan Hajnoczi
2023-04-19 17:28 ` [PATCH v2 07/16] block/export: stop using is_external in vhost-user-blk server Stefan Hajnoczi
2023-04-19 17:28 ` [PATCH v2 08/16] hw/xen: do not use aio_set_fd_handler(is_external=true) in xen_xenstore Stefan Hajnoczi
2023-04-20  7:54   ` Paul Durrant
2023-04-19 17:28 ` [PATCH v2 09/16] block: add blk_in_drain() API Stefan Hajnoczi
2023-04-19 17:28 ` [PATCH v2 10/16] block: drain from main loop thread in bdrv_co_yield_to_drain() Stefan Hajnoczi
2023-04-19 17:28 ` [PATCH v2 11/16] xen-block: implement BlockDevOps->drained_begin() Stefan Hajnoczi
2023-04-19 17:28 ` [PATCH v2 12/16] hw/xen: do not set is_external=true on evtchn fds Stefan Hajnoczi
2023-04-19 17:28 ` [PATCH v2 13/16] block/export: rewrite vduse-blk drain code Stefan Hajnoczi
2023-04-19 17:28 ` [PATCH v2 14/16] block/export: don't require AioContext lock around blk_exp_ref/unref() Stefan Hajnoczi
2023-04-19 17:28 ` [PATCH v2 15/16] block/fuse: do not set is_external=true on FUSE fd Stefan Hajnoczi
2023-04-19 17:28 ` [PATCH v2 16/16] virtio: make it possible to detach host notifier from any thread Stefan Hajnoczi
2023-04-19 18:51   ` Eric Blake
2023-04-20 11:29     ` Stefan Hajnoczi
2023-04-20 12:20       ` Juan Quintela

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).