From: "Alex Bennée" <alex.bennee@linaro.org>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-devel@nongnu.org, slp@redhat.com, mst@redhat.com,
marcandre.lureau@redhat.com, mathieu.poirier@linaro.org,
viresh.kumar@linaro.org, sgarzare@redhat.com,
Raphael Norwitz <raphael.norwitz@nutanix.com>,
Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>,
"open list:Block layer core" <qemu-block@nongnu.org>
Subject: Re: [PATCH v4 4/5] hw/virtio: generalise CHR_EVENT_CLOSED handling
Date: Thu, 01 Dec 2022 17:05:06 +0000 [thread overview]
Message-ID: <87v8mvaw0c.fsf@linaro.org> (raw)
In-Reply-To: <Y4fJZhRgpAH4NVVP@fedora>
Stefan Hajnoczi <stefanha@redhat.com> writes:
> [[PGP Signed Part:Undecided]]
> On Wed, Nov 30, 2022 at 11:24:38AM +0000, Alex Bennée wrote:
>> ..and use for both virtio-user-blk and virtio-user-gpio. This avoids
>> the circular close by deferring shutdown due to disconnection until a
>> later point.
>
> I thought re-entrancy was already avoided by Patch 3?
No because vu_gpio_disconnect() gets called while we are in the process
of tearing things down for the vu_gpio_stop(). As it hasn't disconnected
yet we end up doing the teardown there:
#2 0x0000557adbe04d56 in vu_gpio_disconnect (dev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:255
#3 0x0000557adbe049bb in vu_gpio_event (opaque=0x557adf80d640, event=CHR_EVENT_CLOSED) at ../../hw/virtio/vhost-user-gpio.c:274
#4 0x0000557adc0539ef in chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:61
#5 0x0000557adc0506aa in qemu_chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:81
#6 0x0000557adc04f666 in tcp_chr_disconnect_locked (chr=0x557adea51f10) at ../../chardev/char-socket.c:470
#7 0x0000557adc04c81a in tcp_chr_write (chr=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-socket.c:129
#8 0x0000557adc050999 in qemu_chr_write_buffer (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, offset=0x7ffe8588cbe4, write_all=true) at ../../chardev/char.c:121
#9 0x0000557adc0507c7 in qemu_chr_write (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, write_all=true) at ../../chardev/char.c:173
#10 0x0000557adc046f3a in qemu_chr_fe_write_all (be=0x557adf80d830, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-fe.c:53
#11 0x0000557adbddc02f in vhost_user_write (dev=0x557adf80d878, msg=0x7ffe8588cce0, fds=0x0, fd_num=0) at ../../hw/virtio/vhost-user.c:490
#12 0x0000557adbddd48f in vhost_user_get_vring_base (dev=0x557adf80d878, ring=0x7ffe8588d000) at ../../hw/virtio/vhost-user.c:1260
#13 0x0000557adbdd4bd6 in vhost_virtqueue_stop (dev=0x557adf80d878, vdev=0x557adf80d640, vq=0x557adf843570, idx=0) at ../../hw/virtio/vhost.c:1220
#14 0x0000557adbdd7eda in vhost_dev_stop (hdev=0x557adf80d878, vdev=0x557adf80d640, vrings=false) at ../../hw/virtio/vhost.c:1916
#15 0x0000557adbe051a6 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:142
and when we finally return the disconnect event has wiped out vhost_user.
>
>> virtio-user-blk already had this mechanism in place so
>> generalise it as a vhost-user helper function and use for both blk and
>> gpio devices.
>>
>> While we are at it we also fix up vhost-user-gpio to re-establish the
>> event handler after close down so we can reconnect later.
>
> I don't understand how the BH is supposed to help. Normally BHs are used
> to defer processing while we're in a call stack of some depth. The
> callers still require the resource we want to free, so we scheduled a
> BH.
>
> That's not the case here. The CHR_EVENT_CLOSED callback is just another
> top-level event handler, not something that is called deep in a call
> stack.
Well it is in the shutdown case because as we write to the vhost_user
backend to shut things down it dies and we immediately process the
disconnect event.
It is racy though so some times you get away with it.
> This BH approach isn't safe against nested event loops (something that
> calls aio_poll() to wait), because now the BH can be invoked deep in a
> call stack.
Well I did ask if anyone had any better suggestions. In this case I'm
mostly just copying what vhost-user-blk did.
>
> That said, if Patch 3 isn't enough and this patch fixes a problem, then
> let's keep it for 7.2. It feels like a possibly incomplete solution and
> maybe something that should be solved differently in 8.0.
>
>>
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
>> ---
>> include/hw/virtio/vhost-user.h | 18 +++++++++
>> hw/block/vhost-user-blk.c | 41 +++-----------------
>> hw/virtio/vhost-user-gpio.c | 11 +++++-
>> hw/virtio/vhost-user.c | 71 ++++++++++++++++++++++++++++++++++
>> 4 files changed, 104 insertions(+), 37 deletions(-)
>>
>> diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h
>> index c6e693cd3f..191216a74f 100644
>> --- a/include/hw/virtio/vhost-user.h
>> +++ b/include/hw/virtio/vhost-user.h
>> @@ -68,4 +68,22 @@ bool vhost_user_init(VhostUserState *user, CharBackend *chr, Error **errp);
>> */
>> void vhost_user_cleanup(VhostUserState *user);
>>
>> +/**
>> + * vhost_user_async_close() - cleanup vhost-user post connection drop
>> + * @d: DeviceState for the associated device (passed to callback)
>> + * @chardev: the CharBackend associated with the connection
>> + * @vhost: the common vhost device
>> + * @cb: the user callback function to complete the clean-up
>> + *
>> + * This function is used to handle the shutdown of a vhost-user
>> + * connection to a backend. We handle this centrally to make sure we
>> + * do all the steps and handle potential races due to VM shutdowns.
>> + * Once the connection is disabled we call a backhalf to ensure
>> + */
>> +typedef void (*vu_async_close_fn)(DeviceState *cb);
>> +
>> +void vhost_user_async_close(DeviceState *d,
>> + CharBackend *chardev, struct vhost_dev *vhost,
>> + vu_async_close_fn cb);
>> +
>> #endif
>> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
>> index 1177064631..aff4d2b8cb 100644
>> --- a/hw/block/vhost-user-blk.c
>> +++ b/hw/block/vhost-user-blk.c
>> @@ -369,17 +369,10 @@ static void vhost_user_blk_disconnect(DeviceState *dev)
>> vhost_user_blk_stop(vdev);
>>
>> vhost_dev_cleanup(&s->dev);
>> -}
>>
>> -static void vhost_user_blk_chr_closed_bh(void *opaque)
>> -{
>> - DeviceState *dev = opaque;
>> - VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>> - VHostUserBlk *s = VHOST_USER_BLK(vdev);
>> -
>> - vhost_user_blk_disconnect(dev);
>> + /* Re-instate the event handler for new connections */
>> qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, vhost_user_blk_event,
>> - NULL, opaque, NULL, true);
>> + NULL, dev, NULL, true);
>> }
>>
>> static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
>> @@ -398,33 +391,9 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
>> }
>> break;
>> case CHR_EVENT_CLOSED:
>> - if (!runstate_check(RUN_STATE_SHUTDOWN)) {
>> - /*
>> - * A close event may happen during a read/write, but vhost
>> - * code assumes the vhost_dev remains setup, so delay the
>> - * stop & clear.
>> - */
>> - AioContext *ctx = qemu_get_current_aio_context();
>> -
>> - qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL, NULL,
>> - NULL, NULL, false);
>> - aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, opaque);
>> -
>> - /*
>> - * Move vhost device to the stopped state. The vhost-user device
>> - * will be clean up and disconnected in BH. This can be useful in
>> - * the vhost migration code. If disconnect was caught there is an
>> - * option for the general vhost code to get the dev state without
>> - * knowing its type (in this case vhost-user).
>> - *
>> - * FIXME: this is sketchy to be reaching into vhost_dev
>> - * now because we are forcing something that implies we
>> - * have executed vhost_dev_stop() but that won't happen
>> - * until vhost_user_blk_stop() gets called from the bh.
>> - * Really this state check should be tracked locally.
>> - */
>> - s->dev.started = false;
>> - }
>> + /* defer close until later to avoid circular close */
>> + vhost_user_async_close(dev, &s->chardev, &s->dev,
>> + vhost_user_blk_disconnect);
>> break;
>> case CHR_EVENT_BREAK:
>> case CHR_EVENT_MUX_IN:
>> diff --git a/hw/virtio/vhost-user-gpio.c b/hw/virtio/vhost-user-gpio.c
>> index be9be08b4c..b7b82a1099 100644
>> --- a/hw/virtio/vhost-user-gpio.c
>> +++ b/hw/virtio/vhost-user-gpio.c
>> @@ -233,6 +233,8 @@ static int vu_gpio_connect(DeviceState *dev, Error **errp)
>> return 0;
>> }
>>
>> +static void vu_gpio_event(void *opaque, QEMUChrEvent event);
>> +
>> static void vu_gpio_disconnect(DeviceState *dev)
>> {
>> VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>> @@ -245,6 +247,11 @@ static void vu_gpio_disconnect(DeviceState *dev)
>>
>> vu_gpio_stop(vdev);
>> vhost_dev_cleanup(&gpio->vhost_dev);
>> +
>> + /* Re-instate the event handler for new connections */
>> + qemu_chr_fe_set_handlers(&gpio->chardev,
>> + NULL, NULL, vu_gpio_event,
>> + NULL, dev, NULL, true);
>> }
>>
>> static void vu_gpio_event(void *opaque, QEMUChrEvent event)
>> @@ -262,7 +269,9 @@ static void vu_gpio_event(void *opaque, QEMUChrEvent event)
>> }
>> break;
>> case CHR_EVENT_CLOSED:
>> - vu_gpio_disconnect(dev);
>> + /* defer close until later to avoid circular close */
>> + vhost_user_async_close(dev, &gpio->chardev, &gpio->vhost_dev,
>> + vu_gpio_disconnect);
>> break;
>> case CHR_EVENT_BREAK:
>> case CHR_EVENT_MUX_IN:
>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>> index abe23d4ebe..8f635844af 100644
>> --- a/hw/virtio/vhost-user.c
>> +++ b/hw/virtio/vhost-user.c
>> @@ -21,6 +21,7 @@
>> #include "qemu/error-report.h"
>> #include "qemu/main-loop.h"
>> #include "qemu/sockets.h"
>> +#include "sysemu/runstate.h"
>> #include "sysemu/cryptodev.h"
>> #include "migration/migration.h"
>> #include "migration/postcopy-ram.h"
>> @@ -2670,6 +2671,76 @@ void vhost_user_cleanup(VhostUserState *user)
>> user->chr = NULL;
>> }
>>
>> +
>> +typedef struct {
>> + vu_async_close_fn cb;
>> + DeviceState *dev;
>> + CharBackend *cd;
>> + struct vhost_dev *vhost;
>> +} VhostAsyncCallback;
>> +
>> +static void vhost_user_async_close_bh(void *opaque)
>> +{
>> + VhostAsyncCallback *data = opaque;
>> + struct vhost_dev *vhost = data->vhost;
>> +
>> + /*
>> + * If the vhost_dev has been cleared in the meantime there is
>> + * nothing left to do as some other path has completed the
>> + * cleanup.
>> + */
>> + if (vhost->vdev) {
>> + data->cb(data->dev);
>> + }
>> +
>> + g_free(data);
>> +}
>> +
>> +/*
>> + * We only schedule the work if the machine is running. If suspended
>> + * we want to keep all the in-flight data as is for migration
>> + * purposes.
>> + */
>> +void vhost_user_async_close(DeviceState *d,
>> + CharBackend *chardev, struct vhost_dev *vhost,
>> + vu_async_close_fn cb)
>> +{
>> + if (!runstate_check(RUN_STATE_SHUTDOWN)) {
>> + /*
>> + * A close event may happen during a read/write, but vhost
>> + * code assumes the vhost_dev remains setup, so delay the
>> + * stop & clear.
>> + */
>> + AioContext *ctx = qemu_get_current_aio_context();
>> + VhostAsyncCallback *data = g_new0(VhostAsyncCallback, 1);
>> +
>> + /* Save data for the callback */
>> + data->cb = cb;
>> + data->dev = d;
>> + data->cd = chardev;
>> + data->vhost = vhost;
>> +
>> + /* Disable any further notifications on the chardev */
>> + qemu_chr_fe_set_handlers(chardev,
>> + NULL, NULL, NULL, NULL, NULL, NULL,
>> + false);
>> +
>> + aio_bh_schedule_oneshot(ctx, vhost_user_async_close_bh, data);
>> +
>> + /*
>> + * Move vhost device to the stopped state. The vhost-user device
>> + * will be clean up and disconnected in BH. This can be useful in
>> + * the vhost migration code. If disconnect was caught there is an
>> + * option for the general vhost code to get the dev state without
>> + * knowing its type (in this case vhost-user).
>> + *
>> + * Note if the vhost device is fully cleared by the time we
>> + * execute the bottom half we won't continue with the cleanup.
>> + */
>> + vhost->started = false;
>> + }
>> +}
>> +
>> static int vhost_user_dev_start(struct vhost_dev *dev, bool started)
>> {
>> if (!virtio_has_feature(dev->protocol_features,
>> --
>> 2.34.1
>>
>
> [[End of PGP Signed Part]]
--
Alex Bennée
next prev parent reply other threads:[~2022-12-01 17:13 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-30 11:24 [PATCH for 7.2 v4 0/5] final vhost-user fixes Alex Bennée
2022-11-30 11:24 ` [PATCH v4 1/5] tests/qtests: override "force-legacy" for gpio virtio-mmio tests Alex Bennée
2022-11-30 12:15 ` Thomas Huth
2022-11-30 11:24 ` [Virtio-fs] [PATCH v4 2/5] vhost: enable vrings in vhost_dev_start() for vhost-user devices Alex Bennée
2022-11-30 11:24 ` Alex Bennée
2022-11-30 11:24 ` [PATCH v4 3/5] hw/virtio: add started_vu status field to vhost-user-gpio Alex Bennée
2022-11-30 11:24 ` [PATCH v4 4/5] hw/virtio: generalise CHR_EVENT_CLOSED handling Alex Bennée
2022-11-30 21:21 ` Stefan Hajnoczi
2022-12-01 17:05 ` Alex Bennée [this message]
2022-12-01 6:56 ` Michael S. Tsirkin
2022-11-30 11:24 ` [PATCH v4 5/5] include/hw: VM state takes precedence in virtio_device_should_start Alex Bennée
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v8mvaw0c.fsf@linaro.org \
--to=alex.bennee@linaro.org \
--cc=hreitz@redhat.com \
--cc=kwolf@redhat.com \
--cc=marcandre.lureau@redhat.com \
--cc=mathieu.poirier@linaro.org \
--cc=mst@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=raphael.norwitz@nutanix.com \
--cc=sgarzare@redhat.com \
--cc=slp@redhat.com \
--cc=stefanha@redhat.com \
--cc=viresh.kumar@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.