qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Hanna Czenczek <hreitz@redhat.com>
Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org,
	Fiona Ebner <f.ebner@proxmox.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Fam Zheng <fam@euphon.net>
Subject: Re: [PATCH 0/2] virtio: Keep notifications disabled during drain
Date: Thu, 25 Jan 2024 13:05:00 -0500	[thread overview]
Message-ID: <20240125180500.GB36016@fedora> (raw)
In-Reply-To: <20240124173834.66320-1-hreitz@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 4299 bytes --]

On Wed, Jan 24, 2024 at 06:38:28PM +0100, Hanna Czenczek wrote:
> Hi,
> 
> When registering callbacks via aio_set_event_notifier_poll(), the
> io_poll_end() callback is only invoked when polling actually ends.  If
> the notifiers are removed while in a polling section, it is not called.
> Therefore, io_poll_start() is not necessarily followed up by
> io_poll_end().
> 
> It is not entirely clear whether this is good or bad behavior.  On one
> hand, it may be unexpected to callers.  On the other, it may be
> counterproductive to call io_poll_end() when the polling section has not
> ended yet.
> 
> Right now, there is only one user of aio_set_event_notifier(), which is
> virtio_queue_aio_attach_host_notifier().  It does not expect this
> behavior, which leads to virtqueue notifiers remaining disabled if
> virtio_queue_aio_detach_host_notifier() is called while polling.  That
> can happen e.g. through virtio_scsi_drained_begin() or
> virtio_blk_drained_begin() (through virtio_blk_data_plane_detach()).
> In such a case, the virtqueue may not be processed for a while, letting
> the guest driver hang.  This can be reproduced by repeatedly
> hot-plugging and -unplugging a virtio-scsi device with a scsi-hd disk,
> because the guest will try to enumerate the virtio-scsi device while
> we’re attaching the scsi-hd disk, which causes a drain, which can cause
> the virtio-scsi virtqueue to stall as described.
> 
> Stefan has suggested ensuring we always follow up io_poll_start() by
> io_poll_end():
> 
> https://lists.nongnu.org/archive/html/qemu-block/2023-12/msg00163.html
> 
> I prefer changing the caller instead, because I don’t think we actually
> want the virtqueue notifier to be force-enabled when removing our AIO
> notifiers.  So I believe we actually only want to take care to
> force-enable it when we re-attach the AIO notifiers, and to kick
> virtqueue processing once, in case we missed any events while the AIO
> notifiers were not attached.
> 
> That is done by patch 2.  We have already discussed a prior version of
> it here:
> 
> https://lists.nongnu.org/archive/html/qemu-block/2024-01/msg00001.html
> 
> And compared to that, based on the discussion, there are some changes:
> 1. Used virtio_queue_notify() instead of virtio_queue_notify_vq(), as
>    suggested by Paolo, because it’s thread-safe
> 2. Moved virtio_queue_notify() into
>    virtio_queue_aio_attach_host_notifier*(), because we always want it
> 3. Dropped virtio_queue_set_notification(vq, 0) from
>    virtio_queue_aio_detach_host_notifier(): Paolo wasn’t sure whether
>    that was safe to do from any context.  We don’t really need to call
>    it anyway, so I just dropped it.
> 4. Added patch 1:
> 
> Patch 1 fixes virtio_scsi_drained_end() so it won’t attach polling
> notifiers for the event virtqueue.  That didn’t turn out to be an issue
> so far, but with patch 2, Fiona saw the virtqueue processing queue
> spinning in a loop as described in
> 38738f7dbbda90fbc161757b7f4be35b52205552 ("virtio-scsi: don't waste CPU
> polling the event virtqueue").
> 
> 
> Note that as of eaad0fe26050c227dc5dad63205835bac4912a51 ("scsi: only
> access SCSIDevice->requests from one thread") there’s a different
> problem when trying to reproduce the bug via hot-plugging and
> -unplugging a virtio-scsi device, specifically, when unplugging, qemu
> may crash with an assertion failure[1].  I don’t have a full fix for
> that yet, but in case you need a work-around for the specific case of
> virtio-scsi hot-plugging and -unplugging, you can use this patch:
> 
> https://czenczek.de/0001-DONTMERGE-Fix-crash-on-scsi-unplug.patch
> 
> 
> [1] https://lists.nongnu.org/archive/html/qemu-block/2024-01/msg00317.html
> 
> 
> Hanna Czenczek (2):
>   virtio-scsi: Attach event vq notifier with no_poll
>   virtio: Keep notifications disabled during drain
> 
>  include/block/aio.h   |  7 ++++++-
>  hw/scsi/virtio-scsi.c |  7 ++++++-
>  hw/virtio/virtio.c    | 42 ++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 54 insertions(+), 2 deletions(-)

This patch series also fixes RHEL-7356.

Buglink: https://issues.redhat.com/browse/RHEL-7356.
Tested-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

      parent reply	other threads:[~2024-01-25 18:05 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-24 17:38 [PATCH 0/2] virtio: Keep notifications disabled during drain Hanna Czenczek
2024-01-24 17:38 ` [PATCH 1/2] virtio-scsi: Attach event vq notifier with no_poll Hanna Czenczek
2024-01-24 22:00   ` Stefan Hajnoczi
2024-01-25  9:43   ` Fiona Ebner
2024-01-24 17:38 ` [PATCH 2/2] virtio: Keep notifications disabled during drain Hanna Czenczek
2024-01-25  9:43   ` Fiona Ebner
2024-01-25 18:03   ` Stefan Hajnoczi
2024-01-25 18:18     ` Hanna Czenczek
2024-01-25 18:32       ` Hanna Czenczek
2024-01-25 21:32         ` Stefan Hajnoczi
2024-01-25 18:05 ` Stefan Hajnoczi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240125180500.GB36016@fedora \
    --to=stefanha@redhat.com \
    --cc=f.ebner@proxmox.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).