All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Hanna Czenczek <hreitz@redhat.com>
Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org,
	Fiona Ebner <f.ebner@proxmox.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Fam Zheng <fam@euphon.net>
Subject: Re: [PATCH 0/2] virtio: Keep notifications disabled during drain
Date: Thu, 25 Jan 2024 13:05:00 -0500	[thread overview]
Message-ID: <20240125180500.GB36016@fedora> (raw)
In-Reply-To: <20240124173834.66320-1-hreitz@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 4299 bytes --]

On Wed, Jan 24, 2024 at 06:38:28PM +0100, Hanna Czenczek wrote:
> Hi,
> 
> When registering callbacks via aio_set_event_notifier_poll(), the
> io_poll_end() callback is only invoked when polling actually ends.  If
> the notifiers are removed while in a polling section, it is not called.
> Therefore, io_poll_start() is not necessarily followed up by
> io_poll_end().
> 
> It is not entirely clear whether this is good or bad behavior.  On one
> hand, it may be unexpected to callers.  On the other, it may be
> counterproductive to call io_poll_end() when the polling section has not
> ended yet.
> 
> Right now, there is only one user of aio_set_event_notifier(), which is
> virtio_queue_aio_attach_host_notifier().  It does not expect this
> behavior, which leads to virtqueue notifiers remaining disabled if
> virtio_queue_aio_detach_host_notifier() is called while polling.  That
> can happen e.g. through virtio_scsi_drained_begin() or
> virtio_blk_drained_begin() (through virtio_blk_data_plane_detach()).
> In such a case, the virtqueue may not be processed for a while, letting
> the guest driver hang.  This can be reproduced by repeatedly
> hot-plugging and -unplugging a virtio-scsi device with a scsi-hd disk,
> because the guest will try to enumerate the virtio-scsi device while
> we’re attaching the scsi-hd disk, which causes a drain, which can cause
> the virtio-scsi virtqueue to stall as described.
> 
> Stefan has suggested ensuring we always follow up io_poll_start() by
> io_poll_end():
> 
> https://lists.nongnu.org/archive/html/qemu-block/2023-12/msg00163.html
> 
> I prefer changing the caller instead, because I don’t think we actually
> want the virtqueue notifier to be force-enabled when removing our AIO
> notifiers.  So I believe we actually only want to take care to
> force-enable it when we re-attach the AIO notifiers, and to kick
> virtqueue processing once, in case we missed any events while the AIO
> notifiers were not attached.
> 
> That is done by patch 2.  We have already discussed a prior version of
> it here:
> 
> https://lists.nongnu.org/archive/html/qemu-block/2024-01/msg00001.html
> 
> And compared to that, based on the discussion, there are some changes:
> 1. Used virtio_queue_notify() instead of virtio_queue_notify_vq(), as
>    suggested by Paolo, because it’s thread-safe
> 2. Moved virtio_queue_notify() into
>    virtio_queue_aio_attach_host_notifier*(), because we always want it
> 3. Dropped virtio_queue_set_notification(vq, 0) from
>    virtio_queue_aio_detach_host_notifier(): Paolo wasn’t sure whether
>    that was safe to do from any context.  We don’t really need to call
>    it anyway, so I just dropped it.
> 4. Added patch 1:
> 
> Patch 1 fixes virtio_scsi_drained_end() so it won’t attach polling
> notifiers for the event virtqueue.  That didn’t turn out to be an issue
> so far, but with patch 2, Fiona saw the virtqueue processing queue
> spinning in a loop as described in
> 38738f7dbbda90fbc161757b7f4be35b52205552 ("virtio-scsi: don't waste CPU
> polling the event virtqueue").
> 
> 
> Note that as of eaad0fe26050c227dc5dad63205835bac4912a51 ("scsi: only
> access SCSIDevice->requests from one thread") there’s a different
> problem when trying to reproduce the bug via hot-plugging and
> -unplugging a virtio-scsi device, specifically, when unplugging, qemu
> may crash with an assertion failure[1].  I don’t have a full fix for
> that yet, but in case you need a work-around for the specific case of
> virtio-scsi hot-plugging and -unplugging, you can use this patch:
> 
> https://czenczek.de/0001-DONTMERGE-Fix-crash-on-scsi-unplug.patch
> 
> 
> [1] https://lists.nongnu.org/archive/html/qemu-block/2024-01/msg00317.html
> 
> 
> Hanna Czenczek (2):
>   virtio-scsi: Attach event vq notifier with no_poll
>   virtio: Keep notifications disabled during drain
> 
>  include/block/aio.h   |  7 ++++++-
>  hw/scsi/virtio-scsi.c |  7 ++++++-
>  hw/virtio/virtio.c    | 42 ++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 54 insertions(+), 2 deletions(-)

This patch series also fixes RHEL-7356.

Buglink: https://issues.redhat.com/browse/RHEL-7356.
Tested-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

      parent reply	other threads:[~2024-01-25 18:05 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-24 17:38 [PATCH 0/2] virtio: Keep notifications disabled during drain Hanna Czenczek
2024-01-24 17:38 ` [PATCH 1/2] virtio-scsi: Attach event vq notifier with no_poll Hanna Czenczek
2024-01-24 22:00   ` Stefan Hajnoczi
2024-01-25  9:43   ` Fiona Ebner
2024-01-24 17:38 ` [PATCH 2/2] virtio: Keep notifications disabled during drain Hanna Czenczek
2024-01-25  9:43   ` Fiona Ebner
2024-01-25 18:03   ` Stefan Hajnoczi
2024-01-25 18:18     ` Hanna Czenczek
2024-01-25 18:32       ` Hanna Czenczek
2024-01-25 21:32         ` Stefan Hajnoczi
2024-01-25 18:05 ` Stefan Hajnoczi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240125180500.GB36016@fedora \
    --to=stefanha@redhat.com \
    --cc=f.ebner@proxmox.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.