qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Greg Kurz <groug@kaod.org>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org,
	Max Reitz <mreitz@redhat.com>
Subject: Re: [PATCH v2] block: End quiescent sections when a BDS is deleted
Date: Tue, 27 Oct 2020 12:16:53 +0100	[thread overview]
Message-ID: <20201027111653.GC4628@merkur.fritz.box> (raw)
In-Reply-To: <160346526998.272601.9045392804399803158.stgit@bahia.lan>

Am 23.10.2020 um 17:01 hat Greg Kurz geschrieben:
> If a BDS gets deleted during blk_drain_all(), it might miss a
> call to bdrv_do_drained_end(). This means missing a call to
> aio_enable_external() and the AIO context remains disabled for
> ever. This can cause a device to become irresponsive and to
> disrupt the guest execution, ie. hang, loop forever or worse.
> 
> This scenario is quite easy to encounter with virtio-scsi
> on POWER when punching multiple blockdev-create QMP commands
> while the guest is booting and it is still running the SLOF
> firmware. This happens because SLOF disables/re-enables PCI
> devices multiple times via IO/MEM/MASTER bits of PCI_COMMAND
> register after the initial probe/feature negotiation, as it
> tends to work with a single device at a time at various stages
> like probing and running block/network bootloaders without
> doing a full reset in-between. This naturally generates many
> dataplane stops and starts, and thus many drain sections that
> can race with blockdev_create_run(). In the end, SLOF bails
> out.
> 
> It is somehow reproducible on x86 but it requires to generate
> articial dataplane start/stop activity with stop/cont QMP
> commands. In this case, seabios ends up looping for ever,
> waiting for the virtio-scsi device to send a response to
> a command it never received.
> 
> Add a helper that pairs all previously called bdrv_do_drained_begin()
> with a bdrv_do_drained_end() and call it from bdrv_close().
> While at it, update the "/bdrv-drain/graph-change/drain_all"
> test in test-bdrv-drain so that it can catch the issue.
> 
> BugId: https://bugzilla.redhat.com/show_bug.cgi?id=1874441
> Signed-off-by: Greg Kurz <groug@kaod.org>

Thanks, applied to the block branch.

Kevin



  reply	other threads:[~2020-10-27 11:39 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-23 15:01 [PATCH v2] block: End quiescent sections when a BDS is deleted Greg Kurz
2020-10-27 11:16 ` Kevin Wolf [this message]
2020-10-27 13:54 ` Stefan Hajnoczi
2020-10-27 15:24   ` Greg Kurz
2020-10-27 15:41     ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201027111653.GC4628@merkur.fritz.box \
    --to=kwolf@redhat.com \
    --cc=groug@kaod.org \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).