qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Max Reitz <mreitz@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 0/4] block: Keep track of parent quiescing
Date: Fri, 14 Jun 2019 17:25:16 +0200	[thread overview]
Message-ID: <20190614152516.GI6042@dhcp-200-226.str.redhat.com> (raw)
In-Reply-To: <20190605161118.14544-1-mreitz@redhat.com>

Am 05.06.2019 um 18:11 hat Max Reitz geschrieben:
> We currently do not keep track of how many times a child has quiesced
> its parent.  We just guess based on the child’s quiesce_counter.  That
> keeps biting us when we try to leave drained sections or detach children
> (see e.g. commit 5cb2737e925042e).
> 
> I think we need an explicit counter to keep track of how many times a
> parent has been quiesced (patch 1).  That just makes the code more
> resilient.
> 
> Actually, no, we don’t need a counter, we need a boolean.  See patch 2
> for the explanation.
> 
> Yes, it is a bit weird to introduce a counter first (patch 1) and then
> immediately make it a boolean (patch 2).  But I believe this to be the
> most logical change set.
> 
> (“Our current model relies on counting, so adding an explicit counter
> makes sense.  It then turns out that counting is probably not the best
> idea, so why not make it a boolean?”)

Trying to summarise an IRC discussion I just had with Max:

The real root problem isn't that the recursion in bdrv_do_drained_end()
doesn't correctly deal with graph changes, but that those graph changes
happen in the first place. The one basic guiding principle in my drain
rewrite was that during the recursion (both to children and parents), no
graph changes are allowed, which means that no aio_poll() calls are
allowed either.

Of course, while I think the principle is right and greatly simplifies
the code (or actually is the only thing that gives us any hope to get
things right), I messed up the implementation because
bdrv_drain_invoke() does use BDRV_POLL_WHILE() for ending a drained
section. This is wrong, and it could still cause crashes after this
series because a recursive call could remove a node that is currently
processed somewhere up the call chain.

The fix for the observed bugs should be to make drained_end completely
symmetric to drained_begin: Just start the bdrv_drain_invoke_entry()
coroutine, do the recursion and call all the callbacks (none of which
may modify the graph) and only after all of this is done, poll once at
the top level drain. (The poll condition could be simplified to just
wait for bdrv_drain_invoke() to be completed, we don't care about other
running requests in drained_end. But this is only an optimisation.)

Despite this being a full fix, we also agreed that patch 1 is a nice
cleanup and we want to keep it even if it doesn't strictly fix a bug any
more.

Kevin


      parent reply	other threads:[~2019-06-14 15:52 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-05 16:11 [Qemu-devel] [PATCH 0/4] block: Keep track of parent quiescing Max Reitz
2019-06-05 16:11 ` [Qemu-devel] [PATCH 1/4] block: Introduce BdrvChild.parent_quiesce_counter Max Reitz
2019-06-05 16:11 ` [Qemu-devel] [PATCH 2/4] block: Make @parent_quiesced a bool Max Reitz
2019-06-05 16:11 ` [Qemu-devel] [PATCH 3/4] iotests: Add @has_quit to vm.shutdown() Max Reitz
2019-06-05 16:11 ` [Qemu-devel] [PATCH 4/4] iotests: Test commit with a filter on the chain Max Reitz
2019-06-14 15:25 ` Kevin Wolf [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190614152516.GI6042@dhcp-200-226.str.redhat.com \
    --to=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).