From: Kevin Wolf <kwolf@redhat.com>
To: Fam Zheng <famz@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, jcody@redhat.com,
Max Reitz <mreitz@redhat.com>,
pbonzini@redhat.com, Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH RFC 0/9] block: Rewrite block drain begin/end
Date: Thu, 30 Nov 2017 11:31:35 +0100 [thread overview]
Message-ID: <20171130103135.GA4039@localhost.localdomain> (raw)
In-Reply-To: <20171130020359.GB16237@lemon>
Am 30.11.2017 um 03:03 hat Fam Zheng geschrieben:
> On Wed, 11/29 18:25, Kevin Wolf wrote:
> > Am 29.11.2017 um 15:49 hat Fam Zheng geschrieben:
> > > While we look at the fixes for 2.11, I briefly prototyped this series
> > > to see if it makes sense as a simplification of the drain API for
> > > 2.12.
> > >
> > > The idea is to let AioContext manage quiesce callbacks, then the block
> > > layer only needs to do the in-flight request waiting. This lets us get
> > > rid of the callback recursion (both up and down).
> >
> > So essentially you don't drain individual nodes any more, but whole
> > AioContexts. I have a feeeling that this would be a step in the wrong
> > direction.
> >
> > Not only would it completely bypass the path I/O requests take and
> > potentially drain a lot more than is actually necessary, but it also
> > requires that all nodes that are connected in a tree are in the same
> > AioContext.
>
> Yeah, good point. Initially I wanted to introduce a BlockGraph object
> which manages the per-graph draining, (i.e. where to register the
> drain callbacks), but I felt lazy and used AioContext.
>
> Will that make it better? BlockGraph would be a proper abstraction
> and will not limit the API to one AioContext per tree.
There is only a single graph, so this would mean going back to global
bdrv_drain_all() exclusively.
What you really mean is probably connected components in the graph, but
do we really want to manage merging and splitting object representing
connected components when a node is added or removed from the graph?
Especially when that graph change occurs in a drain callback?
You can also still easily introduce bugs where graph changes during a
drain end up in nodes not being drained, possibly drained twice, you
still access the next pointer of a deleted node or you accidentally
switch to draining a different component.
It's probably possible to get this right, but essentially you're just
switching from iterating a tree to iterating a list. You get roughly the
same set of problems that you have to consider as today, and getting it
right should be about the same difficulty.
> > And finally, I don't really think that the recursion is even a problem.
> > The problem is with graph changes made by callbacks that drain allows to
> > run. With your changes, it might be a bit easier to avoid that
> > bdrv_drain() itself gets into trouble due to graph changes, but this
> > doesn't solve the problem for any (possibly indirect) callers of
> > bdrv_drain().
>
> The recursion is the one big place that can be easily broken by graph changes,
> fixing this doesn't make the situation any worse. We could still fix the
> indirect callers by taking references or by introducing "ubiquitous coroutines".
But hiding a bug in 80% of the cases where it shows isn't enough.
I think the only real solution is to forbid graph changes until after
any critical operation has completed. I haven't tried it out in
practice, but I suppose we could use a CoMutex around them and take it
in bdrv_drained_begin/end() and all other places that can get into
trouble with graph changes.
Kevin
next prev parent reply other threads:[~2017-11-30 10:31 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-29 14:49 [Qemu-devel] [PATCH RFC 0/9] block: Rewrite block drain begin/end Fam Zheng
2017-11-29 14:49 ` [Qemu-devel] [PATCH RFC 1/9] block: Remove unused bdrv_requests_pending Fam Zheng
2017-12-01 12:35 ` [Qemu-devel] [Qemu-block] " Alberto Garcia
2017-11-29 14:49 ` [Qemu-devel] [PATCH RFC 2/9] aio: Add drain begin/end API to AioContext Fam Zheng
2017-11-30 16:07 ` Stefan Hajnoczi
2017-11-29 14:49 ` [Qemu-devel] [PATCH RFC 3/9] blockjob: Implement AioContext drain ops Fam Zheng
2017-11-30 16:09 ` Stefan Hajnoczi
2017-11-29 14:49 ` [Qemu-devel] [PATCH RFC 4/9] throttle: " Fam Zheng
2017-11-29 14:49 ` [Qemu-devel] [PATCH RFC 5/9] qed: " Fam Zheng
2017-11-29 14:49 ` [Qemu-devel] [PATCH RFC 6/9] block: Use aio_context_drained_begin in bdrv_set_aio_context Fam Zheng
2017-11-29 14:49 ` [Qemu-devel] [PATCH RFC 7/9] block: Switch to use AIO drained begin/end API Fam Zheng
2017-11-29 14:49 ` [Qemu-devel] [PATCH RFC 8/9] block: Drop old drained_{begin, end} callbacks Fam Zheng
2017-11-29 14:49 ` [Qemu-devel] [PATCH RFC 9/9] blockjob: Drop unused functions Fam Zheng
2017-11-29 17:25 ` [Qemu-devel] [PATCH RFC 0/9] block: Rewrite block drain begin/end Kevin Wolf
2017-11-30 2:03 ` Fam Zheng
2017-11-30 10:31 ` Kevin Wolf [this message]
2017-11-30 14:34 ` Fam Zheng
2017-11-30 15:10 ` Kevin Wolf
2017-11-30 16:04 ` Paolo Bonzini
2017-12-01 9:51 ` Fam Zheng
2017-12-01 12:24 ` Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171130103135.GA4039@localhost.localdomain \
--to=kwolf@redhat.com \
--cc=famz@redhat.com \
--cc=jcody@redhat.com \
--cc=mreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).