From: Max Reitz <mreitz@redhat.com>
To: qemu-block@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH for-4.1 0/2] block: bdrv_drained_end() changes fallout
Date: Mon, 22 Jul 2019 19:18:20 +0200 [thread overview]
Message-ID: <5f89c6ba-b049-db87-23ec-0317c53e978f@redhat.com> (raw)
In-Reply-To: <20190722133054.21781-1-mreitz@redhat.com>
[-- Attachment #1.1.1: Type: text/plain, Size: 2830 bytes --]
On 22.07.19 15:30, Max Reitz wrote:
> Hi,
>
> I noted that test-bdrv-drain sometimgs hangs (very rarely, though), and
> tried to write a test that triggers the issue. I failed to do so (there
> is a good reason for that, see patch 1), but on my way I noticed that
> calling bdrv_set_aio_context_ignore() from any AioContext but the main
> one is a bad idea. Hence patch 2.
>
> Anyway, I found the problem, which is fixed by patch 1 -- I think it’s
> rather obvious. There is no dedicated test because I don’t think it’s
> possible to write one, as I explain there.
>
>
> Max Reitz (2):
> block: Dec. drained_end_counter before bdrv_wakeup
> block: Only the main loop can change AioContexts
>
> include/block/block.h | 8 +++-----
> block.c | 13 ++++++++-----
> block/io.c | 5 ++---
> 3 files changed, 13 insertions(+), 13 deletions(-)
Sorry, applied to my block branch.
“Sorry” obviously because I didn’t really give much time to review this
series. My justification is:
- rc2’s tomorrow. I know the current code is broken, so I’d rather take
the chance to have a fixed rc2 than to know it to be broken and force
an rc4 by sending this for rc3.
- Patch 1 looks really obvious and simple to me. It makes sense to
decrement the drained_end_counter exactly where .done is set to true.
- Patch 2 is not so obvious. But the only dangerous change it
introduces is an assertion that bdrv_set_aio_context_ignore() is
called from the main loop. Right now, if it is called from anywhere
but the main loop, we *will* run into assertions elsewhere:
- bdrv_drained_begin() does a BDRV_POLL_WHILE(bs, ...). This only
works from the main context or bs's context.
- bdrv_drained_end() does the same, but now bs's context has changed.
Ergo, right now (on master), you can run bdrv_set_aio_context_ignore()
only from the main loop anyway. The new assertion only makes that
fact more explicit.
Now this makes it look like before e037c09c78520, you could run
bdrv_set_aio_context_ignore() from the old context (like the comment
therein said). But that’s wrong. Even before e037c09c78520,
bdrv_drained_end() didn’t work for mixed-context trees unless you call
it from the main context: It schedules the drained_end callbacks in the
respective node's context, but then it polls them from the original
context. So if you modify e.g. test_set_aio_context() in
test-bdrv-drain to run bdrv_try_set_aio_context() from a BH in the old
context, you will see a failed assertion in bdrv_drain_invoke()'s
BDRV_POLL_WHILE. (I’ve attached a diff for use on e037c09c78520^.)
Therefore, I’m confident this series doesn’t make things worse, which is
why I’m taking it without reviews.
Max
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.1.2: test-bdrv-drain.diff --]
[-- Type: text/x-patch; name="test-bdrv-drain.diff", Size: 2000 bytes --]
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
index 3503ce3b69..cf1194754e 100644
--- a/tests/test-bdrv-drain.c
+++ b/tests/test-bdrv-drain.c
@@ -1497,6 +1497,19 @@ static void test_append_to_drained(void)
blk_unref(blk);
}
+typedef struct BHParams {
+ BlockDriverState *bs;
+ AioContext *target;
+ bool done;
+} BHParams;
+
+static void bh_fun(void *opaque)
+{
+ BHParams *bhp = opaque;
+ bdrv_try_set_aio_context(bhp->bs, bhp->target, &error_abort);
+ bhp->done = true;
+}
+
static void test_set_aio_context(void)
{
BlockDriverState *bs;
@@ -1504,22 +1517,38 @@ static void test_set_aio_context(void)
IOThread *b = iothread_new();
AioContext *ctx_a = iothread_get_aio_context(a);
AioContext *ctx_b = iothread_get_aio_context(b);
+ BHParams bhp;
bs = bdrv_new_open_driver(&bdrv_test, "test-node", BDRV_O_RDWR,
&error_abort);
bdrv_drained_begin(bs);
- bdrv_try_set_aio_context(bs, ctx_a, &error_abort);
+ bhp = (BHParams){ bs, ctx_a };
+ aio_bh_schedule_oneshot(qemu_get_aio_context(), bh_fun, &bhp);
+ while (!bhp.done) {
+ aio_poll(qemu_get_aio_context(), true);
+ }
aio_context_acquire(ctx_a);
bdrv_drained_end(bs);
bdrv_drained_begin(bs);
- bdrv_try_set_aio_context(bs, ctx_b, &error_abort);
+
+ bhp = (BHParams){ bs, ctx_b };
+ aio_bh_schedule_oneshot(ctx_a, bh_fun, &bhp);
aio_context_release(ctx_a);
+ while (!bhp.done) {
+ aio_poll(qemu_get_aio_context(), true);
+ }
+
aio_context_acquire(ctx_b);
- bdrv_try_set_aio_context(bs, qemu_get_aio_context(), &error_abort);
+ bhp = (BHParams){ bs, qemu_get_aio_context() };
+ aio_bh_schedule_oneshot(ctx_b, bh_fun, &bhp);
aio_context_release(ctx_b);
+ while (!bhp.done) {
+ aio_poll(qemu_get_aio_context(), true);
+ }
+
bdrv_drained_end(bs);
bdrv_unref(bs);
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
prev parent reply other threads:[~2019-07-22 17:18 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-22 13:30 [Qemu-devel] [PATCH for-4.1 0/2] block: bdrv_drained_end() changes fallout Max Reitz
2019-07-22 13:30 ` [Qemu-devel] [PATCH for-4.1 1/2] block: Dec. drained_end_counter before bdrv_wakeup Max Reitz
2019-07-22 13:30 ` [Qemu-devel] [PATCH for-4.1 2/2] block: Only the main loop can change AioContexts Max Reitz
2019-07-23 8:52 ` Kevin Wolf
2019-07-23 9:41 ` Max Reitz
2019-07-23 10:02 ` Kevin Wolf
2019-07-23 10:21 ` Max Reitz
2019-07-23 11:06 ` Kevin Wolf
2019-07-22 17:18 ` Max Reitz [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5f89c6ba-b049-db87-23ec-0317c53e978f@redhat.com \
--to=mreitz@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).