From: Fam Zheng <famz@redhat.com>
To: kwolf@redhat.com, Sergio Lopez <slp@redhat.com>
Cc: stefanha@redhat.com, qemu-block@nongnu.org, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] util/async: use qemu_aio_coroutine_enter in co_schedule_bh_cb
Date: Wed, 12 Sep 2018 15:41:59 +0800 [thread overview]
Message-ID: <20180912074159.GA11164@lemon.usersys.redhat.com> (raw)
In-Reply-To: <20180905093351.21954-1-slp@redhat.com>
On Wed, 09/05 11:33, Sergio Lopez wrote:
> AIO Coroutines shouldn't by managed by an AioContext different than the
> one assigned when they are created. aio_co_enter avoids entering a
> coroutine from a different AioContext, calling aio_co_schedule instead.
>
> Scheduled coroutines are then entered by co_schedule_bh_cb using
> qemu_coroutine_enter, which just calls qemu_aio_coroutine_enter with the
> current AioContext obtained with qemu_get_current_aio_context.
> Eventually, co->ctx will be set to the AioContext passed as an argument
> to qemu_aio_coroutine_enter.
>
> This means that, if an IO Thread's AioConext is being processed by the
> Main Thread (due to aio_poll being called with a BDS AioContext, as it
> happens in AIO_WAIT_WHILE among other places), the AioContext from some
> coroutines may be wrongly replaced with the one from the Main Thread.
>
> This is the root cause behind some crashes, mainly triggered by the
> drain code at block/io.c. The most common are these abort and failed
> assertion:
>
> util/async.c:aio_co_schedule
> 456 if (scheduled) {
> 457 fprintf(stderr,
> 458 "%s: Co-routine was already scheduled in '%s'\n",
> 459 __func__, scheduled);
> 460 abort();
> 461 }
>
> util/qemu-coroutine-lock.c:
> 286 assert(mutex->holder == self);
>
> But it's also known to cause random errors at different locations, and
> even SIGSEGV with broken coroutine backtraces.
>
> By using qemu_aio_coroutine_enter directly in co_schedule_bh_cb, we can
> pass the correct AioContext as an argument, making sure co->ctx is not
> wrongly altered.
>
> Signed-off-by: Sergio Lopez <slp@redhat.com>
> ---
> util/async.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/util/async.c b/util/async.c
> index 05979f8014..c10642a385 100644
> --- a/util/async.c
> +++ b/util/async.c
> @@ -400,7 +400,7 @@ static void co_schedule_bh_cb(void *opaque)
>
> /* Protected by write barrier in qemu_aio_coroutine_enter */
> atomic_set(&co->scheduled, NULL);
> - qemu_coroutine_enter(co);
> + qemu_aio_coroutine_enter(ctx, co);
> aio_context_release(ctx);
> }
> }
Kevin, could you test this patch together with your next version of the drain
fix series? Since they are related, it's better if you could include it in your
series or even apply it yourself. Peter is not processing pull requests, so
scattering fixes in various trees will do no good.
Fam
next prev parent reply other threads:[~2018-09-12 7:42 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-05 9:33 [Qemu-devel] [PATCH] util/async: use qemu_aio_coroutine_enter in co_schedule_bh_cb Sergio Lopez
2018-09-12 7:35 ` Fam Zheng
2018-09-12 7:41 ` Fam Zheng [this message]
2018-09-12 10:28 ` Kevin Wolf
2018-09-13 14:58 ` [Qemu-devel] [Qemu-block] " Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180912074159.GA11164@lemon.usersys.redhat.com \
--to=famz@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=slp@redhat.com \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).