qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>, Fam Zheng <famz@redhat.com>
Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org,
	Max Reitz <mreitz@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v2 5/6] coroutine: Explicitly specify AioContext when entering coroutine
Date: Mon, 10 Apr 2017 08:56:00 +0800	[thread overview]
Message-ID: <f28a47ae-4c65-8b56-d7c0-d0298f03c3f7@redhat.com> (raw)
In-Reply-To: <20170407151445.GF4716@noname.redhat.com>



On 07/04/2017 23:14, Kevin Wolf wrote:
> One part of the reason is that
> BDRV_POLL_WHILE() drops the lock since commit c9d1a561, so just calling
> aio_context_acquire() doesn't protect you any more if there is any
> chance that a nested function calls BDRV_POLL_WHILE().

Hi guys, sorry for the late reply.

There wasn't actually much that changed in commit c9d1a561.  Everything
that could break was also broken before.

With commit c9d1a561 the logic is

	aio_context_acquire
	prepare I/O
	aio_context_release
	poll main AioContext
				aio_context_acquire
				aio_poll secondary AioContext
				complete I/O
			    <-- bdrv_wakeup
	aio_context_acquire
				aio_context_release
	Sync I/O is complete

while before it was

	aio_context_acquire
	prepare I/O
	aio_poll secondary AioContext
	complete I/O
	Sync I/O is complete

The code that can run in "aio_poll secondary AioContext" is the same in
both cases.  The difference is that, after c9d1a561, it always runs in
one thread which eliminates the need for RFifoLock's contention
callbacks (and a bunch of bdrv_drain_all deadlocks that arose from the
combination of contention callbacks and recursive locking).

The patch that introduced the bug is the one that introduced the "home
AioContext" co->ctx for coroutines, because now you have

	aio_context_acquire
	prepare I/O
	aio_context_release
	poll main AioContext
				aio_context_acquire
				aio_poll secondary AioContext
				aio_co_wake
	complete I/O
	bdrv_wakeup
	aio_context_acquire
				aio_context_release
	Sync I/O is complete

and if "complete I/O" causes anything to happen in the iothread, bad
things happen.

I think Fam's analysis is right.  This patch will hopefully be reverted
in 2.10, but right now it's the right thing to do.  However, I don't
like very much adding an argument to qemu_coroutine_enter because:

1) coroutines can be used without AioContexts (CoMutex has a dependency
on aio_co_wake, but it is hidden behind the API and if you have a single
AioContext you could just define aio_co_wake to be qemu_coroutine_enter).

2) the value that is assigned to co->ctx is not the AioContext in which
the coroutine is running.

It would be better to split aio_co_wake so that aio_co_wake is

    /* Read coroutine before co->ctx.  Matches smp_wmb in
     * qemu_coroutine_enter.
     */
    smp_read_barrier_depends();
    ctx = atomic_read(&co->ctx);
    aio_co_enter(ctx, co);

and aio_co_enter is the rest of the logic.

Then the coroutine code always runs in the right thread, which obviously
is thread-safe.

Paolo

  reply	other threads:[~2017-04-10  0:56 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-07  6:54 [Qemu-devel] [PATCH v2 0/6] block: Fixes regarding dataplane and management operations Fam Zheng
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 1/6] block: Fix unpaired aio_disable_external in external snapshot Fam Zheng
2017-04-07 13:23   ` Stefan Hajnoczi
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 2/6] block: Assert attached child node has right aio context Fam Zheng
2017-04-07 13:24   ` Stefan Hajnoczi
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 3/6] mirror: Fix aio context of mirror_top_bs Fam Zheng
2017-04-07 13:24   ` Stefan Hajnoczi
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 4/6] block: Quiesce old aio context during bdrv_set_aio_context Fam Zheng
2017-04-07 12:50   ` Stefan Hajnoczi
2017-04-08  3:43     ` Fam Zheng
2017-04-10  8:06       ` Kevin Wolf
2017-04-10  8:45         ` Fam Zheng
2017-04-10  9:11           ` Kevin Wolf
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 5/6] coroutine: Explicitly specify AioContext when entering coroutine Fam Zheng
2017-04-07  9:57   ` Fam Zheng
2017-04-07 13:26   ` Stefan Hajnoczi
2017-04-08  3:27     ` Fam Zheng
2017-04-07 14:07   ` Eric Blake
2017-04-07 15:14   ` Kevin Wolf
2017-04-10  0:56     ` Paolo Bonzini [this message]
2017-04-10  1:43       ` Fam Zheng
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 6/6] tests/block-job-txn: Don't start block job before adding to txn Fam Zheng
2017-04-07 13:28   ` Stefan Hajnoczi
2017-04-07 18:05     ` John Snow
2017-04-08  3:39       ` Fam Zheng
2017-04-07 12:45 ` [Qemu-devel] [PATCH v2 0/6] block: Fixes regarding dataplane and management operations Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f28a47ae-4c65-8b56-d7c0-d0298f03c3f7@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).