From: Kevin Wolf <kwolf@redhat.com>
To: qemu-block@nongnu.org
Cc: kwolf@redhat.com, mreitz@redhat.com, famz@redhat.com,
pbonzini@redhat.com, slp@redhat.com, jsnow@redhat.com,
qemu-devel@nongnu.org
Subject: [Qemu-devel] [PATCH v2 13/17] blockjob: Lie better in child_job_drained_poll()
Date: Thu, 13 Sep 2018 14:52:13 +0200 [thread overview]
Message-ID: <20180913125217.23173-14-kwolf@redhat.com> (raw)
In-Reply-To: <20180913125217.23173-1-kwolf@redhat.com>
Block jobs claim in .drained_poll() that they are in a quiescent state
as soon as job->deferred_to_main_loop is true. This is obviously wrong,
they still have a completion BH to run. We only get away with this
because commit 91af091f923 added an unconditional aio_poll(false) to the
drain functions, but this is bypassing the regular drain mechanisms.
However, just removing this and telling that the job is still active
doesn't work either: The completion callbacks themselves call drain
functions (directly, or indirectly with bdrv_reopen), so they would
deadlock then.
As a better lie, tell that the job is active as long as the BH is
pending, but falsely call it quiescent from the point in the BH when the
completion callback is called. At this point, nested drain calls won't
deadlock because they ignore the job, and outer drains will wait for the
job to really reach a quiescent state because the callback is already
running.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
include/qemu/job.h | 3 +++
blockjob.c | 2 +-
job.c | 11 ++++++++++-
3 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/include/qemu/job.h b/include/qemu/job.h
index 63c60ef1ba..9e7cd1e4a0 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -76,6 +76,9 @@ typedef struct Job {
* Set to false by the job while the coroutine has yielded and may be
* re-entered by job_enter(). There may still be I/O or event loop activity
* pending. Accessed under block_job_mutex (in blockjob.c).
+ *
+ * When the job is deferred to the main loop, busy is true as long as the
+ * bottom half is still pending.
*/
bool busy;
diff --git a/blockjob.c b/blockjob.c
index 8d27e8e1ea..617d86fe93 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -164,7 +164,7 @@ static bool child_job_drained_poll(BdrvChild *c)
/* An inactive or completed job doesn't have any pending requests. Jobs
* with !job->busy are either already paused or have a pause point after
* being reentered, so no job driver code will run before they pause. */
- if (!job->busy || job_is_completed(job) || job->deferred_to_main_loop) {
+ if (!job->busy || job_is_completed(job)) {
return false;
}
diff --git a/job.c b/job.c
index fa74558ba0..00a1cd128d 100644
--- a/job.c
+++ b/job.c
@@ -857,7 +857,16 @@ static void job_exit(void *opaque)
AioContext *ctx = job->aio_context;
aio_context_acquire(ctx);
+
+ /* This is a lie, we're not quiescent, but still doing the completion
+ * callbacks. However, completion callbacks tend to involve operations that
+ * drain block nodes, and if .drained_poll still returned true, we would
+ * deadlock. */
+ job->busy = false;
+ job_event_idle(job);
+
job_completed(job);
+
aio_context_release(ctx);
}
@@ -872,8 +881,8 @@ static void coroutine_fn job_co_entry(void *opaque)
assert(job && job->driver && job->driver->run);
job_pause_point(job);
job->ret = job->driver->run(job, &job->err);
- job_event_idle(job);
job->deferred_to_main_loop = true;
+ job->busy = true;
aio_bh_schedule_oneshot(qemu_get_aio_context(), job_exit, job);
}
--
2.13.6
next prev parent reply other threads:[~2018-09-13 12:53 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-13 12:52 [Qemu-devel] [PATCH v2 00/17] Fix some jobs/drain/aio_poll related hangs Kevin Wolf
2018-09-13 12:52 ` [Qemu-devel] [PATCH v2 01/17] job: Fix missing locking due to mismerge Kevin Wolf
2018-09-13 13:56 ` Max Reitz
2018-09-13 17:38 ` John Snow
2018-09-13 12:52 ` [Qemu-devel] [PATCH v2 02/17] blockjob: Wake up BDS when job becomes idle Kevin Wolf
2018-09-13 14:31 ` Max Reitz
2018-09-13 12:52 ` [Qemu-devel] [PATCH v2 03/17] aio-wait: Increase num_waiters even in home thread Kevin Wolf
2018-09-13 15:11 ` Paolo Bonzini
2018-09-13 17:21 ` Kevin Wolf
2018-09-14 15:14 ` Paolo Bonzini
2018-09-13 12:52 ` [Qemu-devel] [PATCH v2 04/17] test-bdrv-drain: Drain with block jobs in an I/O thread Kevin Wolf
2018-09-13 12:52 ` [Qemu-devel] [PATCH v2 05/17] test-blockjob: Acquire AioContext around job_cancel_sync() Kevin Wolf
2018-09-13 12:52 ` [Qemu-devel] [PATCH v2 06/17] job: Use AIO_WAIT_WHILE() in job_finish_sync() Kevin Wolf
2018-09-13 14:45 ` Max Reitz
2018-09-13 15:15 ` Paolo Bonzini
2018-09-13 17:39 ` Kevin Wolf
2018-09-13 12:52 ` [Qemu-devel] [PATCH v2 07/17] test-bdrv-drain: Test AIO_WAIT_WHILE() in completion callback Kevin Wolf
2018-09-13 12:52 ` [Qemu-devel] [PATCH v2 08/17] block: Add missing locking in bdrv_co_drain_bh_cb() Kevin Wolf
2018-09-13 14:58 ` Max Reitz
2018-09-13 15:17 ` Paolo Bonzini
2018-09-13 17:36 ` Kevin Wolf
2018-09-13 12:52 ` [Qemu-devel] [PATCH v2 09/17] block-backend: Add .drained_poll callback Kevin Wolf
2018-09-13 15:01 ` Max Reitz
2018-09-13 12:52 ` [Qemu-devel] [PATCH v2 10/17] block-backend: Fix potential double blk_delete() Kevin Wolf
2018-09-13 15:19 ` Paolo Bonzini
2018-09-13 19:50 ` Max Reitz
2018-09-13 12:52 ` [Qemu-devel] [PATCH v2 11/17] block-backend: Decrease in_flight only after callback Kevin Wolf
2018-09-13 15:10 ` Paolo Bonzini
2018-09-13 16:59 ` Kevin Wolf
2018-09-14 7:47 ` Fam Zheng
2018-09-14 15:12 ` Paolo Bonzini
2018-09-14 17:14 ` Kevin Wolf
2018-09-14 17:38 ` Paolo Bonzini
2018-09-13 20:50 ` Max Reitz
2018-09-13 12:52 ` [Qemu-devel] [PATCH v2 12/17] mirror: Fix potential use-after-free in active commit Kevin Wolf
2018-09-13 20:55 ` Max Reitz
2018-09-13 21:43 ` Max Reitz
2018-09-14 16:25 ` Kevin Wolf
2018-09-13 12:52 ` Kevin Wolf [this message]
2018-09-13 21:52 ` [Qemu-devel] [PATCH v2 13/17] blockjob: Lie better in child_job_drained_poll() Max Reitz
2018-09-13 12:52 ` [Qemu-devel] [PATCH v2 14/17] block: Remove aio_poll() in bdrv_drain_poll variants Kevin Wolf
2018-09-13 21:55 ` Max Reitz
2018-09-13 12:52 ` [Qemu-devel] [PATCH v2 15/17] test-bdrv-drain: Test nested poll in bdrv_drain_poll_top_level() Kevin Wolf
2018-09-13 12:52 ` [Qemu-devel] [PATCH v2 16/17] job: Avoid deadlocks in job_completed_txn_abort() Kevin Wolf
2018-09-13 22:01 ` Max Reitz
2018-09-13 12:52 ` [Qemu-devel] [PATCH v2 17/17] test-bdrv-drain: AIO_WAIT_WHILE() in job .commit/.abort Kevin Wolf
2018-09-13 22:05 ` Max Reitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180913125217.23173-14-kwolf@redhat.com \
--to=kwolf@redhat.com \
--cc=famz@redhat.com \
--cc=jsnow@redhat.com \
--cc=mreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=slp@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.