From: Sergio Lopez <slp@redhat.com>
To: qemu-block@nongnu.org
Cc: qemu-devel@nongnu.org, stefanha@redhat.com, kwolf@redhat.com,
mreitz@redhat.com, eblake@redhat.com,
Sergio Lopez <slp@redhat.com>
Subject: [Qemu-devel] [PATCH v2] mirror: Confirm we're quiesced only if the job is paused or cancelled
Date: Thu, 7 Mar 2019 19:54:01 +0100 [thread overview]
Message-ID: <20190307185401.41639-1-slp@redhat.com> (raw)
While child_job_drained_begin() calls to job_pause(), the job doesn't
actually transition between states until it runs again and reaches a
pause point. This means bdrv_drained_begin() may return with some jobs
using the node still having 'busy == true'.
As a consequence, block_job_detach_aio_context() may get into a
deadlock, waiting for the job to be actually paused, while the coroutine
servicing the job is yielding and doesn't get the opportunity to get
scheduled again. This situation can be reproduced by issuing a
'block-commit' immediately followed by a 'device_del'.
To ensure bdrv_drained_begin() only returns when the jobs have been
paused, we change mirror_drained_poll() to only confirm it's quiesced
when job->paused == true and there aren't any in-flight requests, except
if we reached that point by a drained section initiated by the
mirror/commit job itself.
The other block jobs shouldn't need any changes, as the default
drained_poll() behavior is to only confirm it's quiesced if the job is
not busy or completed.
Signed-off-by: Sergio Lopez <slp@redhat.com>
---
v2
- Fix typo (thanks to Eric Blake)
---
block/mirror.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/block/mirror.c b/block/mirror.c
index 726d3c27fb..1a1fb174b6 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -80,6 +80,7 @@ typedef struct MirrorBlockJob {
bool initial_zeroing_ongoing;
int in_active_write_counter;
bool prepared;
+ bool in_drain;
} MirrorBlockJob;
typedef struct MirrorBDSOpaque {
@@ -679,9 +680,11 @@ static int mirror_exit_common(Job *job)
/* The mirror job has no requests in flight any more, but we need to
* drain potential other users of the BDS before changing the graph. */
+ s->in_drain = true;
bdrv_drained_begin(target_bs);
bdrv_replace_node(to_replace, target_bs, &local_err);
bdrv_drained_end(target_bs);
+ s->in_drain = false;
if (local_err) {
error_report_err(local_err);
ret = -EPERM;
@@ -717,6 +720,7 @@ static int mirror_exit_common(Job *job)
bs_opaque->job = NULL;
bdrv_drained_end(src);
+ s->in_drain = false;
bdrv_unref(mirror_top_bs);
bdrv_unref(src);
@@ -1000,10 +1004,12 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
*/
trace_mirror_before_drain(s, cnt);
+ s->in_drain = true;
bdrv_drained_begin(bs);
cnt = bdrv_get_dirty_count(s->dirty_bitmap);
if (cnt > 0 || mirror_flush(s) < 0) {
bdrv_drained_end(bs);
+ s->in_drain = false;
continue;
}
@@ -1051,6 +1057,7 @@ immediate_exit:
bdrv_dirty_iter_free(s->dbi);
if (need_drain) {
+ s->in_drain = true;
bdrv_drained_begin(bs);
}
@@ -1119,6 +1126,16 @@ static void coroutine_fn mirror_pause(Job *job)
static bool mirror_drained_poll(BlockJob *job)
{
MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
+
+ /* If the job isn't paused nor cancelled, we can't be sure that it won't
+ * issue more requests. We make an exception if we've reached this point
+ * from one of our own drain sections, to avoid a deadlock waiting for
+ * ourselves.
+ */
+ if (!s->common.job.paused && !s->common.job.cancelled && !s->in_drain) {
+ return true;
+ }
+
return !!s->in_flight;
}
--
2.20.1
next reply other threads:[~2019-03-07 18:54 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-07 18:54 Sergio Lopez [this message]
2019-03-08 13:41 ` [Qemu-devel] [PATCH v2] mirror: Confirm we're quiesced only if the job is paused or cancelled Kevin Wolf
2019-03-08 15:45 ` Sergio Lopez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190307185401.41639-1-slp@redhat.com \
--to=slp@redhat.com \
--cc=eblake@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).