From: Stefan Reiter <s.reiter@proxmox.com>
To: qemu-devel@nongnu.org, qemu-block@nongnu.org
Cc: kwolf@redhat.com, vsementsov@virtuozzo.com, slp@redhat.com,
mreitz@redhat.com, stefanha@redhat.com, jsnow@redhat.com,
dietmar@proxmox.com
Subject: [PATCH v4 1/3] job: take each job's lock individually in job_txn_apply
Date: Wed, 1 Apr 2020 10:15:02 +0200 [thread overview]
Message-ID: <20200401081504.200017-2-s.reiter@proxmox.com> (raw)
In-Reply-To: <20200401081504.200017-1-s.reiter@proxmox.com>
All callers of job_txn_apply hold a single job's lock, but different
jobs within a transaction can have different contexts, thus we need to
lock each one individually before applying the callback function.
Similar to job_completed_txn_abort this also requires releasing the
caller's context before and reacquiring it after to avoid recursive
locks which might break AIO_WAIT_WHILE in the callback.
This also brings to light a different issue: When a callback function in
job_txn_apply moves it's job to a different AIO context, job_exit will
try to release the wrong lock (now that we re-acquire the lock
correctly, previously it would just continue with the old lock, leaving
the job unlocked for the rest of the codepath back to job_exit). Fix
this by not caching the job's context in job_exit and add a comment
about why this is done.
One test needed adapting, since it calls job_finalize directly, so it
manually needs to acquire the correct context.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
job.c | 48 ++++++++++++++++++++++++++++++++++---------
tests/test-blockjob.c | 2 ++
2 files changed, 40 insertions(+), 10 deletions(-)
diff --git a/job.c b/job.c
index 134a07b92e..5fbaaabf78 100644
--- a/job.c
+++ b/job.c
@@ -136,17 +136,36 @@ static void job_txn_del_job(Job *job)
}
}
-static int job_txn_apply(JobTxn *txn, int fn(Job *))
+static int job_txn_apply(Job *job, int fn(Job *))
{
- Job *job, *next;
+ AioContext *inner_ctx;
+ Job *other_job, *next;
+ JobTxn *txn = job->txn;
int rc = 0;
- QLIST_FOREACH_SAFE(job, &txn->jobs, txn_list, next) {
- rc = fn(job);
+ /*
+ * Similar to job_completed_txn_abort, we take each job's lock before
+ * applying fn, but since we assume that outer_ctx is held by the caller,
+ * we need to release it here to avoid holding the lock twice - which would
+ * break AIO_WAIT_WHILE from within fn.
+ */
+ aio_context_release(job->aio_context);
+
+ QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
+ inner_ctx = other_job->aio_context;
+ aio_context_acquire(inner_ctx);
+ rc = fn(other_job);
+ aio_context_release(inner_ctx);
if (rc) {
break;
}
}
+
+ /*
+ * Note that job->aio_context might have been changed by calling fn, so we
+ * can't use a local variable to cache it.
+ */
+ aio_context_acquire(job->aio_context);
return rc;
}
@@ -774,11 +793,11 @@ static void job_do_finalize(Job *job)
assert(job && job->txn);
/* prepare the transaction to complete */
- rc = job_txn_apply(job->txn, job_prepare);
+ rc = job_txn_apply(job, job_prepare);
if (rc) {
job_completed_txn_abort(job);
} else {
- job_txn_apply(job->txn, job_finalize_single);
+ job_txn_apply(job, job_finalize_single);
}
}
@@ -824,10 +843,10 @@ static void job_completed_txn_success(Job *job)
assert(other_job->ret == 0);
}
- job_txn_apply(txn, job_transition_to_pending);
+ job_txn_apply(job, job_transition_to_pending);
/* If no jobs need manual finalization, automatically do so */
- if (job_txn_apply(txn, job_needs_finalize) == 0) {
+ if (job_txn_apply(job, job_needs_finalize) == 0) {
job_do_finalize(job);
}
}
@@ -849,9 +868,10 @@ static void job_completed(Job *job)
static void job_exit(void *opaque)
{
Job *job = (Job *)opaque;
- AioContext *ctx = job->aio_context;
+ AioContext *ctx;
- aio_context_acquire(ctx);
+ job_ref(job);
+ aio_context_acquire(job->aio_context);
/* This is a lie, we're not quiescent, but still doing the completion
* callbacks. However, completion callbacks tend to involve operations that
@@ -862,6 +882,14 @@ static void job_exit(void *opaque)
job_completed(job);
+ /*
+ * Note that calling job_completed can move the job to a different
+ * aio_context, so we cannot cache from above. job_txn_apply takes care of
+ * acquiring the new lock, and we ref/unref to avoid job_completed freeing
+ * the job underneath us.
+ */
+ ctx = job->aio_context;
+ job_unref(job);
aio_context_release(ctx);
}
diff --git a/tests/test-blockjob.c b/tests/test-blockjob.c
index 4eeb184caf..7519847912 100644
--- a/tests/test-blockjob.c
+++ b/tests/test-blockjob.c
@@ -367,7 +367,9 @@ static void test_cancel_concluded(void)
aio_poll(qemu_get_aio_context(), true);
assert(job->status == JOB_STATUS_PENDING);
+ aio_context_acquire(job->aio_context);
job_finalize(job, &error_abort);
+ aio_context_release(job->aio_context);
assert(job->status == JOB_STATUS_CONCLUDED);
cancel_common(s);
--
2.26.0
next prev parent reply other threads:[~2020-04-01 8:18 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-01 8:15 [PATCH v4 0/3] Fix some AIO context locking in jobs Stefan Reiter
2020-04-01 8:15 ` Stefan Reiter [this message]
2020-04-02 12:33 ` [PATCH v4 1/3] job: take each job's lock individually in job_txn_apply Max Reitz
2020-04-02 15:04 ` Stefan Reiter
2020-04-01 8:15 ` [PATCH v4 2/3] replication: acquire aio context before calling job_cancel_sync Stefan Reiter
2020-04-02 12:41 ` Max Reitz
2020-04-02 15:05 ` Stefan Reiter
2020-04-01 8:15 ` [PATCH v4 3/3] backup: don't acquire aio_context in backup_clean Stefan Reiter
2020-04-02 12:59 ` Max Reitz
2020-04-02 12:48 ` [PATCH v4 0/3] Fix some AIO context locking in jobs Kevin Wolf
2020-04-06 20:11 ` John Snow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200401081504.200017-2-s.reiter@proxmox.com \
--to=s.reiter@proxmox.com \
--cc=dietmar@proxmox.com \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=slp@redhat.com \
--cc=stefanha@redhat.com \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).