From: Kevin Wolf <kwolf@redhat.com>
To: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Cc: qemu-block@nongnu.org, Hanna Reitz <hreitz@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>, John Snow <jsnow@redhat.com>,
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
Wen Congyang <wencongyang2@huawei.com>,
Xie Changlong <xiechanglong.d@gmail.com>,
Markus Armbruster <armbru@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>, Fam Zheng <fam@euphon.net>,
qemu-devel@nongnu.org
Subject: Re: [PATCH v6 15/18] job: detect change of aiocontext within job coroutine
Date: Fri, 3 Jun 2022 18:59:03 +0200 [thread overview]
Message-ID: <Ypo915liDsISLwuW@redhat.com> (raw)
In-Reply-To: <20220314133707.2206082-16-eesposit@redhat.com>
Am 14.03.2022 um 14:37 hat Emanuele Giuseppe Esposito geschrieben:
> From: Paolo Bonzini <pbonzini@redhat.com>
>
> We want to make sure access of job->aio_context is always done
> under either BQL or job_mutex. The problem is that using
> aio_co_enter(job->aiocontext, job->co) in job_start and job_enter_cond
> makes the coroutine immediately resume, so we can't hold the job lock.
> And caching it is not safe either, as it might change.
>
> job_start is under BQL, so it can freely read job->aiocontext, but
> job_enter_cond is not. In order to fix this, use aio_co_wake():
> the advantage is that it won't use job->aiocontext, but the
> main disadvantage is that it won't be able to detect a change of
> job AioContext.
>
> Calling bdrv_try_set_aio_context() will issue the following calls
> (simplified):
> * in terms of bdrv callbacks:
> .drained_begin -> .set_aio_context -> .drained_end
> * in terms of child_job functions:
> child_job_drained_begin -> child_job_set_aio_context -> child_job_drained_end
> * in terms of job functions:
> job_pause_locked -> job_set_aio_context -> job_resume_locked
>
> We can see that after setting the new aio_context, job_resume_locked
> calls again job_enter_cond, which then invokes aio_co_wake(). But
> while job->aiocontext has been set in job_set_aio_context,
> job->co->ctx has not changed, so the coroutine would be entering in
> the wrong aiocontext.
>
> Using aio_co_schedule in job_resume_locked() might seem as a valid
> alternative, but the problem is that the bh resuming the coroutine
> is not scheduled immediately, and if in the meanwhile another
> bdrv_try_set_aio_context() is run (see test_propagate_mirror() in
> test-block-iothread.c), we would have the first schedule in the
> wrong aiocontext, and the second set of drains won't even manage
> to schedule the coroutine, as job->busy would still be true from
> the previous job_resume_locked().
>
> The solution is to stick with aio_co_wake(), but then detect every time
> the coroutine resumes back from yielding if job->aio_context
> has changed. If so, we can reschedule it to the new context.
>
> Check for the aiocontext change in job_do_yield_locked because:
> 1) aio_co_reschedule_self requires to be in the running coroutine
> 2) since child_job_set_aio_context allows changing the aiocontext only
> while the job is paused, this is the exact place where the coroutine
> resumes, before running JobDriver's code.
>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> job.c | 24 +++++++++++++++++++++---
> 1 file changed, 21 insertions(+), 3 deletions(-)
>
> diff --git a/job.c b/job.c
> index 89c0e6bed9..10a5981748 100644
> --- a/job.c
> +++ b/job.c
> @@ -543,11 +543,12 @@ void job_enter_cond_locked(Job *job, bool(*fn)(Job *job))
> return;
> }
>
> - assert(!job->deferred_to_main_loop);
Why doesn't this assertion hold true any more?
> timer_del(&job->sleep_timer);
> job->busy = true;
> real_job_unlock();
> - aio_co_enter(job->aio_context, job->co);
> + job_unlock();
> + aio_co_wake(job->co);
> + job_lock();
> }
>
> void job_enter(Job *job)
> @@ -568,6 +569,8 @@ void job_enter(Job *job)
> */
> static void coroutine_fn job_do_yield_locked(Job *job, uint64_t ns)
> {
> + AioContext *next_aio_context;
> +
> real_job_lock();
> if (ns != -1) {
> timer_mod(&job->sleep_timer, ns);
> @@ -579,6 +582,20 @@ static void coroutine_fn job_do_yield_locked(Job *job, uint64_t ns)
> qemu_coroutine_yield();
> job_lock();
>
> + next_aio_context = job->aio_context;
> + /*
> + * Coroutine has resumed, but in the meanwhile the job AioContext
> + * might have changed via bdrv_try_set_aio_context(), so we need to move
> + * the coroutine too in the new aiocontext.
> + */
> + while (qemu_get_current_aio_context() != next_aio_context) {
> + job_unlock();
> + aio_co_reschedule_self(next_aio_context);
> + job_lock();
> + next_aio_context = job->aio_context;
> + }
> +
> +
Extra empty line.
> /* Set by job_enter_cond_locked() before re-entering the coroutine. */
> assert(job->busy);
> }
> @@ -680,7 +697,6 @@ void job_resume_locked(Job *job)
> if (job->pause_count) {
> return;
> }
> -
> /* kick only if no timer is pending */
> job_enter_cond_locked(job, job_timer_not_pending_locked);
> }
This hunk looks unrelated.
Kevin
next prev parent reply other threads:[~2022-06-03 17:01 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-14 13:36 [PATCH v6 00/18] job: replace AioContext lock with job_mutex Emanuele Giuseppe Esposito
2022-03-14 13:36 ` [PATCH v6 01/18] job.c: make job_mutex and job_lock/unlock() public Emanuele Giuseppe Esposito
2022-03-14 13:36 ` [PATCH v6 02/18] job.h: categorize fields in struct Job Emanuele Giuseppe Esposito
2022-06-03 16:00 ` Kevin Wolf
2022-06-07 13:20 ` Emanuele Giuseppe Esposito
2022-06-07 15:41 ` Paolo Bonzini
2022-06-08 7:28 ` Emanuele Giuseppe Esposito
2022-03-14 13:36 ` [PATCH v6 03/18] job.c: API functions not used outside should be static Emanuele Giuseppe Esposito
2022-06-09 9:16 ` Stefan Hajnoczi
2022-03-14 13:36 ` [PATCH v6 04/18] aio-wait.h: introduce AIO_WAIT_WHILE_UNLOCKED Emanuele Giuseppe Esposito
2022-03-14 13:36 ` [PATCH v6 05/18] job.h: add _locked duplicates for job API functions called with and without job_mutex Emanuele Giuseppe Esposito
2022-06-03 16:17 ` Kevin Wolf
2022-06-07 13:23 ` Emanuele Giuseppe Esposito
2022-06-09 9:32 ` Stefan Hajnoczi
2022-03-14 13:36 ` [PATCH v6 06/18] jobs: protect jobs with job_lock/unlock Emanuele Giuseppe Esposito
2022-06-03 16:40 ` Kevin Wolf
2022-06-07 13:17 ` Emanuele Giuseppe Esposito
2022-03-14 13:36 ` [PATCH v6 07/18] jobs: add job lock in find_* functions Emanuele Giuseppe Esposito
2022-03-14 13:36 ` [PATCH v6 08/18] jobs: use job locks also in the unit tests Emanuele Giuseppe Esposito
2022-03-14 13:36 ` [PATCH v6 09/18] block/mirror.c: use of job helpers in drivers to avoid TOC/TOU Emanuele Giuseppe Esposito
2022-03-14 13:36 ` [PATCH v6 10/18] jobs: rename static functions called with job_mutex held Emanuele Giuseppe Esposito
2022-06-09 9:47 ` Stefan Hajnoczi
2022-03-14 13:37 ` [PATCH v6 11/18] job.h: rename job API " Emanuele Giuseppe Esposito
2022-06-09 9:48 ` Stefan Hajnoczi
2022-03-14 13:37 ` [PATCH v6 12/18] block_job: rename block_job " Emanuele Giuseppe Esposito
2022-06-09 9:47 ` Stefan Hajnoczi
2022-03-14 13:37 ` [PATCH v6 13/18] job.h: define unlocked functions Emanuele Giuseppe Esposito
2022-06-09 9:41 ` Stefan Hajnoczi
2022-03-14 13:37 ` [PATCH v6 14/18] commit and mirror: create new nodes using bdrv_get_aio_context, and not the job aiocontext Emanuele Giuseppe Esposito
2022-03-14 13:37 ` [PATCH v6 15/18] job: detect change of aiocontext within job coroutine Emanuele Giuseppe Esposito
2022-06-03 16:59 ` Kevin Wolf [this message]
2022-06-07 13:28 ` Emanuele Giuseppe Esposito
2022-03-14 13:37 ` [PATCH v6 16/18] jobs: protect job.aio_context with BQL and job_mutex Emanuele Giuseppe Esposito
2022-03-14 13:37 ` [PATCH v6 17/18] job.c: enable job lock/unlock and remove Aiocontext locks Emanuele Giuseppe Esposito
2022-03-14 13:37 ` [PATCH v6 18/18] block_job_query: remove atomic read Emanuele Giuseppe Esposito
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ypo915liDsISLwuW@redhat.com \
--to=kwolf@redhat.com \
--cc=armbru@redhat.com \
--cc=eesposit@redhat.com \
--cc=fam@euphon.net \
--cc=hreitz@redhat.com \
--cc=jsnow@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@virtuozzo.com \
--cc=wencongyang2@huawei.com \
--cc=xiechanglong.d@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.