All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Cc: qemu-block@nongnu.org, Hanna Reitz <hreitz@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>, John Snow <jsnow@redhat.com>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	Wen Congyang <wencongyang2@huawei.com>,
	Xie Changlong <xiechanglong.d@gmail.com>,
	Markus Armbruster <armbru@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>, Fam Zheng <fam@euphon.net>,
	qemu-devel@nongnu.org
Subject: Re: [PATCH v10 14/21] jobs: protect job.aio_context with BQL and job_mutex
Date: Fri, 5 Aug 2022 11:12:10 +0200	[thread overview]
Message-ID: <Yuze6ldui3LtEcZm@redhat.com> (raw)
In-Reply-To: <20220725073855.76049-15-eesposit@redhat.com>

Am 25.07.2022 um 09:38 hat Emanuele Giuseppe Esposito geschrieben:
> In order to make it thread safe, implement a "fake rwlock",
> where we allow reads under BQL *or* job_mutex held, but
> writes only under BQL *and* job_mutex.

Oh, so the "or BQL" part is only for job.aio_context? Okay.

> The only write we have is in child_job_set_aio_ctx, which always
> happens under drain (so the job is paused).
> For this reason, introduce job_set_aio_context and make sure that
> the context is set under BQL, job_mutex and drain.
> Also make sure all other places where the aiocontext is read
> are protected.
> 
> Note: at this stage, job_{lock/unlock} and job lock guard macros
> are *nop*.
> 
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/replication.c |  6 ++++--
>  blockjob.c          |  3 ++-
>  include/qemu/job.h  | 19 ++++++++++++++++++-
>  job.c               | 12 ++++++++++++
>  4 files changed, 36 insertions(+), 4 deletions(-)
> 
> diff --git a/block/replication.c b/block/replication.c
> index 55c8f894aa..2189863df1 100644
> --- a/block/replication.c
> +++ b/block/replication.c
> @@ -148,8 +148,10 @@ static void replication_close(BlockDriverState *bs)
>      }
>      if (s->stage == BLOCK_REPLICATION_FAILOVER) {
>          commit_job = &s->commit_job->job;
> -        assert(commit_job->aio_context == qemu_get_current_aio_context());
> -        job_cancel_sync(commit_job, false);
> +        WITH_JOB_LOCK_GUARD() {
> +            assert(commit_job->aio_context == qemu_get_current_aio_context());
> +            job_cancel_sync_locked(commit_job, false);
> +        }
>      }

.bdrv_close runs under the BQL, so why is this needed? Maybe a
GLOBAL_STATE_CODE() annotation would be helpful, though.

>      if (s->mode == REPLICATION_MODE_SECONDARY) {
> diff --git a/blockjob.c b/blockjob.c
> index 96fb9d9f73..9ff2727025 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -162,12 +162,13 @@ static void child_job_set_aio_ctx(BdrvChild *c, AioContext *ctx,
>          bdrv_set_aio_context_ignore(sibling->bs, ctx, ignore);
>      }
>  
> -    job->job.aio_context = ctx;
> +    job_set_aio_context(&job->job, ctx);
>  }
>  
>  static AioContext *child_job_get_parent_aio_context(BdrvChild *c)
>  {
>      BlockJob *job = c->opaque;
> +    assert(qemu_in_main_thread());

Any reason not to use GLOBAL_STATE_CODE()?

>      return job->job.aio_context;
>  }
> diff --git a/include/qemu/job.h b/include/qemu/job.h
> index 5709e8d4a8..c144aabefc 100644
> --- a/include/qemu/job.h
> +++ b/include/qemu/job.h
> @@ -77,7 +77,12 @@ typedef struct Job {
>  
>      /** Protected by AioContext lock */

I think this section comment should move down below aio_context now.

> -    /** AioContext to run the job coroutine in */
> +    /**
> +     * AioContext to run the job coroutine in.
> +     * This field can be read when holding either the BQL (so we are in
> +     * the main loop) or the job_mutex.
> +     * It can be only written when we hold *both* BQL and job_mutex.
> +     */
>      AioContext *aio_context;
>  
>      /** Reference count of the block job */
> @@ -741,4 +746,16 @@ int job_finish_sync(Job *job, void (*finish)(Job *, Error **errp),
>  int job_finish_sync_locked(Job *job, void (*finish)(Job *, Error **errp),
>                             Error **errp);
>  
> +/**
> + * Sets the @job->aio_context.
> + * Called with job_mutex *not* held.
> + *
> + * This function must run in the main thread to protect against
> + * concurrent read in job_finish_sync_locked(),

Odd line break here in the middle of a sentence.

> + * takes the job_mutex lock to protect against the read in
> + * job_do_yield_locked(), and must be called when the coroutine
> + * is quiescent.
> + */
> +void job_set_aio_context(Job *job, AioContext *ctx);
> +
>  #endif
> diff --git a/job.c b/job.c
> index ecec66b44e..0a857b1468 100644
> --- a/job.c
> +++ b/job.c
> @@ -394,6 +394,17 @@ Job *job_get(const char *id)
>      return job_get_locked(id);
>  }
>  
> +void job_set_aio_context(Job *job, AioContext *ctx)
> +{
> +    /* protect against read in job_finish_sync_locked and job_start */
> +    assert(qemu_in_main_thread());

Same question about GLOBAL_STATE_CODE().

> +    /* protect against read in job_do_yield_locked */
> +    JOB_LOCK_GUARD();
> +    /* ensure the coroutine is quiescent while the AioContext is changed */
> +    assert(job->pause_count > 0);

job->pause_count only shows that pausing was requested. The coroutine is
only really quiescent if job->busy == false, too.

Or maybe job->paused is actually the one you want here.

> +    job->aio_context = ctx;
> +}
> +
>  /* Called with job_mutex *not* held. */
>  static void job_sleep_timer_cb(void *opaque)
>  {
> @@ -1376,6 +1387,7 @@ int job_finish_sync_locked(Job *job,
>  {
>      Error *local_err = NULL;
>      int ret;
> +    assert(qemu_in_main_thread());
>  
>      job_ref_locked(job);

Another GLOBAL_STATE_CODE()?

Kevin



  parent reply	other threads:[~2022-08-05  9:14 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-25  7:38 [PATCH v10 00/21] job: replace AioContext lock with job_mutex Emanuele Giuseppe Esposito
2022-07-25  7:38 ` [PATCH v10 01/21] job.c: make job_mutex and job_lock/unlock() public Emanuele Giuseppe Esposito
2022-07-25  7:38 ` [PATCH v10 02/21] job.h: categorize fields in struct Job Emanuele Giuseppe Esposito
2022-07-29 12:30   ` Kevin Wolf
2022-08-16 18:28   ` Stefan Hajnoczi
2022-07-25  7:38 ` [PATCH v10 03/21] job.c: API functions not used outside should be static Emanuele Giuseppe Esposito
2022-07-29 12:30   ` Kevin Wolf
2022-07-25  7:38 ` [PATCH v10 04/21] aio-wait.h: introduce AIO_WAIT_WHILE_UNLOCKED Emanuele Giuseppe Esposito
2022-07-29 12:33   ` Kevin Wolf
2022-07-25  7:38 ` [PATCH v10 05/21] job.c: add job_lock/unlock while keeping job.h intact Emanuele Giuseppe Esposito
2022-07-27 10:45   ` Vladimir Sementsov-Ogievskiy
2022-07-29 13:33   ` Kevin Wolf
2022-08-16 18:31   ` Stefan Hajnoczi
2022-07-25  7:38 ` [PATCH v10 06/21] job: move and update comments from blockjob.c Emanuele Giuseppe Esposito
2022-08-03 15:47   ` Kevin Wolf
2022-08-16 18:32   ` Stefan Hajnoczi
2022-07-25  7:38 ` [PATCH v10 07/21] blockjob: introduce block_job _locked() APIs Emanuele Giuseppe Esposito
2022-08-03 15:52   ` Kevin Wolf
2022-08-16 18:33   ` Stefan Hajnoczi
2022-07-25  7:38 ` [PATCH v10 08/21] jobs: add job lock in find_* functions Emanuele Giuseppe Esposito
2022-08-04 11:47   ` Kevin Wolf
2022-07-25  7:38 ` [PATCH v10 09/21] jobs: use job locks also in the unit tests Emanuele Giuseppe Esposito
2022-07-27 14:29   ` Vladimir Sementsov-Ogievskiy
2022-08-04 11:56   ` Kevin Wolf
2022-07-25  7:38 ` [PATCH v10 10/21] block/mirror.c: use of job helpers in drivers to avoid TOC/TOU Emanuele Giuseppe Esposito
2022-08-04 16:35   ` Kevin Wolf
2022-08-16 14:23     ` Emanuele Giuseppe Esposito
2022-07-25  7:38 ` [PATCH v10 11/21] jobs: group together API calls under the same job lock Emanuele Giuseppe Esposito
2022-07-27 14:50   ` Vladimir Sementsov-Ogievskiy
2022-08-04 17:10   ` Kevin Wolf
2022-08-16 14:54     ` Emanuele Giuseppe Esposito
2022-08-17  8:46       ` Kevin Wolf
2022-08-17  9:35         ` Emanuele Giuseppe Esposito
2022-08-17  9:59           ` Kevin Wolf
2022-07-25  7:38 ` [PATCH v10 12/21] commit and mirror: create new nodes using bdrv_get_aio_context, and not the job aiocontext Emanuele Giuseppe Esposito
2022-08-05  8:14   ` Kevin Wolf
2022-08-16 14:57     ` Emanuele Giuseppe Esposito
2022-07-25  7:38 ` [PATCH v10 13/21] job: detect change of aiocontext within job coroutine Emanuele Giuseppe Esposito
2022-08-05  8:37   ` Kevin Wolf
2022-08-16 15:09     ` Emanuele Giuseppe Esposito
2022-08-17  8:34       ` Kevin Wolf
2022-08-17 11:16         ` Emanuele Giuseppe Esposito
2022-07-25  7:38 ` [PATCH v10 14/21] jobs: protect job.aio_context with BQL and job_mutex Emanuele Giuseppe Esposito
2022-07-27 15:22   ` Vladimir Sementsov-Ogievskiy
2022-08-05  9:12   ` Kevin Wolf [this message]
2022-08-17  8:04     ` Emanuele Giuseppe Esposito
2022-08-17 13:10       ` Emanuele Giuseppe Esposito
2022-08-18  8:48         ` Emanuele Giuseppe Esposito
2022-07-25  7:38 ` [PATCH v10 15/21] blockjob.h: categorize fields in struct BlockJob Emanuele Giuseppe Esposito
2022-08-05  9:21   ` Kevin Wolf
2022-07-25  7:38 ` [PATCH v10 16/21] blockjob: rename notifier callbacks as _locked Emanuele Giuseppe Esposito
2022-08-05  9:25   ` Kevin Wolf
2022-07-25  7:38 ` [PATCH v10 17/21] blockjob: protect iostatus field in BlockJob struct Emanuele Giuseppe Esposito
2022-07-27 15:29   ` Vladimir Sementsov-Ogievskiy
2022-08-16 12:39     ` Emanuele Giuseppe Esposito
2022-08-05 10:55   ` Kevin Wolf
2022-07-25  7:38 ` [PATCH v10 18/21] job.c: enable job lock/unlock and remove Aiocontext locks Emanuele Giuseppe Esposito
2022-07-27 15:53   ` Vladimir Sementsov-Ogievskiy
2022-08-16 12:52     ` Emanuele Giuseppe Esposito
2022-08-17 18:54       ` Vladimir Sementsov-Ogievskiy
2022-08-18  7:46         ` Emanuele Giuseppe Esposito
2022-08-19 15:49           ` Vladimir Sementsov-Ogievskiy
2022-08-16 12:53     ` Emanuele Giuseppe Esposito
2022-08-05 13:01   ` Kevin Wolf
2022-08-17 12:45     ` Emanuele Giuseppe Esposito
2022-07-25  7:38 ` [PATCH v10 19/21] block_job_query: remove atomic read Emanuele Giuseppe Esposito
2022-08-05 13:01   ` Kevin Wolf
2022-07-25  7:38 ` [PATCH v10 20/21] blockjob: remove unused functions Emanuele Giuseppe Esposito
2022-08-05 13:05   ` Kevin Wolf
2022-07-25  7:38 ` [PATCH v10 21/21] job: " Emanuele Giuseppe Esposito
2022-08-05 13:09   ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yuze6ldui3LtEcZm@redhat.com \
    --to=kwolf@redhat.com \
    --cc=armbru@redhat.com \
    --cc=eesposit@redhat.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    --cc=wencongyang2@huawei.com \
    --cc=xiechanglong.d@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.