All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Fiona Ebner <f.ebner@proxmox.com>
Cc: qemu-devel@nongnu.org, qemu-stable@nongnu.org,
	qemu-block@nongnu.org, hreitz@redhat.com,
	vsementsov@yandex-team.ru, jsnow@redhat.com,
	jean-louis@dupond.be, dionbosschieter@gmail.com
Subject: Re: [PATCH] block/mirror: disable dirty bitmap only after job init
Date: Mon, 16 Feb 2026 17:38:36 +0100	[thread overview]
Message-ID: <aZNIDFp9OaNBe0NK@redhat.com> (raw)
In-Reply-To: <811bbd59-df4e-4a60-adba-61cd2ef3016c@proxmox.com>

Am 16.02.2026 um 17:12 hat Fiona Ebner geschrieben:
> Am 16.02.26 um 4:31 PM schrieb Kevin Wolf:
> > Am 16.02.2026 um 14:01 hat Fiona Ebner geschrieben:
> >> Am 16.02.26 um 1:48 PM schrieb Fiona Ebner:
> >>> Am 16.02.26 um 11:25 AM schrieb Kevin Wolf:
> >>>> Am 12.02.2026 um 13:02 hat Fiona Ebner geschrieben:
> >>>>> Currently, the dirty bitmap is disabled too early and the following
> >>>>> bad scenario is possible:
> >>>>>
> >>>>> 1. Dirty bitmap is disabled in mirror_start_job()
> >>>>> 2. Some request are started in mirror_top_bs while s->job == NULL
> >>>>> 3. mirror_dirty_init() -> bdrv_co_is_allocated_above() runs and because
> >>>>>    the request hasn't completed yet, the block isn't allocated
> >>>>> 4. The request completes, still sees s->job == NULL and skips the
> >>>>>    bitmap, and nothing else will mark it dirty either
> >>>>>
> >>>>> One ingredient is that mirror_top_opaque->job is only set after the
> >>>>> job is fully initialized. For the rationale, see commit 32125b1460
> >>>>> ("mirror: Fix access of uninitialised fields during start").
> >>>>>
> >>>>> Disabling the dirty bitmap is safe once bdrv_mirror_top_do_write()
> >>>>> sees that the job is set, because then:
> >>>>>
> >>>>> 1. When not using MIRROR_COPY_MODE_WRITE_BLOCKING, the dirty bitmap
> >>>>>    will be set by bdrv_mirror_top_do_write().
> >>>>>
> >>>>> 2. When using MIRROR_COPY_MODE_WRITE_BLOCKING, writes will be done
> >>>>>    synchronously to the target.
> >>>>>
> >>>>> At least with virtio-blk using iothread-vq-mapping, mirror_run() and
> >>>>> bdrv_mirror_top_do_write() might be called in different threads.
> >>>>> bdrv_disable_dirty_bitmap() acquires and releases the dirty bitmap
> >>>>> mutex, so the memory is synchronized between threads.
> >>>>>
> >>>>> Many thanks to Kevin Wolf for discussing the issue and solutions with
> >>>>> me! [0]
> >>>>>
> >>>>> [0]: https://lore.kernel.org/qemu-devel/4853b0e5-8ec3-41e9-9a53-b1912b8e4449@dupond.be/T/
> >>>>>
> >>>>> Cc: qemu-stable@nongnu.org
> >>>>> Closes: https://gitlab.com/qemu-project/qemu/-/issues/3273
> >>>>> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> >>>>> ---
> >>>>>  block/mirror.c | 21 +++++++++++++++------
> >>>>>  1 file changed, 15 insertions(+), 6 deletions(-)
> >>>>>
> >>>>> diff --git a/block/mirror.c b/block/mirror.c
> >>>>> index b344182c74..eadd4501e8 100644
> >>>>> --- a/block/mirror.c
> >>>>> +++ b/block/mirror.c
> >>>>> @@ -1123,6 +1123,21 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
> >>>>>       */
> >>>>>      mirror_top_opaque->job = s;
> >>>>>  
> >>>>> +    /*
> >>>>> +     * Disabling the dirty bitmap is safe once bdrv_mirror_top_do_write() sees
> >>>>> +     * that the job is set, because then:
> >>>>> +     *
> >>>>> +     * 1. When not using MIRROR_COPY_MODE_WRITE_BLOCKING, the dirty bitmap will
> >>>>> +     * be set by bdrv_mirror_top_do_write().
> >>>>> +     *
> >>>>> +     * 2. When using MIRROR_COPY_MODE_WRITE_BLOCKING, writes will be done
> >>>>> +     * synchronously to the target.
> >>>>> +     *
> >>>>> +     * bdrv_disable_dirty_bitmap() acquires and releases the dirty bitmap mutex,
> >>>>> +     * so the memory is synchronized between threads.
> >>>>> +     */
> >>>>> +    bdrv_disable_dirty_bitmap(s->dirty_bitmap);
> >>>>> +
> >>>>>      assert(!s->dbi);
> >>>>>      s->dbi = bdrv_dirty_iter_new(s->dirty_bitmap);
> >>>>>      for (;;) {
> >>>>> @@ -2014,12 +2029,6 @@ static BlockJob *mirror_start_job(
> >>>>>          goto fail;
> >>>>>      }
> >>>>>  
> >>>>> -    /*
> >>>>> -     * The dirty bitmap is set by bdrv_mirror_top_do_write() when not in active
> >>>>> -     * mode.
> >>>>> -     */
> >>>>> -    bdrv_disable_dirty_bitmap(s->dirty_bitmap);
> >>>>> -
> >>>>>      bdrv_graph_wrlock_drained();
> >>>>>      ret = block_job_add_bdrv(&s->common, "source", bs, 0,
> >>>>>                               BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE |
> >>>>
> >>>> The thing I meant in the other thread is if we don't need something like
> >>>> this additionally:
> >>>>
> >>>> diff --git a/block/mirror.c b/block/mirror.c
> >>>> index b344182c747..159954158ba 100644
> >>>> --- a/block/mirror.c
> >>>> +++ b/block/mirror.c
> >>>> @@ -1672,9 +1672,17 @@ bdrv_mirror_top_do_write(BlockDriverState *bs, MirrorMethod method,
> >>>>          abort();
> >>>>      }
> >>>>  
> >>>> -    if (!copy_to_target && s->job && s->job->dirty_bitmap) {
> >>>> +    if (!copy_to_target) {
> >>>>          qatomic_set(&s->job->actively_synced, false);
> >>>> -        bdrv_set_dirty_bitmap(s->job->dirty_bitmap, offset, bytes);
> >>>> +        if (s->job && s->job->dirty_bitmap) {
> >>>> +            bdrv_set_dirty_bitmap(s->job->dirty_bitmap, offset, bytes);
> >>>> +        } else {
> >>>> +            /*
> >>>> +             * Avoid race in the case that mirror_run() disables the bitmap
> >>>> +             * between here and bdrv_co_write_req_finish().
> >>>> +             */
> >>>> +            bdrv_set_dirty(bs, offset, bytes);
> >>>> +        }
> >>>>      }
> >>>>  
> >>>>      if (ret < 0) {
> >>>
> >>> You're right! It's not enough to ensure that the job is set before the
> >>> bitmap is disabled, because both could happen after the check for job here.
> >>>
> >>> Isn't this change enough by itself? After the change,
> >>> bdrv_mirror_top_do_write() ensures that:
> >>>
> >>> 1. When copy_to_target == true, the write is done synchronously to the
> >>> target, no need to set the dirty bitmap.
> >>>
> >>> 2. When copy_to_target == false, the dirty bitmap is set.
> >>>
> >>> So it doesn't really matter at which point the dirty bitmap is disabled?
> >>> Or am I missing something again?
> >>
> >> Ah right, bdrv_set_dirty() only sets it if still enabled :)
> > 
> > Yes, exactly.
> > 
> > If that's still too confusing, how about just giving mirror_top_bs
> > access to its dirty bitmap right from the start while it's drained? I
> > didn't try to reproduce the bug with it yet, though.
> 
> You beat me to it :) Was thinking about this too, inspired by your
> initial suggestion for the alternate approach to avoid relying on the
> block layer for dirty tracking, but mine ended up needlessly involved,
> because I removed the dirty_bitmap from the MirrorBlockJob object and
> had to add a helper function for getting the dirty bitmap back from the
> job, which is less than ideal :P
> > 
> > Kevin
> > 
> > diff --git a/block/mirror.c b/block/mirror.c
> > index b344182c747..c11aca1366c 100644
> > --- a/block/mirror.c
> > +++ b/block/mirror.c
> > @@ -99,6 +99,7 @@ typedef struct MirrorBlockJob {
> >  
> >  typedef struct MirrorBDSOpaque {
> >      MirrorBlockJob *job;
> > +    BdrvDirtyBitmap *dirty_bitmap;
> >      bool stop;
> >      bool is_commit;
> >  } MirrorBDSOpaque;
> > @@ -1672,9 +1673,9 @@ bdrv_mirror_top_do_write(BlockDriverState *bs, MirrorMethod method,
> >          abort();
> >      }
> >  
> > -    if (!copy_to_target && s->job && s->job->dirty_bitmap) {
> > +    if (!copy_to_target) {
> 
> I put an assert(s->dirty_bitmap) here, but maybe it's not worth it?

The difference is SIGABRT here vs. SIGSEGV two lines later. I don't think
it matters much, either way is fine with me.

> >          qatomic_set(&s->job->actively_synced, false);
> > -        bdrv_set_dirty_bitmap(s->job->dirty_bitmap, offset, bytes);
> > +        bdrv_set_dirty_bitmap(s->dirty_bitmap, offset, bytes);
> >      }
> >  
> >      if (ret < 0) {
> > @@ -1901,13 +1902,28 @@ static BlockJob *mirror_start_job(
> 
> I created and disabled the dirty bitmap here already before the drained
> section. It seems slightly more readable. Do you see any downside to that?

Hm... I think it's okay in practice because a drain comes after it, so
while there may be a race initially, we don't care when exactly we start
tracking as long as it's before checking the block status. But that's a
bit subtle.

We could keep creating the bitmap outside of the drain if that improves
things, but I would leave bdrv_disable_dirty_bitmap() inside the drained
section just to be very clear that there can't be a race with a request.

> >  
> >      bdrv_drained_begin(bs);
> >      ret = bdrv_append(mirror_top_bs, bs, errp);
> > -    bdrv_drained_end(bs);
> > -
> >      if (ret < 0) {
> > +        bdrv_drained_end(bs);
> >          bdrv_unref(mirror_top_bs);
> >          return NULL;
> >      }
> >  
> > +    bs_opaque->dirty_bitmap = bdrv_create_dirty_bitmap(mirror_top_bs,
> > +                                                       granularity,
> > +                                                       NULL, errp);
> > +    if (!bs_opaque->dirty_bitmap) {
> > +        bdrv_drained_end(bs);
> > +        bdrv_unref(mirror_top_bs);
> > +        return NULL;
> > +    }
> > +
> > +    /*
> > +     * The dirty bitmap is set by bdrv_mirror_top_do_write() when not in active
> > +     * mode.
> > +     */
> 
> This comment could become:
> 
> +    /*
> +     * Disabling the dirty bitmap is safe, because:
> +     *
> +     * 1. If should_copy_to_target() == true, writes will be done
> synchronously
> +     * to the target, no need to set the bitmap.
> +     *
> +     * 2. If should_copy_to_target() == false, the dirty bitmap will be
> set by
> +     * bdrv_mirror_top_do_write().
> +     */

Feels verbose to me because the implications of the existing comment
seem obvious to me, but if you think it's useful, it probably isn't as
obvious as I thought.

> > +    bdrv_disable_dirty_bitmap(bs_opaque->dirty_bitmap);
> > +    bdrv_drained_end(bs);
> > +
> >      /* Make sure that the source is not resized while the job is running */
> >      s = block_job_create(job_id, driver, NULL, mirror_top_bs,
> >                           BLK_PERM_CONSISTENT_READ,
> > @@ -2002,24 +2018,13 @@ static BlockJob *mirror_start_job(
> >      s->base_overlay = bdrv_find_overlay(bs, base);
> >      s->granularity = granularity;
> >      s->buf_size = ROUND_UP(buf_size, granularity);
> > +    s->dirty_bitmap = bs_opaque->dirty_bitmap;
> >      s->unmap = unmap;
> >      if (auto_complete) {
> >          s->should_complete = true;
> >      }
> >      bdrv_graph_rdunlock_main_loop();
> >  
> > -    s->dirty_bitmap = bdrv_create_dirty_bitmap(s->mirror_top_bs, granularity,
> > -                                               NULL, errp);
> > -    if (!s->dirty_bitmap) {
> > -        goto fail;
> > -    }
> > -
> > -    /*
> > -     * The dirty bitmap is set by bdrv_mirror_top_do_write() when not in active
> > -     * mode.
> > -     */
> > -    bdrv_disable_dirty_bitmap(s->dirty_bitmap);
> > -
> >      bdrv_graph_wrlock_drained();
> >      ret = block_job_add_bdrv(&s->common, "source", bs, 0,
> >                               BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE |
> > @@ -2099,9 +2104,6 @@ fail:
> >          g_free(s->replaces);
> >          blk_unref(s->target);
> >          bs_opaque->job = NULL;
> > -        if (s->dirty_bitmap) {
> > -            bdrv_release_dirty_bitmap(s->dirty_bitmap);
> > -        }
> >          job_early_fail(&s->common.job);
> >      }
> >  
> > @@ -2115,6 +2117,7 @@ fail:
> >      bdrv_graph_wrunlock();
> >      bdrv_drained_end(bs);
> >  
> > +    bdrv_release_dirty_bitmap(bs_opaque->dirty_bitmap);
> >      bdrv_unref(mirror_top_bs);
> >  
> >      return NULL;

Kevin



  reply	other threads:[~2026-02-16 16:39 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-12 12:02 [PATCH] block/mirror: disable dirty bitmap only after job init Fiona Ebner
2026-02-12 12:14 ` Fiona Ebner
2026-02-16 10:24 ` Kevin Wolf
2026-02-16 12:49   ` Fiona Ebner
2026-02-16 13:01     ` Fiona Ebner
2026-02-16 15:30       ` Kevin Wolf
2026-02-16 16:12         ` Fiona Ebner
2026-02-16 16:38           ` Kevin Wolf [this message]
2026-02-17  9:14             ` Fiona Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aZNIDFp9OaNBe0NK@redhat.com \
    --to=kwolf@redhat.com \
    --cc=dionbosschieter@gmail.com \
    --cc=f.ebner@proxmox.com \
    --cc=hreitz@redhat.com \
    --cc=jean-louis@dupond.be \
    --cc=jsnow@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-stable@nongnu.org \
    --cc=vsementsov@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.