qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Chuang Xu <xuchuangxclwt@bytedance.com>
Cc: qemu-devel@nongnu.org, mst@redhat.com, sgarzare@redhat.com,
	richard.henderson@linaro.org, pbonzini@redhat.com,
	david@kernel.org, philmd@linaro.org, farosas@suse.de
Subject: Re: [PATCH v2 1/1] migration: merge fragmented clear_dirty ioctls
Date: Mon, 15 Dec 2025 11:32:04 -0500	[thread overview]
Message-ID: <aUA4BGW2Faw9CMgs@x1.local> (raw)
In-Reply-To: <20251215140611.16180-2-xuchuangxclwt@bytedance.com>

On Mon, Dec 15, 2025 at 10:06:11PM +0800, Chuang Xu wrote:
> From: xuchuangxclwt <xuchuangxclwt@bytedance.com>
> 
> When the addresses processed are not aligned, a large number of
> clear_dirty ioctl occur (e.g. a 4MB misaligned memory can generate
> 2048 clear_dirty ioctls from two different memory_listener),
> which increases the time required for bitmap_sync and makes it
> more difficult for dirty pages to converge.
> 
> Attempt to merge those fragmented clear_dirty ioctls.
> 
> Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
> ---
>  accel/tcg/cputlb.c       |  5 +++--
>  include/system/physmem.h |  7 ++++---
>  migration/ram.c          | 26 ++++++++++++------------
>  system/memory.c          |  2 +-
>  system/physmem.c         | 44 ++++++++++++++++++++++++----------------
>  5 files changed, 48 insertions(+), 36 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index fd1606c856..c8827c8b0d 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -857,8 +857,9 @@ void tlb_flush_page_bits_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
>  void tlb_protect_code(ram_addr_t ram_addr)
>  {
>      physical_memory_test_and_clear_dirty(ram_addr & TARGET_PAGE_MASK,
> -                                             TARGET_PAGE_SIZE,
> -                                             DIRTY_MEMORY_CODE);
> +                                         TARGET_PAGE_SIZE,
> +                                         DIRTY_MEMORY_CODE,
> +                                         NULL);
>  }
>  
>  /* update the TLB so that writes in physical page 'phys_addr' are no longer
> diff --git a/include/system/physmem.h b/include/system/physmem.h
> index 879f6eae38..8eeace9d1f 100644
> --- a/include/system/physmem.h
> +++ b/include/system/physmem.h
> @@ -39,9 +39,10 @@ uint64_t physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
>  
>  void physical_memory_dirty_bits_cleared(ram_addr_t start, ram_addr_t length);
>  
> -bool physical_memory_test_and_clear_dirty(ram_addr_t start,
> -                                          ram_addr_t length,
> -                                          unsigned client);
> +uint64_t physical_memory_test_and_clear_dirty(ram_addr_t start,
> +                                              ram_addr_t length,
> +                                              unsigned client,
> +                                              unsigned long *dest);

Nitpick: please consider adding doc for this function now, both "dest" and
retval may need some explanations.

>  
>  DirtyBitmapSnapshot *
>  physical_memory_snapshot_and_clear_dirty(MemoryRegion *mr, hwaddr offset,
> diff --git a/migration/ram.c b/migration/ram.c
> index 29f016cb25..2d5e979211 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -942,7 +942,6 @@ static uint64_t physical_memory_sync_dirty_bitmap(RAMBlock *rb,
>                                                    ram_addr_t start,
>                                                    ram_addr_t length)
>  {
> -    ram_addr_t addr;
>      unsigned long word = BIT_WORD((start + rb->offset) >> TARGET_PAGE_BITS);
>      uint64_t num_dirty = 0;
>      unsigned long *dest = rb->bmap;
> @@ -995,18 +994,19 @@ static uint64_t physical_memory_sync_dirty_bitmap(RAMBlock *rb,
>          }
>      } else {
>          ram_addr_t offset = rb->offset;
> -
> -        for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
> -            if (physical_memory_test_and_clear_dirty(
> -                        start + addr + offset,
> -                        TARGET_PAGE_SIZE,
> -                        DIRTY_MEMORY_MIGRATION)) {
> -                long k = (start + addr) >> TARGET_PAGE_BITS;
> -                if (!test_and_set_bit(k, dest)) {
> -                    num_dirty++;
> -                }
> -            }
> -        }
> +        unsigned long end, start_page;
> +        uint64_t mr_offset, mr_size;
> +
> +        num_dirty = physical_memory_test_and_clear_dirty(
> +                        start + offset,
> +                        length,
> +                        DIRTY_MEMORY_MIGRATION,
> +                        dest);

Thanks for doing this, I think this is better.  Though IIUC you missed a
major benefit of the current API I'm suggesting here, which is to avoid
explicit invokations of memory_region_clear_dirty_bitmap().

IIUC you can remove the "if" check at [1] below, then you can drop the five
lines here afterwards.

So physical_memory_test_and_clear_dirty() should never worry about remote
clears, and it should always be done properly.

> +        end = TARGET_PAGE_ALIGN(start + offset + length) >> TARGET_PAGE_BITS;
> +        start_page = (start + offset) >> TARGET_PAGE_BITS;
> +        mr_offset = (ram_addr_t)(start_page << TARGET_PAGE_BITS) - offset;
> +        mr_size = (end - start_page) << TARGET_PAGE_BITS;
> +        memory_region_clear_dirty_bitmap(rb->mr, mr_offset, mr_size);
>      }
>  
>      return num_dirty;
> diff --git a/system/memory.c b/system/memory.c
> index 8b84661ae3..666364392d 100644
> --- a/system/memory.c
> +++ b/system/memory.c
> @@ -2424,7 +2424,7 @@ void memory_region_reset_dirty(MemoryRegion *mr, hwaddr addr,
>  {
>      assert(mr->ram_block);
>      physical_memory_test_and_clear_dirty(
> -        memory_region_get_ram_addr(mr) + addr, size, client);
> +        memory_region_get_ram_addr(mr) + addr, size, client, NULL);
>  }
>  
>  int memory_region_get_fd(MemoryRegion *mr)
> diff --git a/system/physmem.c b/system/physmem.c
> index c9869e4049..d015eb2133 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -1090,18 +1090,19 @@ void physical_memory_set_dirty_range(ram_addr_t start, ram_addr_t length,
>  }
>  
>  /* Note: start and end must be within the same ram block.  */
> -bool physical_memory_test_and_clear_dirty(ram_addr_t start,
> +uint64_t physical_memory_test_and_clear_dirty(ram_addr_t start,
>                                                ram_addr_t length,
> -                                              unsigned client)
> +                                              unsigned client,
> +                                              unsigned long *dest)
>  {
>      DirtyMemoryBlocks *blocks;
>      unsigned long end, page, start_page;
> -    bool dirty = false;
> +    uint64_t num_dirty = 0;
>      RAMBlock *ramblock;
>      uint64_t mr_offset, mr_size;
>  
>      if (length == 0) {
> -        return false;
> +        return 0;
>      }
>  
>      end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
> @@ -1118,31 +1119,40 @@ bool physical_memory_test_and_clear_dirty(ram_addr_t start,
>          while (page < end) {
>              unsigned long idx = page / DIRTY_MEMORY_BLOCK_SIZE;
>              unsigned long offset = page % DIRTY_MEMORY_BLOCK_SIZE;
> -            unsigned long num = MIN(end - page,
> -                                    DIRTY_MEMORY_BLOCK_SIZE - offset);
>  
> -            dirty |= bitmap_test_and_clear_atomic(blocks->blocks[idx],
> -                                                  offset, num);
> -            page += num;
> +            if (bitmap_test_and_clear_atomic(blocks->blocks[idx], offset, 1)) {
> +                if (dest) {
> +                    unsigned long k = page - (ramblock->offset >> TARGET_PAGE_BITS);
> +                    if (!test_and_set_bit(k, dest)) {
> +                        num_dirty++;
> +                    }
> +                } else {
> +                    num_dirty++;
> +                }
> +            }
> +
> +            page++;
>          }
>  
> -        mr_offset = (ram_addr_t)(start_page << TARGET_PAGE_BITS) - ramblock->offset;
> -        mr_size = (end - start_page) << TARGET_PAGE_BITS;
> -        memory_region_clear_dirty_bitmap(ramblock->mr, mr_offset, mr_size);
> +        if (!dest && num_dirty) {

[1]

> +            mr_offset = (ram_addr_t)(start_page << TARGET_PAGE_BITS) - ramblock->offset;
> +            mr_size = (end - start_page) << TARGET_PAGE_BITS;
> +            memory_region_clear_dirty_bitmap(ramblock->mr, mr_offset, mr_size);
> +        }
>      }
>  
> -    if (dirty) {
> +    if (num_dirty) {
>          physical_memory_dirty_bits_cleared(start, length);
>      }
>  
> -    return dirty;
> +    return num_dirty;
>  }
>  
>  static void physical_memory_clear_dirty_range(ram_addr_t addr, ram_addr_t length)
>  {
> -    physical_memory_test_and_clear_dirty(addr, length, DIRTY_MEMORY_MIGRATION);
> -    physical_memory_test_and_clear_dirty(addr, length, DIRTY_MEMORY_VGA);
> -    physical_memory_test_and_clear_dirty(addr, length, DIRTY_MEMORY_CODE);
> +    physical_memory_test_and_clear_dirty(addr, length, DIRTY_MEMORY_MIGRATION, NULL);
> +    physical_memory_test_and_clear_dirty(addr, length, DIRTY_MEMORY_VGA, NULL);
> +    physical_memory_test_and_clear_dirty(addr, length, DIRTY_MEMORY_CODE, NULL);
>  }
>  
>  DirtyBitmapSnapshot *physical_memory_snapshot_and_clear_dirty
> -- 
> 2.39.3 (Apple Git-146)
> 

-- 
Peter Xu



  reply	other threads:[~2025-12-15 16:33 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-15 14:06 [PATCH v2 0/1] migration: reduce bitmap sync time and make dirty pages converge much more easily Chuang Xu
2025-12-15 14:06 ` [PATCH v2 1/1] migration: merge fragmented clear_dirty ioctls Chuang Xu
2025-12-15 16:32   ` Peter Xu [this message]
2025-12-15 16:26 ` [PATCH v2 0/1] migration: reduce bitmap sync time and make dirty pages converge much more easily Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aUA4BGW2Faw9CMgs@x1.local \
    --to=peterx@redhat.com \
    --cc=david@kernel.org \
    --cc=farosas@suse.de \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=sgarzare@redhat.com \
    --cc=xuchuangxclwt@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).