From: Peter Xu <peterx@redhat.com>
To: Chuang Xu <xuchuangxclwt@bytedance.com>
Cc: qemu-devel@nongnu.org, mst@redhat.com, sgarzare@redhat.com,
richard.henderson@linaro.org, pbonzini@redhat.com,
david@kernel.org, philmd@linaro.org, farosas@suse.de
Subject: Re: [PATCH v2 1/1] migration: merge fragmented clear_dirty ioctls
Date: Mon, 15 Dec 2025 11:32:04 -0500 [thread overview]
Message-ID: <aUA4BGW2Faw9CMgs@x1.local> (raw)
In-Reply-To: <20251215140611.16180-2-xuchuangxclwt@bytedance.com>
On Mon, Dec 15, 2025 at 10:06:11PM +0800, Chuang Xu wrote:
> From: xuchuangxclwt <xuchuangxclwt@bytedance.com>
>
> When the addresses processed are not aligned, a large number of
> clear_dirty ioctl occur (e.g. a 4MB misaligned memory can generate
> 2048 clear_dirty ioctls from two different memory_listener),
> which increases the time required for bitmap_sync and makes it
> more difficult for dirty pages to converge.
>
> Attempt to merge those fragmented clear_dirty ioctls.
>
> Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
> ---
> accel/tcg/cputlb.c | 5 +++--
> include/system/physmem.h | 7 ++++---
> migration/ram.c | 26 ++++++++++++------------
> system/memory.c | 2 +-
> system/physmem.c | 44 ++++++++++++++++++++++++----------------
> 5 files changed, 48 insertions(+), 36 deletions(-)
>
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index fd1606c856..c8827c8b0d 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -857,8 +857,9 @@ void tlb_flush_page_bits_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
> void tlb_protect_code(ram_addr_t ram_addr)
> {
> physical_memory_test_and_clear_dirty(ram_addr & TARGET_PAGE_MASK,
> - TARGET_PAGE_SIZE,
> - DIRTY_MEMORY_CODE);
> + TARGET_PAGE_SIZE,
> + DIRTY_MEMORY_CODE,
> + NULL);
> }
>
> /* update the TLB so that writes in physical page 'phys_addr' are no longer
> diff --git a/include/system/physmem.h b/include/system/physmem.h
> index 879f6eae38..8eeace9d1f 100644
> --- a/include/system/physmem.h
> +++ b/include/system/physmem.h
> @@ -39,9 +39,10 @@ uint64_t physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
>
> void physical_memory_dirty_bits_cleared(ram_addr_t start, ram_addr_t length);
>
> -bool physical_memory_test_and_clear_dirty(ram_addr_t start,
> - ram_addr_t length,
> - unsigned client);
> +uint64_t physical_memory_test_and_clear_dirty(ram_addr_t start,
> + ram_addr_t length,
> + unsigned client,
> + unsigned long *dest);
Nitpick: please consider adding doc for this function now, both "dest" and
retval may need some explanations.
>
> DirtyBitmapSnapshot *
> physical_memory_snapshot_and_clear_dirty(MemoryRegion *mr, hwaddr offset,
> diff --git a/migration/ram.c b/migration/ram.c
> index 29f016cb25..2d5e979211 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -942,7 +942,6 @@ static uint64_t physical_memory_sync_dirty_bitmap(RAMBlock *rb,
> ram_addr_t start,
> ram_addr_t length)
> {
> - ram_addr_t addr;
> unsigned long word = BIT_WORD((start + rb->offset) >> TARGET_PAGE_BITS);
> uint64_t num_dirty = 0;
> unsigned long *dest = rb->bmap;
> @@ -995,18 +994,19 @@ static uint64_t physical_memory_sync_dirty_bitmap(RAMBlock *rb,
> }
> } else {
> ram_addr_t offset = rb->offset;
> -
> - for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
> - if (physical_memory_test_and_clear_dirty(
> - start + addr + offset,
> - TARGET_PAGE_SIZE,
> - DIRTY_MEMORY_MIGRATION)) {
> - long k = (start + addr) >> TARGET_PAGE_BITS;
> - if (!test_and_set_bit(k, dest)) {
> - num_dirty++;
> - }
> - }
> - }
> + unsigned long end, start_page;
> + uint64_t mr_offset, mr_size;
> +
> + num_dirty = physical_memory_test_and_clear_dirty(
> + start + offset,
> + length,
> + DIRTY_MEMORY_MIGRATION,
> + dest);
Thanks for doing this, I think this is better. Though IIUC you missed a
major benefit of the current API I'm suggesting here, which is to avoid
explicit invokations of memory_region_clear_dirty_bitmap().
IIUC you can remove the "if" check at [1] below, then you can drop the five
lines here afterwards.
So physical_memory_test_and_clear_dirty() should never worry about remote
clears, and it should always be done properly.
> + end = TARGET_PAGE_ALIGN(start + offset + length) >> TARGET_PAGE_BITS;
> + start_page = (start + offset) >> TARGET_PAGE_BITS;
> + mr_offset = (ram_addr_t)(start_page << TARGET_PAGE_BITS) - offset;
> + mr_size = (end - start_page) << TARGET_PAGE_BITS;
> + memory_region_clear_dirty_bitmap(rb->mr, mr_offset, mr_size);
> }
>
> return num_dirty;
> diff --git a/system/memory.c b/system/memory.c
> index 8b84661ae3..666364392d 100644
> --- a/system/memory.c
> +++ b/system/memory.c
> @@ -2424,7 +2424,7 @@ void memory_region_reset_dirty(MemoryRegion *mr, hwaddr addr,
> {
> assert(mr->ram_block);
> physical_memory_test_and_clear_dirty(
> - memory_region_get_ram_addr(mr) + addr, size, client);
> + memory_region_get_ram_addr(mr) + addr, size, client, NULL);
> }
>
> int memory_region_get_fd(MemoryRegion *mr)
> diff --git a/system/physmem.c b/system/physmem.c
> index c9869e4049..d015eb2133 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -1090,18 +1090,19 @@ void physical_memory_set_dirty_range(ram_addr_t start, ram_addr_t length,
> }
>
> /* Note: start and end must be within the same ram block. */
> -bool physical_memory_test_and_clear_dirty(ram_addr_t start,
> +uint64_t physical_memory_test_and_clear_dirty(ram_addr_t start,
> ram_addr_t length,
> - unsigned client)
> + unsigned client,
> + unsigned long *dest)
> {
> DirtyMemoryBlocks *blocks;
> unsigned long end, page, start_page;
> - bool dirty = false;
> + uint64_t num_dirty = 0;
> RAMBlock *ramblock;
> uint64_t mr_offset, mr_size;
>
> if (length == 0) {
> - return false;
> + return 0;
> }
>
> end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS;
> @@ -1118,31 +1119,40 @@ bool physical_memory_test_and_clear_dirty(ram_addr_t start,
> while (page < end) {
> unsigned long idx = page / DIRTY_MEMORY_BLOCK_SIZE;
> unsigned long offset = page % DIRTY_MEMORY_BLOCK_SIZE;
> - unsigned long num = MIN(end - page,
> - DIRTY_MEMORY_BLOCK_SIZE - offset);
>
> - dirty |= bitmap_test_and_clear_atomic(blocks->blocks[idx],
> - offset, num);
> - page += num;
> + if (bitmap_test_and_clear_atomic(blocks->blocks[idx], offset, 1)) {
> + if (dest) {
> + unsigned long k = page - (ramblock->offset >> TARGET_PAGE_BITS);
> + if (!test_and_set_bit(k, dest)) {
> + num_dirty++;
> + }
> + } else {
> + num_dirty++;
> + }
> + }
> +
> + page++;
> }
>
> - mr_offset = (ram_addr_t)(start_page << TARGET_PAGE_BITS) - ramblock->offset;
> - mr_size = (end - start_page) << TARGET_PAGE_BITS;
> - memory_region_clear_dirty_bitmap(ramblock->mr, mr_offset, mr_size);
> + if (!dest && num_dirty) {
[1]
> + mr_offset = (ram_addr_t)(start_page << TARGET_PAGE_BITS) - ramblock->offset;
> + mr_size = (end - start_page) << TARGET_PAGE_BITS;
> + memory_region_clear_dirty_bitmap(ramblock->mr, mr_offset, mr_size);
> + }
> }
>
> - if (dirty) {
> + if (num_dirty) {
> physical_memory_dirty_bits_cleared(start, length);
> }
>
> - return dirty;
> + return num_dirty;
> }
>
> static void physical_memory_clear_dirty_range(ram_addr_t addr, ram_addr_t length)
> {
> - physical_memory_test_and_clear_dirty(addr, length, DIRTY_MEMORY_MIGRATION);
> - physical_memory_test_and_clear_dirty(addr, length, DIRTY_MEMORY_VGA);
> - physical_memory_test_and_clear_dirty(addr, length, DIRTY_MEMORY_CODE);
> + physical_memory_test_and_clear_dirty(addr, length, DIRTY_MEMORY_MIGRATION, NULL);
> + physical_memory_test_and_clear_dirty(addr, length, DIRTY_MEMORY_VGA, NULL);
> + physical_memory_test_and_clear_dirty(addr, length, DIRTY_MEMORY_CODE, NULL);
> }
>
> DirtyBitmapSnapshot *physical_memory_snapshot_and_clear_dirty
> --
> 2.39.3 (Apple Git-146)
>
--
Peter Xu
next prev parent reply other threads:[~2025-12-15 16:33 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-15 14:06 [PATCH v2 0/1] migration: reduce bitmap sync time and make dirty pages converge much more easily Chuang Xu
2025-12-15 14:06 ` [PATCH v2 1/1] migration: merge fragmented clear_dirty ioctls Chuang Xu
2025-12-15 16:32 ` Peter Xu [this message]
2025-12-15 16:26 ` [PATCH v2 0/1] migration: reduce bitmap sync time and make dirty pages converge much more easily Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aUA4BGW2Faw9CMgs@x1.local \
--to=peterx@redhat.com \
--cc=david@kernel.org \
--cc=farosas@suse.de \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=sgarzare@redhat.com \
--cc=xuchuangxclwt@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).