Re: [RFC v1 2/2] migration: merge fragmented clear_dirty ioctls

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Peter Xu <peterx@redhat.com>
To: Chuang Xu <xuchuangxclwt@bytedance.com>
Cc: qemu-devel@nongnu.org, mst@redhat.com, sgarzare@redhat.com,
	richard.henderson@linaro.org, pbonzini@redhat.com,
	david@kernel.org, philmd@linaro.org, farosas@suse.de
Subject: Re: [RFC v1 2/2] migration: merge fragmented clear_dirty ioctls
Date: Wed, 10 Dec 2025 14:45:01 -0500	[thread overview]
Message-ID: <aTnNvcvHJBvnbExC@x1.local> (raw)
In-Reply-To: <58c1e3f7-be13-4a2e-a1e6-0a8295d83345@bytedance.com>

On Wed, Dec 10, 2025 at 10:18:21PM +0800, Chuang Xu wrote:
> Hi, Peter
> 
> On 10/12/2025 05:10, Peter Xu wrote:
> > On Mon, Dec 08, 2025 at 08:09:52PM +0800, Chuang Xu wrote:
> >> From: xuchuangxclwt <xuchuangxclwt@bytedance.com>
> >>
> >> When the addresses processed are not aligned, a large number of
> >> clear_dirty ioctl occur (e.g. a 16MB misaligned memory can generate
> >> 4096 clear_dirty ioctls), which increases the time required for
> >> bitmap_sync and makes it more difficult for dirty pages to converge.
> >>
> >> Attempt to merge those fragmented clear_dirty ioctls.
> > (besides separate perf results I requested as in the cover letter reply..)
> >
> > Could you add something into the commit log explaining at least one example
> > that you observe?  E.g. what is the VM setup, how many ramblocks are the
> > ones not aligned?
> >
> > Have you considered setting rb->clear_bmap when it's available?  It'll
> > postpone the remote clear even further until page sent.  Logically it
> > should be more efficient, but it may depend on the size of unaligned
> > ramblocks that you're hitting indeed, as clear_bmap isn't PAGE_SIZE based
> > but it can be much bigger.  Some discussion around this would be nice on
> > how you chose the current solution.
> >
> 
> On my Intel(R) Xeon(R) 6986P-C(previous tests were based on Cascade Lake),
> I add some logs. Here are some examples of unaligned memory I observed:
> size 1966080: system.flash0
> size 131072: /rom@etc/acpi/tables, isa-bios, system.flash1, pc.rom
> size 65536: cirrus_vga.rom
> 
> Taking system.flash0 as an example, judging from its size, this should 
> be the OVMF I'm using.
> This memory segment will trigger clear_dirty in both memory_listener 
> "kvm-memory" and
> memory_listener "kvm-smram" simultaneously, ultimately resulting in a 
> total of 960 kvm_ioctl calls.
> If a larger OVMF is used, this number will obviously worsen.
> 
> On the machine I tested, clear system.flash0 took a total of 49ms,
> and clear all unaligned memory took a total of 61ms.

Thanks for the numbers.  It might be more helpful if you share the other
relevant numbers, e.g. referring to your cover letter definitions,

  - total bitmap sync time (x)
  - iteration transfer time (t)

that'll provide a full picture of how much wasted on per-page log_clear().

You can use those to compare to the clear() operation it took after your
patch applied.  I think that might be a better way to justify the patch.

In general, it looks reasonable to me.

Having a bool parameter for the old physical_memory_test_and_clear_dirty()
is fine but might be slightly error prone for new callers if they set it to
false by accident.

Another way to do it is, since physical_memory_test_and_clear_dirty()
always takes a range, we can pass the bitmap ("dest") into it when
available, then changing the retval to be num_dirty:

  (1) when dest provided, only account it when "dest" bitmap wasn't set
  already

  (2) when dest==NULL, treat all dirty bits into num_dirty

Maybe that'll look better, you can double check.

Looks like patch 1 will be dropped. You can repost this patch alone when
you feel ready whatever way you see fit.

Thanks,

> 
> Regarding why the current solution was chosen, because I'm not sure if 
> there were any
> special considerations in the initial design of clear_dirty_log for not 
> applying unaligned memory paths.
> Therefore, I chose to keep most of the logic the same as the existing one,
> only extracting and merging the actual clear_dirty operations.
> 
> Thanks.
> 

-- 
Peter Xu

next prev parent reply	other threads:[~2025-12-10 19:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-08 12:09 [RFC v1 0/2] migration: reduce bitmap sync time and make dirty pages converge much more easily Chuang Xu
2025-12-08 12:09 ` [RFC v1 1/2] vhost: eliminate duplicate dirty_bitmap sync when log shared by multiple devices Chuang Xu
2025-12-09 20:47   ` Peter Xu
2025-12-10  6:51     ` Jason Wang
2025-12-10 13:52       ` Chuang Xu
2025-12-08 12:09 ` [RFC v1 2/2] migration: merge fragmented clear_dirty ioctls Chuang Xu
2025-12-09 21:10   ` Peter Xu
2025-12-10 14:18     ` Chuang Xu
2025-12-10 19:45       ` Peter Xu [this message]
2025-12-09 21:06 ` [RFC v1 0/2] migration: reduce bitmap sync time and make dirty pages converge much more easily Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aTnNvcvHJBvnbExC@x1.local \
    --to=peterx@redhat.com \
    --cc=david@kernel.org \
    --cc=farosas@suse.de \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=sgarzare@redhat.com \
    --cc=xuchuangxclwt@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).