Re: [RFC v1 2/2] migration: merge fragmented clear_dirty ioctls

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Peter Xu <peterx@redhat.com>
To: Chuang Xu <xuchuangxclwt@bytedance.com>
Cc: qemu-devel@nongnu.org, mst@redhat.com, sgarzare@redhat.com,
	richard.henderson@linaro.org, pbonzini@redhat.com,
	david@kernel.org, philmd@linaro.org, farosas@suse.de
Subject: Re: [RFC v1 2/2] migration: merge fragmented clear_dirty ioctls
Date: Wed, 10 Dec 2025 14:45:01 -0500	[thread overview]
Message-ID: <aTnNvcvHJBvnbExC@x1.local> (raw)
In-Reply-To: <58c1e3f7-be13-4a2e-a1e6-0a8295d83345@bytedance.com>

On Wed, Dec 10, 2025 at 10:18:21PM +0800, Chuang Xu wrote:
> Hi, Peter
> 
> On 10/12/2025 05:10, Peter Xu wrote:
> > On Mon, Dec 08, 2025 at 08:09:52PM +0800, Chuang Xu wrote:
> >> From: xuchuangxclwt <xuchuangxclwt@bytedance.com>
> >>
> >> When the addresses processed are not aligned, a large number of
> >> clear_dirty ioctl occur (e.g. a 16MB misaligned memory can generate
> >> 4096 clear_dirty ioctls), which increases the time required for
> >> bitmap_sync and makes it more difficult for dirty pages to converge.
> >>
> >> Attempt to merge those fragmented clear_dirty ioctls.
> > (besides separate perf results I requested as in the cover letter reply..)
> >
> > Could you add something into the commit log explaining at least one example
> > that you observe?  E.g. what is the VM setup, how many ramblocks are the
> > ones not aligned?
> >
> > Have you considered setting rb->clear_bmap when it's available?  It'll
> > postpone the remote clear even further until page sent.  Logically it
> > should be more efficient, but it may depend on the size of unaligned
> > ramblocks that you're hitting indeed, as clear_bmap isn't PAGE_SIZE based
> > but it can be much bigger.  Some discussion around this would be nice on
> > how you chose the current solution.
> >
> 
> On my Intel(R) Xeon(R) 6986P-C(previous tests were based on Cascade Lake),
> I add some logs. Here are some examples of unaligned memory I observed:
> size 1966080: system.flash0
> size 131072: /rom@etc/acpi/tables, isa-bios, system.flash1, pc.rom
> size 65536: cirrus_vga.rom
> 
> Taking system.flash0 as an example, judging from its size, this should 
> be the OVMF I'm using.
> This memory segment will trigger clear_dirty in both memory_listener 
> "kvm-memory" and
> memory_listener "kvm-smram" simultaneously, ultimately resulting in a 
> total of 960 kvm_ioctl calls.
> If a larger OVMF is used, this number will obviously worsen.
> 
> On the machine I tested, clear system.flash0 took a total of 49ms,
> and clear all unaligned memory took a total of 61ms.

Thanks for the numbers.  It might be more helpful if you share the other
relevant numbers, e.g. referring to your cover letter definitions,

  - total bitmap sync time (x)
  - iteration transfer time (t)

that'll provide a full picture of how much wasted on per-page log_clear().

You can use those to compare to the clear() operation it took after your
patch applied.  I think that might be a better way to justify the patch.

In general, it looks reasonable to me.

Having a bool parameter for the old physical_memory_test_and_clear_dirty()
is fine but might be slightly error prone for new callers if they set it to
false by accident.

Another way to do it is, since physical_memory_test_and_clear_dirty()
always takes a range, we can pass the bitmap ("dest") into it when
available, then changing the retval to be num_dirty:

  (1) when dest provided, only account it when "dest" bitmap wasn't set
  already

  (2) when dest==NULL, treat all dirty bits into num_dirty

Maybe that'll look better, you can double check.

Looks like patch 1 will be dropped. You can repost this patch alone when
you feel ready whatever way you see fit.

Thanks,

> 
> Regarding why the current solution was chosen, because I'm not sure if 
> there were any
> special considerations in the initial design of clear_dirty_log for not 
> applying unaligned memory paths.
> Therefore, I chose to keep most of the logic the same as the existing one,
> only extracting and merging the actual clear_dirty operations.
> 
> Thanks.
> 

-- 
Peter Xu

next prev parent reply	other threads:[~2025-12-10 19:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-08 12:09 [RFC v1 0/2] migration: reduce bitmap sync time and make dirty pages converge much more easily Chuang Xu
2025-12-08 12:09 ` [RFC v1 1/2] vhost: eliminate duplicate dirty_bitmap sync when log shared by multiple devices Chuang Xu
2025-12-09 20:47   ` Peter Xu
2025-12-10  6:51     ` Jason Wang
2025-12-10 13:52       ` Chuang Xu
2025-12-08 12:09 ` [RFC v1 2/2] migration: merge fragmented clear_dirty ioctls Chuang Xu
2025-12-09 21:10   ` Peter Xu
2025-12-10 14:18     ` Chuang Xu
2025-12-10 19:45       ` Peter Xu [this message]
2025-12-09 21:06 ` [RFC v1 0/2] migration: reduce bitmap sync time and make dirty pages converge much more easily Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aTnNvcvHJBvnbExC@x1.local \
    --to=peterx@redhat.com \
    --cc=david@kernel.org \
    --cc=farosas@suse.de \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=sgarzare@redhat.com \
    --cc=xuchuangxclwt@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.