Re: [RFC PATCH 1/5] migration: add RAM Block fields and helpers for fast snapshot load

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Peter Xu <peterx@redhat.com>
To: Aadeshveer Singh <aadeshveer07@gmail.com>
Cc: qemu-devel@nongnu.org, farosas@suse.de, pbonzini@redhat.com,
	philmd@mailo.com, lvivier@redhat.com, ayoub@saferwall.com
Subject: Re: [RFC PATCH 1/5] migration: add RAM Block fields and helpers for fast snapshot load
Date: Wed, 24 Jun 2026 10:28:31 -0400	[thread overview]
Message-ID: <ajvpj_pxXpvGWKb4@x1.local> (raw)
In-Reply-To: <CA++cPvJGwOQpSD3zTNAmpxE2JSCZrh5K4taVuAavDMOggBLcJQ@mail.gmail.com>

On Wed, Jun 24, 2026 at 12:36:16PM +0530, Aadeshveer Singh wrote:
> On Mon, Jun 22, 2026 at 9:53 PM Peter Xu <peterx@redhat.com> wrote:
> >
> > On Thu, Jun 18, 2026 at 08:50:06AM +0530, Aadeshveer Singh wrote:
> > > Add two fields per RAMBlock:
> > >
> > > - nonzeropages: Mirrors the mapped-ram bitmap for storing which pages
> > >   are present in file and which are zero.
> > > - pending_bmap: Bitmap to store internal state of which pages have been
> > >   read by some thread to ensure coordination between threads.
> > >
> > > Both fields are allocated and initialized in ram_load_setup and freed in
> > > ram_load_cleanup. nonzeropages is populated in parse_ramblock_mapped_ram
> > > eliminating the use of a temporary bitmap.
> > >
> > > Change ram_load() to load using ram_load_precopy() in case of fast
> > > snapshot load.
> > >
> > > Also add migrate_fast_snapshot_load() returning true when both
> > > postcopy-ram and mapped-ram capabilities are set.
> > >
> > > Update qemu_get_buffer_at() to not set error to make it thread safe. All
> > > the callers of qemu_get_buffer_at(), take care of error handling.
> > >
> > > Signed-off-by: Aadeshveer Singh <aadeshveer07@gmail.com>
> > > ---
> > >  include/system/ramblock.h |  8 +++++
> > >  migration/options.c       |  5 ++++
> > >  migration/options.h       |  1 +
> > >  migration/qemu-file.c     | 10 +------
> > >  migration/ram.c           | 61 ++++++++++++++++++++++++++++++++-------
> > >  5 files changed, 65 insertions(+), 20 deletions(-)
> > >
> > > diff --git a/include/system/ramblock.h b/include/system/ramblock.h
> > > index 4435f8d55f..73275d0459 100644
> > > --- a/include/system/ramblock.h
> > > +++ b/include/system/ramblock.h
> > > @@ -60,6 +60,14 @@ struct RAMBlock {
> > >
> > >      /* Bitmap of already received pages.  Only used on destination side. */
> > >      unsigned long *receivedmap;
> > > +    /* Bitmap of zero pages. Used for fast snapshot load. */
> > > +    unsigned long *nonzeropages;
> >
> > We have file_bmap, only used on source for now.  I think we can safely
> > reuse it by caching the pointer allocated.
> Agreed will drop nonzeropages and reuse file_bmap for this in v2
> 
> >
> > > +    /*
> > > +     * Bitmap for pages that are yet to be read from disk. It is required for
> > > +     * fault thread and eager thread to keep note of which pages are currently
> > > +     * being read. Used by fast snapshot load.
> > > +     */
> > > +    unsigned long *pending_bmap;
> >
> > We have receivedmap right above, and it's always allocated on dest.  IIUC
> > we can directly use it.
> >
> > It's also already set by uffd helpers, see qemu_ufd_copy_ioctl().
> > Currently it's a bit ugly put under a "if (!ret)"..  if you want you can
> > clean it up a bit.
> >
> > There, we may want to skip page_requested or page_request_mutex operations
> > for file load because they're not necessary.
> >
> 
> I looked into reusing receivedmap here instead of creating
> pending_bmap, but doing so creates a race condition that breaks the
> postcopy blocktime calculations.
> 
> receivedmap strictly tracks pages that have been successfully placed
> in the VM. My pending_bmap tracks pages that any thread is actively
> loading (reading from disk).

Yes, I think if we want to reuse the bitmap, we need to redefine
receivedmap a little bit.

Remote postcopy needs to remember it not only to maintain the GTree for
page_requested, but also when postcopy is interrupted it needs to sync the
bitmap to source to know what exactly have landed destination.  Here
setting the bit _after_ the placement is critical.

In your case, you don't need to remember "what pages we have loaded".

So if we could redefine receivedmap in your case to be "who will place this
page, eager load thread or fault thread", then it could work.

Said that, maybe this will need some work to make it work, you can evaluate
how much needed before going too deep.  Then if you think it's still easier
to have the new bitmap, then it's still ok to consider having it.

It may also matter on how you plan to refactor qemu_ufd_copy_ioctl(),
maybe.  So you can take all these problems together in mind when
considering the final decision.

> 
> If we use receivedmap to track in-load pages, and a vCPU faults on a
> page that the eager thread is currently reading, the fault thread will
> see the bit set in receivedmap. It will then incorrectly assume the
> page is already in the VM and skip tracking the blocktime for that
> fault, even though the eager thread hasn't actually called place_page
> yet. Keeping pending_bmap separate ensures the fault thread accurately
> tracks blocktime until placement is actually complete.
> 
> Regarding the cleanup in qemu_ufd_copy_ioctl(), I agree we can skip
> the page_requested operations for the file load. I also noticed a
> potential race where if the eager thread places a page just before the
> fault thread attempts to, we might hit an assert there. I will look
> into returning safely instead of asserting to harden this, and I will
> include the cleanup in v2.

What is the assertion about?

Logically if the fault thread will atomically fetch and set the bitmap bits
(either your new bitmap, or reuse receivedmap), then it should already see
the eager thread set the bit already, so it should avoid further playing
with the page.  Vice versa.

But as long as you have clue on how to resolve it and move on, we're good!
You can mention that in the next cover letter then we can review it in the
next version directly.

Thanks,

-- 
Peter Xu

next prev parent reply	other threads:[~2026-06-24 14:28 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-18  3:20 [RFC PATCH 0/5] migration: fast snapshot load Aadeshveer Singh
2026-06-18  3:20 ` [RFC PATCH 1/5] migration: add RAM Block fields and helpers for " Aadeshveer Singh
2026-06-22 16:23   ` Peter Xu
2026-06-24  7:06     ` Aadeshveer Singh
2026-06-24 14:28       ` Peter Xu [this message]
2026-06-18  3:20 ` [RFC PATCH 2/5] migration: add support for fault thread to load pages from disk Aadeshveer Singh
2026-06-22 18:32   ` Peter Xu
2026-06-24 14:36   ` Fabiano Rosas
2026-06-18  3:20 ` [RFC PATCH 3/5] migration: add eager load thread for fast snapshot load Aadeshveer Singh
2026-06-22 18:50   ` Peter Xu
2026-06-24 14:50   ` Fabiano Rosas
2026-06-18  3:20 ` [RFC PATCH 4/5] migration: write up code to run fast snapshot load in qemu_loadvm_state Aadeshveer Singh
2026-06-22 19:16   ` Peter Xu
2026-06-24 15:10   ` Fabiano Rosas
2026-06-18  3:20 ` [RFC PATCH 5/5] migration/tests: remove capability conflict test postcopy-ram+mapped-ram Aadeshveer Singh
2026-06-22 18:51   ` Peter Xu
2026-06-19 13:18 ` [RFC PATCH 0/5] migration: fast snapshot load Aadeshveer Singh
2026-06-22 19:19   ` Peter Xu
2026-06-24  7:00     ` Aadeshveer Singh
2026-06-24 14:13 ` Fabiano Rosas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajvpj_pxXpvGWKb4@x1.local \
    --to=peterx@redhat.com \
    --cc=aadeshveer07@gmail.com \
    --cc=ayoub@saferwall.com \
    --cc=farosas@suse.de \
    --cc=lvivier@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@mailo.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.