All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Aadeshveer Singh <aadeshveer07@gmail.com>
Cc: qemu-devel@nongnu.org, farosas@suse.de, pbonzini@redhat.com,
	philmd@mailo.com, lvivier@redhat.com, ayoub@saferwall.com
Subject: Re: [RFC PATCH 2/5] migration: add support for fault thread to load pages from disk
Date: Mon, 22 Jun 2026 14:32:03 -0400	[thread overview]
Message-ID: <ajl_oz63TGnhMwvL@x1.local> (raw)
In-Reply-To: <20260618032010.88755-3-aadeshveer07@gmail.com>

On Thu, Jun 18, 2026 at 08:50:07AM +0530, Aadeshveer Singh wrote:
> In fast snapshot load, we would like to serve faults as soon as possible
> hence loading pages directly instead of requesting a source
> 
> Add postcopy_mapped_ram_load_page() function which serves single page
> fault by reading the snapshot file. It uses bitmap_test_and_clear_atomic
> on pending_bmap to coordinate between threads so each page is loaded
> exactly once. Non-zero pages are read using qemu_get_buffer_at into a
> temporary page (for loading page atomically), which is then placed using
> postcopy_place_page. Zero pages are placed directly using
> postcopy_place_page_zero.
> 
> Update postcopy_ram_fault_thread to call postcopy_mapped_ram_load_page
> instead of requesting source in case of fast snapshot load. to_src_file
> check is bypassed in fast snapshot load case as there is no source
> 
> Allocate another channel in postcopy_temp_pages_setup(like the preempt
> case), for both the fault thread and eager thread to load pages
> independently.
> 
> In case of failure to read required page crash the system using assert
> as disk failure is critical and VM cannot be recovered.
> 
> Signed-off-by: Aadeshveer Singh <aadeshveer07@gmail.com>
> ---
>  migration/postcopy-ram.c | 92 ++++++++++++++++++++++++++++++++--------
>  1 file changed, 75 insertions(+), 17 deletions(-)
> 
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index f5ef93f193..1ec20a07dd 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -949,6 +949,53 @@ int postcopy_wake_shared(struct PostCopyFD *pcfd,
>                         pagesize);
>  }
>  
> +/*
> + * Load a page from RAMBlock at offset at given host address.
> + * Used by postcopy ram fault thread and eager thread in fast snapshot load
> + * case. rb_offset: Offset of page in RAMBlock haddr: Base of page where to load
> + * in page Channel: Used to identify between threads and use corresponding temp
> + * pages Returns 0 on success
> + */

Somehow this paragraph is not properly formatted with newlines.  If you
want to provide a full doc of the function, you can follow kernel-doc
format:

https://docs.kernel.org/doc-guide/kernel-doc.html

/**
 * function_name() - desc
 *
 * @arg1: desc for @arg1
 * @arg2: desc for @arg2
 * ...

> +static int postcopy_mapped_ram_load_page(MigrationIncomingState *mis,
> +                                         RAMBlock *rb, ram_addr_t rb_offset,
> +                                         uint64_t haddr, int channel)
> +{
> +    int ret = 0;
> +    unsigned long page;
> +    void *host = (void *)haddr;

Can drop this var.

> +    void *place_source = mis->postcopy_tmp_pages[channel].tmp_huge_page;
> +    size_t read;

Nit: can use reverse christmas tree.

> +
> +    page = rb_offset >> TARGET_PAGE_BITS;
> +
> +    if (bitmap_test_and_clear_atomic(rb->pending_bmap, page, 1)) {
> +        if (test_bit(page, rb->nonzeropages)) {
> +            /*
> +             * qemu_get_buffer_at uses preadv which is thread safe we do not
> +             * need different channels
> +             */

Slightly misleading when the two threads do not use the same "channel" (or
say, temp pages..).  Maybe what you wanted to emphasize is QEMU might have
more than one thread using this function to install pages.  In that case,
it can be put as:

               /*
                * This can happen concurrently, but it's thread-safe because
                * qemu_get_buffer_at() is thread-safe, and the caller will be
                * using different temporary buffers.
                */

> +            read = qemu_get_buffer_at(mis->from_src_file, place_source,
> +                                      TARGET_PAGE_SIZE,
> +                                      rb->pages_offset + rb_offset);
> +
> +            g_assert(read == TARGET_PAGE_SIZE);

Two things can be improved on errors:

- When an error can be reached with user input, logically we shouldn't
  assert(), assert() is only for program errors.  Here it's possible an
  user specified a broken image, then we should cleanly exit with an err
  code.

- Still better to report the error to the caller and only handle error at
  the very top caller after throwing the error to stderr.

I did suggest that we can assert on loading failures when we talked, but I
confess I was not clear on how to do, sorry.  Let's use assert() only if
it's a programming error.

Postcopy didn't do as good on error reporting, normally nowadays QEMU
suggests to use "bool function(..., Error **errp)" as function interface,
return false if failure hit.  You can do it with the new functions like
this, then return the Error** object to the top caller and do one
error_report() before exit().

Alone the way if you want to convert some postcopy code to start using
Error** it'll be even better.  Feel free to have a look at the comments at
the start of include/qapi/error.h on the suggested way to handle errors in
QEMU.

> +
> +            ret = postcopy_place_page(mis, host, place_source, rb);
> +            if (ret) {
> +                return ret;
> +            }
> +
> +        } else {
> +            /* zero page */

Nit, we can drop this comment.

> +            ret = postcopy_place_page_zero(mis, host, rb);
> +            if (ret) {
> +                return ret;
> +            }
> +        }
> +    }
> +    return ret;
> +}
> +
>  /*
>   * NOTE: @tid is only used when postcopy-blocktime feature is enabled, and
>   * also optional: when zero is provided, the fault accounting will be ignored.
> @@ -1320,11 +1367,11 @@ static void *postcopy_ram_fault_thread(void *opaque)
>              break;
>          }
>  

We can leave below comments as-is, then add one line here to explain:

           /*
            * Fast snapshot load doesn't support pause and recover, because
            * it's not necessary: we can fail right away when QEMU just booted
            * with nothing to lose.
            */

> -        if (!mis->to_src_file) {
> +        if (!migrate_fast_snapshot_load() && !mis->to_src_file) {
>              /*
> -             * Possibly someone tells us that the return path is
> -             * broken already using the event. We should hold until
> -             * the channel is rebuilt.
> +             * Fast snapshot load has no to src file or in other case someone
> +             * possibly tells us that the return path is broken already using
> +             * the event. We should hold until the channel is rebuilt.
>               */
>              postcopy_pause_fault_thread(mis);
>          }
> @@ -1387,18 +1434,26 @@ static void *postcopy_ram_fault_thread(void *opaque)
>                                                  qemu_ram_get_idstr(rb),
>                                                  rb_offset,
>                                                  msg.arg.pagefault.feat.ptid);
> +
> +            if (migrate_fast_snapshot_load()) {
> +                if (postcopy_mapped_ram_load_page(
> +                        mis, rb, rb_offset, msg.arg.pagefault.address, 1)) {

s/1/RAM_CHANNEL_POSTCOPY/?  I agree it's not ideal with the current names,
but it is still kind of suitable.

> +                    break;
> +                }

With above, we can error_report() here and exit() when error happens.

> +            } else {
>  retry:
> -            /*
> -             * Send the request to the source - we want to request one
> -             * of our host page sizes (which is >= TPS)
> -             */
> -            ret = postcopy_request_page(mis, rb, rb_offset,
> -                                        msg.arg.pagefault.address,
> -                                        msg.arg.pagefault.feat.ptid);
> -            if (ret) {
> -                /* May be network failure, try to wait for recovery */
> -                postcopy_pause_fault_thread(mis);
> -                goto retry;
> +                /*
> +                 * Send the request to the source - we want to request one
> +                 * of our host page sizes (which is >= TPS)
> +                 */
> +                ret = postcopy_request_page(mis, rb, rb_offset,
> +                                            msg.arg.pagefault.address,
> +                                            msg.arg.pagefault.feat.ptid);
> +                if (ret) {
> +                    /* May be network failure, try to wait for recovery */
> +                    postcopy_pause_fault_thread(mis);
> +                    goto retry;
> +                }
>              }
>          }
>  
> @@ -1471,8 +1526,11 @@ static int postcopy_temp_pages_setup(MigrationIncomingState *mis)
>      unsigned i, channels;
>      void *temp_page;
>  
> -    if (migrate_postcopy_preempt()) {
> -        /* If preemption enabled, need extra channel for urgent requests */
> +    if (migrate_postcopy_preempt() || migrate_fast_snapshot_load()) {
> +        /*
> +         * If preemption enabled or it is fast snapshot load, need extra channel
> +         * for urgent requests/faults
> +         */
>          mis->postcopy_channels = RAM_CHANNEL_MAX;
>      } else {
>          /* Both precopy/postcopy on the same channel */
> -- 
> 2.54.0
> 

-- 
Peter Xu



  reply	other threads:[~2026-06-22 18:33 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-18  3:20 [RFC PATCH 0/5] migration: fast snapshot load Aadeshveer Singh
2026-06-18  3:20 ` [RFC PATCH 1/5] migration: add RAM Block fields and helpers for " Aadeshveer Singh
2026-06-22 16:23   ` Peter Xu
2026-06-18  3:20 ` [RFC PATCH 2/5] migration: add support for fault thread to load pages from disk Aadeshveer Singh
2026-06-22 18:32   ` Peter Xu [this message]
2026-06-18  3:20 ` [RFC PATCH 3/5] migration: add eager load thread for fast snapshot load Aadeshveer Singh
2026-06-22 18:50   ` Peter Xu
2026-06-18  3:20 ` [RFC PATCH 4/5] migration: write up code to run fast snapshot load in qemu_loadvm_state Aadeshveer Singh
2026-06-22 19:16   ` Peter Xu
2026-06-18  3:20 ` [RFC PATCH 5/5] migration/tests: remove capability conflict test postcopy-ram+mapped-ram Aadeshveer Singh
2026-06-22 18:51   ` Peter Xu
2026-06-19 13:18 ` [RFC PATCH 0/5] migration: fast snapshot load Aadeshveer Singh
2026-06-22 19:19   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajl_oz63TGnhMwvL@x1.local \
    --to=peterx@redhat.com \
    --cc=aadeshveer07@gmail.com \
    --cc=ayoub@saferwall.com \
    --cc=farosas@suse.de \
    --cc=lvivier@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@mailo.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.