All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Aadeshveer Singh <aadeshveer07@gmail.com>
Cc: qemu-devel@nongnu.org, farosas@suse.de, pbonzini@redhat.com,
	philmd@mailo.com, lvivier@redhat.com, ayoub@saferwall.com
Subject: Re: [RFC PATCH 3/5] migration: add eager load thread for fast snapshot load
Date: Mon, 22 Jun 2026 14:50:26 -0400	[thread overview]
Message-ID: <ajmD8tCPuO51wKPd@x1.local> (raw)
In-Reply-To: <20260618032010.88755-4-aadeshveer07@gmail.com>

On Thu, Jun 18, 2026 at 08:50:08AM +0530, Aadeshveer Singh wrote:
> In fast snapshot load a thread is needed for actively loading in pages
> along with the fault path so that the guest is not dependent on fault
> thread indefinitely.
> 
> Add postcopy_ram_eager_load_thread(), for the eager thread which
> iterates over all non ignored blocks calling ram_block_load_eager()
> each. ram_block_load_eager then iterates to load in all pages using
> postcopy_mapped_ram_load_page(), with a different channel, which takes
> care of not loading in pages already loaded by fault thread. On
> completion the thread schedules postcopy_ram_eager_load_bh() to destroy
> the incoming migration state and set states to completed/end.
> 
> Add postcopy_ram_eager_load_setup() to create the thread. Added joining
> logic in postcopy_incoming_cleanup().
> 
> Add tracepoints for entry and exit to eager load thread.
> 
> Signed-off-by: Aadeshveer Singh <aadeshveer07@gmail.com>
> ---
>  migration/migration.h    |  5 +++
>  migration/postcopy-ram.c | 75 ++++++++++++++++++++++++++++++++++++++++
>  migration/postcopy-ram.h |  2 ++
>  migration/trace-events   |  2 ++
>  4 files changed, 84 insertions(+)
> 
> diff --git a/migration/migration.h b/migration/migration.h
> index 841f49b215..7bb54a6584 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -42,6 +42,7 @@
>  #define  MIGRATION_THREAD_DST_FAULT         "mig/dst/fault"
>  #define  MIGRATION_THREAD_DST_LISTEN        "mig/dst/listen"
>  #define  MIGRATION_THREAD_DST_PREEMPT       "mig/dst/preempt"
> +#define  MIGRATION_THREAD_DST_EAGER         "mig/dst/eager"

The name isn't easy to digest when someone saw it the 1st time.  Maybe
"snapshot_load" / DST_SNAPSHOT_LOAD?

>  
>  struct PostcopyBlocktimeContext;
>  typedef struct ThreadPool ThreadPool;
> @@ -120,6 +121,10 @@ struct MigrationIncomingState {
>      bool           have_listen_thread;
>      QemuThread     listen_thread;
>  
> +    /* Thread to load pages eagerly in fast snapshot load case */
> +    bool have_eager_load_thread;
> +    QemuThread eager_load_thread;
> +
>      /* For the kernel to send us notifications */
>      int       userfault_fd;
>      /* To notify the fault_thread to wake, e.g., when need to quit */
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index 1ec20a07dd..0ee294a381 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -2289,9 +2289,84 @@ int postcopy_incoming_cleanup(MigrationIncomingState *mis)
>          mis->have_listen_thread = false;
>      }
>  
> +    if (mis->have_eager_load_thread) {
> +        qemu_thread_join(&mis->eager_load_thread);
> +        mis->have_eager_load_thread = false;
> +    }
> +
>      if (migrate_postcopy_ram()) {
>          rc = postcopy_ram_incoming_cleanup(mis);
>      }
>  
>      return rc;
>  }
> +
> +/*
> + * Called by postcopy_ram_eager_load_thread over all blocks to load in all the
> + * pending pages of given ram block
> + */
> +static int ram_block_load_eager(RAMBlock *rb, void *opaque)
> +{
> +    MigrationIncomingState *mis = opaque;
> +    void *host = qemu_ram_get_host_addr(rb);
> +    void *target;
> +    int ret = 0;
> +
> +    for (ram_addr_t page_loc = 0; page_loc < rb->used_length;
> +         page_loc += TARGET_PAGE_SIZE) {

I think you can directly use rb->page_size here, then huge page will also
work.

> +        target = (uint8_t *)host + page_loc;
> +        ret = postcopy_mapped_ram_load_page(mis, rb, page_loc, (uint64_t)target,
> +                                            0);

RAM_CHANNEL_PRECOPY

> +        if (ret) {
> +            break;
> +        }
> +    }
> +    return ret;
> +}
> +
> +/*
> + * Bottom half for fast snapshot load, scheduled by eager load thread
> + */
> +static void postcopy_ram_eager_load_bh(void *opaque)
> +{
> +    MigrationIncomingState *mis = opaque;
> +    postcopy_state_set(POSTCOPY_INCOMING_END);
> +    migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
> +                      MIGRATION_STATUS_COMPLETED);

If you follow what postcopy_listen_thread() does, this should be put in the
eager load thread when all things succeeded.

> +    migration_incoming_state_destroy();
> +}

IIUC we should reuse what postcopy does.

Say, you can have one prior patch making postcopy_listen_thread_bh() to be
what you want.  IIUC we can safely move these two lines into it:

    postcopy_state_set(POSTCOPY_INCOMING_END);

Then I believe you can reuse it.  Maybe when at it, rename it to
postcopy_complete_bh() so it detaches from "listen thread".

That also handles exit_on_error case, so IIUC that's also what you should
do: when any failure happens, set s->error, then set FAILED status, finally
kickoff this BH to handle the rest, so exit() will be done in the BH.

PS: on incoming side so far we still sometimes reuse MigrationState->error.
It's a historical "slight" misuse.. but let's stick with it so far; I think
the whole point is to make error visible in query-migrate if
exit_on_error=false.  It's another problem to solve in the future.

> +
> +/*
> + * Used by fast snapshot load to eagerly load in all pages of RAM and schedule
> + * cleanup after entire RAM is loaded
> + */
> +static void *postcopy_ram_eager_load_thread(void *opaque)
> +{
> +    MigrationIncomingState *mis = opaque;
> +
> +    trace_postcopy_ram_eager_load_thread_entry();
> +    rcu_register_thread();
> +    qemu_event_set(&mis->thread_sync_event);
> +
> +    if (foreach_not_ignored_block(ram_block_load_eager, mis)) {
> +        error_report("ram_block_load_eager failed");

We can set error to MigrationState here with migrate_error_propagate(), if
you would reuse the BH I mentioned above.

> +    }
> +
> +    migration_bh_schedule(postcopy_ram_eager_load_bh, mis);
> +
> +    rcu_unregister_thread();
> +    trace_postcopy_ram_eager_load_thread_exit();
> +    return NULL;
> +}
> +
> +/*
> + * Create thread for eager loading in fast snapshot load case
> + */
> +int postcopy_ram_eager_load_setup(MigrationIncomingState *mis)
> +{
> +    postcopy_thread_create(
> +        mis, &mis->eager_load_thread, MIGRATION_THREAD_DST_EAGER,
> +        postcopy_ram_eager_load_thread, QEMU_THREAD_JOINABLE);
> +    mis->have_eager_load_thread = true;
> +    return 0;
> +}

Then here this patch introduced these functions without using them.  Then
this patch will be almost not possible to do proper review because the
reviewer won't know how it will be used, and where.

IMHO you can squash this patch with the one that will use this function.

Thanks,

> diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h
> index a080dd65a7..b3ba42e447 100644
> --- a/migration/postcopy-ram.h
> +++ b/migration/postcopy-ram.h
> @@ -202,4 +202,6 @@ void mark_postcopy_blocktime_begin(uintptr_t addr, uint32_t ptid,
>  int postcopy_incoming_setup(MigrationIncomingState *mis, Error **errp);
>  int postcopy_incoming_cleanup(MigrationIncomingState *mis);
>  
> +int postcopy_ram_eager_load_setup(MigrationIncomingState *mis);
> +
>  #endif
> diff --git a/migration/trace-events b/migration/trace-events
> index de99d976ab..38f11e1e9f 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -314,6 +314,8 @@ postcopy_blocktime_tid_cpu_map(int cpu, uint32_t tid) "cpu: %d, tid: %u"
>  postcopy_blocktime_begin(uint64_t addr, uint64_t time, int cpu, bool exists) "addr: 0x%" PRIx64 ", time: %" PRIu64 ", cpu: %d, exist: %d"
>  postcopy_blocktime_end(uint64_t addr, uint64_t time, int affected_cpu, int affected_non_cpus) "addr: 0x%" PRIx64 ", time: %" PRIu64 ", affected_cpus: %d, affected_non_cpus: %d"
>  postcopy_blocktime_end_one(int cpu, uint8_t left_faults) "cpu: %d, left_faults: %" PRIu8
> +postcopy_ram_eager_load_thread_entry(void) ""
> +postcopy_ram_eager_load_thread_exit(void) ""
>  
>  # exec.c
>  migration_exec_outgoing(const char *cmd) "cmd=%s"
> -- 
> 2.54.0
> 

-- 
Peter Xu



  reply	other threads:[~2026-06-22 18:51 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-18  3:20 [RFC PATCH 0/5] migration: fast snapshot load Aadeshveer Singh
2026-06-18  3:20 ` [RFC PATCH 1/5] migration: add RAM Block fields and helpers for " Aadeshveer Singh
2026-06-22 16:23   ` Peter Xu
2026-06-18  3:20 ` [RFC PATCH 2/5] migration: add support for fault thread to load pages from disk Aadeshveer Singh
2026-06-22 18:32   ` Peter Xu
2026-06-18  3:20 ` [RFC PATCH 3/5] migration: add eager load thread for fast snapshot load Aadeshveer Singh
2026-06-22 18:50   ` Peter Xu [this message]
2026-06-18  3:20 ` [RFC PATCH 4/5] migration: write up code to run fast snapshot load in qemu_loadvm_state Aadeshveer Singh
2026-06-22 19:16   ` Peter Xu
2026-06-18  3:20 ` [RFC PATCH 5/5] migration/tests: remove capability conflict test postcopy-ram+mapped-ram Aadeshveer Singh
2026-06-22 18:51   ` Peter Xu
2026-06-19 13:18 ` [RFC PATCH 0/5] migration: fast snapshot load Aadeshveer Singh
2026-06-22 19:19   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajmD8tCPuO51wKPd@x1.local \
    --to=peterx@redhat.com \
    --cc=aadeshveer07@gmail.com \
    --cc=ayoub@saferwall.com \
    --cc=farosas@suse.de \
    --cc=lvivier@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@mailo.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.