All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] migration: fast snapshot load
@ 2026-06-18  3:20 Aadeshveer Singh
  2026-06-18  3:20 ` [RFC PATCH 1/5] migration: add RAM Block fields and helpers for " Aadeshveer Singh
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Aadeshveer Singh @ 2026-06-18  3:20 UTC (permalink / raw)
  To: qemu-devel
  Cc: peterx, farosas, pbonzini, philmd, lvivier, ayoub,
	Aadeshveer Singh

This RFC implements a "fast snapshot load" mechanism to significantly
reduce the perceived resume time of a VM from a snapshot file.

Currently, resuming a VM from a snapshot file requires loading all RAM
pages into the QEMU instance before execution begins. This extension
allows the user to run the VM nearly instantly by loading only the
required device states up front and loading RAM pages lazily, by
trapping access to pages that have not yet been loaded.

Using the Linux userfaultfd syscall, a fault thread catches all page
faults caused by the guest and loads in the pages required to keep
the VM running. Concurrently, an eager background thread iteratively
loads all remaining pages into RAM so the guest does not have to
depend on the fault thread indefinitely.

Much of code is reused from postcopy for fault handling and precopy
for reading mapped ram file. Implementation revolves around two
threads named the fault thread and eager load thread. Fault thread as
name suggests catches page faults by the guest and serves them using
userfaultfd. Postcopy fault thread is reused but instead of requesting
source for a page it loads the page directly by reading form file. In
order to remove the dependency of guest on fault thread indefinitely
the eager load thread loads in the entire RAM sequentially, and after
iterating through the entire RAM signals fault thread to exit and
calls cleanup.

In order to prevent the case of a page being loaded twice(in the
case when eager load thread is loading it and fault thread also
tries to serve fault on same page) a bitmap called pending_bmap is
used to track pages which are pending and not being loaded by any
thread. Atomic operations on this bitmap allows coordination between
threads to prevent any unwanted behaviours

This patch was tested using a Debian 13 bare minimum system and Fedora
44 KDE, snapshots for both are loaded successfully with no error.

Next Steps:
- Add testing framework, in qtest and unit tests
- Add support for postcopy-blocktime
- Update documentation

Future direction:
- Add support for hugepages
- Add support for multifd
- Add support for vhost-user

Aadeshveer Singh (5):
  migration: add RAM Block fields and helpers for fast snapshot load
  migration: add support for fault thread to load pages from disk
  migration: add eager load thread for fast snapshot load
  migration: write up code to run fast snapshot load in
    qemu_loadvm_state
  migration/tests: remove capability conflict test
    postcopy-ram+mapped-ram

 include/system/ramblock.h          |   8 ++
 migration/migration.c              |  10 +-
 migration/migration.h              |   5 +
 migration/options.c                |  11 +-
 migration/options.h                |   1 +
 migration/postcopy-ram.c           | 167 ++++++++++++++++++++++++++---
 migration/postcopy-ram.h           |   2 +
 migration/qemu-file.c              |  10 +-
 migration/ram.c                    |  61 +++++++++--
 migration/savevm.c                 |  52 ++++++++-
 migration/savevm.h                 |   2 +
 migration/trace-events             |   2 +
 tests/qtest/migration/misc-tests.c |  52 ---------
 13 files changed, 283 insertions(+), 100 deletions(-)

-- 
2.54.0



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-06-22 19:19 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-18  3:20 [RFC PATCH 0/5] migration: fast snapshot load Aadeshveer Singh
2026-06-18  3:20 ` [RFC PATCH 1/5] migration: add RAM Block fields and helpers for " Aadeshveer Singh
2026-06-22 16:23   ` Peter Xu
2026-06-18  3:20 ` [RFC PATCH 2/5] migration: add support for fault thread to load pages from disk Aadeshveer Singh
2026-06-22 18:32   ` Peter Xu
2026-06-18  3:20 ` [RFC PATCH 3/5] migration: add eager load thread for fast snapshot load Aadeshveer Singh
2026-06-22 18:50   ` Peter Xu
2026-06-18  3:20 ` [RFC PATCH 4/5] migration: write up code to run fast snapshot load in qemu_loadvm_state Aadeshveer Singh
2026-06-22 19:16   ` Peter Xu
2026-06-18  3:20 ` [RFC PATCH 5/5] migration/tests: remove capability conflict test postcopy-ram+mapped-ram Aadeshveer Singh
2026-06-22 18:51   ` Peter Xu
2026-06-19 13:18 ` [RFC PATCH 0/5] migration: fast snapshot load Aadeshveer Singh
2026-06-22 19:19   ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.