All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com>
To: qemu-devel@nongnu.org
Cc: aarcange@redhat.com, yamahata@private.email.ne.jp,
	lilei@linux.vnet.ibm.com, quintela@redhat.com
Subject: [Qemu-devel] [PATCH v2 43/43] Start documenting how postcopy works.
Date: Mon, 11 Aug 2014 15:29:59 +0100	[thread overview]
Message-ID: <1407767399-3030-44-git-send-email-dgilbert@redhat.com> (raw)
In-Reply-To: <1407767399-3030-1-git-send-email-dgilbert@redhat.com>

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 docs/migration.txt | 150 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 150 insertions(+)

diff --git a/docs/migration.txt b/docs/migration.txt
index 0492a45..fec2d46 100644
--- a/docs/migration.txt
+++ b/docs/migration.txt
@@ -294,3 +294,153 @@ save/send this state when we are in the middle of a pio operation
 (that is what ide_drive_pio_state_needed() checks).  If DRQ_STAT is
 not enabled, the values on that fields are garbage and don't need to
 be sent.
+
+= Return path =
+
+In most migration scenarios there is only a single data path that runs
+from the source VM to the destination, typically along a single fd (although
+possibly with another fd or similar for some fast way of throwing pages across).
+
+However, some uses need two way comms; in particular the Postcopy destination
+needs to be able to request pages on demand from the source.
+
+For these scenarios there is a 'return path' from the destination to the source;
+qemu_file_get_return_path(QEMUFile* fwdpath) gives the QEMUFile* for the return
+path.
+
+  Source side
+     Forward path - written by migration thread
+     Return path  - opened by main thread, read by fd_handler on main thread
+
+  Destination side
+     Forward path - read by main thread
+     Return path  - opened by main thread, written by main thread AND postcopy
+                    thread (protected by rp_mutex)
+
+Opening the return path generally sets the fd to be non-blocking so that a
+failed destination can't block the source; and since the non-blockingness seems
+to follow both directions it does alter the semantics of the forward path.
+
+= Postcopy =
+'Postcopy' migration is a way to deal with migrations that refuse to converge;
+it's plus side is that there is an upper bound on the amount of migration traffic
+and time it takes, the down side is that during the postcopy phase, a failure of
+*either* side or the network connection causes the guest to be lost.
+
+In postcopy the destination CPUs are started before all the memory has been
+transferred, and accesses to pages that are yet to be transferred cause
+a fault that's translated by QEMU into a request to the source QEMU.
+
+Postcopy can be combined with precopy (i.e. normal migration) so that if precopy
+doesn't finish in a given time the switch is automatically made to precopy.
+
+=== Enabling postcopy ===
+
+To enable postcopy (prior to the start of migration):
+
+migrate_set_capability x-postcopy-ram on
+
+The migration will still start in precopy mode, however issuing:
+
+migrate_start_postcopy
+
+will now cause the transition from precopy to postcopy.
+It can be issued immediately after migration is started or any
+time later on.  Issuing it after the end of a migration is harmless.
+
+=== Postcopy states ===
+Postcopy moves through a series of states (see postcopy_ram_state)
+from ADVISE->LISTEN->RUNNING->END
+
+  Advise: Set at the start of migration if postcopy is enabled, even
+          if it hasn't passed the start-time threshold; here the destination
+          checks it's OS has the support needed for postcopy, and performs
+          setup to ensure the RAM mappings are suitable for later postcopy.
+          (Triggered by reception of POSTCOPY_RAM_ADVISE command)
+
+Normal precopy now carries on as normal, until the point that the source
+hits the start-time threshold and transitions to postcopy.  The source
+stops it's CPUs and transmits a 'discard bitmap' indicating pages that
+have been previously sent but are now dirty again and hence are out of
+date on the destination.
+
+The migration stream now contains a 'package' containing it's own chunk
+of migration stream, followed by a return to a normal stream containing
+page data.  The package (sent as CMD_PACKAGED) contains the commands to
+cycle the states on the destination, followed by all of the device
+state excluding RAM.  This lets the destination request pages from the
+source in parallel with loading device state, this is required since
+some devices (virtio) access guest memory during device initialisation.
+
+  Listen: The first command in the package, POSTCOPY_RAM_LISTEN, switches
+          the destination state to Listen, and starts a new thread
+          (the 'listen thread') which takes over the job of receiving
+          pages off the migration stream, while the main thread carries
+          on processing the blob.  With this thread able to process page
+          reception, the destination now 'sensitises' the RAM to detect
+          any access to missing pages (on Linux using the 'userfault'
+          system).
+
+The package now contains all the remaining state data and the command
+to transition to the next state.
+
+  Running: POSTCOPY_RAM_RUN causes the destination to synchronise all
+          state and start the CPUs and IO devices running.  The main
+          thread now finishes processing the migration package and
+          now carries on as it would for normal precopy migration
+          (although it can't do the cleanup it would do as it
+          finishes a normal migration).
+
+Page data is sent from the source to the destination both as part
+of a linear scan (like normal migration), and received by the 'listen thread',
+When the destination tries to use a page it hasn't got, it requests
+it from the source (down the return path) and the source sends this
+page in the same stream.  When the source has transmitted all pages
+it sends a POSTCOPY_RAM_END command to transition to
+
+  End: The listen thread can now quit, and perform the cleanup of migration
+state, the migration is now complete.
+
+=== Source side page maps ===
+The source side keeps two bitmaps during postcopy; 'the migration bitmap'
+and 'sent map'.  The 'migration bitmap' is basically the same as in
+the precopy case, and holds a bit to indicate that page is 'dirty' -
+i.e. needs sending.  During the precopy phase this is updated as the CPU
+dirties pages, however during postcopy the CPUs are stopped and nothing
+should dirty anything any more.
+
+The 'sent map' is used for the transition to postcopy. It is a bitmap that
+has a bit set whenever a page is sent to the destination, however during
+the transition to postcopy mode it is masked against the migration bitmap
+(sentmap &= migrationbitmap) to generate a bitmap recording pages that
+have been previously been sent but are now dirty again.  This masked
+sentmap is sent to the destination which discards those now dirty pages
+before starting the CPUs.
+
+Note that once in postcopy mode, the sent map is still updated, however it's
+contents are not-consistent as a local view of what's been sent since it's
+only got the masked result.
+
+=== Destination side page maps ===
+(Needs to be changed so we can update both easily - at the moment updates are done
+ with a lock)
+The destination keeps a 'requested map' and a 'received map'.
+Both maps are initially 0, as pages are received the bits are set in 'received map'.
+Incoming requests from the kernel cause the bit to be set in the 'requested map'.
+When a page is received that is marked as 'requested' the kernel is notified.
+If the kernel requests a page that has already been 'received' the kernel is notified
+without re-requesting.
+
+This leads to three valid page states:
+page states:
+    missing (!rc,!rq)  - page not yet received or requested
+    received (rc,!rq)  - Page received
+    requested (!rc,rq) - page requested but not yet received
+
+state transitions:
+      received -> missing   (only during setup/discard)
+
+      missing -> received   (normal incoming page)
+      requested -> received (incoming page previously requested)
+      missing -> requested  (userfault request)
+
-- 
1.9.3

  parent reply	other threads:[~2014-08-11 14:31 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-11 14:29 [Qemu-devel] [PATCH v2 00/43] Postcopy implementation Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 01/43] qemu_ram_foreach_block: pass up error value, and down the ramblock name Dr. David Alan Gilbert (git)
2014-08-11 18:29   ` Eric Blake
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 02/43] improve DPRINTF macros, add to savevm Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 03/43] Add qemu_get_counted_string to read a string prefixed by a count byte Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 04/43] Create MigrationIncomingState Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 05/43] Return path: Open a return path on QEMUFile for sockets Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 06/43] Return path: socket_writev_buffer: Block even on non-blocking fd's Dr. David Alan Gilbert (git)
2014-08-12  2:13   ` [Qemu-devel] 答复: " chenliang (T)
2014-08-12  9:36     ` [Qemu-devel] ????: [PATCH v2 06/43] Return path: socket_writev_buffer:?Block " Dr. David Alan Gilbert
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 07/43] Migration commands Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 08/43] Return path: Control commands Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 09/43] Return path: Send responses from destination to source Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 10/43] Return path: Source handling of return path Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 11/43] qemu_loadvm errors and debug Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 12/43] ram_debug_dump_bitmap: Dump a migration bitmap as text Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 13/43] Rework loadvm path for subloops Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 14/43] Add migration-capability boolean for postcopy-ram Dr. David Alan Gilbert (git)
2014-08-11 16:47   ` Eric Blake
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 15/43] Add wrappers and handlers for sending/receiving the postcopy-ram migration messages Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 16/43] QEMU_VM_CMD_PACKAGED: Send a packaged chunk of migration stream Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 17/43] migrate_init: Call from savevm Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 18/43] Allow savevm handlers to state whether they could go into postcopy Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 19/43] postcopy: OS support test Dr. David Alan Gilbert (git)
2014-08-12  5:32   ` zhanghailiang
2014-08-12  8:18     ` Dr. David Alan Gilbert
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 20/43] migrate_start_postcopy: Command to trigger transition to postcopy Dr. David Alan Gilbert (git)
2014-08-11 17:01   ` Eric Blake
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 21/43] MIG_STATE_POSTCOPY_ACTIVE: Add new migration state Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 22/43] qemu_savevm_state_complete: Postcopy changes Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 23/43] Postcopy: Maintain sentmap during postcopy pre phase Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 24/43] Postcopy page-map-incoming (PMI) structure Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 25/43] postcopy: Add incoming_init/cleanup functions Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 26/43] postcopy: Incoming initialisation Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 27/43] postcopy: ram_enable_notify to switch on userfault Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 28/43] Postcopy: postcopy_start Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 29/43] Postcopy: Rework migration thread for postcopy mode Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 30/43] mig fd_connect: open return path Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 31/43] Postcopy: Create a fault handler thread before marking the ram as userfault Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 32/43] Page request: Add MIG_RPCOMM_REQPAGES reverse command Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 33/43] Page request: Process incoming page request Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 34/43] Page request: Consume pages off the post-copy queue Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 35/43] Add assertion to check migration_dirty_pages Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 36/43] postcopy_ram.c: place_page and helpers Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 37/43] Postcopy: Use helpers to map pages during migration Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 38/43] qemu_ram_block_from_host Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 39/43] Postcopy; Handle userfault requests Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 40/43] Start up a postcopy/listener thread ready for incoming page data Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 41/43] postcopy: Wire up loadvm_postcopy_ram_handle_{run, end} commands Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 42/43] End of migration for postcopy Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` Dr. David Alan Gilbert (git) [this message]
2014-08-11 17:19   ` [Qemu-devel] [PATCH v2 43/43] Start documenting how postcopy works Eric Blake
2014-08-11 17:58     ` Dr. David Alan Gilbert
2014-08-12  1:50 ` [Qemu-devel] [PATCH v2 00/43] Postcopy implementation zhanghailiang
2014-08-12  9:19   ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1407767399-3030-44-git-send-email-dgilbert@redhat.com \
    --to=dgilbert@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=lilei@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=yamahata@private.email.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.