All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com>
To: qemu-devel@nongnu.org
Cc: aarcange@redhat.com, yamahata@private.email.ne.jp,
	lilei@linux.vnet.ibm.com, quintela@redhat.com
Subject: [Qemu-devel] [PATCH 46/46] Start documenting how postcopy works.
Date: Fri,  4 Jul 2014 18:41:57 +0100	[thread overview]
Message-ID: <1404495717-4239-47-git-send-email-dgilbert@redhat.com> (raw)
In-Reply-To: <1404495717-4239-1-git-send-email-dgilbert@redhat.com>

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 docs/migration.txt | 148 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 148 insertions(+)

diff --git a/docs/migration.txt b/docs/migration.txt
index 0492a45..dbd5e5f 100644
--- a/docs/migration.txt
+++ b/docs/migration.txt
@@ -294,3 +294,151 @@ save/send this state when we are in the middle of a pio operation
 (that is what ide_drive_pio_state_needed() checks).  If DRQ_STAT is
 not enabled, the values on that fields are garbage and don't need to
 be sent.
+
+= Return path =
+
+In most migration scenarios there is only a single data path that runs
+from the source VM to the destination, typically along a single fd (although
+possibly with another fd or similar for some fast way of throwing pages across).
+
+However, some uses need two way comms; in particular the Postcopy destination
+needs to be able to request pages on demand from the source.
+
+For these scenarios there is a 'return path' from the destination to the source;
+qemu_file_get_return_path(QEMUFile* fwdpath) gives the QEMUFile* for the return
+path.
+
+  Source side
+     Forward path - written by migration thread
+     Return path  - opened by main thread, read by fd_handler on main thread
+
+  Destination side
+     Forward path - read by main thread
+     Return path  - opened by main thread, written by main thread AND postcopy
+                    thread (protected by rp_mutex)
+
+Opening the return path generally sets the fd to be non-blocking so that a
+failed destination can't block the source; and since the non-blockingness seems
+to follow both directions it does alter the semantics of the forward path.
+
+= Postcopy =
+'Postcopy' migration is a way to deal with migrations that refuse to converge;
+it's plus side is that there is an upper bound on the amount of migration traffic
+and time it takes, the down side is that during the postcopy phase, a failure of
+*either* side or the network connection causes the guest to be lost.
+
+In postcopy the destination CPUs are started before all the memory has been
+transferred, and accesses to pages that are yet to be transferred cause
+a fault that's translated by QEMU into a request to the source QEMU.
+
+Postcopy can be combined with precopy (i.e. normal migration) so that if precopy
+doesn't finish in a given time the switch is automatically made to precopy.
+
+=== Enabling postcopy ===
+
+To enable pure postcopy:
+
+migrate_set_capability x-postcopy-ram on
+
+To add a period of precopy:
+
+migrate_set_parameter x-postcopy-start-time 500
+
+(time in ms)
+
+=== Postcopy states ===
+Postcopy moves through a series of states (see postcopy_ram_state)
+from ADVISE->LISTEN->RUNNING->END
+
+  Advise: Set at the start of migration if postcopy is enabled, even
+          if it hasn't passed the start-time threshold; here the destination
+          checks it's OS has the support needed for postcopy, and performs
+          setup to ensure the RAM mappings are suitable for later postcopy.
+          (Triggered by reception of POSTCOPY_RAM_ADVISE command)
+
+Normal precopy now carries on as normal, until the point that the source
+hits the start-time threshold and transitions to postcopy.  The source
+stops it's CPUs and transmits a 'discard bitmap' indicating pages that
+have been previously sent but are now dirty again and hence are out of
+date on the destination.
+
+The migration stream now contains a 'package' containing it's own chunk
+of migration stream, followed by a return to a normal stream containing
+page data.  The package (sent as CMD_PACKAGED) contains the commands to
+cycle the states on the destination, followed by all of the device
+state excluding RAM.  This lets the destination request pages from the
+source in parallel with loading device state, this is required since
+some devices (virtio) access guest memory during device initialisation.
+
+  Listen: The first command in the package, POSTCOPY_RAM_LISTEN, switches
+          the destination state to Listen, and starts a new thread
+          (the 'listen thread') which takes over the job of receiving
+          pages off the migration stream, while the main thread carries
+          on processing the blob.  With this thread able to process page
+          reception, the destination now 'sensitises' the RAM to detect
+          any access to missing pages (on Linux using the 'userfault'
+          system).
+
+The package now contains all the remaining state data and the command
+to transition to the next state.
+
+  Running: POSTCOPY_RAM_RUN causes the destination to synchronise all
+          state and start the CPUs and IO devices running.  The main
+          thread now finishes processing the migration package and
+          now carries on as it would for normal precopy migration
+          (although it can't do the cleanup it would do as it
+          finishes a normal migration).
+
+Page data is sent from the source to the destination both as part
+of a linear scan (like normal migration), and received by the 'listen thread',
+When the destination tries to use a page it hasn't got, it requests
+it from the source (down the return path) and the source sends this
+page in the same stream.  When the source has transmitted all pages
+it sends a POSTCOPY_RAM_END command to transition to
+
+  End: The listen thread can now quit, and perform the cleanup of migration
+state, the migration is now complete.
+
+=== Source side page maps ===
+The source side keeps two bitmaps during postcopy; 'the migration bitmap'
+and 'sent map'.  The 'migration bitmap' is basically the same as in
+the precopy case, and holds a bit to indicate that page is 'dirty' -
+i.e. needs sending.  During the precopy phase this is updated as the CPU
+dirties pages, however during postcopy the CPUs are stopped and nothing
+should dirty anything any more.
+
+The 'sent map' is used for the transition to postcopy. It is a bitmap that
+has a bit set whenever a page is sent to the destination, however during
+the transition to postcopy mode it is masked against the migration bitmap
+(sentmap &= migrationbitmap) to generate a bitmap recording pages that
+have been previously been sent but are now dirty again.  This masked
+sentmap is sent to the destination which discards those now dirty pages
+before starting the CPUs.
+
+Note that once in postcopy mode, the sent map is still updated, however it's
+contents are not-consistent as a local view of what's been sent since it's
+only got the masked result.
+
+=== Destination side page maps ===
+(Needs to be changed so we can update both easily - at the moment updates are done
+ with a lock)
+The destination keeps a 'requested map' and a 'received map'.
+Both maps are initially 0, as pages are received the bits are set in 'received map'.
+Incoming requests from the kernel cause the bit to be set in the 'requested map'.
+When a page is received that is marked as 'requested' the kernel is notified.
+If the kernel requests a page that has already been 'received' the kernel is notified
+without re-requesting.
+
+This leads to three valid page states:
+page states:
+    missing (!rc,!rq)  - page not yet received or requested
+    received (rc,!rq)  - Page received
+    requested (!rc,rq) - page requested but not yet received
+
+state transitions:
+      received -> missing   (only during setup/discard)
+
+      missing -> received   (normal incoming page)
+      requested -> received (incoming page previously requested)
+      missing -> requested  (userfault request)
+
-- 
1.9.3

  parent reply	other threads:[~2014-07-04 17:43 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-04 17:41 [Qemu-devel] [PATCH 00/46] Postcopy implementation Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 01/46] qemu_ram_foreach_block: pass up error value, and down the ramblock name Dr. David Alan Gilbert (git)
2014-07-07 15:46   ` Eric Blake
2014-07-07 15:48     ` Dr. David Alan Gilbert
2014-07-04 17:41 ` [Qemu-devel] [PATCH 02/46] Move QEMUFile structure to qemu-file.h Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 03/46] QEMUSizedBuffer/QEMUFile Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 04/46] improve DPRINTF macros, add to savevm Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 05/46] Add qemu_get_counted_string to read a string prefixed by a count byte Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 06/46] Create MigrationIncomingState Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 07/46] Return path: Open a return path on QEMUFile for sockets Dr. David Alan Gilbert (git)
2014-07-05 10:06   ` Paolo Bonzini
2014-07-16  9:37     ` Dr. David Alan Gilbert
2014-07-16  9:50       ` Paolo Bonzini
2014-07-16 11:52         ` Dr. David Alan Gilbert
2014-07-16 12:31           ` Paolo Bonzini
2014-07-16 17:10             ` Dr. David Alan Gilbert
2014-07-17  6:25               ` Paolo Bonzini
2014-07-04 17:41 ` [Qemu-devel] [PATCH 08/46] Return path: socket_writev_buffer: Block even on non-blocking fd's Dr. David Alan Gilbert (git)
2014-07-05 10:07   ` Paolo Bonzini
2014-07-04 17:41 ` [Qemu-devel] [PATCH 09/46] Migration commands Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 10/46] Return path: Control commands Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 11/46] Return path: Send responses from destination to source Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 12/46] Return path: Source handling of return path Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 13/46] qemu_loadvm debug Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 14/46] ram_debug_dump_bitmap: Dump a migration bitmap as text Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 15/46] Rework loadvm path for subloops Dr. David Alan Gilbert (git)
2014-07-05 10:26   ` Paolo Bonzini
2014-07-07 14:35     ` Dr. David Alan Gilbert
2014-07-07 14:53       ` Paolo Bonzini
2014-07-07 15:04         ` Dr. David Alan Gilbert
2014-07-16  9:25         ` Dr. David Alan Gilbert
2014-07-04 17:41 ` [Qemu-devel] [PATCH 16/46] Add migration-capability boolean for postcopy-ram Dr. David Alan Gilbert (git)
2014-07-07 19:41   ` Eric Blake
2014-07-07 20:23     ` Dr. David Alan Gilbert
2014-07-10 16:17       ` Paolo Bonzini
2014-07-10 19:02         ` Dr. David Alan Gilbert
2014-07-04 17:41 ` [Qemu-devel] [PATCH 17/46] Add wrappers and handlers for sending/receiving the postcopy-ram migration messages Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 18/46] QEMU_VM_CMD_PACKAGED: Send a packaged chunk of migration stream Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 19/46] migrate_init: Call from savevm Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 20/46] Allow savevm handlers to state whether they could go into postcopy Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 21/46] postcopy: OS support test Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 22/46] Migration parameters: Add qmp/hmp commands for setting/viewing Dr. David Alan Gilbert (git)
2014-07-07 19:50   ` Eric Blake
2014-07-04 17:41 ` [Qemu-devel] [PATCH 23/46] MIG_STATE_POSTCOPY_ACTIVE: Add new migration state Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 24/46] qemu_savevm_state_complete: Postcopy changes Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 25/46] Postcopy: Maintain sentmap during postcopy pre phase Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 26/46] Postcopy page-map-incoming (PMI) structure Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 27/46] postcopy: Add incoming_init/cleanup functions Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 28/46] postcopy: Incoming initialisation Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 29/46] postcopy: ram_enable_notify to switch on userfault Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 30/46] Postcopy: postcopy_start Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 31/46] Postcopy: Rework migration thread for postcopy mode Dr. David Alan Gilbert (git)
2014-07-05 10:19   ` Paolo Bonzini
2014-08-28 11:04     ` Dr. David Alan Gilbert
2014-08-28 11:23       ` Paolo Bonzini
2014-07-04 17:41 ` [Qemu-devel] [PATCH 32/46] mig fd_connect: open return path Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 33/46] Postcopy: Create a fault handler thread before marking the ram as userfault Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 34/46] Page request: Add MIG_RPCOMM_REQPAGES reverse command Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 35/46] Page request: Process incoming page request Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 36/46] Page request: Consume pages off the post-copy queue Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 37/46] Add assertion to check migration_dirty_pages doesn't go -ve; have seen it happen once but not sure why Dr. David Alan Gilbert (git)
2014-07-11 15:20   ` Eric Blake
2014-07-11 15:41     ` Dr. David Alan Gilbert
2014-07-04 17:41 ` [Qemu-devel] [PATCH 38/46] postcopy_ram.c: place_page and helpers Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 39/46] Postcopy: Use helpers to map pages during migration Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 40/46] qemu_ram_block_from_host Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 41/46] Handle userfault requests (although userfaultfd not done yet) Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 42/46] Start up a postcopy/listener thread ready for incoming page data Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 43/46] postcopy: Wire up loadvm_postcopy_ram_handle_{run, end} commands Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 44/46] postcopy: Use userfaultfd Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 45/46] End of migration for postcopy Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` Dr. David Alan Gilbert (git) [this message]
2014-07-05 10:28 ` [Qemu-devel] [PATCH 00/46] Postcopy implementation Paolo Bonzini
2014-07-07 14:02   ` Dr. David Alan Gilbert
2014-07-07 14:35     ` Paolo Bonzini
2014-07-07 14:58       ` Dr. David Alan Gilbert
2014-07-10 11:29       ` Dr. David Alan Gilbert
2014-07-10 12:48         ` Eric Blake
2014-07-10 13:37           ` Dr. David Alan Gilbert
2014-07-10 15:33             ` Andrea Arcangeli
2014-07-10 15:49               ` Dr. David Alan Gilbert
2014-07-11  4:05                 ` Sanidhya Kashyap
2014-08-11 15:31           ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1404495717-4239-47-git-send-email-dgilbert@redhat.com \
    --to=dgilbert@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=lilei@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=yamahata@private.email.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.