qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com>
To: qemu-devel@nongnu.org
Cc: aarcange@redhat.com, yamahata@private.email.ne.jp,
	lilei@linux.vnet.ibm.com, quintela@redhat.com
Subject: [Qemu-devel] [PATCH v2 43/43] Start documenting how postcopy works.
Date: Mon, 11 Aug 2014 15:29:59 +0100	[thread overview]
Message-ID: <1407767399-3030-44-git-send-email-dgilbert@redhat.com> (raw)
In-Reply-To: <1407767399-3030-1-git-send-email-dgilbert@redhat.com>

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 docs/migration.txt | 150 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 150 insertions(+)

diff --git a/docs/migration.txt b/docs/migration.txt
index 0492a45..fec2d46 100644
--- a/docs/migration.txt
+++ b/docs/migration.txt
@@ -294,3 +294,153 @@ save/send this state when we are in the middle of a pio operation
 (that is what ide_drive_pio_state_needed() checks).  If DRQ_STAT is
 not enabled, the values on that fields are garbage and don't need to
 be sent.
+
+= Return path =
+
+In most migration scenarios there is only a single data path that runs
+from the source VM to the destination, typically along a single fd (although
+possibly with another fd or similar for some fast way of throwing pages across).
+
+However, some uses need two way comms; in particular the Postcopy destination
+needs to be able to request pages on demand from the source.
+
+For these scenarios there is a 'return path' from the destination to the source;
+qemu_file_get_return_path(QEMUFile* fwdpath) gives the QEMUFile* for the return
+path.
+
+  Source side
+     Forward path - written by migration thread
+     Return path  - opened by main thread, read by fd_handler on main thread
+
+  Destination side
+     Forward path - read by main thread
+     Return path  - opened by main thread, written by main thread AND postcopy
+                    thread (protected by rp_mutex)
+
+Opening the return path generally sets the fd to be non-blocking so that a
+failed destination can't block the source; and since the non-blockingness seems
+to follow both directions it does alter the semantics of the forward path.
+
+= Postcopy =
+'Postcopy' migration is a way to deal with migrations that refuse to converge;
+it's plus side is that there is an upper bound on the amount of migration traffic
+and time it takes, the down side is that during the postcopy phase, a failure of
+*either* side or the network connection causes the guest to be lost.
+
+In postcopy the destination CPUs are started before all the memory has been
+transferred, and accesses to pages that are yet to be transferred cause
+a fault that's translated by QEMU into a request to the source QEMU.
+
+Postcopy can be combined with precopy (i.e. normal migration) so that if precopy
+doesn't finish in a given time the switch is automatically made to precopy.
+
+=== Enabling postcopy ===
+
+To enable postcopy (prior to the start of migration):
+
+migrate_set_capability x-postcopy-ram on
+
+The migration will still start in precopy mode, however issuing:
+
+migrate_start_postcopy
+
+will now cause the transition from precopy to postcopy.
+It can be issued immediately after migration is started or any
+time later on.  Issuing it after the end of a migration is harmless.
+
+=== Postcopy states ===
+Postcopy moves through a series of states (see postcopy_ram_state)
+from ADVISE->LISTEN->RUNNING->END
+
+  Advise: Set at the start of migration if postcopy is enabled, even
+          if it hasn't passed the start-time threshold; here the destination
+          checks it's OS has the support needed for postcopy, and performs
+          setup to ensure the RAM mappings are suitable for later postcopy.
+          (Triggered by reception of POSTCOPY_RAM_ADVISE command)
+
+Normal precopy now carries on as normal, until the point that the source
+hits the start-time threshold and transitions to postcopy.  The source
+stops it's CPUs and transmits a 'discard bitmap' indicating pages that
+have been previously sent but are now dirty again and hence are out of
+date on the destination.
+
+The migration stream now contains a 'package' containing it's own chunk
+of migration stream, followed by a return to a normal stream containing
+page data.  The package (sent as CMD_PACKAGED) contains the commands to
+cycle the states on the destination, followed by all of the device
+state excluding RAM.  This lets the destination request pages from the
+source in parallel with loading device state, this is required since
+some devices (virtio) access guest memory during device initialisation.
+
+  Listen: The first command in the package, POSTCOPY_RAM_LISTEN, switches
+          the destination state to Listen, and starts a new thread
+          (the 'listen thread') which takes over the job of receiving
+          pages off the migration stream, while the main thread carries
+          on processing the blob.  With this thread able to process page
+          reception, the destination now 'sensitises' the RAM to detect
+          any access to missing pages (on Linux using the 'userfault'
+          system).
+
+The package now contains all the remaining state data and the command
+to transition to the next state.
+
+  Running: POSTCOPY_RAM_RUN causes the destination to synchronise all
+          state and start the CPUs and IO devices running.  The main
+          thread now finishes processing the migration package and
+          now carries on as it would for normal precopy migration
+          (although it can't do the cleanup it would do as it
+          finishes a normal migration).
+
+Page data is sent from the source to the destination both as part
+of a linear scan (like normal migration), and received by the 'listen thread',
+When the destination tries to use a page it hasn't got, it requests
+it from the source (down the return path) and the source sends this
+page in the same stream.  When the source has transmitted all pages
+it sends a POSTCOPY_RAM_END command to transition to
+
+  End: The listen thread can now quit, and perform the cleanup of migration
+state, the migration is now complete.
+
+=== Source side page maps ===
+The source side keeps two bitmaps during postcopy; 'the migration bitmap'
+and 'sent map'.  The 'migration bitmap' is basically the same as in
+the precopy case, and holds a bit to indicate that page is 'dirty' -
+i.e. needs sending.  During the precopy phase this is updated as the CPU
+dirties pages, however during postcopy the CPUs are stopped and nothing
+should dirty anything any more.
+
+The 'sent map' is used for the transition to postcopy. It is a bitmap that
+has a bit set whenever a page is sent to the destination, however during
+the transition to postcopy mode it is masked against the migration bitmap
+(sentmap &= migrationbitmap) to generate a bitmap recording pages that
+have been previously been sent but are now dirty again.  This masked
+sentmap is sent to the destination which discards those now dirty pages
+before starting the CPUs.
+
+Note that once in postcopy mode, the sent map is still updated, however it's
+contents are not-consistent as a local view of what's been sent since it's
+only got the masked result.
+
+=== Destination side page maps ===
+(Needs to be changed so we can update both easily - at the moment updates are done
+ with a lock)
+The destination keeps a 'requested map' and a 'received map'.
+Both maps are initially 0, as pages are received the bits are set in 'received map'.
+Incoming requests from the kernel cause the bit to be set in the 'requested map'.
+When a page is received that is marked as 'requested' the kernel is notified.
+If the kernel requests a page that has already been 'received' the kernel is notified
+without re-requesting.
+
+This leads to three valid page states:
+page states:
+    missing (!rc,!rq)  - page not yet received or requested
+    received (rc,!rq)  - Page received
+    requested (!rc,rq) - page requested but not yet received
+
+state transitions:
+      received -> missing   (only during setup/discard)
+
+      missing -> received   (normal incoming page)
+      requested -> received (incoming page previously requested)
+      missing -> requested  (userfault request)
+
-- 
1.9.3

  parent reply	other threads:[~2014-08-11 14:31 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-11 14:29 [Qemu-devel] [PATCH v2 00/43] Postcopy implementation Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 01/43] qemu_ram_foreach_block: pass up error value, and down the ramblock name Dr. David Alan Gilbert (git)
2014-08-11 18:29   ` Eric Blake
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 02/43] improve DPRINTF macros, add to savevm Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 03/43] Add qemu_get_counted_string to read a string prefixed by a count byte Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 04/43] Create MigrationIncomingState Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 05/43] Return path: Open a return path on QEMUFile for sockets Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 06/43] Return path: socket_writev_buffer: Block even on non-blocking fd's Dr. David Alan Gilbert (git)
2014-08-12  2:13   ` [Qemu-devel] 答复: " chenliang (T)
2014-08-12  9:36     ` [Qemu-devel] ????: [PATCH v2 06/43] Return path: socket_writev_buffer:?Block " Dr. David Alan Gilbert
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 07/43] Migration commands Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 08/43] Return path: Control commands Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 09/43] Return path: Send responses from destination to source Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 10/43] Return path: Source handling of return path Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 11/43] qemu_loadvm errors and debug Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 12/43] ram_debug_dump_bitmap: Dump a migration bitmap as text Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 13/43] Rework loadvm path for subloops Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 14/43] Add migration-capability boolean for postcopy-ram Dr. David Alan Gilbert (git)
2014-08-11 16:47   ` Eric Blake
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 15/43] Add wrappers and handlers for sending/receiving the postcopy-ram migration messages Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 16/43] QEMU_VM_CMD_PACKAGED: Send a packaged chunk of migration stream Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 17/43] migrate_init: Call from savevm Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 18/43] Allow savevm handlers to state whether they could go into postcopy Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 19/43] postcopy: OS support test Dr. David Alan Gilbert (git)
2014-08-12  5:32   ` zhanghailiang
2014-08-12  8:18     ` Dr. David Alan Gilbert
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 20/43] migrate_start_postcopy: Command to trigger transition to postcopy Dr. David Alan Gilbert (git)
2014-08-11 17:01   ` Eric Blake
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 21/43] MIG_STATE_POSTCOPY_ACTIVE: Add new migration state Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 22/43] qemu_savevm_state_complete: Postcopy changes Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 23/43] Postcopy: Maintain sentmap during postcopy pre phase Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 24/43] Postcopy page-map-incoming (PMI) structure Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 25/43] postcopy: Add incoming_init/cleanup functions Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 26/43] postcopy: Incoming initialisation Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 27/43] postcopy: ram_enable_notify to switch on userfault Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 28/43] Postcopy: postcopy_start Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 29/43] Postcopy: Rework migration thread for postcopy mode Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 30/43] mig fd_connect: open return path Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 31/43] Postcopy: Create a fault handler thread before marking the ram as userfault Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 32/43] Page request: Add MIG_RPCOMM_REQPAGES reverse command Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 33/43] Page request: Process incoming page request Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 34/43] Page request: Consume pages off the post-copy queue Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 35/43] Add assertion to check migration_dirty_pages Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 36/43] postcopy_ram.c: place_page and helpers Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 37/43] Postcopy: Use helpers to map pages during migration Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 38/43] qemu_ram_block_from_host Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 39/43] Postcopy; Handle userfault requests Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 40/43] Start up a postcopy/listener thread ready for incoming page data Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 41/43] postcopy: Wire up loadvm_postcopy_ram_handle_{run, end} commands Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` [Qemu-devel] [PATCH v2 42/43] End of migration for postcopy Dr. David Alan Gilbert (git)
2014-08-11 14:29 ` Dr. David Alan Gilbert (git) [this message]
2014-08-11 17:19   ` [Qemu-devel] [PATCH v2 43/43] Start documenting how postcopy works Eric Blake
2014-08-11 17:58     ` Dr. David Alan Gilbert
2014-08-12  1:50 ` [Qemu-devel] [PATCH v2 00/43] Postcopy implementation zhanghailiang
2014-08-12  9:19   ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1407767399-3030-44-git-send-email-dgilbert@redhat.com \
    --to=dgilbert@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=lilei@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=yamahata@private.email.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).