qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com>
To: qemu-devel@nongnu.org, quintela@redhat.com, amit.shah@redhat.com
Cc: aarcange@redhat.com, pbonzini@redhat.com, liang.z.li@intel.com,
	luis@cs.umu.se, bharata@linux.vnet.ibm.com
Subject: [Qemu-devel] [PATCH v8 01/54] Add postcopy documentation
Date: Tue, 29 Sep 2015 09:37:25 +0100	[thread overview]
Message-ID: <1443515898-3594-2-git-send-email-dgilbert@redhat.com> (raw)
In-Reply-To: <1443515898-3594-1-git-send-email-dgilbert@redhat.com>

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
---
 docs/migration.txt | 191 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 191 insertions(+)

diff --git a/docs/migration.txt b/docs/migration.txt
index f6df4be..7853709 100644
--- a/docs/migration.txt
+++ b/docs/migration.txt
@@ -291,3 +291,194 @@ save/send this state when we are in the middle of a pio operation
 (that is what ide_drive_pio_state_needed() checks).  If DRQ_STAT is
 not enabled, the values on that fields are garbage and don't need to
 be sent.
+
+= Return path =
+
+In most migration scenarios there is only a single data path that runs
+from the source VM to the destination, typically along a single fd (although
+possibly with another fd or similar for some fast way of throwing pages across).
+
+However, some uses need two way communication; in particular the Postcopy
+destination needs to be able to request pages on demand from the source.
+
+For these scenarios there is a 'return path' from the destination to the source;
+qemu_file_get_return_path(QEMUFile* fwdpath) gives the QEMUFile* for the return
+path.
+
+  Source side
+     Forward path - written by migration thread
+     Return path  - opened by main thread, read by return-path thread
+
+  Destination side
+     Forward path - read by main thread
+     Return path  - opened by main thread, written by main thread AND postcopy
+                    thread (protected by rp_mutex)
+
+= Postcopy =
+'Postcopy' migration is a way to deal with migrations that refuse to converge
+(or take too long to converge) its plus side is that there is an upper bound on
+the amount of migration traffic and time it takes, the down side is that during
+the postcopy phase, a failure of *either* side or the network connection causes
+the guest to be lost.
+
+In postcopy the destination CPUs are started before all the memory has been
+transferred, and accesses to pages that are yet to be transferred cause
+a fault that's translated by QEMU into a request to the source QEMU.
+
+Postcopy can be combined with precopy (i.e. normal migration) so that if precopy
+doesn't finish in a given time the switch is made to postcopy.
+
+=== Enabling postcopy ===
+
+To enable postcopy, issue this command on the monitor prior to the
+start of migration:
+
+migrate_set_capability x-postcopy-ram on
+
+The normal commands are then used to start a migration, which is still
+started in precopy mode.  Issuing:
+
+migrate_start_postcopy
+
+will now cause the transition from precopy to postcopy.
+It can be issued immediately after migration is started or any
+time later on.  Issuing it after the end of a migration is harmless.
+
+Note: During the postcopy phase, the bandwidth limits set using
+migrate_set_speed is ignored (to avoid delaying requested pages that
+the destination is waiting for).
+
+=== Postcopy device transfer ===
+
+Loading of device data may cause the device emulation to access guest RAM
+that may trigger faults that have to be resolved by the source, as such
+the migration stream has to be able to respond with page data *during* the
+device load, and hence the device data has to be read from the stream completely
+before the device load begins to free the stream up.  This is achieved by
+'packaging' the device data into a blob that's read in one go.
+
+Source behaviour
+
+Until postcopy is entered the migration stream is identical to normal
+precopy, except for the addition of a 'postcopy advise' command at
+the beginning, to tell the destination that postcopy might happen.
+When postcopy starts the source sends the page discard data and then
+forms the 'package' containing:
+
+   Command: 'postcopy listen'
+   The device state
+      A series of sections, identical to the precopy streams device state stream
+      containing everything except postcopiable devices (i.e. RAM)
+   Command: 'postcopy run'
+
+The 'package' is sent as the data part of a Command: 'CMD_PACKAGED', and the
+contents are formatted in the same way as the main migration stream.
+
+During postcopy the source scans the list of dirty pages and sends them
+to the destination without being requested (in much the same way as precopy),
+however when a page request is received from the destination, the dirty page
+scanning restarts from the requested location.  This causes requested pages
+to be sent quickly, and also causes pages directly after the requested page
+to be sent quickly in the hope that those pages are likely to be used
+by the destination soon.
+
+Destination behaviour
+
+Initially the destination looks the same as precopy, with a single thread
+reading the migration stream; the 'postcopy advise' and 'discard' commands
+are processed to change the way RAM is managed, but don't affect the stream
+processing.
+
+------------------------------------------------------------------------------
+                        1      2   3     4 5                      6   7
+main -----DISCARD-CMD_PACKAGED ( LISTEN  DEVICE     DEVICE DEVICE RUN )
+thread                             |       |
+                                   |     (page request)
+                                   |        \___
+                                   v            \
+listen thread:                     --- page -- page -- page -- page -- page --
+
+                                   a   b        c
+------------------------------------------------------------------------------
+
+On receipt of CMD_PACKAGED (1)
+   All the data associated with the package - the ( ... ) section in the
+diagram - is read into memory (into a QEMUSizedBuffer), and the main thread
+recurses into qemu_loadvm_state_main to process the contents of the package (2)
+which contains commands (3,6) and devices (4...)
+
+On receipt of 'postcopy listen' - 3 -(i.e. the 1st command in the package)
+a new thread (a) is started that takes over servicing the migration stream,
+while the main thread carries on loading the package.   It loads normal
+background page data (b) but if during a device load a fault happens (5) the
+returned page (c) is loaded by the listen thread allowing the main threads
+device load to carry on.
+
+The last thing in the CMD_PACKAGED is a 'RUN' command (6) letting the destination
+CPUs start running.
+At the end of the CMD_PACKAGED (7) the main thread returns to normal running behaviour
+and is no longer used by migration, while the listen thread carries
+on servicing page data until the end of migration.
+
+=== Postcopy states ===
+
+Postcopy moves through a series of states (see postcopy_state) from
+ADVISE->DISCARD->LISTEN->RUNNING->END
+
+  Advise:  Set at the start of migration if postcopy is enabled, even
+           if it hasn't had the start command; here the destination
+           checks that its OS has the support needed for postcopy, and performs
+           setup to ensure the RAM mappings are suitable for later postcopy.
+           The destination will fail early in migration at this point if the
+           required OS support is not present.
+           (Triggered by reception of POSTCOPY_ADVISE command)
+
+  Discard: Entered on receipt of the first 'discard' command; prior to
+           the first Discard being performed, hugepages are switched off
+           (using madvise) to ensure that no new huge pages are created
+           during the postcopy phase, and to cause any huge pages that
+           have discards on them to be broken.
+
+  Listen:  The first command in the package, POSTCOPY_LISTEN, switches
+           the destination state to Listen, and starts a new thread
+           (the 'listen thread') which takes over the job of receiving
+           pages off the migration stream, while the main thread carries
+           on processing the blob.  With this thread able to process page
+           reception, the destination now 'sensitises' the RAM to detect
+           any access to missing pages (on Linux using the 'userfault'
+           system).
+
+  Running: POSTCOPY_RUN causes the destination to synchronise all
+           state and start the CPUs and IO devices running.  The main
+           thread now finishes processing the migration package and
+           now carries on as it would for normal precopy migration
+           (although it can't do the cleanup it would do as it
+           finishes a normal migration).
+
+  End:     The listen thread can now quit, and perform the cleanup of migration
+           state, the migration is now complete.
+
+=== Source side page maps ===
+
+The source side keeps two bitmaps during postcopy; 'the migration bitmap'
+and 'sent map'.  The 'migration bitmap' is basically the same as in
+the precopy case, and holds a bit to indicate that page is 'dirty' -
+i.e. needs sending.  During the precopy phase this is updated as the CPU
+dirties pages, however during postcopy the CPUs are stopped and nothing
+should dirty anything any more.
+
+The 'sent map' is used for the transition to postcopy. It is a bitmap that
+has a bit set whenever a page is sent to the destination, however during
+the transition to postcopy mode it is combined with the migration bitmap
+to form a set of pages that:
+   a) Have been sent but then redirtied (which must be discarded)
+   b) Have not yet been sent - which also must be discarded to cause any
+      transparent huge pages built during precopy to be broken.
+
+Note that the contents of the sentmap are sacrificed during the calculation
+of the discard set and thus aren't valid once in postcopy.  The dirtymap
+is still valid and is used to ensure that no page is sent more than once.  Any
+request for a page that has already been sent is ignored.  Duplicate requests
+such as this can happen as a page is sent at about the same time the
+destination accesses it.
+
-- 
2.5.0

  reply	other threads:[~2015-09-29  8:38 UTC|newest]

Thread overview: 119+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-29  8:37 [Qemu-devel] [PATCH v8 00/54] Postcopy implementation Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` Dr. David Alan Gilbert (git) [this message]
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 02/54] Provide runtime Target page information Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 03/54] Init page sizes in qtest Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 04/54] Move configuration section writing Dr. David Alan Gilbert (git)
2015-10-05  6:44   ` Amit Shah
2015-10-30 12:47     ` Dr. David Alan Gilbert
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 05/54] qemu_ram_block_from_host Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 06/54] Rename mis->file to from_src_file Dr. David Alan Gilbert (git)
2015-09-29 10:41   ` Amit Shah
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 07/54] Add qemu_get_buffer_in_place to avoid copies some of the time Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 08/54] Add wrapper for setting blocking status on a QEMUFile Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 09/54] Add QEMU_MADV_NOHUGEPAGE Dr. David Alan Gilbert (git)
2015-10-28 10:35   ` Amit Shah
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 10/54] migration/ram.c: Use RAMBlock rather than MemoryRegion Dr. David Alan Gilbert (git)
2015-10-28 10:36   ` Amit Shah
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 11/54] ram_debug_dump_bitmap: Dump a migration bitmap as text Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 12/54] migrate_init: Call from savevm Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 13/54] Move dirty page search state into separate structure Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 14/54] ram_find_and_save_block: Split out the finding Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 15/54] Rename save_live_complete to save_live_complete_precopy Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 16/54] Return path: Open a return path on QEMUFile for sockets Dr. David Alan Gilbert (git)
2015-10-02 15:29   ` Daniel P. Berrange
2015-10-02 16:32     ` Dr. David Alan Gilbert
2015-10-02 17:03       ` Daniel P. Berrange
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 17/54] Return path: socket_writev_buffer: Block even on non-blocking fd's Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 18/54] Migration commands Dr. David Alan Gilbert (git)
2015-10-20 11:22   ` Juan Quintela
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 19/54] Return path: Control commands Dr. David Alan Gilbert (git)
2015-10-20 11:27   ` Juan Quintela
2015-10-26 11:42     ` Dr. David Alan Gilbert
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 20/54] Return path: Send responses from destination to source Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 21/54] Return path: Source handling of return path Dr. David Alan Gilbert (git)
2015-10-20 11:33   ` Juan Quintela
2015-10-26 12:06     ` Dr. David Alan Gilbert
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 22/54] Rework loadvm path for subloops Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 23/54] Add migration-capability boolean for postcopy-ram Dr. David Alan Gilbert (git)
2015-09-29 20:22   ` Eric Blake
2015-09-30  7:00     ` Amit Shah
2015-09-30 12:44       ` Eric Blake
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 24/54] Add wrappers and handlers for sending/receiving the postcopy-ram migration messages Dr. David Alan Gilbert (git)
2015-10-20 11:50   ` Juan Quintela
2015-10-26 12:22     ` Dr. David Alan Gilbert
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 25/54] MIG_CMD_PACKAGED: Send a packaged chunk of migration stream Dr. David Alan Gilbert (git)
2015-10-20 13:25   ` Juan Quintela
2015-10-26 16:21     ` Dr. David Alan Gilbert
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 26/54] Modify save_live_pending for postcopy Dr. David Alan Gilbert (git)
2015-10-28 11:03   ` Amit Shah
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 27/54] postcopy: OS support test Dr. David Alan Gilbert (git)
2015-10-20 13:31   ` Juan Quintela
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 28/54] migrate_start_postcopy: Command to trigger transition to postcopy Dr. David Alan Gilbert (git)
2015-09-30 16:25   ` Eric Blake
2015-09-30 16:30     ` Dr. David Alan Gilbert
2015-10-20 13:33   ` Juan Quintela
2015-10-28 11:17   ` Amit Shah
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 29/54] MIGRATION_STATUS_POSTCOPY_ACTIVE: Add new migration state Dr. David Alan Gilbert (git)
2015-10-20 13:35   ` Juan Quintela
2015-10-30 18:19     ` Dr. David Alan Gilbert
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 30/54] Avoid sending vmdescription during postcopy Dr. David Alan Gilbert (git)
2015-10-20 13:35   ` Juan Quintela
2015-10-28 11:19   ` Amit Shah
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 31/54] Add qemu_savevm_state_complete_postcopy Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 32/54] Postcopy: Maintain sentmap and calculate discard Dr. David Alan Gilbert (git)
2015-10-21 11:17   ` Juan Quintela
2015-10-30 18:43     ` Dr. David Alan Gilbert
2015-11-02 17:31     ` Dr. David Alan Gilbert
2015-11-02 18:19     ` Dr. David Alan Gilbert
2015-11-02 20:14     ` Dr. David Alan Gilbert
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 33/54] postcopy: Incoming initialisation Dr. David Alan Gilbert (git)
2015-10-21  8:35   ` Juan Quintela
2015-11-03 17:59     ` Dr. David Alan Gilbert
2015-11-03 18:32       ` Juan Quintela
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 34/54] postcopy: ram_enable_notify to switch on userfault Dr. David Alan Gilbert (git)
2015-10-28 11:40   ` Amit Shah
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 35/54] Postcopy: Postcopy startup in migration thread Dr. David Alan Gilbert (git)
2015-10-21  8:57   ` Juan Quintela
2015-10-26 17:12     ` Dr. David Alan Gilbert
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 36/54] Split out end of migration code from migration_thread Dr. David Alan Gilbert (git)
2015-10-21  9:11   ` Juan Quintela
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 37/54] Postcopy: End of iteration Dr. David Alan Gilbert (git)
2015-10-21  9:16   ` Juan Quintela
2015-10-29  5:10   ` Amit Shah
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 38/54] Page request: Add MIG_RP_MSG_REQ_PAGES reverse command Dr. David Alan Gilbert (git)
2015-10-21 11:12   ` Juan Quintela
2015-10-26 16:58     ` Dr. David Alan Gilbert
2015-10-29  5:17   ` Amit Shah
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 39/54] Page request: Process incoming page request Dr. David Alan Gilbert (git)
2015-10-21 11:17   ` Juan Quintela
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 40/54] Page request: Consume pages off the post-copy queue Dr. David Alan Gilbert (git)
2015-10-26 16:32   ` Juan Quintela
2015-11-03 11:52     ` Dr. David Alan Gilbert
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 41/54] postcopy_ram.c: place_page and helpers Dr. David Alan Gilbert (git)
2015-10-28 10:28   ` Juan Quintela
2015-10-28 13:11     ` Dr. David Alan Gilbert
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 42/54] Postcopy: Use helpers to map pages during migration Dr. David Alan Gilbert (git)
2015-10-28 10:58   ` Juan Quintela
2015-10-30 12:59     ` Dr. David Alan Gilbert
2015-10-30 16:35     ` Dr. David Alan Gilbert
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 43/54] Don't sync dirty bitmaps in postcopy Dr. David Alan Gilbert (git)
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 44/54] Don't iterate on precopy-only devices during postcopy Dr. David Alan Gilbert (git)
2015-10-28 11:01   ` Juan Quintela
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 45/54] Host page!=target page: Cleanup bitmaps Dr. David Alan Gilbert (git)
2015-10-28 11:24   ` Juan Quintela
2015-11-03 17:32     ` Dr. David Alan Gilbert
2015-11-03 18:30       ` Juan Quintela
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 46/54] postcopy: Check order of received target pages Dr. David Alan Gilbert (git)
2015-10-28 11:26   ` Juan Quintela
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 47/54] Round up RAMBlock sizes to host page sizes Dr. David Alan Gilbert (git)
2015-10-28 11:28   ` Juan Quintela
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 48/54] Postcopy; Handle userfault requests Dr. David Alan Gilbert (git)
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 49/54] Start up a postcopy/listener thread ready for incoming page data Dr. David Alan Gilbert (git)
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 50/54] postcopy: Wire up loadvm_postcopy_handle_ commands Dr. David Alan Gilbert (git)
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 51/54] Postcopy: Mark nohugepage before discard Dr. David Alan Gilbert (git)
2015-10-28 14:02   ` Juan Quintela
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 52/54] End of migration for postcopy Dr. David Alan Gilbert (git)
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 53/54] Disable mlock around incoming postcopy Dr. David Alan Gilbert (git)
2015-10-21  9:17   ` Juan Quintela
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 54/54] Inhibit ballooning during postcopy Dr. David Alan Gilbert (git)
     [not found] <1443459153-10965-1-git-send-email-dgilbert@redhat.com>
     [not found] ` <1443459153-10965-2-git-send-email-dgilbert@redhat.com>
     [not found]   ` <87oaftx1nw.fsf@neno.neno>
2015-10-20 11:54     ` [Qemu-devel] [PATCH v8 01/54] Add postcopy documentation Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1443515898-3594-2-git-send-email-dgilbert@redhat.com \
    --to=dgilbert@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=bharata@linux.vnet.ibm.com \
    --cc=liang.z.li@intel.com \
    --cc=luis@cs.umu.se \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).