All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com>
To: qemu-devel@nongnu.org, quintela@redhat.com, amit.shah@redhat.com
Cc: aarcange@redhat.com, pbonzini@redhat.com, liang.z.li@intel.com,
	luis@cs.umu.se, bharata@linux.vnet.ibm.com
Subject: [Qemu-devel] [PATCH v8 01/54] Add postcopy documentation
Date: Tue, 29 Sep 2015 09:37:25 +0100	[thread overview]
Message-ID: <1443515898-3594-2-git-send-email-dgilbert@redhat.com> (raw)
In-Reply-To: <1443515898-3594-1-git-send-email-dgilbert@redhat.com>

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
---
 docs/migration.txt | 191 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 191 insertions(+)

diff --git a/docs/migration.txt b/docs/migration.txt
index f6df4be..7853709 100644
--- a/docs/migration.txt
+++ b/docs/migration.txt
@@ -291,3 +291,194 @@ save/send this state when we are in the middle of a pio operation
 (that is what ide_drive_pio_state_needed() checks).  If DRQ_STAT is
 not enabled, the values on that fields are garbage and don't need to
 be sent.
+
+= Return path =
+
+In most migration scenarios there is only a single data path that runs
+from the source VM to the destination, typically along a single fd (although
+possibly with another fd or similar for some fast way of throwing pages across).
+
+However, some uses need two way communication; in particular the Postcopy
+destination needs to be able to request pages on demand from the source.
+
+For these scenarios there is a 'return path' from the destination to the source;
+qemu_file_get_return_path(QEMUFile* fwdpath) gives the QEMUFile* for the return
+path.
+
+  Source side
+     Forward path - written by migration thread
+     Return path  - opened by main thread, read by return-path thread
+
+  Destination side
+     Forward path - read by main thread
+     Return path  - opened by main thread, written by main thread AND postcopy
+                    thread (protected by rp_mutex)
+
+= Postcopy =
+'Postcopy' migration is a way to deal with migrations that refuse to converge
+(or take too long to converge) its plus side is that there is an upper bound on
+the amount of migration traffic and time it takes, the down side is that during
+the postcopy phase, a failure of *either* side or the network connection causes
+the guest to be lost.
+
+In postcopy the destination CPUs are started before all the memory has been
+transferred, and accesses to pages that are yet to be transferred cause
+a fault that's translated by QEMU into a request to the source QEMU.
+
+Postcopy can be combined with precopy (i.e. normal migration) so that if precopy
+doesn't finish in a given time the switch is made to postcopy.
+
+=== Enabling postcopy ===
+
+To enable postcopy, issue this command on the monitor prior to the
+start of migration:
+
+migrate_set_capability x-postcopy-ram on
+
+The normal commands are then used to start a migration, which is still
+started in precopy mode.  Issuing:
+
+migrate_start_postcopy
+
+will now cause the transition from precopy to postcopy.
+It can be issued immediately after migration is started or any
+time later on.  Issuing it after the end of a migration is harmless.
+
+Note: During the postcopy phase, the bandwidth limits set using
+migrate_set_speed is ignored (to avoid delaying requested pages that
+the destination is waiting for).
+
+=== Postcopy device transfer ===
+
+Loading of device data may cause the device emulation to access guest RAM
+that may trigger faults that have to be resolved by the source, as such
+the migration stream has to be able to respond with page data *during* the
+device load, and hence the device data has to be read from the stream completely
+before the device load begins to free the stream up.  This is achieved by
+'packaging' the device data into a blob that's read in one go.
+
+Source behaviour
+
+Until postcopy is entered the migration stream is identical to normal
+precopy, except for the addition of a 'postcopy advise' command at
+the beginning, to tell the destination that postcopy might happen.
+When postcopy starts the source sends the page discard data and then
+forms the 'package' containing:
+
+   Command: 'postcopy listen'
+   The device state
+      A series of sections, identical to the precopy streams device state stream
+      containing everything except postcopiable devices (i.e. RAM)
+   Command: 'postcopy run'
+
+The 'package' is sent as the data part of a Command: 'CMD_PACKAGED', and the
+contents are formatted in the same way as the main migration stream.
+
+During postcopy the source scans the list of dirty pages and sends them
+to the destination without being requested (in much the same way as precopy),
+however when a page request is received from the destination, the dirty page
+scanning restarts from the requested location.  This causes requested pages
+to be sent quickly, and also causes pages directly after the requested page
+to be sent quickly in the hope that those pages are likely to be used
+by the destination soon.
+
+Destination behaviour
+
+Initially the destination looks the same as precopy, with a single thread
+reading the migration stream; the 'postcopy advise' and 'discard' commands
+are processed to change the way RAM is managed, but don't affect the stream
+processing.
+
+------------------------------------------------------------------------------
+                        1      2   3     4 5                      6   7
+main -----DISCARD-CMD_PACKAGED ( LISTEN  DEVICE     DEVICE DEVICE RUN )
+thread                             |       |
+                                   |     (page request)
+                                   |        \___
+                                   v            \
+listen thread:                     --- page -- page -- page -- page -- page --
+
+                                   a   b        c
+------------------------------------------------------------------------------
+
+On receipt of CMD_PACKAGED (1)
+   All the data associated with the package - the ( ... ) section in the
+diagram - is read into memory (into a QEMUSizedBuffer), and the main thread
+recurses into qemu_loadvm_state_main to process the contents of the package (2)
+which contains commands (3,6) and devices (4...)
+
+On receipt of 'postcopy listen' - 3 -(i.e. the 1st command in the package)
+a new thread (a) is started that takes over servicing the migration stream,
+while the main thread carries on loading the package.   It loads normal
+background page data (b) but if during a device load a fault happens (5) the
+returned page (c) is loaded by the listen thread allowing the main threads
+device load to carry on.
+
+The last thing in the CMD_PACKAGED is a 'RUN' command (6) letting the destination
+CPUs start running.
+At the end of the CMD_PACKAGED (7) the main thread returns to normal running behaviour
+and is no longer used by migration, while the listen thread carries
+on servicing page data until the end of migration.
+
+=== Postcopy states ===
+
+Postcopy moves through a series of states (see postcopy_state) from
+ADVISE->DISCARD->LISTEN->RUNNING->END
+
+  Advise:  Set at the start of migration if postcopy is enabled, even
+           if it hasn't had the start command; here the destination
+           checks that its OS has the support needed for postcopy, and performs
+           setup to ensure the RAM mappings are suitable for later postcopy.
+           The destination will fail early in migration at this point if the
+           required OS support is not present.
+           (Triggered by reception of POSTCOPY_ADVISE command)
+
+  Discard: Entered on receipt of the first 'discard' command; prior to
+           the first Discard being performed, hugepages are switched off
+           (using madvise) to ensure that no new huge pages are created
+           during the postcopy phase, and to cause any huge pages that
+           have discards on them to be broken.
+
+  Listen:  The first command in the package, POSTCOPY_LISTEN, switches
+           the destination state to Listen, and starts a new thread
+           (the 'listen thread') which takes over the job of receiving
+           pages off the migration stream, while the main thread carries
+           on processing the blob.  With this thread able to process page
+           reception, the destination now 'sensitises' the RAM to detect
+           any access to missing pages (on Linux using the 'userfault'
+           system).
+
+  Running: POSTCOPY_RUN causes the destination to synchronise all
+           state and start the CPUs and IO devices running.  The main
+           thread now finishes processing the migration package and
+           now carries on as it would for normal precopy migration
+           (although it can't do the cleanup it would do as it
+           finishes a normal migration).
+
+  End:     The listen thread can now quit, and perform the cleanup of migration
+           state, the migration is now complete.
+
+=== Source side page maps ===
+
+The source side keeps two bitmaps during postcopy; 'the migration bitmap'
+and 'sent map'.  The 'migration bitmap' is basically the same as in
+the precopy case, and holds a bit to indicate that page is 'dirty' -
+i.e. needs sending.  During the precopy phase this is updated as the CPU
+dirties pages, however during postcopy the CPUs are stopped and nothing
+should dirty anything any more.
+
+The 'sent map' is used for the transition to postcopy. It is a bitmap that
+has a bit set whenever a page is sent to the destination, however during
+the transition to postcopy mode it is combined with the migration bitmap
+to form a set of pages that:
+   a) Have been sent but then redirtied (which must be discarded)
+   b) Have not yet been sent - which also must be discarded to cause any
+      transparent huge pages built during precopy to be broken.
+
+Note that the contents of the sentmap are sacrificed during the calculation
+of the discard set and thus aren't valid once in postcopy.  The dirtymap
+is still valid and is used to ensure that no page is sent more than once.  Any
+request for a page that has already been sent is ignored.  Duplicate requests
+such as this can happen as a page is sent at about the same time the
+destination accesses it.
+
-- 
2.5.0

  reply	other threads:[~2015-09-29  8:38 UTC|newest]

Thread overview: 119+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-29  8:37 [Qemu-devel] [PATCH v8 00/54] Postcopy implementation Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` Dr. David Alan Gilbert (git) [this message]
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 02/54] Provide runtime Target page information Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 03/54] Init page sizes in qtest Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 04/54] Move configuration section writing Dr. David Alan Gilbert (git)
2015-10-05  6:44   ` Amit Shah
2015-10-30 12:47     ` Dr. David Alan Gilbert
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 05/54] qemu_ram_block_from_host Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 06/54] Rename mis->file to from_src_file Dr. David Alan Gilbert (git)
2015-09-29 10:41   ` Amit Shah
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 07/54] Add qemu_get_buffer_in_place to avoid copies some of the time Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 08/54] Add wrapper for setting blocking status on a QEMUFile Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 09/54] Add QEMU_MADV_NOHUGEPAGE Dr. David Alan Gilbert (git)
2015-10-28 10:35   ` Amit Shah
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 10/54] migration/ram.c: Use RAMBlock rather than MemoryRegion Dr. David Alan Gilbert (git)
2015-10-28 10:36   ` Amit Shah
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 11/54] ram_debug_dump_bitmap: Dump a migration bitmap as text Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 12/54] migrate_init: Call from savevm Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 13/54] Move dirty page search state into separate structure Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 14/54] ram_find_and_save_block: Split out the finding Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 15/54] Rename save_live_complete to save_live_complete_precopy Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 16/54] Return path: Open a return path on QEMUFile for sockets Dr. David Alan Gilbert (git)
2015-10-02 15:29   ` Daniel P. Berrange
2015-10-02 16:32     ` Dr. David Alan Gilbert
2015-10-02 17:03       ` Daniel P. Berrange
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 17/54] Return path: socket_writev_buffer: Block even on non-blocking fd's Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 18/54] Migration commands Dr. David Alan Gilbert (git)
2015-10-20 11:22   ` Juan Quintela
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 19/54] Return path: Control commands Dr. David Alan Gilbert (git)
2015-10-20 11:27   ` Juan Quintela
2015-10-26 11:42     ` Dr. David Alan Gilbert
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 20/54] Return path: Send responses from destination to source Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 21/54] Return path: Source handling of return path Dr. David Alan Gilbert (git)
2015-10-20 11:33   ` Juan Quintela
2015-10-26 12:06     ` Dr. David Alan Gilbert
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 22/54] Rework loadvm path for subloops Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 23/54] Add migration-capability boolean for postcopy-ram Dr. David Alan Gilbert (git)
2015-09-29 20:22   ` Eric Blake
2015-09-30  7:00     ` Amit Shah
2015-09-30 12:44       ` Eric Blake
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 24/54] Add wrappers and handlers for sending/receiving the postcopy-ram migration messages Dr. David Alan Gilbert (git)
2015-10-20 11:50   ` Juan Quintela
2015-10-26 12:22     ` Dr. David Alan Gilbert
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 25/54] MIG_CMD_PACKAGED: Send a packaged chunk of migration stream Dr. David Alan Gilbert (git)
2015-10-20 13:25   ` Juan Quintela
2015-10-26 16:21     ` Dr. David Alan Gilbert
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 26/54] Modify save_live_pending for postcopy Dr. David Alan Gilbert (git)
2015-10-28 11:03   ` Amit Shah
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 27/54] postcopy: OS support test Dr. David Alan Gilbert (git)
2015-10-20 13:31   ` Juan Quintela
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 28/54] migrate_start_postcopy: Command to trigger transition to postcopy Dr. David Alan Gilbert (git)
2015-09-30 16:25   ` Eric Blake
2015-09-30 16:30     ` Dr. David Alan Gilbert
2015-10-20 13:33   ` Juan Quintela
2015-10-28 11:17   ` Amit Shah
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 29/54] MIGRATION_STATUS_POSTCOPY_ACTIVE: Add new migration state Dr. David Alan Gilbert (git)
2015-10-20 13:35   ` Juan Quintela
2015-10-30 18:19     ` Dr. David Alan Gilbert
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 30/54] Avoid sending vmdescription during postcopy Dr. David Alan Gilbert (git)
2015-10-20 13:35   ` Juan Quintela
2015-10-28 11:19   ` Amit Shah
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 31/54] Add qemu_savevm_state_complete_postcopy Dr. David Alan Gilbert (git)
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 32/54] Postcopy: Maintain sentmap and calculate discard Dr. David Alan Gilbert (git)
2015-10-21 11:17   ` Juan Quintela
2015-10-30 18:43     ` Dr. David Alan Gilbert
2015-11-02 17:31     ` Dr. David Alan Gilbert
2015-11-02 18:19     ` Dr. David Alan Gilbert
2015-11-02 20:14     ` Dr. David Alan Gilbert
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 33/54] postcopy: Incoming initialisation Dr. David Alan Gilbert (git)
2015-10-21  8:35   ` Juan Quintela
2015-11-03 17:59     ` Dr. David Alan Gilbert
2015-11-03 18:32       ` Juan Quintela
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 34/54] postcopy: ram_enable_notify to switch on userfault Dr. David Alan Gilbert (git)
2015-10-28 11:40   ` Amit Shah
2015-09-29  8:37 ` [Qemu-devel] [PATCH v8 35/54] Postcopy: Postcopy startup in migration thread Dr. David Alan Gilbert (git)
2015-10-21  8:57   ` Juan Quintela
2015-10-26 17:12     ` Dr. David Alan Gilbert
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 36/54] Split out end of migration code from migration_thread Dr. David Alan Gilbert (git)
2015-10-21  9:11   ` Juan Quintela
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 37/54] Postcopy: End of iteration Dr. David Alan Gilbert (git)
2015-10-21  9:16   ` Juan Quintela
2015-10-29  5:10   ` Amit Shah
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 38/54] Page request: Add MIG_RP_MSG_REQ_PAGES reverse command Dr. David Alan Gilbert (git)
2015-10-21 11:12   ` Juan Quintela
2015-10-26 16:58     ` Dr. David Alan Gilbert
2015-10-29  5:17   ` Amit Shah
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 39/54] Page request: Process incoming page request Dr. David Alan Gilbert (git)
2015-10-21 11:17   ` Juan Quintela
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 40/54] Page request: Consume pages off the post-copy queue Dr. David Alan Gilbert (git)
2015-10-26 16:32   ` Juan Quintela
2015-11-03 11:52     ` Dr. David Alan Gilbert
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 41/54] postcopy_ram.c: place_page and helpers Dr. David Alan Gilbert (git)
2015-10-28 10:28   ` Juan Quintela
2015-10-28 13:11     ` Dr. David Alan Gilbert
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 42/54] Postcopy: Use helpers to map pages during migration Dr. David Alan Gilbert (git)
2015-10-28 10:58   ` Juan Quintela
2015-10-30 12:59     ` Dr. David Alan Gilbert
2015-10-30 16:35     ` Dr. David Alan Gilbert
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 43/54] Don't sync dirty bitmaps in postcopy Dr. David Alan Gilbert (git)
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 44/54] Don't iterate on precopy-only devices during postcopy Dr. David Alan Gilbert (git)
2015-10-28 11:01   ` Juan Quintela
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 45/54] Host page!=target page: Cleanup bitmaps Dr. David Alan Gilbert (git)
2015-10-28 11:24   ` Juan Quintela
2015-11-03 17:32     ` Dr. David Alan Gilbert
2015-11-03 18:30       ` Juan Quintela
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 46/54] postcopy: Check order of received target pages Dr. David Alan Gilbert (git)
2015-10-28 11:26   ` Juan Quintela
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 47/54] Round up RAMBlock sizes to host page sizes Dr. David Alan Gilbert (git)
2015-10-28 11:28   ` Juan Quintela
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 48/54] Postcopy; Handle userfault requests Dr. David Alan Gilbert (git)
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 49/54] Start up a postcopy/listener thread ready for incoming page data Dr. David Alan Gilbert (git)
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 50/54] postcopy: Wire up loadvm_postcopy_handle_ commands Dr. David Alan Gilbert (git)
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 51/54] Postcopy: Mark nohugepage before discard Dr. David Alan Gilbert (git)
2015-10-28 14:02   ` Juan Quintela
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 52/54] End of migration for postcopy Dr. David Alan Gilbert (git)
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 53/54] Disable mlock around incoming postcopy Dr. David Alan Gilbert (git)
2015-10-21  9:17   ` Juan Quintela
2015-09-29  8:38 ` [Qemu-devel] [PATCH v8 54/54] Inhibit ballooning during postcopy Dr. David Alan Gilbert (git)
     [not found] <1443459153-10965-1-git-send-email-dgilbert@redhat.com>
     [not found] ` <1443459153-10965-2-git-send-email-dgilbert@redhat.com>
     [not found]   ` <87oaftx1nw.fsf@neno.neno>
2015-10-20 11:54     ` [Qemu-devel] [PATCH v8 01/54] Add postcopy documentation Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1443515898-3594-2-git-send-email-dgilbert@redhat.com \
    --to=dgilbert@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=bharata@linux.vnet.ibm.com \
    --cc=liang.z.li@intel.com \
    --cc=luis@cs.umu.se \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.