From: Hongyang Yang <yanghy@cn.fujitsu.com>
To: "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com>,
qemu-devel@nongnu.org
Cc: aarcange@redhat.com, yamahata@private.email.ne.jp,
lilei@linux.vnet.ibm.com, quintela@redhat.com,
Jiang Yunhong <yunhong.jiang@intel.com>,
Dong Eddie <eddie.dong@intel.com>,
amit.shah@redhat.com, Lai Jiangshan <laijs@cn.fujitsu.com>
Subject: Re: [Qemu-devel] [PATCH v3 03/47] Start documenting how postcopy works.
Date: Tue, 9 Sep 2014 11:34:46 +0800 [thread overview]
Message-ID: <540E7556.4010406@cn.fujitsu.com> (raw)
In-Reply-To: <1409238244-31720-4-git-send-email-dgilbert@redhat.com>
Hi
I've read your documentation about Postcopy, this is interesting.
It comes to my mind that if COLO can gain some improvements from
Postcopy.
The first thing I thought was that if we can use the back channel
that request dirty pages from source so that we do not need to manage
a ram snapshot of the source. That is, when entered a COLO checkpoint,
we just request the pages that dirtied on destination from source.
But I'm not sure it won't affect the performance, anyway, it may worth
a try.
在 08/28/2014 11:03 PM, Dr. David Alan Gilbert (git) 写道:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> docs/migration.txt | 188 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 188 insertions(+)
>
> diff --git a/docs/migration.txt b/docs/migration.txt
> index 0492a45..7f0fdc4 100644
> --- a/docs/migration.txt
> +++ b/docs/migration.txt
> @@ -294,3 +294,191 @@ save/send this state when we are in the middle of a pio operation
> (that is what ide_drive_pio_state_needed() checks). If DRQ_STAT is
> not enabled, the values on that fields are garbage and don't need to
> be sent.
> +
> += Return path =
> +
> +In most migration scenarios there is only a single data path that runs
> +from the source VM to the destination, typically along a single fd (although
> +possibly with another fd or similar for some fast way of throwing pages across).
> +
> +However, some uses need two way communication; in particular the Postcopy destination
> +needs to be able to request pages on demand from the source.
> +
> +For these scenarios there is a 'return path' from the destination to the source;
> +qemu_file_get_return_path(QEMUFile* fwdpath) gives the QEMUFile* for the return
> +path.
> +
> + Source side
> + Forward path - written by migration thread
> + Return path - opened by main thread, read by return-path thread
> +
> + Destination side
> + Forward path - read by main thread
> + Return path - opened by main thread, written by main thread AND postcopy
> + thread (protected by rp_mutex)
> +
> += Postcopy =
> +'Postcopy' migration is a way to deal with migrations that refuse to converge;
> +its plus side is that there is an upper bound on the amount of migration traffic
> +and time it takes, the down side is that during the postcopy phase, a failure of
> +*either* side or the network connection causes the guest to be lost.
> +
> +In postcopy the destination CPUs are started before all the memory has been
> +transferred, and accesses to pages that are yet to be transferred cause
> +a fault that's translated by QEMU into a request to the source QEMU.
> +
> +Postcopy can be combined with precopy (i.e. normal migration) so that if precopy
> +doesn't finish in a given time the switch is automatically made to precopy.
> +
> +=== Enabling postcopy ===
> +
> +To enable postcopy (prior to the start of migration):
> +
> +migrate_set_capability x-postcopy-ram on
> +
> +The migration will still start in precopy mode, however issuing:
> +
> +migrate_start_postcopy
> +
> +will now cause the transition from precopy to postcopy.
> +It can be issued immediately after migration is started or any
> +time later on. Issuing it after the end of a migration is harmless.
> +
> +=== Postcopy device transfer ===
> +
> +Loading of device data may cause the device emulation to access guest RAM
> +that may trigger faults that have to be resolved by the source, as such
> +the migration stream has to be able to respond with page data *during* the
> +device load, and hence the device data has to be read from the stream completely
> +before the device load begins to free the stream up. This is achieved by
> +'packaging' the device data into a blob that's read in one go.
> +
> +Source behaviour
> +
> +Until postcopy is entered the migration stream is identical to normal postcopy,
> +except for the addition of a 'postcopy advise' command at the beginning to
> +let the destination know that postcopy might happen. When postcopy starts
> +the source sends the page discard data and then forms the 'package' containing:
> +
> + Command: 'postcopy ram listen'
> + The device state
> + A series of sections, identical to the precopy streams device state stream
> + containing everything except postcopiable devices (i.e. RAM)
> + Command: 'postcopy ram run'
> +
> +The 'package' is sent as the data part of a Command: 'CMD_PACKAGED', and the
> +contents are formatted in the same way as the main migration stream.
> +
> +Destination behaviour
> +
> +Initially the destination looks the same as precopy, with a single thread
> +reading the migration stream; the 'postcopy advise' and 'discard' commands
> +are processed to change the way RAM is managed, but don't affect the stream
> +processing.
> +
> +------------------------------------------------------------------------------
> + 1 2 3 4 5 6 7
> +main -----DISCARD-CMD_PACKAGED ( LISTEN DEVICE DEVICE DEVICE RUN )
> +thread | |
> + | (page request)
> + | \___
> + v \
> +listen thread: --- page -- page -- page -- page -- page --
> +
> + a b c
> +------------------------------------------------------------------------------
> +
> +On receipt of CMD_PACKAGED (1)
> + All the data associated with the package - the ( ... ) section in the
> +diagram - is read into memory (into a QEMUSizedBuffer), and the main thread
> +recurses into qemu_loadvm_state_main to process the contents of the package (2)
> +which contains commands (3,6) and devices (4...)
> +
> +On receipt of 'postcopy ram listen' - 3 -(i.e. the 1st command in the package)
> +a new thread (a) is started that takes over servicing the migration stream,
> +while the main thread carries on loading the package. It loads normal
> +background page data (b) but if during a device load a fault happens (5) the
> +returned page (c) is loaded by the listen thread allowing the main threads
> +device load to carry on.
> +
> +The last thing in the CMD_PACKAGED is a 'RUN' command (6) letting the destination
> +CPUs start running.
> +At the end of the CMD_PACKAGED (7) the main thread returns to normal running behaviour
> +and is no longer used by migration, while the listen thread carries
> +on servicing page data until the end of migration.
> +
> +=== Postcopy states ===
> +
> +Postcopy moves through a series of states (see postcopy_ram_state)
> +from ADVISE->LISTEN->RUNNING->END
> +
> + Advise: Set at the start of migration if postcopy is enabled, even
> + if it hasn't had the start command; here the destination
> + checks that its OS has the support needed for postcopy, and performs
> + setup to ensure the RAM mappings are suitable for later postcopy.
> + (Triggered by reception of POSTCOPY_RAM_ADVISE command)
> +
> + Listen: The first command in the package, POSTCOPY_RAM_LISTEN, switches
> + the destination state to Listen, and starts a new thread
> + (the 'listen thread') which takes over the job of receiving
> + pages off the migration stream, while the main thread carries
> + on processing the blob. With this thread able to process page
> + reception, the destination now 'sensitises' the RAM to detect
> + any access to missing pages (on Linux using the 'userfault'
> + system).
> +
> + Running: POSTCOPY_RAM_RUN causes the destination to synchronise all
> + state and start the CPUs and IO devices running. The main
> + thread now finishes processing the migration package and
> + now carries on as it would for normal precopy migration
> + (although it can't do the cleanup it would do as it
> + finishes a normal migration).
> +
> + End: The listen thread can now quit, and perform the cleanup of migration
> + state, the migration is now complete.
> +
> +=== Source side page maps ===
> +
> +The source side keeps two bitmaps during postcopy; 'the migration bitmap'
> +and 'sent map'. The 'migration bitmap' is basically the same as in
> +the precopy case, and holds a bit to indicate that page is 'dirty' -
> +i.e. needs sending. During the precopy phase this is updated as the CPU
> +dirties pages, however during postcopy the CPUs are stopped and nothing
> +should dirty anything any more.
> +
> +The 'sent map' is used for the transition to postcopy. It is a bitmap that
> +has a bit set whenever a page is sent to the destination, however during
> +the transition to postcopy mode it is masked against the migration bitmap
> +(sentmap &= migrationbitmap) to generate a bitmap recording pages that
> +have been previously been sent but are now dirty again. This masked
> +sentmap is sent to the destination which discards those now dirty pages
> +before starting the CPUs.
> +
> +Note that once in postcopy mode, the sent map is still updated; however,
> +its contents are not necessarily consistent with the pages already sent
> +due to the masking with the migration bitmap.
> +
> +=== Destination side page maps ===
> +
> +(Needs to be changed so we can update both easily - at the moment updates are done
> + with a lock)
> +The destination keeps a 'requested map' and a 'received map'.
> +Both maps are initially 0, as pages are received the bits are set in 'received map'.
> +Incoming requests from the kernel cause the bit to be set in the 'requested map'.
> +When a page is received that is marked as 'requested' the kernel is notified.
> +If the kernel requests a page that has already been 'received' the kernel is notified
> +without re-requesting.
> +
> +This leads to three valid page states:
> +page states:
> + missing (!rc,!rq) - page not yet received or requested
> + received (rc,!rq) - Page received
> + requested (!rc,rq) - page requested but not yet received
> +
> +state transitions:
> + received -> missing (only during setup/discard)
> +
> + missing -> received (normal incoming page)
> + requested -> received (incoming page previously requested)
> + missing -> requested (userfault request)
> +
>
--
Thanks,
Yang.
next prev parent reply other threads:[~2014-09-09 3:36 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-28 15:03 [Qemu-devel] [PATCH v3 00/47] Postcopy implementation Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 01/47] QEMUSizedBuffer/QEMUFile Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 02/47] Tests: QEMUSizedBuffer/QEMUBuffer Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 03/47] Start documenting how postcopy works Dr. David Alan Gilbert (git)
2014-09-09 3:34 ` Hongyang Yang [this message]
2014-09-09 3:46 ` Hongyang Yang
2014-09-09 3:39 ` Hongyang Yang
2014-09-12 11:23 ` Dr. David Alan Gilbert
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 04/47] qemu_ram_foreach_block: pass up error value, and down the ramblock name Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 05/47] improve DPRINTF macros, add to savevm Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 06/47] Add qemu_get_counted_string to read a string prefixed by a count byte Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 07/47] Create MigrationIncomingState Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 08/47] socket shutdown Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 09/47] Return path: Open a return path on QEMUFile for sockets Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 10/47] Return path: socket_writev_buffer: Block even on non-blocking fd's Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 11/47] Migration commands Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 12/47] Return path: Control commands Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 13/47] Return path: Send responses from destination to source Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 14/47] Return path: Source handling of return path Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 15/47] qemu_loadvm errors and debug Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 16/47] ram_debug_dump_bitmap: Dump a migration bitmap as text Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 17/47] Rework loadvm path for subloops Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 18/47] Add migration-capability boolean for postcopy-ram Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 19/47] Add wrappers and handlers for sending/receiving the postcopy-ram migration messages Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 20/47] QEMU_VM_CMD_PACKAGED: Send a packaged chunk of migration stream Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 21/47] migrate_init: Call from savevm Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 22/47] Allow savevm handlers to state whether they could go into postcopy Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 23/47] postcopy: OS support test Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 24/47] migrate_start_postcopy: Command to trigger transition to postcopy Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 25/47] MIG_STATE_POSTCOPY_ACTIVE: Add new migration state Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 26/47] qemu_savevm_state_complete: Postcopy changes Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 27/47] Postcopy: Maintain sentmap during postcopy pre phase Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 28/47] Postcopy page-map-incoming (PMI) structure Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 29/47] postcopy: Add incoming_init/cleanup functions Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 30/47] postcopy: Incoming initialisation Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 31/47] postcopy: ram_enable_notify to switch on userfault Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 32/47] Postcopy: postcopy_start Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 33/47] Postcopy: Rework migration thread for postcopy mode Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 34/47] mig fd_connect: open return path Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 35/47] Postcopy: Create a fault handler thread before marking the ram as userfault Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 36/47] Page request: Add MIG_RPCOMM_REQPAGES reverse command Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 37/47] Page request: Process incoming page request Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 38/47] Page request: Consume pages off the post-copy queue Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 39/47] Add assertion to check migration_dirty_pages Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 40/47] postcopy_ram.c: place_page and helpers Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 41/47] Postcopy: Use helpers to map pages during migration Dr. David Alan Gilbert (git)
2014-08-28 15:03 ` [Qemu-devel] [PATCH v3 42/47] qemu_ram_block_from_host Dr. David Alan Gilbert (git)
2014-08-28 15:04 ` [Qemu-devel] [PATCH v3 43/47] Don't sync dirty bitmaps in postcopy Dr. David Alan Gilbert (git)
2014-08-28 15:04 ` [Qemu-devel] [PATCH v3 44/47] Postcopy; Handle userfault requests Dr. David Alan Gilbert (git)
2014-08-28 15:04 ` [Qemu-devel] [PATCH v3 45/47] Start up a postcopy/listener thread ready for incoming page data Dr. David Alan Gilbert (git)
2014-08-28 15:04 ` [Qemu-devel] [PATCH v3 46/47] postcopy: Wire up loadvm_postcopy_ram_handle_{run, end} commands Dr. David Alan Gilbert (git)
2014-08-28 15:04 ` [Qemu-devel] [PATCH v3 47/47] End of migration for postcopy Dr. David Alan Gilbert (git)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=540E7556.4010406@cn.fujitsu.com \
--to=yanghy@cn.fujitsu.com \
--cc=aarcange@redhat.com \
--cc=amit.shah@redhat.com \
--cc=dgilbert@redhat.com \
--cc=eddie.dong@intel.com \
--cc=laijs@cn.fujitsu.com \
--cc=lilei@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=yamahata@private.email.ne.jp \
--cc=yunhong.jiang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).