From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40597) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aGY6v-0005uL-AS for qemu-devel@nongnu.org; Tue, 05 Jan 2016 15:21:42 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aGY6u-0007e1-AE for qemu-devel@nongnu.org; Tue, 05 Jan 2016 15:21:41 -0500 References: <1450802786-20893-1-git-send-email-kwolf@redhat.com> <1450802786-20893-6-git-send-email-kwolf@redhat.com> <5679B5DE.5070205@redhat.com> From: John Snow Message-ID: <568C25CB.7050808@redhat.com> Date: Tue, 5 Jan 2016 15:21:31 -0500 MIME-Version: 1.0 In-Reply-To: <5679B5DE.5070205@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [Qemu-block] [PATCH 05/10] block: Inactivate BDS when migration completes List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake , Kevin Wolf , qemu-block@nongnu.org Cc: qemu-devel@nongnu.org, mreitz@redhat.com On 12/22/2015 03:43 PM, Eric Blake wrote: > On 12/22/2015 09:46 AM, Kevin Wolf wrote: >> So far, live migration with shared storage meant that the image is in a >> not-really-ready don't-touch-me state on the destination while the >> source is still actively using it, but after completing the migration, >> the image was fully opened on both sides. This is bad. >> >> This patch adds a block driver callback to inactivate images on the >> source before completing the migration. Inactivation means that it goes >> to a state as if it was just live migrated to the qemu instance on the >> source (i.e. BDRV_O_INCOMING is set). You're then supposed to continue >> either on the source or on the destination, which takes ownership of the >> image. >> >> A typical migration looks like this now with respect to disk images: >> >> 1. Destination qemu is started, the image is opened with >> BDRV_O_INCOMING. The image is fully opened on the source. >> >> 2. Migration is about to complete. The source flushes the image and >> inactivates it. Now both sides have the image opened with >> BDRV_O_INCOMING and are expecting the other side to still modify it. > > The name BDRV_O_INCOMING now doesn't quite match semantics on the > source, but I don't have any better suggestions. BDRV_O_LIMITED_USE? > BDRV_O_HANDOFF? At any rate, I fully agree with your logic of locking > things down on the source to mark that the destination is about to take > over write access to the file. > INCOMING is handy as it keeps the code simple, even if it's weird to read. Is it worth adding the extra ifs/case statements everywhere to add in BDRV_O_HANDOFF? Maybe in the future someone will use BDRV_O_INCOMING to mean something more specific (data is incoming, not just in the process of being handed off) that could cause problems. Maybe even just renaming BDRV_O_INCOMING right now to be BDRV_O_HANDOFF would accomplish the semantics we want on both source and destination without needing two flags. Follow your dreams, Go with what you feel. >> >> 3. One side (the destination on success) continues and calls >> bdrv_invalidate_all() in order to take ownership of the image again. >> This removes BDRV_O_INCOMING on the resuming side; the flag remains >> set on the other side. >> >> This ensures that the same image isn't written to by both instances >> (unless both are resumed, but then you get what you deserve). This is >> important because .bdrv_close for non-BDRV_O_INCOMING images could write >> to the image file, which is definitely forbidden while another host is >> using the image. > > And indeed, this is a prereq to your patch that modifies the file on > close to clear the new 'open-for-writing' flag :) > >> >> Signed-off-by: Kevin Wolf >> --- >> block.c | 34 ++++++++++++++++++++++++++++++++++ >> include/block/block.h | 1 + >> include/block/block_int.h | 1 + >> migration/migration.c | 7 +++++++ >> qmp.c | 12 ++++++++++++ >> 5 files changed, 55 insertions(+) >> > >> @@ -1536,6 +1540,9 @@ static void migration_completion(MigrationState *s, int current_active_state, >> if (!ret) { >> ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); >> if (ret >= 0) { >> + ret = bdrv_inactivate_all(); >> + } >> + if (ret >= 0) { >> qemu_file_set_rate_limit(s->file, INT64_MAX); > > Isn't the point of the rate limit change to allow any pending operations > to flush without artificial slow limits? Will inactivating the device > be too slow if rate limits are still slow? > This sets the rate limit for the migration pipe, doesn't it? My reading was that this removes any artificial limits for the sake of post-copy, but we shouldn't be flushing any writes to disk at this point, so I think this order won't interfere with anything. > But offhand, I don't have any strong proof that a different order is > required, so yours makes sense to me. > > You may want a second opinion, but I'm okay if you add: > Reviewed-by: Eric Blake > Reviewed-by: John Snow