From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50738) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJ8RR-00065D-RA for qemu-devel@nongnu.org; Wed, 16 May 2018 22:14:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fJ8RQ-0007Wd-P5 for qemu-devel@nongnu.org; Wed, 16 May 2018 22:14:53 -0400 Date: Thu, 17 May 2018 10:14:41 +0800 From: Fam Zheng Message-ID: <20180517021441.GG6731@lemon.usersys.redhat.com> References: <8739e95e-1900-d4e3-7ac8-77e3aa8b927e@virtuozzo.com> <20180514064139.GB16389@lemon.usersys.redhat.com> <9a59b786-2351-e8ae-f582-5e436ce0383c@virtuozzo.com> <20180516124755.GB4435@localhost.localdomain> <352a3b44-b7ac-7d62-2d4e-8e46a80366d8@virtuozzo.com> <20180516153203.GD4435@localhost.localdomain> <25af7923-1eb0-8189-1c14-0f8f33007e6e@virtuozzo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <25af7923-1eb0-8189-1c14-0f8f33007e6e@virtuozzo.com> Subject: Re: [Qemu-devel] Restoring bitmaps after failed/cancelled migration List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladimir Sementsov-Ogievskiy Cc: Kevin Wolf , qemu-devel , qemu block , Max Reitz , John Snow , "gilbert >> Dr. David Alan Gilbert" , "Juan Q >> Juan Jose Quintela Carreira" On Wed, 05/16 18:52, Vladimir Sementsov-Ogievskiy wrote: > 16.05.2018 18:32, Kevin Wolf wrote: > > Am 16.05.2018 um 17:10 hat Vladimir Sementsov-Ogievskiy geschrieben: > > > 16.05.2018 15:47, Kevin Wolf wrote: > > > > Am 14.05.2018 um 12:09 hat Vladimir Sementsov-Ogievskiy geschrieben: > > > > > 14.05.2018 09:41, Fam Zheng wrote: > > > > > > On Wed, 04/18 17:00, Vladimir Sementsov-Ogievskiy wrote: > > > > > > > Is it possible, that target will change the disk, and then we return control > > > > > > > to the source? In this case bitmaps will be invalid. So, should not we drop > > > > > > > all the bitmaps on inactivate? > > > > > > Yes, dropping all live bitmaps upon inactivate sounds reasonable. If the dst > > > > > > fails to start, and we want to resume VM at src, we could (optionally?) reload > > > > > > the persistent bitmaps, I guess. > > > > > Reload from where? We didn't store them. > > > > Maybe this just means that it turns out that not storing them was a bad > > > > idea? > > > > > > > > What was the motivation for not storing the bitmap? The additional > > > > downtime? Is it really that bad, though? Bitmaps should be fairly small > > > > for the usual image sizes and writing them out should be quick. > > > What are usual ones? A bitmap of standard granularity of 64k for 16Tb disk > > > is ~30mb. If we have several such bitmaps it may be significant downtime. > > We could have an in-memory bitmap that tracks which parts of the > > persistent bitmap are dirty so that you don't have to write out the > > whole 30 MB during the migration downtime, but can already flush most of > > the persistent bitmap before the VM is stopped. > > > > Kevin > > Yes it looks possible. But how to control that downtime? Introduce migration > state, with specific _pending function? However, it may be not necessary. > > Anyway, I think we don't need to store it. > > If we decided to resume source, bitmap is already in memory, why to reload > it? If someone already killed source (which was in paused mode), it is > inconsistent anyway and loss of dirty bitmap is not the worst possible > problem. > > So, finally, it looks safe enough, just to make bitmaps on source persistent > again (or better, introduce another way to skip storing (may be with > additional flag, so everybody will be happy), not dropping persistent flag). This makes some sense to me. We'll then use the current persistent flag to indicate the bitmap "is" a persistent one, instead of "should it be persisted". They are apparently two different properties in the case discussed in this thread. > And, after source resume, we have one of the following situations: > > 1. disk was not changed during migration, so, all is ok and we have bitmaps > 2. disk was changed. bitmaps are inconsistent. But not only bitmaps, the > whole vm state is inconsistent with it's disk. This case is a bug in > management layer and it should never happen. And possibly, we need some > separate way, to catch such cases. Fam