From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46834) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3OPr-0007s4-Hg for qemu-devel@nongnu.org; Tue, 03 Apr 2018 12:04:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3OPl-00085h-6o for qemu-devel@nongnu.org; Tue, 03 Apr 2018 12:04:11 -0400 References: <20180320170521.32152-1-vsementsov@virtuozzo.com> <4add3400-b4d4-4812-72f2-0f184b2f4fd6@virtuozzo.com> <2520cd8d-990b-e7b9-c7b6-0b345e414ce8@virtuozzo.com> <48dbaef0-0a9f-de65-9c7d-584021fc4759@redhat.com> <1870fc0d-0c36-f983-8496-3cb5119851ba@redhat.com> <50eaf53c-871e-6081-7c4d-4d8de9b0d87f@virtuozzo.com> <9f49e65a-fc44-ea11-ff0a-cb38270451aa@virtuozzo.com> <7be588fb-dd63-d6fd-0e39-81fbee1bdd25@virtuozzo.com> From: Max Reitz Message-ID: Date: Tue, 3 Apr 2018 18:03:43 +0200 MIME-Version: 1.0 In-Reply-To: <7be588fb-dd63-d6fd-0e39-81fbee1bdd25@virtuozzo.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v4 for 2.12 0/3] fix bitmaps migration through shared storage List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladimir Sementsov-Ogievskiy , qemu-devel@nongnu.org, qemu-block@nongnu.org Cc: kwolf@redhat.com, jsnow@redhat.com, den@openvz.org On 2018-03-30 17:32, Vladimir Sementsov-Ogievskiy wrote: > 30.03.2018 16:31, Vladimir Sementsov-Ogievskiy wrote: >> 29.03.2018 18:09, Vladimir Sementsov-Ogievskiy wrote: >>> 29.03.2018 17:03, Max Reitz wrote: >>>> On 2018-03-29 10:08, Vladimir Sementsov-Ogievskiy wrote: >>>>> 28.03.2018 17:53, Max Reitz wrote: >>>>>> On 2018-03-27 12:11, Vladimir Sementsov-Ogievskiy wrote: >>>> [...] >>>> >>>>>>> isn't it because a lot of cat processes? will check, update loop = to >>>>>>> i=3D0; while check -qcow2 169; do ((i++)); echo $i OK; killall -9= cat; >>>>>>> done >>>>>> Hmm...=C2=A0 I know I tried to kill all of the cats, but for some >>>>>> reason that >>>>>> didn't really help yesterday.=C2=A0 Seems to help now, for 2.12.0-= rc0 at >>>>>> least (that is, before this series). >>>>> reproduced with killing... (without these series, just on master) >>>>> >>>>>> After the whole series, I still get a lot of failures in 169 >>>>>> (mismatching bitmap hash, mostly). >>>>>> >>>>>> And interestingly, if I add an abort(): >>>>>> >>>>>> diff --git a/block/qcow2.c b/block/qcow2.c >>>>>> index 486f3e83b7..9204c1c0ac 100644 >>>>>> --- a/block/qcow2.c >>>>>> +++ b/block/qcow2.c >>>>>> @@ -1481,6 +1481,7 @@ static int coroutine_fn >>>>>> qcow2_do_open(BlockDriverState *bs, QDict *options,=C2=A0=C2=A0=C2= =A0=C2=A0 } >>>>>> >>>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (bdrv_dirty_bitmap_next(bs= , NULL)) { >>>>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 abort(); >>>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* It= 's some kind of reopen with already existing dirty >>>>>> bitmaps. There >>>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= * are no known cases where we need loading bitmaps in >>>>>> such >>>>>> situation, >>>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= * so it's safer don't load them. >>>>>> >>>>>> Then this fires for a couple of test cases of 169 even without the >>>>>> third >>>>>> patch of this series. >>>>>> >>>>>> I guess bdrv_dirty_bitmap_next() reacts to some bitmaps that >>>>>> migration >>>>>> adds or something?=C2=A0 Then this would be the wrong condition, b= ecause I >>>>>> guess we still want to load the bitmaps that are in the qcow2 file= . >>>>>> >>>>>> I'm not sure whether bdrv_has_readonly_bitmaps() is the correct >>>>>> condition then, either, though.=C2=A0 Maybe let's take a step back= : We >>>>>> want >>>>>> to load all the bitmaps from the file exactly once, and that is >>>>>> when it >>>>>> is opened the first time.=C2=A0 Or that's what I would have though= t...=C2=A0 Is >>>>>> that even correct? >>>>>> >>>>>> Why do we load the bitmaps when the device is inactive anyway? >>>>>> Shouldn't we load them only once the device is activated? >>>>> Hmm, not sure. May be, we don't need. But we anyway need to load th= em, >>>>> when opening read-only, and we should correspondingly reopen in >>>>> this case. >>>> Yeah, well, yeah, but the current state seems just wrong. Apparently >>>> there are cases where a qcow2 node may have bitmaps before we try to >>>> load them from the file, so the current condition doesn't work. >>>> >>>> Furthermore, it seems like the current "state machine" is too >>>> complex so >>>> we don't know which cases are possible anymore and what to do when. >>>> >>>> So the first thing we could do is add a field to the BDRVQCow2State >>>> that >>>> tells us whether the bitmaps have been loaded already or not. If not= , >>>> we invoke qcow2_load_dirty_bitmaps() and set the value to true. If t= he >>>> value was true already and the BDS is active and R/W now, we call >>>> qcow2_reopen_bitmaps_rw_hint().=C2=A0 That should solve one problem. >>> >>> good idea, will do. >>> >>>> >>>> The other problem of course is the question whether we should call >>>> qcow2_load_dirty_bitmaps() at all while the drive is still inactive. >>>> You know the migration model better than me, so I'm asking this >>>> question >>>> to you.=C2=A0 We can phrase it differently: Do we need to load the b= itmaps >>>> before the drive is activated? >>> >>> Now I think that we don't need. At least, we don't have such cases in >>> Virtuozzo (I hope :). >>> >>> Why did I doubt: >>> >>> 1. We have cases, when we want to start vm as inactive, to be able to >>> export it's drive as NBD export, push some data to it and then start >>> the VM (which involves activating) >>> 2. We have cases, when we want to start vm stopped and operate on >>> dirty bitmaps. >>> >>> If just open all images in inactive mode until vm start, it looks >>> like we need bitmaps in inactive mode (for 2.). But it looks like >>> wrong approach anyway. >>> Firstly, I tried to solve (1.) by simply inactivate_all() in case of >>> start vm in paused mode, but it breaks at least (2.), so finally, I >>> solved (1.) by an approach similar with "-incoming defer". So, we >>> have inactive mode in two cases: >>> =C2=A0- incoming migration >>> =C2=A0- push data to vm before start >>> >>> and, in these cases, we don't need to load dirty-bitmaps. >>> >>> Also, inconsistency: now, we remove persistent bitmaps on inactivate. >>> So, it is inconsistent to load the in inactive mode. >>> >>> Ok, I'll try to respin. >> >> finally, what cases we actually have for qcow2_do_open? >> >> 1. INACTIVE -> ACTIVE (through invalidate_cache, we obviously should >> load bitmaps, if we decided that we have no persistent bitmaps in >> INACTIVE mode) >> 2. creating new bdrv state (first open of the image) in INACTIVE mode >> (will not load bitmaps) >> 3. creating new bdrv state (first open of the image) in ACTIVE mode >> (will load bitmaps, maybe read-only if disk is RO) >> >> If only these three cases, it would be enough to just load bitmaps if >> !INACTIVE and do nothing otherwise. Maybe, but I'd prefer just adding a flag for now. We can think about whether we need bitmaps in inactive mode later, and even then managing the state through a flag doesn't hurt. >> Or, we have some of the following cases too? >> >> 1?. ACTIVE -> ACTIVE (through invalidate_cache, some kind of no-op, we >> should not reload bitmaps) >> 2?. RO -> RW (we should reopen_bitmaps_rw) (or it is possible only >> through bdrv_reopen, which will not touch qcow2_do_open? >> 3?. RW -> RO (reopen_bitmaps_ro ?) >> ? something other?? I don't really know, but I just hope having a flag saves us from having to think about all of this too much. :-) >>> about 169, how often is it reproducible for you? More errors than passes, I guess. But it always leaves cats behind, and that's another sign that something isn't right. >> it becomes very interseting. >> >> persistent-migbitmap-online case failed on line 136. with mismathced >> bitmap sha. This check is not relate to migration, on this line, >> bitmap is already successfully migrated. But for some reason it is >> corrupted after stop/start the VM. How is it possible - I can't >> imagine. But it looks not really related to migration.. May it relate >> to case, when postcopy was not finished or something like this?.. >> maybe fixing wait(RESUME) to something more appropriate will help. But >> it is strange anyway. >> >> persistent-notmigbitmap-offline case failed on ine 128, which is >> "self.vm_b.event_wait("RESUME", timeout=3D10.0)" with timeout.. can we >> skip RESUME for some reason? maybe just move to MIGRATION event will >> help, will check It would still be interesting to know where the RESUME ended up... > looks like moving from "RESUME" to "MIGRATION" events fixes the whole > issue (I've already 740 successful iterations, which is good enough, bu= t > I will not stop the loop until at least 2000). I'll submit a patch. But > I don't understand, why it fix sha256 mismatch, and why sha256-check > fails with "RESUME" (check success after migration and fail after > stop/start, o_O)... Well, if you have the time, it might be worth investigating. Max