From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45113) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1asnmF-00064X-E3 for qemu-devel@nongnu.org; Wed, 20 Apr 2016 04:46:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1asnmD-0003tS-Sd for qemu-devel@nongnu.org; Wed, 20 Apr 2016 04:46:27 -0400 References: <1460690887-32751-1-git-send-email-famz@redhat.com> <1460690887-32751-5-git-send-email-famz@redhat.com> <5712CAD6.9040707@redhat.com> <20160418013334.GC18893@ad-mail.usersys.redhat.com> <571471E5.5010907@openvz.org> <5716839C.8040808@redhat.com> From: "Denis V. Lunev" Message-ID: <571741D0.10900@openvz.org> Date: Wed, 20 Apr 2016 11:46:08 +0300 MIME-Version: 1.0 In-Reply-To: <5716839C.8040808@redhat.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH for-2.7 v2 04/17] block: Introduce image file locking List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Max Reitz , Fam Zheng Cc: qemu-devel@nongnu.org, Kevin Wolf , Jeff Cody , Markus Armbruster , Eric Blake , John Snow , qemu-block@nongnu.org, berrange@redhat.com, pbonzini@redhat.com On 04/19/2016 10:14 PM, Max Reitz wrote: > On 18.04.2016 07:34, Denis V. Lunev wrote: >> On 04/18/2016 04:33 AM, Fam Zheng wrote: >>> On Sun, 04/17 01:29, Max Reitz wrote: >>>> On 15.04.2016 05:27, Fam Zheng wrote: >>>>> Block drivers can implement this new operation .bdrv_lockf to >>>>> actually lock the >>>>> image in the protocol specific way. >>>>> >>>>> Signed-off-by: Fam Zheng >>>>> --- >>>>> block.c | 42 >>>>> ++++++++++++++++++++++++++++++++++++++++++ >>>>> include/block/block_int.h | 12 ++++++++++++ >>>>> 2 files changed, 54 insertions(+) >>>> I'm prepared for everyone hating this idea, but I'm just bringing it up >>>> so I can always say I did bring it up. >>>> >>>> Heads up: This will be about qcow2 locking again. >>>> >>>> Relax, though, it won't be about how much better qcow2 locking is better >>>> than protocol locking. >>>> >>>> Now that you know this feel free to drop out. >>>> >>>> This patch implements locking by just trying to lock every single BDS >>>> that is being opened. While it may fulfill its purpose, I don't think >>>> that is what we actually want. >>>> >>>> What we want is the following: qemu has a BDS graph. It is basically a >>>> forest of trees. It may be a bit more complicated (DAGs instead of >>>> trees), but let's just assume it is. >>>> >>>> What we want to protect are leaves in this tree. Every leaf basically >>>> corresponds to a physical resource such as a file or an NBD connection. >>>> Every leaf is driven by a protocol block driver. We want to protect >>>> these physical resources from concurrent access. >>>> >>>> Ideally, we can just protect the physical resource itself. This works >>>> for raw-posix, this works for gluster, this works for raw-win32, and >>>> probably some other protocols, too. But I guess it won't work for all >>>> protocols, and even if it does, it would need to be implemented. >>>> >>>> But we can protect leaves in the BDS forest by locking non-leaves also: >>>> If you lock a qcow2 node, all of its "file" subtree will be protected; >>>> normally, that's just a single leaf. >>>> >>>> Therefore, I think the ideal approach would be for each BDS tree that is >>>> to be created we try to lock all of its leaves, and if that does not >>>> work for some, we walk up the tree and try to lock inner nodes (e.g. >>>> format BDSs which then use format locking) so that the leaves are still >>>> protected even if their protocol does not support that. >>>> >>>> This could be implemented like this: Whenever a leaf BDS is created, try >>>> to lock it. If we can't, leave some information to the parent node that >>>> its child could not be locked. Then, the parent will evaluate this >>>> information and try to act upon it. This then recurses up the tree. Or, >>>> well, down the tree, considering that in most natural trees the root is >>>> at the bottom. >>>> >>>> >>>> We could just implement qcow2 locking on top of this series as it is, >>>> but this would result in qcow2 files being locked even if their files' >>>> protocol nodes have been successfully locked. That would be superfluous >>>> and we'd have all the issues with force-unlocking qcow2 files we have >>>> discussed before. >>>> >>>> >>>> So what am I saying? I think that it makes sense to consider format >>>> locking as a backup alternative to protocol locking in case the latter >>>> is not possible. I think it is possible to implement both using the same >>>> framework. >>>> >>>> I don't think we need to worry about the actual implementation of format >>>> locking now. But I do think having a framework which supports both >>>> format and protocol locking is possible and would be nice to have. >>>> >>>> Such a framework would require more effort, however, than the basically >>>> brute-force "just lock everything" method presented in this patch. Don't >>>> get me wrong, this method here works for what it's supposed to do (I >>>> haven't reviewed it yet, though), and it's very reasonable if protocol >>>> locking is all we intend to have. I'm just suggesting that maybe we do >>>> want to have more than that. >>>> >>>> >>>> All in all, I won't object if the locking framework introduced by this >>>> series is not supposed to and does not work with format locking. It can >>>> always be added later if I really like it so much, and I can definitely >>>> understand if it appears to be too much effort for basically no gain >>>> right now. >>>> >>>> As I said above, I just brought this up so I brought it up. :-) >>> I don't hate this idea, but it is not necessarily much more effort. >>> We can >>> always check the underlying file in qcow2's locking implementation, >>> can't we? >>> >>> int qcow2_lockf(BlockDriverState *bs, int cmd) >>> { >>> if ((cmd != BDRV_LOCKF_UNLOCK) && !bdrv_is_locked(bs->file)) { >>> return 0; >>> } >>> ... >>> } >>> >>> The problem with doing this generically in block layer is the >>> chicken-and-egg >>> problem: it's not safe to just have format probling code or qcow2 >>> driver to >>> read the image or even writing to the header field for opening it, >>> another >>> process could be writing to the image already. A challenge with format >>> locking is the lack of file level atomic operations (cmpxchg on the image >>> header). >>> >>> Fam >> We should not touch images! >> >> If QEMU will die, f.e. on assert or by power off, we will suffer >> a REAL pain to guess whether the lock is taken or not. >> flock() and friends is a perfect mechanics for the purpose >> as the kernel will drop corresponding locks by magic. >> >> Header changes are good for detecting of unclean stops and >> running consistency checks after that, f.e. if we have lazy >> ref-counters on. >> >> Pls do not do this inside the image. We have eaten that >> stuff in our older products and this is really BAD way to >> follow. > My suggestion was to use format locking only when protocol locking is > not available. And you can always switch it off manually. > > Also, the bulk of my argument was not to implement format locking right > now, but to keep in mind that maybe we do want to implement it later. > > Max > OK. this sounds sane and fair if we could start with the current approach and extend it later on with format locking. Den