From: "Denis V. Lunev" <den@openvz.org>
To: Fam Zheng <famz@redhat.com>, Max Reitz <mreitz@redhat.com>
Cc: qemu-devel@nongnu.org, Kevin Wolf <kwolf@redhat.com>,
Jeff Cody <jcody@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
Eric Blake <eblake@redhat.com>, John Snow <jsnow@redhat.com>,
qemu-block@nongnu.org, berrange@redhat.com, pbonzini@redhat.com
Subject: Re: [Qemu-devel] [PATCH for-2.7 v2 04/17] block: Introduce image file locking
Date: Mon, 18 Apr 2016 08:34:29 +0300 [thread overview]
Message-ID: <571471E5.5010907@openvz.org> (raw)
In-Reply-To: <20160418013334.GC18893@ad-mail.usersys.redhat.com>
On 04/18/2016 04:33 AM, Fam Zheng wrote:
> On Sun, 04/17 01:29, Max Reitz wrote:
>> On 15.04.2016 05:27, Fam Zheng wrote:
>>> Block drivers can implement this new operation .bdrv_lockf to actually lock the
>>> image in the protocol specific way.
>>>
>>> Signed-off-by: Fam Zheng <famz@redhat.com>
>>> ---
>>> block.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>>> include/block/block_int.h | 12 ++++++++++++
>>> 2 files changed, 54 insertions(+)
>> I'm prepared for everyone hating this idea, but I'm just bringing it up
>> so I can always say I did bring it up.
>>
>> Heads up: This will be about qcow2 locking again.
>>
>> Relax, though, it won't be about how much better qcow2 locking is better
>> than protocol locking.
>>
>> Now that you know this feel free to drop out.
>>
>> This patch implements locking by just trying to lock every single BDS
>> that is being opened. While it may fulfill its purpose, I don't think
>> that is what we actually want.
>>
>> What we want is the following: qemu has a BDS graph. It is basically a
>> forest of trees. It may be a bit more complicated (DAGs instead of
>> trees), but let's just assume it is.
>>
>> What we want to protect are leaves in this tree. Every leaf basically
>> corresponds to a physical resource such as a file or an NBD connection.
>> Every leaf is driven by a protocol block driver. We want to protect
>> these physical resources from concurrent access.
>>
>> Ideally, we can just protect the physical resource itself. This works
>> for raw-posix, this works for gluster, this works for raw-win32, and
>> probably some other protocols, too. But I guess it won't work for all
>> protocols, and even if it does, it would need to be implemented.
>>
>> But we can protect leaves in the BDS forest by locking non-leaves also:
>> If you lock a qcow2 node, all of its "file" subtree will be protected;
>> normally, that's just a single leaf.
>>
>> Therefore, I think the ideal approach would be for each BDS tree that is
>> to be created we try to lock all of its leaves, and if that does not
>> work for some, we walk up the tree and try to lock inner nodes (e.g.
>> format BDSs which then use format locking) so that the leaves are still
>> protected even if their protocol does not support that.
>>
>> This could be implemented like this: Whenever a leaf BDS is created, try
>> to lock it. If we can't, leave some information to the parent node that
>> its child could not be locked. Then, the parent will evaluate this
>> information and try to act upon it. This then recurses up the tree. Or,
>> well, down the tree, considering that in most natural trees the root is
>> at the bottom.
>>
>>
>> We could just implement qcow2 locking on top of this series as it is,
>> but this would result in qcow2 files being locked even if their files'
>> protocol nodes have been successfully locked. That would be superfluous
>> and we'd have all the issues with force-unlocking qcow2 files we have
>> discussed before.
>>
>>
>> So what am I saying? I think that it makes sense to consider format
>> locking as a backup alternative to protocol locking in case the latter
>> is not possible. I think it is possible to implement both using the same
>> framework.
>>
>> I don't think we need to worry about the actual implementation of format
>> locking now. But I do think having a framework which supports both
>> format and protocol locking is possible and would be nice to have.
>>
>> Such a framework would require more effort, however, than the basically
>> brute-force "just lock everything" method presented in this patch. Don't
>> get me wrong, this method here works for what it's supposed to do (I
>> haven't reviewed it yet, though), and it's very reasonable if protocol
>> locking is all we intend to have. I'm just suggesting that maybe we do
>> want to have more than that.
>>
>>
>> All in all, I won't object if the locking framework introduced by this
>> series is not supposed to and does not work with format locking. It can
>> always be added later if I really like it so much, and I can definitely
>> understand if it appears to be too much effort for basically no gain
>> right now.
>>
>> As I said above, I just brought this up so I brought it up. :-)
> I don't hate this idea, but it is not necessarily much more effort. We can
> always check the underlying file in qcow2's locking implementation, can't we?
>
> int qcow2_lockf(BlockDriverState *bs, int cmd)
> {
> if ((cmd != BDRV_LOCKF_UNLOCK) && !bdrv_is_locked(bs->file)) {
> return 0;
> }
> ...
> }
>
> The problem with doing this generically in block layer is the chicken-and-egg
> problem: it's not safe to just have format probling code or qcow2 driver to
> read the image or even writing to the header field for opening it, another
> process could be writing to the image already. A challenge with format
> locking is the lack of file level atomic operations (cmpxchg on the image
> header).
>
> Fam
We should not touch images!
If QEMU will die, f.e. on assert or by power off, we will suffer
a REAL pain to guess whether the lock is taken or not.
flock() and friends is a perfect mechanics for the purpose
as the kernel will drop corresponding locks by magic.
Header changes are good for detecting of unclean stops and
running consistency checks after that, f.e. if we have lazy
ref-counters on.
Pls do not do this inside the image. We have eaten that
stuff in our older products and this is really BAD way to
follow.
Den
next prev parent reply other threads:[~2016-04-18 5:34 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-15 3:27 [Qemu-devel] [PATCH for-2.7 v2 00/17] block: Lock images when opening Fam Zheng
2016-04-15 3:27 ` [Qemu-devel] [PATCH for-2.7 v2 01/17] block: Add BDRV_O_NO_LOCK Fam Zheng
2016-04-16 10:47 ` Denis V. Lunev
2016-04-15 3:27 ` [Qemu-devel] [PATCH for-2.7 v2 02/17] qapi: Add lock-image in blockdev-add options Fam Zheng
2016-04-16 10:48 ` Denis V. Lunev
2016-04-26 8:01 ` Fam Zheng
2016-04-15 3:27 ` [Qemu-devel] [PATCH for-2.7 v2 03/17] blockdev: Add and parse "lock-image" option for block devices Fam Zheng
2016-04-16 13:15 ` Denis V. Lunev
2016-04-19 13:00 ` Fam Zheng
2016-04-15 3:27 ` [Qemu-devel] [PATCH for-2.7 v2 04/17] block: Introduce image file locking Fam Zheng
2016-04-16 13:22 ` Denis V. Lunev
2016-04-18 1:43 ` Fam Zheng
2016-04-16 23:29 ` Max Reitz
2016-04-18 1:33 ` Fam Zheng
2016-04-18 5:34 ` Denis V. Lunev [this message]
2016-04-19 19:14 ` Max Reitz
2016-04-20 8:46 ` Denis V. Lunev
2016-04-19 19:13 ` Max Reitz
2016-04-25 23:55 ` Laszlo Ersek
2016-04-26 0:47 ` Fam Zheng
2016-04-15 3:27 ` [Qemu-devel] [PATCH for-2.7 v2 05/17] raw-posix: Implement .bdrv_lockf Fam Zheng
2016-04-16 13:29 ` Denis V. Lunev
2016-04-18 1:12 ` Fam Zheng
2016-04-18 5:30 ` Denis V. Lunev
2016-04-18 9:34 ` Daniel P. Berrange
2016-04-18 9:38 ` Denis V. Lunev
2016-04-17 19:27 ` Richard W.M. Jones
2016-04-18 1:10 ` Fam Zheng
2016-04-18 8:04 ` Richard W.M. Jones
2016-04-19 12:37 ` Fam Zheng
2016-04-19 13:07 ` Richard W.M. Jones
2016-04-19 13:19 ` Fam Zheng
2016-04-19 13:36 ` Richard W.M. Jones
2016-04-19 13:45 ` Daniel P. Berrange
2016-04-19 13:34 ` Daniel P. Berrange
2016-04-19 13:40 ` Richard W.M. Jones
2016-04-15 3:27 ` [Qemu-devel] [PATCH for-2.7 v2 06/17] gluster: " Fam Zheng
2016-04-15 12:24 ` [Qemu-devel] [Qemu-block] " Niels de Vos
2016-04-15 3:27 ` [Qemu-devel] [PATCH for-2.7 v2 07/17] rbd: Implement image locking Fam Zheng
2016-04-23 1:57 ` Jason Dillaman
2016-04-25 0:42 ` Fam Zheng
2016-04-26 15:42 ` Jason Dillaman
2016-04-27 0:20 ` Fam Zheng
2016-04-27 18:18 ` Jason Dillaman
2016-04-28 1:33 ` Fam Zheng
2016-04-15 3:27 ` [Qemu-devel] [PATCH for-2.7 v2 08/17] qemu-io: Add "-L" option for BDRV_O_NO_LOCK Fam Zheng
2016-04-16 13:46 ` Denis V. Lunev
2016-04-19 12:59 ` Fam Zheng
2016-04-15 3:27 ` [Qemu-devel] [PATCH for-2.7 v2 09/17] qemu-img: Add "-L" option to sub commands Fam Zheng
2016-04-16 14:29 ` Denis V. Lunev
2016-04-16 14:30 ` Denis V. Lunev
2016-04-19 12:59 ` Fam Zheng
2016-04-15 3:28 ` [Qemu-devel] [PATCH for-2.7 v2 10/17] qemu-img: Update documentation of "-L" option Fam Zheng
2016-04-15 3:28 ` [Qemu-devel] [PATCH for-2.7 v2 11/17] qemu-nbd: Add "--no-lock/-L" option Fam Zheng
2016-04-16 14:32 ` Denis V. Lunev
2016-04-19 12:58 ` Fam Zheng
2016-04-15 3:28 ` [Qemu-devel] [PATCH for-2.7 v2 12/17] qemu-iotests: 140: Disable image lock for qemu-io access Fam Zheng
2016-04-15 3:28 ` [Qemu-devel] [PATCH for-2.7 v2 13/17] qemu-iotests: 046: Move version detection out from verify_io Fam Zheng
2016-04-15 3:28 ` [Qemu-devel] [PATCH for-2.7 v2 14/17] qemu-iotests: Wait for QEMU processes before checking image in 091 Fam Zheng
2016-04-15 3:28 ` [Qemu-devel] [PATCH for-2.7 v2 15/17] qemu-iotests: Disable image lock when checking test image Fam Zheng
2016-04-15 3:28 ` [Qemu-devel] [PATCH for-2.7 v2 16/17] block: Turn on image locking by default Fam Zheng
2016-04-15 3:28 ` [Qemu-devel] [PATCH for-2.7 v2 17/17] qemu-iotests: Add test case 152 for image locking Fam Zheng
2016-04-16 14:33 ` [Qemu-devel] [PATCH for-2.7 v2 00/17] block: Lock images when opening Denis V. Lunev
2016-04-18 9:53 ` Daniel P. Berrange
2016-04-19 12:40 ` Fam Zheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=571471E5.5010907@openvz.org \
--to=den@openvz.org \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=eblake@redhat.com \
--cc=famz@redhat.com \
--cc=jcody@redhat.com \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).