qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Denis V. Lunev" <den@openvz.org>
To: Max Reitz <mreitz@redhat.com>, Fam Zheng <famz@redhat.com>
Cc: qemu-devel@nongnu.org, Kevin Wolf <kwolf@redhat.com>,
	Jeff Cody <jcody@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	Eric Blake <eblake@redhat.com>, John Snow <jsnow@redhat.com>,
	qemu-block@nongnu.org, berrange@redhat.com, pbonzini@redhat.com
Subject: Re: [Qemu-devel] [PATCH for-2.7 v2 04/17] block: Introduce image file locking
Date: Wed, 20 Apr 2016 11:46:08 +0300	[thread overview]
Message-ID: <571741D0.10900@openvz.org> (raw)
In-Reply-To: <5716839C.8040808@redhat.com>

On 04/19/2016 10:14 PM, Max Reitz wrote:
> On 18.04.2016 07:34, Denis V. Lunev wrote:
>> On 04/18/2016 04:33 AM, Fam Zheng wrote:
>>> On Sun, 04/17 01:29, Max Reitz wrote:
>>>> On 15.04.2016 05:27, Fam Zheng wrote:
>>>>> Block drivers can implement this new operation .bdrv_lockf to
>>>>> actually lock the
>>>>> image in the protocol specific way.
>>>>>
>>>>> Signed-off-by: Fam Zheng <famz@redhat.com>
>>>>> ---
>>>>>    block.c                   | 42
>>>>> ++++++++++++++++++++++++++++++++++++++++++
>>>>>    include/block/block_int.h | 12 ++++++++++++
>>>>>    2 files changed, 54 insertions(+)
>>>> I'm prepared for everyone hating this idea, but I'm just bringing it up
>>>> so I can always say I did bring it up.
>>>>
>>>> Heads up: This will be about qcow2 locking again.
>>>>
>>>> Relax, though, it won't be about how much better qcow2 locking is better
>>>> than protocol locking.
>>>>
>>>> Now that you know this feel free to drop out.
>>>>
>>>> This patch implements locking by just trying to lock every single BDS
>>>> that is being opened. While it may fulfill its purpose, I don't think
>>>> that is what we actually want.
>>>>
>>>> What we want is the following: qemu has a BDS graph. It is basically a
>>>> forest of trees. It may be a bit more complicated (DAGs instead of
>>>> trees), but let's just assume it is.
>>>>
>>>> What we want to protect are leaves in this tree. Every leaf basically
>>>> corresponds to a physical resource such as a file or an NBD connection.
>>>> Every leaf is driven by a protocol block driver. We want to protect
>>>> these physical resources from concurrent access.
>>>>
>>>> Ideally, we can just protect the physical resource itself. This works
>>>> for raw-posix, this works for gluster, this works for raw-win32, and
>>>> probably some other protocols, too. But I guess it won't work for all
>>>> protocols, and even if it does, it would need to be implemented.
>>>>
>>>> But we can protect leaves in the BDS forest by locking non-leaves also:
>>>> If you lock a qcow2 node, all of its "file" subtree will be protected;
>>>> normally, that's just a single leaf.
>>>>
>>>> Therefore, I think the ideal approach would be for each BDS tree that is
>>>> to be created we try to lock all of its leaves, and if that does not
>>>> work for some, we walk up the tree and try to lock inner nodes (e.g.
>>>> format BDSs which then use format locking) so that the leaves are still
>>>> protected even if their protocol does not support that.
>>>>
>>>> This could be implemented like this: Whenever a leaf BDS is created, try
>>>> to lock it. If we can't, leave some information to the parent node that
>>>> its child could not be locked. Then, the parent will evaluate this
>>>> information and try to act upon it. This then recurses up the tree. Or,
>>>> well, down the tree, considering that in most natural trees the root is
>>>> at the bottom.
>>>>
>>>>
>>>> We could just implement qcow2 locking on top of this series as it is,
>>>> but this would result in qcow2 files being locked even if their files'
>>>> protocol nodes have been successfully locked. That would be superfluous
>>>> and we'd have all the issues with force-unlocking qcow2 files we have
>>>> discussed before.
>>>>
>>>>
>>>> So what am I saying? I think that it makes sense to consider format
>>>> locking as a backup alternative to protocol locking in case the latter
>>>> is not possible. I think it is possible to implement both using the same
>>>> framework.
>>>>
>>>> I don't think we need to worry about the actual implementation of format
>>>> locking now. But I do think having a framework which supports both
>>>> format and protocol locking is possible and would be nice to have.
>>>>
>>>> Such a framework would require more effort, however, than the basically
>>>> brute-force "just lock everything" method presented in this patch. Don't
>>>> get me wrong, this method here works for what it's supposed to do (I
>>>> haven't reviewed it yet, though), and it's very reasonable if protocol
>>>> locking is all we intend to have. I'm just suggesting that maybe we do
>>>> want to have more than that.
>>>>
>>>>
>>>> All in all, I won't object if the locking framework introduced by this
>>>> series is not supposed to and does not work with format locking. It can
>>>> always be added later if I really like it so much, and I can definitely
>>>> understand if it appears to be too much effort for basically no gain
>>>> right now.
>>>>
>>>> As I said above, I just brought this up so I brought it up. :-)
>>> I don't hate this idea, but it is not necessarily much more effort.
>>> We can
>>> always check the underlying file in qcow2's locking implementation,
>>> can't we?
>>>
>>>       int qcow2_lockf(BlockDriverState *bs, int cmd)
>>>       {
>>>           if ((cmd != BDRV_LOCKF_UNLOCK) && !bdrv_is_locked(bs->file)) {
>>>               return 0;
>>>           }
>>>           ...
>>>       }
>>>
>>> The problem with doing this generically in block layer is the
>>> chicken-and-egg
>>> problem: it's not safe to just have format probling code or qcow2
>>> driver to
>>> read the image or even writing to the header field for opening it,
>>> another
>>> process could be writing to the image already. A challenge with format
>>> locking is the lack of file level atomic operations (cmpxchg on the image
>>> header).
>>>
>>> Fam
>> We should not touch images!
>>
>> If QEMU will die, f.e. on assert or by power off, we will suffer
>> a REAL pain to guess whether the lock is taken or not.
>> flock() and friends is a perfect mechanics for the purpose
>> as the kernel will drop corresponding locks by magic.
>>
>> Header changes are good for detecting of unclean stops and
>> running consistency checks after that, f.e. if we have lazy
>> ref-counters on.
>>
>> Pls do not do this inside the image. We have eaten that
>> stuff in our older products and this is really BAD way to
>> follow.
> My suggestion was to use format locking only when protocol locking is
> not available. And you can always switch it off manually.
>
> Also, the bulk of my argument was not to implement format locking right
> now, but to keep in mind that maybe we do want to implement it later.
>
> Max
>
OK. this sounds sane and fair if we could start with the
current approach and extend it later on with format
locking.

Den

  reply	other threads:[~2016-04-20  8:46 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-15  3:27 [Qemu-devel] [PATCH for-2.7 v2 00/17] block: Lock images when opening Fam Zheng
2016-04-15  3:27 ` [Qemu-devel] [PATCH for-2.7 v2 01/17] block: Add BDRV_O_NO_LOCK Fam Zheng
2016-04-16 10:47   ` Denis V. Lunev
2016-04-15  3:27 ` [Qemu-devel] [PATCH for-2.7 v2 02/17] qapi: Add lock-image in blockdev-add options Fam Zheng
2016-04-16 10:48   ` Denis V. Lunev
2016-04-26  8:01     ` Fam Zheng
2016-04-15  3:27 ` [Qemu-devel] [PATCH for-2.7 v2 03/17] blockdev: Add and parse "lock-image" option for block devices Fam Zheng
2016-04-16 13:15   ` Denis V. Lunev
2016-04-19 13:00     ` Fam Zheng
2016-04-15  3:27 ` [Qemu-devel] [PATCH for-2.7 v2 04/17] block: Introduce image file locking Fam Zheng
2016-04-16 13:22   ` Denis V. Lunev
2016-04-18  1:43     ` Fam Zheng
2016-04-16 23:29   ` Max Reitz
2016-04-18  1:33     ` Fam Zheng
2016-04-18  5:34       ` Denis V. Lunev
2016-04-19 19:14         ` Max Reitz
2016-04-20  8:46           ` Denis V. Lunev [this message]
2016-04-19 19:13       ` Max Reitz
2016-04-25 23:55   ` Laszlo Ersek
2016-04-26  0:47     ` Fam Zheng
2016-04-15  3:27 ` [Qemu-devel] [PATCH for-2.7 v2 05/17] raw-posix: Implement .bdrv_lockf Fam Zheng
2016-04-16 13:29   ` Denis V. Lunev
2016-04-18  1:12     ` Fam Zheng
2016-04-18  5:30       ` Denis V. Lunev
2016-04-18  9:34       ` Daniel P. Berrange
2016-04-18  9:38         ` Denis V. Lunev
2016-04-17 19:27   ` Richard W.M. Jones
2016-04-18  1:10     ` Fam Zheng
2016-04-18  8:04       ` Richard W.M. Jones
2016-04-19 12:37         ` Fam Zheng
2016-04-19 13:07           ` Richard W.M. Jones
2016-04-19 13:19             ` Fam Zheng
2016-04-19 13:36               ` Richard W.M. Jones
2016-04-19 13:45                 ` Daniel P. Berrange
2016-04-19 13:34         ` Daniel P. Berrange
2016-04-19 13:40           ` Richard W.M. Jones
2016-04-15  3:27 ` [Qemu-devel] [PATCH for-2.7 v2 06/17] gluster: " Fam Zheng
2016-04-15 12:24   ` [Qemu-devel] [Qemu-block] " Niels de Vos
2016-04-15  3:27 ` [Qemu-devel] [PATCH for-2.7 v2 07/17] rbd: Implement image locking Fam Zheng
2016-04-23  1:57   ` Jason Dillaman
2016-04-25  0:42     ` Fam Zheng
2016-04-26 15:42       ` Jason Dillaman
2016-04-27  0:20         ` Fam Zheng
2016-04-27 18:18           ` Jason Dillaman
2016-04-28  1:33             ` Fam Zheng
2016-04-15  3:27 ` [Qemu-devel] [PATCH for-2.7 v2 08/17] qemu-io: Add "-L" option for BDRV_O_NO_LOCK Fam Zheng
2016-04-16 13:46   ` Denis V. Lunev
2016-04-19 12:59     ` Fam Zheng
2016-04-15  3:27 ` [Qemu-devel] [PATCH for-2.7 v2 09/17] qemu-img: Add "-L" option to sub commands Fam Zheng
2016-04-16 14:29   ` Denis V. Lunev
2016-04-16 14:30     ` Denis V. Lunev
2016-04-19 12:59     ` Fam Zheng
2016-04-15  3:28 ` [Qemu-devel] [PATCH for-2.7 v2 10/17] qemu-img: Update documentation of "-L" option Fam Zheng
2016-04-15  3:28 ` [Qemu-devel] [PATCH for-2.7 v2 11/17] qemu-nbd: Add "--no-lock/-L" option Fam Zheng
2016-04-16 14:32   ` Denis V. Lunev
2016-04-19 12:58     ` Fam Zheng
2016-04-15  3:28 ` [Qemu-devel] [PATCH for-2.7 v2 12/17] qemu-iotests: 140: Disable image lock for qemu-io access Fam Zheng
2016-04-15  3:28 ` [Qemu-devel] [PATCH for-2.7 v2 13/17] qemu-iotests: 046: Move version detection out from verify_io Fam Zheng
2016-04-15  3:28 ` [Qemu-devel] [PATCH for-2.7 v2 14/17] qemu-iotests: Wait for QEMU processes before checking image in 091 Fam Zheng
2016-04-15  3:28 ` [Qemu-devel] [PATCH for-2.7 v2 15/17] qemu-iotests: Disable image lock when checking test image Fam Zheng
2016-04-15  3:28 ` [Qemu-devel] [PATCH for-2.7 v2 16/17] block: Turn on image locking by default Fam Zheng
2016-04-15  3:28 ` [Qemu-devel] [PATCH for-2.7 v2 17/17] qemu-iotests: Add test case 152 for image locking Fam Zheng
2016-04-16 14:33 ` [Qemu-devel] [PATCH for-2.7 v2 00/17] block: Lock images when opening Denis V. Lunev
2016-04-18  9:53 ` Daniel P. Berrange
2016-04-19 12:40   ` Fam Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=571741D0.10900@openvz.org \
    --to=den@openvz.org \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=eblake@redhat.com \
    --cc=famz@redhat.com \
    --cc=jcody@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).