From: Kevin Wolf <kwolf@redhat.com>
To: Max Reitz <mreitz@redhat.com>
Cc: Fam Zheng <famz@redhat.com>,
qemu-devel@nongnu.org, berrange@redhat.com,
John Snow <jsnow@redhat.com>,
qemu-block@nongnu.org, rjones@redhat.com,
Jeff Cody <jcody@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
stefanha@redhat.com, den@openvz.org, pbonzini@redhat.com,
eblake@redhat.com
Subject: Re: [Qemu-devel] [PATCH v8 00/36] block: Image locking series
Date: Mon, 24 Oct 2016 12:11:11 +0200 [thread overview]
Message-ID: <20161024101111.GB4374@noname.redhat.com> (raw)
In-Reply-To: <e6c3e63c-13bd-b21d-eb78-f3516c599989@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 7007 bytes --]
Am 22.10.2016 um 03:00 hat Max Reitz geschrieben:
> <parenthesis>
>
> I personally still don't like making locking a qdev property very much
> because it doesn't make sense to me*. But I remember Kevin had his
> reasons (even though I can no longer remember them) for asking for it,
> and I don't have any besides "It doesn't make sense to me". After having
> though a bit about it (= having written three paragraphs and deleted
> them again), I guess I can make some sense of it, though it seems to be
> a rather esoteric use case still; it appears to me that a guest could
> use it to signal that it's fine with some block device being shared;
> then we could use a shared lock or none at all or I don't know.
> Otherwise, we should get an exclusive lock for write access and a shared
> lock for read access.
The reason is pretty simple if you think about this question: Why do we
need user input in the first place? It's just because we don't know the
requiremens of the guest, and unfortunately the guest won't tell us
(because things like shared IDE disks simply don't exist on real
hardware). So we need ask the user, and we should do this in a layer as
close to the guest (which is the origin of the requirement) as possible.
This point is the device emulation.
Everything that isn't a guest device is under the control of qemu, so
we don't need user input there.
> (Although, fun question (that's the reason for the "I don't know"): What
> if guest A supports such a volatile block device, but guest B doesn't?
> Then imagine we run A with write access and B with read-only access. A
> will claim no lock or a shared lock, while B will probably claim a
> shared lock, too, expecting writing users to try to grab an exclusive
> lock. So suddenly both can open the image file, but B isn't actually
> prepared for that. Fun!)
Hmm, that's a good point actually.
And in fact, it reminds me how I suggested that the problem at hand is
really similar to what the new op blockers are supposed to do. And the
system I proposed for those already handles this situation just fine.
The relevant part here is how the proposal had two bit masks per user,
one describing which operatings the user needs to be able to perform
itself, and the other one for operations that other users are still
allowed perform. This means that for every operation you have four
possible states: Exclusively used; shared use; unused, but blocking
other users; unused, but allowed for other users.
If we interpret the locking flag as the operation "writes to the image",
in your example, A would have the mode "shared use" and B would have
"unused, but blocking". These two modes are in conflict.
Now, the big question is how to translate this into file locking. This
could become a little tricky. I had a few thoughts involving another
lock on byte 2, but none of them actually worked out so far, because
what we want is essentially a lock that can be shared by readers, that
can also be shared by writers, but not by readers and writers at the
same time.
> So as far as I understand it, those qdev properties should eventually be
> changeable by the guest. And if that is so, the user probably shouldn't
> touch them at all because the guest OS really knows whether it can cope
> with volatile block devices or not, and it will tell qemu if that is so.
This is an interesting thought for PV devices, but it hasn't been part
of the plan so far. I'm not sure if the guest OS block device drivers
even have this information generally, because that's more a matter of
whether a cluster file system is used on the block device or not.
> That may be the reason why the qdev property doesn't make sense to me at
> the first glance: It does make sense for the guest OS to set this
> property at that level, but it doesn't make sense for the user to do so.
> Or maybe it does, but the user is really asking for things breaking then
> ("YES, I AM SURE THAT MY GUEST WILL BE COMPLETELY FINE WITH FLOPPY DISKS
> JUST CHANGING RANDOMLY" -- but I guess we've all been at that point
> ourselves...).
>
> So after having convinced myself twice now, having the qdev property is
> probably correct.
Essentially the whole reason is that we need the information and the
guest doesn't tell us, so we need to ask someone who supposedly knows
enough about the guest - which is the user.
> </parenthesis>
>
> But that's where the flags come in again: The guest may be fine with a
> shared lock or none at all. But what about the block layer? Say you're
> using a qcow2 image; that will not like sharing at all, independently of
> what the guest things. So the block layer needs to have internal
> mechanisms to elevate the locks proposed by the guest devices to e.g.
> exclusive ones. I don't think that is something addressed by this series
> in this version. Maybe you'll still need the flags for that.
I'm not completely sure about the mechanism and whether it should
involve flags, but yes, you need something to propagate the required
locking mode down recursively (just like op blockers will need to be
done - maybe we should really use the same infrastructure and only
do file locking as its implementation of the lowest layer).
What you don't need, however, is user input. We already know that qcow2
doesn't like concurrent writes.
> Final thing: Say you have a user who's crazy. Maybe they want to debug
> something. Anyway, they have some qcow2 image attached to some guest
> device, and they want to access that image simultaneously from some
> other process; as I said, crazy, but there may be legitimate uses for
> that so we should allow it.
>
> Now first that user of course has to set the device's lock_mode to
> shared or none, thus promising qemu that the guest will be fine with
> volatile data. But of course qcow2 will override that lock mode, because
> qcow2 doesn't care about the guest: It primarily cares about its own
> metadata integrity. So at this point the user will need some way to
> override qcow2's choice, too. Therefore, I think we also need a per-node
> flag to override the locking mode.
>
> qcow2 would probably make this flag default to exclusive for its
> underlying node, and the user would then have to override that flag for
> exactly that underlying node.
>
> Therefore I think we still need a block layer property for the lock
> mode. While I now agree that we do need a qdev property, I think that
> alone won't be sufficient.
Attaching a qcow2 image to two processes simply doesn't work unless both
are read-only, so I disagree that we should allow it. Or can you really
come up with a specific reasonable use case that requires this?
Users crazy enough to want something like this patch qemu. Both of us
have patched qemu for debugging guests before and we didn't do things as
crazy as writing to the same qcow2 image from multiple processes.
Kevin
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
next prev parent reply other threads:[~2016-10-24 10:11 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-30 12:09 [Qemu-devel] [PATCH v8 00/36] block: Image locking series Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 01/36] block: Add flag bits for image locking Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 02/36] qapi: Add ImageLockMode Fam Zheng
2016-10-21 20:45 ` Max Reitz
2016-10-25 5:36 ` Fam Zheng
2016-10-25 13:20 ` Max Reitz
2016-10-25 13:34 ` Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 03/36] block: Introduce image file locking Fam Zheng
2016-10-21 21:04 ` Max Reitz
2016-10-25 5:48 ` Fam Zheng
2016-10-25 13:21 ` Max Reitz
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 04/36] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
2016-10-21 21:15 ` Max Reitz
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 05/36] raw-posix: Add image locking support Fam Zheng
2016-10-21 23:40 ` Max Reitz
2016-10-25 6:31 ` Fam Zheng
2016-10-25 13:28 ` Max Reitz
2016-10-25 13:43 ` Fam Zheng
2016-10-26 14:56 ` Max Reitz
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 06/36] qemu-io: Add "-L" option for BDRV_O_NO_LOCK Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 07/36] qemu-img: Add "-L" option to sub commands Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 08/36] qemu-img: Update documentation of "-L" option Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 09/36] qemu-nbd: Add "--no-lock/-L" option Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 10/36] block: Don't lock drive-backup target image in none mode Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 11/36] block: Add blk_lock_image Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 12/36] virtio-blk: Apply lock-mode when realize Fam Zheng
2016-10-22 0:08 ` Max Reitz
2016-10-22 0:12 ` Max Reitz
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 13/36] scsi-disk: " Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 14/36] scsi-generic: " Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 15/36] qdev: Add "lock-mode" to block device options Fam Zheng
2016-10-22 0:11 ` Max Reitz
2016-10-25 5:58 ` Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 16/36] ide: Apply lock-mode when initialize Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 17/36] nvme: " Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 18/36] usb-storage: Apply lock-mode when realize Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 19/36] pflash: Add "lock-mode" property Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 20/36] qemu-iotests: 046: Move version detection out from verify_io Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 21/36] qemu-iotests: 091: Prepare for image lock Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 22/36] qemu-iotests: 030: Disable image locking when checking test image Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 23/36] iotests: 087: Disable image locking in cases where file is shared Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 24/36] " Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 25/36] iotests: 130: Check image info locklessly Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 26/36] iotests: Disable image locking in 085 Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 27/36] tests: Use null-co:// instead of /dev/null Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 28/36] qemu-iotests: Add test case 153 for image locking Fam Zheng
2016-09-30 12:09 ` [Qemu-devel] [PATCH v8 29/36] ahci: Use shared lock for shared storage migration Fam Zheng
2016-09-30 12:10 ` [Qemu-devel] [PATCH v8 30/36] tests/postcopy: Use shared lock for images Fam Zheng
2016-09-30 12:10 ` [Qemu-devel] [PATCH v8 31/36] fdc: Add lock-mode qdev properties Fam Zheng
2016-09-30 12:10 ` [Qemu-devel] [PATCH v8 32/36] m25p80: Add 'lock-mode' property Fam Zheng
2016-09-30 12:10 ` [Qemu-devel] [PATCH v8 33/36] nand: " Fam Zheng
2016-09-30 12:10 ` [Qemu-devel] [PATCH v8 34/36] onenand: " Fam Zheng
2016-09-30 12:10 ` [Qemu-devel] [PATCH v8 35/36] spapr_nvram: " Fam Zheng
2016-09-30 12:10 ` [Qemu-devel] [PATCH v8 36/36] sd: " Fam Zheng
2016-10-22 1:00 ` [Qemu-devel] [PATCH v8 00/36] block: Image locking series Max Reitz
2016-10-24 10:11 ` Kevin Wolf [this message]
2016-10-24 18:03 ` Max Reitz
2016-10-25 8:24 ` Kevin Wolf
2016-10-25 13:30 ` Max Reitz
2016-10-25 14:57 ` Kevin Wolf
2016-10-26 11:01 ` Fam Zheng
2016-10-26 15:12 ` Max Reitz
2016-10-26 15:33 ` Kevin Wolf
2016-10-26 15:34 ` Max Reitz
2016-10-27 6:25 ` Fam Zheng
2016-10-26 15:04 ` Max Reitz
2016-10-25 7:09 ` Fam Zheng
2016-10-25 8:06 ` Richard W.M. Jones
2016-10-25 9:19 ` Fam Zheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161024101111.GB4374@noname.redhat.com \
--to=kwolf@redhat.com \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=den@openvz.org \
--cc=eblake@redhat.com \
--cc=famz@redhat.com \
--cc=jcody@redhat.com \
--cc=jsnow@redhat.com \
--cc=mreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=rjones@redhat.com \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.