From: Fam Zheng <famz@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org, Max Reitz <mreitz@redhat.com>,
qemu-block@nongnu.org
Subject: Re: [Qemu-devel] [PATCH RFC] block: Tolerate existing writers on read only BdrvChild
Date: Wed, 1 Mar 2017 20:39:13 +0800 [thread overview]
Message-ID: <20170301123913.GA26744@lemon.lan> (raw)
In-Reply-To: <20170301094917.GA4799@noname.redhat.com>
On Wed, 03/01 10:49, Kevin Wolf wrote:
> Am 01.03.2017 um 09:15 hat Fam Zheng geschrieben:
> > In an ideal world, read-write access to an image is inherently
> > exclusive, because we cannot guarantee other readers and writers a
> > consistency view of the whole image at all point. That's what the
> > current permission system does, and it is okay as long as it is entirely
> > internal. But that would change with the coming image locking. In
> > practice, both end users and our test cases use tools like qemu-img and
> > qemu-io to peek at images while guest is running.
> >
> > Relax a bit and accept other writers in this case.
> >
> > Signed-off-by: Fam Zheng <famz@redhat.com>
>
> Hm. On the one hand I think you're right that people are using things
> like this, on other hand it's also true that the result might not be
> consistent and therefore image locking is right about forbidding these
> actions.
>
> I think your patch is too permissive, it allows even launching a second
> long-running VM on the image, which will definitely see corrupted data
> sooner or later.
>
> Maybe what we can do is allow shared writers for read-only images if
> CONSISTENT_READ isn't requested. It's still not 100% correct because we
> can get inconsistent metadata and cause unexpected failure, but this is
> probably tolerable. We would then have to change the allowed actions to
> not request this permission.
>
> In qemu-img, we have these read-only users:
>
> * qemu-img check (without -r): Let's keep this blocked, it will only
> report lots of leaks and leads to invalid bug reports. I've had my
> share of them.
>
> * qemu-img compare: Not sure if this makes sense with an image that is
> in active use?
>
> * qemu-img convert source: Similarly to qemu-img compare, this doesn't
> make a whole lot of sense, with one exception: People are using -s to
> extract internal snapshots while the VM is still running, and this
> usually works because the snapshot doesn't change. However, I don't
> want to know what happens if you delete the snapshort from the qemu
> process while qemu-img convert is running... Doing 'qemu-img snapshot
> -s ...' with a running VM is more tolerated abuse than a supported
> feature.
>
> * qemu-img info: This one makes perfect sense even with a running VM
>
> * qemu-img map: Results are potentially meaningless with concurrent I/O,
> but there may be cases where it makes sense.
>
> * qemu-img snapshot -l: Somewhat similar to qemu-img convert -s, except
> that it's very quick and doesn't access actual data (could even be
> BDRV_O_NOIO, I think). Allowing this makes sense, I guess.
>
> * qemu-img rebase, safe mode, backing files: If we allowed concurrent
> writes, it wouldn't be safe.
>
> * qemu-img bench: Well... No. You don't need this on an image with an
> active VM.
>
> * qemu-img dd: Same as convert, except that there is no -s.
>
> So the list of subcommands where we want to support it is rather short.
> We can change blk_new_open() to clear CONSISTENT_READ for BDRV_O_NOIO,
> which could cover 'info' and 'snapshot -l'.
>
> That leaves us with qemu-io, 'convert -s' and 'map', all of which can be
> imagined to be useful even with a running VM, but all of which can also
> easily produce wrong results in this case. An explicit command line
> option to disable CONSISTENT_READ might be the right tool here.
I'm not sure about this because: 1) this is intrusive from a user PoV, many
scripts and upper layer tools will stop working; 2) CONSISTENT_READ is enforced
by qcow2 in its .bdrv_child_perm implementation even if blk_new_open() doesn't
ask for it, therefore such an option has to impact the whole graph; 3) this
isn't only about asking for "persistent read" perm, but more about granting
"write" in shared_perm, so it feels messy.
A perhaps more contained way is to add a BDRV_O_RELAXED_LOCK flag and use it in
those subcommands, then in the image locking code, the "no other writer" byte is
not locked if it's set. This has a simpler semantic and a more manageable scope.
Fam
next prev parent reply other threads:[~2017-03-01 12:39 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-01 8:15 [Qemu-devel] [PATCH RFC] block: Tolerate existing writers on read only BdrvChild Fam Zheng
2017-03-01 9:49 ` Kevin Wolf
2017-03-01 12:39 ` Fam Zheng [this message]
2017-03-01 15:16 ` Kevin Wolf
2017-03-01 16:10 ` Fam Zheng
2017-03-01 16:22 ` Kevin Wolf
2017-03-02 11:21 ` Fam Zheng
2017-03-02 14:23 ` Kevin Wolf
2017-03-03 1:46 ` Fam Zheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170301123913.GA26744@lemon.lan \
--to=famz@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).