All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fam Zheng <famz@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org, Max Reitz <mreitz@redhat.com>,
	qemu-block@nongnu.org
Subject: Re: [Qemu-devel] [PATCH RFC] block: Tolerate existing writers on read only BdrvChild
Date: Wed, 1 Mar 2017 20:39:13 +0800	[thread overview]
Message-ID: <20170301123913.GA26744@lemon.lan> (raw)
In-Reply-To: <20170301094917.GA4799@noname.redhat.com>

On Wed, 03/01 10:49, Kevin Wolf wrote:
> Am 01.03.2017 um 09:15 hat Fam Zheng geschrieben:
> > In an ideal world, read-write access to an image is inherently
> > exclusive, because we cannot guarantee other readers and writers a
> > consistency view of the whole image at all point. That's what the
> > current permission system does, and it is okay as long as it is entirely
> > internal. But that would change with the coming image locking. In
> > practice, both end users and our test cases use tools like qemu-img and
> > qemu-io to peek at images while guest is running.
> > 
> > Relax a bit and accept other writers in this case.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> 
> Hm. On the one hand I think you're right that people are using things
> like this, on other hand it's also true that the result might not be
> consistent and therefore image locking is right about forbidding these
> actions.
> 
> I think your patch is too permissive, it allows even launching a second
> long-running VM on the image, which will definitely see corrupted data
> sooner or later.
> 
> Maybe what we can do is allow shared writers for read-only images if
> CONSISTENT_READ isn't requested. It's still not 100% correct because we
> can get inconsistent metadata and cause unexpected failure, but this is
> probably tolerable. We would then have to change the allowed actions to
> not request this permission.
> 
> In qemu-img, we have these read-only users:
> 
> * qemu-img check (without -r): Let's keep this blocked, it will only
>   report lots of leaks and leads to invalid bug reports. I've had my
>   share of them.
> 
> * qemu-img compare: Not sure if this makes sense with an image that is
>   in active use?
> 
> * qemu-img convert source: Similarly to qemu-img compare, this doesn't
>   make a whole lot of sense, with one exception: People are using -s to
>   extract internal snapshots while the VM is still running, and this
>   usually works because the snapshot doesn't change. However, I don't
>   want to know what happens if you delete the snapshort from the qemu
>   process while qemu-img convert is running... Doing 'qemu-img snapshot
>   -s ...' with a running VM is more tolerated abuse than a supported
>   feature.
> 
> * qemu-img info: This one makes perfect sense even with a running VM
> 
> * qemu-img map: Results are potentially meaningless with concurrent I/O,
>   but there may be cases where it makes sense.
> 
> * qemu-img snapshot -l: Somewhat similar to qemu-img convert -s, except
>   that it's very quick and doesn't access actual data (could even be
>   BDRV_O_NOIO, I think). Allowing this makes sense, I guess.
> 
> * qemu-img rebase, safe mode, backing files: If we allowed concurrent
>   writes, it wouldn't be safe.
> 
> * qemu-img bench: Well... No. You don't need this on an image with an
>   active VM.
> 
> * qemu-img dd: Same as convert, except that there is no -s.
> 
> So the list of subcommands where we want to support it is rather short.
> We can change blk_new_open() to clear CONSISTENT_READ for BDRV_O_NOIO,
> which could cover 'info' and 'snapshot -l'.
> 
> That leaves us with qemu-io, 'convert -s' and 'map', all of which can be
> imagined to be useful even with a running VM, but all of which can also
> easily produce wrong results in this case. An explicit command line
> option to disable CONSISTENT_READ might be the right tool here.

I'm not sure about this because: 1) this is intrusive from a user PoV, many
scripts and upper layer tools will stop working; 2) CONSISTENT_READ is enforced
by qcow2 in its .bdrv_child_perm implementation even if blk_new_open() doesn't
ask for it, therefore such an option has to impact the whole graph; 3) this
isn't only about asking for "persistent read" perm, but more about granting
"write" in shared_perm, so it feels messy.

A perhaps more contained way is to add a BDRV_O_RELAXED_LOCK flag and use it in
those subcommands, then in the image locking code, the "no other writer" byte is
not locked if it's set. This has a simpler semantic and a more manageable scope.

Fam

  reply	other threads:[~2017-03-01 12:39 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-01  8:15 [Qemu-devel] [PATCH RFC] block: Tolerate existing writers on read only BdrvChild Fam Zheng
2017-03-01  9:49 ` Kevin Wolf
2017-03-01 12:39   ` Fam Zheng [this message]
2017-03-01 15:16     ` Kevin Wolf
2017-03-01 16:10       ` Fam Zheng
2017-03-01 16:22         ` Kevin Wolf
2017-03-02 11:21           ` Fam Zheng
2017-03-02 14:23             ` Kevin Wolf
2017-03-03  1:46               ` Fam Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170301123913.GA26744@lemon.lan \
    --to=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.