qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: Alberto Garcia <berto@igalia.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Cc: Kevin Wolf <kwolf@redhat.com>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>,
	Max Reitz <mreitz@redhat.com>
Subject: Re: qcow2 preallocation and backing files
Date: Wed, 20 Nov 2019 12:27:53 +0000	[thread overview]
Message-ID: <a1575e90-3fe4-e843-518c-714db4b621bc@virtuozzo.com> (raw)
In-Reply-To: <20191120120625.GA25497@igalia.com>

20.11.2019 15:06, Alberto Garcia wrote:
> Hi,
> 
> as we discussed yesterday on IRC there's an inconsistency in the way
> qcow2 preallocation works.
> 
> Let's create an image and fill it with data:
> 
>     $ qemu-img create -f raw base.img 1M
>     $ qemu-io -f raw -c 'write -P 0xFF 0 1M' base.img
> 
> Now QEMU won't let us create a new image backed by base.img using
> preallocation:
> 
>     $ qemu-img create -f qcow2 -b base.img -o preallocation=metadata active.img
>     qemu-img: active.img: Backing file and preallocation cannot be used at the same time
> 
> The reason is that once a cluster is preallocated (i.e. it has a valid
> L2 entry pointing to a host offset) the guest won't see the contents
> of the backing file, so those options conflict with each other.
> 
> It is possible however to create an image that is smaller than
> the backing file and then resize it using preallocation. In this
> case qemu-img will happily accept any --preallocation option, with
> different results from the guest's point of view:
> 
>     # This reads as 0xFF (the data comes from base.img)
>     $ qemu-img create -f qcow2 -b base.img active.img 512K
> 
>     # The second half of the image also reads as 0xFF
>     $ qemu-img resize --preallocation=off active.img 1M
> 
>     # Here the second half reads as zeroes
>     $ qemu-img resize --preallocation=metadata active.img 1M
> 
> Apart from "qemu-img resize", the QMP block-resize command can also
> extend an image like this, although it always uses PREALLOC_MODE_OFF
> and the user cannot change that.
> 
> It does not seem right that the guest-visible data changes depending
> on the preallocation mode. This could be solved by returning an error
> when (backing_bs(blk_bs(blk)) && prealloc != PREALLOC_MODE_OFF) on
> img_resize().
> 
> The important question is however: what behavior is the right one?
> Should growing an image that was smaller than the backing file return
> zeroes, or data from the backing file? I would opt for the latter, for
> simplicity and consistency with the current behavior of block-resize,
> although it was pointed out that this could be a security problem (I'm
> not sure that I agree with that, but we can discuss it).

I'm for zeros way.

1. I'm sure that if guest after some operation may get access to that data
which it should not see, it's a security problem.

2. Seing backing file through new clusters is inconsistent with how read works:
read will return zeroes, not data from backing. Consider the following example:

      0         x     y
top: [---------------]
mid: [---------]
base:[111111111111111]

reading from [x,y] from top will return zeroes, not ones.

So, if we consider data after EOF as zeroes (not UNALLOCATED clusters), we should
not make these clusters UNALLOCATED after truncation.

3. Also, the latter way is inconsistent with discard. Discarded regions returns
zeroes, not clusters from backing. I think discard and truncate should behave
in the same safe zero way.

> 
> This also has a consequence on how preallocation should be implemented
> for images with subclusters. Extended L2 entries allow us to allocate
> a cluster but leave each one of its subclusters unallocated. That
> would allow us to have a cluster that is simultaneously allocated but
> whose data is read from the backing file. But it's up to us to decide
> if that's what we should do when resizing an image.
> 
> Berto
> 


-- 
Best regards,
Vladimir

  reply	other threads:[~2019-11-20 12:29 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-20 12:06 qcow2 preallocation and backing files Alberto Garcia
2019-11-20 12:27 ` Vladimir Sementsov-Ogievskiy [this message]
2019-11-20 15:18   ` Alberto Garcia
2019-11-20 15:58     ` Vladimir Sementsov-Ogievskiy
2019-11-20 16:35       ` Alberto Garcia
2019-11-20 16:46       ` Kevin Wolf
2019-11-20 17:49         ` Vladimir Sementsov-Ogievskiy
2019-11-21  8:51       ` Max Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a1575e90-3fe4-e843-518c-714db4b621bc@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=berto@igalia.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).