Re: [Qemu-devel] [PATCH v4 2/5] qcow2-refcount: avoid eating RAM

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Eric Blake <eblake@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org
Cc: kwolf@redhat.com, den@openvz.org, mreitz@redhat.com
Subject: Re: [Qemu-devel] [PATCH v4 2/5] qcow2-refcount: avoid eating RAM
Date: Wed, 27 Feb 2019 08:43:34 -0600	[thread overview]
Message-ID: <09afd97f-0e16-82f2-df02-eccec310c4ab@redhat.com> (raw)
In-Reply-To: <20181214134240.217870-3-vsementsov@virtuozzo.com>

On 12/14/18 7:42 AM, Vladimir Sementsov-Ogievskiy wrote:
> qcow2_inc_refcounts_imrt() (through realloc_refcount_array()) can eat
> an unpredictable amount of memory on corrupted table entries, which are
> referencing regions far beyond the end of file.
> 
> Prevent this, by skipping such regions from further processing.
> 
> Interesting that iotest 138 checks exactly the behavior which we fix
> here. So, change the test appropriately.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---

> +    /* Last cluster of qcow2 image may be semi-allocated, so it's may be OK to
> +     * reference some space after file end but it should be less than one
> +     * cluster.

Are we guaranteed that this is always the case, even when incrementally
building up an image.  That is, if we get a power failure in the middle
of allocating both a new L2 table and a larger refcount table, can we
see an image which temporary references 2 or more clusters beyond the
end of the file that it was previously using?  Or are we guaranteed that
the way we update the refcount table will never see more than one
cluster beyond the end of the file for any other reference?  I guess
where I'm going with this is a question of whether we should permit a
finite overrun to account for multiple pieces of metadata all being in
flight at once, and only reject the obvious overruns that are definitely
beyond that sane limit, and whether 1 cluster is too small for that sane
limit.

> +     */
> +    if (offset + size - file_len >= s->cluster_size) {
> +        fprintf(stderr, "ERROR: counting reference for region exceeding the "
> +                "end of the file by one cluster or more: offset 0x%" PRIx64
> +                " size 0x%" PRIx64 "\n", offset, size);

Should we be using something smarter than fprintf() here?

Grammar suggestion:

ERROR: reference to a region too far beyond the end of the file:

(which is shorter than what you wrote, and also has the nice property of
allowing us to change our mind on whether it is 1 cluster, 2, 10, or
some other finite limit as our heuristic of when we consider the image
corrupt without having to reword it).

> +++ b/tests/qemu-iotests/138
> @@ -54,15 +54,13 @@ $QEMU_IO -c 'write 0 512' "$TEST_IMG" | _filter_qemu_io
>  # Put the data cluster at a multiple of 2 TB, resulting in the image apparently
>  # having a multiple of 2^32 clusters
>  # (To be more specific: It is at 32 PB)
> -poke_file "$TEST_IMG" 2048 "\x80\x80\x00\x00\x00\x00\x00\x00"
> +poke_file "$TEST_IMG" $((2048 + 8)) "\x00\x80\x00\x00\x00\x00\x00\x00"
>  
>  # An offset of 32 PB results in qemu-img check having to allocate an in-memory
> -# refcount table of 128 TB (16 bit refcounts, 512 byte clusters).
> -# This should be generally too much for any system and thus fail.
> -# What this test is checking is that the qcow2 driver actually tries to allocate
> -# such a large amount of memory (and is consequently aborting) instead of having
> -# truncated the cluster count somewhere (which would result in much less memory
> -# being allocated and then a segfault occurring).
> +# refcount table of 128 TB (16 bit refcounts, 512 byte clusters), if qemu-img
> +# don't check that referenced data cluster is far beyond the end of file.
> +# But starting from 4.0, qemu-img does this check, and instead of "Cannot
> +# allocate memory", we have an error showing that l2 entry is invalid.
>  _check_test_img

In other words, we are still gracefully invalidating the file, but now
because of a heuristic rather than a memory allocation failure.

I might word the comment differently:

An offset of 32 PB would result in qemu-img check having to allocate an
in-memory refcount table of 128 TB (16 bit refcounts, 512 byte
clusters). It is unlikely that a system has this much memory, so the
test ensures that qemu-img either gracefully fails an allocation (prior
to 4.0) or flags the image as invalid (the reference points to a cluster
too far beyond the actual file), rather than silently allocating a
truncated memory amount or dumping core.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

next prev parent reply	other threads:[~2019-02-27 14:43 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-14 13:42 [Qemu-devel] [PATCH v4 0/5] qcow2 check improvements part I Vladimir Sementsov-Ogievskiy
2018-12-14 13:42 ` [Qemu-devel] [PATCH v4 1/5] qcow2-refcount: fix check_oflag_copied Vladimir Sementsov-Ogievskiy
2019-02-27 12:02   ` Max Reitz
2018-12-14 13:42 ` [Qemu-devel] [PATCH v4 2/5] qcow2-refcount: avoid eating RAM Vladimir Sementsov-Ogievskiy
2019-02-27 12:26   ` Max Reitz
2019-02-27 14:43   ` Eric Blake [this message]
2018-12-14 13:42 ` [Qemu-devel] [PATCH v4 3/5] qcow2-refcount: check_refcounts_l2: reduce ignored overlaps Vladimir Sementsov-Ogievskiy
2018-12-14 13:42 ` [Qemu-devel] [PATCH v4 4/5] qcow2-refcount: check_refcounts_l2: don't count fixed cluster as allocated Vladimir Sementsov-Ogievskiy
2019-02-27 12:32   ` Max Reitz
2018-12-14 13:42 ` [Qemu-devel] [PATCH v4 5/5] qcow2-refcount: don't mask corruptions under internal errors Vladimir Sementsov-Ogievskiy
2019-02-27 12:42   ` Max Reitz
2019-02-27 12:47     ` Vladimir Sementsov-Ogievskiy
2019-02-27 12:48       ` Max Reitz
2019-01-10 15:28 ` [Qemu-devel] [PATCH v4 0/5] qcow2 check improvements part I Vladimir Sementsov-Ogievskiy
2019-02-27 10:07   ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=09afd97f-0e16-82f2-df02-eccec310c4ab@redhat.com \
    --to=eblake@redhat.com \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).