qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Eric Blake <eblake@redhat.com>, qemu-devel@nongnu.org
Cc: kwolf@redhat.com, qemu-block@nongnu.org, berto@igalia.com
Subject: Re: [Qemu-devel] [PATCH v5 4/5] qcow2: Don't allow overflow during cluster allocation
Date: Wed, 25 Apr 2018 16:44:12 +0200	[thread overview]
Message-ID: <3fb4c5dd-bb6f-e965-30f2-cafd0e3a2d2c@redhat.com> (raw)
In-Reply-To: <20180423223337.82366-5-eblake@redhat.com>


[-- Attachment #1.1: Type: text/plain, Size: 6954 bytes --]

On 2018-04-24 00:33, Eric Blake wrote:
> Our code was already checking that we did not attempt to
> allocate more clusters than what would fit in an INT64 (the
> physical maximimum if we can access a full off_t's worth of
> data).  But this does not catch smaller limits enforced by
> various spots in the qcow2 image description: L1 and normal
> clusters of L2 are documented as having bits 63-56 reserved
> for other purposes, capping our maximum offset at 64PB (bit
> 55 is the maximum bit set).  And for compressed images with
> 2M clusters, the cap drops the maximum offset to bit 48, or
> a maximum offset of 512TB.  If we overflow that offset, we
> would write compressed data into one place, but try to
> decompress from another, which won't work.
> 
> I don't have 512TB handy to prove whether things break if we
> compress so much data that we overflow that limit, and don't
> think that iotests can (quickly) test it either.  Test 138
> comes close (it corrupts an image into thinking something lives
> at 32PB, which is half the maximum for L1 sizing - although
> it relies on 512-byte clusters).  But that test points out
> that we will generally hit other limits first (such as running
> out of memory for the refcount table, or exceeding file system
> limits like 16TB on ext4, etc), so this is more a theoretical
> safety valve than something likely to be hit.

You don't need 512 TB, though, 36 MB is sufficient.

Here's what you do:
(1) Create a 513 TB image with cluster_size=2M,refcount_bits=1
(2) Take a hex editor and enter 16 refblocks into the reftable
(3) Fill all of those refblocks with 1s

(Funny side note: qemu-img check thinks that image is clean because it
doesn't check refcounts beyond the image end...)

I've attached a compressed test image (unsurprisingly, it compresses
really well).

Before this series:
$ ./qemu-io -c 'write -c 0 2M' test.qcow2
qcow2: Marking image as corrupt: Preventing invalid write on metadata
(overlaps with refcount block); further corruption events will be suppressed
write failed: Input/output error

Aw.

After this series:
$ ./qemu-io -c 'write -c 0 2M' test.qcow2
write failed: Input/output error

(Normal writes just work fine.)


Maybe you want to add a test still -- creating the image is rather quick
(well, you have to write 64 MB of 1s, but other than that).  The only
thing that takes a bit of time is qemu figuring out where the first free
cluster is...  That takes like 15 seconds here.

And another issue of course is...

$ ls -lhs test.qcow2
42M -rw-r--r--. 1 maxx maxx 513T 25. Apr 16:42 test.qcow2

Yeah, that.  Depends on the host file system, of course, whether that is
a real issue. O:-)

Max

> Signed-off-by: Eric Blake <eblake@redhat.com>
> Reviewed-by: Alberto Garcia <berto@igalia.com>
> 
> ---
> v3: use s->cluster_offset_mask instead of open-coding it [Berto],
> use MIN() to clamp offset on small cluster size, add spec patch
> first to justify clamping even on refcount allocations
> ---
>  block/qcow2.h          |  6 ++++++
>  block/qcow2-refcount.c | 21 ++++++++++++++-------
>  2 files changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/block/qcow2.h b/block/qcow2.h
> index 1df15a18aa1..a3b9faa9d53 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -41,6 +41,12 @@
>  #define QCOW_MAX_CRYPT_CLUSTERS 32
>  #define QCOW_MAX_SNAPSHOTS 65536
> 
> +/* Field widths in qcow2 mean normal cluster offsets cannot reach
> + * 64PB; depending on cluster size, compressed clusters can have a
> + * smaller limit (64PB for up to 16k clusters, then ramps down to
> + * 512TB for 2M clusters).  */
> +#define QCOW_MAX_CLUSTER_OFFSET ((1ULL << 56) - 1)
> +
>  /* 8 MB refcount table is enough for 2 PB images at 64k cluster size
>   * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
>  #define QCOW_MAX_REFTABLE_SIZE 0x800000
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index 6bfc11bb48f..fcbea3c9644 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -31,7 +31,8 @@
>  #include "qemu/bswap.h"
>  #include "qemu/cutils.h"
> 
> -static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size);
> +static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size,
> +                                    uint64_t max);
>  static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
>                              int64_t offset, int64_t length, uint64_t addend,
>                              bool decrease, enum qcow2_discard_type type);
> @@ -362,7 +363,8 @@ static int alloc_refcount_block(BlockDriverState *bs,
>      }
> 
>      /* Allocate the refcount block itself and mark it as used */
> -    int64_t new_block = alloc_clusters_noref(bs, s->cluster_size);
> +    int64_t new_block = alloc_clusters_noref(bs, s->cluster_size,
> +                                             QCOW_MAX_CLUSTER_OFFSET);
>      if (new_block < 0) {
>          return new_block;
>      }
> @@ -954,7 +956,8 @@ int qcow2_update_cluster_refcount(BlockDriverState *bs,
> 
> 
>  /* return < 0 if error */
> -static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size)
> +static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size,
> +                                    uint64_t max)
>  {
>      BDRVQcow2State *s = bs->opaque;
>      uint64_t i, nb_clusters, refcount;
> @@ -979,9 +982,9 @@ retry:
>      }
> 
>      /* Make sure that all offsets in the "allocated" range are representable
> -     * in an int64_t */
> +     * in the requested max */
>      if (s->free_cluster_index > 0 &&
> -        s->free_cluster_index - 1 > (INT64_MAX >> s->cluster_bits))
> +        s->free_cluster_index - 1 > (max >> s->cluster_bits))
>      {
>          return -EFBIG;
>      }
> @@ -1001,7 +1004,7 @@ int64_t qcow2_alloc_clusters(BlockDriverState *bs, uint64_t size)
> 
>      BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC);
>      do {
> -        offset = alloc_clusters_noref(bs, size);
> +        offset = alloc_clusters_noref(bs, size, QCOW_MAX_CLUSTER_OFFSET);
>          if (offset < 0) {
>              return offset;
>          }
> @@ -1083,7 +1086,11 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
>      free_in_cluster = s->cluster_size - offset_into_cluster(s, offset);
>      do {
>          if (!offset || free_in_cluster < size) {
> -            int64_t new_cluster = alloc_clusters_noref(bs, s->cluster_size);
> +            int64_t new_cluster;
> +
> +            new_cluster = alloc_clusters_noref(bs, s->cluster_size,
> +                                               MIN(s->cluster_offset_mask,
> +                                                   QCOW_MAX_CLUSTER_OFFSET));
>              if (new_cluster < 0) {
>                  return new_cluster;
>              }
> 


[-- Attachment #1.2: test.qcow2.xz --]
[-- Type: application/x-xz, Size: 6060 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2018-04-25 14:44 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-23 22:33 [Qemu-devel] [PATCH v5 0/5] minor qcow2 compression improvements Eric Blake
2018-04-23 22:33 ` [Qemu-devel] [PATCH v5 1/5] qcow2: Prefer byte-based calls into bs->file Eric Blake
2018-04-23 22:33 ` [Qemu-devel] [PATCH v5 2/5] qcow2: Document some maximum size constraints Eric Blake
2018-04-24  9:13   ` Alberto Garcia
2018-04-23 22:33 ` [Qemu-devel] [PATCH v5 3/5] qcow2: Reduce REFT_OFFSET_MASK Eric Blake
2018-04-23 22:33 ` [Qemu-devel] [PATCH v5 4/5] qcow2: Don't allow overflow during cluster allocation Eric Blake
2018-04-25 14:44   ` Max Reitz [this message]
2018-04-25 18:26     ` Eric Blake
2018-04-25 20:31     ` Eric Blake
2018-04-23 22:33 ` [Qemu-devel] [PATCH v5 5/5] qcow2: Avoid memory over-allocation on compressed images Eric Blake
2018-04-25 15:00   ` Max Reitz
2018-04-25 18:37     ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3fb4c5dd-bb6f-e965-30f2-cafd0e3a2d2c@redhat.com \
    --to=mreitz@redhat.com \
    --cc=berto@igalia.com \
    --cc=eblake@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).