[Qemu-devel] [PULL 4/9] qcow2: Document some maximum size constraints

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Kevin Wolf <kwolf@redhat.com>
To: qemu-block@nongnu.org
Cc: kwolf@redhat.com, peter.maydell@linaro.org, qemu-devel@nongnu.org
Subject: [Qemu-devel] [PULL 4/9] qcow2: Document some maximum size constraints
Date: Mon, 19 Nov 2018 15:29:39 +0100	[thread overview]
Message-ID: <20181119142944.29061-5-kwolf@redhat.com> (raw)
In-Reply-To: <20181119142944.29061-1-kwolf@redhat.com>

From: Eric Blake <eblake@redhat.com>

Although off_t permits up to 63 bits (8EB) of file offsets, in
practice, we're going to hit other limits first.  Document some
of those limits in the qcow2 spec (some are inherent, others are
implementation choices of qemu), and how choice of cluster size
can influence some of the limits.

While we cannot map any uncompressed virtual cluster to any
address higher than 64 PB (56 bits) (due to the current L1/L2
field encoding stopping at bit 55), qemu's cap of 8M for the
refcount table can still access larger host addresses for some
combinations of large clusters and small refcount_order.  For
comparison, ext4 with 4k blocks caps files at 16PB.

Another interesting limit: for compressed clusters, the L2 layout
requires an ever-smaller maximum host offset as cluster size gets
larger, down to a 512 TB maximum with 2M clusters.  In particular,
note that with a cluster size of 8k or smaller, the L2 entry for
a compressed cluster could technically point beyond the 64PB mark,
but when you consider that with 8k clusters and refcount_order = 0,
you cannot access beyond 512T without exceeding qemu's limit of an
8M cap on the refcount table, it is unlikely that any image in the
wild has attempted to do so.  To be safe, let's document that bits
beyond 55 in a compressed cluster must be 0.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 docs/interop/qcow2.txt | 38 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
index 845d40a086..fb5cb47245 100644
--- a/docs/interop/qcow2.txt
+++ b/docs/interop/qcow2.txt
@@ -40,7 +40,18 @@ The first cluster of a qcow2 image contains the file header:
                     with larger cluster sizes.
 
          24 - 31:   size
-                    Virtual disk size in bytes
+                    Virtual disk size in bytes.
+
+                    Note: qemu has an implementation limit of 32 MB as
+                    the maximum L1 table size.  With a 2 MB cluster
+                    size, it is unable to populate a virtual cluster
+                    beyond 2 EB (61 bits); with a 512 byte cluster
+                    size, it is unable to populate a virtual size
+                    larger than 128 GB (37 bits).  Meanwhile, L1/L2
+                    table layouts limit an image to no more than 64 PB
+                    (56 bits) of populated clusters, and an image may
+                    hit other limits first (such as a file system's
+                    maximum size).
 
          32 - 35:   crypt_method
                     0 for no encryption
@@ -326,6 +337,17 @@ in the image file.
 It contains pointers to the second level structures which are called refcount
 blocks and are exactly one cluster in size.
 
+Although a large enough refcount table can reserve clusters past 64 PB
+(56 bits) (assuming the underlying protocol can even be sized that
+large), note that some qcow2 metadata such as L1/L2 tables must point
+to clusters prior to that point.
+
+Note: qemu has an implementation limit of 8 MB as the maximum refcount
+table size.  With a 2 MB cluster size and a default refcount_order of
+4, it is unable to reference host resources beyond 2 EB (61 bits); in
+the worst case, with a 512 cluster size and refcount_order of 6, it is
+unable to access beyond 32 GB (35 bits).
+
 Given an offset into the image file, the refcount of its cluster can be
 obtained as follows:
 
@@ -365,6 +387,16 @@ The L1 table has a variable size (stored in the header) and may use multiple
 clusters, however it must be contiguous in the image file. L2 tables are
 exactly one cluster in size.
 
+The L1 and L2 tables have implications on the maximum virtual file
+size; for a given L1 table size, a larger cluster size is required for
+the guest to have access to more space.  Furthermore, a virtual
+cluster must currently map to a host offset below 64 PB (56 bits)
+(although this limit could be relaxed by putting reserved bits into
+use).  Additionally, as cluster size increases, the maximum host
+offset for a compressed cluster is reduced (a 2M cluster size requires
+compressed clusters to reside below 512 TB (49 bits), and this limit
+cannot be relaxed without an incompatible layout change).
+
 Given an offset into the virtual disk, the offset into the image file can be
 obtained as follows:
 
@@ -427,7 +459,9 @@ Standard Cluster Descriptor:
 Compressed Clusters Descriptor (x = 62 - (cluster_bits - 8)):
 
     Bit  0 - x-1:   Host cluster offset. This is usually _not_ aligned to a
-                    cluster or sector boundary!
+                    cluster or sector boundary!  If cluster_bits is
+                    small enough that this field includes bits beyond
+                    55, those upper bits must be set to 0.
 
          x - 61:    Number of additional 512-byte sectors used for the
                     compressed data, beyond the sector containing the offset
-- 
2.19.1

next prev parent reply	other threads:[~2018-11-19 14:30 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-19 14:29 [Qemu-devel] [PULL 0/9] Block layer patches Kevin Wolf
2018-11-19 14:29 ` [Qemu-devel] [PULL 1/9] nvme: fix oob access issue(CVE-2018-16847) Kevin Wolf
2018-11-19 14:29 ` [Qemu-devel] [PULL 2/9] fdc: fix segfault in fdctrl_stop_transfer() when DMA is disabled Kevin Wolf
2018-11-19 14:29 ` [Qemu-devel] [PULL 3/9] vvfat: Fix memory leak Kevin Wolf
2018-11-19 14:29 ` Kevin Wolf [this message]
2018-11-19 14:29 ` [Qemu-devel] [PULL 5/9] qcow2: Don't allow overflow during cluster allocation Kevin Wolf
2018-11-19 14:29 ` [Qemu-devel] [PULL 6/9] iotests: Add new test 220 for max compressed cluster offset Kevin Wolf
2018-11-19 14:29 ` [Qemu-devel] [PULL 7/9] block: Always abort reopen after prepare succeeded Kevin Wolf
2018-11-19 14:29 ` [Qemu-devel] [PULL 8/9] file-posix: Fix shared locks on reopen commit Kevin Wolf
2018-11-19 14:29 ` [Qemu-devel] [PULL 9/9] iotests: Test file-posix locking and reopen Kevin Wolf
2018-11-19 15:03 ` [Qemu-devel] [PULL 0/9] Block layer patches Peter Maydell

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:845d40a08 dfblob:fb5cb4724 )
 OR (
bs:"[Qemu-devel] [PULL 4/9] qcow2: Document some maximum size constraints" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181119142944.29061-5-kwolf@redhat.com \
    --to=kwolf@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).