From: Eric Blake <eblake@redhat.com>
To: Alberto Garcia <berto@igalia.com>, qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>, "Denis V . Lunev" <den@openvz.org>,
Anton Nefedov <anton.nefedov@virtuozzo.com>,
qemu-block@nongnu.org, Max Reitz <mreitz@redhat.com>
Subject: Re: [RFC PATCH 00/23] Add subcluster allocation to qcow2
Date: Tue, 15 Oct 2019 11:05:23 -0500 [thread overview]
Message-ID: <1d8b92eb-d530-9acb-82a6-87a04ea5c31d@redhat.com> (raw)
In-Reply-To: <cover.1571152571.git.berto@igalia.com>
On 10/15/19 10:23 AM, Alberto Garcia wrote:
> Hi,
>
> this series adds a new feature to the qcow2 on-disk format called
> "Extended L2 Entries", which allows us to do subcluster allocation.
>
> This cover letter explains the reasons behind this proposal, the
> changes to the on-disk format, test results and pending work. If you
> are curious you can also have a look at previous discussions about
> this feature:
>
> === Changes to the on-disk format ===
>
> An L2 entry is 64 bits wide, with this format (for uncompressed
> clusters):
>
> 63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0
> 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> **<----> <--------------------------------------------------><------->*
> Rsrved host cluster offset of data Reserved
> (6 bits) (47 bits) (8 bits)
>
> bit 63: refcount == 1 (QCOW_OFLAG_COPIED)
> bit 62: compressed = 1 (QCOW_OFLAG_COMPRESSED)
> bit 0: all zeros (QCOW_OFLAG_ZERO)
>
> If Extended L2 Entries are enabled, bit 0 becomes reserved and must be
> unset, and this 64-bit bitmap follows the entry:
>
> 63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0
> 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> <---------------------------------> <--------------------------------->
> subcluster reads as zeros subcluster is allocated
> (32 bits) (32 bits)
I like the grouping - you can then do a 4-byte read and comparison to 0
to see if the entire cluster reads as zeroes or is unallocated.
With 32k clusters, this results in 1k subclusters. In cluster 1 (offset
32k), which bits map where? (The obvious choices are that sub-cluster
32k maps to bit 0, 33k maps to bit 1, ...; or that sub-cluster 32k maps
to bit 31, 33k maps to bit 30, ...)
/me reads ahead
okay, in patch 5, you said you map the most significant bit to the first
cluster. That feels backwards to me; I wonder if the math is any easier
if you map sub-clusters starting from the least-significant, because
then you get:
bit = (address >> cluster_size) & 32
rather than
bit = 31 - ((address >> cluster_size) & 32)
> Some comments about the results:
>
> - The smallest allowed cluster size for an image with subclusters is
> 16 KB (in this case the subclusters size is 512 bytes), hence the
> missing values in the 4 KB and 8 KB rows.
Again reading ahead, I see that patch 5 requires a 16k minimum cluster
for using extended L2. Could we still permit clusters smaller than
that, but merely document that subclusters are always a minimum of 512
bytes and therefore for an 8k cluster we only use 16 bits (leaving the
other 16 bits zero)? But I'm also fine with the simplicity of just
stating that subclusters require at least 16k clusters.
> === To do ===
>
> A couple of things are missing from this series:
>
> - The ability to efficiently zero individual subclusters using
> qcow2_co_pwrite_zeroes(). At the moment only full clusters can be
> zeroed with this method.
>
> - Alternatively we could get rid of the individual "all zeroes" bits
> altogether and have 64 subclusters per cluster. We would still have
> the QCOW_OFLAG_ZERO bit in the standard cluster descriptor.
I think you've got more flexibility with the two bits per sub-cluster
than you would with just 1 bit and 64 subclusters, so I don't think this
direction is going to get us far.
>
> - The number of subclusters per cluster is always 32. It would be
> trivial to allow configuring this, but I don't see any use case.
Agreed.
>
> - Tests: I have a few written that I'll add in future revisions of
> this series.
>
> - handle_alloc_space() works at the subclusters level. That is, if you
> have an unallocated 2MB cluster with 64KB subclusters, no backing
> image and you write 4KB of data, QEMU won't write zeroes to the
> affected subcluster(s) and will use handle_alloc_space() instead.
> The other subclusters won't be touched and will remain unallocated.
> This behavior is consistent with how subclusters work and saves disk
> space, but offers slightly lower performance (see test results
> above). Theoretically we could offer a setting to configure this,
> but I'm not convinced that this is very useful.
>
> ===========================
>
> As usual, feedback is welcome,
Looks promising!
How do subclusters interact with external data files?
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization: qemu.org | libvirt.org
next prev parent reply other threads:[~2019-10-15 16:06 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-15 15:23 [RFC PATCH 00/23] Add subcluster allocation to qcow2 Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 01/23] qcow2: Add calculate_l2_meta() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 02/23] qcow2: Split cluster_needs_cow() out of count_cow_clusters() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 03/23] qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in handle_copied() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 04/23] qcow2: Add get_l2_entry() and set_l2_entry() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 05/23] qcow2: Document the Extended L2 Entries feature Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 06/23] qcow2: Add dummy has_subclusters() function Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 07/23] qcow2: Add subcluster-related fields to BDRVQcow2State Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 08/23] qcow2: Add offset_to_sc_index() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 09/23] qcow2: Add l2_entry_size() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 10/23] qcow2: Update get/set_l2_entry() and add get/set_l2_bitmap() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 11/23] qcow2: Add qcow2_get_subcluster_type() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 12/23] qcow2: Handle QCOW2_CLUSTER_UNALLOCATED_SUBCLUSTER Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 13/23] qcow2: Add subcluster support to calculate_l2_meta() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 14/23] qcow2: Add subcluster support to qcow2_get_cluster_offset() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 15/23] qcow2: Add subcluster support to zero_in_l2_slice() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 16/23] qcow2: Add subcluster support to discard_in_l2_slice() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 17/23] qcow2: Add subcluster support to check_refcounts_l2() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 18/23] qcow2: Add subcluster support to expand_zero_clusters_in_l1() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 19/23] qcow2: Fix offset calculation in handle_dependencies() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 20/23] qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 21/23] qcow2: Add subcluster support to handle_alloc_space() Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 22/23] qcow2: Restrict qcow2_co_pwrite_zeroes() to full clusters only Alberto Garcia
2019-10-15 15:23 ` [RFC PATCH 23/23] qcow2: Add the 'extended_l2' option and the QCOW2_INCOMPAT_EXTL2 bit Alberto Garcia
2019-10-15 15:42 ` Eric Blake
2019-10-15 16:05 ` Eric Blake [this message]
2019-10-16 7:14 ` [RFC PATCH 00/23] Add subcluster allocation to qcow2 Alberto Garcia
2019-10-23 10:39 ` Vladimir Sementsov-Ogievskiy
2019-10-26 21:26 ` Alberto Garcia
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1d8b92eb-d530-9acb-82a6-87a04ea5c31d@redhat.com \
--to=eblake@redhat.com \
--cc=anton.nefedov@virtuozzo.com \
--cc=berto@igalia.com \
--cc=den@openvz.org \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).