From: Eric Blake <eblake@redhat.com>
To: Alberto Garcia <berto@igalia.com>, qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
qemu-block@nongnu.org, Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [PATCH for-2.12] qcow2: Reset free_cluster_index when allocating a new refcount block
Date: Wed, 21 Mar 2018 08:08:23 -0500 [thread overview]
Message-ID: <1809fec1-cdf6-0da4-14e8-04b01d4fc48b@redhat.com> (raw)
In-Reply-To: <w51po3xixqz.fsf@maestria.local.igalia.com>
On 03/21/2018 04:28 AM, Alberto Garcia wrote:
> On Tue 20 Mar 2018 06:54:15 PM CET, Eric Blake wrote:
>
>>> When we try to allocate new clusters we first look for available ones
>>> starting from s->free_cluster_index and once we find them we increase
>>> their reference counts. Before we get to call update_refcount() to do
>>> this last step s->free_cluster_index is already pointing to the next
>>> cluster after the ones we are trying to allocate.
>>>
>> > During update_refcount() it may happen however that we also need to
>> > allocate a new refcount block in order to store the refcounts of
>> > these new clusters
>>
>> Your changes to test 121 covers this...
>>
>> > (and to complicate things further that may also require us to grow
>> > the refcount table).
>>
>> ...but not this. Is it worth also trying to cover this case in the
>> testsuite as well?
>
> I checked and the patch doesn't really fix that scenario. There's a
> different problem that I haven't debugged completely yet, because of two
> reasons:
>
> - One difference is that when we grow the refcount table we actually
> allocate a new one, so s->free_cluster_index points to the beginning
> of the image (where the previous table was) and any holes left during
> the process are allocated after that (depending on how much data we
> write though).
Yeah, that can make the test harder to reason about, and is slightly
more sensitive to our internal algorithm - but it also explains why you
checked for index > start rather than unconditionally assigning index =
start.
>
> - This scenario is harder to reach: in order to fill a 1-cluster
> refcount table the size of the image needs to be larger than
> (cluster_size³ / refcount_bits) bytes, that's 16TB with the default
> parameters. So although it can be reproduced easily if you reduce the
> cluster size I think it's very infrequent under normal conditions.
Yes, 16TB for default size, but only 2M for the best-case (512-byte
cluster, 64-bit refcount), so still easy to write a test for.
>
> But yes, it's a task left for the future.
Fair enough.
>
>>> + /* If the caller needs to restart the search for free clusters,
>>> + * try the same ones first to see if they're still free. */
>>> + if (ret == -EAGAIN) {
>>> + if (s->free_cluster_index > (start >> s->cluster_bits)) {
>>> + s->free_cluster_index = (start >> s->cluster_bits);
>>> + }
>>
>> Is there any harm in making this assignment unconditional, instead of
>> only doing it when free_cluster_index has grown larger than start?
>
> It can happen that it is smaller than 'start' if we were moving the
> refcount table to a new location, so we want to keep the lowest value.
Okay, then my R-b is sufficient on the patch as-is.
>
>> [And unrelated, but it might be nice to do a followup cleanup to track
>> free_cluster_offset by bytes instead of having to shift
>> free_cluster_index everywhere]
>
> I've actually just seen that we already have free_byte_offset, we use
> that for compressed clusters, so it might be possible to use that one...
free_byte_offset really IS a byte offset, as it can point mid-cluster.
But having EVERYTHING be byte-based seems like it will make reasoning
about the code easier to do.
>
> I'll put that in my TODO list.
>
> Berto
>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
next prev parent reply other threads:[~2018-03-21 13:08 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-20 13:55 [Qemu-devel] [PATCH] qcow2: Reset free_cluster_index when allocating a new refcount block Alberto Garcia
2018-03-20 17:54 ` [Qemu-devel] [PATCH for-2.12] " Eric Blake
2018-03-21 9:28 ` Alberto Garcia
2018-03-21 13:08 ` Eric Blake [this message]
2018-03-21 13:15 ` Alberto Garcia
2018-03-21 13:30 ` [Qemu-devel] [PATCH] " Eric Blake
2018-03-21 13:31 ` Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1809fec1-cdf6-0da4-14e8-04b01d4fc48b@redhat.com \
--to=eblake@redhat.com \
--cc=berto@igalia.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).