Re: [PATCH 1/4] qcow2: Improve refcount structure rebuilding

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Max Reitz <mreitz@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-block@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>, qemu-devel@nongnu.org
Subject: Re: [PATCH 1/4] qcow2: Improve refcount structure rebuilding
Date: Fri, 26 Mar 2021 14:47:14 +0100	[thread overview]
Message-ID: <3eb1748e-012d-9c74-d9dc-c1a6f2644ba6@redhat.com> (raw)
In-Reply-To: <969cd321-0cc7-ddc3-8a0d-75819be3a6bf@virtuozzo.com>

On 26.03.21 12:48, Vladimir Sementsov-Ogievskiy wrote:
> 10.03.2021 18:59, Max Reitz wrote:
>> When rebuilding the refcount structures (when qemu-img check -r found
>> errors with refcount = 0, but reference count > 0), the new refcount
>> table defaults to being put at the image file end[1].  There is no good
>> reason for that except that it means we will not have to rewrite any
>> refblocks we already wrote to disk.
>>
>> Changing the code to rewrite those refblocks is not too difficult,
>> though, so let us do that.  That is beneficial for images on block
>> devices, where we cannot really write beyond the end of the image file.
>>
>> [1] Unless there is something allocated in the area pointed to by the
>>      last refblock, so we have to write that refblock.  In that case, we
>>      try to put the reftable in there.
>>
>> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1519071
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   block/qcow2-refcount.c | 126 ++++++++++++++++++++++-------------------
>>   1 file changed, 67 insertions(+), 59 deletions(-)
>>
>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>> index 8e649b008e..162caeeb8e 100644
>> --- a/block/qcow2-refcount.c
>> +++ b/block/qcow2-refcount.c
>> @@ -2352,8 +2352,9 @@ static int 
>> rebuild_refcount_structure(BlockDriverState *bs,
>>                                         int64_t *nb_clusters)
>>   {
>>       BDRVQcow2State *s = bs->opaque;
>> -    int64_t first_free_cluster = 0, reftable_offset = -1, cluster = 0;
>> +    int64_t first_free_cluster = 0, reftable_offset = -1, cluster;
>>       int64_t refblock_offset, refblock_start, refblock_index;
>> +    int64_t first_cluster, end_cluster;
>>       uint32_t reftable_size = 0;
>>       uint64_t *on_disk_reftable = NULL;
>>       void *on_disk_refblock;
>> @@ -2365,8 +2366,11 @@ static int 
>> rebuild_refcount_structure(BlockDriverState *bs,
>>       qcow2_cache_empty(bs, s->refcount_block_cache);
>> +    first_cluster = 0;
>> +    end_cluster = *nb_clusters;
>> +
>>   write_refblocks:
>> -    for (; cluster < *nb_clusters; cluster++) {
>> +    for (cluster = first_cluster; cluster < end_cluster; cluster++) {
>>           if (!s->get_refcount(*refcount_table, cluster)) {
>>               continue;
>>           }
>> @@ -2374,65 +2378,68 @@ write_refblocks:
>>           refblock_index = cluster >> s->refcount_block_bits;
>>           refblock_start = refblock_index << s->refcount_block_bits;
>> -        /* Don't allocate a cluster in a refblock already written to 
>> disk */
>> -        if (first_free_cluster < refblock_start) {
>> -            first_free_cluster = refblock_start;
>> -        }
>> -        refblock_offset = alloc_clusters_imrt(bs, 1, refcount_table,
>> -                                              nb_clusters, 
>> &first_free_cluster);
>> -        if (refblock_offset < 0) {
>> -            fprintf(stderr, "ERROR allocating refblock: %s\n",
>> -                    strerror(-refblock_offset));
>> -            res->check_errors++;
>> -            ret = refblock_offset;
>> -            goto fail;
>> -        }
>> -
>> -        if (reftable_size <= refblock_index) {
>> -            uint32_t old_reftable_size = reftable_size;
>> -            uint64_t *new_on_disk_reftable;
>> +        if (reftable_size > refblock_index &&
>> +            on_disk_reftable[refblock_index])
>> +        {
>> +            refblock_offset = on_disk_reftable[refblock_index];
> 
> In this branch, we assign it to ..
> 
>> +        } else {
>> +            int64_t refblock_cluster_index;
>> -            reftable_size = ROUND_UP((refblock_index + 1) * 
>> REFTABLE_ENTRY_SIZE,
>> -                                     s->cluster_size) / 
>> REFTABLE_ENTRY_SIZE;
>> -            new_on_disk_reftable = g_try_realloc(on_disk_reftable,
>> -                                                 reftable_size *
>> -                                                 REFTABLE_ENTRY_SIZE);
>> -            if (!new_on_disk_reftable) {
>> +            /* Don't allocate a cluster in a refblock already written 
>> to disk */
>> +            if (first_free_cluster < refblock_start) {
>> +                first_free_cluster = refblock_start;
>> +            }
>> +            refblock_offset = alloc_clusters_imrt(bs, 1, refcount_table,
>> +                                                  nb_clusters,
>> +                                                  &first_free_cluster);
>> +            if (refblock_offset < 0) {
>> +                fprintf(stderr, "ERROR allocating refblock: %s\n",
>> +                        strerror(-refblock_offset));
>>                   res->check_errors++;
>> -                ret = -ENOMEM;
>> +                ret = refblock_offset;
>>                   goto fail;
>>               }
>> -            on_disk_reftable = new_on_disk_reftable;
>> -            memset(on_disk_reftable + old_reftable_size, 0,
>> -                   (reftable_size - old_reftable_size) * 
>> REFTABLE_ENTRY_SIZE);
>> +            refblock_cluster_index = refblock_offset / s->cluster_size;
>> +            if (refblock_cluster_index >= end_cluster) {
>> +                /*
>> +                 * We must write the refblock that holds this refblock's
>> +                 * refcount
>> +                 */
>> +                end_cluster = refblock_cluster_index + 1;
>> +            }
>> -            /* The offset we have for the reftable is now no longer 
>> valid;
>> -             * this will leak that range, but we can easily fix that 
>> by running
>> -             * a leak-fixing check after this rebuild operation */
>> -            reftable_offset = -1;
>> -        } else {
>> -            assert(on_disk_reftable);
>> -        }
>> -        on_disk_reftable[refblock_index] = refblock_offset;
>> +            if (reftable_size <= refblock_index) {
>> +                uint32_t old_reftable_size = reftable_size;
>> +                uint64_t *new_on_disk_reftable;
>> +
>> +                reftable_size =
>> +                    ROUND_UP((refblock_index + 1) * REFTABLE_ENTRY_SIZE,
>> +                             s->cluster_size) / REFTABLE_ENTRY_SIZE;
>> +                new_on_disk_reftable =
>> +                    g_try_realloc(on_disk_reftable,
>> +                                  reftable_size * REFTABLE_ENTRY_SIZE);
>> +                if (!new_on_disk_reftable) {
>> +                    res->check_errors++;
>> +                    ret = -ENOMEM;
>> +                    goto fail;
>> +                }
>> +                on_disk_reftable = new_on_disk_reftable;
>> -        /* If this is apparently the last refblock (for now), try to 
>> squeeze the
>> -         * reftable in */
>> -        if (refblock_index == (*nb_clusters - 1) >> 
>> s->refcount_block_bits &&
>> -            reftable_offset < 0)
>> -        {
>> -            uint64_t reftable_clusters = size_to_clusters(s, 
>> reftable_size *
>> -                                                          
>> REFTABLE_ENTRY_SIZE);
>> -            reftable_offset = alloc_clusters_imrt(bs, reftable_clusters,
>> -                                                  refcount_table, 
>> nb_clusters,
>> -                                                  &first_free_cluster);
>> -            if (reftable_offset < 0) {
>> -                fprintf(stderr, "ERROR allocating reftable: %s\n",
>> -                        strerror(-reftable_offset));
>> -                res->check_errors++;
>> -                ret = reftable_offset;
>> -                goto fail;
>> +                memset(on_disk_reftable + old_reftable_size, 0,
>> +                       (reftable_size - old_reftable_size) *
>> +                       REFTABLE_ENTRY_SIZE);
>> +
>> +                /*
>> +                 * The offset we have for the reftable is now no 
>> longer valid;
>> +                 * this will leak that range, but we can easily fix 
>> that by
>> +                 * running a leak-fixing check after this rebuild 
>> operation
>> +                 */
>> +                reftable_offset = -1;
>> +            } else {
>> +                assert(on_disk_reftable);
>>               }
>> +            on_disk_reftable[refblock_index] = refblock_offset;
> 
> only to write back again ?

This assignment is on a deeper level, though, isn it?  I.e. it’s inside 
the else branch from above.

>>           }
>>           ret = qcow2_pre_write_overlap_check(bs, 0, refblock_offset,
>> @@ -2459,15 +2466,12 @@ write_refblocks:
>>       }
>>       if (reftable_offset < 0) {
> 
> at this point reftable_offset is always -1 now..
> 
> Ah not. and now I am a bit close to understanding all of this logic. 
> this thing with "goto write_refblocks" is not obvious
> 
>> -        uint64_t post_refblock_start, reftable_clusters;
>> +        uint64_t reftable_clusters;
>> -        post_refblock_start = ROUND_UP(*nb_clusters, 
>> s->refcount_block_size);
>>           reftable_clusters =
>>               size_to_clusters(s, reftable_size * REFTABLE_ENTRY_SIZE);
>> -        /* Not pretty but simple */
>> -        if (first_free_cluster < post_refblock_start) {
>> -            first_free_cluster = post_refblock_start;
>> -        }
>> +
>> +        first_free_cluster = 0;
>>           reftable_offset = alloc_clusters_imrt(bs, reftable_clusters,
>>                                                 refcount_table, 
>> nb_clusters,
>>                                                 &first_free_cluster);
>> @@ -2479,6 +2483,10 @@ write_refblocks:
>>               goto fail;
>>           }
>> +        assert(offset_into_cluster(s, reftable_offset) == 0);
>> +        first_cluster = reftable_offset / s->cluster_size;
>> +        end_cluster = first_cluster + reftable_clusters;
>> +
>>           goto write_refblocks;
> 
> these three lines now looks like a function call in assembler :)
> 
>>       }
>>
> 
> You didn't ping the series (more than two week old) so, I'm not sure 
> that you are not preparing v2 now.. But I kept it "marked unred" all 
> this time, and several times tried to look at it, and postponed, because 
> I don't familiar with this place of qcow2 driver. And the function looks 
> too difficult. Now finally I think I understand most of the logic that 
> you change. Honestly, I think a bit of refactoring the 
> rebuild_refcount_structure() prior to logic change would make it a lot 
> easier to review..

Hm, yes, putting the for () loop into its own function would probably be 
a good starting put.  I’ll look into it.

Max

next prev parent reply	other threads:[~2021-03-26 13:49 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-10 15:59 [PATCH 0/4] qcow2: Improve refcount structure rebuilding Max Reitz
2021-03-10 15:59 ` [PATCH 1/4] " Max Reitz
2021-03-26 11:48   ` Vladimir Sementsov-Ogievskiy
2021-03-26 13:47     ` Max Reitz [this message]
2021-03-26 14:38       ` Vladimir Sementsov-Ogievskiy
2021-03-10 15:59 ` [PATCH 2/4] iotests/common.qemu: Add _cleanup_single_qemu Max Reitz
2021-03-10 15:59 ` [PATCH 3/4] iotests/common.qemu: Allow using the QSD Max Reitz
2021-03-10 15:59 ` [PATCH 4/4] iotests/108: Test new refcount rebuild algorithm Max Reitz
2021-03-10 16:07   ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3eb1748e-012d-9c74-d9dc-c1a6f2644ba6@redhat.com \
    --to=mreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).