Re: [Qemu-devel] [PATCH v5 1/3] qcow2: Add qcow2_shrink_l1_and_l2_table for qcow2 shrinking

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Max Reitz <mreitz@redhat.com>
To: Jun Li <junmuzi@126.com>
Cc: kwolf@redhat.com, famz@redhat.com, juli@redhat.com,
	qemu-devel@nongnu.org, stefanha@redhat.com,
	Jun Li <junmuzi@gmail.com>
Subject: Re: [Qemu-devel] [PATCH v5 1/3] qcow2: Add qcow2_shrink_l1_and_l2_table for qcow2 shrinking
Date: Thu, 15 Jan 2015 13:47:40 -0500	[thread overview]
Message-ID: <54B80B4C.7010406@redhat.com> (raw)
In-Reply-To: <20150103104100.GA19089@localhost.localdomain>

On 2015-01-03 at 07:23, Jun Li wrote:
> On Fri, 11/21 11:56, Max Reitz wrote:
>> So, as for what I think we do need to do when shrinking (and keep in mind:
>> The offset given to qcow2_truncate() is the guest size! NOT the host image
>> size!):
>>
>> (1) Determine the first L2 table and the first entry in the table which will
>> lie beyond the new guest disk size.
> Here is not correct always. Due to the COW, using offset to calculate the
> first entry of the first L2 table will be incorrect.

Again: This is *not* about the host disk size or the host offset of some 
cluster, but about the *guest* disk size.

Let's make up an example. You have a 2 GB disk but you want to resize it 
to 1.25 GB. The cluster size is 64 kB, therefore we have 2 GB / 64 kB = 
32,768 data clusters (as long as there aren't any internal snapshots, 
which is a prerequisite for resizing qcow2 images).

Every L2 table contains 65,536 / 8 = 8,192 entries; there are thus 
32,768 / 8,192 = 4 L2 tables.

As you can see, one can directly derive the number of data clusters and 
L2 tables from the guest disk size (as long as there aren't any internal 
snapshots).

So of course we can do the same for the target disk size: 1.25 GB / 64 
kB = 20,480 data clusters; 20,480 / 8,192 = 2.5 L2 tables, therefore we 
need three L2 tables but only half of the last one (4,096 entries).

We know that every cluster references somewhere after that limit (that 
is, every entry in the fourth L2 table and every entry starting with 
index 4,096 in the third L2 table) is a data cluster with a guest offset 
somewhere beyond 1.25 GB, so we don't need it anymore.

Thus, we simply discard all those data clusters and after that we can 
discard the fourth L2 table. That's it.

If we really want to we can calculate the highest cluster host offset in 
use and truncate the image accordingly. But that's optional, see the 
last point in my "problems with this approach" list (having discarded 
the clusters should save us all the space already). Furthermore, as I'm 
saying in that list, to really solve this issue, we'd need qcow2 
defragmentation.

> What I have done for this scenario:
> (1) if the first entry is the first entry of the L2 table, so will scan "the
> previous L2 table"("the previous L2 table" location is in front of "L2 table"
> in L1 table). If the entry of previous L2 table is larger than offset, will
> discard this entry, too.
> (2) If the first entry is not the first entry of the L2 table, still to scan
> the whole L2 table to make sure no entry is beyond offset.
>
>> (2) Discard all clusters beginning from there.
>> (3) Discard all L2 tables which are then completely empty.
>> (4) Update the header size.
> For this patch current's realizion, have include above 4 steps I think.
> Current patch, also have another step 5.
> (5) truncate the file.

As I wrote above, you can do that but it shouldn't matter much because 
the discarded clusters should not use any disk space.

> Here I think we also should add discard refcount table and refcount block
> table when they are completely empty.
>
>> And that's it. We can either speed up step (2) by implementing it manually,
>> or we just use bdrv_discard() on the qcow2 BDS (in the simplest case:
>> bdrv_discard(bs, DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE), bs->total_sectors -
>> DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE));.
>>
>> We can incorporate step (3) by extending qcow2_discard_clusters() to free L2
>> tables when they are empty after discard_single_l2(). But we don't even have
>> to that now. It's an optimization we can go about later.
>>
>> So, we can do (1), (2) and (3) in a single step: Just one bdrv_discard()
>> call. But it's probably better to use qcow2_discard_clusters() instead and
>> set the full_discard parameter to true.
>>
>> So: qcow2_discard_clusters(bs, offset, bs->total_sectors - offset /
>> BDRV_SECTOR_SIZE, true);. Then update the guest disk size field in the
>> header. And we're done.
>>
>> There are four problems with this approach:
>> - qcow2_discard_clusters() might be slower than optimal. I personally don't
>> care at all.
>> - If "bs->total_sectors * BDRV_SECTOR_SIZE - offset" is greater than
>> INT_MAX, this won't work. Trivially solvable by encapsulating the
>> qcow2_discard_clusters() call in a loop which limits nb_clusters to INT_MAX
>> / BDRV_SECTOR_SIZE.
>> - The L1 table is not resized. Should not matter in practice at all.
> Yes, agree with you.
>
>> - The file is not truncated. Does not matter either (because holes in the
>> file are good enough), and we can't actually solve this problem without
>> defragmentation anyway.
>>
>> There is one advantage:
>> - It's extremely simple. It's literally below ten lines of code.
>>
>> I think the advantage far outweighs the disadvantage. But I may be wrong.
>> What do you think?
> Hi max,
>
>    Sorry for so late to reply as I am so busy recently. I think let's have an
> agreement on how to realize qcow2 shrinking first, then type code is better.

Yes, this will probably be for the best. :-)

> Another issue, as gmail can not be used in current China, I have to use this
> email to reply. :)

No problem.

Max

next prev parent reply	other threads:[~2015-01-15 18:47 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-26 15:20 [Qemu-devel] [PATCH v5 0/3] qcow2: Patch for shrinking qcow2 disk image Jun Li
2014-10-26 15:20 ` [Qemu-devel] [PATCH v5 1/3] qcow2: Add qcow2_shrink_l1_and_l2_table for qcow2 shrinking Jun Li
2014-11-21 10:56   ` Max Reitz
2014-11-24 17:49     ` Eric Blake
2015-01-03 12:23     ` Jun Li
2015-01-15 18:47       ` Max Reitz [this message]
2015-01-19 13:16         ` Jun Li
2015-01-22 19:14           ` Max Reitz
2015-01-27 14:06             ` Jun Li
2014-10-26 15:20 ` [Qemu-devel] [PATCH v5 2/3] qcow2: add update refcount table realization for update_refcount Jun Li
2014-11-21 12:41   ` Max Reitz
2014-11-24 18:11     ` Eric Blake
2014-10-26 15:20 ` [Qemu-devel] [PATCH v5 3/3] qcow2: Add qemu-iotests for qcow2 shrinking Jun Li
2014-11-21 13:01   ` Max Reitz
2014-11-10  8:36 ` [Qemu-devel] [PATCH v5 0/3] qcow2: Patch for shrinking qcow2 disk image Jun Li
2014-11-10  9:17   ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54B80B4C.7010406@redhat.com \
    --to=mreitz@redhat.com \
    --cc=famz@redhat.com \
    --cc=juli@redhat.com \
    --cc=junmuzi@126.com \
    --cc=junmuzi@gmail.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).