From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:51591)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <zhanghy@sangfor.com>) id 1XcPR9-0008VZ-Uv
	for qemu-devel@nongnu.org; Thu, 09 Oct 2014 21:56:14 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <zhanghy@sangfor.com>) id 1XcPR3-000145-3P
	for qemu-devel@nongnu.org; Thu, 09 Oct 2014 21:56:07 -0400
Received: from [58.251.49.30] (port=41592 helo=mail.sangfor.com)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <zhanghy@sangfor.com>) id 1XcPR0-00012J-8x
	for qemu-devel@nongnu.org; Thu, 09 Oct 2014 21:56:01 -0400
Date: Fri, 10 Oct 2014 09:54:58 +0800
From: "=?utf-8?B?WmhhbmcgSGFveXU=?=" <zhanghy@sangfor.com>
References: <201410091917519618804@sangfor.com>
Message-ID: <201410100954567266628@sangfor.com>
Mime-Version: 1.0
Content-Type: text/plain;
	charset="utf-8"
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel]
	=?utf-8?q?=5Bquestion=5D_is_it_possible_that_big-end?=
	=?utf-8?q?ian_l1_table_offset_referenced_by_other_I/O_while_updati?=
	=?utf-8?q?ng_l1_table_offset_in_qcow2=5Fupdate=5Fsnapshot=5Frefcou?=
	=?utf-8?b?bnQ/?=
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: =?utf-8?B?RXJpYyBCbGFrZQ==?= <eblake@redhat.com>, =?utf-8?B?cWVtdS1kZXZlbA==?= <qemu-devel@nongnu.org>

>> Hi,
>> I encounter a problem that after deleting snapshot, the qcow2 image size is very larger than that it should be displayed by ls command, 
>> but the virtual disk size is okay via qemu-img info.
>> I suspect that during updating l1 table offset, other I/O job reference the big-endian l1 table offset (very large value),
>> so the file is truncated to very large.
>
>Not quite.  Rather, all the data that the snapshot used to occupy is
>still consuming holes in the file; the maximum offset of the file is
>still unchanged, even if the file is no longer using as many referenced
>clusters.  Recent changes have gone in to sparsify the file when
>possible (punching holes if your kernel and file system is new enough to
>support that), so that it is not consuming the amount of disk space that
>a mere ls reports.  But if what you are asking for is a way to compact
>the file back down, then you'll need to submit a patch.  The idea of
>having an online defragmenter for qcow2 files has been kicked around
>before, but it is complex enough that no one has attempted a patch yet.

Sorry, I didn't clarify the problem clearly.
In qcow2_update_snapshot_refcount(), below code, 
    /* Update L1 only if it isn't deleted anyway (addend = -1) */
    if (ret == 0 && addend >= 0 && l1_modified) {
        for (i = 0; i < l1_size; i++) {
            cpu_to_be64s(&l1_table[i]);
        }

        ret = bdrv_pwrite_sync(bs->file, l1_table_offset, l1_table, l1_size2);

        for (i = 0; i < l1_size; i++) {
            be64_to_cpus(&l1_table[i]);
        }
    }
between cpu_to_be64s(&l1_table[i]); and be64_to_cpus(&l1_table[i]);, 
is it possible that there is other I/O reference this interim l1 table whose entries contain the be64 l2 table offset? 
The be64 l2 table offset maybe a very large value, hundreds of TB is possible,
then the qcow2 file will be truncated to far larger than normal size.
So we'll see the huge size of the qcow2 file by ls -hl, but the size is still normal displayed by qemu-img info.

If the possibility mentioned above exists, below raw code may fix it,
     if (ret == 0 && addend >= 0 && l1_modified) {
        tmp_l1_table = g_malloc0(l1_size * sizeof(uint64_t))
        memcpy(tmp_l1_table, l1_table, l1_size * sizeof(uint64_t));
        for (i = 0; i < l1_size; i++) {
            cpu_to_be64s(&tmp_l1_table[i]);
        }
        ret = bdrv_pwrite_sync(bs->file, l1_table_offset, tmp_l1_table, l1_size2);

        free(tmp_l1_table);
    }

Thanks,
Zhang Haoyu