From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55574) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1axaPh-0000iJ-6d for qemu-devel@nongnu.org; Tue, 03 May 2016 09:31:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1axaPV-0004bV-76 for qemu-devel@nongnu.org; Tue, 03 May 2016 09:30:51 -0400 Received: from mail-db3on0107.outbound.protection.outlook.com ([157.55.234.107]:55968 helo=emea01-db3-obe.outbound.protection.outlook.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1axaPT-0004Wp-4H for qemu-devel@nongnu.org; Tue, 03 May 2016 09:30:45 -0400 References: <1461740925-20887-1-git-send-email-den@openvz.org> <20160502153547.GG4882@noname.redhat.com> From: "Denis V. Lunev" Message-ID: <5728A7F3.3050305@openvz.org> Date: Tue, 3 May 2016 16:30:27 +0300 MIME-Version: 1.0 In-Reply-To: <20160502153547.GG4882@noname.redhat.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH for 2.7 v2 1/1] qcow2: improve qcow2_co_write_zeroes() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: qemu-devel@nongnu.org, Max Reitz On 05/02/2016 06:35 PM, Kevin Wolf wrote: > Am 27.04.2016 um 09:08 hat Denis V. Lunev geschrieben: >> There is a possibility that qcow2_co_write_zeroes() will be called by >> with the partial block. This could be synthetically triggered with >> qemu-io -c "write -z 32k 4k" >> and can happen in the real life in qemu-nbd. The latter happens under >> the following conditions: >> (1) qemu-nbd is started with --detect-zeroes=on and is connected to the >> kernel NBD client >> (2) third party program opens kernel NBD device with O_DIRECT >> (3) third party program performs write operation with memory buffer >> not aligned to the page >> In this case qcow2_co_write_zeroes() is unable to perform the operation >> and mark entire cluster as zeroed and returns ENOTSUP. Thus the caller >> switches to non-optimized version and writes real zeroes to the disk. >> >> The patch creates a shortcut. If the block is full of zeroes at the moment, >> the request is extended to cover full block. User-visible situation with >> this block is not changed. Before the patch the block is filled in the >> image with real zeroes. After that patch the block is marked as zeroed >> in metadata. Thus any subsequent changes in backing store chain are not >> affected. >> >> Kewin, thank you for a cool suggestion. > s/Kewin/Kevin/ ;-) > >> Signed-off-by: Denis V. Lunev >> CC: Kevin Wolf >> CC: Max Reitz >> --- >> Changes from v1: >> - description rewritten completely >> - new approach suggested by Kevin is implemented >> >> block/qcow2.c | 24 +++++++++++++++++++----- >> 1 file changed, 19 insertions(+), 5 deletions(-) >> >> diff --git a/block/qcow2.c b/block/qcow2.c >> index 470734b..405d1da 100644 >> --- a/block/qcow2.c >> +++ b/block/qcow2.c >> @@ -2417,15 +2417,29 @@ static coroutine_fn int qcow2_co_write_zeroes(BlockDriverState *bs, >> int ret; >> BDRVQcow2State *s = bs->opaque; >> >> - /* Emulate misaligned zero writes */ >> - if (sector_num % s->cluster_sectors || nb_sectors % s->cluster_sectors) { >> - return -ENOTSUP; >> + int head = sector_num % s->cluster_sectors; >> + int tail = nb_sectors % s->cluster_sectors; > If you want to get the number of sectors between the end of the request > and the end of its cluster, this is not the right calculation. > > We need to count the sectors after the end of the request rather than > those before it, and we also need to consider requests where both start > and end of the request are unaligned. > >> + if (head != 0 || tail != 0) { >> + int nr; >> + BlockDriverState *file; >> + >> + sector_num -= head; >> + nb_sectors += head + tail; >> + >> + /* check the the request extended to the entire cluster is read >> + as all zeroes at the moment. If so we can mark entire cluster >> + as zeroed in the metadata */ >> + ret = bdrv_get_block_status_above(bs, NULL, sector_num, nb_sectors, >> + &nr, &file); >> + if (ret < 0 || !(ret & BDRV_BLOCK_ZERO) || sector_num == nr) { > sector_num is an offset and nr is a length, so the comparison doesn't > look very meaningful. Did you mean nb_sectors != nr? > >> + /* Emulate misaligned zero writes */ >> + return -ENOTSUP; >> + } >> } >> >> /* Whatever is left can use real zero clusters */ >> qemu_co_mutex_lock(&s->lock); >> - ret = qcow2_zero_clusters(bs, sector_num << BDRV_SECTOR_BITS, >> - nb_sectors); >> + ret = qcow2_zero_clusters(bs, sector_num << BDRV_SECTOR_BITS, nb_sectors); >> qemu_co_mutex_unlock(&s->lock); > Kevin you are perfectly correct in both places.. Will fix. Thank you very much. Den