From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55140) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y4V02-0005TW-GO for qemu-devel@nongnu.org; Fri, 26 Dec 2014 08:32:15 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Y4Uzy-0003cG-Gm for qemu-devel@nongnu.org; Fri, 26 Dec 2014 08:32:14 -0500 Received: from relay.parallels.com ([195.214.232.42]:55150) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y4Uzy-0003cC-9z for qemu-devel@nongnu.org; Fri, 26 Dec 2014 08:32:10 -0500 Message-ID: <549D6351.8050904@parallels.com> Date: Fri, 26 Dec 2014 16:32:01 +0300 From: "Denis V. Lunev" MIME-Version: 1.0 References: <1419597313-20514-1-git-send-email-den@openvz.org> <1419597313-20514-2-git-send-email-den@openvz.org> <549D5F16.2070007@kamp.de> In-Reply-To: <549D5F16.2070007@kamp.de> Content-Type: text/plain; charset="ISO-8859-15"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 1/7] block: fix maximum length sent to bdrv_co_do_write_zeroes callback in bs List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Lieven , "Denis V. Lunev" Cc: Kevin Wolf , qemu-devel@nongnu.org, Stefan Hajnoczi On 26/12/14 16:13, Peter Lieven wrote: > Am 26.12.2014 um 13:35 schrieb Denis V. Lunev: >> The check for maximum length was added by >> commit c31cb70728d2c0c8900b35a66784baa446fd5147 >> Author: Peter Lieven >> Date: Thu Oct 24 12:06:58 2013 +0200 >> block: honour BlockLimits in bdrv_co_do_write_zeroes >> >> but actually if driver provides .bdrv_co_write_zeroes callback, there is >> no need to limit the size to 32 MB. Callback should provide effective >> implementation which normally should not write any zeroes in comparable >> amount. > > NACK. > > First there is no guarantee that bdrv_co_do_write_zeroes is a fast operation. > This heaviliy depends on several circumstances that the block layer is not aware of. > If a specific protocol knows it is very fast in writing zeroes under any circumstance > it should provide INT_MAX in bs->bl.max_write_zeroes. It is then still allowed to > return -ENOTSUP if the request size or alignment doesn't fit. the idea is that (from my point of view) if .bdrv_co_do_write_zeroes is specified, the cost is almost the same for any amount of zeroes written. This is true for fallocate from my point of view. The amount of actually written data will be in several orders less than specified except slow path, which honors 32 MB limit. If the operation is complex in realization, then it will be rate-limited below, in actual implementation. > There are known backends e.g. anything that deals with SCSI that have a known > limitation of the maximum number of zeroes they can write fast in a single request. > This number MUST NOT be exceeded. The below patch would break all those backends. could you pls point me this backends. Actually, from my point of view, they should properly setup max_write_zeroes for themselves. This is done at least in block/iscsi.c and it would be consistent way of doing so. > > What issue are you trying to fix with this patch? Maybe there is a better way to fix > it at another point in the code. > I am trying to minimize amount of metadata updates for a file. This provides some speedup even on ext4 and this will provide even more speedup with a distributed filesystem like CEPH where size updates of the files and block allocation are costly. Regards, Den