From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60509) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YIHTj-0001XA-RC for qemu-devel@nongnu.org; Mon, 02 Feb 2015 08:55:52 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YIHTd-00066J-OX for qemu-devel@nongnu.org; Mon, 02 Feb 2015 08:55:51 -0500 Received: from mx-v6.kamp.de ([2a02:248:0:51::16]:35022 helo=mx01.kamp.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YIHTd-000666-ET for qemu-devel@nongnu.org; Mon, 02 Feb 2015 08:55:45 -0500 Message-ID: <54CF81DA.3020003@kamp.de> Date: Mon, 02 Feb 2015 14:55:38 +0100 From: Peter Lieven MIME-Version: 1.0 References: <1422607337-25335-1-git-send-email-den@openvz.org> <1422607337-25335-8-git-send-email-den@openvz.org> <20150202132355.GC9478@noname.redhat.com> In-Reply-To: <20150202132355.GC9478@noname.redhat.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 7/7] block/raw-posix: set max_write_zeroes to INT_MAX for regular files List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf , "Denis V. Lunev" Cc: Fam Zheng , qemu-devel@nongnu.org, Stefan Hajnoczi Am 02.02.2015 um 14:23 schrieb Kevin Wolf: > Am 30.01.2015 um 09:42 hat Denis V. Lunev geschrieben: >> fallocate() works fine and could handle properly with arbitrary size >> requests. There is no sense to reduce the amount of space to fallocate. >> The bigger is the size, the better is the performance as the amount of >> journal updates is reduced. >> >> The patch changes behavior for both generic filesystem and XFS codepaths, >> which are different in handle_aiocb_write_zeroes. The implementation >> of fallocate and xfsctl(XFS_IOC_ZERO_RANGE) for XFS are exactly the same >> thus the change is fine for both ways. >> >> Signed-off-by: Denis V. Lunev >> Reviewed-by: Max Reitz >> CC: Kevin Wolf >> CC: Stefan Hajnoczi >> CC: Peter Lieven >> CC: Fam Zheng >> --- >> block/raw-posix.c | 17 +++++++++++++++++ >> 1 file changed, 17 insertions(+) >> >> diff --git a/block/raw-posix.c b/block/raw-posix.c >> index 7b42f37..933c778 100644 >> --- a/block/raw-posix.c >> +++ b/block/raw-posix.c >> @@ -293,6 +293,20 @@ static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp) >> } >> } >> >> +static void raw_probe_max_write_zeroes(BlockDriverState *bs) >> +{ >> + BDRVRawState *s = bs->opaque; >> + struct stat st; >> + >> + if (fstat(s->fd, &st) < 0) { >> + return; /* no problem, keep default value */ >> + } >> + if (!S_ISREG(st.st_mode) || !s->discard_zeroes) { >> + return; >> + } >> + bs->bl.max_write_zeroes = INT_MAX; >> +} > Peter, do you remember why INT_MAX isn't actually the default? I think > the most reasonable behaviour would be that a limitation is only used if > a block driver requests it, and otherwise unlimited is assumed. The default (0) actually means unlimited or undefined. We introduced that limit of 16MB in bdrv_co_write_zeroes to create only reasonable sized requests because there is no guarantee that write zeroes is a fast operation. We should set INT_MAX only if we know that write zeroes of an arbitrary size is always fast. Peter