From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail104.syd.optusnet.com.au ([211.29.132.246]:38727 "EHLO mail104.syd.optusnet.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726445AbfF0WYe (ORCPT ); Thu, 27 Jun 2019 18:24:34 -0400 Date: Fri, 28 Jun 2019 08:23:24 +1000 From: Dave Chinner Subject: Re: [PATCH 07/12] xfs: don't preallocate a transaction for file size updates Message-ID: <20190627222324.GH7777@dread.disaster.area> References: <20190624055253.31183-1-hch@lst.de> <20190624055253.31183-8-hch@lst.de> <20190624161720.GQ5387@magnolia> <20190624231523.GC7777@dread.disaster.area> <20190625102507.GA1986@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190625102507.GA1986@lst.de> Sender: linux-xfs-owner@vger.kernel.org List-ID: List-Id: xfs To: Christoph Hellwig Cc: "Darrick J. Wong" , Damien Le Moal , Andreas Gruenbacher , linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org On Tue, Jun 25, 2019 at 12:25:07PM +0200, Christoph Hellwig wrote: > On Tue, Jun 25, 2019 at 09:15:23AM +1000, Dave Chinner wrote: > > > So, uh, how much of a hit do we take for having to allocate a > > > transaction for a file size extension? Particularly since we can > > > combine those things now? > > > > Unless we are out of log space, the transaction allocation and free > > should be largely uncontended and so it's just a small amount of CPU > > usage. i.e it's a slab allocation/free and then lockless space > > reservation/free. If we are out of log space, then we sleep waiting > > for space - the issue really comes down to where it is better to > > sleep in that case.... > > I see the general point, but we'll still have the same issue with > unwritten extent conversion and cow completions, and I don't remember > seeing any issue in that regard. These are realtively rare for small file workloads - I'm really talking about the effect of delalloc and how we've optimised allocation during writeback to merge small, cross-file writeback into much larger large physical IOs. Unwritten extents nor COW are used in these (common) cases, and if they are then the allocation patterns prevent the cross-file IO merging in the block layer and so we don't get the "hundred ioends for a hundred inodes from a single a physical IO completion" thundering heard problem.... > And we'd hit exactly that case > with random writes to preallocated or COW files, i.e. the typical image > file workload. I do see a noticable amount of IO completion overhead in the host when hitting unwritten extents in VM image workloads. I'll see if I can track the number of kworkers we're stalling in under some of these workloads, but I think it's still largely bound by the request queue depth of the IO stack inside the VM because there is no IO merging in these cases. Cheers, Dave. -- Dave Chinner david@fromorbit.com