From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoph Hellwig Subject: Re: [PATCH 0/2] Improve writeout pattern from xfs_flush_pages() Date: Wed, 3 Aug 2011 17:42:06 -0400 Message-ID: <20110803214206.GA20477@infradead.org> References: <1312404545-15400-1-git-send-email-jack@suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: hch@infradead.org, xfs@oss.sgi.com, linux-fsdevel@vger.kernel.org To: Jan Kara Return-path: Received: from 173-166-109-252-newengland.hfc.comcastbusiness.net ([173.166.109.252]:56841 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752952Ab1HCVmI (ORCPT ); Wed, 3 Aug 2011 17:42:08 -0400 Content-Disposition: inline In-Reply-To: <1312404545-15400-1-git-send-email-jack@suse.cz> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Wed, Aug 03, 2011 at 10:49:03PM +0200, Jan Kara wrote: > > Hi, > > at one of customer's machines, I've spotted an issue that sync(1) called > after writing a single huge file has been achieving rather low throughput. After > debugging this with blktrace, I've found that the culprit was in flusher thread > racing with page writeout happening from XFS sync code. The patches below helped > that case. Although they are not a complete solution, I belive they are useful > anyway so please consider merging them... We currently have three calls to xfs_flush_pages with XBF_ASYNC set: - xfs_setattr_size - xfs_sync_inode_data - xfs_release The first one actually is a synchronous writeout, just implemented in a rather odd way by doing the xfs_ioend_wait right after it, so your change is actively harmful for it. The second is only called from xfs_flush_worker, which is the workqueue offload when we hit ENOSPC. I can see how this might race with the writeback code, but the correct fix is to replace it with a call to writeback_inodes_sb(_if_idle) on that one is fixed to do a trylock on s_umount and thus won't deadlock. The third one is opportunistic writeout if a file got truncated down on final release. filemap_flush probably is fine here, but there's no need for a range version. If you replace it with filemap_flush please also kill the useless wrapper while you're at it.