From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wu Fengguang Subject: Re: [PATCH 29/35] nfs: in-commit pages accounting and wait queue Date: Tue, 14 Dec 2010 23:40:26 +0800 Message-ID: <20101214154026.GA8959@localhost> References: <20101213144646.341970461@intel.com> <20101213150329.831955132@intel.com> <1292274951.8795.28.camel@heimdal.trondhjem.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Andrew Morton , Jan Kara , Christoph Hellwig , Dave Chinner , Theodore Ts'o , Chris Mason , Peter Zijlstra , Mel Gorman , Rik van Riel , KOSAKI Motohiro , Greg Thelen , Minchan Kim , linux-mm , "linux-fsdevel@vger.kernel.org" , LKML To: Trond Myklebust Return-path: Content-Disposition: inline In-Reply-To: <1292274951.8795.28.camel@heimdal.trondhjem.org> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Tue, Dec 14, 2010 at 05:15:51AM +0800, Trond Myklebust wrote: > On Mon, 2010-12-13 at 22:47 +0800, Wu Fengguang wrote: > > plain text document attachment (writeback-nfs-in-commit.patch) > > When doing 10+ concurrent dd's, I observed very bumpy commits submission > > (partly because the dd's are started at the same time, and hence reached > > 4MB to-commit pages at the same time). Basically we rely on the server > > to complete and return write/commit requests, and want both to progress > > smoothly and not consume too many pages. The write request wait queue is > > not enough as it's mainly network bounded. So add another commit request > > wait queue. Only async writes need to sleep on this queue. > > > > I'm not understanding the above reasoning. Why should we serialise > commits at the per-filesystem level (and only for non-blocking flushes > at that)? I did the commit wait queue after seeing this graph, where there is very bursty pattern of commit submission and hence completion: http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/tests/3G/nfs-100dd-1M-8p-2953M-2.6.37-rc3+-2010-12-03-01/nfs-commit-1000.png leading to big fluctuations, eg. the almost straight up/straight down lines below http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/tests/3G/nfs-100dd-1M-8p-2953M-2.6.37-rc3+-2010-12-03-01/vmstat-dirty-300.png http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/tests/3G/nfs-100dd-1M-8p-2953M-2.6.37-rc3+-2010-12-03-01/dirty-pages.png http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/tests/3G/nfs-100dd-1M-8p-2953M-2.6.37-rc3+-2010-12-03-01/dirty-pages-200.png A commit wait queue will help wipe out the "peaks". The "fixed" graph is http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/tests/3G/nfs-100dd-1M-8p-2952M-2.6.37-rc5+-2010-12-09-03-23/vmstat-dirty-300.png http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/tests/3G/nfs-100dd-1M-8p-2952M-2.6.37-rc5+-2010-12-09-03-23/dirty-pages.png Blocking flushes don't need to wait on this queue because they already throttle themselves by waiting on the inode commit lock before/after the commit. They actually should not wait on this queue, to prevent sync requests being unnecessarily blocked by async ones. Thanks, Fengguang