From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wu Fengguang Subject: Re: [PATCH 20/45] NFS: introduce writeback wait queue Date: Wed, 7 Oct 2009 17:07:22 +0800 Message-ID: <20091007090722.GA5646@localhost> References: <20091007073818.318088777@intel.com> <20091007074903.749770316@intel.com> <1254905600.26976.218.camel@twins> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Andrew Morton , Theodore Tso , Christoph Hellwig , Dave Chinner , Chris Mason , "Li, Shaohua" , Myklebust Trond , "jens.axboe@oracle.com" , Jan Kara , Nick Piggin , "linux-fsdevel@vger.kernel.org" To: Peter Zijlstra Return-path: Received: from mga14.intel.com ([143.182.124.37]:5014 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933106AbZJGJIl (ORCPT ); Wed, 7 Oct 2009 05:08:41 -0400 Content-Disposition: inline In-Reply-To: <1254905600.26976.218.camel@twins> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Wed, Oct 07, 2009 at 04:53:20PM +0800, Peter Zijlstra wrote: > On Wed, 2009-10-07 at 15:38 +0800, Wu Fengguang wrote: > > plain text document attachment (writeback-nfs-request-queue.patch) > > The generic writeback routines are departing from congestion_wait() > > in preferance of get_request_wait(), aka. waiting on the block queues. > > > > Introduce the missing writeback wait queue for NFS, otherwise its > > writeback pages may grow out of control. > > > > In perticular, balance_dirty_pages() will exit after it pushes > > write_chunk pages into the PG_writeback page pool, _OR_ when the > > background writeback work quits. The latter is new behavior, and could > > not only quit (normally) after below background threshold, but also > > quit when it finds _zero_ dirty pages to write. The latter case gives > > rise to the number of PG_writeback pages if it is not explicitly limited. > > > > CC: Jens Axboe > > CC: Chris Mason > > CC: Peter Zijlstra > > CC: Trond Myklebust > > Signed-off-by: Wu Fengguang > > --- > > > > The wait time and network throughput varies a lot! this is a major problem. > > This means nfs_end_page_writeback() is not called smoothly over time, > > even when there are plenty of PG_writeback pages on the client side. > > Could that be some ack batching from the nfs protocol/server? Yes possibly. Another possibility is some works in the nfsiod workqueue takes up a long time. > > +static void nfs_wakeup_congested(long nr, long limit, > > + struct backing_dev_info *bdi, > > + wait_queue_head_t *wqh) > > +{ > > + if (nr < 2*limit - min(limit/8, NFS_WAIT_PAGES)) { > > + if (test_bit(BDI_sync_congested, &bdi->state)) > > + clear_bdi_congested(bdi, BLK_RW_SYNC); > > + if (waitqueue_active(&wqh[BLK_RW_SYNC])) { > > + smp_mb__after_clear_bit(); > > + wake_up(&wqh[BLK_RW_SYNC]); > > + } > > + } > > + if (nr < limit - min(limit/8, NFS_WAIT_PAGES)) { > > + if (test_bit(BDI_async_congested, &bdi->state)) > > + clear_bdi_congested(bdi, BLK_RW_ASYNC); > > + if (waitqueue_active(&wqh[BLK_RW_ASYNC])) { > > + smp_mb__after_clear_bit(); > > + wake_up(&wqh[BLK_RW_ASYNC]); > > + } > > + } > > +} > > > > wakeup implies a full memory barrier. If so, this smp_mb__after_clear_bit() line is also not necessary? void clear_bdi_congested(struct backing_dev_info *bdi, int sync) { //... clear_bit(bit, &bdi->state); smp_mb__after_clear_bit(); if (waitqueue_active(wqh)) wake_up(wqh); Thanks, Fengguang