From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754621Ab0KXMa2 (ORCPT ); Wed, 24 Nov 2010 07:30:28 -0500 Received: from mga02.intel.com ([134.134.136.20]:23388 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753841Ab0KXMa1 (ORCPT ); Wed, 24 Nov 2010 07:30:27 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.59,247,1288594800"; d="scan'208";a="680618816" Date: Wed, 24 Nov 2010 20:30:23 +0800 From: Wu Fengguang To: Peter Zijlstra Cc: Andrew Morton , Jan Kara , Christoph Hellwig , Dave Chinner , "Theodore Ts'o" , Chris Mason , Mel Gorman , Rik van Riel , KOSAKI Motohiro , linux-mm , "linux-fsdevel@vger.kernel.org" , LKML Subject: Re: [PATCH 08/13] writeback: quit throttling when bdi dirty pages dropped low Message-ID: <20101124123023.GA10413@localhost> References: <20101117042720.033773013@intel.com> <20101117042850.245782303@intel.com> <1290597233.2072.454.camel@laptop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1290597233.2072.454.camel@laptop> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 24, 2010 at 07:13:53PM +0800, Peter Zijlstra wrote: > On Wed, 2010-11-17 at 12:27 +0800, Wu Fengguang wrote: > > > @@ -578,6 +579,25 @@ static void balance_dirty_pages(struct a > > bdi_stat(bdi, BDI_WRITEBACK); > > } > > > > + /* > > + * bdi_thresh takes time to ramp up from the initial 0, > > + * especially for slow devices. > > + * > > + * It's possible that at the moment dirty throttling starts, > > + * bdi_dirty = nr_dirty > > + * = (background_thresh + dirty_thresh) / 2 > > + * >> bdi_thresh > > + * Then the task could be blocked for a dozen second to flush > > + * all the exceeded (bdi_dirty - bdi_thresh) pages. So offer a > > + * complementary way to break out of the loop when 250ms worth > > + * of dirty pages have been cleaned during our pause time. > > + */ > > + if (nr_dirty < dirty_thresh && > > + bdi_prev_dirty - bdi_dirty > > > + bdi->write_bandwidth >> (PAGE_CACHE_SHIFT + 2)) > > + break; > > + bdi_prev_dirty = bdi_dirty; > > + > > if (bdi_dirty >= bdi_thresh) { > > pause = HZ/10; > > goto pause; > > > So we're testing to see if during our pause time (<=100ms) we've written > out 250ms worth of pages (given our current bandwidth estimation), > right? > > (1/4th of bandwidth in bytes/s is bytes per 0.25s) Right. > (and in your recent patches you've changed the bw to pages/s so I take > it the PAGE_CACHE_SIZE will be gone from all these sites). Yeah. Actually I did one more fix after that. The break is designed mainly to help single task case. It helps less for concurrent dirtier cases, however for long run servers I guess they don't really care some boot time lags. For the 1-dd case, it looks better to lower the break threshold to 125ms. After all, it's not easy for the dirty pages to drop by 250ms worth of data when you only slept 200ms (note: the max pause time has been doubled mainly for servers). - if (nr_dirty < dirty_thresh && - bdi_prev_dirty - bdi_dirty > (long)bdi->write_bandwidth / 4) + if (nr_dirty <= dirty_thresh && + bdi_prev_dirty - bdi_dirty > (long)bdi->write_bandwidth / 8) break; Thanks, Fengguang