From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754318Ab0KRGvU (ORCPT ); Thu, 18 Nov 2010 01:51:20 -0500 Received: from mga09.intel.com ([134.134.136.24]:62494 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751569Ab0KRGvT (ORCPT ); Thu, 18 Nov 2010 01:51:19 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.59,216,1288594800"; d="scan'208";a="575005893" Date: Thu, 18 Nov 2010 14:51:12 +0800 From: Wu Fengguang To: Andrew Morton Cc: Jan Kara , "Li, Shaohua" , Christoph Hellwig , Dave Chinner , "Theodore Ts'o" , Chris Mason , Peter Zijlstra , Mel Gorman , Rik van Riel , KOSAKI Motohiro , linux-mm , "linux-fsdevel@vger.kernel.org" , LKML Subject: Re: [PATCH 06/13] writeback: bdi write bandwidth estimation Message-ID: <20101118065111.GA8458@localhost> References: <20101117042720.033773013@intel.com> <20101117042850.002299964@intel.com> <20101117150837.a18d56c1.akpm@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20101117150837.a18d56c1.akpm@linux-foundation.org> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 18, 2010 at 07:08:37AM +0800, Andrew Morton wrote: > On Wed, 17 Nov 2010 12:27:26 +0800 > Wu Fengguang wrote: > > > + w = min(elapsed / (HZ/100), 128UL); > > I did try setting HZ=10 many years ago, and the kernel blew up. > > I do recall hearing of people who set HZ very low, perhaps because > their huge machines were seeing performance prolems when the timer tick > went off. Probably there's no need to do that any more. > > But still, we shouldn't hard-wire the (HZ >= 100) assumption if we > don't absolutely need to, and I don't think it is absolutely needed > here. Fair enough. Here is the fix. The other (HZ/10) will be addressed by another patch that increase it to MAX_PAUSE=max(HZ/5, 1). Thanks, Fengguang --- Subject: writeback: prevent divide error on tiny HZ Date: Thu Nov 18 12:19:56 CST 2010 As suggested by Andrew and Peter: I do recall hearing of people who set HZ very low, perhaps because their huge machines were seeing performance prolems when the timer tick went off. Probably there's no need to do that any more. But still, we shouldn't hard-wire the (HZ >= 100) assumption if we don't absolutely need to, and I don't think it is absolutely needed here. People who do cpu bring-up on very slow FPGAs also lower HZ as far as possible. CC: Peter Zijlstra Signed-off-by: Wu Fengguang --- mm/page-writeback.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- linux-next.orig/mm/page-writeback.c 2010-11-18 12:35:18.000000000 +0800 +++ linux-next/mm/page-writeback.c 2010-11-18 12:35:38.000000000 +0800 @@ -490,6 +490,7 @@ void bdi_update_write_bandwidth(struct b unsigned long *bw_time, s64 *bw_written) { + const unsigned long unit_time = max(HZ/100, 1); unsigned long written; unsigned long elapsed; unsigned long bw; @@ -499,7 +500,7 @@ void bdi_update_write_bandwidth(struct b goto snapshot; elapsed = jiffies - *bw_time; - if (elapsed < HZ/100) + if (elapsed < unit_time) return; /* @@ -513,7 +514,7 @@ void bdi_update_write_bandwidth(struct b written = percpu_counter_read(&bdi->bdi_stat[BDI_WRITTEN]) - *bw_written; bw = (HZ * PAGE_CACHE_SIZE * written + elapsed/2) / elapsed; - w = min(elapsed / (HZ/100), 128UL); + w = min(elapsed / unit_time, 128UL); bdi->write_bandwidth = (bdi->write_bandwidth * (1024-w) + bw * w) >> 10; bdi->write_bandwidth_update_time = jiffies; snapshot: From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wu Fengguang Subject: Re: [PATCH 06/13] writeback: bdi write bandwidth estimation Date: Thu, 18 Nov 2010 14:51:12 +0800 Message-ID: <20101118065111.GA8458@localhost> References: <20101117042720.033773013@intel.com> <20101117042850.002299964@intel.com> <20101117150837.a18d56c1.akpm@linux-foundation.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Jan Kara , "Li, Shaohua" , Christoph Hellwig , Dave Chinner , Theodore Ts'o , Chris Mason , Peter Zijlstra , Mel Gorman , Rik van Riel , KOSAKI Motohiro , linux-mm , "linux-fsdevel@vger.kernel.org" , LKML To: Andrew Morton Return-path: Content-Disposition: inline In-Reply-To: <20101117150837.a18d56c1.akpm@linux-foundation.org> Sender: owner-linux-mm@kvack.org List-Id: linux-fsdevel.vger.kernel.org On Thu, Nov 18, 2010 at 07:08:37AM +0800, Andrew Morton wrote: > On Wed, 17 Nov 2010 12:27:26 +0800 > Wu Fengguang wrote: > > > + w = min(elapsed / (HZ/100), 128UL); > > I did try setting HZ=10 many years ago, and the kernel blew up. > > I do recall hearing of people who set HZ very low, perhaps because > their huge machines were seeing performance prolems when the timer tick > went off. Probably there's no need to do that any more. > > But still, we shouldn't hard-wire the (HZ >= 100) assumption if we > don't absolutely need to, and I don't think it is absolutely needed > here. Fair enough. Here is the fix. The other (HZ/10) will be addressed by another patch that increase it to MAX_PAUSE=max(HZ/5, 1). Thanks, Fengguang --- Subject: writeback: prevent divide error on tiny HZ Date: Thu Nov 18 12:19:56 CST 2010 As suggested by Andrew and Peter: I do recall hearing of people who set HZ very low, perhaps because their huge machines were seeing performance prolems when the timer tick went off. Probably there's no need to do that any more. But still, we shouldn't hard-wire the (HZ >= 100) assumption if we don't absolutely need to, and I don't think it is absolutely needed here. People who do cpu bring-up on very slow FPGAs also lower HZ as far as possible. CC: Peter Zijlstra Signed-off-by: Wu Fengguang --- mm/page-writeback.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- linux-next.orig/mm/page-writeback.c 2010-11-18 12:35:18.000000000 +0800 +++ linux-next/mm/page-writeback.c 2010-11-18 12:35:38.000000000 +0800 @@ -490,6 +490,7 @@ void bdi_update_write_bandwidth(struct b unsigned long *bw_time, s64 *bw_written) { + const unsigned long unit_time = max(HZ/100, 1); unsigned long written; unsigned long elapsed; unsigned long bw; @@ -499,7 +500,7 @@ void bdi_update_write_bandwidth(struct b goto snapshot; elapsed = jiffies - *bw_time; - if (elapsed < HZ/100) + if (elapsed < unit_time) return; /* @@ -513,7 +514,7 @@ void bdi_update_write_bandwidth(struct b written = percpu_counter_read(&bdi->bdi_stat[BDI_WRITTEN]) - *bw_written; bw = (HZ * PAGE_CACHE_SIZE * written + elapsed/2) / elapsed; - w = min(elapsed / (HZ/100), 128UL); + w = min(elapsed / unit_time, 128UL); bdi->write_bandwidth = (bdi->write_bandwidth * (1024-w) + bw * w) >> 10; bdi->write_bandwidth_update_time = jiffies; snapshot: -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: email@kvack.org