public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Lin Ming <ming.m.lin@intel.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Subject: Re: iozone regression with 2.6.29-rc6
Date: Fri, 27 Feb 2009 12:55:20 +0100	[thread overview]
Message-ID: <20090227115520.GC21296@wotan.suse.de> (raw)
In-Reply-To: <1235728154.24401.55.camel@laptop>

On Fri, Feb 27, 2009 at 10:49:14AM +0100, Peter Zijlstra wrote:
> On Fri, 2009-02-27 at 17:13 +0800, Lin Ming wrote:
> > bisect locates below commits,
> > 
> > commit 1cf6e7d83bf334cc5916137862c920a97aabc018
> > Author: Nick Piggin <npiggin@suse.de>
> > Date:   Wed Feb 18 14:48:18 2009 -0800
> > 
> >     mm: task dirty accounting fix
> > 
> >     YAMAMOTO-san noticed that task_dirty_inc doesn't seem to be called properly for
> >     cases where set_page_dirty is not used to dirty a page (eg. mark_buffer_dirty).
> > 
> >     Additionally, there is some inconsistency about when task_dirty_inc is
> >     called.  It is used for dirty balancing, however it even gets called for
> >     __set_page_dirty_no_writeback.
> > 
> >     So rather than increment it in a set_page_dirty wrapper, move it down to
> >     exactly where the dirty page accounting stats are incremented.
> > 
> >     Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
> >     Signed-off-by: Nick Piggin <npiggin@suse.de>
> >     Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> >     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> >     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> > 
> > 
> > below data in parenthesis is the result after above commit reverted, for example,
> > -10% (+2%) means,
> > iozone has ~10% regression with 2.6.29-rc6 compared with 2.6.29-rc5.
> > and
> > iozone has ~2% improvement with 2.6.29-rc6-revert-1cf6e7d compared with 2.6.29-rc5.
> > 
> > 
> > 			4P dual-core HT	 	2P qual-core  	2P qual-core HT
> > 			tulsa		   	stockley	Nehalem
> > 			--------------------------------------------------------
> > iozone-rewrite		-10% (+2%)		-8% (0%)	-10% (-7%)
> > iozone-rand-write	-50% (0%)		-20% (+10%)
> > iozone-read					-13% (0%)
> > iozone-write					-28% (-1%)
> > iozone-reread							-5% (-1%)
> > iozone-mmap-read						-7% (+2%)
> > iozone-mmap-reread						-7% (+2%)
> > iozone-mmap-rand-read						-7% (+3%)
> > iozone-mmap-rand-write						-5% (0%)
> 
> Ugh, that's unexpected..
> 
> So 'better' accounting leads to worse performance, which would indicate
> we throttle more.
> 
> I take it you machine has gobs of memory.
> 
> Does something like the below help any?

Shall we revert this for 2.6.29, then? And try to improve it in the next
cycle? Are we looking at a several more weeks before 2.6.29, or do we
prefer not to try tweaking heuristics at this point?

 
> ---
> Subject: mm: bdi: tweak task dirty penalty
> From: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date: Fri Feb 27 10:41:22 CET 2009
> 
> Penalizing heavy dirtiers with 1/8-th the total dirty limit might be rather
> excessive on large memory machines. Use sqrt to scale it sub-linearly.
> 
> Update the comment while we're there.
> 
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> ---
>  mm/page-writeback.c |   12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> Index: linux-2.6/mm/page-writeback.c
> ===================================================================
> --- linux-2.6.orig/mm/page-writeback.c
> +++ linux-2.6/mm/page-writeback.c
> @@ -293,17 +293,21 @@ static inline void task_dirties_fraction
>  }
>  
>  /*
> - * scale the dirty limit
> + * Task specific dirty limit:
>   *
> - * task specific dirty limit:
> + *   dirty -= 8 * sqrt(dirty) * p_{t}
>   *
> - *   dirty -= (dirty/8) * p_{t}
> + * Penalize tasks that dirty a lot of pages by lowering their dirty limit. This
> + * avoids infrequent dirtiers from getting stuck in this other guys dirty
> + * pages.
> + *
> + * Use a sub-linear function to scale the penalty, we only need a little room.
>   */
>  static void task_dirty_limit(struct task_struct *tsk, long *pdirty)
>  {
>  	long numerator, denominator;
>  	long dirty = *pdirty;
> -	u64 inv = dirty >> 3;
> +	u64 inv = 8*int_sqrt(dirty);
>  
>  	task_dirties_fraction(tsk, &numerator, &denominator);
>  	inv *= numerator;
> 

  reply	other threads:[~2009-02-27 11:55 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-27  9:13 iozone regression with 2.6.29-rc6 Lin Ming
2009-02-27  9:49 ` Peter Zijlstra
2009-02-27 11:55   ` Nick Piggin [this message]
2009-03-02  2:19   ` Lin Ming
2009-03-02  3:12     ` Wu Fengguang
2009-03-02  3:16       ` Lin Ming

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090227115520.GC21296@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.m.lin@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox