All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Jan Kara <jack@suse.cz>
Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	Wu Fengguang <fengguang.wu@intel.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Hellwig <hch@infradead.org>,
	Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH 2/5] mm: Properly reflect task dirty limits in dirty_exceeded logic
Date: Wed, 9 Mar 2011 16:02:53 -0500	[thread overview]
Message-ID: <20110309210253.GD10346@redhat.com> (raw)
In-Reply-To: <1299623475-5512-3-git-send-email-jack@suse.cz>

On Tue, Mar 08, 2011 at 11:31:12PM +0100, Jan Kara wrote:
> We set bdi->dirty_exceeded (and thus ratelimiting code starts to
> call balance_dirty_pages() every 8 pages) when a per-bdi limit is
> exceeded or global limit is exceeded. But per-bdi limit also depends
> on the task. Thus different tasks reach the limit on that bdi at
> different levels of dirty pages. The result is that with current code
> bdi->dirty_exceeded ping-ponged between 1 and 0 depending on which task
> just got into balance_dirty_pages().
> 
> We fix the issue by clearing bdi->dirty_exceeded only when per-bdi amount
> of dirty pages drops below the threshold (7/8 * bdi_dirty_limit) where task
> limits already do not have any influence.
> 
> CC: Andrew Morton <akpm@linux-foundation.org>
> CC: Christoph Hellwig <hch@infradead.org>
> CC: Dave Chinner <david@fromorbit.com>
> CC: Wu Fengguang <fengguang.wu@intel.com>
> CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  mm/page-writeback.c |   18 ++++++++++++++++--
>  1 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index c472c1c..f388f70 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -275,12 +275,13 @@ static inline void task_dirties_fraction(struct task_struct *tsk,
>   * effectively curb the growth of dirty pages. Light dirtiers with high enough
>   * dirty threshold may never get throttled.
>   */
> +#define TASK_LIMIT_FRACTION 8
>  static unsigned long task_dirty_limit(struct task_struct *tsk,
>  				       unsigned long bdi_dirty)
>  {
>  	long numerator, denominator;
>  	unsigned long dirty = bdi_dirty;
> -	u64 inv = dirty >> 3;
> +	u64 inv = dirty / TASK_LIMIT_FRACTION;
>  
>  	task_dirties_fraction(tsk, &numerator, &denominator);
>  	inv *= numerator;
> @@ -291,6 +292,12 @@ static unsigned long task_dirty_limit(struct task_struct *tsk,
>  	return max(dirty, bdi_dirty/2);
>  }
>  
> +/* Minimum limit for any task */
> +static unsigned long task_min_dirty_limit(unsigned long bdi_dirty)
> +{
> +	return bdi_dirty - bdi_dirty / TASK_LIMIT_FRACTION;
> +}
> +

Hi Jan,

Should the above be called bdi_min_dirty_limit()? In essense we seem to
be setting bdi->bdi_exceeded when dirty pages on bdi cross bdi_thresh and
clear it when dirty pages on bdi are below 7/8*bdi_thresh. So there does
not seem to be any dependency on task dirty limit here hence string
"task" sounds confusing to me. In fact, would bdi_dirty_exceeded_clear_thresh()
be a better name?
 
>  /*
>   *
>   */
> @@ -484,9 +491,11 @@ static void balance_dirty_pages(struct address_space *mapping,
>  	unsigned long background_thresh;
>  	unsigned long dirty_thresh;
>  	unsigned long bdi_thresh;
> +	unsigned long min_bdi_thresh = ULONG_MAX;
>  	unsigned long pages_written = 0;
>  	unsigned long pause = 1;
>  	bool dirty_exceeded = false;
> +	bool min_dirty_exceeded = false;
>  	struct backing_dev_info *bdi = mapping->backing_dev_info;
>  
>  	for (;;) {
> @@ -513,6 +522,7 @@ static void balance_dirty_pages(struct address_space *mapping,
>  			break;
>  
>  		bdi_thresh = bdi_dirty_limit(bdi, dirty_thresh);
> +		min_bdi_thresh = task_min_dirty_limit(bdi_thresh);
>  		bdi_thresh = task_dirty_limit(current, bdi_thresh);
                ^^^^^
This patch aside, we use bdi_thresh name both for bdi threshold as well
as per task per bdi threshold. will task_bdi_thresh be a better name
here.

>  
>  		/*
> @@ -542,6 +552,9 @@ static void balance_dirty_pages(struct address_space *mapping,
>  		dirty_exceeded =
>  			(bdi_nr_reclaimable + bdi_nr_writeback > bdi_thresh)
>  			|| (nr_reclaimable + nr_writeback > dirty_thresh);
> +		min_dirty_exceeded =
> +			(bdi_nr_reclaimable + bdi_nr_writeback > min_bdi_thresh)
> +			|| (nr_reclaimable + nr_writeback > dirty_thresh);

Would following be easier to understand.

		clear_dirty_exceeded =
			(bdi_nr_reclaimable + bdi_nr_writeback <
				dirty_exceeded_reset_thresh)
			&& (nr_reclaimable + nr_writeback < dirty_thresh);

>  
>  		if (!dirty_exceeded)
>  			break;
> @@ -579,7 +592,8 @@ static void balance_dirty_pages(struct address_space *mapping,
>  			pause = HZ / 10;
>  	}
>  
> -	if (!dirty_exceeded && bdi->dirty_exceeded)
> +	/* Clear dirty_exceeded flag only when no task can exceed the limit */
> +	if (!min_dirty_exceeded && bdi->dirty_exceeded)
>  		bdi->dirty_exceeded = 0;

similiarly...

	if (bdi->dirty_exceeded && clear_dirty_exceeded)
		bdi->dirty_exceeded = 0;

I was confused with the term min_dirty_limit and task_min_dirty_limit()
for sometime as patch said that we are trying to move away from dependence
on task specific bdi_thres for clearing bdi->bdi_thresh. May be it is
just me...
 
Thanks
Vivek 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-03-09 21:02 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-08 22:31 [PATCH RFC 0/5] IO-less balance_dirty_pages() v2 (simple approach) Jan Kara
2011-03-08 22:31 ` Jan Kara
2011-03-08 22:31 ` [PATCH 1/5] writeback: account per-bdi accumulated written pages Jan Kara
2011-03-08 22:31   ` Jan Kara
2011-03-08 22:31 ` [PATCH 2/5] mm: Properly reflect task dirty limits in dirty_exceeded logic Jan Kara
2011-03-08 22:31   ` Jan Kara
2011-03-09 21:02   ` Vivek Goyal [this message]
2011-03-14 20:44     ` Jan Kara
2011-03-14 20:44       ` Jan Kara
2011-03-15 15:21       ` Vivek Goyal
2011-03-08 22:31 ` [PATCH 3/5] mm: Implement IO-less balance_dirty_pages() Jan Kara
2011-03-08 22:31   ` Jan Kara
2011-03-10  0:07   ` Vivek Goyal
2011-03-14 20:48     ` Jan Kara
2011-03-14 20:48       ` Jan Kara
2011-03-15 15:23       ` Vivek Goyal
2011-03-16 21:26         ` Curt Wohlgemuth
2011-03-16 22:53           ` Curt Wohlgemuth
2011-03-16 22:53             ` Curt Wohlgemuth
2011-03-16 16:53   ` Vivek Goyal
2011-03-16 19:10     ` Jan Kara
2011-03-16 19:31       ` Vivek Goyal
2011-03-16 19:58         ` Jan Kara
2011-03-16 19:58           ` Jan Kara
2011-03-16 20:22           ` Vivek Goyal
2011-03-08 22:31 ` [PATCH 4/5] mm: Remove low limit from sync_writeback_pages() Jan Kara
2011-03-08 22:31 ` [PATCH 5/5] mm: Autotune interval between distribution of page completions Jan Kara
2011-03-08 22:31   ` Jan Kara
2011-03-17 15:46 ` [PATCH RFC 0/5] IO-less balance_dirty_pages() v2 (simple approach) Curt Wohlgemuth
2011-03-17 15:46   ` Curt Wohlgemuth
2011-03-17 15:51   ` Christoph Hellwig
2011-03-17 15:51     ` Christoph Hellwig
2011-03-17 16:24     ` Curt Wohlgemuth
2011-03-17 16:24       ` Curt Wohlgemuth
2011-03-17 16:43       ` Christoph Hellwig
2011-03-17 16:43         ` Christoph Hellwig
2011-03-17 17:32   ` Jan Kara
2011-03-17 17:32     ` Jan Kara
2011-03-17 18:55     ` Curt Wohlgemuth
2011-03-17 18:55       ` Curt Wohlgemuth
2011-03-17 22:56       ` Vivek Goyal
2011-03-17 22:56         ` Vivek Goyal
2011-03-18 14:30 ` Wu Fengguang
2011-03-18 14:30   ` Wu Fengguang
2011-03-22 21:43   ` Jan Kara
2011-03-22 21:43     ` Jan Kara
2011-03-23  4:41     ` Dave Chinner
2011-03-23  4:41       ` Dave Chinner
2011-03-25 12:59       ` Wu Fengguang
2011-03-25 12:59         ` Wu Fengguang
2011-03-25 13:44     ` Wu Fengguang
2011-03-25 23:05       ` Jan Kara
2011-03-25 23:05         ` Jan Kara
2011-03-28  2:44         ` Wu Fengguang
2011-03-28  2:44           ` Wu Fengguang
2011-03-28 15:08           ` Jan Kara
2011-03-28 15:08             ` Jan Kara
2011-03-29  1:44             ` Wu Fengguang
2011-03-29  1:44               ` Wu Fengguang
2011-03-29  2:14           ` Dave Chinner
2011-03-29  2:41             ` Wu Fengguang
2011-03-29  5:59               ` Dave Chinner
2011-03-29  5:59                 ` Dave Chinner
2011-03-29  7:31                 ` Wu Fengguang
2011-03-29  7:52                   ` Wu Fengguang
2011-03-29  7:52                     ` Wu Fengguang
  -- strict thread matches above, loose matches on Subject: below --
2011-02-04  1:38 [RFC PATCH 0/5] IO-less balance dirty pages Jan Kara
2011-02-04  1:38 ` [PATCH 2/5] mm: Properly reflect task dirty limits in dirty_exceeded logic Jan Kara
2011-02-04  1:38   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110309210253.GD10346@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=fengguang.wu@intel.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.