All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Christoph Hellwig <hch@infradead.org>, Jan Kara <jack@suse.cz>,
	Richard Kennedy <richard@rsk.demon.co.uk>,
	Dave Chinner <david@fromorbit.com>,
	linux-fsdevel@vger.kernel.org,
	Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/6] writeback: reduce calls to global_page_state in balance_dirty_pages()
Date: Tue, 03 Aug 2010 16:55:27 +0200	[thread overview]
Message-ID: <1280847327.1923.589.camel@laptop> (raw)
In-Reply-To: <20100711021748.735126772@intel.com>

On Sun, 2010-07-11 at 10:06 +0800, Wu Fengguang wrote:
> 
> CC: Jan Kara <jack@suse.cz>

I can more or less remember this patch, and the result looks good.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>


> Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> ---
>  mm/page-writeback.c |   95 ++++++++++++++----------------------------
>  1 file changed, 33 insertions(+), 62 deletions(-)
> 
> --- linux-next.orig/mm/page-writeback.c 2010-07-11 08:42:14.000000000 +0800
> +++ linux-next/mm/page-writeback.c      2010-07-11 08:44:49.000000000 +0800
> @@ -253,32 +253,6 @@ static void bdi_writeout_fraction(struct
>         }
>  }
>  
>  static inline void task_dirties_fraction(struct task_struct *tsk,
>                 long *numerator, long *denominator)
>  {
> @@ -469,7 +443,6 @@ get_dirty_limits(unsigned long *pbackgro
>                         bdi_dirty = dirty * bdi->max_ratio / 100;
>  
>                 *pbdi_dirty = bdi_dirty;
>                 task_dirty_limit(current, pbdi_dirty);
>         }
>  }
> @@ -491,7 +464,7 @@ static void balance_dirty_pages(struct a
>         unsigned long bdi_thresh;
>         unsigned long pages_written = 0;
>         unsigned long pause = 1;
> +       int dirty_exceeded;
>         struct backing_dev_info *bdi = mapping->backing_dev_info;
>  
>         for (;;) {
> @@ -510,10 +483,35 @@ static void balance_dirty_pages(struct a
>                 nr_writeback = global_page_state(NR_WRITEBACK) +
>                                global_page_state(NR_WRITEBACK_TEMP);
>  
> +               /*
> +                * In order to avoid the stacked BDI deadlock we need
> +                * to ensure we accurately count the 'dirty' pages when
> +                * the threshold is low.
> +                *
> +                * Otherwise it would be possible to get thresh+n pages
> +                * reported dirty, even though there are thresh-m pages
> +                * actually dirty; with m+n sitting in the percpu
> +                * deltas.
> +                */
> +               if (bdi_thresh < 2*bdi_stat_error(bdi)) {
> +                       bdi_nr_reclaimable = bdi_stat_sum(bdi, BDI_RECLAIMABLE);
> +                       bdi_nr_writeback = bdi_stat_sum(bdi, BDI_WRITEBACK);
> +               } else {
> +                       bdi_nr_reclaimable = bdi_stat(bdi, BDI_RECLAIMABLE);
> +                       bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
> +               }
> +
> +               /*
> +                * The bdi thresh is somehow "soft" limit derived from the
> +                * global "hard" limit. The former helps to prevent heavy IO
> +                * bdi or process from holding back light ones; The latter is
> +                * the last resort safeguard.
> +                */
> +               dirty_exceeded =
> +                       (bdi_nr_reclaimable + bdi_nr_writeback >= bdi_thresh)
> +                       || (nr_reclaimable + nr_writeback >= dirty_thresh);
>  
> +               if (!dirty_exceeded)
>                         break;
>  
>                 /*
> @@ -541,34 +539,10 @@ static void balance_dirty_pages(struct a
>                 if (bdi_nr_reclaimable > bdi_thresh) {
>                         writeback_inodes_wb(&bdi->wb, &wbc);
>                         pages_written += write_chunk - wbc.nr_to_write;
>                         trace_wbc_balance_dirty_written(&wbc, bdi);
> +                       if (pages_written >= write_chunk)
> +                               break;          /* We've done our duty */
>                 }
>                 trace_wbc_balance_dirty_wait(&wbc, bdi);
>                 __set_current_state(TASK_INTERRUPTIBLE);
>                 io_schedule_timeout(pause);
> @@ -582,8 +556,7 @@ static void balance_dirty_pages(struct a
>                         pause = HZ / 10;
>         }
>  
> +       if (!dirty_exceeded && bdi->dirty_exceeded)
>                 bdi->dirty_exceeded = 0;
>  
>         if (writeback_in_progress(bdi))
> @@ -598,9 +571,7 @@ static void balance_dirty_pages(struct a
>          * background_thresh, to keep the amount of dirty memory low.
>          */
>         if ((laptop_mode && pages_written) ||
> +           (!laptop_mode && (nr_reclaimable > background_thresh)))
>                 bdi_start_background_writeback(bdi);
>  }

WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Christoph Hellwig <hch@infradead.org>, Jan Kara <jack@suse.cz>,
	Richard Kennedy <richard@rsk.demon.co.uk>,
	Dave Chinner <david@fromorbit.com>,
	 linux-fsdevel@vger.kernel.org,
	Linux Memory Management List <linux-mm@kvack.org>,
	 LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/6] writeback: reduce calls to global_page_state in balance_dirty_pages()
Date: Tue, 03 Aug 2010 16:55:27 +0200	[thread overview]
Message-ID: <1280847327.1923.589.camel@laptop> (raw)
In-Reply-To: <20100711021748.735126772@intel.com>

On Sun, 2010-07-11 at 10:06 +0800, Wu Fengguang wrote:
> 
> CC: Jan Kara <jack@suse.cz>

I can more or less remember this patch, and the result looks good.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>


> Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> ---
>  mm/page-writeback.c |   95 ++++++++++++++----------------------------
>  1 file changed, 33 insertions(+), 62 deletions(-)
> 
> --- linux-next.orig/mm/page-writeback.c 2010-07-11 08:42:14.000000000 +0800
> +++ linux-next/mm/page-writeback.c      2010-07-11 08:44:49.000000000 +0800
> @@ -253,32 +253,6 @@ static void bdi_writeout_fraction(struct
>         }
>  }
>  
>  static inline void task_dirties_fraction(struct task_struct *tsk,
>                 long *numerator, long *denominator)
>  {
> @@ -469,7 +443,6 @@ get_dirty_limits(unsigned long *pbackgro
>                         bdi_dirty = dirty * bdi->max_ratio / 100;
>  
>                 *pbdi_dirty = bdi_dirty;
>                 task_dirty_limit(current, pbdi_dirty);
>         }
>  }
> @@ -491,7 +464,7 @@ static void balance_dirty_pages(struct a
>         unsigned long bdi_thresh;
>         unsigned long pages_written = 0;
>         unsigned long pause = 1;
> +       int dirty_exceeded;
>         struct backing_dev_info *bdi = mapping->backing_dev_info;
>  
>         for (;;) {
> @@ -510,10 +483,35 @@ static void balance_dirty_pages(struct a
>                 nr_writeback = global_page_state(NR_WRITEBACK) +
>                                global_page_state(NR_WRITEBACK_TEMP);
>  
> +               /*
> +                * In order to avoid the stacked BDI deadlock we need
> +                * to ensure we accurately count the 'dirty' pages when
> +                * the threshold is low.
> +                *
> +                * Otherwise it would be possible to get thresh+n pages
> +                * reported dirty, even though there are thresh-m pages
> +                * actually dirty; with m+n sitting in the percpu
> +                * deltas.
> +                */
> +               if (bdi_thresh < 2*bdi_stat_error(bdi)) {
> +                       bdi_nr_reclaimable = bdi_stat_sum(bdi, BDI_RECLAIMABLE);
> +                       bdi_nr_writeback = bdi_stat_sum(bdi, BDI_WRITEBACK);
> +               } else {
> +                       bdi_nr_reclaimable = bdi_stat(bdi, BDI_RECLAIMABLE);
> +                       bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
> +               }
> +
> +               /*
> +                * The bdi thresh is somehow "soft" limit derived from the
> +                * global "hard" limit. The former helps to prevent heavy IO
> +                * bdi or process from holding back light ones; The latter is
> +                * the last resort safeguard.
> +                */
> +               dirty_exceeded =
> +                       (bdi_nr_reclaimable + bdi_nr_writeback >= bdi_thresh)
> +                       || (nr_reclaimable + nr_writeback >= dirty_thresh);
>  
> +               if (!dirty_exceeded)
>                         break;
>  
>                 /*
> @@ -541,34 +539,10 @@ static void balance_dirty_pages(struct a
>                 if (bdi_nr_reclaimable > bdi_thresh) {
>                         writeback_inodes_wb(&bdi->wb, &wbc);
>                         pages_written += write_chunk - wbc.nr_to_write;
>                         trace_wbc_balance_dirty_written(&wbc, bdi);
> +                       if (pages_written >= write_chunk)
> +                               break;          /* We've done our duty */
>                 }
>                 trace_wbc_balance_dirty_wait(&wbc, bdi);
>                 __set_current_state(TASK_INTERRUPTIBLE);
>                 io_schedule_timeout(pause);
> @@ -582,8 +556,7 @@ static void balance_dirty_pages(struct a
>                         pause = HZ / 10;
>         }
>  
> +       if (!dirty_exceeded && bdi->dirty_exceeded)
>                 bdi->dirty_exceeded = 0;
>  
>         if (writeback_in_progress(bdi))
> @@ -598,9 +571,7 @@ static void balance_dirty_pages(struct a
>          * background_thresh, to keep the amount of dirty memory low.
>          */
>         if ((laptop_mode && pages_written) ||
> +           (!laptop_mode && (nr_reclaimable > background_thresh)))
>                 bdi_start_background_writeback(bdi);
>  }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-08-03 14:55 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-11  2:06 [PATCH 0/6] writeback cleanups and trivial fixes Wu Fengguang
2010-07-11  2:06 ` Wu Fengguang
2010-07-11  2:06 ` Wu Fengguang
2010-07-11  2:06 ` [PATCH 1/6] writeback: take account of NR_WRITEBACK_TEMP in balance_dirty_pages() Wu Fengguang
2010-07-11  2:06   ` Wu Fengguang
2010-07-12 21:52   ` Andrew Morton
2010-07-12 21:52     ` Andrew Morton
2010-07-12 21:52     ` Andrew Morton
2010-07-13  8:58     ` Miklos Szeredi
2010-07-13  8:58       ` Miklos Szeredi
2010-07-15 14:50       ` Wu Fengguang
2010-07-15 14:50         ` Wu Fengguang
2010-07-11  2:06 ` [PATCH 2/6] writeback: reduce calls to global_page_state " Wu Fengguang
2010-07-11  2:06   ` Wu Fengguang
2010-07-11  2:06   ` Wu Fengguang
2010-07-26 15:19   ` Jan Kara
2010-07-26 15:19     ` Jan Kara
2010-07-27  3:59     ` Wu Fengguang
2010-07-27  3:59       ` Wu Fengguang
2010-07-27  9:12       ` Jan Kara
2010-07-27  9:12         ` Jan Kara
2010-07-28  2:04         ` Wu Fengguang
2010-07-28  2:04           ` Wu Fengguang
2010-08-03 14:55   ` Peter Zijlstra [this message]
2010-08-03 14:55     ` Peter Zijlstra
2010-07-11  2:06 ` [PATCH 3/6] writeback: avoid unnecessary calculation of bdi dirty thresholds Wu Fengguang
2010-07-11  2:06   ` Wu Fengguang
2010-07-11  2:06   ` Wu Fengguang
2010-07-12 21:56   ` Andrew Morton
2010-07-12 21:56     ` Andrew Morton
2010-07-12 21:56     ` Andrew Morton
2010-07-15 14:55     ` Wu Fengguang
2010-07-15 14:55       ` Wu Fengguang
2010-07-19 21:35   ` Andrew Morton
2010-07-19 21:35     ` Andrew Morton
2010-07-19 21:35     ` Andrew Morton
2010-07-20  3:34     ` Wu Fengguang
2010-07-20  3:34       ` Wu Fengguang
2010-07-20  3:34       ` Wu Fengguang
2010-07-20  4:14       ` Andrew Morton
2010-07-20  4:14         ` Andrew Morton
2010-08-03 15:03   ` Peter Zijlstra
2010-08-03 15:03     ` Peter Zijlstra
2010-08-03 15:10     ` Wu Fengguang
2010-08-03 15:10       ` Wu Fengguang
2010-08-04 16:41     ` Wu Fengguang
2010-08-04 16:41       ` Wu Fengguang
2010-08-04 17:10       ` Peter Zijlstra
2010-08-04 17:10         ` Peter Zijlstra
2010-07-11  2:07 ` [PATCH 4/6] writeback: dont redirty tail an inode with dirty pages Wu Fengguang
2010-07-11  2:07   ` Wu Fengguang
2010-07-12  2:01   ` Dave Chinner
2010-07-12  2:01     ` Dave Chinner
2010-07-12 15:31     ` Wu Fengguang
2010-07-12 15:31       ` Wu Fengguang
2010-07-12 22:13       ` Andrew Morton
2010-07-12 22:13         ` Andrew Morton
2010-07-15 15:35         ` Wu Fengguang
2010-07-15 15:35           ` Wu Fengguang
2010-07-11  2:07 ` [PATCH 5/6] writeback: fix queue_io() ordering Wu Fengguang
2010-07-11  2:07   ` Wu Fengguang
2010-07-11  2:07   ` Wu Fengguang
2010-07-12 22:15   ` Andrew Morton
2010-07-12 22:15     ` Andrew Morton
2010-07-12 22:15     ` Andrew Morton
2010-07-11  2:07 ` [PATCH 6/6] writeback: merge for_kupdate and !for_kupdate cases Wu Fengguang
2010-07-11  2:07   ` Wu Fengguang
2010-07-12  2:08   ` Dave Chinner
2010-07-12  2:08     ` Dave Chinner
2010-07-12 15:52     ` Wu Fengguang
2010-07-12 15:52       ` Wu Fengguang
2010-07-12 22:06       ` Dave Chinner
2010-07-12 22:06         ` Dave Chinner
2010-07-12 22:22       ` Andrew Morton
2010-07-12 22:22         ` Andrew Morton
2010-08-05 16:01         ` Wu Fengguang
2010-08-05 16:01           ` Wu Fengguang
2010-07-11  2:44 ` [PATCH 0/6] writeback cleanups and trivial fixes Christoph Hellwig
2010-07-11  2:44   ` Christoph Hellwig
2010-07-11  2:50   ` Wu Fengguang
2010-07-11  2:50     ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1280847327.1923.589.camel@laptop \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=fengguang.wu@intel.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=richard@rsk.demon.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.