All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: <linux-fsdevel@vger.kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Wu Fengguang <fengguang.wu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Andrea Righi <arighi@develer.com>
Cc: linux-mm <linux-mm@kvack.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Subject: [PATCH 09/11] writeback: control dirty pause time
Date: Mon, 03 Oct 2011 21:42:37 +0800	[thread overview]
Message-ID: <20111003134537.301789823@intel.com> (raw)
In-Reply-To: 20111003134228.090592370@intel.com

[-- Attachment #1: max-pause-adaption --]
[-- Type: text/plain, Size: 2263 bytes --]

The dirty pause time shall ultimately be controlled by adjusting
nr_dirtied_pause, since there is relationship

	pause = pages_dirtied / task_ratelimit

Assuming

	pages_dirtied ~= nr_dirtied_pause
	task_ratelimit ~= dirty_ratelimit

We get

	nr_dirtied_pause ~= dirty_ratelimit * desired_pause

Here dirty_ratelimit is preferred over task_ratelimit because it's
more stable.

It's also important to limit possible large transitional errors:

- bw is changing quickly
- pages_dirtied << nr_dirtied_pause on entering dirty exceeded area
- pages_dirtied >> nr_dirtied_pause on btrfs (to be improved by a
  separate fix, but still expect non-trivial errors)

So we end up using the above formula inside clamp_val().

The best test case for this code is to run 100 "dd bs=4M" tasks on
btrfs and check its pause time distribution.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/page-writeback.c |   20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

--- linux-next.orig/mm/page-writeback.c	2011-10-03 17:35:57.000000000 +0800
+++ linux-next/mm/page-writeback.c	2011-10-03 17:39:27.000000000 +0800
@@ -1086,6 +1086,10 @@ static void balance_dirty_pages(struct a
 		task_ratelimit = (u64)dirty_ratelimit *
 					pos_ratio >> RATELIMIT_CALC_SHIFT;
 		pause = (HZ * pages_dirtied) / (task_ratelimit | 1);
+		if (unlikely(pause <= 0)) {
+			pause = 1; /* avoid resetting nr_dirtied_pause below */
+			break;
+		}
 		pause = min(pause, max_pause);
 
 pause:
@@ -1107,7 +1111,21 @@ pause:
 		bdi->dirty_exceeded = 0;
 
 	current->nr_dirtied = 0;
-	current->nr_dirtied_pause = dirty_poll_interval(nr_dirty, dirty_thresh);
+	if (pause == 0) { /* in freerun area */
+		current->nr_dirtied_pause =
+				dirty_poll_interval(nr_dirty, dirty_thresh);
+	} else if (pause <= max_pause / 4 &&
+		   pages_dirtied >= current->nr_dirtied_pause) {
+		current->nr_dirtied_pause = clamp_val(
+					dirty_ratelimit * (max_pause / 2) / HZ,
+					pages_dirtied + pages_dirtied / 8,
+					pages_dirtied * 4);
+	} else if (pause >= max_pause) {
+		current->nr_dirtied_pause = 1 | clamp_val(
+					dirty_ratelimit * (max_pause / 2) / HZ,
+					pages_dirtied / 4,
+					pages_dirtied - pages_dirtied / 8);
+	}
 
 	if (writeback_in_progress(bdi))
 		return;



WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: <linux-fsdevel@vger.kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Wu Fengguang <fengguang.wu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Andrea Righi <arighi@develer.com>
Cc: linux-mm <linux-mm@kvack.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Subject: [PATCH 09/11] writeback: control dirty pause time
Date: Mon, 03 Oct 2011 21:42:37 +0800	[thread overview]
Message-ID: <20111003134537.301789823@intel.com> (raw)
In-Reply-To: 20111003134228.090592370@intel.com

[-- Attachment #1: max-pause-adaption --]
[-- Type: text/plain, Size: 2566 bytes --]

The dirty pause time shall ultimately be controlled by adjusting
nr_dirtied_pause, since there is relationship

	pause = pages_dirtied / task_ratelimit

Assuming

	pages_dirtied ~= nr_dirtied_pause
	task_ratelimit ~= dirty_ratelimit

We get

	nr_dirtied_pause ~= dirty_ratelimit * desired_pause

Here dirty_ratelimit is preferred over task_ratelimit because it's
more stable.

It's also important to limit possible large transitional errors:

- bw is changing quickly
- pages_dirtied << nr_dirtied_pause on entering dirty exceeded area
- pages_dirtied >> nr_dirtied_pause on btrfs (to be improved by a
  separate fix, but still expect non-trivial errors)

So we end up using the above formula inside clamp_val().

The best test case for this code is to run 100 "dd bs=4M" tasks on
btrfs and check its pause time distribution.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/page-writeback.c |   20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

--- linux-next.orig/mm/page-writeback.c	2011-10-03 17:35:57.000000000 +0800
+++ linux-next/mm/page-writeback.c	2011-10-03 17:39:27.000000000 +0800
@@ -1086,6 +1086,10 @@ static void balance_dirty_pages(struct a
 		task_ratelimit = (u64)dirty_ratelimit *
 					pos_ratio >> RATELIMIT_CALC_SHIFT;
 		pause = (HZ * pages_dirtied) / (task_ratelimit | 1);
+		if (unlikely(pause <= 0)) {
+			pause = 1; /* avoid resetting nr_dirtied_pause below */
+			break;
+		}
 		pause = min(pause, max_pause);
 
 pause:
@@ -1107,7 +1111,21 @@ pause:
 		bdi->dirty_exceeded = 0;
 
 	current->nr_dirtied = 0;
-	current->nr_dirtied_pause = dirty_poll_interval(nr_dirty, dirty_thresh);
+	if (pause == 0) { /* in freerun area */
+		current->nr_dirtied_pause =
+				dirty_poll_interval(nr_dirty, dirty_thresh);
+	} else if (pause <= max_pause / 4 &&
+		   pages_dirtied >= current->nr_dirtied_pause) {
+		current->nr_dirtied_pause = clamp_val(
+					dirty_ratelimit * (max_pause / 2) / HZ,
+					pages_dirtied + pages_dirtied / 8,
+					pages_dirtied * 4);
+	} else if (pause >= max_pause) {
+		current->nr_dirtied_pause = 1 | clamp_val(
+					dirty_ratelimit * (max_pause / 2) / HZ,
+					pages_dirtied / 4,
+					pages_dirtied - pages_dirtied / 8);
+	}
 
 	if (writeback_in_progress(bdi))
 		return;


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: linux-fsdevel@vger.kernel.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Wu Fengguang <fengguang.wu@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.cz>, Christoph Hellwig <hch@lst.de>,
	Dave Chinner <david@fromorbit.com>,
	Greg Thelen <gthelen@google.com>,
	Minchan Kim <minchan.kim@gmail.com>,
	Vivek Goyal <vgoyal@redhat.com>,
	Andrea Righi <arighi@develer.com>, linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: [PATCH 09/11] writeback: control dirty pause time
Date: Mon, 03 Oct 2011 21:42:37 +0800	[thread overview]
Message-ID: <20111003134537.301789823@intel.com> (raw)
In-Reply-To: 20111003134228.090592370@intel.com

[-- Attachment #1: max-pause-adaption --]
[-- Type: text/plain, Size: 2566 bytes --]

The dirty pause time shall ultimately be controlled by adjusting
nr_dirtied_pause, since there is relationship

	pause = pages_dirtied / task_ratelimit

Assuming

	pages_dirtied ~= nr_dirtied_pause
	task_ratelimit ~= dirty_ratelimit

We get

	nr_dirtied_pause ~= dirty_ratelimit * desired_pause

Here dirty_ratelimit is preferred over task_ratelimit because it's
more stable.

It's also important to limit possible large transitional errors:

- bw is changing quickly
- pages_dirtied << nr_dirtied_pause on entering dirty exceeded area
- pages_dirtied >> nr_dirtied_pause on btrfs (to be improved by a
  separate fix, but still expect non-trivial errors)

So we end up using the above formula inside clamp_val().

The best test case for this code is to run 100 "dd bs=4M" tasks on
btrfs and check its pause time distribution.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/page-writeback.c |   20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

--- linux-next.orig/mm/page-writeback.c	2011-10-03 17:35:57.000000000 +0800
+++ linux-next/mm/page-writeback.c	2011-10-03 17:39:27.000000000 +0800
@@ -1086,6 +1086,10 @@ static void balance_dirty_pages(struct a
 		task_ratelimit = (u64)dirty_ratelimit *
 					pos_ratio >> RATELIMIT_CALC_SHIFT;
 		pause = (HZ * pages_dirtied) / (task_ratelimit | 1);
+		if (unlikely(pause <= 0)) {
+			pause = 1; /* avoid resetting nr_dirtied_pause below */
+			break;
+		}
 		pause = min(pause, max_pause);
 
 pause:
@@ -1107,7 +1111,21 @@ pause:
 		bdi->dirty_exceeded = 0;
 
 	current->nr_dirtied = 0;
-	current->nr_dirtied_pause = dirty_poll_interval(nr_dirty, dirty_thresh);
+	if (pause == 0) { /* in freerun area */
+		current->nr_dirtied_pause =
+				dirty_poll_interval(nr_dirty, dirty_thresh);
+	} else if (pause <= max_pause / 4 &&
+		   pages_dirtied >= current->nr_dirtied_pause) {
+		current->nr_dirtied_pause = clamp_val(
+					dirty_ratelimit * (max_pause / 2) / HZ,
+					pages_dirtied + pages_dirtied / 8,
+					pages_dirtied * 4);
+	} else if (pause >= max_pause) {
+		current->nr_dirtied_pause = 1 | clamp_val(
+					dirty_ratelimit * (max_pause / 2) / HZ,
+					pages_dirtied / 4,
+					pages_dirtied - pages_dirtied / 8);
+	}
 
 	if (writeback_in_progress(bdi))
 		return;


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2011-10-03 13:46 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-03 13:42 [PATCH 00/11] IO-less dirty throttling v12 Wu Fengguang
2011-10-03 13:42 ` Wu Fengguang
2011-10-03 13:42 ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 01/11] writeback: account per-bdi accumulated dirtied pages Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 02/11] writeback: dirty position control Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 03/11] writeback: add bg_threshold parameter to __bdi_update_bandwidth() Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 04/11] writeback: dirty rate control Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 05/11] writeback: stabilize bdi->dirty_ratelimit Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 06/11] writeback: per task dirty rate limit Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 07/11] writeback: IO-less balance_dirty_pages() Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 08/11] writeback: limit max dirty pause time Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` Wu Fengguang [this message]
2011-10-03 13:42   ` [PATCH 09/11] writeback: control " Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 10/11] writeback: dirty position control - bdi reserve area Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 11/11] writeback: per-bdi background threshold Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:59 ` [PATCH 00/11] IO-less dirty throttling v12 Wu Fengguang
2011-10-03 13:59   ` Wu Fengguang
2011-10-05  1:42   ` Wu Fengguang
2011-10-05  1:42     ` Wu Fengguang
2011-10-04 19:52 ` Vivek Goyal
2011-10-04 19:52   ` Vivek Goyal
2011-10-05 13:56   ` Wu Fengguang
2011-10-05 13:56     ` Wu Fengguang
2011-10-05 15:16   ` Andi Kleen
2011-10-05 15:16     ` Andi Kleen
2011-10-10 12:14 ` Peter Zijlstra
2011-10-10 12:14   ` Peter Zijlstra
2011-10-10 13:07   ` Wu Fengguang
2011-10-10 13:07     ` Wu Fengguang
2011-10-10 13:10     ` [RFC][PATCH 1/2] nfs: writeback pages wait queue Wu Fengguang
2011-10-10 13:10       ` Wu Fengguang
2011-10-10 13:11       ` [RFC][PATCH 2/2] nfs: scale writeback threshold proportional to dirty threshold Wu Fengguang
2011-10-10 13:11         ` Wu Fengguang
2011-10-18  8:53         ` Wu Fengguang
2011-10-18  8:53           ` Wu Fengguang
2011-10-18  8:53           ` Wu Fengguang
2011-10-18  8:59           ` Wu Fengguang
2011-10-18  8:59             ` Wu Fengguang
2011-10-18  8:59             ` Wu Fengguang
2011-10-20  2:49             ` Wu Fengguang
2011-10-20  2:49               ` Wu Fengguang
2011-10-18  8:51       ` [RFC][PATCH 1/2] nfs: writeback pages wait queue Wu Fengguang
2011-10-18  8:51         ` Wu Fengguang
2011-10-20  3:59         ` Wu Fengguang
2011-10-20  3:59           ` Wu Fengguang
2011-10-10 14:28     ` [PATCH 00/11] IO-less dirty throttling v12 Wu Fengguang
2011-10-10 14:28       ` Wu Fengguang
2011-10-17  3:03       ` Wu Fengguang
2011-10-17  3:03         ` Wu Fengguang
2011-10-20  3:39 ` Wu Fengguang
2011-10-20  3:39   ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111003134537.301789823@intel.com \
    --to=fengguang.wu@intel.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.