All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: <linux-fsdevel@vger.kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Wu Fengguang <fengguang.wu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Andrea Righi <arighi@develer.com>
Cc: linux-mm <linux-mm@kvack.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Subject: [PATCH 10/11] writeback: dirty position control - bdi reserve area
Date: Mon, 03 Oct 2011 21:42:38 +0800	[thread overview]
Message-ID: <20111003134537.434162395@intel.com> (raw)
In-Reply-To: 20111003134228.090592370@intel.com

[-- Attachment #1: bdi-reserve-area --]
[-- Type: text/plain, Size: 1468 bytes --]

Keep a minimal pool of dirty pages for each bdi, so that the disk IO
queues won't underrun. Also gently increase a small bdi_thresh to avoid
it stuck in 0 for some light dirtied bdi.

It's particularly useful for JBOD and small memory system.

It may result in (pos_ratio > 1) at the setpoint and push the dirty
pages high. This is more or less intended because the bdi is in the
danger of IO queue underflow.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/page-writeback.c |   15 +++++++++++++++
 1 file changed, 15 insertions(+)

--- linux-next.orig/mm/page-writeback.c	2011-10-03 21:05:48.000000000 +0800
+++ linux-next/mm/page-writeback.c	2011-10-03 21:05:51.000000000 +0800
@@ -599,6 +599,7 @@ static unsigned long bdi_position_ratio(
 	 */
 	if (unlikely(bdi_thresh > thresh))
 		bdi_thresh = thresh;
+	bdi_thresh = max(bdi_thresh, (limit - dirty) / 8);
 	/*
 	 * scale global setpoint to bdi's:
 	 *	bdi_setpoint = setpoint * bdi_thresh / thresh
@@ -622,6 +623,20 @@ static unsigned long bdi_position_ratio(
 	} else
 		pos_ratio /= 4;
 
+	/*
+	 * bdi reserve area, safeguard against dirty pool underrun and disk idle
+	 * It may push the desired control point of global dirty pages higher
+	 * than setpoint.
+	 */
+	x_intercept = bdi_thresh / 2;
+	if (bdi_dirty < x_intercept) {
+		if (bdi_dirty > x_intercept / 8) {
+			pos_ratio *= x_intercept;
+			do_div(pos_ratio, bdi_dirty);
+		} else
+			pos_ratio *= 8;
+	}
+
 	return pos_ratio;
 }
 



WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: <linux-fsdevel@vger.kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Wu Fengguang <fengguang.wu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Andrea Righi <arighi@develer.com>
Cc: linux-mm <linux-mm@kvack.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Subject: [PATCH 10/11] writeback: dirty position control - bdi reserve area
Date: Mon, 03 Oct 2011 21:42:38 +0800	[thread overview]
Message-ID: <20111003134537.434162395@intel.com> (raw)
In-Reply-To: 20111003134228.090592370@intel.com

[-- Attachment #1: bdi-reserve-area --]
[-- Type: text/plain, Size: 1771 bytes --]

Keep a minimal pool of dirty pages for each bdi, so that the disk IO
queues won't underrun. Also gently increase a small bdi_thresh to avoid
it stuck in 0 for some light dirtied bdi.

It's particularly useful for JBOD and small memory system.

It may result in (pos_ratio > 1) at the setpoint and push the dirty
pages high. This is more or less intended because the bdi is in the
danger of IO queue underflow.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/page-writeback.c |   15 +++++++++++++++
 1 file changed, 15 insertions(+)

--- linux-next.orig/mm/page-writeback.c	2011-10-03 21:05:48.000000000 +0800
+++ linux-next/mm/page-writeback.c	2011-10-03 21:05:51.000000000 +0800
@@ -599,6 +599,7 @@ static unsigned long bdi_position_ratio(
 	 */
 	if (unlikely(bdi_thresh > thresh))
 		bdi_thresh = thresh;
+	bdi_thresh = max(bdi_thresh, (limit - dirty) / 8);
 	/*
 	 * scale global setpoint to bdi's:
 	 *	bdi_setpoint = setpoint * bdi_thresh / thresh
@@ -622,6 +623,20 @@ static unsigned long bdi_position_ratio(
 	} else
 		pos_ratio /= 4;
 
+	/*
+	 * bdi reserve area, safeguard against dirty pool underrun and disk idle
+	 * It may push the desired control point of global dirty pages higher
+	 * than setpoint.
+	 */
+	x_intercept = bdi_thresh / 2;
+	if (bdi_dirty < x_intercept) {
+		if (bdi_dirty > x_intercept / 8) {
+			pos_ratio *= x_intercept;
+			do_div(pos_ratio, bdi_dirty);
+		} else
+			pos_ratio *= 8;
+	}
+
 	return pos_ratio;
 }
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: linux-fsdevel@vger.kernel.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Wu Fengguang <fengguang.wu@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.cz>, Christoph Hellwig <hch@lst.de>,
	Dave Chinner <david@fromorbit.com>,
	Greg Thelen <gthelen@google.com>,
	Minchan Kim <minchan.kim@gmail.com>,
	Vivek Goyal <vgoyal@redhat.com>,
	Andrea Righi <arighi@develer.com>, linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: [PATCH 10/11] writeback: dirty position control - bdi reserve area
Date: Mon, 03 Oct 2011 21:42:38 +0800	[thread overview]
Message-ID: <20111003134537.434162395@intel.com> (raw)
In-Reply-To: 20111003134228.090592370@intel.com

[-- Attachment #1: bdi-reserve-area --]
[-- Type: text/plain, Size: 1771 bytes --]

Keep a minimal pool of dirty pages for each bdi, so that the disk IO
queues won't underrun. Also gently increase a small bdi_thresh to avoid
it stuck in 0 for some light dirtied bdi.

It's particularly useful for JBOD and small memory system.

It may result in (pos_ratio > 1) at the setpoint and push the dirty
pages high. This is more or less intended because the bdi is in the
danger of IO queue underflow.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/page-writeback.c |   15 +++++++++++++++
 1 file changed, 15 insertions(+)

--- linux-next.orig/mm/page-writeback.c	2011-10-03 21:05:48.000000000 +0800
+++ linux-next/mm/page-writeback.c	2011-10-03 21:05:51.000000000 +0800
@@ -599,6 +599,7 @@ static unsigned long bdi_position_ratio(
 	 */
 	if (unlikely(bdi_thresh > thresh))
 		bdi_thresh = thresh;
+	bdi_thresh = max(bdi_thresh, (limit - dirty) / 8);
 	/*
 	 * scale global setpoint to bdi's:
 	 *	bdi_setpoint = setpoint * bdi_thresh / thresh
@@ -622,6 +623,20 @@ static unsigned long bdi_position_ratio(
 	} else
 		pos_ratio /= 4;
 
+	/*
+	 * bdi reserve area, safeguard against dirty pool underrun and disk idle
+	 * It may push the desired control point of global dirty pages higher
+	 * than setpoint.
+	 */
+	x_intercept = bdi_thresh / 2;
+	if (bdi_dirty < x_intercept) {
+		if (bdi_dirty > x_intercept / 8) {
+			pos_ratio *= x_intercept;
+			do_div(pos_ratio, bdi_dirty);
+		} else
+			pos_ratio *= 8;
+	}
+
 	return pos_ratio;
 }
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2011-10-03 13:48 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-03 13:42 [PATCH 00/11] IO-less dirty throttling v12 Wu Fengguang
2011-10-03 13:42 ` Wu Fengguang
2011-10-03 13:42 ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 01/11] writeback: account per-bdi accumulated dirtied pages Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 02/11] writeback: dirty position control Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 03/11] writeback: add bg_threshold parameter to __bdi_update_bandwidth() Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 04/11] writeback: dirty rate control Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 05/11] writeback: stabilize bdi->dirty_ratelimit Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 06/11] writeback: per task dirty rate limit Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 07/11] writeback: IO-less balance_dirty_pages() Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 08/11] writeback: limit max dirty pause time Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 09/11] writeback: control " Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` Wu Fengguang [this message]
2011-10-03 13:42   ` [PATCH 10/11] writeback: dirty position control - bdi reserve area Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42 ` [PATCH 11/11] writeback: per-bdi background threshold Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:42   ` Wu Fengguang
2011-10-03 13:59 ` [PATCH 00/11] IO-less dirty throttling v12 Wu Fengguang
2011-10-03 13:59   ` Wu Fengguang
2011-10-05  1:42   ` Wu Fengguang
2011-10-05  1:42     ` Wu Fengguang
2011-10-04 19:52 ` Vivek Goyal
2011-10-04 19:52   ` Vivek Goyal
2011-10-05 13:56   ` Wu Fengguang
2011-10-05 13:56     ` Wu Fengguang
2011-10-05 15:16   ` Andi Kleen
2011-10-05 15:16     ` Andi Kleen
2011-10-10 12:14 ` Peter Zijlstra
2011-10-10 12:14   ` Peter Zijlstra
2011-10-10 13:07   ` Wu Fengguang
2011-10-10 13:07     ` Wu Fengguang
2011-10-10 13:10     ` [RFC][PATCH 1/2] nfs: writeback pages wait queue Wu Fengguang
2011-10-10 13:10       ` Wu Fengguang
2011-10-10 13:11       ` [RFC][PATCH 2/2] nfs: scale writeback threshold proportional to dirty threshold Wu Fengguang
2011-10-10 13:11         ` Wu Fengguang
2011-10-18  8:53         ` Wu Fengguang
2011-10-18  8:53           ` Wu Fengguang
2011-10-18  8:53           ` Wu Fengguang
2011-10-18  8:59           ` Wu Fengguang
2011-10-18  8:59             ` Wu Fengguang
2011-10-18  8:59             ` Wu Fengguang
2011-10-20  2:49             ` Wu Fengguang
2011-10-20  2:49               ` Wu Fengguang
2011-10-18  8:51       ` [RFC][PATCH 1/2] nfs: writeback pages wait queue Wu Fengguang
2011-10-18  8:51         ` Wu Fengguang
2011-10-20  3:59         ` Wu Fengguang
2011-10-20  3:59           ` Wu Fengguang
2011-10-10 14:28     ` [PATCH 00/11] IO-less dirty throttling v12 Wu Fengguang
2011-10-10 14:28       ` Wu Fengguang
2011-10-17  3:03       ` Wu Fengguang
2011-10-17  3:03         ` Wu Fengguang
2011-10-20  3:39 ` Wu Fengguang
2011-10-20  3:39   ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111003134537.434162395@intel.com \
    --to=fengguang.wu@intel.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.