All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>,
	LKML <linux-kernel@vger.kernel.org>, Jan Kara <jack@suse.cz>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Chris Mason <chris.mason@oracle.com>, Nick Piggin <npiggin@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>, Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan.kim@gmail.com>
Subject: [PATCH 3/5] writeback: prevent sync livelock with the sync_after timestamp
Date: Thu, 29 Jul 2010 19:51:45 +0800	[thread overview]
Message-ID: <20100729121423.471866750@intel.com> (raw)
In-Reply-To: 20100729115142.102255590@intel.com

[-- Attachment #1: writeback-sync-pending-start_time.patch --]
[-- Type: text/plain, Size: 2918 bytes --]

The start time in writeback_inodes_wb() is not very useful because it
slips at each invocation time. Preferrably one _constant_ time shall be
used at the beginning to cover the whole sync() work.

The newly dirtied inodes are now guarded at the queue_io() time instead
of the b_io walk time. This is more natural: non-empty b_io/b_more_io
means "more work pending".

The timestamp is now grabbed the sync work submission time, and may be
further optimized to the initial sync() call time.

CC: Jan Kara <jack@suse.cz>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 fs/fs-writeback.c         |   16 ++++++----------
 include/linux/writeback.h |    4 ++--
 2 files changed, 8 insertions(+), 12 deletions(-)

--- linux-next.orig/fs/fs-writeback.c	2010-07-29 17:13:49.000000000 +0800
+++ linux-next/fs/fs-writeback.c	2010-07-29 17:13:58.000000000 +0800
@@ -228,6 +228,10 @@ static void move_expired_inodes(struct l
 	struct inode *inode;
 	int do_sb_sort = 0;
 
+	if (wbc->for_sync) {
+		expire_interval = 1;
+		older_than_this = wbc->sync_after;
+	}
 	if (wbc->for_kupdate || wbc->for_background) {
 		expire_interval = msecs_to_jiffies(dirty_expire_interval * 10);
 		older_than_this = jiffies - expire_interval;
@@ -507,12 +511,6 @@ static int writeback_sb_inodes(struct su
 			requeue_io(inode);
 			continue;
 		}
-		/*
-		 * Was this inode dirtied after sync_sb_inodes was called?
-		 * This keeps sync from extra jobs and livelock.
-		 */
-		if (inode_dirtied_after(inode, wbc->wb_start))
-			return 1;
 
 		BUG_ON(inode->i_state & I_FREEING);
 		__iget(inode);
@@ -541,10 +539,9 @@ void writeback_inodes_wb(struct bdi_writ
 {
 	int ret = 0;
 
-	wbc->wb_start = jiffies; /* livelock avoidance */
 	spin_lock(&inode_lock);
 
-	if (!(wbc->for_kupdate || wbc->for_background) || list_empty(&wb->b_io))
+	if (list_empty(&wb->b_io))
 		queue_io(wb, wbc);
 
 	while (!list_empty(&wb->b_io)) {
@@ -571,9 +568,8 @@ static void __writeback_inodes_sb(struct
 {
 	WARN_ON(!rwsem_is_locked(&sb->s_umount));
 
-	wbc->wb_start = jiffies; /* livelock avoidance */
 	spin_lock(&inode_lock);
-	if (!(wbc->for_kupdate || wbc->for_background) || list_empty(&wb->b_io))
+	if (list_empty(&wb->b_io))
 		queue_io(wb, wbc);
 	writeback_sb_inodes(sb, wb, wbc, true);
 	spin_unlock(&inode_lock);
--- linux-next.orig/include/linux/writeback.h	2010-07-29 17:13:18.000000000 +0800
+++ linux-next/include/linux/writeback.h	2010-07-29 17:13:58.000000000 +0800
@@ -28,8 +28,8 @@ enum writeback_sync_modes {
  */
 struct writeback_control {
 	enum writeback_sync_modes sync_mode;
-	unsigned long wb_start;         /* Time writeback_inodes_wb was
-					   called. This is needed to avoid
+	unsigned long sync_after;	/* Only sync inodes dirtied after this
+					   timestamp. This is needed to avoid
 					   extra jobs and livelock */
 	long nr_to_write;		/* Write this many pages, and decrement
 					   this for each page written */



WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>,
	LKML <linux-kernel@vger.kernel.org>, Jan Kara <jack@suse.cz>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Chris Mason <chris.mason@oracle.com>, Nick Piggin <npiggin@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>, Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan.kim@gmail.com>
Subject: [PATCH 3/5] writeback: prevent sync livelock with the sync_after timestamp
Date: Thu, 29 Jul 2010 19:51:45 +0800	[thread overview]
Message-ID: <20100729121423.471866750@intel.com> (raw)
In-Reply-To: 20100729115142.102255590@intel.com

[-- Attachment #1: writeback-sync-pending-start_time.patch --]
[-- Type: text/plain, Size: 3143 bytes --]

The start time in writeback_inodes_wb() is not very useful because it
slips at each invocation time. Preferrably one _constant_ time shall be
used at the beginning to cover the whole sync() work.

The newly dirtied inodes are now guarded at the queue_io() time instead
of the b_io walk time. This is more natural: non-empty b_io/b_more_io
means "more work pending".

The timestamp is now grabbed the sync work submission time, and may be
further optimized to the initial sync() call time.

CC: Jan Kara <jack@suse.cz>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 fs/fs-writeback.c         |   16 ++++++----------
 include/linux/writeback.h |    4 ++--
 2 files changed, 8 insertions(+), 12 deletions(-)

--- linux-next.orig/fs/fs-writeback.c	2010-07-29 17:13:49.000000000 +0800
+++ linux-next/fs/fs-writeback.c	2010-07-29 17:13:58.000000000 +0800
@@ -228,6 +228,10 @@ static void move_expired_inodes(struct l
 	struct inode *inode;
 	int do_sb_sort = 0;
 
+	if (wbc->for_sync) {
+		expire_interval = 1;
+		older_than_this = wbc->sync_after;
+	}
 	if (wbc->for_kupdate || wbc->for_background) {
 		expire_interval = msecs_to_jiffies(dirty_expire_interval * 10);
 		older_than_this = jiffies - expire_interval;
@@ -507,12 +511,6 @@ static int writeback_sb_inodes(struct su
 			requeue_io(inode);
 			continue;
 		}
-		/*
-		 * Was this inode dirtied after sync_sb_inodes was called?
-		 * This keeps sync from extra jobs and livelock.
-		 */
-		if (inode_dirtied_after(inode, wbc->wb_start))
-			return 1;
 
 		BUG_ON(inode->i_state & I_FREEING);
 		__iget(inode);
@@ -541,10 +539,9 @@ void writeback_inodes_wb(struct bdi_writ
 {
 	int ret = 0;
 
-	wbc->wb_start = jiffies; /* livelock avoidance */
 	spin_lock(&inode_lock);
 
-	if (!(wbc->for_kupdate || wbc->for_background) || list_empty(&wb->b_io))
+	if (list_empty(&wb->b_io))
 		queue_io(wb, wbc);
 
 	while (!list_empty(&wb->b_io)) {
@@ -571,9 +568,8 @@ static void __writeback_inodes_sb(struct
 {
 	WARN_ON(!rwsem_is_locked(&sb->s_umount));
 
-	wbc->wb_start = jiffies; /* livelock avoidance */
 	spin_lock(&inode_lock);
-	if (!(wbc->for_kupdate || wbc->for_background) || list_empty(&wb->b_io))
+	if (list_empty(&wb->b_io))
 		queue_io(wb, wbc);
 	writeback_sb_inodes(sb, wb, wbc, true);
 	spin_unlock(&inode_lock);
--- linux-next.orig/include/linux/writeback.h	2010-07-29 17:13:18.000000000 +0800
+++ linux-next/include/linux/writeback.h	2010-07-29 17:13:58.000000000 +0800
@@ -28,8 +28,8 @@ enum writeback_sync_modes {
  */
 struct writeback_control {
 	enum writeback_sync_modes sync_mode;
-	unsigned long wb_start;         /* Time writeback_inodes_wb was
-					   called. This is needed to avoid
+	unsigned long sync_after;	/* Only sync inodes dirtied after this
+					   timestamp. This is needed to avoid
 					   extra jobs and livelock */
 	long nr_to_write;		/* Write this many pages, and decrement
 					   this for each page written */


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>,
	LKML <linux-kernel@vger.kernel.org>, Jan Kara <jack@suse.cz>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Dave Chinner <david@fromorbit.com>,
	Chris Mason <chris.mason@oracle.com>,
	Nick Piggin <npiggin@suse.de>, Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Christoph Hellwig <hch@infradead.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mel Gorman <mel@csn.ul.ie>, Minchan Kim <minchan.kim@gmail.com>
Subject: [PATCH 3/5] writeback: prevent sync livelock with the sync_after timestamp
Date: Thu, 29 Jul 2010 19:51:45 +0800	[thread overview]
Message-ID: <20100729121423.471866750@intel.com> (raw)
In-Reply-To: 20100729115142.102255590@intel.com

[-- Attachment #1: writeback-sync-pending-start_time.patch --]
[-- Type: text/plain, Size: 3143 bytes --]

The start time in writeback_inodes_wb() is not very useful because it
slips at each invocation time. Preferrably one _constant_ time shall be
used at the beginning to cover the whole sync() work.

The newly dirtied inodes are now guarded at the queue_io() time instead
of the b_io walk time. This is more natural: non-empty b_io/b_more_io
means "more work pending".

The timestamp is now grabbed the sync work submission time, and may be
further optimized to the initial sync() call time.

CC: Jan Kara <jack@suse.cz>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 fs/fs-writeback.c         |   16 ++++++----------
 include/linux/writeback.h |    4 ++--
 2 files changed, 8 insertions(+), 12 deletions(-)

--- linux-next.orig/fs/fs-writeback.c	2010-07-29 17:13:49.000000000 +0800
+++ linux-next/fs/fs-writeback.c	2010-07-29 17:13:58.000000000 +0800
@@ -228,6 +228,10 @@ static void move_expired_inodes(struct l
 	struct inode *inode;
 	int do_sb_sort = 0;
 
+	if (wbc->for_sync) {
+		expire_interval = 1;
+		older_than_this = wbc->sync_after;
+	}
 	if (wbc->for_kupdate || wbc->for_background) {
 		expire_interval = msecs_to_jiffies(dirty_expire_interval * 10);
 		older_than_this = jiffies - expire_interval;
@@ -507,12 +511,6 @@ static int writeback_sb_inodes(struct su
 			requeue_io(inode);
 			continue;
 		}
-		/*
-		 * Was this inode dirtied after sync_sb_inodes was called?
-		 * This keeps sync from extra jobs and livelock.
-		 */
-		if (inode_dirtied_after(inode, wbc->wb_start))
-			return 1;
 
 		BUG_ON(inode->i_state & I_FREEING);
 		__iget(inode);
@@ -541,10 +539,9 @@ void writeback_inodes_wb(struct bdi_writ
 {
 	int ret = 0;
 
-	wbc->wb_start = jiffies; /* livelock avoidance */
 	spin_lock(&inode_lock);
 
-	if (!(wbc->for_kupdate || wbc->for_background) || list_empty(&wb->b_io))
+	if (list_empty(&wb->b_io))
 		queue_io(wb, wbc);
 
 	while (!list_empty(&wb->b_io)) {
@@ -571,9 +568,8 @@ static void __writeback_inodes_sb(struct
 {
 	WARN_ON(!rwsem_is_locked(&sb->s_umount));
 
-	wbc->wb_start = jiffies; /* livelock avoidance */
 	spin_lock(&inode_lock);
-	if (!(wbc->for_kupdate || wbc->for_background) || list_empty(&wb->b_io))
+	if (list_empty(&wb->b_io))
 		queue_io(wb, wbc);
 	writeback_sb_inodes(sb, wb, wbc, true);
 	spin_unlock(&inode_lock);
--- linux-next.orig/include/linux/writeback.h	2010-07-29 17:13:18.000000000 +0800
+++ linux-next/include/linux/writeback.h	2010-07-29 17:13:58.000000000 +0800
@@ -28,8 +28,8 @@ enum writeback_sync_modes {
  */
 struct writeback_control {
 	enum writeback_sync_modes sync_mode;
-	unsigned long wb_start;         /* Time writeback_inodes_wb was
-					   called. This is needed to avoid
+	unsigned long sync_after;	/* Only sync inodes dirtied after this
+					   timestamp. This is needed to avoid
 					   extra jobs and livelock */
 	long nr_to_write;		/* Write this many pages, and decrement
 					   this for each page written */


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-07-29 12:23 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-29 11:51 [PATCH 0/5] [RFC] transfer ASYNC vmscan writeback IO to the flusher threads Wu Fengguang
2010-07-29 11:51 ` Wu Fengguang
2010-07-29 11:51 ` Wu Fengguang
2010-07-29 11:51 ` [PATCH 1/5] writeback: introduce wbc.for_sync to cover the two sync stages Wu Fengguang
2010-07-29 11:51   ` Wu Fengguang
2010-07-29 11:51   ` Wu Fengguang
2010-07-29 15:04   ` Jan Kara
2010-07-29 15:04     ` Jan Kara
2010-07-30  5:10     ` Wu Fengguang
2010-07-30  5:10       ` Wu Fengguang
2010-07-29 11:51 ` [PATCH 2/5] writeback: stop periodic/background work on seeing sync works Wu Fengguang
2010-07-29 11:51   ` Wu Fengguang
2010-07-29 11:51   ` Wu Fengguang
2010-07-29 16:20   ` Jan Kara
2010-07-29 16:20     ` Jan Kara
2010-07-30  4:03     ` Wu Fengguang
2010-07-30  4:03       ` Wu Fengguang
2010-08-02 20:51       ` Jan Kara
2010-08-02 20:51         ` Jan Kara
2010-08-03  3:01         ` Wu Fengguang
2010-08-03  3:01           ` Wu Fengguang
2010-08-03 10:55           ` Jan Kara
2010-08-03 10:55             ` Jan Kara
2010-08-03 12:39             ` Jan Kara
2010-08-03 12:39               ` Jan Kara
2010-08-03 12:59               ` Wu Fengguang
2010-08-03 12:59                 ` Wu Fengguang
2010-08-03 13:18                 ` Jan Kara
2010-08-03 13:18                   ` Jan Kara
2010-08-03 13:22                 ` Wu Fengguang
2010-08-03 13:22                   ` Wu Fengguang
2010-08-03 13:44                   ` Wu Fengguang
2010-08-03 13:44                     ` Wu Fengguang
2010-08-03 13:48                     ` Wu Fengguang
2010-08-03 13:48                       ` Wu Fengguang
2010-08-03 14:36             ` Wu Fengguang
2010-08-03 14:36               ` Wu Fengguang
2010-07-29 11:51 ` Wu Fengguang [this message]
2010-07-29 11:51   ` [PATCH 3/5] writeback: prevent sync livelock with the sync_after timestamp Wu Fengguang
2010-07-29 11:51   ` Wu Fengguang
2010-07-29 15:02   ` Jan Kara
2010-07-29 15:02     ` Jan Kara
2010-07-30  5:17     ` Wu Fengguang
2010-07-30  5:17       ` Wu Fengguang
2010-07-29 11:51 ` [PATCH 4/5] writeback: introduce bdi_start_inode_writeback() Wu Fengguang
2010-07-29 11:51   ` Wu Fengguang
2010-07-29 11:51   ` Wu Fengguang
2010-07-29 11:51 ` [PATCH 5/5] vmscan: transfer async file writeback to the flusher Wu Fengguang
2010-07-29 11:51   ` Wu Fengguang
2010-07-29 11:51   ` Wu Fengguang
2010-07-29 16:09 ` [PATCH 0/5] [RFC] transfer ASYNC vmscan writeback IO to the flusher threads Jan Kara
2010-07-29 16:09   ` Jan Kara
2010-07-30  5:34   ` Wu Fengguang
2010-07-30  5:34     ` Wu Fengguang
2010-07-29 23:23 ` Dave Chinner
2010-07-29 23:23   ` Dave Chinner
2010-07-30  7:58   ` Wu Fengguang
2010-07-30  7:58     ` Wu Fengguang
2010-07-30  9:22     ` KOSAKI Motohiro
2010-07-30  9:22       ` KOSAKI Motohiro
2010-07-30 12:25       ` Wu Fengguang
2010-07-30 12:25         ` Wu Fengguang
2010-07-30 11:12     ` Dave Chinner
2010-07-30 11:12       ` Dave Chinner
2010-07-30 13:18       ` Wu Fengguang
2010-07-30 13:18         ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100729121423.471866750@intel.com \
    --to=fengguang.wu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.