From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>,
LKML <linux-kernel@vger.kernel.org>, Jan Kara <jack@suse.cz>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Chris Mason <chris.mason@oracle.com>, Nick Piggin <npiggin@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>, Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan.kim@gmail.com>
Subject: [PATCH 3/5] writeback: prevent sync livelock with the sync_after timestamp
Date: Thu, 29 Jul 2010 19:51:45 +0800 [thread overview]
Message-ID: <20100729121423.471866750@intel.com> (raw)
In-Reply-To: 20100729115142.102255590@intel.com
[-- Attachment #1: writeback-sync-pending-start_time.patch --]
[-- Type: text/plain, Size: 2918 bytes --]
The start time in writeback_inodes_wb() is not very useful because it
slips at each invocation time. Preferrably one _constant_ time shall be
used at the beginning to cover the whole sync() work.
The newly dirtied inodes are now guarded at the queue_io() time instead
of the b_io walk time. This is more natural: non-empty b_io/b_more_io
means "more work pending".
The timestamp is now grabbed the sync work submission time, and may be
further optimized to the initial sync() call time.
CC: Jan Kara <jack@suse.cz>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
fs/fs-writeback.c | 16 ++++++----------
include/linux/writeback.h | 4 ++--
2 files changed, 8 insertions(+), 12 deletions(-)
--- linux-next.orig/fs/fs-writeback.c 2010-07-29 17:13:49.000000000 +0800
+++ linux-next/fs/fs-writeback.c 2010-07-29 17:13:58.000000000 +0800
@@ -228,6 +228,10 @@ static void move_expired_inodes(struct l
struct inode *inode;
int do_sb_sort = 0;
+ if (wbc->for_sync) {
+ expire_interval = 1;
+ older_than_this = wbc->sync_after;
+ }
if (wbc->for_kupdate || wbc->for_background) {
expire_interval = msecs_to_jiffies(dirty_expire_interval * 10);
older_than_this = jiffies - expire_interval;
@@ -507,12 +511,6 @@ static int writeback_sb_inodes(struct su
requeue_io(inode);
continue;
}
- /*
- * Was this inode dirtied after sync_sb_inodes was called?
- * This keeps sync from extra jobs and livelock.
- */
- if (inode_dirtied_after(inode, wbc->wb_start))
- return 1;
BUG_ON(inode->i_state & I_FREEING);
__iget(inode);
@@ -541,10 +539,9 @@ void writeback_inodes_wb(struct bdi_writ
{
int ret = 0;
- wbc->wb_start = jiffies; /* livelock avoidance */
spin_lock(&inode_lock);
- if (!(wbc->for_kupdate || wbc->for_background) || list_empty(&wb->b_io))
+ if (list_empty(&wb->b_io))
queue_io(wb, wbc);
while (!list_empty(&wb->b_io)) {
@@ -571,9 +568,8 @@ static void __writeback_inodes_sb(struct
{
WARN_ON(!rwsem_is_locked(&sb->s_umount));
- wbc->wb_start = jiffies; /* livelock avoidance */
spin_lock(&inode_lock);
- if (!(wbc->for_kupdate || wbc->for_background) || list_empty(&wb->b_io))
+ if (list_empty(&wb->b_io))
queue_io(wb, wbc);
writeback_sb_inodes(sb, wb, wbc, true);
spin_unlock(&inode_lock);
--- linux-next.orig/include/linux/writeback.h 2010-07-29 17:13:18.000000000 +0800
+++ linux-next/include/linux/writeback.h 2010-07-29 17:13:58.000000000 +0800
@@ -28,8 +28,8 @@ enum writeback_sync_modes {
*/
struct writeback_control {
enum writeback_sync_modes sync_mode;
- unsigned long wb_start; /* Time writeback_inodes_wb was
- called. This is needed to avoid
+ unsigned long sync_after; /* Only sync inodes dirtied after this
+ timestamp. This is needed to avoid
extra jobs and livelock */
long nr_to_write; /* Write this many pages, and decrement
this for each page written */
WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>,
LKML <linux-kernel@vger.kernel.org>, Jan Kara <jack@suse.cz>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Chris Mason <chris.mason@oracle.com>, Nick Piggin <npiggin@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>, Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan.kim@gmail.com>
Subject: [PATCH 3/5] writeback: prevent sync livelock with the sync_after timestamp
Date: Thu, 29 Jul 2010 19:51:45 +0800 [thread overview]
Message-ID: <20100729121423.471866750@intel.com> (raw)
In-Reply-To: 20100729115142.102255590@intel.com
[-- Attachment #1: writeback-sync-pending-start_time.patch --]
[-- Type: text/plain, Size: 3143 bytes --]
The start time in writeback_inodes_wb() is not very useful because it
slips at each invocation time. Preferrably one _constant_ time shall be
used at the beginning to cover the whole sync() work.
The newly dirtied inodes are now guarded at the queue_io() time instead
of the b_io walk time. This is more natural: non-empty b_io/b_more_io
means "more work pending".
The timestamp is now grabbed the sync work submission time, and may be
further optimized to the initial sync() call time.
CC: Jan Kara <jack@suse.cz>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
fs/fs-writeback.c | 16 ++++++----------
include/linux/writeback.h | 4 ++--
2 files changed, 8 insertions(+), 12 deletions(-)
--- linux-next.orig/fs/fs-writeback.c 2010-07-29 17:13:49.000000000 +0800
+++ linux-next/fs/fs-writeback.c 2010-07-29 17:13:58.000000000 +0800
@@ -228,6 +228,10 @@ static void move_expired_inodes(struct l
struct inode *inode;
int do_sb_sort = 0;
+ if (wbc->for_sync) {
+ expire_interval = 1;
+ older_than_this = wbc->sync_after;
+ }
if (wbc->for_kupdate || wbc->for_background) {
expire_interval = msecs_to_jiffies(dirty_expire_interval * 10);
older_than_this = jiffies - expire_interval;
@@ -507,12 +511,6 @@ static int writeback_sb_inodes(struct su
requeue_io(inode);
continue;
}
- /*
- * Was this inode dirtied after sync_sb_inodes was called?
- * This keeps sync from extra jobs and livelock.
- */
- if (inode_dirtied_after(inode, wbc->wb_start))
- return 1;
BUG_ON(inode->i_state & I_FREEING);
__iget(inode);
@@ -541,10 +539,9 @@ void writeback_inodes_wb(struct bdi_writ
{
int ret = 0;
- wbc->wb_start = jiffies; /* livelock avoidance */
spin_lock(&inode_lock);
- if (!(wbc->for_kupdate || wbc->for_background) || list_empty(&wb->b_io))
+ if (list_empty(&wb->b_io))
queue_io(wb, wbc);
while (!list_empty(&wb->b_io)) {
@@ -571,9 +568,8 @@ static void __writeback_inodes_sb(struct
{
WARN_ON(!rwsem_is_locked(&sb->s_umount));
- wbc->wb_start = jiffies; /* livelock avoidance */
spin_lock(&inode_lock);
- if (!(wbc->for_kupdate || wbc->for_background) || list_empty(&wb->b_io))
+ if (list_empty(&wb->b_io))
queue_io(wb, wbc);
writeback_sb_inodes(sb, wb, wbc, true);
spin_unlock(&inode_lock);
--- linux-next.orig/include/linux/writeback.h 2010-07-29 17:13:18.000000000 +0800
+++ linux-next/include/linux/writeback.h 2010-07-29 17:13:58.000000000 +0800
@@ -28,8 +28,8 @@ enum writeback_sync_modes {
*/
struct writeback_control {
enum writeback_sync_modes sync_mode;
- unsigned long wb_start; /* Time writeback_inodes_wb was
- called. This is needed to avoid
+ unsigned long sync_after; /* Only sync inodes dirtied after this
+ timestamp. This is needed to avoid
extra jobs and livelock */
long nr_to_write; /* Write this many pages, and decrement
this for each page written */
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>,
LKML <linux-kernel@vger.kernel.org>, Jan Kara <jack@suse.cz>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Dave Chinner <david@fromorbit.com>,
Chris Mason <chris.mason@oracle.com>,
Nick Piggin <npiggin@suse.de>, Rik van Riel <riel@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Christoph Hellwig <hch@infradead.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Mel Gorman <mel@csn.ul.ie>, Minchan Kim <minchan.kim@gmail.com>
Subject: [PATCH 3/5] writeback: prevent sync livelock with the sync_after timestamp
Date: Thu, 29 Jul 2010 19:51:45 +0800 [thread overview]
Message-ID: <20100729121423.471866750@intel.com> (raw)
In-Reply-To: 20100729115142.102255590@intel.com
[-- Attachment #1: writeback-sync-pending-start_time.patch --]
[-- Type: text/plain, Size: 3143 bytes --]
The start time in writeback_inodes_wb() is not very useful because it
slips at each invocation time. Preferrably one _constant_ time shall be
used at the beginning to cover the whole sync() work.
The newly dirtied inodes are now guarded at the queue_io() time instead
of the b_io walk time. This is more natural: non-empty b_io/b_more_io
means "more work pending".
The timestamp is now grabbed the sync work submission time, and may be
further optimized to the initial sync() call time.
CC: Jan Kara <jack@suse.cz>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
fs/fs-writeback.c | 16 ++++++----------
include/linux/writeback.h | 4 ++--
2 files changed, 8 insertions(+), 12 deletions(-)
--- linux-next.orig/fs/fs-writeback.c 2010-07-29 17:13:49.000000000 +0800
+++ linux-next/fs/fs-writeback.c 2010-07-29 17:13:58.000000000 +0800
@@ -228,6 +228,10 @@ static void move_expired_inodes(struct l
struct inode *inode;
int do_sb_sort = 0;
+ if (wbc->for_sync) {
+ expire_interval = 1;
+ older_than_this = wbc->sync_after;
+ }
if (wbc->for_kupdate || wbc->for_background) {
expire_interval = msecs_to_jiffies(dirty_expire_interval * 10);
older_than_this = jiffies - expire_interval;
@@ -507,12 +511,6 @@ static int writeback_sb_inodes(struct su
requeue_io(inode);
continue;
}
- /*
- * Was this inode dirtied after sync_sb_inodes was called?
- * This keeps sync from extra jobs and livelock.
- */
- if (inode_dirtied_after(inode, wbc->wb_start))
- return 1;
BUG_ON(inode->i_state & I_FREEING);
__iget(inode);
@@ -541,10 +539,9 @@ void writeback_inodes_wb(struct bdi_writ
{
int ret = 0;
- wbc->wb_start = jiffies; /* livelock avoidance */
spin_lock(&inode_lock);
- if (!(wbc->for_kupdate || wbc->for_background) || list_empty(&wb->b_io))
+ if (list_empty(&wb->b_io))
queue_io(wb, wbc);
while (!list_empty(&wb->b_io)) {
@@ -571,9 +568,8 @@ static void __writeback_inodes_sb(struct
{
WARN_ON(!rwsem_is_locked(&sb->s_umount));
- wbc->wb_start = jiffies; /* livelock avoidance */
spin_lock(&inode_lock);
- if (!(wbc->for_kupdate || wbc->for_background) || list_empty(&wb->b_io))
+ if (list_empty(&wb->b_io))
queue_io(wb, wbc);
writeback_sb_inodes(sb, wb, wbc, true);
spin_unlock(&inode_lock);
--- linux-next.orig/include/linux/writeback.h 2010-07-29 17:13:18.000000000 +0800
+++ linux-next/include/linux/writeback.h 2010-07-29 17:13:58.000000000 +0800
@@ -28,8 +28,8 @@ enum writeback_sync_modes {
*/
struct writeback_control {
enum writeback_sync_modes sync_mode;
- unsigned long wb_start; /* Time writeback_inodes_wb was
- called. This is needed to avoid
+ unsigned long sync_after; /* Only sync inodes dirtied after this
+ timestamp. This is needed to avoid
extra jobs and livelock */
long nr_to_write; /* Write this many pages, and decrement
this for each page written */
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-07-29 12:23 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-29 11:51 [PATCH 0/5] [RFC] transfer ASYNC vmscan writeback IO to the flusher threads Wu Fengguang
2010-07-29 11:51 ` Wu Fengguang
2010-07-29 11:51 ` Wu Fengguang
2010-07-29 11:51 ` [PATCH 1/5] writeback: introduce wbc.for_sync to cover the two sync stages Wu Fengguang
2010-07-29 11:51 ` Wu Fengguang
2010-07-29 11:51 ` Wu Fengguang
2010-07-29 15:04 ` Jan Kara
2010-07-29 15:04 ` Jan Kara
2010-07-30 5:10 ` Wu Fengguang
2010-07-30 5:10 ` Wu Fengguang
2010-07-29 11:51 ` [PATCH 2/5] writeback: stop periodic/background work on seeing sync works Wu Fengguang
2010-07-29 11:51 ` Wu Fengguang
2010-07-29 11:51 ` Wu Fengguang
2010-07-29 16:20 ` Jan Kara
2010-07-29 16:20 ` Jan Kara
2010-07-30 4:03 ` Wu Fengguang
2010-07-30 4:03 ` Wu Fengguang
2010-08-02 20:51 ` Jan Kara
2010-08-02 20:51 ` Jan Kara
2010-08-03 3:01 ` Wu Fengguang
2010-08-03 3:01 ` Wu Fengguang
2010-08-03 10:55 ` Jan Kara
2010-08-03 10:55 ` Jan Kara
2010-08-03 12:39 ` Jan Kara
2010-08-03 12:39 ` Jan Kara
2010-08-03 12:59 ` Wu Fengguang
2010-08-03 12:59 ` Wu Fengguang
2010-08-03 13:18 ` Jan Kara
2010-08-03 13:18 ` Jan Kara
2010-08-03 13:22 ` Wu Fengguang
2010-08-03 13:22 ` Wu Fengguang
2010-08-03 13:44 ` Wu Fengguang
2010-08-03 13:44 ` Wu Fengguang
2010-08-03 13:48 ` Wu Fengguang
2010-08-03 13:48 ` Wu Fengguang
2010-08-03 14:36 ` Wu Fengguang
2010-08-03 14:36 ` Wu Fengguang
2010-07-29 11:51 ` Wu Fengguang [this message]
2010-07-29 11:51 ` [PATCH 3/5] writeback: prevent sync livelock with the sync_after timestamp Wu Fengguang
2010-07-29 11:51 ` Wu Fengguang
2010-07-29 15:02 ` Jan Kara
2010-07-29 15:02 ` Jan Kara
2010-07-30 5:17 ` Wu Fengguang
2010-07-30 5:17 ` Wu Fengguang
2010-07-29 11:51 ` [PATCH 4/5] writeback: introduce bdi_start_inode_writeback() Wu Fengguang
2010-07-29 11:51 ` Wu Fengguang
2010-07-29 11:51 ` Wu Fengguang
2010-07-29 11:51 ` [PATCH 5/5] vmscan: transfer async file writeback to the flusher Wu Fengguang
2010-07-29 11:51 ` Wu Fengguang
2010-07-29 11:51 ` Wu Fengguang
2010-07-29 16:09 ` [PATCH 0/5] [RFC] transfer ASYNC vmscan writeback IO to the flusher threads Jan Kara
2010-07-29 16:09 ` Jan Kara
2010-07-30 5:34 ` Wu Fengguang
2010-07-30 5:34 ` Wu Fengguang
2010-07-29 23:23 ` Dave Chinner
2010-07-29 23:23 ` Dave Chinner
2010-07-30 7:58 ` Wu Fengguang
2010-07-30 7:58 ` Wu Fengguang
2010-07-30 9:22 ` KOSAKI Motohiro
2010-07-30 9:22 ` KOSAKI Motohiro
2010-07-30 12:25 ` Wu Fengguang
2010-07-30 12:25 ` Wu Fengguang
2010-07-30 11:12 ` Dave Chinner
2010-07-30 11:12 ` Dave Chinner
2010-07-30 13:18 ` Wu Fengguang
2010-07-30 13:18 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100729121423.471866750@intel.com \
--to=fengguang.wu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.