All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mel@linux.vnet.ibm.com>, Mel Gorman <mel@csn.ul.ie>,
	Trond Myklebust <Trond.Myklebust@netapp.com>,
	Itaru Kitayama <kitayama@cl.bb4u.ne.jp>,
	Minchan Kim <minchan.kim@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: [PATCH 3/6] writeback: sync expired inodes first in background writeback
Date: Tue, 26 Apr 2011 13:37:06 +0800	[thread overview]
Message-ID: <20110426053706.GA17262@localhost> (raw)
In-Reply-To: <20110422211255.GB2977@quack.suse.cz>

On Sat, Apr 23, 2011 at 05:12:55AM +0800, Jan Kara wrote:
> On Fri 22-04-11 10:24:59, Wu Fengguang wrote:
> > > 2) The intention of both bdi_flush_io() and balance_dirty_pages() is to
> > > write .nr_to_write pages. So they should either do queue_io()
> > > unconditionally (I kind of like that for simplicity) or they should requeue
> > > once if they have not written enough - otherwise it could happen that they
> > > are called just at the moment when b_io contains a single inode with a few
> > > dirty pages and they end up doing almost nothing.
> > 
> > It makes much more sense to keep the policy consistent. When the
> > flusher and the throttled tasks are both actively manipulating the
> > shared lists but in different ways, how are we going to analyze the
> > resulted mixture behavior?
> > 
> > Note that bdi_flush_io() and balance_dirty_pages() both have outer
> > loops to retry writeout, so smallish b_io is not a problem at all.
>   Well, it changes how balance_dirty_pages() behaves in some corner cases
> (I'm not that much concerned about bdi_flush_io() because that is a last
> resort thing anyway). But I see your point in consistency as well.
> 
> > > 3) I guess your patch does not compile because queue_io() is static ;).
> > 
> > Yeah, good spot~ :) Here is the updated patch. I feel like moving
> > bdi_flush_io() to fs-writeback.c rather than exporting the low level
> > queue_io() (and enable others to conveniently change the queue policy!).
> > 
> > balance_dirty_pages() cannot be moved.. so I plan to submit it after
> > any IO-less merges. It's a cleanup patch after all.
> Can't we just have a wrapper in fs/fs-writeback.c that will do:
>      spin_lock(&bdi->wb.list_lock);
>      if (list_empty(&bdi->wb.b_io))
>              queue_io(&bdi->wb, &wbc);
>      writeback_inodes_wb(&bdi->wb, &wbc);
>      spin_unlock(&bdi->wb.list_lock);
> 
> And call it wherever we need? We can then also unexport
> writeback_inodes_wb() which is not really a function someone would want to
> call externally after your changes.

OK, this avoids the need to move bdi_flush_io(). Here is the updated
patch, do you see any more problems?

Thanks,
Fengguang
---
Subject: writeback: elevate queue_io() into wb_writeback()
Date: Thu Apr 21 12:06:32 CST 2011

Code refactor for more logical code layout.
No behavior change.

- remove the mis-named __writeback_inodes_sb()

- wb_writeback()/writeback_inodes_wb() will decide when to queue_io()
  before calling __writeback_inodes_wb()

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 fs/fs-writeback.c |   27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

--- linux-next.orig/fs/fs-writeback.c	2011-04-26 13:20:17.000000000 +0800
+++ linux-next/fs/fs-writeback.c	2011-04-26 13:30:19.000000000 +0800
@@ -570,17 +570,13 @@ static int writeback_sb_inodes(struct su
 	return 1;
 }
 
-void writeback_inodes_wb(struct bdi_writeback *wb,
-		struct writeback_control *wbc)
+static void __writeback_inodes_wb(struct bdi_writeback *wb,
+				  struct writeback_control *wbc)
 {
 	int ret = 0;
 
 	if (!wbc->wb_start)
 		wbc->wb_start = jiffies; /* livelock avoidance */
-	spin_lock(&wb->list_lock);
-
-	if (list_empty(&wb->b_io))
-		queue_io(wb, wbc);
 
 	while (!list_empty(&wb->b_io)) {
 		struct inode *inode = wb_inode(wb->b_io.prev);
@@ -596,19 +592,16 @@ void writeback_inodes_wb(struct bdi_writ
 		if (ret)
 			break;
 	}
-	spin_unlock(&wb->list_lock);
 	/* Leave any unwritten inodes on b_io */
 }
 
-static void __writeback_inodes_sb(struct super_block *sb,
-		struct bdi_writeback *wb, struct writeback_control *wbc)
+void writeback_inodes_wb(struct bdi_writeback *wb,
+		struct writeback_control *wbc)
 {
-	WARN_ON(!rwsem_is_locked(&sb->s_umount));
-
 	spin_lock(&wb->list_lock);
 	if (list_empty(&wb->b_io))
 		queue_io(wb, wbc);
-	writeback_sb_inodes(sb, wb, wbc, true);
+	__writeback_inodes_wb(wb, wbc);
 	spin_unlock(&wb->list_lock);
 }
 
@@ -674,7 +667,7 @@ static long wb_writeback(struct bdi_writ
 	 * The intended call sequence for WB_SYNC_ALL writeback is:
 	 *
 	 *      wb_writeback()
-	 *          __writeback_inodes_sb()     <== called only once
+	 *          writeback_sb_inodes()       <== called only once
 	 *              write_cache_pages()     <== called once for each inode
 	 *                   (quickly) tag currently dirty pages
 	 *                   (maybe slowly) sync all tagged pages
@@ -722,10 +715,14 @@ static long wb_writeback(struct bdi_writ
 
 retry:
 		trace_wbc_writeback_start(&wbc, wb->bdi);
+		spin_lock(&wb->list_lock);
+		if (list_empty(&wb->b_io))
+			queue_io(wb, &wbc);
 		if (work->sb)
-			__writeback_inodes_sb(work->sb, wb, &wbc);
+			writeback_sb_inodes(work->sb, wb, &wbc, true);
 		else
-			writeback_inodes_wb(wb, &wbc);
+			__writeback_inodes_wb(wb, &wbc);
+		spin_unlock(&wb->list_lock);
 		trace_wbc_writeback_written(&wbc, wb->bdi);
 
 		work->nr_pages -= write_chunk - wbc.nr_to_write;

WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mel@linux.vnet.ibm.com>, Mel Gorman <mel@csn.ul.ie>,
	Trond Myklebust <Trond.Myklebust@netapp.com>,
	Itaru Kitayama <kitayama@cl.bb4u.ne.jp>,
	Minchan Kim <minchan.kim@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: [PATCH 3/6] writeback: sync expired inodes first in background writeback
Date: Tue, 26 Apr 2011 13:37:06 +0800	[thread overview]
Message-ID: <20110426053706.GA17262@localhost> (raw)
In-Reply-To: <20110422211255.GB2977@quack.suse.cz>

On Sat, Apr 23, 2011 at 05:12:55AM +0800, Jan Kara wrote:
> On Fri 22-04-11 10:24:59, Wu Fengguang wrote:
> > > 2) The intention of both bdi_flush_io() and balance_dirty_pages() is to
> > > write .nr_to_write pages. So they should either do queue_io()
> > > unconditionally (I kind of like that for simplicity) or they should requeue
> > > once if they have not written enough - otherwise it could happen that they
> > > are called just at the moment when b_io contains a single inode with a few
> > > dirty pages and they end up doing almost nothing.
> > 
> > It makes much more sense to keep the policy consistent. When the
> > flusher and the throttled tasks are both actively manipulating the
> > shared lists but in different ways, how are we going to analyze the
> > resulted mixture behavior?
> > 
> > Note that bdi_flush_io() and balance_dirty_pages() both have outer
> > loops to retry writeout, so smallish b_io is not a problem at all.
>   Well, it changes how balance_dirty_pages() behaves in some corner cases
> (I'm not that much concerned about bdi_flush_io() because that is a last
> resort thing anyway). But I see your point in consistency as well.
> 
> > > 3) I guess your patch does not compile because queue_io() is static ;).
> > 
> > Yeah, good spot~ :) Here is the updated patch. I feel like moving
> > bdi_flush_io() to fs-writeback.c rather than exporting the low level
> > queue_io() (and enable others to conveniently change the queue policy!).
> > 
> > balance_dirty_pages() cannot be moved.. so I plan to submit it after
> > any IO-less merges. It's a cleanup patch after all.
> Can't we just have a wrapper in fs/fs-writeback.c that will do:
>      spin_lock(&bdi->wb.list_lock);
>      if (list_empty(&bdi->wb.b_io))
>              queue_io(&bdi->wb, &wbc);
>      writeback_inodes_wb(&bdi->wb, &wbc);
>      spin_unlock(&bdi->wb.list_lock);
> 
> And call it wherever we need? We can then also unexport
> writeback_inodes_wb() which is not really a function someone would want to
> call externally after your changes.

OK, this avoids the need to move bdi_flush_io(). Here is the updated
patch, do you see any more problems?

Thanks,
Fengguang
---
Subject: writeback: elevate queue_io() into wb_writeback()
Date: Thu Apr 21 12:06:32 CST 2011

Code refactor for more logical code layout.
No behavior change.

- remove the mis-named __writeback_inodes_sb()

- wb_writeback()/writeback_inodes_wb() will decide when to queue_io()
  before calling __writeback_inodes_wb()

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 fs/fs-writeback.c |   27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

--- linux-next.orig/fs/fs-writeback.c	2011-04-26 13:20:17.000000000 +0800
+++ linux-next/fs/fs-writeback.c	2011-04-26 13:30:19.000000000 +0800
@@ -570,17 +570,13 @@ static int writeback_sb_inodes(struct su
 	return 1;
 }
 
-void writeback_inodes_wb(struct bdi_writeback *wb,
-		struct writeback_control *wbc)
+static void __writeback_inodes_wb(struct bdi_writeback *wb,
+				  struct writeback_control *wbc)
 {
 	int ret = 0;
 
 	if (!wbc->wb_start)
 		wbc->wb_start = jiffies; /* livelock avoidance */
-	spin_lock(&wb->list_lock);
-
-	if (list_empty(&wb->b_io))
-		queue_io(wb, wbc);
 
 	while (!list_empty(&wb->b_io)) {
 		struct inode *inode = wb_inode(wb->b_io.prev);
@@ -596,19 +592,16 @@ void writeback_inodes_wb(struct bdi_writ
 		if (ret)
 			break;
 	}
-	spin_unlock(&wb->list_lock);
 	/* Leave any unwritten inodes on b_io */
 }
 
-static void __writeback_inodes_sb(struct super_block *sb,
-		struct bdi_writeback *wb, struct writeback_control *wbc)
+void writeback_inodes_wb(struct bdi_writeback *wb,
+		struct writeback_control *wbc)
 {
-	WARN_ON(!rwsem_is_locked(&sb->s_umount));
-
 	spin_lock(&wb->list_lock);
 	if (list_empty(&wb->b_io))
 		queue_io(wb, wbc);
-	writeback_sb_inodes(sb, wb, wbc, true);
+	__writeback_inodes_wb(wb, wbc);
 	spin_unlock(&wb->list_lock);
 }
 
@@ -674,7 +667,7 @@ static long wb_writeback(struct bdi_writ
 	 * The intended call sequence for WB_SYNC_ALL writeback is:
 	 *
 	 *      wb_writeback()
-	 *          __writeback_inodes_sb()     <== called only once
+	 *          writeback_sb_inodes()       <== called only once
 	 *              write_cache_pages()     <== called once for each inode
 	 *                   (quickly) tag currently dirty pages
 	 *                   (maybe slowly) sync all tagged pages
@@ -722,10 +715,14 @@ static long wb_writeback(struct bdi_writ
 
 retry:
 		trace_wbc_writeback_start(&wbc, wb->bdi);
+		spin_lock(&wb->list_lock);
+		if (list_empty(&wb->b_io))
+			queue_io(wb, &wbc);
 		if (work->sb)
-			__writeback_inodes_sb(work->sb, wb, &wbc);
+			writeback_sb_inodes(work->sb, wb, &wbc, true);
 		else
-			writeback_inodes_wb(wb, &wbc);
+			__writeback_inodes_wb(wb, &wbc);
+		spin_unlock(&wb->list_lock);
 		trace_wbc_writeback_written(&wbc, wb->bdi);
 
 		work->nr_pages -= write_chunk - wbc.nr_to_write;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-04-26  5:37 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-19  3:00 [PATCH 0/6] writeback: moving expire targets for background/kupdate works Wu Fengguang
2011-04-19  3:00 ` Wu Fengguang
2011-04-19  3:00 ` Wu Fengguang
2011-04-19  3:00 ` [PATCH 1/6] writeback: pass writeback_control down to move_expired_inodes() Wu Fengguang
2011-04-19  3:00   ` Wu Fengguang
2011-04-19  3:00   ` Wu Fengguang
2011-04-19  3:00 ` [PATCH 2/6] writeback: the kupdate expire timestamp should be a moving target Wu Fengguang
2011-04-19  3:00   ` Wu Fengguang
2011-04-19  3:00   ` Wu Fengguang
2011-04-19  7:02   ` Dave Chinner
2011-04-19  7:02     ` Dave Chinner
2011-04-19  7:20     ` Wu Fengguang
2011-04-19  7:20       ` Wu Fengguang
2011-04-19  9:31       ` Jan Kara
2011-04-19  9:31         ` Jan Kara
2011-04-19  3:00 ` [PATCH 3/6] writeback: sync expired inodes first in background writeback Wu Fengguang
2011-04-19  3:00   ` Wu Fengguang
2011-04-19  3:00   ` Wu Fengguang
2011-04-19  7:35   ` Dave Chinner
2011-04-19  7:35     ` Dave Chinner
2011-04-19  9:57     ` Jan Kara
2011-04-19  9:57       ` Jan Kara
2011-04-19 12:56       ` Wu Fengguang
2011-04-19 13:46         ` Wu Fengguang
2011-04-19 13:46           ` Wu Fengguang
2011-04-20  1:21         ` Dave Chinner
2011-04-20  1:21           ` Dave Chinner
2011-04-20  2:53           ` Wu Fengguang
2011-04-20  2:53             ` Wu Fengguang
2011-04-21  0:45             ` Dave Chinner
2011-04-21  0:45               ` Dave Chinner
2011-04-21  2:06               ` Wu Fengguang
2011-04-21  2:06                 ` Wu Fengguang
2011-04-21  3:01                 ` Dave Chinner
2011-04-21  3:01                   ` Dave Chinner
2011-04-21  3:59                   ` Wu Fengguang
2011-04-21  3:59                     ` Wu Fengguang
2011-04-21  4:10                     ` Wu Fengguang
2011-04-21  4:10                       ` Wu Fengguang
2011-04-21  4:36                       ` Christoph Hellwig
2011-04-21  4:36                         ` Christoph Hellwig
2011-04-21  6:36                       ` Dave Chinner
2011-04-21  6:36                         ` Dave Chinner
2011-04-21 16:04                       ` Jan Kara
2011-04-21 16:04                         ` Jan Kara
2011-04-22  2:24                         ` Wu Fengguang
2011-04-22  2:24                           ` Wu Fengguang
2011-04-22 21:12                           ` Jan Kara
2011-04-22 21:12                             ` Jan Kara
2011-04-26  5:37                             ` Wu Fengguang [this message]
2011-04-26  5:37                               ` Wu Fengguang
2011-04-26 14:30                               ` Jan Kara
2011-04-26 14:30                                 ` Jan Kara
2011-04-20  7:38           ` Wu Fengguang
2011-04-20  7:38             ` Wu Fengguang
2011-04-21  1:01             ` Dave Chinner
2011-04-21  1:01               ` Dave Chinner
2011-04-21  1:47               ` Wu Fengguang
2011-04-21  1:47                 ` Wu Fengguang
2011-04-19  3:00 ` [PATCH 4/6] writeback: introduce writeback_control.inodes_cleaned Wu Fengguang
2011-04-19  3:00   ` Wu Fengguang
2011-04-19  3:00   ` Wu Fengguang
2011-04-19  9:47   ` Jan Kara
2011-04-19  9:47     ` Jan Kara
2011-04-19  3:00 ` [PATCH 5/6] writeback: try more writeback as long as something was written Wu Fengguang
2011-04-19  3:00   ` Wu Fengguang
2011-04-19  3:00   ` Wu Fengguang
2011-04-19 10:20   ` Jan Kara
2011-04-19 10:20     ` Jan Kara
2011-04-19 11:16     ` Wu Fengguang
2011-04-19 11:16       ` Wu Fengguang
2011-04-19 21:10       ` Jan Kara
2011-04-19 21:10         ` Jan Kara
2011-04-20  7:50         ` Wu Fengguang
2011-04-20  7:50           ` Wu Fengguang
2011-04-20 15:22           ` Jan Kara
2011-04-20 15:22             ` Jan Kara
2011-04-21  3:33             ` Wu Fengguang
2011-04-21  4:39               ` Christoph Hellwig
2011-04-21  4:39                 ` Christoph Hellwig
2011-04-21  6:05                 ` Wu Fengguang
2011-04-21  6:05                   ` Wu Fengguang
2011-04-21 16:41                   ` Jan Kara
2011-04-21 16:41                     ` Jan Kara
2011-04-22  2:32                     ` Wu Fengguang
2011-04-22  2:32                       ` Wu Fengguang
2011-04-22 21:23                       ` Jan Kara
2011-04-22 21:23                         ` Jan Kara
2011-04-21  7:09               ` Dave Chinner
2011-04-21  7:09                 ` Dave Chinner
2011-04-21  7:14                 ` Christoph Hellwig
2011-04-21  7:14                   ` Christoph Hellwig
2011-04-21  7:52                   ` Dave Chinner
2011-04-21  7:52                     ` Dave Chinner
2011-04-21  8:00                     ` Christoph Hellwig
2011-04-21  8:00                       ` Christoph Hellwig
2011-04-19  3:00 ` [PATCH 6/6] NFS: return -EAGAIN when skipped commit in nfs_commit_unstable_pages() Wu Fengguang
2011-04-19  3:00   ` Wu Fengguang
2011-04-19  3:29   ` Trond Myklebust
2011-04-19  3:29     ` Trond Myklebust
2011-04-19  3:55     ` Wu Fengguang
2011-04-19  3:55       ` Wu Fengguang
2011-04-21  4:40   ` Christoph Hellwig
2011-04-21  4:40     ` Christoph Hellwig
2011-04-19  6:38 ` [PATCH 0/6] writeback: moving expire targets for background/kupdate works Dave Chinner
2011-04-19  6:38   ` Dave Chinner
2011-04-19  8:02   ` Wu Fengguang
2011-04-19  8:02     ` Wu Fengguang
2011-04-21  4:34 ` Christoph Hellwig
2011-04-21  4:34   ` Christoph Hellwig
2011-04-21  5:50   ` Wu Fengguang
2011-04-21  5:50     ` Wu Fengguang
2011-04-21  5:56     ` Christoph Hellwig
2011-04-21  5:56       ` Christoph Hellwig
2011-04-21  6:07       ` Wu Fengguang
2011-04-21  6:07         ` Wu Fengguang
2011-04-21  7:17         ` Christoph Hellwig
2011-04-21  7:17           ` Christoph Hellwig
2011-04-21 10:15           ` Wu Fengguang
2011-04-21 10:15             ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110426053706.GA17262@localhost \
    --to=fengguang.wu@intel.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=kitayama@cl.bb4u.ne.jp \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mel@linux.vnet.ibm.com \
    --cc=minchan.kim@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.