All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <jens.axboe@oracle.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>,
	jack@suse.cz, jengelh@medozas.de, stable@kernel.org,
	gregkh@suse.de
Subject: [PATCH] writeback: Fix broken sync writeback
Date: Fri, 12 Feb 2010 10:16:09 +0100	[thread overview]
Message-ID: <20100212091609.GB1025@kernel.dk> (raw)

[-- Attachment #1: Type: text/plain, Size: 2390 bytes --]

Hi,

There's currently a writeback bug in the 2.6.32 and 2.6.33-rc kernels
that prevent proper writeback when sync(1) is being run. Instead of
flushing everything older than the sync run, it will do chunks of normal
MAX_WRITEBACK_PAGES writeback and restart over and over. This results in
very suboptimal writeback for many files, see the initial report from
Jan Engelhardt:

http://lkml.org/lkml/2010/1/22/382

This fixes it by using the passed in page writeback count, instead of
doing MAX_WRITEBACK_PAGES batches, which gets us much better performance
(Jan reports it's up from ~400KB/sec to 10MB/sec) and makes sync(1)
finish properly even when new pages are being dirted.

Thanks to Jan Kara <jack@suse.cz> for spotting this problem!

Cc: stable@kernel.org
Reported-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

---

I'm sending this out as a patch on the list instead since I'll be going
away for a week skiing very shortly, a mail+patch is easier to discuss
here.

Greg, I'm attaching a 2.6.32 based patch as well, since this one will
not apply to 2.6.32.

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 1a7c42c..55aeea9 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -749,6 +749,8 @@ static long wb_writeback(struct bdi_writeback *wb,
 	}
 
 	for (;;) {
+		long to_write = 0;
+
 		/*
 		 * Stop writeback when nr_pages has been consumed
 		 */
@@ -762,12 +764,17 @@ static long wb_writeback(struct bdi_writeback *wb,
 		if (args->for_background && !over_bground_thresh())
 			break;
 
+		if (args->sync_mode == WB_SYNC_ALL)
+			to_write = args->nr_pages;
+		if (!to_write)
+			to_write = MAX_WRITEBACK_PAGES;
+
 		wbc.more_io = 0;
-		wbc.nr_to_write = MAX_WRITEBACK_PAGES;
+		wbc.nr_to_write = to_write;
 		wbc.pages_skipped = 0;
 		writeback_inodes_wb(wb, &wbc);
-		args->nr_pages -= MAX_WRITEBACK_PAGES - wbc.nr_to_write;
-		wrote += MAX_WRITEBACK_PAGES - wbc.nr_to_write;
+		args->nr_pages -= to_write - wbc.nr_to_write;
+		wrote += to_write - wbc.nr_to_write;
 
 		/*
 		 * If we consumed everything, see if we have more
@@ -782,7 +789,7 @@ static long wb_writeback(struct bdi_writeback *wb,
 		/*
 		 * Did we write something? Try for more
 		 */
-		if (wbc.nr_to_write < MAX_WRITEBACK_PAGES)
+		if (wbc.nr_to_write < to_write)
 			continue;
 		/*
 		 * Nothing written. Wait for some inode to

-- 
Jens Axboe


[-- Attachment #2: writeback-fix-broken-sync-2.6.32.patch --]
[-- Type: text/x-diff, Size: 2409 bytes --]

commit 057226ca7447880e4e766a82cf32197e492ba963
Author: Jens Axboe <jens.axboe@oracle.com>
Date:   Fri Feb 12 10:14:34 2010 +0100

    writeback: Fix broken sync writeback
    
    There's currently a writeback bug in the 2.6.32 and 2.6.33-rc kernels
    that prevent proper writeback when sync(1) is being run. Instead of
    flushing everything older than the sync run, it will do chunks of normal
    MAX_WRITEBACK_PAGES writeback and restart over and over. This results in
    very suboptimal writeback for many files, see the initial report from
    Jan Engelhardt:
    
    http://lkml.org/lkml/2010/1/22/382
    
    This fixes it by using the passed in page writeback count, instead of
    doing MAX_WRITEBACK_PAGES batches, which gets us much better performance
    (Jan reports it's up from ~400KB/sec to 10MB/sec) and makes sync(1)
    finish properly even when new pages are being dirted.
    
    Thanks to Jan Kara <jack@suse.cz> for spotting this problem!
    
    Cc: stable@kernel.org
    Reported-by: Jan Engelhardt <jengelh@medozas.de>
    Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 9d5360c..8a46c67 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -773,6 +773,8 @@ static long wb_writeback(struct bdi_writeback *wb,
 	}
 
 	for (;;) {
+		long to_write = 0;
+
 		/*
 		 * Stop writeback when nr_pages has been consumed
 		 */
@@ -786,13 +788,18 @@ static long wb_writeback(struct bdi_writeback *wb,
 		if (args->for_background && !over_bground_thresh())
 			break;
 
+		if (args->sync_mode == WB_SYNC_ALL)
+			to_write = args->nr_pages;
+		if (!to_write)
+			to_write = MAX_WRITEBACK_PAGES;
+
 		wbc.more_io = 0;
 		wbc.encountered_congestion = 0;
-		wbc.nr_to_write = MAX_WRITEBACK_PAGES;
+		wbc.nr_to_write = to_write;
 		wbc.pages_skipped = 0;
 		writeback_inodes_wb(wb, &wbc);
-		args->nr_pages -= MAX_WRITEBACK_PAGES - wbc.nr_to_write;
-		wrote += MAX_WRITEBACK_PAGES - wbc.nr_to_write;
+		args->nr_pages -= to_write - wbc.nr_to_write;
+		wrote += to_write - wbc.nr_to_write;
 
 		/*
 		 * If we consumed everything, see if we have more
@@ -807,7 +814,7 @@ static long wb_writeback(struct bdi_writeback *wb,
 		/*
 		 * Did we write something? Try for more
 		 */
-		if (wbc.nr_to_write < MAX_WRITEBACK_PAGES)
+		if (wbc.nr_to_write < to_write)
 			continue;
 		/*
 		 * Nothing written. Wait for some inode to

             reply	other threads:[~2010-02-12  9:16 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-12  9:16 Jens Axboe [this message]
2010-02-12 15:45 ` [PATCH] writeback: Fix broken sync writeback Linus Torvalds
2010-02-13 12:58   ` Jan Engelhardt
2010-02-15 14:49     ` Jan Kara
2010-02-15 15:41       ` Jan Engelhardt
2010-02-15 15:58         ` Jan Kara
2010-06-27 16:44         ` Jan Engelhardt
2010-10-24 23:41           ` Sync writeback still broken Jan Engelhardt
2010-10-30  0:57             ` Linus Torvalds
2010-10-30  1:16               ` Linus Torvalds
2010-10-30  1:30                 ` Linus Torvalds
2010-10-30  3:18                 ` Andrew Morton
2010-10-30 13:15                 ` Christoph Hellwig
2010-10-31 12:24             ` Jan Kara
2010-10-31 22:40               ` Jan Kara
2010-11-05 21:33                 ` Jan Kara
2010-11-05 21:34                   ` Jan Kara
2010-11-05 21:41                     ` Linus Torvalds
2010-11-05 22:03                   ` Jan Engelhardt
2010-11-07 12:57                     ` Jan Kara
2011-01-20 22:50                     ` Jan Engelhardt
2011-01-21 15:09                       ` Jan Kara
2010-02-15 14:17   ` [PATCH] writeback: Fix broken sync writeback Jan Kara
2010-02-16  0:05     ` Linus Torvalds
2010-02-16 23:00       ` Jan Kara
2010-02-16 23:34         ` Linus Torvalds
2010-02-17  0:01           ` Linus Torvalds
2010-02-17  1:33           ` Jan Kara
2010-02-17  1:57             ` Dave Chinner
2010-02-17  3:35             ` Linus Torvalds
2010-02-17  4:30               ` tytso
2010-02-17  5:16                 ` Linus Torvalds
2010-02-22 17:29                   ` Jan Kara
2010-02-22 21:01                     ` tytso
2010-02-22 22:26                       ` Jan Kara
2010-02-23  2:53                       ` Dave Chinner
2010-02-23  3:23                         ` tytso
2010-02-23  5:53                           ` Dave Chinner
2010-02-24 14:56                             ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100212091609.GB1025@kernel.dk \
    --to=jens.axboe@oracle.com \
    --cc=gregkh@suse.de \
    --cc=jack@suse.cz \
    --cc=jengelh@medozas.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.