From: Wu Fengguang <fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
To: Jos Houtman <jos-vMeIAzyucXQ@public.gmane.org>
Cc: Nick Piggin <nickpiggin-/E1597aS9LT0CCvOHzKKcA@public.gmane.org>,
"linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Dave Chinner <david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org>,
"linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
"jens.axboe-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org"
<jens.axboe-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
"akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org"
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
"hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org"
<hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
"linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: Page Cache writeback too slow, SSD/noop scheduler/ext2
Date: Sun, 29 Mar 2009 10:32:38 +0800 [thread overview]
Message-ID: <20090329023238.GA7825@localhost> (raw)
In-Reply-To: <C5F2C492.D4A8%jos-vMeIAzyucXQ@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 3171 bytes --]
On Sat, Mar 28, 2009 at 12:59:43AM +0800, Jos Houtman wrote:
> Hi,
>
> >>
> >> kupdate surely should just continue to keep trying to write back pages
> >> so long as there are more old pages to clean, and the queue isn't
> >> congested. That seems to be the intention anyway: MAX_WRITEBACK_PAGES
> >> is just the number to write back in a single call, but you see
> >> nr_to_write is set to the number of dirty pages in the system.
>
> And when it's congested it should just wait a little bit before continuing.
>
> >> On your system, what must be happening is more_io is not being set.
> >> The logic in fs/fs-writeback.c might be busted.
>
> I don't know about more_io, but I agree that the logic seems busted.
>
> >
> > Hi Jos,
> >
> > I prepared a debugging patch for 2.6.28. (I cannot observe writeback
> > problems on my local ext2 mount.)
>
> Thanx for the patch, but for the next time: How should I apply it?
> it seems to be context aware (@@) and broke on all kernel versions I tried
> 2.6.28/2.6.28.7/2.6.29
Do you mean that the patch applies after removing " @@.*$"?
To be safe, I created the patch with quilt as well as git, for 2.6.29.
> Because I saw the patch only a few hour ago and didn't want to block on your
> reply I decided to patch it manually and in the process ported it to 2.6.29.
>
> As for the information the patch provided: It is most helpful.
>
> Attached you will find a list of files containing dirty pages and the count
> of there dirty pages, there is also a dmesg output where I trace the
> writeback for 40 seconds.
They helped, thank you!
> I did some testing on my own using printk's and what I saw is that for the
> inodes located on sdb1 (the database) a lot of times they would pass
> http://lxr.linux.no/linux+v2.6.29/fs/fs-writeback.c#L335
> And then redirty_tail would be called, I haven't had the time to dig deeper,
> but that is my primary suspect for the moment.
You are right. In your case, there are several big dirty files in sdb1,
and the sdb write queue is constantly (almost-)congested. The SSD write
speed is so slow, that in each round of sdb1 writeback, it begins with
an uncongested queue, but quickly fills up after writing some pages.
Hence all the inodes will get redirtied because of (nr_to_write > 0).
The following quick fix should solve the slow-writeback-on-congested-SSD
problem. However the writeback sequence is suboptimal: it sync-and-requeue
each file until congested (in your case about 3~600 pages) instead of
until MAX_WRITEBACK_PAGES=1024 pages.
A more complete fix would be turning MAX_WRITEBACK_PAGES into an exact
per-file limit. It has been sitting in my todo list for quite a while...
Thanks,
Fengguang
---
fs/fs-writeback.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- mm.orig/fs/fs-writeback.c
+++ mm/fs/fs-writeback.c
@@ -325,7 +325,8 @@ __sync_single_inode(struct inode *inode,
* soon as the queue becomes uncongested.
*/
inode->i_state |= I_DIRTY_PAGES;
- if (wbc->nr_to_write <= 0) {
+ if (wbc->nr_to_write <= 0 ||
+ wbc->encountered_congestion) {
/*
* slice used up: queue for next turn
*/
[-- Attachment #2: writeback-requeue-congestion-quickfix.patch --]
[-- Type: text/x-diff, Size: 486 bytes --]
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index e3fe991..da5f88d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -325,7 +325,8 @@ __sync_single_inode(struct inode *inode, struct writeback_control *wbc)
* soon as the queue becomes uncongested.
*/
inode->i_state |= I_DIRTY_PAGES;
- if (wbc->nr_to_write <= 0) {
+ if (wbc->nr_to_write <= 0 ||
+ wbc->encountered_congestion) {
/*
* slice used up: queue for next turn
*/
next prev parent reply other threads:[~2009-03-29 2:32 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <C5EC2B99.C3B3%jos@hyves.nl>
[not found] ` <200903250148.53644.nickpiggin@yahoo.com.au>
[not found] ` <200903250148.53644.nickpiggin-/E1597aS9LT0CCvOHzKKcA@public.gmane.org>
2009-03-25 5:26 ` Page Cache writeback too slow, SSD/noop scheduler/ext2 Wu Fengguang
2009-03-27 16:59 ` Jos Houtman
[not found] ` <C5F2C492.D4A8%jos-vMeIAzyucXQ@public.gmane.org>
2009-03-29 2:32 ` Wu Fengguang [this message]
2009-03-30 16:47 ` Jos Houtman
[not found] ` <C5F6B627.D9D0%jos-vMeIAzyucXQ@public.gmane.org>
2009-03-31 0:28 ` Wu Fengguang
2009-03-31 12:16 ` Jos Houtman
[not found] ` <C5F7D654.DE6F%jos-vMeIAzyucXQ@public.gmane.org>
2009-03-31 12:31 ` Wu Fengguang
2009-03-31 14:10 ` Jos Houtman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090329023238.GA7825@localhost \
--to=fengguang.wu-ral2jqcrhueavxtiumwx3w@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org \
--cc=hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=jens.axboe-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
--cc=jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=jos-vMeIAzyucXQ@public.gmane.org \
--cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=nickpiggin-/E1597aS9LT0CCvOHzKKcA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).