From: Andrew Morton <akpm@osdl.org>
To: David Chinner <dgc@sgi.com>
Cc: dgc@sgi.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] Prevent large file writeback starvation
Date: Sun, 5 Feb 2006 22:22:15 -0800 [thread overview]
Message-ID: <20060205222215.313f30a9.akpm@osdl.org> (raw)
In-Reply-To: <20060206054815.GJ43335175@melbourne.sgi.com>
David Chinner <dgc@sgi.com> wrote:
>
> > >From a quick peek, this code:
> >
> > if (wbc->for_kupdate) {
> > /*
> > * For the kupdate function we leave the inode
> > * at the head of sb_dirty so it will get more
> > * writeout as soon as the queue becomes
> > * uncongested.
> > */
> > inode->i_state |= I_DIRTY_PAGES;
> > list_move_tail(&inode->i_list, &sb->s_dirty);
> >
> >
> > isn't working right any more.
>
> If the intent is to continue writing it back until fully
> sync'd, then shouldn't we be moving that to the tail of I/O list so
> we don't have to iterate over the dirty list again before we try to
> write another chunk out?
Only if dirtied_when has expired. Until that's true I think it's right to
move onto other (potentially expired) inodes.
Your patch leaves these inodes on s_io, actually.
> > >
> > > It appears that it is intended to handle congested devices. The thing
> > > is, 1024 pages on writeback is not enough to congest a single disk,
> > > let alone a RAID box 10 or 100 times faster than a single disk.
> > > Hence we're stopping writeback long before we congest the device.
> >
> > I think the comment is misleading. The writeout pass can terminate because
> > wbc->nr_to_write was satisfied, as well as for queue congestion.
>
> Exactly my point and what the patch addresses - it allows writeback on
> that inode to continue from where it left off if the device was not
> congested.
But what will it do to other inodes? Say, ones which have expired? This
inode could take many minutes to write out if it's all fragmented.
s_dirty is supposed to be kept in dirtied_when order, btw.
> > I suspect what's happened here is that someone other than pdflush has tried
> > to do some writeback and didn't set for_kupdate, so we ended up resetting
> > dirtied_when.
>
> If it's not wb_kupdate that is trying to write it back, and we have little
> memory pressure, and we completed writing the file long ago, then what behaves
> exactly like wb_kupdate for hours on end apart from wb_kupdate?
Don't know. I'm not sure that we exactly know what's going on yet?
The list_move_tail is supposed to put the inode at the *head* of s_dirty.
So it's the first one which gets encountered on the next pdflush pass.
And I guess that's working OK. Except we only write 4MB of it each five
seconds. Is that the case?
If so, why would that happen? Take a look at wb_kupdate(). It's supposed
to work *continuously* on the inodes until writeback_inodes() failed to
write back enough pages. It takes this as an indication that there's no
more work to do at this time.
It'd be interesting to take a look at what's happening in wb_kupdate().
> > > Therefore, lets only move the inode back onto the dirty list if the device
> > > really is congested. Patch against 2.6.15-rc2 below.
> >
> > This'll break something else, I bet :(
>
> Wonderful. What needs testing to indicate something else hasn't broken?
Hard.
> Does anyone have any regression tests for this code?
No.
next prev parent reply other threads:[~2006-02-06 6:22 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-02-06 4:00 [PATCH] Prevent large file writeback starvation David Chinner
2006-02-06 4:27 ` Andrew Morton
2006-02-06 5:48 ` David Chinner
2006-02-06 6:22 ` Andrew Morton [this message]
2006-02-06 6:36 ` Andrew Morton
2006-02-06 11:57 ` David Chinner
2006-02-06 11:55 ` David Chinner
2006-02-06 23:14 ` Andrew Morton
2006-02-07 0:34 ` David Chinner
2006-02-07 1:04 ` Andrew Morton
2006-02-07 1:31 ` David Chinner
2006-02-07 5:27 ` Andrew Morton
2006-02-07 7:42 ` David Chinner
2006-02-07 22:51 ` Andrew Morton
2006-02-07 7:49 ` David Chinner
2006-02-06 14:36 ` Mark Lord
2006-02-06 14:39 ` Mark Lord
2006-02-06 20:11 ` Andrew Morton
2006-02-13 13:59 ` dirty pages (Was: Re: [PATCH] Prevent large file writeback starvation) Johannes Stezenbach
2006-02-13 20:08 ` Andrew Morton
2006-02-13 22:48 ` Johannes Stezenbach
2006-02-13 23:04 ` Andrew Morton
2006-02-13 23:31 ` Johannes Stezenbach
2006-02-13 23:52 ` Mark Lord
2006-02-14 0:50 ` Mark Lord
2006-02-14 16:32 ` Mark Lord
2006-04-11 12:42 ` Alexander Bergolth
2006-03-20 22:40 ` [PATCH] Prevent large file writeback starvation Alexander Bergolth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060205222215.313f30a9.akpm@osdl.org \
--to=akpm@osdl.org \
--cc=dgc@sgi.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).