From: Fengguang Wu <wfg@mail.ustc.edu.cn>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Chakri n <chakriin5@gmail.com>, Krzysztof Oledzki <olel@ans.pl>,
akpm@linux-foundation.org,
linux-pm <linux-pm@lists.linux-foundation.org>,
lkml <linux-kernel@vger.kernel.org>
Subject: Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)
Date: Sat, 29 Sep 2007 20:28:42 +0800 [thread overview]
Message-ID: <391068925.28146@ustc.edu.cn> (raw)
Message-ID: <20070929122842.GA5454@mail.ustc.edu.cn> (raw)
In-Reply-To: <1191066481.18147.115.camel@lappy>
On Sat, Sep 29, 2007 at 01:48:01PM +0200, Peter Zijlstra wrote:
>
> On Sat, 2007-09-29 at 19:04 +0800, Fengguang Wu wrote:
> > On Thu, Sep 27, 2007 at 11:32:36PM -0700, Chakri n wrote:
> > > Hi,
> > >
> > > In my testing, a unresponsive file system can hang all I/O in the system.
> > > This is not seen in 2.4.
> > >
> > > I started 20 threads doing I/O on a NFS share. They are just doing 4K
> > > writes in a loop.
> > >
> > > Now I stop NFS server hosting the NFS share and start a
> > > "dd" process to write a file on local EXT3 file system.
> > >
> > > # dd if=/dev/zero of=/tmp/x count=1000
> > >
> > > This process never progresses.
> >
> > Peter, do you think this patch will help?
>
> In another sub-thread:
>
> > It's works on .23-rc8-mm2 with out any problems.
> >
> > "dd" process does not hang any more.
> >
> > Thanks for all the help.
> >
> > Cheers
> > --Chakri
>
> So the per-bdi dirty patches that are in -mm already fix the problem.
That's good.
But still it could be a good candidate for 2.6.22.x or even 2.6.23.
> > ===
> > writeback: avoid possible balance_dirty_pages() lockup on light-load bdi
> >
> > On a busy-writing system, a writer could be hold up infinitely on a
> > light-load device. It will be trying to sync more than enough dirty data.
> >
> > The problem case:
> >
> > 0. sda/nr_dirty >= dirty_limit;
> > sdb/nr_dirty == 0
> > 1. dd writes 32 pages on sdb
> > 2. balance_dirty_pages() blocks dd, and tries to write 6MB.
> > 3. it never gets there: there's only 128KB dirty data.
> > 4. dd may be blocked for a loooong time as long as sda is overloaded
> >
> > Fix it by returning on 'zero dirty inodes' in the current bdi.
> > (In fact there are slight differences between 'dirty inodes' and 'dirty pages'.
> > But there is no available counters for 'dirty pages'.)
> >
> > Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
> > ---
> > mm/page-writeback.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > --- linux-2.6.22.orig/mm/page-writeback.c
> > +++ linux-2.6.22/mm/page-writeback.c
> > @@ -227,6 +227,9 @@ static void balance_dirty_pages(struct a
> > if (nr_reclaimable + global_page_state(NR_WRITEBACK) <=
> > dirty_thresh)
> > break;
> > + if (list_empty(&mapping->host->i_sb->s_dirty) &&
> > + list_empty(&mapping->host->i_sb->s_io))
> > + break;
> >
> > if (!dirty_exceeded)
> > dirty_exceeded = 1;
> >
>
> On the patch itself, not sure if it would have been enough. As soon as
> there is a single dirty inode on the list one would get caught in the
> same problem as before.
That should not be a problem. Normally the few new dirty inodes will
be all cleaned in one go and there are no more dirty inodes left(at
least for a moment). Hmm, I guess the new 'break' should be moved
immediately after writeback_inodes()...
> That is, if NFS_dirty+NFS_unstable+NFS_writeback > dirty_limit this
> break won't fix it.
In fact this patch exactly targets at this condition.
When NFS* < dirty_limit, Chakri won't see the lockup at all.
The problem was, there are only two 'break's in the loop, and neither
one evaluates to true for his dd command.
next prev parent reply other threads:[~2007-09-29 12:28 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-09-28 6:32 A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?) Chakri n
2007-09-28 6:50 ` Andrew Morton
2007-09-28 6:59 ` Peter Zijlstra
2007-09-28 8:27 ` Chakri n
2007-09-28 8:40 ` Peter Zijlstra
2007-09-28 9:01 ` Chakri n
2007-09-28 9:12 ` Peter Zijlstra
2007-09-28 9:20 ` Chakri n
2007-09-28 9:23 ` Peter Zijlstra
2007-09-28 10:36 ` Chakri n
2007-09-28 13:28 ` Jonathan Corbet
2007-09-28 13:35 ` Peter Zijlstra
2007-09-28 16:45 ` [linux-pm] " Alan Stern
2007-09-29 1:27 ` Daniel Phillips
2007-09-28 18:04 ` Andrew Morton
2007-09-28 17:00 ` Trond Myklebust
2007-09-28 18:49 ` Andrew Morton
2007-09-28 18:48 ` Peter Zijlstra
2007-09-28 19:16 ` Andrew Morton
2007-10-02 13:36 ` Peter Zijlstra
2007-10-02 15:42 ` Randy Dunlap
2007-10-03 9:28 ` [PATCH] lockstat: documentation Peter Zijlstra
2007-10-03 9:35 ` Ingo Molnar
2007-09-28 19:16 ` A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?) Trond Myklebust
2007-09-28 19:26 ` Andrew Morton
2007-09-28 19:52 ` Trond Myklebust
2007-09-28 20:10 ` Andrew Morton
2007-09-28 20:32 ` Trond Myklebust
2007-09-28 20:43 ` Andrew Morton
2007-09-28 21:36 ` Chakri n
2007-09-28 23:33 ` Chakri n
2007-09-28 20:24 ` Daniel Phillips
2007-09-29 1:51 ` KDB? Daniel Phillips
2007-09-29 0:46 ` A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?) Daniel Phillips
[not found] ` <20070929110454.GA29861@mail.ustc.edu.cn>
2007-09-29 11:04 ` Fengguang Wu
2007-09-29 11:48 ` Peter Zijlstra
[not found] ` <20070929122842.GA5454@mail.ustc.edu.cn>
2007-09-29 12:28 ` Fengguang Wu [this message]
2007-09-29 14:43 ` Peter Zijlstra
2007-10-01 15:57 ` Chuck Ebbert
[not found] ` <20071002020040.GA5275@mail.ustc.edu.cn>
2007-10-02 2:00 ` [PATCH] writeback: avoid possible balance_dirty_pages() lockup on a light-load bdi Fengguang Wu
2007-10-02 2:14 ` Andrew Morton
[not found] ` <20071002121327.GA5718@mail.ustc.edu.cn>
2007-10-02 12:13 ` Fengguang Wu
[not found] ` <20071002132702.GA10967@mail.ustc.edu.cn>
2007-10-02 13:27 ` Fengguang Wu
2007-10-02 18:35 ` Chuck Ebbert
2007-10-03 12:46 ` richard kennedy
[not found] ` <20071004015053.GA5789@mail.ustc.edu.cn>
2007-10-04 1:50 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=391068925.28146@ustc.edu.cn \
--to=wfg@mail.ustc.edu.cn \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=chakriin5@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@lists.linux-foundation.org \
--cc=olel@ans.pl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox