From: Daniel Phillips <phillips@phunq.net>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "Chakri n" <chakriin5@gmail.com>,
linux-pm <linux-pm@lists.linux-foundation.org>,
lkml <linux-kernel@vger.kernel.org>,
nfs@lists.sourceforge.net,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)
Date: Fri, 28 Sep 2007 17:46:43 -0700 [thread overview]
Message-ID: <200709281746.44499.phillips@phunq.net> (raw)
In-Reply-To: <20070927235034.ae7bd73d.akpm@linux-foundation.org>
On Thursday 27 September 2007 23:50, Andrew Morton wrote:
> Actually we perhaps could address this at the VFS level in another
> way. Processes which are writing to the dead NFS server will
> eventually block in balance_dirty_pages() once they've exceeded the
> memory limits and will remain blocked until the server wakes up -
> that's the behaviour we want.
It is not necessary to restrict total dirty pages at all. Instead it is
necessary to restrict total writeout in flight. This is evident from
the fact that making progress is the one and only reason our kernel
exists, and writeout is how we make progress clearing memory. In other
words, if we guarantee the progress of writeout, we will live happily
ever after and not have to sell the farm.
The current situation has an eerily similar feeling to the VM
instability in early 2.4, which was never solved until we convinced
ourselves that the only way to deal with Moore's law as applied to
number of memory pages was to implement positive control of swapout in
the form of reverse mapping[1]. This time round, we need to add
positive control of writeout in the form of rate limiting.
I _think_ Peter is with me on this, and not only that, but between the
too of us we already have patches for most of the subsystems that need
it, and we have both been busy testing (different subsets of) these
patches to destruction for the better part of a year.
Anyway, to fix the immediate bug before the one true dirty_limit removal
patch lands (promise) I think you are on the right track by noticing
that balance_dirty_pages has to become aware of how congested the
involved block device is, since blocking a writeout process on an
underused block device is clearly a bad idea. Note how much this idea
looks like rate limiting.
[1] We lost the scent for a number of reasons, not least because the
experimental implementation of reverse mapping at the time was buggy
for reasons entirely unrelated to the reverse mapping itself.
Regards,
Daniel
next prev parent reply other threads:[~2007-09-29 0:47 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-09-28 6:32 A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?) Chakri n
2007-09-28 6:50 ` Andrew Morton
2007-09-28 6:59 ` Peter Zijlstra
2007-09-28 8:27 ` Chakri n
2007-09-28 8:40 ` Peter Zijlstra
2007-09-28 9:01 ` Chakri n
2007-09-28 9:12 ` Peter Zijlstra
2007-09-28 9:20 ` Chakri n
2007-09-28 9:23 ` Peter Zijlstra
2007-09-28 10:36 ` Chakri n
2007-09-28 13:28 ` Jonathan Corbet
2007-09-28 13:35 ` Peter Zijlstra
2007-09-28 16:45 ` [linux-pm] " Alan Stern
2007-09-29 1:27 ` Daniel Phillips
2007-09-28 18:04 ` Andrew Morton
2007-09-28 17:00 ` Trond Myklebust
2007-09-28 18:49 ` Andrew Morton
2007-09-28 18:48 ` Peter Zijlstra
2007-09-28 19:16 ` Andrew Morton
2007-10-02 13:36 ` Peter Zijlstra
2007-10-02 15:42 ` Randy Dunlap
2007-10-03 9:28 ` [PATCH] lockstat: documentation Peter Zijlstra
2007-10-03 9:35 ` Ingo Molnar
2007-09-28 19:16 ` A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?) Trond Myklebust
2007-09-28 19:26 ` Andrew Morton
2007-09-28 19:52 ` Trond Myklebust
2007-09-28 20:10 ` Andrew Morton
2007-09-28 20:32 ` Trond Myklebust
2007-09-28 20:43 ` Andrew Morton
2007-09-28 21:36 ` Chakri n
2007-09-28 23:33 ` Chakri n
2007-09-28 20:24 ` Daniel Phillips
2007-09-29 1:51 ` KDB? Daniel Phillips
2007-09-29 0:46 ` Daniel Phillips [this message]
[not found] ` <20070929110454.GA29861@mail.ustc.edu.cn>
2007-09-29 11:04 ` A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?) Fengguang Wu
2007-09-29 11:48 ` Peter Zijlstra
[not found] ` <20070929122842.GA5454@mail.ustc.edu.cn>
2007-09-29 12:28 ` Fengguang Wu
2007-09-29 14:43 ` Peter Zijlstra
2007-10-01 15:57 ` Chuck Ebbert
[not found] ` <20071002020040.GA5275@mail.ustc.edu.cn>
2007-10-02 2:00 ` [PATCH] writeback: avoid possible balance_dirty_pages() lockup on a light-load bdi Fengguang Wu
2007-10-02 2:14 ` Andrew Morton
[not found] ` <20071002121327.GA5718@mail.ustc.edu.cn>
2007-10-02 12:13 ` Fengguang Wu
[not found] ` <20071002132702.GA10967@mail.ustc.edu.cn>
2007-10-02 13:27 ` Fengguang Wu
2007-10-02 18:35 ` Chuck Ebbert
2007-10-03 12:46 ` richard kennedy
[not found] ` <20071004015053.GA5789@mail.ustc.edu.cn>
2007-10-04 1:50 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200709281746.44499.phillips@phunq.net \
--to=phillips@phunq.net \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=chakriin5@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@lists.linux-foundation.org \
--cc=nfs@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox