From: Dave Chinner <david@fromorbit.com>
To: Jamie Lokier <jamie@shareable.org>
Cc: Nick Piggin <npiggin@suse.de>, jim owens <jowens@hp.com>,
linux-fsdevel@vger.kernel.org,
Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: [patch][rfc] mm: hold page lock over page_mkwrite
Date: Wed, 4 Mar 2009 15:37:39 +1100 [thread overview]
Message-ID: <20090304043739.GM26138@disturbed> (raw)
In-Reply-To: <20090303172535.GA16993@shareable.org>
On Tue, Mar 03, 2009 at 05:25:36PM +0000, Jamie Lokier wrote:
> > > it so "we can always make forward progress". But it won't
> > > matter because once a real user drives the system off this
> > > cliff there is no difference between "hung" and "really slow
> > > progress". They are going to crash it and report a hang.
> >
> > I don't think that is the case. These are situations that
> > would be *really* rare and transient. It is not like thrashing
> > in that your working set size exceeds physical RAM, but just
> > a combination of conditions that causes an unusual spike in the
> > required memory to clean some dirty pages (eg. Dave's example
> > of several IOs requiring btree splits over several AGs). Could
> > cause a resource deadlock.
>
> Suppose the systems has two pages to be written. The first must
> _reserve_ 40 pages of scratch space just in case the operation will
> need them. If the second page write is initiated concurrently with
> the first, the second must reserve another 40 pages concurrently.
>
> If 10 page writes are concurrent, that's 400 pages of scratch space
> needed in reserve...
Therein lies the problem. XFS can do this in parallel in every AG at
the same time. i.e. the reserve is per AG. The maximum number of AGs
in XFS is 2^32, and I know of filesystems out there that have
thousands of AGs in them. Hence reserving 40 pages per AG is
definitely unreasonable. ;)
Even if we look at concurrent allocations as the upper bound, I've
seen an 8p machine with several hundred concurrent allocation
transactions in progress. Even that is unreasonable if you consider
machines with 64k pages - it's hundreds of megabytes of RAM that are
mostly going to be unused.
Specifying a pool of pages is not a guaranteed solution, either,
as someone will always exhaust it as we can't guarantee any given
transaction will complete before the pool is exhausted. i.e.
the mempool design as it stands can't be used.
AFAIC, "should never allocate during writeback" is a great goal, but
it is one that we will never be able to reach without throwing
everything away and starting again. Minimising allocation is
something we can do but we can't avoid it entirely. The higher
layers need to understand this, not assert that the lower layers
must conform to an impossible constraint and break if they don't.....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2009-03-04 4:37 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-25 9:36 [patch][rfc] mm: hold page lock over page_mkwrite Nick Piggin
2009-02-25 16:42 ` Zach Brown
2009-02-25 16:55 ` Nick Piggin
2009-02-25 16:58 ` Zach Brown
2009-02-25 17:02 ` Nick Piggin
2009-02-25 22:35 ` Mark Fasheh
2009-02-25 16:48 ` Chris Mason
2009-02-26 9:20 ` Peter Zijlstra
2009-02-26 11:09 ` Nick Piggin
2009-03-01 8:17 ` Dave Chinner
2009-03-01 13:50 ` Nick Piggin
2009-03-02 8:19 ` Dave Chinner
2009-03-02 8:37 ` Nick Piggin
2009-03-02 15:26 ` jim owens
2009-03-03 4:33 ` Nick Piggin
2009-03-03 17:25 ` Jamie Lokier
2009-03-04 4:37 ` Dave Chinner [this message]
2009-03-04 9:23 ` Nick Piggin
2009-03-04 18:13 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090304043739.GM26138@disturbed \
--to=david@fromorbit.com \
--cc=jamie@shareable.org \
--cc=jowens@hp.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).