From: Dave Chinner <david@fromorbit.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ted Ts'o <tytso@mit.edu>, David Rientjes <rientjes@google.com>,
Jens Axboe <jaxboe@fusionio.com>,
Andrew Morton <akpm@linux-foundation.org>,
Neil Brown <neilb@suse.de>, Alasdair G Kergon <agk@redhat.com>,
Chris Mason <chris.mason@oracle.com>,
Steven Whitehouse <swhiteho@redhat.com>, Jan Kara <jack@suse.cz>,
Frederic Weisbecker <fweisbec@gmail.com>,
"linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
"cluster-devel@redhat.com" <cluster-devel@redhat.com>,
"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>,
"reiserfs-devel@vger.kernel.org" <reiserfs-devel@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [patch 1/5] mm: add nofail variants of kmalloc kcalloc and kzalloc
Date: Wed, 25 Aug 2010 23:24:17 +1000 [thread overview]
Message-ID: <20100825132417.GQ31488@dastard> (raw)
In-Reply-To: <1282740516.2605.3644.camel@laptop>
On Wed, Aug 25, 2010 at 02:48:36PM +0200, Peter Zijlstra wrote:
> On Wed, 2010-08-25 at 07:57 -0400, Ted Ts'o wrote:
> > On Wed, Aug 25, 2010 at 01:35:32PM +0200, Peter Zijlstra wrote:
> > > On Wed, 2010-08-25 at 07:24 -0400, Ted Ts'o wrote:
> > > > Part of the problem is that we have a few places in the kernel where
> > > > failure is really not an option --- or rather, if we're going to fail
> > > > while we're in the middle of doing a commit, our choices really are
> > > > (a) retry the loop in the jbd layer (which Andrew really doesn't
> > > > like), (b) keep our own private cache of free memory so we don't fail
> > > > and/or loop, (c) fail the file system and mark it read-only, or (d)
> > > > panic.
> > >
> > > d) do the allocation before you're committed to going fwd and can still
> > > fail and back out.
> >
> > Sure in some cases that can be done, but the commit has to happen at
> > some point, or we run out of journal space, at which point we're back
> > to (c) or (d).
>
> Well (b) sounds a lot saner than either of those. Simply revert to a
> state that is sub-optimal but has bounded memory use and reserve that
> memory up-front. That way you can always get out of a tight memory spot.
>
> Its what the block layer has always done to avoid the memory deadlock
> situation, it has a private stash of BIOs that is big enough to always
> service some IO, and as long as IO is happening stuff keeps moving fwd
> and we don't deadlock.
>
> Filesystems might have a slightly harder time creating such a bounded
> state because there might be more involved like journals and the like,
> but still it should be possible to create something like that (my swap
> over nfs patches created such a state for the network rx side of
> things).
Filesystems are way more complex than the block layer - the block
layer simply doesn't have to handle situations were thread X is
holding A, B and C, while thread Y needs C to complete the
transaction. thread Y is the user of the low memory pool, but has
almost depleted it and so even if we swith to thread X, the pool doe
snot have enouhg memory for X to complete and allow us to switch
back to Y and have it complete, freeing the memory from the pool
that it holds.
That is, the guarantee that we will always make progress simply does
not exist in filesystems, so a mempool-like concept seems to me to
be doomed from the start....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2010-08-25 13:24 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-24 10:50 [patch 1/5] mm: add nofail variants of kmalloc kcalloc and kzalloc David Rientjes
2010-08-24 10:50 ` [patch 4/5] btrfs: add nofail variant of set_extent_dirty David Rientjes
2010-08-24 13:30 ` Peter Zijlstra
2010-08-24 12:15 ` [patch 1/5] mm: add nofail variants of kmalloc kcalloc and kzalloc Jan Kara
2010-08-24 13:29 ` Peter Zijlstra
2010-08-24 13:33 ` Jens Axboe
2010-08-24 20:11 ` David Rientjes
2010-08-25 11:24 ` Ted Ts'o
2010-08-25 11:35 ` Peter Zijlstra
2010-08-25 11:57 ` Ted Ts'o
2010-08-25 12:48 ` Peter Zijlstra
2010-08-25 12:52 ` Peter Zijlstra
2010-08-25 13:20 ` Theodore Tso
2010-08-25 13:31 ` Peter Zijlstra
2010-08-25 20:43 ` David Rientjes
2010-08-25 20:55 ` Peter Zijlstra
2010-08-25 21:11 ` David Rientjes
2010-08-25 21:27 ` Peter Zijlstra
2010-08-25 23:11 ` David Rientjes
2010-08-26 0:19 ` Ted Ts'o
2010-08-26 0:30 ` David Rientjes
[not found] ` <alpine.DEB.2.00.1008251724360.25783@chino.kir.corp.google.com>
2010-08-26 1:48 ` Ted Ts'o
2010-08-26 3:09 ` David Rientjes
2010-08-26 6:38 ` Dave Chinner
[not found] ` <alpine.DEB.2.00.1008251951230.7034@chino.kir.corp.google.com>
2010-08-26 7:06 ` Dave Chinner
2010-08-26 8:29 ` Peter Zijlstra
2010-08-25 13:34 ` Peter Zijlstra
2010-08-25 13:24 ` Dave Chinner [this message]
2010-08-25 13:35 ` Peter Zijlstra
2010-08-25 20:53 ` Ted Ts'o
2010-08-25 20:59 ` David Rientjes
2010-08-25 21:35 ` Peter Zijlstra
2010-08-25 20:58 ` David Rientjes
2010-08-25 21:11 ` Christoph Lameter
2010-08-25 21:21 ` Peter Zijlstra
2010-08-25 21:23 ` David Rientjes
2010-08-25 21:35 ` Christoph Lameter
2010-08-25 23:05 ` David Rientjes
2010-08-26 1:30 ` Christoph Lameter
2010-08-26 3:12 ` David Rientjes
2010-08-26 14:16 ` Christoph Lameter
2010-08-26 22:31 ` David Rientjes
2010-08-26 0:09 ` Dave Chinner
2010-08-25 14:13 ` Peter Zijlstra
2010-08-24 13:55 ` Dave Chinner
2010-08-24 14:03 ` Peter Zijlstra
2010-08-24 20:12 ` David Rientjes
2010-08-24 20:08 ` David Rientjes
2010-09-02 1:02 ` [patch v2 " David Rientjes
2010-09-02 1:03 ` [patch v2 4/5] btrfs: add nofail variant of set_extent_dirty David Rientjes
2010-09-02 7:59 ` [patch v2 1/5] mm: add nofail variants of kmalloc kcalloc and kzalloc Jiri Slaby
2010-09-02 14:51 ` Jan Kara
2010-09-02 21:15 ` Neil Brown
2010-09-05 23:03 ` David Rientjes
2010-09-05 23:01 ` David Rientjes
2010-09-06 9:05 ` David Rientjes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100825132417.GQ31488@dastard \
--to=david@fromorbit.com \
--cc=agk@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=chris.mason@oracle.com \
--cc=cluster-devel@redhat.com \
--cc=fweisbec@gmail.com \
--cc=jack@suse.cz \
--cc=jaxboe@fusionio.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
--cc=peterz@infradead.org \
--cc=reiserfs-devel@vger.kernel.org \
--cc=rientjes@google.com \
--cc=swhiteho@redhat.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).