linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Ted Ts'o <tytso@mit.edu>
Cc: David Rientjes <rientjes@google.com>,
	Jens Axboe <jaxboe@fusionio.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Neil Brown <neilb@suse.de>, Alasdair G Kergon <agk@redhat.com>,
	Chris Mason <chris.mason@oracle.com>,
	Steven Whitehouse <swhiteho@redhat.com>, Jan Kara <jack@suse.cz>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	"linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
	"cluster-devel@redhat.com" <cluster-devel@redhat.com>,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>,
	"reiserfs-devel@vger.kernel.org" <reiserfs-devel@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [patch 1/5] mm: add nofail variants of kmalloc kcalloc and kzalloc
Date: Wed, 25 Aug 2010 14:48:36 +0200	[thread overview]
Message-ID: <1282740516.2605.3644.camel@laptop> (raw)
In-Reply-To: <20100825115709.GD4453@thunk.org>

On Wed, 2010-08-25 at 07:57 -0400, Ted Ts'o wrote:
> On Wed, Aug 25, 2010 at 01:35:32PM +0200, Peter Zijlstra wrote:
> > On Wed, 2010-08-25 at 07:24 -0400, Ted Ts'o wrote:
> > > Part of the problem is that we have a few places in the kernel where
> > > failure is really not an option --- or rather, if we're going to fail
> > > while we're in the middle of doing a commit, our choices really are
> > > (a) retry the loop in the jbd layer (which Andrew really doesn't
> > > like), (b) keep our own private cache of free memory so we don't fail
> > > and/or loop, (c) fail the file system and mark it read-only, or (d)
> > > panic. 
> > 
> > d) do the allocation before you're committed to going fwd and can still
> > fail and back out.
> 
> Sure in some cases that can be done, but the commit has to happen at
> some point, or we run out of journal space, at which point we're back
> to (c) or (d).

Well (b) sounds a lot saner than either of those. Simply revert to a
state that is sub-optimal but has bounded memory use and reserve that
memory up-front. That way you can always get out of a tight memory spot.

Its what the block layer has always done to avoid the memory deadlock
situation, it has a private stash of BIOs that is big enough to always
service some IO, and as long as IO is happening stuff keeps moving fwd
and we don't deadlock.

Filesystems might have a slightly harder time creating such a bounded
state because there might be more involved like journals and the like,
but still it should be possible to create something like that (my swap
over nfs patches created such a state for the network rx side of
things).

Also, we cannot let our fear of crappy userspace get in the way of doing
sensible things. Your example of write(2) returning -ENOMEM is not
correct though, the syscall (and the page_mkwrite callback for mmap()s)
happens before we actually dirty the data and need to write things out,
so we can always simply wait for memory to become available to dirty.

  reply	other threads:[~2010-08-25 12:48 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-24 10:50 [patch 1/5] mm: add nofail variants of kmalloc kcalloc and kzalloc David Rientjes
2010-08-24 10:50 ` [patch 4/5] btrfs: add nofail variant of set_extent_dirty David Rientjes
2010-08-24 13:30   ` Peter Zijlstra
2010-08-24 12:15 ` [patch 1/5] mm: add nofail variants of kmalloc kcalloc and kzalloc Jan Kara
2010-08-24 13:29 ` Peter Zijlstra
2010-08-24 13:33   ` Jens Axboe
2010-08-24 20:11     ` David Rientjes
2010-08-25 11:24       ` Ted Ts'o
2010-08-25 11:35         ` Peter Zijlstra
2010-08-25 11:57           ` Ted Ts'o
2010-08-25 12:48             ` Peter Zijlstra [this message]
2010-08-25 12:52               ` Peter Zijlstra
2010-08-25 13:20                 ` Theodore Tso
2010-08-25 13:31                   ` Peter Zijlstra
2010-08-25 20:43                     ` David Rientjes
2010-08-25 20:55                       ` Peter Zijlstra
2010-08-25 21:11                         ` David Rientjes
2010-08-25 21:27                           ` Peter Zijlstra
2010-08-25 23:11                             ` David Rientjes
2010-08-26  0:19                               ` Ted Ts'o
2010-08-26  0:30                                 ` David Rientjes
     [not found]                                 ` <alpine.DEB.2.00.1008251724360.25783@chino.kir.corp.google.com>
2010-08-26  1:48                                   ` Ted Ts'o
2010-08-26  3:09                                     ` David Rientjes
2010-08-26  6:38                                     ` Dave Chinner
     [not found]                                     ` <alpine.DEB.2.00.1008251951230.7034@chino.kir.corp.google.com>
2010-08-26  7:06                                       ` Dave Chinner
2010-08-26  8:29                                       ` Peter Zijlstra
2010-08-25 13:34                   ` Peter Zijlstra
2010-08-25 13:24               ` Dave Chinner
2010-08-25 13:35                 ` Peter Zijlstra
2010-08-25 20:53                   ` Ted Ts'o
2010-08-25 20:59                     ` David Rientjes
2010-08-25 21:35                     ` Peter Zijlstra
2010-08-25 20:58                   ` David Rientjes
2010-08-25 21:11                     ` Christoph Lameter
2010-08-25 21:21                       ` Peter Zijlstra
2010-08-25 21:23                       ` David Rientjes
2010-08-25 21:35                         ` Christoph Lameter
2010-08-25 23:05                           ` David Rientjes
2010-08-26  1:30                             ` Christoph Lameter
2010-08-26  3:12                               ` David Rientjes
2010-08-26 14:16                                 ` Christoph Lameter
2010-08-26 22:31                                   ` David Rientjes
2010-08-26  0:09                   ` Dave Chinner
2010-08-25 14:13         ` Peter Zijlstra
2010-08-24 13:55   ` Dave Chinner
2010-08-24 14:03     ` Peter Zijlstra
2010-08-24 20:12     ` David Rientjes
2010-08-24 20:08   ` David Rientjes
2010-09-02  1:02 ` [patch v2 " David Rientjes
2010-09-02  1:03   ` [patch v2 4/5] btrfs: add nofail variant of set_extent_dirty David Rientjes
2010-09-02  7:59   ` [patch v2 1/5] mm: add nofail variants of kmalloc kcalloc and kzalloc Jiri Slaby
2010-09-02 14:51     ` Jan Kara
2010-09-02 21:15       ` Neil Brown
2010-09-05 23:03         ` David Rientjes
2010-09-05 23:01     ` David Rientjes
2010-09-06  9:05   ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1282740516.2605.3644.camel@laptop \
    --to=peterz@infradead.org \
    --cc=agk@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=cluster-devel@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=jack@suse.cz \
    --cc=jaxboe@fusionio.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=reiserfs-devel@vger.kernel.org \
    --cc=rientjes@google.com \
    --cc=swhiteho@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).