All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Phillips <phillips@phunq.net>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: "Mike Snitzer" <snitzer@gmail.com>,
	"Christoph Lameter" <clameter@sgi.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, dkegel@google.com,
	"David Miller" <davem@davemloft.net>,
	"Nick Piggin" <npiggin@suse.de>, "Wouter Verhelst" <w@uter.be>,
	"Evgeniy Polyakov" <johnpol@2ka.mipt.ru>
Subject: Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC)
Date: Tue, 18 Sep 2007 09:56:06 -0700	[thread overview]
Message-ID: <200709180956.07772.phillips@phunq.net> (raw)
In-Reply-To: <20070918115836.1394a051@twins>

On Tuesday 18 September 2007 02:58, Peter Zijlstra wrote:
> On Mon, 17 Sep 2007 22:11:25 -0700 Daniel Phillips wrote:
> > > I've been using Avi Kivity's patch from some time ago:
> > > http://lkml.org/lkml/2004/7/26/68
> >
> > Yes.  Ddsnap includes a bit of code almost identical to that, which
> > we wrote independently.  Seems wild and crazy at first blush,
> > doesn't it? But this approach has proved robust in practice, and is
> > to my mind, obviously correct.
>
> I'm so not liking this :-(

Why don't you share your specific concerns?

> Can't we just run the user-space part as mlockall and extend netlink
> to work with PF_MEMALLOC where needed?
>
> I did something like that for iSCSI.

Not sure what you mean by extend netlink.  We do run the user daemons 
under mlockall of course, this is one of the rules I stated earlier for 
daemons running in the block IO path.  The problem is, if this 
userspace daemon allocates even one page, for example in sys_open, it 
can deadlock.  Running the daemon in PF_MEMALLOC mode fixes this 
problem robustly, provided that the necessary audit of memory 
allocation patterns and library dependencies has been done.

I suppose you are worried that the userspace code could unexpectedly 
allocate a large amount of memory and exhaust the entire PF_MEMALLOC 
reserve?  Kernel code could do that too.  This userspace code just 
needs to be checked carefully.  Perhaps we could come up with a kernel 
debugging option to verify that a task does in fact stay within some 
bounded number of page allocs while in PF_MEMALLOC mode.

Regards,

Daniel

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Phillips <phillips@phunq.net>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Snitzer <snitzer@gmail.com>,
	Christoph Lameter <clameter@sgi.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, dkegel@google.com,
	David Miller <davem@davemloft.net>, Nick Piggin <npiggin@suse.de>,
	Wouter Verhelst <w@uter.be>,
	Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Subject: Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC)
Date: Tue, 18 Sep 2007 09:56:06 -0700	[thread overview]
Message-ID: <200709180956.07772.phillips@phunq.net> (raw)
In-Reply-To: <20070918115836.1394a051@twins>

On Tuesday 18 September 2007 02:58, Peter Zijlstra wrote:
> On Mon, 17 Sep 2007 22:11:25 -0700 Daniel Phillips wrote:
> > > I've been using Avi Kivity's patch from some time ago:
> > > http://lkml.org/lkml/2004/7/26/68
> >
> > Yes.  Ddsnap includes a bit of code almost identical to that, which
> > we wrote independently.  Seems wild and crazy at first blush,
> > doesn't it? But this approach has proved robust in practice, and is
> > to my mind, obviously correct.
>
> I'm so not liking this :-(

Why don't you share your specific concerns?

> Can't we just run the user-space part as mlockall and extend netlink
> to work with PF_MEMALLOC where needed?
>
> I did something like that for iSCSI.

Not sure what you mean by extend netlink.  We do run the user daemons 
under mlockall of course, this is one of the rules I stated earlier for 
daemons running in the block IO path.  The problem is, if this 
userspace daemon allocates even one page, for example in sys_open, it 
can deadlock.  Running the daemon in PF_MEMALLOC mode fixes this 
problem robustly, provided that the necessary audit of memory 
allocation patterns and library dependencies has been done.

I suppose you are worried that the userspace code could unexpectedly 
allocate a large amount of memory and exhaust the entire PF_MEMALLOC 
reserve?  Kernel code could do that too.  This userspace code just 
needs to be checked carefully.  Perhaps we could come up with a kernel 
debugging option to verify that a task does in fact stay within some 
bounded number of page allocs while in PF_MEMALLOC mode.

Regards,

Daniel

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-09-18 16:56 UTC|newest]

Thread overview: 108+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-14 14:21 [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC) Christoph Lameter
2007-08-14 14:21 ` Christoph Lameter
2007-08-14 14:21 ` [RFC 1/3] Allow reclaim via __GFP_NOMEMALLOC reclaim Christoph Lameter
2007-08-14 14:21   ` Christoph Lameter
2007-08-14 14:21 ` [RFC 2/3] Use NOMEMALLOC reclaim to allow reclaim if PF_MEMALLOC is set Christoph Lameter
2007-08-14 14:21   ` Christoph Lameter
2007-08-14 14:21 ` [RFC 3/3] Test code for PF_MEMALLOC reclaim Christoph Lameter
2007-08-14 14:21   ` Christoph Lameter
2007-08-14 14:36 ` [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC) Peter Zijlstra
2007-08-14 14:36   ` Peter Zijlstra
2007-08-14 15:29   ` Christoph Lameter
2007-08-14 15:29     ` Christoph Lameter
2007-08-14 19:32     ` Peter Zijlstra
2007-08-14 19:32       ` Peter Zijlstra
2007-08-14 19:41       ` Christoph Lameter
2007-08-14 19:41         ` Christoph Lameter
2007-08-15 12:22 ` Nick Piggin
2007-08-15 12:22   ` Nick Piggin
2007-08-15 13:12   ` Peter Zijlstra
2007-08-15 14:15     ` Andi Kleen
2007-08-15 14:15       ` Andi Kleen
2007-08-15 13:55       ` Peter Zijlstra
2007-08-15 14:34         ` Andi Kleen
2007-08-15 14:34           ` Andi Kleen
2007-08-15 20:32         ` Christoph Lameter
2007-08-15 20:32           ` Christoph Lameter
2007-08-15 20:29     ` Christoph Lameter
2007-08-15 20:29       ` Christoph Lameter
2007-08-16  3:29     ` Nick Piggin
2007-08-16  3:29       ` Nick Piggin
2007-08-16 20:27       ` Christoph Lameter
2007-08-16 20:27         ` Christoph Lameter
2007-08-20  3:51       ` Peter Zijlstra
2007-08-20 19:15         ` Christoph Lameter
2007-08-20 19:15           ` Christoph Lameter
2007-08-21  0:32           ` Nick Piggin
2007-08-21  0:32             ` Nick Piggin
2007-08-21  0:28         ` Nick Piggin
2007-08-21  0:28           ` Nick Piggin
2007-08-21 15:29           ` Peter Zijlstra
2007-08-23  3:02             ` Nick Piggin
2007-08-23  3:02               ` Nick Piggin
2007-09-12 22:39           ` Christoph Lameter
2007-09-12 22:39             ` Christoph Lameter
2007-09-05  9:20 ` Daniel Phillips
2007-09-05  9:20   ` Daniel Phillips
2007-09-05 10:42   ` Christoph Lameter
2007-09-05 10:42     ` Christoph Lameter
2007-09-05 11:42     ` Nick Piggin
2007-09-05 11:42       ` Nick Piggin
2007-09-05 12:14       ` Christoph Lameter
2007-09-05 12:14         ` Christoph Lameter
2007-09-05 12:19         ` Nick Piggin
2007-09-05 12:19           ` Nick Piggin
2007-09-10 19:29           ` Christoph Lameter
2007-09-10 19:29             ` Christoph Lameter
2007-09-10 19:37             ` Peter Zijlstra
2007-09-10 19:41               ` Christoph Lameter
2007-09-10 19:41                 ` Christoph Lameter
2007-09-10 19:55                 ` Peter Zijlstra
2007-09-10 20:17                   ` Christoph Lameter
2007-09-10 20:17                     ` Christoph Lameter
2007-09-10 20:48                     ` Peter Zijlstra
2007-09-11  7:41             ` Nick Piggin
2007-09-11  7:41               ` Nick Piggin
2007-09-12 10:52         ` Peter Zijlstra
2007-09-12 22:47           ` Christoph Lameter
2007-09-12 22:47             ` Christoph Lameter
2007-09-13  8:19             ` Peter Zijlstra
2007-09-13 18:32               ` Christoph Lameter
2007-09-13 18:32                 ` Christoph Lameter
2007-09-13 19:24                 ` Peter Zijlstra
2007-09-13 19:24                   ` Peter Zijlstra
2007-09-05 16:16     ` Daniel Phillips
2007-09-05 16:16       ` Daniel Phillips
2007-09-08  5:12       ` Mike Snitzer
2007-09-08  5:12         ` Mike Snitzer
2007-09-18  0:28         ` Daniel Phillips
2007-09-18  0:28           ` Daniel Phillips
2007-09-18  3:27           ` Mike Snitzer
2007-09-18  3:27             ` Mike Snitzer
2007-09-18  5:37             ` Daniel Phillips
2007-09-18  9:30             ` Peter Zijlstra
2007-09-18  9:30               ` Peter Zijlstra
     [not found]             ` <200709172211.26493.phillips@phunq.net>
2007-09-18  8:11               ` Wouter Verhelst
2007-09-18  8:11                 ` Wouter Verhelst
2007-09-18  9:58               ` Peter Zijlstra
2007-09-18  9:58                 ` Peter Zijlstra
2007-09-18 16:56                 ` Daniel Phillips [this message]
2007-09-18 16:56                   ` Daniel Phillips
2007-09-18 19:16                   ` Peter Zijlstra
2007-09-18 19:16                     ` Peter Zijlstra
2007-09-18 18:40             ` Daniel Phillips
2007-09-18 20:13               ` Mike Snitzer
2007-09-10 19:25       ` Christoph Lameter
2007-09-10 19:25         ` Christoph Lameter
2007-09-10 19:55         ` Peter Zijlstra
2007-09-10 20:22           ` Christoph Lameter
2007-09-10 20:22             ` Christoph Lameter
2007-09-10 20:48             ` Peter Zijlstra
2007-10-26 17:44               ` Pavel Machek
2007-10-26 17:44                 ` Pavel Machek
2007-10-26 17:55                 ` Christoph Lameter
2007-10-26 17:55                   ` Christoph Lameter
2007-10-27 22:58                   ` Daniel Phillips
2007-10-27 22:58                     ` Daniel Phillips
2007-10-27 23:08                 ` Daniel Phillips
2007-10-27 23:08                   ` Daniel Phillips

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200709180956.07772.phillips@phunq.net \
    --to=phillips@phunq.net \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=davem@davemloft.net \
    --cc=dkegel@google.com \
    --cc=johnpol@2ka.mipt.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=snitzer@gmail.com \
    --cc=w@uter.be \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.