linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Christoph Lameter <clameter@sgi.com>
Cc: Nick Piggin <npiggin@suse.de>,
	Daniel Phillips <phillips@phunq.net>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, dkegel@google.com,
	David Miller <davem@davemloft.net>
Subject: Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC)
Date: Thu, 13 Sep 2007 21:24:33 +0200	[thread overview]
Message-ID: <1189711473.5643.18.camel@lappy> (raw)
In-Reply-To: <Pine.LNX.4.64.0709131128050.9546@schroedinger.engr.sgi.com>

On Thu, 2007-09-13 at 11:32 -0700, Christoph Lameter wrote:
> On Thu, 13 Sep 2007, Peter Zijlstra wrote:
> 
> > 
> > > > Every user of memory relies on the VM, and we only get into trouble if
> > > > the VM in turn relies on one of these users. Traditionally that has only
> > > > been the block layer, and we special cased that using mempools and
> > > > PF_MEMALLOC.
> > > > 
> > > > Why do you object to me doing a similar thing for networking?
> > > 
> > > I have not seen you using mempools for the networking layer. I would not 
> > > object to such a solution. It already exists for other subsystems.
> > 
> > Dude, listen, how often do I have to say this: I cannot use mempools for
> > the network subsystem because its build on kmalloc! What I've done is
> > build a replacement for mempools - a reserve system - that does work
> > similar to mempools but also provides the flexibility of kmalloc.
> > 
> > That is all, no more, no less.
> 
> Its different since it becomes a privileged player that can suck all 
> the available memory out of the page allocator.

No, each reserve user comes with a bean-counter that will limit the
usage.

> > I'm confused by this, I've never claimed part of, or such a thing. All
> > I'm saying is that because of the circular dependency between the VM and
> > the IO subsystem used for swap (not file backed paging [*], just swap)
> > you have to do something special to avoid deadlocks.
> 
> How are dirty file backed pages different? They may also be written out 
> by the VM during reclaim.

when you have dirty file backed pages, the rest of the memory can only
consists of clean file pages and or anonymous pages - due to the dirty
limit. If you can guarantee that swap doesn't use memory (well, it does,
but its PF_MEMALLOC memory that cannot be used by others) then you can
always free memory by dropping clean pages or swapping out. And thus
make progress for file based writeback.

This is why the dirty page tracking made mmap over NFS useable.

> > > Replacing the mempools for the block layer sounds pretty good. But how do 
> > > these various subsystems that may live in different portions of the system 
> > > for various devices avoid global serialization and livelock through your 
> > > system? 
> > 
> > The reserves are spread over all kernel mapped zones, the slab allocator
> > is still per cpu, the page allocator tries to get pages from the nearest
> > node.
> 
> But it seems that you have unbounded allocations with PF_MEMALLOC now for 
> the networking case? So networking can exhaust all reserves?

No, networking will beancount all PF_MEMALLOC memory it receives, and
stop allocating once it hits it limit. It knows that when it has than
much memory outstanding its guaranteed memory will be freed soon.

> > > And how is fairness addresses? I may want to run a fileserver on 
> > > some nodes and a HPC application that relies on a fiberchannel connection 
> > > on other nodes. How do we guarantee that the HPC application is not 
> > > impacted if the network services of the fileserver flood the system with 
> > > messages and exhaust memory?
> > 
> > The network system reserves A pages, the block layer reserves B pages,
> > once they start getting pages from the reserves they go bean counting,
> > once they reach their respective limit they stop.
> 
> That sounds good.

Ok, so next time I'll post the whole series again - I know some people
found it too much - but that way you can see the bean counter.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-09-13 19:24 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-14 14:21 [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC) Christoph Lameter
2007-08-14 14:21 ` [RFC 1/3] Allow reclaim via __GFP_NOMEMALLOC reclaim Christoph Lameter
2007-08-14 14:21 ` [RFC 2/3] Use NOMEMALLOC reclaim to allow reclaim if PF_MEMALLOC is set Christoph Lameter
2007-08-14 14:21 ` [RFC 3/3] Test code for PF_MEMALLOC reclaim Christoph Lameter
2007-08-14 14:36 ` [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC) Peter Zijlstra
2007-08-14 15:29   ` Christoph Lameter
2007-08-14 19:32     ` Peter Zijlstra
2007-08-14 19:41       ` Christoph Lameter
2007-08-15 12:22 ` Nick Piggin
2007-08-15 13:12   ` Peter Zijlstra
2007-08-15 14:15     ` Andi Kleen
2007-08-15 13:55       ` Peter Zijlstra
2007-08-15 14:34         ` Andi Kleen
2007-08-15 20:32         ` Christoph Lameter
2007-08-15 20:29     ` Christoph Lameter
2007-08-16  3:29     ` Nick Piggin
2007-08-16 20:27       ` Christoph Lameter
2007-08-20  3:51       ` Peter Zijlstra
2007-08-20 19:15         ` Christoph Lameter
2007-08-21  0:32           ` Nick Piggin
2007-08-21  0:28         ` Nick Piggin
2007-08-21 15:29           ` Peter Zijlstra
2007-08-23  3:02             ` Nick Piggin
2007-09-12 22:39           ` Christoph Lameter
2007-09-05  9:20 ` Daniel Phillips
2007-09-05 10:42   ` Christoph Lameter
2007-09-05 11:42     ` Nick Piggin
2007-09-05 12:14       ` Christoph Lameter
2007-09-05 12:19         ` Nick Piggin
2007-09-10 19:29           ` Christoph Lameter
2007-09-10 19:37             ` Peter Zijlstra
2007-09-10 19:41               ` Christoph Lameter
2007-09-10 19:55                 ` Peter Zijlstra
2007-09-10 20:17                   ` Christoph Lameter
2007-09-10 20:48                     ` Peter Zijlstra
2007-09-11  7:41             ` Nick Piggin
2007-09-12 10:52         ` Peter Zijlstra
2007-09-12 22:47           ` Christoph Lameter
2007-09-13  8:19             ` Peter Zijlstra
2007-09-13 18:32               ` Christoph Lameter
2007-09-13 19:24                 ` Peter Zijlstra [this message]
2007-09-05 16:16     ` Daniel Phillips
2007-09-08  5:12       ` Mike Snitzer
2007-09-18  0:28         ` Daniel Phillips
2007-09-18  3:27           ` Mike Snitzer
     [not found]             ` <200709172211.26493.phillips@phunq.net>
2007-09-18  8:11               ` Wouter Verhelst
2007-09-18  9:58               ` Peter Zijlstra
2007-09-18 16:56                 ` Daniel Phillips
2007-09-18 19:16                   ` Peter Zijlstra
2007-09-18  9:30             ` Peter Zijlstra
2007-09-18 18:40             ` Daniel Phillips
2007-09-18 20:13               ` Mike Snitzer
2007-09-10 19:25       ` Christoph Lameter
2007-09-10 19:55         ` Peter Zijlstra
2007-09-10 20:22           ` Christoph Lameter
2007-09-10 20:48             ` Peter Zijlstra
2007-10-26 17:44               ` Pavel Machek
2007-10-26 17:55                 ` Christoph Lameter
2007-10-27 22:58                   ` Daniel Phillips
2007-10-27 23:08                 ` Daniel Phillips

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1189711473.5643.18.camel@lappy \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=davem@davemloft.net \
    --cc=dkegel@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=phillips@phunq.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).