All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Christoph Lameter <clameter@sgi.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC 0/7] Postphone reclaim laundry to write at high water marks
Date: Wed, 22 Aug 2007 00:09:03 +0200	[thread overview]
Message-ID: <1187734144.5463.35.camel@lappy> (raw)
In-Reply-To: <Pine.LNX.4.64.0708211418120.3267@schroedinger.engr.sgi.com>

On Tue, 2007-08-21 at 14:29 -0700, Christoph Lameter wrote:
> On Tue, 21 Aug 2007, Peter Zijlstra wrote:
> 
> > It quickly ends up with all of memory in the laundry list and then
> > recursing into __alloc_pages which will fail to make progress and OOMs.
> 
> Hmmmm... Okay that needs to be addressed. Reserves need to be used and we 
> only should enter reclaim if that runs out (like the first patch that I 
> did).
> 
> > But aside from the numerous issues with the patch set as presented, I'm
> > not seeing the seeing the big picture, why are you doing this.
> 
> I want general improvements to reclaim to address the issues that you see 
> and other issues related to reclaim instead of the strange code that makes 
> PF_MEMALLOC allocs compete for allocations from a single slab and putting 
> logic into the kernel to decide which allocs to fail. We can reclaim after 
> all. Its just a matter of finding the right way to do this. 

The latest patch I posted got rid of that global slab.

Also, all I want is for slab to honour gfp flags like page allocation
does, nothing more, nothing less.

(well, actually slightly less, since I'm only really interrested in the
ALLOC_MIN|ALLOC_HIGH|ALLOC_HARDER -> ALLOC_NO_WATERMARKS transition and
not all higher ones)

I want slab to fail when a similar page alloc would fail, no magic.

Strictly speaking:

if:

 page = alloc_page(gfp);

fails but:

 obj = kmem_cache_alloc(s, gfp);

succeeds then its a bug.

But I'm not actually needing it that strict, just the ALLOC_NO_WATERMARK
part needs to be done, ALLOC_HARDER, ALLOC_HIGH those may fudge a bit.

> > Anonymous pages are a there to stay, and we cannot tell people how to
> > use them. So we need some free or freeable pages in order to avoid the
> > vm deadlock that arises from all memory dirty.
> 
> No one is trying to abolish Anonymous pages. Free memory is readily 
> available on demand if one calls reclaim. Your scheme introduces complex 
> negotiations over a few scraps of memory when large amounts of memory 
> would still be readily available if one would do the right thing and call 
> into reclaim.

This is the thing I contend, there need not be large amounts of memory
around. In my test prog the hot code path fits into a single page, the
rest can be anonymous.

> > 'Optimizing' this by switching to freeable pages has mainly
> > disadvantages IMHO, finding them scrambles LRU order and complexifies
> > relcaim and all that for a relatively small gain in space for clean
> > pagecache pages.
> 
> Sounds like you would like to change the way we handle memory in general 
> in the VM? Reclaim (and thus finding freeable pages) is basic to Linux 
> memory management.

Not quite, currently we have free pages in the reserves, if you want to
replace some (or all) of that by freeable pages then that is a change.

I'm just using the reserves.

> > Please, stop writing patches and write down a solid proposal of how you
> > envision the VM working in the various scenarios and why its better than
> > the current approach.
> 
> Sorry I just got into this a short time ago and I may need a few cycles 
> to get this all straight. An approach that uses memory instead of 
> ignoring available memory is certainly better.

Sure if and when possible. There will always be need to fall back to the
reserves.

A bit off-topic, re that reclaim from atomic context:
Currently we try to hold spinlocks only for short periods of time so
that reclaim can be preempted, if you run all of reclaim from a
non-preemptible context you get very large preemption latencies and if
done from int context it'd also generate large int latencies.


WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Christoph Lameter <clameter@sgi.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC 0/7] Postphone reclaim laundry to write at high water marks
Date: Wed, 22 Aug 2007 00:09:03 +0200	[thread overview]
Message-ID: <1187734144.5463.35.camel@lappy> (raw)
In-Reply-To: <Pine.LNX.4.64.0708211418120.3267@schroedinger.engr.sgi.com>

On Tue, 2007-08-21 at 14:29 -0700, Christoph Lameter wrote:
> On Tue, 21 Aug 2007, Peter Zijlstra wrote:
> 
> > It quickly ends up with all of memory in the laundry list and then
> > recursing into __alloc_pages which will fail to make progress and OOMs.
> 
> Hmmmm... Okay that needs to be addressed. Reserves need to be used and we 
> only should enter reclaim if that runs out (like the first patch that I 
> did).
> 
> > But aside from the numerous issues with the patch set as presented, I'm
> > not seeing the seeing the big picture, why are you doing this.
> 
> I want general improvements to reclaim to address the issues that you see 
> and other issues related to reclaim instead of the strange code that makes 
> PF_MEMALLOC allocs compete for allocations from a single slab and putting 
> logic into the kernel to decide which allocs to fail. We can reclaim after 
> all. Its just a matter of finding the right way to do this. 

The latest patch I posted got rid of that global slab.

Also, all I want is for slab to honour gfp flags like page allocation
does, nothing more, nothing less.

(well, actually slightly less, since I'm only really interrested in the
ALLOC_MIN|ALLOC_HIGH|ALLOC_HARDER -> ALLOC_NO_WATERMARKS transition and
not all higher ones)

I want slab to fail when a similar page alloc would fail, no magic.

Strictly speaking:

if:

 page = alloc_page(gfp);

fails but:

 obj = kmem_cache_alloc(s, gfp);

succeeds then its a bug.

But I'm not actually needing it that strict, just the ALLOC_NO_WATERMARK
part needs to be done, ALLOC_HARDER, ALLOC_HIGH those may fudge a bit.

> > Anonymous pages are a there to stay, and we cannot tell people how to
> > use them. So we need some free or freeable pages in order to avoid the
> > vm deadlock that arises from all memory dirty.
> 
> No one is trying to abolish Anonymous pages. Free memory is readily 
> available on demand if one calls reclaim. Your scheme introduces complex 
> negotiations over a few scraps of memory when large amounts of memory 
> would still be readily available if one would do the right thing and call 
> into reclaim.

This is the thing I contend, there need not be large amounts of memory
around. In my test prog the hot code path fits into a single page, the
rest can be anonymous.

> > 'Optimizing' this by switching to freeable pages has mainly
> > disadvantages IMHO, finding them scrambles LRU order and complexifies
> > relcaim and all that for a relatively small gain in space for clean
> > pagecache pages.
> 
> Sounds like you would like to change the way we handle memory in general 
> in the VM? Reclaim (and thus finding freeable pages) is basic to Linux 
> memory management.

Not quite, currently we have free pages in the reserves, if you want to
replace some (or all) of that by freeable pages then that is a change.

I'm just using the reserves.

> > Please, stop writing patches and write down a solid proposal of how you
> > envision the VM working in the various scenarios and why its better than
> > the current approach.
> 
> Sorry I just got into this a short time ago and I may need a few cycles 
> to get this all straight. An approach that uses memory instead of 
> ignoring available memory is certainly better.

Sure if and when possible. There will always be need to fall back to the
reserves.

A bit off-topic, re that reclaim from atomic context:
Currently we try to hold spinlocks only for short periods of time so
that reclaim can be preempted, if you run all of reclaim from a
non-preemptible context you get very large preemption latencies and if
done from int context it'd also generate large int latencies.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2007-08-21 22:09 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-20 21:50 [RFC 0/7] Postphone reclaim laundry to write at high water marks Christoph Lameter
2007-08-20 21:50 ` Christoph Lameter
2007-08-20 21:50 ` [RFC 1/7] release_lru_pages(): Generic release of pages to the LRU Christoph Lameter
2007-08-20 21:50   ` Christoph Lameter
2007-08-21 14:52   ` Mel Gorman
2007-08-21 14:52     ` Mel Gorman
2007-08-21 20:51     ` Christoph Lameter
2007-08-21 20:51       ` Christoph Lameter
2007-08-20 21:50 ` [RFC 2/7] Move checks from pageout() to shrink_page_list Christoph Lameter
2007-08-20 21:50   ` Christoph Lameter
2007-08-20 21:50 ` [RFC 3/7] shrink_page_list: Support isolating dirty pages on laundry list Christoph Lameter
2007-08-20 21:50   ` Christoph Lameter
2007-08-21 15:04   ` Mel Gorman
2007-08-21 15:04     ` Mel Gorman
2007-08-21 20:53     ` Christoph Lameter
2007-08-21 20:53       ` Christoph Lameter
2007-08-20 21:50 ` [RFC 4/7] Pass laundry through shrink_inactive_list() and shrink_zone() Christoph Lameter
2007-08-20 21:50   ` Christoph Lameter
2007-08-20 21:50 ` [RFC 5/7] Laundry handling for direct reclaim Christoph Lameter
2007-08-20 21:50   ` Christoph Lameter
2007-08-21 15:06   ` Mel Gorman
2007-08-21 15:06     ` Mel Gorman
2007-08-21 20:55     ` Christoph Lameter
2007-08-21 20:55       ` Christoph Lameter
2007-08-21 15:19   ` Mel Gorman
2007-08-21 15:19     ` Mel Gorman
2007-08-21 21:00     ` Christoph Lameter
2007-08-21 21:00       ` Christoph Lameter
2007-08-20 21:50 ` [RFC 6/7] kswapd: Do laundry after reclaim Christoph Lameter
2007-08-20 21:50   ` Christoph Lameter
2007-08-20 21:50 ` [RFC 7/7] Switch of PF_MEMALLOC during writeout Christoph Lameter
2007-08-20 21:50   ` Christoph Lameter
2007-08-20 23:08   ` Andi Kleen
2007-08-20 23:08     ` Andi Kleen
2007-08-20 23:19     ` Christoph Lameter
2007-08-20 23:19       ` Christoph Lameter
2007-08-21  1:13       ` Andi Kleen
2007-08-21  1:13         ` Andi Kleen
2007-08-21 10:36 ` [RFC 0/7] Postphone reclaim laundry to write at high water marks Peter Zijlstra
2007-08-21 10:36   ` Peter Zijlstra
2007-08-21 20:48   ` Christoph Lameter
2007-08-21 20:48     ` Christoph Lameter
2007-08-21 21:13     ` Peter Zijlstra
2007-08-21 21:13       ` Peter Zijlstra
2007-08-21 21:29       ` Christoph Lameter
2007-08-21 21:29         ` Christoph Lameter
2007-08-21 21:43         ` Rik van Riel
2007-08-21 21:43           ` Rik van Riel
2007-08-21 22:32           ` Christoph Lameter
2007-08-21 22:32             ` Christoph Lameter
2007-08-23 12:05             ` Andrea Arcangeli
2007-08-23 12:05               ` Andrea Arcangeli
2007-08-23 20:23               ` Christoph Lameter
2007-08-23 20:23                 ` Christoph Lameter
2007-08-21 22:09         ` Peter Zijlstra [this message]
2007-08-21 22:09           ` Peter Zijlstra
2007-08-21 22:43           ` Christoph Lameter
2007-08-21 22:43             ` Christoph Lameter
2007-08-22  7:02             ` Peter Zijlstra
2007-08-22  7:02               ` Peter Zijlstra
2007-08-22 19:04               ` Christoph Lameter
2007-08-22 19:04                 ` Christoph Lameter
2007-08-22 20:03                 ` Peter Zijlstra
2007-08-22 20:03                   ` Peter Zijlstra
2007-08-22 20:16                   ` Christoph Lameter
2007-08-22 20:16                     ` Christoph Lameter
2007-08-23  7:39                     ` Peter Zijlstra
2007-08-23  7:39                       ` Peter Zijlstra
2007-08-26  4:52                     ` Rik van Riel
2007-08-26  4:52                       ` Rik van Riel
2007-08-23 12:16                   ` Andrea Arcangeli
2007-08-23 12:16                     ` Andrea Arcangeli
2007-08-22  7:45             ` Ingo Molnar
2007-08-22  7:45               ` Ingo Molnar
2007-08-22 19:19               ` Christoph Lameter
2007-08-22 19:19                 ` Christoph Lameter
2007-08-23 12:08           ` Andrea Arcangeli
2007-08-23 12:08             ` Andrea Arcangeli
2007-08-23 12:59             ` Peter Zijlstra
2007-08-23 12:59               ` Peter Zijlstra
2007-08-21 15:16 ` Rik van Riel
2007-08-21 15:16   ` Rik van Riel
2007-08-21 20:59   ` Christoph Lameter
2007-08-21 20:59     ` Christoph Lameter
2007-08-21 21:14     ` Rik van Riel
2007-08-21 21:14       ` Rik van Riel
2007-08-21 21:30       ` Christoph Lameter
2007-08-21 21:30         ` Christoph Lameter
2007-08-21 15:51 ` Dave McCracken
2007-08-21 15:51   ` Dave McCracken
2007-08-21 21:03   ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1187734144.5463.35.camel@lappy \
    --to=peterz@infradead.org \
    --cc=clameter@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.