All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Christoph Lameter <clameter@sgi.com>
Cc: Matt Mackall <mpm@selenic.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Thomas Graf <tgraf@suug.ch>, David Miller <davem@davemloft.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Daniel Phillips <phillips@google.com>,
	Pekka Enberg <penberg@cs.helsinki.fi>, Paul Jackson <pj@sgi.com>,
	npiggin@suse.de
Subject: Re: [PATCH 0/5] make slab gfp fair
Date: Mon, 21 May 2007 21:33:58 +0200	[thread overview]
Message-ID: <1179776038.5735.39.camel@lappy> (raw)
In-Reply-To: <Pine.LNX.4.64.0705210932500.25871@schroedinger.engr.sgi.com>

On Mon, 2007-05-21 at 09:45 -0700, Christoph Lameter wrote:
> On Sun, 20 May 2007, Peter Zijlstra wrote:
> 
> > I care about kernel allocations only. In particular about those that
> > have PF_MEMALLOC semantics.
> 
> Hmmmm.. I wish I was more familiar with PF_MEMALLOC. ccing Nick.
> 
> >  - set page->reserve nonzero for each page allocated with
> >    ALLOC_NO_WATERMARKS; which by the previous point implies that all
> >    available zones are below ALLOC_MIN|ALLOC_HIGH|ALLOC_HARDER
> 
> Ok that adds a new field to the page struct. I suggested a page flag in 
> slub before.

No it doesn't; it overloads page->index. Its just used as extra return
value, it need not be persistent. Definitely not worth a page-flag.

> >  - when a page->reserve slab is allocated store it in s->reserve_slab
> >    and do not update the ->cpu_slab[] (this forces subsequent allocs to
> >    retry the allocation).
> 
> Right that should work.
>  
> > All ALLOC_NO_WATERMARKS enabled slab allocations are served from
> > ->reserve_slab, up until the point where a !page->reserve slab alloc
> > succeeds, at which point the ->reserve_slab is pushed into the partial
> > lists and ->reserve_slab set to NULL.
> 
> So the original issue is still not fixed. A slab alloc may succeed without
> watermarks if that particular allocation is restricted to a different set 
> of nodes. Then the reserve slab is dropped despite the memory scarcity on
> another set of nodes?

I can't see how. This extra ALLOC_MIN|ALLOC_HIGH|ALLOC_HARDER alloc will
first deplete all other zones. Once that starts failing no node should
still have pages accessible by any allocation context other than
PF_MEMALLOC.

> > Since only the allocation of a new slab uses the gfp zone flags, and
> > other allocations placement hints they have to be uniform over all slab
> > allocs for a given kmem_cache. Thus the s->reserve_slab/page->reserve
> > status is kmem_cache wide.
> 
> No the gfp zone flags are not uniform and placement of page allocator 
> allocs through SLUB do not always have the same allocation constraints.

It has to; since it can serve the allocation from a pre-existing slab
allocation. Hence any page allocation must be valid for all other users.

> SLUB will check the node of the page that was allocated when the page 
> allocator returns and put the page into that nodes slab list. This varies
> depending on the allocation context.

Yes, it keeps slabs on per node lists. I'm just not seeing how this puts
hard constraints on the allocations.

As far as I can see there cannot be a hard constraint here, because
allocations form interrupt context are at best node local. And node
affine zone lists still have all zones, just ordered on locality.

> Allocations can be particular to uses of a slab in particular situations. 
> A kmalloc cache can be used to allocate from various sets of nodes in 
> different circumstances. kmalloc will allow serving a limited number of 
> objects from the wrong nodes for performance reasons but the next 
> allocation from the page allocator (or from the partial lists) will occur 
> using the current set of allowed nodes in order to ensure a rough 
> obedience to the memory policies and cpusets. kmalloc_node behaves 
> differently and will enforce using memory from a particular node.

>From what I can see, it takes pretty much any page it can get once you
hit it with PF_MEMALLOC. If the page allocation doesn't use ALLOC_CPUSET
the page can come from pretty much anywhere.



WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Christoph Lameter <clameter@sgi.com>
Cc: Matt Mackall <mpm@selenic.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Thomas Graf <tgraf@suug.ch>, David Miller <davem@davemloft.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Daniel Phillips <phillips@google.com>,
	Pekka Enberg <penberg@cs.helsinki.fi>, Paul Jackson <pj@sgi.com>,
	npiggin@suse.de
Subject: Re: [PATCH 0/5] make slab gfp fair
Date: Mon, 21 May 2007 21:33:58 +0200	[thread overview]
Message-ID: <1179776038.5735.39.camel@lappy> (raw)
In-Reply-To: <Pine.LNX.4.64.0705210932500.25871@schroedinger.engr.sgi.com>

On Mon, 2007-05-21 at 09:45 -0700, Christoph Lameter wrote:
> On Sun, 20 May 2007, Peter Zijlstra wrote:
> 
> > I care about kernel allocations only. In particular about those that
> > have PF_MEMALLOC semantics.
> 
> Hmmmm.. I wish I was more familiar with PF_MEMALLOC. ccing Nick.
> 
> >  - set page->reserve nonzero for each page allocated with
> >    ALLOC_NO_WATERMARKS; which by the previous point implies that all
> >    available zones are below ALLOC_MIN|ALLOC_HIGH|ALLOC_HARDER
> 
> Ok that adds a new field to the page struct. I suggested a page flag in 
> slub before.

No it doesn't; it overloads page->index. Its just used as extra return
value, it need not be persistent. Definitely not worth a page-flag.

> >  - when a page->reserve slab is allocated store it in s->reserve_slab
> >    and do not update the ->cpu_slab[] (this forces subsequent allocs to
> >    retry the allocation).
> 
> Right that should work.
>  
> > All ALLOC_NO_WATERMARKS enabled slab allocations are served from
> > ->reserve_slab, up until the point where a !page->reserve slab alloc
> > succeeds, at which point the ->reserve_slab is pushed into the partial
> > lists and ->reserve_slab set to NULL.
> 
> So the original issue is still not fixed. A slab alloc may succeed without
> watermarks if that particular allocation is restricted to a different set 
> of nodes. Then the reserve slab is dropped despite the memory scarcity on
> another set of nodes?

I can't see how. This extra ALLOC_MIN|ALLOC_HIGH|ALLOC_HARDER alloc will
first deplete all other zones. Once that starts failing no node should
still have pages accessible by any allocation context other than
PF_MEMALLOC.

> > Since only the allocation of a new slab uses the gfp zone flags, and
> > other allocations placement hints they have to be uniform over all slab
> > allocs for a given kmem_cache. Thus the s->reserve_slab/page->reserve
> > status is kmem_cache wide.
> 
> No the gfp zone flags are not uniform and placement of page allocator 
> allocs through SLUB do not always have the same allocation constraints.

It has to; since it can serve the allocation from a pre-existing slab
allocation. Hence any page allocation must be valid for all other users.

> SLUB will check the node of the page that was allocated when the page 
> allocator returns and put the page into that nodes slab list. This varies
> depending on the allocation context.

Yes, it keeps slabs on per node lists. I'm just not seeing how this puts
hard constraints on the allocations.

As far as I can see there cannot be a hard constraint here, because
allocations form interrupt context are at best node local. And node
affine zone lists still have all zones, just ordered on locality.

> Allocations can be particular to uses of a slab in particular situations. 
> A kmalloc cache can be used to allocate from various sets of nodes in 
> different circumstances. kmalloc will allow serving a limited number of 
> objects from the wrong nodes for performance reasons but the next 
> allocation from the page allocator (or from the partial lists) will occur 
> using the current set of allowed nodes in order to ensure a rough 
> obedience to the memory policies and cpusets. kmalloc_node behaves 
> differently and will enforce using memory from a particular node.

>From what I can see, it takes pretty much any page it can get once you
hit it with PF_MEMALLOC. If the page allocation doesn't use ALLOC_CPUSET
the page can come from pretty much anywhere.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-05-21 20:00 UTC|newest]

Thread overview: 138+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-14 13:19 [PATCH 0/5] make slab gfp fair Peter Zijlstra
2007-05-14 13:19 ` Peter Zijlstra
2007-05-14 13:19 ` [PATCH 1/5] mm: page allocation rank Peter Zijlstra
2007-05-14 13:19   ` Peter Zijlstra
2007-05-14 13:19 ` [PATCH 2/5] mm: slab allocation fairness Peter Zijlstra
2007-05-14 13:19   ` Peter Zijlstra
2007-05-14 15:51   ` Christoph Lameter
2007-05-14 15:51     ` Christoph Lameter
2007-05-14 13:19 ` [PATCH 3/5] mm: slub " Peter Zijlstra
2007-05-14 13:19   ` Peter Zijlstra
2007-05-14 15:49   ` Christoph Lameter
2007-05-14 15:49     ` Christoph Lameter
2007-05-14 16:14     ` Peter Zijlstra
2007-05-14 16:14       ` Peter Zijlstra
2007-05-14 16:35       ` Christoph Lameter
2007-05-14 16:35         ` Christoph Lameter
2007-05-14 13:19 ` [PATCH 4/5] mm: slob " Peter Zijlstra
2007-05-14 13:19   ` Peter Zijlstra
2007-05-14 13:19 ` [PATCH 5/5] mm: allow mempool to fall back to memalloc reserves Peter Zijlstra
2007-05-14 13:19   ` Peter Zijlstra
2007-05-14 15:53 ` [PATCH 0/5] make slab gfp fair Christoph Lameter
2007-05-14 15:53   ` Christoph Lameter
2007-05-14 16:10   ` Peter Zijlstra
2007-05-14 16:10     ` Peter Zijlstra
2007-05-14 16:37     ` Christoph Lameter
2007-05-14 16:37       ` Christoph Lameter
2007-05-14 16:12   ` Matt Mackall
2007-05-14 16:12     ` Matt Mackall
2007-05-14 16:29     ` Christoph Lameter
2007-05-14 16:29       ` Christoph Lameter
2007-05-14 17:40       ` Peter Zijlstra
2007-05-14 17:40         ` Peter Zijlstra
2007-05-14 17:57         ` Christoph Lameter
2007-05-14 17:57           ` Christoph Lameter
2007-05-14 19:28           ` Peter Zijlstra
2007-05-14 19:28             ` Peter Zijlstra
2007-05-14 19:56             ` Christoph Lameter
2007-05-14 19:56               ` Christoph Lameter
2007-05-14 20:03               ` Peter Zijlstra
2007-05-14 20:03                 ` Peter Zijlstra
2007-05-14 20:06                 ` Christoph Lameter
2007-05-14 20:06                   ` Christoph Lameter
2007-05-14 20:12                   ` Peter Zijlstra
2007-05-14 20:12                     ` Peter Zijlstra
2007-05-14 20:25                 ` Christoph Lameter
2007-05-14 20:25                   ` Christoph Lameter
2007-05-15 17:27             ` Peter Zijlstra
2007-05-15 17:27               ` Peter Zijlstra
2007-05-15 22:02               ` Christoph Lameter
2007-05-15 22:02                 ` Christoph Lameter
2007-05-16  6:59                 ` Peter Zijlstra
2007-05-16  6:59                   ` Peter Zijlstra
2007-05-16 18:43                   ` Christoph Lameter
2007-05-16 18:43                     ` Christoph Lameter
2007-05-16 19:25                     ` Peter Zijlstra
2007-05-16 19:25                       ` Peter Zijlstra
2007-05-16 19:53                       ` Christoph Lameter
2007-05-16 19:53                         ` Christoph Lameter
2007-05-16 20:18                         ` Peter Zijlstra
2007-05-16 20:18                           ` Peter Zijlstra
2007-05-16 20:27                           ` Christoph Lameter
2007-05-16 20:27                             ` Christoph Lameter
2007-05-16 20:40                             ` Peter Zijlstra
2007-05-16 20:40                               ` Peter Zijlstra
2007-05-16 20:44                               ` Christoph Lameter
2007-05-16 20:44                                 ` Christoph Lameter
2007-05-16 20:54                                 ` Peter Zijlstra
2007-05-16 20:54                                   ` Peter Zijlstra
2007-05-16 20:59                                   ` Christoph Lameter
2007-05-16 20:59                                     ` Christoph Lameter
2007-05-16 21:04                                     ` Peter Zijlstra
2007-05-16 21:04                                       ` Peter Zijlstra
2007-05-16 21:13                                       ` Christoph Lameter
2007-05-16 21:13                                         ` Christoph Lameter
2007-05-16 21:20                                         ` Peter Zijlstra
2007-05-16 21:20                                           ` Peter Zijlstra
2007-05-16 21:42                                           ` Christoph Lameter
2007-05-16 21:42                                             ` Christoph Lameter
2007-05-17  7:28                                             ` Peter Zijlstra
2007-05-17  7:28                                               ` Peter Zijlstra
2007-05-17 17:30                                               ` Christoph Lameter
2007-05-17 17:30                                                 ` Christoph Lameter
2007-05-17 17:53                                                 ` Peter Zijlstra
2007-05-17 17:53                                                   ` Peter Zijlstra
2007-05-17 18:01                                                   ` Christoph Lameter
2007-05-17 18:01                                                     ` Christoph Lameter
2007-05-14 19:44     ` Andrew Morton
2007-05-14 19:44       ` Andrew Morton
2007-05-14 20:01       ` Matt Mackall
2007-05-14 20:01         ` Matt Mackall
2007-05-14 20:05       ` Peter Zijlstra
2007-05-14 20:05         ` Peter Zijlstra
2007-05-17  3:02 ` Christoph Lameter
2007-05-17  3:02   ` Christoph Lameter
2007-05-17  7:08   ` Peter Zijlstra
2007-05-17  7:08     ` Peter Zijlstra
2007-05-17 17:29     ` Christoph Lameter
2007-05-17 17:29       ` Christoph Lameter
2007-05-17 17:52       ` Peter Zijlstra
2007-05-17 17:52         ` Peter Zijlstra
2007-05-17 17:59         ` Christoph Lameter
2007-05-17 17:59           ` Christoph Lameter
2007-05-17 17:53       ` Matt Mackall
2007-05-17 17:53         ` Matt Mackall
2007-05-17 18:02         ` Christoph Lameter
2007-05-17 18:02           ` Christoph Lameter
2007-05-17 19:18           ` Peter Zijlstra
2007-05-17 19:18             ` Peter Zijlstra
2007-05-17 19:24             ` Christoph Lameter
2007-05-17 19:24               ` Christoph Lameter
2007-05-17 21:26               ` Peter Zijlstra
2007-05-17 21:26                 ` Peter Zijlstra
2007-05-17 21:44                 ` Paul Jackson
2007-05-17 21:44                   ` Paul Jackson
2007-05-17 22:27                 ` Christoph Lameter
2007-05-17 22:27                   ` Christoph Lameter
2007-05-18  9:54                   ` Peter Zijlstra
2007-05-18  9:54                     ` Peter Zijlstra
2007-05-18 17:11                     ` Paul Jackson
2007-05-18 17:11                       ` Paul Jackson
2007-05-18 17:11                     ` Christoph Lameter
2007-05-18 17:11                       ` Christoph Lameter
2007-05-20  8:39                       ` Peter Zijlstra
2007-05-20  8:39                         ` Peter Zijlstra
2007-05-21 16:45                         ` Christoph Lameter
2007-05-21 16:45                           ` Christoph Lameter
2007-05-21 19:33                           ` Peter Zijlstra [this message]
2007-05-21 19:33                             ` Peter Zijlstra
2007-05-21 19:43                             ` Christoph Lameter
2007-05-21 19:43                               ` Christoph Lameter
2007-05-21 20:08                               ` Peter Zijlstra
2007-05-21 20:08                                 ` Peter Zijlstra
2007-05-21 20:32                                 ` Christoph Lameter
2007-05-21 20:32                                   ` Christoph Lameter
2007-05-21 20:54                                   ` Peter Zijlstra
2007-05-21 20:54                                     ` Peter Zijlstra
2007-05-21 21:04                                     ` Christoph Lameter
2007-05-21 21:04                                       ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1179776038.5735.39.camel@lappy \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mpm@selenic.com \
    --cc=npiggin@suse.de \
    --cc=penberg@cs.helsinki.fi \
    --cc=phillips@google.com \
    --cc=pj@sgi.com \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.