From: Pekka Enberg <penberg@cs.helsinki.fi>
To: Xiaotian Feng <dfeng@redhat.com>
Cc: linux-mm@kvack.org, linux-nfs@vger.kernel.org,
netdev@vger.kernel.org, riel@redhat.com, cl@linux-foundation.org,
a.p.zijlstra@chello.nl, linux-kernel@vger.kernel.org,
lwang@redhat.com, akpm@linux-foundation.org, davem@davemloft.net
Subject: Re: [PATCH -mmotm 05/30] mm: sl[au]b: add knowledge of reserve pages
Date: Tue, 13 Jul 2010 23:33:14 +0300 [thread overview]
Message-ID: <AANLkTilj5GrhbRJZfSsfXP1v9cQSRlARFmxpys1vUelr@mail.gmail.com> (raw)
In-Reply-To: <20100713101747.2835.45722.sendpatchset@danny.redhat>
Hi Xiaotian!
I would actually prefer that the SLAB, SLOB, and SLUB changes were in
separate patches to make reviewing easier.
Looking at SLUB:
On Tue, Jul 13, 2010 at 1:17 PM, Xiaotian Feng <dfeng@redhat.com> wrote:
> diff --git a/mm/slub.c b/mm/slub.c
> index 7bb7940..7a5d6dc 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -27,6 +27,8 @@
> #include <linux/memory.h>
> #include <linux/math64.h>
> #include <linux/fault-inject.h>
> +#include "internal.h"
> +
>
> /*
> * Lock order:
> @@ -1139,7 +1141,8 @@ static void setup_object(struct kmem_cache *s, struct page *page,
> s->ctor(object);
> }
>
> -static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
> +static
> +struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node, int *reserve)
> {
> struct page *page;
> void *start;
> @@ -1153,6 +1156,8 @@ static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
> if (!page)
> goto out;
>
> + *reserve = page->reserve;
> +
> inc_slabs_node(s, page_to_nid(page), page->objects);
> page->slab = s;
> page->flags |= 1 << PG_slab;
> @@ -1606,10 +1611,20 @@ static void *__slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
> {
> void **object;
> struct page *new;
> + int reserve;
>
> /* We handle __GFP_ZERO in the caller */
> gfpflags &= ~__GFP_ZERO;
>
> + if (unlikely(c->reserve)) {
> + /*
> + * If the current slab is a reserve slab and the current
> + * allocation context does not allow access to the reserves we
> + * must force an allocation to test the current levels.
> + */
> + if (!(gfp_to_alloc_flags(gfpflags) & ALLOC_NO_WATERMARKS))
> + goto grow_slab;
OK, so assume that:
(1) c->reserve is set to one
(2) GFP flags don't allow dipping into the reserves
(3) we've managed to free enough pages so normal
allocations are fine
(4) the page from reserves is not yet empty
we will call flush_slab() and put the "emergency page" on partial list
and clear c->reserve. This effectively means that now some other
allocation can fetch the partial page and start to use it. Is this OK?
Who makes sure the emergency reserves are large enough for the next
out-of-memory condition where we swap over NFS?
> + }
> if (!c->page)
> goto new_slab;
>
> @@ -1623,8 +1638,8 @@ load_freelist:
> object = c->page->freelist;
> if (unlikely(!object))
> goto another_slab;
> - if (unlikely(SLABDEBUG && PageSlubDebug(c->page)))
> - goto debug;
> + if (unlikely(SLABDEBUG && PageSlubDebug(c->page) || c->reserve))
> + goto slow_path;
>
> c->freelist = get_freepointer(s, object);
> c->page->inuse = c->page->objects;
> @@ -1646,16 +1661,18 @@ new_slab:
> goto load_freelist;
> }
>
> +grow_slab:
> if (gfpflags & __GFP_WAIT)
> local_irq_enable();
>
> - new = new_slab(s, gfpflags, node);
> + new = new_slab(s, gfpflags, node, &reserve);
>
> if (gfpflags & __GFP_WAIT)
> local_irq_disable();
>
> if (new) {
> c = __this_cpu_ptr(s->cpu_slab);
> + c->reserve = reserve;
> stat(s, ALLOC_SLAB);
> if (c->page)
> flush_slab(s, c);
> @@ -1667,10 +1684,20 @@ new_slab:
> if (!(gfpflags & __GFP_NOWARN) && printk_ratelimit())
> slab_out_of_memory(s, gfpflags, node);
> return NULL;
> -debug:
> - if (!alloc_debug_processing(s, c->page, object, addr))
> +
> +slow_path:
> + if (!c->reserve && !alloc_debug_processing(s, c->page, object, addr))
> goto another_slab;
>
> + /*
> + * Avoid the slub fast path in slab_alloc() by not setting
> + * c->freelist and the fast path in slab_free() by making
> + * node_match() fail by setting c->node to -1.
> + *
> + * We use this for for debug and reserve checks which need
> + * to be done for each allocation.
> + */
> +
> c->page->inuse++;
> c->page->freelist = get_freepointer(s, object);
> c->node = -1;
> @@ -2095,10 +2122,11 @@ static void early_kmem_cache_node_alloc(gfp_t gfpflags, int node)
> struct page *page;
> struct kmem_cache_node *n;
> unsigned long flags;
> + int reserve;
>
> BUG_ON(kmalloc_caches->size < sizeof(struct kmem_cache_node));
>
> - page = new_slab(kmalloc_caches, gfpflags, node);
> + page = new_slab(kmalloc_caches, gfpflags, node, &reserve);
>
> BUG_ON(!page);
> if (page_to_nid(page) != node) {
> --
> 1.7.1.1
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-07-13 20:33 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 01/30] mm: serialize access to min_free_kbytes Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 02/30] Swap over network documentation Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 03/30] mm: expose gfp_to_alloc_flags() Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 04/30] mm: tag reseve pages Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 05/30] mm: sl[au]b: add knowledge of reserve pages Xiaotian Feng
2010-07-13 20:33 ` Pekka Enberg [this message]
2010-07-15 12:37 ` Xiaotian Feng
2010-08-03 1:44 ` Neil Brown
2010-07-13 10:17 ` [PATCH -mmotm 06/30] mm: kmem_alloc_estimate() Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 07/30] mm: allow PF_MEMALLOC from softirq context Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 08/30] mm: emergency pool Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 09/30] mm: system wide ALLOC_NO_WATERMARK Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 10/30] mm: __GFP_MEMALLOC Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 11/30] mm: memory reserve management Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 12/30] selinux: tag avc cache alloc as non-critical Xiaotian Feng
2010-07-13 10:55 ` Mitchell Erblich
2010-07-15 11:51 ` Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 13/30] net: packet split receive api Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 14/30] net: sk_allocation() - concentrate socket related allocations Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 15/30] netvm: network reserve infrastructure Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 16/30] netvm: INET reserves Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 17/30] netvm: hook skb allocation to reserves Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 18/30] netvm: filter emergency skbs Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 19/30] netvm: prevent a stream specific deadlock Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 20/30] netfilter: NF_QUEUE vs emergency skbs Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 21/30] netvm: skb processing Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 22/30] mm: add support for non block device backed swap files Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 23/30] mm: methods for teaching filesystems about PG_swapcache pages Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 24/30] nfs: teach the NFS client how to treat " Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 25/30] nfs: disable data cache revalidation for swapfiles Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 26/30] nfs: enable swap on NFS Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 27/30] nfs: fix various memory recursions possible with swap over NFS Xiaotian Feng
2010-07-13 10:22 ` [PATCH -mmotm 28/30] build fix for skb_emergency_protocol Xiaotian Feng
2010-07-13 10:22 ` [PATCH -mmotm 29/30] fix null pointer deref in swap_entry_free Xiaotian Feng
2010-07-13 10:22 ` [PATCH -mmotm 30/30] fix mess up on swap with multi files from same nfs server Xiaotian Feng
2010-07-13 12:53 ` [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Américo Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AANLkTilj5GrhbRJZfSsfXP1v9cQSRlARFmxpys1vUelr@mail.gmail.com \
--to=penberg@cs.helsinki.fi \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=cl@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=dfeng@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
--cc=lwang@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).