From: Xiaotian Feng <dfeng@redhat.com>
To: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: linux-mm@kvack.org, linux-nfs@vger.kernel.org,
netdev@vger.kernel.org, riel@redhat.com, cl@linux-foundation.org,
a.p.zijlstra@chello.nl, linux-kernel@vger.kernel.org,
lwang@redhat.com, akpm@linux-foundation.org, davem@davemloft.net
Subject: Re: [PATCH -mmotm 05/30] mm: sl[au]b: add knowledge of reserve pages
Date: Thu, 15 Jul 2010 20:37:42 +0800 [thread overview]
Message-ID: <4C3F0116.2040309@redhat.com> (raw)
In-Reply-To: <AANLkTilj5GrhbRJZfSsfXP1v9cQSRlARFmxpys1vUelr@mail.gmail.com>
On 07/14/2010 04:33 AM, Pekka Enberg wrote:
> Hi Xiaotian!
>
> I would actually prefer that the SLAB, SLOB, and SLUB changes were in
> separate patches to make reviewing easier.
>
> Looking at SLUB:
>
> On Tue, Jul 13, 2010 at 1:17 PM, Xiaotian Feng<dfeng@redhat.com> wrote:
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 7bb7940..7a5d6dc 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -27,6 +27,8 @@
>> #include<linux/memory.h>
>> #include<linux/math64.h>
>> #include<linux/fault-inject.h>
>> +#include "internal.h"
>> +
>>
>> /*
>> * Lock order:
>> @@ -1139,7 +1141,8 @@ static void setup_object(struct kmem_cache *s, struct page *page,
>> s->ctor(object);
>> }
>>
>> -static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
>> +static
>> +struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node, int *reserve)
>> {
>> struct page *page;
>> void *start;
>> @@ -1153,6 +1156,8 @@ static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
>> if (!page)
>> goto out;
>>
>> + *reserve = page->reserve;
>> +
>> inc_slabs_node(s, page_to_nid(page), page->objects);
>> page->slab = s;
>> page->flags |= 1<< PG_slab;
>> @@ -1606,10 +1611,20 @@ static void *__slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
>> {
>> void **object;
>> struct page *new;
>> + int reserve;
>>
>> /* We handle __GFP_ZERO in the caller */
>> gfpflags&= ~__GFP_ZERO;
>>
>> + if (unlikely(c->reserve)) {
>> + /*
>> + * If the current slab is a reserve slab and the current
>> + * allocation context does not allow access to the reserves we
>> + * must force an allocation to test the current levels.
>> + */
>> + if (!(gfp_to_alloc_flags(gfpflags)& ALLOC_NO_WATERMARKS))
>> + goto grow_slab;
>
> OK, so assume that:
>
> (1) c->reserve is set to one
>
> (2) GFP flags don't allow dipping into the reserves
>
> (3) we've managed to free enough pages so normal
> allocations are fine
>
> (4) the page from reserves is not yet empty
>
> we will call flush_slab() and put the "emergency page" on partial list
> and clear c->reserve. This effectively means that now some other
> allocation can fetch the partial page and start to use it. Is this OK?
> Who makes sure the emergency reserves are large enough for the next
> out-of-memory condition where we swap over NFS?
>
Good catch. I'm just wondering if above check is necessary. For
"emergency page", we don't set c->freelist. How can we get a
reserved slab, if GPF flags don't allow dipping into reserves?
>> + }
>> if (!c->page)
>> goto new_slab;
>>
>> @@ -1623,8 +1638,8 @@ load_freelist:
>> object = c->page->freelist;
>> if (unlikely(!object))
>> goto another_slab;
>> - if (unlikely(SLABDEBUG&& PageSlubDebug(c->page)))
>> - goto debug;
>> + if (unlikely(SLABDEBUG&& PageSlubDebug(c->page) || c->reserve))
>> + goto slow_path;
>>
>> c->freelist = get_freepointer(s, object);
>> c->page->inuse = c->page->objects;
>> @@ -1646,16 +1661,18 @@ new_slab:
>> goto load_freelist;
>> }
>>
>> +grow_slab:
>> if (gfpflags& __GFP_WAIT)
>> local_irq_enable();
>>
>> - new = new_slab(s, gfpflags, node);
>> + new = new_slab(s, gfpflags, node,&reserve);
>>
>> if (gfpflags& __GFP_WAIT)
>> local_irq_disable();
>>
>> if (new) {
>> c = __this_cpu_ptr(s->cpu_slab);
>> + c->reserve = reserve;
>> stat(s, ALLOC_SLAB);
>> if (c->page)
>> flush_slab(s, c);
>> @@ -1667,10 +1684,20 @@ new_slab:
>> if (!(gfpflags& __GFP_NOWARN)&& printk_ratelimit())
>> slab_out_of_memory(s, gfpflags, node);
>> return NULL;
>> -debug:
>> - if (!alloc_debug_processing(s, c->page, object, addr))
>> +
>> +slow_path:
>> + if (!c->reserve&& !alloc_debug_processing(s, c->page, object, addr))
>> goto another_slab;
>>
>> + /*
>> + * Avoid the slub fast path in slab_alloc() by not setting
>> + * c->freelist and the fast path in slab_free() by making
>> + * node_match() fail by setting c->node to -1.
>> + *
>> + * We use this for for debug and reserve checks which need
>> + * to be done for each allocation.
>> + */
>> +
>> c->page->inuse++;
>> c->page->freelist = get_freepointer(s, object);
>> c->node = -1;
>> @@ -2095,10 +2122,11 @@ static void early_kmem_cache_node_alloc(gfp_t gfpflags, int node)
>> struct page *page;
>> struct kmem_cache_node *n;
>> unsigned long flags;
>> + int reserve;
>>
>> BUG_ON(kmalloc_caches->size< sizeof(struct kmem_cache_node));
>>
>> - page = new_slab(kmalloc_caches, gfpflags, node);
>> + page = new_slab(kmalloc_caches, gfpflags, node,&reserve);
>>
>> BUG_ON(!page);
>> if (page_to_nid(page) != node) {
>> --
>> 1.7.1.1
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org. For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email:<a href=mailto:"dont@kvack.org"> email@kvack.org</a>
>>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-07-15 12:38 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 01/30] mm: serialize access to min_free_kbytes Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 02/30] Swap over network documentation Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 03/30] mm: expose gfp_to_alloc_flags() Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 04/30] mm: tag reseve pages Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 05/30] mm: sl[au]b: add knowledge of reserve pages Xiaotian Feng
2010-07-13 20:33 ` Pekka Enberg
2010-07-15 12:37 ` Xiaotian Feng [this message]
2010-08-03 1:44 ` Neil Brown
2010-07-13 10:17 ` [PATCH -mmotm 06/30] mm: kmem_alloc_estimate() Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 07/30] mm: allow PF_MEMALLOC from softirq context Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 08/30] mm: emergency pool Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 09/30] mm: system wide ALLOC_NO_WATERMARK Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 10/30] mm: __GFP_MEMALLOC Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 11/30] mm: memory reserve management Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 12/30] selinux: tag avc cache alloc as non-critical Xiaotian Feng
2010-07-13 10:55 ` Mitchell Erblich
2010-07-15 11:51 ` Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 13/30] net: packet split receive api Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 14/30] net: sk_allocation() - concentrate socket related allocations Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 15/30] netvm: network reserve infrastructure Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 16/30] netvm: INET reserves Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 17/30] netvm: hook skb allocation to reserves Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 18/30] netvm: filter emergency skbs Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 19/30] netvm: prevent a stream specific deadlock Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 20/30] netfilter: NF_QUEUE vs emergency skbs Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 21/30] netvm: skb processing Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 22/30] mm: add support for non block device backed swap files Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 23/30] mm: methods for teaching filesystems about PG_swapcache pages Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 24/30] nfs: teach the NFS client how to treat " Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 25/30] nfs: disable data cache revalidation for swapfiles Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 26/30] nfs: enable swap on NFS Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 27/30] nfs: fix various memory recursions possible with swap over NFS Xiaotian Feng
2010-07-13 10:22 ` [PATCH -mmotm 28/30] build fix for skb_emergency_protocol Xiaotian Feng
2010-07-13 10:22 ` [PATCH -mmotm 29/30] fix null pointer deref in swap_entry_free Xiaotian Feng
2010-07-13 10:22 ` [PATCH -mmotm 30/30] fix mess up on swap with multi files from same nfs server Xiaotian Feng
2010-07-13 12:53 ` [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Américo Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C3F0116.2040309@redhat.com \
--to=dfeng@redhat.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=cl@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
--cc=lwang@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=penberg@cs.helsinki.fi \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).