Re: [PATCH -mmotm 05/30] mm: sl[au]b: add knowledge of reserve pages

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Xiaotian Feng <dfeng@redhat.com>
To: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: linux-mm@kvack.org, linux-nfs@vger.kernel.org,
	netdev@vger.kernel.org, riel@redhat.com, cl@linux-foundation.org,
	a.p.zijlstra@chello.nl, linux-kernel@vger.kernel.org,
	lwang@redhat.com, akpm@linux-foundation.org, davem@davemloft.net
Subject: Re: [PATCH -mmotm 05/30] mm: sl[au]b: add knowledge of reserve pages
Date: Thu, 15 Jul 2010 20:37:42 +0800	[thread overview]
Message-ID: <4C3F0116.2040309@redhat.com> (raw)
In-Reply-To: <AANLkTilj5GrhbRJZfSsfXP1v9cQSRlARFmxpys1vUelr@mail.gmail.com>

On 07/14/2010 04:33 AM, Pekka Enberg wrote:
> Hi Xiaotian!
>
> I would actually prefer that the SLAB, SLOB, and SLUB changes were in
> separate patches to make reviewing easier.
>
> Looking at SLUB:
>
> On Tue, Jul 13, 2010 at 1:17 PM, Xiaotian Feng<dfeng@redhat.com>  wrote:
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 7bb7940..7a5d6dc 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -27,6 +27,8 @@
>>   #include<linux/memory.h>
>>   #include<linux/math64.h>
>>   #include<linux/fault-inject.h>
>> +#include "internal.h"
>> +
>>
>>   /*
>>   * Lock order:
>> @@ -1139,7 +1141,8 @@ static void setup_object(struct kmem_cache *s, struct page *page,
>>                 s->ctor(object);
>>   }
>>
>> -static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
>> +static
>> +struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node, int *reserve)
>>   {
>>         struct page *page;
>>         void *start;
>> @@ -1153,6 +1156,8 @@ static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
>>         if (!page)
>>                 goto out;
>>
>> +       *reserve = page->reserve;
>> +
>>         inc_slabs_node(s, page_to_nid(page), page->objects);
>>         page->slab = s;
>>         page->flags |= 1<<  PG_slab;
>> @@ -1606,10 +1611,20 @@ static void *__slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
>>   {
>>         void **object;
>>         struct page *new;
>> +       int reserve;
>>
>>         /* We handle __GFP_ZERO in the caller */
>>         gfpflags&= ~__GFP_ZERO;
>>
>> +       if (unlikely(c->reserve)) {
>> +               /*
>> +                * If the current slab is a reserve slab and the current
>> +                * allocation context does not allow access to the reserves we
>> +                * must force an allocation to test the current levels.
>> +                */
>> +               if (!(gfp_to_alloc_flags(gfpflags)&  ALLOC_NO_WATERMARKS))
>> +                       goto grow_slab;
>
> OK, so assume that:
>
>    (1) c->reserve is set to one
>
>    (2) GFP flags don't allow dipping into the reserves
>
>    (3) we've managed to free enough pages so normal
>         allocations are fine
>
>    (4) the page from reserves is not yet empty
>
> we will call flush_slab() and put the "emergency page" on partial list
> and clear c->reserve. This effectively means that now some other
> allocation can fetch the partial page and start to use it. Is this OK?
> Who makes sure the emergency reserves are large enough for the next
> out-of-memory condition where we swap over NFS?
>

Good catch. I'm just wondering if above check is necessary. For
"emergency page", we don't set c->freelist. How can we get a
reserved slab, if GPF flags don't allow dipping into reserves?

>> +       }
>>         if (!c->page)
>>                 goto new_slab;
>>
>> @@ -1623,8 +1638,8 @@ load_freelist:
>>         object = c->page->freelist;
>>         if (unlikely(!object))
>>                 goto another_slab;
>> -       if (unlikely(SLABDEBUG&&  PageSlubDebug(c->page)))
>> -               goto debug;
>> +       if (unlikely(SLABDEBUG&&  PageSlubDebug(c->page) || c->reserve))
>> +               goto slow_path;
>>
>>         c->freelist = get_freepointer(s, object);
>>         c->page->inuse = c->page->objects;
>> @@ -1646,16 +1661,18 @@ new_slab:
>>                 goto load_freelist;
>>         }
>>
>> +grow_slab:
>>         if (gfpflags&  __GFP_WAIT)
>>                 local_irq_enable();
>>
>> -       new = new_slab(s, gfpflags, node);
>> +       new = new_slab(s, gfpflags, node,&reserve);
>>
>>         if (gfpflags&  __GFP_WAIT)
>>                 local_irq_disable();
>>
>>         if (new) {
>>                 c = __this_cpu_ptr(s->cpu_slab);
>> +               c->reserve = reserve;
>>                 stat(s, ALLOC_SLAB);
>>                 if (c->page)
>>                         flush_slab(s, c);
>> @@ -1667,10 +1684,20 @@ new_slab:
>>         if (!(gfpflags&  __GFP_NOWARN)&&  printk_ratelimit())
>>                 slab_out_of_memory(s, gfpflags, node);
>>         return NULL;
>> -debug:
>> -       if (!alloc_debug_processing(s, c->page, object, addr))
>> +
>> +slow_path:
>> +       if (!c->reserve&&  !alloc_debug_processing(s, c->page, object, addr))
>>                 goto another_slab;
>>
>> +       /*
>> +        * Avoid the slub fast path in slab_alloc() by not setting
>> +        * c->freelist and the fast path in slab_free() by making
>> +        * node_match() fail by setting c->node to -1.
>> +        *
>> +        * We use this for for debug and reserve checks which need
>> +        * to be done for each allocation.
>> +        */
>> +
>>         c->page->inuse++;
>>         c->page->freelist = get_freepointer(s, object);
>>         c->node = -1;
>> @@ -2095,10 +2122,11 @@ static void early_kmem_cache_node_alloc(gfp_t gfpflags, int node)
>>         struct page *page;
>>         struct kmem_cache_node *n;
>>         unsigned long flags;
>> +       int reserve;
>>
>>         BUG_ON(kmalloc_caches->size<  sizeof(struct kmem_cache_node));
>>
>> -       page = new_slab(kmalloc_caches, gfpflags, node);
>> +       page = new_slab(kmalloc_caches, gfpflags, node,&reserve);
>>
>>         BUG_ON(!page);
>>         if (page_to_nid(page) != node) {
>> --
>> 1.7.1.1
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email:<a href=mailto:"dont@kvack.org">  email@kvack.org</a>
>>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2010-07-15 12:38 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 01/30] mm: serialize access to min_free_kbytes Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 02/30] Swap over network documentation Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 03/30] mm: expose gfp_to_alloc_flags() Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 04/30] mm: tag reseve pages Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 05/30] mm: sl[au]b: add knowledge of reserve pages Xiaotian Feng
2010-07-13 20:33   ` Pekka Enberg
2010-07-15 12:37     ` Xiaotian Feng [this message]
2010-08-03  1:44     ` Neil Brown
2010-07-13 10:17 ` [PATCH -mmotm 06/30] mm: kmem_alloc_estimate() Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 07/30] mm: allow PF_MEMALLOC from softirq context Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 08/30] mm: emergency pool Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 09/30] mm: system wide ALLOC_NO_WATERMARK Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 10/30] mm: __GFP_MEMALLOC Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 11/30] mm: memory reserve management Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 12/30] selinux: tag avc cache alloc as non-critical Xiaotian Feng
2010-07-13 10:55   ` Mitchell Erblich
2010-07-15 11:51     ` Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 13/30] net: packet split receive api Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 14/30] net: sk_allocation() - concentrate socket related allocations Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 15/30] netvm: network reserve infrastructure Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 16/30] netvm: INET reserves Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 17/30] netvm: hook skb allocation to reserves Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 18/30] netvm: filter emergency skbs Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 19/30] netvm: prevent a stream specific deadlock Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 20/30] netfilter: NF_QUEUE vs emergency skbs Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 21/30] netvm: skb processing Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 22/30] mm: add support for non block device backed swap files Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 23/30] mm: methods for teaching filesystems about PG_swapcache pages Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 24/30] nfs: teach the NFS client how to treat " Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 25/30] nfs: disable data cache revalidation for swapfiles Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 26/30] nfs: enable swap on NFS Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 27/30] nfs: fix various memory recursions possible with swap over NFS Xiaotian Feng
2010-07-13 10:22 ` [PATCH -mmotm 28/30] build fix for skb_emergency_protocol Xiaotian Feng
2010-07-13 10:22 ` [PATCH -mmotm 29/30] fix null pointer deref in swap_entry_free Xiaotian Feng
2010-07-13 10:22 ` [PATCH -mmotm 30/30] fix mess up on swap with multi files from same nfs server Xiaotian Feng
2010-07-13 12:53 ` [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Américo Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C3F0116.2040309@redhat.com \
    --to=dfeng@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=lwang@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=penberg@cs.helsinki.fi \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).