linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
To: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	David Rientjes <rientjes@google.com>,
	Han Pingtian <hanpt@linux.vnet.ibm.com>,
	Pekka Enberg <penberg@kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Paul Mackerras <paulus@samba.org>,
	Anton Blanchard <anton@samba.org>, Matt Mackall <mpm@selenic.com>,
	linuxppc-dev@lists.ozlabs.org,
	Wanpeng Li <liwanp@linux.vnet.ibm.com>
Subject: Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node
Date: Mon, 10 Feb 2014 11:13:21 -0800	[thread overview]
Message-ID: <20140210191321.GD1558@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.DEB.2.10.1402071245040.20246@nuc>

Hi Christoph,

On 07.02.2014 [12:51:07 -0600], Christoph Lameter wrote:
> Here is a draft of a patch to make this work with memoryless nodes.
> 
> The first thing is that we modify node_match to also match if we hit an
> empty node. In that case we simply take the current slab if its there.
> 
> If there is no current slab then a regular allocation occurs with the
> memoryless node. The page allocator will fallback to a possible node and
> that will become the current slab. Next alloc from a memoryless node
> will then use that slab.
> 
> For that we also add some tracking of allocations on nodes that were not
> satisfied using the empty_node[] array. A successful alloc on a node
> clears that flag.
> 
> I would rather avoid the empty_node[] array since its global and there may
> be thread specific allocation restrictions but it would be expensive to do
> an allocation attempt via the page allocator to make sure that there is
> really no page available from the page allocator.

With this patch on our test system (I pulled out the numa_mem_id()
change, since you Acked Joonsoo's already), on top of 3.13.0 + my
kthread locality change + CONFIG_HAVE_MEMORYLESS_NODES + Joonsoo's RFC
patch 1):

MemTotal:        8264704 kB
MemFree:         5924608 kB
...
Slab:            1402496 kB
SReclaimable:     102848 kB
SUnreclaim:      1299648 kB

And Anton's slabusage reports:

slab                                   mem     objs    slabs
                                      used   active   active
------------------------------------------------------------
kmalloc-16384                       207 MB   98.60%  100.00%
task_struct                         134 MB   97.82%  100.00%
kmalloc-8192                        117 MB  100.00%  100.00%
pgtable-2^12                        111 MB  100.00%  100.00%
pgtable-2^10                        104 MB  100.00%  100.00%

For comparison, Anton's patch applied at the same point in the series:

meminfo:

MemTotal:        8264704 kB
MemFree:         4150464 kB
...
Slab:            1590336 kB
SReclaimable:     208768 kB
SUnreclaim:      1381568 kB

slabusage:

slab                                   mem     objs    slabs
                                      used   active   active
------------------------------------------------------------
kmalloc-16384                       227 MB   98.63%  100.00%
kmalloc-8192                        130 MB  100.00%  100.00%
task_struct                         129 MB   97.73%  100.00%
pgtable-2^12                        112 MB  100.00%  100.00%
pgtable-2^10                        106 MB  100.00%  100.00%


Consider this patch:

Acked-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Tested-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>

I was thinking about your concerns about empty_node[]. Would it make
sense to use a helper function, rather than direct access to
direct_node, such as:

	bool is_node_empty(int nid)

	void set_node_empty(int nid, bool empty)

which we stub out if !HAVE_MEMORYLESS_NODES to return false and noop
respectively?

That way only architectures that have memoryless nodes pay the penalty
of the array allocation?

Thanks,
Nish

> Index: linux/mm/slub.c
> ===================================================================
> --- linux.orig/mm/slub.c	2014-02-03 13:19:22.896853227 -0600
> +++ linux/mm/slub.c	2014-02-07 12:44:49.311494806 -0600
> @@ -132,6 +132,8 @@ static inline bool kmem_cache_has_cpu_pa
>  #endif
>  }
> 
> +static int empty_node[MAX_NUMNODES];
> +
>  /*
>   * Issues still to be resolved:
>   *
> @@ -1405,16 +1407,22 @@ static struct page *new_slab(struct kmem
>  	void *last;
>  	void *p;
>  	int order;
> +	int alloc_node;
> 
>  	BUG_ON(flags & GFP_SLAB_BUG_MASK);
> 
>  	page = allocate_slab(s,
>  		flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
> -	if (!page)
> +	if (!page) {
> +		if (node != NUMA_NO_NODE)
> +			empty_node[node] = 1;
>  		goto out;
> +	}
> 
>  	order = compound_order(page);
> -	inc_slabs_node(s, page_to_nid(page), page->objects);
> +	alloc_node = page_to_nid(page);
> +	empty_node[alloc_node] = 0;
> +	inc_slabs_node(s, alloc_node, page->objects);
>  	memcg_bind_pages(s, order);
>  	page->slab_cache = s;
>  	__SetPageSlab(page);
> @@ -1712,7 +1720,7 @@ static void *get_partial(struct kmem_cac
>  		struct kmem_cache_cpu *c)
>  {
>  	void *object;
> -	int searchnode = (node == NUMA_NO_NODE) ? numa_node_id() : node;
> +	int searchnode = (node == NUMA_NO_NODE) ? numa_mem_id() : node;
> 
>  	object = get_partial_node(s, get_node(s, searchnode), c, flags);
>  	if (object || node != NUMA_NO_NODE)
> @@ -2107,8 +2115,25 @@ static void flush_all(struct kmem_cache
>  static inline int node_match(struct page *page, int node)
>  {
>  #ifdef CONFIG_NUMA
> -	if (!page || (node != NUMA_NO_NODE && page_to_nid(page) != node))
> +	int page_node;
> +
> +	/* No data means no match */
> +	if (!page)
>  		return 0;
> +
> +	/* Node does not matter. Therefore anything is a match */
> +	if (node == NUMA_NO_NODE)
> +		return 1;
> +
> +	/* Did we hit the requested node ? */
> +	page_node = page_to_nid(page);
> +	if (page_node == node)
> +		return 1;
> +
> +	/* If the node has available data then we can use it. Mismatch */
> +	return !empty_node[page_node];
> +
> +	/* Target node empty so just take anything */
>  #endif
>  	return 1;
>  }
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2014-02-10 19:13 UTC|newest]

Thread overview: 124+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-07  2:21 [PATCH] slub: Don't throw away partial remote slabs if there is no local memory Anton Blanchard
2014-01-07  4:19 ` Wanpeng Li
2014-01-07  4:19 ` Wanpeng Li
2014-01-07  4:19 ` Wanpeng Li
2014-01-07  6:49 ` Andi Kleen
2014-01-08 14:03   ` Anton Blanchard
2014-01-07  7:41 ` Joonsoo Kim
2014-01-07  8:48   ` Wanpeng Li
2014-01-07  8:48   ` Wanpeng Li
2014-01-07  9:10     ` Joonsoo Kim
2014-01-07  9:21       ` Wanpeng Li
2014-01-07  9:21       ` Wanpeng Li
2014-01-07  9:21       ` Wanpeng Li
2014-01-07  9:31         ` Joonsoo Kim
2014-01-07  9:49           ` Wanpeng Li
2014-01-07  9:49           ` Wanpeng Li
2014-01-07  9:49           ` Wanpeng Li
2014-01-07  8:48   ` Wanpeng Li
2014-01-07  9:52   ` Wanpeng Li
2014-01-09  0:20     ` Joonsoo Kim
2014-01-07  9:52   ` Wanpeng Li
2014-01-07  9:52   ` Wanpeng Li
2014-01-20  9:10   ` Wanpeng Li
2014-01-20  9:10   ` Wanpeng Li
2014-01-20  9:10   ` Wanpeng Li
     [not found]   ` <52dce7fe.e5e6420a.5ff6.ffff84a0SMTPIN_ADDED_BROKEN@mx.google.com>
2014-01-20 22:13     ` Christoph Lameter
2014-01-21  2:20       ` Wanpeng Li
2014-01-21  2:20       ` Wanpeng Li
2014-01-21  2:20       ` Wanpeng Li
2014-01-24  3:09       ` Wanpeng Li
2014-01-24  3:14         ` Wanpeng Li
2014-01-24  3:14         ` Wanpeng Li
2014-01-24  3:14         ` Wanpeng Li
     [not found]         ` <52e1da8f.86f7440a.120f.25f3SMTPIN_ADDED_BROKEN@mx.google.com>
2014-01-24 15:50           ` Christoph Lameter
2014-01-24 21:03             ` David Rientjes
2014-01-24 22:19               ` Nishanth Aravamudan
2014-01-24 23:29               ` Nishanth Aravamudan
2014-01-24 23:49                 ` David Rientjes
2014-01-25  0:16                   ` Nishanth Aravamudan
2014-01-25  0:25                     ` David Rientjes
2014-01-25  1:10                       ` Nishanth Aravamudan
2014-01-27  5:58                         ` Joonsoo Kim
2014-01-28 18:29                           ` Nishanth Aravamudan
2014-01-29 15:54                             ` Christoph Lameter
2014-01-29 22:36                             ` Nishanth Aravamudan
2014-01-30 16:26                               ` Christoph Lameter
2014-02-03 23:00                             ` Nishanth Aravamudan
2014-02-04  3:38                               ` Christoph Lameter
2014-02-04  7:26                                 ` Nishanth Aravamudan
2014-02-04 20:39                                   ` Christoph Lameter
2014-02-05  0:13                                     ` Nishanth Aravamudan
2014-02-05 19:28                                       ` Christoph Lameter
2014-02-06  2:08                                         ` Nishanth Aravamudan
2014-02-06 17:25                                           ` Christoph Lameter
2014-01-27 16:18                         ` Christoph Lameter
2014-02-06  2:07                       ` Nishanth Aravamudan
2014-02-06  8:04                         ` Joonsoo Kim
     [not found]                           ` <20140206185955.GA7845@linux.vnet.ibm.com>
2014-02-06 19:28                             ` Nishanth Aravamudan
2014-02-07  8:03                               ` Joonsoo Kim
2014-02-06  8:07                         ` [RFC PATCH 1/3] slub: search partial list on numa_mem_id(), instead of numa_node_id() Joonsoo Kim
2014-02-06  8:07                           ` [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node Joonsoo Kim
2014-02-06  8:52                             ` David Rientjes
2014-02-06 10:29                               ` Joonsoo Kim
2014-02-06 19:11                                 ` Nishanth Aravamudan
2014-02-07  5:42                                   ` Joonsoo Kim
2014-02-06 20:52                                 ` David Rientjes
2014-02-07  5:48                                   ` Joonsoo Kim
2014-02-07 17:53                                     ` Christoph Lameter
2014-02-07 18:51                                       ` Christoph Lameter
2014-02-07 21:38                                         ` Nishanth Aravamudan
2014-02-10  1:15                                           ` Joonsoo Kim
2014-02-10  1:29                                         ` Joonsoo Kim
2014-02-11 18:45                                           ` Christoph Lameter
2014-02-10 19:13                                         ` Nishanth Aravamudan [this message]
2014-02-11  7:42                                           ` Joonsoo Kim
2014-02-12 22:16                                             ` Christoph Lameter
2014-02-13  3:53                                               ` Nishanth Aravamudan
2014-02-17  6:52                                               ` Joonsoo Kim
2014-02-18 16:38                                                 ` Christoph Lameter
2014-02-19 22:04                                                   ` David Rientjes
2014-02-20 16:02                                                     ` Christoph Lameter
2014-02-24  5:08                                                   ` Joonsoo Kim
2014-02-24 19:54                                                     ` Christoph Lameter
2014-03-13 16:51                                                       ` Nishanth Aravamudan
2014-02-18 17:22                                               ` Nishanth Aravamudan
2014-02-13  6:51                                             ` Nishanth Aravamudan
2014-02-17  7:00                                               ` Joonsoo Kim
2014-02-18 16:57                                                 ` Christoph Lameter
2014-02-18 17:28                                                   ` Nishanth Aravamudan
2014-02-18 19:58                                                     ` Christoph Lameter
2014-02-18 21:09                                                       ` Nishanth Aravamudan
2014-02-18 21:49                                                         ` Christoph Lameter
2014-02-18 22:22                                                           ` Nishanth Aravamudan
2014-02-19 16:11                                                             ` Christoph Lameter
2014-02-19 22:03                                                       ` David Rientjes
2014-02-08  9:57                                     ` David Rientjes
2014-02-10  1:09                                       ` Joonsoo Kim
2014-07-22  1:03                                         ` Nishanth Aravamudan
2014-07-22  1:16                                           ` David Rientjes
2014-07-22 21:43                                             ` Nishanth Aravamudan
2014-07-22 21:49                                               ` Tejun Heo
2014-07-22 23:47                                               ` Nishanth Aravamudan
2014-07-23  0:43                                               ` David Rientjes
2014-02-06  8:07                           ` [RFC PATCH 3/3] slub: fallback to get_numa_mem() node if we want to allocate on memoryless node Joonsoo Kim
2014-02-06 17:30                             ` Christoph Lameter
2014-02-07  5:41                               ` Joonsoo Kim
2014-02-07 17:49                                 ` Christoph Lameter
2014-02-10  1:22                                   ` Joonsoo Kim
2014-02-06  8:37                           ` [RFC PATCH 1/3] slub: search partial list on numa_mem_id(), instead of numa_node_id() David Rientjes
2014-02-06 17:31                             ` Christoph Lameter
2014-02-06 17:26                           ` Christoph Lameter
2014-05-16 23:37                           ` Nishanth Aravamudan
2014-05-19  2:41                             ` Joonsoo Kim
2014-06-05  0:13                           ` [RESEND PATCH] " David Rientjes
2014-01-27 16:24                     ` [PATCH] slub: Don't throw away partial remote slabs if there is no local memory Christoph Lameter
2014-01-27 16:16                   ` Christoph Lameter
2014-01-24  3:09       ` Wanpeng Li
2014-01-24  3:09       ` Wanpeng Li
2014-01-07  9:42 ` David Laight
2014-01-08 14:14   ` Anton Blanchard
2014-01-07 10:28 ` Wanpeng Li
2014-01-07 10:28 ` Wanpeng Li
2014-01-07 10:28 ` Wanpeng Li
     [not found] ` <20140107041939.GA20916@hacker.(null)>
2014-01-08 14:17   ` Anton Blanchard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140210191321.GD1558@linux.vnet.ibm.com \
    --to=nacc@linux.vnet.ibm.com \
    --cc=anton@samba.org \
    --cc=cl@linux.com \
    --cc=hanpt@linux.vnet.ibm.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=liwanp@linux.vnet.ibm.com \
    --cc=mpm@selenic.com \
    --cc=paulus@samba.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).