All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hyeonggon Yoo <42.hyeyoo@gmail.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Joe Perches <joe@perches.com>,
	Vasily Averin <vasily.averin@linux.dev>,
	Matthew WilCox <willy@infradead.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v3 08/15] mm/slab_common: kmalloc_node: pass large requests to page allocator
Date: Tue, 2 Aug 2022 17:59:54 +0900	[thread overview]
Message-ID: <Yujnihj5YVPP2LjA@hyeyoo> (raw)
In-Reply-To: <caca2088-957d-70d3-0548-7fae810ae5b5@suse.cz>

On Mon, Aug 01, 2022 at 04:44:22PM +0200, Vlastimil Babka wrote:
> 
> On 8/1/22 16:37, Hyeonggon Yoo wrote:
> > On Thu, Jul 28, 2022 at 06:09:27PM +0200, Vlastimil Babka wrote:
> >> On 7/12/22 15:39, Hyeonggon Yoo wrote:
> >> > Now that kmalloc_large_node() is in common code, pass large requests
> >> > to page allocator in kmalloc_node() using kmalloc_large_node().
> >> > 
> >> > One problem is that currently there is no tracepoint in
> >> > kmalloc_large_node(). Instead of simply putting tracepoint in it,
> >> > use kmalloc_large_node{,_notrace} depending on its caller to show
> >> > useful address for both inlined kmalloc_node() and
> >> > __kmalloc_node_track_caller() when large objects are allocated.
> >> > 
> >> > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> >> 
> >> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> >>
> > 
> > Thanks!
> > 
> >> Nit below:
> >> 
> >> > ---
> >> >  v3:
> >> > 	This patch is new in v3 and this avoids
> >> > 	missing caller in __kmalloc_large_node_track_caller()
> >> > 	when kmalloc_large_node() is called.
> >> > 
> >> >  include/linux/slab.h | 26 +++++++++++++++++++-------
> >> >  mm/slab.h            |  2 ++
> >> >  mm/slab_common.c     | 17 ++++++++++++++++-
> >> >  mm/slub.c            |  2 +-
> >> >  4 files changed, 38 insertions(+), 9 deletions(-)
> >> > 
> >> > diff --git a/include/linux/slab.h b/include/linux/slab.h
> >> > index 082499306098..fd2e129fc813 100644
> >> > --- a/include/linux/slab.h
> >> > +++ b/include/linux/slab.h
> >> > @@ -571,23 +571,35 @@ static __always_inline __alloc_size(1) void *kmalloc(size_t size, gfp_t flags)
> >> >  	return __kmalloc(size, flags);
> >> >  }
> >> >  
> >> > +#ifndef CONFIG_SLOB
> >> >  static __always_inline __alloc_size(1) void *kmalloc_node(size_t size, gfp_t flags, int node)
> >> >  {
> >> > -#ifndef CONFIG_SLOB
> >> > -	if (__builtin_constant_p(size) &&
> >> > -		size <= KMALLOC_MAX_CACHE_SIZE) {
> >> > -		unsigned int i = kmalloc_index(size);
> >> > +	if (__builtin_constant_p(size)) {
> >> > +		unsigned int index;
> >> >  
> >> > -		if (!i)
> >> > +		if (size > KMALLOC_MAX_CACHE_SIZE)
> >> > +			return kmalloc_large_node(size, flags, node);
> >> > +
> >> > +		index = kmalloc_index(size);
> >> > +
> >> > +		if (!index)
> >> >  			return ZERO_SIZE_PTR;
> >> >  
> >> >  		return kmem_cache_alloc_node_trace(
> >> > -				kmalloc_caches[kmalloc_type(flags)][i],
> >> > +				kmalloc_caches[kmalloc_type(flags)][index],
> >> >  						flags, node, size);
> >> >  	}
> >> > -#endif
> >> >  	return __kmalloc_node(size, flags, node);
> >> >  }
> >> > +#else
> >> > +static __always_inline __alloc_size(1) void *kmalloc_node(size_t size, gfp_t flags, int node)
> >> > +{
> >> > +	if (__builtin_constant_p(size) && size > KMALLOC_MAX_CACHE_SIZE)
> >> > +		return kmalloc_large_node(size, flags, node);
> >> > +
> >> > +	return __kmalloc_node(size, flags, node);
> >> > +}
> >> > +#endif
> >> >  
> >> >  /**
> >> >   * kmalloc_array - allocate memory for an array.
> >> > diff --git a/mm/slab.h b/mm/slab.h
> >> > index a8d5eb1c323f..7cb51ff44f0c 100644
> >> > --- a/mm/slab.h
> >> > +++ b/mm/slab.h
> >> > @@ -273,6 +273,8 @@ void create_kmalloc_caches(slab_flags_t);
> >> >  
> >> >  /* Find the kmalloc slab corresponding for a certain size */
> >> >  struct kmem_cache *kmalloc_slab(size_t, gfp_t);
> >> > +
> >> > +void *kmalloc_large_node_notrace(size_t size, gfp_t flags, int node);
> >> >  #endif
> >> >  
> >> >  gfp_t kmalloc_fix_flags(gfp_t flags);
> >> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> >> > index 6f855587b635..dc872e0ef0fc 100644
> >> > --- a/mm/slab_common.c
> >> > +++ b/mm/slab_common.c
> >> > @@ -956,7 +956,8 @@ void *kmalloc_large(size_t size, gfp_t flags)
> >> >  }
> >> >  EXPORT_SYMBOL(kmalloc_large);
> >> >  
> >> > -void *kmalloc_large_node(size_t size, gfp_t flags, int node)
> >> > +static __always_inline
> >> 
> >> I don't think we need to inline, compiler should be able to make
> >> kmalloc_large_node(_notrace) quite efficient anyway.
> > 
> > You mean s/static __always_inline/static/g, or like this?
> > 
> > kmalloc_large_node_notrace():
> > 	fold __kmalloc_large_node_notrace() into here
> > 
> > kmalloc_large_node():
> > 	kmalloc_large_node_notrace()
> > 	trace_kmalloc()
> > 
> > I have no strong opinion.
> > 
> > IMO It's unlikely that there would be workloads that are
> > meaningfully affected by inlining or not inlining __kmalloc_large_node_notrace().
> > Just wanted to avoid adding even tiny of overhead by this series.
> 
> I tried the following quick change on top of your series and got (with
> CONFIG_SLUB):
> 
> add/remove: 2/4 grow/shrink: 0/2 up/down: 208/-588 (-380)
> Function                                     old     new   delta
> __kmalloc_large_node                           -     186    +186
> __kmalloc_large_node.cold                      -      22     +22
> kmalloc_large_node.cold                       15       -     -15
> kmalloc_large.cold                            15       -     -15
> kmalloc_large_node_notrace.cold               22       -     -22
> kmalloc_large                                252      87    -165
> kmalloc_large_node                           271      86    -185
> kmalloc_large_node_notrace                   186       -    -186
> Total: Before=49398081, After=49397701, chg -0.00%
> 

Yeah, uninlining __kmalloc_large_node saves hundreds of bytes.
And the diff below looks good to me.

By The Way, do you have opinions on inlining slab_alloc_node()?
(Looks like similar topic?)

AFAIK slab_alloc_node() is inlined in:
        kmem_cache_alloc()
        kmem_cache_alloc_node()
        kmem_cache_alloc_lru()
        kmem_cache_alloc_trace()
        kmem_cache_alloc_node_trace()
        __kmem_cache_alloc_node()

This is what I get after simply dropping __always_inline in slab_alloc_node:

add/remove: 1/1 grow/shrink: 3/6 up/down: 1911/-5275 (-3364)
Function                                     old     new   delta
slab_alloc_node                                -    1356   +1356
sysfs_slab_alias                             134     327    +193
slab_memory_callback                         528     717    +189
__kmem_cache_create                         1325    1498    +173
__slab_alloc.constprop                       135       -    -135
kmem_cache_alloc_trace                       909     196    -713
kmem_cache_alloc                             937     191    -746
kmem_cache_alloc_node_trace                 1020     200    -820
__kmem_cache_alloc_node                      862      19    -843
kmem_cache_alloc_node                       1046     189    -857
kmem_cache_alloc_lru                        1348     187   -1161
Total: Before=32011183, After=32007819, chg -0.01%

So 3.28kB is cost of eliminating function call overhead in the 
fastpath.

This is tradeoff between function call overhead and
instruction cache usage...

> 
> diff --git a/mm/slab.h b/mm/slab.h
> index ad634e02b3cb..7b8963394144 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -274,7 +274,6 @@ void create_kmalloc_caches(slab_flags_t);
>  /* Find the kmalloc slab corresponding for a certain size */
>  struct kmem_cache *kmalloc_slab(size_t, gfp_t);
>  
> -void *kmalloc_large_node_notrace(size_t size, gfp_t flags, int node);
>  void *__kmem_cache_alloc_node(struct kmem_cache *s, gfp_t gfpflags,
>  			      int node, size_t orig_size,
>  			      unsigned long caller);
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 0d6cbe9d7ad0..07a6bf1cff36 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -925,6 +925,7 @@ void free_large_kmalloc(struct folio *folio, void *object)
>  	__free_pages(folio_page(folio, 0), order);
>  }
>  
> +static void *__kmalloc_large_node(size_t size, gfp_t flags, int node);
>  static __always_inline
>  void *__do_kmalloc_node(size_t size, gfp_t flags, int node, unsigned long caller)
>  {
> @@ -932,7 +933,7 @@ void *__do_kmalloc_node(size_t size, gfp_t flags, int node, unsigned long caller
>  	void *ret;
>  
>  	if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) {
> -		ret = kmalloc_large_node_notrace(size, flags, node);
> +		ret = __kmalloc_large_node(size, flags, node);
>  		trace_kmalloc(_RET_IP_, ret,
>  			      size, PAGE_SIZE << get_order(size),
>  			      flags, node);
> @@ -1042,8 +1043,7 @@ gfp_t kmalloc_fix_flags(gfp_t flags)
>   * know the allocation order to free the pages properly in kfree.
>   */
>  
> -static __always_inline
> -void *__kmalloc_large_node_notrace(size_t size, gfp_t flags, int node)
> +static void *__kmalloc_large_node(size_t size, gfp_t flags, int node)
>  {
>  	struct page *page;
>  	void *ptr = NULL;
> @@ -1069,7 +1069,7 @@ void *__kmalloc_large_node_notrace(size_t size, gfp_t flags, int node)
>  
>  void *kmalloc_large(size_t size, gfp_t flags)
>  {
> -	void *ret = __kmalloc_large_node_notrace(size, flags, NUMA_NO_NODE);
> +	void *ret = __kmalloc_large_node(size, flags, NUMA_NO_NODE);
>  
>  	trace_kmalloc(_RET_IP_, ret,
>  		      size, PAGE_SIZE << get_order(size),
> @@ -1078,14 +1078,9 @@ void *kmalloc_large(size_t size, gfp_t flags)
>  }
>  EXPORT_SYMBOL(kmalloc_large);
>  
> -void *kmalloc_large_node_notrace(size_t size, gfp_t flags, int node)
> -{
> -	return __kmalloc_large_node_notrace(size, flags, node);
> -}
> -
>  void *kmalloc_large_node(size_t size, gfp_t flags, int node)
>  {
> -	void *ret = __kmalloc_large_node_notrace(size, flags, node);
> +	void *ret = __kmalloc_large_node(size, flags, node);
>  
>  	trace_kmalloc(_RET_IP_, ret,
>  		      size, PAGE_SIZE << get_order(size),

-- 
Thanks,
Hyeonggon


  reply	other threads:[~2022-08-02  9:00 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-12 13:39 [PATCH v3 00/15] common kmalloc v3 Hyeonggon Yoo
2022-07-12 13:39 ` [PATCH v3 1/15] mm/slab: move NUMA-related code to __do_cache_alloc() Hyeonggon Yoo
2022-07-12 14:29   ` Christoph Lameter
2022-07-13  9:39     ` Hyeonggon Yoo
2022-07-12 13:39 ` [PATCH v3 2/15] mm/slab: cleanup slab_alloc() and slab_alloc_node() Hyeonggon Yoo
2022-07-12 13:39 ` [PATCH v3 03/15] mm/slab_common: remove CONFIG_NUMA ifdefs for common kmalloc functions Hyeonggon Yoo
2022-07-12 13:39 ` [PATCH v3 04/15] mm/slab_common: cleanup kmalloc_track_caller() Hyeonggon Yoo
2022-07-12 13:39 ` [PATCH v3 05/15] mm/sl[au]b: factor out __do_kmalloc_node() Hyeonggon Yoo
2022-07-28 14:45   ` Vlastimil Babka
2022-07-12 13:39 ` [PATCH v3 06/15] mm/slab_common: fold kmalloc_order_trace() into kmalloc_large() Hyeonggon Yoo
2022-07-28 15:23   ` Vlastimil Babka
2022-08-01 13:26     ` Hyeonggon Yoo
2022-08-01 13:36       ` Vlastimil Babka
2022-08-02  2:54         ` Hyeonggon Yoo
2022-07-12 13:39 ` [PATCH v3 07/15] mm/slub: move kmalloc_large_node() to slab_common.c Hyeonggon Yoo
2022-07-12 13:39 ` [PATCH v3 08/15] mm/slab_common: kmalloc_node: pass large requests to page allocator Hyeonggon Yoo
2022-07-28 16:09   ` Vlastimil Babka
2022-08-01 14:37     ` Hyeonggon Yoo
2022-08-01 14:44       ` Vlastimil Babka
2022-08-02  8:59         ` Hyeonggon Yoo [this message]
2022-08-02  9:32           ` Vlastimil Babka
2022-07-12 13:39 ` [PATCH v3 09/15] mm/slab_common: cleanup kmalloc_large() Hyeonggon Yoo
2022-07-28 16:13   ` Vlastimil Babka
2022-07-12 13:39 ` [PATCH v3 10/15] mm/slab: kmalloc: pass requests larger than order-1 page to page allocator Hyeonggon Yoo
2022-07-28 16:25   ` Vlastimil Babka
2022-07-12 13:39 ` [PATCH v3 11/15] mm/sl[au]b: introduce common alloc/free functions without tracepoint Hyeonggon Yoo
2022-07-29  9:49   ` Vlastimil Babka
2022-07-12 13:39 ` [PATCH v3 12/15] mm/sl[au]b: generalize kmalloc subsystem Hyeonggon Yoo
2022-07-29 10:25   ` Vlastimil Babka
2022-07-12 13:39 ` [PATCH v3 13/15] mm/slab_common: unify NUMA and UMA version of tracepoints Hyeonggon Yoo
2022-07-29 10:52   ` Vlastimil Babka
2022-07-12 13:39 ` [PATCH 14/16] mm/slab_common: drop kmem_alloc & avoid dereferencing fields when not using Hyeonggon Yoo
2022-07-29 11:23   ` Vlastimil Babka
2022-08-02  9:22     ` Hyeonggon Yoo
2022-07-12 13:39 ` [PATCH 15/16] mm/slab_common: move definition of __ksize() to mm/slab.h Hyeonggon Yoo
2022-07-29 11:47   ` Vlastimil Babka
2022-08-02  9:25     ` Hyeonggon Yoo
2022-07-12 13:39 ` [PATCH 16/16] mm/sl[au]b: check if large object is valid in __ksize() Hyeonggon Yoo
2022-07-12 15:13   ` Christoph Lameter
2022-07-13  9:25     ` Hyeonggon Yoo
2022-07-13 10:07       ` Christoph Lameter
2022-07-13 10:33         ` Marco Elver
2022-07-14  9:15           ` Christoph Lameter
2022-07-14 10:30             ` Marco Elver
2022-07-20 10:05               ` Hyeonggon Yoo
2022-07-29 11:50   ` Vlastimil Babka
2022-07-29 15:08 ` [PATCH v3 00/15] common kmalloc v3 Vlastimil Babka
2022-08-14 10:06   ` Hyeonggon Yoo
2022-08-15 12:59     ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yujnihj5YVPP2LjA@hyeyoo \
    --to=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=joe@perches.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=vasily.averin@linux.dev \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.