linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: simplify size2index conversion of __kmalloc_index
@ 2022-08-28 15:14 Dawei Li
  2022-08-29  3:11 ` Matthew Wilcox
  2022-08-29  3:18 ` Hyeonggon Yoo
  0 siblings, 2 replies; 8+ messages in thread
From: Dawei Li @ 2022-08-28 15:14 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, vbabka,
	roman.gushchin, 42.hyeyoo
  Cc: linux-mm, linux-kernel, Dawei Li

Current size2index is implemented by one to one hardcode mapping,
which can be improved by order_base_2().
Must be careful to not violate compile-time optimization rule.

Generated code for caller of kmalloc:
48 8b 3d 9f 0b 6b 01 mov    0x16b0b9f(%rip),%rdi
                            # ffffffff826d1568 <kmalloc_caches+0x48>
ba 08 01 00 00       mov    $0x108,%edx
be c0 0d 00 00       mov    $0xdc0,%esi
e8 98 d7 2e 00       callq  ffffffff8130e170 <kmem_cache_alloc_trace>

Signed-off-by: Dawei Li <set_pte_at@outlook.com>
---
 include/linux/slab.h | 34 +++++++++-------------------------
 1 file changed, 9 insertions(+), 25 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 0fefdf528e0d..66452a4357c6 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -17,7 +17,7 @@
 #include <linux/types.h>
 #include <linux/workqueue.h>
 #include <linux/percpu-refcount.h>
-
+#include <linux/log2.h>
 
 /*
  * Flags to pass to kmem_cache_create().
@@ -394,31 +394,16 @@ static __always_inline unsigned int __kmalloc_index(size_t size,
 
 	if (KMALLOC_MIN_SIZE <= 32 && size > 64 && size <= 96)
 		return 1;
+
 	if (KMALLOC_MIN_SIZE <= 64 && size > 128 && size <= 192)
 		return 2;
-	if (size <=          8) return 3;
-	if (size <=         16) return 4;
-	if (size <=         32) return 5;
-	if (size <=         64) return 6;
-	if (size <=        128) return 7;
-	if (size <=        256) return 8;
-	if (size <=        512) return 9;
-	if (size <=       1024) return 10;
-	if (size <=   2 * 1024) return 11;
-	if (size <=   4 * 1024) return 12;
-	if (size <=   8 * 1024) return 13;
-	if (size <=  16 * 1024) return 14;
-	if (size <=  32 * 1024) return 15;
-	if (size <=  64 * 1024) return 16;
-	if (size <= 128 * 1024) return 17;
-	if (size <= 256 * 1024) return 18;
-	if (size <= 512 * 1024) return 19;
-	if (size <= 1024 * 1024) return 20;
-	if (size <=  2 * 1024 * 1024) return 21;
-	if (size <=  4 * 1024 * 1024) return 22;
-	if (size <=  8 * 1024 * 1024) return 23;
-	if (size <=  16 * 1024 * 1024) return 24;
-	if (size <=  32 * 1024 * 1024) return 25;
+
+	if (size <= 8)
+		return 3;
+
+	/* Following compile-time optimization rule is mandatory. */
+	if (size <= 32 * 1024 * 1024)
+		return order_base_2(size);
 
 	if (!IS_ENABLED(CONFIG_PROFILE_ALL_BRANCHES) && size_is_constant)
 		BUILD_BUG_ON_MSG(1, "unexpected size in kmalloc_index()");
@@ -700,7 +685,6 @@ static inline __alloc_size(1, 2) void *kcalloc_node(size_t n, size_t size, gfp_t
 	return kmalloc_array_node(n, size, flags | __GFP_ZERO, node);
 }
 
-
 #ifdef CONFIG_NUMA
 extern void *__kmalloc_node_track_caller(size_t size, gfp_t flags, int node,
 					 unsigned long caller) __alloc_size(1);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: simplify size2index conversion of __kmalloc_index
  2022-08-28 15:14 [PATCH] mm: simplify size2index conversion of __kmalloc_index Dawei Li
@ 2022-08-29  3:11 ` Matthew Wilcox
  2022-08-29  3:36   ` Hyeonggon Yoo
  2022-08-29  3:18 ` Hyeonggon Yoo
  1 sibling, 1 reply; 8+ messages in thread
From: Matthew Wilcox @ 2022-08-29  3:11 UTC (permalink / raw)
  To: Dawei Li
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, vbabka,
	roman.gushchin, 42.hyeyoo, linux-mm, linux-kernel

On Sun, Aug 28, 2022 at 11:14:48PM +0800, Dawei Li wrote:
> Current size2index is implemented by one to one hardcode mapping,
> which can be improved by order_base_2().
> Must be careful to not violate compile-time optimization rule.

This patch has been NACKed before (when submitted by other people).


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: simplify size2index conversion of __kmalloc_index
  2022-08-28 15:14 [PATCH] mm: simplify size2index conversion of __kmalloc_index Dawei Li
  2022-08-29  3:11 ` Matthew Wilcox
@ 2022-08-29  3:18 ` Hyeonggon Yoo
  1 sibling, 0 replies; 8+ messages in thread
From: Hyeonggon Yoo @ 2022-08-29  3:18 UTC (permalink / raw)
  To: Dawei Li
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, vbabka,
	roman.gushchin, linux-mm, linux-kernel

On Sun, Aug 28, 2022 at 11:14:48PM +0800, Dawei Li wrote:
> Current size2index is implemented by one to one hardcode mapping,
> which can be improved by order_base_2().
> Must be careful to not violate compile-time optimization rule.

> Generated code for caller of kmalloc:
> 48 8b 3d 9f 0b 6b 01 mov    0x16b0b9f(%rip),%rdi
>                             # ffffffff826d1568 <kmalloc_caches+0x48>
> ba 08 01 00 00       mov    $0x108,%edx
> be c0 0d 00 00       mov    $0xdc0,%esi
> e8 98 d7 2e 00       callq  ffffffff8130e170 <kmem_cache_alloc_trace>
> 
> Signed-off-by: Dawei Li <set_pte_at@outlook.com>
> ---
>  include/linux/slab.h | 34 +++++++++-------------------------
>  1 file changed, 9 insertions(+), 25 deletions(-)
> 
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 0fefdf528e0d..66452a4357c6 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -17,7 +17,7 @@
>  #include <linux/types.h>
>  #include <linux/workqueue.h>
>  #include <linux/percpu-refcount.h>
> -
> +#include <linux/log2.h>
>  
>  /*
>   * Flags to pass to kmem_cache_create().
> @@ -394,31 +394,16 @@ static __always_inline unsigned int __kmalloc_index(size_t size,
>  
>  	if (KMALLOC_MIN_SIZE <= 32 && size > 64 && size <= 96)
>  		return 1;
> +
>  	if (KMALLOC_MIN_SIZE <= 64 && size > 128 && size <= 192)
>  		return 2;
> -	if (size <=          8) return 3;
> -	if (size <=         16) return 4;
> -	if (size <=         32) return 5;
> -	if (size <=         64) return 6;
> -	if (size <=        128) return 7;
> -	if (size <=        256) return 8;
> -	if (size <=        512) return 9;
> -	if (size <=       1024) return 10;
> -	if (size <=   2 * 1024) return 11;
> -	if (size <=   4 * 1024) return 12;
> -	if (size <=   8 * 1024) return 13;
> -	if (size <=  16 * 1024) return 14;
> -	if (size <=  32 * 1024) return 15;
> -	if (size <=  64 * 1024) return 16;
> -	if (size <= 128 * 1024) return 17;
> -	if (size <= 256 * 1024) return 18;
> -	if (size <= 512 * 1024) return 19;
> -	if (size <= 1024 * 1024) return 20;
> -	if (size <=  2 * 1024 * 1024) return 21;
> -	if (size <=  4 * 1024 * 1024) return 22;
> -	if (size <=  8 * 1024 * 1024) return 23;
> -	if (size <=  16 * 1024 * 1024) return 24;
> -	if (size <=  32 * 1024 * 1024) return 25;

It does not apply. better rebase it on Vlastimil's slab tree (for-next branch)
https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git/

> +
> +	if (size <= 8)
> +		return 3;
> +
> +	/* Following compile-time optimization rule is mandatory. */
> +	if (size <= 32 * 1024 * 1024)
> +		return order_base_2(size);

Oh, it seems order_base_2() does compile-time opitmization as well.

With order_base_2(), what about using KMALLOC_MAX_CACHE_SIZE instead of 32 * 1024 * 1024?
I think that would be more robust.

Hmm also better check if it works okay with kfence tests (it passes non-constant value)
let't check if it breaks after rebase.

Thanks!

>  
>  	if (!IS_ENABLED(CONFIG_PROFILE_ALL_BRANCHES) && size_is_constant)
>  		BUILD_BUG_ON_MSG(1, "unexpected size in kmalloc_index()");
> @@ -700,7 +685,6 @@ static inline __alloc_size(1, 2) void *kcalloc_node(size_t n, size_t size, gfp_t
>  	return kmalloc_array_node(n, size, flags | __GFP_ZERO, node);
>  }
>  
> -
>  #ifdef CONFIG_NUMA
>  extern void *__kmalloc_node_track_caller(size_t size, gfp_t flags, int node,
>  					 unsigned long caller) __alloc_size(1);
> -- 
> 2.25.1
> 

-- 
Thanks,
Hyeonggon


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: simplify size2index conversion of __kmalloc_index
  2022-08-29  3:11 ` Matthew Wilcox
@ 2022-08-29  3:36   ` Hyeonggon Yoo
  2022-08-29 14:21     ` Vlastimil Babka
  0 siblings, 1 reply; 8+ messages in thread
From: Hyeonggon Yoo @ 2022-08-29  3:36 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Dawei Li, cl, penberg, rientjes, iamjoonsoo.kim, akpm, vbabka,
	roman.gushchin, linux-mm, linux-kernel

On Mon, Aug 29, 2022 at 04:11:04AM +0100, Matthew Wilcox wrote:
> On Sun, Aug 28, 2022 at 11:14:48PM +0800, Dawei Li wrote:
> > Current size2index is implemented by one to one hardcode mapping,
> > which can be improved by order_base_2().
> > Must be careful to not violate compile-time optimization rule.
> 
> This patch has been NACKed before (when submitted by other people).


Hmm right.
https://lkml.iu.edu/hypermail/linux/kernel/1606.2/05402.html

Christoph Lameter wrote:
> On Wed, 22 Jun 2016, Yury Norov wrote:
> > There will be no fls() for constant at runtime because ilog2() calculates
> > constant values at compile-time as well. From this point of view,
> > this patch removes code duplication, as we already have compile-time
> > log() calculation in kernel, and should re-use it whenever possible.\

> The reason not to use ilog there was that the constant folding did not
> work correctly with one or the other architectures/compilers. If you want
> to do this then please verify that all arches reliably do produce a
> constant there.

Can we re-evaluate this?

-- 
Thanks,
Hyeonggon


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: simplify size2index conversion of __kmalloc_index
  2022-08-29  3:36   ` Hyeonggon Yoo
@ 2022-08-29 14:21     ` Vlastimil Babka
  2022-08-29 15:37       ` 回复: " pgd pte
  2022-08-30  5:51       ` Christophe Leroy
  0 siblings, 2 replies; 8+ messages in thread
From: Vlastimil Babka @ 2022-08-29 14:21 UTC (permalink / raw)
  To: Hyeonggon Yoo, Matthew Wilcox
  Cc: Dawei Li, cl, penberg, rientjes, iamjoonsoo.kim, akpm,
	roman.gushchin, linux-mm, linux-kernel

On 8/29/22 05:36, Hyeonggon Yoo wrote:
> On Mon, Aug 29, 2022 at 04:11:04AM +0100, Matthew Wilcox wrote:
>> On Sun, Aug 28, 2022 at 11:14:48PM +0800, Dawei Li wrote:
>> > Current size2index is implemented by one to one hardcode mapping,
>> > which can be improved by order_base_2().
>> > Must be careful to not violate compile-time optimization rule.
>> 
>> This patch has been NACKed before (when submitted by other people).
> 
> 
> Hmm right.
> https://lkml.iu.edu/hypermail/linux/kernel/1606.2/05402.html
> 
> Christoph Lameter wrote:
>> On Wed, 22 Jun 2016, Yury Norov wrote:
>> > There will be no fls() for constant at runtime because ilog2() calculates
>> > constant values at compile-time as well. From this point of view,
>> > this patch removes code duplication, as we already have compile-time
>> > log() calculation in kernel, and should re-use it whenever possible.\
> 
>> The reason not to use ilog there was that the constant folding did not
>> work correctly with one or the other architectures/compilers. If you want
>> to do this then please verify that all arches reliably do produce a
>> constant there.
> 
> Can we re-evaluate this?

Is there a way to turn inability of compile-time calculation to a
compile-time error? (when size_is_constant=true etc). Then we could try and
see if anything breaks in -next.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* 回复: [PATCH] mm: simplify size2index conversion of __kmalloc_index
  2022-08-29 14:21     ` Vlastimil Babka
@ 2022-08-29 15:37       ` pgd pte
  2022-08-30  5:51       ` Christophe Leroy
  1 sibling, 0 replies; 8+ messages in thread
From: pgd pte @ 2022-08-29 15:37 UTC (permalink / raw)
  To: Vlastimil Babka, Hyeonggon Yoo, Matthew Wilcox
  Cc: cl@linux.com, penberg@kernel.org, rientjes@google.com,
	iamjoonsoo.kim@lge.com, akpm@linux-foundation.org,
	roman.gushchin@linux.dev, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org

Interesting, I will see what I can do about it.
Just curious, could for-next testing cover all architectures and compilers?
Thanks for all the insightful comments from you guys, that's very helpful.

________________________________________
发件人: Vlastimil Babka <vbabka@suse.cz>
发送时间: 2022年8月29日 22:21
收件人: Hyeonggon Yoo; Matthew Wilcox
抄送: Dawei Li; cl@linux.com; penberg@kernel.org; rientjes@google.com; iamjoonsoo.kim@lge.com; akpm@linux-foundation.org; roman.gushchin@linux.dev; linux-mm@kvack.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH] mm: simplify size2index conversion of __kmalloc_index

On 8/29/22 05:36, Hyeonggon Yoo wrote:
> On Mon, Aug 29, 2022 at 04:11:04AM +0100, Matthew Wilcox wrote:
>> On Sun, Aug 28, 2022 at 11:14:48PM +0800, Dawei Li wrote:
>> > Current size2index is implemented by one to one hardcode mapping,
>> > which can be improved by order_base_2().
>> > Must be careful to not violate compile-time optimization rule.
>>
>> This patch has been NACKed before (when submitted by other people).
>
>
> Hmm right.
> https://lkml.iu.edu/hypermail/linux/kernel/1606.2/05402.html
>
> Christoph Lameter wrote:
>> On Wed, 22 Jun 2016, Yury Norov wrote:
>> > There will be no fls() for constant at runtime because ilog2() calculates
>> > constant values at compile-time as well. From this point of view,
>> > this patch removes code duplication, as we already have compile-time
>> > log() calculation in kernel, and should re-use it whenever possible.\
>
>> The reason not to use ilog there was that the constant folding did not
>> work correctly with one or the other architectures/compilers. If you want
>> to do this then please verify that all arches reliably do produce a
>> constant there.
>
> Can we re-evaluate this?

Is there a way to turn inability of compile-time calculation to a
compile-time error? (when size_is_constant=true etc). Then we could try and
see if anything breaks in -next.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: simplify size2index conversion of __kmalloc_index
  2022-08-29 14:21     ` Vlastimil Babka
  2022-08-29 15:37       ` 回复: " pgd pte
@ 2022-08-30  5:51       ` Christophe Leroy
  2022-08-30 13:13         ` Vlastimil Babka
  1 sibling, 1 reply; 8+ messages in thread
From: Christophe Leroy @ 2022-08-30  5:51 UTC (permalink / raw)
  To: Vlastimil Babka, Hyeonggon Yoo, Matthew Wilcox
  Cc: Dawei Li, cl@linux.com, penberg@kernel.org, rientjes@google.com,
	iamjoonsoo.kim@lge.com, akpm@linux-foundation.org,
	roman.gushchin@linux.dev, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org



Le 29/08/2022 à 16:21, Vlastimil Babka a écrit :
> On 8/29/22 05:36, Hyeonggon Yoo wrote:
>> On Mon, Aug 29, 2022 at 04:11:04AM +0100, Matthew Wilcox wrote:
>>> On Sun, Aug 28, 2022 at 11:14:48PM +0800, Dawei Li wrote:
>>>> Current size2index is implemented by one to one hardcode mapping,
>>>> which can be improved by order_base_2().
>>>> Must be careful to not violate compile-time optimization rule.
>>>
>>> This patch has been NACKed before (when submitted by other people).
>>
>>
>> Hmm right.
>> https://lkml.iu.edu/hypermail/linux/kernel/1606.2/05402.html
>>
>> Christoph Lameter wrote:
>>> On Wed, 22 Jun 2016, Yury Norov wrote:
>>>> There will be no fls() for constant at runtime because ilog2() calculates
>>>> constant values at compile-time as well. From this point of view,
>>>> this patch removes code duplication, as we already have compile-time
>>>> log() calculation in kernel, and should re-use it whenever possible.\
>>
>>> The reason not to use ilog there was that the constant folding did not
>>> work correctly with one or the other architectures/compilers. If you want
>>> to do this then please verify that all arches reliably do produce a
>>> constant there.
>>
>> Can we re-evaluate this?
> 
> Is there a way to turn inability of compile-time calculation to a
> compile-time error? (when size_is_constant=true etc). Then we could try and
> see if anything breaks in -next.
> 
> 

The following will generate a build error if the function 
constant_check() is not called with a buildtime constant argument.

static void __always_inline constant_check(unsigned long val)
{
	BUILD_BUG_ON(!__builtin_constant_p(val));
}

Is that what you are looking for ?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: simplify size2index conversion of __kmalloc_index
  2022-08-30  5:51       ` Christophe Leroy
@ 2022-08-30 13:13         ` Vlastimil Babka
  0 siblings, 0 replies; 8+ messages in thread
From: Vlastimil Babka @ 2022-08-30 13:13 UTC (permalink / raw)
  To: Christophe Leroy, Hyeonggon Yoo, Matthew Wilcox
  Cc: Dawei Li, cl@linux.com, penberg@kernel.org, rientjes@google.com,
	iamjoonsoo.kim@lge.com, akpm@linux-foundation.org,
	roman.gushchin@linux.dev, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org

On 8/30/22 07:51, Christophe Leroy wrote:
> 
> 
> Le 29/08/2022 à 16:21, Vlastimil Babka a écrit :
>> On 8/29/22 05:36, Hyeonggon Yoo wrote:
>>> On Mon, Aug 29, 2022 at 04:11:04AM +0100, Matthew Wilcox wrote:
>>>> On Sun, Aug 28, 2022 at 11:14:48PM +0800, Dawei Li wrote:
>>>>> Current size2index is implemented by one to one hardcode mapping,
>>>>> which can be improved by order_base_2().
>>>>> Must be careful to not violate compile-time optimization rule.
>>>>
>>>> This patch has been NACKed before (when submitted by other people).
>>>
>>>
>>> Hmm right.
>>> https://lkml.iu.edu/hypermail/linux/kernel/1606.2/05402.html
>>>
>>> Christoph Lameter wrote:
>>>> On Wed, 22 Jun 2016, Yury Norov wrote:
>>>>> There will be no fls() for constant at runtime because ilog2() calculates
>>>>> constant values at compile-time as well. From this point of view,
>>>>> this patch removes code duplication, as we already have compile-time
>>>>> log() calculation in kernel, and should re-use it whenever possible.\
>>>
>>>> The reason not to use ilog there was that the constant folding did not
>>>> work correctly with one or the other architectures/compilers. If you want
>>>> to do this then please verify that all arches reliably do produce a
>>>> constant there.
>>>
>>> Can we re-evaluate this?
>>
>> Is there a way to turn inability of compile-time calculation to a
>> compile-time error? (when size_is_constant=true etc). Then we could try and
>> see if anything breaks in -next.
>>
>>
> 
> The following will generate a build error if the function
> constant_check() is not called with a buildtime constant argument.
> 
> static void __always_inline constant_check(unsigned long val)
> {
> 	BUILD_BUG_ON(!__builtin_constant_p(val));
> }
> 
> Is that what you are looking for ?
Maybe, if we can rely on these two being equivalent:
- __kmalloc_index(x) is evaluated compile-time
- __builtin_constant_p(__kmalloc_index(x)) is true

Logically such equivalency should be expected, and a quick attempt 
locally with recent gcc seems to work fine, but I guess we'll have to 
try in -next for a bit and see if anything comes out.



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-08-30 13:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-28 15:14 [PATCH] mm: simplify size2index conversion of __kmalloc_index Dawei Li
2022-08-29  3:11 ` Matthew Wilcox
2022-08-29  3:36   ` Hyeonggon Yoo
2022-08-29 14:21     ` Vlastimil Babka
2022-08-29 15:37       ` 回复: " pgd pte
2022-08-30  5:51       ` Christophe Leroy
2022-08-30 13:13         ` Vlastimil Babka
2022-08-29  3:18 ` Hyeonggon Yoo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).