* [-mm patch] make kcalloc() a static inline
@ 2005-08-08 22:38 Adrian Bunk
2005-08-09 5:09 ` Pekka J Enberg
2005-08-15 8:06 ` [-mm patch] " Denis Vlasenko
0 siblings, 2 replies; 13+ messages in thread
From: Adrian Bunk @ 2005-08-08 22:38 UTC (permalink / raw)
To: Pekka J Enberg, Andrew Morton; +Cc: linux-kernel
kcalloc() doesn't do much more than calling kzalloc(), and gcc has
better optimizing opportunities when it's inlined.
The result of this patch with a fulll kernel compile (roughly equivalent
to "make allyesconfig") shows a minimal size improvement:
text data bss dec hex filename
25864955 5891214 2012840 33769009 2034631 vmlinux-before
25864635 5891206 2012840 33768681 20344e9 vmlinux-staticinline
Signed-off-by: Adrian Bunk <bunk@stusta.de>
---
include/linux/slab.h | 15 ++++++++++++++-
mm/slab.c | 14 --------------
2 files changed, 14 insertions(+), 15 deletions(-)
--- linux-2.6.13-rc5-mm1-full/include/linux/slab.h.old 2005-08-08 12:28:32.000000000 +0200
+++ linux-2.6.13-rc5-mm1-full/include/linux/slab.h 2005-08-08 12:29:59.000000000 +0200
@@ -103,8 +103,21 @@
return __kmalloc(size, flags);
}
-extern void *kcalloc(size_t, size_t, unsigned int __nocast);
extern void *kzalloc(size_t, unsigned int __nocast);
+
+/**
+ * kcalloc - allocate memory for an array. The memory is set to zero.
+ * @n: number of elements.
+ * @size: element size.
+ * @flags: the type of memory to allocate.
+ */
+static inline void *kcalloc(size_t n, size_t size, unsigned int __nocast flags)
+{
+ if (n != 0 && size > INT_MAX / n)
+ return NULL;
+ return kzalloc(n * size, flags);
+}
+
extern void kfree(const void *);
extern unsigned int ksize(const void *);
--- linux-2.6.13-rc5-mm1-full/mm/slab.c.old 2005-08-08 12:29:26.000000000 +0200
+++ linux-2.6.13-rc5-mm1-full/mm/slab.c 2005-08-08 12:29:53.000000000 +0200
@@ -3028,20 +3028,6 @@
EXPORT_SYMBOL(kzalloc);
/**
- * kcalloc - allocate memory for an array. The memory is set to zero.
- * @n: number of elements.
- * @size: element size.
- * @flags: the type of memory to allocate.
- */
-void *kcalloc(size_t n, size_t size, unsigned int __nocast flags)
-{
- if (n != 0 && size > INT_MAX / n)
- return NULL;
- return kzalloc(n * size, flags);
-}
-EXPORT_SYMBOL(kcalloc);
-
-/**
* kfree - free previously allocated memory
* @objp: pointer returned by kmalloc.
*
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: make kcalloc() a static inline 2005-08-08 22:38 [-mm patch] make kcalloc() a static inline Adrian Bunk @ 2005-08-09 5:09 ` Pekka J Enberg 2005-08-15 8:06 ` [-mm patch] " Denis Vlasenko 1 sibling, 0 replies; 13+ messages in thread From: Pekka J Enberg @ 2005-08-09 5:09 UTC (permalink / raw) To: Adrian Bunk; +Cc: Pekka J Enberg, Andrew Morton, linux-kernel Adrian Bunk writes: > kcalloc() doesn't do much more than calling kzalloc(), and gcc has > better optimizing opportunities when it's inlined. > > The result of this patch with a fulll kernel compile (roughly equivalent > to "make allyesconfig") shows a minimal size improvement: > > text data bss dec hex filename > 25864955 5891214 2012840 33769009 2034631 vmlinux-before > 25864635 5891206 2012840 33768681 20344e9 vmlinux-staticinline > > > Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Pekka ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [-mm patch] make kcalloc() a static inline 2005-08-08 22:38 [-mm patch] make kcalloc() a static inline Adrian Bunk 2005-08-09 5:09 ` Pekka J Enberg @ 2005-08-15 8:06 ` Denis Vlasenko 2005-08-15 8:14 ` Arjan van de Ven 1 sibling, 1 reply; 13+ messages in thread From: Denis Vlasenko @ 2005-08-15 8:06 UTC (permalink / raw) To: Adrian Bunk, Pekka J Enberg, Andrew Morton; +Cc: linux-kernel On Tuesday 09 August 2005 01:38, Adrian Bunk wrote: > kcalloc() doesn't do much more than calling kzalloc(), and gcc has > better optimizing opportunities when it's inlined. > > The result of this patch with a fulll kernel compile (roughly equivalent > to "make allyesconfig") shows a minimal size improvement: > > text data bss dec hex filename > 25864955 5891214 2012840 33769009 2034631 vmlinux-before > 25864635 5891206 2012840 33768681 20344e9 vmlinux-staticinline > > > Signed-off-by: Adrian Bunk <bunk@stusta.de> > > --- > > include/linux/slab.h | 15 ++++++++++++++- > mm/slab.c | 14 -------------- > 2 files changed, 14 insertions(+), 15 deletions(-) > > --- linux-2.6.13-rc5-mm1-full/include/linux/slab.h.old 2005-08-08 12:28:32.000000000 +0200 > +++ linux-2.6.13-rc5-mm1-full/include/linux/slab.h 2005-08-08 12:29:59.000000000 +0200 > @@ -103,8 +103,21 @@ > return __kmalloc(size, flags); > } > > -extern void *kcalloc(size_t, size_t, unsigned int __nocast); > extern void *kzalloc(size_t, unsigned int __nocast); > + > +/** > + * kcalloc - allocate memory for an array. The memory is set to zero. > + * @n: number of elements. > + * @size: element size. > + * @flags: the type of memory to allocate. > + */ > +static inline void *kcalloc(size_t n, size_t size, unsigned int __nocast flags) > +{ > + if (n != 0 && size > INT_MAX / n) > + return NULL; > + return kzalloc(n * size, flags); > +} > + > extern void kfree(const void *); > extern unsigned int ksize(const void *); > > --- linux-2.6.13-rc5-mm1-full/mm/slab.c.old 2005-08-08 12:29:26.000000000 +0200 > +++ linux-2.6.13-rc5-mm1-full/mm/slab.c 2005-08-08 12:29:53.000000000 +0200 > @@ -3028,20 +3028,6 @@ > EXPORT_SYMBOL(kzalloc); > > /** > - * kcalloc - allocate memory for an array. The memory is set to zero. > - * @n: number of elements. > - * @size: element size. > - * @flags: the type of memory to allocate. > - */ > -void *kcalloc(size_t n, size_t size, unsigned int __nocast flags) > -{ > - if (n != 0 && size > INT_MAX / n) > - return NULL; > - return kzalloc(n * size, flags); > -} > -EXPORT_SYMBOL(kcalloc); > - > -/** > * kfree - free previously allocated memory > * @objp: pointer returned by kmalloc. > * Can you conditionalize it on __builtin_constant_p(n) ? Otherwise with variable n you have 3 oprations in inlined code, one of them a division. I bet you'll save even more space with such change. -- vda ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [-mm patch] make kcalloc() a static inline 2005-08-15 8:06 ` [-mm patch] " Denis Vlasenko @ 2005-08-15 8:14 ` Arjan van de Ven 2005-08-15 8:20 ` Denis Vlasenko 0 siblings, 1 reply; 13+ messages in thread From: Arjan van de Ven @ 2005-08-15 8:14 UTC (permalink / raw) To: Denis Vlasenko; +Cc: Adrian Bunk, Pekka J Enberg, Andrew Morton, linux-kernel On Mon, 2005-08-15 at 11:06 +0300, Denis Vlasenko wrote: > > +static inline void *kcalloc(size_t n, size_t size, unsigned int __nocast flags) > > +{ > > + if (n != 0 && size > INT_MAX / n) > > + return NULL; > > + return kzalloc(n * size, flags); > > +} > Can you conditionalize it on __builtin_constant_p(n) ? > Otherwise with variable n you have 3 oprations in inlined > code, one of them a division. you can't conditionalize the security check really. If it's constant, gcc will optimize it anyway, and if it's not constant you for sure want the check there... ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [-mm patch] make kcalloc() a static inline 2005-08-15 8:14 ` Arjan van de Ven @ 2005-08-15 8:20 ` Denis Vlasenko 2005-08-15 8:28 ` Pekka J Enberg 0 siblings, 1 reply; 13+ messages in thread From: Denis Vlasenko @ 2005-08-15 8:20 UTC (permalink / raw) To: Arjan van de Ven; +Cc: Adrian Bunk, Pekka J Enberg, Andrew Morton, linux-kernel On Monday 15 August 2005 11:14, Arjan van de Ven wrote: > On Mon, 2005-08-15 at 11:06 +0300, Denis Vlasenko wrote: > > > > +static inline void *kcalloc(size_t n, size_t size, unsigned int __nocast flags) > > > +{ > > > + if (n != 0 && size > INT_MAX / n) > > > + return NULL; > > > + return kzalloc(n * size, flags); > > > +} > > > Can you conditionalize it on __builtin_constant_p(n) ? > > Otherwise with variable n you have 3 oprations in inlined > > code, one of them a division. > > you can't conditionalize the security check really. If it's constant, > gcc will optimize it anyway, and if it's not constant you for sure want > the check there... Sure, I don't want to disable the check. I want that check to be in _non-inlined function_. static inline void *kcalloc(size_t n, size_t size, unsigned int __nocast flags) { if (__builtin_constant_p(n)) { if (n != 0 && size > INT_MAX / n) return NULL; return kzalloc(n * size, flags); } return kcalloc_helper(n, size, flags); } void *kcalloc_helper(size_t n, size_t size, unsigned int __nocast flags) { if (n != 0 && size > INT_MAX / n) return NULL; return kzalloc(n * size, flags); } -- vda ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [-mm patch] make kcalloc() a static inline 2005-08-15 8:20 ` Denis Vlasenko @ 2005-08-15 8:28 ` Pekka J Enberg 2005-08-15 9:33 ` Denis Vlasenko 0 siblings, 1 reply; 13+ messages in thread From: Pekka J Enberg @ 2005-08-15 8:28 UTC (permalink / raw) To: Denis Vlasenko; +Cc: Arjan van de Ven, Adrian Bunk, Andrew Morton, linux-kernel On Mon, 15 Aug 2005, Denis Vlasenko wrote: > Sure, I don't want to disable the check. I want that check to be > in _non-inlined function_. > > static inline void *kcalloc(size_t n, size_t size, unsigned int __nocast flags) > { > if (__builtin_constant_p(n)) { > if (n != 0 && size > INT_MAX / n) > return NULL; > return kzalloc(n * size, flags); > } > return kcalloc_helper(n, size, flags); > } > > void *kcalloc_helper(size_t n, size_t size, unsigned int __nocast flags) > { > if (n != 0 && size > INT_MAX / n) > return NULL; > return kzalloc(n * size, flags); > } That's extra complexity and code duplication. How much do we shave off kernel text of allyesconfig with this? Please note that whenever the caller does proper bounds checking, GCC can optimize the security check away. Therefore, in practice, we don't spread around the extra operations so much, I think. Pekka ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [-mm patch] make kcalloc() a static inline 2005-08-15 8:28 ` Pekka J Enberg @ 2005-08-15 9:33 ` Denis Vlasenko 2005-08-15 9:41 ` Arjan van de Ven 0 siblings, 1 reply; 13+ messages in thread From: Denis Vlasenko @ 2005-08-15 9:33 UTC (permalink / raw) To: Pekka J Enberg; +Cc: Arjan van de Ven, Adrian Bunk, Andrew Morton, linux-kernel > > static inline void *kcalloc(size_t n, size_t size, unsigned int __nocast flags) > > { > > if (__builtin_constant_p(n)) { > > if (n != 0 && size > INT_MAX / n) > > return NULL; > > return kzalloc(n * size, flags); > > } > > return kcalloc_helper(n, size, flags); > > } > > > > void *kcalloc_helper(size_t n, size_t size, unsigned int __nocast flags) > > { > > if (n != 0 && size > INT_MAX / n) > > return NULL; > > return kzalloc(n * size, flags); > > } > > That's extra complexity and code duplication. How much do we shave off > kernel text of allyesconfig with this? > > Please note that whenever the caller does proper bounds checking, GCC can > optimize the security check away. Therefore, in practice, we don't spread > around the extra operations so much, I think. gcc can optimize that away with non-const n?! I don't think so. -- vda ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [-mm patch] make kcalloc() a static inline 2005-08-15 9:33 ` Denis Vlasenko @ 2005-08-15 9:41 ` Arjan van de Ven 2005-08-15 12:17 ` Denis Vlasenko 0 siblings, 1 reply; 13+ messages in thread From: Arjan van de Ven @ 2005-08-15 9:41 UTC (permalink / raw) To: Denis Vlasenko; +Cc: Pekka J Enberg, Adrian Bunk, Andrew Morton, linux-kernel On Mon, 2005-08-15 at 12:33 +0300, Denis Vlasenko wrote: > gcc can optimize that away with non-const n?! I don't think so. due to the wonders of "value range propagation" it actually can, if the check is done sufficiently before... gcc keeps track of the range a variable can have at each point, so if the check is done before, the "possible range" will be such that the divide should be optimizable. (Note: this is a relatively new feature in gcc though) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [-mm patch] make kcalloc() a static inline 2005-08-15 9:41 ` Arjan van de Ven @ 2005-08-15 12:17 ` Denis Vlasenko 2005-08-15 12:25 ` Arjan van de Ven 0 siblings, 1 reply; 13+ messages in thread From: Denis Vlasenko @ 2005-08-15 12:17 UTC (permalink / raw) To: Arjan van de Ven; +Cc: Pekka J Enberg, Adrian Bunk, Andrew Morton, linux-kernel On Monday 15 August 2005 12:41, Arjan van de Ven wrote: > On Mon, 2005-08-15 at 12:33 +0300, Denis Vlasenko wrote: > > > gcc can optimize that away with non-const n?! I don't think so. > > due to the wonders of "value range propagation" it actually can, if the > check is done sufficiently before... > > gcc keeps track of the range a variable can have at each point, so if > the check is done before, the "possible range" will be such that the > divide should be optimizable. > > (Note: this is a relatively new feature in gcc though) # gcc -v Using built-in specs. Target: i386-pc-linux-gnu Configured with: ../gcc-4.0.0.src/configure --prefix=/usr/app/gcc-4.0.0 --exec-prefix=/usr/app/gcc-4.0.0 --bindir=/usr/bin --sbindir=/usr/sbin --libexecdir=/usr/app/gcc-4.0.0/libexec --datadir=/usr/app/gcc-4.0.0/share --sysconfdir=/etc --sharedstatedir=/usr/app/gcc-4.0.0/var/com --localstatedir=/usr/app/gcc-4.0.0/var --libdir=/usr/lib --includedir=/usr/include --infodir=/usr/info --mandir=/usr/man --with-slibdir=/usr/app/gcc-4.0.0/lib --with-local-prefix=/usr/local --with-gxx-include-dir=/usr/app/gcc-4.0.0/include/g++-v3 --enable-languages=c,c++ --with-system-zlib --disable-nls --enable-threads=posix i386-pc-linux-gnu Thread model: posix gcc version 4.0.0 # cat t.c #include <stdlib.h> #include <limits.h> void *kzalloc(size_t size, unsigned int flags); static inline void *kcalloc(size_t n, size_t size, unsigned int flags) { if (n != 0 && size > INT_MAX / n) return NULL; return kzalloc(n * size, flags); } void* f(int n, int sz) { if (sz<1000) return kcalloc(n, sz, 1); return NULL; } # gcc -O2 -S -mcpu=i386 t.c `-mcpu=' is deprecated. Use `-mtune=' or '-march=' instead. # cat t.s .file "t.c" .text .p2align 2,,3 .globl f .type f, @function f: pushl %ebp movl %esp, %ebp pushl %ebx movl 8(%ebp), %ebx movl 12(%ebp), %ecx cmpl $999, %ecx jg .L2 testl %ebx, %ebx jne .L9 .L4: movl $1, 12(%ebp) imull %ebx, %ecx movl %ecx, 8(%ebp) popl %ebx leave jmp kzalloc .p2align 2,,3 .L9: movl $2147483647, %eax xorl %edx, %edx divl %ebx cmpl %eax, %ecx jbe .L4 .p2align 2,,3 .L2: xorl %eax, %eax popl %ebx leave ret .size f, .-f .ident "GCC: (GNU) 4.0.0" .section .note.GNU-stack,"",@progbits Seems like that optimization is not helping. Do you have better example? -- vda ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [-mm patch] make kcalloc() a static inline 2005-08-15 12:17 ` Denis Vlasenko @ 2005-08-15 12:25 ` Arjan van de Ven 2005-08-15 13:06 ` Pekka J Enberg 0 siblings, 1 reply; 13+ messages in thread From: Arjan van de Ven @ 2005-08-15 12:25 UTC (permalink / raw) To: Denis Vlasenko; +Cc: Pekka J Enberg, Adrian Bunk, Andrew Morton, linux-kernel On Mon, 2005-08-15 at 15:17 +0300, Denis Vlasenko wrote: > Seems like that optimization is not helping. > Do you have better example? you need gcc 4.1 (eg CVS) for the value range propagation stuff. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [-mm patch] make kcalloc() a static inline 2005-08-15 12:25 ` Arjan van de Ven @ 2005-08-15 13:06 ` Pekka J Enberg 2005-08-15 13:51 ` Alan Cox 2005-08-15 16:30 ` Adrian Bunk 0 siblings, 2 replies; 13+ messages in thread From: Pekka J Enberg @ 2005-08-15 13:06 UTC (permalink / raw) To: Arjan van de Ven; +Cc: Denis Vlasenko, Adrian Bunk, Andrew Morton, linux-kernel On Mon, 2005-08-15 at 15:17 +0300, Denis Vlasenko wrote: > > Seems like that optimization is not helping. > > Do you have better example? On Mon, 15 Aug 2005, Arjan van de Ven wrote: > you need gcc 4.1 (eg CVS) for the value range propagation stuff. For Denis' example, it does not seem to help. I must admit I did not know GCC 3.x does not have this optimization. I am also bit confused as Adrian and I saw small reduction in kernel text with kcalloc() inlined. If GCC is, in fact, spreading the extra operations everywhere, shouldn't kernel text be bigger? Pekka penberg@haji ~/tmp $ ~/bin/gcc-4.1-cvs/bin/gcc -v Using built-in specs. Target: i686-pc-linux-gnu Configured with: ../gcc/configure --prefix=/home/penberg/bin/gcc-4.1-cvs --enable-languages=c Thread model: posix gcc version 4.1.0 20050815 (experimental) penberg@haji ~/tmp $ ~/bin/gcc-4.1-cvs/bin/gcc -O -S -mcpu=i386 t.c `-mcpu=' is deprecated. Use `-mtune=' or '-march=' instead. penberg@haji ~/tmp $ cat t.s .file "t.c" .text .globl f .type f, @function f: pushl %ebp movl %esp, %ebp pushl %ebx subl $4, %esp movl 12(%ebp), %eax cmpl $999, %eax jg .L2 movl %eax, %ebx movl 8(%ebp), %ecx testl %ecx, %ecx je .L4 movl $2147483647, %eax movl $0, %edx divl %ecx cmpl %eax, %ebx ja .L2 .L4: subl $8, %esp pushl $1 movl %ebx, %eax imull %ecx, %eax pushl %eax call kzalloc addl $16, %esp jmp .L6 .L2: movl $0, %eax .L6: movl -4(%ebp), %ebx leave ret .size f, .-f .ident "GCC: (GNU) 4.1.0 20050815 (experimental)" .section .note.GNU-stack,"",@progbits ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [-mm patch] make kcalloc() a static inline 2005-08-15 13:06 ` Pekka J Enberg @ 2005-08-15 13:51 ` Alan Cox 2005-08-15 16:30 ` Adrian Bunk 1 sibling, 0 replies; 13+ messages in thread From: Alan Cox @ 2005-08-15 13:51 UTC (permalink / raw) To: Pekka J Enberg Cc: Arjan van de Ven, Denis Vlasenko, Adrian Bunk, Andrew Morton, linux-kernel On Llu, 2005-08-15 at 16:06 +0300, Pekka J Enberg wrote: > and I saw small reduction in kernel text with kcalloc() inlined. If GCC > is, in fact, spreading the extra operations everywhere, shouldn't kernel > text be bigger? Only if the cost of the function call in lines of code is higher than the inline code. That includes the less obvious cost for the fact that its a synchronization point so you can't do certain optimisations through a function call (the function you are calling may read/write any global/static variable or pointer target). Alan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [-mm patch] make kcalloc() a static inline 2005-08-15 13:06 ` Pekka J Enberg 2005-08-15 13:51 ` Alan Cox @ 2005-08-15 16:30 ` Adrian Bunk 1 sibling, 0 replies; 13+ messages in thread From: Adrian Bunk @ 2005-08-15 16:30 UTC (permalink / raw) To: Pekka J Enberg Cc: Arjan van de Ven, Denis Vlasenko, Andrew Morton, linux-kernel On Mon, Aug 15, 2005 at 04:06:22PM +0300, Pekka J Enberg wrote: > On Mon, 2005-08-15 at 15:17 +0300, Denis Vlasenko wrote: > > > Seems like that optimization is not helping. > > > Do you have better example? > > On Mon, 15 Aug 2005, Arjan van de Ven wrote: > > you need gcc 4.1 (eg CVS) for the value range propagation stuff. > > For Denis' example, it does not seem to help. I must admit I did not know > GCC 3.x does not have this optimization. I am also bit confused as Adrian > and I saw small reduction in kernel text with kcalloc() inlined. If GCC > is, in fact, spreading the extra operations everywhere, shouldn't kernel > text be bigger? Denis' example is a case where gcc might not produce the best code possible, but anyway his example isn't the typical in-kernel example. In most in-kernel cases the first two arguments of kcalloc() are constant at compile-time - and in these cases all gcc versions supported by the kernel are able to optimize away the checks. > Pekka >... cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2005-08-15 16:30 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-08-08 22:38 [-mm patch] make kcalloc() a static inline Adrian Bunk 2005-08-09 5:09 ` Pekka J Enberg 2005-08-15 8:06 ` [-mm patch] " Denis Vlasenko 2005-08-15 8:14 ` Arjan van de Ven 2005-08-15 8:20 ` Denis Vlasenko 2005-08-15 8:28 ` Pekka J Enberg 2005-08-15 9:33 ` Denis Vlasenko 2005-08-15 9:41 ` Arjan van de Ven 2005-08-15 12:17 ` Denis Vlasenko 2005-08-15 12:25 ` Arjan van de Ven 2005-08-15 13:06 ` Pekka J Enberg 2005-08-15 13:51 ` Alan Cox 2005-08-15 16:30 ` Adrian Bunk
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox