[PATCH] no need to check for NULL before calling kfree()

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] no need to check for NULL before calling kfree() - fs/ext2/
@ 2005-03-25 22:08 Jesper Juhl
  2005-03-25 22:29 ` linux-os
  0 siblings, 1 reply; 40+ messages in thread
From: Jesper Juhl @ 2005-03-25 22:08 UTC (permalink / raw)
  To: ext2-devel; +Cc: linux-kernel

(please keep me on CC)


kfree() handles NULL fine, to check is redundant.

Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>

--- linux-2.6.12-rc1-mm3-orig/fs/ext2/acl.c	2005-03-02 08:38:18.000000000 +0100
+++ linux-2.6.12-rc1-mm3/fs/ext2/acl.c	2005-03-25 22:41:07.000000000 +0100
@@ -194,8 +194,7 @@ ext2_get_acl(struct inode *inode, int ty
 		acl = NULL;
 	else
 		acl = ERR_PTR(retval);
-	if (value)
-		kfree(value);
+	kfree(value);
 
 	if (!IS_ERR(acl)) {
 		switch(type) {
@@ -262,8 +261,7 @@ ext2_set_acl(struct inode *inode, int ty
 
 	error = ext2_xattr_set(inode, name_index, "", value, size, 0);
 
-	if (value)
-		kfree(value);
+	kfree(value);
 	if (!error) {
 		switch(type) {
 			case ACL_TYPE_ACCESS:



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() - fs/ext2/
  2005-03-25 22:08 [PATCH] no need to check for NULL before calling kfree() - fs/ext2/ Jesper Juhl
@ 2005-03-25 22:29 ` linux-os
  2005-03-25 22:44   ` Jesper Juhl
                     ` (2 more replies)
  0 siblings, 3 replies; 40+ messages in thread
From: linux-os @ 2005-03-25 22:29 UTC (permalink / raw)
  To: Jesper Juhl; +Cc: ext2-devel, linux-kernel


Isn't it expensive of CPU time to call kfree() even though the
pointer may have already been freed? I suggest that the check
for a NULL before the call is much less expensive than calling
kfree() and doing the check there. The resulting "double check"
is cheap, compared to the call.

On Fri, 25 Mar 2005, Jesper Juhl wrote:

> (please keep me on CC)
>
>
> kfree() handles NULL fine, to check is redundant.
>
> Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
>
> --- linux-2.6.12-rc1-mm3-orig/fs/ext2/acl.c	2005-03-02 08:38:18.000000000 +0100
> +++ linux-2.6.12-rc1-mm3/fs/ext2/acl.c	2005-03-25 22:41:07.000000000 +0100
> @@ -194,8 +194,7 @@ ext2_get_acl(struct inode *inode, int ty
> 		acl = NULL;
> 	else
> 		acl = ERR_PTR(retval);
> -	if (value)
> -		kfree(value);
> +	kfree(value);
>
> 	if (!IS_ERR(acl)) {
> 		switch(type) {
> @@ -262,8 +261,7 @@ ext2_set_acl(struct inode *inode, int ty
>
> 	error = ext2_xattr_set(inode, name_index, "", value, size, 0);
>
> -	if (value)
> -		kfree(value);
> +	kfree(value);
> 	if (!error) {
> 		switch(type) {
> 			case ACL_TYPE_ACCESS:
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

Cheers,
Dick Johnson
Penguin : Linux version 2.6.11 on an i686 machine (5537.79 BogoMips).
  Notice : All mail here is now cached for review by Dictator Bush.
                  98.36% of all statistics are fiction.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() - fs/ext2/
  2005-03-25 22:29 ` linux-os
@ 2005-03-25 22:44   ` Jesper Juhl
  2005-03-26  7:50   ` Pekka Enberg
  2005-03-26  8:32   ` Arjan van de Ven
  2 siblings, 0 replies; 40+ messages in thread
From: Jesper Juhl @ 2005-03-25 22:44 UTC (permalink / raw)
  To: linux-os; +Cc: Jesper Juhl, ext2-devel, linux-kernel

On Fri, 25 Mar 2005, linux-os wrote:

> 
> Isn't it expensive of CPU time to call kfree() even though the
> pointer may have already been freed? I suggest that the check
> for a NULL before the call is much less expensive than calling
> kfree() and doing the check there. The resulting "double check"
> is cheap, compared to the call.
> 
I've been looking at some of the actual code gcc generates for those 
checks, and it's quite bloated. My guess is that the reduced memory 
footprint, one less branch, and the fact that kfree is probably already in 
cache (since it's called often all over the place) outweighs the cost of a 
function call - especially in the cases where the pointer is rarely NULL 
and we'll end up doing the call in any case.
And the reduced use of screen real-estate is nice as well :)

-- 
Jesper juhl



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() - fs/ext2/
  2005-03-25 22:29 ` linux-os
  2005-03-25 22:44   ` Jesper Juhl
@ 2005-03-26  7:50   ` Pekka Enberg
  2005-03-26  8:32   ` Arjan van de Ven
  2 siblings, 0 replies; 40+ messages in thread
From: Pekka Enberg @ 2005-03-26  7:50 UTC (permalink / raw)
  To: linux-os; +Cc: Jesper Juhl, ext2-devel, linux-kernel, penberg

Hi,

On Fri, 25 Mar 2005 17:29:56 -0500 (EST), linux-os
<linux-os@analogic.com> wrote:
> Isn't it expensive of CPU time to call kfree() even though the
> pointer may have already been freed? I suggest that the check
> for a NULL before the call is much less expensive than calling
> kfree() and doing the check there. The resulting "double check"
> is cheap, compared to the call.

Resource release paths are usually not performance critical. However,
if removing the redundant checks introduce a _measurable_ regressions
in terms of performance, we can make kfree() inline which will take
care of it.

                           Pekka

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() - fs/ext2/
  2005-03-25 22:29 ` linux-os
  2005-03-25 22:44   ` Jesper Juhl
  2005-03-26  7:50   ` Pekka Enberg
@ 2005-03-26  8:32   ` Arjan van de Ven
  2005-03-26 23:21     ` [PATCH] no need to check for NULL before calling kfree() -fs/ext2/ linux-os
  2 siblings, 1 reply; 40+ messages in thread
From: Arjan van de Ven @ 2005-03-26  8:32 UTC (permalink / raw)
  To: linux-os; +Cc: Jesper Juhl, ext2-devel, linux-kernel

On Fri, 2005-03-25 at 17:29 -0500, linux-os wrote:
> Isn't it expensive of CPU time to call kfree() even though the
> pointer may have already been freed? 

nope

a call instruction is effectively half a cycle or less, the branch
predictor of the cpu can predict perfectly where the next instruction is
from. The extra if() you do in front is a different matter, that can
easily cost 100 cycles+. (And those are redundant cycles because kfree
will do the if again anyway). So what you propose is to spend 100+
cycles to save half a cycle. Not a good tradeoff ;)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-26  8:32   ` Arjan van de Ven
@ 2005-03-26 23:21     ` linux-os
  2005-03-26 23:34       ` Marcin Dalecki
                         ` (2 more replies)
  0 siblings, 3 replies; 40+ messages in thread
From: linux-os @ 2005-03-26 23:21 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Jesper Juhl, ext2-devel, Linux kernel

On Sat, 26 Mar 2005, Arjan van de Ven wrote:

> On Fri, 2005-03-25 at 17:29 -0500, linux-os wrote:
>> Isn't it expensive of CPU time to call kfree() even though the
>> pointer may have already been freed?
>
> nope
>
> a call instruction is effectively half a cycle or less, the branch

Wrong!

> predictor of the cpu can predict perfectly where the next instruction is
> from. The extra if() you do in front is a different matter, that can
> easily cost 100 cycles+. (And those are redundant cycles because kfree
> will do the if again anyway). So what you propose is to spend 100+
> cycles to save half a cycle. Not a good tradeoff ;)
>

Wrong!

Pure unmitigated bull-shit. I measure (with hardware devices)
the execution time of real code in modern CPUs. I do this for
a living so you don't have to stand in line for a couple of
hours to have your baggage scanned at the airport.

Always, always, a call will be more expensive than a branch
on condition. It's impossible to be otherwise. A call requires
that the return address be written to memory (the stack),
using register indirection (the stack-pointer).

If somebody said; "I think that the code will look better
and the few cycles lost will not be a consequence with modern
CPUs...", then there is a point. But coming up with this
disingenuous bullshit is something else.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.11 on an i686 machine (5537.79 BogoMips).
  Notice : All mail here is now cached for review by Dictator Bush.
                  98.36% of all statistics are fiction.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-26 23:21     ` [PATCH] no need to check for NULL before calling kfree() -fs/ext2/ linux-os
@ 2005-03-26 23:34       ` Marcin Dalecki
  2005-03-27  2:00         ` Horst von Brand
  2005-03-27 22:12         ` linux-os
  2005-03-26 23:54       ` Jesper Juhl
  2005-03-27  8:45       ` Arjan van de Ven
  2 siblings, 2 replies; 40+ messages in thread
From: Marcin Dalecki @ 2005-03-26 23:34 UTC (permalink / raw)
  To: linux-os; +Cc: ext2-devel, Linux kernel, Arjan van de Ven, Jesper Juhl


On 2005-03-27, at 00:21, linux-os wrote:
>
> Always, always, a call will be more expensive than a branch
> on condition. It's impossible to be otherwise. A call requires
> that the return address be written to memory (the stack),
> using register indirection (the stack-pointer).
>
Needless to say that there are enough architectures out there, which 
don't even
have something like an explicit call as separate assembler 
instruction...


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-26 23:34       ` Marcin Dalecki
@ 2005-03-27  2:00         ` Horst von Brand
  2005-03-27  3:18           ` Marcin Dalecki
  2005-03-27 22:12         ` linux-os
  1 sibling, 1 reply; 40+ messages in thread
From: Horst von Brand @ 2005-03-27  2:00 UTC (permalink / raw)
  To: Marcin Dalecki
  Cc: linux-os, ext2-devel, Linux kernel, Arjan van de Ven, Jesper Juhl

Marcin Dalecki <martin@dalecki.de> said:
> On 2005-03-27, at 00:21, linux-os wrote:
> > Always, always, a call will be more expensive than a branch
> > on condition.

Wrong.

> >               It's impossible to be otherwise.

Many, many counterexamples say otherwise...

> >                                                A call requires
> > that the return address be written to memory (the stack),

Not necesarily right now, it can be done at leisure later on while doing
other stuff.

> > using register indirection (the stack-pointer).

So what? The stack pointer is surely special. Modern programming languages
(and programming styles) encourage many calls, so this is very heavily
optimized.

> Needless to say that there are enough architectures out there, which
> don't even have something like an explicit call as separate assembler
> instruction...

The mechanism exists somehow.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-27  2:00         ` Horst von Brand
@ 2005-03-27  3:18           ` Marcin Dalecki
  0 siblings, 0 replies; 40+ messages in thread
From: Marcin Dalecki @ 2005-03-27  3:18 UTC (permalink / raw)
  To: Horst von Brand
  Cc: Arjan van de Ven, Linux kernel, linux-os, ext2-devel, Jesper Juhl

On 2005-03-27, at 04:00, Horst von Brand wrote:
>
>> Needless to say that there are enough architectures out there, which
>> don't even have something like an explicit call as separate assembler
>> instruction...
>
> The mechanism exists somehow.

Most RISC architectures are claiming a huge register set advantage over 
IA32.
However in reality it's normal that:

1. Some of the register take roles as declared by the ABI. One is stack 
one
is basis pointer and so no.
2. Only a subset of register is declared to be guaranteed to be 
preserved by
system calls.

Thus the mechanisms are simple calling conventions.

Compilers can frequently see what a subroutine does and can flatten out 
the cost
of function calls to something very much resembling just two jumps 
instead of
a single jump around a condition.

On the other hand most modern IA32 implementation (since cyrix 486) are 
very
efficient at mapping stack operations to a special cache between the 
CPU and
L1 cache. I could even imagine them to be more efficient then plain 
jumps, which
simply don't carry the same information for cache prefetch and branch 
predition.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-26 23:34       ` Marcin Dalecki
  2005-03-27  2:00         ` Horst von Brand
@ 2005-03-27 22:12         ` linux-os
  1 sibling, 0 replies; 40+ messages in thread
From: linux-os @ 2005-03-27 22:12 UTC (permalink / raw)
  To: Marcin Dalecki; +Cc: ext2-devel, Linux kernel, Arjan van de Ven, Jesper Juhl

On Sat, 26 Mar 2005, Marcin Dalecki wrote:

>
> On 2005-03-27, at 00:21, linux-os wrote:
>>
>> Always, always, a call will be more expensive than a branch
>> on condition. It's impossible to be otherwise. A call requires
>> that the return address be written to memory (the stack),
>> using register indirection (the stack-pointer).
>>
> Needless to say that there are enough architectures out there, which
> don't even
> have something like an explicit call as separate assembler
> instruction...
>

Yes, they break the 'call' into seperate expensive operations like
loading the IP address that will exist after the call into a register
storing that in a dedicated register, used as a "stack", then
branching to the called procedure with another indirection, etc.

>

Cheers,
Dick Johnson
Penguin : Linux version 2.6.11 on an i686 machine (5537.79 BogoMips).
  Notice : All mail here is now cached for review by Dictator Bush.
                  98.36% of all statistics are fiction.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-26 23:21     ` [PATCH] no need to check for NULL before calling kfree() -fs/ext2/ linux-os
  2005-03-26 23:34       ` Marcin Dalecki
@ 2005-03-26 23:54       ` Jesper Juhl
  2005-03-27  0:05         ` Lee Revell
  2005-03-27  8:45       ` Arjan van de Ven
  2 siblings, 1 reply; 40+ messages in thread
From: Jesper Juhl @ 2005-03-26 23:54 UTC (permalink / raw)
  To: linux-os; +Cc: Arjan van de Ven, Jesper Juhl, ext2-devel, Linux kernel

On Sat, 26 Mar 2005, linux-os wrote:

> On Sat, 26 Mar 2005, Arjan van de Ven wrote:
> 
> > On Fri, 2005-03-25 at 17:29 -0500, linux-os wrote:
> > > Isn't it expensive of CPU time to call kfree() even though the
> > > pointer may have already been freed?
> > 
> > nope
> > 
> > a call instruction is effectively half a cycle or less, the branch
> 
> Wrong!
> 
> > predictor of the cpu can predict perfectly where the next instruction is
> > from. The extra if() you do in front is a different matter, that can
> > easily cost 100 cycles+. (And those are redundant cycles because kfree
> > will do the if again anyway). So what you propose is to spend 100+
> > cycles to save half a cycle. Not a good tradeoff ;)
> > 
> 
> Wrong!
> 
[snip]
>
> Always, always, a call will be more expensive than a branch
> on condition. It's impossible to be otherwise. A call requires
> that the return address be written to memory (the stack),
> using register indirection (the stack-pointer).
> 
> If somebody said; "I think that the code will look better
> and the few cycles lost will not be a consequence with modern
> CPUs...", then there is a point. But coming up with this
> disingenuous bullshit is something else.
> 

I tried to create a test to see what the actual impact of this sort of 
change is, the result I reached is below (as well as the code used to 
obtain the numbers): 


Each test is run 10000000 times, and the number of jiffies spent doing the
        kfree();
or
        if (p)
                kfree(p);
is meassured. Total number of jiffies used for that for all 10000000 runs
is reported.

test 0:
        Pointer is NULL half the time, value returned by kmalloc half the
        time.
        kfree() is called on the pointer without checking for NULL first.

test 1:
        Pointer is NULL half the time, value returned by kmalloc half the
        time.
        The pointer is checked for NULL and kfree() is called on the
        pointer only if it is != NULL.

test 2:
        Pointer is NULL the majority of the time, only in 1 out of 50
        cases is it assigned a real value by kmalloc().
        kfree() is called on the pointer without checking for NULL first.

test 3:
        Pointer is NULL the majority of the time, only in 1 out of 50
        cases is it assigned a real value by kmalloc().
        The pointer is checked for NULL and kfree() is called on the
        pointer only if it is != NULL.

test 4:
        Pointer is rarely NULL - only in 1 out of 50 cases.
        kfree() is called on the pointer without checking for NULL first.

test 5:
        Pointer is rarely NULL - only in 1 out of 50 cases.
        The pointer is checked for NULL and kfree() is called on the
        pointer only if it is != NULL.


Here are the numbers from 5 runs on my box - the numbers naturally
differ a bit between each run, but they are quite similar each time :

[ 1395.059375] test 0 used up 235 kfree related jiffies
[ 1395.059385] test 1 used up 195 kfree related jiffies
[ 1395.059389] test 2 used up 66 kfree related jiffies
[ 1395.059392] test 3 used up 20 kfree related jiffies
[ 1395.059395] test 4 used up 366 kfree related jiffies
[ 1395.059398] test 5 used up 428 kfree related jiffies

[ 1412.994705] test 0 used up 231 kfree related jiffies
[ 1412.994744] test 1 used up 209 kfree related jiffies
[ 1412.994748] test 2 used up 68 kfree related jiffies
[ 1412.994751] test 3 used up 12 kfree related jiffies
[ 1412.994754] test 4 used up 362 kfree related jiffies
[ 1412.994757] test 5 used up 392 kfree related jiffies

[ 1423.734356] test 0 used up 245 kfree related jiffies
[ 1423.734366] test 1 used up 179 kfree related jiffies
[ 1423.734370] test 2 used up 78 kfree related jiffies
[ 1423.734373] test 3 used up 30 kfree related jiffies
[ 1423.734376] test 4 used up 384 kfree related jiffies
[ 1423.734379] test 5 used up 385 kfree related jiffies

[ 1434.390194] test 0 used up 242 kfree related jiffies
[ 1434.390203] test 1 used up 179 kfree related jiffies
[ 1434.390207] test 2 used up 70 kfree related jiffies
[ 1434.390210] test 3 used up 16 kfree related jiffies
[ 1434.390214] test 4 used up 365 kfree related jiffies
[ 1434.390217] test 5 used up 397 kfree related jiffies

[ 1446.529856] test 0 used up 231 kfree related jiffies
[ 1446.530046] test 1 used up 232 kfree related jiffies
[ 1446.530117] test 2 used up 79 kfree related jiffies
[ 1446.530211] test 3 used up 16 kfree related jiffies
[ 1446.530278] test 4 used up 360 kfree related jiffies
[ 1446.530362] test 5 used up 412 kfree related jiffies

The conclusions I draw from those numbers are that when NULL pointers are
rare (tests 4 & 5) then it pays off to not have the if() check. When NULL
pointers are common, then there's a small bennefit to having the if() 
check, but we are talking ~50 jiffies (or less) over 10 million runs pr 
test, which is pretty insignificant unless the code is in a very hot path.
When pointers are NULL 50% of the time there's a bennefit to the if(), but 
it's small.
So, unless the code is extremely performance critical *and* the pointer
is NULL more often than not, having the if(pointer != NULL) check before
calling kfree() is pointless and may even be degrading performance if the
pointer is most commonly != NULL.  I'd say that the general rule should
be "don't check for NULL first unless you *know* the pointer will be NULL
>50% of the time"... 
I ran these tests on a 1.4GHz AMD Athlon (T-bird), and with a HZ setting
of 1000.

Am I drawing flawed conclusions here?

If someone could check the sanity of my code used to obtain these numbers
(below), then I'd appreciate it - if the numbers are wrong, then any
conclusion is also wrong of course.


Here's the tiny module I wrote to get the numbers above : 


#include <linux/init.h>
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/slab.h>

#define NR_TESTS	10000000

void do_work(void *data);

DECLARE_WORK(work, do_work, NULL);

static int test_time[] = {0, 0, 0, 0, 0, 0};

void do_work(void *data)
{
	unsigned long j;
	static int what_test = 0;
	unsigned long start;
	void *tmp;

	switch (what_test) {
		case 0:
			for (j = 0; j < NR_TESTS; j++) {
				if (j%2 == 0)
					tmp = kmalloc(1, GFP_KERNEL);
				else
					tmp = NULL;
				start = jiffies;
				kfree(tmp);
				test_time[0] += jiffies - start;
			}
			break;
		case 1:
			for (j = 0; j < NR_TESTS; j++) {
				if (j%2 == 0)
					tmp = kmalloc(1, GFP_KERNEL);
				else
					tmp = NULL;
				start = jiffies;
				if (tmp)
					kfree(tmp);
				test_time[1] += jiffies - start;
			}
			break;
		case 2:
			for (j = 0; j < NR_TESTS; j++) {
				if (j%50 == 0)
					tmp = kmalloc(1, GFP_KERNEL);
				else
					tmp = NULL;
				start = jiffies;
				kfree(tmp);
				test_time[2] += jiffies - start;
			}
			break;
		case 3:
			for (j = 0; j < NR_TESTS; j++) {
				if (j%50 == 0)
					tmp = kmalloc(1, GFP_KERNEL);
				else
					tmp = NULL;
				start = jiffies;
				if (tmp)
					kfree(tmp);
				test_time[3] += jiffies - start;
			}
			break;
		case 4:
			for (j = 0; j < NR_TESTS; j++) {
				if (j%50 == 0)
					tmp = NULL;
				else
					tmp = kmalloc(1, GFP_KERNEL);
				start = jiffies;
				kfree(tmp);
				test_time[4] += jiffies - start;
			}
			break;
		case 5:
			for (j = 0; j < NR_TESTS; j++) {
				if (j%50 == 0)
					tmp = NULL;
				else
					tmp = kmalloc(1, GFP_KERNEL);
				start = jiffies;
				if (tmp)
					kfree(tmp);
				test_time[5] += jiffies - start;
			}
			break;
		default:
			break;
	}
	printk(KERN_ALERT "test %d done.\n", what_test);
	
	if (what_test < 5)
		schedule_delayed_work(&work, 1);
	else
		printk(KERN_ALERT "All tests done...\n");

	what_test++;
}


static int kfreetest_init(void)
{
	schedule_work(&work);
	return 0;
}

static void kfreetest_exit(void)
{
	int i;

	cancel_delayed_work(&work);
	flush_scheduled_work();
	for (i = 0; i < 6; i++)
		printk(KERN_ALERT "test %d used up %d kfree related jiffies\n", i, test_time[i]);
}


module_init(kfreetest_init);
module_exit(kfreetest_exit);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Jesper Juhl");



-- 
Jesper Juhl <juhl-lkml@dif.dk>



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-26 23:54       ` Jesper Juhl
@ 2005-03-27  0:05         ` Lee Revell
  2005-03-27 10:55           ` Jesper Juhl
  0 siblings, 1 reply; 40+ messages in thread
From: Lee Revell @ 2005-03-27  0:05 UTC (permalink / raw)
  To: Jesper Juhl; +Cc: linux-os, Arjan van de Ven, ext2-devel, Linux kernel

On Sun, 2005-03-27 at 00:54 +0100, Jesper Juhl wrote:
> I'd say that the general rule should
> be "don't check for NULL first unless you *know* the pointer will be NULL
> >50% of the time"... 

How about running the same tests but using likely()/unlikely() for the
'1 in 50' cases?

Lee


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-27  0:05         ` Lee Revell
@ 2005-03-27 10:55           ` Jesper Juhl
  2005-03-27 14:56             ` Paul Jackson
  0 siblings, 1 reply; 40+ messages in thread
From: Jesper Juhl @ 2005-03-27 10:55 UTC (permalink / raw)
  To: Lee Revell
  Cc: Jesper Juhl, linux-os, Arjan van de Ven, ext2-devel, Linux kernel

On Sat, 26 Mar 2005, Lee Revell wrote:

> On Sun, 2005-03-27 at 00:54 +0100, Jesper Juhl wrote:
> > I'd say that the general rule should
> > be "don't check for NULL first unless you *know* the pointer will be NULL
> > >50% of the time"... 
> 
> How about running the same tests but using likely()/unlikely() for the
> '1 in 50' cases?
> 
I added likely() and unlikely() to all tests, here are the results from 3 
runs on my box : 


[ 4379.223858] starting test : 50/50 NULL pointers, kfree(p)
[ 4379.785541] Test done. This test used up 240 kfree related jiffies
[ 4379.785543] -------------------------
[ 4379.884863] starting test : 50/50 NULL pointers, if(p) kfree(p)
[ 4380.609537] Test done. This test used up 221 kfree related jiffies
[ 4380.609539] -------------------------
[ 4380.709285] starting test : 50/50 NULL pointers, if(likely(p)) kfree(p)
[ 4381.241958] Test done. This test used up 237 kfree related jiffies
[ 4381.241961] -------------------------
[ 4381.341843] starting test : 50/50 NULL pointers, if(unlikely(p)) kfree(p)
[ 4381.874492] Test done. This test used up 261 kfree related jiffies
[ 4381.874495] -------------------------
[ 4381.974396] starting test : 49 out of 50 pointers == NULL, kfree(p)
[ 4382.239784] Test done. This test used up 87 kfree related jiffies
[ 4382.239787] -------------------------
[ 4382.339138] starting test : 49 out of 50 pointers == NULL, if(p) kfree(p)
[ 4382.519165] Test done. This test used up 22 kfree related jiffies
[ 4382.519167] -------------------------
[ 4382.618944] starting test : 49 out of 50 pointers == NULL, if(likely(p)) kfree(p)
[ 4382.798832] Test done. This test used up 18 kfree related jiffies
[ 4382.798834] -------------------------
[ 4382.898746] starting test : 49 out of 50 pointers == NULL, if(unlikely(p)) kfree(p)
[ 4383.079062] Test done. This test used up 26 kfree related jiffies
[ 4383.079065] -------------------------
[ 4383.178549] starting test : 1 out of 50 pointers == NULL, kfree(p)
[ 4384.187707] Test done. This test used up 365 kfree related jiffies
[ 4384.187710] -------------------------
[ 4384.286769] starting test : 1 out of 50 pointers == NULL, if(p) kfree(p)
[ 4385.351731] Test done. This test used up 438 kfree related jiffies
[ 4385.351733] -------------------------
[ 4385.450951] starting test : 1 out of 50 pointers == NULL, if(likely(p)) kfree(p)
[ 4386.480408] Test done. This test used up 378 kfree related jiffies
[ 4386.480410] -------------------------
[ 4386.580161] starting test : 1 out of 50 pointers == NULL, if(unlikely(p)) kfree(p)
[ 4387.630779] Test done. This test used up 432 kfree related jiffies
[ 4387.630782] -------------------------


[ 4614.027356] starting test : 50/50 NULL pointers, kfree(p)
[ 4614.589174] Test done. This test used up 258 kfree related jiffies
[ 4614.589177] -------------------------
[ 4614.688741] starting test : 50/50 NULL pointers, if(p) kfree(p)
[ 4615.409793] Test done. This test used up 252 kfree related jiffies
[ 4615.409795] -------------------------
[ 4615.509165] starting test : 50/50 NULL pointers, if(likely(p)) kfree(p)
[ 4616.041816] Test done. This test used up 200 kfree related jiffies
[ 4616.041818] -------------------------
[ 4616.141720] starting test : 50/50 NULL pointers, if(unlikely(p)) kfree(p)
[ 4616.678002] Test done. This test used up 223 kfree related jiffies
[ 4616.678005] -------------------------
[ 4616.777275] starting test : 49 out of 50 pointers == NULL, kfree(p)
[ 4617.042512] Test done. This test used up 91 kfree related jiffies
[ 4617.042514] -------------------------
[ 4617.142017] starting test : 49 out of 50 pointers == NULL, if(p) kfree(p)
[ 4617.322044] Test done. This test used up 24 kfree related jiffies
[ 4617.322047] -------------------------
[ 4617.421820] starting test : 49 out of 50 pointers == NULL, if(likely(p)) kfree(p)
[ 4617.601710] Test done. This test used up 29 kfree related jiffies
[ 4617.601713] -------------------------
[ 4617.701625] starting test : 49 out of 50 pointers == NULL, if(unlikely(p)) kfree(p)
[ 4617.882083] Test done. This test used up 27 kfree related jiffies
[ 4617.882085] -------------------------
[ 4617.981427] starting test : 1 out of 50 pointers == NULL, kfree(p)
[ 4618.990599] Test done. This test used up 355 kfree related jiffies
[ 4618.990601] -------------------------
[ 4619.089646] starting test : 1 out of 50 pointers == NULL, if(p) kfree(p)
[ 4620.154737] Test done. This test used up 388 kfree related jiffies
[ 4620.154740] -------------------------
[ 4620.253829] starting test : 1 out of 50 pointers == NULL, if(likely(p)) kfree(p)
[ 4621.283279] Test done. This test used up 372 kfree related jiffies
[ 4621.283282] -------------------------
[ 4621.383035] starting test : 1 out of 50 pointers == NULL, if(unlikely(p)) kfree(p)
[ 4622.442580] Test done. This test used up 468 kfree related jiffies
[ 4622.442583] -------------------------


[ 4673.948568] starting test : 50/50 NULL pointers, kfree(p)
[ 4674.513874] Test done. This test used up 257 kfree related jiffies
[ 4674.513877] -------------------------
[ 4674.613603] starting test : 50/50 NULL pointers, if(p) kfree(p)
[ 4675.338429] Test done. This test used up 256 kfree related jiffies
[ 4675.338432] -------------------------
[ 4675.438022] starting test : 50/50 NULL pointers, if(likely(p)) kfree(p)
[ 4675.970685] Test done. This test used up 209 kfree related jiffies
[ 4675.970687] -------------------------
[ 4676.070575] starting test : 50/50 NULL pointers, if(unlikely(p)) kfree(p)
[ 4676.603217] Test done. This test used up 233 kfree related jiffies
[ 4676.603219] -------------------------
[ 4676.703132] starting test : 49 out of 50 pointers == NULL, kfree(p)
[ 4676.968502] Test done. This test used up 82 kfree related jiffies
[ 4676.968504] -------------------------
[ 4677.067877] starting test : 49 out of 50 pointers == NULL, if(p) kfree(p)
[ 4677.247945] Test done. This test used up 17 kfree related jiffies
[ 4677.247948] -------------------------
[ 4677.347675] starting test : 49 out of 50 pointers == NULL, if(likely(p)) kfree(p)
[ 4677.527564] Test done. This test used up 29 kfree related jiffies
[ 4677.527566] -------------------------
[ 4677.627480] starting test : 49 out of 50 pointers == NULL, if(unlikely(p)) kfree(p)
[ 4677.807935] Test done. This test used up 19 kfree related jiffies
[ 4677.807937] -------------------------
[ 4677.907282] starting test : 1 out of 50 pointers == NULL, kfree(p)
[ 4678.916434] Test done. This test used up 404 kfree related jiffies
[ 4678.916437] -------------------------
[ 4679.015503] starting test : 1 out of 50 pointers == NULL, if(p) kfree(p)
[ 4680.087547] Test done. This test used up 423 kfree related jiffies
[ 4680.087549] -------------------------
[ 4680.186681] starting test : 1 out of 50 pointers == NULL, if(likely(p)) kfree(p)
[ 4681.209136] Test done. This test used up 373 kfree related jiffies
[ 4681.209139] -------------------------
[ 4681.308891] starting test : 1 out of 50 pointers == NULL, if(unlikely(p)) kfree(p)
[ 4682.366623] Test done. This test used up 449 kfree related jiffies
[ 4682.366625] -------------------------


And here's the source for the module that generated the above : 


#include <linux/init.h>
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/slab.h>
#include <linux/compiler.h>
#include <linux/workqueue.h>

#define NR_TESTS	10000000

void do_kfreetest_work(void *data);

DECLARE_WORK(kfree_work, do_kfreetest_work, NULL);

static int test_time[] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};

void do_kfreetest_work(void *data)
{
	unsigned long j;
	static int what_test = 0;
	unsigned long start;
	void *tmp;

	printk(KERN_ALERT "starting test : ");
	switch (what_test) {
		case 0:
			printk("50/50 NULL pointers, kfree(p)\n");
			for (j = 0; j < NR_TESTS; j++) {
				if (j%2 == 0)
					tmp = kmalloc(1, GFP_KERNEL);
				else
					tmp = NULL;
				start = jiffies;
				kfree(tmp);
				test_time[what_test] += jiffies - start;
			}
			break;
		case 1:
			printk("50/50 NULL pointers, if(p) kfree(p)\n");
			for (j = 0; j < NR_TESTS; j++) {
				if (j%2 == 0)
					tmp = kmalloc(1, GFP_KERNEL);
				else
					tmp = NULL;
				start = jiffies;
				if (tmp)
					kfree(tmp);
				test_time[what_test] += jiffies - start;
			}
			break;
		case 2:
			printk("50/50 NULL pointers, if(likely(p)) kfree(p)\n");
			for (j = 0; j < NR_TESTS; j++) {
				if (j%2 == 0)
					tmp = kmalloc(1, GFP_KERNEL);
				else
					tmp = NULL;
				start = jiffies;
				if (likely(tmp))
					kfree(tmp);
				test_time[what_test] += jiffies - start;
			}
			break;
		case 3:
			printk("50/50 NULL pointers, if(unlikely(p)) kfree(p)\n");
			for (j = 0; j < NR_TESTS; j++) {
				if (j%2 == 0)
					tmp = kmalloc(1, GFP_KERNEL);
				else
					tmp = NULL;
				start = jiffies;
				if (unlikely(tmp))
					kfree(tmp);
				test_time[what_test] += jiffies - start;
			}
			break;
		case 4:
			printk("49 out of 50 pointers == NULL, kfree(p)\n");
			for (j = 0; j < NR_TESTS; j++) {
				if (j%50 == 0)
					tmp = kmalloc(1, GFP_KERNEL);
				else
					tmp = NULL;
				start = jiffies;
				kfree(tmp);
				test_time[what_test] += jiffies - start;
			}
			break;
		case 5:
			printk("49 out of 50 pointers == NULL, if(p) kfree(p)\n");
			for (j = 0; j < NR_TESTS; j++) {
				if (j%50 == 0)
					tmp = kmalloc(1, GFP_KERNEL);
				else
					tmp = NULL;
				start = jiffies;
				if (tmp)
					kfree(tmp);
				test_time[what_test] += jiffies - start;
			}
			break;
		case 6:
			printk("49 out of 50 pointers == NULL, if(likely(p)) kfree(p)\n");
			for (j = 0; j < NR_TESTS; j++) {
				if (j%50 == 0)
					tmp = kmalloc(1, GFP_KERNEL);
				else
					tmp = NULL;
				start = jiffies;
				if (likely(tmp))
					kfree(tmp);
				test_time[what_test] += jiffies - start;
			}
			break;
		case 7:
			printk("49 out of 50 pointers == NULL, if(unlikely(p)) kfree(p)\n");
			for (j = 0; j < NR_TESTS; j++) {
				if (j%50 == 0)
					tmp = kmalloc(1, GFP_KERNEL);
				else
					tmp = NULL;
				start = jiffies;
				if (unlikely(tmp))
					kfree(tmp);
				test_time[what_test] += jiffies - start;
			}
			break;
		case 8:
			printk("1 out of 50 pointers == NULL, kfree(p)\n");
			for (j = 0; j < NR_TESTS; j++) {
				if (j%50 == 0)
					tmp = NULL;
				else
					tmp = kmalloc(1, GFP_KERNEL);
				start = jiffies;
				kfree(tmp);
				test_time[what_test] += jiffies - start;
			}
			break;
		case 9:
			printk("1 out of 50 pointers == NULL, if(p) kfree(p)\n");
			for (j = 0; j < NR_TESTS; j++) {
				if (j%50 == 0)
					tmp = NULL;
				else
					tmp = kmalloc(1, GFP_KERNEL);
				start = jiffies;
				if (tmp)
					kfree(tmp);
				test_time[what_test] += jiffies - start;
			}
			break;
		case 10:
			printk("1 out of 50 pointers == NULL, if(likely(p)) kfree(p)\n");
			for (j = 0; j < NR_TESTS; j++) {
				if (j%50 == 0)
					tmp = NULL;
				else
					tmp = kmalloc(1, GFP_KERNEL);
				start = jiffies;
				if (likely(tmp))
					kfree(tmp);
				test_time[what_test] += jiffies - start;
			}
			break;
		case 11:
			printk("1 out of 50 pointers == NULL, if(unlikely(p)) kfree(p)\n");
			for (j = 0; j < NR_TESTS; j++) {
				if (j%50 == 0)
					tmp = NULL;
				else
					tmp = kmalloc(1, GFP_KERNEL);
				start = jiffies;
				if (unlikely(tmp))
					kfree(tmp);
				test_time[what_test] += jiffies - start;
			}
			break;
		default:
			printk(KERN_ALERT "hit default\n");
			break;
	}
	printk(KERN_ALERT "Test done. This test used up %d kfree related jiffies\n-------------------------\n", test_time[what_test]);
	
	if (what_test < 11)
		schedule_delayed_work(&kfree_work, 100);
	else
		printk(KERN_ALERT "All tests done.....\n-----------------------------------\n");

	what_test++;
}

static int kfreetest_init(void)
{
	schedule_work(&kfree_work);
	return 0;
}

static void kfreetest_exit(void)
{
	cancel_delayed_work(&kfree_work);
	flush_scheduled_work();
}

module_init(kfreetest_init);
module_exit(kfreetest_exit);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Jesper Juhl");




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-27 10:55           ` Jesper Juhl
@ 2005-03-27 14:56             ` Paul Jackson
  2005-03-27 15:12               ` Jan Engelhardt
  0 siblings, 1 reply; 40+ messages in thread
From: Paul Jackson @ 2005-03-27 14:56 UTC (permalink / raw)
  To: Jesper Juhl
  Cc: rlrevell, juhl-lkml, linux-os, arjan, ext2-devel, linux-kernel

> I added likely() and unlikely() to all tests, here are the results from 3 
> runs on my box : 

Any chance you could summarize what these results are, for those
of us too lazy to parse it all out?  The time spent by one author
to summarize in English what the numbers state can save the time of
a hundred readers each individually having to parse the numbers.

Just looking at the third run, it seems to me that "if (likely(p))
kfree(p);" beats a naked "kfree(p);" everytime, whether p is half
NULL's, or very few NULL's, or almost all NULL's.

If I'm reading this right, and if these results are valid, then we are
going about this optimization all wrong, at least if your CPU is an
AMD Athlon (T-bird). Weird.  Instead of stripping the "if (p)" test, we
should be changing it to "if (likely(p))", regardless of whether it
is very likely, or unlikely, or in between.  That is not what I would
call intuitive.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-27 14:56             ` Paul Jackson
@ 2005-03-27 15:12               ` Jan Engelhardt
  2005-03-27 17:40                 ` Dave Jones
  2005-03-28  4:07                 ` [PATCH] " Paul Jackson
  0 siblings, 2 replies; 40+ messages in thread
From: Jan Engelhardt @ 2005-03-27 15:12 UTC (permalink / raw)
  To: Linux Kernel Mailing List

>Just looking at the third run, it seems to me that "if (likely(p))
>kfree(p);" beats a naked "kfree(p);" everytime, whether p is half
>NULL's, or very few NULL's, or almost all NULL's.

Well, kfree inlined was already mentioned but forgotten again.
What if this was used:

inline static void kfree_WRAP(void *addr) {
    if(likely(addr != NULL)) {
        kfree_real(addr);
    }
    return;
}

And remove the NULL-test in kfree_real()? Then we would have:

  test eax, eax
  jz afterwards;
  <some more stuff for call>
  call kfree_real;
.afterwards:
  <continue execution>

The two cases then:
ptr==NULL: test-jmp
ptr!=NULL: test-call(freeit-return)

Looks like the least expensive way to me.


Jan Engelhardt
-- 
No TOFU for me, please.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-27 15:12               ` Jan Engelhardt
@ 2005-03-27 17:40                 ` Dave Jones
  2005-03-27 18:17                   ` Jan Engelhardt
                                     ` (6 more replies)
  2005-03-28  4:07                 ` [PATCH] " Paul Jackson
  1 sibling, 7 replies; 40+ messages in thread
From: Dave Jones @ 2005-03-27 17:40 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Linux Kernel Mailing List

On Sun, Mar 27, 2005 at 05:12:58PM +0200, Jan Engelhardt wrote:

 > Well, kfree inlined was already mentioned but forgotten again.
 > What if this was used:
 > 
 > inline static void kfree_WRAP(void *addr) {
 >     if(likely(addr != NULL)) {
 >         kfree_real(addr);
 >     }
 >     return;
 > }
 > 
 > And remove the NULL-test in kfree_real()? Then we would have:

Am I the only person who is completely fascinated by the
effort being spent here micro-optimising something thats
almost never in a path that needs optimising ?
I'd be amazed if any of this masturbation showed the tiniest
blip on a real workload, or even on a benchmark other than
one crafted specifically to test kfree in a loop.

That each occurance of this 'optimisation' also saves a handful
of bytes in generated code is it's only real benefit afaics.
Even then, if a functions cache performance is better off because
we trimmed a few bytes from the tail of a function, I'd be
completely amazed.

I guess April 1st came early this year.

		Dave

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-27 17:40                 ` Dave Jones
@ 2005-03-27 18:17                   ` Jan Engelhardt
  2005-03-27 19:25                   ` Pekka Enberg
                                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 40+ messages in thread
From: Jan Engelhardt @ 2005-03-27 18:17 UTC (permalink / raw)
  To: Dave Jones; +Cc: Linux Kernel Mailing List

>Am I the only person who is completely fascinated by the
>effort being spent here micro-optimising something thats
>almost never in a path that needs optimising ?
>I'd be amazed if any of this masturbation showed the tiniest
>blip on a real workload, or even on a benchmark other than
>one crafted specifically to test kfree in a loop.
>[...]
Was not me who started it :P

>I guess April 1st came early this year.
DST change for Europe and Australia is a day ahead- hm, does not suffice to 
get onto April 1.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-27 17:40                 ` Dave Jones
  2005-03-27 18:17                   ` Jan Engelhardt
@ 2005-03-27 19:25                   ` Pekka Enberg
  2005-03-27 22:56                   ` Jesper Juhl
                                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 40+ messages in thread
From: Pekka Enberg @ 2005-03-27 19:25 UTC (permalink / raw)
  To: Dave Jones, Jan Engelhardt, Linux Kernel Mailing List, penberg

On Sun, 27 Mar 2005 12:40:26 -0500, Dave Jones <davej@redhat.com> wrote:
> Am I the only person who is completely fascinated by the
> effort being spent here micro-optimising something thats
> almost never in a path that needs optimising ?
> I'd be amazed if any of this masturbation showed the tiniest
> blip on a real workload, or even on a benchmark other than
> one crafted specifically to test kfree in a loop.

Indeed. The NULL checks are redundant and thus need to go. If someone
can show a measurable performance regression in the kernel for a
realistic workload, kfree() can be turned into an inline function
which will take care of it. The microbenchmarks are fun but should not
stand in the way of merging the kfree() cleanups from Jesper and
others.

                        Pekka

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-27 17:40                 ` Dave Jones
  2005-03-27 18:17                   ` Jan Engelhardt
  2005-03-27 19:25                   ` Pekka Enberg
@ 2005-03-27 22:56                   ` Jesper Juhl
  2005-03-28  4:53                     ` Paul Jackson
  2005-03-28  1:20                   ` Horst von Brand
                                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 40+ messages in thread
From: Jesper Juhl @ 2005-03-27 22:56 UTC (permalink / raw)
  To: Dave Jones
  Cc: Jan Engelhardt, Pekka Enberg, Paul Jackson, rlrevell, linux-os,
	Arjan van de Ven, Denis Vlasenko, Jesper Juhl,
	Linux Kernel Mailing List

On Sun, 27 Mar 2005, Dave Jones wrote:

> Am I the only person who is completely fascinated by the
> effort being spent here micro-optimising something thats
> almost never in a path that needs optimising ?
> I'd be amazed if any of this masturbation showed the tiniest
> blip on a real workload, or even on a benchmark other than
> one crafted specifically to test kfree in a loop.
> 
> That each occurance of this 'optimisation' also saves a handful
> of bytes in generated code is it's only real benefit afaics.
> Even then, if a functions cache performance is better off because
> we trimmed a few bytes from the tail of a function, I'd be
> completely amazed.
> 

I agree, it's amazing that this is treated like a big issue, it's not, and 
I never meant for it to be a matter of such debate. 

The whole thing (viewed from where I'm sitting) started when I noticed a 
few of those redundant NULL checks while reading code, thought I'd clean 
them up since they were clearly not needed and submit those patches. When 
those patches then got merged I thought "ok, so this is something that's 
actually appreciated, guess I might as well do some more when I come 
across them or maybe even seek them out and get rid of them once and for 
all"... So I started doing that and more of the patches got merged which 
(at least to me) confirmed that it was a worthwhile activity, until at 
some point voices were raised in objection.

At that point I felt I needed to explain the "why" of why I was doing it 
and try and show that it might actually be a bennefit (and I believe the 
small test I wrote shows it to be either a bennefit in the usual case or 
at worst a trivial performance hit in the not-so-common case).
What I'm trying to find out now is if there's a general consensus that 
these patches are worthwile and wanted or if they are unwanted and I'm 
wasting my time.  If the patches are wanted I don't mind doing them, but 
if they are not wanted I don't want to waste my time (nor anyone elses) on 
them.  So, if I could just get peoples comment on that "wanted vs 
not-wanted" then I could get on with either producing some patches for 
people or get on with other things and drop this... Or I guess I could 
just go on making those patches, submit them and then just leave it in the 
hands of the individual maintainers (which was more or less how I started 
out)...

-- 
Jesper Juhl

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-27 22:56                   ` Jesper Juhl
@ 2005-03-28  4:53                     ` Paul Jackson
  2005-03-30 18:57                       ` Jesper Juhl
  0 siblings, 1 reply; 40+ messages in thread
From: Paul Jackson @ 2005-03-28  4:53 UTC (permalink / raw)
  To: Jesper Juhl
  Cc: davej, jengelh, penberg, rlrevell, linux-os, arjan, vda,
	juhl-lkml, linux-kernel

Jesper wrote:
> What I'm trying to find out now is if there's a general consensus that 
> these patches are worthwile and wanted or if they are unwanted and I'm 
> wasting my time. 

Thanks for your good work so far, and your good natured tolerance of
the excessively detailed discussions resulting.

I don't have a big preference either way - but a couple of questions:

 1) Do you have any wild guess of how many of these you've done so far,
    and how many potentially remain that could be done?  Tens, hundreds,
    thousands?

 2) Any idea what was going on with the numbers you posted yesterday,
    which, at least from what I saw at first glance, seemed to show
    "if (likely(p)) kfree(p);" to be a good or best choice, for all
    cases, including (p != NULL) being very unlikely?  That seemed
    like a weird result.

If the use of "likely(p)" is almost always a winner, then the changes
you've been doing, followed by a header file change:

	static inline void kfree(const void *p)
	{
		if (likely(p))
			__kfree(p);	/* __kfree(p) doesn't check for NULL p */
	}

along the lines that Jan considered a few posts ago, might be a winner.

But since this "likely(p)" result seems so bizarre, it seems unlikely that
the above kfree wrapper would be a winner.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-28  4:53                     ` Paul Jackson
@ 2005-03-30 18:57                       ` Jesper Juhl
  0 siblings, 0 replies; 40+ messages in thread
From: Jesper Juhl @ 2005-03-30 18:57 UTC (permalink / raw)
  To: Paul Jackson
  Cc: Jesper Juhl, davej, jengelh, penberg, rlrevell, linux-os, arjan,
	vda, linux-kernel


Hi Paul,

Sorry about the late reply.


On Sun, 27 Mar 2005, Paul Jackson wrote:

> Jesper wrote:
> > What I'm trying to find out now is if there's a general consensus that 
> > these patches are worthwile and wanted or if they are unwanted and I'm 
> > wasting my time. 
> 
> Thanks for your good work so far, and your good natured tolerance of
> the excessively detailed discussions resulting.
> 
> I don't have a big preference either way - but a couple of questions:
> 
>  1) Do you have any wild guess of how many of these you've done so far,
>     and how many potentially remain that could be done?  Tens, hundreds,
>     thousands?
> 
That would indeed be a wild guess, I guess I've done somewhere inbetween 
50 and 200 so far, and I guess there's probably somewhere around 1-2000 
left... but, I'm guessing.


>  2) Any idea what was going on with the numbers you posted yesterday,
>     which, at least from what I saw at first glance, seemed to show
>     "if (likely(p)) kfree(p);" to be a good or best choice, for all
>     cases, including (p != NULL) being very unlikely?  That seemed
>     like a weird result.
> 
> If the use of "likely(p)" is almost always a winner, then the changes
> you've been doing, followed by a header file change:
> 
> 	static inline void kfree(const void *p)
> 	{
> 		if (likely(p))
> 			__kfree(p);	/* __kfree(p) doesn't check for NULL p */
> 	}
> 
> along the lines that Jan considered a few posts ago, might be a winner.
> 
> But since this "likely(p)" result seems so bizarre, it seems unlikely that
> the above kfree wrapper would be a winner.
> 
I don't think it matters much really. The way I read the numbers they seem 
to indicate that the performance impact is very small in any case, and I'm 
not even entirely sure my way of trying to measure is accurate, so I don't 
know how much should be read into those numbers... 


-- 
Jesper



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-27 17:40                 ` Dave Jones
                                     ` (2 preceding siblings ...)
  2005-03-27 22:56                   ` Jesper Juhl
@ 2005-03-28  1:20                   ` Horst von Brand
  2005-03-28  4:10                   ` Paul Jackson
                                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 40+ messages in thread
From: Horst von Brand @ 2005-03-28  1:20 UTC (permalink / raw)
  To: Dave Jones, Jan Engelhardt, Linux Kernel Mailing List

Dave Jones <davej@redhat.com> said:
> On Sun, Mar 27, 2005 at 05:12:58PM +0200, Jan Engelhardt wrote:
> 
>  > Well, kfree inlined was already mentioned but forgotten again.
>  > What if this was used:
>  > 
>  > inline static void kfree_WRAP(void *addr) {
>  >     if(likely(addr != NULL)) {
>  >         kfree_real(addr);
>  >     }
>  >     return;
>  > }
>  > 
>  > And remove the NULL-test in kfree_real()? Then we would have:

> Am I the only person who is completely fascinated by the
> effort being spent here micro-optimising something thats
> almost never in a path that needs optimising ?
> I'd be amazed if any of this masturbation showed the tiniest
> blip on a real workload, or even on a benchmark other than
> one crafted specifically to test kfree in a loop.

Right.

> That each occurance of this 'optimisation' also saves a handful
> of bytes in generated code is it's only real benefit afaics.

No. It clears up the calls to kfree() a bit too in the source. Not really
important, sure.
.
> Even then, if a functions cache performance is better off because
> we trimmed a few bytes from the tail of a function, I'd be
> completely amazed.
> 
> I guess April 1st came early this year.

Got (at) you! ;-)
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-27 17:40                 ` Dave Jones
                                     ` (3 preceding siblings ...)
  2005-03-28  1:20                   ` Horst von Brand
@ 2005-03-28  4:10                   ` Paul Jackson
  2005-03-28 12:58                   ` Geert Uytterhoeven
  2005-03-29  2:52                   ` Lee Revell
  6 siblings, 0 replies; 40+ messages in thread
From: Paul Jackson @ 2005-03-28  4:10 UTC (permalink / raw)
  To: Dave Jones; +Cc: jengelh, linux-kernel

Dave writes:
> Am I the only person who is completely fascinated by the
> effort being spent here micro-optimising something thats

Eh .. it's well known behaviour.  Bring two questions before a
committee, one for millions of dollars and the other for pocket change,
and watch the committee spend more time discussing the second question.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-27 17:40                 ` Dave Jones
                                     ` (4 preceding siblings ...)
  2005-03-28  4:10                   ` Paul Jackson
@ 2005-03-28 12:58                   ` Geert Uytterhoeven
  2005-03-29  2:52                   ` Lee Revell
  6 siblings, 0 replies; 40+ messages in thread
From: Geert Uytterhoeven @ 2005-03-28 12:58 UTC (permalink / raw)
  To: Dave Jones; +Cc: Jan Engelhardt, Linux Kernel Mailing List

On Sun, 27 Mar 2005, Dave Jones wrote:
> On Sun, Mar 27, 2005 at 05:12:58PM +0200, Jan Engelhardt wrote:
>  > Well, kfree inlined was already mentioned but forgotten again.
>  > What if this was used:
>  > 
>  > inline static void kfree_WRAP(void *addr) {
>  >     if(likely(addr != NULL)) {
>  >         kfree_real(addr);
>  >     }
>  >     return;
>  > }
>  > 
>  > And remove the NULL-test in kfree_real()? Then we would have:
> 
> Am I the only person who is completely fascinated by the
> effort being spent here micro-optimising something thats
> almost never in a path that needs optimising ?
> I'd be amazed if any of this masturbation showed the tiniest
> blip on a real workload, or even on a benchmark other than
> one crafted specifically to test kfree in a loop.

The benchmarks were started when someone noticed one of the tests was (a) not
in a cleanup path and (b) very unlikely to be true.

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-27 17:40                 ` Dave Jones
                                     ` (5 preceding siblings ...)
  2005-03-28 12:58                   ` Geert Uytterhoeven
@ 2005-03-29  2:52                   ` Lee Revell
  2005-03-29  6:30                     ` Pekka Enberg
  6 siblings, 1 reply; 40+ messages in thread
From: Lee Revell @ 2005-03-29  2:52 UTC (permalink / raw)
  To: Dave Jones; +Cc: Jan Engelhardt, Linux Kernel Mailing List

On Sun, 2005-03-27 at 12:40 -0500, Dave Jones wrote:
> On Sun, Mar 27, 2005 at 05:12:58PM +0200, Jan Engelhardt wrote:
> 
>  > Well, kfree inlined was already mentioned but forgotten again.
>  > What if this was used:
>  > 
>  > inline static void kfree_WRAP(void *addr) {
>  >     if(likely(addr != NULL)) {
>  >         kfree_real(addr);
>  >     }
>  >     return;
>  > }
>  > 
>  > And remove the NULL-test in kfree_real()? Then we would have:
> 
> Am I the only person who is completely fascinated by the
> effort being spent here micro-optimising something thats
> almost never in a path that needs optimising ?
> I'd be amazed if any of this masturbation showed the tiniest
> blip on a real workload, or even on a benchmark other than
> one crafted specifically to test kfree in a loop.
> 

I see kfree used in several hot paths.  Check out
this /proc/latency_trace excerpt:

(T1/#147)      ksoftirqd/0     2 0 2 00000002 00000093 [0038603486826120] 0.133ms (+0.000ms): preempt_schedule+0x11/0x80 <c02b5181> (rpc_wake_up_task+0x6c/0x80 <c02a802c>)
(T1/#148)      ksoftirqd/0     2 0 2 00000001 00000094 [0038603486826375] 0.133ms (+0.000ms): preempt_schedule+0x11/0x80 <c02b5181> (udp_data_ready+0x1ca/0x270 <c02a5b8a>)
(T1/#149)      ksoftirqd/0     2 0 2 00000001 00000095 [0038603486826527] 0.133ms (+0.000ms): skb_free_datagram+0xb/0x30 <c0257ecb> (udp_data_ready+0x19c/0x270 <c02a5b5c>)
(T1/#150)      ksoftirqd/0     2 0 2 00000001 00000096 [0038603486826686] 0.133ms (+0.000ms): __kfree_skb+0xe/0xf0 <c0255bce> (udp_data_ready+0x19c/0x270 <c02a5b5c>)
(T1/#151)      ksoftirqd/0     2 0 2 00000001 00000097 [0038603486826924] 0.133ms (+0.000ms): sock_rfree+0x8/0x20 <c0254648> (__kfree_skb+0x6b/0xf0 <c0255c2b>)
(T1/#152)      ksoftirqd/0     2 0 2 00000001 00000098 [0038603486827082] 0.134ms (+0.000ms): kfree_skbmem+0xe/0x30 <c0255b9e> (__kfree_skb+0x76/0xf0 <c0255c36>)
(T1/#153)      ksoftirqd/0     2 0 2 00000001 00000099 [0038603486827189] 0.134ms (+0.000ms): skb_release_data+0xd/0xd0 <c0255acd> (kfree_skbmem+0x19/0x30 <c0255ba9>)
(T1/#154)      ksoftirqd/0     2 0 2 00000001 0000009a [0038603486827444] 0.134ms (+0.000ms): skb_drop_fraglist+0xc/0x50 <c0255a4c> (skb_release_data+0xa3/0xd0 <c0255b63>)
(T1/#155)      ksoftirqd/0     2 0 2 00000001 0000009b [0038603486827573] 0.134ms (+0.000ms): __kfree_skb+0xe/0xf0 <c0255bce> (skb_drop_fraglist+0x35/0x50 <c0255a75>)
(T1/#156)      ksoftirqd/0     2 0 2 00000001 0000009c [0038603486827733] 0.134ms (+0.000ms): kfree_skbmem+0xe/0x30 <c0255b9e> (__kfree_skb+0x76/0xf0 <c0255c36>)
(T1/#157)      ksoftirqd/0     2 0 2 00000001 0000009d [0038603486827861] 0.134ms (+0.000ms): skb_release_data+0xd/0xd0 <c0255acd> (kfree_skbmem+0x19/0x30 <c0255ba9>)
(T1/#158)      ksoftirqd/0     2 0 2 00000001 0000009e [0038603486828121] 0.134ms (+0.000ms): kfree+0x14/0x70 <c0143514> (kfree_skbmem+0x19/0x30 <c0255ba9>)
(T1/#159)      ksoftirqd/0     2 0 2 00000001 0000009f [0038603486828671] 0.135ms (+0.000ms): kmem_cache_free+0x14/0x60 <c0143444> (kfree_skbmem+0x2a/0x30 <c0255bba>)
(T1/#160)      ksoftirqd/0     2 0 2 00000001 000000a0 [0038603486829127] 0.135ms (+0.000ms): __kfree_skb+0xe/0xf0 <c0255bce> (skb_drop_fraglist+0x35/0x50 <c0255a75>)
(T1/#161)      ksoftirqd/0     2 0 2 00000001 000000a1 [0038603486829341] 0.135ms (+0.000ms): kfree_skbmem+0xe/0x30 <c0255b9e> (__kfree_skb+0x76/0xf0 <c0255c36>)
(T1/#162)      ksoftirqd/0     2 0 2 00000001 000000a2 [0038603486829469] 0.135ms (+0.000ms): skb_release_data+0xd/0xd0 <c0255acd> (kfree_skbmem+0x19/0x30 <c0255ba9>)
(T1/#163)      ksoftirqd/0     2 0 2 00000001 000000a3 [0038603486829644] 0.135ms (+0.000ms): kfree+0x14/0x70 <c0143514> (kfree_skbmem+0x19/0x30 <c0255ba9>)
(T1/#164)      ksoftirqd/0     2 0 2 00000001 000000a4 [0038603486829944] 0.136ms (+0.000ms): kmem_cache_free+0x14/0x60 <c0143444> (kfree_skbmem+0x2a/0x30 <c0255bba>)
(T1/#165)      ksoftirqd/0     2 0 2 00000001 000000a5 [0038603486830243] 0.136ms (+0.000ms): __kfree_skb+0xe/0xf0 <c0255bce> (skb_drop_fraglist+0x35/0x50 <c0255a75>)
(T1/#166)      ksoftirqd/0     2 0 2 00000001 000000a6 [0038603486830463] 0.136ms (+0.000ms): kfree_skbmem+0xe/0x30 <c0255b9e> (__kfree_skb+0x76/0xf0 <c0255c36>)
(T1/#167)      ksoftirqd/0     2 0 2 00000001 000000a7 [0038603486830589] 0.136ms (+0.000ms): skb_release_data+0xd/0xd0 <c0255acd> (kfree_skbmem+0x19/0x30 <c0255ba9>)
(T1/#168)      ksoftirqd/0     2 0 2 00000001 000000a8 [0038603486830800] 0.136ms (+0.000ms): kfree+0x14/0x70 <c0143514> (kfree_skbmem+0x19/0x30 <c0255ba9>)
(T1/#169)      ksoftirqd/0     2 0 2 00000001 000000a9 [0038603486831083] 0.137ms (+0.000ms): kmem_cache_free+0x14/0x60 <c0143444> (kfree_skbmem+0x2a/0x30 <c0255bba>)
(T1/#170)      ksoftirqd/0     2 0 2 00000001 000000aa [0038603486831394] 0.137ms (+0.000ms): __kfree_skb+0xe/0xf0 <c0255bce> (skb_drop_fraglist+0x35/0x50 <c0255a75>)
(T1/#171)      ksoftirqd/0     2 0 2 00000001 000000ab [0038603486831608] 0.137ms (+0.000ms): kfree_skbmem+0xe/0x30 <c0255b9e> (__kfree_skb+0x76/0xf0 <c0255c36>)
(T1/#172)      ksoftirqd/0     2 0 2 00000001 000000ac [0038603486831736] 0.137ms (+0.000ms): skb_release_data+0xd/0xd0 <c0255acd> (kfree_skbmem+0x19/0x30 <c0255ba9>)
(T1/#173)      ksoftirqd/0     2 0 2 00000001 000000ad [0038603486831985] 0.137ms (+0.000ms): kfree+0x14/0x70 <c0143514> (kfree_skbmem+0x19/0x30 <c0255ba9>)
(T1/#174)      ksoftirqd/0     2 0 2 00000001 000000ae [0038603486832443] 0.138ms (+0.000ms): kmem_cache_free+0x14/0x60 <c0143444> (kfree_skbmem+0x2a/0x30 <c0255bba>)
(T1/#175)      ksoftirqd/0     2 0 2 00000001 000000af [0038603486832744] 0.138ms (+0.000ms): __kfree_skb+0xe/0xf0 <c0255bce> (skb_drop_fraglist+0x35/0x50 <c0255a75>)
(T1/#176)      ksoftirqd/0     2 0 2 00000001 000000b0 [0038603486832880] 0.138ms (+0.000ms): kfree_skbmem+0xe/0x30 <c0255b9e> (__kfree_skb+0x76/0xf0 <c0255c36>)

Lee


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-29  2:52                   ` Lee Revell
@ 2005-03-29  6:30                     ` Pekka Enberg
  2005-03-29  7:06                       ` Jan Engelhardt
  0 siblings, 1 reply; 40+ messages in thread
From: Pekka Enberg @ 2005-03-29  6:30 UTC (permalink / raw)
  To: Lee Revell; +Cc: Dave Jones, Jan Engelhardt, Linux Kernel Mailing List, penberg

On Mon, 28 Mar 2005 21:52:57 -0500, Lee Revell <rlrevell@joe-job.com> wrote:
> I see kfree used in several hot paths.  Check out
> this /proc/latency_trace excerpt:

Yes, but is the pointer being free'd NULL most of the time? The
optimization does not help if you are releasing actual memory.

                           Pekka

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-29  6:30                     ` Pekka Enberg
@ 2005-03-29  7:06                       ` Jan Engelhardt
  2005-03-29  7:24                         ` Pekka J Enberg
  0 siblings, 1 reply; 40+ messages in thread
From: Jan Engelhardt @ 2005-03-29  7:06 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Lee Revell, Dave Jones, Linux Kernel Mailing List, penberg

>> I see kfree used in several hot paths.  Check out
>> this /proc/latency_trace excerpt:
>
>Yes, but is the pointer being free'd NULL most of the time?

"[...]In general, you should prefer to use actual profile feedback for this 
(`-fprofile-arcs'), as programmers are NOTORIOUSLY BAD AT PREDICTING how 
their programs actually perform." --gcc info pages.

>The optimization does not help if you are releasing actual memory.

It does not turn the real case (releasing memory) worse, but just improves the 
unreal case (releasing NULL).



Jan Engelhardt
-- 
No TOFU for me, please.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-29  7:06                       ` Jan Engelhardt
@ 2005-03-29  7:24                         ` Pekka J Enberg
  2005-03-30  2:44                           ` Paul Jackson
  0 siblings, 1 reply; 40+ messages in thread
From: Pekka J Enberg @ 2005-03-29  7:24 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Pekka Enberg, Lee Revell, Dave Jones, Linux Kernel Mailing List

Hi, 

Jan Engelhardt writes:
> "[...]In general, you should prefer to use actual profile feedback for this 
> (`-fprofile-arcs'), as programmers are NOTORIOUSLY BAD AT PREDICTING how 
> their programs actually perform." --gcc info pages.

Indeed. 

I wrote:
> > The optimization does not help if you are releasing actual memory.

Jan Engelhardt writes:
> It does not turn the real case (releasing memory) worse, but just improves the 
> unreal case (releasing NULL).

You don't know that until you profile! Please note that it _can_ turn the 
real case worse as the generated code will be bigger (assuming we inline 
kfree() to optimize the special case). To summarize: 

 (1) The optimization only helps when the passed pointer is NULL.
 (2) Most of the time, kfree() _should_ be given a real pointer.
     Anything else but sounds quite broken.
 (3) We don't know if inlining kfree() hurts the common case.
 (4) The cleanups Jesper and others are doing are to remove the
     _redundant_ NULL checks (i.e. it is now checked twice). 

Therefore please keep merging the cleanup patches and don't inline kfree() 
unless someone can show a _globally visible_ performance regression (i.e. p% 
slowdown in XYZ benchmark). 

               Pekka 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-29  7:24                         ` Pekka J Enberg
@ 2005-03-30  2:44                           ` Paul Jackson
  2005-03-30  6:13                             ` Pekka J Enberg
  2005-03-30 19:10                             ` Jesper Juhl
  0 siblings, 2 replies; 40+ messages in thread
From: Paul Jackson @ 2005-03-30  2:44 UTC (permalink / raw)
  To: Pekka J Enberg; +Cc: jengelh, penberg, rlrevell, davej, linux-kernel

Pekka wrote:
>  (4) The cleanups Jesper and others are doing are to remove the
>      _redundant_ NULL checks (i.e. it is now checked twice). 

Even such obvious changes as removing redundant checks doesn't
seem to ensure a performance improvement.  Jesper Juhl posted
performance data for such changes in his microbenchmark a couple
of days ago.

As I posted then, I could swear that his numbers show:

> Just looking at the third run, it seems to me that "if (likely(p))
> kfree(p);" beats a naked "kfree(p);" everytime, whether p is half
> NULL's, or very few NULL's, or almost all NULL's.

Twice now I have asked Jesper to explain this strange result.

I have heard no explanation (not even a terse "you idiot ;)"),
nor anyone else comment on these numbers.

Maybe we should be following your good advice:

> You don't know that until you profile! 

instead of continuing to make these code changes.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-30  2:44                           ` Paul Jackson
@ 2005-03-30  6:13                             ` Pekka J Enberg
  2005-03-30  6:16                               ` Paul Jackson
  2005-03-30  7:15                               ` P Lavin
  2005-03-30 19:10                             ` Jesper Juhl
  1 sibling, 2 replies; 40+ messages in thread
From: Pekka J Enberg @ 2005-03-30  6:13 UTC (permalink / raw)
  To: Paul Jackson; +Cc: jengelh, penberg, rlrevell, davej, linux-kernel

Hi, 

Paul Jackson writes:
> Even such obvious changes as removing redundant checks doesn't
> seem to ensure a performance improvement.  Jesper Juhl posted
> performance data for such changes in his microbenchmark a couple
> of days ago.

It is not a performance issue, it's an API issue. Please note that kfree() 
is analogous libc free() in terms of NULL checking. People are checking NULL 
twice now because they're confused whether kfree() deals it or not. 

Paul Jackson writes:
> Maybe we should be following your good advice: 
> 
> > You don't know that until you profile!  
> 
> instead of continuing to make these code changes.

I am all for profiling but it should not stop us from merging the patches 
because we can restore the generated code with the included (totally 
untested) patch. 

            Pekka 

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
 --- 

Index: 2.6/include/linux/slab.h
===================================================================
 --- 2.6.orig/include/linux/slab.h       2005-03-22 14:31:30.000000000 +0200
+++ 2.6/include/linux/slab.h    2005-03-30 09:08:13.000000000 +0300
@@ -105,8 +105,14 @@
       return __kmalloc(size, flags);
} 

+static inline void kfree(const void * p)
+{
+       if (!p)
+               return;
+       __kfree(p);
+}
+
extern void *kcalloc(size_t, size_t, int);
 -extern void kfree(const void *);
extern unsigned int ksize(const void *); 

extern int FASTCALL(kmem_cache_reap(int));
Index: 2.6/mm/slab.c
===================================================================
 --- 2.6.orig/mm/slab.c  2005-03-22 14:31:31.000000000 +0200
+++ 2.6/mm/slab.c       2005-03-30 09:08:45.000000000 +0300
@@ -2567,13 +2567,11 @@
 * Don't free memory not originally allocated by kmalloc()
 * or you will run into trouble.
 */
 -void kfree (const void *objp)
+void __kfree (const void *objp)
{
       kmem_cache_t *c;
       unsigned long flags; 

 -       if (!objp)
 -               return;
       local_irq_save(flags);
       kfree_debugcheck(objp);
       c = GET_PAGE_CACHE(virt_to_page(objp));
@@ -2581,7 +2579,7 @@
       local_irq_restore(flags);
} 

 -EXPORT_SYMBOL(kfree);
+EXPORT_SYMBOL(__kfree); 

#ifdef CONFIG_SMP
/** 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-30  6:13                             ` Pekka J Enberg
@ 2005-03-30  6:16                               ` Paul Jackson
  2005-03-30  7:15                               ` P Lavin
  1 sibling, 0 replies; 40+ messages in thread
From: Paul Jackson @ 2005-03-30  6:16 UTC (permalink / raw)
  To: Pekka J Enberg; +Cc: jengelh, penberg, rlrevell, davej, linux-kernel

Pekka writes:
> It is not a performance issue, it's an API issue.
> ...
> I am all for profiling but it should not stop us from merging the patches 

Ok - sounds right.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-30  6:13                             ` Pekka J Enberg
  2005-03-30  6:16                               ` Paul Jackson
@ 2005-03-30  7:15                               ` P Lavin
  2005-03-30 14:20                                 ` Jesper Juhl
  1 sibling, 1 reply; 40+ messages in thread
From: P Lavin @ 2005-03-30  7:15 UTC (permalink / raw)
  To: linux-kernel

Hi,
In my wlan driver module, i allocated some memory using kmalloc in 
interrupt context, this one failed but its not returning NULL , so i was 
proceeding further everything was going wrong... & finally the kernel 
crahed. Can any one of you tell me why this is happening ? i cannot use 
GFP_KERNEL because i'm calling this function from interrupt context & it 
may block. Any other solution for this ?? I'm concerned abt why kmalloc 
is not returning null if its not a success ??

Is it not necessary to check for NULL before calling kfree() ??
Regards,
Lavin

Pekka J Enberg wrote:

> Hi,
> Paul Jackson writes:
>
>> Even such obvious changes as removing redundant checks doesn't
>> seem to ensure a performance improvement.  Jesper Juhl posted
>> performance data for such changes in his microbenchmark a couple
>> of days ago.
>
>
> It is not a performance issue, it's an API issue. Please note that 
> kfree() is analogous libc free() in terms of NULL checking. People are 
> checking NULL twice now because they're confused whether kfree() deals 
> it or not.
> Paul Jackson writes:
>
>> Maybe we should be following your good advice:
>> > You don't know that until you profile! 
>> instead of continuing to make these code changes.
>
>
> I am all for profiling but it should not stop us from merging the 
> patches because we can restore the generated code with the included 
> (totally untested) patch.
>            Pekka
> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
> ---
> Index: 2.6/include/linux/slab.h
> ===================================================================
> --- 2.6.orig/include/linux/slab.h       2005-03-22 14:31:30.000000000 
> +0200
> +++ 2.6/include/linux/slab.h    2005-03-30 09:08:13.000000000 +0300
> @@ -105,8 +105,14 @@
>       return __kmalloc(size, flags);
> }
> +static inline void kfree(const void * p)
> +{
> +       if (!p)
> +               return;
> +       __kfree(p);
> +}
> +
> extern void *kcalloc(size_t, size_t, int);
> -extern void kfree(const void *);
> extern unsigned int ksize(const void *);
> extern int FASTCALL(kmem_cache_reap(int));
> Index: 2.6/mm/slab.c
> ===================================================================
> --- 2.6.orig/mm/slab.c  2005-03-22 14:31:31.000000000 +0200
> +++ 2.6/mm/slab.c       2005-03-30 09:08:45.000000000 +0300
> @@ -2567,13 +2567,11 @@
> * Don't free memory not originally allocated by kmalloc()
> * or you will run into trouble.
> */
> -void kfree (const void *objp)
> +void __kfree (const void *objp)
> {
>       kmem_cache_t *c;
>       unsigned long flags;
> -       if (!objp)
> -               return;
>       local_irq_save(flags);
>       kfree_debugcheck(objp);
>       c = GET_PAGE_CACHE(virt_to_page(objp));
> @@ -2581,7 +2579,7 @@
>       local_irq_restore(flags);
> }
> -EXPORT_SYMBOL(kfree);
> +EXPORT_SYMBOL(__kfree);
> #ifdef CONFIG_SMP
> /**
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-30  7:15                               ` P Lavin
@ 2005-03-30 14:20                                 ` Jesper Juhl
  0 siblings, 0 replies; 40+ messages in thread
From: Jesper Juhl @ 2005-03-30 14:20 UTC (permalink / raw)
  To: P Lavin; +Cc: linux-kernel

On Wed, 30 Mar 2005, P Lavin wrote:

> Date: Wed, 30 Mar 2005 12:45:01 +0530
> From: P Lavin <lavin.p@redpinesignals.com>
> To: linux-kernel@vger.kernel.org
> Subject: Re: no need to check for NULL before calling kfree() -fs/ext2/
> 
> Hi,
> In my wlan driver module, i allocated some memory using kmalloc in interrupt
> context, this one failed but its not returning NULL , 

kmalloc() should always return NULL if the allocation failed.


>so i was proceeding
> further everything was going wrong... & finally the kernel crahed. Can any one
> of you tell me why this is happening ? i cannot use GFP_KERNEL because i'm
> calling this function from interrupt context & it may block. Any other

If you need to allocate memory from interrupt context you should be using 
GFP_ATOMIC (or, if possible, do the allocation earlier in a different 
context).


> solution for this ?? I'm concerned abt why kmalloc is not returning null if
> its not a success ??
> 
I have no explanation for that, are you sure that's really what's 
happening?


> Is it not necessary to check for NULL before calling kfree() ??

No, it is not nessesary to check for NULL before calling kfree() since 
kfree() does    

void kfree (const void *objp)
{
	... 
        if (!objp)
                return;
	...
}

So, if you pass kfree() a NULL pointer it deals with it itself, you don't 
need to check that explicitly before calling kfree() - that's redundant.


-- 
Jesper Juhl



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-30  2:44                           ` Paul Jackson
  2005-03-30  6:13                             ` Pekka J Enberg
@ 2005-03-30 19:10                             ` Jesper Juhl
  2005-04-09  2:21                               ` Jesper Juhl
  1 sibling, 1 reply; 40+ messages in thread
From: Jesper Juhl @ 2005-03-30 19:10 UTC (permalink / raw)
  To: Paul Jackson
  Cc: Pekka J Enberg, jengelh, penberg, rlrevell, davej, linux-kernel

On Tue, 29 Mar 2005, Paul Jackson wrote:

> Pekka wrote:
> >  (4) The cleanups Jesper and others are doing are to remove the
> >      _redundant_ NULL checks (i.e. it is now checked twice). 
> 
> Even such obvious changes as removing redundant checks doesn't
> seem to ensure a performance improvement.  Jesper Juhl posted
> performance data for such changes in his microbenchmark a couple
> of days ago.
> 
> As I posted then, I could swear that his numbers show:
> 
> > Just looking at the third run, it seems to me that "if (likely(p))
> > kfree(p);" beats a naked "kfree(p);" everytime, whether p is half
> > NULL's, or very few NULL's, or almost all NULL's.
> 
> Twice now I have asked Jesper to explain this strange result.
> 
I've been kept busy with other things for a while and haven't had the time 
to reply to your emails, sorry.   As I just said in another post I don't 
know how valid my numbers are, but I'll try and craft a few more tests to 
see if I can get some more solid results.

> 
> Maybe we should be following your good advice:
> 
> > You don't know that until you profile! 
> 
> instead of continuing to make these code changes.
> 
I'll gather some more numbers and post them along with any conclusions I 
believe can be drawn from them within a day or two, untill then I'll hold 
back on the patches...


-- 
Jesper Juhl



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-30 19:10                             ` Jesper Juhl
@ 2005-04-09  2:21                               ` Jesper Juhl
  0 siblings, 0 replies; 40+ messages in thread
From: Jesper Juhl @ 2005-04-09  2:21 UTC (permalink / raw)
  To: Jesper Juhl
  Cc: Paul Jackson, Pekka J Enberg, jengelh, penberg, rlrevell, davej,
	linux-kernel

On Wed, 30 Mar 2005, Jesper Juhl wrote:

> On Tue, 29 Mar 2005, Paul Jackson wrote:
> 
> > Pekka wrote:
> > >  (4) The cleanups Jesper and others are doing are to remove the
> > >      _redundant_ NULL checks (i.e. it is now checked twice). 
> > 
> > Even such obvious changes as removing redundant checks doesn't
> > seem to ensure a performance improvement.  Jesper Juhl posted
> > performance data for such changes in his microbenchmark a couple
> > of days ago.
> > 
> > As I posted then, I could swear that his numbers show:
> > 
> > > Just looking at the third run, it seems to me that "if (likely(p))
> > > kfree(p);" beats a naked "kfree(p);" everytime, whether p is half
> > > NULL's, or very few NULL's, or almost all NULL's.
> > 
> > Twice now I have asked Jesper to explain this strange result.
> > 
> I've been kept busy with other things for a while and haven't had the time 
> to reply to your emails, sorry.   As I just said in another post I don't 
> know how valid my numbers are, but I'll try and craft a few more tests to 
> see if I can get some more solid results.
> 
> > 
> > Maybe we should be following your good advice:
> > 
> > > You don't know that until you profile! 
> > 
> > instead of continuing to make these code changes.
> > 
> I'll gather some more numbers and post them along with any conclusions I 
> believe can be drawn from them within a day or two, untill then I'll hold 
> back on the patches...
> 
Ok, I never got around to doing some more benchmarks, mainly since it 
seems that people converged on the oppinion that the kfree() cleanups are 
OK and we can fix up any regressions by inlining kfree if needed (the 
difference these changes make to performance seem to be small and in the 
noice anyway).
If anyone would /like/ me to do more benchmarks, then speak up and I will 
do them - I guess I could also build a kernel with an inline kfree() as a 
comparison.. but, unless someone speaks up I'll just carry on making these 
kfree() cleanups and not bother with benchmarks...


-- 
Jesper



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-27 15:12               ` Jan Engelhardt
  2005-03-27 17:40                 ` Dave Jones
@ 2005-03-28  4:07                 ` Paul Jackson
  1 sibling, 0 replies; 40+ messages in thread
From: Paul Jackson @ 2005-03-28  4:07 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: linux-kernel

Jan - please don't trim the CC lists when responding on lkml.
Thanks.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-26 23:21     ` [PATCH] no need to check for NULL before calling kfree() -fs/ext2/ linux-os
  2005-03-26 23:34       ` Marcin Dalecki
  2005-03-26 23:54       ` Jesper Juhl
@ 2005-03-27  8:45       ` Arjan van de Ven
  2005-03-27 12:51         ` Denis Vlasenko
  2005-03-27 23:13         ` [PATCH] no need to check for NULL before calling kfree()-fs/ext2/ linux-os
  2 siblings, 2 replies; 40+ messages in thread
From: Arjan van de Ven @ 2005-03-27  8:45 UTC (permalink / raw)
  To: linux-os; +Cc: Jesper Juhl, ext2-devel, Linux kernel

On Sat, 2005-03-26 at 18:21 -0500, linux-os wrote:
> On Sat, 26 Mar 2005, Arjan van de Ven wrote:
> 
> > On Fri, 2005-03-25 at 17:29 -0500, linux-os wrote:
> >> Isn't it expensive of CPU time to call kfree() even though the
> >> pointer may have already been freed?
> >
> > nope
> >
> > a call instruction is effectively half a cycle or less, the branch
> 
> Wrong!

oh? a call is "push eip + a new eip" effectively. the new eip is
entirely free, the push eip takes half a cycle (or 1 full cycle but only
one of the two/three pipelines).

> 
> > predictor of the cpu can predict perfectly where the next instruction is
> > from. The extra if() you do in front is a different matter, that can
> > easily cost 100 cycles+. (And those are redundant cycles because kfree
> > will do the if again anyway). So what you propose is to spend 100+
> > cycles to save half a cycle. Not a good tradeoff ;)
> >
> 
> Wrong!

Is it wrong that the cpu can predict the target perfectly? No. Unless
you use function pointers (then it's a  whole different ballgame).

> 
> Pure unmitigated bull-shit. I measure (with hardware devices)
> the execution time of real code in modern CPUs. I do this for
> a living so you don't have to stand in line for a couple of
> hours to have your baggage scanned at the airport.

Ok I used to do this kind of performance work for a living too and
measuring it to death as well.

> Always, always, a call will be more expensive than a branch
> on condition. 

It is not on modern Out of order cpus. 

> It's impossible to be otherwise. A call requires
> that the return address be written to memory (the stack),
> using register indirection (the stack-pointer).

and it's a so common pattern that it's optimized to death. Internally a
call gets transformed to 2 uops or so, one is push eip, the other is the
jmp (which gets then just absorbed by the "what is the next eip" logic,
just as a "jmp"s are 0 cycles)

> If somebody said; "I think that the code will look better
> and the few cycles lost will not be a consequence with modern
> CPUs...", then there is a point. But coming up with this
> disingenuous bullshit is something else.

I don't have to take this from you and I don't. You're calling me a liar
with zero evidence. Lets get some facts straight
1) On a modern cpu, a miss of the branch predictor is quite expensive.
   The entire pipeline needs flushing if this happens, and on a p4 this
   will be in the order of 50 to 100 cycles at minimum.
2) absolute "jmp" is free on modern OOO cpus. Instead of taking an
   actual execution slot, all that happens is that the "what is the next
   EIP" logic gets a different value. (you can argue what happens if you
   have a sequence of jmps and that it's not free then, and I'll grant
   you that, but that corner case is not relevant here)
3) a "call" instruction gets translated into what basically is 
   "push EIP" and "jmp" uops.
4) modern processors have special logic to optimize push/pop
   instructions; for example a "push eax ; push ebx" sequence will
   execute in parallel in the same cycle even though there is a data
   dependency on esp, the cpu can perfectly predict the esp effect and
   will do so.
5) modern processors have a call/ret fifo cache they use to do branch
   prediction for the target of "ret" instructions. Unless you do
   misbalanced call/ret pairs the prediction will be perfect.

Based on this the conclusion "a function call is really cheap versus a
conditional branch" is justified imo. Now you better come with proof
about which of the 5 things above I'm totally lying to you or you better
come with an apology.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-27  8:45       ` Arjan van de Ven
@ 2005-03-27 12:51         ` Denis Vlasenko
  2005-03-27 14:28           ` Arjan van de Ven
  2005-03-27 23:13         ` [PATCH] no need to check for NULL before calling kfree()-fs/ext2/ linux-os
  1 sibling, 1 reply; 40+ messages in thread
From: Denis Vlasenko @ 2005-03-27 12:51 UTC (permalink / raw)
  To: Arjan van de Ven, linux-os; +Cc: Jesper Juhl, ext2-devel, Linux kernel

> > It's impossible to be otherwise. A call requires
> > that the return address be written to memory (the stack),
> > using register indirection (the stack-pointer).
> 
> and it's a so common pattern that it's optimized to death. Internally a
> call gets transformed to 2 uops or so, one is push eip, the other is the
> jmp (which gets then just absorbed by the "what is the next eip" logic,
> just as a "jmp"s are 0 cycles)

Arjan, you overlook the fact that kfree() contains 'if(!p) return;' too.
call + test-and-branch can never be faster than test+and+branch.
Maybe on the really clever CPU it can take the same time, but not faster...
--
vda


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree() -fs/ext2/
  2005-03-27 12:51         ` Denis Vlasenko
@ 2005-03-27 14:28           ` Arjan van de Ven
  0 siblings, 0 replies; 40+ messages in thread
From: Arjan van de Ven @ 2005-03-27 14:28 UTC (permalink / raw)
  To: Denis Vlasenko; +Cc: linux-os, Jesper Juhl, ext2-devel, Linux kernel

On Sun, 2005-03-27 at 15:51 +0300, Denis Vlasenko wrote:
> > > It's impossible to be otherwise. A call requires
> > > that the return address be written to memory (the stack),
> > > using register indirection (the stack-pointer).
> > 
> > and it's a so common pattern that it's optimized to death. Internally a
> > call gets transformed to 2 uops or so, one is push eip, the other is the
> > jmp (which gets then just absorbed by the "what is the next eip" logic,
> > just as a "jmp"s are 0 cycles)
> 
> Arjan, you overlook the fact that kfree() contains 'if(!p) return;' too.
> call + test-and-branch can never be faster than test+and+branch

ok so for the non-null case you have

test-nbranch-call-test-nbranch
vs
call-test-nbranch

vs the null case where you get
test-branch
vs
call-test-branch

(I'm using nbranch here as a non-taken branch; it's also a conditional
branch and it has the same misprediction possibility)

in the non-null case with if you have *two* chances for the branch
predictor to go wrong. (and "wrong" can also mean "cold, eg unknown"
here) and always an extra "test-nbranch" sequence, which is probably a
cycle at least

the offset for that is the null-case-without-if where you have an extra
"call", which is also half to a whole cycle.

even in the null case it's dubious if there is gain, it depends on how
the branch predictor happens to feel that day ;)



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH] no need to check for NULL before calling kfree()-fs/ext2/
  2005-03-27  8:45       ` Arjan van de Ven
  2005-03-27 12:51         ` Denis Vlasenko
@ 2005-03-27 23:13         ` linux-os
  1 sibling, 0 replies; 40+ messages in thread
From: linux-os @ 2005-03-27 23:13 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Jesper Juhl, ext2-devel, Linux kernel

On Sun, 27 Mar 2005, Arjan van de Ven wrote:

> On Sat, 2005-03-26 at 18:21 -0500, linux-os wrote:
>> On Sat, 26 Mar 2005, Arjan van de Ven wrote:
>>
>>> On Fri, 2005-03-25 at 17:29 -0500, linux-os wrote:
>>>> Isn't it expensive of CPU time to call kfree() even though the
>>>> pointer may have already been freed?
>>>
>>> nope
>>>
>>> a call instruction is effectively half a cycle or less, the branch
>>
>> Wrong!
>
> oh? a call is "push eip + a new eip" effectively. the new eip is
> entirely free, the push eip takes half a cycle (or 1 full cycle but only
> one of the two/three pipelines).
>
>>
>>> predictor of the cpu can predict perfectly where the next instruction is
>>> from. The extra if() you do in front is a different matter, that can
>>> easily cost 100 cycles+. (And those are redundant cycles because kfree
>>> will do the if again anyway). So what you propose is to spend 100+
>>> cycles to save half a cycle. Not a good tradeoff ;)
>>>
>>
>> Wrong!
>
> Is it wrong that the cpu can predict the target perfectly? No. Unless
> you use function pointers (then it's a  whole different ballgame).
>
>>
>> Pure unmitigated bull-shit. I measure (with hardware devices)
>> the execution time of real code in modern CPUs. I do this for
>> a living so you don't have to stand in line for a couple of
>> hours to have your baggage scanned at the airport.
>
> Ok I used to do this kind of performance work for a living too and
> measuring it to death as well.
>
>> Always, always, a call will be more expensive than a branch
>> on condition.
>
> It is not on modern Out of order cpus.
>
>> It's impossible to be otherwise. A call requires
>> that the return address be written to memory (the stack),
>> using register indirection (the stack-pointer).
>
> and it's a so common pattern that it's optimized to death. Internally a
> call gets transformed to 2 uops or so, one is push eip, the other is the
> jmp (which gets then just absorbed by the "what is the next eip" logic,
> just as a "jmp"s are 0 cycles)
>
>> If somebody said; "I think that the code will look better
>> and the few cycles lost will not be a consequence with modern
>> CPUs...", then there is a point. But coming up with this
>> disingenuous bullshit is something else.
>
> I don't have to take this from you and I don't. You're calling me a liar
> with zero evidence. Lets get some facts straight
> 1) On a modern cpu, a miss of the branch predictor is quite expensive.
>   The entire pipeline needs flushing if this happens, and on a p4 this
>   will be in the order of 50 to 100 cycles at minimum.

No. It depends upon how the branch prediction is done. If there are
two virtual logic-units, one that has already taken the result of
the TRUE condition the other that has already taken the result of
the FALSE condition, then the IP is set to the address of one
or the other, regardless of whatever is taken. Both of these
IPs are already in the cache because the branch logic makes
sure they are before the conditional test occurs. The possible loss
of performance occurs if there is another conditional branch that
must occur before the logic-units pipelines can be refilled.

This can happen if the tests are conducted like:

 	cmpl	$NUMBER, %eax
 	jbe	1f	# First branch	ZF or CF
 	......		# Other code
 	......		# Other code
1:	jc	2f	# Immediate second branch on CF

Most compilers don't produce code like that, but they could.
The loss of performance is only the loss of the branch predictor,
for the second branch.

> 2) absolute "jmp" is free on modern OOO cpus. Instead of taking an
>   actual execution slot, all that happens is that the "what is the next
>   EIP" logic gets a different value. (you can argue what happens if you
>   have a sequence of jmps and that it's not free then, and I'll grant
>   you that, but that corner case is not relevant here)

So? I never even mentioned a jump and, falling through after
a conditional test is not a jump, it's just executing the
next instruction.

> 3) a "call" instruction gets translated into what basically is
>   "push EIP" and "jmp" uops.

No, It can't be on machines that have a "call" instruction.
This is because, in cases where you use the same stack for
access to data, one needs to maintain the coherency of the
stack.  If I call 'kfree', I do:

 	pushl	(pointer)
 	call	kfree
 	addl	$4, %esp	# Sizeof the pointer

... or

 	movl	(pointer), %eax
 	pushl	%eax
 	call	kfree
 	addl	$4, %esp

(pointer) is a memory operand, could be on the stack or in
other data. It is the pointer you want to free.

When kfree() gets called, the value passed __must__ be in its
final place on the stack, which means that the return address
must be there and the stack-pointer value adjusted to its
final resting place. This is necessary so that code will
get the correct values. On ix86, the called function will
find the first passed parameter at 0x04(%esp). The return
address will be at 0x00(%esp).

With the test for NULL, a decent compiler will produce code
like:
 		movl	(pointer), %eax
 		orl	%eax,%eax
 		jz	1f
 		pushl	%eax
 		call	kfree
 		addl	$4, %esp
 	1:

So you add a one-byte instruction and branch on condition.

> 4) modern processors have special logic to optimize push/pop
>   instructions; for example a "push eax ; push ebx" sequence will
>   execute in parallel in the same cycle even though there is a data
>   dependency on esp, the cpu can perfectly predict the esp effect and
>   will do so.

So? BTW, it's not a data-dependency, it's a size of register
dependency. The call-frame optimization can load several
register values onto the stack and adjust the stack-pointer
value only once. That's what the advertising hype is all about.

> 5) modern processors have a call/ret fifo cache they use to do branch
>   prediction for the target of "ret" instructions. Unless you do
>   misbalanced call/ret pairs the prediction will be perfect.
>

The Intel CPUs don't have return-on-condition instructions like
the old Z80 and they don't have call-on-condition. If they did
you wouldn't have to jump around the calls as shown above.

It would be nice to have a call-non-zero instruction.

> Based on this the conclusion "a function call is really cheap versus a
> conditional branch" is justified imo. Now you better come with proof
> about which of the 5 things above I'm totally lying to you or you better
> come with an apology.
>

There is a vast difference between the perception that advertising
hype produces, and the actual hardware designs. At one time, such
hype remained with the definition of megabyte. Now, they have
advertising agencies writing hype about technical buzz-words like
"branch prediction" and some engineers assume that the chip
contains a bunch of wonderful logic that will allow them to write
"pretty" or sloppy code. It just isn't so. There is much logic
that needs to be executed sequentially of else it won't work,
some needs to be coherent only occasionally. Stack operations
aren't in that category. The called procedures really need to
get the correct parameter values.

This whole advertising hype thing is a sore-point with
me because a server-box company went all the way to the
president of this company, trying to get me fired, when
I discovered that their box didn't work. Kill the messenger.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.11 on an i686 machine (5537.79 BogoMips).
  Notice : All mail here is now cached for review by Dictator Bush.
                  98.36% of all statistics are fiction.

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2005-04-09  2:18 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-25 22:08 [PATCH] no need to check for NULL before calling kfree() - fs/ext2/ Jesper Juhl
2005-03-25 22:29 ` linux-os
2005-03-25 22:44   ` Jesper Juhl
2005-03-26  7:50   ` Pekka Enberg
2005-03-26  8:32   ` Arjan van de Ven
2005-03-26 23:21     ` [PATCH] no need to check for NULL before calling kfree() -fs/ext2/ linux-os
2005-03-26 23:34       ` Marcin Dalecki
2005-03-27  2:00         ` Horst von Brand
2005-03-27  3:18           ` Marcin Dalecki
2005-03-27 22:12         ` linux-os
2005-03-26 23:54       ` Jesper Juhl
2005-03-27  0:05         ` Lee Revell
2005-03-27 10:55           ` Jesper Juhl
2005-03-27 14:56             ` Paul Jackson
2005-03-27 15:12               ` Jan Engelhardt
2005-03-27 17:40                 ` Dave Jones
2005-03-27 18:17                   ` Jan Engelhardt
2005-03-27 19:25                   ` Pekka Enberg
2005-03-27 22:56                   ` Jesper Juhl
2005-03-28  4:53                     ` Paul Jackson
2005-03-30 18:57                       ` Jesper Juhl
2005-03-28  1:20                   ` Horst von Brand
2005-03-28  4:10                   ` Paul Jackson
2005-03-28 12:58                   ` Geert Uytterhoeven
2005-03-29  2:52                   ` Lee Revell
2005-03-29  6:30                     ` Pekka Enberg
2005-03-29  7:06                       ` Jan Engelhardt
2005-03-29  7:24                         ` Pekka J Enberg
2005-03-30  2:44                           ` Paul Jackson
2005-03-30  6:13                             ` Pekka J Enberg
2005-03-30  6:16                               ` Paul Jackson
2005-03-30  7:15                               ` P Lavin
2005-03-30 14:20                                 ` Jesper Juhl
2005-03-30 19:10                             ` Jesper Juhl
2005-04-09  2:21                               ` Jesper Juhl
2005-03-28  4:07                 ` [PATCH] " Paul Jackson
2005-03-27  8:45       ` Arjan van de Ven
2005-03-27 12:51         ` Denis Vlasenko
2005-03-27 14:28           ` Arjan van de Ven
2005-03-27 23:13         ` [PATCH] no need to check for NULL before calling kfree()-fs/ext2/ linux-os

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox