question on common error-handling idiom

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* question on common error-handling idiom
@ 2004-11-02 20:08 Chris Friesen
  2004-11-02 20:58 ` Jan Engelhardt
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Chris Friesen @ 2004-11-02 20:08 UTC (permalink / raw)
  To: Linux kernel

There's something I've been wondering about for a while.  There is a lot of code 
in linux that looks something like this:

err = -ERRORCODE
if (error condition)
	goto out;

While nice to read, it would seem that it might be more efficient to do the 
following:

if (error condition) {
	err = -ERRORCODE;
	goto out;
}

Is there any particular reason why the former is preferred?  Is the compiler 
smart enough to optimize away the additional write in the non-error path?

Chris

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: question on common error-handling idiom
  2004-11-02 20:08 question on common error-handling idiom Chris Friesen
@ 2004-11-02 20:58 ` Jan Engelhardt
  2004-11-02 21:12   ` linux-os
  2004-11-02 21:12   ` Russell Miller
  2004-11-02 21:16 ` Jesper Juhl
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 12+ messages in thread
From: Jan Engelhardt @ 2004-11-02 20:58 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Linux kernel

>There's something I've been wondering about for a while.  There is a lot of code
>in linux that looks something like this:
>
>err = -ERRORCODE
>if (error condition)
>	goto out;

That's because there might something as:

err = -EPERM;
if(error) { goto out; }
do something;
if(error2) { goto out; }
do something more;
if(error3) { goto out; }

Is shorter than:

if(error) { err = -EPERM; goto out; }
do something;
if(error2) { err = -EPERM; goto out; }
do something more;
if(error3) { err = -EPERM; goto out; }


>Is there any particular reason why the former is preferred?  Is the compiler

To keep it short. Because it might have been worse than just err =xxx:

if(error) {
    do this and that;
    and more;
    even more;
    more more;
    goto out;
}

Repeating that over and over is not that good. So we wrap it a little bit to do
a "staircase" deinitialization:

err = -EPERM;
if(error) { goto this_didnot_work; }
...
err = -ENOSPC;
if(error) { goto that_didnot_work; }


this_didnot_work:
  all uninitializations needed

that_didnot_work:
  all other uninit's

  return err;


So to summarize, it's done to reduce code whilst keeping the error code around
until we actually leave the function.


My € 0.02!


Jan Engelhardt
-- 
Gesellschaft für Wissenschaftliche Datenverarbeitung
Am Fassberg, 37077 Göttingen, www.gwdg.de

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: question on common error-handling idiom
  2004-11-02 20:58 ` Jan Engelhardt
@ 2004-11-02 21:12   ` linux-os
  2004-11-03 10:45     ` Ross Kendall Axe
  2004-11-02 21:12   ` Russell Miller
  1 sibling, 1 reply; 12+ messages in thread
From: linux-os @ 2004-11-02 21:12 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Chris Friesen, Linux kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed, Size: 1916 bytes --]

On Tue, 2 Nov 2004, Jan Engelhardt wrote:

>> There's something I've been wondering about for a while.  There is a lot of code
>> in linux that looks something like this:
>>
>> err = -ERRORCODE
>> if (error condition)
>> 	goto out;
>
> That's because there might something as:
>
> err = -EPERM;
> if(error) { goto out; }
> do something;
> if(error2) { goto out; }
> do something more;
> if(error3) { goto out; }
>
> Is shorter than:
>
> if(error) { err = -EPERM; goto out; }
> do something;
> if(error2) { err = -EPERM; goto out; }
> do something more;
> if(error3) { err = -EPERM; goto out; }
>
>
>> Is there any particular reason why the former is preferred?  Is the compiler
>
> To keep it short. Because it might have been worse than just err =xxx:
>
> if(error) {
>    do this and that;
>    and more;
>    even more;
>    more more;
>    goto out;
> }
>
> Repeating that over and over is not that good. So we wrap it a little bit to do
> a "staircase" deinitialization:
>
> err = -EPERM;
> if(error) { goto this_didnot_work; }
> ...
> err = -ENOSPC;
> if(error) { goto that_didnot_work; }
>
>
> this_didnot_work:
>  all uninitializations needed
>
> that_didnot_work:
>  all other uninit's
>
>  return err;
>
>
> So to summarize, it's done to reduce code whilst keeping the error code around
> until we actually leave the function.
>
>
> My ÿÿ 0.02!
>
>
> Jan Engelhardt
> -- 
> Gesellschaft fÿÿr Wissenschaftliche Datenverarbeitung
> Am Fassberg, 37077 Gÿÿttingen, www.gwdg.de

I think it's just to get around the "uninitialized variable"
warning when the 'C' compiler doesn't know that it will
always be initialized.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.9 on an i686 machine (5537.79 BogoMips).
  Notice : All mail here is now cached for review by John Ashcroft.
                  98.36% of all statistics are fiction.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: question on common error-handling idiom
  2004-11-02 20:58 ` Jan Engelhardt
  2004-11-02 21:12   ` linux-os
@ 2004-11-02 21:12   ` Russell Miller
  1 sibling, 0 replies; 12+ messages in thread
From: Russell Miller @ 2004-11-02 21:12 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Chris Friesen, Linux kernel

On Tuesday 02 November 2004 14:58, Jan Engelhardt wrote:

> So to summarize, it's done to reduce code whilst keeping the error code
> around until we actually leave the function.
>
I understand what you're saying, the OP did raise a point that I think is 
worth repeating, that it's an extra instruction in all but error cases.  Is 
that extra instruction worth the tradeoff?

--Russell

>
> My € 0.02!
>
>
> Jan Engelhardt

-- 

Russell Miller - rmiller@duskglow.com - Le Mars, IA
Duskglow Consulting - Helping companies just like you to succeed for ~ 10 yrs.
http://www.duskglow.com - 712-546-5886

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: question on common error-handling idiom
  2004-11-02 20:08 question on common error-handling idiom Chris Friesen
  2004-11-02 20:58 ` Jan Engelhardt
@ 2004-11-02 21:16 ` Jesper Juhl
  2004-11-02 21:21   ` Oliver Neukum
  2004-11-02 21:29   ` Jan Engelhardt
  2004-11-03  8:11 ` GNicz
  2004-11-04 19:52 ` Linus Torvalds
  3 siblings, 2 replies; 12+ messages in thread
From: Jesper Juhl @ 2004-11-02 21:16 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Linux kernel

On Tue, 2 Nov 2004, Chris Friesen wrote:

> 
> There's something I've been wondering about for a while.  There is a lot of
> code in linux that looks something like this:
> 
> 
> err = -ERRORCODE
> if (error condition)
> 	goto out;
> 
> 
> While nice to read, it would seem that it might be more efficient to do the
> following:
> 
> if (error condition) {
> 	err = -ERRORCODE;
> 	goto out;
> }
> 
> 
> Is there any particular reason why the former is preferred?  Is the compiler
> smart enough to optimize away the additional write in the non-error path?
> 
There are some places that do

err = -SOMEERROR;
if (some_error)
	goto out;
if (some_other_error)
	goto out;
if (another_error)
	goto out;

In that case, where there are several different conditions that need 
testing, but they all need to return the same error, setting the error 
just once seems the best approach.

but for the places that do

err = -SOMEERROR;
if (condition)
	goto out;

err = -OTHERERROR;
if (condition)
	goto out;

I would tend to agree with you that moving the setting of the error inside 
the if() would make sense.

Let's see what other people think :)


--
Jesper Juhl


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: question on common error-handling idiom
  2004-11-02 21:16 ` Jesper Juhl
@ 2004-11-02 21:21   ` Oliver Neukum
  2004-11-02 21:29   ` Jan Engelhardt
  1 sibling, 0 replies; 12+ messages in thread
From: Oliver Neukum @ 2004-11-02 21:21 UTC (permalink / raw)
  To: Jesper Juhl; +Cc: Chris Friesen, Linux kernel


> but for the places that do
> 
> err = -SOMEERROR;
> if (condition)
> 	goto out;
> 
> err = -OTHERERROR;
> if (condition)
> 	goto out;
> 
> I would tend to agree with you that moving the setting of the error inside 
> the if() would make sense.

The intention is to allow the compiler to turn the if into a simple
conditional branch on the assumption that gcc is not smart enough
to put the if's body out of line.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: question on common error-handling idiom
  2004-11-02 21:16 ` Jesper Juhl
  2004-11-02 21:21   ` Oliver Neukum
@ 2004-11-02 21:29   ` Jan Engelhardt
  2004-11-02 21:48     ` Jesper Juhl
  1 sibling, 1 reply; 12+ messages in thread
From: Jan Engelhardt @ 2004-11-02 21:29 UTC (permalink / raw)
  To: Jesper Juhl; +Cc: Chris Friesen, Linux kernel

>There are some places that do
>
>err = -SOMEERROR;
>if (some_error)
>	goto out;
>if (some_other_error)
>	goto out;
>if (another_error)
>	goto out;
>
>Let's see what other people think :)

err = -ESOME;
if(some_error || some_other_error || another_error) {
	goto out;
}

Best.

Jan Engelhardt
-- 
Gesellschaft für Wissenschaftliche Datenverarbeitung
Am Fassberg, 37077 Göttingen, www.gwdg.de

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: question on common error-handling idiom
  2004-11-02 21:29   ` Jan Engelhardt
@ 2004-11-02 21:48     ` Jesper Juhl
  2004-11-03 16:49       ` Chris Friesen
  0 siblings, 1 reply; 12+ messages in thread
From: Jesper Juhl @ 2004-11-02 21:48 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Chris Friesen, Oliver Neukum, Linux kernel

On Tue, 2 Nov 2004, Jan Engelhardt wrote:

> >There are some places that do
> >
> >err = -SOMEERROR;
> >if (some_error)
> >	goto out;
> >if (some_other_error)
> >	goto out;
> >if (another_error)
> >	goto out;
> >
> >Let's see what other people think :)
> 
> err = -ESOME;
> if(some_error || some_other_error || another_error) {
> 	goto out;
> }
> 
> Best.
> 
Agreed, but that would potentially make something like the following (from 
fs/binfmt_elf.c::load_elf_binary) quite unreadable :

if (loc->elf_ex.e_type != ET_EXEC && loc->elf_ex.e_type != ET_DYN)
	goto out;
if (!elf_check_arch(&loc->elf_ex))
	goto out;
if (!bprm->file->f_op||!bprm->file->f_op->mmap)
	goto out;


But that's not even the most interresting case, the interresting one is 
the one that does

err = -ERR;
if (foo)
	goto out;

err = -ERROR;
if (bar)
	goto out;

where there's the potential to save an instruction by moving the err= into 
the error path.
But as Oliver Neukum pointed out gcc may not be smart enough for that to 
actually generate the best code.

Has anyone taken a look at what recent gcc's actually do with different 
variations of the constructs mentioned in this thread?


--
Jesper Juhl



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: question on common error-handling idiom
  2004-11-02 20:08 question on common error-handling idiom Chris Friesen
  2004-11-02 20:58 ` Jan Engelhardt
  2004-11-02 21:16 ` Jesper Juhl
@ 2004-11-03  8:11 ` GNicz
  2004-11-04 19:52 ` Linus Torvalds
  3 siblings, 0 replies; 12+ messages in thread
From: GNicz @ 2004-11-03  8:11 UTC (permalink / raw)
  To: linux-kernel

Some code parts is using diffirent error handling.

if(err1) {
	clenup1;
	return -ERR1;
}

...

if(err2) {
	cleanup2;
	return -ERR2;
}

Shouldn't it be converted into:

err = -ERR1;
if(err1)
	goto cleanup1;

...

err = -ERR2;
if(err2)
	goto cleanup2;

I think there should be one methond, which will be used in all kernel code.

Regards, GNicz

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: question on common error-handling idiom
  2004-11-02 21:12   ` linux-os
@ 2004-11-03 10:45     ` Ross Kendall Axe
  0 siblings, 0 replies; 12+ messages in thread
From: Ross Kendall Axe @ 2004-11-03 10:45 UTC (permalink / raw)
  To: linux-os; +Cc: Jan Engelhardt, Chris Friesen, Linux kernel

[-- Attachment #1: Type: text/plain, Size: 430 bytes --]

linux-os wrote:

>>> There's something I've been wondering about for a while.  There is a 
>>> lot of code
>>> in linux that looks something like this:
>>>
>>> err = -ERRORCODE
>>> if (error condition)
>>>     goto out;
>>
> 
> 
> I think it's just to get around the "uninitialized variable"
> warning when the 'C' compiler doesn't know that it will
> always be initialized.
> 

gcc is smart enough to get this case right.

Ross


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: question on common error-handling idiom
  2004-11-02 21:48     ` Jesper Juhl
@ 2004-11-03 16:49       ` Chris Friesen
  0 siblings, 0 replies; 12+ messages in thread
From: Chris Friesen @ 2004-11-03 16:49 UTC (permalink / raw)
  To: Jesper Juhl; +Cc: Jan Engelhardt, Oliver Neukum, Linux kernel

Jesper Juhl wrote:

> Has anyone taken a look at what recent gcc's actually do with different 
> variations of the constructs mentioned in this thread?

I did, out of curiosity:

I used the following (admittedly simplistic) code, compiled with -O2.

int bbbb(int a)
{
	int err = -5;
	if (a == 1)
		goto out;
	err=0;
out:
	return err;
}

int cccc(int a)
{
	int err=0;
	if (a == 1) {
		err = -5;
		goto out;
	}
out:
	return err;
}


With gcc 3.2.2 for x86, both constructs generated the same code:

        pushl   %ebp
         movl    %esp, %ebp
         xorl    %eax, %eax
         cmpl    $1, 8(%ebp)
         setne   %al
         leal    -5(%eax,%eax,4), %eax
         leave
         ret

With gcc 2.96 (Mandrake) however, the standard construct generated this:


	pushl	%ebp
	movl	%esp, %ebp
	subl	$4, %esp
	movl	$-5, -4(%ebp)
	cmpl	$1, 8(%ebp)
	jne	.L3
	jmp	.L4
	.p2align 4,,7
.L3:
	movl	$0, -4(%ebp)
.L4:
	movl	-4(%ebp), %eax
	movl	%eax, %eax
	movl	%ebp, %esp
	popl	%ebp
	ret



While moving the err setting into the conditional generates the following:

	pushl	%ebp
	movl	%esp, %ebp
	subl	$4, %esp
	movl	$0, -4(%ebp)
	cmpl	$1, 8(%ebp)
	jne	.L6
	movl	$-5, -4(%ebp)
	jmp	.L7
	.p2align 4,,7
.L6:
	nop
.L7:
	movl	-4(%ebp), %eax
	movl	%eax, %eax
	movl	%ebp, %esp
	popl	%ebp
	ret


For PPC, gcc 3.3.3, the standard construct gave:

         xori 3,3,1
         addic 3,3,-1
         subfe 3,3,3
         rlwinm 3,3,0,30,28
         blr

While moving the err setting into the conditional generates the following:

         xori 3,3,1
         srawi 0,3,31
         xor 3,0,3
         subf 3,3,0
         srawi 3,3,31
         andi. 3,3,5
         addi 3,3,-5
         blr


So, it looks like the standard construct can actually generate better code in 
some cases, its almost never worse, and it's certainly nicer to read.

Chris

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: question on common error-handling idiom
  2004-11-02 20:08 question on common error-handling idiom Chris Friesen
                   ` (2 preceding siblings ...)
  2004-11-03  8:11 ` GNicz
@ 2004-11-04 19:52 ` Linus Torvalds
  3 siblings, 0 replies; 12+ messages in thread
From: Linus Torvalds @ 2004-11-04 19:52 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Linux kernel

On Tue, 2 Nov 2004, Chris Friesen wrote:
> 
> While nice to read, it would seem that it might be more efficient to do the 
> following:
> 
> if (error condition) {
> 	err = -ERRORCODE;
> 	goto out;
> }
> 
> 
> Is there any particular reason why the former is preferred?  Is the compiler 
> smart enough to optimize away the additional write in the non-error path?

Quite often, the additional write is _faster_ in the non-error path.

If it's a plain register, the obvious code generation for the above is

		..
		testl error-condition
		je no_error
		movl $-XXX,%eax
		jmp out;
	no_error:
		...

which is a lot slower (for the common non-error case) than

		movl $-XXX,%eax
		testl error-condition
		jne out

because forward branches are usually predicted not-taken when no other 
prediction exists.

You can do the same thing with

	if (unlikely(error condition)) {
		err = -ERRORCODE;
		goto out;
	}

which is hopefully even better, but the fact is, the regular

	err = -ERRORCODE;
	if (error condition)
		goto out;

is just _smaller_ and simpler and quite often more readable anyway.

It has the added advantage that it tends to stylistically match what I 
consider the proper error return behaviour, ie

	err = function_returns_errno(...)
	if (err)
		goto out;

ie it looks syntactically identical when "error condition" and the error 
value happen to be one and the same. Which is, after all, _the_ most 
common case.

So I personally tend to prefer the simple format for several reasons. 
It's small, it's consistent, and it maps well to good code generation.

The code generation part ends up being nice when something goes wrong.  
When somebody sends in an oops, I often end up having to look at the
disassembly (and no, a fancy debugger wouldn't help - I'm talking about
the disassembly of the "code" portion of the oops itself, and matching it
up with the source tree - the oops doesn't come with the whole binary),
and then having code generation match the source makes things a _lot_
easier.

Does it make sense for other projects? Dunno. But it is, as you noted, a 
common idiom in the kernel. Getting used to it just makes it even more 
readable.

			Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2004-11-04 19:58 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-02 20:08 question on common error-handling idiom Chris Friesen
2004-11-02 20:58 ` Jan Engelhardt
2004-11-02 21:12   ` linux-os
2004-11-03 10:45     ` Ross Kendall Axe
2004-11-02 21:12   ` Russell Miller
2004-11-02 21:16 ` Jesper Juhl
2004-11-02 21:21   ` Oliver Neukum
2004-11-02 21:29   ` Jan Engelhardt
2004-11-02 21:48     ` Jesper Juhl
2004-11-03 16:49       ` Chris Friesen
2004-11-03  8:11 ` GNicz
2004-11-04 19:52 ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox