* [PATCH] use unlikely() for current_kernel_time() loop
@ 2006-06-07 17:36 Andreas Dilger
2006-06-08 2:28 ` Andi Kleen
0 siblings, 1 reply; 14+ messages in thread
From: Andreas Dilger @ 2006-06-07 17:36 UTC (permalink / raw)
To: Adrian's Trivial Patches; +Cc: linux-kernel
Hello,
I just noticed this minor optimization. current_kernel_time() is called
from current_fs_time() so it is used fairly often but it doesn't use
unlikely(read_seqretry(&xtime_lock, seq)) as other users of xtime_lock do.
Also removes extra whitespace on the empty line above.
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
--- ./kernel/time.c.orig 2006-05-08 15:40:43.000000000 -0600
+++ ./kernel/time.c 2006-06-07 11:24:49.000000000 -0600
@@ -424,9 +424,9 @@ inline struct timespec current_kernel_time
do {
seq = read_seqbegin(&xtime_lock);
-
+
now = xtime;
- } while (read_seqretry(&xtime_lock, seq));
+ } while (unlikely(read_seqretry(&xtime_lock, seq)));
return now;
}
Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH] use unlikely() for current_kernel_time() loop
2006-06-07 17:36 [PATCH] use unlikely() for current_kernel_time() loop Andreas Dilger
@ 2006-06-08 2:28 ` Andi Kleen
2006-06-08 5:01 ` Andrew Morton
0 siblings, 1 reply; 14+ messages in thread
From: Andi Kleen @ 2006-06-08 2:28 UTC (permalink / raw)
To: Andreas Dilger; +Cc: linux-kernel
Andreas Dilger <adilger@clusterfs.com> writes:
> Hello,
> I just noticed this minor optimization. current_kernel_time() is called
> from current_fs_time() so it is used fairly often but it doesn't use
> unlikely(read_seqretry(&xtime_lock, seq)) as other users of xtime_lock do.
> Also removes extra whitespace on the empty line above.
It would be better to put the unlikely into the read_seqretry I guess.
-Andi
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] use unlikely() for current_kernel_time() loop
2006-06-08 2:28 ` Andi Kleen
@ 2006-06-08 5:01 ` Andrew Morton
2006-06-08 5:39 ` Andi Kleen
0 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2006-06-08 5:01 UTC (permalink / raw)
To: Andi Kleen; +Cc: adilger, linux-kernel
On 08 Jun 2006 04:28:12 +0200
Andi Kleen <ak@suse.de> wrote:
> Andreas Dilger <adilger@clusterfs.com> writes:
>
> > Hello,
> > I just noticed this minor optimization. current_kernel_time() is called
> > from current_fs_time() so it is used fairly often but it doesn't use
> > unlikely(read_seqretry(&xtime_lock, seq)) as other users of xtime_lock do.
> > Also removes extra whitespace on the empty line above.
>
> It would be better to put the unlikely into the read_seqretry I guess.
>
yup. But it'd be good to check that this actually causes the compiler to
do the right thing, rather than simply ignoring it.
I'm not sure how one would do that though. I guess compare
before-and-after assembly code, work out if "after" is better.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] use unlikely() for current_kernel_time() loop
2006-06-08 5:01 ` Andrew Morton
@ 2006-06-08 5:39 ` Andi Kleen
2006-06-08 6:41 ` Brian F. G. Bidulock
0 siblings, 1 reply; 14+ messages in thread
From: Andi Kleen @ 2006-06-08 5:39 UTC (permalink / raw)
To: Andrew Morton; +Cc: adilger, linux-kernel
On Thursday 08 June 2006 07:01, Andrew Morton wrote:
> On 08 Jun 2006 04:28:12 +0200
> Andi Kleen <ak@suse.de> wrote:
>
> > Andreas Dilger <adilger@clusterfs.com> writes:
> >
> > > Hello,
> > > I just noticed this minor optimization. current_kernel_time() is called
> > > from current_fs_time() so it is used fairly often but it doesn't use
> > > unlikely(read_seqretry(&xtime_lock, seq)) as other users of xtime_lock do.
> > > Also removes extra whitespace on the empty line above.
> >
> > It would be better to put the unlikely into the read_seqretry I guess.
> >
>
> yup. But it'd be good to check that this actually causes the compiler to
> do the right thing, rather than simply ignoring it.
If it was put into a macro wrapper it should be safe enough.
>
> I'm not sure how one would do that though. I guess compare
> before-and-after assembly code, work out if "after" is better.
Nothing on x86-64 at least - it uses -fno-reorder-blocks by default.
-Andi
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] use unlikely() for current_kernel_time() loop
2006-06-08 5:39 ` Andi Kleen
@ 2006-06-08 6:41 ` Brian F. G. Bidulock
2006-06-08 6:51 ` Andi Kleen
0 siblings, 1 reply; 14+ messages in thread
From: Brian F. G. Bidulock @ 2006-06-08 6:41 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, adilger, linux-kernel
Andi,
On Thu, 08 Jun 2006, Andi Kleen wrote:
>
> Nothing on x86-64 at least - it uses -fno-reorder-blocks by default.
>
Why is that?
--brian
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] use unlikely() for current_kernel_time() loop
2006-06-08 6:41 ` Brian F. G. Bidulock
@ 2006-06-08 6:51 ` Andi Kleen
2006-06-08 7:00 ` Brian F. G. Bidulock
0 siblings, 1 reply; 14+ messages in thread
From: Andi Kleen @ 2006-06-08 6:51 UTC (permalink / raw)
To: bidulock; +Cc: Andrew Morton, adilger, linux-kernel
On Thursday 08 June 2006 08:41, Brian F. G. Bidulock wrote:
> Andi,
>
> On Thu, 08 Jun 2006, Andi Kleen wrote:
> >
> > Nothing on x86-64 at least - it uses -fno-reorder-blocks by default.
> >
>
> Why is that?
Originally because it made assembly too unreadable. Later it was discovered
it produces smaller code too.
-Andi
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] use unlikely() for current_kernel_time() loop
2006-06-08 6:51 ` Andi Kleen
@ 2006-06-08 7:00 ` Brian F. G. Bidulock
2006-06-08 7:07 ` Andi Kleen
0 siblings, 1 reply; 14+ messages in thread
From: Brian F. G. Bidulock @ 2006-06-08 7:00 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, adilger, linux-kernel
Andi,
On Thu, 08 Jun 2006, Andi Kleen wrote:
>
> Originally because it made assembly too unreadable. Later it was discovered
> it produces smaller code too.
>
Thank you for the explanation. But, this brings to mind two other questions:
Does the option not also make assembly less readable on other architectures?
If one is interested in smaller code, why not use -Os?
Also, does -fno-reorder-blocks actually defeat __builtin_expect()?
(GCC documentation doesn't really say that.)
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] use unlikely() for current_kernel_time() loop
2006-06-08 7:00 ` Brian F. G. Bidulock
@ 2006-06-08 7:07 ` Andi Kleen
2006-06-08 7:43 ` Brian F. G. Bidulock
2006-06-08 8:59 ` Brian F. G. Bidulock
0 siblings, 2 replies; 14+ messages in thread
From: Andi Kleen @ 2006-06-08 7:07 UTC (permalink / raw)
To: bidulock; +Cc: Andrew Morton, adilger, linux-kernel
On Thursday 08 June 2006 09:00, Brian F. G. Bidulock wrote:
> > Originally because it made assembly too unreadable. Later it was discovered
> > it produces smaller code too.
> >
>
> Thank you for the explanation. But, this brings to mind two other questions:
>
> Does the option not also make assembly less readable on other architectures?
Yes it does. Maybe the people who spend a lot of time debugging these
don't feel the pain anymore.
>
> If one is interested in smaller code, why not use -Os?
We already use -Os if you set the right config (but this change was actually
long before that)
It's possible that -Os includes -fno-block-reordering though.
> Also, does -fno-reorder-blocks actually defeat __builtin_expect()?
> (GCC documentation doesn't really say that.)
AFAIK gcc mostly uses the probability information for block reordering
to make the fast path fall through without jumps.
There are some more uses, but they don't impact the code very much.
-Andi
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] use unlikely() for current_kernel_time() loop
2006-06-08 7:07 ` Andi Kleen
@ 2006-06-08 7:43 ` Brian F. G. Bidulock
2006-06-08 8:59 ` Brian F. G. Bidulock
1 sibling, 0 replies; 14+ messages in thread
From: Brian F. G. Bidulock @ 2006-06-08 7:43 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, adilger, linux-kernel
Andi,
On Thu, 08 Jun 2006, Andi Kleen wrote:
>
> AFAIK gcc mostly uses the probability information for block reordering
> to make the fast path fall through without jumps.
After a quick try on RH EL4 gcc 3.4.4-2 it appears that
-fno-reorder-blocks indeed defeats __builtin_expect() as you
say. (Which is rather bizarre as __builtin_expect() no longer
does what one expects.) I think that I'm going to strip it out
for my externally compiled modules. Otherwise, the source code
rearrangements necessary to get the same effect will make the
source code unreadable and generate larger code, which I think
is worse than those effects on the assembler code.
Thanks again.
--brian
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] use unlikely() for current_kernel_time() loop
2006-06-08 7:07 ` Andi Kleen
2006-06-08 7:43 ` Brian F. G. Bidulock
@ 2006-06-08 8:59 ` Brian F. G. Bidulock
2006-06-08 9:14 ` Andi Kleen
1 sibling, 1 reply; 14+ messages in thread
From: Brian F. G. Bidulock @ 2006-06-08 8:59 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, adilger, linux-kernel
Andi,
On Thu, 08 Jun 2006, Andi Kleen wrote:
>
> > > Originally because it made assembly too unreadable. Later it was discovered
> > > it produces smaller code too.
> > >
Another quick check on Intel 630. w/o -fno-reorder-blocks: code increased 2% in
size; performance increased 2% per hyperthread; code still as readable with
objdump -S -d. Rather marginal one way or the other.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] use unlikely() for current_kernel_time() loop
2006-06-08 8:59 ` Brian F. G. Bidulock
@ 2006-06-08 9:14 ` Andi Kleen
2006-06-08 9:36 ` Brian F. G. Bidulock
0 siblings, 1 reply; 14+ messages in thread
From: Andi Kleen @ 2006-06-08 9:14 UTC (permalink / raw)
To: bidulock; +Cc: Andrew Morton, adilger, linux-kernel
> performance increased 2% per hyperthread;
That would surprise me. Most likely you made some measurement error.
I guess it can be a difference if it's used in very hot loops, but then
i would expect a fallthrough path to win. Kernel code very rarely
has hot loops like this.
-Andi
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] use unlikely() for current_kernel_time() loop
2006-06-08 9:14 ` Andi Kleen
@ 2006-06-08 9:36 ` Brian F. G. Bidulock
2006-06-08 10:20 ` Andi Kleen
0 siblings, 1 reply; 14+ messages in thread
From: Brian F. G. Bidulock @ 2006-06-08 9:36 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, adilger, linux-kernel
Andi,
On Thu, 08 Jun 2006, Andi Kleen wrote:
>
> > performance increased 2% per hyperthread;
>
> That would surprise me. Most likely you made some measurement error.
>
Don't think there was an error. Tests performed on STREAMS external
modules optimized with profiling data.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] use unlikely() for current_kernel_time() loop
2006-06-08 9:36 ` Brian F. G. Bidulock
@ 2006-06-08 10:20 ` Andi Kleen
2006-06-08 10:50 ` Brian F. G. Bidulock
0 siblings, 1 reply; 14+ messages in thread
From: Andi Kleen @ 2006-06-08 10:20 UTC (permalink / raw)
To: bidulock; +Cc: Andrew Morton, adilger, linux-kernel
On Thursday 08 June 2006 11:36, Brian F. G. Bidulock wrote:
> Andi,
>
> On Thu, 08 Jun 2006, Andi Kleen wrote:
> >
> > > performance increased 2% per hyperthread;
> >
> > That would surprise me. Most likely you made some measurement error.
> >
>
> Don't think there was an error. Tests performed on STREAMS external
> modules optimized with profiling data.
You mean with compiler profile feedback?
If that is slower then it would sound like a compiler bug.
-Andi
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] use unlikely() for current_kernel_time() loop
2006-06-08 10:20 ` Andi Kleen
@ 2006-06-08 10:50 ` Brian F. G. Bidulock
0 siblings, 0 replies; 14+ messages in thread
From: Brian F. G. Bidulock @ 2006-06-08 10:50 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, adilger, linux-kernel
Andi,
On Thu, 08 Jun 2006, Andi Kleen wrote:
> >
> > Don't think there was an error. Tests performed on STREAMS external
> > modules optimized with profiling data.
>
> You mean with compiler profile feedback?
No, hand optimized from oprofile runs.
> If that is slower then it would sound like a compiler bug.
What slower? It was 2% faster. And that was only the external
kernel modules compiled without the -fno-redorder-blocks flag.
--brian
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2006-06-08 10:51 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-07 17:36 [PATCH] use unlikely() for current_kernel_time() loop Andreas Dilger
2006-06-08 2:28 ` Andi Kleen
2006-06-08 5:01 ` Andrew Morton
2006-06-08 5:39 ` Andi Kleen
2006-06-08 6:41 ` Brian F. G. Bidulock
2006-06-08 6:51 ` Andi Kleen
2006-06-08 7:00 ` Brian F. G. Bidulock
2006-06-08 7:07 ` Andi Kleen
2006-06-08 7:43 ` Brian F. G. Bidulock
2006-06-08 8:59 ` Brian F. G. Bidulock
2006-06-08 9:14 ` Andi Kleen
2006-06-08 9:36 ` Brian F. G. Bidulock
2006-06-08 10:20 ` Andi Kleen
2006-06-08 10:50 ` Brian F. G. Bidulock
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox