Re: [PATCH 1/3] Fix Unlikely(x) == y

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andi Kleen <andi@firstfloor.org>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Roel Kluin <12o3l@tiscali.nl>,
	lkml <linux-kernel@vger.kernel.org>,
	cbe-oss-dev@ozlabs.org, linuxppc-dev@ozlabs.org,
	Andi Kleen <andi@firstfloor.org>, Willy Tarreau <w@1wt.eu>,
	Arjan van de Ven <arjan@infradead.org>
Subject: Re: [PATCH 1/3] Fix Unlikely(x) == y
Date: Tue, 19 Feb 2008 10:57:02 +0100	[thread overview]
Message-ID: <20080219095702.GA6940@one.firstfloor.org> (raw)
In-Reply-To: <200802192046.46955.nickpiggin@yahoo.com.au>


On Tue, Feb 19, 2008 at 08:46:46PM +1100, Nick Piggin wrote:
> On Tuesday 19 February 2008 20:25, Andi Kleen wrote:
> > On Tue, Feb 19, 2008 at 01:33:53PM +1100, Nick Piggin wrote:
> 
> > > I actually once measured context switching performance in the scheduler,
> > > and removing the  unlikely hint for testing RT tasks IIRC gave about 5%
> > > performance drop.
> >
> > OT: what benchmarks did you use for that? I had a change some time
> > ago to the CFS scheduler to avoid unpredicted indirect calls for
> > the common case, but I wasn't able to benchmark a difference with the usual
> > suspect benchmark (lmbench). Since it increased code size by
> > a few bytes it was rejected then.
> 
> I think it was just a simple context switch benchmark, but not lmbench
> (which I found to be a bit too variable). But it was a long time ago...

Do you still have it?

I thought about writing my own but ended up being too lazy for that @)

> 
> > > However, the P4's branch predictor is pretty good, and it should easily
> >
> > I think it depends on the generation. Prescott class branch
> > prediction should be much better than the earlier ones.
> 
> I was using a Nocona Xeon, which I think is a Prescott class? 

Yes.

> And don't they have much higher mispredict penalty (than older P4s)?

They do have a longer pipeline, so yes more penalty (5 or 6 stages more iirc),
but also a lot better branch predictor which makes up for that.

> 
> 
> > > Actually one thing I don't like about gcc is that I think it still emits
> > > cmovs for likely/unlikely branches,
> >
> > That's -Os.
> 
> And -O2 and -O3, on the gccs that I'm using, AFAIKS.

Well if it still happens on gcc 4.2 with P4 tuning you should
perhaps open a gcc PR. They tend to ignore these bugs mostly in
my experience, but sometimes they act on them. 

> 
> 
> > > which is silly (the gcc developers
> >
> > It depends on the CPU. e.g. on K8 and P6 using CMOV if possible
> > makes sense. P4 doesn't like it though.
> 
> If the branch is completely predictable (eg. annotated), then I
> think branches should be used anyway. Even on well predicted
> branches, cmov is similar speed on microbenchmarks, but it will
> increase data hazards I think, so it will probably be worse for
> some real world situations.

At least the respective optimization manuals say they should be used.
I presume they only made this recommendation after some extensive
benchmarking.

> 
> 
> > > the quite good numbers that cold CPU predictors can attain. However
> > > for really performance critical code (or really "never" executed
> > > code), then I think it is OK to have the hints and not have to rely
> > > on gcc heuristics.
> >
> > But only when the explicit hints are different from what the implicit
> > branch predictors would predict anyways. And if you look at the
> > heuristics that is not often the case...
> 
> But a likely branch will be _strongly_ predicted to be taken,
> wheras a lot of the gcc heuristics simply have slightly more or
> slightly less probability. So it's not just a question of which
> way is more likely, but also _how_ likely it is to go that way.

Yes, but a lot of the heuristics are pretty strong (>80%) and gcc will
act on them unless it has a very strong contra cue. And that should
normally not be the case.

-Andi

WARNING: multiple messages have this Message-ID (diff)

From: Andi Kleen <andi@firstfloor.org>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Andi Kleen <andi@firstfloor.org>,
	Arjan van de Ven <arjan@infradead.org>, Willy Tarreau <w@1wt.eu>,
	Roel Kluin <12o3l@tiscali.nl>,
	geoffrey.levand@am.sony.com, linuxppc-dev@ozlabs.org,
	cbe-oss-dev@ozlabs.org, lkml <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/3] Fix Unlikely(x) == y
Date: Tue, 19 Feb 2008 10:57:02 +0100	[thread overview]
Message-ID: <20080219095702.GA6940@one.firstfloor.org> (raw)
In-Reply-To: <200802192046.46955.nickpiggin@yahoo.com.au>


On Tue, Feb 19, 2008 at 08:46:46PM +1100, Nick Piggin wrote:
> On Tuesday 19 February 2008 20:25, Andi Kleen wrote:
> > On Tue, Feb 19, 2008 at 01:33:53PM +1100, Nick Piggin wrote:
> 
> > > I actually once measured context switching performance in the scheduler,
> > > and removing the  unlikely hint for testing RT tasks IIRC gave about 5%
> > > performance drop.
> >
> > OT: what benchmarks did you use for that? I had a change some time
> > ago to the CFS scheduler to avoid unpredicted indirect calls for
> > the common case, but I wasn't able to benchmark a difference with the usual
> > suspect benchmark (lmbench). Since it increased code size by
> > a few bytes it was rejected then.
> 
> I think it was just a simple context switch benchmark, but not lmbench
> (which I found to be a bit too variable). But it was a long time ago...

Do you still have it?

I thought about writing my own but ended up being too lazy for that @)

> 
> > > However, the P4's branch predictor is pretty good, and it should easily
> >
> > I think it depends on the generation. Prescott class branch
> > prediction should be much better than the earlier ones.
> 
> I was using a Nocona Xeon, which I think is a Prescott class? 

Yes.

> And don't they have much higher mispredict penalty (than older P4s)?

They do have a longer pipeline, so yes more penalty (5 or 6 stages more iirc),
but also a lot better branch predictor which makes up for that.

> 
> 
> > > Actually one thing I don't like about gcc is that I think it still emits
> > > cmovs for likely/unlikely branches,
> >
> > That's -Os.
> 
> And -O2 and -O3, on the gccs that I'm using, AFAIKS.

Well if it still happens on gcc 4.2 with P4 tuning you should
perhaps open a gcc PR. They tend to ignore these bugs mostly in
my experience, but sometimes they act on them. 

> 
> 
> > > which is silly (the gcc developers
> >
> > It depends on the CPU. e.g. on K8 and P6 using CMOV if possible
> > makes sense. P4 doesn't like it though.
> 
> If the branch is completely predictable (eg. annotated), then I
> think branches should be used anyway. Even on well predicted
> branches, cmov is similar speed on microbenchmarks, but it will
> increase data hazards I think, so it will probably be worse for
> some real world situations.

At least the respective optimization manuals say they should be used.
I presume they only made this recommendation after some extensive
benchmarking.

> 
> 
> > > the quite good numbers that cold CPU predictors can attain. However
> > > for really performance critical code (or really "never" executed
> > > code), then I think it is OK to have the hints and not have to rely
> > > on gcc heuristics.
> >
> > But only when the explicit hints are different from what the implicit
> > branch predictors would predict anyways. And if you look at the
> > heuristics that is not often the case...
> 
> But a likely branch will be _strongly_ predicted to be taken,
> wheras a lot of the gcc heuristics simply have slightly more or
> slightly less probability. So it's not just a question of which
> way is more likely, but also _how_ likely it is to go that way.

Yes, but a lot of the heuristics are pretty strong (>80%) and gcc will
act on them unless it has a very strong contra cue. And that should
normally not be the case.

-Andi

next prev parent reply	other threads:[~2008-02-19  9:57 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-16 16:08 [PATCH 1/3] Fix Unlikely(x) == y Roel Kluin
2008-02-16 17:25 ` Arjan van de Ven
2008-02-16 17:25   ` Arjan van de Ven
2008-02-16 17:33   ` Willy Tarreau
2008-02-16 17:33     ` Willy Tarreau
2008-02-16 17:42     ` Arjan van de Ven
2008-02-16 17:42       ` Arjan van de Ven
2008-02-16 17:58       ` Willy Tarreau
2008-02-16 17:58         ` Willy Tarreau
2008-02-16 18:29         ` Arjan van de Ven
2008-02-16 18:29           ` Arjan van de Ven
2008-02-17  9:45         ` [Cbe-oss-dev] " Andrew Pinski
2008-02-17  9:45           ` Andrew Pinski
2008-02-17 10:08           ` Willy Tarreau
2008-02-17 10:08             ` Willy Tarreau
2008-02-16 18:31       ` Geoff Levand
2008-02-16 18:31         ` Geoff Levand
2008-02-16 18:39         ` Arjan van de Ven
2008-02-16 18:39           ` Arjan van de Ven
2008-02-17 11:50           ` Michael Ellerman
2008-02-17 11:50             ` Michael Ellerman
2008-02-18 13:56             ` Adrian Bunk
2008-02-18 13:56               ` Adrian Bunk
2008-02-18 14:01               ` Geert Uytterhoeven
2008-02-18 14:01                 ` Geert Uytterhoeven
2008-02-18 14:13                 ` Adrian Bunk
2008-02-18 14:13                   ` Adrian Bunk
2008-02-18 21:46                   ` Michael Ellerman
2008-02-18 21:46                     ` Michael Ellerman
2008-02-19  7:43                     ` Adrian Bunk
2008-02-19  7:43                       ` Adrian Bunk
2008-02-18 14:27                 ` David Howells
2008-02-18 14:27                   ` David Howells
2008-02-18 14:59                   ` Roel Kluin
2008-02-18 14:59                     ` Roel Kluin
2008-02-18 18:11                   ` Valdis.Kletnieks
2008-02-18 18:11                     ` Valdis.Kletnieks
2008-02-18 18:33                     ` Arjan van de Ven
2008-02-18 18:33                       ` Arjan van de Ven
2008-02-18 19:22                 ` [Cbe-oss-dev] " Andrew Pinski
2008-02-18 14:39       ` Andi Kleen
2008-02-18 14:39         ` Andi Kleen
2008-02-19  2:33         ` Nick Piggin
2008-02-19  2:33           ` Nick Piggin
2008-02-19  2:40           ` Arjan van de Ven
2008-02-19  2:40             ` Arjan van de Ven
2008-02-19  4:41             ` Nick Piggin
2008-02-19  4:41               ` Nick Piggin
2008-02-19  5:58           ` Willy Tarreau
2008-02-19  5:58             ` Willy Tarreau
2008-02-19  6:20             ` Nick Piggin
2008-02-19  6:20               ` Nick Piggin
2008-02-19  9:28             ` Andi Kleen
2008-02-19  9:28               ` Andi Kleen
2008-02-20  7:32               ` Willy Tarreau
2008-02-20  7:32                 ` Willy Tarreau
2008-02-19  9:25           ` Andi Kleen
2008-02-19  9:25             ` Andi Kleen
2008-02-19  9:46             ` Nick Piggin
2008-02-19  9:46               ` Nick Piggin
2008-02-19  9:57               ` Andi Kleen [this message]
2008-02-19  9:57                 ` Andi Kleen
2008-02-19 22:25                 ` Nick Piggin
2008-02-19 22:25                   ` Nick Piggin
2008-02-16 18:41 ` Geoff Levand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080219095702.GA6940@one.firstfloor.org \
    --to=andi@firstfloor.org \
    --cc=12o3l@tiscali.nl \
    --cc=arjan@infradead.org \
    --cc=cbe-oss-dev@ozlabs.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=w@1wt.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.