From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754457Ab0LOOaV (ORCPT ); Wed, 15 Dec 2010 09:30:21 -0500 Received: from mx1.redhat.com ([209.132.183.28]:3599 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754433Ab0LOOaU (ORCPT ); Wed, 15 Dec 2010 09:30:20 -0500 Subject: Re: likely() vs. unlikely() From: Steven Rostedt To: Daniel Kopko Cc: linux-kernel@vger.kernel.org In-Reply-To: <294021.5068.qm@web120709.mail.ne1.yahoo.com> References: <294021.5068.qm@web120709.mail.ne1.yahoo.com> Content-Type: text/plain; charset="UTF-8" Organization: Red Hat Date: Wed, 15 Dec 2010 09:30:18 -0500 Message-ID: <1292423418.2733.220.camel@fedora> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2010-12-14 at 22:42 -0800, Daniel Kopko wrote: > Hello, Mr. Rostedt, LKML, > > I've noticed the patch series by Steven Rostedt. I am a bit of a lurker here, > but I noticed something that I could perhaps contribute to. Mr. Rostedt has > done some great work deducing exactly whether or not these clauses meet their > stated presumptions of "likeliness". However, I think there may be some cases > where accurately codifying branch biases based on literal likeliness might > produce worse performance overall. An example: > > if(X) > some_expensive_code(); //takes 500 ms Nothing in the kernel proper should ever take 500ms. > else > some_very_cheap_code(); //takes 100 us > > Now, let's say X is true 90% of the time. The literal encoding of that would be > "if(likely(X))". However, it may make much more sense to encode it *wrongly* > for the sake of cheapening the already cheap code, as the delay of the branch > misprediction may be readily "absorbed" into the more expensive code. In which > case, even with X being likely, we may want to encode it as "if(unlikely(X))". > (Also, to avoid obscuring things, please keep in mind that the bodies of the two > halves of the branch above need not actually be function calls.) Doesn't matter if they are function calls or not. > > I think that this type of thing may be most noticeable around any branches where > there is a fastpath that may be run if ideal conditions are met, but which are > met less than 50% of the time. Then that's not a fastpath. A definition of a fastpath has nothing to do with the amount of time that path takes. We call something a fastpath when it is hit 90% of the time and hit often. We want that to be as fast as possible, even if it takes 500ms compared to the 10% of 100us. If you look at the big picture (the entire running system) adding a missed branch prediction(*) to 90% of a single branch is going to be larger than having it hit the branch that is only 10% taken. Also note, I honestly believe that most of the branch annotations should be removed unless they are correct 90% of the time. But I do not remove them blindly, so it takes a bit of work for each and every change. > In such cases, the likely()/unlikely() may be > used "wrongly" to cause the branch misprediction to occur in the > already-high-latency (some_expensive_function()) case, and lower latencies in > the already-low-latency (some_very_cheap_function()) case. This would lead to > lower attainable latencies overall (by at least the cost of a branch miss which > would otherwise have been spent in the fast code), and further encourage coding > to meet the ideal conditions of the fastpath. Which is not what we call a fast path. > > So, several points: > 1) Please let me know if any of the above is outright wrong. Already stated ;-) > 2) I don't know if any such cases occur in the likely()/unlikely() patch > series. A place where it obviously DOESN'T occur would be: > http://marc.info/?l=linux-kernel&m=129229014528892&w=2 > A place where I thought it MAY occur: > http://marc.info/?l=linux-kernel&m=129228728125413&w=2 > 3) If there is overall agreement on the above, then I would also suggest that > perhaps some additional macro names would be appropriate for the > __builtin_expect() use (for cases where we want __builtin_expect(!!(X),1), but > for which it isn't truly "likely", and for cases where we want > __builtin_expect((X), 0), but for which it isn't truly "unlikely"). These would > be parallel to likely()/unlikely() and have the same implementations, but > different titles, to better document the intent of the code where they're used. > Names maybe slow_branch_path() and fast_branch_path()? > slow_branch()/fast_branch()? > 4) I'm very sorry if this winds up ill-formatted. I have a yahoo webmail > client. Open to suggestions for different free email providers on this front. Lets look at a very short path that is done all over the place: if (unlikely(mypointer == NULL)) return; This is done all over the place. And it fits your definition of a fast path. Because all it does is end the function. Where if we were to continue, the path could be much longer. But if this function is called 1000 times a second, we want all branches to be as little of a hindrance as possible. -- Steve (*) the likely() and unlike()'s do not really do anything with branch prediction of most archs. Some archs allow for a flag that can be added to a branch condition in assembly to what you expect it will take. Intel does not. But it does affect the way gcc orders code, which causes instruction cache misses. And there is a bit of default predictions that we want. A compare and branch is default to not branch. cmp x be 1: [...] 1: The cpu will be pulling in the instructions from memory and placing it into the instruction cache. Branch prediction helps when the CPU sees the branch and if there's already a prediction made, it can pull in the location of the destination of that branch, if it has not done so already. Lets say the cache size is 128 bytes, and that branch was pulled in at the beginning of the cache line. We then have 124 bytes of instructions after it even if that branch is expected to be taken. If we have the following code: if (x) r = 1; a = 2; If we consider likely(x) then we will have this: cmp x be 1: ;; if (x == 0) branch ld r, 1 1: ld a, 2 But if x is unlikely 90% of the time, we don't want to even bother jumping around it. 10% of the time it's ok to just make it slower. cmp x bne 2: ;; if (x != 0) branch 1: ld a, 1 [...] 2: ld r, 1 jmp 1b We just removed a line of code from our cache line that we don't want in there in the first place. A lot of error code is like this. We test for errors all the time, but we do not expect that error code to ever be taken, so we want it out of our cache lines as much as possible.