public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Michael Breuer <mbreuer@majjas.com>
To: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Cc: Mike Galbraith <efault@gmx.de>
Subject: Re: x86 - cpu_relax - why nop vs. pause?
Date: Sun, 07 Feb 2010 16:15:05 -0500	[thread overview]
Message-ID: <4B6F2D59.1070508@majjas.com> (raw)
In-Reply-To: <4B6F1DAE.6020407@majjas.com>

On 02/07/2010 03:08 PM, Michael Breuer wrote:
> On 2/7/2010 1:14 PM, Mike Galbraith wrote:
> , and this got me thinking... and testing... I think there's an 
> optimization issue with gcc:
>
> First of all - a bit of background on how I got here:
>
> After reading the Intel documentation, I tried replacing rep:nop with 
> pause (in theory exactly what's shown above). The system hung on booting.
> I then tried replacing nop with pause (rep:pause) and the system 
> booted. Using the above example, the opcode becomes f3 f3 90 vs f3 90 
> (rep nop).
>
> Given the above compiler test case, this seemed odd, to say the least. 
> So I played a bit more with gcc. Seems that the optimizer (-O3) is 
> handling the *three*cases differently (objdump output)
>
> Base code for all three cases (only change is the asm volitile line as 
> shown for each case):
>
> static inline void pause(void)
> {
>         asm volatile("pause" ::: "memory");
> }
>
> void main(void)
> {
>     pause();
> }
>
> Case1 - asm volatile("pause" ::: "memory");
> 0000000000400480 <main>:
>   400480:    f3 90                    pause
>   400482:    c3                       retq
>   400483:    90                       nop
>
> Case2 - asm volitile("rep;nop" ::: "memory") Note: this didn't inline!
>
> 0000000000400474 <pause>:
>   400474:    55                       push   %rbp
>   400475:    48 89 e5                 mov    %rsp,%rbp
>   400478:    f3 90                    pause
>   40047a:    c9                       leaveq
>   40047b:    c3                       retq
>
> 000000000040047c <main>:
>   40047c:    55                       push   %rbp
>   40047d:    48 89 e5                 mov    %rsp,%rbp
>   400480:    e8 ef ff ff ff           callq  400474 <pause>
>   400485:    c9                       leaveq
>   400486:    c3                       retq
>   400487:    90                       nop
>   400488:    90                       nop
>   400489:    90                       nop
>   40048a:    90                       nop
>   40048b:    90                       nop
>   40048c:    90                       nop
>   40048d:    90                       nop
>   40048e:    90                       nop
>   40048f:    90                       nop
>
> Case3 - asm volitile("rep;pause" ::: "memory")
> 0000000000400480 <main>:
>   400480:    f3 f3 90                 pause
>   400483:    c3                       retq
>   400484:    90                       nop
> _______
> Note the difference between opcodes case 1 and case 3, and the mess 
> made by the compiler in case 2.
>
> As to benchmarks  - I've checked a few things, no formal or lasting 
> stuff... but striking at first glance:
>
> 1) At idle, perf top shows time spent in _raw_spin_lock dropping from 
> ~35% to ~25%.
> 2) Running a media transcode (single core - handbrakecli): frame rate 
> increased by about 5-10%.
> 3) During file-intensive operations (#2, above, or copying large files 
> - ext4 on software raid6) - latencytop shows a decerase on writing a 
> page to disc from about 120ms to about 90ms.
> -- 
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
Disregard case 2 - was missing -O3. With -O3 or -O2 rep;nop and pause 
are identical. The interesting case is rep;pause which is different and 
seems more efficient.

  reply	other threads:[~2010-02-07 21:15 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-07 17:28 x86 - cpu_relax - why nop vs. pause? Michael Breuer
2010-02-07 18:09 ` Joerg Roedel
2010-02-07 18:32 ` Arjan van de Ven
     [not found] ` <1265566470.6280.10.camel@marge.simson.net>
2010-02-07 20:08   ` Michael Breuer
2010-02-07 21:15     ` Michael Breuer [this message]
2010-02-08  3:50       ` Michael Breuer
2010-02-08 13:33         ` Artur Skawina

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B6F2D59.1070508@majjas.com \
    --to=mbreuer@majjas.com \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox