Re: [PATCH -tip] x86/locking/atomic: Use asm_inline for atomic locking insns

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@kernel.org>
To: Dirk Gouders <dirk@gouders.net>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Jiri Olsa <jolsa@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Namhyung Kim <namhyung@kernel.org>
Cc: Uros Bizjak <ubizjak@gmail.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@intel.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Linus Torvalds <torvalds@linuxfoundation.org>
Subject: Re: [PATCH -tip] x86/locking/atomic: Use asm_inline for atomic locking insns
Date: Thu, 6 Mar 2025 11:59:20 +0100	[thread overview]
Message-ID: <Z8mACAi4-kN4uBLz@gmail.com> (raw)
In-Reply-To: <gh8qpil9d3.fsf@gouders.net>


( I've Cc:-ed some perf gents as to the measurement artifacts observed 
  below. Full report quoted below. )

* Dirk Gouders <dirk@gouders.net> wrote:

> Hi Ingo,
> 
> my interest comes, because I just started to try to better understand
> PCL and am reading the perf manual pages.  Perhaps I should therefore
> keep my RO-bit permanent for some more months, but:
> 
> > And if the benchmark is context-switching heavy, you'll want to use 
> > 'perf stat -a' option to not have PMU context switching costs, and the 
> 
> I'm sure you know what you are talking about so I don't doubt the above
> is correct but perhaps, the manual page should also clarify -a:
> 
> -a::
> --all-cpus::
>         system-wide collection from all CPUs (default if no target is specified)
> 
> In the last example -a is combined with -C 2 which is even more irritating when
> you just started with the manual pages.
> 
> 
> But the main reason why I thought it might be OK to once toggle my
> RO-bit is that I tried your examples and with the first one I have way
> higher numbers than yours and I thought that must be, because you just
> own the faster machine (as I would have expected):
> 
> >  starship:~> perf bench sched pipe
> >  # Running 'sched/pipe' benchmark:
> >  # Executed 1000000 pipe operations between two processes
> >
> >      Total time: 6.939 [sec]
> >
> >        6.939128 usecs/op
> >          144110 ops/sec
> 
> lena:~> perf bench sched pipe
> # Running 'sched/pipe' benchmark:
> # Executed 1000000 pipe operations between two processes
> 
>      Total time: 11.129 [sec]
> 
>       11.129952 usecs/op
>           89847 ops/sec
> 
> And I expected this to continue throughout the examples.
> 
> But -- to keep this short -- with the last example, my numbers are
> suddenly significantly lower than yours:
> 
> >  starship:~> taskset 0x4 perf stat -a -C 2 -e instructions --repeat 5 perf bench sched pipe
> >        5.808068 usecs/op
> >        5.843716 usecs/op
> >        5.826543 usecs/op
> >        5.801616 usecs/op
> >        5.793129 usecs/op
> >
> >  Performance counter stats for 'system wide' (5 runs):
> >
> >     32,244,691,275      instructions                                                            ( +-  0.21% )
> >
> >            5.81624 +- 0.00912 seconds time elapsed  ( +-  0.16% )
> 
> lena:~> taskset 0x4 perf stat -a -C 2 -e instructions --repeat 5 perf bench sched pipe
>        4.204444 usecs/op
>        4.169279 usecs/op
>        4.186812 usecs/op
>        4.217039 usecs/op
>        4.208538 usecs/op
> 
>  Performance counter stats for 'system wide' (5 runs):
> 
>     14,196,762,588      instructions                                                            ( +-  0.04% )
> 
>            4.20203 +- 0.00854 seconds time elapsed  ( +-  0.20% )
> 
> 
> 
> Of course, I don't want to waste anyones time if this is a so obvious
> thing that only newbies don't understand.  So, feel free to just ignore this.
> 
> Regards
> 
> Dirk

next prev parent reply	other threads:[~2025-03-06 10:59 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-28 12:35 [PATCH -tip] x86/locking/atomic: Use asm_inline for atomic locking insns Uros Bizjak
2025-02-28 13:13 ` Uros Bizjak
2025-02-28 16:48 ` Dave Hansen
2025-02-28 22:31   ` Uros Bizjak
2025-02-28 22:58     ` Dave Hansen
2025-03-01  9:05       ` Uros Bizjak
2025-03-01 12:38         ` Borislav Petkov
2025-03-05  8:54           ` Uros Bizjak
2025-03-05 17:04             ` Linus Torvalds
2025-03-05 19:40               ` Peter Zijlstra
2025-03-05 19:47               ` Uros Bizjak
2025-03-05 22:18                 ` David Laight
2025-03-05 20:14               ` David Laight
2025-03-06 10:45                 ` Uros Bizjak
2025-03-06 13:07                   ` Uros Bizjak
2025-03-06 22:19                     ` Ingo Molnar
2025-03-08  7:22                       ` Uros Bizjak
2025-03-08 19:15               ` H. Peter Anvin
2025-03-05 19:55             ` Ingo Molnar
2025-03-05 20:13               ` Uros Bizjak
2025-03-05 20:21                 ` Ingo Molnar
2025-03-06  9:38                   ` Uros Bizjak
2025-03-05 20:20               ` Ingo Molnar
2025-03-06 10:52                 ` Dirk Gouders
2025-03-06 10:59                   ` Ingo Molnar [this message]
2025-03-05 20:36             ` Borislav Petkov
2025-03-05 21:26               ` Peter Zijlstra
2025-03-06  9:01                 ` Uros Bizjak
2025-03-06  9:43                   ` kernel: Current status of CONFIG_CC_OPTIMIZE_FOR_SIZE=y (was: Re: [PATCH -tip] x86/locking/atomic: Use asm_inline for atomic locking insns) Ingo Molnar
2025-03-06 10:37                     ` Arnd Bergmann
2025-03-06 20:37                     ` David Laight
2025-03-03 13:12       ` [PATCH -tip] x86/locking/atomic: Use asm_inline for atomic locking insns David Laight
2025-03-02 20:56   ` Uros Bizjak
2025-03-03 12:23     ` Uros Bizjak
2025-03-08 19:08   ` H. Peter Anvin
2025-03-09  7:50     ` Uros Bizjak
2025-03-09  9:46       ` David Laight
2025-03-09  9:57         ` Uros Bizjak
2025-03-06  9:57 ` Ingo Molnar
2025-03-06 10:26   ` Uros Bizjak
2025-03-06 10:38     ` Ingo Molnar
2025-03-06 10:50       ` Ingo Molnar
2025-03-06 13:56   ` Uros Bizjak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z8mACAi4-kN4uBLz@gmail.com \
    --to=mingo@kernel.org \
    --cc=acme@kernel.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dirk@gouders.net \
    --cc=hpa@zytor.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linuxfoundation.org \
    --cc=ubizjak@gmail.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.