From: David Malcolm <dmalcolm@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ard Biesheuvel <ardb@kernel.org>,
linux-toolchains@vger.kernel.org,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Josh Poimboeuf <jpoimboe@redhat.com>,
Jason Baron <jbaron@akamai.com>,
"Steven Rostedt (VMware)" <rostedt@goodmis.org>
Subject: Re: static_branch/jump_label vs branch merging
Date: Fri, 09 Apr 2021 17:07:15 -0400 [thread overview]
Message-ID: <5a07bde1a9fa9a056a19637399b0635252ddb303.camel@redhat.com> (raw)
In-Reply-To: <YHC0dgwhYS9hKcRT@hirez.programming.kicks-ass.net>
On Fri, 2021-04-09 at 22:09 +0200, Peter Zijlstra wrote:
> On Fri, Apr 09, 2021 at 03:21:49PM -0400, David Malcolm wrote:
> > [Caveat: I'm a gcc developer, not a kernel expert]
> >
> > But it's not *quite* a global constant, or presumably you would be
> > simply using a global constant, right? As the optimizer gets
> > smarter,
> > you don't want to have it one day decide that actually it really is
> > constant, and optimize away everything at compile-time (e.g. when
> > LTO
> > is turned on, or whatnot).
>
> Right; as I said, the result is not a constant, but any invocation
> ever,
> will return the same result. Small but subtle difference :-)
>
> > I get the impression that you're resorting to assembler because
> > you're
> > pushing beyond what the C language can express.
>
> Of course :-) I tend to always push waaaaay past what C considers
> sane.
> Lets say I'm firmly in the C-as-Optimizing-Assembler camp :-)
Yeah, I got that :)
> > Taking things to a slightly higher level, am I right in thinking
> > that
> > what you're trying to achieve is a control flow construct that
> > almost
> > always takes one of the given branches, but which can (very rarely)
> > be
> > switched to permanently take one of the other branches, and that
> > you
> > want the lowest possible overhead for the common case where the
> > control flow hasn't been touched yet?
>
> Correct, that's what it is. We do runtime code patching to flip the
> branch if/when needed. We've been doing this for many many years now.
>
> The issue of today is all this clever stuff defeating some simple
> optimizations.
It's certainly clever - though, if you'll forgive me, that's not always
a good thing :)
> > (and presumably little overhead for when it
> > has been?)... and that you want to be able to merge repeated such
> > conditionals.
>
> This.. So the 'static' branches have been upstream and in use ever
> since
> GCC added asm-goto, it was in fact the driving force to get asm-goto
> implemented. This was 2010 according to git history.
>
> So we emit, using asm goto, either a "NOP5" or "JMP.d32" (x86
> speaking),
> and a special section entry into which we encode the key address and
> the
> instruction address and the jump target.
>
> GCC, not knowing what the asm does, only sees the 2 edges and all is
> well.
>
> Then, at runtime, when we decide we want the other edge for a given
> key,
> we iterate our section and rewrite the code to either nop5 or jmp.d32
> with the correct jump target.
>
> > It's kind of the opposite of "volatile" - something that the user
> > is
> > happy for the compiler to treat as not changing much, as opposed to
> > something the user is warning the compiler about changing from
> > under
> > it. A "const-ish" value?
>
> Just so. Encoded in text, not data.
>
> > Sorry if I'm being incoherent; I'm kind of thinking aloud here.
>
> No problem, we're way outside of what is generally considered normal,
> and I did somewhat assume people were familiar with our 'dodgy'
> construct (some on this list are more than others).
>
> I hope it's all a little clearer now.
Yeah. This is actually on two mailing lists; I'm only subscribed to
linux-toolchains, which AIUI is about sharing ideas between Linux and
the toolchains.
You've built a very specific thing out of asm-goto to fulfil the tough
requirements you outlined above - as well as the nops, there's a thing
in another section to contend with.
How to merge these asm-goto constructs?
Doing so feels very special-case to the kernel and not something that
other GCC users would find useful.
I can imagine a GCC plugin that implemented a custom optimization pass
for that - basically something that spots the asm-gotos in the gimple
IR and optimizes away duplicates by replacing them with jumps, but
having read about Linus's feelings about GCC plugins recently:
https://lwn.net/Articles/851090/
I suspect that that isn't going to fly (and if you're going down the
route of adding an optimization pass via a plugin, there's probably a
way to do that that doesn't involve asm). In theory, something to
optimize the asm-gotos could be relatively simple, but that said, we
don't really have a GCC plugin API; all of our internal APIs are
exposed, and are liable to change from release to release, which I know
is a pain (I've managed to break one of my own plugins with one of my
own API changes at least once).
Hope this is constructive
Dave
next prev parent reply other threads:[~2021-04-09 21:07 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-08 16:52 static_branch/jump_label vs branch merging Peter Zijlstra
2021-04-09 9:57 ` Ard Biesheuvel
2021-04-09 10:55 ` Florian Weimer
2021-04-09 11:16 ` Peter Zijlstra
2021-04-09 19:33 ` Nick Desaulniers
2021-04-09 20:11 ` Peter Zijlstra
2021-04-10 17:02 ` Segher Boessenkool
2021-04-09 11:12 ` Peter Zijlstra
2021-04-09 11:55 ` David Malcolm
2021-04-09 12:03 ` Peter Zijlstra
2021-04-09 13:01 ` Peter Zijlstra
2021-04-09 13:13 ` Peter Zijlstra
2021-04-09 13:48 ` David Malcolm
2021-04-09 18:40 ` Peter Zijlstra
2021-04-09 19:21 ` David Malcolm
2021-04-09 20:09 ` Peter Zijlstra
2021-04-09 21:07 ` David Malcolm [this message]
2021-04-09 21:39 ` Peter Zijlstra
2021-04-22 11:48 ` Peter Zijlstra
2021-04-22 17:08 ` Segher Boessenkool
2021-04-22 17:49 ` Peter Zijlstra
2021-04-22 18:31 ` Segher Boessenkool
2021-04-26 17:13 ` Peter Zijlstra
2021-04-10 12:44 ` David Laight
2021-04-09 13:03 ` Segher Boessenkool
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5a07bde1a9fa9a056a19637399b0635252ddb303.camel@redhat.com \
--to=dmalcolm@redhat.com \
--cc=ardb@kernel.org \
--cc=jbaron@akamai.com \
--cc=jpoimboe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-toolchains@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).