From: Ingo Molnar <mingo@kernel.org>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH -tip 2/2] x86/hweight: Use POPCNT when available with X86_NATIVE_CPU option
Date: Sun, 30 Mar 2025 20:54:15 +0200 [thread overview]
Message-ID: <Z-mTVyOb8h1wYxvt@gmail.com> (raw)
In-Reply-To: <CAFULd4a2eOhZ4TPQRsLtN1yPYSfgqsXR_yOG+z3PedE-ZCMynw@mail.gmail.com>
* Uros Bizjak <ubizjak@gmail.com> wrote:
> On Sun, Mar 30, 2025 at 11:56 AM Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > * Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > > > So a better optimization I think would be to declare and implement
> > > > __sw_hweight32 with a different, less intrusive function call ABI
> > > > that
> > >
> > > With an external function, the ABI specifies the location of input
> > > argument and function result.
> >
> > This is all within the kernel, and __sw_hweight32() is implemented in
> > the kernel as well, entirely in assembly, and the ALTERNATIVE*() macros
> > are fully under our control as well - so we have full control over the
> > calling convention.
>
> There is a minor issue with a generic prototype in <linux/bitops.h>,
> where we declare:
>
> extern unsigned int __sw_hweight32(unsigned int w);
> extern unsigned long __sw_hweight64(__u64 w);
>
> This creates a bit of mixup, so perhaps it is better to define and use
> an x86 specific function name.
Yes, I alluded to this complication:
> > For example, we could make a version of __sw_hweight32 that is a
> > largely no-clobber function that only touches a single register, which
That version of __sw_hweight32 would be a different symbol.
> > I'm not saying it's *worth* it for POPCNTL emulation alone:
> >
> > - The code generation benefits might or might not be there. Needs to
> > be examined.
>
> Matching inputs with output will actually make the instruction
> "destructive", so the compiler will have to copy the input argument
> when it won't die in the instruction. This is not desirable.
Yeah, absolutely - it was mainly a demonstration that even
single-clobber functions are possible. (There's even zero-clobber
functions, like __fentry__)
> I think that adding a __POPCNT__ version (similar to my original
> patch) would bring the most benefit, because we could use "rm" input
> and "=r" output registers, without any constraints, enforced by
> fallback function call. This is only possible with a new
> -march=native functionality.
Yeah, -march=native might be nice for local tinkering, but it won't
reach 99.999% of Linux users - so it's immaterial to this particular
discussion.
Also, is POPCNTL the best example for this? Are there no other, more
frequently used ALTERNATIVE() patching sites with function call
alternatives that disturb the register state of important kernel
functions? (And I don't know the answer.)
Thanks,
Ingo
next prev parent reply other threads:[~2025-03-30 18:54 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-25 16:48 [PATCH -tip 1/2] x86/hweight: Fix false output register dependency of POPCNT insn Uros Bizjak
2025-03-25 16:48 ` [PATCH -tip 2/2] x86/hweight: Use POPCNT when available with X86_NATIVE_CPU option Uros Bizjak
2025-03-25 17:11 ` Borislav Petkov
2025-03-30 15:15 ` Uros Bizjak
2025-03-30 17:31 ` Borislav Petkov
2025-03-30 18:47 ` Ingo Molnar
2025-03-30 19:06 ` Borislav Petkov
2025-03-30 19:20 ` Ingo Molnar
2025-03-30 19:28 ` Borislav Petkov
2025-03-25 21:56 ` Ingo Molnar
2025-03-29 9:19 ` Uros Bizjak
2025-03-29 11:00 ` David Laight
2025-03-30 7:49 ` Uros Bizjak
2025-03-30 18:02 ` David Laight
2025-03-29 23:10 ` H. Peter Anvin
2025-03-30 6:54 ` Uros Bizjak
2025-03-30 9:56 ` Ingo Molnar
2025-03-30 16:07 ` Uros Bizjak
2025-03-30 18:15 ` David Laight
2025-03-30 22:44 ` H. Peter Anvin
2025-03-30 18:54 ` Ingo Molnar [this message]
2025-03-25 17:09 ` [PATCH -tip 1/2] x86/hweight: Fix false output register dependency of POPCNT insn Borislav Petkov
2025-03-25 17:17 ` Uros Bizjak
2025-03-25 17:44 ` Borislav Petkov
2025-03-25 21:50 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z-mTVyOb8h1wYxvt@gmail.com \
--to=mingo@kernel.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=ubizjak@gmail.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox