public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH -tip 2/2] x86/hweight: Use POPCNT when available with X86_NATIVE_CPU option
Date: Sun, 30 Mar 2025 11:56:31 +0200	[thread overview]
Message-ID: <Z-kVT4ROZJXx6kui@gmail.com> (raw)
In-Reply-To: <CAFULd4bCnnL-CBFwgAQtN9S+sUE_wikda6E+8k9632J9b62dCg@mail.gmail.com>


* Uros Bizjak <ubizjak@gmail.com> wrote:

> > So a better optimization I think would be to declare and implement 
> > __sw_hweight32 with a different, less intrusive function call ABI 
> > that
> 
> With an external function, the ABI specifies the location of input 
> argument and function result.

This is all within the kernel, and __sw_hweight32() is implemented in 
the kernel as well, entirely in assembly, and the ALTERNATIVE*() macros 
are fully under our control as well - so we have full control over the 
calling convention.

Ie. in principle there's no need for the __sw_hweight32 function 
utilized by ALTERNATIVE() to be a C-call-ABI external function with all 
its call-clobbering constraints that disturbs register state affected 
by the C-call-ABI. (RSI RSI RDX RCX R8 R9)

The calling convention used is the kernel's choice, which we can 
re-evaluate.

For example, we could make a version of __sw_hweight32 that is a 
largely no-clobber function that only touches a single register, which 
receives its input in RAX and returns the result to RAX (as usual), and 
saves/restores everything else. This pushes overhead into the uncommon 
case (__sw_hweight32 users) and reduces register pressure on the 
calling site.

I'm not saying it's *worth* it for POPCNTL emulation alone:

 - The code generation benefits might or might not be there. Needs to 
   be examined.

 - There may be some trouble with on-stack red zones used by the 
   compiler, if the compiler doesn't know that a call was done.

 - Plus rolling a different calling convention down the alternatives 
   patching macros will have some maintenance overhead side effects. 
   Possibly other usecases need to be found as well for this to be 
   worth it.

But I wanted to bust the false assumption you seem to be making about 
C-call-ABI constraints.

Thanks,

	Ingo

  parent reply	other threads:[~2025-03-30  9:56 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-25 16:48 [PATCH -tip 1/2] x86/hweight: Fix false output register dependency of POPCNT insn Uros Bizjak
2025-03-25 16:48 ` [PATCH -tip 2/2] x86/hweight: Use POPCNT when available with X86_NATIVE_CPU option Uros Bizjak
2025-03-25 17:11   ` Borislav Petkov
2025-03-30 15:15     ` Uros Bizjak
2025-03-30 17:31       ` Borislav Petkov
2025-03-30 18:47         ` Ingo Molnar
2025-03-30 19:06           ` Borislav Petkov
2025-03-30 19:20             ` Ingo Molnar
2025-03-30 19:28               ` Borislav Petkov
2025-03-25 21:56   ` Ingo Molnar
2025-03-29  9:19     ` Uros Bizjak
2025-03-29 11:00       ` David Laight
2025-03-30  7:49         ` Uros Bizjak
2025-03-30 18:02           ` David Laight
2025-03-29 23:10       ` H. Peter Anvin
2025-03-30  6:54         ` Uros Bizjak
2025-03-30  9:56       ` Ingo Molnar [this message]
2025-03-30 16:07         ` Uros Bizjak
2025-03-30 18:15           ` David Laight
2025-03-30 22:44             ` H. Peter Anvin
2025-03-30 18:54           ` Ingo Molnar
2025-03-25 17:09 ` [PATCH -tip 1/2] x86/hweight: Fix false output register dependency of POPCNT insn Borislav Petkov
2025-03-25 17:17   ` Uros Bizjak
2025-03-25 17:44     ` Borislav Petkov
2025-03-25 21:50 ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z-kVT4ROZJXx6kui@gmail.com \
    --to=mingo@kernel.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=ubizjak@gmail.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox