All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Arnd Bergmann" <arnd@kernel.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Borislav Petkov" <bp@alien8.de>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	"Juergen Gross" <jgross@suse.com>,
	"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
	"Alexander Usyskin" <alexander.usyskin@intel.com>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Mateusz Jończyk" <mat.jonczyk@o2.pl>,
	"Mike Rapoport" <rppt@kernel.org>,
	"Ard Biesheuvel" <ardb@kernel.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	linux-kernel@vger.kernel.org, xen-devel@lists.xenproject.org
Subject: Re: [PATCH] bitops/32: Convert variable_ffs() and fls() zero-case handling to C
Date: Mon, 28 Apr 2025 09:14:40 +0200	[thread overview]
Message-ID: <aA8q4Ot-1zTzv_Kt@gmail.com> (raw)
In-Reply-To: <aA8oqKUaFU-0wb-D@gmail.com>


* Ingo Molnar <mingo@kernel.org> wrote:

> And once we remove 486, I think we can do the optimization below to 
> just assume the output doesn't get clobbered by BS*L in the 
> zero-case, right?
> 
> In the text size space it's a substantial optimization on x86-32 
> defconfig:
> 
>         text	   data	       bss	     dec	    hex	filename
>   16,577,728    7598826    1744896      25921450        18b87aa vmlinux.vanilla      # CMOV+BS*L
>   16,577,908	7598838	   1744896	25921642	18b886a	vmlinux.linus_patch  # if()+BS*L
>   16,573,568	7602922	   1744896	25921386	18b876a	vmlinux.noclobber    # BS*L

And BTW, *that* is a price that all of non-486 x86-32 was paying for 
486 support...

And, just out of intellectual curiosity, I also tried to measure the 
code generation price of the +1 standards-quirk in the fls()/ffs() 
interface as well:

         text	   data	       bss	     dec	    hex	filename
   16,577,728   7598826    1744896      25921450        18b87aa vmlinux.vanilla      # CMOV+BS*L
   16,577,908	7598838	   1744896	25921642	18b886a	vmlinux.linus_patch  # if()+BS*L
   16,573,568	7602922	   1744896	25921386	18b876a	vmlinux.noclobber    # BS*L
   ..........
   16,573,552	7602922	   1744896	25921370	18b875a	vmlinux.broken       # BROKEN: 0 baseline instead of 1

... and unless I messed up the patch, it seems to have a surprisingly 
low impact - maybe because the compiler can amortize its cost by 
adjusting all dependent code mostly at build time, so the +1 doesn't 
end up being generated most of the time?

Thanks,

	Ingo

===============================>

This broken patch is broken: it intentionally breaks the ffs()/fls() 
interface in an attempt to measure the code generation effects of 
interface details.

NOT-Signed-off-by: <anyone@anywhere.anytime>
---
 arch/x86/include/asm/bitops.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index e3e94a806656..21707696bafe 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -318,7 +318,7 @@ static __always_inline int variable_ffs(int x)
 	    : "=r" (r)
 	    : ASM_INPUT_RM (x), "0" (-1));
 
-	return r + 1;
+	return r;
 }
 
 /**
@@ -362,7 +362,7 @@ static __always_inline int fls(unsigned int x)
 	    : "=r" (r)
 	    : ASM_INPUT_RM (x), "0" (-1));
 
-	return r + 1;
+	return r;
 }
 
 /**


  reply	other threads:[~2025-04-28  7:15 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-25 14:15 [PATCH] [RFC] x86/cpu: rework instruction set selection Arnd Bergmann
2025-04-25 15:34 ` H. Peter Anvin
2025-04-25 16:13   ` Arnd Bergmann
2025-04-25 20:15     ` H. Peter Anvin
2025-04-26  9:08 ` Ingo Molnar
2025-04-26 13:17   ` H. Peter Anvin
2025-04-26 18:55     ` Ingo Molnar
2025-04-27  0:35       ` H. Peter Anvin
2025-04-26 18:58   ` Arnd Bergmann
2025-04-26 19:09     ` Ingo Molnar
2025-04-27 13:24       ` Arnd Bergmann
2025-04-27 21:17         ` H. Peter Anvin
2025-04-26 19:24     ` Linus Torvalds
2025-04-26 19:55       ` Linus Torvalds
2025-04-26 23:47         ` H. Peter Anvin
2025-04-27 10:18           ` Ingo Molnar
2025-04-27  0:02         ` H. Peter Anvin
2025-04-27 19:17         ` Andrew Cooper
2025-04-27 19:34           ` Linus Torvalds
2025-04-27 21:14             ` H. Peter Anvin
2025-04-28  6:58             ` [PATCH] bitops/32: Convert variable_ffs() and fls() zero-case handling to C Ingo Molnar
2025-04-28  7:05               ` Ingo Molnar
2025-04-28  7:14                 ` Ingo Molnar [this message]
2025-04-28 12:30                   ` Arnd Bergmann
2025-04-28 13:41                   ` H. Peter Anvin
2025-04-28 16:23                   ` Linus Torvalds
2025-04-29 10:08                     ` Ingo Molnar
2025-04-29 14:32                       ` H. Peter Anvin
2025-04-28 16:14                 ` Linus Torvalds
2025-04-28 21:38                   ` H. Peter Anvin
2025-04-29  0:12                     ` Andrew Cooper
2025-04-29  2:00                       ` H. Peter Anvin
2025-04-29  2:22                         ` Linus Torvalds
2025-04-29  2:25                         ` Andrew Cooper
2025-04-29  3:13                           ` H. Peter Anvin
2025-04-29 14:38                             ` Andrew Cooper
2025-04-29 18:05                               ` Linus Torvalds
2025-04-29 19:13                                 ` Andrew Cooper
2025-04-29 20:12                                   ` Linus Torvalds
2025-04-29 21:23                                     ` H. Peter Anvin
2025-04-29 21:53                                       ` Linus Torvalds
2025-04-29 21:59                                         ` Andrew Cooper
2025-04-29 22:04                                           ` Linus Torvalds
2025-04-29 22:10                                             ` H. Peter Anvin
2025-04-29 22:22                                             ` Andrew Cooper
2025-04-29 22:34                                               ` Linus Torvalds
2025-04-27  9:50       ` [PATCH] [RFC] x86/cpu: rework instruction set selection Ingo Molnar
2025-04-30 21:54       ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aA8q4Ot-1zTzv_Kt@gmail.com \
    --to=mingo@kernel.org \
    --cc=alexander.usyskin@intel.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=arnd@kernel.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mat.jonczyk@o2.pl \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rppt@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.