public inbox for linux-crypto@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/crc32: use builtins to improve code generation
@ 2025-02-27  6:12 Bill Wendling
  2025-02-27  6:28 ` Eric Biggers
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Bill Wendling @ 2025-02-27  6:12 UTC (permalink / raw)
  To: Bill Wendling
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), H. Peter Anvin,
	Eric Biggers, Ard Biesheuvel, Nathan Chancellor, Nick Desaulniers,
	Justin Stitt, LKML, linux-crypto, clang-built-linux

For both gcc and clang, crc32 builtins generate better code than the
inline asm. GCC improves, removing unneeded "mov" instructions. Clang
does the same and unrolls the loops. GCC has no changes on i386, but
Clang's code generation is vastly improved, due to Clang's "rm"
constraint issue.

The number of cycles improved by ~0.1% for GCC and ~1% for Clang, which
is expected because of the "rm" issue. However, Clang's performance is
better than GCC's by ~1.5%, most likely due to loop unrolling.

Link: https://github.com/llvm/llvm-project/issues/20571#issuecomment-2649330009
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-crypto@vger.kernel.org
Cc: llvm@lists.linux.dev
Signed-off-by: Bill Wendling <morbo@google.com>
---
 arch/x86/Makefile         | 3 +++
 arch/x86/lib/crc32-glue.c | 8 ++++----
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 5b773b34768d..241436da1473 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -114,6 +114,9 @@ else
 KBUILD_CFLAGS += $(call cc-option,-fcf-protection=none)
 endif

+# Enables the use of CRC32 builtins.
+KBUILD_CFLAGS += -mcrc32
+
 ifeq ($(CONFIG_X86_32),y)
         BITS := 32
         UTS_MACHINE := i386
diff --git a/arch/x86/lib/crc32-glue.c b/arch/x86/lib/crc32-glue.c
index 2dd18a886ded..fdb94bff25f4 100644
--- a/arch/x86/lib/crc32-glue.c
+++ b/arch/x86/lib/crc32-glue.c
@@ -48,9 +48,9 @@ u32 crc32_le_arch(u32 crc, const u8 *p, size_t len)
 EXPORT_SYMBOL(crc32_le_arch);

 #ifdef CONFIG_X86_64
-#define CRC32_INST "crc32q %1, %q0"
+#define CRC32_INST __builtin_ia32_crc32di
 #else
-#define CRC32_INST "crc32l %1, %0"
+#define CRC32_INST __builtin_ia32_crc32si
 #endif

 /*
@@ -78,10 +78,10 @@ u32 crc32c_le_arch(u32 crc, const u8 *p, size_t len)

        for (num_longs = len / sizeof(unsigned long);
             num_longs != 0; num_longs--, p += sizeof(unsigned long))
-               asm(CRC32_INST : "+r" (crc) : "rm" (*(unsigned long *)p));
+               crc = CRC32_INST(crc,  *(unsigned long *)p);

        for (len %= sizeof(unsigned long); len; len--, p++)
-               asm("crc32b %1, %0" : "+r" (crc) : "rm" (*p));
+               crc = __builtin_ia32_crc32qi(crc, *p);

        return crc;
 }
-- 
2.48.1.711.g2feabab25a-goog

^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-03-04 21:52 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-27  6:12 [PATCH] x86/crc32: use builtins to improve code generation Bill Wendling
2025-02-27  6:28 ` Eric Biggers
2025-02-27  7:08   ` Bill Wendling
2025-02-28  2:08     ` Eric Biggers
2025-02-27 10:52   ` H. Peter Anvin
2025-02-27 12:17     ` Bill Wendling
2025-02-27 20:56       ` Bill Wendling
2025-02-27 16:26 ` Dave Hansen
2025-02-27 20:57   ` Bill Wendling
2025-02-27 21:03     ` Dave Hansen
2025-02-27 23:47 ` [PATCH v2] " Bill Wendling
2025-02-28 21:20   ` Eric Biggers
2025-02-28 21:29     ` Bill Wendling
2025-03-03 20:15   ` David Laight
2025-03-03 20:27     ` Bill Wendling
2025-03-03 22:42       ` David Laight
2025-03-03 23:57         ` H. Peter Anvin
2025-03-04  0:16           ` Bill Wendling
2025-03-04  0:43             ` H. Peter Anvin
2025-03-04  4:32             ` David Laight
2025-03-04 20:52               ` David Laight
2025-03-04 21:52                 ` Eric Biggers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox