public inbox for linux-crypto@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Bill Wendling <morbo@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)"
	<x86@kernel.org>, "H. Peter Anvin" <hpa@zytor.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Nathan Chancellor <nathan@kernel.org>,
	Nick Desaulniers <nick.desaulniers+lkml@gmail.com>,
	Justin Stitt <justinstitt@google.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-crypto@vger.kernel.org,
	clang-built-linux <llvm@lists.linux.dev>
Subject: Re: [PATCH] x86/crc32: use builtins to improve code generation
Date: Wed, 26 Feb 2025 22:28:59 -0800	[thread overview]
Message-ID: <20250227062859.GA2506@sol.localdomain> (raw)
In-Reply-To: <CAGG=3QVi27WRYVxmsk9+HLpJw9ZJrpfLjU8G4exuXm-vUA-KqQ@mail.gmail.com>

On Wed, Feb 26, 2025 at 10:12:47PM -0800, Bill Wendling wrote:
> For both gcc and clang, crc32 builtins generate better code than the
> inline asm. GCC improves, removing unneeded "mov" instructions. Clang
> does the same and unrolls the loops. GCC has no changes on i386, but
> Clang's code generation is vastly improved, due to Clang's "rm"
> constraint issue.
> 
> The number of cycles improved by ~0.1% for GCC and ~1% for Clang, which
> is expected because of the "rm" issue. However, Clang's performance is
> better than GCC's by ~1.5%, most likely due to loop unrolling.
> 
> Link: https://github.com/llvm/llvm-project/issues/20571#issuecomment-2649330009
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: x86@kernel.org
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Eric Biggers <ebiggers@kernel.org>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Cc: Nathan Chancellor <nathan@kernel.org>
> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
> Cc: Justin Stitt <justinstitt@google.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-crypto@vger.kernel.org
> Cc: llvm@lists.linux.dev
> Signed-off-by: Bill Wendling <morbo@google.com>
> ---
>  arch/x86/Makefile         | 3 +++
>  arch/x86/lib/crc32-glue.c | 8 ++++----
>  2 files changed, 7 insertions(+), 4 deletions(-)

Thanks!  A couple concerns, though:

> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
> index 5b773b34768d..241436da1473 100644
> --- a/arch/x86/Makefile
> +++ b/arch/x86/Makefile
> @@ -114,6 +114,9 @@ else
>  KBUILD_CFLAGS += $(call cc-option,-fcf-protection=none)
>  endif
> 
> +# Enables the use of CRC32 builtins.
> +KBUILD_CFLAGS += -mcrc32

Doesn't this technically allow the compiler to insert CRC32 instructions
anywhere in arch/x86/ without the needed runtime CPU feature check?  Normally
when using intrinsics it's necessary to limit the scope of the feature
enablement to match the runtime CPU feature check that is done, e.g. by using
the target function attribute.

> diff --git a/arch/x86/lib/crc32-glue.c b/arch/x86/lib/crc32-glue.c
> index 2dd18a886ded..fdb94bff25f4 100644
> --- a/arch/x86/lib/crc32-glue.c
> +++ b/arch/x86/lib/crc32-glue.c
> @@ -48,9 +48,9 @@ u32 crc32_le_arch(u32 crc, const u8 *p, size_t len)
>  EXPORT_SYMBOL(crc32_le_arch);
> 
>  #ifdef CONFIG_X86_64
> -#define CRC32_INST "crc32q %1, %q0"
> +#define CRC32_INST __builtin_ia32_crc32di
>  #else
> -#define CRC32_INST "crc32l %1, %0"
> +#define CRC32_INST __builtin_ia32_crc32si
>  #endif

Do both gcc and clang consider these builtins to be a stable API, or do they
only guarantee the stability of _mm_crc32_*() from immintrin.h?  At least for
the rest of the SSE and AVX stuff, I thought that only the immintrin.h functions
are actually considered stable.

- Eric

  reply	other threads:[~2025-02-27  6:29 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-27  6:12 [PATCH] x86/crc32: use builtins to improve code generation Bill Wendling
2025-02-27  6:28 ` Eric Biggers [this message]
2025-02-27  7:08   ` Bill Wendling
2025-02-28  2:08     ` Eric Biggers
2025-02-27 10:52   ` H. Peter Anvin
2025-02-27 12:17     ` Bill Wendling
2025-02-27 20:56       ` Bill Wendling
2025-02-27 16:26 ` Dave Hansen
2025-02-27 20:57   ` Bill Wendling
2025-02-27 21:03     ` Dave Hansen
2025-02-27 23:47 ` [PATCH v2] " Bill Wendling
2025-02-28 21:20   ` Eric Biggers
2025-02-28 21:29     ` Bill Wendling
2025-03-03 20:15   ` David Laight
2025-03-03 20:27     ` Bill Wendling
2025-03-03 22:42       ` David Laight
2025-03-03 23:57         ` H. Peter Anvin
2025-03-04  0:16           ` Bill Wendling
2025-03-04  0:43             ` H. Peter Anvin
2025-03-04  4:32             ` David Laight
2025-03-04 20:52               ` David Laight
2025-03-04 21:52                 ` Eric Biggers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250227062859.GA2506@sol.localdomain \
    --to=ebiggers@kernel.org \
    --cc=ardb@kernel.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=justinstitt@google.com \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=mingo@redhat.com \
    --cc=morbo@google.com \
    --cc=nathan@kernel.org \
    --cc=nick.desaulniers+lkml@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox