All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: Honza Fikar <j.fikar@gmail.com>,
	linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org,
	"Jason A . Donenfeld" <Jason@zx2c4.com>,
	x86@kernel.org, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH 09/12] lib/crypto: blake2s: Always enable arch-optimized BLAKE2s code
Date: Fri, 29 Aug 2025 09:10:18 -0700	[thread overview]
Message-ID: <20250829161018.GB91803@sol> (raw)
In-Reply-To: <CAMj1kXGf+0b=6kPAzzxgesaOYSJtzoL1oQyNqT2VrUkWFzwJzA@mail.gmail.com>

On Fri, Aug 29, 2025 at 06:05:42PM +0200, Ard Biesheuvel wrote:
> On Fri, 29 Aug 2025 at 17:30, Eric Biggers <ebiggers@kernel.org> wrote:
> >
> > On Fri, Aug 29, 2025 at 03:08:56PM +0200, Honza Fikar wrote:
> > > On Fri, Aug 29, 2025 at 2:54 PM Eric Biggers <ebiggers@kernel.org> wrote:
> > >
> > > > Currently, BLAKE2s support is always enabled ('obj-y'), since random.c
> > > > uses it.  Therefore, the arch-optimized BLAKE2s code, which exists for
> > > > ARM and x86_64, should be always enabled too.
> > >
> > > Maybe a stupid question: what about ARM64? The current NEON
> > > implementation in kernel arch/arm/crypto/blake2s-core.S seems to be just
> > > for ARM.
> > >
> 
> That code is scalar not NEON, and is carefully tuned to make use of
> the ARM barrel shifter, which does not exist on arm64.
> 
> > > While the upstream BLAKE2s with NEON is both for ARM and Aarch64 (ARM64):
> > >
> > > https://github.com/BLAKE2/BLAKE2/blob/master/neon
> >
> > There's no ARM64 optimized BLAKE2s code in the Linux kernel yet.  If
> > it's useful, someone would need to contribute it.
> >
> 
> NEON is cumbersome in the kernel so this only makes sense if it is
> substantially more performant, and I'm skeptical that this is the
> case, as you pointed out yourself in
> 
> commit 5172d322d34c30fb926b29aeb5a064e1fd8a5e13
> Author: Eric Biggers <ebiggers@google.com>
> Date:   Wed Dec 23 00:09:59 2020 -0800
> 
>     crypto: arm/blake2s - add ARM scalar optimized BLAKE2s
> 
>     Add an ARM scalar optimized implementation of BLAKE2s.
> 
>     NEON isn't very useful for BLAKE2s because the BLAKE2s block size
>     is too small for NEON to help.  Each NEON instruction would depend
>     on the previous one, resulting in poor performance.
> 
> Even if NEON code might be slightly faster on some cores, the fact
> that it is sensitive to micro-architectural details makes it less
> attractive.

Yes, agreed: there isn't much opportunity for an ARM64 optimized BLAKE2s
implementation to be faster than the generic C code.

- Eric


  reply	other threads:[~2025-08-30  0:47 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-27 15:11 [PATCH 00/12] ChaCha and BLAKE2s cleanups Eric Biggers
2025-08-27 15:11 ` [PATCH 01/12] arm: configs: Remove obsolete assignments to CRYPTO_CHACHA20_NEON Eric Biggers
2025-08-27 15:11 ` [PATCH 02/12] crypto: chacha - register only "-lib" drivers Eric Biggers
2025-08-27 15:11 ` [PATCH 03/12] lib/crypto: chacha: Remove unused function chacha_is_arch_optimized() Eric Biggers
2025-08-27 15:11 ` [PATCH 04/12] lib/crypto: chacha: Rename chacha.c to chacha-block-generic.c Eric Biggers
2025-08-27 15:11 ` [PATCH 05/12] lib/crypto: chacha: Rename libchacha.c to chacha.c Eric Biggers
2025-08-27 15:11 ` [PATCH 06/12] lib/crypto: chacha: Consolidate into single module Eric Biggers
2025-08-27 15:11 ` [PATCH 07/12] lib/crypto: x86/blake2s: Reduce size of BLAKE2S_SIGMA2 Eric Biggers
2025-08-27 15:11 ` [PATCH 08/12] lib/crypto: blake2s: Remove obsolete self-test Eric Biggers
2025-08-27 15:11 ` [PATCH 09/12] lib/crypto: blake2s: Always enable arch-optimized BLAKE2s code Eric Biggers
2025-08-29 13:08   ` Honza Fikar
2025-08-29 15:29     ` Eric Biggers
2025-08-29 16:05       ` Ard Biesheuvel
2025-08-29 16:10         ` Eric Biggers [this message]
2025-08-27 15:11 ` [PATCH 10/12] lib/crypto: blake2s: Move generic code into blake2s.c Eric Biggers
2025-08-27 15:11 ` [PATCH 11/12] lib/crypto: blake2s: Consolidate into single C translation unit Eric Biggers
2025-08-27 15:11 ` [PATCH 12/12] lib/crypto: tests: Add KUnit tests for BLAKE2s Eric Biggers
2025-08-29 16:37 ` [PATCH 00/12] ChaCha and BLAKE2s cleanups Ard Biesheuvel
2025-09-06 21:44 ` Eric Biggers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250829161018.GB91803@sol \
    --to=ebiggers@kernel.org \
    --cc=Jason@zx2c4.com \
    --cc=ardb@kernel.org \
    --cc=j.fikar@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.