From: David Laight <david.laight.linux@gmail.com>
To: Eric Biggers <ebiggers@kernel.org>
Cc: Demian Shulhan <demyansh@gmail.com>,
linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, ardb@kernel.org
Subject: Re: [PATCH v3] lib/crc: arm64: add NEON accelerated CRC64-NVMe implementation
Date: Sun, 29 Mar 2026 22:57:04 +0100 [thread overview]
Message-ID: <20260329225704.0eb82966@pumpkin> (raw)
In-Reply-To: <20260329203829.GA2746@quark>
On Sun, 29 Mar 2026 13:38:29 -0700
Eric Biggers <ebiggers@kernel.org> wrote:
> On Sun, Mar 29, 2026 at 07:43:38AM +0000, Demian Shulhan wrote:
> > Implement an optimized CRC64 (NVMe) algorithm for ARM64 using NEON
> > Polynomial Multiply Long (PMULL) instructions. The generic shift-and-XOR
> > software implementation is slow, which creates a bottleneck in NVMe and
> > other storage subsystems.
> >
> > The acceleration is implemented using C intrinsics (<arm_neon.h>) rather
> > than raw assembly for better readability and maintainability.
> >
> > Key highlights of this implementation:
> > - Uses 4KB chunking inside scoped_ksimd() to avoid preemption latency
> > spikes on large buffers.
> > - Pre-calculates and loads fold constants via vld1q_u64() to minimize
> > register spilling.
> > - Benchmarks show the break-even point against the generic implementation
> > is around 128 bytes. The PMULL path is enabled only for len >= 128.
Final thought:
Is that allowing for the cost of kernel_fpu_begin()? - which I think only
affects the first call.
And the cost of the data-cache misses for the lookup table reads? - again
worse for the first call.
David
> >
> > Performance results (kunit crc_benchmark on Cortex-A72):
> > - Generic (len=4096): ~268 MB/s
> > - PMULL (len=4096): ~1556 MB/s (nearly 6x improvement)
> >
> > Signed-off-by: Demian Shulhan <demyansh@gmail.com>
>
> Applied to https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git/log/?h=crc-next
>
> Thanks!
>
> - Eric
>
next prev parent reply other threads:[~2026-03-29 21:57 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260317065425.2684093-1-demyansh@gmail.com>
[not found] ` <20260327060211.902077-1-demyansh@gmail.com>
2026-03-27 19:38 ` [PATCH v2] lib/crc: arm64: add NEON accelerated CRC64-NVMe implementation Eric Biggers
2026-03-29 7:43 ` [PATCH v3] " Demian Shulhan
2026-03-29 20:38 ` Eric Biggers
2026-03-29 21:57 ` David Laight [this message]
2026-03-29 22:18 ` Eric Biggers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260329225704.0eb82966@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=ardb@kernel.org \
--cc=demyansh@gmail.com \
--cc=ebiggers@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox