From: Stephen Hemminger <stephen@networkplumber.org>
To: Daniel Gregory <code@danielg0.com>
Cc: "Stanisław Kardach" <stanislaw.kardach@gmail.com>,
dev@dpdk.org, "Punit Agrawal" <punit.agrawal@bytedance.com>,
"Liang Ma" <liangma@liangbit.com>,
"Pengcheng Wang" <wangpengcheng.pp@bytedance.com>,
"Chunsong Feng" <fengchunsong@bytedance.com>,
"Daniel Gregory" <daniel.gregory@bytedance.com>,
"Sun Yuechi" <sunyuechi@iscas.ac.cn>
Subject: Re: [PATCH v4 00/10] riscv: implement accelerated crc using zbc
Date: Sun, 22 Feb 2026 10:03:13 -0800 [thread overview]
Message-ID: <20260222100313.5fd90e0c@phoenix.local> (raw)
In-Reply-To: <cover.1771772598.git.code@danielg0.com>
On Sun, 22 Feb 2026 16:29:54 +0100
Daniel Gregory <code@danielg0.com> wrote:
> The RISC-V Zbc extension adds instructions for carry-less multiplication
> we can use to implement CRC in hardware. This patch set contains two new
> implementations:
>
> - one in lib/hash/rte_crc_riscv64.h that uses a Barrett reduction to
> implement the four rte_hash_crc_* functions
> - one in lib/net/net_crc_zbc.c that uses repeated single-folds to reduce
> the buffer until it is small enough for a Barrett reduction to
> implement rte_crc16_ccitt_zbc_handler and rte_crc32_eth_zbc_handler
>
> My approach is largely based on the Intel's "Fast CRC Computation Using
> PCLMULQDQ Instruction" white paper
> https://www.researchgate.net/publication/263424619_Fast_CRC_computation
> and a post about "Optimizing CRC32 for small payload sizes on x86"
> https://mary.rs/lab/crc32/
>
> Whether these new implementations are enabled is controlled by new
> build-time and run-time detection of the RISC-V extensions present in
> the compiler and on the target system.
>
> I have carried out some performance comparisons between the generic
> table implementations and the new hardware implementations. Listed below
> is the number of cycles it takes to compute the CRC hash for buffers of
> various sizes (as reported by rte_get_timer_cycles()). These results
> were collected on a Kendryte K230 and averaged over 20 samples:
>
> |Buffer | CRC32-ETH (lib/net) | CRC32C (lib/hash) |
> |Size (MB) | Table | Hardware | Table | Hardware |
> |----------|----------|----------|----------|----------|
> | 1 | 155168 | 11610 | 73026 | 18385 |
> | 2 | 311203 | 22998 | 145586 | 35886 |
> | 3 | 466744 | 34370 | 218536 | 53939 |
> | 4 | 621843 | 45536 | 291574 | 71944 |
> | 5 | 777908 | 56989 | 364152 | 89706 |
> | 6 | 932736 | 68023 | 437016 | 107726 |
> | 7 | 1088756 | 79236 | 510197 | 125426 |
> | 8 | 1243794 | 90467 | 583231 | 143614 |
>
> These results suggest a speed-up of lib/net by thirteen times, and of
> lib/hash by four times.
>
> I have also run the hash_functions_autotest benchmark in dpdk_test,
> which measures the performance of the lib/hash implementation on small
> buffers, getting the following times:
>
> | Key Length | Time (ticks/op) |
> | (bytes) | Table | Hardware |
> |------------|----------|----------|
> | 1 | 0.47 | 0.85 |
> | 2 | 0.57 | 0.87 |
> | 4 | 0.99 | 0.88 |
> | 8 | 1.35 | 0.88 |
> | 9 | 1.20 | 1.09 |
> | 13 | 1.76 | 1.35 |
> | 16 | 1.87 | 1.02 |
> | 32 | 2.96 | 0.98 |
> | 37 | 3.35 | 1.45 |
> | 40 | 3.49 | 1.12 |
> | 48 | 4.02 | 1.25 |
> | 64 | 5.08 | 1.54 |
>
> v4:
> - rebase on 26.03-rc1
> - RISC64 -> RISCV64 in test_hash.c (Stephen Hemminger)
> - Added section to release notes (Stephen Hemminger)
> - SPDX-License_Identifier -> SPDX-License-Identifier in
> rte_crc_riscv64.h (Stephen Hemminger)
> - Fix header guard in rte_crc_riscv64.h (Stephen Hemminger)
> - assert -> RTE_ASSERT in rte_crc_riscv64.h (Stephen Hemminger)
> - Fix copyright statement in net_crc_zbc.c (Stephen Hemminger)
> - Make crc context structs static in net_crc_zbc.c (Stephen Hemminger)
> - prefer the optimised crc when zbc present over jhash in rte_fbk_hash.c
> v3:
> - rebase on 24.07
> - replace crc with CRC in commits (check-git-log.sh)
> v2:
> - replace compile flag with build-time (riscv extension macros) and
> run-time detection (linux hwprobe syscall) (Stephen Hemminger)
> - add qemu target that supports zbc (Stanislaw Kardach)
> - fix spelling error in commit message
> - fix a bug in the net/ implementation that would cause segfaults on
> small unaligned buffers
> - refactor net/ implementation to move variable declarations to top of
> functions
> - enable the optimisation in a couple other places optimised crc is
> preferred to jhash
> - l3fwd-power
> - cuckoo-hash
>
> Daniel Gregory (10):
> config/riscv: detect presence of Zbc extension
> hash: implement CRC using riscv carryless multiply
> net: implement CRC using riscv carryless multiply
> config/riscv: add qemu crossbuild target
> examples/l3fwd: use accelerated CRC on riscv
> ipfrag: use accelerated CRC on riscv
> examples/l3fwd-power: use accelerated CRC on riscv
> hash: use accelerated CRC on riscv
> member: use accelerated CRC on riscv
> doc: implement CRC using riscv carryless multiply
>
> .mailmap | 2 +-
> MAINTAINERS | 2 +
> app/test/test_crc.c | 10 +
> app/test/test_hash.c | 7 +
> config/riscv/meson.build | 33 +++
> config/riscv/riscv64_qemu_linux_gcc | 17 ++
> .../linux_gsg/cross_build_dpdk_for_riscv.rst | 5 +
> doc/guides/rel_notes/release_26_03.rst | 8 +
> examples/l3fwd-power/main.c | 2 +-
> examples/l3fwd/l3fwd_em.c | 2 +-
> lib/eal/riscv/include/rte_cpuflags.h | 2 +
> lib/eal/riscv/rte_cpuflags.c | 112 +++++++---
> lib/hash/meson.build | 1 +
> lib/hash/rte_crc_riscv64.h | 90 ++++++++
> lib/hash/rte_cuckoo_hash.c | 3 +
> lib/hash/rte_fbk_hash.c | 3 +
> lib/hash/rte_hash_crc.c | 13 +-
> lib/hash/rte_hash_crc.h | 6 +-
> lib/ip_frag/ip_frag_internal.c | 6 +-
> lib/member/member.h | 2 +-
> lib/net/meson.build | 4 +
> lib/net/net_crc.h | 11 +
> lib/net/net_crc_zbc.c | 194 ++++++++++++++++++
> lib/net/rte_net_crc.c | 30 ++-
> lib/net/rte_net_crc.h | 3 +
> 25 files changed, 526 insertions(+), 42 deletions(-)
> create mode 100644 config/riscv/riscv64_qemu_linux_gcc
> create mode 100644 lib/hash/rte_crc_riscv64.h
> create mode 100644 lib/net/net_crc_zbc.c
>
AI patch review summary: the overall approach looks good — the hwprobe
integration is clean and the Barrett reduction math appears correct.
A few issues need addressing before this can be merged:
1. [ERROR, patch 01] 1 << n used for all 26 HWCAP mask entries
The feature table entries now store masks in a uint64_t field, but
all 26 existing RISCV_ISA_* entries still use plain '1 << n' (signed
int). This produces 32-bit values stored in a 64-bit field and causes
undefined behaviour for n >= 31. All entries must use UINT64_C(1) << n:
FEAT_DEF(RISCV_ISA_A, REG_HWCAP, UINT64_C(1) << 0)
...
FEAT_DEF(RISCV_ISA_Z, REG_HWCAP, UINT64_C(1) << 25)
2. [WARNING, patches 03-07, 09] Missing Signed-off-by from submitter address
These patches carry only:
Signed-off-by: Daniel Gregory <daniel.gregory@bytedance.com>
but are submitted from code@danielg0.com. The DCO requires a
Signed-off-by from the address used to submit the patch. Please add:
Signed-off-by: Daniel Gregory <code@danielg0.com>
to each of these patches (as done correctly in 01, 08, and 10).
3. [WARNING, patch 02] Inverted condition in rte_hash_crc_set_alg
The new warning log fires when the caller does *not* request
CRC32_RISCV64, but the message says the opposite:
if (!(alg & CRC32_RISCV64))
HASH_CRC_LOG(WARNING, "Unsupported CRC32 algorithm requested
using CRC32_RISCV64");
Either flip the condition or reword the message to match the intent
(e.g. "Falling back to CRC32_RISCV64 despite a different algorithm
being requested").
4. [WARNING, patch 03] Unaligned access in crc32_repeated_barrett_zbc
The function casts 'data' directly to uint64_t*/uint32_t*/uint16_t*
and dereferences it. It is called for tail data (after the main fold
loop and for buffers < 16 bytes) where alignment is not guaranteed.
RISC-V hardware unaligned access support is optional. Use memcpy into
a local variable or equivalent to be safe on all implementations.
5. [WARNING, patch 10] .mailmap replaces rather than aliases old address
The change removes the bytedance entry entirely. The correct form
maps the old address to the new canonical one:
Daniel Gregory <code@danielg0.com> <daniel.gregory@bytedance.com>
Without this, existing commits with the bytedance address will no
longer be attributed to the canonical identity in git shortlog.
Minor: the Barrett reduction in rte_crc_riscv64.h and net_crc_zbc.c
both truncate a uint64_t result to uint32_t implicitly on return. A
brief comment explaining this is intentional (only the lower 32 bits
are the CRC remainder) would help future readers.
next prev parent reply other threads:[~2026-02-22 18:03 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-18 17:41 [PATCH 0/5] riscv: implement accelerated crc using zbc Daniel Gregory
2024-06-18 17:41 ` [PATCH 1/5] config/riscv: add flag for using Zbc extension Daniel Gregory
2024-06-18 20:03 ` Stephen Hemminger
2024-06-19 7:08 ` Morten Brørup
2024-06-19 14:49 ` Stephen Hemminger
2024-06-19 16:41 ` Daniel Gregory
2024-10-07 8:14 ` Stanisław Kardach
2024-10-07 15:20 ` Stephen Hemminger
2024-10-08 5:52 ` Stanisław Kardach
2024-10-08 15:35 ` Stephen Hemminger
2024-06-18 17:41 ` [PATCH 2/5] hash: implement crc using riscv carryless multiply Daniel Gregory
2024-06-18 17:41 ` [PATCH 3/5] net: " Daniel Gregory
2024-06-18 17:41 ` [PATCH 4/5] examples/l3fwd: use accelerated crc on riscv Daniel Gregory
2024-06-18 17:41 ` [PATCH 5/5] ipfrag: " Daniel Gregory
2024-07-12 15:46 ` [PATCH v2 0/9] riscv: implement accelerated crc using zbc Daniel Gregory
2024-07-12 15:46 ` [PATCH v2 1/9] config/riscv: detect presence of Zbc extension Daniel Gregory
2024-07-12 15:46 ` [PATCH v2 2/9] hash: implement crc using riscv carryless multiply Daniel Gregory
2024-07-12 15:46 ` [PATCH v2 3/9] net: " Daniel Gregory
2024-07-12 15:46 ` [PATCH v2 4/9] config/riscv: add qemu crossbuild target Daniel Gregory
2024-07-12 15:46 ` [PATCH v2 5/9] examples/l3fwd: use accelerated crc on riscv Daniel Gregory
2024-07-12 15:46 ` [PATCH v2 6/9] ipfrag: " Daniel Gregory
2024-07-12 15:46 ` [PATCH v2 7/9] examples/l3fwd-power: " Daniel Gregory
2024-07-12 15:46 ` [PATCH v2 8/9] hash/cuckoo: " Daniel Gregory
2024-07-12 15:46 ` [PATCH v2 9/9] member: " Daniel Gregory
2024-07-12 17:19 ` [PATCH v2 0/9] riscv: implement accelerated crc using zbc David Marchand
2024-08-27 15:32 ` [PATCH v3 " Daniel Gregory
2024-08-27 15:32 ` [PATCH v3 1/9] config/riscv: detect presence of Zbc extension Daniel Gregory
2024-08-27 15:32 ` [PATCH v3 2/9] hash: implement CRC using riscv carryless multiply Daniel Gregory
2024-08-27 15:32 ` [PATCH v3 3/9] net: " Daniel Gregory
2024-08-27 15:32 ` [PATCH v3 4/9] config/riscv: add qemu crossbuild target Daniel Gregory
2024-08-27 15:36 ` [PATCH v3 5/9] examples/l3fwd: use accelerated CRC on riscv Daniel Gregory
2024-08-27 15:36 ` [PATCH v3 6/9] ipfrag: " Daniel Gregory
2024-08-27 15:36 ` [PATCH v3 7/9] examples/l3fwd-power: " Daniel Gregory
2024-08-27 15:36 ` [PATCH v3 8/9] hash/cuckoo: " Daniel Gregory
2024-08-27 15:36 ` [PATCH v3 9/9] member: " Daniel Gregory
2024-09-17 14:26 ` [PATCH v3 0/9] riscv: implement accelerated crc using zbc Daniel Gregory
2025-11-17 4:47 ` sunyuechi
2026-01-13 1:07 ` Stephen Hemminger
2026-02-22 15:29 ` [PATCH v4 00/10] " Daniel Gregory
2026-02-22 15:29 ` [PATCH v4 01/10] config/riscv: detect presence of Zbc extension Daniel Gregory
2026-02-22 15:29 ` [PATCH v4 02/10] hash: implement CRC using riscv carryless multiply Daniel Gregory
2026-02-22 15:29 ` [PATCH v4 03/10] net: " Daniel Gregory
2026-02-22 15:29 ` [PATCH v4 04/10] config/riscv: add qemu crossbuild target Daniel Gregory
2026-02-22 15:29 ` [PATCH v4 05/10] examples/l3fwd: use accelerated CRC on riscv Daniel Gregory
2026-02-22 15:30 ` [PATCH v4 06/10] ipfrag: " Daniel Gregory
2026-02-22 15:30 ` [PATCH v4 07/10] examples/l3fwd-power: " Daniel Gregory
2026-02-22 15:30 ` [PATCH v4 08/10] hash: " Daniel Gregory
2026-02-22 15:30 ` [PATCH v4 09/10] member: " Daniel Gregory
2026-02-22 15:30 ` [PATCH v4 10/10] doc: implement CRC using riscv carryless multiply Daniel Gregory
2026-02-22 18:03 ` Stephen Hemminger [this message]
2026-02-22 19:42 ` [PATCH v4 00/10] riscv: implement accelerated crc using zbc Morten Brørup
2026-03-23 5:42 ` dangshiwei
2026-03-29 18:22 ` Stephen Hemminger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260222100313.5fd90e0c@phoenix.local \
--to=stephen@networkplumber.org \
--cc=code@danielg0.com \
--cc=daniel.gregory@bytedance.com \
--cc=dev@dpdk.org \
--cc=fengchunsong@bytedance.com \
--cc=liangma@liangbit.com \
--cc=punit.agrawal@bytedance.com \
--cc=stanislaw.kardach@gmail.com \
--cc=sunyuechi@iscas.ac.cn \
--cc=wangpengcheng.pp@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.