From: Thomas Monjalon <thomas@monjalon.net>
To: Scott Mitchell <scott.k.mitch1@gmail.com>
Cc: David Marchand <david.marchand@redhat.com>,
dev@dpdk.org, mb@smartsharesystems.com,
stephen@networkplumber.org, bruce.richardson@intel.com
Subject: Re: [PATCH v19 0/2] net: optimize __rte_raw_cksum
Date: Tue, 10 Feb 2026 12:53:21 +0100 [thread overview]
Message-ID: <5146778.aeNJFYEL58@thomas> (raw)
In-Reply-To: <CAFn2buB6yCTZGFOfE-erpU92SRfgGqcvpFrivOLNva3g-pq5Lg@mail.gmail.com>
Here are my test results:
buildtype : debugoptimized
default_library : shared
-march=x86-64-v4 (Cascade Lake)
gcc 15.2.1
clang 21.1.6
GCC - BEFORE
Alignment Block size TSC cycles/block TSC cycles/byte
Aligned 20 20.5 1.02
Unaligned 20 14.1 0.70
Aligned 21 15.8 0.75
Unaligned 21 15.8 0.75
Aligned 1500 148.2 0.10
Unaligned 1500 148.3 0.10
Aligned 1501 148.4 0.10
Unaligned 1501 148.2 0.10
GCC - AFTER
Alignment Block size TSC cycles/block TSC cycles/byte
Aligned 20 20.8 1.04
Unaligned 20 15.6 0.78
Aligned 21 16.9 0.81
Unaligned 21 16.9 0.80
Aligned 1500 109.5 0.07
Unaligned 1500 111.6 0.07
Aligned 1501 111.1 0.07
Unaligned 1501 113.0 0.08
Aligned 9000 612.4 0.07
Unaligned 9000 612.6 0.07
Aligned 9001 581.5 0.06
Unaligned 9001 601.7 0.07
CLANG - BEFORE
Alignment Block size TSC cycles/block TSC cycles/byte
Aligned 20 14.2 0.71
Unaligned 20 9.5 0.47
Aligned 21 11.7 0.56
Unaligned 21 11.8 0.56
Aligned 1500 610.7 0.41
Unaligned 1500 632.0 0.42
Aligned 1501 610.4 0.41
Unaligned 1501 627.6 0.42
CLANG - AFTER
Alignment Block size TSC cycles/block TSC cycles/byte
Aligned 20 14.0 0.70
Unaligned 20 9.1 0.45
Aligned 21 9.7 0.46
Unaligned 21 9.6 0.46
Aligned 1500 77.9 0.05
Unaligned 1500 79.4 0.05
Aligned 1501 79.4 0.05
Unaligned 1501 80.4 0.05
Aligned 9000 447.8 0.05
Unaligned 9000 492.1 0.05
Aligned 9001 448.5 0.05
Unaligned 9001 492.6 0.05
Before your patch,
With small block size, clang is better than GCC.
With large block size, GCC is better than clang.
After your patch, clang is always better than GCC.
07/02/2026 02:29, Scott Mitchell:
> Thanks for testing! I included my build/host config, results on the
> main branch, and then with this path applied below. What is your build
> flags/configuration (e, cpu_instruction_set, march, optimization
> level, etc.)? I wasn't able to get any Clang version (18, 19, 20) to
> vectorize on Godbolt https://godbolt.org/z/8149r7sq8, and curious if
> your config enables vectorization.
>
> #### build / host config
> User defined options
> b_lto : false
> buildtype : release
> c_args : -fno-omit-frame-pointer
> -DPACKET_QDISC_BYPASS=1 -DRTE_MEMCPY_AVX512=1
> cpu_instruction_set: cascadelake
> default_library : static
> max_lcores : 128
> optimization : 3
> $ clang --version
> clang version 18.1.8 (Red Hat, Inc. 18.1.8-3.el9)
> $ cat /etc/redhat-release
> Red Hat Enterprise Linux release 9.4 (Plow)
>
> #### main branch
> $ echo "cksum_perf_autotest" | /usr/local/bin/dpdk-test
> ### rte_raw_cksum() performance ###
> Alignment Block size TSC cycles/block TSC cycles/byte
> Aligned 20 10.0 0.50
> Unaligned 20 10.1 0.50
> Aligned 21 11.1 0.53
> Unaligned 21 11.6 0.55
> Aligned 100 39.4 0.39
> Unaligned 100 67.3 0.67
> Aligned 101 43.3 0.43
> Unaligned 101 41.5 0.41
> Aligned 1500 728.2 0.49
> Unaligned 1500 805.8 0.54
> Aligned 1501 768.8 0.51
> Unaligned 1501 787.3 0.52
> Test OK
>
> #### with this patch
> $ echo "cksum_perf_autotest" | /usr/local/bin/dpdk-test
> ### rte_raw_cksum() performance ###
> Alignment Block size TSC cycles/block TSC cycles/byte
> Aligned 20 12.6 0.63
> Unaligned 20 12.3 0.62
> Aligned 21 13.6 0.65
> Unaligned 21 13.6 0.65
> Aligned 100 22.7 0.23
> Unaligned 100 22.6 0.23
> Aligned 101 47.4 0.47
> Unaligned 101 23.9 0.24
> Aligned 1500 73.9 0.05
> Unaligned 1500 73.9 0.05
> Aligned 1501 95.7 0.06
> Unaligned 1501 73.9 0.05
> Aligned 9000 459.8 0.05
> Unaligned 9000 523.5 0.06
> Aligned 9001 536.7 0.06
> Unaligned 9001 507.5 0.06
> Aligned 65536 3158.4 0.05
> Unaligned 65536 3506.1 0.05
> Aligned 65537 3277.6 0.05
> Unaligned 65537 3697.6 0.06
> Test OK
>
next prev parent reply other threads:[~2026-02-10 11:53 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-12 12:04 [PATCH v14 0/2] net: optimize __rte_raw_cksum scott.k.mitch1
2026-01-12 12:04 ` [PATCH v14 1/2] eal: add __rte_may_alias to unaligned typedefs scott.k.mitch1
2026-01-12 13:28 ` Morten Brørup
2026-01-12 15:00 ` Scott Mitchell
2026-01-12 12:04 ` [PATCH v14 2/2] net: __rte_raw_cksum pointers enable compiler optimizations scott.k.mitch1
2026-01-17 21:21 ` [PATCH v15 0/2] net: optimize __rte_raw_cksum scott.k.mitch1
2026-01-17 21:21 ` [PATCH v15 1/2] eal: add __rte_may_alias to unaligned typedefs scott.k.mitch1
2026-01-20 15:23 ` Morten Brørup
2026-01-23 14:34 ` Scott Mitchell
2026-01-17 21:21 ` [PATCH v15 2/2] net: __rte_raw_cksum pointers enable compiler optimizations scott.k.mitch1
2026-01-17 22:08 ` [PATCH v15 0/2] net: optimize __rte_raw_cksum Stephen Hemminger
2026-01-20 12:45 ` Morten Brørup
2026-01-23 15:43 ` Scott Mitchell
2026-01-23 16:02 ` [PATCH v16 " scott.k.mitch1
2026-01-23 16:02 ` [PATCH v16 1/2] eal: add __rte_may_alias to unaligned typedefs scott.k.mitch1
2026-01-23 16:02 ` [PATCH v16 2/2] net: __rte_raw_cksum pointers enable compiler optimizations scott.k.mitch1
2026-01-28 11:05 ` David Marchand
2026-01-28 17:39 ` Scott Mitchell
2026-01-24 8:23 ` [PATCH v16 0/2] net: optimize __rte_raw_cksum Morten Brørup
2026-01-28 18:05 ` [PATCH v17 " scott.k.mitch1
2026-01-28 18:05 ` [PATCH v17 1/2] eal: add __rte_may_alias and __rte_aligned to unaligned typedefs scott.k.mitch1
2026-01-28 18:05 ` [PATCH v17 2/2] net: __rte_raw_cksum pointers enable compiler optimizations scott.k.mitch1
2026-01-28 19:41 ` [PATCH v18 0/2] net: optimize __rte_raw_cksum scott.k.mitch1
2026-01-28 19:41 ` [PATCH v18 1/2] eal: add __rte_may_alias and __rte_aligned to unaligned typedefs scott.k.mitch1
2026-01-29 8:28 ` Morten Brørup
2026-02-02 4:31 ` Scott Mitchell
2026-01-28 19:41 ` [PATCH v18 2/2] net: __rte_raw_cksum pointers enable compiler optimizations scott.k.mitch1
2026-01-29 8:31 ` Morten Brørup
2026-02-02 4:48 ` [PATCH v19 0/2] net: optimize __rte_raw_cksum scott.k.mitch1
2026-02-02 4:48 ` [PATCH v19 1/2] eal: add __rte_may_alias and __rte_aligned to unaligned typedefs scott.k.mitch1
2026-02-03 8:18 ` Morten Brørup
2026-02-16 14:29 ` David Marchand
2026-02-16 15:00 ` Morten Brørup
2026-02-02 4:48 ` [PATCH v19 2/2] net: __rte_raw_cksum pointers enable compiler optimizations scott.k.mitch1
2026-02-03 8:19 ` Morten Brørup
2026-02-06 14:54 ` [PATCH v19 0/2] net: optimize __rte_raw_cksum David Marchand
2026-02-07 1:29 ` Scott Mitchell
2026-02-10 11:53 ` Thomas Monjalon [this message]
2026-02-16 14:04 ` David Marchand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5146778.aeNJFYEL58@thomas \
--to=thomas@monjalon.net \
--cc=bruce.richardson@intel.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=mb@smartsharesystems.com \
--cc=scott.k.mitch1@gmail.com \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox