public inbox for dev@dpdk.org
 help / color / mirror / Atom feed
From: Thomas Monjalon <thomas@monjalon.net>
To: Scott Mitchell <scott.k.mitch1@gmail.com>
Cc: David Marchand <david.marchand@redhat.com>,
	dev@dpdk.org, mb@smartsharesystems.com,
	stephen@networkplumber.org, bruce.richardson@intel.com
Subject: Re: [PATCH v19 0/2] net: optimize __rte_raw_cksum
Date: Tue, 10 Feb 2026 12:53:21 +0100	[thread overview]
Message-ID: <5146778.aeNJFYEL58@thomas> (raw)
In-Reply-To: <CAFn2buB6yCTZGFOfE-erpU92SRfgGqcvpFrivOLNva3g-pq5Lg@mail.gmail.com>

Here are my test results:

    buildtype             : debugoptimized
    default_library       : shared
    -march=x86-64-v4 (Cascade Lake)
    gcc 15.2.1
    clang 21.1.6

GCC - BEFORE
Alignment  Block size    TSC cycles/block  TSC cycles/byte
Aligned           20                20.5             1.02
Unaligned         20                14.1             0.70
Aligned           21                15.8             0.75
Unaligned         21                15.8             0.75
Aligned         1500               148.2             0.10
Unaligned       1500               148.3             0.10
Aligned         1501               148.4             0.10
Unaligned       1501               148.2             0.10

GCC - AFTER
Alignment  Block size    TSC cycles/block  TSC cycles/byte
Aligned           20                20.8             1.04
Unaligned         20                15.6             0.78
Aligned           21                16.9             0.81
Unaligned         21                16.9             0.80
Aligned         1500               109.5             0.07
Unaligned       1500               111.6             0.07
Aligned         1501               111.1             0.07
Unaligned       1501               113.0             0.08
Aligned         9000               612.4             0.07
Unaligned       9000               612.6             0.07
Aligned         9001               581.5             0.06
Unaligned       9001               601.7             0.07

CLANG - BEFORE
Alignment  Block size    TSC cycles/block  TSC cycles/byte
Aligned           20                14.2             0.71
Unaligned         20                 9.5             0.47
Aligned           21                11.7             0.56
Unaligned         21                11.8             0.56
Aligned         1500               610.7             0.41
Unaligned       1500               632.0             0.42
Aligned         1501               610.4             0.41
Unaligned       1501               627.6             0.42

CLANG - AFTER
Alignment  Block size    TSC cycles/block  TSC cycles/byte
Aligned           20                14.0             0.70
Unaligned         20                 9.1             0.45
Aligned           21                 9.7             0.46
Unaligned         21                 9.6             0.46
Aligned         1500                77.9             0.05
Unaligned       1500                79.4             0.05
Aligned         1501                79.4             0.05
Unaligned       1501                80.4             0.05
Aligned         9000               447.8             0.05
Unaligned       9000               492.1             0.05
Aligned         9001               448.5             0.05
Unaligned       9001               492.6             0.05

Before your patch,
With small block size, clang is better than GCC.
With large block size, GCC is better than clang.
After your patch, clang is always better than GCC.


07/02/2026 02:29, Scott Mitchell:
> Thanks for testing! I included my build/host config, results on the
> main branch, and then with this path applied below. What is your build
> flags/configuration (e, cpu_instruction_set, march, optimization
> level, etc.)? I wasn't able to get any Clang version (18, 19, 20) to
> vectorize on Godbolt https://godbolt.org/z/8149r7sq8, and curious if
> your config enables vectorization.
> 
> #### build / host config
>   User defined options
>     b_lto              : false
>     buildtype          : release
>     c_args             : -fno-omit-frame-pointer
> -DPACKET_QDISC_BYPASS=1 -DRTE_MEMCPY_AVX512=1
>     cpu_instruction_set: cascadelake
>     default_library    : static
>     max_lcores         : 128
>     optimization       : 3
> $ clang --version
> clang version 18.1.8 (Red Hat, Inc. 18.1.8-3.el9)
> $ cat /etc/redhat-release
> Red Hat Enterprise Linux release 9.4 (Plow)
> 
> #### main branch
> $ echo "cksum_perf_autotest" | /usr/local/bin/dpdk-test
> ### rte_raw_cksum() performance ###
> Alignment  Block size    TSC cycles/block  TSC cycles/byte
> Aligned           20                10.0             0.50
> Unaligned         20                10.1             0.50
> Aligned           21                11.1             0.53
> Unaligned         21                11.6             0.55
> Aligned          100                39.4             0.39
> Unaligned        100                67.3             0.67
> Aligned          101                43.3             0.43
> Unaligned        101                41.5             0.41
> Aligned         1500               728.2             0.49
> Unaligned       1500               805.8             0.54
> Aligned         1501               768.8             0.51
> Unaligned       1501               787.3             0.52
> Test OK
> 
> #### with this patch
> $ echo "cksum_perf_autotest" | /usr/local/bin/dpdk-test
> ### rte_raw_cksum() performance ###
> Alignment  Block size    TSC cycles/block  TSC cycles/byte
> Aligned           20                12.6             0.63
> Unaligned         20                12.3             0.62
> Aligned           21                13.6             0.65
> Unaligned         21                13.6             0.65
> Aligned          100                22.7             0.23
> Unaligned        100                22.6             0.23
> Aligned          101                47.4             0.47
> Unaligned        101                23.9             0.24
> Aligned         1500                73.9             0.05
> Unaligned       1500                73.9             0.05
> Aligned         1501                95.7             0.06
> Unaligned       1501                73.9             0.05
> Aligned         9000               459.8             0.05
> Unaligned       9000               523.5             0.06
> Aligned         9001               536.7             0.06
> Unaligned       9001               507.5             0.06
> Aligned        65536              3158.4             0.05
> Unaligned      65536              3506.1             0.05
> Aligned        65537              3277.6             0.05
> Unaligned      65537              3697.6             0.06
> Test OK
> 






  reply	other threads:[~2026-02-10 11:53 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-12 12:04 [PATCH v14 0/2] net: optimize __rte_raw_cksum scott.k.mitch1
2026-01-12 12:04 ` [PATCH v14 1/2] eal: add __rte_may_alias to unaligned typedefs scott.k.mitch1
2026-01-12 13:28   ` Morten Brørup
2026-01-12 15:00     ` Scott Mitchell
2026-01-12 12:04 ` [PATCH v14 2/2] net: __rte_raw_cksum pointers enable compiler optimizations scott.k.mitch1
2026-01-17 21:21 ` [PATCH v15 0/2] net: optimize __rte_raw_cksum scott.k.mitch1
2026-01-17 21:21   ` [PATCH v15 1/2] eal: add __rte_may_alias to unaligned typedefs scott.k.mitch1
2026-01-20 15:23     ` Morten Brørup
2026-01-23 14:34       ` Scott Mitchell
2026-01-17 21:21   ` [PATCH v15 2/2] net: __rte_raw_cksum pointers enable compiler optimizations scott.k.mitch1
2026-01-17 22:08   ` [PATCH v15 0/2] net: optimize __rte_raw_cksum Stephen Hemminger
2026-01-20 12:45     ` Morten Brørup
2026-01-23 15:43       ` Scott Mitchell
2026-01-23 16:02   ` [PATCH v16 " scott.k.mitch1
2026-01-23 16:02     ` [PATCH v16 1/2] eal: add __rte_may_alias to unaligned typedefs scott.k.mitch1
2026-01-23 16:02     ` [PATCH v16 2/2] net: __rte_raw_cksum pointers enable compiler optimizations scott.k.mitch1
2026-01-28 11:05       ` David Marchand
2026-01-28 17:39         ` Scott Mitchell
2026-01-24  8:23     ` [PATCH v16 0/2] net: optimize __rte_raw_cksum Morten Brørup
2026-01-28 18:05     ` [PATCH v17 " scott.k.mitch1
2026-01-28 18:05       ` [PATCH v17 1/2] eal: add __rte_may_alias and __rte_aligned to unaligned typedefs scott.k.mitch1
2026-01-28 18:05       ` [PATCH v17 2/2] net: __rte_raw_cksum pointers enable compiler optimizations scott.k.mitch1
2026-01-28 19:41       ` [PATCH v18 0/2] net: optimize __rte_raw_cksum scott.k.mitch1
2026-01-28 19:41         ` [PATCH v18 1/2] eal: add __rte_may_alias and __rte_aligned to unaligned typedefs scott.k.mitch1
2026-01-29  8:28           ` Morten Brørup
2026-02-02  4:31             ` Scott Mitchell
2026-01-28 19:41         ` [PATCH v18 2/2] net: __rte_raw_cksum pointers enable compiler optimizations scott.k.mitch1
2026-01-29  8:31           ` Morten Brørup
2026-02-02  4:48         ` [PATCH v19 0/2] net: optimize __rte_raw_cksum scott.k.mitch1
2026-02-02  4:48           ` [PATCH v19 1/2] eal: add __rte_may_alias and __rte_aligned to unaligned typedefs scott.k.mitch1
2026-02-03  8:18             ` Morten Brørup
2026-02-16 14:29             ` David Marchand
2026-02-16 15:00               ` Morten Brørup
2026-02-02  4:48           ` [PATCH v19 2/2] net: __rte_raw_cksum pointers enable compiler optimizations scott.k.mitch1
2026-02-03  8:19             ` Morten Brørup
2026-02-06 14:54           ` [PATCH v19 0/2] net: optimize __rte_raw_cksum David Marchand
2026-02-07  1:29             ` Scott Mitchell
2026-02-10 11:53               ` Thomas Monjalon [this message]
2026-02-16 14:04               ` David Marchand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5146778.aeNJFYEL58@thomas \
    --to=thomas@monjalon.net \
    --cc=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=mb@smartsharesystems.com \
    --cc=scott.k.mitch1@gmail.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox