All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: Scott Mitchell <scott.k.mitch1@gmail.com>
Cc: "Morten Brørup" <mb@smartsharesystems.com>,
	"Konstantin Ananyev" <konstantin.ananyev@huawei.com>,
	dev@dpdk.org, "Bruce Richardson" <bruce.richardson@intel.com>,
	"Konstantin Ananyev" <konstantin.v.ananyev@yandex.ru>,
	"Vipin Varghese" <vipin.varghese@amd.com>
Subject: Re: [PATCH v5] eal/x86: optimize memcpy of small sizes
Date: Mon, 12 Jan 2026 16:39:39 -0800	[thread overview]
Message-ID: <20260112163939.12c7cf8f@phoenix.local> (raw)
In-Reply-To: <CAFn2buBx3JNXS1pb6w=0S1HOAxmG+HVTkk=_b+1kou5MnAnqpw@mail.gmail.com>

On Mon, 12 Jan 2026 11:00:36 -0500
Scott Mitchell <scott.k.mitch1@gmail.com> wrote:

> >
> > The discussion about the optimized checksum function [1] has shown us that memcpy() sometimes prevents Clang from optimizing (loop unrolling and vectorizing) and potentially causes strict aliasing bugs with GCC, so I will work on a new patch version that keeps using the above types, instead of introducing memcpy() inside rte_memcpy().
> >
> > [1]: https://inbox.dpdk.org/dev/CAFn2buBzBLFLVN-K=u3MgBEbQ-hqbgJLVpDx3vSXVKJpa0yPNg@mail.gmail.com/
> >  
> 
> Great timing for this thread :)
> 
> My observation:
> - clang is unable to apply optimizations with RTE_PTR_[ADD,SUB]
> like loop unrolling and vectorization (e.g. cksum)
> - Even when clang/gcc do apply optimizations the assembly can be non-optimal
> - direct usage of unaligned_NN_t types can cause incorrect results
> (due to gcc bugs)
> 
> I don't think "rte_NN_alias" structs are safe on architectures that don't allow
> unaligned access bcz the inner "val" needs to indicate it maybe for
> unaligned access.
> 
> My suggestion:
> 1. Fix unaligned_NN_t types to ensure compiler doesn't aggressively
> apply strict-alias
> optimizations resulting in incorrect results
> (https://patches.dpdk.org/project/dpdk/patch/20260112120411.27314-2-scott.k.mitch1@gmail.com/).
> Intermediate structs rte_NN_alias are then unnecessary and we can directly use
> unaligned_NN_t instead (e.g.
> https://patches.dpdk.org/project/dpdk/patch/20260112120411.27314-3-scott.k.mitch1@gmail.com/)
> 
> 2. Improve RTE_PTR_[ADD,SUB] to be more compiler friendly
> (https://patches.dpdk.org/project/dpdk/patch/20260112154059.36879-1-scott.k.mitch1@gmail.com/)

FYI the Linux kernel avoids the memcpy silliness.
Mostly by identifying architectures where unaligned access is non-issue.
On x86, unaligned access works fine. As I remember it works on ARM as well.
The only place where unaligned can break badly is when this is an atomic operation.

  reply	other threads:[~2026-01-13  0:39 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-20 11:45 [PATCH] eal/x86: reduce memcpy code duplication Morten Brørup
2025-11-21 10:35 ` [PATCH v2] eal/x86: optimize memcpy of small sizes Morten Brørup
2025-11-21 16:57   ` Stephen Hemminger
2025-11-21 17:02     ` Bruce Richardson
2025-11-21 17:11       ` Stephen Hemminger
2025-11-21 21:36         ` Morten Brørup
2025-11-21 10:40 ` Morten Brørup
2025-11-21 10:40 ` [PATCH v3] " Morten Brørup
2025-11-24 13:36   ` Morten Brørup
2025-11-24 15:46     ` Patrick Robb
2025-11-28 14:02   ` Konstantin Ananyev
2025-11-28 15:55     ` Morten Brørup
2025-11-28 18:10       ` Konstantin Ananyev
2025-11-29  2:17         ` Morten Brørup
2025-12-01  9:35           ` Konstantin Ananyev
2025-12-01 10:41             ` Morten Brørup
2025-11-24 20:31 ` [PATCH v4] " Morten Brørup
2025-11-25  8:19   ` Morten Brørup
2025-12-01 15:55 ` [PATCH v5] " Morten Brørup
2025-12-03 13:29   ` Morten Brørup
2026-01-03 17:53   ` Morten Brørup
2026-01-09 15:05     ` Varghese, Vipin
2026-01-11 15:52     ` Konstantin Ananyev
2026-01-11 16:01       ` Stephen Hemminger
2026-01-12  8:02       ` Morten Brørup
2026-01-12 16:00         ` Scott Mitchell
2026-01-13  0:39           ` Stephen Hemminger [this message]
2026-01-12 12:03 ` [PATCH v6] " Morten Brørup
2026-01-13 23:19   ` Stephen Hemminger
2026-01-20 11:00     ` Varghese, Vipin
2026-01-20 11:19       ` Varghese, Vipin
2026-01-20 11:22         ` Morten Brørup
2026-01-21 11:48           ` Varghese, Vipin
2026-01-22  6:59             ` Varghese, Vipin
2026-01-22  7:28               ` Liangxing Wang
2026-01-23  6:58               ` Varghese, Vipin
2026-02-20 11:08 ` [PATCH v7] " Morten Brørup
2026-03-11  7:28   ` Morten Brørup
2026-03-11 16:58   ` Bruce Richardson
2026-03-11 18:29     ` Morten Brørup
2026-03-11 19:09       ` Bruce Richardson
2026-03-12  8:33   ` Konstantin Ananyev
2026-03-19 15:55   ` Morten Brørup
2026-04-29  9:36 ` [PATCH v8] " Morten Brørup
2026-04-29 10:35 ` [PATCH v9] " Morten Brørup
2026-04-29 11:24   ` Morten Brørup
2026-05-08  6:32   ` Morten Brørup
2026-05-21 10:54   ` [TEST PATCH " Morten Brørup
2026-05-08  9:58 ` [PATCH v10] " Morten Brørup
2026-05-21 18:56 ` [PATCH v11] " Morten Brørup
2026-05-21 19:48   ` Stephen Hemminger
2026-05-21 22:42   ` Stephen Hemminger
2026-06-01 13:38     ` Thomas Monjalon
2026-06-01 14:19       ` Morten Brørup
2026-06-01 19:48   ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260112163939.12c7cf8f@phoenix.local \
    --to=stephen@networkplumber.org \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@huawei.com \
    --cc=konstantin.v.ananyev@yandex.ru \
    --cc=mb@smartsharesystems.com \
    --cc=scott.k.mitch1@gmail.com \
    --cc=vipin.varghese@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.