All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: "Morten Brørup" <mb@smartsharesystems.com>,
	dev@dpdk.org,
	"Konstantin Ananyev" <konstantin.v.ananyev@yandex.ru>,
	"Vipin Varghese" <vipin.varghese@amd.com>
Subject: Re: [PATCH v2] eal/x86: optimize memcpy of small sizes
Date: Fri, 21 Nov 2025 17:02:17 +0000	[thread overview]
Message-ID: <aSCbGWNQ7dr7EE7A@bricha3-mobl1.ger.corp.intel.com> (raw)
In-Reply-To: <20251121085730.51f0466a@phoenix.local>

On Fri, Nov 21, 2025 at 08:57:30AM -0800, Stephen Hemminger wrote:
> On Fri, 21 Nov 2025 10:35:35 +0000
> Morten Brørup <mb@smartsharesystems.com> wrote:
> 
> > The implementation for copying up to 64 bytes does not depend on address
> > alignment with the size of the CPU's vector registers, so the code
> > handling this was moved from the various implementations to the common
> > function.
> > 
> > Furthermore, the function for copying less than 16 bytes was replaced with
> > a smarter implementation using fewer branches and potentially fewer
> > load/store operations.
> > This function was also extended to handle copying of up to 16 bytes,
> > instead of up to 15 bytes. This small extension reduces the code path for
> > copying two pointers.
> > 
> > These changes provide two benefits:
> > 1. The memory footprint of the copy function is reduced.
> > Previously there were two instances of the compiled code to copy up to 64
> > bytes, one in the "aligned" code path, and one in the "generic" code path.
> > Now there is only one instance, in the "common" code path.
> > 2. The performance for copying up to 64 bytes is improved.
> > The memcpy performance test shows cache-to-cache copying of up to 32 bytes
> > now typically only takes 2 cycles (4 cycles for 64 bytes) versus
> > ca. 6.5 cycles before this patch.
> > 
> > And finally, the missing implementation of rte_mov48() was added.
> > 
> > Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> 
> As I have said before would rather that DPDK move away from having its
> own specialized memcpy.  How is this compared to stock inline gcc?
> The main motivation is that the glibc/gcc team does more testing across
> multiple architectures and has a community with more expertise on CPU
> special cases.

I would tend to agree. Even if we get rte_memcpy a few cycles faster, I
suspect many apps wouldn't notice the difference. However, I understand
that the virtio/vhost libraries gain from using rte_memcpy over standard
memcpy - or at least used to. Perhaps we can consider deprecating
rte_memcpy and just putting a vhost-specific memcpy in that library?

/Bruce

  reply	other threads:[~2025-11-21 17:02 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-20 11:45 [PATCH] eal/x86: reduce memcpy code duplication Morten Brørup
2025-11-21 10:35 ` [PATCH v2] eal/x86: optimize memcpy of small sizes Morten Brørup
2025-11-21 16:57   ` Stephen Hemminger
2025-11-21 17:02     ` Bruce Richardson [this message]
2025-11-21 17:11       ` Stephen Hemminger
2025-11-21 21:36         ` Morten Brørup
2025-11-21 10:40 ` Morten Brørup
2025-11-21 10:40 ` [PATCH v3] " Morten Brørup
2025-11-24 13:36   ` Morten Brørup
2025-11-24 15:46     ` Patrick Robb
2025-11-28 14:02   ` Konstantin Ananyev
2025-11-28 15:55     ` Morten Brørup
2025-11-28 18:10       ` Konstantin Ananyev
2025-11-29  2:17         ` Morten Brørup
2025-12-01  9:35           ` Konstantin Ananyev
2025-12-01 10:41             ` Morten Brørup
2025-11-24 20:31 ` [PATCH v4] " Morten Brørup
2025-11-25  8:19   ` Morten Brørup
2025-12-01 15:55 ` [PATCH v5] " Morten Brørup
2025-12-03 13:29   ` Morten Brørup
2026-01-03 17:53   ` Morten Brørup
2026-01-09 15:05     ` Varghese, Vipin
2026-01-11 15:52     ` Konstantin Ananyev
2026-01-11 16:01       ` Stephen Hemminger
2026-01-12  8:02       ` Morten Brørup
2026-01-12 16:00         ` Scott Mitchell
2026-01-13  0:39           ` Stephen Hemminger
2026-01-12 12:03 ` [PATCH v6] " Morten Brørup
2026-01-13 23:19   ` Stephen Hemminger
2026-01-20 11:00     ` Varghese, Vipin
2026-01-20 11:19       ` Varghese, Vipin
2026-01-20 11:22         ` Morten Brørup
2026-01-21 11:48           ` Varghese, Vipin
2026-01-22  6:59             ` Varghese, Vipin
2026-01-22  7:28               ` Liangxing Wang
2026-01-23  6:58               ` Varghese, Vipin
2026-02-20 11:08 ` [PATCH v7] " Morten Brørup
2026-03-11  7:28   ` Morten Brørup
2026-03-11 16:58   ` Bruce Richardson
2026-03-11 18:29     ` Morten Brørup
2026-03-11 19:09       ` Bruce Richardson
2026-03-12  8:33   ` Konstantin Ananyev
2026-03-19 15:55   ` Morten Brørup
2026-04-29  9:36 ` [PATCH v8] " Morten Brørup
2026-04-29 10:35 ` [PATCH v9] " Morten Brørup
2026-04-29 11:24   ` Morten Brørup
2026-05-08  6:32   ` Morten Brørup
2026-05-21 10:54   ` [TEST PATCH " Morten Brørup
2026-05-08  9:58 ` [PATCH v10] " Morten Brørup
2026-05-21 18:56 ` [PATCH v11] " Morten Brørup
2026-05-21 19:48   ` Stephen Hemminger
2026-05-21 22:42   ` Stephen Hemminger
2026-06-01 13:38     ` Thomas Monjalon
2026-06-01 14:19       ` Morten Brørup
2026-06-01 19:48   ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aSCbGWNQ7dr7EE7A@bricha3-mobl1.ger.corp.intel.com \
    --to=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.v.ananyev@yandex.ru \
    --cc=mb@smartsharesystems.com \
    --cc=stephen@networkplumber.org \
    --cc=vipin.varghese@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.