DPDK-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Konstantin Ananyev <konstantin.ananyev@huawei.com>
To: Stephen Hemminger <stephen@networkplumber.org>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: Wathsala Vithanage <wathsala.vithanage@arm.com>
Subject: RE: [PATCH 2/5] ring: use GCC builtin as alternative to rte_atomic32
Date: Thu, 4 Jun 2026 15:11:25 +0000	[thread overview]
Message-ID: <d65ce1df239445e0bb795c852e075b1a@huawei.com> (raw)
In-Reply-To: <20260602171552.686349-3-stephen@networkplumber.org>



> This patch replaces use of the deprecated rte_atomic32 code with
> GCC builtin atomic operations.
> 
> Although it would be preferable to use C11 version on all architectures,
> there is a performance loss if we do it that way:
> 
> Measured on i9-13900H, two physical cores MP/MC bulk n=128, 10 runs:
>   with C11 builtin:           5.86 cycles/elem
>   with __sync builtin:        5.36 cycles/elem  (-9.4%)
> 
> The C11 __atomic_compare_exchange_n builtin writes the actual value back
> to its expected pointer on failure. On x86 this forces GCC
> to emit extra instructions on the critical path between the CAS
> and the success-test.
> 
> __sync_bool_compare_and_swap returns a plain bool with no pointer
> writeback, allowing GCC to emit tighter code.
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
>  lib/ring/meson.build                          |  2 +-
>  lib/ring/rte_ring_c11_pvt.h                   |  3 +-
>  lib/ring/rte_ring_elem_pvt.h                  |  2 +-
>  ..._ring_generic_pvt.h => rte_ring_gcc_pvt.h} | 37 +++++++++++--------
>  4 files changed, 24 insertions(+), 20 deletions(-)
>  rename lib/ring/{rte_ring_generic_pvt.h => rte_ring_gcc_pvt.h} (87%)
> 
> diff --git a/lib/ring/meson.build b/lib/ring/meson.build
> index 21f2c12989..2ba160b178 100644
> --- a/lib/ring/meson.build
> +++ b/lib/ring/meson.build
> @@ -9,7 +9,7 @@ indirect_headers += files (
>          'rte_ring_elem.h',
>          'rte_ring_elem_pvt.h',
>          'rte_ring_c11_pvt.h',
> -        'rte_ring_generic_pvt.h',
> +        'rte_ring_gcc_pvt.h',
>          'rte_ring_hts.h',
>          'rte_ring_hts_elem_pvt.h',
>          'rte_ring_peek.h',
> diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h
> index 5afc14dec9..8358b0f21f 100644
> --- a/lib/ring/rte_ring_c11_pvt.h
> +++ b/lib/ring/rte_ring_c11_pvt.h
> @@ -43,7 +43,6 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht,
> uint32_t old_val,
>  	 */
>  	rte_atomic_store_explicit(&ht->tail, new_val,
> rte_memory_order_release);
>  }
> -
>  /**
>   * @internal This is a helper function that moves the producer/consumer head
>   *    optimized for single threaded case
> @@ -82,7 +81,7 @@ __rte_ring_headtail_move_head_st(struct rte_ring_headtail
> *d,
>  	/* Single producer: only this thread writes d->head,
>  	 * so a relaxed load is sufficient.
>  	 */
> -	*old_head = rte_atomic_load_explicit(&d->head,
> rte_memory_order_relaxed);
> +	*old_head = rte_atomic_load_explicit(&d->head,
> 	rte_memory_order_acquire);

Not sure, why it had changed to 'acquire' here?
Looks like just patch splitting mistake, no?

> 
>  	/* Acquire pairs with the consumer's release-store of tail in
> __rte_ring_update_tail,
>  	 * ensuring the consumer's ring-element reads are complete before

  reply	other threads:[~2026-06-04 15:11 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-02 17:07 [PATCH 0/5] ring: convert to C11 atomics where practical Stephen Hemminger
2026-06-02 17:07 ` [PATCH 1/5] ring: split single thread vs multi-thread cases Stephen Hemminger
2026-06-04 15:09   ` Konstantin Ananyev
2026-06-02 17:07 ` [PATCH 2/5] ring: use GCC builtin as alternative to rte_atomic32 Stephen Hemminger
2026-06-04 15:11   ` Konstantin Ananyev [this message]
2026-06-04 15:20     ` Stephen Hemminger
2026-06-04 15:43       ` Konstantin Ananyev
2026-06-02 17:07 ` [PATCH 3/5] ring: use C11 for update_tail Stephen Hemminger
2026-06-04 15:39   ` Konstantin Ananyev
2026-06-02 17:07 ` [PATCH 4/5] ring: drop unused arg to update_tail Stephen Hemminger
2026-06-04 15:40   ` Konstantin Ananyev
2026-06-02 17:07 ` [PATCH 5/5] ring: use C11 for single thread move head Stephen Hemminger
2026-06-04 15:41   ` Konstantin Ananyev
2026-06-04 16:32 ` [PATCH v2] ring: convert to C11 atomics where practical Stephen Hemminger
2026-06-04 16:32   ` [PATCH v2 1/3] ring: split single thread vs multi-thread cases Stephen Hemminger
2026-06-04 16:32   ` [PATCH v2 2/3] ring: use GCC builtin as alternative to rte_atomic32 Stephen Hemminger
2026-06-04 16:32   ` [PATCH v2 3/3] ring: cleanup the C11 code Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d65ce1df239445e0bb795c852e075b1a@huawei.com \
    --to=konstantin.ananyev@huawei.com \
    --cc=dev@dpdk.org \
    --cc=stephen@networkplumber.org \
    --cc=wathsala.vithanage@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox