RE: [PATCH v4 03/27] ring: unify memory model on C11, remove atomic32

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Konstantin Ananyev <konstantin.ananyev@huawei.com>
To: Stephen Hemminger <stephen@networkplumber.org>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: Wathsala Vithanage <wathsala.vithanage@arm.com>
Subject: RE: [PATCH v4 03/27] ring: unify memory model on C11, remove atomic32
Date: Mon, 1 Jun 2026 18:18:18 +0000	[thread overview]
Message-ID: <b8523a4e0ee34e8b9194935574e86f7c@huawei.com> (raw)
In-Reply-To: <20260526232542.620966-4-stephen@networkplumber.org>


> Remove the RTE_USE_C11_MEM_MODEL build switch; C11 atomics are now
> the default for all platforms. Unifies __rte_ring_update_tail into
> the C11 form (atomic_store_release replaces the older rte_smp_wmb +
> plain store on the generic path) and renames rte_ring_generic_pvt.h
> to rte_ring_x86_pvt.h to reflect its new scope.
> 
> Also splits the head-move helper into separate ST and MT variants,
> removing the runtime is_st branch from the MT retry loop.
> This gets small boost and scopes the following exception
> more tightly.
> 
> Exception: on x86 with GCC, atomic_compare_exchange on the head CAS
> regresses MP/MC contended throughput by ~20% existing hand-written
> cmpxchg. As a workaround, GCC-on-x86 builds use the older
> __sync_bool_compare_and_swap builtin, which generates equivalent
> code to the original asm. Can be reverted if/when GCC gets
> fixed; similar issue was observed in Linux kernel.
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
>  lib/ring/meson.build                          |   2 +-
>  lib/ring/rte_ring_c11_pvt.h                   |  75 +++--------
>  lib/ring/rte_ring_elem_pvt.h                  | 125 ++++++++++++++++--
>  ..._ring_generic_pvt.h => rte_ring_x86_pvt.h} |  61 ++-------
>  lib/ring/soring.c                             |  15 ++-
>  5 files changed, 158 insertions(+), 120 deletions(-)
>  rename lib/ring/{rte_ring_generic_pvt.h => rte_ring_x86_pvt.h} (60%)
> 
> diff --git a/lib/ring/meson.build b/lib/ring/meson.build
> index 21f2c12989..b178c963b8 100644
> --- a/lib/ring/meson.build
> +++ b/lib/ring/meson.build
> @@ -9,7 +9,7 @@ indirect_headers += files (
>          'rte_ring_elem.h',
>          'rte_ring_elem_pvt.h',
>          'rte_ring_c11_pvt.h',
> -        'rte_ring_generic_pvt.h',
> +        'rte_ring_x86_pvt.h',
>          'rte_ring_hts.h',
>          'rte_ring_hts_elem_pvt.h',
>          'rte_ring_peek.h',
> diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h
> index 07b6efc416..3efe011f08 100644
> --- a/lib/ring/rte_ring_c11_pvt.h
> +++ b/lib/ring/rte_ring_c11_pvt.h
> @@ -15,35 +15,10 @@
>   * @file rte_ring_c11_pvt.h
>   * It is not recommended to include this file directly,
>   * include <rte_ring.h> instead.
> - * Contains internal helper functions for MP/SP and MC/SC ring modes.
> + * Contains internal helper functions for MP and MC ring modes.
>   * For more information please refer to <rte_ring.h>.
>   */
> 
> -/**
> - * @internal This function updates tail values.
> - */
> -static __rte_always_inline void
> -__rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
> -		uint32_t new_val, uint32_t single, uint32_t enqueue)
> -{
> -	RTE_SET_USED(enqueue);
> -
> -	/*
> -	 * If there are other enqueues/dequeues in progress that preceded us,
> -	 * we need to wait for them to complete
> -	 */
> -	if (!single)
> -		rte_wait_until_equal_32((uint32_t *)(uintptr_t)&ht->tail, old_val,
> -			rte_memory_order_relaxed);
> -
> -	/*
> -	 * R0: Establishes a synchronizing edge with load-acquire of tail at A1.
> -	 * Ensures that memory effects by this thread on ring elements array
> -	 * is observed by a different thread of the other type.
> -	 */
> -	rte_atomic_store_explicit(&ht->tail, new_val,
> rte_memory_order_release);
> -}
> -
>  /**
>   * @internal This is a helper function that moves the producer/consumer head
>   *
> @@ -72,14 +47,11 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht,
> uint32_t old_val,
>   *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only
>   */
>  static __rte_always_inline unsigned int
> -__rte_ring_headtail_move_head(struct rte_ring_headtail *d,
> +__rte_ring_headtail_move_head_mt(struct rte_ring_headtail *d,
>  		const struct rte_ring_headtail *s, uint32_t capacity,
> -		unsigned int is_st, unsigned int n,
> -		enum rte_ring_queue_behavior behavior,
> +		unsigned int n,	enum rte_ring_queue_behavior behavior,
>  		uint32_t *old_head, uint32_t *new_head, uint32_t *entries)
>  {
> -	uint32_t stail;
> -	int success;
>  	unsigned int max = n;
> 
>  	/*
> @@ -89,8 +61,7 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail
> *d,
>  	 * d->head.
>  	 * If not, an unsafe partial order may ensue.
>  	 */
> -	*old_head = rte_atomic_load_explicit(&d->head,
> -			rte_memory_order_acquire);
> +	*old_head = rte_atomic_load_explicit(&d->head, rte_memory_order_acquire);
>  	do {
>  		/* Reset n to the initial burst count */
>  		n = max;
> @@ -101,15 +72,14 @@ __rte_ring_headtail_move_head(struct
> rte_ring_headtail *d,
>  		 * ring elements array is observed by the time
>  		 * this thread observes its tail update.
>  		 */
> -		stail = rte_atomic_load_explicit(&s->tail,
> -					rte_memory_order_acquire);
> +		uint32_t stail = rte_atomic_load_explicit(&s->tail,
> rte_memory_order_acquire);
> 
>  		/* The subtraction is done between two unsigned 32bits value
>  		 * (the result is always modulo 32 bits even if we have
>  		 * *old_head > s->tail). So 'entries' is always between 0
>  		 * and capacity (which is < size).
>  		 */
> -		*entries = (capacity + stail - *old_head);
> +		*entries = capacity + stail - *old_head;
> 
>  		/* check that we have enough room in ring */
>  		if (unlikely(n > *entries))
> @@ -120,25 +90,20 @@ __rte_ring_headtail_move_head(struct
> rte_ring_headtail *d,
>  			return 0;
> 
>  		*new_head = *old_head + n;
> -		if (is_st) {
> -			d->head = *new_head;
> -			success = 1;
> -		} else
> -			/* on failure, *old_head is updated */
> -			/*
> -			 * R1/A2.
> -			 * R1: Establishes a synchronizing edge with A0 of a
> -			 * different thread.
> -			 * A2: Establishes a synchronizing edge with R1 of a
> -			 * different thread to observe same value for stail
> -			 * observed by that thread on CAS failure (to retry
> -			 * with an updated *old_head).
> -			 */
> -			success =
> rte_atomic_compare_exchange_strong_explicit(
> -					&d->head, old_head, *new_head,
> -					rte_memory_order_release,
> -					rte_memory_order_acquire);
> -	} while (unlikely(success == 0));
> +
> +		/* on failure, *old_head is updated */
> +		/*
> +		 * R1/A2.
> +		 * R1: Establishes a synchronizing edge with A0 of a
> +		 * different thread.
> +		 * A2: Establishes a synchronizing edge with R1 of a
> +		 * different thread to observe same value for stail
> +		 * observed by that thread on CAS failure (to retry
> +		 * with an updated *old_head).
> +		 */
> +	} while (unlikely(!rte_atomic_compare_exchange_strong_explicit(
> +				  &d->head, old_head, *new_head,
> +				  rte_memory_order_release,
> rte_memory_order_acquire)));
>  	return n;
>  }
> 
> diff --git a/lib/ring/rte_ring_elem_pvt.h b/lib/ring/rte_ring_elem_pvt.h
> index 6eafae121f..9d1da12a92 100644
> --- a/lib/ring/rte_ring_elem_pvt.h
> +++ b/lib/ring/rte_ring_elem_pvt.h
> @@ -299,17 +299,108 @@ __rte_ring_dequeue_elems(struct rte_ring *r,
> uint32_t cons_head,
>  			cons_head & r->mask, esize, num);
>  }
> 
> -/* Between load and load. there might be cpu reorder in weak model
> - * (powerpc/arm).
> - * There are 2 choices for the users
> - * 1.use rmb() memory barrier
> - * 2.use one-direction load_acquire/store_release barrier
> - * It depends on performance test results.
> +/**
> + * @internal This function updates tail values.
>   */
> -#ifdef RTE_USE_C11_MEM_MODEL
> -#include "rte_ring_c11_pvt.h"
> +static __rte_always_inline void
> +__rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
> +		uint32_t new_val, uint32_t single, uint32_t enqueue)
> +{
> +	RTE_SET_USED(enqueue);
> +
> +	/*
> +	 * If there are other enqueues/dequeues in progress that preceded us,
> +	 * we need to wait for them to complete
> +	 */
> +	if (!single)
> +		rte_wait_until_equal_32((uint32_t *)(uintptr_t)&ht->tail, old_val,
> +			rte_memory_order_relaxed);
> +
> +	/*
> +	 * R0: Establishes a synchronizing edge with load-acquire of tail at A1.
> +	 * Ensures that memory effects by this thread on ring elements array
> +	 * is observed by a different thread of the other type.
> +	 */
> +	rte_atomic_store_explicit(&ht->tail, new_val,
> rte_memory_order_release);
> +}
> +
> +/**
> + * @internal This is a helper function that moves the producer/consumer head
> + *
> + *
> + * This optimized version for single threaded case.
> + *
> + * @param d
> + *   A pointer to the headtail structure with head value to be moved
> + * @param s
> + *   A pointer to the counter-part headtail structure. Note that this
> + *   function only reads tail value from it
> + * @param capacity
> + *   Either ring capacity value (for producer), or zero (for consumer)
> + * @param n
> + *   The number of elements we want to move head value on
> + * @param behavior
> + *   RTE_RING_QUEUE_FIXED:    Move on a fixed number of items
> + *   RTE_RING_QUEUE_VARIABLE: Move on as many items as possible
> + * @param old_head
> + *   Returns head value as it was before the move
> + * @param new_head
> + *   Returns the new head value
> + * @param entries
> + *   Returns the number of ring entries available BEFORE head was moved
> + * @return
> + *   Actual number of objects the head was moved on
> + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only
> + */
> +static __rte_always_inline unsigned int
> +__rte_ring_headtail_move_head_st(struct rte_ring_headtail *d,
> +		const struct rte_ring_headtail *s, uint32_t capacity,
> +		unsigned int n, enum rte_ring_queue_behavior behavior,
> +		uint32_t *old_head, uint32_t *new_head, uint32_t *entries)
> +{
> +	uint32_t stail;
> +

I really like the idea to split _st and _mt move_head into separate functions.
That makes code much cleaner an easier to understand and maintain.
Few comments on actual '_st' implementation below: 
 
> +	/*
> +	 * A0: Establishes a synchronizing edge with R1.
> +	 * Ensure that this thread observes same values
> +	 * to stail observed by the thread that updated
> +	 * d->head.
> +	 * If not, an unsafe partial order may ensue.
> +	 */

I believe that comment is not relevant for '_st',
there is no R1 anymore for '_st' - see below,
and no other thread except that one can move the head.
So, there is probably no point to use '_acquire' order here.
 
> +	*old_head = rte_atomic_load_explicit(&d->head,
> rte_memory_order_acquire);
> +
> +	/*
> +	 * A1: Establishes a synchronizing edge with R0.
> +	 * Ensures that other thread's memory effects on
> +	 * ring elements array is observed by the time
> +	 * this thread observes its tail update.
> +	 */
> +	stail = rte_atomic_load_explicit(&s->tail, rte_memory_order_acquire);
> +
> +	/* The subtraction is done between two unsigned 32bits value
> +	 * (the result is always modulo 32 bits even if we have
> +	 * *old_head > s->tail). So 'entries' is always between 0
> +	 * and capacity (which is < size).
> +	 */
> +	*entries = capacity + stail - *old_head;
> +
> +	/* check that we have enough room in ring */
> +	if (unlikely(n > *entries))
> +		n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *entries;
> +
> +	if (n > 0) {
> +		*new_head = *old_head + n;
> +		d->head = *new_head;

There is a bit of inconsistency with the 'load' operation above:
If we use atomic_load(&d->head. ...) then it would be better to use
atomic_store(&d->head,..., order_relaxed) here.
 
> +	}
> +
> +	return n;
> +}
> +
> +/* There are two choices because GCC optimizer does poorly on
> atomic_compare_exchange */
> +#if defined(RTE_TOOLCHAIN_GCC) && defined(RTE_ARCH_X86)

If we still need to use legacy code for x86, I think we need an explcit macro
to enable C11 for x86 (RTE_RING_FORCE_C11 or so):
to make sure that C11 version will still get tested and measured on x86. 

> +#include "rte_ring_x86_pvt.h"
>  #else
> -#include "rte_ring_generic_pvt.h"
> +#include "rte_ring_c11_pvt.h"
>  #endif

I tried to look at compiler output for both cases, most of the code
looks nearly identical, one thing that I noticed: 
C11 __rte_ring_headtail_move_head_mt() uses output
parameter: 'uint32_t *old_head' directly within CAS operation.
In x86_64 that cause gcc to generate extra instructions to
store return value of CAS (eax) within 'old_head' memory location,
even when CAS was not successfull and another attempt should be
performed. In some cases, even extra branch can be observed:
https://godbolt.org/z/4dTrqMjYe
In constrast, x86 specific version that uses
__sync_bool_compare_and_swap() doesn't exibit such problem,
as __sync_bool_compare_and_swap() doesn't update the 'old_head'
with new value, and we have to re-read it explicitly on each iteration.
I tried to overcome that problem by using local variable 'head' inside the loop,
and updaing '*old_head' value only at exit.
With such change gcc manages to avoid extra store(/branch),
see __rte_ring_headtail_move_head_mt_c11_v2() in the link above.
Can I ask you to re-run your perf test with the patch:
https://patchwork.dpdk.org/project/dpdk/patch/20260601181509.71007-1-konstantin.ananyev@huawei.com/
applied on top of your changes and see would it help in terms of performance?
From other side - if you'll point me to the exact tests you are running,
I am happy to repeat them on my box. 
My preference would be to avoid arch/compiler specific versions, if possible.

>  /**
> @@ -341,8 +432,12 @@ __rte_ring_move_prod_head(struct rte_ring *r,
> unsigned int is_sp,
>  		uint32_t *old_head, uint32_t *new_head,
>  		uint32_t *free_entries)
>  {
> -	return __rte_ring_headtail_move_head(&r->prod, &r->cons, r->capacity,
> -			is_sp, n, behavior, old_head, new_head, free_entries);
> +	if (is_sp)
> +		return __rte_ring_headtail_move_head_st(&r->prod, &r->cons,
> r->capacity,
> +				n, behavior, old_head, new_head, free_entries);
> +	else
> +		return __rte_ring_headtail_move_head_mt(&r->prod, &r->cons,
> r->capacity,
> +				n, behavior, old_head, new_head, free_entries);
>  }
> 
>  /**
> @@ -374,8 +469,12 @@ __rte_ring_move_cons_head(struct rte_ring *r,
> unsigned int is_sc,
>  		uint32_t *old_head, uint32_t *new_head,
>  		uint32_t *entries)
>  {
> -	return __rte_ring_headtail_move_head(&r->cons, &r->prod, 0,
> -			is_sc, n, behavior, old_head, new_head, entries);
> +	if (is_sc)
> +		return __rte_ring_headtail_move_head_st(&r->cons, &r->prod,
> 0,
> +				n, behavior, old_head, new_head, entries);
> +	else
> +		return __rte_ring_headtail_move_head_mt(&r->cons, &r->prod,
> 0,
> +				n, behavior, old_head, new_head, entries);
>  }
> 
>  /**
> diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_x86_pvt.h
> similarity index 60%
> rename from lib/ring/rte_ring_generic_pvt.h
> rename to lib/ring/rte_ring_x86_pvt.h
> index affd2d5ba7..c8de108bbd 100644
> --- a/lib/ring/rte_ring_generic_pvt.h
> +++ b/lib/ring/rte_ring_x86_pvt.h
> @@ -7,39 +7,19 @@
>   * Used as BSD-3 Licensed with permission from Kip Macy.
>   */
> 
> -#ifndef _RTE_RING_GENERIC_PVT_H_
> -#define _RTE_RING_GENERIC_PVT_H_
> +#ifndef _RTE_RING_X86_PVT_H_
> +#define _RTE_RING_X86_PVT_H_
> 
>  /**
> - * @file rte_ring_generic_pvt.h
> + * @file rte_ring_x86_pvt.h
>   * It is not recommended to include this file directly,
>   * include <rte_ring.h> instead.
> - * Contains internal helper functions for MP/SP and MC/SC ring modes.
> - * For more information please refer to <rte_ring.h>.
> + *
> + * Contains internal helper functions for MP and MC ring modes.
> + * It is GCC specific to workaround poor optimizer handling of C11 atomic
> + * compare_exchange.
>   */
> 
> -/**
> - * @internal This function updates tail values.
> - */
> -static __rte_always_inline void
> -__rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
> -		uint32_t new_val, uint32_t single, uint32_t enqueue)
> -{
> -	if (enqueue)
> -		rte_smp_wmb();
> -	else
> -		rte_smp_rmb();
> -	/*
> -	 * If there are other enqueues/dequeues in progress that preceded us,
> -	 * we need to wait for them to complete
> -	 */
> -	if (!single)
> -		rte_wait_until_equal_32((volatile uint32_t *)(uintptr_t)&ht->tail,
> old_val,
> -			rte_memory_order_relaxed);
> -
> -	ht->tail = new_val;
> -}
> -
>  /**
>   * @internal This is a helper function that moves the producer/consumer head
>   *
> @@ -50,8 +30,6 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht,
> uint32_t old_val,
>   *   function only reads tail value from it
>   * @param capacity
>   *   Either ring capacity value (for producer), or zero (for consumer)
> - * @param is_st
> - *   Indicates whether multi-thread safe path is needed or not
>   * @param n
>   *   The number of elements we want to move head value on
>   * @param behavior
> @@ -68,14 +46,13 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht,
> uint32_t old_val,
>   *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only
>   */
>  static __rte_always_inline unsigned int
> -__rte_ring_headtail_move_head(struct rte_ring_headtail *d,
> +__rte_ring_headtail_move_head_mt(struct rte_ring_headtail *d,
>  		const struct rte_ring_headtail *s, uint32_t capacity,
> -		unsigned int is_st, unsigned int n,
> +		unsigned int n,
>  		enum rte_ring_queue_behavior behavior,
>  		uint32_t *old_head, uint32_t *new_head, uint32_t *entries)
>  {
>  	unsigned int max = n;
> -	int success;
> 
>  	do {
>  		/* Reset n to the initial burst count */
> @@ -83,18 +60,13 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail
> *d,
> 
>  		*old_head = d->head;
> 
> -		/* add rmb barrier to avoid load/load reorder in weak
> -		 * memory model. It is noop on x86
> -		 */
> -		rte_smp_rmb();
> -
>  		/*
>  		 *  The subtraction is done between two unsigned 32bits value
>  		 * (the result is always modulo 32 bits even if we have
>  		 * *old_head > s->tail). So 'entries' is always between 0
>  		 * and capacity (which is < size).
>  		 */
> -		*entries = (capacity + s->tail - *old_head);
> +		*entries = capacity + s->tail - *old_head;
> 
>  		/* check that we have enough room in ring */
>  		if (unlikely(n > *entries))
> @@ -105,15 +77,10 @@ __rte_ring_headtail_move_head(struct
> rte_ring_headtail *d,
>  			return 0;
> 
>  		*new_head = *old_head + n;
> -		if (is_st) {
> -			d->head = *new_head;
> -			success = 1;
> -		} else
> -			success = rte_atomic32_cmpset(
> -					(uint32_t *)(uintptr_t)&d->head,
> -					*old_head, *new_head);
> -	} while (unlikely(success == 0));
> +	} while (unlikely(!__sync_bool_compare_and_swap(
> +				  (uint32_t *)(uintptr_t)&d->head,
> +				  *old_head, *new_head)));
>  	return n;
>  }
> 
> -#endif /* _RTE_RING_GENERIC_PVT_H_ */
> +#endif /* _RTE_RING_X86_PVT_H_ */
> diff --git a/lib/ring/soring.c b/lib/ring/soring.c
> index 3b90521bdb..0e8bbc03c1 100644
> --- a/lib/ring/soring.c
> +++ b/lib/ring/soring.c
> @@ -135,9 +135,12 @@ __rte_soring_move_prod_head(struct rte_soring *r,
> uint32_t num,
> 
>  	switch (st) {
>  	case RTE_RING_SYNC_ST:
> +		n = __rte_ring_headtail_move_head_st(&r->prod.ht, &r-
> >cons.ht,
> +				r->capacity, num, behavior, head, next, free);
> +		break;
>  	case RTE_RING_SYNC_MT:
> -		n = __rte_ring_headtail_move_head(&r->prod.ht, &r->cons.ht,
> -			r->capacity, st, num, behavior, head, next, free);
> +		n = __rte_ring_headtail_move_head_mt(&r->prod.ht, &r-
> >cons.ht,
> +				r->capacity, num, behavior, head, next, free);
>  		break;
>  	case RTE_RING_SYNC_MT_RTS:
>  		n = __rte_ring_rts_move_head(&r->prod.rts, &r->cons.ht,
> @@ -168,9 +171,13 @@ __rte_soring_move_cons_head(struct rte_soring *r,
> uint32_t stage, uint32_t num,
> 
>  	switch (st) {
>  	case RTE_RING_SYNC_ST:
> +		n = __rte_ring_headtail_move_head_st(&r->cons.ht,
> +			&r->stage[stage].ht, 0, num, behavior,
> +			head, next, avail);
> +		break;
>  	case RTE_RING_SYNC_MT:
> -		n = __rte_ring_headtail_move_head(&r->cons.ht,
> -			&r->stage[stage].ht, 0, st, num, behavior,
> +		n = __rte_ring_headtail_move_head_mt(&r->cons.ht,
> +			&r->stage[stage].ht, 0, num, behavior,
>  			head, next, avail);
>  		break;
>  	case RTE_RING_SYNC_MT_RTS:
> --
> 2.53.0

next prev parent reply	other threads:[~2026-06-01 18:18 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-21  4:17 [RFC 0/7] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
2026-05-21  4:17 ` [RFC 1/7] doc: update versions in deprecation file Stephen Hemminger
2026-05-21  4:17 ` [RFC 2/7] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
2026-05-21 15:43   ` Wathsala Vithanage
2026-05-21  4:17 ` [RFC 3/7] ring: use C11 atomic operations for MP/SP head/tail Stephen Hemminger
2026-05-21 15:57   ` Wathsala Vithanage
2026-05-21  4:17 ` [RFC 4/7] net/zxdh: work around GCC bitfield uninit false positive Stephen Hemminger
2026-05-21  4:17 ` [RFC 5/7] net/bonding: use stdatomic Stephen Hemminger
2026-05-21  4:17 ` [RFC 6/7] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
2026-05-21  4:17 ` [RFC 7/7] config: use RTE_FORCE_INTRINSICS on all platforms Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
2026-05-21 18:04   ` [RFC v2 01/11] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
2026-05-21 18:04   ` [RFC v2 02/11] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
2026-05-21 18:04   ` [RFC v2 03/11] ring: use C11 atomic operations for MP/SP head/tail Stephen Hemminger
2026-05-21 18:04   ` [RFC v2 04/11] net/bonding: use stdatomic Stephen Hemminger
2026-05-21 18:04   ` [RFC v2 05/11] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
2026-05-21 18:04   ` [RFC v2 06/11] net/ena: replace use of rte_atomicNN Stephen Hemminger
2026-05-21 18:04   ` [RFC v2 07/11] net/failsafe: convert to stdatomic Stephen Hemminger
2026-05-21 18:04   ` [RFC v2 08/11] net/enic: do not use deprecated rte_atomic64 Stephen Hemminger
2026-05-21 18:04   ` [RFC v2 09/11] net/pfe: use ethdev linkstatus helpers Stephen Hemminger
2026-05-21 18:04   ` [RFC v2 10/11] net/sfc: replace rte_atomic with stdatomic Stephen Hemminger
2026-05-21 18:04   ` [RFC v2 11/11] crypto/ccp: replace use of rte_atomic64 " Stephen Hemminger
2026-05-22 14:19   ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Bruce Richardson
2026-05-22 14:45     ` Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
2026-05-23 19:16   ` [PATCH v3 01/27] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
2026-05-23 19:16   ` [PATCH v3 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
2026-05-23 19:16   ` [PATCH v3 03/27] ring: use compare-and-swap wrapper Stephen Hemminger
2026-05-25  7:41     ` Konstantin Ananyev
2026-05-25 14:31       ` Stephen Hemminger
2026-05-25 15:35       ` Stephen Hemminger
2026-05-25 15:47         ` Morten Brørup
2026-05-23 19:16   ` [PATCH v3 04/27] bpf: replace atomic op macro with typed helpers Stephen Hemminger
2026-05-23 19:16   ` [PATCH v3 05/27] net/bonding: use stdatomic Stephen Hemminger
2026-05-23 19:16   ` [PATCH v3 06/27] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
2026-05-23 19:16   ` [PATCH v3 07/27] net/ena: replace use of rte_atomicNN Stephen Hemminger
2026-05-23 19:16   ` [PATCH v3 08/27] net/failsafe: convert to stdatomic Stephen Hemminger
2026-05-23 19:16   ` [PATCH v3 09/27] net/enic: do not use deprecated rte_atomic64 Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 01/27] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 03/27] ring: use compare-and-swap wrapper Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 04/27] bpf: replace atomic op macro with typed helpers Stephen Hemminger
2026-05-25 10:49     ` Marat Khalili
2026-05-23 19:56   ` [PATCH v3 05/27] net/bonding: use stdatomic Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 06/27] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 07/27] net/ena: replace use of rte_atomicNN Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 08/27] net/failsafe: convert to stdatomic Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 09/27] net/enic: do not use deprecated rte_atomic64 Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 10/27] net/pfe: use ethdev linkstatus helpers Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 11/27] net/sfc: replace rte_atomic with stdatomic Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 12/27] crypto/ccp: replace use of rte_atomic64 " Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 13/27] bus/dpaa: replace rte_atomic16 " Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 14/27] drivers: " Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 15/27] net/netvsc: replace rte_atomic32 " Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 16/27] event/sw: convert from rte_atomic32 to stdatomic Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 17/27] bus/vmbus: convert from rte_atomic " Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 18/27] common/dpaax: remove unused atomic macros Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 19/27] net/bnx2x: convert from rte_atomic32 to stdatomic Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 20/27] bus/fslmc: replace rte_atomic32 with stdatomic Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 21/27] drivers/event: replace rte_atomic32 in selftests Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 22/27] net/hinic: replace rte_atomic32 with stdatomic Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 23/27] net/txgbe: " Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 24/27] net/vhost: use stdatomic instead of rte_atomic32 Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 25/27] vdpa/ifc: replace rte_atomic32 with stdatomic Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 26/27] test/atomic: suppress deprecation warnings for legacy APIs Stephen Hemminger
2026-05-23 19:56   ` [PATCH v3 27/27] eal: mark rte_atomicNN as deprecated Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
2026-05-26 23:23   ` [PATCH v4 01/27] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
2026-06-01 18:23     ` Konstantin Ananyev
2026-05-26 23:23   ` [PATCH v4 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
2026-06-01 18:24     ` Konstantin Ananyev
2026-05-26 23:23   ` [PATCH v4 03/27] ring: unify memory model on C11, remove atomic32 Stephen Hemminger
2026-06-01 18:18     ` Konstantin Ananyev [this message]
2026-06-01 21:05       ` Stephen Hemminger
2026-06-01 21:18       ` Stephen Hemminger
2026-06-01 22:07     ` Stephen Hemminger
2026-05-26 23:23   ` [PATCH v4 04/27] bpf: use C11 atomics in BPF_ST_ATOMIC_REG Stephen Hemminger
2026-05-27 16:52     ` Marat Khalili
2026-05-26 23:23   ` [PATCH v4 05/27] net/bonding: use stdatomic Stephen Hemminger
2026-05-26 23:23   ` [PATCH v4 06/27] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
2026-05-26 23:23   ` [PATCH v4 07/27] net/ena: replace use of rte_atomicNN Stephen Hemminger
2026-05-26 23:23   ` [PATCH v4 08/27] net/failsafe: convert to stdatomic Stephen Hemminger
2026-05-26 23:23   ` [PATCH v4 09/27] net/enic: do not use deprecated rte_atomic64 Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 10/27] net/pfe: use ethdev linkstatus helpers Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 11/27] net/sfc: replace rte_atomic with stdatomic Stephen Hemminger
2026-06-01  9:22     ` Andrew Rybchenko
2026-05-26 23:24   ` [PATCH v4 12/27] crypto/ccp: replace use of rte_atomic64 " Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 13/27] bus/dpaa: replace rte_atomic16 " Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 14/27] drivers: " Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 15/27] net/netvsc: replace rte_atomic32 " Stephen Hemminger
2026-05-27  0:29     ` [EXTERNAL] " Long Li
2026-05-31 16:35       ` Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 16/27] event/sw: convert from rte_atomic32 to stdatomic Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 17/27] bus/vmbus: convert from rte_atomic " Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 18/27] common/dpaax: use stdatomic instead of rte_atomic Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 19/27] net/bnx2x: convert from rte_atomic32 to stdatomic Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 20/27] bus/fslmc: replace rte_atomic32 with stdatomic Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 21/27] drivers/event: replace rte_atomic32 in selftests Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 22/27] net/hinic: replace rte_atomic32 with stdatomic Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 23/27] net/txgbe: " Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 24/27] net/vhost: use stdatomic instead of rte_atomic32 Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 25/27] vdpa/ifc: replace rte_atomic32 with stdatomic Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 26/27] test/atomic: suppress deprecation warnings for legacy APIs Stephen Hemminger
2026-05-26 23:24   ` [PATCH v4 27/27] eal: mark rte_atomicNN as deprecated Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b8523a4e0ee34e8b9194935574e86f7c@huawei.com \
    --to=konstantin.ananyev@huawei.com \
    --cc=dev@dpdk.org \
    --cc=stephen@networkplumber.org \
    --cc=wathsala.vithanage@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.