* [RFC 0/7] prepare deprecation of rte_atomicNN_*() family
@ 2026-05-21 4:17 Stephen Hemminger
2026-05-21 4:17 ` [RFC 1/7] doc: update versions in deprecation file Stephen Hemminger
` (10 more replies)
0 siblings, 11 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 4:17 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
The goal is to land every deprecation currently listed in the release
notes by the 26.11 ABI bump. Working back from there: any function to
be removed in 26.11 needs to be marked __rte_deprecated by 26.07, and
all in-tree users converted off it before the marker patch goes in so
CI stays clean.
This series is the preparatory work for the rte_atomicNN_*() family
under that plan. It does not yet add the __rte_deprecated marker;
that's a separate follow-up once the remaining in-tree users are
converted. Other items on the deprecation list (VXLAN_GPE,
pipeline/table/port legacy API, the MAX enum fix, regexdev, pdump,
TM locks) will follow as their own series on the same timeline.
Patch 3 is the load-bearing change: the last lib/ caller of
rte_atomic32_cmpset() is converted, clearing the way.
Patch 7 drops RTE_FORCE_INTRINSICS entirely. With the option always on,
the asm implementations of atomics, spinlock and byteorder become dead
code. ~900 lines deleted; the patch most worth review attention.
This makes it easier to flag the rte_atomicNN as deprecated
since they are all in one place.
Patch 2 retires the rte_smp_*mb deprecation notice (open since 2021)
by reimplementing those APIs as wrappers over rte_atomic_thread_fence,
preserving the API for readability. Patches 5 and 6 convert and clean
up two driver users (bonding, nbl).
Patch 4 is a preparatory workaround for a pre-existing GCC bitfield
-Wmaybe-uninitialized false positive in net/zxdh, surfaced by the
improved compiler visibility after patch 7. Placed ahead of patch 7
to keep every commit bisectable.
Follow on patches will mechanically convert drivers.
If driver writer fixes it themselves; all the beter.
Stephen Hemminger (7):
doc: update versions in deprecation file
eal: reimplement rte_smp_*mb with rte_atomic_thread_fence
ring: use C11 atomic operations for MP/SP head/tail
net/zxdh: work around GCC bitfield uninit false positive
net/bonding: use stdatomic
net/nbl: remove unused rte_atomic16 field
config: use RTE_FORCE_INTRINSICS on all platforms
config/arm/meson.build | 1 -
config/loongarch/meson.build | 1 -
config/meson.build | 3 -
config/riscv/meson.build | 1 -
doc/guides/rel_notes/deprecation.rst | 12 +-
doc/guides/rel_notes/release_26_07.rst | 5 +
drivers/net/bonding/eth_bond_8023ad_private.h | 4 +-
drivers/net/bonding/rte_eth_bond_8023ad.c | 18 +-
drivers/net/nbl/nbl_hw/nbl_resource.h | 1 -
drivers/net/zxdh/zxdh_msg.c | 4 +-
lib/eal/arm/include/rte_atomic_32.h | 9 -
lib/eal/arm/include/rte_atomic_64.h | 9 -
lib/eal/arm/include/rte_byteorder.h | 3 -
lib/eal/arm/include/rte_spinlock.h | 3 -
lib/eal/include/generic/rte_atomic.h | 164 ++++----------
lib/eal/include/generic/rte_byteorder.h | 2 -
lib/eal/include/generic/rte_spinlock.h | 10 -
lib/eal/loongarch/include/rte_atomic.h | 9 -
lib/eal/loongarch/include/rte_spinlock.h | 3 -
lib/eal/ppc/include/rte_atomic.h | 179 ---------------
lib/eal/ppc/include/rte_byteorder.h | 13 --
lib/eal/ppc/include/rte_spinlock.h | 26 ---
lib/eal/riscv/include/rte_atomic.h | 9 -
lib/eal/riscv/include/rte_spinlock.h | 3 -
lib/eal/x86/include/rte_atomic.h | 205 +-----------------
lib/eal/x86/include/rte_atomic_32.h | 188 ----------------
lib/eal/x86/include/rte_atomic_64.h | 157 --------------
lib/eal/x86/include/rte_byteorder.h | 49 -----
lib/eal/x86/include/rte_spinlock.h | 49 -----
lib/ring/rte_ring_generic_pvt.h | 64 ++++--
30 files changed, 118 insertions(+), 1086 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 105+ messages in thread
* [RFC 1/7] doc: update versions in deprecation file
2026-05-21 4:17 [RFC 0/7] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
@ 2026-05-21 4:17 ` Stephen Hemminger
2026-05-21 4:17 ` [RFC 2/7] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
` (9 subsequent siblings)
10 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 4:17 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
This document was mentioned 23.11 release and needs update
for upcoming 26.11.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
doc/guides/rel_notes/deprecation.rst | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 35c9b4e06c..346c517623 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -7,8 +7,8 @@ ABI and API Deprecation
See the guidelines document for details of the :doc:`ABI policy
<../contributing/abi_policy>`.
-With DPDK 23.11, there will be a new major ABI version: 24.
-This means that during the development of 23.11,
+With DPDK 26.11, there will be a new major ABI version: 27.
+This means that during the development of 26.11,
new items may be added to structs or enums,
even if those additions involve an ABI compatibility breakage.
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [RFC 2/7] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence
2026-05-21 4:17 [RFC 0/7] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
2026-05-21 4:17 ` [RFC 1/7] doc: update versions in deprecation file Stephen Hemminger
@ 2026-05-21 4:17 ` Stephen Hemminger
2026-05-21 15:43 ` Wathsala Vithanage
2026-05-21 4:17 ` [RFC 3/7] ring: use C11 atomic operations for MP/SP head/tail Stephen Hemminger
` (8 subsequent siblings)
10 siblings, 1 reply; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 4:17 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, Wathsala Vithanage, Bibo Mao,
David Christensen, Sun Yuechi, Bruce Richardson,
Konstantin Ananyev
The rte_smp_mb(), rte_smp_wmb() and rte_smp_rmb() functions were
flagged as deprecated by commit 3ec965b6de12 ("doc: update atomic
operation deprecation") in 2021 but nothing came of it.
Reimplement them as inline wrappers over rte_atomic_thread_fence()
and drop the deprecation notice.
The API is preserved; only the implementation changes.
Generated code is unchanged on x86 (seq_cst keeps the lock-addl
trick, release/acquire collapse to a compiler barrier under TSO).
On arm64, release/acquire emit dmb ish instead of dmb ishst/ishld;
the difference is below measurement noise.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
doc/guides/rel_notes/deprecation.rst | 8 --
lib/eal/arm/include/rte_atomic_32.h | 6 --
lib/eal/arm/include/rte_atomic_64.h | 6 --
lib/eal/include/generic/rte_atomic.h | 106 +++++++++++--------------
lib/eal/loongarch/include/rte_atomic.h | 6 --
lib/eal/ppc/include/rte_atomic.h | 6 --
lib/eal/riscv/include/rte_atomic.h | 6 --
lib/eal/x86/include/rte_atomic.h | 33 +++-----
8 files changed, 57 insertions(+), 120 deletions(-)
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 346c517623..03b763b472 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -47,14 +47,6 @@ Deprecation Notices
operations must be used for patches that need to be merged in 20.08 onwards.
This change will not introduce any performance degradation.
-* rte_smp_*mb: These APIs provide full barrier functionality. However, many
- use cases do not require full barriers. To support such use cases, DPDK has
- adopted atomic operations from
- https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html. These
- operations and a new wrapper ``rte_atomic_thread_fence`` instead of
- ``__atomic_thread_fence`` must be used for patches that need to be merged in
- 20.08 onwards. This change will not introduce any performance degradation.
-
* lib: will fix extending some enum/define breaking the ABI. There are multiple
samples in DPDK that enum/define terminated with a ``.*MAX.*`` value which is
used by iterators, and arrays holding these values are sized with this
diff --git a/lib/eal/arm/include/rte_atomic_32.h b/lib/eal/arm/include/rte_atomic_32.h
index 0b9a0dfa30..3809ddefb7 100644
--- a/lib/eal/arm/include/rte_atomic_32.h
+++ b/lib/eal/arm/include/rte_atomic_32.h
@@ -21,12 +21,6 @@ extern "C" {
#define rte_rmb() __sync_synchronize()
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_wmb()
diff --git a/lib/eal/arm/include/rte_atomic_64.h b/lib/eal/arm/include/rte_atomic_64.h
index 181bb60929..c9b41f6212 100644
--- a/lib/eal/arm/include/rte_atomic_64.h
+++ b/lib/eal/arm/include/rte_atomic_64.h
@@ -24,12 +24,6 @@ extern "C" {
#define rte_rmb() asm volatile("dmb oshld" : : : "memory")
-#define rte_smp_mb() asm volatile("dmb ish" : : : "memory")
-
-#define rte_smp_wmb() asm volatile("dmb ishst" : : : "memory")
-
-#define rte_smp_rmb() asm volatile("dmb ishld" : : : "memory")
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_wmb()
diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h
index 0a4f3f8528..4e9d230f85 100644
--- a/lib/eal/include/generic/rte_atomic.h
+++ b/lib/eal/include/generic/rte_atomic.h
@@ -49,69 +49,8 @@ static inline void rte_wmb(void);
* occur before the LOAD operations generated after.
*/
static inline void rte_rmb(void);
-///@}
-
-/** @name SMP Memory Barrier
- */
-///@{
-/**
- * General memory barrier between lcores
- *
- * Guarantees that the LOAD and STORE operations that precede the
- * rte_smp_mb() call are globally visible across the lcores
- * before the LOAD and STORE operations that follows it.
- *
- * @note
- * This function is deprecated.
- * It provides similar synchronization primitive as atomic fence,
- * but has different syntax and memory ordering semantic. Hence
- * deprecated for the simplicity of memory ordering semantics in use.
- *
- * rte_atomic_thread_fence(rte_memory_order_acq_rel) should be used instead.
- */
-static inline void rte_smp_mb(void);
-/**
- * Write memory barrier between lcores
- *
- * Guarantees that the STORE operations that precede the
- * rte_smp_wmb() call are globally visible across the lcores
- * before the STORE operations that follows it.
- *
- * @note
- * This function is deprecated.
- * It provides similar synchronization primitive as atomic fence,
- * but has different syntax and memory ordering semantic. Hence
- * deprecated for the simplicity of memory ordering semantics in use.
- *
- * rte_atomic_thread_fence(rte_memory_order_release) should be used instead.
- * The fence also guarantees LOAD operations that precede the call
- * are globally visible across the lcores before the STORE operations
- * that follows it.
- */
-static inline void rte_smp_wmb(void);
-
-/**
- * Read memory barrier between lcores
- *
- * Guarantees that the LOAD operations that precede the
- * rte_smp_rmb() call are globally visible across the lcores
- * before the LOAD operations that follows it.
- *
- * @note
- * This function is deprecated.
- * It provides similar synchronization primitive as atomic fence,
- * but has different syntax and memory ordering semantic. Hence
- * deprecated for the simplicity of memory ordering semantics in use.
- *
- * rte_atomic_thread_fence(rte_memory_order_acquire) should be used instead.
- * The fence also guarantees LOAD operations that precede the call
- * are globally visible across the lcores before the STORE operations
- * that follows it.
- */
-static inline void rte_smp_rmb(void);
///@}
-
/** @name I/O Memory Barrier
*/
///@{
@@ -164,6 +103,51 @@ static inline void rte_io_rmb(void);
*/
static inline void rte_atomic_thread_fence(rte_memory_order memorder);
+
+/** @name SMP Memory Barrier
+ */
+///@{
+/**
+ * General memory barrier between lcores
+ *
+ * Guarantees that the LOAD and STORE operations that precede the
+ * rte_smp_mb() call are globally visible across the lcores
+ * before the LOAD and STORE operations that follows it.
+ */
+static __rte_always_inline void
+rte_smp_mb(void)
+{
+ rte_atomic_thread_fence(rte_memory_order_seq_cst);
+}
+
+/**
+ * Write memory barrier between lcores
+ *
+ * Guarantees that the STORE operations that precede the
+ * rte_smp_wmb() call are globally visible across the lcores
+ * before the STORE operations that follows it.
+ */
+static __rte_always_inline void
+rte_smp_wmb(void)
+{
+ rte_atomic_thread_fence(rte_memory_order_release);
+}
+
+/**
+ * Read memory barrier between lcores
+ *
+ * Guarantees that the LOAD operations that precede the
+ * rte_smp_rmb() call are globally visible across the lcores
+ * before the LOAD operations that follows it.
+ */
+static __rte_always_inline void
+rte_smp_rmb(void)
+{
+ rte_atomic_thread_fence(rte_memory_order_acquire);
+}
+
+///@}
+
/*------------------------- 16 bit atomic operations -------------------------*/
#ifndef RTE_TOOLCHAIN_MSVC
diff --git a/lib/eal/loongarch/include/rte_atomic.h b/lib/eal/loongarch/include/rte_atomic.h
index c8066a4612..49e0c67020 100644
--- a/lib/eal/loongarch/include/rte_atomic.h
+++ b/lib/eal/loongarch/include/rte_atomic.h
@@ -22,12 +22,6 @@ extern "C" {
#define rte_rmb() rte_mb()
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_mb()
-
-#define rte_smp_rmb() rte_mb()
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_mb()
diff --git a/lib/eal/ppc/include/rte_atomic.h b/lib/eal/ppc/include/rte_atomic.h
index 10acc238f9..1da5afccbf 100644
--- a/lib/eal/ppc/include/rte_atomic.h
+++ b/lib/eal/ppc/include/rte_atomic.h
@@ -24,12 +24,6 @@ extern "C" {
#define rte_rmb() asm volatile("sync" : : : "memory")
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_wmb()
diff --git a/lib/eal/riscv/include/rte_atomic.h b/lib/eal/riscv/include/rte_atomic.h
index 66346ad474..dd10ad5127 100644
--- a/lib/eal/riscv/include/rte_atomic.h
+++ b/lib/eal/riscv/include/rte_atomic.h
@@ -27,12 +27,6 @@ extern "C" {
#define rte_rmb() asm volatile("fence r, r" : : : "memory")
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#define rte_io_mb() asm volatile("fence iorw, iorw" : : : "memory")
#define rte_io_wmb() asm volatile("fence orw, ow" : : : "memory")
diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h
index e071e4234e..a850b0257c 100644
--- a/lib/eal/x86/include/rte_atomic.h
+++ b/lib/eal/x86/include/rte_atomic.h
@@ -23,10 +23,6 @@
#define rte_rmb() _mm_lfence()
-#define rte_smp_wmb() rte_compiler_barrier()
-
-#define rte_smp_rmb() rte_compiler_barrier()
-
#ifdef __cplusplus
extern "C" {
#endif
@@ -63,20 +59,6 @@ extern "C" {
* So below we use that technique for rte_smp_mb() implementation.
*/
-static __rte_always_inline void
-rte_smp_mb(void)
-{
-#ifdef RTE_TOOLCHAIN_MSVC
- _mm_mfence();
-#else
-#ifdef RTE_ARCH_I686
- asm volatile("lock addl $0, -128(%%esp); " ::: "memory");
-#else
- asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
-#endif
-#endif
-}
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_compiler_barrier()
@@ -93,10 +75,19 @@ rte_smp_mb(void)
static __rte_always_inline void
rte_atomic_thread_fence(rte_memory_order memorder)
{
- if (memorder == rte_memory_order_seq_cst)
- rte_smp_mb();
- else
+ if (memorder == rte_memory_order_seq_cst) {
+#ifdef RTE_TOOLCHAIN_MSVC
+ _mm_mfence();
+#else
+#ifdef RTE_ARCH_I686
+ asm volatile("lock addl $0, -128(%%esp); " ::: "memory");
+#else
+ asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
+#endif
+#endif
+ } else {
__rte_atomic_thread_fence(memorder);
+ }
}
#ifdef __cplusplus
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [RFC 3/7] ring: use C11 atomic operations for MP/SP head/tail
2026-05-21 4:17 [RFC 0/7] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
2026-05-21 4:17 ` [RFC 1/7] doc: update versions in deprecation file Stephen Hemminger
2026-05-21 4:17 ` [RFC 2/7] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
@ 2026-05-21 4:17 ` Stephen Hemminger
2026-05-21 15:57 ` Wathsala Vithanage
2026-05-21 4:17 ` [RFC 4/7] net/zxdh: work around GCC bitfield uninit false positive Stephen Hemminger
` (7 subsequent siblings)
10 siblings, 1 reply; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 4:17 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Konstantin Ananyev, Wathsala Vithanage
Last caller of rte_atomic32_cmpset() in lib/, blocking deprecation
of the rte_atomicNN_*() family.
Replace cmpset with rte_atomic_compare_exchange_weak_explicit(),
and convert head/tail loads/stores from implicit seq_cst to explicit
acquire/release. Matches the HTS/RTS pattern.
Acquire-load of d->head orders the subsequent load of s->tail (was
rte_smp_rmb()). Acquire-load of s->tail pairs with the release-store
of the counterpart tail in __rte_ring_update_tail(), which subsumes
the previous wmb/rmb barriers.
Weak CAS avoids arm64's hidden inner retry; the outer do-while already
loops. CAS orderings relaxed: no data published by the reservation.
The now-unused 'enqueue' parameter of __rte_ring_update_tail() is
removed; both call sites updated.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
lib/ring/rte_ring_generic_pvt.h | 64 +++++++++++++++++++++++----------
1 file changed, 45 insertions(+), 19 deletions(-)
diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_generic_pvt.h
index affd2d5ba7..9497f6737b 100644
--- a/lib/ring/rte_ring_generic_pvt.h
+++ b/lib/ring/rte_ring_generic_pvt.h
@@ -23,21 +23,25 @@
*/
static __rte_always_inline void
__rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
- uint32_t new_val, uint32_t single, uint32_t enqueue)
+ uint32_t new_val, uint32_t single,
+ uint32_t enqueue __rte_unused)
{
- if (enqueue)
- rte_smp_wmb();
- else
- rte_smp_rmb();
/*
* If there are other enqueues/dequeues in progress that preceded us,
* we need to wait for them to complete
*/
if (!single)
- rte_wait_until_equal_32((volatile uint32_t *)(uintptr_t)&ht->tail, old_val,
- rte_memory_order_relaxed);
+ rte_wait_until_equal_32((volatile uint32_t *)(uintptr_t)&ht->tail,
+ old_val, rte_memory_order_relaxed);
- ht->tail = new_val;
+ /*
+ * Release ordering on the tail store ensures that the slot reads
+ * (dequeue) or writes (enqueue) performed by this thread are visible
+ * to the other side before the new tail value is observed.
+ * Pairs with the acquire load of the counterpart's tail in
+ * __rte_ring_headtail_move_head().
+ */
+ rte_atomic_store_explicit(&ht->tail, new_val, rte_memory_order_release);
}
/**
@@ -76,25 +80,35 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
{
unsigned int max = n;
int success;
+ uint32_t tail;
do {
/* Reset n to the initial burst count */
n = max;
- *old_head = d->head;
+ /*
+ * Acquire load: orders this load before the load of s->tail
+ * below (replaces rte_smp_rmb() in the previous version) and
+ * re-establishes ordering after a failed CAS on retry.
+ */
+ *old_head = rte_atomic_load_explicit(&d->head,
+ rte_memory_order_acquire);
- /* add rmb barrier to avoid load/load reorder in weak
- * memory model. It is noop on x86
+ /*
+ * Acquire load on the counterpart's tail pairs with the
+ * release store in __rte_ring_update_tail() on the other
+ * side, ensuring slot operations performed there are visible
+ * before the caller accesses the reserved slots.
*/
- rte_smp_rmb();
+ tail = rte_atomic_load_explicit(&s->tail, rte_memory_order_acquire);
/*
* The subtraction is done between two unsigned 32bits value
* (the result is always modulo 32 bits even if we have
- * *old_head > s->tail). So 'entries' is always between 0
+ * *old_head > tail). So 'entries' is always between 0
* and capacity (which is < size).
*/
- *entries = (capacity + s->tail - *old_head);
+ *entries = (capacity + tail - *old_head);
/* check that we have enough room in ring */
if (unlikely(n > *entries))
@@ -106,12 +120,24 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
*new_head = *old_head + n;
if (is_st) {
- d->head = *new_head;
+ rte_atomic_store_explicit(&d->head, *new_head, rte_memory_order_relaxed);
success = 1;
- } else
- success = rte_atomic32_cmpset(
- (uint32_t *)(uintptr_t)&d->head,
- *old_head, *new_head);
+ } else {
+ /*
+ * Weak CAS: the outer do-while handles spurious
+ * failures, so we avoid the strong variant's
+ * internal retry (which on arm64 wraps the LL/SC
+ * pair in a hidden inner loop).
+ *
+ * Relaxed on both success and failure: this CAS
+ * does not publish data. Slot data visibility is
+ * provided by the acquire loads above and the
+ * release store of tail in __rte_ring_update_tail().
+ */
+ success = rte_atomic_compare_exchange_weak_explicit(
+ &d->head, old_head, *new_head,
+ rte_memory_order_relaxed, rte_memory_order_relaxed);
+ }
} while (unlikely(success == 0));
return n;
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [RFC 4/7] net/zxdh: work around GCC bitfield uninit false positive
2026-05-21 4:17 [RFC 0/7] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (2 preceding siblings ...)
2026-05-21 4:17 ` [RFC 3/7] ring: use C11 atomic operations for MP/SP head/tail Stephen Hemminger
@ 2026-05-21 4:17 ` Stephen Hemminger
2026-05-21 4:17 ` [RFC 5/7] net/bonding: use stdatomic Stephen Hemminger
` (6 subsequent siblings)
10 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 4:17 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Junlong Wang, Ming Ran
GCC's -Wmaybe-uninitialized analysis cannot follow struct
initialization through bitfield reads. The warning is currently
masked by inline assembly elsewhere limiting analysis depth; it
surfaces once the EAL atomic and spinlock primitives switch to
compiler intrinsics.
Replace the struct initializer with an explicit memset() so the
full-width initialization is visible to the analyzer.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85301
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110743
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/zxdh/zxdh_msg.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/zxdh/zxdh_msg.c b/drivers/net/zxdh/zxdh_msg.c
index 4b01daf37a..8f88181a3f 100644
--- a/drivers/net/zxdh/zxdh_msg.c
+++ b/drivers/net/zxdh/zxdh_msg.c
@@ -728,13 +728,15 @@ zxdh_bar_chan_sync_msg_reps_get(uint64_t subchan_addr,
int
zxdh_bar_chan_sync_msg_send(struct zxdh_pci_bar_msg *in, struct zxdh_msg_recviver_mem *result)
{
- struct zxdh_bar_msg_header msg_header = {0};
+ struct zxdh_bar_msg_header msg_header;
uint16_t seq_id = 0;
uint64_t subchan_addr = 0;
uint32_t time_out_cnt = 0;
uint16_t valid = 0;
int ret = 0;
+ memset(&msg_header, 0, sizeof(msg_header));
+
ret = zxdh_bar_chan_send_para_check(in, result);
if (ret != ZXDH_BAR_MSG_OK)
goto exit;
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [RFC 5/7] net/bonding: use stdatomic
2026-05-21 4:17 [RFC 0/7] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (3 preceding siblings ...)
2026-05-21 4:17 ` [RFC 4/7] net/zxdh: work around GCC bitfield uninit false positive Stephen Hemminger
@ 2026-05-21 4:17 ` Stephen Hemminger
2026-05-21 4:17 ` [RFC 6/7] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
` (5 subsequent siblings)
10 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 4:17 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Chas Williams, Min Hu (Connor)
The old rte_atomic16 functions are deprecated.
Replace with rte_stdatomic for managing warning flag.
Can also use fetch_or and exchange to avoid CAS here.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/bonding/eth_bond_8023ad_private.h | 4 ++--
drivers/net/bonding/rte_eth_bond_8023ad.c | 18 ++++--------------
2 files changed, 6 insertions(+), 16 deletions(-)
diff --git a/drivers/net/bonding/eth_bond_8023ad_private.h b/drivers/net/bonding/eth_bond_8023ad_private.h
index ab7d15f81a..1756c9307d 100644
--- a/drivers/net/bonding/eth_bond_8023ad_private.h
+++ b/drivers/net/bonding/eth_bond_8023ad_private.h
@@ -9,7 +9,7 @@
#include <rte_ether.h>
#include <rte_byteorder.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_flow.h>
#include "rte_eth_bond_8023ad.h"
@@ -143,7 +143,7 @@ struct port {
volatile uint64_t rx_marker_timer;
uint64_t warning_timer;
- volatile uint16_t warnings_to_show;
+ RTE_ATOMIC(uint16_t) warnings_to_show;
/** Memory pool used to allocate slow queues */
struct rte_mempool *slow_pool;
diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
index ba88f6d261..641aae1a67 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.c
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
@@ -171,27 +171,17 @@ timer_is_running(uint64_t *timer)
static void
set_warning_flags(struct port *port, uint16_t flags)
{
- int retval;
- uint16_t old;
- uint16_t new_flag = 0;
-
- do {
- old = port->warnings_to_show;
- new_flag = old | flags;
- retval = rte_atomic16_cmpset(&port->warnings_to_show, old, new_flag);
- } while (unlikely(retval == 0));
+ rte_atomic_fetch_or_explicit(&port->warnings_to_show, flags, rte_memory_order_relaxed);
}
static void
show_warnings(uint16_t member_id)
{
struct port *port = &bond_mode_8023ad_ports[member_id];
- uint8_t warnings;
-
- do {
- warnings = port->warnings_to_show;
- } while (rte_atomic16_cmpset(&port->warnings_to_show, warnings, 0) == 0);
+ uint16_t warnings;
+ warnings = rte_atomic_exchange_explicit(&port->warnings_to_show, 0,
+ rte_memory_order_relaxed);
if (!warnings)
return;
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [RFC 6/7] net/nbl: remove unused rte_atomic16 field
2026-05-21 4:17 [RFC 0/7] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (4 preceding siblings ...)
2026-05-21 4:17 ` [RFC 5/7] net/bonding: use stdatomic Stephen Hemminger
@ 2026-05-21 4:17 ` Stephen Hemminger
2026-05-21 4:17 ` [RFC 7/7] config: use RTE_FORCE_INTRINSICS on all platforms Stephen Hemminger
` (4 subsequent siblings)
10 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 4:17 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Dimon Zhao, Leon Yu, Sam Chen
The tx_current_queue was defined as rte_atomic16_t which
is deprecated. Remove it since it was never used.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/nbl/nbl_hw/nbl_resource.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/net/nbl/nbl_hw/nbl_resource.h b/drivers/net/nbl/nbl_hw/nbl_resource.h
index bf5a9461f5..f2182ba6bc 100644
--- a/drivers/net/nbl/nbl_hw/nbl_resource.h
+++ b/drivers/net/nbl/nbl_hw/nbl_resource.h
@@ -225,7 +225,6 @@ struct nbl_res_info {
u16 base_qid;
u16 lcore_max;
u16 *pf_qid_to_lcore_id;
- rte_atomic16_t tx_current_queue;
};
struct nbl_resource_mgt {
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [RFC 7/7] config: use RTE_FORCE_INTRINSICS on all platforms
2026-05-21 4:17 [RFC 0/7] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (5 preceding siblings ...)
2026-05-21 4:17 ` [RFC 6/7] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
@ 2026-05-21 4:17 ` Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (3 subsequent siblings)
10 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 4:17 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, Wathsala Vithanage, Bruce Richardson, Bibo Mao,
Sun Yuechi, David Christensen, Konstantin Ananyev
Next step is to deprecate the rte_atomicNN_*() family. Rather than
maintaining both the inline asm and intrinsic fallbacks, drop the
asm paths and use intrinsics everywhere. The RTE_FORCE_INTRINSICS
config option is removed.
This also retires the asm-based byteorder bswap and spinlock
implementations, which were guarded by the same option.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
config/arm/meson.build | 1 -
config/loongarch/meson.build | 1 -
config/meson.build | 3 -
config/riscv/meson.build | 1 -
doc/guides/rel_notes/release_26_07.rst | 5 +
lib/eal/arm/include/rte_atomic_32.h | 3 -
lib/eal/arm/include/rte_atomic_64.h | 3 -
lib/eal/arm/include/rte_byteorder.h | 3 -
lib/eal/arm/include/rte_spinlock.h | 3 -
lib/eal/include/generic/rte_atomic.h | 58 -------
lib/eal/include/generic/rte_byteorder.h | 2 -
lib/eal/include/generic/rte_spinlock.h | 10 --
lib/eal/loongarch/include/rte_atomic.h | 3 -
lib/eal/loongarch/include/rte_spinlock.h | 3 -
lib/eal/ppc/include/rte_atomic.h | 173 ---------------------
lib/eal/ppc/include/rte_byteorder.h | 13 --
lib/eal/ppc/include/rte_spinlock.h | 26 ----
lib/eal/riscv/include/rte_atomic.h | 3 -
lib/eal/riscv/include/rte_spinlock.h | 3 -
lib/eal/x86/include/rte_atomic.h | 172 ---------------------
lib/eal/x86/include/rte_atomic_32.h | 188 -----------------------
lib/eal/x86/include/rte_atomic_64.h | 157 -------------------
lib/eal/x86/include/rte_byteorder.h | 49 ------
lib/eal/x86/include/rte_spinlock.h | 49 ------
24 files changed, 5 insertions(+), 927 deletions(-)
diff --git a/config/arm/meson.build b/config/arm/meson.build
index 5a9c16b9b1..08fff73599 100644
--- a/config/arm/meson.build
+++ b/config/arm/meson.build
@@ -837,7 +837,6 @@ socs = {
}
dpdk_conf.set('RTE_ARCH_ARM', 1)
-dpdk_conf.set('RTE_FORCE_INTRINSICS', 1)
update_flags = false
soc_flags = []
diff --git a/config/loongarch/meson.build b/config/loongarch/meson.build
index 99dabef203..1623cdb571 100644
--- a/config/loongarch/meson.build
+++ b/config/loongarch/meson.build
@@ -6,7 +6,6 @@ if not dpdk_conf.get('RTE_ARCH_64')
endif
dpdk_conf.set('RTE_ARCH', 'loongarch')
dpdk_conf.set('RTE_ARCH_LOONGARCH', 1)
-dpdk_conf.set('RTE_FORCE_INTRINSICS', 1)
machine_args_generic = [
['default', ['-march=loongarch64']],
diff --git a/config/meson.build b/config/meson.build
index 9ba7b9a338..934abf04f2 100644
--- a/config/meson.build
+++ b/config/meson.build
@@ -29,9 +29,6 @@ is_ms_compiler = is_windows and (cc.get_id() == 'msvc')
is_ms_linker = is_windows and (cc.get_id() == 'clang' or is_ms_compiler)
if is_ms_compiler
- # force the use of intrinsics the MSVC compiler (except x86)
- # does not support inline assembly
- dpdk_conf.set('RTE_FORCE_INTRINSICS', 1)
# force the use of C++11 memory model in lib/ring
dpdk_conf.set('RTE_USE_C11_MEM_MODEL', true)
diff --git a/config/riscv/meson.build b/config/riscv/meson.build
index a06429a1e2..5dba613973 100644
--- a/config/riscv/meson.build
+++ b/config/riscv/meson.build
@@ -16,7 +16,6 @@ endif
dpdk_conf.set('RTE_ARCH', 'riscv')
dpdk_conf.set('RTE_ARCH_RISCV', 1)
-dpdk_conf.set('RTE_FORCE_INTRINSICS', 1)
# common flags to all riscv builds, with lowest priority
flags_common = [
diff --git a/doc/guides/rel_notes/release_26_07.rst b/doc/guides/rel_notes/release_26_07.rst
index f012d47a4b..9378cf1d36 100644
--- a/doc/guides/rel_notes/release_26_07.rst
+++ b/doc/guides/rel_notes/release_26_07.rst
@@ -92,6 +92,11 @@ API Changes
Also, make sure to start the actual text at the margin.
=======================================================
+* **Changed to use stdatomic intrinsics on all platforms.**
+
+ The config option ``RTE_FORCE_INTRINSICS`` has been removed.
+ Architecture specific code has been replaced with stdatomic.
+
ABI Changes
-----------
diff --git a/lib/eal/arm/include/rte_atomic_32.h b/lib/eal/arm/include/rte_atomic_32.h
index 3809ddefb7..a5ee63a2c7 100644
--- a/lib/eal/arm/include/rte_atomic_32.h
+++ b/lib/eal/arm/include/rte_atomic_32.h
@@ -5,9 +5,6 @@
#ifndef _RTE_ATOMIC_ARM32_H_
#define _RTE_ATOMIC_ARM32_H_
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
#include "generic/rte_atomic.h"
diff --git a/lib/eal/arm/include/rte_atomic_64.h b/lib/eal/arm/include/rte_atomic_64.h
index c9b41f6212..01412020e7 100644
--- a/lib/eal/arm/include/rte_atomic_64.h
+++ b/lib/eal/arm/include/rte_atomic_64.h
@@ -6,9 +6,6 @@
#ifndef _RTE_ATOMIC_ARM64_H_
#define _RTE_ATOMIC_ARM64_H_
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
#include "generic/rte_atomic.h"
#include <rte_branch_prediction.h>
diff --git a/lib/eal/arm/include/rte_byteorder.h b/lib/eal/arm/include/rte_byteorder.h
index a0aaff4a28..ffaaf726a4 100644
--- a/lib/eal/arm/include/rte_byteorder.h
+++ b/lib/eal/arm/include/rte_byteorder.h
@@ -5,9 +5,6 @@
#ifndef _RTE_BYTEORDER_ARM_H_
#define _RTE_BYTEORDER_ARM_H_
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
#include <stdint.h>
#include <rte_common.h>
diff --git a/lib/eal/arm/include/rte_spinlock.h b/lib/eal/arm/include/rte_spinlock.h
index a5d01b0d21..47820e5e1a 100644
--- a/lib/eal/arm/include/rte_spinlock.h
+++ b/lib/eal/arm/include/rte_spinlock.h
@@ -5,9 +5,6 @@
#ifndef _RTE_SPINLOCK_ARM_H_
#define _RTE_SPINLOCK_ARM_H_
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
#include <rte_common.h>
#include "generic/rte_spinlock.h"
diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h
index 4e9d230f85..06b6acf9eb 100644
--- a/lib/eal/include/generic/rte_atomic.h
+++ b/lib/eal/include/generic/rte_atomic.h
@@ -171,13 +171,11 @@ rte_smp_rmb(void)
static inline int
rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src);
-#ifdef RTE_FORCE_INTRINSICS
static inline int
rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
}
-#endif
/**
* Atomic exchange.
@@ -197,13 +195,11 @@ rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
static inline uint16_t
rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val);
-#ifdef RTE_FORCE_INTRINSICS
static inline uint16_t
rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
{
return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
}
-#endif
/**
* The atomic counter structure.
@@ -296,13 +292,11 @@ rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
static inline void
rte_atomic16_inc(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic16_inc(rte_atomic16_t *v)
{
rte_atomic16_add(v, 1);
}
-#endif
/**
* Atomically decrement a counter by one.
@@ -313,13 +307,11 @@ rte_atomic16_inc(rte_atomic16_t *v)
static inline void
rte_atomic16_dec(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic16_dec(rte_atomic16_t *v)
{
rte_atomic16_sub(v, 1);
}
-#endif
/**
* Atomically add a 16-bit value to a counter and return the result.
@@ -375,13 +367,11 @@ rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
*/
static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
rte_memory_order_seq_cst) + 1 == 0;
}
-#endif
/**
* Atomically decrement a 16-bit counter by one and test.
@@ -396,13 +386,11 @@ static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
*/
static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
rte_memory_order_seq_cst) - 1 == 0;
}
-#endif
/**
* Atomically test and set a 16-bit atomic counter.
@@ -417,12 +405,10 @@ static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
*/
static inline int rte_atomic16_test_and_set(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
{
return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
}
-#endif
/**
* Atomically set a 16-bit counter to 0.
@@ -456,13 +442,11 @@ static inline void rte_atomic16_clear(rte_atomic16_t *v)
static inline int
rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src);
-#ifdef RTE_FORCE_INTRINSICS
static inline int
rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
}
-#endif
/**
* Atomic exchange.
@@ -482,13 +466,11 @@ rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
static inline uint32_t
rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val);
-#ifdef RTE_FORCE_INTRINSICS
static inline uint32_t
rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
{
return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
}
-#endif
/**
* The atomic counter structure.
@@ -581,13 +563,11 @@ rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
static inline void
rte_atomic32_inc(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic32_inc(rte_atomic32_t *v)
{
rte_atomic32_add(v, 1);
}
-#endif
/**
* Atomically decrement a counter by one.
@@ -598,13 +578,11 @@ rte_atomic32_inc(rte_atomic32_t *v)
static inline void
rte_atomic32_dec(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic32_dec(rte_atomic32_t *v)
{
rte_atomic32_sub(v,1);
}
-#endif
/**
* Atomically add a 32-bit value to a counter and return the result.
@@ -660,13 +638,11 @@ rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
*/
static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
rte_memory_order_seq_cst) + 1 == 0;
}
-#endif
/**
* Atomically decrement a 32-bit counter by one and test.
@@ -681,13 +657,11 @@ static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
*/
static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
rte_memory_order_seq_cst) - 1 == 0;
}
-#endif
/**
* Atomically test and set a 32-bit atomic counter.
@@ -702,12 +676,10 @@ static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
*/
static inline int rte_atomic32_test_and_set(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
{
return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
}
-#endif
/**
* Atomically set a 32-bit counter to 0.
@@ -740,13 +712,11 @@ static inline void rte_atomic32_clear(rte_atomic32_t *v)
static inline int
rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src);
-#ifdef RTE_FORCE_INTRINSICS
static inline int
rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
}
-#endif
/**
* Atomic exchange.
@@ -766,13 +736,11 @@ rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
static inline uint64_t
rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val);
-#ifdef RTE_FORCE_INTRINSICS
static inline uint64_t
rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
{
return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
}
-#endif
/**
* The atomic counter structure.
@@ -795,7 +763,6 @@ typedef struct {
static inline void
rte_atomic64_init(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_init(rte_atomic64_t *v)
{
@@ -812,7 +779,6 @@ rte_atomic64_init(rte_atomic64_t *v)
}
#endif
}
-#endif
/**
* Atomically read a 64-bit counter.
@@ -825,7 +791,6 @@ rte_atomic64_init(rte_atomic64_t *v)
static inline int64_t
rte_atomic64_read(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int64_t
rte_atomic64_read(rte_atomic64_t *v)
{
@@ -844,7 +809,6 @@ rte_atomic64_read(rte_atomic64_t *v)
return tmp;
#endif
}
-#endif
/**
* Atomically set a 64-bit counter.
@@ -857,7 +821,6 @@ rte_atomic64_read(rte_atomic64_t *v)
static inline void
rte_atomic64_set(rte_atomic64_t *v, int64_t new_value);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
{
@@ -874,7 +837,6 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
}
#endif
}
-#endif
/**
* Atomically add a 64-bit value to a counter.
@@ -887,14 +849,12 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
static inline void
rte_atomic64_add(rte_atomic64_t *v, int64_t inc);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
{
rte_atomic_fetch_add_explicit((volatile __rte_atomic int64_t *)&v->cnt, inc,
rte_memory_order_seq_cst);
}
-#endif
/**
* Atomically subtract a 64-bit value from a counter.
@@ -907,14 +867,12 @@ rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
static inline void
rte_atomic64_sub(rte_atomic64_t *v, int64_t dec);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
{
rte_atomic_fetch_sub_explicit((volatile __rte_atomic int64_t *)&v->cnt, dec,
rte_memory_order_seq_cst);
}
-#endif
/**
* Atomically increment a 64-bit counter by one and test.
@@ -925,13 +883,11 @@ rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
static inline void
rte_atomic64_inc(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_inc(rte_atomic64_t *v)
{
rte_atomic64_add(v, 1);
}
-#endif
/**
* Atomically decrement a 64-bit counter by one and test.
@@ -942,13 +898,11 @@ rte_atomic64_inc(rte_atomic64_t *v)
static inline void
rte_atomic64_dec(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_dec(rte_atomic64_t *v)
{
rte_atomic64_sub(v, 1);
}
-#endif
/**
* Add a 64-bit value to an atomic counter and return the result.
@@ -966,14 +920,12 @@ rte_atomic64_dec(rte_atomic64_t *v)
static inline int64_t
rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc);
-#ifdef RTE_FORCE_INTRINSICS
static inline int64_t
rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int64_t *)&v->cnt, inc,
rte_memory_order_seq_cst) + inc;
}
-#endif
/**
* Subtract a 64-bit value from an atomic counter and return the result.
@@ -991,14 +943,12 @@ rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
static inline int64_t
rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec);
-#ifdef RTE_FORCE_INTRINSICS
static inline int64_t
rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int64_t *)&v->cnt, dec,
rte_memory_order_seq_cst) - dec;
}
-#endif
/**
* Atomically increment a 64-bit counter by one and test.
@@ -1013,12 +963,10 @@ rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
*/
static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
{
return rte_atomic64_add_return(v, 1) == 0;
}
-#endif
/**
* Atomically decrement a 64-bit counter by one and test.
@@ -1033,12 +981,10 @@ static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
*/
static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
{
return rte_atomic64_sub_return(v, 1) == 0;
}
-#endif
/**
* Atomically test and set a 64-bit atomic counter.
@@ -1053,12 +999,10 @@ static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
*/
static inline int rte_atomic64_test_and_set(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
{
return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
}
-#endif
/**
* Atomically set a 64-bit counter to 0.
@@ -1068,12 +1012,10 @@ static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
*/
static inline void rte_atomic64_clear(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void rte_atomic64_clear(rte_atomic64_t *v)
{
rte_atomic64_set(v, 0);
}
-#endif
#endif
diff --git a/lib/eal/include/generic/rte_byteorder.h b/lib/eal/include/generic/rte_byteorder.h
index 7973d6326f..e8b5f0ab86 100644
--- a/lib/eal/include/generic/rte_byteorder.h
+++ b/lib/eal/include/generic/rte_byteorder.h
@@ -239,7 +239,6 @@ static uint64_t rte_be_to_cpu_64(rte_be64_t x);
#endif /* __DOXYGEN__ */
-#ifdef RTE_FORCE_INTRINSICS
#ifndef RTE_TOOLCHAIN_MSVC
#define rte_bswap16(x) __builtin_bswap16(x)
@@ -253,7 +252,6 @@ static uint64_t rte_be_to_cpu_64(rte_be64_t x);
#define rte_bswap64(x) _byteswap_uint64(x)
#endif
-#endif
#ifdef __cplusplus
}
diff --git a/lib/eal/include/generic/rte_spinlock.h b/lib/eal/include/generic/rte_spinlock.h
index dd3d2d046c..13916f88b3 100644
--- a/lib/eal/include/generic/rte_spinlock.h
+++ b/lib/eal/include/generic/rte_spinlock.h
@@ -13,8 +13,6 @@
* This kind of lock simply waits in a loop
* repeatedly checking until the lock becomes available.
*
- * Some functions may have an architecture-specific implementation
- * if RTE_FORCE_INTRINSICS is disabled.
* The hardware transactional memory (lock elision) functions have _tm suffix
* and are implemented in architecture-specific files.
*
@@ -22,9 +20,7 @@
*/
#include <rte_lcore.h>
-#ifdef RTE_FORCE_INTRINSICS
#include <rte_common.h>
-#endif
#include <rte_debug.h>
#include <rte_lock_annotations.h>
#include <rte_pause.h>
@@ -68,7 +64,6 @@ static inline void
rte_spinlock_lock(rte_spinlock_t *sl)
__rte_acquire_capability(sl);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_spinlock_lock(rte_spinlock_t *sl)
__rte_no_thread_safety_analysis
@@ -82,7 +77,6 @@ rte_spinlock_lock(rte_spinlock_t *sl)
exp = 0;
}
}
-#endif
/**
* Release the spinlock.
@@ -94,14 +88,12 @@ static inline void
rte_spinlock_unlock(rte_spinlock_t *sl)
__rte_release_capability(sl);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_spinlock_unlock(rte_spinlock_t *sl)
__rte_no_thread_safety_analysis
{
rte_atomic_store_explicit(&sl->locked, 0, rte_memory_order_release);
}
-#endif
/**
* Try to take the lock.
@@ -116,7 +108,6 @@ static inline int
rte_spinlock_trylock(rte_spinlock_t *sl)
__rte_try_acquire_capability(true, sl);
-#ifdef RTE_FORCE_INTRINSICS
static inline int
rte_spinlock_trylock(rte_spinlock_t *sl)
__rte_no_thread_safety_analysis
@@ -125,7 +116,6 @@ rte_spinlock_trylock(rte_spinlock_t *sl)
return rte_atomic_compare_exchange_strong_explicit(&sl->locked, &exp, 1,
rte_memory_order_acquire, rte_memory_order_relaxed);
}
-#endif
/**
* Test if the lock is taken.
diff --git a/lib/eal/loongarch/include/rte_atomic.h b/lib/eal/loongarch/include/rte_atomic.h
index 49e0c67020..ed42e36843 100644
--- a/lib/eal/loongarch/include/rte_atomic.h
+++ b/lib/eal/loongarch/include/rte_atomic.h
@@ -5,9 +5,6 @@
#ifndef RTE_ATOMIC_LOONGARCH_H
#define RTE_ATOMIC_LOONGARCH_H
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
#include <rte_common.h>
#include "generic/rte_atomic.h"
diff --git a/lib/eal/loongarch/include/rte_spinlock.h b/lib/eal/loongarch/include/rte_spinlock.h
index 38f00f631d..bc9569b8e3 100644
--- a/lib/eal/loongarch/include/rte_spinlock.h
+++ b/lib/eal/loongarch/include/rte_spinlock.h
@@ -12,9 +12,6 @@
extern "C" {
#endif
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
static inline int rte_tm_supported(void)
{
diff --git a/lib/eal/ppc/include/rte_atomic.h b/lib/eal/ppc/include/rte_atomic.h
index 1da5afccbf..0e64db2a35 100644
--- a/lib/eal/ppc/include/rte_atomic.h
+++ b/lib/eal/ppc/include/rte_atomic.h
@@ -37,179 +37,6 @@ rte_atomic_thread_fence(rte_memory_order memorder)
}
/*------------------------- 16 bit atomic operations -------------------------*/
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
-{
- return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src, rte_memory_order_acquire,
- rte_memory_order_acquire) ? 1 : 0;
-}
-
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
-{
- return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire) + 1 == 0;
-}
-
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire) - 1 == 0;
-}
-
-static inline uint16_t
-rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
-{
- return __atomic_exchange_2(dst, val, rte_memory_order_seq_cst);
-}
-
-/*------------------------- 32 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
-{
- return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src, rte_memory_order_acquire,
- rte_memory_order_acquire) ? 1 : 0;
-}
-
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
-{
- return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire) + 1 == 0;
-}
-
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire) - 1 == 0;
-}
-
-static inline uint32_t
-rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
-{
- return __atomic_exchange_4(dst, val, rte_memory_order_seq_cst);
-}
-
-/*------------------------- 64 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
- return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src, rte_memory_order_acquire,
- rte_memory_order_acquire) ? 1 : 0;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
- return v->cnt;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
- v->cnt = new_value;
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, inc, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, dec, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, inc, rte_memory_order_acquire) + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, dec, rte_memory_order_acquire) - dec;
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire) + 1 == 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire) - 1 == 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
- return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-
-static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
-{
- return __atomic_exchange_8(dst, val, rte_memory_order_seq_cst);
-}
-
-#endif
#ifdef __cplusplus
}
diff --git a/lib/eal/ppc/include/rte_byteorder.h b/lib/eal/ppc/include/rte_byteorder.h
index 6c11fce9dc..e1e74f83e8 100644
--- a/lib/eal/ppc/include/rte_byteorder.h
+++ b/lib/eal/ppc/include/rte_byteorder.h
@@ -49,19 +49,6 @@ static inline uint64_t rte_arch_bswap64(uint64_t _x)
((_x << 40) & (0xffULL << 48)) | ((_x << 56));
}
-#ifndef RTE_FORCE_INTRINSICS
-#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ? \
- rte_constant_bswap16(x) : \
- rte_arch_bswap16(x)))
-
-#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ? \
- rte_constant_bswap32(x) : \
- rte_arch_bswap32(x)))
-
-#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ? \
- rte_constant_bswap64(x) : \
- rte_arch_bswap64(x)))
-#endif
/* Power 8 have both little endian and big endian mode
* Power 7 only support big endian
diff --git a/lib/eal/ppc/include/rte_spinlock.h b/lib/eal/ppc/include/rte_spinlock.h
index 6d242db35d..76afa52413 100644
--- a/lib/eal/ppc/include/rte_spinlock.h
+++ b/lib/eal/ppc/include/rte_spinlock.h
@@ -15,32 +15,6 @@ extern "C" {
/* Fixme: Use intrinsics to implement the spinlock on Power architecture */
-#ifndef RTE_FORCE_INTRINSICS
-
-static inline void
-rte_spinlock_lock(rte_spinlock_t *sl)
- __rte_no_thread_safety_analysis
-{
- while (__sync_lock_test_and_set(&sl->locked, 1))
- while (sl->locked)
- rte_pause();
-}
-
-static inline void
-rte_spinlock_unlock(rte_spinlock_t *sl)
- __rte_no_thread_safety_analysis
-{
- __sync_lock_release(&sl->locked);
-}
-
-static inline int
-rte_spinlock_trylock(rte_spinlock_t *sl)
- __rte_no_thread_safety_analysis
-{
- return __sync_lock_test_and_set(&sl->locked, 1) == 0;
-}
-
-#endif
static inline int rte_tm_supported(void)
{
diff --git a/lib/eal/riscv/include/rte_atomic.h b/lib/eal/riscv/include/rte_atomic.h
index dd10ad5127..bc7d446df5 100644
--- a/lib/eal/riscv/include/rte_atomic.h
+++ b/lib/eal/riscv/include/rte_atomic.h
@@ -8,9 +8,6 @@
#ifndef RTE_ATOMIC_RISCV_H
#define RTE_ATOMIC_RISCV_H
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
#include <stdint.h>
#include <rte_common.h>
diff --git a/lib/eal/riscv/include/rte_spinlock.h b/lib/eal/riscv/include/rte_spinlock.h
index 5fe4980e44..5df97ac5ca 100644
--- a/lib/eal/riscv/include/rte_spinlock.h
+++ b/lib/eal/riscv/include/rte_spinlock.h
@@ -8,9 +8,6 @@
#ifndef RTE_SPINLOCK_RISCV_H
#define RTE_SPINLOCK_RISCV_H
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
#include <rte_common.h>
#include "generic/rte_spinlock.h"
diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h
index a850b0257c..f4d39ce4fe 100644
--- a/lib/eal/x86/include/rte_atomic.h
+++ b/lib/eal/x86/include/rte_atomic.h
@@ -102,178 +102,6 @@ rte_atomic_thread_fence(rte_memory_order memorder)
extern "C" {
#endif
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
-{
- uint8_t res;
-
- asm volatile(
- MPLOCKED
- "cmpxchgw %[src], %[dst];"
- "sete %[res];"
- : [res] "=a" (res), /* output */
- [dst] "=m" (*dst)
- : [src] "r" (src), /* input */
- "a" (exp),
- "m" (*dst)
- : "memory"); /* no-clobber list */
- return res;
-}
-
-static inline uint16_t
-rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
-{
- asm volatile(
- MPLOCKED
- "xchgw %0, %1;"
- : "=r" (val), "=m" (*dst)
- : "0" (val), "m" (*dst)
- : "memory"); /* no-clobber list */
- return val;
-}
-
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
-{
- return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v)
-{
- asm volatile(
- MPLOCKED
- "incw %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v)
-{
- asm volatile(
- MPLOCKED
- "decw %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "incw %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
-{
- uint8_t ret;
-
- asm volatile(MPLOCKED
- "decw %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-/*------------------------- 32 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
-{
- uint8_t res;
-
- asm volatile(
- MPLOCKED
- "cmpxchgl %[src], %[dst];"
- "sete %[res];"
- : [res] "=a" (res), /* output */
- [dst] "=m" (*dst)
- : [src] "r" (src), /* input */
- "a" (exp),
- "m" (*dst)
- : "memory"); /* no-clobber list */
- return res;
-}
-
-static inline uint32_t
-rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
-{
- asm volatile(
- MPLOCKED
- "xchgl %0, %1;"
- : "=r" (val), "=m" (*dst)
- : "0" (val), "m" (*dst)
- : "memory"); /* no-clobber list */
- return val;
-}
-
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
-{
- return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v)
-{
- asm volatile(
- MPLOCKED
- "incl %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v)
-{
- asm volatile(
- MPLOCKED
- "decl %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "incl %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
-{
- uint8_t ret;
-
- asm volatile(MPLOCKED
- "decl %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-#endif /* !RTE_FORCE_INTRINSICS */
#ifdef __cplusplus
}
diff --git a/lib/eal/x86/include/rte_atomic_32.h b/lib/eal/x86/include/rte_atomic_32.h
index 0f25863aa5..37d139f30d 100644
--- a/lib/eal/x86/include/rte_atomic_32.h
+++ b/lib/eal/x86/include/rte_atomic_32.h
@@ -20,193 +20,5 @@
/*------------------------- 64 bit atomic operations -------------------------*/
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
- uint8_t res;
- union {
- struct {
- uint32_t l32;
- uint32_t h32;
- };
- uint64_t u64;
- } _exp, _src;
-
- _exp.u64 = exp;
- _src.u64 = src;
-
-#ifndef __PIC__
- asm volatile (
- MPLOCKED
- "cmpxchg8b (%[dst]);"
- "setz %[res];"
- : [res] "=a" (res) /* result in eax */
- : [dst] "S" (dst), /* esi */
- "b" (_src.l32), /* ebx */
- "c" (_src.h32), /* ecx */
- "a" (_exp.l32), /* eax */
- "d" (_exp.h32) /* edx */
- : "memory" ); /* no-clobber list */
-#else
- asm volatile (
- "xchgl %%ebx, %%edi;\n"
- MPLOCKED
- "cmpxchg8b (%[dst]);"
- "setz %[res];"
- "xchgl %%ebx, %%edi;\n"
- : [res] "=a" (res) /* result in eax */
- : [dst] "S" (dst), /* esi */
- "D" (_src.l32), /* ebx */
- "c" (_src.h32), /* ecx */
- "a" (_exp.l32), /* eax */
- "d" (_exp.h32) /* edx */
- : "memory" ); /* no-clobber list */
-#endif
-
- return res;
-}
-
-static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dest, uint64_t val)
-{
- uint64_t old;
-
- do {
- old = *dest;
- } while (rte_atomic64_cmpset(dest, old, val) == 0);
-
- return old;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, 0);
- }
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- /* replace the value by itself */
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp);
- }
- return tmp;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, new_value);
- }
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp + inc);
- }
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp - dec);
- }
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
- rte_atomic64_add(v, 1);
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
- rte_atomic64_sub(v, 1);
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp + inc);
- }
-
- return tmp + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp - dec);
- }
-
- return tmp - dec;
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
- return rte_atomic64_add_return(v, 1) == 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
- return rte_atomic64_sub_return(v, 1) == 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
- return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
- rte_atomic64_set(v, 0);
-}
-#endif
#endif /* _RTE_ATOMIC_I686_H_ */
diff --git a/lib/eal/x86/include/rte_atomic_64.h b/lib/eal/x86/include/rte_atomic_64.h
index 0a7a2131e0..1cd12695a2 100644
--- a/lib/eal/x86/include/rte_atomic_64.h
+++ b/lib/eal/x86/include/rte_atomic_64.h
@@ -22,163 +22,6 @@
/*------------------------- 64 bit atomic operations -------------------------*/
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
- uint8_t res;
-
-
- asm volatile(
- MPLOCKED
- "cmpxchgq %[src], %[dst];"
- "sete %[res];"
- : [res] "=a" (res), /* output */
- [dst] "=m" (*dst)
- : [src] "r" (src), /* input */
- "a" (exp),
- "m" (*dst)
- : "memory"); /* no-clobber list */
-
- return res;
-}
-
-static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
-{
- asm volatile(
- MPLOCKED
- "xchgq %0, %1;"
- : "=r" (val), "=m" (*dst)
- : "0" (val), "m" (*dst)
- : "memory"); /* no-clobber list */
- return val;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
- return v->cnt;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
- v->cnt = new_value;
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
- asm volatile(
- MPLOCKED
- "addq %[inc], %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : [inc] "ir" (inc), /* input */
- "m" (v->cnt)
- );
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
- asm volatile(
- MPLOCKED
- "subq %[dec], %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : [dec] "ir" (dec), /* input */
- "m" (v->cnt)
- );
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
- asm volatile(
- MPLOCKED
- "incq %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
- asm volatile(
- MPLOCKED
- "decq %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
- int64_t prev = inc;
-
- asm volatile(
- MPLOCKED
- "xaddq %[prev], %[cnt]"
- : [prev] "+r" (prev), /* output */
- [cnt] "=m" (v->cnt)
- : "m" (v->cnt) /* input */
- );
- return prev + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
- return rte_atomic64_add_return(v, -dec);
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "incq %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
-
- return ret != 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "decq %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
- return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-#endif
/*------------------------ 128 bit atomic operations -------------------------*/
diff --git a/lib/eal/x86/include/rte_byteorder.h b/lib/eal/x86/include/rte_byteorder.h
index bcf4a02225..5a9e5f0762 100644
--- a/lib/eal/x86/include/rte_byteorder.h
+++ b/lib/eal/x86/include/rte_byteorder.h
@@ -18,55 +18,6 @@ extern "C" {
#define RTE_BYTE_ORDER RTE_LITTLE_ENDIAN
#endif
-#ifndef RTE_FORCE_INTRINSICS
-/*
- * An architecture-optimized byte swap for a 16-bit value.
- *
- * Do not use this function directly. The preferred function is rte_bswap16().
- */
-static inline uint16_t rte_arch_bswap16(uint16_t _x)
-{
- uint16_t x = _x;
- asm volatile ("xchgb %b[x1],%h[x2]"
- : [x1] "=Q" (x)
- : [x2] "0" (x)
- );
- return x;
-}
-
-/*
- * An architecture-optimized byte swap for a 32-bit value.
- *
- * Do not use this function directly. The preferred function is rte_bswap32().
- */
-static inline uint32_t rte_arch_bswap32(uint32_t _x)
-{
- uint32_t x = _x;
- asm volatile ("bswap %[x]"
- : [x] "+r" (x)
- );
- return x;
-}
-
-#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ? \
- rte_constant_bswap16(x) : \
- rte_arch_bswap16(x)))
-
-#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ? \
- rte_constant_bswap32(x) : \
- rte_arch_bswap32(x)))
-
-#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ? \
- rte_constant_bswap64(x) : \
- rte_arch_bswap64(x)))
-
-#ifdef RTE_ARCH_I686
-#include "rte_byteorder_32.h"
-#else
-#include "rte_byteorder_64.h"
-#endif
-
-#endif /* !RTE_FORCE_INTRINSICS */
#ifdef __cplusplus
}
diff --git a/lib/eal/x86/include/rte_spinlock.h b/lib/eal/x86/include/rte_spinlock.h
index 273bbdc39c..104da6bd78 100644
--- a/lib/eal/x86/include/rte_spinlock.h
+++ b/lib/eal/x86/include/rte_spinlock.h
@@ -19,55 +19,6 @@ extern "C" {
#define RTE_RTM_MAX_RETRIES (20)
#define RTE_XABORT_LOCK_BUSY (0xff)
-#ifndef RTE_FORCE_INTRINSICS
-static inline void
-rte_spinlock_lock(rte_spinlock_t *sl)
- __rte_no_thread_safety_analysis
-{
- int lock_val = 1;
- asm volatile (
- "1:\n"
- "xchg %[locked], %[lv]\n"
- "test %[lv], %[lv]\n"
- "jz 3f\n"
- "2:\n"
- "pause\n"
- "cmpl $0, %[locked]\n"
- "jnz 2b\n"
- "jmp 1b\n"
- "3:\n"
- : [locked] "=m" (sl->locked), [lv] "=q" (lock_val)
- : "[lv]" (lock_val)
- : "memory");
-}
-
-static inline void
-rte_spinlock_unlock (rte_spinlock_t *sl)
- __rte_no_thread_safety_analysis
-{
- int unlock_val = 0;
- asm volatile (
- "xchg %[locked], %[ulv]\n"
- : [locked] "=m" (sl->locked), [ulv] "=q" (unlock_val)
- : "[ulv]" (unlock_val)
- : "memory");
-}
-
-static inline int
-rte_spinlock_trylock (rte_spinlock_t *sl)
- __rte_no_thread_safety_analysis
-{
- int lockval = 1;
-
- asm volatile (
- "xchg %[locked], %[lockval]"
- : [locked] "=m" (sl->locked), [lockval] "=q" (lockval)
- : "[lockval]" (lockval)
- : "memory");
-
- return lockval == 0;
-}
-#endif
extern uint8_t rte_rtm_supported;
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* Re: [RFC 2/7] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence
2026-05-21 4:17 ` [RFC 2/7] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
@ 2026-05-21 15:43 ` Wathsala Vithanage
0 siblings, 0 replies; 105+ messages in thread
From: Wathsala Vithanage @ 2026-05-21 15:43 UTC (permalink / raw)
To: Stephen Hemminger, dev
Cc: Bibo Mao, David Christensen, Sun Yuechi, Bruce Richardson,
Konstantin Ananyev
Hi Stephen,
Suggesting minor changes to comments on acquire and
release fences..
> +/** @name SMP Memory Barrier
> + */
> +///@{
> +/**
> + * General memory barrier between lcores
> + *
> + * Guarantees that the LOAD and STORE operations that precede the
> + * rte_smp_mb() call are globally visible across the lcores
> + * before the LOAD and STORE operations that follows it.
> + */
> +static __rte_always_inline void
> +rte_smp_mb(void)
> +{
> + rte_atomic_thread_fence(rte_memory_order_seq_cst);
> +}
> +
> +/**
> + * Write memory barrier between lcores
> + *
> + * Guarantees that the STORE operations that precede the
> + * rte_smp_wmb() call are globally visible across the lcores
> + * before the STORE operations that follows it.
> + */
> +static __rte_always_inline void
> +rte_smp_wmb(void)
> +{
> + rte_atomic_thread_fence(rte_memory_order_release);
> +}
Release fences order STORE | STORE, and LOAD | STORE.
Therefor, the comment should be "Guarantees that LOAD
and STORE operations that precede the rte_smp_wmb() call
are globally observed before STORE operations that follows it."
> +
> +/**
> + * Read memory barrier between lcores
> + *
> + * Guarantees that the LOAD operations that precede the
> + * rte_smp_rmb() call are globally visible across the lcores
> + * before the LOAD operations that follows it.
> + */
> +static __rte_always_inline void
> +rte_smp_rmb(void)
> +{
> + rte_atomic_thread_fence(rte_memory_order_acquire);
> +}
Acquire fences order LOAD | LOAD and LOAD | STORE.
Thus, the comment should be "Guarantees that the LOAD
operations that precede the rte_smp_rmb() call observe
global state before LOAD and STORE operations that
follows it"
--wathsala
^ permalink raw reply [flat|nested] 105+ messages in thread
* Re: [RFC 3/7] ring: use C11 atomic operations for MP/SP head/tail
2026-05-21 4:17 ` [RFC 3/7] ring: use C11 atomic operations for MP/SP head/tail Stephen Hemminger
@ 2026-05-21 15:57 ` Wathsala Vithanage
0 siblings, 0 replies; 105+ messages in thread
From: Wathsala Vithanage @ 2026-05-21 15:57 UTC (permalink / raw)
To: Stephen Hemminger, dev; +Cc: Konstantin Ananyev
[-- Attachment #1: Type: text/plain, Size: 5398 bytes --]
Already looks good. I have one minor suggestion.
In |rte_ring_c11_pvt.h| (and in the MCS lock code as well), we introduced
a comment style that annotates load-acquire and store-release
operations as |An| and |Rm|, respectively. Each |An| comment refers to the
corresponding |Rm| it synchronizes with, and vice versa, while also
describing
the intent of the pairing.
--wathsala
On 5/20/26 23:17, Stephen Hemminger wrote:
> Last caller of rte_atomic32_cmpset() in lib/, blocking deprecation
> of the rte_atomicNN_*() family.
>
> Replace cmpset with rte_atomic_compare_exchange_weak_explicit(),
> and convert head/tail loads/stores from implicit seq_cst to explicit
> acquire/release. Matches the HTS/RTS pattern.
>
> Acquire-load of d->head orders the subsequent load of s->tail (was
> rte_smp_rmb()). Acquire-load of s->tail pairs with the release-store
> of the counterpart tail in __rte_ring_update_tail(), which subsumes
> the previous wmb/rmb barriers.
>
> Weak CAS avoids arm64's hidden inner retry; the outer do-while already
> loops. CAS orderings relaxed: no data published by the reservation.
>
> The now-unused 'enqueue' parameter of __rte_ring_update_tail() is
> removed; both call sites updated.
>
> Signed-off-by: Stephen Hemminger<stephen@networkplumber.org>
> ---
> lib/ring/rte_ring_generic_pvt.h | 64 +++++++++++++++++++++++----------
> 1 file changed, 45 insertions(+), 19 deletions(-)
>
> diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_generic_pvt.h
> index affd2d5ba7..9497f6737b 100644
> --- a/lib/ring/rte_ring_generic_pvt.h
> +++ b/lib/ring/rte_ring_generic_pvt.h
> @@ -23,21 +23,25 @@
> */
> static __rte_always_inline void
> __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
> - uint32_t new_val, uint32_t single, uint32_t enqueue)
> + uint32_t new_val, uint32_t single,
> + uint32_t enqueue __rte_unused)
> {
> - if (enqueue)
> - rte_smp_wmb();
> - else
> - rte_smp_rmb();
> /*
> * If there are other enqueues/dequeues in progress that preceded us,
> * we need to wait for them to complete
> */
> if (!single)
> - rte_wait_until_equal_32((volatile uint32_t *)(uintptr_t)&ht->tail, old_val,
> - rte_memory_order_relaxed);
> + rte_wait_until_equal_32((volatile uint32_t *)(uintptr_t)&ht->tail,
> + old_val, rte_memory_order_relaxed);
>
> - ht->tail = new_val;
> + /*
> + * Release ordering on the tail store ensures that the slot reads
> + * (dequeue) or writes (enqueue) performed by this thread are visible
> + * to the other side before the new tail value is observed.
> + * Pairs with the acquire load of the counterpart's tail in
> + * __rte_ring_headtail_move_head().
> + */
> + rte_atomic_store_explicit(&ht->tail, new_val, rte_memory_order_release);
> }
>
> /**
> @@ -76,25 +80,35 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
> {
> unsigned int max = n;
> int success;
> + uint32_t tail;
>
> do {
> /* Reset n to the initial burst count */
> n = max;
>
> - *old_head = d->head;
> + /*
> + * Acquire load: orders this load before the load of s->tail
> + * below (replaces rte_smp_rmb() in the previous version) and
> + * re-establishes ordering after a failed CAS on retry.
> + */
> + *old_head = rte_atomic_load_explicit(&d->head,
> + rte_memory_order_acquire);
>
> - /* add rmb barrier to avoid load/load reorder in weak
> - * memory model. It is noop on x86
> + /*
> + * Acquire load on the counterpart's tail pairs with the
> + * release store in __rte_ring_update_tail() on the other
> + * side, ensuring slot operations performed there are visible
> + * before the caller accesses the reserved slots.
> */
> - rte_smp_rmb();
> + tail = rte_atomic_load_explicit(&s->tail, rte_memory_order_acquire);
>
> /*
> * The subtraction is done between two unsigned 32bits value
> * (the result is always modulo 32 bits even if we have
> - * *old_head > s->tail). So 'entries' is always between 0
> + * *old_head > tail). So 'entries' is always between 0
> * and capacity (which is < size).
> */
> - *entries = (capacity + s->tail - *old_head);
> + *entries = (capacity + tail - *old_head);
>
> /* check that we have enough room in ring */
> if (unlikely(n > *entries))
> @@ -106,12 +120,24 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
>
> *new_head = *old_head + n;
> if (is_st) {
> - d->head = *new_head;
> + rte_atomic_store_explicit(&d->head, *new_head, rte_memory_order_relaxed);
> success = 1;
> - } else
> - success = rte_atomic32_cmpset(
> - (uint32_t *)(uintptr_t)&d->head,
> - *old_head, *new_head);
> + } else {
> + /*
> + * Weak CAS: the outer do-while handles spurious
> + * failures, so we avoid the strong variant's
> + * internal retry (which on arm64 wraps the LL/SC
> + * pair in a hidden inner loop).
> + *
> + * Relaxed on both success and failure: this CAS
> + * does not publish data. Slot data visibility is
> + * provided by the acquire loads above and the
> + * release store of tail in __rte_ring_update_tail().
> + */
> + success = rte_atomic_compare_exchange_weak_explicit(
> + &d->head, old_head, *new_head,
> + rte_memory_order_relaxed, rte_memory_order_relaxed);
> + }
> } while (unlikely(success == 0));
> return n;
> }
[-- Attachment #2: Type: text/html, Size: 6093 bytes --]
^ permalink raw reply [flat|nested] 105+ messages in thread
* [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family
2026-05-21 4:17 [RFC 0/7] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (6 preceding siblings ...)
2026-05-21 4:17 ` [RFC 7/7] config: use RTE_FORCE_INTRINSICS on all platforms Stephen Hemminger
@ 2026-05-21 18:04 ` Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 01/11] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
` (11 more replies)
2026-05-23 19:16 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (2 subsequent siblings)
10 siblings, 12 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 18:04 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
The goal is to land every deprecation currently listed in the release
notes by the 26.11 ABI bump. Functions to be removed in 26.11 need
to be marked __rte_deprecated by 26.07, with all in-tree users
converted off them first so CI stays clean.
This is the first step. After this series there are no remaining
in-tree users of rte_atomic64. Expected follow-ups:
- convert remaining rte_atomic32 users (dpaa/fslmc, netvsc, vmbus,
sw_evdev, txgbe, ifc, hinic, bnx2x, vhost)
- convert remaining rte_atomic16 users (dpaa/fslmc, qman)
- mark the rte_atomicNN_*() family __rte_deprecated
- remove the legacy test_atomic.c
- remove the API itself at 26.11
Patch 1 deletes the inline-asm atomic fallbacks across arm, ppc,
loongarch, riscv, and x86 now that RTE_FORCE_INTRINSICS has been
the default everywhere for years. Largest patch by line count and
the one most worth review attention.
Patch 2 retires the rte_smp_*mb deprecation notice (open since 2021)
by reimplementing those APIs as inline wrappers over
rte_atomic_thread_fence; the API is preserved for readability.
Patch 3 is the load-bearing change for lib/: the last caller of
rte_atomic32_cmpset() is converted, with explicit acquire/release
orderings matching the existing HTS/RTS ring pattern.
Driver conversions (patches 4-11) match each rte_atomic64 use to its
best fit rather than blanket seq_cst: software stats become plain
assignment (DPDK convention, torn reads accepted); CAS loops setting
a flag collapse to fetch_or or exchange; open-coded link-status CAS
in net/pfe and net/sfc moves to the existing rte_eth_linkstatus
helpers; genuine synchronization stays atomic with explicit ordering.
v2 - fix clang build
- replace rte_atomic64 in more drivers
- incorporate feedback on rte_smp and ring
- drop zxdh change (only caused by intrinsics in spinlock)
Stephen Hemminger (11):
eal: use intrinsics for rte_atomic on all platforms
eal: reimplement rte_smp_*mb with rte_atomic_thread_fence
ring: use C11 atomic operations for MP/SP head/tail
net/bonding: use stdatomic
net/nbl: remove unused rte_atomic16 field
net/ena: replace use of rte_atomicNN
net/failsafe: convert to stdatomic
net/enic: do not use deprecated rte_atomic64
net/pfe: use ethdev linkstatus helpers
net/sfc: replace rte_atomic with stdatomic
crypto/ccp: replace use of rte_atomic64 with stdatomic
doc/guides/rel_notes/deprecation.rst | 8 -
drivers/crypto/ccp/ccp_crypto.c | 11 +-
drivers/crypto/ccp/ccp_crypto.h | 2 +-
drivers/crypto/ccp/ccp_dev.c | 10 +-
drivers/crypto/ccp/ccp_dev.h | 4 +-
drivers/net/bonding/eth_bond_8023ad_private.h | 6 +-
drivers/net/bonding/rte_eth_bond_8023ad.c | 35 ++-
drivers/net/ena/base/ena_plat_dpdk.h | 14 +-
drivers/net/ena/ena_ethdev.c | 21 +-
drivers/net/ena/ena_ethdev.h | 7 +-
drivers/net/enic/enic.h | 6 +-
drivers/net/enic/enic_compat.h | 1 -
drivers/net/enic/enic_main.c | 17 +-
drivers/net/enic/enic_rxtx.c | 14 +-
drivers/net/enic/enic_rxtx_vec_avx2.c | 4 +-
drivers/net/failsafe/failsafe_ops.c | 12 +-
drivers/net/failsafe/failsafe_private.h | 29 +--
drivers/net/failsafe/failsafe_rxtx.c | 2 +-
drivers/net/nbl/nbl_hw/nbl_resource.h | 1 -
drivers/net/pfe/pfe_ethdev.c | 32 +--
drivers/net/sfc/sfc.c | 9 +-
drivers/net/sfc/sfc.h | 4 +-
drivers/net/sfc/sfc_port.c | 7 +-
drivers/net/sfc/sfc_stats.h | 2 +-
lib/eal/arm/include/rte_atomic_32.h | 10 -
lib/eal/arm/include/rte_atomic_64.h | 10 -
lib/eal/include/generic/rte_atomic.h | 206 +++---------------
lib/eal/loongarch/include/rte_atomic.h | 10 -
lib/eal/ppc/include/rte_atomic.h | 179 ---------------
lib/eal/riscv/include/rte_atomic.h | 10 -
lib/eal/x86/include/rte_atomic.h | 205 +----------------
lib/eal/x86/include/rte_atomic_32.h | 188 ----------------
lib/eal/x86/include/rte_atomic_64.h | 157 -------------
lib/ring/rte_ring_generic_pvt.h | 65 ++++--
34 files changed, 190 insertions(+), 1108 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 105+ messages in thread
* [RFC v2 01/11] eal: use intrinsics for rte_atomic on all platforms
2026-05-21 18:04 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
@ 2026-05-21 18:04 ` Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 02/11] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
` (10 subsequent siblings)
11 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 18:04 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, Wathsala Vithanage, Bibo Mao,
David Christensen, Sun Yuechi, Bruce Richardson,
Konstantin Ananyev
Next step is to deprecate the rte_atomicNN_*() family. Rather than
maintaining both the inline asm and intrinsic fallbacks, drop the
asm paths and use intrinsics everywhere.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
lib/eal/arm/include/rte_atomic_32.h | 4 -
lib/eal/arm/include/rte_atomic_64.h | 4 -
lib/eal/include/generic/rte_atomic.h | 76 +---------
lib/eal/loongarch/include/rte_atomic.h | 4 -
lib/eal/ppc/include/rte_atomic.h | 173 -----------------------
lib/eal/riscv/include/rte_atomic.h | 4 -
lib/eal/x86/include/rte_atomic.h | 172 ----------------------
lib/eal/x86/include/rte_atomic_32.h | 188 -------------------------
lib/eal/x86/include/rte_atomic_64.h | 157 ---------------------
9 files changed, 6 insertions(+), 776 deletions(-)
diff --git a/lib/eal/arm/include/rte_atomic_32.h b/lib/eal/arm/include/rte_atomic_32.h
index 0b9a0dfa30..696a539fef 100644
--- a/lib/eal/arm/include/rte_atomic_32.h
+++ b/lib/eal/arm/include/rte_atomic_32.h
@@ -5,10 +5,6 @@
#ifndef _RTE_ATOMIC_ARM32_H_
#define _RTE_ATOMIC_ARM32_H_
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
-
#include "generic/rte_atomic.h"
#ifdef __cplusplus
diff --git a/lib/eal/arm/include/rte_atomic_64.h b/lib/eal/arm/include/rte_atomic_64.h
index 181bb60929..9f790238df 100644
--- a/lib/eal/arm/include/rte_atomic_64.h
+++ b/lib/eal/arm/include/rte_atomic_64.h
@@ -6,10 +6,6 @@
#ifndef _RTE_ATOMIC_ARM64_H_
#define _RTE_ATOMIC_ARM64_H_
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
-
#include "generic/rte_atomic.h"
#include <rte_branch_prediction.h>
#include <rte_debug.h>
diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h
index 0a4f3f8528..292e52fade 100644
--- a/lib/eal/include/generic/rte_atomic.h
+++ b/lib/eal/include/generic/rte_atomic.h
@@ -187,13 +187,11 @@ static inline void rte_atomic_thread_fence(rte_memory_order memorder);
static inline int
rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src);
-#ifdef RTE_FORCE_INTRINSICS
static inline int
rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
}
-#endif
/**
* Atomic exchange.
@@ -211,15 +209,11 @@ rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
* The original value at that location
*/
static inline uint16_t
-rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val);
-
-#ifdef RTE_FORCE_INTRINSICS
-static inline uint16_t
rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
{
- return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
+ return rte_atomic_exchange_explicit((volatile __rte_atomic uint16_t *)dst,
+ val, rte_memory_order_seq_cst);
}
-#endif
/**
* The atomic counter structure.
@@ -312,13 +306,11 @@ rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
static inline void
rte_atomic16_inc(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic16_inc(rte_atomic16_t *v)
{
rte_atomic16_add(v, 1);
}
-#endif
/**
* Atomically decrement a counter by one.
@@ -329,13 +321,11 @@ rte_atomic16_inc(rte_atomic16_t *v)
static inline void
rte_atomic16_dec(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic16_dec(rte_atomic16_t *v)
{
rte_atomic16_sub(v, 1);
}
-#endif
/**
* Atomically add a 16-bit value to a counter and return the result.
@@ -391,13 +381,11 @@ rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
*/
static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
rte_memory_order_seq_cst) + 1 == 0;
}
-#endif
/**
* Atomically decrement a 16-bit counter by one and test.
@@ -412,13 +400,11 @@ static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
*/
static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
rte_memory_order_seq_cst) - 1 == 0;
}
-#endif
/**
* Atomically test and set a 16-bit atomic counter.
@@ -433,12 +419,10 @@ static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
*/
static inline int rte_atomic16_test_and_set(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
{
return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
}
-#endif
/**
* Atomically set a 16-bit counter to 0.
@@ -472,13 +456,11 @@ static inline void rte_atomic16_clear(rte_atomic16_t *v)
static inline int
rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src);
-#ifdef RTE_FORCE_INTRINSICS
static inline int
rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
}
-#endif
/**
* Atomic exchange.
@@ -496,15 +478,11 @@ rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
* The original value at that location
*/
static inline uint32_t
-rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val);
-
-#ifdef RTE_FORCE_INTRINSICS
-static inline uint32_t
rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
{
- return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
+ return rte_atomic_exchange_explicit((volatile __rte_atomic uint32_t *)dst,
+ val, rte_memory_order_seq_cst);
}
-#endif
/**
* The atomic counter structure.
@@ -597,13 +575,11 @@ rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
static inline void
rte_atomic32_inc(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic32_inc(rte_atomic32_t *v)
{
rte_atomic32_add(v, 1);
}
-#endif
/**
* Atomically decrement a counter by one.
@@ -614,13 +590,11 @@ rte_atomic32_inc(rte_atomic32_t *v)
static inline void
rte_atomic32_dec(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic32_dec(rte_atomic32_t *v)
{
rte_atomic32_sub(v,1);
}
-#endif
/**
* Atomically add a 32-bit value to a counter and return the result.
@@ -676,13 +650,11 @@ rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
*/
static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
rte_memory_order_seq_cst) + 1 == 0;
}
-#endif
/**
* Atomically decrement a 32-bit counter by one and test.
@@ -697,13 +669,11 @@ static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
*/
static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
rte_memory_order_seq_cst) - 1 == 0;
}
-#endif
/**
* Atomically test and set a 32-bit atomic counter.
@@ -718,12 +688,10 @@ static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
*/
static inline int rte_atomic32_test_and_set(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
{
return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
}
-#endif
/**
* Atomically set a 32-bit counter to 0.
@@ -756,13 +724,11 @@ static inline void rte_atomic32_clear(rte_atomic32_t *v)
static inline int
rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src);
-#ifdef RTE_FORCE_INTRINSICS
static inline int
rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
}
-#endif
/**
* Atomic exchange.
@@ -780,15 +746,11 @@ rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
* The original value at that location
*/
static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val);
-
-#ifdef RTE_FORCE_INTRINSICS
-static inline uint64_t
rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
{
- return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
+ return rte_atomic_exchange_explicit((volatile __rte_atomic uint64_t *)dst,
+ val, rte_memory_order_seq_cst);
}
-#endif
/**
* The atomic counter structure.
@@ -811,7 +773,6 @@ typedef struct {
static inline void
rte_atomic64_init(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_init(rte_atomic64_t *v)
{
@@ -828,7 +789,6 @@ rte_atomic64_init(rte_atomic64_t *v)
}
#endif
}
-#endif
/**
* Atomically read a 64-bit counter.
@@ -841,7 +801,6 @@ rte_atomic64_init(rte_atomic64_t *v)
static inline int64_t
rte_atomic64_read(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int64_t
rte_atomic64_read(rte_atomic64_t *v)
{
@@ -860,7 +819,6 @@ rte_atomic64_read(rte_atomic64_t *v)
return tmp;
#endif
}
-#endif
/**
* Atomically set a 64-bit counter.
@@ -873,7 +831,6 @@ rte_atomic64_read(rte_atomic64_t *v)
static inline void
rte_atomic64_set(rte_atomic64_t *v, int64_t new_value);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
{
@@ -890,7 +847,6 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
}
#endif
}
-#endif
/**
* Atomically add a 64-bit value to a counter.
@@ -903,14 +859,12 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
static inline void
rte_atomic64_add(rte_atomic64_t *v, int64_t inc);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
{
rte_atomic_fetch_add_explicit((volatile __rte_atomic int64_t *)&v->cnt, inc,
rte_memory_order_seq_cst);
}
-#endif
/**
* Atomically subtract a 64-bit value from a counter.
@@ -923,14 +877,12 @@ rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
static inline void
rte_atomic64_sub(rte_atomic64_t *v, int64_t dec);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
{
rte_atomic_fetch_sub_explicit((volatile __rte_atomic int64_t *)&v->cnt, dec,
rte_memory_order_seq_cst);
}
-#endif
/**
* Atomically increment a 64-bit counter by one and test.
@@ -941,13 +893,11 @@ rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
static inline void
rte_atomic64_inc(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_inc(rte_atomic64_t *v)
{
rte_atomic64_add(v, 1);
}
-#endif
/**
* Atomically decrement a 64-bit counter by one and test.
@@ -958,13 +908,11 @@ rte_atomic64_inc(rte_atomic64_t *v)
static inline void
rte_atomic64_dec(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_dec(rte_atomic64_t *v)
{
rte_atomic64_sub(v, 1);
}
-#endif
/**
* Add a 64-bit value to an atomic counter and return the result.
@@ -982,14 +930,12 @@ rte_atomic64_dec(rte_atomic64_t *v)
static inline int64_t
rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc);
-#ifdef RTE_FORCE_INTRINSICS
static inline int64_t
rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int64_t *)&v->cnt, inc,
rte_memory_order_seq_cst) + inc;
}
-#endif
/**
* Subtract a 64-bit value from an atomic counter and return the result.
@@ -1007,14 +953,12 @@ rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
static inline int64_t
rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec);
-#ifdef RTE_FORCE_INTRINSICS
static inline int64_t
rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int64_t *)&v->cnt, dec,
rte_memory_order_seq_cst) - dec;
}
-#endif
/**
* Atomically increment a 64-bit counter by one and test.
@@ -1029,12 +973,10 @@ rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
*/
static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
{
return rte_atomic64_add_return(v, 1) == 0;
}
-#endif
/**
* Atomically decrement a 64-bit counter by one and test.
@@ -1049,12 +991,10 @@ static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
*/
static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
{
return rte_atomic64_sub_return(v, 1) == 0;
}
-#endif
/**
* Atomically test and set a 64-bit atomic counter.
@@ -1069,12 +1009,10 @@ static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
*/
static inline int rte_atomic64_test_and_set(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
{
return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
}
-#endif
/**
* Atomically set a 64-bit counter to 0.
@@ -1084,12 +1022,10 @@ static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
*/
static inline void rte_atomic64_clear(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void rte_atomic64_clear(rte_atomic64_t *v)
{
rte_atomic64_set(v, 0);
}
-#endif
#endif
diff --git a/lib/eal/loongarch/include/rte_atomic.h b/lib/eal/loongarch/include/rte_atomic.h
index c8066a4612..785a452c9e 100644
--- a/lib/eal/loongarch/include/rte_atomic.h
+++ b/lib/eal/loongarch/include/rte_atomic.h
@@ -5,10 +5,6 @@
#ifndef RTE_ATOMIC_LOONGARCH_H
#define RTE_ATOMIC_LOONGARCH_H
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
-
#include <rte_common.h>
#include "generic/rte_atomic.h"
diff --git a/lib/eal/ppc/include/rte_atomic.h b/lib/eal/ppc/include/rte_atomic.h
index 10acc238f9..64f4c3d670 100644
--- a/lib/eal/ppc/include/rte_atomic.h
+++ b/lib/eal/ppc/include/rte_atomic.h
@@ -43,179 +43,6 @@ rte_atomic_thread_fence(rte_memory_order memorder)
}
/*------------------------- 16 bit atomic operations -------------------------*/
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
-{
- return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src, rte_memory_order_acquire,
- rte_memory_order_acquire) ? 1 : 0;
-}
-
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
-{
- return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire) + 1 == 0;
-}
-
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire) - 1 == 0;
-}
-
-static inline uint16_t
-rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
-{
- return __atomic_exchange_2(dst, val, rte_memory_order_seq_cst);
-}
-
-/*------------------------- 32 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
-{
- return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src, rte_memory_order_acquire,
- rte_memory_order_acquire) ? 1 : 0;
-}
-
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
-{
- return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire) + 1 == 0;
-}
-
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire) - 1 == 0;
-}
-
-static inline uint32_t
-rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
-{
- return __atomic_exchange_4(dst, val, rte_memory_order_seq_cst);
-}
-
-/*------------------------- 64 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
- return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src, rte_memory_order_acquire,
- rte_memory_order_acquire) ? 1 : 0;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
- return v->cnt;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
- v->cnt = new_value;
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, inc, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, dec, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, inc, rte_memory_order_acquire) + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, dec, rte_memory_order_acquire) - dec;
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire) + 1 == 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire) - 1 == 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
- return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-
-static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
-{
- return __atomic_exchange_8(dst, val, rte_memory_order_seq_cst);
-}
-
-#endif
#ifdef __cplusplus
}
diff --git a/lib/eal/riscv/include/rte_atomic.h b/lib/eal/riscv/include/rte_atomic.h
index 66346ad474..061b175f33 100644
--- a/lib/eal/riscv/include/rte_atomic.h
+++ b/lib/eal/riscv/include/rte_atomic.h
@@ -8,10 +8,6 @@
#ifndef RTE_ATOMIC_RISCV_H
#define RTE_ATOMIC_RISCV_H
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
-
#include <stdint.h>
#include <rte_common.h>
#include <rte_config.h>
diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h
index e071e4234e..4f05302c9f 100644
--- a/lib/eal/x86/include/rte_atomic.h
+++ b/lib/eal/x86/include/rte_atomic.h
@@ -111,178 +111,6 @@ rte_atomic_thread_fence(rte_memory_order memorder)
extern "C" {
#endif
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
-{
- uint8_t res;
-
- asm volatile(
- MPLOCKED
- "cmpxchgw %[src], %[dst];"
- "sete %[res];"
- : [res] "=a" (res), /* output */
- [dst] "=m" (*dst)
- : [src] "r" (src), /* input */
- "a" (exp),
- "m" (*dst)
- : "memory"); /* no-clobber list */
- return res;
-}
-
-static inline uint16_t
-rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
-{
- asm volatile(
- MPLOCKED
- "xchgw %0, %1;"
- : "=r" (val), "=m" (*dst)
- : "0" (val), "m" (*dst)
- : "memory"); /* no-clobber list */
- return val;
-}
-
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
-{
- return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v)
-{
- asm volatile(
- MPLOCKED
- "incw %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v)
-{
- asm volatile(
- MPLOCKED
- "decw %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "incw %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
-{
- uint8_t ret;
-
- asm volatile(MPLOCKED
- "decw %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-/*------------------------- 32 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
-{
- uint8_t res;
-
- asm volatile(
- MPLOCKED
- "cmpxchgl %[src], %[dst];"
- "sete %[res];"
- : [res] "=a" (res), /* output */
- [dst] "=m" (*dst)
- : [src] "r" (src), /* input */
- "a" (exp),
- "m" (*dst)
- : "memory"); /* no-clobber list */
- return res;
-}
-
-static inline uint32_t
-rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
-{
- asm volatile(
- MPLOCKED
- "xchgl %0, %1;"
- : "=r" (val), "=m" (*dst)
- : "0" (val), "m" (*dst)
- : "memory"); /* no-clobber list */
- return val;
-}
-
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
-{
- return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v)
-{
- asm volatile(
- MPLOCKED
- "incl %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v)
-{
- asm volatile(
- MPLOCKED
- "decl %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "incl %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
-{
- uint8_t ret;
-
- asm volatile(MPLOCKED
- "decl %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-#endif /* !RTE_FORCE_INTRINSICS */
#ifdef __cplusplus
}
diff --git a/lib/eal/x86/include/rte_atomic_32.h b/lib/eal/x86/include/rte_atomic_32.h
index 0f25863aa5..37d139f30d 100644
--- a/lib/eal/x86/include/rte_atomic_32.h
+++ b/lib/eal/x86/include/rte_atomic_32.h
@@ -20,193 +20,5 @@
/*------------------------- 64 bit atomic operations -------------------------*/
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
- uint8_t res;
- union {
- struct {
- uint32_t l32;
- uint32_t h32;
- };
- uint64_t u64;
- } _exp, _src;
-
- _exp.u64 = exp;
- _src.u64 = src;
-
-#ifndef __PIC__
- asm volatile (
- MPLOCKED
- "cmpxchg8b (%[dst]);"
- "setz %[res];"
- : [res] "=a" (res) /* result in eax */
- : [dst] "S" (dst), /* esi */
- "b" (_src.l32), /* ebx */
- "c" (_src.h32), /* ecx */
- "a" (_exp.l32), /* eax */
- "d" (_exp.h32) /* edx */
- : "memory" ); /* no-clobber list */
-#else
- asm volatile (
- "xchgl %%ebx, %%edi;\n"
- MPLOCKED
- "cmpxchg8b (%[dst]);"
- "setz %[res];"
- "xchgl %%ebx, %%edi;\n"
- : [res] "=a" (res) /* result in eax */
- : [dst] "S" (dst), /* esi */
- "D" (_src.l32), /* ebx */
- "c" (_src.h32), /* ecx */
- "a" (_exp.l32), /* eax */
- "d" (_exp.h32) /* edx */
- : "memory" ); /* no-clobber list */
-#endif
-
- return res;
-}
-
-static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dest, uint64_t val)
-{
- uint64_t old;
-
- do {
- old = *dest;
- } while (rte_atomic64_cmpset(dest, old, val) == 0);
-
- return old;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, 0);
- }
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- /* replace the value by itself */
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp);
- }
- return tmp;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, new_value);
- }
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp + inc);
- }
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp - dec);
- }
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
- rte_atomic64_add(v, 1);
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
- rte_atomic64_sub(v, 1);
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp + inc);
- }
-
- return tmp + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp - dec);
- }
-
- return tmp - dec;
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
- return rte_atomic64_add_return(v, 1) == 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
- return rte_atomic64_sub_return(v, 1) == 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
- return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
- rte_atomic64_set(v, 0);
-}
-#endif
#endif /* _RTE_ATOMIC_I686_H_ */
diff --git a/lib/eal/x86/include/rte_atomic_64.h b/lib/eal/x86/include/rte_atomic_64.h
index 0a7a2131e0..1cd12695a2 100644
--- a/lib/eal/x86/include/rte_atomic_64.h
+++ b/lib/eal/x86/include/rte_atomic_64.h
@@ -22,163 +22,6 @@
/*------------------------- 64 bit atomic operations -------------------------*/
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
- uint8_t res;
-
-
- asm volatile(
- MPLOCKED
- "cmpxchgq %[src], %[dst];"
- "sete %[res];"
- : [res] "=a" (res), /* output */
- [dst] "=m" (*dst)
- : [src] "r" (src), /* input */
- "a" (exp),
- "m" (*dst)
- : "memory"); /* no-clobber list */
-
- return res;
-}
-
-static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
-{
- asm volatile(
- MPLOCKED
- "xchgq %0, %1;"
- : "=r" (val), "=m" (*dst)
- : "0" (val), "m" (*dst)
- : "memory"); /* no-clobber list */
- return val;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
- return v->cnt;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
- v->cnt = new_value;
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
- asm volatile(
- MPLOCKED
- "addq %[inc], %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : [inc] "ir" (inc), /* input */
- "m" (v->cnt)
- );
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
- asm volatile(
- MPLOCKED
- "subq %[dec], %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : [dec] "ir" (dec), /* input */
- "m" (v->cnt)
- );
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
- asm volatile(
- MPLOCKED
- "incq %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
- asm volatile(
- MPLOCKED
- "decq %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
- int64_t prev = inc;
-
- asm volatile(
- MPLOCKED
- "xaddq %[prev], %[cnt]"
- : [prev] "+r" (prev), /* output */
- [cnt] "=m" (v->cnt)
- : "m" (v->cnt) /* input */
- );
- return prev + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
- return rte_atomic64_add_return(v, -dec);
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "incq %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
-
- return ret != 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "decq %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
- return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-#endif
/*------------------------ 128 bit atomic operations -------------------------*/
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [RFC v2 02/11] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence
2026-05-21 18:04 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 01/11] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
@ 2026-05-21 18:04 ` Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 03/11] ring: use C11 atomic operations for MP/SP head/tail Stephen Hemminger
` (9 subsequent siblings)
11 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 18:04 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, Wathsala Vithanage, Bibo Mao,
David Christensen, Sun Yuechi, Bruce Richardson,
Konstantin Ananyev
The rte_smp_mb(), rte_smp_wmb() and rte_smp_rmb() functions were
flagged as deprecated by commit 3ec965b6de12 ("doc: update atomic
operation deprecation") in 2021 but nothing came of it.
Reimplement them as inline wrappers over rte_atomic_thread_fence()
and drop the deprecation notice.
The API is preserved; only the implementation changes.
Generated code is unchanged on x86 (seq_cst keeps the lock-addl
trick, release/acquire collapse to a compiler barrier under TSO).
On arm64, release/acquire emit dmb ish instead of dmb ishst/ishld;
the difference is below measurement noise.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
doc/guides/rel_notes/deprecation.rst | 8 --
lib/eal/arm/include/rte_atomic_32.h | 6 --
lib/eal/arm/include/rte_atomic_64.h | 6 --
lib/eal/include/generic/rte_atomic.h | 130 +++++--------------------
lib/eal/loongarch/include/rte_atomic.h | 6 --
lib/eal/ppc/include/rte_atomic.h | 6 --
lib/eal/riscv/include/rte_atomic.h | 6 --
lib/eal/x86/include/rte_atomic.h | 33 +++----
8 files changed, 37 insertions(+), 164 deletions(-)
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 35c9b4e06c..2190419f79 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -47,14 +47,6 @@ Deprecation Notices
operations must be used for patches that need to be merged in 20.08 onwards.
This change will not introduce any performance degradation.
-* rte_smp_*mb: These APIs provide full barrier functionality. However, many
- use cases do not require full barriers. To support such use cases, DPDK has
- adopted atomic operations from
- https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html. These
- operations and a new wrapper ``rte_atomic_thread_fence`` instead of
- ``__atomic_thread_fence`` must be used for patches that need to be merged in
- 20.08 onwards. This change will not introduce any performance degradation.
-
* lib: will fix extending some enum/define breaking the ABI. There are multiple
samples in DPDK that enum/define terminated with a ``.*MAX.*`` value which is
used by iterators, and arrays holding these values are sized with this
diff --git a/lib/eal/arm/include/rte_atomic_32.h b/lib/eal/arm/include/rte_atomic_32.h
index 696a539fef..4115271091 100644
--- a/lib/eal/arm/include/rte_atomic_32.h
+++ b/lib/eal/arm/include/rte_atomic_32.h
@@ -17,12 +17,6 @@ extern "C" {
#define rte_rmb() __sync_synchronize()
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_wmb()
diff --git a/lib/eal/arm/include/rte_atomic_64.h b/lib/eal/arm/include/rte_atomic_64.h
index 9f790238df..604e777bcd 100644
--- a/lib/eal/arm/include/rte_atomic_64.h
+++ b/lib/eal/arm/include/rte_atomic_64.h
@@ -20,12 +20,6 @@ extern "C" {
#define rte_rmb() asm volatile("dmb oshld" : : : "memory")
-#define rte_smp_mb() asm volatile("dmb ish" : : : "memory")
-
-#define rte_smp_wmb() asm volatile("dmb ishst" : : : "memory")
-
-#define rte_smp_rmb() asm volatile("dmb ishld" : : : "memory")
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_wmb()
diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h
index 292e52fade..1b04b43cbb 100644
--- a/lib/eal/include/generic/rte_atomic.h
+++ b/lib/eal/include/generic/rte_atomic.h
@@ -59,55 +59,25 @@ static inline void rte_rmb(void);
*
* Guarantees that the LOAD and STORE operations that precede the
* rte_smp_mb() call are globally visible across the lcores
- * before the LOAD and STORE operations that follows it.
- *
- * @note
- * This function is deprecated.
- * It provides similar synchronization primitive as atomic fence,
- * but has different syntax and memory ordering semantic. Hence
- * deprecated for the simplicity of memory ordering semantics in use.
- *
- * rte_atomic_thread_fence(rte_memory_order_acq_rel) should be used instead.
+ * before the LOAD and STORE operations that follow it.
*/
static inline void rte_smp_mb(void);
/**
* Write memory barrier between lcores
*
- * Guarantees that the STORE operations that precede the
- * rte_smp_wmb() call are globally visible across the lcores
- * before the STORE operations that follows it.
- *
- * @note
- * This function is deprecated.
- * It provides similar synchronization primitive as atomic fence,
- * but has different syntax and memory ordering semantic. Hence
- * deprecated for the simplicity of memory ordering semantics in use.
- *
- * rte_atomic_thread_fence(rte_memory_order_release) should be used instead.
- * The fence also guarantees LOAD operations that precede the call
- * are globally visible across the lcores before the STORE operations
- * that follows it.
+ * Guarantees that the LOAD and STORE operations that precede the
+ * rte_smp_wmb() call are globally visible across the lcores before
+ * any STORE operations that follow it.
*/
static inline void rte_smp_wmb(void);
/**
* Read memory barrier between lcores
*
- * Guarantees that the LOAD operations that precede the
- * rte_smp_rmb() call are globally visible across the lcores
- * before the LOAD operations that follows it.
- *
- * @note
- * This function is deprecated.
- * It provides similar synchronization primitive as atomic fence,
- * but has different syntax and memory ordering semantic. Hence
- * deprecated for the simplicity of memory ordering semantics in use.
- *
- * rte_atomic_thread_fence(rte_memory_order_acquire) should be used instead.
- * The fence also guarantees LOAD operations that precede the call
- * are globally visible across the lcores before the STORE operations
- * that follows it.
+ * Guarantees that any LOAD operations that precede the rte_smp_rmb()
+ * call complete before LOAD and STORE operations that follow it
+ * become globally visible.
*/
static inline void rte_smp_rmb(void);
///@}
@@ -164,6 +134,24 @@ static inline void rte_io_rmb(void);
*/
static inline void rte_atomic_thread_fence(rte_memory_order memorder);
+static __rte_always_inline void
+rte_smp_mb(void)
+{
+ rte_atomic_thread_fence(rte_memory_order_seq_cst);
+}
+
+static __rte_always_inline void
+rte_smp_wmb(void)
+{
+ rte_atomic_thread_fence(rte_memory_order_release);
+}
+
+static __rte_always_inline void
+rte_smp_rmb(void)
+{
+ rte_atomic_thread_fence(rte_memory_order_acquire);
+}
+
/*------------------------- 16 bit atomic operations -------------------------*/
#ifndef RTE_TOOLCHAIN_MSVC
@@ -184,9 +172,6 @@ static inline void rte_atomic_thread_fence(rte_memory_order memorder);
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src);
-
static inline int
rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
{
@@ -303,9 +288,6 @@ rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v);
-
static inline void
rte_atomic16_inc(rte_atomic16_t *v)
{
@@ -318,9 +300,6 @@ rte_atomic16_inc(rte_atomic16_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v);
-
static inline void
rte_atomic16_dec(rte_atomic16_t *v)
{
@@ -379,8 +358,6 @@ rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
* @return
* True if the result after the increment operation is 0; false otherwise.
*/
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v);
-
static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
@@ -398,8 +375,6 @@ static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
* @return
* True if the result after the decrement operation is 0; false otherwise.
*/
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v);
-
static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
@@ -417,8 +392,6 @@ static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v);
-
static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
{
return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
@@ -453,9 +426,6 @@ static inline void rte_atomic16_clear(rte_atomic16_t *v)
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src);
-
static inline int
rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
{
@@ -572,9 +542,6 @@ rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v);
-
static inline void
rte_atomic32_inc(rte_atomic32_t *v)
{
@@ -587,9 +554,6 @@ rte_atomic32_inc(rte_atomic32_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v);
-
static inline void
rte_atomic32_dec(rte_atomic32_t *v)
{
@@ -648,8 +612,6 @@ rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
* @return
* True if the result after the increment operation is 0; false otherwise.
*/
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v);
-
static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
@@ -667,8 +629,6 @@ static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
* @return
* True if the result after the decrement operation is 0; false otherwise.
*/
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v);
-
static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
@@ -686,8 +646,6 @@ static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v);
-
static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
{
return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
@@ -721,9 +679,6 @@ static inline void rte_atomic32_clear(rte_atomic32_t *v)
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src);
-
static inline int
rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
{
@@ -770,9 +725,6 @@ typedef struct {
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic64_init(rte_atomic64_t *v);
-
static inline void
rte_atomic64_init(rte_atomic64_t *v)
{
@@ -798,9 +750,6 @@ rte_atomic64_init(rte_atomic64_t *v)
* @return
* The value of the counter.
*/
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v);
-
static inline int64_t
rte_atomic64_read(rte_atomic64_t *v)
{
@@ -828,9 +777,6 @@ rte_atomic64_read(rte_atomic64_t *v)
* @param new_value
* The new value of the counter.
*/
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value);
-
static inline void
rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
{
@@ -856,9 +802,6 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
* @param inc
* The value to be added to the counter.
*/
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc);
-
static inline void
rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
{
@@ -874,9 +817,6 @@ rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
* @param dec
* The value to be subtracted from the counter.
*/
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec);
-
static inline void
rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
{
@@ -890,9 +830,6 @@ rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v);
-
static inline void
rte_atomic64_inc(rte_atomic64_t *v)
{
@@ -905,9 +842,6 @@ rte_atomic64_inc(rte_atomic64_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v);
-
static inline void
rte_atomic64_dec(rte_atomic64_t *v)
{
@@ -927,9 +861,6 @@ rte_atomic64_dec(rte_atomic64_t *v)
* @return
* The value of v after the addition.
*/
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc);
-
static inline int64_t
rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
{
@@ -950,9 +881,6 @@ rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
* @return
* The value of v after the subtraction.
*/
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec);
-
static inline int64_t
rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
{
@@ -971,8 +899,6 @@ rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
* @return
* True if the result after the addition is 0; false otherwise.
*/
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v);
-
static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
{
return rte_atomic64_add_return(v, 1) == 0;
@@ -989,8 +915,6 @@ static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
* @return
* True if the result after subtraction is 0; false otherwise.
*/
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v);
-
static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
{
return rte_atomic64_sub_return(v, 1) == 0;
@@ -1007,8 +931,6 @@ static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v);
-
static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
{
return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
@@ -1020,8 +942,6 @@ static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void rte_atomic64_clear(rte_atomic64_t *v);
-
static inline void rte_atomic64_clear(rte_atomic64_t *v)
{
rte_atomic64_set(v, 0);
diff --git a/lib/eal/loongarch/include/rte_atomic.h b/lib/eal/loongarch/include/rte_atomic.h
index 785a452c9e..a789e3ab4d 100644
--- a/lib/eal/loongarch/include/rte_atomic.h
+++ b/lib/eal/loongarch/include/rte_atomic.h
@@ -18,12 +18,6 @@ extern "C" {
#define rte_rmb() rte_mb()
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_mb()
-
-#define rte_smp_rmb() rte_mb()
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_mb()
diff --git a/lib/eal/ppc/include/rte_atomic.h b/lib/eal/ppc/include/rte_atomic.h
index 64f4c3d670..0e64db2a35 100644
--- a/lib/eal/ppc/include/rte_atomic.h
+++ b/lib/eal/ppc/include/rte_atomic.h
@@ -24,12 +24,6 @@ extern "C" {
#define rte_rmb() asm volatile("sync" : : : "memory")
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_wmb()
diff --git a/lib/eal/riscv/include/rte_atomic.h b/lib/eal/riscv/include/rte_atomic.h
index 061b175f33..04c40e4e9b 100644
--- a/lib/eal/riscv/include/rte_atomic.h
+++ b/lib/eal/riscv/include/rte_atomic.h
@@ -23,12 +23,6 @@ extern "C" {
#define rte_rmb() asm volatile("fence r, r" : : : "memory")
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#define rte_io_mb() asm volatile("fence iorw, iorw" : : : "memory")
#define rte_io_wmb() asm volatile("fence orw, ow" : : : "memory")
diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h
index 4f05302c9f..f4d39ce4fe 100644
--- a/lib/eal/x86/include/rte_atomic.h
+++ b/lib/eal/x86/include/rte_atomic.h
@@ -23,10 +23,6 @@
#define rte_rmb() _mm_lfence()
-#define rte_smp_wmb() rte_compiler_barrier()
-
-#define rte_smp_rmb() rte_compiler_barrier()
-
#ifdef __cplusplus
extern "C" {
#endif
@@ -63,20 +59,6 @@ extern "C" {
* So below we use that technique for rte_smp_mb() implementation.
*/
-static __rte_always_inline void
-rte_smp_mb(void)
-{
-#ifdef RTE_TOOLCHAIN_MSVC
- _mm_mfence();
-#else
-#ifdef RTE_ARCH_I686
- asm volatile("lock addl $0, -128(%%esp); " ::: "memory");
-#else
- asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
-#endif
-#endif
-}
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_compiler_barrier()
@@ -93,10 +75,19 @@ rte_smp_mb(void)
static __rte_always_inline void
rte_atomic_thread_fence(rte_memory_order memorder)
{
- if (memorder == rte_memory_order_seq_cst)
- rte_smp_mb();
- else
+ if (memorder == rte_memory_order_seq_cst) {
+#ifdef RTE_TOOLCHAIN_MSVC
+ _mm_mfence();
+#else
+#ifdef RTE_ARCH_I686
+ asm volatile("lock addl $0, -128(%%esp); " ::: "memory");
+#else
+ asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
+#endif
+#endif
+ } else {
__rte_atomic_thread_fence(memorder);
+ }
}
#ifdef __cplusplus
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [RFC v2 03/11] ring: use C11 atomic operations for MP/SP head/tail
2026-05-21 18:04 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 01/11] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 02/11] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
@ 2026-05-21 18:04 ` Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 04/11] net/bonding: use stdatomic Stephen Hemminger
` (8 subsequent siblings)
11 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 18:04 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Konstantin Ananyev, Wathsala Vithanage
Last caller of rte_atomic32_cmpset() in lib/, blocking deprecation
of the rte_atomicNN_*() family.
Replace cmpset with rte_atomic_compare_exchange_weak_explicit(),
and convert head/tail loads/stores from implicit seq_cst to explicit
acquire/release. Matches the HTS/RTS pattern.
Acquire-load of d->head orders the subsequent load of s->tail (was
rte_smp_rmb()). Acquire-load of s->tail pairs with the release-store
of the counterpart tail in __rte_ring_update_tail(), which subsumes
the previous wmb/rmb barriers.
Weak CAS avoids arm64's hidden inner retry; the outer do-while already
loops. CAS orderings relaxed: no data published by the reservation.
The now-unused 'enqueue' parameter of __rte_ring_update_tail() is
removed; both call sites updated.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
lib/ring/rte_ring_generic_pvt.h | 65 +++++++++++++++++++++++----------
1 file changed, 45 insertions(+), 20 deletions(-)
diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_generic_pvt.h
index affd2d5ba7..84570fd5fc 100644
--- a/lib/ring/rte_ring_generic_pvt.h
+++ b/lib/ring/rte_ring_generic_pvt.h
@@ -23,21 +23,24 @@
*/
static __rte_always_inline void
__rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
- uint32_t new_val, uint32_t single, uint32_t enqueue)
+ uint32_t new_val, uint32_t single,
+ uint32_t enqueue __rte_unused)
{
- if (enqueue)
- rte_smp_wmb();
- else
- rte_smp_rmb();
/*
* If there are other enqueues/dequeues in progress that preceded us,
* we need to wait for them to complete
*/
if (!single)
- rte_wait_until_equal_32((volatile uint32_t *)(uintptr_t)&ht->tail, old_val,
- rte_memory_order_relaxed);
-
- ht->tail = new_val;
+ rte_wait_until_equal_32((volatile uint32_t *)(uintptr_t)&ht->tail,
+ old_val, rte_memory_order_relaxed);
+ /*
+ * R0: Release store on the tail. Pairs with the acquire load of the
+ * counterpart's tail at A0 in __rte_ring_headtail_move_head() on the
+ * other side. Ensures slot operations performed by this thread (writes
+ * for enqueue, reads for dequeue) become visible before the new tail
+ * value is observed by the other side.
+ */
+ rte_atomic_store_explicit(&ht->tail, new_val, rte_memory_order_release);
}
/**
@@ -76,25 +79,35 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
{
unsigned int max = n;
int success;
+ uint32_t tail;
do {
/* Reset n to the initial burst count */
n = max;
- *old_head = d->head;
+ /*
+ * Acquire on d->head and acquire on s->tail below together prevent
+ * the two loads from being reordered (was rte_smp_rmb()) and
+ * re-establish ordering after a failed CAS on retry.
+ */
+ *old_head = rte_atomic_load_explicit(&d->head,
+ rte_memory_order_acquire);
- /* add rmb barrier to avoid load/load reorder in weak
- * memory model. It is noop on x86
+ /*
+ * A0: Acquire load on the counterpart's tail. Pairs with the
+ * release store at R0 in __rte_ring_update_tail(), ensuring slot
+ * operations on the other side are visible before this thread
+ * accesses the reserved slots.
*/
- rte_smp_rmb();
+ tail = rte_atomic_load_explicit(&s->tail, rte_memory_order_acquire);
/*
* The subtraction is done between two unsigned 32bits value
* (the result is always modulo 32 bits even if we have
- * *old_head > s->tail). So 'entries' is always between 0
+ * *old_head > tail). So 'entries' is always between 0
* and capacity (which is < size).
*/
- *entries = (capacity + s->tail - *old_head);
+ *entries = (capacity + tail - *old_head);
/* check that we have enough room in ring */
if (unlikely(n > *entries))
@@ -106,12 +119,24 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
*new_head = *old_head + n;
if (is_st) {
- d->head = *new_head;
+ rte_atomic_store_explicit(&d->head, *new_head, rte_memory_order_relaxed);
success = 1;
- } else
- success = rte_atomic32_cmpset(
- (uint32_t *)(uintptr_t)&d->head,
- *old_head, *new_head);
+ } else {
+ /*
+ * Weak CAS: the outer do-while handles spurious
+ * failures, so we avoid the strong variant's
+ * internal retry (which on arm64 wraps the LL/SC
+ * pair in a hidden inner loop).
+ *
+ * Relaxed on both success and failure: this CAS
+ * does not publish data. Slot data visibility is
+ * provided by the acquire loads above and the
+ * release store of tail in __rte_ring_update_tail().
+ */
+ success = rte_atomic_compare_exchange_weak_explicit(
+ &d->head, old_head, *new_head,
+ rte_memory_order_relaxed, rte_memory_order_relaxed);
+ }
} while (unlikely(success == 0));
return n;
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [RFC v2 04/11] net/bonding: use stdatomic
2026-05-21 18:04 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (2 preceding siblings ...)
2026-05-21 18:04 ` [RFC v2 03/11] ring: use C11 atomic operations for MP/SP head/tail Stephen Hemminger
@ 2026-05-21 18:04 ` Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 05/11] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
` (7 subsequent siblings)
11 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 18:04 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Chas Williams, Min Hu (Connor)
The old rte_atomic16 and rte_atomic64 functions are deprecated.
Replace with rte_stdatomic for managing warning and timer flags.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/bonding/eth_bond_8023ad_private.h | 6 ++--
drivers/net/bonding/rte_eth_bond_8023ad.c | 35 ++++++++-----------
2 files changed, 17 insertions(+), 24 deletions(-)
diff --git a/drivers/net/bonding/eth_bond_8023ad_private.h b/drivers/net/bonding/eth_bond_8023ad_private.h
index ab7d15f81a..dd3cf3ed26 100644
--- a/drivers/net/bonding/eth_bond_8023ad_private.h
+++ b/drivers/net/bonding/eth_bond_8023ad_private.h
@@ -9,7 +9,7 @@
#include <rte_ether.h>
#include <rte_byteorder.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_flow.h>
#include "rte_eth_bond_8023ad.h"
@@ -140,10 +140,10 @@ struct port {
/** Timer which is also used as mutex. If is 0 (not running) RX marker
* packet might be responded. Otherwise shall be dropped. It is zeroed in
* mode 4 callback function after expire. */
- volatile uint64_t rx_marker_timer;
+ RTE_ATOMIC(uint64_t) rx_marker_timer;
uint64_t warning_timer;
- volatile uint16_t warnings_to_show;
+ RTE_ATOMIC(uint16_t) warnings_to_show;
/** Memory pool used to allocate slow queues */
struct rte_mempool *slow_pool;
diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
index ba88f6d261..cc7e4af2b9 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.c
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
@@ -171,27 +171,17 @@ timer_is_running(uint64_t *timer)
static void
set_warning_flags(struct port *port, uint16_t flags)
{
- int retval;
- uint16_t old;
- uint16_t new_flag = 0;
-
- do {
- old = port->warnings_to_show;
- new_flag = old | flags;
- retval = rte_atomic16_cmpset(&port->warnings_to_show, old, new_flag);
- } while (unlikely(retval == 0));
+ rte_atomic_fetch_or_explicit(&port->warnings_to_show, flags, rte_memory_order_relaxed);
}
static void
show_warnings(uint16_t member_id)
{
struct port *port = &bond_mode_8023ad_ports[member_id];
- uint8_t warnings;
-
- do {
- warnings = port->warnings_to_show;
- } while (rte_atomic16_cmpset(&port->warnings_to_show, warnings, 0) == 0);
+ uint16_t warnings;
+ warnings = rte_atomic_exchange_explicit(&port->warnings_to_show, 0,
+ rte_memory_order_relaxed);
if (!warnings)
return;
@@ -1337,7 +1327,6 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
struct port *port = &bond_mode_8023ad_ports[member_id];
struct marker_header *m_hdr;
uint64_t marker_timer, old_marker_timer;
- int retval;
uint8_t wrn, subtype;
/* If packet is a marker, we send response now by reusing given packet
* and update only source MAC, destination MAC is multicast so don't
@@ -1354,17 +1343,19 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
}
/* Setup marker timer. Do it in loop in case concurrent access. */
+ old_marker_timer = rte_atomic_load_explicit(&port->rx_marker_timer,
+ rte_memory_order_relaxed);
do {
- old_marker_timer = port->rx_marker_timer;
if (!timer_is_expired(&old_marker_timer)) {
wrn = WRN_RX_MARKER_TO_FAST;
goto free_out;
}
timer_set(&marker_timer, mode4->rx_marker_timeout);
- retval = rte_atomic64_cmpset(&port->rx_marker_timer,
- old_marker_timer, marker_timer);
- } while (unlikely(retval == 0));
+
+ } while (!rte_atomic_compare_exchange_weak_explicit(&port->rx_marker_timer,
+ &old_marker_timer, marker_timer,
+ rte_memory_order_seq_cst, rte_memory_order_relaxed));
m_hdr->marker.tlv_type_marker = MARKER_TLV_TYPE_RESP;
rte_eth_macaddr_get(member_id, &m_hdr->eth_hdr.src_addr);
@@ -1372,7 +1363,8 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
if (internals->mode4.dedicated_queues.enabled == 0) {
if (rte_ring_enqueue(port->tx_ring, pkt) != 0) {
/* reset timer */
- port->rx_marker_timer = 0;
+ rte_atomic_store_explicit(&port->rx_marker_timer, 0,
+ rte_memory_order_release);
wrn = WRN_TX_QUEUE_FULL;
goto free_out;
}
@@ -1386,7 +1378,8 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
&pkt, tx_count);
if (tx_count != 1) {
/* reset timer */
- port->rx_marker_timer = 0;
+ rte_atomic_store_explicit(&port->rx_marker_timer, 0,
+ rte_memory_order_release);
wrn = WRN_TX_QUEUE_FULL;
goto free_out;
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [RFC v2 05/11] net/nbl: remove unused rte_atomic16 field
2026-05-21 18:04 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (3 preceding siblings ...)
2026-05-21 18:04 ` [RFC v2 04/11] net/bonding: use stdatomic Stephen Hemminger
@ 2026-05-21 18:04 ` Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 06/11] net/ena: replace use of rte_atomicNN Stephen Hemminger
` (6 subsequent siblings)
11 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 18:04 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Dimon Zhao, Leon Yu, Sam Chen
The tx_current_queue was defined as rte_atomic16_t which
is deprecated. Remove it since it was never used.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/nbl/nbl_hw/nbl_resource.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/net/nbl/nbl_hw/nbl_resource.h b/drivers/net/nbl/nbl_hw/nbl_resource.h
index bf5a9461f5..f2182ba6bc 100644
--- a/drivers/net/nbl/nbl_hw/nbl_resource.h
+++ b/drivers/net/nbl/nbl_hw/nbl_resource.h
@@ -225,7 +225,6 @@ struct nbl_res_info {
u16 base_qid;
u16 lcore_max;
u16 *pf_qid_to_lcore_id;
- rte_atomic16_t tx_current_queue;
};
struct nbl_resource_mgt {
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [RFC v2 06/11] net/ena: replace use of rte_atomicNN
2026-05-21 18:04 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (4 preceding siblings ...)
2026-05-21 18:04 ` [RFC v2 05/11] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
@ 2026-05-21 18:04 ` Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 07/11] net/failsafe: convert to stdatomic Stephen Hemminger
` (5 subsequent siblings)
11 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 18:04 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, Shai Brandes, Evgeny Schemeilin, Ron Beider,
Amit Bernstein, Wajeeh Atrash
Convert the legacy rte_atomicNN operations to stdatomic.
* Remove variable ena_alloc_cnt is defined by not used.
It is a leftover from previous memzone naming scheme.
* Convert the legacy rte_atomic32_t and rte_atomic32_{inc,dec,set,read}
macros to C11 stdatomic equivalents.
Memory ordering is kept at seq_cst,
matching the implicit ordering of the legacy API.
* Do not use rte_atomic for statistics
Unnecessary overhead to use atomic operations for error counts.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/ena/base/ena_plat_dpdk.h | 14 +++++++++-----
drivers/net/ena/ena_ethdev.c | 21 ++++++---------------
drivers/net/ena/ena_ethdev.h | 7 +++----
3 files changed, 18 insertions(+), 24 deletions(-)
diff --git a/drivers/net/ena/base/ena_plat_dpdk.h b/drivers/net/ena/base/ena_plat_dpdk.h
index c84420de22..83b354d9da 100644
--- a/drivers/net/ena/base/ena_plat_dpdk.h
+++ b/drivers/net/ena/base/ena_plat_dpdk.h
@@ -40,7 +40,7 @@ typedef uint64_t dma_addr_t;
#endif
#define ENA_PRIu64 PRIu64
-#define ena_atomic32_t rte_atomic32_t
+typedef RTE_ATOMIC(int32_t) ena_atomic32_t;
#define ena_mem_handle_t const struct rte_memzone *
#define SZ_256 (256U)
@@ -267,10 +267,14 @@ ena_mem_alloc_coherent(struct rte_eth_dev_data *data, size_t size,
#define ENA_REG_READ32(bus, reg) \
__extension__ ({ (void)(bus); rte_read32_relaxed((reg)); })
-#define ATOMIC32_INC(i32_ptr) rte_atomic32_inc(i32_ptr)
-#define ATOMIC32_DEC(i32_ptr) rte_atomic32_dec(i32_ptr)
-#define ATOMIC32_SET(i32_ptr, val) rte_atomic32_set(i32_ptr, val)
-#define ATOMIC32_READ(i32_ptr) rte_atomic32_read(i32_ptr)
+#define ATOMIC32_INC(i32_ptr) \
+ rte_atomic_fetch_add_explicit((i32_ptr), 1, rte_memory_order_seq_cst)
+#define ATOMIC32_DEC(i32_ptr) \
+ rte_atomic_fetch_sub_explicit((i32_ptr), 1, rte_memory_order_seq_cst)
+#define ATOMIC32_SET(i32_ptr, val) \
+ rte_atomic_store_explicit((i32_ptr), (val), rte_memory_order_seq_cst)
+#define ATOMIC32_READ(i32_ptr) \
+ rte_atomic_load_explicit((i32_ptr), rte_memory_order_seq_cst)
#define msleep(x) rte_delay_us(x * 1000)
#define udelay(x) rte_delay_us(x)
diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index ea4afbc75d..e9c484456c 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -121,12 +121,6 @@ struct ena_stats {
*/
#define ENA_DEVARG_ENABLE_FRAG_BYPASS "enable_frag_bypass"
-/*
- * Each rte_memzone should have unique name.
- * To satisfy it, count number of allocation and add it to name.
- */
-rte_atomic64_t ena_alloc_cnt;
-
static const struct ena_stats ena_stats_global_strings[] = {
ENA_STAT_GLOBAL_ENTRY(wd_expired),
ENA_STAT_GLOBAL_ENTRY(dev_start),
@@ -1249,10 +1243,7 @@ static void ena_stats_restart(struct rte_eth_dev *dev)
{
struct ena_adapter *adapter = dev->data->dev_private;
- rte_atomic64_init(&adapter->drv_stats->ierrors);
- rte_atomic64_init(&adapter->drv_stats->oerrors);
- rte_atomic64_init(&adapter->drv_stats->rx_nombuf);
- adapter->drv_stats->rx_drops = 0;
+ memset(adapter->drv_stats, 0, sizeof(struct ena_driver_stats));
}
static int ena_stats_get(struct rte_eth_dev *dev,
@@ -1289,9 +1280,9 @@ static int ena_stats_get(struct rte_eth_dev *dev,
/* Driver related stats */
stats->imissed = adapter->drv_stats->rx_drops;
- stats->ierrors = rte_atomic64_read(&adapter->drv_stats->ierrors);
- stats->oerrors = rte_atomic64_read(&adapter->drv_stats->oerrors);
- stats->rx_nombuf = rte_atomic64_read(&adapter->drv_stats->rx_nombuf);
+ stats->ierrors = adapter->drv_stats->ierrors;
+ stats->oerrors = adapter->drv_stats->oerrors;
+ stats->rx_nombuf = adapter->drv_stats->rx_nombuf;
/* Queue statistics */
if (qstats) {
@@ -1887,7 +1878,7 @@ static int ena_populate_rx_queue(struct ena_ring *rxq, unsigned int count)
/* get resources for incoming packets */
rc = rte_pktmbuf_alloc_bulk(rxq->mb_pool, mbufs, count);
if (unlikely(rc < 0)) {
- rte_atomic64_inc(&rxq->adapter->drv_stats->rx_nombuf);
+ ++rxq->adapter->drv_stats->rx_nombuf;
++rxq->rx_stats.mbuf_alloc_fail;
PMD_RX_LOG_LINE(DEBUG, "There are not enough free buffers");
return 0;
@@ -3014,7 +3005,7 @@ static uint16_t eth_ena_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
if (unlikely(mbuf->ol_flags &
(RTE_MBUF_F_RX_IP_CKSUM_BAD | RTE_MBUF_F_RX_L4_CKSUM_BAD)))
- rte_atomic64_inc(&rx_ring->adapter->drv_stats->ierrors);
+ ++rx_ring->adapter->drv_stats->ierrors;
rx_pkts[completed] = mbuf;
rx_ring->rx_stats.bytes += mbuf->pkt_len;
diff --git a/drivers/net/ena/ena_ethdev.h b/drivers/net/ena/ena_ethdev.h
index 3a66d79384..b204b07767 100644
--- a/drivers/net/ena/ena_ethdev.h
+++ b/drivers/net/ena/ena_ethdev.h
@@ -6,7 +6,6 @@
#ifndef _ENA_ETHDEV_H_
#define _ENA_ETHDEV_H_
-#include <rte_atomic.h>
#include <rte_ether.h>
#include <ethdev_driver.h>
#include <ethdev_pci.h>
@@ -225,9 +224,9 @@ enum ena_adapter_state {
};
struct ena_driver_stats {
- rte_atomic64_t ierrors;
- rte_atomic64_t oerrors;
- rte_atomic64_t rx_nombuf;
+ u64 ierrors;
+ u64 oerrors;
+ u64 rx_nombuf;
u64 rx_drops;
};
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [RFC v2 07/11] net/failsafe: convert to stdatomic
2026-05-21 18:04 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (5 preceding siblings ...)
2026-05-21 18:04 ` [RFC v2 06/11] net/ena: replace use of rte_atomicNN Stephen Hemminger
@ 2026-05-21 18:04 ` Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 08/11] net/enic: do not use deprecated rte_atomic64 Stephen Hemminger
` (4 subsequent siblings)
11 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 18:04 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Gaetan Rivet
The functions rte_atomic64 are deprecated, convert this
code to use stdatomic for reference count. Use the memory
order implied by naming P/V.
No need for initialization since refcnt is in space
allocated with rte_zmalloc().
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/failsafe/failsafe_ops.c | 12 +++++-----
drivers/net/failsafe/failsafe_private.h | 29 ++++++++++++++-----------
drivers/net/failsafe/failsafe_rxtx.c | 2 +-
3 files changed, 22 insertions(+), 21 deletions(-)
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index ddc8808ebe..fcb0051777 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -11,7 +11,7 @@
#endif
#include <rte_debug.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <ethdev_driver.h>
#include <rte_malloc.h>
#include <rte_flow.h>
@@ -440,14 +440,13 @@ fs_rx_queue_setup(struct rte_eth_dev *dev,
}
rxq = rte_zmalloc(NULL,
sizeof(*rxq) +
- sizeof(rte_atomic64_t) * PRIV(dev)->subs_tail,
+ sizeof(uint64_t) * PRIV(dev)->subs_tail,
RTE_CACHE_LINE_SIZE);
if (rxq == NULL) {
fs_unlock(dev, 0);
return -ENOMEM;
}
- FOREACH_SUBDEV(sdev, i, dev)
- rte_atomic64_init(&rxq->refcnt[i]);
+
rxq->qid = rx_queue_id;
rxq->socket_id = socket_id;
rxq->info.mp = mb_pool;
@@ -617,14 +616,13 @@ fs_tx_queue_setup(struct rte_eth_dev *dev,
}
txq = rte_zmalloc("ethdev TX queue",
sizeof(*txq) +
- sizeof(rte_atomic64_t) * PRIV(dev)->subs_tail,
+ sizeof(uint64_t) * PRIV(dev)->subs_tail,
RTE_CACHE_LINE_SIZE);
if (txq == NULL) {
fs_unlock(dev, 0);
return -ENOMEM;
}
- FOREACH_SUBDEV(sdev, i, dev)
- rte_atomic64_init(&txq->refcnt[i]);
+
txq->qid = tx_queue_id;
txq->socket_id = socket_id;
txq->info.conf = *tx_conf;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index babea6016e..89b06f9756 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -10,7 +10,7 @@
#include <sys/queue.h>
#include <pthread.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <dev_driver.h>
#include <ethdev_driver.h>
#include <rte_devargs.h>
@@ -75,7 +75,7 @@ struct rxq {
int event_fd;
unsigned int enable_events:1;
struct rte_eth_rxq_info info;
- rte_atomic64_t refcnt[];
+ RTE_ATOMIC(uint64_t) refcnt[];
};
struct txq {
@@ -83,7 +83,7 @@ struct txq {
uint16_t qid;
unsigned int socket_id;
struct rte_eth_txq_info info;
- rte_atomic64_t refcnt[];
+ RTE_ATOMIC(uint64_t) refcnt[];
};
struct rte_flow {
@@ -320,33 +320,36 @@ extern int failsafe_mac_from_arg;
*/
/**
- * a: (rte_atomic64_t)
+ * a: _Atomic uint64_t
*/
#define FS_ATOMIC_P(a) \
- rte_atomic64_set(&(a), 1)
+ rte_atomic_exchange_explicit(&(a), 1, rte_memory_order_acquire)
/**
- * a: (rte_atomic64_t)
+ * a: _Atomic uint64_t
*/
#define FS_ATOMIC_V(a) \
- rte_atomic64_set(&(a), 0)
+ rte_atomic_store_explicit(&(a), 0, rte_memory_order_release)
/**
* s: (struct sub_device *)
* i: uint16_t qid
*/
#define FS_ATOMIC_RX(s, i) \
- rte_atomic64_read( \
- &((struct rxq *) \
- (fs_dev(s)->data->rx_queues[i]))->refcnt[(s)->sid])
+ rte_atomic_load_explicit( \
+ &((struct rxq *) \
+ (fs_dev(s)->data->rx_queues[i]))->refcnt[(s)->sid], \
+ rte_memory_order_seq_cst)
+
/**
* s: (struct sub_device *)
* i: uint16_t qid
*/
#define FS_ATOMIC_TX(s, i) \
- rte_atomic64_read( \
- &((struct txq *) \
- (fs_dev(s)->data->tx_queues[i]))->refcnt[(s)->sid])
+ rte_atomic_load_explicit( \
+ &((struct txq *) \
+ (fs_dev(s)->data->tx_queues[i]))->refcnt[(s)->sid], \
+ rte_memory_order_seq_cst)
#ifdef RTE_EXEC_ENV_FREEBSD
#define FS_THREADID_TYPE void*
diff --git a/drivers/net/failsafe/failsafe_rxtx.c b/drivers/net/failsafe/failsafe_rxtx.c
index fe67293299..500483bda3 100644
--- a/drivers/net/failsafe/failsafe_rxtx.c
+++ b/drivers/net/failsafe/failsafe_rxtx.c
@@ -3,7 +3,7 @@
* Copyright 2017 Mellanox Technologies, Ltd
*/
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_debug.h>
#include <rte_mbuf.h>
#include <ethdev_driver.h>
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [RFC v2 08/11] net/enic: do not use deprecated rte_atomic64
2026-05-21 18:04 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (6 preceding siblings ...)
2026-05-21 18:04 ` [RFC v2 07/11] net/failsafe: convert to stdatomic Stephen Hemminger
@ 2026-05-21 18:04 ` Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 09/11] net/pfe: use ethdev linkstatus helpers Stephen Hemminger
` (3 subsequent siblings)
11 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 18:04 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, John Daley, Hyong Youb Kim, Bruce Richardson,
Konstantin Ananyev
The rte_atomic64 datatype and functions are deprecated.
This driver was only using it for error statistics where atomic
is not necessary. The DPDK PMD model is that statistics do
not have to be exact in face of contention.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/enic/enic.h | 6 +++---
drivers/net/enic/enic_compat.h | 1 -
drivers/net/enic/enic_main.c | 17 +++++++----------
drivers/net/enic/enic_rxtx.c | 14 ++++++--------
drivers/net/enic/enic_rxtx_vec_avx2.c | 4 ++--
5 files changed, 18 insertions(+), 24 deletions(-)
diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index 87f6b35fcd..0a8d4a29ca 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -59,9 +59,9 @@
#define ENICPMD_RXQ_INTR_OFFSET 1
struct enic_soft_stats {
- rte_atomic64_t rx_nombuf;
- rte_atomic64_t rx_packet_errors;
- rte_atomic64_t tx_oversized;
+ uint64_t rx_nombuf;
+ uint64_t rx_packet_errors;
+ uint64_t tx_oversized;
};
struct enic_memzone_entry {
diff --git a/drivers/net/enic/enic_compat.h b/drivers/net/enic/enic_compat.h
index 7cff6831b9..3ce4299e81 100644
--- a/drivers/net/enic/enic_compat.h
+++ b/drivers/net/enic/enic_compat.h
@@ -9,7 +9,6 @@
#include <stdio.h>
#include <unistd.h>
-#include <rte_atomic.h>
#include <rte_malloc.h>
#include <rte_log.h>
#include <rte_io.h>
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 2696fa77d4..fb9a5754c9 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -83,17 +83,15 @@ static void enic_log_q_error(struct enic *enic)
static void enic_clear_soft_stats(struct enic *enic)
{
struct enic_soft_stats *soft_stats = &enic->soft_stats;
- rte_atomic64_clear(&soft_stats->rx_nombuf);
- rte_atomic64_clear(&soft_stats->rx_packet_errors);
- rte_atomic64_clear(&soft_stats->tx_oversized);
+
+ memset(soft_stats, 0, sizeof(*soft_stats));
}
static void enic_init_soft_stats(struct enic *enic)
{
struct enic_soft_stats *soft_stats = &enic->soft_stats;
- rte_atomic64_init(&soft_stats->rx_nombuf);
- rte_atomic64_init(&soft_stats->rx_packet_errors);
- rte_atomic64_init(&soft_stats->tx_oversized);
+
+ memset(soft_stats, 0, sizeof(*soft_stats));
enic_clear_soft_stats(enic);
}
@@ -132,7 +130,7 @@ int enic_dev_stats_get(struct enic *enic, struct rte_eth_stats *r_stats,
* counted in ibytes even though truncated packets are dropped
* which can make ibytes be slightly higher than it should be.
*/
- rx_packet_errors = rte_atomic64_read(&soft_stats->rx_packet_errors);
+ rx_packet_errors = soft_stats->rx_packet_errors;
rx_truncated = rx_packet_errors - stats->rx.rx_errors;
r_stats->ipackets = stats->rx.rx_frames_ok - rx_truncated;
@@ -142,12 +140,11 @@ int enic_dev_stats_get(struct enic *enic, struct rte_eth_stats *r_stats,
r_stats->obytes = stats->tx.tx_bytes_ok;
r_stats->ierrors = stats->rx.rx_errors + stats->rx.rx_drop;
- r_stats->oerrors = stats->tx.tx_errors
- + rte_atomic64_read(&soft_stats->tx_oversized);
+ r_stats->oerrors = stats->tx.tx_errors + soft_stats->tx_oversized;
r_stats->imissed = stats->rx.rx_no_bufs + rx_truncated;
- r_stats->rx_nombuf = rte_atomic64_read(&soft_stats->rx_nombuf);
+ r_stats->rx_nombuf = soft_stats->rx_nombuf;
return 0;
}
diff --git a/drivers/net/enic/enic_rxtx.c b/drivers/net/enic/enic_rxtx.c
index 549a153332..c87d947b93 100644
--- a/drivers/net/enic/enic_rxtx.c
+++ b/drivers/net/enic/enic_rxtx.c
@@ -112,7 +112,7 @@ enic_recv_pkts_common(void *rx_queue, struct rte_mbuf **rx_pkts,
/* allocate a new mbuf */
nmb = rte_mbuf_raw_alloc(rq->mp);
if (nmb == NULL) {
- rte_atomic64_inc(&enic->soft_stats.rx_nombuf);
+ ++enic->soft_stats.rx_nombuf;
break;
}
@@ -185,7 +185,7 @@ enic_recv_pkts_common(void *rx_queue, struct rte_mbuf **rx_pkts,
}
if (unlikely(packet_error)) {
rte_pktmbuf_free(first_seg);
- rte_atomic64_inc(&enic->soft_stats.rx_packet_errors);
+ ++enic->soft_stats.rx_packet_errors;
continue;
}
@@ -303,7 +303,7 @@ enic_noscatter_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
if (unlikely(cqd->bytes_written_flags &
CQ_ENET_RQ_DESC_FLAGS_TRUNCATED)) {
rte_pktmbuf_free(*rxmb++);
- rte_atomic64_inc(&enic->soft_stats.rx_packet_errors);
+ ++enic->soft_stats.rx_packet_errors;
cqd++;
continue;
}
@@ -505,14 +505,12 @@ uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint8_t offload_mode;
uint16_t header_len;
uint64_t tso;
- rte_atomic64_t *tx_oversized;
enic_cleanup_wq(enic, wq);
wq_desc_avail = vnic_wq_desc_avail(wq);
head_idx = wq->head_idx;
desc_count = wq->ring.desc_count;
ol_flags_mask = RTE_MBUF_F_TX_VLAN | RTE_MBUF_F_TX_IP_CKSUM | RTE_MBUF_F_TX_L4_MASK;
- tx_oversized = &enic->soft_stats.tx_oversized;
nb_pkts = RTE_MIN(nb_pkts, ENIC_TX_XMIT_MAX);
@@ -527,7 +525,7 @@ uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
/* drop packet if it's too big to send */
if (unlikely(!tso && pkt_len > ENIC_TX_MAX_PKT_SIZE)) {
rte_pktmbuf_free(tx_pkt);
- rte_atomic64_inc(tx_oversized);
+ ++enic->soft_stats.tx_oversized;
continue;
}
@@ -558,7 +556,7 @@ uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
if (unlikely(header_len == 0 || ((tx_pkt->tso_segsz +
header_len) > ENIC_TX_MAX_PKT_SIZE))) {
rte_pktmbuf_free(tx_pkt);
- rte_atomic64_inc(tx_oversized);
+ ++enic->soft_stats.tx_oversized;
continue;
}
@@ -681,7 +679,7 @@ static void enqueue_simple_pkts(struct rte_mbuf **pkts,
*/
if (unlikely(p->pkt_len > ENIC_TX_MAX_PKT_SIZE)) {
desc->length = ENIC_TX_MAX_PKT_SIZE;
- rte_atomic64_inc(&enic->soft_stats.tx_oversized);
+ ++enic->soft_stats.tx_oversized;
}
desc++;
}
diff --git a/drivers/net/enic/enic_rxtx_vec_avx2.c b/drivers/net/enic/enic_rxtx_vec_avx2.c
index 600efff270..53589ab788 100644
--- a/drivers/net/enic/enic_rxtx_vec_avx2.c
+++ b/drivers/net/enic/enic_rxtx_vec_avx2.c
@@ -81,7 +81,7 @@ enic_noscatter_vec_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
if (unlikely(cqd->bytes_written_flags &
CQ_ENET_RQ_DESC_FLAGS_TRUNCATED)) {
rte_pktmbuf_free(*rxmb++);
- rte_atomic64_inc(&enic->soft_stats.rx_packet_errors);
+ ++enic->soft_stats.rx_packet_errors;
} else {
*rx++ = rx_one(cqd, *rxmb++, enic);
}
@@ -761,7 +761,7 @@ enic_noscatter_vec_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
if (unlikely(cqd->bytes_written_flags &
CQ_ENET_RQ_DESC_FLAGS_TRUNCATED)) {
rte_pktmbuf_free(*rxmb++);
- rte_atomic64_inc(&enic->soft_stats.rx_packet_errors);
+ ++enic->soft_stats.rx_packet_errors;
} else {
*rx++ = rx_one(cqd, *rxmb++, enic);
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [RFC v2 09/11] net/pfe: use ethdev linkstatus helpers
2026-05-21 18:04 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (7 preceding siblings ...)
2026-05-21 18:04 ` [RFC v2 08/11] net/enic: do not use deprecated rte_atomic64 Stephen Hemminger
@ 2026-05-21 18:04 ` Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 10/11] net/sfc: replace rte_atomic with stdatomic Stephen Hemminger
` (2 subsequent siblings)
11 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 18:04 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Gagandeep Singh
Rather than open coding with deprecated rte_atomic64,
use the existing ethdev helpers to get and set link status.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/pfe/pfe_ethdev.c | 32 ++------------------------------
1 file changed, 2 insertions(+), 30 deletions(-)
diff --git a/drivers/net/pfe/pfe_ethdev.c b/drivers/net/pfe/pfe_ethdev.c
index 1efa17539e..1b183ab1f3 100644
--- a/drivers/net/pfe/pfe_ethdev.c
+++ b/drivers/net/pfe/pfe_ethdev.c
@@ -531,34 +531,6 @@ pfe_supported_ptypes_get(struct rte_eth_dev *dev, size_t *no_of_elements)
return NULL;
}
-static inline int
-pfe_eth_atomic_read_link_status(struct rte_eth_dev *dev,
- struct rte_eth_link *link)
-{
- struct rte_eth_link *dst = link;
- struct rte_eth_link *src = &dev->data->dev_link;
-
- if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst,
- *(uint64_t *)src) == 0)
- return -1;
-
- return 0;
-}
-
-static inline int
-pfe_eth_atomic_write_link_status(struct rte_eth_dev *dev,
- struct rte_eth_link *link)
-{
- struct rte_eth_link *dst = &dev->data->dev_link;
- struct rte_eth_link *src = link;
-
- if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst,
- *(uint64_t *)src) == 0)
- return -1;
-
- return 0;
-}
-
static int
pfe_eth_link_update(struct rte_eth_dev *dev, int wait_to_complete __rte_unused)
{
@@ -570,7 +542,7 @@ pfe_eth_link_update(struct rte_eth_dev *dev, int wait_to_complete __rte_unused)
memset(&old, 0, sizeof(old));
memset(&link, 0, sizeof(struct rte_eth_link));
- pfe_eth_atomic_read_link_status(dev, &old);
+ rte_eth_linkstatus_get(dev, &old);
/* Read from PFE CDEV, status of link, if file was successfully
* opened.
@@ -601,7 +573,7 @@ pfe_eth_link_update(struct rte_eth_dev *dev, int wait_to_complete __rte_unused)
link.link_duplex = RTE_ETH_LINK_FULL_DUPLEX;
link.link_autoneg = RTE_ETH_LINK_AUTONEG;
- pfe_eth_atomic_write_link_status(dev, &link);
+ rte_eth_linkstatus_set(dev, &link);
PFE_PMD_INFO("Port (%d) link is %s", dev->data->port_id,
link.link_status ? "up" : "down");
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [RFC v2 10/11] net/sfc: replace rte_atomic with stdatomic
2026-05-21 18:04 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (8 preceding siblings ...)
2026-05-21 18:04 ` [RFC v2 09/11] net/pfe: use ethdev linkstatus helpers Stephen Hemminger
@ 2026-05-21 18:04 ` Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 11/11] crypto/ccp: replace use of rte_atomic64 " Stephen Hemminger
2026-05-22 14:19 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Bruce Richardson
11 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 18:04 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Andrew Rybchenko
The rte_atomicNN functions are deprecated and need to be replaced.
Use stdatomic for the restart required flag.
Use existing ethdev helper to set link status.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/sfc/sfc.c | 9 +++++----
drivers/net/sfc/sfc.h | 4 ++--
drivers/net/sfc/sfc_port.c | 7 +------
drivers/net/sfc/sfc_stats.h | 2 +-
4 files changed, 9 insertions(+), 13 deletions(-)
diff --git a/drivers/net/sfc/sfc.c b/drivers/net/sfc/sfc.c
index 69747e49ae..3470f7eed6 100644
--- a/drivers/net/sfc/sfc.c
+++ b/drivers/net/sfc/sfc.c
@@ -670,8 +670,8 @@ sfc_restart_if_required(void *arg)
struct sfc_adapter *sa = arg;
/* If restart is scheduled, clear the flag and do it */
- if (rte_atomic32_cmpset((volatile uint32_t *)&sa->restart_required,
- 1, 0)) {
+ if (rte_atomic_exchange_explicit(&sa->restart_required, false,
+ rte_memory_order_seq_cst)) {
sfc_adapter_lock(sa);
if (sa->state == SFC_ETHDEV_STARTED)
(void)sfc_restart(sa);
@@ -685,7 +685,8 @@ sfc_schedule_restart(struct sfc_adapter *sa)
int rc;
/* Schedule restart alarm if it is not scheduled yet */
- if (!rte_atomic32_test_and_set(&sa->restart_required))
+ if (rte_atomic_exchange_explicit(&sa->restart_required, true,
+ rte_memory_order_seq_cst))
return;
rc = rte_eal_alarm_set(1, sfc_restart_if_required, sa);
@@ -1292,7 +1293,7 @@ sfc_probe(struct sfc_adapter *sa)
SFC_ASSERT(sfc_adapter_is_locked(sa));
sa->socket_id = rte_socket_id();
- rte_atomic32_init(&sa->restart_required);
+ sa->restart_required = false;
sfc_log_init(sa, "get family");
rc = sfc_efx_family(pci_dev, &mem_ebrp, &sa->family);
diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 629578549f..515e1e708d 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -17,7 +17,7 @@
#include <ethdev_driver.h>
#include <rte_kvargs.h>
#include <rte_spinlock.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include "efx.h"
@@ -239,7 +239,7 @@ struct sfc_adapter {
efx_family_t family;
efx_nic_t *nic;
rte_spinlock_t nic_lock;
- rte_atomic32_t restart_required;
+ RTE_ATOMIC(bool) restart_required;
struct sfc_efx_mcdi mcdi;
struct sfc_sriov sriov;
diff --git a/drivers/net/sfc/sfc_port.c b/drivers/net/sfc/sfc_port.c
index 33b53f7ac8..d84648d454 100644
--- a/drivers/net/sfc/sfc_port.c
+++ b/drivers/net/sfc/sfc_port.c
@@ -121,7 +121,6 @@ sfc_port_reset_mac_stats(struct sfc_adapter *sa)
static int
sfc_port_init_dev_link(struct sfc_adapter *sa)
{
- struct rte_eth_link *dev_link = &sa->eth_dev->data->dev_link;
int rc;
efx_link_mode_t link_mode;
struct rte_eth_link current_link;
@@ -132,11 +131,7 @@ sfc_port_init_dev_link(struct sfc_adapter *sa)
sfc_port_link_mode_to_info(link_mode, sa->port.phy_adv_cap,
¤t_link);
-
- EFX_STATIC_ASSERT(sizeof(*dev_link) == sizeof(rte_atomic64_t));
- rte_atomic64_set((rte_atomic64_t *)dev_link,
- *(uint64_t *)¤t_link);
-
+ rte_eth_linkstatus_set(sa->eth_dev, ¤t_link);
return 0;
}
diff --git a/drivers/net/sfc/sfc_stats.h b/drivers/net/sfc/sfc_stats.h
index 597e14dab3..eaa2afd3fe 100644
--- a/drivers/net/sfc/sfc_stats.h
+++ b/drivers/net/sfc/sfc_stats.h
@@ -12,7 +12,7 @@
#include <stdint.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include "sfc_tweak.h"
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [RFC v2 11/11] crypto/ccp: replace use of rte_atomic64 with stdatomic
2026-05-21 18:04 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (9 preceding siblings ...)
2026-05-21 18:04 ` [RFC v2 10/11] net/sfc: replace rte_atomic with stdatomic Stephen Hemminger
@ 2026-05-21 18:04 ` Stephen Hemminger
2026-05-22 14:19 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Bruce Richardson
11 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-21 18:04 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Sunil Uttarwar
The rte_atomicNN functions are deprecated. Replace the free
count with stdatomic.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/crypto/ccp/ccp_crypto.c | 11 +++++++----
drivers/crypto/ccp/ccp_crypto.h | 2 +-
drivers/crypto/ccp/ccp_dev.c | 10 ++++++----
drivers/crypto/ccp/ccp_dev.h | 4 ++--
4 files changed, 16 insertions(+), 11 deletions(-)
diff --git a/drivers/crypto/ccp/ccp_crypto.c b/drivers/crypto/ccp/ccp_crypto.c
index 5899d83bae..1800ad41c9 100644
--- a/drivers/crypto/ccp/ccp_crypto.c
+++ b/drivers/crypto/ccp/ccp_crypto.c
@@ -2683,7 +2683,8 @@ process_ops_to_enqueue(struct ccp_qp *qp,
b_info->cmd_q = cmd_q;
b_info->lsb_buf_phys = (phys_addr_t)rte_mem_virt2iova((void *)b_info->lsb_buf);
- rte_atomic64_sub(&b_info->cmd_q->free_slots, slots_req);
+ rte_atomic_fetch_sub_explicit(&b_info->cmd_q->free_slots, slots_req,
+ rte_memory_order_seq_cst);
b_info->head_offset = (uint32_t)(cmd_q->qbase_phys_addr + cmd_q->qidx *
Q_DESC_SIZE);
@@ -2729,8 +2730,9 @@ process_ops_to_enqueue(struct ccp_qp *qp,
result = -1;
}
if (unlikely(result < 0)) {
- rte_atomic64_add(&b_info->cmd_q->free_slots,
- (slots_req - b_info->desccnt));
+ rte_atomic_fetch_add_explicit(&b_info->cmd_q->free_slots,
+ slots_req - b_info->desccnt,
+ rte_memory_order_seq_cst);
break;
}
b_info->op[i] = op[i];
@@ -2914,7 +2916,8 @@ process_ops_to_dequeue(struct ccp_qp *qp,
success:
*total_nb_ops = b_info->total_nb_ops;
nb_ops = ccp_prepare_ops(qp, op, b_info, nb_ops);
- rte_atomic64_add(&b_info->cmd_q->free_slots, b_info->desccnt);
+ rte_atomic_fetch_add_explicit(&b_info->cmd_q->free_slots, b_info->desccnt,
+ rte_memory_order_seq_cst);
b_info->desccnt = 0;
if (b_info->opcnt > 0) {
qp->b_info = b_info;
diff --git a/drivers/crypto/ccp/ccp_crypto.h b/drivers/crypto/ccp/ccp_crypto.h
index d0b417ca29..5c61b1582d 100644
--- a/drivers/crypto/ccp/ccp_crypto.h
+++ b/drivers/crypto/ccp/ccp_crypto.h
@@ -10,7 +10,7 @@
#include <stdint.h>
#include <string.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_byteorder.h>
#include <rte_io.h>
#include <rte_pci.h>
diff --git a/drivers/crypto/ccp/ccp_dev.c b/drivers/crypto/ccp/ccp_dev.c
index 5088d8ded6..a75816cdfc 100644
--- a/drivers/crypto/ccp/ccp_dev.c
+++ b/drivers/crypto/ccp/ccp_dev.c
@@ -47,14 +47,15 @@ ccp_allot_queue(struct rte_cryptodev *cdev, int slot_req)
priv->last_dev = dev;
if (dev->qidx >= dev->cmd_q_count)
dev->qidx = 0;
- ret = rte_atomic64_read(&dev->cmd_q[dev->qidx].free_slots);
+ ret = rte_atomic_load_explicit(&dev->cmd_q[dev->qidx].free_slots, rte_memory_order_relaxed);
if (ret >= slot_req)
return &dev->cmd_q[dev->qidx];
for (i = 0; i < dev->cmd_q_count; i++) {
dev->qidx++;
if (dev->qidx >= dev->cmd_q_count)
dev->qidx = 0;
- ret = rte_atomic64_read(&dev->cmd_q[dev->qidx].free_slots);
+ ret = rte_atomic_load_explicit(&dev->cmd_q[dev->qidx].free_slots,
+ rte_memory_order_relaxed);
if (ret >= slot_req)
return &dev->cmd_q[dev->qidx];
}
@@ -583,8 +584,9 @@ ccp_add_device(struct ccp_device *dev)
CCP_LOG_ERR("queue doesn't have lsb regions");
cmd_q->lsb = -1;
- rte_atomic64_init(&cmd_q->free_slots);
- rte_atomic64_set(&cmd_q->free_slots, (COMMANDS_PER_QUEUE - 1));
+ rte_atomic_store_explicit(&cmd_q->free_slots,
+ COMMANDS_PER_QUEUE - 1,
+ rte_memory_order_seq_cst);
/* unused slot barrier b/w H&T */
}
diff --git a/drivers/crypto/ccp/ccp_dev.h b/drivers/crypto/ccp/ccp_dev.h
index cd63830759..0d343c2426 100644
--- a/drivers/crypto/ccp/ccp_dev.h
+++ b/drivers/crypto/ccp/ccp_dev.h
@@ -11,7 +11,7 @@
#include <string.h>
#include <bus_pci_driver.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_byteorder.h>
#include <rte_io.h>
#include <rte_pci.h>
@@ -182,7 +182,7 @@ struct __rte_cache_aligned ccp_queue {
struct ccp_device *dev;
char memz_name[RTE_MEMZONE_NAMESIZE];
- rte_atomic64_t free_slots;
+ RTE_ATOMIC(uint64_t) free_slots;
/**< available free slots updated from enq/deq calls */
/* Queue identifier */
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* Re: [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family
2026-05-21 18:04 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (10 preceding siblings ...)
2026-05-21 18:04 ` [RFC v2 11/11] crypto/ccp: replace use of rte_atomic64 " Stephen Hemminger
@ 2026-05-22 14:19 ` Bruce Richardson
2026-05-22 14:45 ` Stephen Hemminger
11 siblings, 1 reply; 105+ messages in thread
From: Bruce Richardson @ 2026-05-22 14:19 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev
On Thu, May 21, 2026 at 11:04:12AM -0700, Stephen Hemminger wrote:
> The goal is to land every deprecation currently listed in the release
> notes by the 26.11 ABI bump. Functions to be removed in 26.11 need to be
> marked __rte_deprecated by 26.07, with all in-tree users converted off
> them first so CI stays clean.
>
> This is the first step. After this series there are no remaining in-tree
> users of rte_atomic64. Expected follow-ups:
>
> - convert remaining rte_atomic32 users (dpaa/fslmc, netvsc, vmbus,
> sw_evdev, txgbe, ifc, hinic, bnx2x, vhost) - convert remaining
> rte_atomic16 users (dpaa/fslmc, qman) - mark the rte_atomicNN_*()
> family __rte_deprecated - remove the legacy test_atomic.c - remove the
> API itself at 26.11
>
> Patch 1 deletes the inline-asm atomic fallbacks across arm, ppc,
> loongarch, riscv, and x86 now that RTE_FORCE_INTRINSICS has been the
> default everywhere for years. Largest patch by line count and the one
> most worth review attention.
>
> Patch 2 retires the rte_smp_*mb deprecation notice (open since 2021) by
> reimplementing those APIs as inline wrappers over
> rte_atomic_thread_fence; the API is preserved for readability.
>
> Patch 3 is the load-bearing change for lib/: the last caller of
> rte_atomic32_cmpset() is converted, with explicit acquire/release
> orderings matching the existing HTS/RTS ring pattern.
>
> Driver conversions (patches 4-11) match each rte_atomic64 use to its best
> fit rather than blanket seq_cst: software stats become plain assignment
> (DPDK convention, torn reads accepted); CAS loops setting a flag collapse
> to fetch_or or exchange; open-coded link-status CAS in net/pfe and
> net/sfc moves to the existing rte_eth_linkstatus helpers; genuine
> synchronization stays atomic with explicit ordering.
>
> v2 - fix clang build - replace rte_atomic64 in more drivers - incorporate
> feedback on rte_smp and ring - drop zxdh change (only caused by
> intrinsics in spinlock)
>
> Stephen Hemminger (11): eal: use intrinsics for rte_atomic on all
> platforms eal: reimplement rte_smp_*mb with rte_atomic_thread_fence ring:
> use C11 atomic operations for MP/SP head/tail net/bonding: use stdatomic
> net/nbl: remove unused rte_atomic16 field net/ena: replace use of
> rte_atomicNN net/failsafe: convert to stdatomic net/enic: do not use
> deprecated rte_atomic64 net/pfe: use ethdev linkstatus helpers net/sfc:
> replace rte_atomic with stdatomic crypto/ccp: replace use of rte_atomic64
> with stdatomic
>
I decided to test this patchset with the ring_perf_autotest (using only two
cores on same socket) to see how performance may be affected on x86 with
this change. On an initial once-off test to compare performance
with/without this patchset for MP/MC cases, it looks like smaller enq/deq
burst e.g. 8/32 are slower after this set, while larger bursts e.g. 128/256
are slightly faster.
I then ran two more tests with the patches applied and again without, and
got AI to analyse the set of 6 results to come up with more meaningful
conclusions after a little bit more numeric analysis. Below is some of the
summary.
While not necessarily a deal-breaker, the regressions seen are cause for
pause. We probably want to benchmark on a few other x86 (both Intel and
AMD) systems to see if this is a consistent picture.
/Bruce
---
Section-level picture (stable changes only):
Testing burst enq/deq:
10 consistent regressions, 0 consistent improvements.
Testing bulk enq/deq:
10 consistent regressions, 1 consistent improvement.
Testing using two physical cores:
mixed, but regressions outnumber improvements (5 vs 2).
Zero-copy and compression sections:
mostly inconclusive due high variance, with one stable regression.
Empty bulk deq and single-element:
mixed small set, with isolated improvements and regressions.
Largest consistent regressions (examples):
elem MP/MC burst n=32: -39.19%
elem MP/MC bulk n=32: -37.84%
elem MP/MC two-core bulk n=8: -36.58%
elem MP/MC two-core bulk n=32: -29.46%
elem MP/MC burst n=8: -29.43%
legacy MP/MC burst n=8: -19.04%
Largest consistent improvements (examples):
elem MP/MC two-core bulk n=128: +28.16%
elem MP/MC two-core bulk n=256: +25.91%
legacy SP/SC empty bulk deq n=8: +23.41%
elem SP/SC bulk n=32: +16.05%
legacy MP/MC single: +4.65%
Bottom line:
There is a real and mostly consistent regression trend in cycle-per-element
performance after replacing rte_atomicNN usage, especially in MP/MC burst
and bulk paths. A few benchmarks improve, but they are fewer than
regressions. All-worker total-count throughput appears statistically flat
in this 3-run sample.
^ permalink raw reply [flat|nested] 105+ messages in thread
* Re: [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family
2026-05-22 14:19 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Bruce Richardson
@ 2026-05-22 14:45 ` Stephen Hemminger
0 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-22 14:45 UTC (permalink / raw)
To: Bruce Richardson; +Cc: dev
On Fri, 22 May 2026 15:19:00 +0100
Bruce Richardson <bruce.richardson@intel.com> wrote:
> >
> I decided to test this patchset with the ring_perf_autotest (using only two
> cores on same socket) to see how performance may be affected on x86 with
> this change. On an initial once-off test to compare performance
> with/without this patchset for MP/MC cases, it looks like smaller enq/deq
> burst e.g. 8/32 are slower after this set, while larger bursts e.g. 128/256
> are slightly faster.
>
> I then ran two more tests with the patches applied and again without, and
> got AI to analyse the set of 6 results to come up with more meaningful
> conclusions after a little bit more numeric analysis. Below is some of the
> summary.
>
> While not necessarily a deal-breaker, the regressions seen are cause for
> pause. We probably want to benchmark on a few other x86 (both Intel and
> AMD) systems to see if this is a consistent picture.
>
> /Bruce
Could you see if problem is the use of intrinsics on x86 or the
changes to rte_ring_pvt?
I am not convinced that deprecation of these function is hard requirement.
This patchset is more of a what-if experiment.
The other alternative is remove the deprecation notice and just leave
well enough alone.
But some of the places actually benefit from the change over because
the are using flags as lock and using other memory orders should
be faster on Arm.
^ permalink raw reply [flat|nested] 105+ messages in thread
* [PATCH v3 00/27] deprecate rte_atomicNN family
2026-05-21 4:17 [RFC 0/7] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (7 preceding siblings ...)
2026-05-21 18:04 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
@ 2026-05-23 19:16 ` Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 01/27] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
` (8 more replies)
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
10 siblings, 9 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:16 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
The rte_atomicNN_* family was flagged for deprecation in 2021 by
commit 3ec965b6de12 ("doc: update atomic operation deprecation")
but enforcement never landed and in-tree usage continued to grow.
v2 covered the EAL changes, lib/ring, and a starter set of drivers.
This series finishes the job: convert every remaining in-tree
caller to the C11-style rte_atomic_*_explicit() / RTE_ATOMIC()
API, then mark the legacy functions __rte_deprecated so future
in-tree and out-of-tree uses are caught at compile time.
Performance: ran the DPDK perf-tests suite (mempool, hash, stack,
ring, distributor, rcu_qsbr, etc.) on the full series; only
lib/ring showed a regression, addressed by the wrapper in patch 03.
Patch organisation
==================
01-02 EAL: drop the inline-asm fallback paths now that intrinsics
work on all platforms; reimplement rte_smp_*mb on top of
rte_atomic_thread_fence.
03-04 lib/ring and lib/bpf -- the last legacy callers in lib/.
05-25 Drivers and selftests, one patch per directory.
26 Suppress deprecation warnings in app/test/test_atomic.c,
which exercises the legacy API until it goes away.
27 Mark rte_atomicNN_* with __rte_deprecated and drop the
corresponding checkpatch grep; new uses are now caught
at compile time.
Changes since v2
================
Scope: v2 stopped at crypto/ccp (11 patches). v3 adds:
04 lib/bpf -- the bpf interpreter's atomic op macro
13-25 Remaining driver/bus/event/vdpa conversions
26-27 Test-suite warning suppression and the actual
__rte_deprecated marking
Substantive changes to patches that were in v2:
02 Also drop the rte_smp_*mb forbidden-token check from
devtools/checkpatches.sh, since the API is no longer
on a deprecation cycle.
03 lib/ring -- keep most of the original code, introduce wrapper
for the one performance sensitive CAS. This fixes the
20-30% drop in ring_perf test on x86 which was observed
when using atomic_compare_exchange_weak_explicit() with GCC.
Feedback wanted
===============
Series is targeting 26.11 rather than the next release. The driver
conversions touch many maintainers' code and several are likely to
need cycles of review/respin; a longer review window avoids rushing
contested orderings into an earlier release.
- vmbus producer commit-order pattern (patch 17)
- the ring CAS wrapper might be improve performance on other similar
uses of ring buffers in vmbus and netvsc.
- Dekker-style seq_cst handshake in net/vhost (patch 24), which
also closes a pre-existing ordering hole on weakly-ordered ISAs
- netvsc rndis_pending claim/timeout/clear cmpxchg orderings
(patch 15)
Stephen Hemminger (27):
eal: use intrinsics for rte_atomic on all platforms
eal: reimplement rte_smp_*mb with rte_atomic_thread_fence
ring: use compare-and-swap wrapper
bpf: replace atomic op macro with typed helpers
net/bonding: use stdatomic
net/nbl: remove unused rte_atomic16 field
net/ena: replace use of rte_atomicNN
net/failsafe: convert to stdatomic
net/enic: do not use deprecated rte_atomic64
net/pfe: use ethdev linkstatus helpers
net/sfc: replace rte_atomic with stdatomic
crypto/ccp: replace use of rte_atomic64 with stdatomic
bus/dpaa: replace rte_atomic16 with stdatomic
drivers: replace rte_atomic16 with stdatomic
net/netvsc: replace rte_atomic32 with stdatomic
event/sw: convert from rte_atomic32 to stdatomic
bus/vmbus: convert from rte_atomic to stdatomic
common/dpaax: remove unused atomic macros
net/bnx2x: convert from rte_atomic32 to stdatomic
bus/fslmc: replace rte_atomic32 with stdatomic
drivers/event: replace rte_atomic32 in selftests
net/hinic: replace rte_atomic32 with stdatomic
net/txgbe: replace rte_atomic32 with stdatomic
net/vhost: use stdatomic instead of rte_atomic32
vdpa/ifc: replace rte_atomic32 with stdatomic
test/atomic: suppress deprecation warnings for legacy APIs
eal: mark rte_atomicNN as deprecated
app/test/test_atomic.c | 12 +
devtools/checkpatches.sh | 16 -
doc/guides/rel_notes/deprecation.rst | 12 +-
doc/guides/rel_notes/release_26_07.rst | 4 +
drivers/bus/dpaa/base/qbman/qman.c | 9 +-
drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c | 10 +-
drivers/bus/fslmc/portal/dpaa2_hw_dpci.c | 10 +-
drivers/bus/fslmc/portal/dpaa2_hw_dpio.c | 12 +-
drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 8 +-
drivers/bus/fslmc/qbman/include/compat.h | 21 +-
drivers/bus/vmbus/private.h | 2 +-
drivers/bus/vmbus/vmbus_bufring.c | 39 ++-
drivers/common/dpaax/compat.h | 14 -
drivers/crypto/ccp/ccp_crypto.c | 11 +-
drivers/crypto/ccp/ccp_crypto.h | 2 +-
drivers/crypto/ccp/ccp_dev.c | 10 +-
drivers/crypto/ccp/ccp_dev.h | 4 +-
drivers/event/dpaa2/dpaa2_eventdev_selftest.c | 26 +-
drivers/event/dpaa2/dpaa2_hw_dpcon.c | 11 +-
drivers/event/octeontx/ssovf_evdev_selftest.c | 58 ++--
drivers/event/sw/sw_evdev.c | 8 +-
drivers/event/sw/sw_evdev.h | 4 +-
drivers/event/sw/sw_evdev_worker.c | 16 +-
drivers/net/bnx2x/bnx2x.c | 6 +-
drivers/net/bnx2x/bnx2x.h | 2 +-
drivers/net/bnx2x/ecore_sp.c | 6 +-
drivers/net/bonding/eth_bond_8023ad_private.h | 6 +-
drivers/net/bonding/rte_eth_bond_8023ad.c | 35 +-
drivers/net/ena/base/ena_plat_dpdk.h | 14 +-
drivers/net/ena/ena_ethdev.c | 21 +-
drivers/net/ena/ena_ethdev.h | 7 +-
drivers/net/enic/enic.h | 6 +-
drivers/net/enic/enic_compat.h | 1 -
drivers/net/enic/enic_main.c | 17 +-
drivers/net/enic/enic_rxtx.c | 14 +-
drivers/net/enic/enic_rxtx_vec_avx2.c | 4 +-
drivers/net/failsafe/failsafe_ops.c | 12 +-
drivers/net/failsafe/failsafe_private.h | 29 +-
drivers/net/failsafe/failsafe_rxtx.c | 2 +-
drivers/net/hinic/base/hinic_compat.h | 2 +-
drivers/net/hinic/base/hinic_pmd_hwdev.c | 24 +-
drivers/net/hinic/base/hinic_pmd_hwdev.h | 4 +-
drivers/net/nbl/nbl_hw/nbl_resource.h | 1 -
drivers/net/netvsc/hn_rndis.c | 28 +-
drivers/net/netvsc/hn_rxtx.c | 12 +-
drivers/net/netvsc/hn_var.h | 6 +-
drivers/net/pfe/pfe_ethdev.c | 32 +-
drivers/net/sfc/sfc.c | 9 +-
drivers/net/sfc/sfc.h | 4 +-
drivers/net/sfc/sfc_port.c | 7 +-
drivers/net/sfc/sfc_stats.h | 2 +-
drivers/net/txgbe/base/txgbe_mng.c | 4 +-
drivers/net/txgbe/base/txgbe_type.h | 2 +-
drivers/net/vhost/rte_eth_vhost.c | 103 +++---
drivers/vdpa/ifc/ifcvf_vdpa.c | 37 +--
lib/bpf/bpf_exec.c | 91 ++++--
lib/eal/arm/include/rte_atomic_32.h | 10 -
lib/eal/arm/include/rte_atomic_64.h | 10 -
lib/eal/include/generic/rte_atomic.h | 305 +++++-------------
lib/eal/loongarch/include/rte_atomic.h | 10 -
lib/eal/ppc/include/rte_atomic.h | 179 ----------
lib/eal/riscv/include/rte_atomic.h | 10 -
lib/eal/x86/include/rte_atomic.h | 205 +-----------
lib/eal/x86/include/rte_atomic_32.h | 188 -----------
lib/eal/x86/include/rte_atomic_64.h | 157 ---------
lib/ring/rte_ring_generic_pvt.h | 32 +-
66 files changed, 585 insertions(+), 1390 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 105+ messages in thread
* [PATCH v3 01/27] eal: use intrinsics for rte_atomic on all platforms
2026-05-23 19:16 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
@ 2026-05-23 19:16 ` Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
` (7 subsequent siblings)
8 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:16 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
Next step is to deprecate the rte_atomicNN_*() family. Rather than
maintaining both the inline asm and intrinsic fallbacks, drop the
asm paths and use intrinsics everywhere.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
lib/eal/arm/include/rte_atomic_32.h | 4 -
lib/eal/arm/include/rte_atomic_64.h | 4 -
lib/eal/include/generic/rte_atomic.h | 76 +---------
lib/eal/loongarch/include/rte_atomic.h | 4 -
lib/eal/ppc/include/rte_atomic.h | 173 -----------------------
lib/eal/riscv/include/rte_atomic.h | 4 -
lib/eal/x86/include/rte_atomic.h | 172 ----------------------
lib/eal/x86/include/rte_atomic_32.h | 188 -------------------------
lib/eal/x86/include/rte_atomic_64.h | 157 ---------------------
9 files changed, 6 insertions(+), 776 deletions(-)
diff --git a/lib/eal/arm/include/rte_atomic_32.h b/lib/eal/arm/include/rte_atomic_32.h
index 0b9a0dfa30..696a539fef 100644
--- a/lib/eal/arm/include/rte_atomic_32.h
+++ b/lib/eal/arm/include/rte_atomic_32.h
@@ -5,10 +5,6 @@
#ifndef _RTE_ATOMIC_ARM32_H_
#define _RTE_ATOMIC_ARM32_H_
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
-
#include "generic/rte_atomic.h"
#ifdef __cplusplus
diff --git a/lib/eal/arm/include/rte_atomic_64.h b/lib/eal/arm/include/rte_atomic_64.h
index 181bb60929..9f790238df 100644
--- a/lib/eal/arm/include/rte_atomic_64.h
+++ b/lib/eal/arm/include/rte_atomic_64.h
@@ -6,10 +6,6 @@
#ifndef _RTE_ATOMIC_ARM64_H_
#define _RTE_ATOMIC_ARM64_H_
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
-
#include "generic/rte_atomic.h"
#include <rte_branch_prediction.h>
#include <rte_debug.h>
diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h
index 0a4f3f8528..292e52fade 100644
--- a/lib/eal/include/generic/rte_atomic.h
+++ b/lib/eal/include/generic/rte_atomic.h
@@ -187,13 +187,11 @@ static inline void rte_atomic_thread_fence(rte_memory_order memorder);
static inline int
rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src);
-#ifdef RTE_FORCE_INTRINSICS
static inline int
rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
}
-#endif
/**
* Atomic exchange.
@@ -211,15 +209,11 @@ rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
* The original value at that location
*/
static inline uint16_t
-rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val);
-
-#ifdef RTE_FORCE_INTRINSICS
-static inline uint16_t
rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
{
- return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
+ return rte_atomic_exchange_explicit((volatile __rte_atomic uint16_t *)dst,
+ val, rte_memory_order_seq_cst);
}
-#endif
/**
* The atomic counter structure.
@@ -312,13 +306,11 @@ rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
static inline void
rte_atomic16_inc(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic16_inc(rte_atomic16_t *v)
{
rte_atomic16_add(v, 1);
}
-#endif
/**
* Atomically decrement a counter by one.
@@ -329,13 +321,11 @@ rte_atomic16_inc(rte_atomic16_t *v)
static inline void
rte_atomic16_dec(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic16_dec(rte_atomic16_t *v)
{
rte_atomic16_sub(v, 1);
}
-#endif
/**
* Atomically add a 16-bit value to a counter and return the result.
@@ -391,13 +381,11 @@ rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
*/
static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
rte_memory_order_seq_cst) + 1 == 0;
}
-#endif
/**
* Atomically decrement a 16-bit counter by one and test.
@@ -412,13 +400,11 @@ static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
*/
static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
rte_memory_order_seq_cst) - 1 == 0;
}
-#endif
/**
* Atomically test and set a 16-bit atomic counter.
@@ -433,12 +419,10 @@ static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
*/
static inline int rte_atomic16_test_and_set(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
{
return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
}
-#endif
/**
* Atomically set a 16-bit counter to 0.
@@ -472,13 +456,11 @@ static inline void rte_atomic16_clear(rte_atomic16_t *v)
static inline int
rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src);
-#ifdef RTE_FORCE_INTRINSICS
static inline int
rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
}
-#endif
/**
* Atomic exchange.
@@ -496,15 +478,11 @@ rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
* The original value at that location
*/
static inline uint32_t
-rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val);
-
-#ifdef RTE_FORCE_INTRINSICS
-static inline uint32_t
rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
{
- return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
+ return rte_atomic_exchange_explicit((volatile __rte_atomic uint32_t *)dst,
+ val, rte_memory_order_seq_cst);
}
-#endif
/**
* The atomic counter structure.
@@ -597,13 +575,11 @@ rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
static inline void
rte_atomic32_inc(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic32_inc(rte_atomic32_t *v)
{
rte_atomic32_add(v, 1);
}
-#endif
/**
* Atomically decrement a counter by one.
@@ -614,13 +590,11 @@ rte_atomic32_inc(rte_atomic32_t *v)
static inline void
rte_atomic32_dec(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic32_dec(rte_atomic32_t *v)
{
rte_atomic32_sub(v,1);
}
-#endif
/**
* Atomically add a 32-bit value to a counter and return the result.
@@ -676,13 +650,11 @@ rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
*/
static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
rte_memory_order_seq_cst) + 1 == 0;
}
-#endif
/**
* Atomically decrement a 32-bit counter by one and test.
@@ -697,13 +669,11 @@ static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
*/
static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
rte_memory_order_seq_cst) - 1 == 0;
}
-#endif
/**
* Atomically test and set a 32-bit atomic counter.
@@ -718,12 +688,10 @@ static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
*/
static inline int rte_atomic32_test_and_set(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
{
return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
}
-#endif
/**
* Atomically set a 32-bit counter to 0.
@@ -756,13 +724,11 @@ static inline void rte_atomic32_clear(rte_atomic32_t *v)
static inline int
rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src);
-#ifdef RTE_FORCE_INTRINSICS
static inline int
rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
}
-#endif
/**
* Atomic exchange.
@@ -780,15 +746,11 @@ rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
* The original value at that location
*/
static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val);
-
-#ifdef RTE_FORCE_INTRINSICS
-static inline uint64_t
rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
{
- return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
+ return rte_atomic_exchange_explicit((volatile __rte_atomic uint64_t *)dst,
+ val, rte_memory_order_seq_cst);
}
-#endif
/**
* The atomic counter structure.
@@ -811,7 +773,6 @@ typedef struct {
static inline void
rte_atomic64_init(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_init(rte_atomic64_t *v)
{
@@ -828,7 +789,6 @@ rte_atomic64_init(rte_atomic64_t *v)
}
#endif
}
-#endif
/**
* Atomically read a 64-bit counter.
@@ -841,7 +801,6 @@ rte_atomic64_init(rte_atomic64_t *v)
static inline int64_t
rte_atomic64_read(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int64_t
rte_atomic64_read(rte_atomic64_t *v)
{
@@ -860,7 +819,6 @@ rte_atomic64_read(rte_atomic64_t *v)
return tmp;
#endif
}
-#endif
/**
* Atomically set a 64-bit counter.
@@ -873,7 +831,6 @@ rte_atomic64_read(rte_atomic64_t *v)
static inline void
rte_atomic64_set(rte_atomic64_t *v, int64_t new_value);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
{
@@ -890,7 +847,6 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
}
#endif
}
-#endif
/**
* Atomically add a 64-bit value to a counter.
@@ -903,14 +859,12 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
static inline void
rte_atomic64_add(rte_atomic64_t *v, int64_t inc);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
{
rte_atomic_fetch_add_explicit((volatile __rte_atomic int64_t *)&v->cnt, inc,
rte_memory_order_seq_cst);
}
-#endif
/**
* Atomically subtract a 64-bit value from a counter.
@@ -923,14 +877,12 @@ rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
static inline void
rte_atomic64_sub(rte_atomic64_t *v, int64_t dec);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
{
rte_atomic_fetch_sub_explicit((volatile __rte_atomic int64_t *)&v->cnt, dec,
rte_memory_order_seq_cst);
}
-#endif
/**
* Atomically increment a 64-bit counter by one and test.
@@ -941,13 +893,11 @@ rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
static inline void
rte_atomic64_inc(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_inc(rte_atomic64_t *v)
{
rte_atomic64_add(v, 1);
}
-#endif
/**
* Atomically decrement a 64-bit counter by one and test.
@@ -958,13 +908,11 @@ rte_atomic64_inc(rte_atomic64_t *v)
static inline void
rte_atomic64_dec(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_dec(rte_atomic64_t *v)
{
rte_atomic64_sub(v, 1);
}
-#endif
/**
* Add a 64-bit value to an atomic counter and return the result.
@@ -982,14 +930,12 @@ rte_atomic64_dec(rte_atomic64_t *v)
static inline int64_t
rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc);
-#ifdef RTE_FORCE_INTRINSICS
static inline int64_t
rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int64_t *)&v->cnt, inc,
rte_memory_order_seq_cst) + inc;
}
-#endif
/**
* Subtract a 64-bit value from an atomic counter and return the result.
@@ -1007,14 +953,12 @@ rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
static inline int64_t
rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec);
-#ifdef RTE_FORCE_INTRINSICS
static inline int64_t
rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int64_t *)&v->cnt, dec,
rte_memory_order_seq_cst) - dec;
}
-#endif
/**
* Atomically increment a 64-bit counter by one and test.
@@ -1029,12 +973,10 @@ rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
*/
static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
{
return rte_atomic64_add_return(v, 1) == 0;
}
-#endif
/**
* Atomically decrement a 64-bit counter by one and test.
@@ -1049,12 +991,10 @@ static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
*/
static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
{
return rte_atomic64_sub_return(v, 1) == 0;
}
-#endif
/**
* Atomically test and set a 64-bit atomic counter.
@@ -1069,12 +1009,10 @@ static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
*/
static inline int rte_atomic64_test_and_set(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
{
return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
}
-#endif
/**
* Atomically set a 64-bit counter to 0.
@@ -1084,12 +1022,10 @@ static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
*/
static inline void rte_atomic64_clear(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void rte_atomic64_clear(rte_atomic64_t *v)
{
rte_atomic64_set(v, 0);
}
-#endif
#endif
diff --git a/lib/eal/loongarch/include/rte_atomic.h b/lib/eal/loongarch/include/rte_atomic.h
index c8066a4612..785a452c9e 100644
--- a/lib/eal/loongarch/include/rte_atomic.h
+++ b/lib/eal/loongarch/include/rte_atomic.h
@@ -5,10 +5,6 @@
#ifndef RTE_ATOMIC_LOONGARCH_H
#define RTE_ATOMIC_LOONGARCH_H
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
-
#include <rte_common.h>
#include "generic/rte_atomic.h"
diff --git a/lib/eal/ppc/include/rte_atomic.h b/lib/eal/ppc/include/rte_atomic.h
index 10acc238f9..64f4c3d670 100644
--- a/lib/eal/ppc/include/rte_atomic.h
+++ b/lib/eal/ppc/include/rte_atomic.h
@@ -43,179 +43,6 @@ rte_atomic_thread_fence(rte_memory_order memorder)
}
/*------------------------- 16 bit atomic operations -------------------------*/
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
-{
- return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src, rte_memory_order_acquire,
- rte_memory_order_acquire) ? 1 : 0;
-}
-
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
-{
- return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire) + 1 == 0;
-}
-
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire) - 1 == 0;
-}
-
-static inline uint16_t
-rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
-{
- return __atomic_exchange_2(dst, val, rte_memory_order_seq_cst);
-}
-
-/*------------------------- 32 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
-{
- return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src, rte_memory_order_acquire,
- rte_memory_order_acquire) ? 1 : 0;
-}
-
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
-{
- return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire) + 1 == 0;
-}
-
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire) - 1 == 0;
-}
-
-static inline uint32_t
-rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
-{
- return __atomic_exchange_4(dst, val, rte_memory_order_seq_cst);
-}
-
-/*------------------------- 64 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
- return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src, rte_memory_order_acquire,
- rte_memory_order_acquire) ? 1 : 0;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
- return v->cnt;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
- v->cnt = new_value;
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, inc, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, dec, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, inc, rte_memory_order_acquire) + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, dec, rte_memory_order_acquire) - dec;
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire) + 1 == 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire) - 1 == 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
- return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-
-static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
-{
- return __atomic_exchange_8(dst, val, rte_memory_order_seq_cst);
-}
-
-#endif
#ifdef __cplusplus
}
diff --git a/lib/eal/riscv/include/rte_atomic.h b/lib/eal/riscv/include/rte_atomic.h
index 66346ad474..061b175f33 100644
--- a/lib/eal/riscv/include/rte_atomic.h
+++ b/lib/eal/riscv/include/rte_atomic.h
@@ -8,10 +8,6 @@
#ifndef RTE_ATOMIC_RISCV_H
#define RTE_ATOMIC_RISCV_H
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
-
#include <stdint.h>
#include <rte_common.h>
#include <rte_config.h>
diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h
index e071e4234e..4f05302c9f 100644
--- a/lib/eal/x86/include/rte_atomic.h
+++ b/lib/eal/x86/include/rte_atomic.h
@@ -111,178 +111,6 @@ rte_atomic_thread_fence(rte_memory_order memorder)
extern "C" {
#endif
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
-{
- uint8_t res;
-
- asm volatile(
- MPLOCKED
- "cmpxchgw %[src], %[dst];"
- "sete %[res];"
- : [res] "=a" (res), /* output */
- [dst] "=m" (*dst)
- : [src] "r" (src), /* input */
- "a" (exp),
- "m" (*dst)
- : "memory"); /* no-clobber list */
- return res;
-}
-
-static inline uint16_t
-rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
-{
- asm volatile(
- MPLOCKED
- "xchgw %0, %1;"
- : "=r" (val), "=m" (*dst)
- : "0" (val), "m" (*dst)
- : "memory"); /* no-clobber list */
- return val;
-}
-
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
-{
- return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v)
-{
- asm volatile(
- MPLOCKED
- "incw %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v)
-{
- asm volatile(
- MPLOCKED
- "decw %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "incw %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
-{
- uint8_t ret;
-
- asm volatile(MPLOCKED
- "decw %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-/*------------------------- 32 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
-{
- uint8_t res;
-
- asm volatile(
- MPLOCKED
- "cmpxchgl %[src], %[dst];"
- "sete %[res];"
- : [res] "=a" (res), /* output */
- [dst] "=m" (*dst)
- : [src] "r" (src), /* input */
- "a" (exp),
- "m" (*dst)
- : "memory"); /* no-clobber list */
- return res;
-}
-
-static inline uint32_t
-rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
-{
- asm volatile(
- MPLOCKED
- "xchgl %0, %1;"
- : "=r" (val), "=m" (*dst)
- : "0" (val), "m" (*dst)
- : "memory"); /* no-clobber list */
- return val;
-}
-
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
-{
- return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v)
-{
- asm volatile(
- MPLOCKED
- "incl %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v)
-{
- asm volatile(
- MPLOCKED
- "decl %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "incl %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
-{
- uint8_t ret;
-
- asm volatile(MPLOCKED
- "decl %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-#endif /* !RTE_FORCE_INTRINSICS */
#ifdef __cplusplus
}
diff --git a/lib/eal/x86/include/rte_atomic_32.h b/lib/eal/x86/include/rte_atomic_32.h
index 0f25863aa5..37d139f30d 100644
--- a/lib/eal/x86/include/rte_atomic_32.h
+++ b/lib/eal/x86/include/rte_atomic_32.h
@@ -20,193 +20,5 @@
/*------------------------- 64 bit atomic operations -------------------------*/
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
- uint8_t res;
- union {
- struct {
- uint32_t l32;
- uint32_t h32;
- };
- uint64_t u64;
- } _exp, _src;
-
- _exp.u64 = exp;
- _src.u64 = src;
-
-#ifndef __PIC__
- asm volatile (
- MPLOCKED
- "cmpxchg8b (%[dst]);"
- "setz %[res];"
- : [res] "=a" (res) /* result in eax */
- : [dst] "S" (dst), /* esi */
- "b" (_src.l32), /* ebx */
- "c" (_src.h32), /* ecx */
- "a" (_exp.l32), /* eax */
- "d" (_exp.h32) /* edx */
- : "memory" ); /* no-clobber list */
-#else
- asm volatile (
- "xchgl %%ebx, %%edi;\n"
- MPLOCKED
- "cmpxchg8b (%[dst]);"
- "setz %[res];"
- "xchgl %%ebx, %%edi;\n"
- : [res] "=a" (res) /* result in eax */
- : [dst] "S" (dst), /* esi */
- "D" (_src.l32), /* ebx */
- "c" (_src.h32), /* ecx */
- "a" (_exp.l32), /* eax */
- "d" (_exp.h32) /* edx */
- : "memory" ); /* no-clobber list */
-#endif
-
- return res;
-}
-
-static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dest, uint64_t val)
-{
- uint64_t old;
-
- do {
- old = *dest;
- } while (rte_atomic64_cmpset(dest, old, val) == 0);
-
- return old;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, 0);
- }
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- /* replace the value by itself */
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp);
- }
- return tmp;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, new_value);
- }
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp + inc);
- }
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp - dec);
- }
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
- rte_atomic64_add(v, 1);
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
- rte_atomic64_sub(v, 1);
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp + inc);
- }
-
- return tmp + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp - dec);
- }
-
- return tmp - dec;
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
- return rte_atomic64_add_return(v, 1) == 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
- return rte_atomic64_sub_return(v, 1) == 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
- return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
- rte_atomic64_set(v, 0);
-}
-#endif
#endif /* _RTE_ATOMIC_I686_H_ */
diff --git a/lib/eal/x86/include/rte_atomic_64.h b/lib/eal/x86/include/rte_atomic_64.h
index 0a7a2131e0..1cd12695a2 100644
--- a/lib/eal/x86/include/rte_atomic_64.h
+++ b/lib/eal/x86/include/rte_atomic_64.h
@@ -22,163 +22,6 @@
/*------------------------- 64 bit atomic operations -------------------------*/
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
- uint8_t res;
-
-
- asm volatile(
- MPLOCKED
- "cmpxchgq %[src], %[dst];"
- "sete %[res];"
- : [res] "=a" (res), /* output */
- [dst] "=m" (*dst)
- : [src] "r" (src), /* input */
- "a" (exp),
- "m" (*dst)
- : "memory"); /* no-clobber list */
-
- return res;
-}
-
-static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
-{
- asm volatile(
- MPLOCKED
- "xchgq %0, %1;"
- : "=r" (val), "=m" (*dst)
- : "0" (val), "m" (*dst)
- : "memory"); /* no-clobber list */
- return val;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
- return v->cnt;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
- v->cnt = new_value;
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
- asm volatile(
- MPLOCKED
- "addq %[inc], %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : [inc] "ir" (inc), /* input */
- "m" (v->cnt)
- );
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
- asm volatile(
- MPLOCKED
- "subq %[dec], %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : [dec] "ir" (dec), /* input */
- "m" (v->cnt)
- );
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
- asm volatile(
- MPLOCKED
- "incq %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
- asm volatile(
- MPLOCKED
- "decq %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
- int64_t prev = inc;
-
- asm volatile(
- MPLOCKED
- "xaddq %[prev], %[cnt]"
- : [prev] "+r" (prev), /* output */
- [cnt] "=m" (v->cnt)
- : "m" (v->cnt) /* input */
- );
- return prev + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
- return rte_atomic64_add_return(v, -dec);
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "incq %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
-
- return ret != 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "decq %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
- return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-#endif
/*------------------------ 128 bit atomic operations -------------------------*/
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence
2026-05-23 19:16 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 01/27] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
@ 2026-05-23 19:16 ` Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 03/27] ring: use compare-and-swap wrapper Stephen Hemminger
` (6 subsequent siblings)
8 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:16 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
The rte_smp_mb(), rte_smp_wmb() and rte_smp_rmb() functions were
flagged as deprecated by commit 3ec965b6de12 ("doc: update atomic
operation deprecation") in 2021 but nothing came of it.
Reimplement them as inline wrappers over rte_atomic_thread_fence()
and drop the deprecation notice.
The API is preserved; only the implementation changes.
Generated code is unchanged on x86 (seq_cst keeps the lock-addl
trick, release/acquire collapse to a compiler barrier under TSO).
On arm64, release/acquire emit dmb ish instead of dmb ishst/ishld;
the difference is below measurement noise.
Drop restriction frm checkpatch since they are no longer
really on deprecation cycle.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
devtools/checkpatches.sh | 8 --
doc/guides/rel_notes/deprecation.rst | 8 --
lib/eal/arm/include/rte_atomic_32.h | 6 --
lib/eal/arm/include/rte_atomic_64.h | 6 --
lib/eal/include/generic/rte_atomic.h | 130 +++++--------------------
lib/eal/loongarch/include/rte_atomic.h | 6 --
lib/eal/ppc/include/rte_atomic.h | 6 --
lib/eal/riscv/include/rte_atomic.h | 6 --
lib/eal/x86/include/rte_atomic.h | 33 +++----
9 files changed, 37 insertions(+), 172 deletions(-)
diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
index f5dd77443f..81bb0fe4e8 100755
--- a/devtools/checkpatches.sh
+++ b/devtools/checkpatches.sh
@@ -121,14 +121,6 @@ check_forbidden_additions() { # <patch>
-f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
"$1" || res=1
- # refrain from new additions of rte_smp_[r/w]mb()
- awk -v FOLDERS="lib drivers app examples" \
- -v EXPRESSIONS="rte_smp_(r|w)?mb\\\(" \
- -v RET_ON_FAIL=1 \
- -v MESSAGE='Using rte_smp_[r/w]mb' \
- -f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
- "$1" || res=1
-
# refrain from using compiler __sync_xxx builtins
awk -v FOLDERS="lib drivers app examples" \
-v EXPRESSIONS="__sync_.*\\\(" \
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 35c9b4e06c..2190419f79 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -47,14 +47,6 @@ Deprecation Notices
operations must be used for patches that need to be merged in 20.08 onwards.
This change will not introduce any performance degradation.
-* rte_smp_*mb: These APIs provide full barrier functionality. However, many
- use cases do not require full barriers. To support such use cases, DPDK has
- adopted atomic operations from
- https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html. These
- operations and a new wrapper ``rte_atomic_thread_fence`` instead of
- ``__atomic_thread_fence`` must be used for patches that need to be merged in
- 20.08 onwards. This change will not introduce any performance degradation.
-
* lib: will fix extending some enum/define breaking the ABI. There are multiple
samples in DPDK that enum/define terminated with a ``.*MAX.*`` value which is
used by iterators, and arrays holding these values are sized with this
diff --git a/lib/eal/arm/include/rte_atomic_32.h b/lib/eal/arm/include/rte_atomic_32.h
index 696a539fef..4115271091 100644
--- a/lib/eal/arm/include/rte_atomic_32.h
+++ b/lib/eal/arm/include/rte_atomic_32.h
@@ -17,12 +17,6 @@ extern "C" {
#define rte_rmb() __sync_synchronize()
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_wmb()
diff --git a/lib/eal/arm/include/rte_atomic_64.h b/lib/eal/arm/include/rte_atomic_64.h
index 9f790238df..604e777bcd 100644
--- a/lib/eal/arm/include/rte_atomic_64.h
+++ b/lib/eal/arm/include/rte_atomic_64.h
@@ -20,12 +20,6 @@ extern "C" {
#define rte_rmb() asm volatile("dmb oshld" : : : "memory")
-#define rte_smp_mb() asm volatile("dmb ish" : : : "memory")
-
-#define rte_smp_wmb() asm volatile("dmb ishst" : : : "memory")
-
-#define rte_smp_rmb() asm volatile("dmb ishld" : : : "memory")
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_wmb()
diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h
index 292e52fade..1b04b43cbb 100644
--- a/lib/eal/include/generic/rte_atomic.h
+++ b/lib/eal/include/generic/rte_atomic.h
@@ -59,55 +59,25 @@ static inline void rte_rmb(void);
*
* Guarantees that the LOAD and STORE operations that precede the
* rte_smp_mb() call are globally visible across the lcores
- * before the LOAD and STORE operations that follows it.
- *
- * @note
- * This function is deprecated.
- * It provides similar synchronization primitive as atomic fence,
- * but has different syntax and memory ordering semantic. Hence
- * deprecated for the simplicity of memory ordering semantics in use.
- *
- * rte_atomic_thread_fence(rte_memory_order_acq_rel) should be used instead.
+ * before the LOAD and STORE operations that follow it.
*/
static inline void rte_smp_mb(void);
/**
* Write memory barrier between lcores
*
- * Guarantees that the STORE operations that precede the
- * rte_smp_wmb() call are globally visible across the lcores
- * before the STORE operations that follows it.
- *
- * @note
- * This function is deprecated.
- * It provides similar synchronization primitive as atomic fence,
- * but has different syntax and memory ordering semantic. Hence
- * deprecated for the simplicity of memory ordering semantics in use.
- *
- * rte_atomic_thread_fence(rte_memory_order_release) should be used instead.
- * The fence also guarantees LOAD operations that precede the call
- * are globally visible across the lcores before the STORE operations
- * that follows it.
+ * Guarantees that the LOAD and STORE operations that precede the
+ * rte_smp_wmb() call are globally visible across the lcores before
+ * any STORE operations that follow it.
*/
static inline void rte_smp_wmb(void);
/**
* Read memory barrier between lcores
*
- * Guarantees that the LOAD operations that precede the
- * rte_smp_rmb() call are globally visible across the lcores
- * before the LOAD operations that follows it.
- *
- * @note
- * This function is deprecated.
- * It provides similar synchronization primitive as atomic fence,
- * but has different syntax and memory ordering semantic. Hence
- * deprecated for the simplicity of memory ordering semantics in use.
- *
- * rte_atomic_thread_fence(rte_memory_order_acquire) should be used instead.
- * The fence also guarantees LOAD operations that precede the call
- * are globally visible across the lcores before the STORE operations
- * that follows it.
+ * Guarantees that any LOAD operations that precede the rte_smp_rmb()
+ * call complete before LOAD and STORE operations that follow it
+ * become globally visible.
*/
static inline void rte_smp_rmb(void);
///@}
@@ -164,6 +134,24 @@ static inline void rte_io_rmb(void);
*/
static inline void rte_atomic_thread_fence(rte_memory_order memorder);
+static __rte_always_inline void
+rte_smp_mb(void)
+{
+ rte_atomic_thread_fence(rte_memory_order_seq_cst);
+}
+
+static __rte_always_inline void
+rte_smp_wmb(void)
+{
+ rte_atomic_thread_fence(rte_memory_order_release);
+}
+
+static __rte_always_inline void
+rte_smp_rmb(void)
+{
+ rte_atomic_thread_fence(rte_memory_order_acquire);
+}
+
/*------------------------- 16 bit atomic operations -------------------------*/
#ifndef RTE_TOOLCHAIN_MSVC
@@ -184,9 +172,6 @@ static inline void rte_atomic_thread_fence(rte_memory_order memorder);
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src);
-
static inline int
rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
{
@@ -303,9 +288,6 @@ rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v);
-
static inline void
rte_atomic16_inc(rte_atomic16_t *v)
{
@@ -318,9 +300,6 @@ rte_atomic16_inc(rte_atomic16_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v);
-
static inline void
rte_atomic16_dec(rte_atomic16_t *v)
{
@@ -379,8 +358,6 @@ rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
* @return
* True if the result after the increment operation is 0; false otherwise.
*/
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v);
-
static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
@@ -398,8 +375,6 @@ static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
* @return
* True if the result after the decrement operation is 0; false otherwise.
*/
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v);
-
static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
@@ -417,8 +392,6 @@ static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v);
-
static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
{
return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
@@ -453,9 +426,6 @@ static inline void rte_atomic16_clear(rte_atomic16_t *v)
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src);
-
static inline int
rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
{
@@ -572,9 +542,6 @@ rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v);
-
static inline void
rte_atomic32_inc(rte_atomic32_t *v)
{
@@ -587,9 +554,6 @@ rte_atomic32_inc(rte_atomic32_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v);
-
static inline void
rte_atomic32_dec(rte_atomic32_t *v)
{
@@ -648,8 +612,6 @@ rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
* @return
* True if the result after the increment operation is 0; false otherwise.
*/
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v);
-
static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
@@ -667,8 +629,6 @@ static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
* @return
* True if the result after the decrement operation is 0; false otherwise.
*/
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v);
-
static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
@@ -686,8 +646,6 @@ static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v);
-
static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
{
return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
@@ -721,9 +679,6 @@ static inline void rte_atomic32_clear(rte_atomic32_t *v)
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src);
-
static inline int
rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
{
@@ -770,9 +725,6 @@ typedef struct {
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic64_init(rte_atomic64_t *v);
-
static inline void
rte_atomic64_init(rte_atomic64_t *v)
{
@@ -798,9 +750,6 @@ rte_atomic64_init(rte_atomic64_t *v)
* @return
* The value of the counter.
*/
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v);
-
static inline int64_t
rte_atomic64_read(rte_atomic64_t *v)
{
@@ -828,9 +777,6 @@ rte_atomic64_read(rte_atomic64_t *v)
* @param new_value
* The new value of the counter.
*/
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value);
-
static inline void
rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
{
@@ -856,9 +802,6 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
* @param inc
* The value to be added to the counter.
*/
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc);
-
static inline void
rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
{
@@ -874,9 +817,6 @@ rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
* @param dec
* The value to be subtracted from the counter.
*/
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec);
-
static inline void
rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
{
@@ -890,9 +830,6 @@ rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v);
-
static inline void
rte_atomic64_inc(rte_atomic64_t *v)
{
@@ -905,9 +842,6 @@ rte_atomic64_inc(rte_atomic64_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v);
-
static inline void
rte_atomic64_dec(rte_atomic64_t *v)
{
@@ -927,9 +861,6 @@ rte_atomic64_dec(rte_atomic64_t *v)
* @return
* The value of v after the addition.
*/
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc);
-
static inline int64_t
rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
{
@@ -950,9 +881,6 @@ rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
* @return
* The value of v after the subtraction.
*/
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec);
-
static inline int64_t
rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
{
@@ -971,8 +899,6 @@ rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
* @return
* True if the result after the addition is 0; false otherwise.
*/
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v);
-
static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
{
return rte_atomic64_add_return(v, 1) == 0;
@@ -989,8 +915,6 @@ static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
* @return
* True if the result after subtraction is 0; false otherwise.
*/
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v);
-
static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
{
return rte_atomic64_sub_return(v, 1) == 0;
@@ -1007,8 +931,6 @@ static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v);
-
static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
{
return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
@@ -1020,8 +942,6 @@ static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void rte_atomic64_clear(rte_atomic64_t *v);
-
static inline void rte_atomic64_clear(rte_atomic64_t *v)
{
rte_atomic64_set(v, 0);
diff --git a/lib/eal/loongarch/include/rte_atomic.h b/lib/eal/loongarch/include/rte_atomic.h
index 785a452c9e..a789e3ab4d 100644
--- a/lib/eal/loongarch/include/rte_atomic.h
+++ b/lib/eal/loongarch/include/rte_atomic.h
@@ -18,12 +18,6 @@ extern "C" {
#define rte_rmb() rte_mb()
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_mb()
-
-#define rte_smp_rmb() rte_mb()
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_mb()
diff --git a/lib/eal/ppc/include/rte_atomic.h b/lib/eal/ppc/include/rte_atomic.h
index 64f4c3d670..0e64db2a35 100644
--- a/lib/eal/ppc/include/rte_atomic.h
+++ b/lib/eal/ppc/include/rte_atomic.h
@@ -24,12 +24,6 @@ extern "C" {
#define rte_rmb() asm volatile("sync" : : : "memory")
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_wmb()
diff --git a/lib/eal/riscv/include/rte_atomic.h b/lib/eal/riscv/include/rte_atomic.h
index 061b175f33..04c40e4e9b 100644
--- a/lib/eal/riscv/include/rte_atomic.h
+++ b/lib/eal/riscv/include/rte_atomic.h
@@ -23,12 +23,6 @@ extern "C" {
#define rte_rmb() asm volatile("fence r, r" : : : "memory")
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#define rte_io_mb() asm volatile("fence iorw, iorw" : : : "memory")
#define rte_io_wmb() asm volatile("fence orw, ow" : : : "memory")
diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h
index 4f05302c9f..f4d39ce4fe 100644
--- a/lib/eal/x86/include/rte_atomic.h
+++ b/lib/eal/x86/include/rte_atomic.h
@@ -23,10 +23,6 @@
#define rte_rmb() _mm_lfence()
-#define rte_smp_wmb() rte_compiler_barrier()
-
-#define rte_smp_rmb() rte_compiler_barrier()
-
#ifdef __cplusplus
extern "C" {
#endif
@@ -63,20 +59,6 @@ extern "C" {
* So below we use that technique for rte_smp_mb() implementation.
*/
-static __rte_always_inline void
-rte_smp_mb(void)
-{
-#ifdef RTE_TOOLCHAIN_MSVC
- _mm_mfence();
-#else
-#ifdef RTE_ARCH_I686
- asm volatile("lock addl $0, -128(%%esp); " ::: "memory");
-#else
- asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
-#endif
-#endif
-}
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_compiler_barrier()
@@ -93,10 +75,19 @@ rte_smp_mb(void)
static __rte_always_inline void
rte_atomic_thread_fence(rte_memory_order memorder)
{
- if (memorder == rte_memory_order_seq_cst)
- rte_smp_mb();
- else
+ if (memorder == rte_memory_order_seq_cst) {
+#ifdef RTE_TOOLCHAIN_MSVC
+ _mm_mfence();
+#else
+#ifdef RTE_ARCH_I686
+ asm volatile("lock addl $0, -128(%%esp); " ::: "memory");
+#else
+ asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
+#endif
+#endif
+ } else {
__rte_atomic_thread_fence(memorder);
+ }
}
#ifdef __cplusplus
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 03/27] ring: use compare-and-swap wrapper
2026-05-23 19:16 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 01/27] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
@ 2026-05-23 19:16 ` Stephen Hemminger
2026-05-25 7:41 ` Konstantin Ananyev
2026-05-23 19:16 ` [PATCH v3 04/27] bpf: replace atomic op macro with typed helpers Stephen Hemminger
` (5 subsequent siblings)
8 siblings, 1 reply; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:16 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
The rte_atomic32_cmpset is deprecated. Initial attempts at
changing this with direct conversion to
rte_atomic_compare_exchange_weak_explicit()
regressed MP/MC contended performance on x86 by 10-30%,
because the C11 builtin's failure-writeback semantic forces
GCC to emit extra instructions on the CAS critical path.
Add an internal __rte_ring_compare_and_swap() wrapper that calls
__sync_bool_compare_and_swap() directly, which keeps the original
instruction sequence. Add equivalent function for MSVC.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
lib/ring/rte_ring_generic_pvt.h | 32 ++++++++++++++++++++++++++++----
1 file changed, 28 insertions(+), 4 deletions(-)
diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_generic_pvt.h
index affd2d5ba7..0fb972de9e 100644
--- a/lib/ring/rte_ring_generic_pvt.h
+++ b/lib/ring/rte_ring_generic_pvt.h
@@ -18,6 +18,30 @@
* For more information please refer to <rte_ring.h>.
*/
+/**
+ * @internal optimized version of compare exchange
+ *
+ * The C11 builtin's failure-writeback semantic generates worse code on x86.
+ * Unlike rte_atomic_compare_exchange_*_explicit(), this wrapper does NOT
+ * write the actual value back to a pointer on failure. Callers in a retry
+ * loop must reload the expected value explicitly on the next iteration.
+ *
+ * Full memory barrier, equivalent to rte_memory_order_seq_cst on both
+ * success and failure.
+ */
+static __rte_always_inline bool
+__rte_ring_compare_and_swap(volatile uint32_t *dst,
+ uint32_t expected, uint32_t desired)
+{
+#if defined(RTE_TOOLCHAIN_MSVC)
+ return _InterlockedCompareExchange((volatile long *)dst,
+ (long)desired, (long)expected)
+ == (long)expected;
+#else
+ return __sync_bool_compare_and_swap(dst, expected, desired);
+#endif
+}
+
/**
* @internal This function updates tail values.
*/
@@ -108,10 +132,10 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
if (is_st) {
d->head = *new_head;
success = 1;
- } else
- success = rte_atomic32_cmpset(
- (uint32_t *)(uintptr_t)&d->head,
- *old_head, *new_head);
+ } else {
+ success = __rte_ring_compare_and_swap(
+ &d->head, *old_head, *new_head);
+ }
} while (unlikely(success == 0));
return n;
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 04/27] bpf: replace atomic op macro with typed helpers
2026-05-23 19:16 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (2 preceding siblings ...)
2026-05-23 19:16 ` [PATCH v3 03/27] ring: use compare-and-swap wrapper Stephen Hemminger
@ 2026-05-23 19:16 ` Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 05/27] net/bonding: use stdatomic Stephen Hemminger
` (4 subsequent siblings)
8 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:16 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
The BPF_ST_ATOMIC_REG macro token-pasted the legacy rte_atomicNN_*()
API names. It also stacked three casts on the destination pointer
and reached a 'return 0' out of the macro into the caller's control
flow.
Replace it with two small static-inline helpers, bpf_atomic32() and
bpf_atomic64(), that dispatch on ins->imm internally and use the C11
atomic intrinsics directly. The destination is cast once, to a
properly __rte_atomic-qualified pointer. The helpers return a status
and the dispatch loop owns the early exit.
Use memory order seq_cst to preserve the previous behavior of
rte_atomicNN_add() / rte_atomicNN_exchange() and matches
the Linux kernel BPF interpreter for these opcodes.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
lib/bpf/bpf_exec.c | 91 ++++++++++++++++++++++++++++++++++------------
1 file changed, 67 insertions(+), 24 deletions(-)
diff --git a/lib/bpf/bpf_exec.c b/lib/bpf/bpf_exec.c
index 18013753b1..b8116db191 100644
--- a/lib/bpf/bpf_exec.c
+++ b/lib/bpf/bpf_exec.c
@@ -64,28 +64,6 @@
(*(type *)(uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off) = \
(type)(reg)[(ins)->src_reg])
-#define BPF_ST_ATOMIC_REG(reg, ins, tp) do { \
- switch (ins->imm) { \
- case BPF_ATOMIC_ADD: \
- rte_atomic##tp##_add((rte_atomic##tp##_t *) \
- (uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off), \
- (reg)[(ins)->src_reg]); \
- break; \
- case BPF_ATOMIC_XCHG: \
- (reg)[(ins)->src_reg] = rte_atomic##tp##_exchange((uint##tp##_t *) \
- (uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off), \
- (reg)[(ins)->src_reg]); \
- break; \
- default: \
- /* this should be caught by validator and never reach here */ \
- RTE_BPF_LOG_LINE(ERR, \
- "%s(%p): unsupported atomic operation at pc: %#zx;", \
- __func__, bpf, \
- (uintptr_t)(ins) - (uintptr_t)(bpf)->prm.ins); \
- return 0; \
- } \
-} while (0)
-
/* BPF_LD | BPF_ABS/BPF_IND */
#define NOP(x) (x)
@@ -105,6 +83,69 @@
reg[EBPF_REG_0] = op(p[0]); \
} while (0)
+/*
+ * Atomic ops on the BPF target memory.
+ *
+ * BPF atomic instructions encode the destination as base register +
+ * signed offset, with the value to combine taken from src_reg.
+ *
+ * Memory order: seq_cst preserves the previous behavior of
+ * rte_atomicNN_add() / rte_atomicNN_exchange() and matches what the
+ * Linux kernel BPF interpreter does for these opcodes.
+ *
+ * Returns 0 on unsupported sub-op (validator should have rejected it),
+ * 1 otherwise.
+ */
+static inline int
+bpf_atomic32(const struct rte_bpf *bpf, uint64_t reg[EBPF_REG_NUM],
+ const struct ebpf_insn *ins)
+{
+ /* need to casts to make bpf memory suitable for C11 atomic */
+ uint32_t __rte_atomic *dst
+ = (uint32_t __rte_atomic *)(uintptr_t)(reg[ins->dst_reg] + ins->off);
+ uint32_t val = (uint32_t)reg[ins->src_reg];
+
+ switch (ins->imm) {
+ case BPF_ATOMIC_ADD:
+ rte_atomic_fetch_add_explicit(dst, val, rte_memory_order_seq_cst);
+ return 1;
+ case BPF_ATOMIC_XCHG:
+ reg[ins->src_reg] = rte_atomic_exchange_explicit(dst, val,
+ rte_memory_order_seq_cst);
+ return 1;
+ default:
+ RTE_BPF_LOG_LINE(ERR,
+ "%s(%p): unsupported atomic operation at pc: %#zx;",
+ __func__, bpf,
+ (uintptr_t)ins - (uintptr_t)bpf->prm.ins);
+ return 0;
+ }
+}
+
+static inline int
+bpf_atomic64(const struct rte_bpf *bpf, uint64_t reg[EBPF_REG_NUM],
+ const struct ebpf_insn *ins)
+{
+ uint64_t __rte_atomic *dst
+ = (uint64_t __rte_atomic *)(uintptr_t) (reg[ins->dst_reg] + ins->off);
+ uint64_t val = reg[ins->src_reg];
+
+ switch (ins->imm) {
+ case BPF_ATOMIC_ADD:
+ rte_atomic_fetch_add_explicit(dst, val, rte_memory_order_seq_cst);
+ return 1;
+ case BPF_ATOMIC_XCHG:
+ reg[ins->src_reg] = rte_atomic_exchange_explicit(dst, val,
+ rte_memory_order_seq_cst);
+ return 1;
+ default:
+ RTE_BPF_LOG_LINE(ERR,
+ "%s(%p): unsupported atomic operation at pc: %#zx;",
+ __func__, bpf,
+ (uintptr_t)ins - (uintptr_t)bpf->prm.ins);
+ return 0;
+ }
+}
static inline void
bpf_alu_be(uint64_t reg[EBPF_REG_NUM], const struct ebpf_insn *ins)
@@ -392,10 +433,12 @@ bpf_exec(const struct rte_bpf *bpf, uint64_t reg[EBPF_REG_NUM])
break;
/* atomic instructions */
case (BPF_STX | EBPF_ATOMIC | BPF_W):
- BPF_ST_ATOMIC_REG(reg, ins, 32);
+ if (bpf_atomic32(bpf, reg, ins) == 0)
+ return 0;
break;
case (BPF_STX | EBPF_ATOMIC | EBPF_DW):
- BPF_ST_ATOMIC_REG(reg, ins, 64);
+ if (bpf_atomic64(bpf, reg, ins) == 0)
+ return 0;
break;
/* jump instructions */
case (BPF_JMP | BPF_JA):
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 05/27] net/bonding: use stdatomic
2026-05-23 19:16 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (3 preceding siblings ...)
2026-05-23 19:16 ` [PATCH v3 04/27] bpf: replace atomic op macro with typed helpers Stephen Hemminger
@ 2026-05-23 19:16 ` Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 06/27] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
` (3 subsequent siblings)
8 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:16 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
The old rte_atomic16 and rte_atomic64 functions are deprecated.
Replace with rte_stdatomic for managing warning and timer flags.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/bonding/eth_bond_8023ad_private.h | 6 ++--
drivers/net/bonding/rte_eth_bond_8023ad.c | 35 ++++++++-----------
2 files changed, 17 insertions(+), 24 deletions(-)
diff --git a/drivers/net/bonding/eth_bond_8023ad_private.h b/drivers/net/bonding/eth_bond_8023ad_private.h
index ab7d15f81a..dd3cf3ed26 100644
--- a/drivers/net/bonding/eth_bond_8023ad_private.h
+++ b/drivers/net/bonding/eth_bond_8023ad_private.h
@@ -9,7 +9,7 @@
#include <rte_ether.h>
#include <rte_byteorder.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_flow.h>
#include "rte_eth_bond_8023ad.h"
@@ -140,10 +140,10 @@ struct port {
/** Timer which is also used as mutex. If is 0 (not running) RX marker
* packet might be responded. Otherwise shall be dropped. It is zeroed in
* mode 4 callback function after expire. */
- volatile uint64_t rx_marker_timer;
+ RTE_ATOMIC(uint64_t) rx_marker_timer;
uint64_t warning_timer;
- volatile uint16_t warnings_to_show;
+ RTE_ATOMIC(uint16_t) warnings_to_show;
/** Memory pool used to allocate slow queues */
struct rte_mempool *slow_pool;
diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
index ba88f6d261..cc7e4af2b9 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.c
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
@@ -171,27 +171,17 @@ timer_is_running(uint64_t *timer)
static void
set_warning_flags(struct port *port, uint16_t flags)
{
- int retval;
- uint16_t old;
- uint16_t new_flag = 0;
-
- do {
- old = port->warnings_to_show;
- new_flag = old | flags;
- retval = rte_atomic16_cmpset(&port->warnings_to_show, old, new_flag);
- } while (unlikely(retval == 0));
+ rte_atomic_fetch_or_explicit(&port->warnings_to_show, flags, rte_memory_order_relaxed);
}
static void
show_warnings(uint16_t member_id)
{
struct port *port = &bond_mode_8023ad_ports[member_id];
- uint8_t warnings;
-
- do {
- warnings = port->warnings_to_show;
- } while (rte_atomic16_cmpset(&port->warnings_to_show, warnings, 0) == 0);
+ uint16_t warnings;
+ warnings = rte_atomic_exchange_explicit(&port->warnings_to_show, 0,
+ rte_memory_order_relaxed);
if (!warnings)
return;
@@ -1337,7 +1327,6 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
struct port *port = &bond_mode_8023ad_ports[member_id];
struct marker_header *m_hdr;
uint64_t marker_timer, old_marker_timer;
- int retval;
uint8_t wrn, subtype;
/* If packet is a marker, we send response now by reusing given packet
* and update only source MAC, destination MAC is multicast so don't
@@ -1354,17 +1343,19 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
}
/* Setup marker timer. Do it in loop in case concurrent access. */
+ old_marker_timer = rte_atomic_load_explicit(&port->rx_marker_timer,
+ rte_memory_order_relaxed);
do {
- old_marker_timer = port->rx_marker_timer;
if (!timer_is_expired(&old_marker_timer)) {
wrn = WRN_RX_MARKER_TO_FAST;
goto free_out;
}
timer_set(&marker_timer, mode4->rx_marker_timeout);
- retval = rte_atomic64_cmpset(&port->rx_marker_timer,
- old_marker_timer, marker_timer);
- } while (unlikely(retval == 0));
+
+ } while (!rte_atomic_compare_exchange_weak_explicit(&port->rx_marker_timer,
+ &old_marker_timer, marker_timer,
+ rte_memory_order_seq_cst, rte_memory_order_relaxed));
m_hdr->marker.tlv_type_marker = MARKER_TLV_TYPE_RESP;
rte_eth_macaddr_get(member_id, &m_hdr->eth_hdr.src_addr);
@@ -1372,7 +1363,8 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
if (internals->mode4.dedicated_queues.enabled == 0) {
if (rte_ring_enqueue(port->tx_ring, pkt) != 0) {
/* reset timer */
- port->rx_marker_timer = 0;
+ rte_atomic_store_explicit(&port->rx_marker_timer, 0,
+ rte_memory_order_release);
wrn = WRN_TX_QUEUE_FULL;
goto free_out;
}
@@ -1386,7 +1378,8 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
&pkt, tx_count);
if (tx_count != 1) {
/* reset timer */
- port->rx_marker_timer = 0;
+ rte_atomic_store_explicit(&port->rx_marker_timer, 0,
+ rte_memory_order_release);
wrn = WRN_TX_QUEUE_FULL;
goto free_out;
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 06/27] net/nbl: remove unused rte_atomic16 field
2026-05-23 19:16 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (4 preceding siblings ...)
2026-05-23 19:16 ` [PATCH v3 05/27] net/bonding: use stdatomic Stephen Hemminger
@ 2026-05-23 19:16 ` Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 07/27] net/ena: replace use of rte_atomicNN Stephen Hemminger
` (2 subsequent siblings)
8 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:16 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
The tx_current_queue was defined as rte_atomic16_t which
is deprecated. Remove it since it was never used.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/nbl/nbl_hw/nbl_resource.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/net/nbl/nbl_hw/nbl_resource.h b/drivers/net/nbl/nbl_hw/nbl_resource.h
index bf5a9461f5..f2182ba6bc 100644
--- a/drivers/net/nbl/nbl_hw/nbl_resource.h
+++ b/drivers/net/nbl/nbl_hw/nbl_resource.h
@@ -225,7 +225,6 @@ struct nbl_res_info {
u16 base_qid;
u16 lcore_max;
u16 *pf_qid_to_lcore_id;
- rte_atomic16_t tx_current_queue;
};
struct nbl_resource_mgt {
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 07/27] net/ena: replace use of rte_atomicNN
2026-05-23 19:16 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (5 preceding siblings ...)
2026-05-23 19:16 ` [PATCH v3 06/27] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
@ 2026-05-23 19:16 ` Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 08/27] net/failsafe: convert to stdatomic Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 09/27] net/enic: do not use deprecated rte_atomic64 Stephen Hemminger
8 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:16 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
Convert the legacy rte_atomicNN operations to stdatomic.
* Remove variable ena_alloc_cnt is defined by not used.
It is a leftover from previous memzone naming scheme.
* Convert the legacy rte_atomic32_t and rte_atomic32_{inc,dec,set,read}
macros to C11 stdatomic equivalents.
Memory ordering is kept at seq_cst,
matching the implicit ordering of the legacy API.
* Do not use rte_atomic for statistics
The DPDK PMD model is that statistics do not have to be exact
in face of contention.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/ena/base/ena_plat_dpdk.h | 14 +++++++++-----
drivers/net/ena/ena_ethdev.c | 21 ++++++---------------
drivers/net/ena/ena_ethdev.h | 7 +++----
3 files changed, 18 insertions(+), 24 deletions(-)
diff --git a/drivers/net/ena/base/ena_plat_dpdk.h b/drivers/net/ena/base/ena_plat_dpdk.h
index c84420de22..83b354d9da 100644
--- a/drivers/net/ena/base/ena_plat_dpdk.h
+++ b/drivers/net/ena/base/ena_plat_dpdk.h
@@ -40,7 +40,7 @@ typedef uint64_t dma_addr_t;
#endif
#define ENA_PRIu64 PRIu64
-#define ena_atomic32_t rte_atomic32_t
+typedef RTE_ATOMIC(int32_t) ena_atomic32_t;
#define ena_mem_handle_t const struct rte_memzone *
#define SZ_256 (256U)
@@ -267,10 +267,14 @@ ena_mem_alloc_coherent(struct rte_eth_dev_data *data, size_t size,
#define ENA_REG_READ32(bus, reg) \
__extension__ ({ (void)(bus); rte_read32_relaxed((reg)); })
-#define ATOMIC32_INC(i32_ptr) rte_atomic32_inc(i32_ptr)
-#define ATOMIC32_DEC(i32_ptr) rte_atomic32_dec(i32_ptr)
-#define ATOMIC32_SET(i32_ptr, val) rte_atomic32_set(i32_ptr, val)
-#define ATOMIC32_READ(i32_ptr) rte_atomic32_read(i32_ptr)
+#define ATOMIC32_INC(i32_ptr) \
+ rte_atomic_fetch_add_explicit((i32_ptr), 1, rte_memory_order_seq_cst)
+#define ATOMIC32_DEC(i32_ptr) \
+ rte_atomic_fetch_sub_explicit((i32_ptr), 1, rte_memory_order_seq_cst)
+#define ATOMIC32_SET(i32_ptr, val) \
+ rte_atomic_store_explicit((i32_ptr), (val), rte_memory_order_seq_cst)
+#define ATOMIC32_READ(i32_ptr) \
+ rte_atomic_load_explicit((i32_ptr), rte_memory_order_seq_cst)
#define msleep(x) rte_delay_us(x * 1000)
#define udelay(x) rte_delay_us(x)
diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index ea4afbc75d..e9c484456c 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -121,12 +121,6 @@ struct ena_stats {
*/
#define ENA_DEVARG_ENABLE_FRAG_BYPASS "enable_frag_bypass"
-/*
- * Each rte_memzone should have unique name.
- * To satisfy it, count number of allocation and add it to name.
- */
-rte_atomic64_t ena_alloc_cnt;
-
static const struct ena_stats ena_stats_global_strings[] = {
ENA_STAT_GLOBAL_ENTRY(wd_expired),
ENA_STAT_GLOBAL_ENTRY(dev_start),
@@ -1249,10 +1243,7 @@ static void ena_stats_restart(struct rte_eth_dev *dev)
{
struct ena_adapter *adapter = dev->data->dev_private;
- rte_atomic64_init(&adapter->drv_stats->ierrors);
- rte_atomic64_init(&adapter->drv_stats->oerrors);
- rte_atomic64_init(&adapter->drv_stats->rx_nombuf);
- adapter->drv_stats->rx_drops = 0;
+ memset(adapter->drv_stats, 0, sizeof(struct ena_driver_stats));
}
static int ena_stats_get(struct rte_eth_dev *dev,
@@ -1289,9 +1280,9 @@ static int ena_stats_get(struct rte_eth_dev *dev,
/* Driver related stats */
stats->imissed = adapter->drv_stats->rx_drops;
- stats->ierrors = rte_atomic64_read(&adapter->drv_stats->ierrors);
- stats->oerrors = rte_atomic64_read(&adapter->drv_stats->oerrors);
- stats->rx_nombuf = rte_atomic64_read(&adapter->drv_stats->rx_nombuf);
+ stats->ierrors = adapter->drv_stats->ierrors;
+ stats->oerrors = adapter->drv_stats->oerrors;
+ stats->rx_nombuf = adapter->drv_stats->rx_nombuf;
/* Queue statistics */
if (qstats) {
@@ -1887,7 +1878,7 @@ static int ena_populate_rx_queue(struct ena_ring *rxq, unsigned int count)
/* get resources for incoming packets */
rc = rte_pktmbuf_alloc_bulk(rxq->mb_pool, mbufs, count);
if (unlikely(rc < 0)) {
- rte_atomic64_inc(&rxq->adapter->drv_stats->rx_nombuf);
+ ++rxq->adapter->drv_stats->rx_nombuf;
++rxq->rx_stats.mbuf_alloc_fail;
PMD_RX_LOG_LINE(DEBUG, "There are not enough free buffers");
return 0;
@@ -3014,7 +3005,7 @@ static uint16_t eth_ena_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
if (unlikely(mbuf->ol_flags &
(RTE_MBUF_F_RX_IP_CKSUM_BAD | RTE_MBUF_F_RX_L4_CKSUM_BAD)))
- rte_atomic64_inc(&rx_ring->adapter->drv_stats->ierrors);
+ ++rx_ring->adapter->drv_stats->ierrors;
rx_pkts[completed] = mbuf;
rx_ring->rx_stats.bytes += mbuf->pkt_len;
diff --git a/drivers/net/ena/ena_ethdev.h b/drivers/net/ena/ena_ethdev.h
index 3a66d79384..b204b07767 100644
--- a/drivers/net/ena/ena_ethdev.h
+++ b/drivers/net/ena/ena_ethdev.h
@@ -6,7 +6,6 @@
#ifndef _ENA_ETHDEV_H_
#define _ENA_ETHDEV_H_
-#include <rte_atomic.h>
#include <rte_ether.h>
#include <ethdev_driver.h>
#include <ethdev_pci.h>
@@ -225,9 +224,9 @@ enum ena_adapter_state {
};
struct ena_driver_stats {
- rte_atomic64_t ierrors;
- rte_atomic64_t oerrors;
- rte_atomic64_t rx_nombuf;
+ u64 ierrors;
+ u64 oerrors;
+ u64 rx_nombuf;
u64 rx_drops;
};
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 08/27] net/failsafe: convert to stdatomic
2026-05-23 19:16 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (6 preceding siblings ...)
2026-05-23 19:16 ` [PATCH v3 07/27] net/ena: replace use of rte_atomicNN Stephen Hemminger
@ 2026-05-23 19:16 ` Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 09/27] net/enic: do not use deprecated rte_atomic64 Stephen Hemminger
8 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:16 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
The functions rte_atomic64 are deprecated, convert this
code to use stdatomic for reference count. Use the memory
order implied by naming P/V.
No need for initialization since refcnt is in space
allocated with rte_zmalloc().
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/failsafe/failsafe_ops.c | 12 +++++-----
drivers/net/failsafe/failsafe_private.h | 29 ++++++++++++++-----------
drivers/net/failsafe/failsafe_rxtx.c | 2 +-
3 files changed, 22 insertions(+), 21 deletions(-)
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index ddc8808ebe..fcb0051777 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -11,7 +11,7 @@
#endif
#include <rte_debug.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <ethdev_driver.h>
#include <rte_malloc.h>
#include <rte_flow.h>
@@ -440,14 +440,13 @@ fs_rx_queue_setup(struct rte_eth_dev *dev,
}
rxq = rte_zmalloc(NULL,
sizeof(*rxq) +
- sizeof(rte_atomic64_t) * PRIV(dev)->subs_tail,
+ sizeof(uint64_t) * PRIV(dev)->subs_tail,
RTE_CACHE_LINE_SIZE);
if (rxq == NULL) {
fs_unlock(dev, 0);
return -ENOMEM;
}
- FOREACH_SUBDEV(sdev, i, dev)
- rte_atomic64_init(&rxq->refcnt[i]);
+
rxq->qid = rx_queue_id;
rxq->socket_id = socket_id;
rxq->info.mp = mb_pool;
@@ -617,14 +616,13 @@ fs_tx_queue_setup(struct rte_eth_dev *dev,
}
txq = rte_zmalloc("ethdev TX queue",
sizeof(*txq) +
- sizeof(rte_atomic64_t) * PRIV(dev)->subs_tail,
+ sizeof(uint64_t) * PRIV(dev)->subs_tail,
RTE_CACHE_LINE_SIZE);
if (txq == NULL) {
fs_unlock(dev, 0);
return -ENOMEM;
}
- FOREACH_SUBDEV(sdev, i, dev)
- rte_atomic64_init(&txq->refcnt[i]);
+
txq->qid = tx_queue_id;
txq->socket_id = socket_id;
txq->info.conf = *tx_conf;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index babea6016e..89b06f9756 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -10,7 +10,7 @@
#include <sys/queue.h>
#include <pthread.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <dev_driver.h>
#include <ethdev_driver.h>
#include <rte_devargs.h>
@@ -75,7 +75,7 @@ struct rxq {
int event_fd;
unsigned int enable_events:1;
struct rte_eth_rxq_info info;
- rte_atomic64_t refcnt[];
+ RTE_ATOMIC(uint64_t) refcnt[];
};
struct txq {
@@ -83,7 +83,7 @@ struct txq {
uint16_t qid;
unsigned int socket_id;
struct rte_eth_txq_info info;
- rte_atomic64_t refcnt[];
+ RTE_ATOMIC(uint64_t) refcnt[];
};
struct rte_flow {
@@ -320,33 +320,36 @@ extern int failsafe_mac_from_arg;
*/
/**
- * a: (rte_atomic64_t)
+ * a: _Atomic uint64_t
*/
#define FS_ATOMIC_P(a) \
- rte_atomic64_set(&(a), 1)
+ rte_atomic_exchange_explicit(&(a), 1, rte_memory_order_acquire)
/**
- * a: (rte_atomic64_t)
+ * a: _Atomic uint64_t
*/
#define FS_ATOMIC_V(a) \
- rte_atomic64_set(&(a), 0)
+ rte_atomic_store_explicit(&(a), 0, rte_memory_order_release)
/**
* s: (struct sub_device *)
* i: uint16_t qid
*/
#define FS_ATOMIC_RX(s, i) \
- rte_atomic64_read( \
- &((struct rxq *) \
- (fs_dev(s)->data->rx_queues[i]))->refcnt[(s)->sid])
+ rte_atomic_load_explicit( \
+ &((struct rxq *) \
+ (fs_dev(s)->data->rx_queues[i]))->refcnt[(s)->sid], \
+ rte_memory_order_seq_cst)
+
/**
* s: (struct sub_device *)
* i: uint16_t qid
*/
#define FS_ATOMIC_TX(s, i) \
- rte_atomic64_read( \
- &((struct txq *) \
- (fs_dev(s)->data->tx_queues[i]))->refcnt[(s)->sid])
+ rte_atomic_load_explicit( \
+ &((struct txq *) \
+ (fs_dev(s)->data->tx_queues[i]))->refcnt[(s)->sid], \
+ rte_memory_order_seq_cst)
#ifdef RTE_EXEC_ENV_FREEBSD
#define FS_THREADID_TYPE void*
diff --git a/drivers/net/failsafe/failsafe_rxtx.c b/drivers/net/failsafe/failsafe_rxtx.c
index fe67293299..500483bda3 100644
--- a/drivers/net/failsafe/failsafe_rxtx.c
+++ b/drivers/net/failsafe/failsafe_rxtx.c
@@ -3,7 +3,7 @@
* Copyright 2017 Mellanox Technologies, Ltd
*/
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_debug.h>
#include <rte_mbuf.h>
#include <ethdev_driver.h>
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 09/27] net/enic: do not use deprecated rte_atomic64
2026-05-23 19:16 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (7 preceding siblings ...)
2026-05-23 19:16 ` [PATCH v3 08/27] net/failsafe: convert to stdatomic Stephen Hemminger
@ 2026-05-23 19:16 ` Stephen Hemminger
8 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:16 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
The rte_atomic64 datatype and functions are deprecated.
This driver was only using it for error statistics where atomic
is not necessary. The DPDK PMD model is that statistics do
not have to be exact in face of contention.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/enic/enic.h | 6 +++---
drivers/net/enic/enic_compat.h | 1 -
drivers/net/enic/enic_main.c | 17 +++++++----------
drivers/net/enic/enic_rxtx.c | 14 ++++++--------
drivers/net/enic/enic_rxtx_vec_avx2.c | 4 ++--
5 files changed, 18 insertions(+), 24 deletions(-)
diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index 87f6b35fcd..0a8d4a29ca 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -59,9 +59,9 @@
#define ENICPMD_RXQ_INTR_OFFSET 1
struct enic_soft_stats {
- rte_atomic64_t rx_nombuf;
- rte_atomic64_t rx_packet_errors;
- rte_atomic64_t tx_oversized;
+ uint64_t rx_nombuf;
+ uint64_t rx_packet_errors;
+ uint64_t tx_oversized;
};
struct enic_memzone_entry {
diff --git a/drivers/net/enic/enic_compat.h b/drivers/net/enic/enic_compat.h
index 7cff6831b9..3ce4299e81 100644
--- a/drivers/net/enic/enic_compat.h
+++ b/drivers/net/enic/enic_compat.h
@@ -9,7 +9,6 @@
#include <stdio.h>
#include <unistd.h>
-#include <rte_atomic.h>
#include <rte_malloc.h>
#include <rte_log.h>
#include <rte_io.h>
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 2696fa77d4..fb9a5754c9 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -83,17 +83,15 @@ static void enic_log_q_error(struct enic *enic)
static void enic_clear_soft_stats(struct enic *enic)
{
struct enic_soft_stats *soft_stats = &enic->soft_stats;
- rte_atomic64_clear(&soft_stats->rx_nombuf);
- rte_atomic64_clear(&soft_stats->rx_packet_errors);
- rte_atomic64_clear(&soft_stats->tx_oversized);
+
+ memset(soft_stats, 0, sizeof(*soft_stats));
}
static void enic_init_soft_stats(struct enic *enic)
{
struct enic_soft_stats *soft_stats = &enic->soft_stats;
- rte_atomic64_init(&soft_stats->rx_nombuf);
- rte_atomic64_init(&soft_stats->rx_packet_errors);
- rte_atomic64_init(&soft_stats->tx_oversized);
+
+ memset(soft_stats, 0, sizeof(*soft_stats));
enic_clear_soft_stats(enic);
}
@@ -132,7 +130,7 @@ int enic_dev_stats_get(struct enic *enic, struct rte_eth_stats *r_stats,
* counted in ibytes even though truncated packets are dropped
* which can make ibytes be slightly higher than it should be.
*/
- rx_packet_errors = rte_atomic64_read(&soft_stats->rx_packet_errors);
+ rx_packet_errors = soft_stats->rx_packet_errors;
rx_truncated = rx_packet_errors - stats->rx.rx_errors;
r_stats->ipackets = stats->rx.rx_frames_ok - rx_truncated;
@@ -142,12 +140,11 @@ int enic_dev_stats_get(struct enic *enic, struct rte_eth_stats *r_stats,
r_stats->obytes = stats->tx.tx_bytes_ok;
r_stats->ierrors = stats->rx.rx_errors + stats->rx.rx_drop;
- r_stats->oerrors = stats->tx.tx_errors
- + rte_atomic64_read(&soft_stats->tx_oversized);
+ r_stats->oerrors = stats->tx.tx_errors + soft_stats->tx_oversized;
r_stats->imissed = stats->rx.rx_no_bufs + rx_truncated;
- r_stats->rx_nombuf = rte_atomic64_read(&soft_stats->rx_nombuf);
+ r_stats->rx_nombuf = soft_stats->rx_nombuf;
return 0;
}
diff --git a/drivers/net/enic/enic_rxtx.c b/drivers/net/enic/enic_rxtx.c
index 549a153332..c87d947b93 100644
--- a/drivers/net/enic/enic_rxtx.c
+++ b/drivers/net/enic/enic_rxtx.c
@@ -112,7 +112,7 @@ enic_recv_pkts_common(void *rx_queue, struct rte_mbuf **rx_pkts,
/* allocate a new mbuf */
nmb = rte_mbuf_raw_alloc(rq->mp);
if (nmb == NULL) {
- rte_atomic64_inc(&enic->soft_stats.rx_nombuf);
+ ++enic->soft_stats.rx_nombuf;
break;
}
@@ -185,7 +185,7 @@ enic_recv_pkts_common(void *rx_queue, struct rte_mbuf **rx_pkts,
}
if (unlikely(packet_error)) {
rte_pktmbuf_free(first_seg);
- rte_atomic64_inc(&enic->soft_stats.rx_packet_errors);
+ ++enic->soft_stats.rx_packet_errors;
continue;
}
@@ -303,7 +303,7 @@ enic_noscatter_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
if (unlikely(cqd->bytes_written_flags &
CQ_ENET_RQ_DESC_FLAGS_TRUNCATED)) {
rte_pktmbuf_free(*rxmb++);
- rte_atomic64_inc(&enic->soft_stats.rx_packet_errors);
+ ++enic->soft_stats.rx_packet_errors;
cqd++;
continue;
}
@@ -505,14 +505,12 @@ uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint8_t offload_mode;
uint16_t header_len;
uint64_t tso;
- rte_atomic64_t *tx_oversized;
enic_cleanup_wq(enic, wq);
wq_desc_avail = vnic_wq_desc_avail(wq);
head_idx = wq->head_idx;
desc_count = wq->ring.desc_count;
ol_flags_mask = RTE_MBUF_F_TX_VLAN | RTE_MBUF_F_TX_IP_CKSUM | RTE_MBUF_F_TX_L4_MASK;
- tx_oversized = &enic->soft_stats.tx_oversized;
nb_pkts = RTE_MIN(nb_pkts, ENIC_TX_XMIT_MAX);
@@ -527,7 +525,7 @@ uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
/* drop packet if it's too big to send */
if (unlikely(!tso && pkt_len > ENIC_TX_MAX_PKT_SIZE)) {
rte_pktmbuf_free(tx_pkt);
- rte_atomic64_inc(tx_oversized);
+ ++enic->soft_stats.tx_oversized;
continue;
}
@@ -558,7 +556,7 @@ uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
if (unlikely(header_len == 0 || ((tx_pkt->tso_segsz +
header_len) > ENIC_TX_MAX_PKT_SIZE))) {
rte_pktmbuf_free(tx_pkt);
- rte_atomic64_inc(tx_oversized);
+ ++enic->soft_stats.tx_oversized;
continue;
}
@@ -681,7 +679,7 @@ static void enqueue_simple_pkts(struct rte_mbuf **pkts,
*/
if (unlikely(p->pkt_len > ENIC_TX_MAX_PKT_SIZE)) {
desc->length = ENIC_TX_MAX_PKT_SIZE;
- rte_atomic64_inc(&enic->soft_stats.tx_oversized);
+ ++enic->soft_stats.tx_oversized;
}
desc++;
}
diff --git a/drivers/net/enic/enic_rxtx_vec_avx2.c b/drivers/net/enic/enic_rxtx_vec_avx2.c
index 600efff270..53589ab788 100644
--- a/drivers/net/enic/enic_rxtx_vec_avx2.c
+++ b/drivers/net/enic/enic_rxtx_vec_avx2.c
@@ -81,7 +81,7 @@ enic_noscatter_vec_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
if (unlikely(cqd->bytes_written_flags &
CQ_ENET_RQ_DESC_FLAGS_TRUNCATED)) {
rte_pktmbuf_free(*rxmb++);
- rte_atomic64_inc(&enic->soft_stats.rx_packet_errors);
+ ++enic->soft_stats.rx_packet_errors;
} else {
*rx++ = rx_one(cqd, *rxmb++, enic);
}
@@ -761,7 +761,7 @@ enic_noscatter_vec_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
if (unlikely(cqd->bytes_written_flags &
CQ_ENET_RQ_DESC_FLAGS_TRUNCATED)) {
rte_pktmbuf_free(*rxmb++);
- rte_atomic64_inc(&enic->soft_stats.rx_packet_errors);
+ ++enic->soft_stats.rx_packet_errors;
} else {
*rx++ = rx_one(cqd, *rxmb++, enic);
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 00/27] deprecate rte_atomicNN family
2026-05-21 4:17 [RFC 0/7] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (8 preceding siblings ...)
2026-05-23 19:16 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 01/27] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
` (26 more replies)
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
10 siblings, 27 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
The rte_atomicNN_* family was flagged for deprecation in 2021 by
commit 3ec965b6de12 ("doc: update atomic operation deprecation")
but enforcement never landed and in-tree usage continued to grow.
v2 covered the EAL changes, lib/ring, and a starter set of drivers.
This series finishes the job: convert every remaining in-tree
caller to the C11-style rte_atomic_*_explicit() / RTE_ATOMIC()
API, then mark the legacy functions __rte_deprecated so future
in-tree and out-of-tree uses are caught at compile time.
Performance: ran the DPDK perf-tests suite (mempool, hash, stack,
ring, distributor, rcu_qsbr, etc.) on the full series; only
lib/ring showed a regression, addressed by the wrapper in patch 03.
Patch organisation
==================
01-02 EAL: drop the inline-asm fallback paths now that intrinsics
work on all platforms; reimplement rte_smp_*mb on top of
rte_atomic_thread_fence.
03-04 lib/ring and lib/bpf -- the last legacy callers in lib/.
05-25 Drivers and selftests, one patch per directory.
26 Suppress deprecation warnings in app/test/test_atomic.c,
which exercises the legacy API until it goes away.
27 Mark rte_atomicNN_* with __rte_deprecated and drop the
corresponding checkpatch grep; new uses are now caught
at compile time.
Changes since v2
================
Scope: v2 stopped at crypto/ccp (11 patches). v3 adds:
04 lib/bpf -- the bpf interpreter's atomic op macro
13-25 Remaining driver/bus/event/vdpa conversions
26-27 Test-suite warning suppression and the actual
__rte_deprecated marking
Substantive changes to patches that were in v2:
02 Also drop the rte_smp_*mb forbidden-token check from
devtools/checkpatches.sh, since the API is no longer
on a deprecation cycle.
03 lib/ring -- keep most of the original code, introduce wrapper
for the one performance sensitive CAS. This fixes the
20-30% drop in ring_perf test on x86 which was observed
when using atomic_compare_exchange_weak_explicit() with GCC.
Feedback wanted
===============
Series is targeting 26.11 rather than the next release. The driver
conversions touch many maintainers' code and several are likely to
need cycles of review/respin; a longer review window avoids rushing
contested orderings into an earlier release.
- vmbus producer commit-order pattern (patch 17)
- the ring CAS GCC bug workaround might be needed on other similar
uses of ring buffers in vmbus and netvsc.
- Dekker-style seq_cst handshake in net/vhost (patch 24), which
also closes a pre-existing ordering hole on weakly-ordered ISAs
- netvsc rndis_pending claim/timeout/clear cmpxchg orderings
(patch 15)
Stephen Hemminger (27):
eal: use intrinsics for rte_atomic on all platforms
eal: reimplement rte_smp_*mb with rte_atomic_thread_fence
ring: use compare-and-swap wrapper
bpf: replace atomic op macro with typed helpers
net/bonding: use stdatomic
net/nbl: remove unused rte_atomic16 field
net/ena: replace use of rte_atomicNN
net/failsafe: convert to stdatomic
net/enic: do not use deprecated rte_atomic64
net/pfe: use ethdev linkstatus helpers
net/sfc: replace rte_atomic with stdatomic
crypto/ccp: replace use of rte_atomic64 with stdatomic
bus/dpaa: replace rte_atomic16 with stdatomic
drivers: replace rte_atomic16 with stdatomic
net/netvsc: replace rte_atomic32 with stdatomic
event/sw: convert from rte_atomic32 to stdatomic
bus/vmbus: convert from rte_atomic to stdatomic
common/dpaax: remove unused atomic macros
net/bnx2x: convert from rte_atomic32 to stdatomic
bus/fslmc: replace rte_atomic32 with stdatomic
drivers/event: replace rte_atomic32 in selftests
net/hinic: replace rte_atomic32 with stdatomic
net/txgbe: replace rte_atomic32 with stdatomic
net/vhost: use stdatomic instead of rte_atomic32
vdpa/ifc: replace rte_atomic32 with stdatomic
test/atomic: suppress deprecation warnings for legacy APIs
eal: mark rte_atomicNN as deprecated
app/test/test_atomic.c | 12 +
devtools/checkpatches.sh | 16 -
doc/guides/rel_notes/deprecation.rst | 12 +-
doc/guides/rel_notes/release_26_07.rst | 4 +
drivers/bus/dpaa/base/qbman/qman.c | 9 +-
drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c | 10 +-
drivers/bus/fslmc/portal/dpaa2_hw_dpci.c | 10 +-
drivers/bus/fslmc/portal/dpaa2_hw_dpio.c | 12 +-
drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 8 +-
drivers/bus/fslmc/qbman/include/compat.h | 21 +-
drivers/bus/vmbus/private.h | 2 +-
drivers/bus/vmbus/vmbus_bufring.c | 39 ++-
drivers/common/dpaax/compat.h | 14 -
drivers/crypto/ccp/ccp_crypto.c | 11 +-
drivers/crypto/ccp/ccp_crypto.h | 2 +-
drivers/crypto/ccp/ccp_dev.c | 10 +-
drivers/crypto/ccp/ccp_dev.h | 4 +-
drivers/event/dpaa2/dpaa2_eventdev_selftest.c | 26 +-
drivers/event/dpaa2/dpaa2_hw_dpcon.c | 11 +-
drivers/event/octeontx/ssovf_evdev_selftest.c | 58 ++--
drivers/event/sw/sw_evdev.c | 8 +-
drivers/event/sw/sw_evdev.h | 4 +-
drivers/event/sw/sw_evdev_worker.c | 16 +-
drivers/net/bnx2x/bnx2x.c | 6 +-
drivers/net/bnx2x/bnx2x.h | 2 +-
drivers/net/bnx2x/ecore_sp.c | 6 +-
drivers/net/bonding/eth_bond_8023ad_private.h | 6 +-
drivers/net/bonding/rte_eth_bond_8023ad.c | 35 +-
drivers/net/ena/base/ena_plat_dpdk.h | 14 +-
drivers/net/ena/ena_ethdev.c | 21 +-
drivers/net/ena/ena_ethdev.h | 7 +-
drivers/net/enic/enic.h | 6 +-
drivers/net/enic/enic_compat.h | 1 -
drivers/net/enic/enic_main.c | 17 +-
drivers/net/enic/enic_rxtx.c | 14 +-
drivers/net/enic/enic_rxtx_vec_avx2.c | 4 +-
drivers/net/failsafe/failsafe_ops.c | 12 +-
drivers/net/failsafe/failsafe_private.h | 29 +-
drivers/net/failsafe/failsafe_rxtx.c | 2 +-
drivers/net/hinic/base/hinic_compat.h | 2 +-
drivers/net/hinic/base/hinic_pmd_hwdev.c | 24 +-
drivers/net/hinic/base/hinic_pmd_hwdev.h | 4 +-
drivers/net/nbl/nbl_hw/nbl_resource.h | 1 -
drivers/net/netvsc/hn_rndis.c | 28 +-
drivers/net/netvsc/hn_rxtx.c | 12 +-
drivers/net/netvsc/hn_var.h | 6 +-
drivers/net/pfe/pfe_ethdev.c | 32 +-
drivers/net/sfc/sfc.c | 9 +-
drivers/net/sfc/sfc.h | 4 +-
drivers/net/sfc/sfc_port.c | 7 +-
drivers/net/sfc/sfc_stats.h | 2 +-
drivers/net/txgbe/base/txgbe_mng.c | 4 +-
drivers/net/txgbe/base/txgbe_type.h | 2 +-
drivers/net/vhost/rte_eth_vhost.c | 103 +++---
drivers/vdpa/ifc/ifcvf_vdpa.c | 37 +--
lib/bpf/bpf_exec.c | 91 ++++--
lib/eal/arm/include/rte_atomic_32.h | 10 -
lib/eal/arm/include/rte_atomic_64.h | 10 -
lib/eal/include/generic/rte_atomic.h | 305 +++++-------------
lib/eal/loongarch/include/rte_atomic.h | 10 -
lib/eal/ppc/include/rte_atomic.h | 179 ----------
lib/eal/riscv/include/rte_atomic.h | 10 -
lib/eal/x86/include/rte_atomic.h | 205 +-----------
lib/eal/x86/include/rte_atomic_32.h | 188 -----------
lib/eal/x86/include/rte_atomic_64.h | 157 ---------
lib/ring/rte_ring_generic_pvt.h | 32 +-
66 files changed, 585 insertions(+), 1390 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 105+ messages in thread
* [PATCH v3 01/27] eal: use intrinsics for rte_atomic on all platforms
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
` (25 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, Wathsala Vithanage, Bibo Mao,
David Christensen, Sun Yuechi, Bruce Richardson,
Konstantin Ananyev
Next step is to deprecate the rte_atomicNN_*() family. Rather than
maintaining both the inline asm and intrinsic fallbacks, drop the
asm paths and use intrinsics everywhere.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
lib/eal/arm/include/rte_atomic_32.h | 4 -
lib/eal/arm/include/rte_atomic_64.h | 4 -
lib/eal/include/generic/rte_atomic.h | 76 +---------
lib/eal/loongarch/include/rte_atomic.h | 4 -
lib/eal/ppc/include/rte_atomic.h | 173 -----------------------
lib/eal/riscv/include/rte_atomic.h | 4 -
lib/eal/x86/include/rte_atomic.h | 172 ----------------------
lib/eal/x86/include/rte_atomic_32.h | 188 -------------------------
lib/eal/x86/include/rte_atomic_64.h | 157 ---------------------
9 files changed, 6 insertions(+), 776 deletions(-)
diff --git a/lib/eal/arm/include/rte_atomic_32.h b/lib/eal/arm/include/rte_atomic_32.h
index 0b9a0dfa30..696a539fef 100644
--- a/lib/eal/arm/include/rte_atomic_32.h
+++ b/lib/eal/arm/include/rte_atomic_32.h
@@ -5,10 +5,6 @@
#ifndef _RTE_ATOMIC_ARM32_H_
#define _RTE_ATOMIC_ARM32_H_
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
-
#include "generic/rte_atomic.h"
#ifdef __cplusplus
diff --git a/lib/eal/arm/include/rte_atomic_64.h b/lib/eal/arm/include/rte_atomic_64.h
index 181bb60929..9f790238df 100644
--- a/lib/eal/arm/include/rte_atomic_64.h
+++ b/lib/eal/arm/include/rte_atomic_64.h
@@ -6,10 +6,6 @@
#ifndef _RTE_ATOMIC_ARM64_H_
#define _RTE_ATOMIC_ARM64_H_
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
-
#include "generic/rte_atomic.h"
#include <rte_branch_prediction.h>
#include <rte_debug.h>
diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h
index 0a4f3f8528..292e52fade 100644
--- a/lib/eal/include/generic/rte_atomic.h
+++ b/lib/eal/include/generic/rte_atomic.h
@@ -187,13 +187,11 @@ static inline void rte_atomic_thread_fence(rte_memory_order memorder);
static inline int
rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src);
-#ifdef RTE_FORCE_INTRINSICS
static inline int
rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
}
-#endif
/**
* Atomic exchange.
@@ -211,15 +209,11 @@ rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
* The original value at that location
*/
static inline uint16_t
-rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val);
-
-#ifdef RTE_FORCE_INTRINSICS
-static inline uint16_t
rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
{
- return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
+ return rte_atomic_exchange_explicit((volatile __rte_atomic uint16_t *)dst,
+ val, rte_memory_order_seq_cst);
}
-#endif
/**
* The atomic counter structure.
@@ -312,13 +306,11 @@ rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
static inline void
rte_atomic16_inc(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic16_inc(rte_atomic16_t *v)
{
rte_atomic16_add(v, 1);
}
-#endif
/**
* Atomically decrement a counter by one.
@@ -329,13 +321,11 @@ rte_atomic16_inc(rte_atomic16_t *v)
static inline void
rte_atomic16_dec(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic16_dec(rte_atomic16_t *v)
{
rte_atomic16_sub(v, 1);
}
-#endif
/**
* Atomically add a 16-bit value to a counter and return the result.
@@ -391,13 +381,11 @@ rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
*/
static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
rte_memory_order_seq_cst) + 1 == 0;
}
-#endif
/**
* Atomically decrement a 16-bit counter by one and test.
@@ -412,13 +400,11 @@ static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
*/
static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
rte_memory_order_seq_cst) - 1 == 0;
}
-#endif
/**
* Atomically test and set a 16-bit atomic counter.
@@ -433,12 +419,10 @@ static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
*/
static inline int rte_atomic16_test_and_set(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
{
return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
}
-#endif
/**
* Atomically set a 16-bit counter to 0.
@@ -472,13 +456,11 @@ static inline void rte_atomic16_clear(rte_atomic16_t *v)
static inline int
rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src);
-#ifdef RTE_FORCE_INTRINSICS
static inline int
rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
}
-#endif
/**
* Atomic exchange.
@@ -496,15 +478,11 @@ rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
* The original value at that location
*/
static inline uint32_t
-rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val);
-
-#ifdef RTE_FORCE_INTRINSICS
-static inline uint32_t
rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
{
- return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
+ return rte_atomic_exchange_explicit((volatile __rte_atomic uint32_t *)dst,
+ val, rte_memory_order_seq_cst);
}
-#endif
/**
* The atomic counter structure.
@@ -597,13 +575,11 @@ rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
static inline void
rte_atomic32_inc(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic32_inc(rte_atomic32_t *v)
{
rte_atomic32_add(v, 1);
}
-#endif
/**
* Atomically decrement a counter by one.
@@ -614,13 +590,11 @@ rte_atomic32_inc(rte_atomic32_t *v)
static inline void
rte_atomic32_dec(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic32_dec(rte_atomic32_t *v)
{
rte_atomic32_sub(v,1);
}
-#endif
/**
* Atomically add a 32-bit value to a counter and return the result.
@@ -676,13 +650,11 @@ rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
*/
static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
rte_memory_order_seq_cst) + 1 == 0;
}
-#endif
/**
* Atomically decrement a 32-bit counter by one and test.
@@ -697,13 +669,11 @@ static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
*/
static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
rte_memory_order_seq_cst) - 1 == 0;
}
-#endif
/**
* Atomically test and set a 32-bit atomic counter.
@@ -718,12 +688,10 @@ static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
*/
static inline int rte_atomic32_test_and_set(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
{
return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
}
-#endif
/**
* Atomically set a 32-bit counter to 0.
@@ -756,13 +724,11 @@ static inline void rte_atomic32_clear(rte_atomic32_t *v)
static inline int
rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src);
-#ifdef RTE_FORCE_INTRINSICS
static inline int
rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
}
-#endif
/**
* Atomic exchange.
@@ -780,15 +746,11 @@ rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
* The original value at that location
*/
static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val);
-
-#ifdef RTE_FORCE_INTRINSICS
-static inline uint64_t
rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
{
- return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
+ return rte_atomic_exchange_explicit((volatile __rte_atomic uint64_t *)dst,
+ val, rte_memory_order_seq_cst);
}
-#endif
/**
* The atomic counter structure.
@@ -811,7 +773,6 @@ typedef struct {
static inline void
rte_atomic64_init(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_init(rte_atomic64_t *v)
{
@@ -828,7 +789,6 @@ rte_atomic64_init(rte_atomic64_t *v)
}
#endif
}
-#endif
/**
* Atomically read a 64-bit counter.
@@ -841,7 +801,6 @@ rte_atomic64_init(rte_atomic64_t *v)
static inline int64_t
rte_atomic64_read(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int64_t
rte_atomic64_read(rte_atomic64_t *v)
{
@@ -860,7 +819,6 @@ rte_atomic64_read(rte_atomic64_t *v)
return tmp;
#endif
}
-#endif
/**
* Atomically set a 64-bit counter.
@@ -873,7 +831,6 @@ rte_atomic64_read(rte_atomic64_t *v)
static inline void
rte_atomic64_set(rte_atomic64_t *v, int64_t new_value);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
{
@@ -890,7 +847,6 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
}
#endif
}
-#endif
/**
* Atomically add a 64-bit value to a counter.
@@ -903,14 +859,12 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
static inline void
rte_atomic64_add(rte_atomic64_t *v, int64_t inc);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
{
rte_atomic_fetch_add_explicit((volatile __rte_atomic int64_t *)&v->cnt, inc,
rte_memory_order_seq_cst);
}
-#endif
/**
* Atomically subtract a 64-bit value from a counter.
@@ -923,14 +877,12 @@ rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
static inline void
rte_atomic64_sub(rte_atomic64_t *v, int64_t dec);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
{
rte_atomic_fetch_sub_explicit((volatile __rte_atomic int64_t *)&v->cnt, dec,
rte_memory_order_seq_cst);
}
-#endif
/**
* Atomically increment a 64-bit counter by one and test.
@@ -941,13 +893,11 @@ rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
static inline void
rte_atomic64_inc(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_inc(rte_atomic64_t *v)
{
rte_atomic64_add(v, 1);
}
-#endif
/**
* Atomically decrement a 64-bit counter by one and test.
@@ -958,13 +908,11 @@ rte_atomic64_inc(rte_atomic64_t *v)
static inline void
rte_atomic64_dec(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_dec(rte_atomic64_t *v)
{
rte_atomic64_sub(v, 1);
}
-#endif
/**
* Add a 64-bit value to an atomic counter and return the result.
@@ -982,14 +930,12 @@ rte_atomic64_dec(rte_atomic64_t *v)
static inline int64_t
rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc);
-#ifdef RTE_FORCE_INTRINSICS
static inline int64_t
rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int64_t *)&v->cnt, inc,
rte_memory_order_seq_cst) + inc;
}
-#endif
/**
* Subtract a 64-bit value from an atomic counter and return the result.
@@ -1007,14 +953,12 @@ rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
static inline int64_t
rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec);
-#ifdef RTE_FORCE_INTRINSICS
static inline int64_t
rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int64_t *)&v->cnt, dec,
rte_memory_order_seq_cst) - dec;
}
-#endif
/**
* Atomically increment a 64-bit counter by one and test.
@@ -1029,12 +973,10 @@ rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
*/
static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
{
return rte_atomic64_add_return(v, 1) == 0;
}
-#endif
/**
* Atomically decrement a 64-bit counter by one and test.
@@ -1049,12 +991,10 @@ static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
*/
static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
{
return rte_atomic64_sub_return(v, 1) == 0;
}
-#endif
/**
* Atomically test and set a 64-bit atomic counter.
@@ -1069,12 +1009,10 @@ static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
*/
static inline int rte_atomic64_test_and_set(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
{
return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
}
-#endif
/**
* Atomically set a 64-bit counter to 0.
@@ -1084,12 +1022,10 @@ static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
*/
static inline void rte_atomic64_clear(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void rte_atomic64_clear(rte_atomic64_t *v)
{
rte_atomic64_set(v, 0);
}
-#endif
#endif
diff --git a/lib/eal/loongarch/include/rte_atomic.h b/lib/eal/loongarch/include/rte_atomic.h
index c8066a4612..785a452c9e 100644
--- a/lib/eal/loongarch/include/rte_atomic.h
+++ b/lib/eal/loongarch/include/rte_atomic.h
@@ -5,10 +5,6 @@
#ifndef RTE_ATOMIC_LOONGARCH_H
#define RTE_ATOMIC_LOONGARCH_H
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
-
#include <rte_common.h>
#include "generic/rte_atomic.h"
diff --git a/lib/eal/ppc/include/rte_atomic.h b/lib/eal/ppc/include/rte_atomic.h
index 10acc238f9..64f4c3d670 100644
--- a/lib/eal/ppc/include/rte_atomic.h
+++ b/lib/eal/ppc/include/rte_atomic.h
@@ -43,179 +43,6 @@ rte_atomic_thread_fence(rte_memory_order memorder)
}
/*------------------------- 16 bit atomic operations -------------------------*/
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
-{
- return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src, rte_memory_order_acquire,
- rte_memory_order_acquire) ? 1 : 0;
-}
-
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
-{
- return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire) + 1 == 0;
-}
-
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire) - 1 == 0;
-}
-
-static inline uint16_t
-rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
-{
- return __atomic_exchange_2(dst, val, rte_memory_order_seq_cst);
-}
-
-/*------------------------- 32 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
-{
- return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src, rte_memory_order_acquire,
- rte_memory_order_acquire) ? 1 : 0;
-}
-
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
-{
- return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire) + 1 == 0;
-}
-
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire) - 1 == 0;
-}
-
-static inline uint32_t
-rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
-{
- return __atomic_exchange_4(dst, val, rte_memory_order_seq_cst);
-}
-
-/*------------------------- 64 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
- return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src, rte_memory_order_acquire,
- rte_memory_order_acquire) ? 1 : 0;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
- return v->cnt;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
- v->cnt = new_value;
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, inc, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, dec, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, inc, rte_memory_order_acquire) + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, dec, rte_memory_order_acquire) - dec;
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire) + 1 == 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire) - 1 == 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
- return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-
-static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
-{
- return __atomic_exchange_8(dst, val, rte_memory_order_seq_cst);
-}
-
-#endif
#ifdef __cplusplus
}
diff --git a/lib/eal/riscv/include/rte_atomic.h b/lib/eal/riscv/include/rte_atomic.h
index 66346ad474..061b175f33 100644
--- a/lib/eal/riscv/include/rte_atomic.h
+++ b/lib/eal/riscv/include/rte_atomic.h
@@ -8,10 +8,6 @@
#ifndef RTE_ATOMIC_RISCV_H
#define RTE_ATOMIC_RISCV_H
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
-
#include <stdint.h>
#include <rte_common.h>
#include <rte_config.h>
diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h
index e071e4234e..4f05302c9f 100644
--- a/lib/eal/x86/include/rte_atomic.h
+++ b/lib/eal/x86/include/rte_atomic.h
@@ -111,178 +111,6 @@ rte_atomic_thread_fence(rte_memory_order memorder)
extern "C" {
#endif
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
-{
- uint8_t res;
-
- asm volatile(
- MPLOCKED
- "cmpxchgw %[src], %[dst];"
- "sete %[res];"
- : [res] "=a" (res), /* output */
- [dst] "=m" (*dst)
- : [src] "r" (src), /* input */
- "a" (exp),
- "m" (*dst)
- : "memory"); /* no-clobber list */
- return res;
-}
-
-static inline uint16_t
-rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
-{
- asm volatile(
- MPLOCKED
- "xchgw %0, %1;"
- : "=r" (val), "=m" (*dst)
- : "0" (val), "m" (*dst)
- : "memory"); /* no-clobber list */
- return val;
-}
-
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
-{
- return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v)
-{
- asm volatile(
- MPLOCKED
- "incw %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v)
-{
- asm volatile(
- MPLOCKED
- "decw %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "incw %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
-{
- uint8_t ret;
-
- asm volatile(MPLOCKED
- "decw %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-/*------------------------- 32 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
-{
- uint8_t res;
-
- asm volatile(
- MPLOCKED
- "cmpxchgl %[src], %[dst];"
- "sete %[res];"
- : [res] "=a" (res), /* output */
- [dst] "=m" (*dst)
- : [src] "r" (src), /* input */
- "a" (exp),
- "m" (*dst)
- : "memory"); /* no-clobber list */
- return res;
-}
-
-static inline uint32_t
-rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
-{
- asm volatile(
- MPLOCKED
- "xchgl %0, %1;"
- : "=r" (val), "=m" (*dst)
- : "0" (val), "m" (*dst)
- : "memory"); /* no-clobber list */
- return val;
-}
-
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
-{
- return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v)
-{
- asm volatile(
- MPLOCKED
- "incl %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v)
-{
- asm volatile(
- MPLOCKED
- "decl %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "incl %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
-{
- uint8_t ret;
-
- asm volatile(MPLOCKED
- "decl %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-#endif /* !RTE_FORCE_INTRINSICS */
#ifdef __cplusplus
}
diff --git a/lib/eal/x86/include/rte_atomic_32.h b/lib/eal/x86/include/rte_atomic_32.h
index 0f25863aa5..37d139f30d 100644
--- a/lib/eal/x86/include/rte_atomic_32.h
+++ b/lib/eal/x86/include/rte_atomic_32.h
@@ -20,193 +20,5 @@
/*------------------------- 64 bit atomic operations -------------------------*/
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
- uint8_t res;
- union {
- struct {
- uint32_t l32;
- uint32_t h32;
- };
- uint64_t u64;
- } _exp, _src;
-
- _exp.u64 = exp;
- _src.u64 = src;
-
-#ifndef __PIC__
- asm volatile (
- MPLOCKED
- "cmpxchg8b (%[dst]);"
- "setz %[res];"
- : [res] "=a" (res) /* result in eax */
- : [dst] "S" (dst), /* esi */
- "b" (_src.l32), /* ebx */
- "c" (_src.h32), /* ecx */
- "a" (_exp.l32), /* eax */
- "d" (_exp.h32) /* edx */
- : "memory" ); /* no-clobber list */
-#else
- asm volatile (
- "xchgl %%ebx, %%edi;\n"
- MPLOCKED
- "cmpxchg8b (%[dst]);"
- "setz %[res];"
- "xchgl %%ebx, %%edi;\n"
- : [res] "=a" (res) /* result in eax */
- : [dst] "S" (dst), /* esi */
- "D" (_src.l32), /* ebx */
- "c" (_src.h32), /* ecx */
- "a" (_exp.l32), /* eax */
- "d" (_exp.h32) /* edx */
- : "memory" ); /* no-clobber list */
-#endif
-
- return res;
-}
-
-static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dest, uint64_t val)
-{
- uint64_t old;
-
- do {
- old = *dest;
- } while (rte_atomic64_cmpset(dest, old, val) == 0);
-
- return old;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, 0);
- }
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- /* replace the value by itself */
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp);
- }
- return tmp;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, new_value);
- }
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp + inc);
- }
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp - dec);
- }
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
- rte_atomic64_add(v, 1);
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
- rte_atomic64_sub(v, 1);
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp + inc);
- }
-
- return tmp + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp - dec);
- }
-
- return tmp - dec;
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
- return rte_atomic64_add_return(v, 1) == 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
- return rte_atomic64_sub_return(v, 1) == 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
- return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
- rte_atomic64_set(v, 0);
-}
-#endif
#endif /* _RTE_ATOMIC_I686_H_ */
diff --git a/lib/eal/x86/include/rte_atomic_64.h b/lib/eal/x86/include/rte_atomic_64.h
index 0a7a2131e0..1cd12695a2 100644
--- a/lib/eal/x86/include/rte_atomic_64.h
+++ b/lib/eal/x86/include/rte_atomic_64.h
@@ -22,163 +22,6 @@
/*------------------------- 64 bit atomic operations -------------------------*/
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
- uint8_t res;
-
-
- asm volatile(
- MPLOCKED
- "cmpxchgq %[src], %[dst];"
- "sete %[res];"
- : [res] "=a" (res), /* output */
- [dst] "=m" (*dst)
- : [src] "r" (src), /* input */
- "a" (exp),
- "m" (*dst)
- : "memory"); /* no-clobber list */
-
- return res;
-}
-
-static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
-{
- asm volatile(
- MPLOCKED
- "xchgq %0, %1;"
- : "=r" (val), "=m" (*dst)
- : "0" (val), "m" (*dst)
- : "memory"); /* no-clobber list */
- return val;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
- return v->cnt;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
- v->cnt = new_value;
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
- asm volatile(
- MPLOCKED
- "addq %[inc], %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : [inc] "ir" (inc), /* input */
- "m" (v->cnt)
- );
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
- asm volatile(
- MPLOCKED
- "subq %[dec], %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : [dec] "ir" (dec), /* input */
- "m" (v->cnt)
- );
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
- asm volatile(
- MPLOCKED
- "incq %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
- asm volatile(
- MPLOCKED
- "decq %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
- int64_t prev = inc;
-
- asm volatile(
- MPLOCKED
- "xaddq %[prev], %[cnt]"
- : [prev] "+r" (prev), /* output */
- [cnt] "=m" (v->cnt)
- : "m" (v->cnt) /* input */
- );
- return prev + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
- return rte_atomic64_add_return(v, -dec);
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "incq %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
-
- return ret != 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "decq %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
- return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-#endif
/*------------------------ 128 bit atomic operations -------------------------*/
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 01/27] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 03/27] ring: use compare-and-swap wrapper Stephen Hemminger
` (24 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, Thomas Monjalon, Wathsala Vithanage, Bibo Mao,
David Christensen, Sun Yuechi, Bruce Richardson,
Konstantin Ananyev
The rte_smp_mb(), rte_smp_wmb() and rte_smp_rmb() functions were
flagged as deprecated by commit 3ec965b6de12 ("doc: update atomic
operation deprecation") in 2021 but nothing came of it.
Reimplement them as inline wrappers over rte_atomic_thread_fence()
and drop the deprecation notice.
The API is preserved; only the implementation changes.
Generated code is unchanged on x86 (seq_cst keeps the lock-addl
trick, release/acquire collapse to a compiler barrier under TSO).
On arm64, release/acquire emit dmb ish instead of dmb ishst/ishld;
the difference is below measurement noise.
Drop restriction frm checkpatch since they are no longer
really on deprecation cycle.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
devtools/checkpatches.sh | 8 --
doc/guides/rel_notes/deprecation.rst | 8 --
lib/eal/arm/include/rte_atomic_32.h | 6 --
lib/eal/arm/include/rte_atomic_64.h | 6 --
lib/eal/include/generic/rte_atomic.h | 130 +++++--------------------
lib/eal/loongarch/include/rte_atomic.h | 6 --
lib/eal/ppc/include/rte_atomic.h | 6 --
lib/eal/riscv/include/rte_atomic.h | 6 --
lib/eal/x86/include/rte_atomic.h | 33 +++----
9 files changed, 37 insertions(+), 172 deletions(-)
diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
index f5dd77443f..81bb0fe4e8 100755
--- a/devtools/checkpatches.sh
+++ b/devtools/checkpatches.sh
@@ -121,14 +121,6 @@ check_forbidden_additions() { # <patch>
-f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
"$1" || res=1
- # refrain from new additions of rte_smp_[r/w]mb()
- awk -v FOLDERS="lib drivers app examples" \
- -v EXPRESSIONS="rte_smp_(r|w)?mb\\\(" \
- -v RET_ON_FAIL=1 \
- -v MESSAGE='Using rte_smp_[r/w]mb' \
- -f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
- "$1" || res=1
-
# refrain from using compiler __sync_xxx builtins
awk -v FOLDERS="lib drivers app examples" \
-v EXPRESSIONS="__sync_.*\\\(" \
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 35c9b4e06c..2190419f79 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -47,14 +47,6 @@ Deprecation Notices
operations must be used for patches that need to be merged in 20.08 onwards.
This change will not introduce any performance degradation.
-* rte_smp_*mb: These APIs provide full barrier functionality. However, many
- use cases do not require full barriers. To support such use cases, DPDK has
- adopted atomic operations from
- https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html. These
- operations and a new wrapper ``rte_atomic_thread_fence`` instead of
- ``__atomic_thread_fence`` must be used for patches that need to be merged in
- 20.08 onwards. This change will not introduce any performance degradation.
-
* lib: will fix extending some enum/define breaking the ABI. There are multiple
samples in DPDK that enum/define terminated with a ``.*MAX.*`` value which is
used by iterators, and arrays holding these values are sized with this
diff --git a/lib/eal/arm/include/rte_atomic_32.h b/lib/eal/arm/include/rte_atomic_32.h
index 696a539fef..4115271091 100644
--- a/lib/eal/arm/include/rte_atomic_32.h
+++ b/lib/eal/arm/include/rte_atomic_32.h
@@ -17,12 +17,6 @@ extern "C" {
#define rte_rmb() __sync_synchronize()
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_wmb()
diff --git a/lib/eal/arm/include/rte_atomic_64.h b/lib/eal/arm/include/rte_atomic_64.h
index 9f790238df..604e777bcd 100644
--- a/lib/eal/arm/include/rte_atomic_64.h
+++ b/lib/eal/arm/include/rte_atomic_64.h
@@ -20,12 +20,6 @@ extern "C" {
#define rte_rmb() asm volatile("dmb oshld" : : : "memory")
-#define rte_smp_mb() asm volatile("dmb ish" : : : "memory")
-
-#define rte_smp_wmb() asm volatile("dmb ishst" : : : "memory")
-
-#define rte_smp_rmb() asm volatile("dmb ishld" : : : "memory")
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_wmb()
diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h
index 292e52fade..1b04b43cbb 100644
--- a/lib/eal/include/generic/rte_atomic.h
+++ b/lib/eal/include/generic/rte_atomic.h
@@ -59,55 +59,25 @@ static inline void rte_rmb(void);
*
* Guarantees that the LOAD and STORE operations that precede the
* rte_smp_mb() call are globally visible across the lcores
- * before the LOAD and STORE operations that follows it.
- *
- * @note
- * This function is deprecated.
- * It provides similar synchronization primitive as atomic fence,
- * but has different syntax and memory ordering semantic. Hence
- * deprecated for the simplicity of memory ordering semantics in use.
- *
- * rte_atomic_thread_fence(rte_memory_order_acq_rel) should be used instead.
+ * before the LOAD and STORE operations that follow it.
*/
static inline void rte_smp_mb(void);
/**
* Write memory barrier between lcores
*
- * Guarantees that the STORE operations that precede the
- * rte_smp_wmb() call are globally visible across the lcores
- * before the STORE operations that follows it.
- *
- * @note
- * This function is deprecated.
- * It provides similar synchronization primitive as atomic fence,
- * but has different syntax and memory ordering semantic. Hence
- * deprecated for the simplicity of memory ordering semantics in use.
- *
- * rte_atomic_thread_fence(rte_memory_order_release) should be used instead.
- * The fence also guarantees LOAD operations that precede the call
- * are globally visible across the lcores before the STORE operations
- * that follows it.
+ * Guarantees that the LOAD and STORE operations that precede the
+ * rte_smp_wmb() call are globally visible across the lcores before
+ * any STORE operations that follow it.
*/
static inline void rte_smp_wmb(void);
/**
* Read memory barrier between lcores
*
- * Guarantees that the LOAD operations that precede the
- * rte_smp_rmb() call are globally visible across the lcores
- * before the LOAD operations that follows it.
- *
- * @note
- * This function is deprecated.
- * It provides similar synchronization primitive as atomic fence,
- * but has different syntax and memory ordering semantic. Hence
- * deprecated for the simplicity of memory ordering semantics in use.
- *
- * rte_atomic_thread_fence(rte_memory_order_acquire) should be used instead.
- * The fence also guarantees LOAD operations that precede the call
- * are globally visible across the lcores before the STORE operations
- * that follows it.
+ * Guarantees that any LOAD operations that precede the rte_smp_rmb()
+ * call complete before LOAD and STORE operations that follow it
+ * become globally visible.
*/
static inline void rte_smp_rmb(void);
///@}
@@ -164,6 +134,24 @@ static inline void rte_io_rmb(void);
*/
static inline void rte_atomic_thread_fence(rte_memory_order memorder);
+static __rte_always_inline void
+rte_smp_mb(void)
+{
+ rte_atomic_thread_fence(rte_memory_order_seq_cst);
+}
+
+static __rte_always_inline void
+rte_smp_wmb(void)
+{
+ rte_atomic_thread_fence(rte_memory_order_release);
+}
+
+static __rte_always_inline void
+rte_smp_rmb(void)
+{
+ rte_atomic_thread_fence(rte_memory_order_acquire);
+}
+
/*------------------------- 16 bit atomic operations -------------------------*/
#ifndef RTE_TOOLCHAIN_MSVC
@@ -184,9 +172,6 @@ static inline void rte_atomic_thread_fence(rte_memory_order memorder);
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src);
-
static inline int
rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
{
@@ -303,9 +288,6 @@ rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v);
-
static inline void
rte_atomic16_inc(rte_atomic16_t *v)
{
@@ -318,9 +300,6 @@ rte_atomic16_inc(rte_atomic16_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v);
-
static inline void
rte_atomic16_dec(rte_atomic16_t *v)
{
@@ -379,8 +358,6 @@ rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
* @return
* True if the result after the increment operation is 0; false otherwise.
*/
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v);
-
static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
@@ -398,8 +375,6 @@ static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
* @return
* True if the result after the decrement operation is 0; false otherwise.
*/
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v);
-
static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
@@ -417,8 +392,6 @@ static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v);
-
static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
{
return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
@@ -453,9 +426,6 @@ static inline void rte_atomic16_clear(rte_atomic16_t *v)
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src);
-
static inline int
rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
{
@@ -572,9 +542,6 @@ rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v);
-
static inline void
rte_atomic32_inc(rte_atomic32_t *v)
{
@@ -587,9 +554,6 @@ rte_atomic32_inc(rte_atomic32_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v);
-
static inline void
rte_atomic32_dec(rte_atomic32_t *v)
{
@@ -648,8 +612,6 @@ rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
* @return
* True if the result after the increment operation is 0; false otherwise.
*/
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v);
-
static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
@@ -667,8 +629,6 @@ static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
* @return
* True if the result after the decrement operation is 0; false otherwise.
*/
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v);
-
static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
@@ -686,8 +646,6 @@ static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v);
-
static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
{
return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
@@ -721,9 +679,6 @@ static inline void rte_atomic32_clear(rte_atomic32_t *v)
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src);
-
static inline int
rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
{
@@ -770,9 +725,6 @@ typedef struct {
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic64_init(rte_atomic64_t *v);
-
static inline void
rte_atomic64_init(rte_atomic64_t *v)
{
@@ -798,9 +750,6 @@ rte_atomic64_init(rte_atomic64_t *v)
* @return
* The value of the counter.
*/
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v);
-
static inline int64_t
rte_atomic64_read(rte_atomic64_t *v)
{
@@ -828,9 +777,6 @@ rte_atomic64_read(rte_atomic64_t *v)
* @param new_value
* The new value of the counter.
*/
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value);
-
static inline void
rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
{
@@ -856,9 +802,6 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
* @param inc
* The value to be added to the counter.
*/
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc);
-
static inline void
rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
{
@@ -874,9 +817,6 @@ rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
* @param dec
* The value to be subtracted from the counter.
*/
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec);
-
static inline void
rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
{
@@ -890,9 +830,6 @@ rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v);
-
static inline void
rte_atomic64_inc(rte_atomic64_t *v)
{
@@ -905,9 +842,6 @@ rte_atomic64_inc(rte_atomic64_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v);
-
static inline void
rte_atomic64_dec(rte_atomic64_t *v)
{
@@ -927,9 +861,6 @@ rte_atomic64_dec(rte_atomic64_t *v)
* @return
* The value of v after the addition.
*/
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc);
-
static inline int64_t
rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
{
@@ -950,9 +881,6 @@ rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
* @return
* The value of v after the subtraction.
*/
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec);
-
static inline int64_t
rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
{
@@ -971,8 +899,6 @@ rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
* @return
* True if the result after the addition is 0; false otherwise.
*/
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v);
-
static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
{
return rte_atomic64_add_return(v, 1) == 0;
@@ -989,8 +915,6 @@ static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
* @return
* True if the result after subtraction is 0; false otherwise.
*/
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v);
-
static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
{
return rte_atomic64_sub_return(v, 1) == 0;
@@ -1007,8 +931,6 @@ static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v);
-
static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
{
return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
@@ -1020,8 +942,6 @@ static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void rte_atomic64_clear(rte_atomic64_t *v);
-
static inline void rte_atomic64_clear(rte_atomic64_t *v)
{
rte_atomic64_set(v, 0);
diff --git a/lib/eal/loongarch/include/rte_atomic.h b/lib/eal/loongarch/include/rte_atomic.h
index 785a452c9e..a789e3ab4d 100644
--- a/lib/eal/loongarch/include/rte_atomic.h
+++ b/lib/eal/loongarch/include/rte_atomic.h
@@ -18,12 +18,6 @@ extern "C" {
#define rte_rmb() rte_mb()
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_mb()
-
-#define rte_smp_rmb() rte_mb()
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_mb()
diff --git a/lib/eal/ppc/include/rte_atomic.h b/lib/eal/ppc/include/rte_atomic.h
index 64f4c3d670..0e64db2a35 100644
--- a/lib/eal/ppc/include/rte_atomic.h
+++ b/lib/eal/ppc/include/rte_atomic.h
@@ -24,12 +24,6 @@ extern "C" {
#define rte_rmb() asm volatile("sync" : : : "memory")
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_wmb()
diff --git a/lib/eal/riscv/include/rte_atomic.h b/lib/eal/riscv/include/rte_atomic.h
index 061b175f33..04c40e4e9b 100644
--- a/lib/eal/riscv/include/rte_atomic.h
+++ b/lib/eal/riscv/include/rte_atomic.h
@@ -23,12 +23,6 @@ extern "C" {
#define rte_rmb() asm volatile("fence r, r" : : : "memory")
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#define rte_io_mb() asm volatile("fence iorw, iorw" : : : "memory")
#define rte_io_wmb() asm volatile("fence orw, ow" : : : "memory")
diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h
index 4f05302c9f..f4d39ce4fe 100644
--- a/lib/eal/x86/include/rte_atomic.h
+++ b/lib/eal/x86/include/rte_atomic.h
@@ -23,10 +23,6 @@
#define rte_rmb() _mm_lfence()
-#define rte_smp_wmb() rte_compiler_barrier()
-
-#define rte_smp_rmb() rte_compiler_barrier()
-
#ifdef __cplusplus
extern "C" {
#endif
@@ -63,20 +59,6 @@ extern "C" {
* So below we use that technique for rte_smp_mb() implementation.
*/
-static __rte_always_inline void
-rte_smp_mb(void)
-{
-#ifdef RTE_TOOLCHAIN_MSVC
- _mm_mfence();
-#else
-#ifdef RTE_ARCH_I686
- asm volatile("lock addl $0, -128(%%esp); " ::: "memory");
-#else
- asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
-#endif
-#endif
-}
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_compiler_barrier()
@@ -93,10 +75,19 @@ rte_smp_mb(void)
static __rte_always_inline void
rte_atomic_thread_fence(rte_memory_order memorder)
{
- if (memorder == rte_memory_order_seq_cst)
- rte_smp_mb();
- else
+ if (memorder == rte_memory_order_seq_cst) {
+#ifdef RTE_TOOLCHAIN_MSVC
+ _mm_mfence();
+#else
+#ifdef RTE_ARCH_I686
+ asm volatile("lock addl $0, -128(%%esp); " ::: "memory");
+#else
+ asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
+#endif
+#endif
+ } else {
__rte_atomic_thread_fence(memorder);
+ }
}
#ifdef __cplusplus
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 03/27] ring: use compare-and-swap wrapper
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 01/27] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 04/27] bpf: replace atomic op macro with typed helpers Stephen Hemminger
` (23 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Konstantin Ananyev, Wathsala Vithanage
The rte_atomic32_cmpset is deprecated. Initial attempts at
changing this with direct conversion to
rte_atomic_compare_exchange_weak_explicit()
regressed MP/MC contended performance on x86 by 10-30%,
because the C11 builtin's failure-writeback semantic forces
GCC to emit extra instructions on the CAS critical path.
Add an internal __rte_ring_compare_and_swap() wrapper that calls
__sync_bool_compare_and_swap() directly, which keeps the original
instruction sequence. Add equivalent function for MSVC.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
lib/ring/rte_ring_generic_pvt.h | 32 ++++++++++++++++++++++++++++----
1 file changed, 28 insertions(+), 4 deletions(-)
diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_generic_pvt.h
index affd2d5ba7..0fb972de9e 100644
--- a/lib/ring/rte_ring_generic_pvt.h
+++ b/lib/ring/rte_ring_generic_pvt.h
@@ -18,6 +18,30 @@
* For more information please refer to <rte_ring.h>.
*/
+/**
+ * @internal optimized version of compare exchange
+ *
+ * The C11 builtin's failure-writeback semantic generates worse code on x86.
+ * Unlike rte_atomic_compare_exchange_*_explicit(), this wrapper does NOT
+ * write the actual value back to a pointer on failure. Callers in a retry
+ * loop must reload the expected value explicitly on the next iteration.
+ *
+ * Full memory barrier, equivalent to rte_memory_order_seq_cst on both
+ * success and failure.
+ */
+static __rte_always_inline bool
+__rte_ring_compare_and_swap(volatile uint32_t *dst,
+ uint32_t expected, uint32_t desired)
+{
+#if defined(RTE_TOOLCHAIN_MSVC)
+ return _InterlockedCompareExchange((volatile long *)dst,
+ (long)desired, (long)expected)
+ == (long)expected;
+#else
+ return __sync_bool_compare_and_swap(dst, expected, desired);
+#endif
+}
+
/**
* @internal This function updates tail values.
*/
@@ -108,10 +132,10 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
if (is_st) {
d->head = *new_head;
success = 1;
- } else
- success = rte_atomic32_cmpset(
- (uint32_t *)(uintptr_t)&d->head,
- *old_head, *new_head);
+ } else {
+ success = __rte_ring_compare_and_swap(
+ &d->head, *old_head, *new_head);
+ }
} while (unlikely(success == 0));
return n;
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 04/27] bpf: replace atomic op macro with typed helpers
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (2 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 03/27] ring: use compare-and-swap wrapper Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-25 10:49 ` Marat Khalili
2026-05-23 19:56 ` [PATCH v3 05/27] net/bonding: use stdatomic Stephen Hemminger
` (22 subsequent siblings)
26 siblings, 1 reply; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Konstantin Ananyev, Marat Khalili
The BPF_ST_ATOMIC_REG macro token-pasted the legacy rte_atomicNN_*()
API names. It also stacked three casts on the destination pointer
and reached a 'return 0' out of the macro into the caller's control
flow.
Replace it with two small static-inline helpers, bpf_atomic32() and
bpf_atomic64(), that dispatch on ins->imm internally and use the C11
atomic intrinsics directly. The destination is cast once, to a
properly __rte_atomic-qualified pointer. The helpers return a status
and the dispatch loop owns the early exit.
Use memory order seq_cst to preserve the previous behavior of
rte_atomicNN_add() / rte_atomicNN_exchange() and matches
the Linux kernel BPF interpreter for these opcodes.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
lib/bpf/bpf_exec.c | 91 ++++++++++++++++++++++++++++++++++------------
1 file changed, 67 insertions(+), 24 deletions(-)
diff --git a/lib/bpf/bpf_exec.c b/lib/bpf/bpf_exec.c
index 18013753b1..b8116db191 100644
--- a/lib/bpf/bpf_exec.c
+++ b/lib/bpf/bpf_exec.c
@@ -64,28 +64,6 @@
(*(type *)(uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off) = \
(type)(reg)[(ins)->src_reg])
-#define BPF_ST_ATOMIC_REG(reg, ins, tp) do { \
- switch (ins->imm) { \
- case BPF_ATOMIC_ADD: \
- rte_atomic##tp##_add((rte_atomic##tp##_t *) \
- (uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off), \
- (reg)[(ins)->src_reg]); \
- break; \
- case BPF_ATOMIC_XCHG: \
- (reg)[(ins)->src_reg] = rte_atomic##tp##_exchange((uint##tp##_t *) \
- (uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off), \
- (reg)[(ins)->src_reg]); \
- break; \
- default: \
- /* this should be caught by validator and never reach here */ \
- RTE_BPF_LOG_LINE(ERR, \
- "%s(%p): unsupported atomic operation at pc: %#zx;", \
- __func__, bpf, \
- (uintptr_t)(ins) - (uintptr_t)(bpf)->prm.ins); \
- return 0; \
- } \
-} while (0)
-
/* BPF_LD | BPF_ABS/BPF_IND */
#define NOP(x) (x)
@@ -105,6 +83,69 @@
reg[EBPF_REG_0] = op(p[0]); \
} while (0)
+/*
+ * Atomic ops on the BPF target memory.
+ *
+ * BPF atomic instructions encode the destination as base register +
+ * signed offset, with the value to combine taken from src_reg.
+ *
+ * Memory order: seq_cst preserves the previous behavior of
+ * rte_atomicNN_add() / rte_atomicNN_exchange() and matches what the
+ * Linux kernel BPF interpreter does for these opcodes.
+ *
+ * Returns 0 on unsupported sub-op (validator should have rejected it),
+ * 1 otherwise.
+ */
+static inline int
+bpf_atomic32(const struct rte_bpf *bpf, uint64_t reg[EBPF_REG_NUM],
+ const struct ebpf_insn *ins)
+{
+ /* need to casts to make bpf memory suitable for C11 atomic */
+ uint32_t __rte_atomic *dst
+ = (uint32_t __rte_atomic *)(uintptr_t)(reg[ins->dst_reg] + ins->off);
+ uint32_t val = (uint32_t)reg[ins->src_reg];
+
+ switch (ins->imm) {
+ case BPF_ATOMIC_ADD:
+ rte_atomic_fetch_add_explicit(dst, val, rte_memory_order_seq_cst);
+ return 1;
+ case BPF_ATOMIC_XCHG:
+ reg[ins->src_reg] = rte_atomic_exchange_explicit(dst, val,
+ rte_memory_order_seq_cst);
+ return 1;
+ default:
+ RTE_BPF_LOG_LINE(ERR,
+ "%s(%p): unsupported atomic operation at pc: %#zx;",
+ __func__, bpf,
+ (uintptr_t)ins - (uintptr_t)bpf->prm.ins);
+ return 0;
+ }
+}
+
+static inline int
+bpf_atomic64(const struct rte_bpf *bpf, uint64_t reg[EBPF_REG_NUM],
+ const struct ebpf_insn *ins)
+{
+ uint64_t __rte_atomic *dst
+ = (uint64_t __rte_atomic *)(uintptr_t) (reg[ins->dst_reg] + ins->off);
+ uint64_t val = reg[ins->src_reg];
+
+ switch (ins->imm) {
+ case BPF_ATOMIC_ADD:
+ rte_atomic_fetch_add_explicit(dst, val, rte_memory_order_seq_cst);
+ return 1;
+ case BPF_ATOMIC_XCHG:
+ reg[ins->src_reg] = rte_atomic_exchange_explicit(dst, val,
+ rte_memory_order_seq_cst);
+ return 1;
+ default:
+ RTE_BPF_LOG_LINE(ERR,
+ "%s(%p): unsupported atomic operation at pc: %#zx;",
+ __func__, bpf,
+ (uintptr_t)ins - (uintptr_t)bpf->prm.ins);
+ return 0;
+ }
+}
static inline void
bpf_alu_be(uint64_t reg[EBPF_REG_NUM], const struct ebpf_insn *ins)
@@ -392,10 +433,12 @@ bpf_exec(const struct rte_bpf *bpf, uint64_t reg[EBPF_REG_NUM])
break;
/* atomic instructions */
case (BPF_STX | EBPF_ATOMIC | BPF_W):
- BPF_ST_ATOMIC_REG(reg, ins, 32);
+ if (bpf_atomic32(bpf, reg, ins) == 0)
+ return 0;
break;
case (BPF_STX | EBPF_ATOMIC | EBPF_DW):
- BPF_ST_ATOMIC_REG(reg, ins, 64);
+ if (bpf_atomic64(bpf, reg, ins) == 0)
+ return 0;
break;
/* jump instructions */
case (BPF_JMP | BPF_JA):
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 05/27] net/bonding: use stdatomic
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (3 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 04/27] bpf: replace atomic op macro with typed helpers Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 06/27] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
` (21 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Chas Williams, Min Hu (Connor)
The old rte_atomic16 and rte_atomic64 functions are deprecated.
Replace with rte_stdatomic for managing warning and timer flags.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/bonding/eth_bond_8023ad_private.h | 6 ++--
drivers/net/bonding/rte_eth_bond_8023ad.c | 35 ++++++++-----------
2 files changed, 17 insertions(+), 24 deletions(-)
diff --git a/drivers/net/bonding/eth_bond_8023ad_private.h b/drivers/net/bonding/eth_bond_8023ad_private.h
index ab7d15f81a..dd3cf3ed26 100644
--- a/drivers/net/bonding/eth_bond_8023ad_private.h
+++ b/drivers/net/bonding/eth_bond_8023ad_private.h
@@ -9,7 +9,7 @@
#include <rte_ether.h>
#include <rte_byteorder.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_flow.h>
#include "rte_eth_bond_8023ad.h"
@@ -140,10 +140,10 @@ struct port {
/** Timer which is also used as mutex. If is 0 (not running) RX marker
* packet might be responded. Otherwise shall be dropped. It is zeroed in
* mode 4 callback function after expire. */
- volatile uint64_t rx_marker_timer;
+ RTE_ATOMIC(uint64_t) rx_marker_timer;
uint64_t warning_timer;
- volatile uint16_t warnings_to_show;
+ RTE_ATOMIC(uint16_t) warnings_to_show;
/** Memory pool used to allocate slow queues */
struct rte_mempool *slow_pool;
diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
index ba88f6d261..cc7e4af2b9 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.c
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
@@ -171,27 +171,17 @@ timer_is_running(uint64_t *timer)
static void
set_warning_flags(struct port *port, uint16_t flags)
{
- int retval;
- uint16_t old;
- uint16_t new_flag = 0;
-
- do {
- old = port->warnings_to_show;
- new_flag = old | flags;
- retval = rte_atomic16_cmpset(&port->warnings_to_show, old, new_flag);
- } while (unlikely(retval == 0));
+ rte_atomic_fetch_or_explicit(&port->warnings_to_show, flags, rte_memory_order_relaxed);
}
static void
show_warnings(uint16_t member_id)
{
struct port *port = &bond_mode_8023ad_ports[member_id];
- uint8_t warnings;
-
- do {
- warnings = port->warnings_to_show;
- } while (rte_atomic16_cmpset(&port->warnings_to_show, warnings, 0) == 0);
+ uint16_t warnings;
+ warnings = rte_atomic_exchange_explicit(&port->warnings_to_show, 0,
+ rte_memory_order_relaxed);
if (!warnings)
return;
@@ -1337,7 +1327,6 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
struct port *port = &bond_mode_8023ad_ports[member_id];
struct marker_header *m_hdr;
uint64_t marker_timer, old_marker_timer;
- int retval;
uint8_t wrn, subtype;
/* If packet is a marker, we send response now by reusing given packet
* and update only source MAC, destination MAC is multicast so don't
@@ -1354,17 +1343,19 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
}
/* Setup marker timer. Do it in loop in case concurrent access. */
+ old_marker_timer = rte_atomic_load_explicit(&port->rx_marker_timer,
+ rte_memory_order_relaxed);
do {
- old_marker_timer = port->rx_marker_timer;
if (!timer_is_expired(&old_marker_timer)) {
wrn = WRN_RX_MARKER_TO_FAST;
goto free_out;
}
timer_set(&marker_timer, mode4->rx_marker_timeout);
- retval = rte_atomic64_cmpset(&port->rx_marker_timer,
- old_marker_timer, marker_timer);
- } while (unlikely(retval == 0));
+
+ } while (!rte_atomic_compare_exchange_weak_explicit(&port->rx_marker_timer,
+ &old_marker_timer, marker_timer,
+ rte_memory_order_seq_cst, rte_memory_order_relaxed));
m_hdr->marker.tlv_type_marker = MARKER_TLV_TYPE_RESP;
rte_eth_macaddr_get(member_id, &m_hdr->eth_hdr.src_addr);
@@ -1372,7 +1363,8 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
if (internals->mode4.dedicated_queues.enabled == 0) {
if (rte_ring_enqueue(port->tx_ring, pkt) != 0) {
/* reset timer */
- port->rx_marker_timer = 0;
+ rte_atomic_store_explicit(&port->rx_marker_timer, 0,
+ rte_memory_order_release);
wrn = WRN_TX_QUEUE_FULL;
goto free_out;
}
@@ -1386,7 +1378,8 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
&pkt, tx_count);
if (tx_count != 1) {
/* reset timer */
- port->rx_marker_timer = 0;
+ rte_atomic_store_explicit(&port->rx_marker_timer, 0,
+ rte_memory_order_release);
wrn = WRN_TX_QUEUE_FULL;
goto free_out;
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 06/27] net/nbl: remove unused rte_atomic16 field
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (4 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 05/27] net/bonding: use stdatomic Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 07/27] net/ena: replace use of rte_atomicNN Stephen Hemminger
` (20 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Dimon Zhao, Leon Yu, Sam Chen
The tx_current_queue was defined as rte_atomic16_t which
is deprecated. Remove it since it was never used.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/nbl/nbl_hw/nbl_resource.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/net/nbl/nbl_hw/nbl_resource.h b/drivers/net/nbl/nbl_hw/nbl_resource.h
index bf5a9461f5..f2182ba6bc 100644
--- a/drivers/net/nbl/nbl_hw/nbl_resource.h
+++ b/drivers/net/nbl/nbl_hw/nbl_resource.h
@@ -225,7 +225,6 @@ struct nbl_res_info {
u16 base_qid;
u16 lcore_max;
u16 *pf_qid_to_lcore_id;
- rte_atomic16_t tx_current_queue;
};
struct nbl_resource_mgt {
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 07/27] net/ena: replace use of rte_atomicNN
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (5 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 06/27] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 08/27] net/failsafe: convert to stdatomic Stephen Hemminger
` (19 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, Shai Brandes, Evgeny Schemeilin, Ron Beider,
Amit Bernstein, Wajeeh Atrash
Convert the legacy rte_atomicNN operations to stdatomic.
* Remove variable ena_alloc_cnt is defined by not used.
It is a leftover from previous memzone naming scheme.
* Convert the legacy rte_atomic32_t and rte_atomic32_{inc,dec,set,read}
macros to C11 stdatomic equivalents.
Memory ordering is kept at seq_cst,
matching the implicit ordering of the legacy API.
* Do not use rte_atomic for statistics
The DPDK PMD model is that statistics do not have to be exact
in face of contention.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/ena/base/ena_plat_dpdk.h | 14 +++++++++-----
drivers/net/ena/ena_ethdev.c | 21 ++++++---------------
drivers/net/ena/ena_ethdev.h | 7 +++----
3 files changed, 18 insertions(+), 24 deletions(-)
diff --git a/drivers/net/ena/base/ena_plat_dpdk.h b/drivers/net/ena/base/ena_plat_dpdk.h
index c84420de22..83b354d9da 100644
--- a/drivers/net/ena/base/ena_plat_dpdk.h
+++ b/drivers/net/ena/base/ena_plat_dpdk.h
@@ -40,7 +40,7 @@ typedef uint64_t dma_addr_t;
#endif
#define ENA_PRIu64 PRIu64
-#define ena_atomic32_t rte_atomic32_t
+typedef RTE_ATOMIC(int32_t) ena_atomic32_t;
#define ena_mem_handle_t const struct rte_memzone *
#define SZ_256 (256U)
@@ -267,10 +267,14 @@ ena_mem_alloc_coherent(struct rte_eth_dev_data *data, size_t size,
#define ENA_REG_READ32(bus, reg) \
__extension__ ({ (void)(bus); rte_read32_relaxed((reg)); })
-#define ATOMIC32_INC(i32_ptr) rte_atomic32_inc(i32_ptr)
-#define ATOMIC32_DEC(i32_ptr) rte_atomic32_dec(i32_ptr)
-#define ATOMIC32_SET(i32_ptr, val) rte_atomic32_set(i32_ptr, val)
-#define ATOMIC32_READ(i32_ptr) rte_atomic32_read(i32_ptr)
+#define ATOMIC32_INC(i32_ptr) \
+ rte_atomic_fetch_add_explicit((i32_ptr), 1, rte_memory_order_seq_cst)
+#define ATOMIC32_DEC(i32_ptr) \
+ rte_atomic_fetch_sub_explicit((i32_ptr), 1, rte_memory_order_seq_cst)
+#define ATOMIC32_SET(i32_ptr, val) \
+ rte_atomic_store_explicit((i32_ptr), (val), rte_memory_order_seq_cst)
+#define ATOMIC32_READ(i32_ptr) \
+ rte_atomic_load_explicit((i32_ptr), rte_memory_order_seq_cst)
#define msleep(x) rte_delay_us(x * 1000)
#define udelay(x) rte_delay_us(x)
diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index ea4afbc75d..e9c484456c 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -121,12 +121,6 @@ struct ena_stats {
*/
#define ENA_DEVARG_ENABLE_FRAG_BYPASS "enable_frag_bypass"
-/*
- * Each rte_memzone should have unique name.
- * To satisfy it, count number of allocation and add it to name.
- */
-rte_atomic64_t ena_alloc_cnt;
-
static const struct ena_stats ena_stats_global_strings[] = {
ENA_STAT_GLOBAL_ENTRY(wd_expired),
ENA_STAT_GLOBAL_ENTRY(dev_start),
@@ -1249,10 +1243,7 @@ static void ena_stats_restart(struct rte_eth_dev *dev)
{
struct ena_adapter *adapter = dev->data->dev_private;
- rte_atomic64_init(&adapter->drv_stats->ierrors);
- rte_atomic64_init(&adapter->drv_stats->oerrors);
- rte_atomic64_init(&adapter->drv_stats->rx_nombuf);
- adapter->drv_stats->rx_drops = 0;
+ memset(adapter->drv_stats, 0, sizeof(struct ena_driver_stats));
}
static int ena_stats_get(struct rte_eth_dev *dev,
@@ -1289,9 +1280,9 @@ static int ena_stats_get(struct rte_eth_dev *dev,
/* Driver related stats */
stats->imissed = adapter->drv_stats->rx_drops;
- stats->ierrors = rte_atomic64_read(&adapter->drv_stats->ierrors);
- stats->oerrors = rte_atomic64_read(&adapter->drv_stats->oerrors);
- stats->rx_nombuf = rte_atomic64_read(&adapter->drv_stats->rx_nombuf);
+ stats->ierrors = adapter->drv_stats->ierrors;
+ stats->oerrors = adapter->drv_stats->oerrors;
+ stats->rx_nombuf = adapter->drv_stats->rx_nombuf;
/* Queue statistics */
if (qstats) {
@@ -1887,7 +1878,7 @@ static int ena_populate_rx_queue(struct ena_ring *rxq, unsigned int count)
/* get resources for incoming packets */
rc = rte_pktmbuf_alloc_bulk(rxq->mb_pool, mbufs, count);
if (unlikely(rc < 0)) {
- rte_atomic64_inc(&rxq->adapter->drv_stats->rx_nombuf);
+ ++rxq->adapter->drv_stats->rx_nombuf;
++rxq->rx_stats.mbuf_alloc_fail;
PMD_RX_LOG_LINE(DEBUG, "There are not enough free buffers");
return 0;
@@ -3014,7 +3005,7 @@ static uint16_t eth_ena_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
if (unlikely(mbuf->ol_flags &
(RTE_MBUF_F_RX_IP_CKSUM_BAD | RTE_MBUF_F_RX_L4_CKSUM_BAD)))
- rte_atomic64_inc(&rx_ring->adapter->drv_stats->ierrors);
+ ++rx_ring->adapter->drv_stats->ierrors;
rx_pkts[completed] = mbuf;
rx_ring->rx_stats.bytes += mbuf->pkt_len;
diff --git a/drivers/net/ena/ena_ethdev.h b/drivers/net/ena/ena_ethdev.h
index 3a66d79384..b204b07767 100644
--- a/drivers/net/ena/ena_ethdev.h
+++ b/drivers/net/ena/ena_ethdev.h
@@ -6,7 +6,6 @@
#ifndef _ENA_ETHDEV_H_
#define _ENA_ETHDEV_H_
-#include <rte_atomic.h>
#include <rte_ether.h>
#include <ethdev_driver.h>
#include <ethdev_pci.h>
@@ -225,9 +224,9 @@ enum ena_adapter_state {
};
struct ena_driver_stats {
- rte_atomic64_t ierrors;
- rte_atomic64_t oerrors;
- rte_atomic64_t rx_nombuf;
+ u64 ierrors;
+ u64 oerrors;
+ u64 rx_nombuf;
u64 rx_drops;
};
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 08/27] net/failsafe: convert to stdatomic
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (6 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 07/27] net/ena: replace use of rte_atomicNN Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 09/27] net/enic: do not use deprecated rte_atomic64 Stephen Hemminger
` (18 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Gaetan Rivet
The functions rte_atomic64 are deprecated, convert this
code to use stdatomic for reference count. Use the memory
order implied by naming P/V.
No need for initialization since refcnt is in space
allocated with rte_zmalloc().
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/failsafe/failsafe_ops.c | 12 +++++-----
drivers/net/failsafe/failsafe_private.h | 29 ++++++++++++++-----------
drivers/net/failsafe/failsafe_rxtx.c | 2 +-
3 files changed, 22 insertions(+), 21 deletions(-)
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index ddc8808ebe..fcb0051777 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -11,7 +11,7 @@
#endif
#include <rte_debug.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <ethdev_driver.h>
#include <rte_malloc.h>
#include <rte_flow.h>
@@ -440,14 +440,13 @@ fs_rx_queue_setup(struct rte_eth_dev *dev,
}
rxq = rte_zmalloc(NULL,
sizeof(*rxq) +
- sizeof(rte_atomic64_t) * PRIV(dev)->subs_tail,
+ sizeof(uint64_t) * PRIV(dev)->subs_tail,
RTE_CACHE_LINE_SIZE);
if (rxq == NULL) {
fs_unlock(dev, 0);
return -ENOMEM;
}
- FOREACH_SUBDEV(sdev, i, dev)
- rte_atomic64_init(&rxq->refcnt[i]);
+
rxq->qid = rx_queue_id;
rxq->socket_id = socket_id;
rxq->info.mp = mb_pool;
@@ -617,14 +616,13 @@ fs_tx_queue_setup(struct rte_eth_dev *dev,
}
txq = rte_zmalloc("ethdev TX queue",
sizeof(*txq) +
- sizeof(rte_atomic64_t) * PRIV(dev)->subs_tail,
+ sizeof(uint64_t) * PRIV(dev)->subs_tail,
RTE_CACHE_LINE_SIZE);
if (txq == NULL) {
fs_unlock(dev, 0);
return -ENOMEM;
}
- FOREACH_SUBDEV(sdev, i, dev)
- rte_atomic64_init(&txq->refcnt[i]);
+
txq->qid = tx_queue_id;
txq->socket_id = socket_id;
txq->info.conf = *tx_conf;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index babea6016e..89b06f9756 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -10,7 +10,7 @@
#include <sys/queue.h>
#include <pthread.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <dev_driver.h>
#include <ethdev_driver.h>
#include <rte_devargs.h>
@@ -75,7 +75,7 @@ struct rxq {
int event_fd;
unsigned int enable_events:1;
struct rte_eth_rxq_info info;
- rte_atomic64_t refcnt[];
+ RTE_ATOMIC(uint64_t) refcnt[];
};
struct txq {
@@ -83,7 +83,7 @@ struct txq {
uint16_t qid;
unsigned int socket_id;
struct rte_eth_txq_info info;
- rte_atomic64_t refcnt[];
+ RTE_ATOMIC(uint64_t) refcnt[];
};
struct rte_flow {
@@ -320,33 +320,36 @@ extern int failsafe_mac_from_arg;
*/
/**
- * a: (rte_atomic64_t)
+ * a: _Atomic uint64_t
*/
#define FS_ATOMIC_P(a) \
- rte_atomic64_set(&(a), 1)
+ rte_atomic_exchange_explicit(&(a), 1, rte_memory_order_acquire)
/**
- * a: (rte_atomic64_t)
+ * a: _Atomic uint64_t
*/
#define FS_ATOMIC_V(a) \
- rte_atomic64_set(&(a), 0)
+ rte_atomic_store_explicit(&(a), 0, rte_memory_order_release)
/**
* s: (struct sub_device *)
* i: uint16_t qid
*/
#define FS_ATOMIC_RX(s, i) \
- rte_atomic64_read( \
- &((struct rxq *) \
- (fs_dev(s)->data->rx_queues[i]))->refcnt[(s)->sid])
+ rte_atomic_load_explicit( \
+ &((struct rxq *) \
+ (fs_dev(s)->data->rx_queues[i]))->refcnt[(s)->sid], \
+ rte_memory_order_seq_cst)
+
/**
* s: (struct sub_device *)
* i: uint16_t qid
*/
#define FS_ATOMIC_TX(s, i) \
- rte_atomic64_read( \
- &((struct txq *) \
- (fs_dev(s)->data->tx_queues[i]))->refcnt[(s)->sid])
+ rte_atomic_load_explicit( \
+ &((struct txq *) \
+ (fs_dev(s)->data->tx_queues[i]))->refcnt[(s)->sid], \
+ rte_memory_order_seq_cst)
#ifdef RTE_EXEC_ENV_FREEBSD
#define FS_THREADID_TYPE void*
diff --git a/drivers/net/failsafe/failsafe_rxtx.c b/drivers/net/failsafe/failsafe_rxtx.c
index fe67293299..500483bda3 100644
--- a/drivers/net/failsafe/failsafe_rxtx.c
+++ b/drivers/net/failsafe/failsafe_rxtx.c
@@ -3,7 +3,7 @@
* Copyright 2017 Mellanox Technologies, Ltd
*/
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_debug.h>
#include <rte_mbuf.h>
#include <ethdev_driver.h>
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 09/27] net/enic: do not use deprecated rte_atomic64
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (7 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 08/27] net/failsafe: convert to stdatomic Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 10/27] net/pfe: use ethdev linkstatus helpers Stephen Hemminger
` (17 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, John Daley, Hyong Youb Kim, Bruce Richardson,
Konstantin Ananyev
The rte_atomic64 datatype and functions are deprecated.
This driver was only using it for error statistics where atomic
is not necessary. The DPDK PMD model is that statistics do
not have to be exact in face of contention.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/enic/enic.h | 6 +++---
drivers/net/enic/enic_compat.h | 1 -
drivers/net/enic/enic_main.c | 17 +++++++----------
drivers/net/enic/enic_rxtx.c | 14 ++++++--------
drivers/net/enic/enic_rxtx_vec_avx2.c | 4 ++--
5 files changed, 18 insertions(+), 24 deletions(-)
diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index 87f6b35fcd..0a8d4a29ca 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -59,9 +59,9 @@
#define ENICPMD_RXQ_INTR_OFFSET 1
struct enic_soft_stats {
- rte_atomic64_t rx_nombuf;
- rte_atomic64_t rx_packet_errors;
- rte_atomic64_t tx_oversized;
+ uint64_t rx_nombuf;
+ uint64_t rx_packet_errors;
+ uint64_t tx_oversized;
};
struct enic_memzone_entry {
diff --git a/drivers/net/enic/enic_compat.h b/drivers/net/enic/enic_compat.h
index 7cff6831b9..3ce4299e81 100644
--- a/drivers/net/enic/enic_compat.h
+++ b/drivers/net/enic/enic_compat.h
@@ -9,7 +9,6 @@
#include <stdio.h>
#include <unistd.h>
-#include <rte_atomic.h>
#include <rte_malloc.h>
#include <rte_log.h>
#include <rte_io.h>
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 2696fa77d4..fb9a5754c9 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -83,17 +83,15 @@ static void enic_log_q_error(struct enic *enic)
static void enic_clear_soft_stats(struct enic *enic)
{
struct enic_soft_stats *soft_stats = &enic->soft_stats;
- rte_atomic64_clear(&soft_stats->rx_nombuf);
- rte_atomic64_clear(&soft_stats->rx_packet_errors);
- rte_atomic64_clear(&soft_stats->tx_oversized);
+
+ memset(soft_stats, 0, sizeof(*soft_stats));
}
static void enic_init_soft_stats(struct enic *enic)
{
struct enic_soft_stats *soft_stats = &enic->soft_stats;
- rte_atomic64_init(&soft_stats->rx_nombuf);
- rte_atomic64_init(&soft_stats->rx_packet_errors);
- rte_atomic64_init(&soft_stats->tx_oversized);
+
+ memset(soft_stats, 0, sizeof(*soft_stats));
enic_clear_soft_stats(enic);
}
@@ -132,7 +130,7 @@ int enic_dev_stats_get(struct enic *enic, struct rte_eth_stats *r_stats,
* counted in ibytes even though truncated packets are dropped
* which can make ibytes be slightly higher than it should be.
*/
- rx_packet_errors = rte_atomic64_read(&soft_stats->rx_packet_errors);
+ rx_packet_errors = soft_stats->rx_packet_errors;
rx_truncated = rx_packet_errors - stats->rx.rx_errors;
r_stats->ipackets = stats->rx.rx_frames_ok - rx_truncated;
@@ -142,12 +140,11 @@ int enic_dev_stats_get(struct enic *enic, struct rte_eth_stats *r_stats,
r_stats->obytes = stats->tx.tx_bytes_ok;
r_stats->ierrors = stats->rx.rx_errors + stats->rx.rx_drop;
- r_stats->oerrors = stats->tx.tx_errors
- + rte_atomic64_read(&soft_stats->tx_oversized);
+ r_stats->oerrors = stats->tx.tx_errors + soft_stats->tx_oversized;
r_stats->imissed = stats->rx.rx_no_bufs + rx_truncated;
- r_stats->rx_nombuf = rte_atomic64_read(&soft_stats->rx_nombuf);
+ r_stats->rx_nombuf = soft_stats->rx_nombuf;
return 0;
}
diff --git a/drivers/net/enic/enic_rxtx.c b/drivers/net/enic/enic_rxtx.c
index 549a153332..c87d947b93 100644
--- a/drivers/net/enic/enic_rxtx.c
+++ b/drivers/net/enic/enic_rxtx.c
@@ -112,7 +112,7 @@ enic_recv_pkts_common(void *rx_queue, struct rte_mbuf **rx_pkts,
/* allocate a new mbuf */
nmb = rte_mbuf_raw_alloc(rq->mp);
if (nmb == NULL) {
- rte_atomic64_inc(&enic->soft_stats.rx_nombuf);
+ ++enic->soft_stats.rx_nombuf;
break;
}
@@ -185,7 +185,7 @@ enic_recv_pkts_common(void *rx_queue, struct rte_mbuf **rx_pkts,
}
if (unlikely(packet_error)) {
rte_pktmbuf_free(first_seg);
- rte_atomic64_inc(&enic->soft_stats.rx_packet_errors);
+ ++enic->soft_stats.rx_packet_errors;
continue;
}
@@ -303,7 +303,7 @@ enic_noscatter_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
if (unlikely(cqd->bytes_written_flags &
CQ_ENET_RQ_DESC_FLAGS_TRUNCATED)) {
rte_pktmbuf_free(*rxmb++);
- rte_atomic64_inc(&enic->soft_stats.rx_packet_errors);
+ ++enic->soft_stats.rx_packet_errors;
cqd++;
continue;
}
@@ -505,14 +505,12 @@ uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint8_t offload_mode;
uint16_t header_len;
uint64_t tso;
- rte_atomic64_t *tx_oversized;
enic_cleanup_wq(enic, wq);
wq_desc_avail = vnic_wq_desc_avail(wq);
head_idx = wq->head_idx;
desc_count = wq->ring.desc_count;
ol_flags_mask = RTE_MBUF_F_TX_VLAN | RTE_MBUF_F_TX_IP_CKSUM | RTE_MBUF_F_TX_L4_MASK;
- tx_oversized = &enic->soft_stats.tx_oversized;
nb_pkts = RTE_MIN(nb_pkts, ENIC_TX_XMIT_MAX);
@@ -527,7 +525,7 @@ uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
/* drop packet if it's too big to send */
if (unlikely(!tso && pkt_len > ENIC_TX_MAX_PKT_SIZE)) {
rte_pktmbuf_free(tx_pkt);
- rte_atomic64_inc(tx_oversized);
+ ++enic->soft_stats.tx_oversized;
continue;
}
@@ -558,7 +556,7 @@ uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
if (unlikely(header_len == 0 || ((tx_pkt->tso_segsz +
header_len) > ENIC_TX_MAX_PKT_SIZE))) {
rte_pktmbuf_free(tx_pkt);
- rte_atomic64_inc(tx_oversized);
+ ++enic->soft_stats.tx_oversized;
continue;
}
@@ -681,7 +679,7 @@ static void enqueue_simple_pkts(struct rte_mbuf **pkts,
*/
if (unlikely(p->pkt_len > ENIC_TX_MAX_PKT_SIZE)) {
desc->length = ENIC_TX_MAX_PKT_SIZE;
- rte_atomic64_inc(&enic->soft_stats.tx_oversized);
+ ++enic->soft_stats.tx_oversized;
}
desc++;
}
diff --git a/drivers/net/enic/enic_rxtx_vec_avx2.c b/drivers/net/enic/enic_rxtx_vec_avx2.c
index 600efff270..53589ab788 100644
--- a/drivers/net/enic/enic_rxtx_vec_avx2.c
+++ b/drivers/net/enic/enic_rxtx_vec_avx2.c
@@ -81,7 +81,7 @@ enic_noscatter_vec_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
if (unlikely(cqd->bytes_written_flags &
CQ_ENET_RQ_DESC_FLAGS_TRUNCATED)) {
rte_pktmbuf_free(*rxmb++);
- rte_atomic64_inc(&enic->soft_stats.rx_packet_errors);
+ ++enic->soft_stats.rx_packet_errors;
} else {
*rx++ = rx_one(cqd, *rxmb++, enic);
}
@@ -761,7 +761,7 @@ enic_noscatter_vec_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
if (unlikely(cqd->bytes_written_flags &
CQ_ENET_RQ_DESC_FLAGS_TRUNCATED)) {
rte_pktmbuf_free(*rxmb++);
- rte_atomic64_inc(&enic->soft_stats.rx_packet_errors);
+ ++enic->soft_stats.rx_packet_errors;
} else {
*rx++ = rx_one(cqd, *rxmb++, enic);
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 10/27] net/pfe: use ethdev linkstatus helpers
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (8 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 09/27] net/enic: do not use deprecated rte_atomic64 Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 11/27] net/sfc: replace rte_atomic with stdatomic Stephen Hemminger
` (16 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Gagandeep Singh
Rather than open coding with deprecated rte_atomic64,
use the existing ethdev helpers to get and set link status.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/pfe/pfe_ethdev.c | 32 ++------------------------------
1 file changed, 2 insertions(+), 30 deletions(-)
diff --git a/drivers/net/pfe/pfe_ethdev.c b/drivers/net/pfe/pfe_ethdev.c
index 1efa17539e..1b183ab1f3 100644
--- a/drivers/net/pfe/pfe_ethdev.c
+++ b/drivers/net/pfe/pfe_ethdev.c
@@ -531,34 +531,6 @@ pfe_supported_ptypes_get(struct rte_eth_dev *dev, size_t *no_of_elements)
return NULL;
}
-static inline int
-pfe_eth_atomic_read_link_status(struct rte_eth_dev *dev,
- struct rte_eth_link *link)
-{
- struct rte_eth_link *dst = link;
- struct rte_eth_link *src = &dev->data->dev_link;
-
- if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst,
- *(uint64_t *)src) == 0)
- return -1;
-
- return 0;
-}
-
-static inline int
-pfe_eth_atomic_write_link_status(struct rte_eth_dev *dev,
- struct rte_eth_link *link)
-{
- struct rte_eth_link *dst = &dev->data->dev_link;
- struct rte_eth_link *src = link;
-
- if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst,
- *(uint64_t *)src) == 0)
- return -1;
-
- return 0;
-}
-
static int
pfe_eth_link_update(struct rte_eth_dev *dev, int wait_to_complete __rte_unused)
{
@@ -570,7 +542,7 @@ pfe_eth_link_update(struct rte_eth_dev *dev, int wait_to_complete __rte_unused)
memset(&old, 0, sizeof(old));
memset(&link, 0, sizeof(struct rte_eth_link));
- pfe_eth_atomic_read_link_status(dev, &old);
+ rte_eth_linkstatus_get(dev, &old);
/* Read from PFE CDEV, status of link, if file was successfully
* opened.
@@ -601,7 +573,7 @@ pfe_eth_link_update(struct rte_eth_dev *dev, int wait_to_complete __rte_unused)
link.link_duplex = RTE_ETH_LINK_FULL_DUPLEX;
link.link_autoneg = RTE_ETH_LINK_AUTONEG;
- pfe_eth_atomic_write_link_status(dev, &link);
+ rte_eth_linkstatus_set(dev, &link);
PFE_PMD_INFO("Port (%d) link is %s", dev->data->port_id,
link.link_status ? "up" : "down");
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 11/27] net/sfc: replace rte_atomic with stdatomic
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (9 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 10/27] net/pfe: use ethdev linkstatus helpers Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 12/27] crypto/ccp: replace use of rte_atomic64 " Stephen Hemminger
` (15 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Andrew Rybchenko
The rte_atomicNN functions are deprecated and need to be replaced.
Use stdatomic for the restart required flag.
Use existing ethdev helper to set link status.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/sfc/sfc.c | 9 +++++----
drivers/net/sfc/sfc.h | 4 ++--
drivers/net/sfc/sfc_port.c | 7 +------
drivers/net/sfc/sfc_stats.h | 2 +-
4 files changed, 9 insertions(+), 13 deletions(-)
diff --git a/drivers/net/sfc/sfc.c b/drivers/net/sfc/sfc.c
index 69747e49ae..3470f7eed6 100644
--- a/drivers/net/sfc/sfc.c
+++ b/drivers/net/sfc/sfc.c
@@ -670,8 +670,8 @@ sfc_restart_if_required(void *arg)
struct sfc_adapter *sa = arg;
/* If restart is scheduled, clear the flag and do it */
- if (rte_atomic32_cmpset((volatile uint32_t *)&sa->restart_required,
- 1, 0)) {
+ if (rte_atomic_exchange_explicit(&sa->restart_required, false,
+ rte_memory_order_seq_cst)) {
sfc_adapter_lock(sa);
if (sa->state == SFC_ETHDEV_STARTED)
(void)sfc_restart(sa);
@@ -685,7 +685,8 @@ sfc_schedule_restart(struct sfc_adapter *sa)
int rc;
/* Schedule restart alarm if it is not scheduled yet */
- if (!rte_atomic32_test_and_set(&sa->restart_required))
+ if (rte_atomic_exchange_explicit(&sa->restart_required, true,
+ rte_memory_order_seq_cst))
return;
rc = rte_eal_alarm_set(1, sfc_restart_if_required, sa);
@@ -1292,7 +1293,7 @@ sfc_probe(struct sfc_adapter *sa)
SFC_ASSERT(sfc_adapter_is_locked(sa));
sa->socket_id = rte_socket_id();
- rte_atomic32_init(&sa->restart_required);
+ sa->restart_required = false;
sfc_log_init(sa, "get family");
rc = sfc_efx_family(pci_dev, &mem_ebrp, &sa->family);
diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 629578549f..515e1e708d 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -17,7 +17,7 @@
#include <ethdev_driver.h>
#include <rte_kvargs.h>
#include <rte_spinlock.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include "efx.h"
@@ -239,7 +239,7 @@ struct sfc_adapter {
efx_family_t family;
efx_nic_t *nic;
rte_spinlock_t nic_lock;
- rte_atomic32_t restart_required;
+ RTE_ATOMIC(bool) restart_required;
struct sfc_efx_mcdi mcdi;
struct sfc_sriov sriov;
diff --git a/drivers/net/sfc/sfc_port.c b/drivers/net/sfc/sfc_port.c
index 33b53f7ac8..d84648d454 100644
--- a/drivers/net/sfc/sfc_port.c
+++ b/drivers/net/sfc/sfc_port.c
@@ -121,7 +121,6 @@ sfc_port_reset_mac_stats(struct sfc_adapter *sa)
static int
sfc_port_init_dev_link(struct sfc_adapter *sa)
{
- struct rte_eth_link *dev_link = &sa->eth_dev->data->dev_link;
int rc;
efx_link_mode_t link_mode;
struct rte_eth_link current_link;
@@ -132,11 +131,7 @@ sfc_port_init_dev_link(struct sfc_adapter *sa)
sfc_port_link_mode_to_info(link_mode, sa->port.phy_adv_cap,
¤t_link);
-
- EFX_STATIC_ASSERT(sizeof(*dev_link) == sizeof(rte_atomic64_t));
- rte_atomic64_set((rte_atomic64_t *)dev_link,
- *(uint64_t *)¤t_link);
-
+ rte_eth_linkstatus_set(sa->eth_dev, ¤t_link);
return 0;
}
diff --git a/drivers/net/sfc/sfc_stats.h b/drivers/net/sfc/sfc_stats.h
index 597e14dab3..eaa2afd3fe 100644
--- a/drivers/net/sfc/sfc_stats.h
+++ b/drivers/net/sfc/sfc_stats.h
@@ -12,7 +12,7 @@
#include <stdint.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include "sfc_tweak.h"
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 12/27] crypto/ccp: replace use of rte_atomic64 with stdatomic
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (10 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 11/27] net/sfc: replace rte_atomic with stdatomic Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 13/27] bus/dpaa: replace rte_atomic16 " Stephen Hemminger
` (14 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Sunil Uttarwar
The rte_atomicNN functions are deprecated. Replace the free
count with stdatomic.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/crypto/ccp/ccp_crypto.c | 11 +++++++----
drivers/crypto/ccp/ccp_crypto.h | 2 +-
drivers/crypto/ccp/ccp_dev.c | 10 ++++++----
drivers/crypto/ccp/ccp_dev.h | 4 ++--
4 files changed, 16 insertions(+), 11 deletions(-)
diff --git a/drivers/crypto/ccp/ccp_crypto.c b/drivers/crypto/ccp/ccp_crypto.c
index 5899d83bae..1800ad41c9 100644
--- a/drivers/crypto/ccp/ccp_crypto.c
+++ b/drivers/crypto/ccp/ccp_crypto.c
@@ -2683,7 +2683,8 @@ process_ops_to_enqueue(struct ccp_qp *qp,
b_info->cmd_q = cmd_q;
b_info->lsb_buf_phys = (phys_addr_t)rte_mem_virt2iova((void *)b_info->lsb_buf);
- rte_atomic64_sub(&b_info->cmd_q->free_slots, slots_req);
+ rte_atomic_fetch_sub_explicit(&b_info->cmd_q->free_slots, slots_req,
+ rte_memory_order_seq_cst);
b_info->head_offset = (uint32_t)(cmd_q->qbase_phys_addr + cmd_q->qidx *
Q_DESC_SIZE);
@@ -2729,8 +2730,9 @@ process_ops_to_enqueue(struct ccp_qp *qp,
result = -1;
}
if (unlikely(result < 0)) {
- rte_atomic64_add(&b_info->cmd_q->free_slots,
- (slots_req - b_info->desccnt));
+ rte_atomic_fetch_add_explicit(&b_info->cmd_q->free_slots,
+ slots_req - b_info->desccnt,
+ rte_memory_order_seq_cst);
break;
}
b_info->op[i] = op[i];
@@ -2914,7 +2916,8 @@ process_ops_to_dequeue(struct ccp_qp *qp,
success:
*total_nb_ops = b_info->total_nb_ops;
nb_ops = ccp_prepare_ops(qp, op, b_info, nb_ops);
- rte_atomic64_add(&b_info->cmd_q->free_slots, b_info->desccnt);
+ rte_atomic_fetch_add_explicit(&b_info->cmd_q->free_slots, b_info->desccnt,
+ rte_memory_order_seq_cst);
b_info->desccnt = 0;
if (b_info->opcnt > 0) {
qp->b_info = b_info;
diff --git a/drivers/crypto/ccp/ccp_crypto.h b/drivers/crypto/ccp/ccp_crypto.h
index d0b417ca29..5c61b1582d 100644
--- a/drivers/crypto/ccp/ccp_crypto.h
+++ b/drivers/crypto/ccp/ccp_crypto.h
@@ -10,7 +10,7 @@
#include <stdint.h>
#include <string.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_byteorder.h>
#include <rte_io.h>
#include <rte_pci.h>
diff --git a/drivers/crypto/ccp/ccp_dev.c b/drivers/crypto/ccp/ccp_dev.c
index 5088d8ded6..a75816cdfc 100644
--- a/drivers/crypto/ccp/ccp_dev.c
+++ b/drivers/crypto/ccp/ccp_dev.c
@@ -47,14 +47,15 @@ ccp_allot_queue(struct rte_cryptodev *cdev, int slot_req)
priv->last_dev = dev;
if (dev->qidx >= dev->cmd_q_count)
dev->qidx = 0;
- ret = rte_atomic64_read(&dev->cmd_q[dev->qidx].free_slots);
+ ret = rte_atomic_load_explicit(&dev->cmd_q[dev->qidx].free_slots, rte_memory_order_relaxed);
if (ret >= slot_req)
return &dev->cmd_q[dev->qidx];
for (i = 0; i < dev->cmd_q_count; i++) {
dev->qidx++;
if (dev->qidx >= dev->cmd_q_count)
dev->qidx = 0;
- ret = rte_atomic64_read(&dev->cmd_q[dev->qidx].free_slots);
+ ret = rte_atomic_load_explicit(&dev->cmd_q[dev->qidx].free_slots,
+ rte_memory_order_relaxed);
if (ret >= slot_req)
return &dev->cmd_q[dev->qidx];
}
@@ -583,8 +584,9 @@ ccp_add_device(struct ccp_device *dev)
CCP_LOG_ERR("queue doesn't have lsb regions");
cmd_q->lsb = -1;
- rte_atomic64_init(&cmd_q->free_slots);
- rte_atomic64_set(&cmd_q->free_slots, (COMMANDS_PER_QUEUE - 1));
+ rte_atomic_store_explicit(&cmd_q->free_slots,
+ COMMANDS_PER_QUEUE - 1,
+ rte_memory_order_seq_cst);
/* unused slot barrier b/w H&T */
}
diff --git a/drivers/crypto/ccp/ccp_dev.h b/drivers/crypto/ccp/ccp_dev.h
index cd63830759..0d343c2426 100644
--- a/drivers/crypto/ccp/ccp_dev.h
+++ b/drivers/crypto/ccp/ccp_dev.h
@@ -11,7 +11,7 @@
#include <string.h>
#include <bus_pci_driver.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_byteorder.h>
#include <rte_io.h>
#include <rte_pci.h>
@@ -182,7 +182,7 @@ struct __rte_cache_aligned ccp_queue {
struct ccp_device *dev;
char memz_name[RTE_MEMZONE_NAMESIZE];
- rte_atomic64_t free_slots;
+ RTE_ATOMIC(uint64_t) free_slots;
/**< available free slots updated from enq/deq calls */
/* Queue identifier */
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 13/27] bus/dpaa: replace rte_atomic16 with stdatomic
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (11 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 12/27] crypto/ccp: replace use of rte_atomic64 " Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 14/27] drivers: " Stephen Hemminger
` (13 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Hemant Agrawal, Sachin Saxena
This is simple inuse flag which can be done with stdatomic
exchange logic.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/bus/dpaa/base/qbman/qman.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/bus/dpaa/base/qbman/qman.c b/drivers/bus/dpaa/base/qbman/qman.c
index 5534e1846c..82a976141a 100644
--- a/drivers/bus/dpaa/base/qbman/qman.c
+++ b/drivers/bus/dpaa/base/qbman/qman.c
@@ -11,6 +11,7 @@
#include <rte_eventdev.h>
#include <rte_byteorder.h>
#include <rte_dpaa_logs.h>
+#include <rte_stdatomic.h>
#include <eal_export.h>
#include <dpaa_bits.h>
@@ -683,7 +684,7 @@ qman_init_portal(struct qman_portal *portal,
#define MAX_GLOBAL_PORTALS 8
static struct qman_portal global_portals[MAX_GLOBAL_PORTALS];
-static rte_atomic16_t global_portals_used[MAX_GLOBAL_PORTALS];
+static RTE_ATOMIC(bool) global_portals_used[MAX_GLOBAL_PORTALS];
struct qman_portal *
qman_alloc_global_portal(struct qm_portal_config *q_pcfg)
@@ -691,7 +692,8 @@ qman_alloc_global_portal(struct qm_portal_config *q_pcfg)
unsigned int i;
for (i = 0; i < MAX_GLOBAL_PORTALS; i++) {
- if (rte_atomic16_test_and_set(&global_portals_used[i])) {
+ if (!rte_atomic_exchange_explicit(&global_portals_used[i], true,
+ rte_memory_order_acquire)) {
global_portals[i].config = q_pcfg;
return &global_portals[i];
}
@@ -708,7 +710,8 @@ qman_free_global_portal(struct qman_portal *portal)
for (i = 0; i < MAX_GLOBAL_PORTALS; i++) {
if (&global_portals[i] == portal) {
- rte_atomic16_clear(&global_portals_used[i]);
+ rte_atomic_store_explicit(&global_portals_used[i], false,
+ rte_memory_order_release);
return 0;
}
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 14/27] drivers: replace rte_atomic16 with stdatomic
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (12 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 13/27] bus/dpaa: replace rte_atomic16 " Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 15/27] net/netvsc: replace rte_atomic32 " Stephen Hemminger
` (12 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Hemant Agrawal, Sachin Saxena
The rte_atomicNN functions and types are deprecated.
The in_use and reference counts flag can be converted to stdatomic.
Also drop the unneeded NULL check in the loop body: TAILQ_FOREACH
terminates when the iterator becomes NULL, so var is guaranteed
non-NULL inside the loop.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c | 10 +++++++---
drivers/bus/fslmc/portal/dpaa2_hw_dpci.c | 10 +++++++---
drivers/bus/fslmc/portal/dpaa2_hw_dpio.c | 12 ++++++++----
drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 8 ++++----
drivers/event/dpaa2/dpaa2_hw_dpcon.c | 11 +++++++----
5 files changed, 33 insertions(+), 18 deletions(-)
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c b/drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c
index 925e83e97d..d94f3965b6 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c
@@ -84,7 +84,7 @@ dpaa2_create_dpbp_device(int vdev_fd __rte_unused,
}
dpbp_node->dpbp_id = dpbp_id;
- rte_atomic16_init(&dpbp_node->in_use);
+ dpbp_node->in_use = 0;
TAILQ_INSERT_TAIL(&dpbp_dev_list, dpbp_node, next);
@@ -103,7 +103,10 @@ struct dpaa2_dpbp_dev *dpaa2_alloc_dpbp_dev(void)
/* Get DPBP dev handle from list using index */
TAILQ_FOREACH(dpbp_dev, &dpbp_dev_list, next) {
- if (dpbp_dev && rte_atomic16_test_and_set(&dpbp_dev->in_use))
+ uint32_t expected = 0;
+ if (rte_atomic_compare_exchange_strong_explicit(
+ &dpbp_dev->in_use, &expected, 1,
+ rte_memory_order_acquire, rte_memory_order_relaxed))
break;
}
@@ -118,7 +121,8 @@ void dpaa2_free_dpbp_dev(struct dpaa2_dpbp_dev *dpbp)
/* Match DPBP handle and mark it free */
TAILQ_FOREACH(dpbp_dev, &dpbp_dev_list, next) {
if (dpbp_dev == dpbp) {
- rte_atomic16_dec(&dpbp_dev->in_use);
+ rte_atomic_store_explicit(&dpbp_dev->in_use, 0,
+ rte_memory_order_release);
return;
}
}
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpci.c b/drivers/bus/fslmc/portal/dpaa2_hw_dpci.c
index b546da82f6..789282085b 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpci.c
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpci.c
@@ -135,7 +135,7 @@ rte_dpaa2_create_dpci_device(int vdev_fd __rte_unused,
}
dpci_node->dpci_id = dpci_id;
- rte_atomic16_init(&dpci_node->in_use);
+ dpci_node->in_use = 0;
TAILQ_INSERT_TAIL(&dpci_dev_list, dpci_node, next);
@@ -159,7 +159,10 @@ struct dpaa2_dpci_dev *rte_dpaa2_alloc_dpci_dev(void)
/* Get DPCI dev handle from list using index */
TAILQ_FOREACH(dpci_dev, &dpci_dev_list, next) {
- if (dpci_dev && rte_atomic16_test_and_set(&dpci_dev->in_use))
+ uint32_t expected = 0;
+ if (rte_atomic_compare_exchange_strong_explicit(
+ &dpci_dev->in_use, &expected, 1,
+ rte_memory_order_acquire, rte_memory_order_relaxed))
break;
}
@@ -174,7 +177,8 @@ void rte_dpaa2_free_dpci_dev(struct dpaa2_dpci_dev *dpci)
/* Match DPCI handle and mark it free */
TAILQ_FOREACH(dpci_dev, &dpci_dev_list, next) {
if (dpci_dev == dpci) {
- rte_atomic16_dec(&dpci_dev->in_use);
+ rte_atomic_store_explicit(&dpci_dev->in_use, 0,
+ rte_memory_order_release);
return;
}
}
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
index 2a9e519668..4d89915c29 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
@@ -293,7 +293,7 @@ static void dpaa2_put_qbman_swp(struct dpaa2_dpio_dev *dpio_dev)
#ifdef RTE_EVENT_DPAA2
dpaa2_dpio_intr_deinit(dpio_dev);
#endif
- rte_atomic16_clear(&dpio_dev->ref_count);
+ rte_atomic_store_explicit(&dpio_dev->ref_count, 0, rte_memory_order_release);
}
}
@@ -305,7 +305,10 @@ static struct dpaa2_dpio_dev *dpaa2_get_qbman_swp(void)
/* Get DPIO dev handle from list using index */
TAILQ_FOREACH(dpio_dev, &dpio_dev_list, next) {
- if (dpio_dev && rte_atomic16_test_and_set(&dpio_dev->ref_count))
+ uint32_t expected = 0;
+ if (rte_atomic_compare_exchange_strong_explicit(
+ &dpio_dev->ref_count, &expected, 1,
+ rte_memory_order_acquire, rte_memory_order_relaxed))
break;
}
if (!dpio_dev) {
@@ -326,7 +329,8 @@ static struct dpaa2_dpio_dev *dpaa2_get_qbman_swp(void)
ret = dpaa2_configure_stashing(dpio_dev, cpu_id);
if (ret) {
DPAA2_BUS_ERR("dpaa2_configure_stashing failed");
- rte_atomic16_clear(&dpio_dev->ref_count);
+ rte_atomic_store_explicit(&dpio_dev->ref_count, 0,
+ rte_memory_order_release);
return NULL;
}
}
@@ -441,7 +445,7 @@ dpaa2_create_dpio_device(int vdev_fd,
dpio_dev->dpio = NULL;
dpio_dev->hw_id = object_id;
- rte_atomic16_init(&dpio_dev->ref_count);
+
/* Using single portal for all devices */
dpio_dev->mc_portal = dpaa2_get_mcp_ptr(MC_PORTAL_INDEX);
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h
index e625a5c035..f2298b18e5 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h
@@ -112,7 +112,7 @@ struct dpaa2_dpio_dev {
TAILQ_ENTRY(dpaa2_dpio_dev) next;
/**< Pointer to Next device instance */
uint16_t index; /**< Index of a instance in the list */
- rte_atomic16_t ref_count;
+ RTE_ATOMIC(uint16_t) ref_count;
/**< How many thread contexts are sharing this.*/
uint16_t eqresp_ci;
uint16_t eqresp_pi;
@@ -141,7 +141,7 @@ struct dpaa2_dpbp_dev {
/**< Pointer to Next device instance */
struct fsl_mc_io dpbp; /** handle to DPBP portal object */
uint16_t token;
- rte_atomic16_t in_use;
+ RTE_ATOMIC(uint16_t) in_use;
uint32_t dpbp_id; /*HW ID for DPBP object */
};
@@ -257,7 +257,7 @@ struct dpaa2_dpci_dev {
/**< Pointer to Next device instance */
struct fsl_mc_io dpci; /** handle to DPCI portal object */
uint16_t token;
- rte_atomic16_t in_use;
+ RTE_ATOMIC(uint16_t) in_use;
uint32_t dpci_id; /*HW ID for DPCI object */
struct dpaa2_queue rx_queue[DPAA2_DPCI_MAX_QUEUES];
struct dpaa2_queue tx_queue[DPAA2_DPCI_MAX_QUEUES];
@@ -267,7 +267,7 @@ struct dpaa2_dpcon_dev {
TAILQ_ENTRY(dpaa2_dpcon_dev) next;
struct fsl_mc_io dpcon;
uint16_t token;
- rte_atomic16_t in_use;
+ RTE_ATOMIC(uint16_t) in_use;
uint32_t dpcon_id;
uint16_t qbman_ch_id;
uint8_t num_priorities;
diff --git a/drivers/event/dpaa2/dpaa2_hw_dpcon.c b/drivers/event/dpaa2/dpaa2_hw_dpcon.c
index ea5b0d4b85..9240534448 100644
--- a/drivers/event/dpaa2/dpaa2_hw_dpcon.c
+++ b/drivers/event/dpaa2/dpaa2_hw_dpcon.c
@@ -15,6 +15,7 @@
#include <rte_malloc.h>
#include <rte_memcpy.h>
#include <rte_string_fns.h>
+#include <rte_stdatomic.h>
#include <rte_cycles.h>
#include <rte_kvargs.h>
#include <dev_driver.h>
@@ -53,7 +54,7 @@ rte_dpaa2_create_dpcon_device(int dev_fd __rte_unused,
int ret, dpcon_id = obj->object_id;
/* Allocate DPAA2 dpcon handle */
- dpcon_node = rte_malloc(NULL, sizeof(struct dpaa2_dpcon_dev), 0);
+ dpcon_node = rte_zmalloc(NULL, sizeof(struct dpaa2_dpcon_dev), 0);
if (!dpcon_node) {
DPAA2_EVENTDEV_ERR(
"Memory allocation failed for dpcon device");
@@ -85,7 +86,6 @@ rte_dpaa2_create_dpcon_device(int dev_fd __rte_unused,
dpcon_node->qbman_ch_id = attr.qbman_ch_id;
dpcon_node->num_priorities = attr.num_priorities;
dpcon_node->dpcon_id = dpcon_id;
- rte_atomic16_init(&dpcon_node->in_use);
TAILQ_INSERT_TAIL(&dpcon_dev_list, dpcon_node, next);
@@ -98,7 +98,10 @@ struct dpaa2_dpcon_dev *rte_dpaa2_alloc_dpcon_dev(void)
/* Get DPCON dev handle from list using index */
TAILQ_FOREACH(dpcon_dev, &dpcon_dev_list, next) {
- if (dpcon_dev && rte_atomic16_test_and_set(&dpcon_dev->in_use))
+ uint32_t expected = 0;
+ if (rte_atomic_compare_exchange_strong_explicit(
+ &dpcon_dev->in_use, &expected, 1,
+ rte_memory_order_acquire, rte_memory_order_relaxed))
break;
}
@@ -112,7 +115,7 @@ void rte_dpaa2_free_dpcon_dev(struct dpaa2_dpcon_dev *dpcon)
/* Match DPCON handle and mark it free */
TAILQ_FOREACH(dpcon_dev, &dpcon_dev_list, next) {
if (dpcon_dev == dpcon) {
- rte_atomic16_dec(&dpcon_dev->in_use);
+ rte_atomic_store_explicit(&dpcon_dev->in_use, 0, rte_memory_order_release);
return;
}
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 15/27] net/netvsc: replace rte_atomic32 with stdatomic
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (13 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 14/27] drivers: " Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 16/27] event/sw: convert from rte_atomic32 to stdatomic Stephen Hemminger
` (11 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Long Li, Wei Hu
Change the rndis transaction id and buffer usage to use
stdatomic functions.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/netvsc/hn_rndis.c | 28 +++++++++++++++++++---------
drivers/net/netvsc/hn_rxtx.c | 12 +++++++-----
drivers/net/netvsc/hn_var.h | 6 +++---
3 files changed, 29 insertions(+), 17 deletions(-)
diff --git a/drivers/net/netvsc/hn_rndis.c b/drivers/net/netvsc/hn_rndis.c
index 7c54eebcef..4b1d3d5539 100644
--- a/drivers/net/netvsc/hn_rndis.c
+++ b/drivers/net/netvsc/hn_rndis.c
@@ -17,7 +17,7 @@
#include <rte_string_fns.h>
#include <rte_memzone.h>
#include <rte_malloc.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_alarm.h>
#include <rte_branch_prediction.h>
#include <rte_ether.h>
@@ -59,7 +59,8 @@ hn_rndis_rid(struct hn_data *hv)
uint32_t rid;
do {
- rid = rte_atomic32_add_return(&hv->rndis_req_id, 1);
+ rid = rte_atomic_fetch_add_explicit(&hv->rndis_req_id, 1,
+ rte_memory_order_seq_cst);
} while (rid == 0);
return rid;
@@ -357,12 +358,14 @@ void hn_rndis_receive_response(struct hn_data *hv,
memcpy(hv->rndis_resp, data, len);
/* make sure response copied before update */
- rte_smp_wmb();
-
- if (rte_atomic32_cmpset(&hv->rndis_pending, hdr->rid, 0) == 0) {
+ uint32_t expected = hdr->rid;
+ if (!rte_atomic_compare_exchange_strong_explicit(&hv->rndis_pending,
+ &expected, 0,
+ rte_memory_order_release,
+ rte_memory_order_relaxed)) {
PMD_DRV_LOG(NOTICE,
"received id %#x pending id %#x",
- hdr->rid, (uint32_t)hv->rndis_pending);
+ hdr->rid, expected);
}
}
@@ -388,8 +391,11 @@ static int hn_rndis_exec1(struct hn_data *hv,
return -EINVAL;
}
+ uint32_t expected = 0;
if (comp != NULL &&
- rte_atomic32_cmpset(&hv->rndis_pending, 0, rid) == 0) {
+ !rte_atomic_compare_exchange_strong_explicit(
+ &hv->rndis_pending, &expected, rid,
+ rte_memory_order_acquire, rte_memory_order_relaxed)) {
PMD_DRV_LOG(ERR,
"Request already pending");
return -EBUSY;
@@ -405,7 +411,8 @@ static int hn_rndis_exec1(struct hn_data *hv,
time_t start = time(NULL);
/* Poll primary channel until response received */
- while (hv->rndis_pending == rid) {
+ while (rte_atomic_load_explicit(&hv->rndis_pending,
+ rte_memory_order_acquire) == rid) {
if (hv->closed)
return -ENETDOWN;
@@ -413,7 +420,10 @@ static int hn_rndis_exec1(struct hn_data *hv,
PMD_DRV_LOG(ERR,
"RNDIS response timed out");
- rte_atomic32_cmpset(&hv->rndis_pending, rid, 0);
+ expected = rid;
+ rte_atomic_compare_exchange_strong_explicit(
+ &hv->rndis_pending, &expected, 0,
+ rte_memory_order_release, rte_memory_order_relaxed);
return -ETIMEDOUT;
}
diff --git a/drivers/net/netvsc/hn_rxtx.c b/drivers/net/netvsc/hn_rxtx.c
index 0d770d1b25..6f536610f2 100644
--- a/drivers/net/netvsc/hn_rxtx.c
+++ b/drivers/net/netvsc/hn_rxtx.c
@@ -17,7 +17,7 @@
#include <rte_string_fns.h>
#include <rte_memzone.h>
#include <rte_malloc.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_bitmap.h>
#include <rte_branch_prediction.h>
#include <rte_ether.h>
@@ -558,7 +558,8 @@ static void hn_rx_buf_free_cb(void *buf __rte_unused, void *opaque)
struct hn_rx_queue *rxq = rxb->rxq;
struct hn_data *hv = rxq->hv;
- rte_atomic32_dec(&rxq->rxbuf_outstanding);
+ rte_atomic_fetch_sub_explicit(&rxq->rxbuf_outstanding, 1,
+ rte_memory_order_release);
hn_nvs_ack_rxbuf(hv, rxb->chan, rxb->xactid);
}
@@ -602,8 +603,8 @@ static void hn_rxpkt(struct hn_rx_queue *rxq, struct hn_rx_bufinfo *rxb,
* some space available in receive area for later packets.
*/
if (hv->rx_extmbuf_enable && dlen > hv->rx_copybreak &&
- (uint32_t)rte_atomic32_read(&rxq->rxbuf_outstanding) <
- hv->rxbuf_section_cnt / 2) {
+ rte_atomic_load_explicit(&rxq->rxbuf_outstanding,
+ rte_memory_order_relaxed) < hv->rxbuf_section_cnt / 2) {
struct rte_mbuf_ext_shared_info *shinfo;
const void *rxbuf;
rte_iova_t iova;
@@ -619,7 +620,8 @@ static void hn_rxpkt(struct hn_rx_queue *rxq, struct hn_rx_bufinfo *rxb,
/* shinfo is already set to 1 by the caller */
if (rte_mbuf_ext_refcnt_update(shinfo, 1) == 2)
- rte_atomic32_inc(&rxq->rxbuf_outstanding);
+ rte_atomic_fetch_add_explicit(&rxq->rxbuf_outstanding, 1,
+ rte_memory_order_acquire);
rte_pktmbuf_attach_extbuf(m, data, iova,
dlen + headroom, shinfo);
diff --git a/drivers/net/netvsc/hn_var.h b/drivers/net/netvsc/hn_var.h
index ef55dee28e..b0929de790 100644
--- a/drivers/net/netvsc/hn_var.h
+++ b/drivers/net/netvsc/hn_var.h
@@ -85,7 +85,7 @@ struct hn_rx_queue {
void *event_buf;
struct hn_rx_bufinfo *rxbuf_info;
- rte_atomic32_t rxbuf_outstanding;
+ RTE_ATOMIC(uint32_t) rxbuf_outstanding;
};
@@ -166,8 +166,8 @@ struct hn_data {
uint32_t rndis_agg_pkts;
uint32_t rndis_agg_align;
- volatile uint32_t rndis_pending;
- rte_atomic32_t rndis_req_id;
+ RTE_ATOMIC(uint32_t) rndis_pending;
+ RTE_ATOMIC(uint32_t) rndis_req_id;
uint8_t rndis_resp[256];
uint32_t rss_hash;
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 16/27] event/sw: convert from rte_atomic32 to stdatomic
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (14 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 15/27] net/netvsc: replace rte_atomic32 " Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 17/27] bus/vmbus: convert from rte_atomic " Stephen Hemminger
` (10 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
Use stdatomic to keep track of inflights.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/event/sw/sw_evdev.c | 8 +++++---
drivers/event/sw/sw_evdev.h | 4 ++--
drivers/event/sw/sw_evdev_worker.c | 16 +++++++++++-----
3 files changed, 18 insertions(+), 10 deletions(-)
diff --git a/drivers/event/sw/sw_evdev.c b/drivers/event/sw/sw_evdev.c
index 3ad82e94ac..a2f760a98d 100644
--- a/drivers/event/sw/sw_evdev.c
+++ b/drivers/event/sw/sw_evdev.c
@@ -153,7 +153,9 @@ sw_port_setup(struct rte_eventdev *dev, uint8_t port_id,
* the sum to no leak credits
*/
int possible_inflights = p->inflight_credits + p->inflights;
- rte_atomic32_sub(&sw->inflights, possible_inflights);
+ rte_atomic_fetch_sub_explicit(&sw->inflights,
+ possible_inflights,
+ rte_memory_order_release);
}
*p = (struct sw_port){0}; /* zero entire structure */
@@ -512,7 +514,7 @@ sw_dev_configure(const struct rte_eventdev *dev)
sw->qid_count = conf->nb_event_queues;
sw->port_count = conf->nb_event_ports;
sw->nb_events_limit = conf->nb_events_limit;
- rte_atomic32_set(&sw->inflights, 0);
+ sw->inflights = 0;
/* Number of chunks sized for worst-case spread of events across IQs */
num_chunks = ((SW_INFLIGHT_EVENTS_TOTAL/SW_EVS_PER_Q_CHUNK)+1) +
@@ -633,7 +635,7 @@ sw_dump(struct rte_eventdev *dev, FILE *f)
fprintf(f, "\tsched cq/qid call: %"PRIu64"\n", sw->sched_cq_qid_called);
fprintf(f, "\tsched no IQ enq: %"PRIu64"\n", sw->sched_no_iq_enqueues);
fprintf(f, "\tsched no CQ enq: %"PRIu64"\n", sw->sched_no_cq_enqueues);
- uint32_t inflights = rte_atomic32_read(&sw->inflights);
+ uint32_t inflights = rte_atomic_load_explicit(&sw->inflights, rte_memory_order_relaxed);
uint32_t credits = sw->nb_events_limit - inflights;
fprintf(f, "\tinflight %d, credits: %d\n", inflights, credits);
diff --git a/drivers/event/sw/sw_evdev.h b/drivers/event/sw/sw_evdev.h
index c159be21be..5e49b08030 100644
--- a/drivers/event/sw/sw_evdev.h
+++ b/drivers/event/sw/sw_evdev.h
@@ -8,7 +8,7 @@
#include "sw_evdev_log.h"
#include <rte_eventdev.h>
#include <eventdev_pmd_vdev.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#define SW_DEFAULT_CREDIT_QUANTA 32
#define SW_DEFAULT_SCHED_QUANTA 128
@@ -233,7 +233,7 @@ struct sw_evdev {
/* Contains all ports - load balanced and directed */
alignas(RTE_CACHE_LINE_SIZE) struct sw_port ports[SW_PORTS_MAX];
- alignas(RTE_CACHE_LINE_SIZE) rte_atomic32_t inflights;
+ alignas(RTE_CACHE_LINE_SIZE) RTE_ATOMIC(uint32_t) inflights;
/*
* max events in this instance. Cached here for performance.
diff --git a/drivers/event/sw/sw_evdev_worker.c b/drivers/event/sw/sw_evdev_worker.c
index 4215726513..0755def367 100644
--- a/drivers/event/sw/sw_evdev_worker.c
+++ b/drivers/event/sw/sw_evdev_worker.c
@@ -56,7 +56,7 @@ sw_event_enqueue_burst(void *port, const struct rte_event ev[], uint16_t num)
uint8_t new_ops[PORT_ENQUEUE_MAX_BURST_SIZE];
struct sw_port *p = port;
struct sw_evdev *sw = (void *)p->sw;
- uint32_t sw_inflights = rte_atomic32_read(&sw->inflights);
+ uint32_t sw_inflights = rte_atomic_load_explicit(&sw->inflights, rte_memory_order_relaxed);
uint32_t credit_update_quanta = sw->credit_update_quanta;
int new = 0;
@@ -74,8 +74,10 @@ sw_event_enqueue_burst(void *port, const struct rte_event ev[], uint16_t num)
if (sw_inflights + credit_update_quanta > sw->nb_events_limit)
return 0;
- rte_atomic32_add(&sw->inflights, credit_update_quanta);
- p->inflight_credits += (credit_update_quanta);
+ rte_atomic_fetch_add_explicit(&sw->inflights,
+ credit_update_quanta,
+ rte_memory_order_acquire);
+ p->inflight_credits += credit_update_quanta;
/* If there are fewer inflight credits than new events, limit
* the number of enqueued events.
@@ -124,7 +126,9 @@ sw_event_enqueue_burst(void *port, const struct rte_event ev[], uint16_t num)
/* Replenish credits if enough releases are performed */
if (p->inflight_credits >= credit_update_quanta * 2) {
- rte_atomic32_sub(&sw->inflights, credit_update_quanta);
+ rte_atomic_fetch_sub_explicit(&sw->inflights,
+ credit_update_quanta,
+ rte_memory_order_release);
p->inflight_credits -= credit_update_quanta;
}
@@ -150,7 +154,9 @@ sw_event_dequeue_burst(void *port, struct rte_event *ev, uint16_t num,
/* Replenish credits if enough releases are performed */
if (p->inflight_credits >= credit_update_quanta * 2) {
- rte_atomic32_sub(&sw->inflights, credit_update_quanta);
+ rte_atomic_fetch_sub_explicit(&sw->inflights,
+ credit_update_quanta,
+ rte_memory_order_release);
p->inflight_credits -= credit_update_quanta;
}
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 17/27] bus/vmbus: convert from rte_atomic to stdatomic
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (15 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 16/27] event/sw: convert from rte_atomic32 to stdatomic Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 18/27] common/dpaax: remove unused atomic macros Stephen Hemminger
` (9 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Long Li, Wei Hu
Replace deprecated rte_atomic32 operations in the vmbus ring buffer
producer with stdatomic equivalents, and replace the smp_wmb + CAS-spin
publish with rte_wait_until_equal_32 + release-store.
The two-cursor design is preserved: tbr->windex is the driver-private
reservation cursor that lets producers reserve slots concurrently
without a lock; vbr->windex is the host-visible commit cursor, updated
in reservation order so the host never observes windex pointing past
unwritten data. This is the lockless analogue of the spinlock-around-
single-cursor pattern used by the Linux (drivers/hv/ring_buffer.c
hv_ringbuffer_write) and FreeBSD (sys/dev/hyperv/vmbus/vmbus_br.c
vmbus_txbr_write) implementations of the same host contract.
The memory ordering mirrors __rte_ring_headtail_move_head and
__rte_ring_update_tail in lib/ring/rte_ring_c11_pvt.h: relaxed wait
for the previous producer's commit, release-store to publish. The
rte_smp_wmb before the publish is folded into the release ordering
on the store itself.
The host-shared vbr->windex remains volatile uint32_t in the packed
bufring struct; the atomic qualifier is added via cast at the access
site. The (uintptr_t) launder on the store-side cast suppresses a
spurious misaligned-atomic warning from the packed-struct attribute
(windex is 4-byte aligned in practice, at offset 0 of a page-aligned
struct).
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/bus/vmbus/private.h | 2 +-
drivers/bus/vmbus/vmbus_bufring.c | 39 +++++++++++++++++--------------
2 files changed, 23 insertions(+), 18 deletions(-)
diff --git a/drivers/bus/vmbus/private.h b/drivers/bus/vmbus/private.h
index 8ac6119ef2..42c4e81ac0 100644
--- a/drivers/bus/vmbus/private.h
+++ b/drivers/bus/vmbus/private.h
@@ -41,7 +41,7 @@ extern int vmbus_logtype_bus;
struct vmbus_br {
struct vmbus_bufring *vbr;
uint32_t dsize;
- uint32_t windex; /* next available location */
+ RTE_ATOMIC(uint32_t) windex; /* next available location */
};
#define UIO_NAME_MAX 64
diff --git a/drivers/bus/vmbus/vmbus_bufring.c b/drivers/bus/vmbus/vmbus_bufring.c
index fcb97287dc..624fe8b6c5 100644
--- a/drivers/bus/vmbus/vmbus_bufring.c
+++ b/drivers/bus/vmbus/vmbus_bufring.c
@@ -15,7 +15,7 @@
#include <rte_tailq.h>
#include <rte_log.h>
#include <rte_malloc.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_memory.h>
#include <rte_pause.h>
#include <rte_bus_vmbus.h>
@@ -114,6 +114,7 @@ vmbus_txbr_write(struct vmbus_br *tbr, const struct iovec iov[], int iovlen,
uint32_t ring_size = tbr->dsize;
uint32_t old_windex, next_windex, windex, total;
uint64_t save_windex;
+ bool success;
int i;
total = 0;
@@ -121,17 +122,13 @@ vmbus_txbr_write(struct vmbus_br *tbr, const struct iovec iov[], int iovlen,
total += iov[i].iov_len;
total += sizeof(save_windex);
+ /* Get current free location */
+ old_windex = rte_atomic_load_explicit(&tbr->windex,
+ rte_memory_order_relaxed);
+
/* Reserve space in ring */
do {
- uint32_t avail;
-
- /* Get current free location */
- old_windex = tbr->windex;
-
- /* Prevent compiler reordering this with calculation */
- rte_compiler_barrier();
-
- avail = vmbus_br_availwrite(tbr, old_windex);
+ uint32_t avail = vmbus_br_availwrite(tbr, old_windex);
/* If not enough space in ring, then tell caller. */
if (avail <= total)
@@ -139,8 +136,13 @@ vmbus_txbr_write(struct vmbus_br *tbr, const struct iovec iov[], int iovlen,
next_windex = vmbus_br_idxinc(old_windex, total, ring_size);
- /* Atomic update of next write_index for other threads */
- } while (!rte_atomic32_cmpset(&tbr->windex, old_windex, next_windex));
+ /* Atomic update of next write_index for other threads
+ * Can use weak since easy to recompute and retry.
+ */
+ success = rte_atomic_compare_exchange_weak_explicit(
+ &tbr->windex, &old_windex, next_windex,
+ rte_memory_order_acquire, rte_memory_order_relaxed);
+ } while (unlikely(!success));
/* Space from old..new is now reserved */
windex = old_windex;
@@ -157,12 +159,15 @@ vmbus_txbr_write(struct vmbus_br *tbr, const struct iovec iov[], int iovlen,
/* The region reserved should match region used */
RTE_ASSERT(windex == next_windex);
- /* Ensure that data is available before updating host index */
- rte_smp_wmb();
+ /* Wait for previous producer to publish their windex update */
+ rte_wait_until_equal_32(&vbr->windex, old_windex, rte_memory_order_relaxed);
- /* Checkin for our reservation. wait for our turn to update host */
- while (!rte_atomic32_cmpset(&vbr->windex, old_windex, next_windex))
- rte_pause();
+ /* Publish our windex update; prior data writes ordered via release.
+ * windex is 4-byte aligned in practice (struct is page-aligned, windex
+ * at offset 0); cast launders the packed-struct alignment-1 attribute.
+ */
+ rte_atomic_store_explicit((volatile __rte_atomic uint32_t *)(uintptr_t)&vbr->windex,
+ next_windex, rte_memory_order_release);
/* If host had read all data before this, then need to signal */
*need_sig |= vmbus_txbr_need_signal(vbr, old_windex);
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 18/27] common/dpaax: remove unused atomic macros
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (16 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 17/27] bus/vmbus: convert from rte_atomic " Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 19/27] net/bnx2x: convert from rte_atomic32 to stdatomic Stephen Hemminger
` (8 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Hemant Agrawal, Sachin Saxena
Driver copy/pasted some macros defining abstraction around
the now deprecated rte_atomic32. Dead code removed.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/common/dpaax/compat.h | 14 --------------
1 file changed, 14 deletions(-)
diff --git a/drivers/common/dpaax/compat.h b/drivers/common/dpaax/compat.h
index d0635255da..580620caf0 100644
--- a/drivers/common/dpaax/compat.h
+++ b/drivers/common/dpaax/compat.h
@@ -365,20 +365,6 @@ static inline unsigned long get_zeroed_page(gfp_t __foo __rte_unused)
#define spin_lock_irqsave(x, f) spin_lock_irq(x)
#define spin_unlock_irqrestore(x, f) spin_unlock_irq(x)
-#define atomic_t rte_atomic32_t
-#define atomic_read(v) rte_atomic32_read(v)
-#define atomic_set(v, i) rte_atomic32_set(v, i)
-
-#define atomic_inc(v) rte_atomic32_add(v, 1)
-#define atomic_dec(v) rte_atomic32_sub(v, 1)
-
-#define atomic_inc_and_test(v) rte_atomic32_inc_and_test(v)
-#define atomic_dec_and_test(v) rte_atomic32_dec_and_test(v)
-
-#define atomic_inc_return(v) rte_atomic32_add_return(v, 1)
-#define atomic_dec_return(v) rte_atomic32_sub_return(v, 1)
-#define atomic_sub_and_test(i, v) (rte_atomic32_sub_return(v, i) == 0)
-
/* Interface name len*/
#define IF_NAME_MAX_LEN 16
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 19/27] net/bnx2x: convert from rte_atomic32 to stdatomic
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (17 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 18/27] common/dpaax: remove unused atomic macros Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 20/27] bus/fslmc: replace rte_atomic32 with stdatomic Stephen Hemminger
` (7 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Julien Aube
Replace the legacy rte_atomic32_* API on sc->scan_fp with the
equivalent rte_atomic_*_explicit C11 helpers, ahead of the
deprecation of rte_atomicNN_t and its associated wrappers.
All accesses use rte_memory_order_seq_cst, matching the semantics
of the legacy API. No functional change.
The scan_fp field is a notification flag between the slow-path
command poster (bnx2x_sp_post) and the fastpath task that reaps
ramrod completions (bnx2x_handle_fp_tq), also cleared from
ecore_state_wait on success, panic, and timeout.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/bnx2x/bnx2x.c | 6 +++---
drivers/net/bnx2x/bnx2x.h | 2 +-
drivers/net/bnx2x/ecore_sp.c | 6 +++---
3 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/net/bnx2x/bnx2x.c b/drivers/net/bnx2x/bnx2x.c
index 8790c858d5..027a0a50d5 100644
--- a/drivers/net/bnx2x/bnx2x.c
+++ b/drivers/net/bnx2x/bnx2x.c
@@ -1098,7 +1098,7 @@ bnx2x_sp_post(struct bnx2x_softc *sc, int command, int cid, uint32_t data_hi,
* Ask bnx2x_intr_intr() to process RAMROD
* completion whenever it gets scheduled.
*/
- rte_atomic32_set(&sc->scan_fp, 1);
+ rte_atomic_store_explicit(&sc->scan_fp, 1, rte_memory_order_seq_cst);
bnx2x_sp_prod_update(sc);
return 0;
@@ -4575,7 +4575,7 @@ static void bnx2x_handle_fp_tq(struct bnx2x_fastpath *fp)
/* update the fastpath index */
bnx2x_update_fp_sb_idx(fp);
- if (rte_atomic32_read(&sc->scan_fp) == 1) {
+ if (rte_atomic_load_explicit(&sc->scan_fp, rte_memory_order_seq_cst)) {
if (bnx2x_has_rx_work(fp)) {
more_rx = bnx2x_rxeof(sc, fp);
}
@@ -4586,7 +4586,7 @@ static void bnx2x_handle_fp_tq(struct bnx2x_fastpath *fp)
return;
}
/* We have completed slow path completion, clear the flag */
- rte_atomic32_set(&sc->scan_fp, 0);
+ rte_atomic_store_explicit(&sc->scan_fp, 0, rte_memory_order_seq_cst);
}
bnx2x_ack_sb(sc, fp->igu_sb_id, USTORM_ID,
diff --git a/drivers/net/bnx2x/bnx2x.h b/drivers/net/bnx2x/bnx2x.h
index 35206b4758..c5de4b71aa 100644
--- a/drivers/net/bnx2x/bnx2x.h
+++ b/drivers/net/bnx2x/bnx2x.h
@@ -1043,7 +1043,7 @@ struct bnx2x_softc {
#define PERIODIC_STOP 0
#define PERIODIC_GO 1
volatile unsigned long periodic_flags;
- rte_atomic32_t scan_fp;
+ RTE_ATOMIC(uint32_t) scan_fp;
struct bnx2x_fastpath fp[MAX_RSS_CHAINS];
struct bnx2x_sp_objs sp_objs[MAX_RSS_CHAINS];
diff --git a/drivers/net/bnx2x/ecore_sp.c b/drivers/net/bnx2x/ecore_sp.c
index c6c3857778..33a40dea6e 100644
--- a/drivers/net/bnx2x/ecore_sp.c
+++ b/drivers/net/bnx2x/ecore_sp.c
@@ -299,21 +299,21 @@ static int ecore_state_wait(struct bnx2x_softc *sc, int state,
#ifdef ECORE_STOP_ON_ERROR
ECORE_MSG(sc, "exit (cnt %d)", 5000 - cnt);
#endif
- rte_atomic32_set(&sc->scan_fp, 0);
+ rte_atomic_store_explicit(&sc->scan_fp, 0, rte_memory_order_seq_cst);
return ECORE_SUCCESS;
}
ECORE_WAIT(sc, delay_us);
if (sc->panic) {
- rte_atomic32_set(&sc->scan_fp, 0);
+ rte_atomic_store_explicit(&sc->scan_fp, 0, rte_memory_order_seq_cst);
return ECORE_IO;
}
}
/* timeout! */
PMD_DRV_LOG(ERR, sc, "timeout waiting for state %d", state);
- rte_atomic32_set(&sc->scan_fp, 0);
+ rte_atomic_store_explicit(&sc->scan_fp, 0, rte_memory_order_seq_cst);
#ifdef ECORE_STOP_ON_ERROR
ecore_panic();
#endif
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 20/27] bus/fslmc: replace rte_atomic32 with stdatomic
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (18 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 19/27] net/bnx2x: convert from rte_atomic32 to stdatomic Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 21/27] drivers/event: replace rte_atomic32 in selftests Stephen Hemminger
` (6 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Hemant Agrawal, Sachin Saxena
The atomic wrappers here are easily converted to stdatomic.
Drop any unused macros.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/bus/fslmc/qbman/include/compat.h | 21 ++++++++-------------
1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/drivers/bus/fslmc/qbman/include/compat.h b/drivers/bus/fslmc/qbman/include/compat.h
index 5a57bd8ed1..9c87f0b639 100644
--- a/drivers/bus/fslmc/qbman/include/compat.h
+++ b/drivers/bus/fslmc/qbman/include/compat.h
@@ -81,18 +81,13 @@ do { \
#define dma_wmb() rte_io_wmb()
-#define atomic_t rte_atomic32_t
-#define atomic_read(v) rte_atomic32_read(v)
-#define atomic_set(v, i) rte_atomic32_set(v, i)
-
-#define atomic_inc(v) rte_atomic32_add(v, 1)
-#define atomic_dec(v) rte_atomic32_sub(v, 1)
-
-#define atomic_inc_and_test(v) rte_atomic32_inc_and_test(v)
-#define atomic_dec_and_test(v) rte_atomic32_dec_and_test(v)
-
-#define atomic_inc_return(v) rte_atomic32_add_return(v, 1)
-#define atomic_dec_return(v) rte_atomic32_sub_return(v, 1)
-#define atomic_sub_and_test(i, v) (rte_atomic32_sub_return(v, i) == 0)
+typedef RTE_ATOMIC(uint32_t) atomic_t;
+
+#define atomic_read(v) rte_atomic_load_explicit((v), rte_memory_order_relaxed)
+#define atomic_set(v, i) rte_atomic_store_explicit((v), (i), rte_memory_order_relaxed)
+#define atomic_inc(v) ((void)rte_atomic_fetch_add_explicit((v), 1, rte_memory_order_seq_cst))
+#define atomic_dec(v) ((void)rte_atomic_fetch_sub_explicit((v), 1, rte_memory_order_seq_cst))
+#define atomic_inc_and_test(v) (rte_atomic_fetch_add_explicit((v), 1, rte_memory_order_seq_cst) == -1)
+#define atomic_dec_and_test(v) (rte_atomic_fetch_sub_explicit((v), 1, rte_memory_order_seq_cst) == 1)
#endif /* HEADER_COMPAT_H */
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 21/27] drivers/event: replace rte_atomic32 in selftests
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (19 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 20/27] bus/fslmc: replace rte_atomic32 with stdatomic Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 22/27] net/hinic: replace rte_atomic32 with stdatomic Stephen Hemminger
` (5 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Hemant Agrawal, Sachin Saxena, Jerin Jacob
Last callers in these selftests of the rte_atomicNN_*() family,
which is being deprecated.
Convert total_events from rte_atomic32_t to RTE_ATOMIC(uint32_t)
for the stack-local instance and __rte_atomic uint32_t * for the
pointer in test_core_param. Switch reads and updates to
rte_atomic_*_explicit().
Reads in the busy-loop checks and progress logs use relaxed: the
counter is purely a "drained yet?" signal and no data is published
through it. The fetch_sub on the dequeue path uses release in
octeontx (preserving the publish-after-mbuf-free ordering already
implied by the seq_cst sub it replaces) and relaxed in dpaa2.
The stack-local atomic_total_events is initialized by direct
assignment instead of rte_atomic32_set(), since it is written
before any worker is launched.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/event/dpaa2/dpaa2_eventdev_selftest.c | 26 +++++----
drivers/event/octeontx/ssovf_evdev_selftest.c | 58 ++++++++++---------
2 files changed, 46 insertions(+), 38 deletions(-)
diff --git a/drivers/event/dpaa2/dpaa2_eventdev_selftest.c b/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
index 9d4938efe6..2c688bd194 100644
--- a/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
+++ b/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
@@ -2,7 +2,7 @@
* Copyright 2018-2019 NXP
*/
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_common.h>
#include <rte_cycles.h>
#include <rte_debug.h>
@@ -49,7 +49,7 @@ struct event_attr {
};
struct test_core_param {
- rte_atomic32_t *total_events;
+ __rte_atomic uint32_t *total_events;
uint64_t dequeue_tmo_ticks;
uint8_t port;
uint8_t sched_type;
@@ -444,10 +444,10 @@ worker_multi_port_fn(void *arg)
struct rte_event ev;
uint16_t valid_event;
uint8_t port = param->port;
- rte_atomic32_t *total_events = param->total_events;
+ __rte_atomic uint32_t *total_events = param->total_events;
int ret;
- while (rte_atomic32_read(total_events) > 0) {
+ while (rte_atomic_load_explicit(total_events, rte_memory_order_relaxed) > 0) {
valid_event = rte_event_dequeue_burst(evdev, port, &ev, 1, 0);
if (!valid_event)
continue;
@@ -455,13 +455,15 @@ worker_multi_port_fn(void *arg)
ret = validate_event(&ev);
RTE_TEST_ASSERT_SUCCESS(ret, "Failed to validate event");
rte_pktmbuf_free(ev.mbuf);
- rte_atomic32_sub(total_events, 1);
+
+ rte_atomic_fetch_sub_explicit(total_events, 1,
+ rte_memory_order_relaxed);
}
return 0;
}
static int
-wait_workers_to_join(int lcore, const rte_atomic32_t *count)
+wait_workers_to_join(int lcore, const __rte_atomic uint32_t *count)
{
uint64_t cycles, print_cycles;
@@ -472,15 +474,15 @@ wait_workers_to_join(int lcore, const rte_atomic32_t *count)
uint64_t new_cycles = rte_get_timer_cycles();
if (new_cycles - print_cycles > rte_get_timer_hz()) {
- dpaa2_evdev_dbg("\r%s: events %d", __func__,
- rte_atomic32_read(count));
+ dpaa2_evdev_dbg("\r%s: events %u", __func__,
+ rte_atomic_load_explicit(count, rte_memory_order_relaxed));
print_cycles = new_cycles;
}
if (new_cycles - cycles > rte_get_timer_hz() * 10) {
dpaa2_evdev_info(
- "%s: No schedules for seconds, deadlock (%d)",
+ "%s: No schedules for seconds, deadlock (%u)",
__func__,
- rte_atomic32_read(count));
+ rte_atomic_load_explicit(count, rte_memory_order_relaxed));
rte_event_dev_dump(evdev, stdout);
cycles = new_cycles;
return -1;
@@ -500,13 +502,13 @@ launch_workers_and_wait(int (*main_worker)(void *),
int w_lcore;
int ret;
struct test_core_param *param;
- rte_atomic32_t atomic_total_events;
+ RTE_ATOMIC(uint32_t) atomic_total_events;
uint64_t dequeue_tmo_ticks;
if (!nb_workers)
return 0;
- rte_atomic32_set(&atomic_total_events, total_events);
+ atomic_total_events = total_events;
RTE_BUILD_BUG_ON(NUM_PACKETS < MAX_EVENTS);
param = malloc(sizeof(struct test_core_param) * nb_workers);
diff --git a/drivers/event/octeontx/ssovf_evdev_selftest.c b/drivers/event/octeontx/ssovf_evdev_selftest.c
index b54ae126d2..500762af78 100644
--- a/drivers/event/octeontx/ssovf_evdev_selftest.c
+++ b/drivers/event/octeontx/ssovf_evdev_selftest.c
@@ -4,7 +4,7 @@
#include <stdlib.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_common.h>
#include <rte_cycles.h>
#include <rte_debug.h>
@@ -84,7 +84,7 @@ seqn_list_check(int limit)
}
struct test_core_param {
- rte_atomic32_t *total_events;
+ __rte_atomic uint32_t *total_events;
uint64_t dequeue_tmo_ticks;
uint8_t port;
uint8_t sched_type;
@@ -558,10 +558,10 @@ worker_multi_port_fn(void *arg)
struct rte_event ev;
uint16_t valid_event;
uint8_t port = param->port;
- rte_atomic32_t *total_events = param->total_events;
+ __rte_atomic uint32_t *total_events = param->total_events;
int ret;
- while (rte_atomic32_read(total_events) > 0) {
+ while (rte_atomic_load_explicit(total_events, rte_memory_order_relaxed) > 0) {
valid_event = rte_event_dequeue_burst(evdev, port, &ev, 1, 0);
if (!valid_event)
continue;
@@ -569,13 +569,14 @@ worker_multi_port_fn(void *arg)
ret = validate_event(&ev);
RTE_TEST_ASSERT_SUCCESS(ret, "Failed to validate event");
rte_pktmbuf_free(ev.mbuf);
- rte_atomic32_sub(total_events, 1);
+
+ rte_atomic_fetch_sub_explicit(total_events, 1, rte_memory_order_release);
}
return 0;
}
static inline int
-wait_workers_to_join(int lcore, const rte_atomic32_t *count)
+wait_workers_to_join(int lcore, const __rte_atomic uint32_t *count)
{
uint64_t cycles, print_cycles;
RTE_SET_USED(count);
@@ -585,15 +586,15 @@ wait_workers_to_join(int lcore, const rte_atomic32_t *count)
uint64_t new_cycles = rte_get_timer_cycles();
if (new_cycles - print_cycles > rte_get_timer_hz()) {
- ssovf_log_dbg("\r%s: events %d", __func__,
- rte_atomic32_read(count));
+ ssovf_log_dbg("\r%s: events %u", __func__,
+ rte_atomic_load_explicit(count, rte_memory_order_relaxed));
print_cycles = new_cycles;
}
if (new_cycles - cycles > rte_get_timer_hz() * 10) {
ssovf_log_dbg(
"%s: No schedules for seconds, deadlock (%d)",
__func__,
- rte_atomic32_read(count));
+ rte_atomic_load_explicit(count, rte_memory_order_relaxed));
rte_event_dev_dump(evdev, stdout);
cycles = new_cycles;
return -1;
@@ -613,13 +614,13 @@ launch_workers_and_wait(int (*main_worker)(void *),
int w_lcore;
int ret;
struct test_core_param *param;
- rte_atomic32_t atomic_total_events;
+ RTE_ATOMIC(uint32_t) atomic_total_events;
uint64_t dequeue_tmo_ticks;
if (!nb_workers)
return 0;
- rte_atomic32_set(&atomic_total_events, total_events);
+ atomic_total_events = total_events;
seqn_list_init();
param = malloc(sizeof(struct test_core_param) * nb_workers);
@@ -889,10 +890,10 @@ worker_flow_based_pipeline(void *arg)
uint16_t valid_event;
uint8_t port = param->port;
uint8_t new_sched_type = param->sched_type;
- rte_atomic32_t *total_events = param->total_events;
+ __rte_atomic uint32_t *total_events = param->total_events;
uint64_t dequeue_tmo_ticks = param->dequeue_tmo_ticks;
- while (rte_atomic32_read(total_events) > 0) {
+ while (rte_atomic_load_explicit(total_events, rte_memory_order_relaxed) > 0) {
valid_event = rte_event_dequeue_burst(evdev, port, &ev, 1,
dequeue_tmo_ticks);
if (!valid_event)
@@ -910,7 +911,8 @@ worker_flow_based_pipeline(void *arg)
} else if (ev.sub_event_type == 1) { /* Events from stage 1*/
if (seqn_list_update(*rte_event_pmd_selftest_seqn(ev.mbuf)) == 0) {
rte_pktmbuf_free(ev.mbuf);
- rte_atomic32_sub(total_events, 1);
+ rte_atomic_fetch_sub_explicit(total_events, 1,
+ rte_memory_order_release);
} else {
ssovf_log_dbg("Failed to update seqn_list");
return -1;
@@ -1044,10 +1046,10 @@ worker_group_based_pipeline(void *arg)
uint16_t valid_event;
uint8_t port = param->port;
uint8_t new_sched_type = param->sched_type;
- rte_atomic32_t *total_events = param->total_events;
+ __rte_atomic uint32_t *total_events = param->total_events;
uint64_t dequeue_tmo_ticks = param->dequeue_tmo_ticks;
- while (rte_atomic32_read(total_events) > 0) {
+ while (rte_atomic_load_explicit(total_events, rte_memory_order_relaxed) > 0) {
valid_event = rte_event_dequeue_burst(evdev, port, &ev, 1,
dequeue_tmo_ticks);
if (!valid_event)
@@ -1065,7 +1067,8 @@ worker_group_based_pipeline(void *arg)
} else if (ev.queue_id == 1) { /* Events from stage 1(group 1)*/
if (seqn_list_update(*rte_event_pmd_selftest_seqn(ev.mbuf)) == 0) {
rte_pktmbuf_free(ev.mbuf);
- rte_atomic32_sub(total_events, 1);
+ rte_atomic_fetch_sub_explicit(total_events, 1,
+ rte_memory_order_release);
} else {
ssovf_log_dbg("Failed to update seqn_list");
return -1;
@@ -1203,16 +1206,17 @@ worker_flow_based_pipeline_max_stages_rand_sched_type(void *arg)
struct rte_event ev;
uint16_t valid_event;
uint8_t port = param->port;
- rte_atomic32_t *total_events = param->total_events;
+ __rte_atomic uint32_t *total_events = param->total_events;
- while (rte_atomic32_read(total_events) > 0) {
+ while (rte_atomic_load_explicit(total_events, rte_memory_order_relaxed) > 0) {
valid_event = rte_event_dequeue_burst(evdev, port, &ev, 1, 0);
if (!valid_event)
continue;
if (ev.sub_event_type == 255) { /* last stage */
rte_pktmbuf_free(ev.mbuf);
- rte_atomic32_sub(total_events, 1);
+ rte_atomic_fetch_sub_explicit(total_events, 1,
+ rte_memory_order_release);
} else {
ev.event_type = RTE_EVENT_TYPE_CPU;
ev.sub_event_type++;
@@ -1278,16 +1282,17 @@ worker_queue_based_pipeline_max_stages_rand_sched_type(void *arg)
RTE_EVENT_DEV_ATTR_QUEUE_COUNT,
&queue_count), "Queue count get failed");
uint8_t nr_queues = queue_count;
- rte_atomic32_t *total_events = param->total_events;
+ __rte_atomic uint32_t *total_events = param->total_events;
- while (rte_atomic32_read(total_events) > 0) {
+ while (rte_atomic_load_explicit(total_events, rte_memory_order_relaxed) > 0) {
valid_event = rte_event_dequeue_burst(evdev, port, &ev, 1, 0);
if (!valid_event)
continue;
if (ev.queue_id == nr_queues - 1) { /* last stage */
rte_pktmbuf_free(ev.mbuf);
- rte_atomic32_sub(total_events, 1);
+ rte_atomic_fetch_sub_explicit(total_events, 1,
+ rte_memory_order_release);
} else {
ev.event_type = RTE_EVENT_TYPE_CPU;
ev.queue_id++;
@@ -1320,16 +1325,17 @@ worker_mixed_pipeline_max_stages_rand_sched_type(void *arg)
RTE_EVENT_DEV_ATTR_QUEUE_COUNT,
&queue_count), "Queue count get failed");
uint8_t nr_queues = queue_count;
- rte_atomic32_t *total_events = param->total_events;
+ __rte_atomic uint32_t *total_events = param->total_events;
- while (rte_atomic32_read(total_events) > 0) {
+ while (rte_atomic_load_explicit(total_events, rte_memory_order_relaxed) > 0) {
valid_event = rte_event_dequeue_burst(evdev, port, &ev, 1, 0);
if (!valid_event)
continue;
if (ev.queue_id == nr_queues - 1) { /* Last stage */
rte_pktmbuf_free(ev.mbuf);
- rte_atomic32_sub(total_events, 1);
+ rte_atomic_fetch_sub_explicit(total_events, 1,
+ rte_memory_order_release);
} else {
ev.event_type = RTE_EVENT_TYPE_CPU;
ev.queue_id++;
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 22/27] net/hinic: replace rte_atomic32 with stdatomic
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (20 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 21/27] drivers/event: replace rte_atomic32 in selftests Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 23/27] net/txgbe: " Stephen Hemminger
` (4 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Xiaoyun Wang
Convert dma_pool::inuse and hinic_os_dep::dma_alloc_cnt to
RTE_ATOMIC(uint32_t) and replace rte_atomic32_*() with the
rte_atomic_*_explicit() equivalents. The matching local variable
and log format change from int/%d to uint32_t/%u.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/hinic/base/hinic_compat.h | 2 +-
drivers/net/hinic/base/hinic_pmd_hwdev.c | 24 ++++++++++++++----------
drivers/net/hinic/base/hinic_pmd_hwdev.h | 4 ++--
3 files changed, 17 insertions(+), 13 deletions(-)
diff --git a/drivers/net/hinic/base/hinic_compat.h b/drivers/net/hinic/base/hinic_compat.h
index d3994c50e9..b0e42fa9bc 100644
--- a/drivers/net/hinic/base/hinic_compat.h
+++ b/drivers/net/hinic/base/hinic_compat.h
@@ -15,7 +15,7 @@
#include <rte_memzone.h>
#include <rte_memcpy.h>
#include <rte_malloc.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_spinlock.h>
#include <rte_cycles.h>
#include <rte_log.h>
diff --git a/drivers/net/hinic/base/hinic_pmd_hwdev.c b/drivers/net/hinic/base/hinic_pmd_hwdev.c
index 818698dcb3..9a1b126632 100644
--- a/drivers/net/hinic/base/hinic_pmd_hwdev.c
+++ b/drivers/net/hinic/base/hinic_pmd_hwdev.c
@@ -116,7 +116,8 @@ static void *hinic_dma_mem_zalloc(struct hinic_hwdev *hwdev, size_t size,
dma_addr_t *dma_handle, unsigned int align,
unsigned int socket_id)
{
- int rc, alloc_cnt;
+ int rc;
+ uint32_t alloc_cnt;
const struct rte_memzone *mz;
char z_name[RTE_MEMZONE_NAMESIZE];
hash_sig_t sig;
@@ -125,8 +126,9 @@ static void *hinic_dma_mem_zalloc(struct hinic_hwdev *hwdev, size_t size,
if (dma_handle == NULL || 0 == size)
return NULL;
- alloc_cnt = rte_atomic32_add_return(&hwdev->os_dep.dma_alloc_cnt, 1);
- snprintf(z_name, sizeof(z_name), "%s_%d",
+ alloc_cnt = rte_atomic_fetch_add_explicit(&hwdev->os_dep.dma_alloc_cnt,
+ 1, rte_memory_order_relaxed);
+ snprintf(z_name, sizeof(z_name), "%s_%u",
hwdev->pcidev_hdl->name, alloc_cnt);
mz = rte_memzone_reserve_aligned(z_name, size, socket_id,
@@ -282,7 +284,6 @@ struct dma_pool *dma_pool_create(const char *name, void *dev,
if (!pool)
return NULL;
- rte_atomic32_set(&pool->inuse, 0);
pool->elem_size = size;
pool->align = align;
pool->boundary = boundary;
@@ -294,12 +295,15 @@ struct dma_pool *dma_pool_create(const char *name, void *dev,
void dma_pool_destroy(struct dma_pool *pool)
{
+ uint32_t inuse;
+
if (!pool)
return;
- if (rte_atomic32_read(&pool->inuse) != 0) {
- PMD_DRV_LOG(ERR, "Leak memory, dma_pool: %s, inuse_count: %d",
- pool->name, rte_atomic32_read(&pool->inuse));
+ inuse = rte_atomic_load_explicit(&pool->inuse, rte_memory_order_relaxed);
+ if (inuse != 0) {
+ PMD_DRV_LOG(ERR, "Leak memory, dma_pool: %s, inuse_count: %u",
+ pool->name, inuse);
}
rte_free(pool);
@@ -312,14 +316,14 @@ void *dma_pool_alloc(struct pci_pool *pool, dma_addr_t *dma_addr)
buf = hinic_dma_mem_zalloc(pool->hwdev, pool->elem_size, dma_addr,
(u32)pool->align, SOCKET_ID_ANY);
if (buf)
- rte_atomic32_inc(&pool->inuse);
+ rte_atomic_fetch_add_explicit(&pool->inuse, 1, rte_memory_order_relaxed);
return buf;
}
void dma_pool_free(struct pci_pool *pool, void *vaddr, dma_addr_t dma)
{
- rte_atomic32_dec(&pool->inuse);
+ rte_atomic_fetch_sub_explicit(&pool->inuse, 1, rte_memory_order_relaxed);
hinic_dma_mem_free(pool->hwdev, pool->elem_size, vaddr, dma);
}
@@ -329,7 +333,7 @@ int hinic_osdep_init(struct hinic_hwdev *hwdev)
struct rte_hash_parameters dh_params = { 0 };
struct rte_hash *paddr_hash = NULL;
- rte_atomic32_set(&hwdev->os_dep.dma_alloc_cnt, 0);
+ hwdev->os_dep.dma_alloc_cnt = 0;
rte_spinlock_init(&hwdev->os_dep.dma_hash_lock);
dh_params.name = hwdev->pcidev_hdl->name;
diff --git a/drivers/net/hinic/base/hinic_pmd_hwdev.h b/drivers/net/hinic/base/hinic_pmd_hwdev.h
index d6896b3f13..ad30ddd72e 100644
--- a/drivers/net/hinic/base/hinic_pmd_hwdev.h
+++ b/drivers/net/hinic/base/hinic_pmd_hwdev.h
@@ -18,7 +18,7 @@
/* dma pool */
struct dma_pool {
- rte_atomic32_t inuse;
+ RTE_ATOMIC(uint32_t) inuse;
size_t elem_size;
size_t align;
size_t boundary;
@@ -402,7 +402,7 @@ struct hinic_hilink_link_info {
/* dma os dependency implementation */
struct hinic_os_dep {
/* kernel dma alloc api */
- rte_atomic32_t dma_alloc_cnt;
+ RTE_ATOMIC(uint32_t) dma_alloc_cnt;
rte_spinlock_t dma_hash_lock;
struct rte_hash *dma_addr_hash;
};
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 23/27] net/txgbe: replace rte_atomic32 with stdatomic
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (21 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 22/27] net/hinic: replace rte_atomic32 with stdatomic Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 24/27] net/vhost: use stdatomic instead of rte_atomic32 Stephen Hemminger
` (3 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Jiawen Wu, Zaiyu Wang
The swfw_busy flag guarding the AML SW-FW mailbox is a one-bit lock,
so convert it to RTE_ATOMIC(bool) and replace the legacy
test-and-set / clear pair with explicit acquire-release:
rte_atomic32_test_and_set ->
rte_atomic_exchange_explicit(.., true, acquire)
rte_atomic32_clear ->
rte_atomic_store_explicit(.., false, release)
Acquire on the take pairs with release on the drop, so accesses
inside the critical section are synchronized between successive
holders. Default zero-initialization of struct txgbe_hw still
gives swfw_busy = false, so no init site needs updating.
Note: the code for the AML spinlock had a bug because
old rte_atomic32_test_set return value was not what the code
expected. This patch fixes that. A seperate patch for stable
has been sent upstream. (Drop this note from commit message
when rebasing after the fix is merged).
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/txgbe/base/txgbe_mng.c | 4 ++--
drivers/net/txgbe/base/txgbe_type.h | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/txgbe/base/txgbe_mng.c b/drivers/net/txgbe/base/txgbe_mng.c
index a1974820b6..c58e1d6589 100644
--- a/drivers/net/txgbe/base/txgbe_mng.c
+++ b/drivers/net/txgbe/base/txgbe_mng.c
@@ -185,7 +185,7 @@ txgbe_host_interface_command_aml(struct txgbe_hw *hw, u32 *buffer,
}
/* try to get lock */
- while (rte_atomic32_test_and_set(&hw->swfw_busy)) {
+ while (rte_atomic_exchange_explicit(&hw->swfw_busy, true, rte_memory_order_acquire)) {
timeout--;
if (!timeout)
return TXGBE_ERR_TIMEOUT;
@@ -266,7 +266,7 @@ txgbe_host_interface_command_aml(struct txgbe_hw *hw, u32 *buffer,
/* index++, index replace txgbe_hic_hdr.checksum */
hw->swfw_index = resp->index == TXGBE_HIC_HDR_INDEX_MAX ?
0 : resp->index + 1;
- rte_atomic32_clear(&hw->swfw_busy);
+ rte_atomic_store_explicit(&hw->swfw_busy, false, rte_memory_order_release);
return err;
}
diff --git a/drivers/net/txgbe/base/txgbe_type.h b/drivers/net/txgbe/base/txgbe_type.h
index ede780321f..d3c82d51a4 100644
--- a/drivers/net/txgbe/base/txgbe_type.h
+++ b/drivers/net/txgbe/base/txgbe_type.h
@@ -880,7 +880,7 @@ struct txgbe_hw {
rte_spinlock_t phy_lock;
/*amlite: new SW-FW mbox */
u8 swfw_index;
- rte_atomic32_t swfw_busy;
+ RTE_ATOMIC(bool) swfw_busy;
u32 fec_mode;
u32 cur_fec_link;
};
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 24/27] net/vhost: use stdatomic instead of rte_atomic32
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (22 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 23/27] net/txgbe: " Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 25/27] vdpa/ifc: replace rte_atomic32 with stdatomic Stephen Hemminger
` (2 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Maxime Coquelin, Chenbo Xia
Convert allow_queuing, while_queuing, started, and dev_attached from
rte_atomic32_t to RTE_ATOMIC(uint32_t) and replace rte_atomic32_*()
with rte_atomic_*_explicit().
The data-path / control-thread handshake on allow_queuing and
while_queuing is a Dekker-style mutual-visibility pattern: each side
stores its own flag and then loads the peer's. Both legs must be
seq_cst to forbid store-load reordering; anything weaker permits both
sides to miss each other. The previous rte_atomic32_set/read compiled
to plain volatile stores/loads and provided no such ordering, so this
also closes a latent ordering hole on weakly-ordered ISAs.
The data-path exit store of while_queuing=0 is release, ordering
preceding slot accesses before the control thread observes the data
path as idle.
The flags started and dev_attached are consulted only inside
update_queuing_status, where the per-queue handshake provides the
real synchronization; their loads and stores are relaxed.
Factor the per-queue allow_queuing store and while_queuing wait into
a small update_queue() helper used by both rx and tx loops.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/vhost/rte_eth_vhost.c | 103 +++++++++++++++++++-----------
1 file changed, 65 insertions(+), 38 deletions(-)
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 05940f2461..3b1eedfe42 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -73,8 +73,8 @@ struct vhost_stats {
struct vhost_queue {
int vid;
- rte_atomic32_t allow_queuing;
- rte_atomic32_t while_queuing;
+ RTE_ATOMIC(uint32_t) allow_queuing;
+ RTE_ATOMIC(uint32_t) while_queuing;
struct pmd_internal *internal;
struct rte_mempool *mb_pool;
uint16_t port;
@@ -86,14 +86,14 @@ struct vhost_queue {
};
struct pmd_internal {
- rte_atomic32_t dev_attached;
+ RTE_ATOMIC(uint32_t) dev_attached;
char *iface_name;
uint64_t flags;
uint64_t disable_flags;
uint64_t features;
uint16_t max_queues;
int vid;
- rte_atomic32_t started;
+ RTE_ATOMIC(uint32_t) started;
bool vlan_strip;
bool rx_sw_csum;
bool tx_sw_csum;
@@ -406,12 +406,19 @@ eth_vhost_rx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
uint16_t i, nb_rx = 0;
uint16_t nb_receive = nb_bufs;
- if (unlikely(rte_atomic32_read(&r->allow_queuing) == 0))
+ /* Fast-path early exit; racy load is fine here -- if we miss a
+ * transition we get caught by the seq_cst check below.
+ */
+ if (unlikely(rte_atomic_load_explicit(&r->allow_queuing, rte_memory_order_relaxed) == 0))
return 0;
- rte_atomic32_set(&r->while_queuing, 1);
-
- if (unlikely(rte_atomic32_read(&r->allow_queuing) == 0))
+ /* Announce presence, then re-check. The store and the following
+ * load MUST both be seq_cst so they are totally ordered with the
+ * control thread's store-to-allow_queuing / load-of-while_queuing
+ * pair. Anything weaker permits both sides to miss each other.
+ */
+ rte_atomic_store_explicit(&r->while_queuing, 1, rte_memory_order_seq_cst);
+ if (unlikely(rte_atomic_load_explicit(&r->allow_queuing, rte_memory_order_seq_cst) == 0))
goto out;
/* Dequeue packets from guest TX queue */
@@ -446,7 +453,7 @@ eth_vhost_rx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
}
out:
- rte_atomic32_set(&r->while_queuing, 0);
+ rte_atomic_store_explicit(&r->while_queuing, 0, rte_memory_order_release);
return nb_rx;
}
@@ -460,12 +467,19 @@ eth_vhost_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
uint64_t nb_bytes = 0;
uint64_t nb_missed = 0;
- if (unlikely(rte_atomic32_read(&r->allow_queuing) == 0))
+ /* Fast-path early exit; racy load is fine here -- if we miss a
+ * transition we get caught by the seq_cst check below.
+ */
+ if (unlikely(rte_atomic_load_explicit(&r->allow_queuing, rte_memory_order_relaxed) == 0))
return 0;
- rte_atomic32_set(&r->while_queuing, 1);
-
- if (unlikely(rte_atomic32_read(&r->allow_queuing) == 0))
+ /* Announce presence, then re-check. The store and the following
+ * load MUST both be seq_cst so they are totally ordered with the
+ * control thread's store-to-allow_queuing / load-of-while_queuing
+ * pair. Anything weaker permits both sides to miss each other.
+ */
+ rte_atomic_store_explicit(&r->while_queuing, 1, rte_memory_order_seq_cst);
+ if (unlikely(rte_atomic_load_explicit(&r->allow_queuing, rte_memory_order_seq_cst) == 0))
goto out;
for (i = 0; i < nb_bufs; i++) {
@@ -515,7 +529,7 @@ eth_vhost_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
for (i = 0; likely(i < nb_tx); i++)
rte_pktmbuf_free(bufs[i]);
out:
- rte_atomic32_set(&r->while_queuing, 0);
+ rte_atomic_store_explicit(&r->while_queuing, 0, rte_memory_order_release);
return nb_tx;
}
@@ -744,6 +758,19 @@ eth_vhost_unconfigure_intr(struct rte_eth_dev *eth_dev)
}
}
+static inline void
+update_queue(struct vhost_queue *vq, uint32_t allow, bool wait_queuing)
+{
+ /* seq_cst: pairs with the data-path's seq_cst store of
+ * while_queuing and seq_cst load of allow_queuing. See
+ * eth_vhost_rx().
+ */
+ rte_atomic_store_explicit(&vq->allow_queuing, allow, rte_memory_order_seq_cst);
+ if (wait_queuing)
+ rte_wait_until_equal_32((volatile uint32_t *)(uintptr_t)&vq->while_queuing,
+ 0, rte_memory_order_seq_cst);
+}
+
static void
update_queuing_status(struct rte_eth_dev *dev, bool wait_queuing)
{
@@ -751,14 +778,18 @@ update_queuing_status(struct rte_eth_dev *dev, bool wait_queuing)
struct vhost_queue *vq;
struct rte_vhost_vring_state *state;
unsigned int i;
- int allow_queuing = 1;
+ bool allow_queuing = true;
if (!dev->data->rx_queues || !dev->data->tx_queues)
return;
- if (rte_atomic32_read(&internal->started) == 0 ||
- rte_atomic32_read(&internal->dev_attached) == 0)
- allow_queuing = 0;
+ /* These are control-plane flags consulted only here;
+ * the real data-path handshake is on vq->allow_queuing below.
+ * Relaxed is sufficient.
+ */
+ if (rte_atomic_load_explicit(&internal->started, rte_memory_order_relaxed) == 0 ||
+ rte_atomic_load_explicit(&internal->dev_attached, rte_memory_order_relaxed) == 0)
+ allow_queuing = false;
state = vring_states[dev->data->port_id];
@@ -767,24 +798,18 @@ update_queuing_status(struct rte_eth_dev *dev, bool wait_queuing)
vq = dev->data->rx_queues[i];
if (vq == NULL)
continue;
- if (allow_queuing && state->cur[vq->virtqueue_id])
- rte_atomic32_set(&vq->allow_queuing, 1);
- else
- rte_atomic32_set(&vq->allow_queuing, 0);
- while (wait_queuing && rte_atomic32_read(&vq->while_queuing))
- rte_pause();
+
+ update_queue(vq, !!(allow_queuing && state->cur[vq->virtqueue_id]),
+ wait_queuing);
}
for (i = 0; i < dev->data->nb_tx_queues; i++) {
vq = dev->data->tx_queues[i];
if (vq == NULL)
continue;
- if (allow_queuing && state->cur[vq->virtqueue_id])
- rte_atomic32_set(&vq->allow_queuing, 1);
- else
- rte_atomic32_set(&vq->allow_queuing, 0);
- while (wait_queuing && rte_atomic32_read(&vq->while_queuing))
- rte_pause();
+
+ update_queue(vq, !!(allow_queuing && state->cur[vq->virtqueue_id]),
+ wait_queuing);
}
}
@@ -848,7 +873,7 @@ new_device(int vid)
}
internal->vid = vid;
- if (rte_atomic32_read(&internal->started) == 1) {
+ if (rte_atomic_load_explicit(&internal->started, rte_memory_order_relaxed) == 1) {
queue_setup(eth_dev, internal);
if (dev_conf->intr_conf.rxq)
eth_vhost_configure_intr(eth_dev);
@@ -863,7 +888,7 @@ new_device(int vid)
vhost_dev_csum_configure(eth_dev);
- rte_atomic32_set(&internal->dev_attached, 1);
+ rte_atomic_store_explicit(&internal->dev_attached, 1, rte_memory_order_relaxed);
update_queuing_status(eth_dev, false);
VHOST_LOG_LINE(INFO, "Vhost device %d created", vid);
@@ -893,7 +918,7 @@ destroy_device(int vid)
eth_dev = list->eth_dev;
internal = eth_dev->data->dev_private;
- rte_atomic32_set(&internal->dev_attached, 0);
+ rte_atomic_store_explicit(&internal->dev_attached, 0, rte_memory_order_relaxed);
update_queuing_status(eth_dev, true);
eth_vhost_unconfigure_intr(eth_dev);
@@ -1148,11 +1173,11 @@ eth_dev_start(struct rte_eth_dev *eth_dev)
}
queue_setup(eth_dev, internal);
- if (rte_atomic32_read(&internal->dev_attached) == 1 &&
+ if (rte_atomic_load_explicit(&internal->dev_attached, rte_memory_order_relaxed) == 1 &&
dev_conf->intr_conf.rxq)
eth_vhost_configure_intr(eth_dev);
- rte_atomic32_set(&internal->started, 1);
+ rte_atomic_store_explicit(&internal->started, 1, rte_memory_order_relaxed);
update_queuing_status(eth_dev, false);
for (i = 0; i < eth_dev->data->nb_rx_queues; i++)
@@ -1170,7 +1195,7 @@ eth_dev_stop(struct rte_eth_dev *dev)
uint16_t i;
dev->data->dev_started = 0;
- rte_atomic32_set(&internal->started, 0);
+ rte_atomic_store_explicit(&internal->started, 0, rte_memory_order_relaxed);
update_queuing_status(dev, true);
for (i = 0; i < dev->data->nb_rx_queues; i++)
@@ -1471,8 +1496,10 @@ vhost_dev_priv_dump(struct rte_eth_dev *dev, FILE *f)
fprintf(f, "features: 0x%" PRIx64 "\n", internal->features);
fprintf(f, "max_queues: %u\n", internal->max_queues);
fprintf(f, "vid: %d\n", internal->vid);
- fprintf(f, "started: %d\n", rte_atomic32_read(&internal->started));
- fprintf(f, "dev_attached: %d\n", rte_atomic32_read(&internal->dev_attached));
+ fprintf(f, "started: %u\n",
+ rte_atomic_load_explicit(&internal->started, rte_memory_order_relaxed));
+ fprintf(f, "dev_attached: %u\n",
+ rte_atomic_load_explicit(&internal->dev_attached, rte_memory_order_relaxed));
fprintf(f, "vlan_strip: %d\n", internal->vlan_strip);
fprintf(f, "rx_sw_csum: %d\n", internal->rx_sw_csum);
fprintf(f, "tx_sw_csum: %d\n", internal->tx_sw_csum);
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 25/27] vdpa/ifc: replace rte_atomic32 with stdatomic
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (23 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 24/27] net/vhost: use stdatomic instead of rte_atomic32 Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 26/27] test/atomic: suppress deprecation warnings for legacy APIs Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 27/27] eal: mark rte_atomicNN as deprecated Stephen Hemminger
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
Last in-tree caller of rte_atomic32_*(), blocking deprecation of the
rte_atomicNN_*() family.
Replace rte_atomic32_read/set() with rte_atomic_load_explicit() and
rte_atomic_store_explicit() on the started, dev_attached, and running
flags. Narrow them to bool (only ever hold 0/1) and group with the
existing bools to reduce padding in struct ifcvf_internal.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/vdpa/ifc/ifcvf_vdpa.c | 37 ++++++++++++++++++-----------------
1 file changed, 19 insertions(+), 18 deletions(-)
diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
index f319d455ba..e5da11a2ba 100644
--- a/drivers/vdpa/ifc/ifcvf_vdpa.c
+++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
@@ -25,6 +25,7 @@
#include <rte_log.h>
#include <rte_kvargs.h>
#include <rte_devargs.h>
+#include <rte_stdatomic.h>
#include "base/ifcvf.h"
@@ -68,10 +69,10 @@ struct ifcvf_internal {
struct rte_vdpa_device *vdev;
uint16_t max_queues;
uint64_t features;
- rte_atomic32_t started;
- rte_atomic32_t dev_attached;
- rte_atomic32_t running;
rte_spinlock_t lock;
+ RTE_ATOMIC(bool) started;
+ RTE_ATOMIC(bool) dev_attached;
+ RTE_ATOMIC(bool) running;
bool sw_lm;
bool sw_fallback_running;
/* mediated vring for sw fallback */
@@ -712,9 +713,9 @@ update_datapath(struct ifcvf_internal *internal)
rte_spinlock_lock(&internal->lock);
- if (!rte_atomic32_read(&internal->running) &&
- (rte_atomic32_read(&internal->started) &&
- rte_atomic32_read(&internal->dev_attached))) {
+ if (!rte_atomic_load_explicit(&internal->running, rte_memory_order_seq_cst) &&
+ (rte_atomic_load_explicit(&internal->started, rte_memory_order_seq_cst) &&
+ rte_atomic_load_explicit(&internal->dev_attached, rte_memory_order_seq_cst))) {
ret = ifcvf_dma_map(internal, true);
if (ret)
goto err;
@@ -735,10 +736,10 @@ update_datapath(struct ifcvf_internal *internal)
if (ret)
goto err;
- rte_atomic32_set(&internal->running, 1);
- } else if (rte_atomic32_read(&internal->running) &&
- (!rte_atomic32_read(&internal->started) ||
- !rte_atomic32_read(&internal->dev_attached))) {
+ rte_atomic_store_explicit(&internal->running, true, rte_memory_order_seq_cst);
+ } else if (rte_atomic_load_explicit(&internal->running, rte_memory_order_seq_cst) &&
+ (!rte_atomic_load_explicit(&internal->started, rte_memory_order_seq_cst) ||
+ !rte_atomic_load_explicit(&internal->dev_attached, rte_memory_order_seq_cst))) {
unset_intr_relay(internal);
ret = unset_notify_relay(internal);
@@ -755,7 +756,7 @@ update_datapath(struct ifcvf_internal *internal)
if (ret)
goto err;
- rte_atomic32_set(&internal->running, 0);
+ rte_atomic_store_explicit(&internal->running, false, rte_memory_order_seq_cst);
}
rte_spinlock_unlock(&internal->lock);
@@ -1058,7 +1059,7 @@ ifcvf_sw_fallback_switchover(struct ifcvf_internal *internal)
vdpa_disable_vfio_intr(internal);
- rte_atomic32_set(&internal->running, 0);
+ rte_atomic_store_explicit(&internal->running, false, rte_memory_order_seq_cst);
ret = rte_vhost_host_notifier_ctrl(vid, RTE_VHOST_QUEUE_ALL, false);
if (ret && ret != -ENOTSUP)
@@ -1113,11 +1114,11 @@ ifcvf_dev_config(int vid)
internal = list->internal;
internal->vid = vid;
- rte_atomic32_set(&internal->dev_attached, 1);
+ rte_atomic_store_explicit(&internal->dev_attached, true, rte_memory_order_seq_cst);
if (update_datapath(internal) < 0) {
DRV_LOG(ERR, "failed to update datapath for vDPA device %s",
vdev->device->name);
- rte_atomic32_set(&internal->dev_attached, 0);
+ rte_atomic_store_explicit(&internal->dev_attached, false, rte_memory_order_seq_cst);
return -1;
}
@@ -1166,7 +1167,7 @@ ifcvf_dev_close(int vid)
internal->sw_fallback_running = false;
} else {
- rte_atomic32_set(&internal->dev_attached, 0);
+ rte_atomic_store_explicit(&internal->dev_attached, false, rte_memory_order_seq_cst);
if (update_datapath(internal) < 0) {
DRV_LOG(ERR, "failed to update datapath for vDPA device %s",
vdev->device->name);
@@ -1782,10 +1783,10 @@ ifcvf_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
goto error;
}
- rte_atomic32_set(&internal->started, 1);
+ rte_atomic_store_explicit(&internal->started, true, rte_memory_order_seq_cst);
if (update_datapath(internal) < 0) {
DRV_LOG(ERR, "failed to update datapath %s", pci_dev->name);
- rte_atomic32_set(&internal->started, 0);
+ rte_atomic_store_explicit(&internal->started, false, rte_memory_order_seq_cst);
rte_vdpa_unregister_device(internal->vdev);
pthread_mutex_lock(&internal_list_lock);
TAILQ_REMOVE(&internal_list, list, next);
@@ -1819,7 +1820,7 @@ ifcvf_pci_remove(struct rte_pci_device *pci_dev)
}
internal = list->internal;
- rte_atomic32_set(&internal->started, 0);
+ rte_atomic_store_explicit(&internal->started, false, rte_memory_order_seq_cst);
if (update_datapath(internal) < 0)
DRV_LOG(ERR, "failed to update datapath %s", pci_dev->name);
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 26/27] test/atomic: suppress deprecation warnings for legacy APIs
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (24 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 25/27] vdpa/ifc: replace rte_atomic32 with stdatomic Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 27/27] eal: mark rte_atomicNN as deprecated Stephen Hemminger
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
The rte_atomicNN_* APIs are now marked __rte_deprecated.
Wrap the whole file with __rte_diagnostic_push / pop and a
GCC pragma -Wdeprecated-declarations.
In future, when the APIs are removed this test collapses to just the
128-bit compare-and-swap case and the suppression goes with it.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test_atomic.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/app/test/test_atomic.c b/app/test/test_atomic.c
index 2a4531b833..f32a1aeff4 100644
--- a/app/test/test_atomic.c
+++ b/app/test/test_atomic.c
@@ -100,6 +100,15 @@
* - At the end of the test, the number of corrupted tokens must be 0.
*/
+/*
+ * The rte_atomicNN_* APIs exercised below are deprecated in favour of C11 atomics.
+ * Suppress the deprecation warnings for the whole file;
+ * when the APIs are removed this test collapses to the 128-bit
+ * compare-and-swap case and the suppression goes with it.
+ */
+__rte_diagnostic_push
+_Pragma("GCC diagnostic ignored \"-Wdeprecated-declarations\"")
+
#define NUM_ATOMIC_TYPES 3
#define N_BASE 1000000u
@@ -645,4 +654,7 @@ test_atomic(void)
return 0;
}
REGISTER_FAST_TEST(atomic_autotest, NOHUGE_SKIP, ASAN_OK, test_atomic);
+
+__rte_diagnostic_pop
+
#endif /* RTE_TOOLCHAIN_MSVC */
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v3 27/27] eal: mark rte_atomicNN as deprecated
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (25 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 26/27] test/atomic: suppress deprecation warnings for legacy APIs Stephen Hemminger
@ 2026-05-23 19:56 ` Stephen Hemminger
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-23 19:56 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Thomas Monjalon
The rte_atomicNN functions back in 2021 but code was still using these
functions. Now that all in tree code has been updated the functions
are marked with __rte_deprecated so that user code will get
see the deprecation.
Remove rte_atomicNN from checkpatches since usage is now
detected at compile time.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
devtools/checkpatches.sh | 8 ---
doc/guides/rel_notes/deprecation.rst | 4 +-
doc/guides/rel_notes/release_26_07.rst | 4 ++
lib/eal/include/generic/rte_atomic.h | 99 ++++++++++++++------------
4 files changed, 59 insertions(+), 56 deletions(-)
diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
index 81bb0fe4e8..a0cbbf09db 100755
--- a/devtools/checkpatches.sh
+++ b/devtools/checkpatches.sh
@@ -113,14 +113,6 @@ check_forbidden_additions() { # <patch>
-f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
"$1" || res=1
- # refrain from new additions of 16/32/64 bits rte_atomicNN_xxx()
- awk -v FOLDERS="lib drivers app examples" \
- -v EXPRESSIONS="rte_atomic[0-9][0-9]_.*\\\(" \
- -v RET_ON_FAIL=1 \
- -v MESSAGE='Using rte_atomicNN_xxx' \
- -f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
- "$1" || res=1
-
# refrain from using compiler __sync_xxx builtins
awk -v FOLDERS="lib drivers app examples" \
-v EXPRESSIONS="__sync_.*\\\(" \
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 2190419f79..5d9226d551 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -43,9 +43,7 @@ Deprecation Notices
* rte_atomicNN_xxx: These APIs do not take memory order parameter. This does
not allow for writing optimized code for all the CPU architectures supported
in DPDK. DPDK has adopted the atomic operations from
- https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html. These
- operations must be used for patches that need to be merged in 20.08 onwards.
- This change will not introduce any performance degradation.
+ https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html.
* lib: will fix extending some enum/define breaking the ABI. There are multiple
samples in DPDK that enum/define terminated with a ``.*MAX.*`` value which is
diff --git a/doc/guides/rel_notes/release_26_07.rst b/doc/guides/rel_notes/release_26_07.rst
index f012d47a4b..2c8faffee9 100644
--- a/doc/guides/rel_notes/release_26_07.rst
+++ b/doc/guides/rel_notes/release_26_07.rst
@@ -92,6 +92,10 @@ API Changes
Also, make sure to start the actual text at the margin.
=======================================================
+* atomic: Marked the ``rte_atomicNN`` functions as deprecated.
+ As previously announced these functions were intended to be deprecated
+ but was not being enforced.
+
ABI Changes
-----------
diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h
index 1b04b43cbb..b660f9fc1c 100644
--- a/lib/eal/include/generic/rte_atomic.h
+++ b/lib/eal/include/generic/rte_atomic.h
@@ -152,6 +152,13 @@ rte_smp_rmb(void)
rte_atomic_thread_fence(rte_memory_order_acquire);
}
+/*
+ * The rte_atomicNN_* APIs defined below are deprecated in favour of C11 atomics.
+ * Suppress the deprecation warnings for the inlines to allow inter-related usage.
+ */
+__rte_diagnostic_push
+_Pragma("GCC diagnostic ignored \"-Wdeprecated-declarations\"")
+
/*------------------------- 16 bit atomic operations -------------------------*/
#ifndef RTE_TOOLCHAIN_MSVC
@@ -172,7 +179,7 @@ rte_smp_rmb(void)
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
+static __rte_deprecated inline int
rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
@@ -193,7 +200,7 @@ rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
* @return
* The original value at that location
*/
-static inline uint16_t
+static __rte_deprecated inline uint16_t
rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
{
return rte_atomic_exchange_explicit((volatile __rte_atomic uint16_t *)dst,
@@ -218,7 +225,7 @@ typedef struct {
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic16_init(rte_atomic16_t *v)
{
v->cnt = 0;
@@ -232,7 +239,7 @@ rte_atomic16_init(rte_atomic16_t *v)
* @return
* The value of the counter.
*/
-static inline int16_t
+static __rte_deprecated inline int16_t
rte_atomic16_read(const rte_atomic16_t *v)
{
return v->cnt;
@@ -246,7 +253,7 @@ rte_atomic16_read(const rte_atomic16_t *v)
* @param new_value
* The new value for the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic16_set(rte_atomic16_t *v, int16_t new_value)
{
v->cnt = new_value;
@@ -260,7 +267,7 @@ rte_atomic16_set(rte_atomic16_t *v, int16_t new_value)
* @param inc
* The value to be added to the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic16_add(rte_atomic16_t *v, int16_t inc)
{
rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v->cnt, inc,
@@ -275,7 +282,7 @@ rte_atomic16_add(rte_atomic16_t *v, int16_t inc)
* @param dec
* The value to be subtracted from the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
{
rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v->cnt, dec,
@@ -288,7 +295,7 @@ rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic16_inc(rte_atomic16_t *v)
{
rte_atomic16_add(v, 1);
@@ -300,7 +307,7 @@ rte_atomic16_inc(rte_atomic16_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic16_dec(rte_atomic16_t *v)
{
rte_atomic16_sub(v, 1);
@@ -319,7 +326,7 @@ rte_atomic16_dec(rte_atomic16_t *v)
* @return
* The value of v after the addition.
*/
-static inline int16_t
+static __rte_deprecated inline int16_t
rte_atomic16_add_return(rte_atomic16_t *v, int16_t inc)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v->cnt, inc,
@@ -340,7 +347,7 @@ rte_atomic16_add_return(rte_atomic16_t *v, int16_t inc)
* @return
* The value of v after the subtraction.
*/
-static inline int16_t
+static __rte_deprecated inline int16_t
rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v->cnt, dec,
@@ -358,7 +365,7 @@ rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
* @return
* True if the result after the increment operation is 0; false otherwise.
*/
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+static __rte_deprecated inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
rte_memory_order_seq_cst) + 1 == 0;
@@ -375,7 +382,7 @@ static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
* @return
* True if the result after the decrement operation is 0; false otherwise.
*/
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+static __rte_deprecated inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
rte_memory_order_seq_cst) - 1 == 0;
@@ -392,7 +399,7 @@ static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+static __rte_deprecated inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
{
return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
}
@@ -403,7 +410,7 @@ static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void rte_atomic16_clear(rte_atomic16_t *v)
+static __rte_deprecated inline void rte_atomic16_clear(rte_atomic16_t *v)
{
v->cnt = 0;
}
@@ -426,7 +433,7 @@ static inline void rte_atomic16_clear(rte_atomic16_t *v)
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
+static __rte_deprecated inline int
rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
@@ -447,7 +454,7 @@ rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
* @return
* The original value at that location
*/
-static inline uint32_t
+static __rte_deprecated inline uint32_t
rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
{
return rte_atomic_exchange_explicit((volatile __rte_atomic uint32_t *)dst,
@@ -472,7 +479,7 @@ typedef struct {
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic32_init(rte_atomic32_t *v)
{
v->cnt = 0;
@@ -486,7 +493,7 @@ rte_atomic32_init(rte_atomic32_t *v)
* @return
* The value of the counter.
*/
-static inline int32_t
+static __rte_deprecated inline int32_t
rte_atomic32_read(const rte_atomic32_t *v)
{
return v->cnt;
@@ -500,7 +507,7 @@ rte_atomic32_read(const rte_atomic32_t *v)
* @param new_value
* The new value for the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic32_set(rte_atomic32_t *v, int32_t new_value)
{
v->cnt = new_value;
@@ -514,7 +521,7 @@ rte_atomic32_set(rte_atomic32_t *v, int32_t new_value)
* @param inc
* The value to be added to the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic32_add(rte_atomic32_t *v, int32_t inc)
{
rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v->cnt, inc,
@@ -529,7 +536,7 @@ rte_atomic32_add(rte_atomic32_t *v, int32_t inc)
* @param dec
* The value to be subtracted from the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
{
rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v->cnt, dec,
@@ -542,7 +549,7 @@ rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic32_inc(rte_atomic32_t *v)
{
rte_atomic32_add(v, 1);
@@ -554,7 +561,7 @@ rte_atomic32_inc(rte_atomic32_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic32_dec(rte_atomic32_t *v)
{
rte_atomic32_sub(v,1);
@@ -573,7 +580,7 @@ rte_atomic32_dec(rte_atomic32_t *v)
* @return
* The value of v after the addition.
*/
-static inline int32_t
+static __rte_deprecated inline int32_t
rte_atomic32_add_return(rte_atomic32_t *v, int32_t inc)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v->cnt, inc,
@@ -594,7 +601,7 @@ rte_atomic32_add_return(rte_atomic32_t *v, int32_t inc)
* @return
* The value of v after the subtraction.
*/
-static inline int32_t
+static __rte_deprecated inline int32_t
rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v->cnt, dec,
@@ -612,7 +619,7 @@ rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
* @return
* True if the result after the increment operation is 0; false otherwise.
*/
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
+static __rte_deprecated inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
rte_memory_order_seq_cst) + 1 == 0;
@@ -629,7 +636,7 @@ static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
* @return
* True if the result after the decrement operation is 0; false otherwise.
*/
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
+static __rte_deprecated inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
rte_memory_order_seq_cst) - 1 == 0;
@@ -646,7 +653,7 @@ static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+static __rte_deprecated inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
{
return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
}
@@ -657,7 +664,7 @@ static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void rte_atomic32_clear(rte_atomic32_t *v)
+static __rte_deprecated inline void rte_atomic32_clear(rte_atomic32_t *v)
{
v->cnt = 0;
}
@@ -679,7 +686,7 @@ static inline void rte_atomic32_clear(rte_atomic32_t *v)
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
+static __rte_deprecated inline int
rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
@@ -700,7 +707,7 @@ rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
* @return
* The original value at that location
*/
-static inline uint64_t
+static __rte_deprecated inline uint64_t
rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
{
return rte_atomic_exchange_explicit((volatile __rte_atomic uint64_t *)dst,
@@ -725,7 +732,7 @@ typedef struct {
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic64_init(rte_atomic64_t *v)
{
#ifdef __LP64__
@@ -750,7 +757,7 @@ rte_atomic64_init(rte_atomic64_t *v)
* @return
* The value of the counter.
*/
-static inline int64_t
+static __rte_deprecated inline int64_t
rte_atomic64_read(rte_atomic64_t *v)
{
#ifdef __LP64__
@@ -777,7 +784,7 @@ rte_atomic64_read(rte_atomic64_t *v)
* @param new_value
* The new value of the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
{
#ifdef __LP64__
@@ -802,7 +809,7 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
* @param inc
* The value to be added to the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
{
rte_atomic_fetch_add_explicit((volatile __rte_atomic int64_t *)&v->cnt, inc,
@@ -817,7 +824,7 @@ rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
* @param dec
* The value to be subtracted from the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
{
rte_atomic_fetch_sub_explicit((volatile __rte_atomic int64_t *)&v->cnt, dec,
@@ -830,7 +837,7 @@ rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic64_inc(rte_atomic64_t *v)
{
rte_atomic64_add(v, 1);
@@ -842,7 +849,7 @@ rte_atomic64_inc(rte_atomic64_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic64_dec(rte_atomic64_t *v)
{
rte_atomic64_sub(v, 1);
@@ -861,7 +868,7 @@ rte_atomic64_dec(rte_atomic64_t *v)
* @return
* The value of v after the addition.
*/
-static inline int64_t
+static __rte_deprecated inline int64_t
rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int64_t *)&v->cnt, inc,
@@ -881,7 +888,7 @@ rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
* @return
* The value of v after the subtraction.
*/
-static inline int64_t
+static __rte_deprecated inline int64_t
rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int64_t *)&v->cnt, dec,
@@ -899,7 +906,7 @@ rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
* @return
* True if the result after the addition is 0; false otherwise.
*/
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+static __rte_deprecated inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
{
return rte_atomic64_add_return(v, 1) == 0;
}
@@ -915,7 +922,7 @@ static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
* @return
* True if the result after subtraction is 0; false otherwise.
*/
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+static __rte_deprecated inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
{
return rte_atomic64_sub_return(v, 1) == 0;
}
@@ -931,7 +938,7 @@ static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+static __rte_deprecated inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
{
return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
}
@@ -942,13 +949,15 @@ static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
+static __rte_deprecated inline void rte_atomic64_clear(rte_atomic64_t *v)
{
rte_atomic64_set(v, 0);
}
#endif
+__rte_diagnostic_pop
+
/*------------------------ 128 bit atomic operations -------------------------*/
/**
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* RE: [PATCH v3 03/27] ring: use compare-and-swap wrapper
2026-05-23 19:16 ` [PATCH v3 03/27] ring: use compare-and-swap wrapper Stephen Hemminger
@ 2026-05-25 7:41 ` Konstantin Ananyev
2026-05-25 14:31 ` Stephen Hemminger
2026-05-25 15:35 ` Stephen Hemminger
0 siblings, 2 replies; 105+ messages in thread
From: Konstantin Ananyev @ 2026-05-25 7:41 UTC (permalink / raw)
To: Stephen Hemminger, dev@dpdk.org
Hi Stephen,
> The rte_atomic32_cmpset is deprecated. Initial attempts at
> changing this with direct conversion to
> rte_atomic_compare_exchange_weak_explicit()
> regressed MP/MC contended performance on x86 by 10-30%,
> because the C11 builtin's failure-writeback semantic forces
> GCC to emit extra instructions on the CAS critical path.
>
> Add an internal __rte_ring_compare_and_swap() wrapper that calls
> __sync_bool_compare_and_swap() directly, which keeps the original
> instruction sequence. Add equivalent function for MSVC.
In fact, in rte_ring we do have 2 implementations of the same core functions:
lib/ring/rte_ring_c11_pvt.h - uses C11 atomics
lib/ring/rte_ring_generic_pvt.h - uses legacy instructions (smp_mb, extra),
If we going remove these legacy instructions anyway (or reimplementing them using C11 atomics),
then there is probably no point to keep rte_ring_generic_pvt.h.
Konstantin
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> lib/ring/rte_ring_generic_pvt.h | 32 ++++++++++++++++++++++++++++----
> 1 file changed, 28 insertions(+), 4 deletions(-)
>
> diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_generic_pvt.h
> index affd2d5ba7..0fb972de9e 100644
> --- a/lib/ring/rte_ring_generic_pvt.h
> +++ b/lib/ring/rte_ring_generic_pvt.h
> @@ -18,6 +18,30 @@
> * For more information please refer to <rte_ring.h>.
> */
>
> +/**
> + * @internal optimized version of compare exchange
> + *
> + * The C11 builtin's failure-writeback semantic generates worse code on x86.
> + * Unlike rte_atomic_compare_exchange_*_explicit(), this wrapper does NOT
> + * write the actual value back to a pointer on failure. Callers in a retry
> + * loop must reload the expected value explicitly on the next iteration.
> + *
> + * Full memory barrier, equivalent to rte_memory_order_seq_cst on both
> + * success and failure.
> + */
> +static __rte_always_inline bool
> +__rte_ring_compare_and_swap(volatile uint32_t *dst,
> + uint32_t expected, uint32_t desired)
> +{
> +#if defined(RTE_TOOLCHAIN_MSVC)
> + return _InterlockedCompareExchange((volatile long *)dst,
> + (long)desired, (long)expected)
> + == (long)expected;
> +#else
> + return __sync_bool_compare_and_swap(dst, expected, desired);
> +#endif
> +}
> +
> /**
> * @internal This function updates tail values.
> */
> @@ -108,10 +132,10 @@ __rte_ring_headtail_move_head(struct
> rte_ring_headtail *d,
> if (is_st) {
> d->head = *new_head;
> success = 1;
> - } else
> - success = rte_atomic32_cmpset(
> - (uint32_t *)(uintptr_t)&d->head,
> - *old_head, *new_head);
> + } else {
> + success = __rte_ring_compare_and_swap(
> + &d->head, *old_head, *new_head);
> + }
> } while (unlikely(success == 0));
> return n;
> }
> --
> 2.53.0
^ permalink raw reply [flat|nested] 105+ messages in thread
* RE: [PATCH v3 04/27] bpf: replace atomic op macro with typed helpers
2026-05-23 19:56 ` [PATCH v3 04/27] bpf: replace atomic op macro with typed helpers Stephen Hemminger
@ 2026-05-25 10:49 ` Marat Khalili
0 siblings, 0 replies; 105+ messages in thread
From: Marat Khalili @ 2026-05-25 10:49 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Konstantin Ananyev, dev@dpdk.org
I absolutely welcome getting rid of macros in this file, but may I request
splitting the rte_atomicNN deprecation and the file refactoring into separate
commits? I am not convinced refactoring was necessary for the deprecation here,
one-line change to `rte_atomic_fetch_add_explicit((type __rte_atomic *)...)` or
like would probably do the job.
If refactor, other examples should also be considered, although we of course
can do it macro by macro. But in any case we should follow some standard
template to reduce number of ways we do stuff in this file if possible.
Regardless of the process, please see some stylistic comments below.
(Technically, the code looks fine.)
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Saturday 23 May 2026 20:56
> To: dev@dpdk.org
> Cc: Stephen Hemminger <stephen@networkplumber.org>; Konstantin Ananyev <konstantin.ananyev@huawei.com>;
> Marat Khalili <marat.khalili@huawei.com>
> Subject: [PATCH v3 04/27] bpf: replace atomic op macro with typed helpers
>
> The BPF_ST_ATOMIC_REG macro token-pasted the legacy rte_atomicNN_*()
> API names. It also stacked three casts on the destination pointer
> and reached a 'return 0' out of the macro into the caller's control
> flow.
>
> Replace it with two small static-inline helpers, bpf_atomic32() and
> bpf_atomic64(), that dispatch on ins->imm internally and use the C11
> atomic intrinsics directly. The destination is cast once, to a
> properly __rte_atomic-qualified pointer. [...]
nit: Not sure number of casts is a relevant metric, but I don't see any
improvement anyway, probably not worth talking about it.
// snip
> @@ -105,6 +83,69 @@
> reg[EBPF_REG_0] = op(p[0]); \
> } while (0)
>
> +/*
> + * Atomic ops on the BPF target memory.
> + *
> + * BPF atomic instructions encode the destination as base register +
> + * signed offset, with the value to combine taken from src_reg.
nit: This applies to any eBPF instruction. And the part that is unusual about
them that they can modify the source register is not mentioned.
> + *
> + * Memory order: seq_cst preserves the previous behavior of
> + * rte_atomicNN_add() / rte_atomicNN_exchange() and matches what the
> + * Linux kernel BPF interpreter does for these opcodes.
Not sure we should refer to the macros we are removing from the code in the
comment, this information belongs to the commit message.
> + *
> + * Returns 0 on unsupported sub-op (validator should have rejected it),
> + * 1 otherwise.
This is an unusual convention, we typically use negative value for errors, or
booleans in some cases.
To clarify, the original `return 0` was specifying return value from the
program, not an error code. For historical reasons in rare cases when the VM
cannot continue the program returns zero (maybe we should reconsider it).
> + */
> +static inline int
> +bpf_atomic32(const struct rte_bpf *bpf, uint64_t reg[EBPF_REG_NUM],
> + const struct ebpf_insn *ins)
> +{
> + /* need to casts to make bpf memory suitable for C11 atomic */
Typo in "need to casts"? (Also not sure what we warn about here.)
> + uint32_t __rte_atomic *dst
> + = (uint32_t __rte_atomic *)(uintptr_t)(reg[ins->dst_reg] + ins->off);
nit: Is it `uint32_t __rte_atomic *` or `RTE_ATOMIC(uint32_t) *`? I honestly
don't know why we have both, but the latter seems more popular in the codebase.
> + uint32_t val = (uint32_t)reg[ins->src_reg];
> +
> + switch (ins->imm) {
> + case BPF_ATOMIC_ADD:
> + rte_atomic_fetch_add_explicit(dst, val, rte_memory_order_seq_cst);
> + return 1;
> + case BPF_ATOMIC_XCHG:
> + reg[ins->src_reg] = rte_atomic_exchange_explicit(dst, val,
> + rte_memory_order_seq_cst);
nit: This is not a typical style of indentation for this file.
> + return 1;
> + default:
> + RTE_BPF_LOG_LINE(ERR,
> + "%s(%p): unsupported atomic operation at pc: %#zx;",
> + __func__, bpf,
> + (uintptr_t)ins - (uintptr_t)bpf->prm.ins);
> + return 0;
This was an optional defensive programming. Since other functions like
`bpf_alu_be` do not have any default label we can arguably just remove it here
and return void (also not accept bpf argument). With the macro it all was at
least transparent to the caller.
> + }
> +}
// snip the rest
^ permalink raw reply [flat|nested] 105+ messages in thread
* Re: [PATCH v3 03/27] ring: use compare-and-swap wrapper
2026-05-25 7:41 ` Konstantin Ananyev
@ 2026-05-25 14:31 ` Stephen Hemminger
2026-05-25 15:35 ` Stephen Hemminger
1 sibling, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-25 14:31 UTC (permalink / raw)
To: Konstantin Ananyev; +Cc: dev@dpdk.org
On Mon, 25 May 2026 07:41:13 +0000
Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
> In fact, in rte_ring we do have 2 implementations of the same core functions:
> lib/ring/rte_ring_c11_pvt.h - uses C11 atomics
> lib/ring/rte_ring_generic_pvt.h - uses legacy instructions (smp_mb, extra),
> If we going remove these legacy instructions anyway (or reimplementing them using C11 atomics),
> then there is probably no point to keep rte_ring_generic_pvt.h.
> Konstantin
Good point will try heading that way.
^ permalink raw reply [flat|nested] 105+ messages in thread
* Re: [PATCH v3 03/27] ring: use compare-and-swap wrapper
2026-05-25 7:41 ` Konstantin Ananyev
2026-05-25 14:31 ` Stephen Hemminger
@ 2026-05-25 15:35 ` Stephen Hemminger
2026-05-25 15:47 ` Morten Brørup
1 sibling, 1 reply; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-25 15:35 UTC (permalink / raw)
To: Konstantin Ananyev; +Cc: dev@dpdk.org
On Mon, 25 May 2026 07:41:13 +0000
Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
> Hi Stephen,
>
> > The rte_atomic32_cmpset is deprecated. Initial attempts at
> > changing this with direct conversion to
> > rte_atomic_compare_exchange_weak_explicit()
> > regressed MP/MC contended performance on x86 by 10-30%,
> > because the C11 builtin's failure-writeback semantic forces
> > GCC to emit extra instructions on the CAS critical path.
> >
> > Add an internal __rte_ring_compare_and_swap() wrapper that calls
> > __sync_bool_compare_and_swap() directly, which keeps the original
> > instruction sequence. Add equivalent function for MSVC.
>
> In fact, in rte_ring we do have 2 implementations of the same core functions:
> lib/ring/rte_ring_c11_pvt.h - uses C11 atomics
> lib/ring/rte_ring_generic_pvt.h - uses legacy instructions (smp_mb, extra),
> If we going remove these legacy instructions anyway (or reimplementing them using C11 atomics),
> then there is probably no point to keep rte_ring_generic_pvt.h.
> Konstantin
Have been deep diving into why C11 atomics give 20-30% performance
drop versus atomic32 version. So far it comes down to GCC optimizer
not doing as well with C11 versus assembly. The C11 form with the
excessive use of always_inline consumes more registers.
^ permalink raw reply [flat|nested] 105+ messages in thread
* RE: [PATCH v3 03/27] ring: use compare-and-swap wrapper
2026-05-25 15:35 ` Stephen Hemminger
@ 2026-05-25 15:47 ` Morten Brørup
0 siblings, 0 replies; 105+ messages in thread
From: Morten Brørup @ 2026-05-25 15:47 UTC (permalink / raw)
To: Stephen Hemminger, Konstantin Ananyev; +Cc: dev
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Monday, 25 May 2026 17.35
>
> On Mon, 25 May 2026 07:41:13 +0000
> Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
>
> > Hi Stephen,
> >
> > > The rte_atomic32_cmpset is deprecated. Initial attempts at
> > > changing this with direct conversion to
> > > rte_atomic_compare_exchange_weak_explicit()
> > > regressed MP/MC contended performance on x86 by 10-30%,
> > > because the C11 builtin's failure-writeback semantic forces
> > > GCC to emit extra instructions on the CAS critical path.
> > >
> > > Add an internal __rte_ring_compare_and_swap() wrapper that calls
> > > __sync_bool_compare_and_swap() directly, which keeps the original
> > > instruction sequence. Add equivalent function for MSVC.
> >
> > In fact, in rte_ring we do have 2 implementations of the same core
> functions:
> > lib/ring/rte_ring_c11_pvt.h - uses C11 atomics
> > lib/ring/rte_ring_generic_pvt.h - uses legacy instructions (smp_mb,
> extra),
> > If we going remove these legacy instructions anyway (or
> reimplementing them using C11 atomics),
> > then there is probably no point to keep rte_ring_generic_pvt.h.
> > Konstantin
>
> Have been deep diving into why C11 atomics give 20-30% performance
> drop versus atomic32 version. So far it comes down to GCC optimizer
> not doing as well with C11 versus assembly. The C11 form with the
> excessive use of always_inline consumes more registers.
Just an idea:
Perhaps adding "const" and/or "restrict" to relevant parameters will give the optimizer the information it needs?
^ permalink raw reply [flat|nested] 105+ messages in thread
* [PATCH v4 00/27] deprecate rte_atomicNN family
2026-05-21 4:17 [RFC 0/7] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
` (9 preceding siblings ...)
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
@ 2026-05-26 23:23 ` Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 01/27] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
` (26 more replies)
10 siblings, 27 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:23 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
The rte_atomicNN_* family was flagged for deprecation in 2021 by
commit 3ec965b6de12 ("doc: update atomic operation deprecation")
but enforcement never landed and in-tree usage continued to grow.
This series finishes converting every remaining in-tree caller to
the C11-style rte_atomic_*_explicit() / RTE_ATOMIC() API, then
marks the legacy functions __rte_deprecated so future in-tree and
out-of-tree uses are caught at compile time.
Performance: ran the DPDK perf-tests suite (mempool, hash, stack,
ring, distributor, rcu_qsbr, etc.) on the full series; only
lib/ring showed a regression, addressed by the wrapper in patch 03.
Patch organisation
==================
01-02 EAL: drop the inline-asm fallback paths now that intrinsics
work on all platforms; reimplement rte_smp_*mb on top of
rte_atomic_thread_fence.
03-04 lib/ring and lib/bpf -- the last legacy callers in lib/.
05-25 Drivers and selftests, one patch per directory.
26 Suppress deprecation warnings in app/test/test_atomic.c,
which exercises the legacy API until it goes away.
27 Mark rte_atomicNN_* with __rte_deprecated and drop the
corresponding checkpatch grep; new uses are now caught
at compile time.
Changes since v3
================
- lib/ring: keep the existing C11 element-access code; the
earlier rewrite regressed ring_perf 20-30% on x86 with GCC's
handling of atomic_compare_exchange_weak_explicit(). v4
keeps the original structure and adds a wrapper for the one
performance-sensitive CAS.
- lib/bpf: keep the BPF_ST_ATOMIC_REG macro structure rather
than open-coding the converted callers; the macro body is
rewritten to use stdatomic.
- Compilation fixes across the driver conversions caught
during review (CAS expected-value types, format-string
specifiers, dpaax HWDEBUG path).
Targeting 26.11 rather than the next release. The driver
conversions touch many maintainers' code and several are likely
to need cycles of review/respin; a longer review window avoids
rushing contested orderings into an earlier release.
Feedback wanted
===============
- vmbus producer commit-order pattern (patch 17)
- the ring CAS GCC bug workaround might be needed on other
similar uses of ring buffers in vmbus and netvsc.
- Dekker-style seq_cst handshake in net/vhost (patch 24),
which also closes a pre-existing ordering hole on
weakly-ordered ISAs
- netvsc rndis_pending claim/timeout/clear cmpxchg orderings
(patch 15)
Stephen Hemminger (27):
eal: use intrinsics for rte_atomic on all platforms
eal: reimplement rte_smp_*mb with rte_atomic_thread_fence
ring: unify memory model on C11, remove atomic32
bpf: use C11 atomics in BPF_ST_ATOMIC_REG
net/bonding: use stdatomic
net/nbl: remove unused rte_atomic16 field
net/ena: replace use of rte_atomicNN
net/failsafe: convert to stdatomic
net/enic: do not use deprecated rte_atomic64
net/pfe: use ethdev linkstatus helpers
net/sfc: replace rte_atomic with stdatomic
crypto/ccp: replace use of rte_atomic64 with stdatomic
bus/dpaa: replace rte_atomic16 with stdatomic
drivers: replace rte_atomic16 with stdatomic
net/netvsc: replace rte_atomic32 with stdatomic
event/sw: convert from rte_atomic32 to stdatomic
bus/vmbus: convert from rte_atomic to stdatomic
common/dpaax: use stdatomic instead of rte_atomic
net/bnx2x: convert from rte_atomic32 to stdatomic
bus/fslmc: replace rte_atomic32 with stdatomic
drivers/event: replace rte_atomic32 in selftests
net/hinic: replace rte_atomic32 with stdatomic
net/txgbe: replace rte_atomic32 with stdatomic
net/vhost: use stdatomic instead of rte_atomic32
vdpa/ifc: replace rte_atomic32 with stdatomic
test/atomic: suppress deprecation warnings for legacy APIs
eal: mark rte_atomicNN as deprecated
app/test/test_atomic.c | 12 +
devtools/checkpatches.sh | 16 -
doc/guides/rel_notes/deprecation.rst | 12 +-
doc/guides/rel_notes/release_26_07.rst | 4 +
drivers/bus/dpaa/base/qbman/qman.c | 9 +-
drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c | 10 +-
drivers/bus/fslmc/portal/dpaa2_hw_dpci.c | 10 +-
drivers/bus/fslmc/portal/dpaa2_hw_dpio.c | 12 +-
drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 8 +-
drivers/bus/fslmc/qbman/include/compat.h | 21 +-
drivers/bus/vmbus/private.h | 2 +-
drivers/bus/vmbus/vmbus_bufring.c | 39 ++-
drivers/common/dpaax/compat.h | 21 +-
drivers/crypto/ccp/ccp_crypto.c | 11 +-
drivers/crypto/ccp/ccp_crypto.h | 2 +-
drivers/crypto/ccp/ccp_dev.c | 10 +-
drivers/crypto/ccp/ccp_dev.h | 4 +-
drivers/event/dpaa2/dpaa2_eventdev_selftest.c | 26 +-
drivers/event/dpaa2/dpaa2_hw_dpcon.c | 11 +-
drivers/event/octeontx/ssovf_evdev_selftest.c | 61 ++--
drivers/event/sw/sw_evdev.c | 8 +-
drivers/event/sw/sw_evdev.h | 4 +-
drivers/event/sw/sw_evdev_worker.c | 16 +-
drivers/net/bnx2x/bnx2x.c | 6 +-
drivers/net/bnx2x/bnx2x.h | 2 +-
drivers/net/bnx2x/ecore_sp.c | 6 +-
drivers/net/bonding/eth_bond_8023ad_private.h | 6 +-
drivers/net/bonding/rte_eth_bond_8023ad.c | 35 +-
drivers/net/ena/base/ena_plat_dpdk.h | 14 +-
drivers/net/ena/ena_ethdev.c | 21 +-
drivers/net/ena/ena_ethdev.h | 7 +-
drivers/net/enic/enic.h | 6 +-
drivers/net/enic/enic_compat.h | 1 -
drivers/net/enic/enic_main.c | 17 +-
drivers/net/enic/enic_rxtx.c | 14 +-
drivers/net/enic/enic_rxtx_vec_avx2.c | 4 +-
drivers/net/failsafe/failsafe_ops.c | 12 +-
drivers/net/failsafe/failsafe_private.h | 29 +-
drivers/net/failsafe/failsafe_rxtx.c | 2 +-
drivers/net/hinic/base/hinic_compat.h | 2 +-
drivers/net/hinic/base/hinic_pmd_hwdev.c | 24 +-
drivers/net/hinic/base/hinic_pmd_hwdev.h | 4 +-
drivers/net/nbl/nbl_hw/nbl_resource.h | 1 -
drivers/net/netvsc/hn_rndis.c | 28 +-
drivers/net/netvsc/hn_rxtx.c | 12 +-
drivers/net/netvsc/hn_var.h | 6 +-
drivers/net/pfe/pfe_ethdev.c | 32 +-
drivers/net/sfc/sfc.c | 9 +-
drivers/net/sfc/sfc.h | 4 +-
drivers/net/sfc/sfc_port.c | 7 +-
drivers/net/sfc/sfc_stats.h | 2 +-
drivers/net/txgbe/base/txgbe_mng.c | 4 +-
drivers/net/txgbe/base/txgbe_type.h | 2 +-
drivers/net/vhost/rte_eth_vhost.c | 103 +++---
drivers/vdpa/ifc/ifcvf_vdpa.c | 37 +--
lib/bpf/bpf_exec.c | 13 +-
lib/eal/arm/include/rte_atomic_32.h | 10 -
lib/eal/arm/include/rte_atomic_64.h | 10 -
lib/eal/include/generic/rte_atomic.h | 306 +++++-------------
lib/eal/include/rte_common.h | 2 +
lib/eal/loongarch/include/rte_atomic.h | 10 -
lib/eal/ppc/include/rte_atomic.h | 179 ----------
lib/eal/riscv/include/rte_atomic.h | 10 -
lib/eal/x86/include/rte_atomic.h | 205 +-----------
lib/eal/x86/include/rte_atomic_32.h | 188 -----------
lib/eal/x86/include/rte_atomic_64.h | 157 ---------
lib/ring/meson.build | 2 +-
lib/ring/rte_ring_c11_pvt.h | 75 ++---
lib/ring/rte_ring_elem_pvt.h | 125 ++++++-
..._ring_generic_pvt.h => rte_ring_x86_pvt.h} | 61 +---
lib/ring/soring.c | 15 +-
71 files changed, 667 insertions(+), 1489 deletions(-)
rename lib/ring/{rte_ring_generic_pvt.h => rte_ring_x86_pvt.h} (60%)
--
2.53.0
^ permalink raw reply [flat|nested] 105+ messages in thread
* [PATCH v4 01/27] eal: use intrinsics for rte_atomic on all platforms
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
@ 2026-05-26 23:23 ` Stephen Hemminger
2026-06-01 18:23 ` Konstantin Ananyev
2026-05-26 23:23 ` [PATCH v4 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
` (25 subsequent siblings)
26 siblings, 1 reply; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:23 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, Wathsala Vithanage, Bibo Mao,
David Christensen, Sun Yuechi, Bruce Richardson,
Konstantin Ananyev
Next step is to deprecate the rte_atomicNN_*() family. Rather than
maintaining both the inline asm and intrinsic fallbacks, drop the
asm paths and use intrinsics everywhere.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
lib/eal/arm/include/rte_atomic_32.h | 4 -
lib/eal/arm/include/rte_atomic_64.h | 4 -
lib/eal/include/generic/rte_atomic.h | 76 +---------
lib/eal/loongarch/include/rte_atomic.h | 4 -
lib/eal/ppc/include/rte_atomic.h | 173 -----------------------
lib/eal/riscv/include/rte_atomic.h | 4 -
lib/eal/x86/include/rte_atomic.h | 172 ----------------------
lib/eal/x86/include/rte_atomic_32.h | 188 -------------------------
lib/eal/x86/include/rte_atomic_64.h | 157 ---------------------
9 files changed, 6 insertions(+), 776 deletions(-)
diff --git a/lib/eal/arm/include/rte_atomic_32.h b/lib/eal/arm/include/rte_atomic_32.h
index 0b9a0dfa30..696a539fef 100644
--- a/lib/eal/arm/include/rte_atomic_32.h
+++ b/lib/eal/arm/include/rte_atomic_32.h
@@ -5,10 +5,6 @@
#ifndef _RTE_ATOMIC_ARM32_H_
#define _RTE_ATOMIC_ARM32_H_
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
-
#include "generic/rte_atomic.h"
#ifdef __cplusplus
diff --git a/lib/eal/arm/include/rte_atomic_64.h b/lib/eal/arm/include/rte_atomic_64.h
index 181bb60929..9f790238df 100644
--- a/lib/eal/arm/include/rte_atomic_64.h
+++ b/lib/eal/arm/include/rte_atomic_64.h
@@ -6,10 +6,6 @@
#ifndef _RTE_ATOMIC_ARM64_H_
#define _RTE_ATOMIC_ARM64_H_
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
-
#include "generic/rte_atomic.h"
#include <rte_branch_prediction.h>
#include <rte_debug.h>
diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h
index 0a4f3f8528..292e52fade 100644
--- a/lib/eal/include/generic/rte_atomic.h
+++ b/lib/eal/include/generic/rte_atomic.h
@@ -187,13 +187,11 @@ static inline void rte_atomic_thread_fence(rte_memory_order memorder);
static inline int
rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src);
-#ifdef RTE_FORCE_INTRINSICS
static inline int
rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
}
-#endif
/**
* Atomic exchange.
@@ -211,15 +209,11 @@ rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
* The original value at that location
*/
static inline uint16_t
-rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val);
-
-#ifdef RTE_FORCE_INTRINSICS
-static inline uint16_t
rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
{
- return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
+ return rte_atomic_exchange_explicit((volatile __rte_atomic uint16_t *)dst,
+ val, rte_memory_order_seq_cst);
}
-#endif
/**
* The atomic counter structure.
@@ -312,13 +306,11 @@ rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
static inline void
rte_atomic16_inc(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic16_inc(rte_atomic16_t *v)
{
rte_atomic16_add(v, 1);
}
-#endif
/**
* Atomically decrement a counter by one.
@@ -329,13 +321,11 @@ rte_atomic16_inc(rte_atomic16_t *v)
static inline void
rte_atomic16_dec(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic16_dec(rte_atomic16_t *v)
{
rte_atomic16_sub(v, 1);
}
-#endif
/**
* Atomically add a 16-bit value to a counter and return the result.
@@ -391,13 +381,11 @@ rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
*/
static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
rte_memory_order_seq_cst) + 1 == 0;
}
-#endif
/**
* Atomically decrement a 16-bit counter by one and test.
@@ -412,13 +400,11 @@ static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
*/
static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
rte_memory_order_seq_cst) - 1 == 0;
}
-#endif
/**
* Atomically test and set a 16-bit atomic counter.
@@ -433,12 +419,10 @@ static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
*/
static inline int rte_atomic16_test_and_set(rte_atomic16_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
{
return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
}
-#endif
/**
* Atomically set a 16-bit counter to 0.
@@ -472,13 +456,11 @@ static inline void rte_atomic16_clear(rte_atomic16_t *v)
static inline int
rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src);
-#ifdef RTE_FORCE_INTRINSICS
static inline int
rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
}
-#endif
/**
* Atomic exchange.
@@ -496,15 +478,11 @@ rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
* The original value at that location
*/
static inline uint32_t
-rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val);
-
-#ifdef RTE_FORCE_INTRINSICS
-static inline uint32_t
rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
{
- return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
+ return rte_atomic_exchange_explicit((volatile __rte_atomic uint32_t *)dst,
+ val, rte_memory_order_seq_cst);
}
-#endif
/**
* The atomic counter structure.
@@ -597,13 +575,11 @@ rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
static inline void
rte_atomic32_inc(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic32_inc(rte_atomic32_t *v)
{
rte_atomic32_add(v, 1);
}
-#endif
/**
* Atomically decrement a counter by one.
@@ -614,13 +590,11 @@ rte_atomic32_inc(rte_atomic32_t *v)
static inline void
rte_atomic32_dec(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic32_dec(rte_atomic32_t *v)
{
rte_atomic32_sub(v,1);
}
-#endif
/**
* Atomically add a 32-bit value to a counter and return the result.
@@ -676,13 +650,11 @@ rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
*/
static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
rte_memory_order_seq_cst) + 1 == 0;
}
-#endif
/**
* Atomically decrement a 32-bit counter by one and test.
@@ -697,13 +669,11 @@ static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
*/
static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
rte_memory_order_seq_cst) - 1 == 0;
}
-#endif
/**
* Atomically test and set a 32-bit atomic counter.
@@ -718,12 +688,10 @@ static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
*/
static inline int rte_atomic32_test_and_set(rte_atomic32_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
{
return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
}
-#endif
/**
* Atomically set a 32-bit counter to 0.
@@ -756,13 +724,11 @@ static inline void rte_atomic32_clear(rte_atomic32_t *v)
static inline int
rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src);
-#ifdef RTE_FORCE_INTRINSICS
static inline int
rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
}
-#endif
/**
* Atomic exchange.
@@ -780,15 +746,11 @@ rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
* The original value at that location
*/
static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val);
-
-#ifdef RTE_FORCE_INTRINSICS
-static inline uint64_t
rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
{
- return rte_atomic_exchange_explicit(dst, val, rte_memory_order_seq_cst);
+ return rte_atomic_exchange_explicit((volatile __rte_atomic uint64_t *)dst,
+ val, rte_memory_order_seq_cst);
}
-#endif
/**
* The atomic counter structure.
@@ -811,7 +773,6 @@ typedef struct {
static inline void
rte_atomic64_init(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_init(rte_atomic64_t *v)
{
@@ -828,7 +789,6 @@ rte_atomic64_init(rte_atomic64_t *v)
}
#endif
}
-#endif
/**
* Atomically read a 64-bit counter.
@@ -841,7 +801,6 @@ rte_atomic64_init(rte_atomic64_t *v)
static inline int64_t
rte_atomic64_read(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int64_t
rte_atomic64_read(rte_atomic64_t *v)
{
@@ -860,7 +819,6 @@ rte_atomic64_read(rte_atomic64_t *v)
return tmp;
#endif
}
-#endif
/**
* Atomically set a 64-bit counter.
@@ -873,7 +831,6 @@ rte_atomic64_read(rte_atomic64_t *v)
static inline void
rte_atomic64_set(rte_atomic64_t *v, int64_t new_value);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
{
@@ -890,7 +847,6 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
}
#endif
}
-#endif
/**
* Atomically add a 64-bit value to a counter.
@@ -903,14 +859,12 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
static inline void
rte_atomic64_add(rte_atomic64_t *v, int64_t inc);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
{
rte_atomic_fetch_add_explicit((volatile __rte_atomic int64_t *)&v->cnt, inc,
rte_memory_order_seq_cst);
}
-#endif
/**
* Atomically subtract a 64-bit value from a counter.
@@ -923,14 +877,12 @@ rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
static inline void
rte_atomic64_sub(rte_atomic64_t *v, int64_t dec);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
{
rte_atomic_fetch_sub_explicit((volatile __rte_atomic int64_t *)&v->cnt, dec,
rte_memory_order_seq_cst);
}
-#endif
/**
* Atomically increment a 64-bit counter by one and test.
@@ -941,13 +893,11 @@ rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
static inline void
rte_atomic64_inc(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_inc(rte_atomic64_t *v)
{
rte_atomic64_add(v, 1);
}
-#endif
/**
* Atomically decrement a 64-bit counter by one and test.
@@ -958,13 +908,11 @@ rte_atomic64_inc(rte_atomic64_t *v)
static inline void
rte_atomic64_dec(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void
rte_atomic64_dec(rte_atomic64_t *v)
{
rte_atomic64_sub(v, 1);
}
-#endif
/**
* Add a 64-bit value to an atomic counter and return the result.
@@ -982,14 +930,12 @@ rte_atomic64_dec(rte_atomic64_t *v)
static inline int64_t
rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc);
-#ifdef RTE_FORCE_INTRINSICS
static inline int64_t
rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int64_t *)&v->cnt, inc,
rte_memory_order_seq_cst) + inc;
}
-#endif
/**
* Subtract a 64-bit value from an atomic counter and return the result.
@@ -1007,14 +953,12 @@ rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
static inline int64_t
rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec);
-#ifdef RTE_FORCE_INTRINSICS
static inline int64_t
rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int64_t *)&v->cnt, dec,
rte_memory_order_seq_cst) - dec;
}
-#endif
/**
* Atomically increment a 64-bit counter by one and test.
@@ -1029,12 +973,10 @@ rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
*/
static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
{
return rte_atomic64_add_return(v, 1) == 0;
}
-#endif
/**
* Atomically decrement a 64-bit counter by one and test.
@@ -1049,12 +991,10 @@ static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
*/
static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
{
return rte_atomic64_sub_return(v, 1) == 0;
}
-#endif
/**
* Atomically test and set a 64-bit atomic counter.
@@ -1069,12 +1009,10 @@ static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
*/
static inline int rte_atomic64_test_and_set(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
{
return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
}
-#endif
/**
* Atomically set a 64-bit counter to 0.
@@ -1084,12 +1022,10 @@ static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
*/
static inline void rte_atomic64_clear(rte_atomic64_t *v);
-#ifdef RTE_FORCE_INTRINSICS
static inline void rte_atomic64_clear(rte_atomic64_t *v)
{
rte_atomic64_set(v, 0);
}
-#endif
#endif
diff --git a/lib/eal/loongarch/include/rte_atomic.h b/lib/eal/loongarch/include/rte_atomic.h
index c8066a4612..785a452c9e 100644
--- a/lib/eal/loongarch/include/rte_atomic.h
+++ b/lib/eal/loongarch/include/rte_atomic.h
@@ -5,10 +5,6 @@
#ifndef RTE_ATOMIC_LOONGARCH_H
#define RTE_ATOMIC_LOONGARCH_H
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
-
#include <rte_common.h>
#include "generic/rte_atomic.h"
diff --git a/lib/eal/ppc/include/rte_atomic.h b/lib/eal/ppc/include/rte_atomic.h
index 10acc238f9..64f4c3d670 100644
--- a/lib/eal/ppc/include/rte_atomic.h
+++ b/lib/eal/ppc/include/rte_atomic.h
@@ -43,179 +43,6 @@ rte_atomic_thread_fence(rte_memory_order memorder)
}
/*------------------------- 16 bit atomic operations -------------------------*/
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
-{
- return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src, rte_memory_order_acquire,
- rte_memory_order_acquire) ? 1 : 0;
-}
-
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
-{
- return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire) + 1 == 0;
-}
-
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire) - 1 == 0;
-}
-
-static inline uint16_t
-rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
-{
- return __atomic_exchange_2(dst, val, rte_memory_order_seq_cst);
-}
-
-/*------------------------- 32 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
-{
- return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src, rte_memory_order_acquire,
- rte_memory_order_acquire) ? 1 : 0;
-}
-
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
-{
- return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire) + 1 == 0;
-}
-
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire) - 1 == 0;
-}
-
-static inline uint32_t
-rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
-{
- return __atomic_exchange_4(dst, val, rte_memory_order_seq_cst);
-}
-
-/*------------------------- 64 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
- return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src, rte_memory_order_acquire,
- rte_memory_order_acquire) ? 1 : 0;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
- return v->cnt;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
- v->cnt = new_value;
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, inc, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, dec, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
- rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
- rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, inc, rte_memory_order_acquire) + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, dec, rte_memory_order_acquire) - dec;
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
- return rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire) + 1 == 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
- return rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire) - 1 == 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
- return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-
-static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
-{
- return __atomic_exchange_8(dst, val, rte_memory_order_seq_cst);
-}
-
-#endif
#ifdef __cplusplus
}
diff --git a/lib/eal/riscv/include/rte_atomic.h b/lib/eal/riscv/include/rte_atomic.h
index 66346ad474..061b175f33 100644
--- a/lib/eal/riscv/include/rte_atomic.h
+++ b/lib/eal/riscv/include/rte_atomic.h
@@ -8,10 +8,6 @@
#ifndef RTE_ATOMIC_RISCV_H
#define RTE_ATOMIC_RISCV_H
-#ifndef RTE_FORCE_INTRINSICS
-# error Platform must be built with RTE_FORCE_INTRINSICS
-#endif
-
#include <stdint.h>
#include <rte_common.h>
#include <rte_config.h>
diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h
index e071e4234e..4f05302c9f 100644
--- a/lib/eal/x86/include/rte_atomic.h
+++ b/lib/eal/x86/include/rte_atomic.h
@@ -111,178 +111,6 @@ rte_atomic_thread_fence(rte_memory_order memorder)
extern "C" {
#endif
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
-{
- uint8_t res;
-
- asm volatile(
- MPLOCKED
- "cmpxchgw %[src], %[dst];"
- "sete %[res];"
- : [res] "=a" (res), /* output */
- [dst] "=m" (*dst)
- : [src] "r" (src), /* input */
- "a" (exp),
- "m" (*dst)
- : "memory"); /* no-clobber list */
- return res;
-}
-
-static inline uint16_t
-rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
-{
- asm volatile(
- MPLOCKED
- "xchgw %0, %1;"
- : "=r" (val), "=m" (*dst)
- : "0" (val), "m" (*dst)
- : "memory"); /* no-clobber list */
- return val;
-}
-
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
-{
- return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v)
-{
- asm volatile(
- MPLOCKED
- "incw %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v)
-{
- asm volatile(
- MPLOCKED
- "decw %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "incw %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
-{
- uint8_t ret;
-
- asm volatile(MPLOCKED
- "decw %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-/*------------------------- 32 bit atomic operations -------------------------*/
-
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
-{
- uint8_t res;
-
- asm volatile(
- MPLOCKED
- "cmpxchgl %[src], %[dst];"
- "sete %[res];"
- : [res] "=a" (res), /* output */
- [dst] "=m" (*dst)
- : [src] "r" (src), /* input */
- "a" (exp),
- "m" (*dst)
- : "memory"); /* no-clobber list */
- return res;
-}
-
-static inline uint32_t
-rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
-{
- asm volatile(
- MPLOCKED
- "xchgl %0, %1;"
- : "=r" (val), "=m" (*dst)
- : "0" (val), "m" (*dst)
- : "memory"); /* no-clobber list */
- return val;
-}
-
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
-{
- return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
-}
-
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v)
-{
- asm volatile(
- MPLOCKED
- "incl %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v)
-{
- asm volatile(
- MPLOCKED
- "decl %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "incl %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
-{
- uint8_t ret;
-
- asm volatile(MPLOCKED
- "decl %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-#endif /* !RTE_FORCE_INTRINSICS */
#ifdef __cplusplus
}
diff --git a/lib/eal/x86/include/rte_atomic_32.h b/lib/eal/x86/include/rte_atomic_32.h
index 0f25863aa5..37d139f30d 100644
--- a/lib/eal/x86/include/rte_atomic_32.h
+++ b/lib/eal/x86/include/rte_atomic_32.h
@@ -20,193 +20,5 @@
/*------------------------- 64 bit atomic operations -------------------------*/
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
- uint8_t res;
- union {
- struct {
- uint32_t l32;
- uint32_t h32;
- };
- uint64_t u64;
- } _exp, _src;
-
- _exp.u64 = exp;
- _src.u64 = src;
-
-#ifndef __PIC__
- asm volatile (
- MPLOCKED
- "cmpxchg8b (%[dst]);"
- "setz %[res];"
- : [res] "=a" (res) /* result in eax */
- : [dst] "S" (dst), /* esi */
- "b" (_src.l32), /* ebx */
- "c" (_src.h32), /* ecx */
- "a" (_exp.l32), /* eax */
- "d" (_exp.h32) /* edx */
- : "memory" ); /* no-clobber list */
-#else
- asm volatile (
- "xchgl %%ebx, %%edi;\n"
- MPLOCKED
- "cmpxchg8b (%[dst]);"
- "setz %[res];"
- "xchgl %%ebx, %%edi;\n"
- : [res] "=a" (res) /* result in eax */
- : [dst] "S" (dst), /* esi */
- "D" (_src.l32), /* ebx */
- "c" (_src.h32), /* ecx */
- "a" (_exp.l32), /* eax */
- "d" (_exp.h32) /* edx */
- : "memory" ); /* no-clobber list */
-#endif
-
- return res;
-}
-
-static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dest, uint64_t val)
-{
- uint64_t old;
-
- do {
- old = *dest;
- } while (rte_atomic64_cmpset(dest, old, val) == 0);
-
- return old;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, 0);
- }
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- /* replace the value by itself */
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp);
- }
- return tmp;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, new_value);
- }
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp + inc);
- }
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp - dec);
- }
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
- rte_atomic64_add(v, 1);
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
- rte_atomic64_sub(v, 1);
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp + inc);
- }
-
- return tmp + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
- int success = 0;
- uint64_t tmp;
-
- while (success == 0) {
- tmp = v->cnt;
- success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
- tmp, tmp - dec);
- }
-
- return tmp - dec;
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
- return rte_atomic64_add_return(v, 1) == 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
- return rte_atomic64_sub_return(v, 1) == 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
- return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
- rte_atomic64_set(v, 0);
-}
-#endif
#endif /* _RTE_ATOMIC_I686_H_ */
diff --git a/lib/eal/x86/include/rte_atomic_64.h b/lib/eal/x86/include/rte_atomic_64.h
index 0a7a2131e0..1cd12695a2 100644
--- a/lib/eal/x86/include/rte_atomic_64.h
+++ b/lib/eal/x86/include/rte_atomic_64.h
@@ -22,163 +22,6 @@
/*------------------------- 64 bit atomic operations -------------------------*/
-#ifndef RTE_FORCE_INTRINSICS
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
-{
- uint8_t res;
-
-
- asm volatile(
- MPLOCKED
- "cmpxchgq %[src], %[dst];"
- "sete %[res];"
- : [res] "=a" (res), /* output */
- [dst] "=m" (*dst)
- : [src] "r" (src), /* input */
- "a" (exp),
- "m" (*dst)
- : "memory"); /* no-clobber list */
-
- return res;
-}
-
-static inline uint64_t
-rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
-{
- asm volatile(
- MPLOCKED
- "xchgq %0, %1;"
- : "=r" (val), "=m" (*dst)
- : "0" (val), "m" (*dst)
- : "memory"); /* no-clobber list */
- return val;
-}
-
-static inline void
-rte_atomic64_init(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v)
-{
- return v->cnt;
-}
-
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
-{
- v->cnt = new_value;
-}
-
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
-{
- asm volatile(
- MPLOCKED
- "addq %[inc], %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : [inc] "ir" (inc), /* input */
- "m" (v->cnt)
- );
-}
-
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
-{
- asm volatile(
- MPLOCKED
- "subq %[dec], %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : [dec] "ir" (dec), /* input */
- "m" (v->cnt)
- );
-}
-
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v)
-{
- asm volatile(
- MPLOCKED
- "incq %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v)
-{
- asm volatile(
- MPLOCKED
- "decq %[cnt]"
- : [cnt] "=m" (v->cnt) /* output */
- : "m" (v->cnt) /* input */
- );
-}
-
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
-{
- int64_t prev = inc;
-
- asm volatile(
- MPLOCKED
- "xaddq %[prev], %[cnt]"
- : [prev] "+r" (prev), /* output */
- [cnt] "=m" (v->cnt)
- : "m" (v->cnt) /* input */
- );
- return prev + inc;
-}
-
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
-{
- return rte_atomic64_add_return(v, -dec);
-}
-
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "incq %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
-
- return ret != 0;
-}
-
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
-{
- uint8_t ret;
-
- asm volatile(
- MPLOCKED
- "decq %[cnt] ; "
- "sete %[ret]"
- : [cnt] "+m" (v->cnt), /* output */
- [ret] "=qm" (ret)
- );
- return ret != 0;
-}
-
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
-{
- return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
-}
-
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
-{
- v->cnt = 0;
-}
-#endif
/*------------------------ 128 bit atomic operations -------------------------*/
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 01/27] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
@ 2026-05-26 23:23 ` Stephen Hemminger
2026-06-01 18:24 ` Konstantin Ananyev
2026-05-26 23:23 ` [PATCH v4 03/27] ring: unify memory model on C11, remove atomic32 Stephen Hemminger
` (24 subsequent siblings)
26 siblings, 1 reply; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:23 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, Thomas Monjalon, Wathsala Vithanage, Bibo Mao,
David Christensen, Sun Yuechi, Bruce Richardson,
Konstantin Ananyev
The rte_smp_mb(), rte_smp_wmb() and rte_smp_rmb() functions were
flagged as deprecated by commit 3ec965b6de12 ("doc: update atomic
operation deprecation") in 2021 but nothing came of it.
Reimplement them as inline wrappers over rte_atomic_thread_fence()
and drop the deprecation notice.
The API is preserved; only the implementation changes.
The wrapper provides stronger guarantees than previous code
because there is no C11 equivalent to old rte_smp_qmb().
Generated code is unchanged on x86; on arm64,
release/acquire emit dmb ish instead of dmb ishst/ishld;
the difference is below measurement noise.
Drop restrictions on rte_smp_XX in checkpatch since they are
no longer on deprecation cycle.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
devtools/checkpatches.sh | 8 --
doc/guides/rel_notes/deprecation.rst | 8 --
lib/eal/arm/include/rte_atomic_32.h | 6 --
lib/eal/arm/include/rte_atomic_64.h | 6 --
lib/eal/include/generic/rte_atomic.h | 130 +++++--------------------
lib/eal/loongarch/include/rte_atomic.h | 6 --
lib/eal/ppc/include/rte_atomic.h | 6 --
lib/eal/riscv/include/rte_atomic.h | 6 --
lib/eal/x86/include/rte_atomic.h | 33 +++----
9 files changed, 37 insertions(+), 172 deletions(-)
diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
index f5dd77443f..81bb0fe4e8 100755
--- a/devtools/checkpatches.sh
+++ b/devtools/checkpatches.sh
@@ -121,14 +121,6 @@ check_forbidden_additions() { # <patch>
-f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
"$1" || res=1
- # refrain from new additions of rte_smp_[r/w]mb()
- awk -v FOLDERS="lib drivers app examples" \
- -v EXPRESSIONS="rte_smp_(r|w)?mb\\\(" \
- -v RET_ON_FAIL=1 \
- -v MESSAGE='Using rte_smp_[r/w]mb' \
- -f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
- "$1" || res=1
-
# refrain from using compiler __sync_xxx builtins
awk -v FOLDERS="lib drivers app examples" \
-v EXPRESSIONS="__sync_.*\\\(" \
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 35c9b4e06c..2190419f79 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -47,14 +47,6 @@ Deprecation Notices
operations must be used for patches that need to be merged in 20.08 onwards.
This change will not introduce any performance degradation.
-* rte_smp_*mb: These APIs provide full barrier functionality. However, many
- use cases do not require full barriers. To support such use cases, DPDK has
- adopted atomic operations from
- https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html. These
- operations and a new wrapper ``rte_atomic_thread_fence`` instead of
- ``__atomic_thread_fence`` must be used for patches that need to be merged in
- 20.08 onwards. This change will not introduce any performance degradation.
-
* lib: will fix extending some enum/define breaking the ABI. There are multiple
samples in DPDK that enum/define terminated with a ``.*MAX.*`` value which is
used by iterators, and arrays holding these values are sized with this
diff --git a/lib/eal/arm/include/rte_atomic_32.h b/lib/eal/arm/include/rte_atomic_32.h
index 696a539fef..4115271091 100644
--- a/lib/eal/arm/include/rte_atomic_32.h
+++ b/lib/eal/arm/include/rte_atomic_32.h
@@ -17,12 +17,6 @@ extern "C" {
#define rte_rmb() __sync_synchronize()
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_wmb()
diff --git a/lib/eal/arm/include/rte_atomic_64.h b/lib/eal/arm/include/rte_atomic_64.h
index 9f790238df..604e777bcd 100644
--- a/lib/eal/arm/include/rte_atomic_64.h
+++ b/lib/eal/arm/include/rte_atomic_64.h
@@ -20,12 +20,6 @@ extern "C" {
#define rte_rmb() asm volatile("dmb oshld" : : : "memory")
-#define rte_smp_mb() asm volatile("dmb ish" : : : "memory")
-
-#define rte_smp_wmb() asm volatile("dmb ishst" : : : "memory")
-
-#define rte_smp_rmb() asm volatile("dmb ishld" : : : "memory")
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_wmb()
diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h
index 292e52fade..1b04b43cbb 100644
--- a/lib/eal/include/generic/rte_atomic.h
+++ b/lib/eal/include/generic/rte_atomic.h
@@ -59,55 +59,25 @@ static inline void rte_rmb(void);
*
* Guarantees that the LOAD and STORE operations that precede the
* rte_smp_mb() call are globally visible across the lcores
- * before the LOAD and STORE operations that follows it.
- *
- * @note
- * This function is deprecated.
- * It provides similar synchronization primitive as atomic fence,
- * but has different syntax and memory ordering semantic. Hence
- * deprecated for the simplicity of memory ordering semantics in use.
- *
- * rte_atomic_thread_fence(rte_memory_order_acq_rel) should be used instead.
+ * before the LOAD and STORE operations that follow it.
*/
static inline void rte_smp_mb(void);
/**
* Write memory barrier between lcores
*
- * Guarantees that the STORE operations that precede the
- * rte_smp_wmb() call are globally visible across the lcores
- * before the STORE operations that follows it.
- *
- * @note
- * This function is deprecated.
- * It provides similar synchronization primitive as atomic fence,
- * but has different syntax and memory ordering semantic. Hence
- * deprecated for the simplicity of memory ordering semantics in use.
- *
- * rte_atomic_thread_fence(rte_memory_order_release) should be used instead.
- * The fence also guarantees LOAD operations that precede the call
- * are globally visible across the lcores before the STORE operations
- * that follows it.
+ * Guarantees that the LOAD and STORE operations that precede the
+ * rte_smp_wmb() call are globally visible across the lcores before
+ * any STORE operations that follow it.
*/
static inline void rte_smp_wmb(void);
/**
* Read memory barrier between lcores
*
- * Guarantees that the LOAD operations that precede the
- * rte_smp_rmb() call are globally visible across the lcores
- * before the LOAD operations that follows it.
- *
- * @note
- * This function is deprecated.
- * It provides similar synchronization primitive as atomic fence,
- * but has different syntax and memory ordering semantic. Hence
- * deprecated for the simplicity of memory ordering semantics in use.
- *
- * rte_atomic_thread_fence(rte_memory_order_acquire) should be used instead.
- * The fence also guarantees LOAD operations that precede the call
- * are globally visible across the lcores before the STORE operations
- * that follows it.
+ * Guarantees that any LOAD operations that precede the rte_smp_rmb()
+ * call complete before LOAD and STORE operations that follow it
+ * become globally visible.
*/
static inline void rte_smp_rmb(void);
///@}
@@ -164,6 +134,24 @@ static inline void rte_io_rmb(void);
*/
static inline void rte_atomic_thread_fence(rte_memory_order memorder);
+static __rte_always_inline void
+rte_smp_mb(void)
+{
+ rte_atomic_thread_fence(rte_memory_order_seq_cst);
+}
+
+static __rte_always_inline void
+rte_smp_wmb(void)
+{
+ rte_atomic_thread_fence(rte_memory_order_release);
+}
+
+static __rte_always_inline void
+rte_smp_rmb(void)
+{
+ rte_atomic_thread_fence(rte_memory_order_acquire);
+}
+
/*------------------------- 16 bit atomic operations -------------------------*/
#ifndef RTE_TOOLCHAIN_MSVC
@@ -184,9 +172,6 @@ static inline void rte_atomic_thread_fence(rte_memory_order memorder);
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
-rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src);
-
static inline int
rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
{
@@ -303,9 +288,6 @@ rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic16_inc(rte_atomic16_t *v);
-
static inline void
rte_atomic16_inc(rte_atomic16_t *v)
{
@@ -318,9 +300,6 @@ rte_atomic16_inc(rte_atomic16_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic16_dec(rte_atomic16_t *v);
-
static inline void
rte_atomic16_dec(rte_atomic16_t *v)
{
@@ -379,8 +358,6 @@ rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
* @return
* True if the result after the increment operation is 0; false otherwise.
*/
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v);
-
static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
@@ -398,8 +375,6 @@ static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
* @return
* True if the result after the decrement operation is 0; false otherwise.
*/
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v);
-
static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
@@ -417,8 +392,6 @@ static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v);
-
static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
{
return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
@@ -453,9 +426,6 @@ static inline void rte_atomic16_clear(rte_atomic16_t *v)
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
-rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src);
-
static inline int
rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
{
@@ -572,9 +542,6 @@ rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic32_inc(rte_atomic32_t *v);
-
static inline void
rte_atomic32_inc(rte_atomic32_t *v)
{
@@ -587,9 +554,6 @@ rte_atomic32_inc(rte_atomic32_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic32_dec(rte_atomic32_t *v);
-
static inline void
rte_atomic32_dec(rte_atomic32_t *v)
{
@@ -648,8 +612,6 @@ rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
* @return
* True if the result after the increment operation is 0; false otherwise.
*/
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v);
-
static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
@@ -667,8 +629,6 @@ static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
* @return
* True if the result after the decrement operation is 0; false otherwise.
*/
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v);
-
static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
@@ -686,8 +646,6 @@ static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v);
-
static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
{
return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
@@ -721,9 +679,6 @@ static inline void rte_atomic32_clear(rte_atomic32_t *v)
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
-rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src);
-
static inline int
rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
{
@@ -770,9 +725,6 @@ typedef struct {
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic64_init(rte_atomic64_t *v);
-
static inline void
rte_atomic64_init(rte_atomic64_t *v)
{
@@ -798,9 +750,6 @@ rte_atomic64_init(rte_atomic64_t *v)
* @return
* The value of the counter.
*/
-static inline int64_t
-rte_atomic64_read(rte_atomic64_t *v);
-
static inline int64_t
rte_atomic64_read(rte_atomic64_t *v)
{
@@ -828,9 +777,6 @@ rte_atomic64_read(rte_atomic64_t *v)
* @param new_value
* The new value of the counter.
*/
-static inline void
-rte_atomic64_set(rte_atomic64_t *v, int64_t new_value);
-
static inline void
rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
{
@@ -856,9 +802,6 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
* @param inc
* The value to be added to the counter.
*/
-static inline void
-rte_atomic64_add(rte_atomic64_t *v, int64_t inc);
-
static inline void
rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
{
@@ -874,9 +817,6 @@ rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
* @param dec
* The value to be subtracted from the counter.
*/
-static inline void
-rte_atomic64_sub(rte_atomic64_t *v, int64_t dec);
-
static inline void
rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
{
@@ -890,9 +830,6 @@ rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic64_inc(rte_atomic64_t *v);
-
static inline void
rte_atomic64_inc(rte_atomic64_t *v)
{
@@ -905,9 +842,6 @@ rte_atomic64_inc(rte_atomic64_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
-rte_atomic64_dec(rte_atomic64_t *v);
-
static inline void
rte_atomic64_dec(rte_atomic64_t *v)
{
@@ -927,9 +861,6 @@ rte_atomic64_dec(rte_atomic64_t *v)
* @return
* The value of v after the addition.
*/
-static inline int64_t
-rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc);
-
static inline int64_t
rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
{
@@ -950,9 +881,6 @@ rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
* @return
* The value of v after the subtraction.
*/
-static inline int64_t
-rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec);
-
static inline int64_t
rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
{
@@ -971,8 +899,6 @@ rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
* @return
* True if the result after the addition is 0; false otherwise.
*/
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v);
-
static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
{
return rte_atomic64_add_return(v, 1) == 0;
@@ -989,8 +915,6 @@ static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
* @return
* True if the result after subtraction is 0; false otherwise.
*/
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v);
-
static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
{
return rte_atomic64_sub_return(v, 1) == 0;
@@ -1007,8 +931,6 @@ static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v);
-
static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
{
return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
@@ -1020,8 +942,6 @@ static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void rte_atomic64_clear(rte_atomic64_t *v);
-
static inline void rte_atomic64_clear(rte_atomic64_t *v)
{
rte_atomic64_set(v, 0);
diff --git a/lib/eal/loongarch/include/rte_atomic.h b/lib/eal/loongarch/include/rte_atomic.h
index 785a452c9e..a789e3ab4d 100644
--- a/lib/eal/loongarch/include/rte_atomic.h
+++ b/lib/eal/loongarch/include/rte_atomic.h
@@ -18,12 +18,6 @@ extern "C" {
#define rte_rmb() rte_mb()
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_mb()
-
-#define rte_smp_rmb() rte_mb()
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_mb()
diff --git a/lib/eal/ppc/include/rte_atomic.h b/lib/eal/ppc/include/rte_atomic.h
index 64f4c3d670..0e64db2a35 100644
--- a/lib/eal/ppc/include/rte_atomic.h
+++ b/lib/eal/ppc/include/rte_atomic.h
@@ -24,12 +24,6 @@ extern "C" {
#define rte_rmb() asm volatile("sync" : : : "memory")
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_wmb()
diff --git a/lib/eal/riscv/include/rte_atomic.h b/lib/eal/riscv/include/rte_atomic.h
index 061b175f33..04c40e4e9b 100644
--- a/lib/eal/riscv/include/rte_atomic.h
+++ b/lib/eal/riscv/include/rte_atomic.h
@@ -23,12 +23,6 @@ extern "C" {
#define rte_rmb() asm volatile("fence r, r" : : : "memory")
-#define rte_smp_mb() rte_mb()
-
-#define rte_smp_wmb() rte_wmb()
-
-#define rte_smp_rmb() rte_rmb()
-
#define rte_io_mb() asm volatile("fence iorw, iorw" : : : "memory")
#define rte_io_wmb() asm volatile("fence orw, ow" : : : "memory")
diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h
index 4f05302c9f..f4d39ce4fe 100644
--- a/lib/eal/x86/include/rte_atomic.h
+++ b/lib/eal/x86/include/rte_atomic.h
@@ -23,10 +23,6 @@
#define rte_rmb() _mm_lfence()
-#define rte_smp_wmb() rte_compiler_barrier()
-
-#define rte_smp_rmb() rte_compiler_barrier()
-
#ifdef __cplusplus
extern "C" {
#endif
@@ -63,20 +59,6 @@ extern "C" {
* So below we use that technique for rte_smp_mb() implementation.
*/
-static __rte_always_inline void
-rte_smp_mb(void)
-{
-#ifdef RTE_TOOLCHAIN_MSVC
- _mm_mfence();
-#else
-#ifdef RTE_ARCH_I686
- asm volatile("lock addl $0, -128(%%esp); " ::: "memory");
-#else
- asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
-#endif
-#endif
-}
-
#define rte_io_mb() rte_mb()
#define rte_io_wmb() rte_compiler_barrier()
@@ -93,10 +75,19 @@ rte_smp_mb(void)
static __rte_always_inline void
rte_atomic_thread_fence(rte_memory_order memorder)
{
- if (memorder == rte_memory_order_seq_cst)
- rte_smp_mb();
- else
+ if (memorder == rte_memory_order_seq_cst) {
+#ifdef RTE_TOOLCHAIN_MSVC
+ _mm_mfence();
+#else
+#ifdef RTE_ARCH_I686
+ asm volatile("lock addl $0, -128(%%esp); " ::: "memory");
+#else
+ asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
+#endif
+#endif
+ } else {
__rte_atomic_thread_fence(memorder);
+ }
}
#ifdef __cplusplus
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 03/27] ring: unify memory model on C11, remove atomic32
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 01/27] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
@ 2026-05-26 23:23 ` Stephen Hemminger
2026-06-01 18:18 ` Konstantin Ananyev
2026-06-01 22:07 ` Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 04/27] bpf: use C11 atomics in BPF_ST_ATOMIC_REG Stephen Hemminger
` (23 subsequent siblings)
26 siblings, 2 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:23 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Konstantin Ananyev, Wathsala Vithanage
Remove the RTE_USE_C11_MEM_MODEL build switch; C11 atomics are now
the default for all platforms. Unifies __rte_ring_update_tail into
the C11 form (atomic_store_release replaces the older rte_smp_wmb +
plain store on the generic path) and renames rte_ring_generic_pvt.h
to rte_ring_x86_pvt.h to reflect its new scope.
Also splits the head-move helper into separate ST and MT variants,
removing the runtime is_st branch from the MT retry loop.
This gets small boost and scopes the following exception
more tightly.
Exception: on x86 with GCC, atomic_compare_exchange on the head CAS
regresses MP/MC contended throughput by ~20% existing hand-written
cmpxchg. As a workaround, GCC-on-x86 builds use the older
__sync_bool_compare_and_swap builtin, which generates equivalent
code to the original asm. Can be reverted if/when GCC gets
fixed; similar issue was observed in Linux kernel.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
lib/ring/meson.build | 2 +-
lib/ring/rte_ring_c11_pvt.h | 75 +++--------
lib/ring/rte_ring_elem_pvt.h | 125 ++++++++++++++++--
..._ring_generic_pvt.h => rte_ring_x86_pvt.h} | 61 ++-------
lib/ring/soring.c | 15 ++-
5 files changed, 158 insertions(+), 120 deletions(-)
rename lib/ring/{rte_ring_generic_pvt.h => rte_ring_x86_pvt.h} (60%)
diff --git a/lib/ring/meson.build b/lib/ring/meson.build
index 21f2c12989..b178c963b8 100644
--- a/lib/ring/meson.build
+++ b/lib/ring/meson.build
@@ -9,7 +9,7 @@ indirect_headers += files (
'rte_ring_elem.h',
'rte_ring_elem_pvt.h',
'rte_ring_c11_pvt.h',
- 'rte_ring_generic_pvt.h',
+ 'rte_ring_x86_pvt.h',
'rte_ring_hts.h',
'rte_ring_hts_elem_pvt.h',
'rte_ring_peek.h',
diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h
index 07b6efc416..3efe011f08 100644
--- a/lib/ring/rte_ring_c11_pvt.h
+++ b/lib/ring/rte_ring_c11_pvt.h
@@ -15,35 +15,10 @@
* @file rte_ring_c11_pvt.h
* It is not recommended to include this file directly,
* include <rte_ring.h> instead.
- * Contains internal helper functions for MP/SP and MC/SC ring modes.
+ * Contains internal helper functions for MP and MC ring modes.
* For more information please refer to <rte_ring.h>.
*/
-/**
- * @internal This function updates tail values.
- */
-static __rte_always_inline void
-__rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
- uint32_t new_val, uint32_t single, uint32_t enqueue)
-{
- RTE_SET_USED(enqueue);
-
- /*
- * If there are other enqueues/dequeues in progress that preceded us,
- * we need to wait for them to complete
- */
- if (!single)
- rte_wait_until_equal_32((uint32_t *)(uintptr_t)&ht->tail, old_val,
- rte_memory_order_relaxed);
-
- /*
- * R0: Establishes a synchronizing edge with load-acquire of tail at A1.
- * Ensures that memory effects by this thread on ring elements array
- * is observed by a different thread of the other type.
- */
- rte_atomic_store_explicit(&ht->tail, new_val, rte_memory_order_release);
-}
-
/**
* @internal This is a helper function that moves the producer/consumer head
*
@@ -72,14 +47,11 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
* If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only
*/
static __rte_always_inline unsigned int
-__rte_ring_headtail_move_head(struct rte_ring_headtail *d,
+__rte_ring_headtail_move_head_mt(struct rte_ring_headtail *d,
const struct rte_ring_headtail *s, uint32_t capacity,
- unsigned int is_st, unsigned int n,
- enum rte_ring_queue_behavior behavior,
+ unsigned int n, enum rte_ring_queue_behavior behavior,
uint32_t *old_head, uint32_t *new_head, uint32_t *entries)
{
- uint32_t stail;
- int success;
unsigned int max = n;
/*
@@ -89,8 +61,7 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
* d->head.
* If not, an unsafe partial order may ensue.
*/
- *old_head = rte_atomic_load_explicit(&d->head,
- rte_memory_order_acquire);
+ *old_head = rte_atomic_load_explicit(&d->head, rte_memory_order_acquire);
do {
/* Reset n to the initial burst count */
n = max;
@@ -101,15 +72,14 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
* ring elements array is observed by the time
* this thread observes its tail update.
*/
- stail = rte_atomic_load_explicit(&s->tail,
- rte_memory_order_acquire);
+ uint32_t stail = rte_atomic_load_explicit(&s->tail, rte_memory_order_acquire);
/* The subtraction is done between two unsigned 32bits value
* (the result is always modulo 32 bits even if we have
* *old_head > s->tail). So 'entries' is always between 0
* and capacity (which is < size).
*/
- *entries = (capacity + stail - *old_head);
+ *entries = capacity + stail - *old_head;
/* check that we have enough room in ring */
if (unlikely(n > *entries))
@@ -120,25 +90,20 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
return 0;
*new_head = *old_head + n;
- if (is_st) {
- d->head = *new_head;
- success = 1;
- } else
- /* on failure, *old_head is updated */
- /*
- * R1/A2.
- * R1: Establishes a synchronizing edge with A0 of a
- * different thread.
- * A2: Establishes a synchronizing edge with R1 of a
- * different thread to observe same value for stail
- * observed by that thread on CAS failure (to retry
- * with an updated *old_head).
- */
- success = rte_atomic_compare_exchange_strong_explicit(
- &d->head, old_head, *new_head,
- rte_memory_order_release,
- rte_memory_order_acquire);
- } while (unlikely(success == 0));
+
+ /* on failure, *old_head is updated */
+ /*
+ * R1/A2.
+ * R1: Establishes a synchronizing edge with A0 of a
+ * different thread.
+ * A2: Establishes a synchronizing edge with R1 of a
+ * different thread to observe same value for stail
+ * observed by that thread on CAS failure (to retry
+ * with an updated *old_head).
+ */
+ } while (unlikely(!rte_atomic_compare_exchange_strong_explicit(
+ &d->head, old_head, *new_head,
+ rte_memory_order_release, rte_memory_order_acquire)));
return n;
}
diff --git a/lib/ring/rte_ring_elem_pvt.h b/lib/ring/rte_ring_elem_pvt.h
index 6eafae121f..9d1da12a92 100644
--- a/lib/ring/rte_ring_elem_pvt.h
+++ b/lib/ring/rte_ring_elem_pvt.h
@@ -299,17 +299,108 @@ __rte_ring_dequeue_elems(struct rte_ring *r, uint32_t cons_head,
cons_head & r->mask, esize, num);
}
-/* Between load and load. there might be cpu reorder in weak model
- * (powerpc/arm).
- * There are 2 choices for the users
- * 1.use rmb() memory barrier
- * 2.use one-direction load_acquire/store_release barrier
- * It depends on performance test results.
+/**
+ * @internal This function updates tail values.
*/
-#ifdef RTE_USE_C11_MEM_MODEL
-#include "rte_ring_c11_pvt.h"
+static __rte_always_inline void
+__rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
+ uint32_t new_val, uint32_t single, uint32_t enqueue)
+{
+ RTE_SET_USED(enqueue);
+
+ /*
+ * If there are other enqueues/dequeues in progress that preceded us,
+ * we need to wait for them to complete
+ */
+ if (!single)
+ rte_wait_until_equal_32((uint32_t *)(uintptr_t)&ht->tail, old_val,
+ rte_memory_order_relaxed);
+
+ /*
+ * R0: Establishes a synchronizing edge with load-acquire of tail at A1.
+ * Ensures that memory effects by this thread on ring elements array
+ * is observed by a different thread of the other type.
+ */
+ rte_atomic_store_explicit(&ht->tail, new_val, rte_memory_order_release);
+}
+
+/**
+ * @internal This is a helper function that moves the producer/consumer head
+ *
+ *
+ * This optimized version for single threaded case.
+ *
+ * @param d
+ * A pointer to the headtail structure with head value to be moved
+ * @param s
+ * A pointer to the counter-part headtail structure. Note that this
+ * function only reads tail value from it
+ * @param capacity
+ * Either ring capacity value (for producer), or zero (for consumer)
+ * @param n
+ * The number of elements we want to move head value on
+ * @param behavior
+ * RTE_RING_QUEUE_FIXED: Move on a fixed number of items
+ * RTE_RING_QUEUE_VARIABLE: Move on as many items as possible
+ * @param old_head
+ * Returns head value as it was before the move
+ * @param new_head
+ * Returns the new head value
+ * @param entries
+ * Returns the number of ring entries available BEFORE head was moved
+ * @return
+ * Actual number of objects the head was moved on
+ * If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only
+ */
+static __rte_always_inline unsigned int
+__rte_ring_headtail_move_head_st(struct rte_ring_headtail *d,
+ const struct rte_ring_headtail *s, uint32_t capacity,
+ unsigned int n, enum rte_ring_queue_behavior behavior,
+ uint32_t *old_head, uint32_t *new_head, uint32_t *entries)
+{
+ uint32_t stail;
+
+ /*
+ * A0: Establishes a synchronizing edge with R1.
+ * Ensure that this thread observes same values
+ * to stail observed by the thread that updated
+ * d->head.
+ * If not, an unsafe partial order may ensue.
+ */
+ *old_head = rte_atomic_load_explicit(&d->head, rte_memory_order_acquire);
+
+ /*
+ * A1: Establishes a synchronizing edge with R0.
+ * Ensures that other thread's memory effects on
+ * ring elements array is observed by the time
+ * this thread observes its tail update.
+ */
+ stail = rte_atomic_load_explicit(&s->tail, rte_memory_order_acquire);
+
+ /* The subtraction is done between two unsigned 32bits value
+ * (the result is always modulo 32 bits even if we have
+ * *old_head > s->tail). So 'entries' is always between 0
+ * and capacity (which is < size).
+ */
+ *entries = capacity + stail - *old_head;
+
+ /* check that we have enough room in ring */
+ if (unlikely(n > *entries))
+ n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *entries;
+
+ if (n > 0) {
+ *new_head = *old_head + n;
+ d->head = *new_head;
+ }
+
+ return n;
+}
+
+/* There are two choices because GCC optimizer does poorly on atomic_compare_exchange */
+#if defined(RTE_TOOLCHAIN_GCC) && defined(RTE_ARCH_X86)
+#include "rte_ring_x86_pvt.h"
#else
-#include "rte_ring_generic_pvt.h"
+#include "rte_ring_c11_pvt.h"
#endif
/**
@@ -341,8 +432,12 @@ __rte_ring_move_prod_head(struct rte_ring *r, unsigned int is_sp,
uint32_t *old_head, uint32_t *new_head,
uint32_t *free_entries)
{
- return __rte_ring_headtail_move_head(&r->prod, &r->cons, r->capacity,
- is_sp, n, behavior, old_head, new_head, free_entries);
+ if (is_sp)
+ return __rte_ring_headtail_move_head_st(&r->prod, &r->cons, r->capacity,
+ n, behavior, old_head, new_head, free_entries);
+ else
+ return __rte_ring_headtail_move_head_mt(&r->prod, &r->cons, r->capacity,
+ n, behavior, old_head, new_head, free_entries);
}
/**
@@ -374,8 +469,12 @@ __rte_ring_move_cons_head(struct rte_ring *r, unsigned int is_sc,
uint32_t *old_head, uint32_t *new_head,
uint32_t *entries)
{
- return __rte_ring_headtail_move_head(&r->cons, &r->prod, 0,
- is_sc, n, behavior, old_head, new_head, entries);
+ if (is_sc)
+ return __rte_ring_headtail_move_head_st(&r->cons, &r->prod, 0,
+ n, behavior, old_head, new_head, entries);
+ else
+ return __rte_ring_headtail_move_head_mt(&r->cons, &r->prod, 0,
+ n, behavior, old_head, new_head, entries);
}
/**
diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_x86_pvt.h
similarity index 60%
rename from lib/ring/rte_ring_generic_pvt.h
rename to lib/ring/rte_ring_x86_pvt.h
index affd2d5ba7..c8de108bbd 100644
--- a/lib/ring/rte_ring_generic_pvt.h
+++ b/lib/ring/rte_ring_x86_pvt.h
@@ -7,39 +7,19 @@
* Used as BSD-3 Licensed with permission from Kip Macy.
*/
-#ifndef _RTE_RING_GENERIC_PVT_H_
-#define _RTE_RING_GENERIC_PVT_H_
+#ifndef _RTE_RING_X86_PVT_H_
+#define _RTE_RING_X86_PVT_H_
/**
- * @file rte_ring_generic_pvt.h
+ * @file rte_ring_x86_pvt.h
* It is not recommended to include this file directly,
* include <rte_ring.h> instead.
- * Contains internal helper functions for MP/SP and MC/SC ring modes.
- * For more information please refer to <rte_ring.h>.
+ *
+ * Contains internal helper functions for MP and MC ring modes.
+ * It is GCC specific to workaround poor optimizer handling of C11 atomic
+ * compare_exchange.
*/
-/**
- * @internal This function updates tail values.
- */
-static __rte_always_inline void
-__rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
- uint32_t new_val, uint32_t single, uint32_t enqueue)
-{
- if (enqueue)
- rte_smp_wmb();
- else
- rte_smp_rmb();
- /*
- * If there are other enqueues/dequeues in progress that preceded us,
- * we need to wait for them to complete
- */
- if (!single)
- rte_wait_until_equal_32((volatile uint32_t *)(uintptr_t)&ht->tail, old_val,
- rte_memory_order_relaxed);
-
- ht->tail = new_val;
-}
-
/**
* @internal This is a helper function that moves the producer/consumer head
*
@@ -50,8 +30,6 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
* function only reads tail value from it
* @param capacity
* Either ring capacity value (for producer), or zero (for consumer)
- * @param is_st
- * Indicates whether multi-thread safe path is needed or not
* @param n
* The number of elements we want to move head value on
* @param behavior
@@ -68,14 +46,13 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
* If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only
*/
static __rte_always_inline unsigned int
-__rte_ring_headtail_move_head(struct rte_ring_headtail *d,
+__rte_ring_headtail_move_head_mt(struct rte_ring_headtail *d,
const struct rte_ring_headtail *s, uint32_t capacity,
- unsigned int is_st, unsigned int n,
+ unsigned int n,
enum rte_ring_queue_behavior behavior,
uint32_t *old_head, uint32_t *new_head, uint32_t *entries)
{
unsigned int max = n;
- int success;
do {
/* Reset n to the initial burst count */
@@ -83,18 +60,13 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
*old_head = d->head;
- /* add rmb barrier to avoid load/load reorder in weak
- * memory model. It is noop on x86
- */
- rte_smp_rmb();
-
/*
* The subtraction is done between two unsigned 32bits value
* (the result is always modulo 32 bits even if we have
* *old_head > s->tail). So 'entries' is always between 0
* and capacity (which is < size).
*/
- *entries = (capacity + s->tail - *old_head);
+ *entries = capacity + s->tail - *old_head;
/* check that we have enough room in ring */
if (unlikely(n > *entries))
@@ -105,15 +77,10 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail *d,
return 0;
*new_head = *old_head + n;
- if (is_st) {
- d->head = *new_head;
- success = 1;
- } else
- success = rte_atomic32_cmpset(
- (uint32_t *)(uintptr_t)&d->head,
- *old_head, *new_head);
- } while (unlikely(success == 0));
+ } while (unlikely(!__sync_bool_compare_and_swap(
+ (uint32_t *)(uintptr_t)&d->head,
+ *old_head, *new_head)));
return n;
}
-#endif /* _RTE_RING_GENERIC_PVT_H_ */
+#endif /* _RTE_RING_X86_PVT_H_ */
diff --git a/lib/ring/soring.c b/lib/ring/soring.c
index 3b90521bdb..0e8bbc03c1 100644
--- a/lib/ring/soring.c
+++ b/lib/ring/soring.c
@@ -135,9 +135,12 @@ __rte_soring_move_prod_head(struct rte_soring *r, uint32_t num,
switch (st) {
case RTE_RING_SYNC_ST:
+ n = __rte_ring_headtail_move_head_st(&r->prod.ht, &r->cons.ht,
+ r->capacity, num, behavior, head, next, free);
+ break;
case RTE_RING_SYNC_MT:
- n = __rte_ring_headtail_move_head(&r->prod.ht, &r->cons.ht,
- r->capacity, st, num, behavior, head, next, free);
+ n = __rte_ring_headtail_move_head_mt(&r->prod.ht, &r->cons.ht,
+ r->capacity, num, behavior, head, next, free);
break;
case RTE_RING_SYNC_MT_RTS:
n = __rte_ring_rts_move_head(&r->prod.rts, &r->cons.ht,
@@ -168,9 +171,13 @@ __rte_soring_move_cons_head(struct rte_soring *r, uint32_t stage, uint32_t num,
switch (st) {
case RTE_RING_SYNC_ST:
+ n = __rte_ring_headtail_move_head_st(&r->cons.ht,
+ &r->stage[stage].ht, 0, num, behavior,
+ head, next, avail);
+ break;
case RTE_RING_SYNC_MT:
- n = __rte_ring_headtail_move_head(&r->cons.ht,
- &r->stage[stage].ht, 0, st, num, behavior,
+ n = __rte_ring_headtail_move_head_mt(&r->cons.ht,
+ &r->stage[stage].ht, 0, num, behavior,
head, next, avail);
break;
case RTE_RING_SYNC_MT_RTS:
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 04/27] bpf: use C11 atomics in BPF_ST_ATOMIC_REG
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (2 preceding siblings ...)
2026-05-26 23:23 ` [PATCH v4 03/27] ring: unify memory model on C11, remove atomic32 Stephen Hemminger
@ 2026-05-26 23:23 ` Stephen Hemminger
2026-05-27 16:52 ` Marat Khalili
2026-05-26 23:23 ` [PATCH v4 05/27] net/bonding: use stdatomic Stephen Hemminger
` (22 subsequent siblings)
26 siblings, 1 reply; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:23 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Konstantin Ananyev, Marat Khalili
The BPF_ST_ATOMIC_REG macro generated code with deprecated
rte_atomicNN_add and rte_atomicNN_exchange.
Replace this with the equivalent rte_stdatomic definitions.
Use memory order seq_cst to preserve the previous behavior of
rte_atomicNN_add() / rte_atomicNN_exchange() and matches
the Linux kernel BPF interpreter for these opcodes.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
lib/bpf/bpf_exec.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/lib/bpf/bpf_exec.c b/lib/bpf/bpf_exec.c
index 18013753b1..ee6ec7516f 100644
--- a/lib/bpf/bpf_exec.c
+++ b/lib/bpf/bpf_exec.c
@@ -10,6 +10,7 @@
#include <rte_log.h>
#include <rte_debug.h>
#include <rte_byteorder.h>
+#include <rte_stdatomic.h>
#include "bpf_impl.h"
@@ -65,16 +66,16 @@
(type)(reg)[(ins)->src_reg])
#define BPF_ST_ATOMIC_REG(reg, ins, tp) do { \
+ RTE_ATOMIC(uint##tp##_t) *dst = (RTE_ATOMIC(uint##tp##_t) *) \
+ (uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off); \
switch (ins->imm) { \
case BPF_ATOMIC_ADD: \
- rte_atomic##tp##_add((rte_atomic##tp##_t *) \
- (uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off), \
- (reg)[(ins)->src_reg]); \
+ rte_atomic_fetch_add_explicit(dst, \
+ (reg)[(ins)->src_reg], rte_memory_order_seq_cst); \
break; \
case BPF_ATOMIC_XCHG: \
- (reg)[(ins)->src_reg] = rte_atomic##tp##_exchange((uint##tp##_t *) \
- (uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off), \
- (reg)[(ins)->src_reg]); \
+ (reg)[(ins)->src_reg] = rte_atomic_exchange_explicit(dst, \
+ (reg)[(ins)->src_reg], rte_memory_order_seq_cst); \
break; \
default: \
/* this should be caught by validator and never reach here */ \
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 05/27] net/bonding: use stdatomic
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (3 preceding siblings ...)
2026-05-26 23:23 ` [PATCH v4 04/27] bpf: use C11 atomics in BPF_ST_ATOMIC_REG Stephen Hemminger
@ 2026-05-26 23:23 ` Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 06/27] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
` (21 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:23 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Chas Williams, Min Hu (Connor)
The old rte_atomic16 and rte_atomic64 functions are deprecated.
Replace with rte_stdatomic for managing warning and timer flags.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/bonding/eth_bond_8023ad_private.h | 6 ++--
drivers/net/bonding/rte_eth_bond_8023ad.c | 35 ++++++++-----------
2 files changed, 17 insertions(+), 24 deletions(-)
diff --git a/drivers/net/bonding/eth_bond_8023ad_private.h b/drivers/net/bonding/eth_bond_8023ad_private.h
index ab7d15f81a..dd3cf3ed26 100644
--- a/drivers/net/bonding/eth_bond_8023ad_private.h
+++ b/drivers/net/bonding/eth_bond_8023ad_private.h
@@ -9,7 +9,7 @@
#include <rte_ether.h>
#include <rte_byteorder.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_flow.h>
#include "rte_eth_bond_8023ad.h"
@@ -140,10 +140,10 @@ struct port {
/** Timer which is also used as mutex. If is 0 (not running) RX marker
* packet might be responded. Otherwise shall be dropped. It is zeroed in
* mode 4 callback function after expire. */
- volatile uint64_t rx_marker_timer;
+ RTE_ATOMIC(uint64_t) rx_marker_timer;
uint64_t warning_timer;
- volatile uint16_t warnings_to_show;
+ RTE_ATOMIC(uint16_t) warnings_to_show;
/** Memory pool used to allocate slow queues */
struct rte_mempool *slow_pool;
diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c
index ba88f6d261..cc7e4af2b9 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.c
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
@@ -171,27 +171,17 @@ timer_is_running(uint64_t *timer)
static void
set_warning_flags(struct port *port, uint16_t flags)
{
- int retval;
- uint16_t old;
- uint16_t new_flag = 0;
-
- do {
- old = port->warnings_to_show;
- new_flag = old | flags;
- retval = rte_atomic16_cmpset(&port->warnings_to_show, old, new_flag);
- } while (unlikely(retval == 0));
+ rte_atomic_fetch_or_explicit(&port->warnings_to_show, flags, rte_memory_order_relaxed);
}
static void
show_warnings(uint16_t member_id)
{
struct port *port = &bond_mode_8023ad_ports[member_id];
- uint8_t warnings;
-
- do {
- warnings = port->warnings_to_show;
- } while (rte_atomic16_cmpset(&port->warnings_to_show, warnings, 0) == 0);
+ uint16_t warnings;
+ warnings = rte_atomic_exchange_explicit(&port->warnings_to_show, 0,
+ rte_memory_order_relaxed);
if (!warnings)
return;
@@ -1337,7 +1327,6 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
struct port *port = &bond_mode_8023ad_ports[member_id];
struct marker_header *m_hdr;
uint64_t marker_timer, old_marker_timer;
- int retval;
uint8_t wrn, subtype;
/* If packet is a marker, we send response now by reusing given packet
* and update only source MAC, destination MAC is multicast so don't
@@ -1354,17 +1343,19 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
}
/* Setup marker timer. Do it in loop in case concurrent access. */
+ old_marker_timer = rte_atomic_load_explicit(&port->rx_marker_timer,
+ rte_memory_order_relaxed);
do {
- old_marker_timer = port->rx_marker_timer;
if (!timer_is_expired(&old_marker_timer)) {
wrn = WRN_RX_MARKER_TO_FAST;
goto free_out;
}
timer_set(&marker_timer, mode4->rx_marker_timeout);
- retval = rte_atomic64_cmpset(&port->rx_marker_timer,
- old_marker_timer, marker_timer);
- } while (unlikely(retval == 0));
+
+ } while (!rte_atomic_compare_exchange_weak_explicit(&port->rx_marker_timer,
+ &old_marker_timer, marker_timer,
+ rte_memory_order_seq_cst, rte_memory_order_relaxed));
m_hdr->marker.tlv_type_marker = MARKER_TLV_TYPE_RESP;
rte_eth_macaddr_get(member_id, &m_hdr->eth_hdr.src_addr);
@@ -1372,7 +1363,8 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
if (internals->mode4.dedicated_queues.enabled == 0) {
if (rte_ring_enqueue(port->tx_ring, pkt) != 0) {
/* reset timer */
- port->rx_marker_timer = 0;
+ rte_atomic_store_explicit(&port->rx_marker_timer, 0,
+ rte_memory_order_release);
wrn = WRN_TX_QUEUE_FULL;
goto free_out;
}
@@ -1386,7 +1378,8 @@ bond_mode_8023ad_handle_slow_pkt(struct bond_dev_private *internals,
&pkt, tx_count);
if (tx_count != 1) {
/* reset timer */
- port->rx_marker_timer = 0;
+ rte_atomic_store_explicit(&port->rx_marker_timer, 0,
+ rte_memory_order_release);
wrn = WRN_TX_QUEUE_FULL;
goto free_out;
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 06/27] net/nbl: remove unused rte_atomic16 field
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (4 preceding siblings ...)
2026-05-26 23:23 ` [PATCH v4 05/27] net/bonding: use stdatomic Stephen Hemminger
@ 2026-05-26 23:23 ` Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 07/27] net/ena: replace use of rte_atomicNN Stephen Hemminger
` (20 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:23 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Dimon Zhao, Leon Yu, Sam Chen
The tx_current_queue was defined as rte_atomic16_t which
is deprecated. Remove it since it was never used.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/nbl/nbl_hw/nbl_resource.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/net/nbl/nbl_hw/nbl_resource.h b/drivers/net/nbl/nbl_hw/nbl_resource.h
index bf5a9461f5..f2182ba6bc 100644
--- a/drivers/net/nbl/nbl_hw/nbl_resource.h
+++ b/drivers/net/nbl/nbl_hw/nbl_resource.h
@@ -225,7 +225,6 @@ struct nbl_res_info {
u16 base_qid;
u16 lcore_max;
u16 *pf_qid_to_lcore_id;
- rte_atomic16_t tx_current_queue;
};
struct nbl_resource_mgt {
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 07/27] net/ena: replace use of rte_atomicNN
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (5 preceding siblings ...)
2026-05-26 23:23 ` [PATCH v4 06/27] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
@ 2026-05-26 23:23 ` Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 08/27] net/failsafe: convert to stdatomic Stephen Hemminger
` (19 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:23 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, Shai Brandes, Evgeny Schemeilin, Ron Beider,
Amit Bernstein, Wajeeh Atrash
Convert the legacy rte_atomicNN operations to stdatomic.
* Remove variable ena_alloc_cnt is defined by not used.
It is a leftover from previous memzone naming scheme.
* Convert the legacy rte_atomic32_t and rte_atomic32_{inc,dec,set,read}
macros to C11 stdatomic equivalents.
Memory ordering is kept at seq_cst,
matching the implicit ordering of the legacy API.
* Do not use rte_atomic for statistics
The DPDK PMD model is that statistics do not have to be exact
in face of contention.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/ena/base/ena_plat_dpdk.h | 14 +++++++++-----
drivers/net/ena/ena_ethdev.c | 21 ++++++---------------
drivers/net/ena/ena_ethdev.h | 7 +++----
3 files changed, 18 insertions(+), 24 deletions(-)
diff --git a/drivers/net/ena/base/ena_plat_dpdk.h b/drivers/net/ena/base/ena_plat_dpdk.h
index c84420de22..83b354d9da 100644
--- a/drivers/net/ena/base/ena_plat_dpdk.h
+++ b/drivers/net/ena/base/ena_plat_dpdk.h
@@ -40,7 +40,7 @@ typedef uint64_t dma_addr_t;
#endif
#define ENA_PRIu64 PRIu64
-#define ena_atomic32_t rte_atomic32_t
+typedef RTE_ATOMIC(int32_t) ena_atomic32_t;
#define ena_mem_handle_t const struct rte_memzone *
#define SZ_256 (256U)
@@ -267,10 +267,14 @@ ena_mem_alloc_coherent(struct rte_eth_dev_data *data, size_t size,
#define ENA_REG_READ32(bus, reg) \
__extension__ ({ (void)(bus); rte_read32_relaxed((reg)); })
-#define ATOMIC32_INC(i32_ptr) rte_atomic32_inc(i32_ptr)
-#define ATOMIC32_DEC(i32_ptr) rte_atomic32_dec(i32_ptr)
-#define ATOMIC32_SET(i32_ptr, val) rte_atomic32_set(i32_ptr, val)
-#define ATOMIC32_READ(i32_ptr) rte_atomic32_read(i32_ptr)
+#define ATOMIC32_INC(i32_ptr) \
+ rte_atomic_fetch_add_explicit((i32_ptr), 1, rte_memory_order_seq_cst)
+#define ATOMIC32_DEC(i32_ptr) \
+ rte_atomic_fetch_sub_explicit((i32_ptr), 1, rte_memory_order_seq_cst)
+#define ATOMIC32_SET(i32_ptr, val) \
+ rte_atomic_store_explicit((i32_ptr), (val), rte_memory_order_seq_cst)
+#define ATOMIC32_READ(i32_ptr) \
+ rte_atomic_load_explicit((i32_ptr), rte_memory_order_seq_cst)
#define msleep(x) rte_delay_us(x * 1000)
#define udelay(x) rte_delay_us(x)
diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index ea4afbc75d..e9c484456c 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -121,12 +121,6 @@ struct ena_stats {
*/
#define ENA_DEVARG_ENABLE_FRAG_BYPASS "enable_frag_bypass"
-/*
- * Each rte_memzone should have unique name.
- * To satisfy it, count number of allocation and add it to name.
- */
-rte_atomic64_t ena_alloc_cnt;
-
static const struct ena_stats ena_stats_global_strings[] = {
ENA_STAT_GLOBAL_ENTRY(wd_expired),
ENA_STAT_GLOBAL_ENTRY(dev_start),
@@ -1249,10 +1243,7 @@ static void ena_stats_restart(struct rte_eth_dev *dev)
{
struct ena_adapter *adapter = dev->data->dev_private;
- rte_atomic64_init(&adapter->drv_stats->ierrors);
- rte_atomic64_init(&adapter->drv_stats->oerrors);
- rte_atomic64_init(&adapter->drv_stats->rx_nombuf);
- adapter->drv_stats->rx_drops = 0;
+ memset(adapter->drv_stats, 0, sizeof(struct ena_driver_stats));
}
static int ena_stats_get(struct rte_eth_dev *dev,
@@ -1289,9 +1280,9 @@ static int ena_stats_get(struct rte_eth_dev *dev,
/* Driver related stats */
stats->imissed = adapter->drv_stats->rx_drops;
- stats->ierrors = rte_atomic64_read(&adapter->drv_stats->ierrors);
- stats->oerrors = rte_atomic64_read(&adapter->drv_stats->oerrors);
- stats->rx_nombuf = rte_atomic64_read(&adapter->drv_stats->rx_nombuf);
+ stats->ierrors = adapter->drv_stats->ierrors;
+ stats->oerrors = adapter->drv_stats->oerrors;
+ stats->rx_nombuf = adapter->drv_stats->rx_nombuf;
/* Queue statistics */
if (qstats) {
@@ -1887,7 +1878,7 @@ static int ena_populate_rx_queue(struct ena_ring *rxq, unsigned int count)
/* get resources for incoming packets */
rc = rte_pktmbuf_alloc_bulk(rxq->mb_pool, mbufs, count);
if (unlikely(rc < 0)) {
- rte_atomic64_inc(&rxq->adapter->drv_stats->rx_nombuf);
+ ++rxq->adapter->drv_stats->rx_nombuf;
++rxq->rx_stats.mbuf_alloc_fail;
PMD_RX_LOG_LINE(DEBUG, "There are not enough free buffers");
return 0;
@@ -3014,7 +3005,7 @@ static uint16_t eth_ena_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
if (unlikely(mbuf->ol_flags &
(RTE_MBUF_F_RX_IP_CKSUM_BAD | RTE_MBUF_F_RX_L4_CKSUM_BAD)))
- rte_atomic64_inc(&rx_ring->adapter->drv_stats->ierrors);
+ ++rx_ring->adapter->drv_stats->ierrors;
rx_pkts[completed] = mbuf;
rx_ring->rx_stats.bytes += mbuf->pkt_len;
diff --git a/drivers/net/ena/ena_ethdev.h b/drivers/net/ena/ena_ethdev.h
index 3a66d79384..b204b07767 100644
--- a/drivers/net/ena/ena_ethdev.h
+++ b/drivers/net/ena/ena_ethdev.h
@@ -6,7 +6,6 @@
#ifndef _ENA_ETHDEV_H_
#define _ENA_ETHDEV_H_
-#include <rte_atomic.h>
#include <rte_ether.h>
#include <ethdev_driver.h>
#include <ethdev_pci.h>
@@ -225,9 +224,9 @@ enum ena_adapter_state {
};
struct ena_driver_stats {
- rte_atomic64_t ierrors;
- rte_atomic64_t oerrors;
- rte_atomic64_t rx_nombuf;
+ u64 ierrors;
+ u64 oerrors;
+ u64 rx_nombuf;
u64 rx_drops;
};
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 08/27] net/failsafe: convert to stdatomic
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (6 preceding siblings ...)
2026-05-26 23:23 ` [PATCH v4 07/27] net/ena: replace use of rte_atomicNN Stephen Hemminger
@ 2026-05-26 23:23 ` Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 09/27] net/enic: do not use deprecated rte_atomic64 Stephen Hemminger
` (18 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:23 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Gaetan Rivet
The functions rte_atomic64 are deprecated, convert this
code to use stdatomic for reference count. Use the memory
order implied by naming P/V.
No need for initialization since refcnt is in space
allocated with rte_zmalloc().
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/failsafe/failsafe_ops.c | 12 +++++-----
drivers/net/failsafe/failsafe_private.h | 29 ++++++++++++++-----------
drivers/net/failsafe/failsafe_rxtx.c | 2 +-
3 files changed, 22 insertions(+), 21 deletions(-)
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index ddc8808ebe..fcb0051777 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -11,7 +11,7 @@
#endif
#include <rte_debug.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <ethdev_driver.h>
#include <rte_malloc.h>
#include <rte_flow.h>
@@ -440,14 +440,13 @@ fs_rx_queue_setup(struct rte_eth_dev *dev,
}
rxq = rte_zmalloc(NULL,
sizeof(*rxq) +
- sizeof(rte_atomic64_t) * PRIV(dev)->subs_tail,
+ sizeof(uint64_t) * PRIV(dev)->subs_tail,
RTE_CACHE_LINE_SIZE);
if (rxq == NULL) {
fs_unlock(dev, 0);
return -ENOMEM;
}
- FOREACH_SUBDEV(sdev, i, dev)
- rte_atomic64_init(&rxq->refcnt[i]);
+
rxq->qid = rx_queue_id;
rxq->socket_id = socket_id;
rxq->info.mp = mb_pool;
@@ -617,14 +616,13 @@ fs_tx_queue_setup(struct rte_eth_dev *dev,
}
txq = rte_zmalloc("ethdev TX queue",
sizeof(*txq) +
- sizeof(rte_atomic64_t) * PRIV(dev)->subs_tail,
+ sizeof(uint64_t) * PRIV(dev)->subs_tail,
RTE_CACHE_LINE_SIZE);
if (txq == NULL) {
fs_unlock(dev, 0);
return -ENOMEM;
}
- FOREACH_SUBDEV(sdev, i, dev)
- rte_atomic64_init(&txq->refcnt[i]);
+
txq->qid = tx_queue_id;
txq->socket_id = socket_id;
txq->info.conf = *tx_conf;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index babea6016e..89b06f9756 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -10,7 +10,7 @@
#include <sys/queue.h>
#include <pthread.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <dev_driver.h>
#include <ethdev_driver.h>
#include <rte_devargs.h>
@@ -75,7 +75,7 @@ struct rxq {
int event_fd;
unsigned int enable_events:1;
struct rte_eth_rxq_info info;
- rte_atomic64_t refcnt[];
+ RTE_ATOMIC(uint64_t) refcnt[];
};
struct txq {
@@ -83,7 +83,7 @@ struct txq {
uint16_t qid;
unsigned int socket_id;
struct rte_eth_txq_info info;
- rte_atomic64_t refcnt[];
+ RTE_ATOMIC(uint64_t) refcnt[];
};
struct rte_flow {
@@ -320,33 +320,36 @@ extern int failsafe_mac_from_arg;
*/
/**
- * a: (rte_atomic64_t)
+ * a: _Atomic uint64_t
*/
#define FS_ATOMIC_P(a) \
- rte_atomic64_set(&(a), 1)
+ rte_atomic_exchange_explicit(&(a), 1, rte_memory_order_acquire)
/**
- * a: (rte_atomic64_t)
+ * a: _Atomic uint64_t
*/
#define FS_ATOMIC_V(a) \
- rte_atomic64_set(&(a), 0)
+ rte_atomic_store_explicit(&(a), 0, rte_memory_order_release)
/**
* s: (struct sub_device *)
* i: uint16_t qid
*/
#define FS_ATOMIC_RX(s, i) \
- rte_atomic64_read( \
- &((struct rxq *) \
- (fs_dev(s)->data->rx_queues[i]))->refcnt[(s)->sid])
+ rte_atomic_load_explicit( \
+ &((struct rxq *) \
+ (fs_dev(s)->data->rx_queues[i]))->refcnt[(s)->sid], \
+ rte_memory_order_seq_cst)
+
/**
* s: (struct sub_device *)
* i: uint16_t qid
*/
#define FS_ATOMIC_TX(s, i) \
- rte_atomic64_read( \
- &((struct txq *) \
- (fs_dev(s)->data->tx_queues[i]))->refcnt[(s)->sid])
+ rte_atomic_load_explicit( \
+ &((struct txq *) \
+ (fs_dev(s)->data->tx_queues[i]))->refcnt[(s)->sid], \
+ rte_memory_order_seq_cst)
#ifdef RTE_EXEC_ENV_FREEBSD
#define FS_THREADID_TYPE void*
diff --git a/drivers/net/failsafe/failsafe_rxtx.c b/drivers/net/failsafe/failsafe_rxtx.c
index fe67293299..500483bda3 100644
--- a/drivers/net/failsafe/failsafe_rxtx.c
+++ b/drivers/net/failsafe/failsafe_rxtx.c
@@ -3,7 +3,7 @@
* Copyright 2017 Mellanox Technologies, Ltd
*/
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_debug.h>
#include <rte_mbuf.h>
#include <ethdev_driver.h>
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 09/27] net/enic: do not use deprecated rte_atomic64
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (7 preceding siblings ...)
2026-05-26 23:23 ` [PATCH v4 08/27] net/failsafe: convert to stdatomic Stephen Hemminger
@ 2026-05-26 23:23 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 10/27] net/pfe: use ethdev linkstatus helpers Stephen Hemminger
` (17 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:23 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, John Daley, Hyong Youb Kim, Bruce Richardson,
Konstantin Ananyev
The rte_atomic64 datatype and functions are deprecated.
This driver was only using it for error statistics where atomic
is not necessary. The DPDK PMD model is that statistics do
not have to be exact in face of contention.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/enic/enic.h | 6 +++---
drivers/net/enic/enic_compat.h | 1 -
drivers/net/enic/enic_main.c | 17 +++++++----------
drivers/net/enic/enic_rxtx.c | 14 ++++++--------
drivers/net/enic/enic_rxtx_vec_avx2.c | 4 ++--
5 files changed, 18 insertions(+), 24 deletions(-)
diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index 87f6b35fcd..0a8d4a29ca 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -59,9 +59,9 @@
#define ENICPMD_RXQ_INTR_OFFSET 1
struct enic_soft_stats {
- rte_atomic64_t rx_nombuf;
- rte_atomic64_t rx_packet_errors;
- rte_atomic64_t tx_oversized;
+ uint64_t rx_nombuf;
+ uint64_t rx_packet_errors;
+ uint64_t tx_oversized;
};
struct enic_memzone_entry {
diff --git a/drivers/net/enic/enic_compat.h b/drivers/net/enic/enic_compat.h
index 7cff6831b9..3ce4299e81 100644
--- a/drivers/net/enic/enic_compat.h
+++ b/drivers/net/enic/enic_compat.h
@@ -9,7 +9,6 @@
#include <stdio.h>
#include <unistd.h>
-#include <rte_atomic.h>
#include <rte_malloc.h>
#include <rte_log.h>
#include <rte_io.h>
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 2696fa77d4..fb9a5754c9 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -83,17 +83,15 @@ static void enic_log_q_error(struct enic *enic)
static void enic_clear_soft_stats(struct enic *enic)
{
struct enic_soft_stats *soft_stats = &enic->soft_stats;
- rte_atomic64_clear(&soft_stats->rx_nombuf);
- rte_atomic64_clear(&soft_stats->rx_packet_errors);
- rte_atomic64_clear(&soft_stats->tx_oversized);
+
+ memset(soft_stats, 0, sizeof(*soft_stats));
}
static void enic_init_soft_stats(struct enic *enic)
{
struct enic_soft_stats *soft_stats = &enic->soft_stats;
- rte_atomic64_init(&soft_stats->rx_nombuf);
- rte_atomic64_init(&soft_stats->rx_packet_errors);
- rte_atomic64_init(&soft_stats->tx_oversized);
+
+ memset(soft_stats, 0, sizeof(*soft_stats));
enic_clear_soft_stats(enic);
}
@@ -132,7 +130,7 @@ int enic_dev_stats_get(struct enic *enic, struct rte_eth_stats *r_stats,
* counted in ibytes even though truncated packets are dropped
* which can make ibytes be slightly higher than it should be.
*/
- rx_packet_errors = rte_atomic64_read(&soft_stats->rx_packet_errors);
+ rx_packet_errors = soft_stats->rx_packet_errors;
rx_truncated = rx_packet_errors - stats->rx.rx_errors;
r_stats->ipackets = stats->rx.rx_frames_ok - rx_truncated;
@@ -142,12 +140,11 @@ int enic_dev_stats_get(struct enic *enic, struct rte_eth_stats *r_stats,
r_stats->obytes = stats->tx.tx_bytes_ok;
r_stats->ierrors = stats->rx.rx_errors + stats->rx.rx_drop;
- r_stats->oerrors = stats->tx.tx_errors
- + rte_atomic64_read(&soft_stats->tx_oversized);
+ r_stats->oerrors = stats->tx.tx_errors + soft_stats->tx_oversized;
r_stats->imissed = stats->rx.rx_no_bufs + rx_truncated;
- r_stats->rx_nombuf = rte_atomic64_read(&soft_stats->rx_nombuf);
+ r_stats->rx_nombuf = soft_stats->rx_nombuf;
return 0;
}
diff --git a/drivers/net/enic/enic_rxtx.c b/drivers/net/enic/enic_rxtx.c
index 549a153332..c87d947b93 100644
--- a/drivers/net/enic/enic_rxtx.c
+++ b/drivers/net/enic/enic_rxtx.c
@@ -112,7 +112,7 @@ enic_recv_pkts_common(void *rx_queue, struct rte_mbuf **rx_pkts,
/* allocate a new mbuf */
nmb = rte_mbuf_raw_alloc(rq->mp);
if (nmb == NULL) {
- rte_atomic64_inc(&enic->soft_stats.rx_nombuf);
+ ++enic->soft_stats.rx_nombuf;
break;
}
@@ -185,7 +185,7 @@ enic_recv_pkts_common(void *rx_queue, struct rte_mbuf **rx_pkts,
}
if (unlikely(packet_error)) {
rte_pktmbuf_free(first_seg);
- rte_atomic64_inc(&enic->soft_stats.rx_packet_errors);
+ ++enic->soft_stats.rx_packet_errors;
continue;
}
@@ -303,7 +303,7 @@ enic_noscatter_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
if (unlikely(cqd->bytes_written_flags &
CQ_ENET_RQ_DESC_FLAGS_TRUNCATED)) {
rte_pktmbuf_free(*rxmb++);
- rte_atomic64_inc(&enic->soft_stats.rx_packet_errors);
+ ++enic->soft_stats.rx_packet_errors;
cqd++;
continue;
}
@@ -505,14 +505,12 @@ uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint8_t offload_mode;
uint16_t header_len;
uint64_t tso;
- rte_atomic64_t *tx_oversized;
enic_cleanup_wq(enic, wq);
wq_desc_avail = vnic_wq_desc_avail(wq);
head_idx = wq->head_idx;
desc_count = wq->ring.desc_count;
ol_flags_mask = RTE_MBUF_F_TX_VLAN | RTE_MBUF_F_TX_IP_CKSUM | RTE_MBUF_F_TX_L4_MASK;
- tx_oversized = &enic->soft_stats.tx_oversized;
nb_pkts = RTE_MIN(nb_pkts, ENIC_TX_XMIT_MAX);
@@ -527,7 +525,7 @@ uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
/* drop packet if it's too big to send */
if (unlikely(!tso && pkt_len > ENIC_TX_MAX_PKT_SIZE)) {
rte_pktmbuf_free(tx_pkt);
- rte_atomic64_inc(tx_oversized);
+ ++enic->soft_stats.tx_oversized;
continue;
}
@@ -558,7 +556,7 @@ uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
if (unlikely(header_len == 0 || ((tx_pkt->tso_segsz +
header_len) > ENIC_TX_MAX_PKT_SIZE))) {
rte_pktmbuf_free(tx_pkt);
- rte_atomic64_inc(tx_oversized);
+ ++enic->soft_stats.tx_oversized;
continue;
}
@@ -681,7 +679,7 @@ static void enqueue_simple_pkts(struct rte_mbuf **pkts,
*/
if (unlikely(p->pkt_len > ENIC_TX_MAX_PKT_SIZE)) {
desc->length = ENIC_TX_MAX_PKT_SIZE;
- rte_atomic64_inc(&enic->soft_stats.tx_oversized);
+ ++enic->soft_stats.tx_oversized;
}
desc++;
}
diff --git a/drivers/net/enic/enic_rxtx_vec_avx2.c b/drivers/net/enic/enic_rxtx_vec_avx2.c
index 600efff270..53589ab788 100644
--- a/drivers/net/enic/enic_rxtx_vec_avx2.c
+++ b/drivers/net/enic/enic_rxtx_vec_avx2.c
@@ -81,7 +81,7 @@ enic_noscatter_vec_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
if (unlikely(cqd->bytes_written_flags &
CQ_ENET_RQ_DESC_FLAGS_TRUNCATED)) {
rte_pktmbuf_free(*rxmb++);
- rte_atomic64_inc(&enic->soft_stats.rx_packet_errors);
+ ++enic->soft_stats.rx_packet_errors;
} else {
*rx++ = rx_one(cqd, *rxmb++, enic);
}
@@ -761,7 +761,7 @@ enic_noscatter_vec_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
if (unlikely(cqd->bytes_written_flags &
CQ_ENET_RQ_DESC_FLAGS_TRUNCATED)) {
rte_pktmbuf_free(*rxmb++);
- rte_atomic64_inc(&enic->soft_stats.rx_packet_errors);
+ ++enic->soft_stats.rx_packet_errors;
} else {
*rx++ = rx_one(cqd, *rxmb++, enic);
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 10/27] net/pfe: use ethdev linkstatus helpers
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (8 preceding siblings ...)
2026-05-26 23:23 ` [PATCH v4 09/27] net/enic: do not use deprecated rte_atomic64 Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 11/27] net/sfc: replace rte_atomic with stdatomic Stephen Hemminger
` (16 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Gagandeep Singh
Rather than open coding with deprecated rte_atomic64,
use the existing ethdev helpers to get and set link status.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/pfe/pfe_ethdev.c | 32 ++------------------------------
1 file changed, 2 insertions(+), 30 deletions(-)
diff --git a/drivers/net/pfe/pfe_ethdev.c b/drivers/net/pfe/pfe_ethdev.c
index 1efa17539e..1b183ab1f3 100644
--- a/drivers/net/pfe/pfe_ethdev.c
+++ b/drivers/net/pfe/pfe_ethdev.c
@@ -531,34 +531,6 @@ pfe_supported_ptypes_get(struct rte_eth_dev *dev, size_t *no_of_elements)
return NULL;
}
-static inline int
-pfe_eth_atomic_read_link_status(struct rte_eth_dev *dev,
- struct rte_eth_link *link)
-{
- struct rte_eth_link *dst = link;
- struct rte_eth_link *src = &dev->data->dev_link;
-
- if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst,
- *(uint64_t *)src) == 0)
- return -1;
-
- return 0;
-}
-
-static inline int
-pfe_eth_atomic_write_link_status(struct rte_eth_dev *dev,
- struct rte_eth_link *link)
-{
- struct rte_eth_link *dst = &dev->data->dev_link;
- struct rte_eth_link *src = link;
-
- if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst,
- *(uint64_t *)src) == 0)
- return -1;
-
- return 0;
-}
-
static int
pfe_eth_link_update(struct rte_eth_dev *dev, int wait_to_complete __rte_unused)
{
@@ -570,7 +542,7 @@ pfe_eth_link_update(struct rte_eth_dev *dev, int wait_to_complete __rte_unused)
memset(&old, 0, sizeof(old));
memset(&link, 0, sizeof(struct rte_eth_link));
- pfe_eth_atomic_read_link_status(dev, &old);
+ rte_eth_linkstatus_get(dev, &old);
/* Read from PFE CDEV, status of link, if file was successfully
* opened.
@@ -601,7 +573,7 @@ pfe_eth_link_update(struct rte_eth_dev *dev, int wait_to_complete __rte_unused)
link.link_duplex = RTE_ETH_LINK_FULL_DUPLEX;
link.link_autoneg = RTE_ETH_LINK_AUTONEG;
- pfe_eth_atomic_write_link_status(dev, &link);
+ rte_eth_linkstatus_set(dev, &link);
PFE_PMD_INFO("Port (%d) link is %s", dev->data->port_id,
link.link_status ? "up" : "down");
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 11/27] net/sfc: replace rte_atomic with stdatomic
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (9 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 10/27] net/pfe: use ethdev linkstatus helpers Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-06-01 9:22 ` Andrew Rybchenko
2026-05-26 23:24 ` [PATCH v4 12/27] crypto/ccp: replace use of rte_atomic64 " Stephen Hemminger
` (15 subsequent siblings)
26 siblings, 1 reply; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Andrew Rybchenko
The rte_atomicNN functions are deprecated and need to be replaced.
Use stdatomic for the restart required flag.
Use existing ethdev helper to set link status.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/sfc/sfc.c | 9 +++++----
drivers/net/sfc/sfc.h | 4 ++--
drivers/net/sfc/sfc_port.c | 7 +------
drivers/net/sfc/sfc_stats.h | 2 +-
4 files changed, 9 insertions(+), 13 deletions(-)
diff --git a/drivers/net/sfc/sfc.c b/drivers/net/sfc/sfc.c
index 69747e49ae..3470f7eed6 100644
--- a/drivers/net/sfc/sfc.c
+++ b/drivers/net/sfc/sfc.c
@@ -670,8 +670,8 @@ sfc_restart_if_required(void *arg)
struct sfc_adapter *sa = arg;
/* If restart is scheduled, clear the flag and do it */
- if (rte_atomic32_cmpset((volatile uint32_t *)&sa->restart_required,
- 1, 0)) {
+ if (rte_atomic_exchange_explicit(&sa->restart_required, false,
+ rte_memory_order_seq_cst)) {
sfc_adapter_lock(sa);
if (sa->state == SFC_ETHDEV_STARTED)
(void)sfc_restart(sa);
@@ -685,7 +685,8 @@ sfc_schedule_restart(struct sfc_adapter *sa)
int rc;
/* Schedule restart alarm if it is not scheduled yet */
- if (!rte_atomic32_test_and_set(&sa->restart_required))
+ if (rte_atomic_exchange_explicit(&sa->restart_required, true,
+ rte_memory_order_seq_cst))
return;
rc = rte_eal_alarm_set(1, sfc_restart_if_required, sa);
@@ -1292,7 +1293,7 @@ sfc_probe(struct sfc_adapter *sa)
SFC_ASSERT(sfc_adapter_is_locked(sa));
sa->socket_id = rte_socket_id();
- rte_atomic32_init(&sa->restart_required);
+ sa->restart_required = false;
sfc_log_init(sa, "get family");
rc = sfc_efx_family(pci_dev, &mem_ebrp, &sa->family);
diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 629578549f..515e1e708d 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -17,7 +17,7 @@
#include <ethdev_driver.h>
#include <rte_kvargs.h>
#include <rte_spinlock.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include "efx.h"
@@ -239,7 +239,7 @@ struct sfc_adapter {
efx_family_t family;
efx_nic_t *nic;
rte_spinlock_t nic_lock;
- rte_atomic32_t restart_required;
+ RTE_ATOMIC(bool) restart_required;
struct sfc_efx_mcdi mcdi;
struct sfc_sriov sriov;
diff --git a/drivers/net/sfc/sfc_port.c b/drivers/net/sfc/sfc_port.c
index 33b53f7ac8..d84648d454 100644
--- a/drivers/net/sfc/sfc_port.c
+++ b/drivers/net/sfc/sfc_port.c
@@ -121,7 +121,6 @@ sfc_port_reset_mac_stats(struct sfc_adapter *sa)
static int
sfc_port_init_dev_link(struct sfc_adapter *sa)
{
- struct rte_eth_link *dev_link = &sa->eth_dev->data->dev_link;
int rc;
efx_link_mode_t link_mode;
struct rte_eth_link current_link;
@@ -132,11 +131,7 @@ sfc_port_init_dev_link(struct sfc_adapter *sa)
sfc_port_link_mode_to_info(link_mode, sa->port.phy_adv_cap,
¤t_link);
-
- EFX_STATIC_ASSERT(sizeof(*dev_link) == sizeof(rte_atomic64_t));
- rte_atomic64_set((rte_atomic64_t *)dev_link,
- *(uint64_t *)¤t_link);
-
+ rte_eth_linkstatus_set(sa->eth_dev, ¤t_link);
return 0;
}
diff --git a/drivers/net/sfc/sfc_stats.h b/drivers/net/sfc/sfc_stats.h
index 597e14dab3..eaa2afd3fe 100644
--- a/drivers/net/sfc/sfc_stats.h
+++ b/drivers/net/sfc/sfc_stats.h
@@ -12,7 +12,7 @@
#include <stdint.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include "sfc_tweak.h"
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 12/27] crypto/ccp: replace use of rte_atomic64 with stdatomic
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (10 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 11/27] net/sfc: replace rte_atomic with stdatomic Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 13/27] bus/dpaa: replace rte_atomic16 " Stephen Hemminger
` (14 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Sunil Uttarwar
The rte_atomicNN functions are deprecated. Replace the free
count with stdatomic.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/crypto/ccp/ccp_crypto.c | 11 +++++++----
drivers/crypto/ccp/ccp_crypto.h | 2 +-
drivers/crypto/ccp/ccp_dev.c | 10 ++++++----
drivers/crypto/ccp/ccp_dev.h | 4 ++--
4 files changed, 16 insertions(+), 11 deletions(-)
diff --git a/drivers/crypto/ccp/ccp_crypto.c b/drivers/crypto/ccp/ccp_crypto.c
index 5899d83bae..1800ad41c9 100644
--- a/drivers/crypto/ccp/ccp_crypto.c
+++ b/drivers/crypto/ccp/ccp_crypto.c
@@ -2683,7 +2683,8 @@ process_ops_to_enqueue(struct ccp_qp *qp,
b_info->cmd_q = cmd_q;
b_info->lsb_buf_phys = (phys_addr_t)rte_mem_virt2iova((void *)b_info->lsb_buf);
- rte_atomic64_sub(&b_info->cmd_q->free_slots, slots_req);
+ rte_atomic_fetch_sub_explicit(&b_info->cmd_q->free_slots, slots_req,
+ rte_memory_order_seq_cst);
b_info->head_offset = (uint32_t)(cmd_q->qbase_phys_addr + cmd_q->qidx *
Q_DESC_SIZE);
@@ -2729,8 +2730,9 @@ process_ops_to_enqueue(struct ccp_qp *qp,
result = -1;
}
if (unlikely(result < 0)) {
- rte_atomic64_add(&b_info->cmd_q->free_slots,
- (slots_req - b_info->desccnt));
+ rte_atomic_fetch_add_explicit(&b_info->cmd_q->free_slots,
+ slots_req - b_info->desccnt,
+ rte_memory_order_seq_cst);
break;
}
b_info->op[i] = op[i];
@@ -2914,7 +2916,8 @@ process_ops_to_dequeue(struct ccp_qp *qp,
success:
*total_nb_ops = b_info->total_nb_ops;
nb_ops = ccp_prepare_ops(qp, op, b_info, nb_ops);
- rte_atomic64_add(&b_info->cmd_q->free_slots, b_info->desccnt);
+ rte_atomic_fetch_add_explicit(&b_info->cmd_q->free_slots, b_info->desccnt,
+ rte_memory_order_seq_cst);
b_info->desccnt = 0;
if (b_info->opcnt > 0) {
qp->b_info = b_info;
diff --git a/drivers/crypto/ccp/ccp_crypto.h b/drivers/crypto/ccp/ccp_crypto.h
index d0b417ca29..5c61b1582d 100644
--- a/drivers/crypto/ccp/ccp_crypto.h
+++ b/drivers/crypto/ccp/ccp_crypto.h
@@ -10,7 +10,7 @@
#include <stdint.h>
#include <string.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_byteorder.h>
#include <rte_io.h>
#include <rte_pci.h>
diff --git a/drivers/crypto/ccp/ccp_dev.c b/drivers/crypto/ccp/ccp_dev.c
index 5088d8ded6..a75816cdfc 100644
--- a/drivers/crypto/ccp/ccp_dev.c
+++ b/drivers/crypto/ccp/ccp_dev.c
@@ -47,14 +47,15 @@ ccp_allot_queue(struct rte_cryptodev *cdev, int slot_req)
priv->last_dev = dev;
if (dev->qidx >= dev->cmd_q_count)
dev->qidx = 0;
- ret = rte_atomic64_read(&dev->cmd_q[dev->qidx].free_slots);
+ ret = rte_atomic_load_explicit(&dev->cmd_q[dev->qidx].free_slots, rte_memory_order_relaxed);
if (ret >= slot_req)
return &dev->cmd_q[dev->qidx];
for (i = 0; i < dev->cmd_q_count; i++) {
dev->qidx++;
if (dev->qidx >= dev->cmd_q_count)
dev->qidx = 0;
- ret = rte_atomic64_read(&dev->cmd_q[dev->qidx].free_slots);
+ ret = rte_atomic_load_explicit(&dev->cmd_q[dev->qidx].free_slots,
+ rte_memory_order_relaxed);
if (ret >= slot_req)
return &dev->cmd_q[dev->qidx];
}
@@ -583,8 +584,9 @@ ccp_add_device(struct ccp_device *dev)
CCP_LOG_ERR("queue doesn't have lsb regions");
cmd_q->lsb = -1;
- rte_atomic64_init(&cmd_q->free_slots);
- rte_atomic64_set(&cmd_q->free_slots, (COMMANDS_PER_QUEUE - 1));
+ rte_atomic_store_explicit(&cmd_q->free_slots,
+ COMMANDS_PER_QUEUE - 1,
+ rte_memory_order_seq_cst);
/* unused slot barrier b/w H&T */
}
diff --git a/drivers/crypto/ccp/ccp_dev.h b/drivers/crypto/ccp/ccp_dev.h
index cd63830759..8c408ac8d3 100644
--- a/drivers/crypto/ccp/ccp_dev.h
+++ b/drivers/crypto/ccp/ccp_dev.h
@@ -11,7 +11,7 @@
#include <string.h>
#include <bus_pci_driver.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_byteorder.h>
#include <rte_io.h>
#include <rte_pci.h>
@@ -182,7 +182,7 @@ struct __rte_cache_aligned ccp_queue {
struct ccp_device *dev;
char memz_name[RTE_MEMZONE_NAMESIZE];
- rte_atomic64_t free_slots;
+ RTE_ATOMIC(int64_t) free_slots;
/**< available free slots updated from enq/deq calls */
/* Queue identifier */
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 13/27] bus/dpaa: replace rte_atomic16 with stdatomic
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (11 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 12/27] crypto/ccp: replace use of rte_atomic64 " Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 14/27] drivers: " Stephen Hemminger
` (13 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Hemant Agrawal, Sachin Saxena
This is simple inuse flag which can be done with stdatomic
exchange logic.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/bus/dpaa/base/qbman/qman.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/bus/dpaa/base/qbman/qman.c b/drivers/bus/dpaa/base/qbman/qman.c
index 5534e1846c..82a976141a 100644
--- a/drivers/bus/dpaa/base/qbman/qman.c
+++ b/drivers/bus/dpaa/base/qbman/qman.c
@@ -11,6 +11,7 @@
#include <rte_eventdev.h>
#include <rte_byteorder.h>
#include <rte_dpaa_logs.h>
+#include <rte_stdatomic.h>
#include <eal_export.h>
#include <dpaa_bits.h>
@@ -683,7 +684,7 @@ qman_init_portal(struct qman_portal *portal,
#define MAX_GLOBAL_PORTALS 8
static struct qman_portal global_portals[MAX_GLOBAL_PORTALS];
-static rte_atomic16_t global_portals_used[MAX_GLOBAL_PORTALS];
+static RTE_ATOMIC(bool) global_portals_used[MAX_GLOBAL_PORTALS];
struct qman_portal *
qman_alloc_global_portal(struct qm_portal_config *q_pcfg)
@@ -691,7 +692,8 @@ qman_alloc_global_portal(struct qm_portal_config *q_pcfg)
unsigned int i;
for (i = 0; i < MAX_GLOBAL_PORTALS; i++) {
- if (rte_atomic16_test_and_set(&global_portals_used[i])) {
+ if (!rte_atomic_exchange_explicit(&global_portals_used[i], true,
+ rte_memory_order_acquire)) {
global_portals[i].config = q_pcfg;
return &global_portals[i];
}
@@ -708,7 +710,8 @@ qman_free_global_portal(struct qman_portal *portal)
for (i = 0; i < MAX_GLOBAL_PORTALS; i++) {
if (&global_portals[i] == portal) {
- rte_atomic16_clear(&global_portals_used[i]);
+ rte_atomic_store_explicit(&global_portals_used[i], false,
+ rte_memory_order_release);
return 0;
}
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 14/27] drivers: replace rte_atomic16 with stdatomic
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (12 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 13/27] bus/dpaa: replace rte_atomic16 " Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 15/27] net/netvsc: replace rte_atomic32 " Stephen Hemminger
` (12 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Hemant Agrawal, Sachin Saxena
The rte_atomicNN functions and types are deprecated.
The in_use and reference counts flag can be converted to stdatomic.
Also drop the unneeded NULL check in the loop body: TAILQ_FOREACH
terminates when the iterator becomes NULL, so var is guaranteed
non-NULL inside the loop.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c | 10 +++++++---
drivers/bus/fslmc/portal/dpaa2_hw_dpci.c | 10 +++++++---
drivers/bus/fslmc/portal/dpaa2_hw_dpio.c | 12 ++++++++----
drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 8 ++++----
drivers/event/dpaa2/dpaa2_hw_dpcon.c | 11 +++++++----
5 files changed, 33 insertions(+), 18 deletions(-)
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c b/drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c
index 925e83e97d..7b08593338 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c
@@ -84,7 +84,7 @@ dpaa2_create_dpbp_device(int vdev_fd __rte_unused,
}
dpbp_node->dpbp_id = dpbp_id;
- rte_atomic16_init(&dpbp_node->in_use);
+ dpbp_node->in_use = 0;
TAILQ_INSERT_TAIL(&dpbp_dev_list, dpbp_node, next);
@@ -103,7 +103,10 @@ struct dpaa2_dpbp_dev *dpaa2_alloc_dpbp_dev(void)
/* Get DPBP dev handle from list using index */
TAILQ_FOREACH(dpbp_dev, &dpbp_dev_list, next) {
- if (dpbp_dev && rte_atomic16_test_and_set(&dpbp_dev->in_use))
+ uint16_t expected = 0;
+ if (rte_atomic_compare_exchange_strong_explicit(
+ &dpbp_dev->in_use, &expected, 1,
+ rte_memory_order_acquire, rte_memory_order_relaxed))
break;
}
@@ -118,7 +121,8 @@ void dpaa2_free_dpbp_dev(struct dpaa2_dpbp_dev *dpbp)
/* Match DPBP handle and mark it free */
TAILQ_FOREACH(dpbp_dev, &dpbp_dev_list, next) {
if (dpbp_dev == dpbp) {
- rte_atomic16_dec(&dpbp_dev->in_use);
+ rte_atomic_store_explicit(&dpbp_dev->in_use, 0,
+ rte_memory_order_release);
return;
}
}
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpci.c b/drivers/bus/fslmc/portal/dpaa2_hw_dpci.c
index b546da82f6..0e36fcdcd4 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpci.c
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpci.c
@@ -135,7 +135,7 @@ rte_dpaa2_create_dpci_device(int vdev_fd __rte_unused,
}
dpci_node->dpci_id = dpci_id;
- rte_atomic16_init(&dpci_node->in_use);
+ dpci_node->in_use = 0;
TAILQ_INSERT_TAIL(&dpci_dev_list, dpci_node, next);
@@ -159,7 +159,10 @@ struct dpaa2_dpci_dev *rte_dpaa2_alloc_dpci_dev(void)
/* Get DPCI dev handle from list using index */
TAILQ_FOREACH(dpci_dev, &dpci_dev_list, next) {
- if (dpci_dev && rte_atomic16_test_and_set(&dpci_dev->in_use))
+ uint16_t expected = 0;
+ if (rte_atomic_compare_exchange_strong_explicit(
+ &dpci_dev->in_use, &expected, 1,
+ rte_memory_order_acquire, rte_memory_order_relaxed))
break;
}
@@ -174,7 +177,8 @@ void rte_dpaa2_free_dpci_dev(struct dpaa2_dpci_dev *dpci)
/* Match DPCI handle and mark it free */
TAILQ_FOREACH(dpci_dev, &dpci_dev_list, next) {
if (dpci_dev == dpci) {
- rte_atomic16_dec(&dpci_dev->in_use);
+ rte_atomic_store_explicit(&dpci_dev->in_use, 0,
+ rte_memory_order_release);
return;
}
}
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
index 2a9e519668..06ddb366d8 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
@@ -293,7 +293,7 @@ static void dpaa2_put_qbman_swp(struct dpaa2_dpio_dev *dpio_dev)
#ifdef RTE_EVENT_DPAA2
dpaa2_dpio_intr_deinit(dpio_dev);
#endif
- rte_atomic16_clear(&dpio_dev->ref_count);
+ rte_atomic_store_explicit(&dpio_dev->ref_count, 0, rte_memory_order_release);
}
}
@@ -305,7 +305,10 @@ static struct dpaa2_dpio_dev *dpaa2_get_qbman_swp(void)
/* Get DPIO dev handle from list using index */
TAILQ_FOREACH(dpio_dev, &dpio_dev_list, next) {
- if (dpio_dev && rte_atomic16_test_and_set(&dpio_dev->ref_count))
+ uint16_t expected = 0;
+ if (rte_atomic_compare_exchange_strong_explicit(
+ &dpio_dev->ref_count, &expected, 1,
+ rte_memory_order_acquire, rte_memory_order_relaxed))
break;
}
if (!dpio_dev) {
@@ -326,7 +329,8 @@ static struct dpaa2_dpio_dev *dpaa2_get_qbman_swp(void)
ret = dpaa2_configure_stashing(dpio_dev, cpu_id);
if (ret) {
DPAA2_BUS_ERR("dpaa2_configure_stashing failed");
- rte_atomic16_clear(&dpio_dev->ref_count);
+ rte_atomic_store_explicit(&dpio_dev->ref_count, 0,
+ rte_memory_order_release);
return NULL;
}
}
@@ -441,7 +445,7 @@ dpaa2_create_dpio_device(int vdev_fd,
dpio_dev->dpio = NULL;
dpio_dev->hw_id = object_id;
- rte_atomic16_init(&dpio_dev->ref_count);
+
/* Using single portal for all devices */
dpio_dev->mc_portal = dpaa2_get_mcp_ptr(MC_PORTAL_INDEX);
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h
index e625a5c035..f2298b18e5 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h
@@ -112,7 +112,7 @@ struct dpaa2_dpio_dev {
TAILQ_ENTRY(dpaa2_dpio_dev) next;
/**< Pointer to Next device instance */
uint16_t index; /**< Index of a instance in the list */
- rte_atomic16_t ref_count;
+ RTE_ATOMIC(uint16_t) ref_count;
/**< How many thread contexts are sharing this.*/
uint16_t eqresp_ci;
uint16_t eqresp_pi;
@@ -141,7 +141,7 @@ struct dpaa2_dpbp_dev {
/**< Pointer to Next device instance */
struct fsl_mc_io dpbp; /** handle to DPBP portal object */
uint16_t token;
- rte_atomic16_t in_use;
+ RTE_ATOMIC(uint16_t) in_use;
uint32_t dpbp_id; /*HW ID for DPBP object */
};
@@ -257,7 +257,7 @@ struct dpaa2_dpci_dev {
/**< Pointer to Next device instance */
struct fsl_mc_io dpci; /** handle to DPCI portal object */
uint16_t token;
- rte_atomic16_t in_use;
+ RTE_ATOMIC(uint16_t) in_use;
uint32_t dpci_id; /*HW ID for DPCI object */
struct dpaa2_queue rx_queue[DPAA2_DPCI_MAX_QUEUES];
struct dpaa2_queue tx_queue[DPAA2_DPCI_MAX_QUEUES];
@@ -267,7 +267,7 @@ struct dpaa2_dpcon_dev {
TAILQ_ENTRY(dpaa2_dpcon_dev) next;
struct fsl_mc_io dpcon;
uint16_t token;
- rte_atomic16_t in_use;
+ RTE_ATOMIC(uint16_t) in_use;
uint32_t dpcon_id;
uint16_t qbman_ch_id;
uint8_t num_priorities;
diff --git a/drivers/event/dpaa2/dpaa2_hw_dpcon.c b/drivers/event/dpaa2/dpaa2_hw_dpcon.c
index ea5b0d4b85..4d1d55eace 100644
--- a/drivers/event/dpaa2/dpaa2_hw_dpcon.c
+++ b/drivers/event/dpaa2/dpaa2_hw_dpcon.c
@@ -15,6 +15,7 @@
#include <rte_malloc.h>
#include <rte_memcpy.h>
#include <rte_string_fns.h>
+#include <rte_stdatomic.h>
#include <rte_cycles.h>
#include <rte_kvargs.h>
#include <dev_driver.h>
@@ -53,7 +54,7 @@ rte_dpaa2_create_dpcon_device(int dev_fd __rte_unused,
int ret, dpcon_id = obj->object_id;
/* Allocate DPAA2 dpcon handle */
- dpcon_node = rte_malloc(NULL, sizeof(struct dpaa2_dpcon_dev), 0);
+ dpcon_node = rte_zmalloc(NULL, sizeof(struct dpaa2_dpcon_dev), 0);
if (!dpcon_node) {
DPAA2_EVENTDEV_ERR(
"Memory allocation failed for dpcon device");
@@ -85,7 +86,6 @@ rte_dpaa2_create_dpcon_device(int dev_fd __rte_unused,
dpcon_node->qbman_ch_id = attr.qbman_ch_id;
dpcon_node->num_priorities = attr.num_priorities;
dpcon_node->dpcon_id = dpcon_id;
- rte_atomic16_init(&dpcon_node->in_use);
TAILQ_INSERT_TAIL(&dpcon_dev_list, dpcon_node, next);
@@ -98,7 +98,10 @@ struct dpaa2_dpcon_dev *rte_dpaa2_alloc_dpcon_dev(void)
/* Get DPCON dev handle from list using index */
TAILQ_FOREACH(dpcon_dev, &dpcon_dev_list, next) {
- if (dpcon_dev && rte_atomic16_test_and_set(&dpcon_dev->in_use))
+ uint16_t expected = 0;
+ if (rte_atomic_compare_exchange_strong_explicit(
+ &dpcon_dev->in_use, &expected, 1,
+ rte_memory_order_acquire, rte_memory_order_relaxed))
break;
}
@@ -112,7 +115,7 @@ void rte_dpaa2_free_dpcon_dev(struct dpaa2_dpcon_dev *dpcon)
/* Match DPCON handle and mark it free */
TAILQ_FOREACH(dpcon_dev, &dpcon_dev_list, next) {
if (dpcon_dev == dpcon) {
- rte_atomic16_dec(&dpcon_dev->in_use);
+ rte_atomic_store_explicit(&dpcon_dev->in_use, 0, rte_memory_order_release);
return;
}
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 15/27] net/netvsc: replace rte_atomic32 with stdatomic
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (13 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 14/27] drivers: " Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-05-27 0:29 ` [EXTERNAL] " Long Li
2026-05-26 23:24 ` [PATCH v4 16/27] event/sw: convert from rte_atomic32 to stdatomic Stephen Hemminger
` (11 subsequent siblings)
26 siblings, 1 reply; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Long Li, Wei Hu
Change the rndis transaction id and buffer usage to use
stdatomic functions.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/netvsc/hn_rndis.c | 28 +++++++++++++++++++---------
drivers/net/netvsc/hn_rxtx.c | 12 +++++++-----
drivers/net/netvsc/hn_var.h | 6 +++---
3 files changed, 29 insertions(+), 17 deletions(-)
diff --git a/drivers/net/netvsc/hn_rndis.c b/drivers/net/netvsc/hn_rndis.c
index 7c54eebcef..4b1d3d5539 100644
--- a/drivers/net/netvsc/hn_rndis.c
+++ b/drivers/net/netvsc/hn_rndis.c
@@ -17,7 +17,7 @@
#include <rte_string_fns.h>
#include <rte_memzone.h>
#include <rte_malloc.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_alarm.h>
#include <rte_branch_prediction.h>
#include <rte_ether.h>
@@ -59,7 +59,8 @@ hn_rndis_rid(struct hn_data *hv)
uint32_t rid;
do {
- rid = rte_atomic32_add_return(&hv->rndis_req_id, 1);
+ rid = rte_atomic_fetch_add_explicit(&hv->rndis_req_id, 1,
+ rte_memory_order_seq_cst);
} while (rid == 0);
return rid;
@@ -357,12 +358,14 @@ void hn_rndis_receive_response(struct hn_data *hv,
memcpy(hv->rndis_resp, data, len);
/* make sure response copied before update */
- rte_smp_wmb();
-
- if (rte_atomic32_cmpset(&hv->rndis_pending, hdr->rid, 0) == 0) {
+ uint32_t expected = hdr->rid;
+ if (!rte_atomic_compare_exchange_strong_explicit(&hv->rndis_pending,
+ &expected, 0,
+ rte_memory_order_release,
+ rte_memory_order_relaxed)) {
PMD_DRV_LOG(NOTICE,
"received id %#x pending id %#x",
- hdr->rid, (uint32_t)hv->rndis_pending);
+ hdr->rid, expected);
}
}
@@ -388,8 +391,11 @@ static int hn_rndis_exec1(struct hn_data *hv,
return -EINVAL;
}
+ uint32_t expected = 0;
if (comp != NULL &&
- rte_atomic32_cmpset(&hv->rndis_pending, 0, rid) == 0) {
+ !rte_atomic_compare_exchange_strong_explicit(
+ &hv->rndis_pending, &expected, rid,
+ rte_memory_order_acquire, rte_memory_order_relaxed)) {
PMD_DRV_LOG(ERR,
"Request already pending");
return -EBUSY;
@@ -405,7 +411,8 @@ static int hn_rndis_exec1(struct hn_data *hv,
time_t start = time(NULL);
/* Poll primary channel until response received */
- while (hv->rndis_pending == rid) {
+ while (rte_atomic_load_explicit(&hv->rndis_pending,
+ rte_memory_order_acquire) == rid) {
if (hv->closed)
return -ENETDOWN;
@@ -413,7 +420,10 @@ static int hn_rndis_exec1(struct hn_data *hv,
PMD_DRV_LOG(ERR,
"RNDIS response timed out");
- rte_atomic32_cmpset(&hv->rndis_pending, rid, 0);
+ expected = rid;
+ rte_atomic_compare_exchange_strong_explicit(
+ &hv->rndis_pending, &expected, 0,
+ rte_memory_order_release, rte_memory_order_relaxed);
return -ETIMEDOUT;
}
diff --git a/drivers/net/netvsc/hn_rxtx.c b/drivers/net/netvsc/hn_rxtx.c
index 0d770d1b25..6f536610f2 100644
--- a/drivers/net/netvsc/hn_rxtx.c
+++ b/drivers/net/netvsc/hn_rxtx.c
@@ -17,7 +17,7 @@
#include <rte_string_fns.h>
#include <rte_memzone.h>
#include <rte_malloc.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_bitmap.h>
#include <rte_branch_prediction.h>
#include <rte_ether.h>
@@ -558,7 +558,8 @@ static void hn_rx_buf_free_cb(void *buf __rte_unused, void *opaque)
struct hn_rx_queue *rxq = rxb->rxq;
struct hn_data *hv = rxq->hv;
- rte_atomic32_dec(&rxq->rxbuf_outstanding);
+ rte_atomic_fetch_sub_explicit(&rxq->rxbuf_outstanding, 1,
+ rte_memory_order_release);
hn_nvs_ack_rxbuf(hv, rxb->chan, rxb->xactid);
}
@@ -602,8 +603,8 @@ static void hn_rxpkt(struct hn_rx_queue *rxq, struct hn_rx_bufinfo *rxb,
* some space available in receive area for later packets.
*/
if (hv->rx_extmbuf_enable && dlen > hv->rx_copybreak &&
- (uint32_t)rte_atomic32_read(&rxq->rxbuf_outstanding) <
- hv->rxbuf_section_cnt / 2) {
+ rte_atomic_load_explicit(&rxq->rxbuf_outstanding,
+ rte_memory_order_relaxed) < hv->rxbuf_section_cnt / 2) {
struct rte_mbuf_ext_shared_info *shinfo;
const void *rxbuf;
rte_iova_t iova;
@@ -619,7 +620,8 @@ static void hn_rxpkt(struct hn_rx_queue *rxq, struct hn_rx_bufinfo *rxb,
/* shinfo is already set to 1 by the caller */
if (rte_mbuf_ext_refcnt_update(shinfo, 1) == 2)
- rte_atomic32_inc(&rxq->rxbuf_outstanding);
+ rte_atomic_fetch_add_explicit(&rxq->rxbuf_outstanding, 1,
+ rte_memory_order_acquire);
rte_pktmbuf_attach_extbuf(m, data, iova,
dlen + headroom, shinfo);
diff --git a/drivers/net/netvsc/hn_var.h b/drivers/net/netvsc/hn_var.h
index 574b909c82..d7124a7df9 100644
--- a/drivers/net/netvsc/hn_var.h
+++ b/drivers/net/netvsc/hn_var.h
@@ -85,7 +85,7 @@ struct hn_rx_queue {
void *event_buf;
struct hn_rx_bufinfo *rxbuf_info;
- rte_atomic32_t rxbuf_outstanding;
+ RTE_ATOMIC(uint32_t) rxbuf_outstanding;
};
@@ -167,8 +167,8 @@ struct hn_data {
uint32_t rndis_agg_pkts;
uint32_t rndis_agg_align;
- volatile uint32_t rndis_pending;
- rte_atomic32_t rndis_req_id;
+ RTE_ATOMIC(uint32_t) rndis_pending;
+ RTE_ATOMIC(uint32_t) rndis_req_id;
uint8_t rndis_resp[256];
uint32_t rss_hash;
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 16/27] event/sw: convert from rte_atomic32 to stdatomic
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (14 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 15/27] net/netvsc: replace rte_atomic32 " Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 17/27] bus/vmbus: convert from rte_atomic " Stephen Hemminger
` (10 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
Use stdatomic to keep track of inflights.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/event/sw/sw_evdev.c | 8 +++++---
drivers/event/sw/sw_evdev.h | 4 ++--
drivers/event/sw/sw_evdev_worker.c | 16 +++++++++++-----
3 files changed, 18 insertions(+), 10 deletions(-)
diff --git a/drivers/event/sw/sw_evdev.c b/drivers/event/sw/sw_evdev.c
index 3ad82e94ac..a2f760a98d 100644
--- a/drivers/event/sw/sw_evdev.c
+++ b/drivers/event/sw/sw_evdev.c
@@ -153,7 +153,9 @@ sw_port_setup(struct rte_eventdev *dev, uint8_t port_id,
* the sum to no leak credits
*/
int possible_inflights = p->inflight_credits + p->inflights;
- rte_atomic32_sub(&sw->inflights, possible_inflights);
+ rte_atomic_fetch_sub_explicit(&sw->inflights,
+ possible_inflights,
+ rte_memory_order_release);
}
*p = (struct sw_port){0}; /* zero entire structure */
@@ -512,7 +514,7 @@ sw_dev_configure(const struct rte_eventdev *dev)
sw->qid_count = conf->nb_event_queues;
sw->port_count = conf->nb_event_ports;
sw->nb_events_limit = conf->nb_events_limit;
- rte_atomic32_set(&sw->inflights, 0);
+ sw->inflights = 0;
/* Number of chunks sized for worst-case spread of events across IQs */
num_chunks = ((SW_INFLIGHT_EVENTS_TOTAL/SW_EVS_PER_Q_CHUNK)+1) +
@@ -633,7 +635,7 @@ sw_dump(struct rte_eventdev *dev, FILE *f)
fprintf(f, "\tsched cq/qid call: %"PRIu64"\n", sw->sched_cq_qid_called);
fprintf(f, "\tsched no IQ enq: %"PRIu64"\n", sw->sched_no_iq_enqueues);
fprintf(f, "\tsched no CQ enq: %"PRIu64"\n", sw->sched_no_cq_enqueues);
- uint32_t inflights = rte_atomic32_read(&sw->inflights);
+ uint32_t inflights = rte_atomic_load_explicit(&sw->inflights, rte_memory_order_relaxed);
uint32_t credits = sw->nb_events_limit - inflights;
fprintf(f, "\tinflight %d, credits: %d\n", inflights, credits);
diff --git a/drivers/event/sw/sw_evdev.h b/drivers/event/sw/sw_evdev.h
index c159be21be..5e49b08030 100644
--- a/drivers/event/sw/sw_evdev.h
+++ b/drivers/event/sw/sw_evdev.h
@@ -8,7 +8,7 @@
#include "sw_evdev_log.h"
#include <rte_eventdev.h>
#include <eventdev_pmd_vdev.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#define SW_DEFAULT_CREDIT_QUANTA 32
#define SW_DEFAULT_SCHED_QUANTA 128
@@ -233,7 +233,7 @@ struct sw_evdev {
/* Contains all ports - load balanced and directed */
alignas(RTE_CACHE_LINE_SIZE) struct sw_port ports[SW_PORTS_MAX];
- alignas(RTE_CACHE_LINE_SIZE) rte_atomic32_t inflights;
+ alignas(RTE_CACHE_LINE_SIZE) RTE_ATOMIC(uint32_t) inflights;
/*
* max events in this instance. Cached here for performance.
diff --git a/drivers/event/sw/sw_evdev_worker.c b/drivers/event/sw/sw_evdev_worker.c
index 4215726513..0755def367 100644
--- a/drivers/event/sw/sw_evdev_worker.c
+++ b/drivers/event/sw/sw_evdev_worker.c
@@ -56,7 +56,7 @@ sw_event_enqueue_burst(void *port, const struct rte_event ev[], uint16_t num)
uint8_t new_ops[PORT_ENQUEUE_MAX_BURST_SIZE];
struct sw_port *p = port;
struct sw_evdev *sw = (void *)p->sw;
- uint32_t sw_inflights = rte_atomic32_read(&sw->inflights);
+ uint32_t sw_inflights = rte_atomic_load_explicit(&sw->inflights, rte_memory_order_relaxed);
uint32_t credit_update_quanta = sw->credit_update_quanta;
int new = 0;
@@ -74,8 +74,10 @@ sw_event_enqueue_burst(void *port, const struct rte_event ev[], uint16_t num)
if (sw_inflights + credit_update_quanta > sw->nb_events_limit)
return 0;
- rte_atomic32_add(&sw->inflights, credit_update_quanta);
- p->inflight_credits += (credit_update_quanta);
+ rte_atomic_fetch_add_explicit(&sw->inflights,
+ credit_update_quanta,
+ rte_memory_order_acquire);
+ p->inflight_credits += credit_update_quanta;
/* If there are fewer inflight credits than new events, limit
* the number of enqueued events.
@@ -124,7 +126,9 @@ sw_event_enqueue_burst(void *port, const struct rte_event ev[], uint16_t num)
/* Replenish credits if enough releases are performed */
if (p->inflight_credits >= credit_update_quanta * 2) {
- rte_atomic32_sub(&sw->inflights, credit_update_quanta);
+ rte_atomic_fetch_sub_explicit(&sw->inflights,
+ credit_update_quanta,
+ rte_memory_order_release);
p->inflight_credits -= credit_update_quanta;
}
@@ -150,7 +154,9 @@ sw_event_dequeue_burst(void *port, struct rte_event *ev, uint16_t num,
/* Replenish credits if enough releases are performed */
if (p->inflight_credits >= credit_update_quanta * 2) {
- rte_atomic32_sub(&sw->inflights, credit_update_quanta);
+ rte_atomic_fetch_sub_explicit(&sw->inflights,
+ credit_update_quanta,
+ rte_memory_order_release);
p->inflight_credits -= credit_update_quanta;
}
}
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 17/27] bus/vmbus: convert from rte_atomic to stdatomic
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (15 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 16/27] event/sw: convert from rte_atomic32 to stdatomic Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 18/27] common/dpaax: use stdatomic instead of rte_atomic Stephen Hemminger
` (9 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Long Li, Wei Hu
Replace deprecated rte_atomic32 operations in the vmbus ring buffer
producer with stdatomic equivalents, and replace the smp_wmb + CAS-spin
publish with rte_wait_until_equal_32 + release-store.
The two-cursor design is preserved: tbr->windex is the driver-private
reservation cursor that lets producers reserve slots concurrently
without a lock; vbr->windex is the host-visible commit cursor, updated
in reservation order so the host never observes windex pointing past
unwritten data. This is the lockless analogue of the spinlock-around-
single-cursor pattern used by the Linux (drivers/hv/ring_buffer.c
hv_ringbuffer_write) and FreeBSD (sys/dev/hyperv/vmbus/vmbus_br.c
vmbus_txbr_write) implementations of the same host contract.
The memory ordering mirrors __rte_ring_headtail_move_head and
__rte_ring_update_tail in lib/ring/rte_ring_c11_pvt.h: relaxed wait
for the previous producer's commit, release-store to publish. The
rte_smp_wmb before the publish is folded into the release ordering
on the store itself.
The host-shared vbr->windex remains volatile uint32_t in the packed
bufring struct; the atomic qualifier is added via cast at the access
site. The (uintptr_t) launder on the store-side cast suppresses a
spurious misaligned-atomic warning from the packed-struct attribute
(windex is 4-byte aligned in practice, at offset 0 of a page-aligned
struct).
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/bus/vmbus/private.h | 2 +-
drivers/bus/vmbus/vmbus_bufring.c | 39 +++++++++++++++++--------------
2 files changed, 23 insertions(+), 18 deletions(-)
diff --git a/drivers/bus/vmbus/private.h b/drivers/bus/vmbus/private.h
index 8ac6119ef2..42c4e81ac0 100644
--- a/drivers/bus/vmbus/private.h
+++ b/drivers/bus/vmbus/private.h
@@ -41,7 +41,7 @@ extern int vmbus_logtype_bus;
struct vmbus_br {
struct vmbus_bufring *vbr;
uint32_t dsize;
- uint32_t windex; /* next available location */
+ RTE_ATOMIC(uint32_t) windex; /* next available location */
};
#define UIO_NAME_MAX 64
diff --git a/drivers/bus/vmbus/vmbus_bufring.c b/drivers/bus/vmbus/vmbus_bufring.c
index fcb97287dc..624fe8b6c5 100644
--- a/drivers/bus/vmbus/vmbus_bufring.c
+++ b/drivers/bus/vmbus/vmbus_bufring.c
@@ -15,7 +15,7 @@
#include <rte_tailq.h>
#include <rte_log.h>
#include <rte_malloc.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_memory.h>
#include <rte_pause.h>
#include <rte_bus_vmbus.h>
@@ -114,6 +114,7 @@ vmbus_txbr_write(struct vmbus_br *tbr, const struct iovec iov[], int iovlen,
uint32_t ring_size = tbr->dsize;
uint32_t old_windex, next_windex, windex, total;
uint64_t save_windex;
+ bool success;
int i;
total = 0;
@@ -121,17 +122,13 @@ vmbus_txbr_write(struct vmbus_br *tbr, const struct iovec iov[], int iovlen,
total += iov[i].iov_len;
total += sizeof(save_windex);
+ /* Get current free location */
+ old_windex = rte_atomic_load_explicit(&tbr->windex,
+ rte_memory_order_relaxed);
+
/* Reserve space in ring */
do {
- uint32_t avail;
-
- /* Get current free location */
- old_windex = tbr->windex;
-
- /* Prevent compiler reordering this with calculation */
- rte_compiler_barrier();
-
- avail = vmbus_br_availwrite(tbr, old_windex);
+ uint32_t avail = vmbus_br_availwrite(tbr, old_windex);
/* If not enough space in ring, then tell caller. */
if (avail <= total)
@@ -139,8 +136,13 @@ vmbus_txbr_write(struct vmbus_br *tbr, const struct iovec iov[], int iovlen,
next_windex = vmbus_br_idxinc(old_windex, total, ring_size);
- /* Atomic update of next write_index for other threads */
- } while (!rte_atomic32_cmpset(&tbr->windex, old_windex, next_windex));
+ /* Atomic update of next write_index for other threads
+ * Can use weak since easy to recompute and retry.
+ */
+ success = rte_atomic_compare_exchange_weak_explicit(
+ &tbr->windex, &old_windex, next_windex,
+ rte_memory_order_acquire, rte_memory_order_relaxed);
+ } while (unlikely(!success));
/* Space from old..new is now reserved */
windex = old_windex;
@@ -157,12 +159,15 @@ vmbus_txbr_write(struct vmbus_br *tbr, const struct iovec iov[], int iovlen,
/* The region reserved should match region used */
RTE_ASSERT(windex == next_windex);
- /* Ensure that data is available before updating host index */
- rte_smp_wmb();
+ /* Wait for previous producer to publish their windex update */
+ rte_wait_until_equal_32(&vbr->windex, old_windex, rte_memory_order_relaxed);
- /* Checkin for our reservation. wait for our turn to update host */
- while (!rte_atomic32_cmpset(&vbr->windex, old_windex, next_windex))
- rte_pause();
+ /* Publish our windex update; prior data writes ordered via release.
+ * windex is 4-byte aligned in practice (struct is page-aligned, windex
+ * at offset 0); cast launders the packed-struct alignment-1 attribute.
+ */
+ rte_atomic_store_explicit((volatile __rte_atomic uint32_t *)(uintptr_t)&vbr->windex,
+ next_windex, rte_memory_order_release);
/* If host had read all data before this, then need to signal */
*need_sig |= vmbus_txbr_need_signal(vbr, old_windex);
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 18/27] common/dpaax: use stdatomic instead of rte_atomic
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (16 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 17/27] bus/vmbus: convert from rte_atomic " Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 19/27] net/bnx2x: convert from rte_atomic32 to stdatomic Stephen Hemminger
` (8 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Hemant Agrawal, Sachin Saxena
The driver debug code uses local atomic wrappers;
convert them to DPDK rte_atomic wrappers for C11 stdatomic.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/common/dpaax/compat.h | 21 ++++++++-------------
1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/drivers/common/dpaax/compat.h b/drivers/common/dpaax/compat.h
index d0635255da..793616e095 100644
--- a/drivers/common/dpaax/compat.h
+++ b/drivers/common/dpaax/compat.h
@@ -365,19 +365,14 @@ static inline unsigned long get_zeroed_page(gfp_t __foo __rte_unused)
#define spin_lock_irqsave(x, f) spin_lock_irq(x)
#define spin_unlock_irqrestore(x, f) spin_unlock_irq(x)
-#define atomic_t rte_atomic32_t
-#define atomic_read(v) rte_atomic32_read(v)
-#define atomic_set(v, i) rte_atomic32_set(v, i)
-
-#define atomic_inc(v) rte_atomic32_add(v, 1)
-#define atomic_dec(v) rte_atomic32_sub(v, 1)
-
-#define atomic_inc_and_test(v) rte_atomic32_inc_and_test(v)
-#define atomic_dec_and_test(v) rte_atomic32_dec_and_test(v)
-
-#define atomic_inc_return(v) rte_atomic32_add_return(v, 1)
-#define atomic_dec_return(v) rte_atomic32_sub_return(v, 1)
-#define atomic_sub_and_test(i, v) (rte_atomic32_sub_return(v, i) == 0)
+typedef RTE_ATOMIC(uint32_t) atomic_t;
+
+#define atomic_read(v) rte_atomic_load_explicit((v), rte_memory_order_relaxed)
+#define atomic_set(v, i) rte_atomic_store_explicit((v), (i), rte_memory_order_relaxed)
+#define atomic_inc(v) ((void)rte_atomic_fetch_add_explicit((v), 1, rte_memory_order_seq_cst))
+#define atomic_dec(v) ((void)rte_atomic_fetch_sub_explicit((v), 1, rte_memory_order_seq_cst))
+#define atomic_inc_and_test(v) (rte_atomic_fetch_add_explicit((v), 1, rte_memory_order_seq_cst) == -1)
+#define atomic_dec_and_test(v) (rte_atomic_fetch_sub_explicit((v), 1, rte_memory_order_seq_cst) == 1)
/* Interface name len*/
#define IF_NAME_MAX_LEN 16
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 19/27] net/bnx2x: convert from rte_atomic32 to stdatomic
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (17 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 18/27] common/dpaax: use stdatomic instead of rte_atomic Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 20/27] bus/fslmc: replace rte_atomic32 with stdatomic Stephen Hemminger
` (7 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Julien Aube
Replace the legacy rte_atomic32_* API on sc->scan_fp with the
equivalent rte_atomic_*_explicit C11 helpers, ahead of the
deprecation of rte_atomicNN_t and its associated wrappers.
All accesses use rte_memory_order_seq_cst, matching the semantics
of the legacy API. No functional change.
The scan_fp field is a notification flag between the slow-path
command poster (bnx2x_sp_post) and the fastpath task that reaps
ramrod completions (bnx2x_handle_fp_tq), also cleared from
ecore_state_wait on success, panic, and timeout.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/bnx2x/bnx2x.c | 6 +++---
drivers/net/bnx2x/bnx2x.h | 2 +-
drivers/net/bnx2x/ecore_sp.c | 6 +++---
3 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/net/bnx2x/bnx2x.c b/drivers/net/bnx2x/bnx2x.c
index 8790c858d5..027a0a50d5 100644
--- a/drivers/net/bnx2x/bnx2x.c
+++ b/drivers/net/bnx2x/bnx2x.c
@@ -1098,7 +1098,7 @@ bnx2x_sp_post(struct bnx2x_softc *sc, int command, int cid, uint32_t data_hi,
* Ask bnx2x_intr_intr() to process RAMROD
* completion whenever it gets scheduled.
*/
- rte_atomic32_set(&sc->scan_fp, 1);
+ rte_atomic_store_explicit(&sc->scan_fp, 1, rte_memory_order_seq_cst);
bnx2x_sp_prod_update(sc);
return 0;
@@ -4575,7 +4575,7 @@ static void bnx2x_handle_fp_tq(struct bnx2x_fastpath *fp)
/* update the fastpath index */
bnx2x_update_fp_sb_idx(fp);
- if (rte_atomic32_read(&sc->scan_fp) == 1) {
+ if (rte_atomic_load_explicit(&sc->scan_fp, rte_memory_order_seq_cst)) {
if (bnx2x_has_rx_work(fp)) {
more_rx = bnx2x_rxeof(sc, fp);
}
@@ -4586,7 +4586,7 @@ static void bnx2x_handle_fp_tq(struct bnx2x_fastpath *fp)
return;
}
/* We have completed slow path completion, clear the flag */
- rte_atomic32_set(&sc->scan_fp, 0);
+ rte_atomic_store_explicit(&sc->scan_fp, 0, rte_memory_order_seq_cst);
}
bnx2x_ack_sb(sc, fp->igu_sb_id, USTORM_ID,
diff --git a/drivers/net/bnx2x/bnx2x.h b/drivers/net/bnx2x/bnx2x.h
index 35206b4758..c5de4b71aa 100644
--- a/drivers/net/bnx2x/bnx2x.h
+++ b/drivers/net/bnx2x/bnx2x.h
@@ -1043,7 +1043,7 @@ struct bnx2x_softc {
#define PERIODIC_STOP 0
#define PERIODIC_GO 1
volatile unsigned long periodic_flags;
- rte_atomic32_t scan_fp;
+ RTE_ATOMIC(uint32_t) scan_fp;
struct bnx2x_fastpath fp[MAX_RSS_CHAINS];
struct bnx2x_sp_objs sp_objs[MAX_RSS_CHAINS];
diff --git a/drivers/net/bnx2x/ecore_sp.c b/drivers/net/bnx2x/ecore_sp.c
index c6c3857778..33a40dea6e 100644
--- a/drivers/net/bnx2x/ecore_sp.c
+++ b/drivers/net/bnx2x/ecore_sp.c
@@ -299,21 +299,21 @@ static int ecore_state_wait(struct bnx2x_softc *sc, int state,
#ifdef ECORE_STOP_ON_ERROR
ECORE_MSG(sc, "exit (cnt %d)", 5000 - cnt);
#endif
- rte_atomic32_set(&sc->scan_fp, 0);
+ rte_atomic_store_explicit(&sc->scan_fp, 0, rte_memory_order_seq_cst);
return ECORE_SUCCESS;
}
ECORE_WAIT(sc, delay_us);
if (sc->panic) {
- rte_atomic32_set(&sc->scan_fp, 0);
+ rte_atomic_store_explicit(&sc->scan_fp, 0, rte_memory_order_seq_cst);
return ECORE_IO;
}
}
/* timeout! */
PMD_DRV_LOG(ERR, sc, "timeout waiting for state %d", state);
- rte_atomic32_set(&sc->scan_fp, 0);
+ rte_atomic_store_explicit(&sc->scan_fp, 0, rte_memory_order_seq_cst);
#ifdef ECORE_STOP_ON_ERROR
ecore_panic();
#endif
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 20/27] bus/fslmc: replace rte_atomic32 with stdatomic
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (18 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 19/27] net/bnx2x: convert from rte_atomic32 to stdatomic Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 21/27] drivers/event: replace rte_atomic32 in selftests Stephen Hemminger
` (6 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Hemant Agrawal, Sachin Saxena
The atomic wrappers here are easily converted to stdatomic.
Drop any unused macros.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/bus/fslmc/qbman/include/compat.h | 21 ++++++++-------------
1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/drivers/bus/fslmc/qbman/include/compat.h b/drivers/bus/fslmc/qbman/include/compat.h
index 5a57bd8ed1..9c87f0b639 100644
--- a/drivers/bus/fslmc/qbman/include/compat.h
+++ b/drivers/bus/fslmc/qbman/include/compat.h
@@ -81,18 +81,13 @@ do { \
#define dma_wmb() rte_io_wmb()
-#define atomic_t rte_atomic32_t
-#define atomic_read(v) rte_atomic32_read(v)
-#define atomic_set(v, i) rte_atomic32_set(v, i)
-
-#define atomic_inc(v) rte_atomic32_add(v, 1)
-#define atomic_dec(v) rte_atomic32_sub(v, 1)
-
-#define atomic_inc_and_test(v) rte_atomic32_inc_and_test(v)
-#define atomic_dec_and_test(v) rte_atomic32_dec_and_test(v)
-
-#define atomic_inc_return(v) rte_atomic32_add_return(v, 1)
-#define atomic_dec_return(v) rte_atomic32_sub_return(v, 1)
-#define atomic_sub_and_test(i, v) (rte_atomic32_sub_return(v, i) == 0)
+typedef RTE_ATOMIC(uint32_t) atomic_t;
+
+#define atomic_read(v) rte_atomic_load_explicit((v), rte_memory_order_relaxed)
+#define atomic_set(v, i) rte_atomic_store_explicit((v), (i), rte_memory_order_relaxed)
+#define atomic_inc(v) ((void)rte_atomic_fetch_add_explicit((v), 1, rte_memory_order_seq_cst))
+#define atomic_dec(v) ((void)rte_atomic_fetch_sub_explicit((v), 1, rte_memory_order_seq_cst))
+#define atomic_inc_and_test(v) (rte_atomic_fetch_add_explicit((v), 1, rte_memory_order_seq_cst) == -1)
+#define atomic_dec_and_test(v) (rte_atomic_fetch_sub_explicit((v), 1, rte_memory_order_seq_cst) == 1)
#endif /* HEADER_COMPAT_H */
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 21/27] drivers/event: replace rte_atomic32 in selftests
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (19 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 20/27] bus/fslmc: replace rte_atomic32 with stdatomic Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 22/27] net/hinic: replace rte_atomic32 with stdatomic Stephen Hemminger
` (5 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Hemant Agrawal, Sachin Saxena, Jerin Jacob
Last callers in these selftests of the rte_atomicNN_*() family,
which is being deprecated.
Convert total_events from rte_atomic32_t to RTE_ATOMIC(uint32_t)
for the stack-local instance and __rte_atomic uint32_t * for the
pointer in test_core_param. Switch reads and updates to
rte_atomic_*_explicit().
Reads in the busy-loop checks and progress logs use relaxed: the
counter is purely a "drained yet?" signal and no data is published
through it. The fetch_sub on the dequeue path uses release in
octeontx (preserving the publish-after-mbuf-free ordering already
implied by the seq_cst sub it replaces) and relaxed in dpaa2.
The stack-local atomic_total_events is initialized by direct
assignment instead of rte_atomic32_set(), since it is written
before any worker is launched.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/event/dpaa2/dpaa2_eventdev_selftest.c | 26 ++++----
drivers/event/octeontx/ssovf_evdev_selftest.c | 61 ++++++++++---------
2 files changed, 47 insertions(+), 40 deletions(-)
diff --git a/drivers/event/dpaa2/dpaa2_eventdev_selftest.c b/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
index 9d4938efe6..2c688bd194 100644
--- a/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
+++ b/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
@@ -2,7 +2,7 @@
* Copyright 2018-2019 NXP
*/
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_common.h>
#include <rte_cycles.h>
#include <rte_debug.h>
@@ -49,7 +49,7 @@ struct event_attr {
};
struct test_core_param {
- rte_atomic32_t *total_events;
+ __rte_atomic uint32_t *total_events;
uint64_t dequeue_tmo_ticks;
uint8_t port;
uint8_t sched_type;
@@ -444,10 +444,10 @@ worker_multi_port_fn(void *arg)
struct rte_event ev;
uint16_t valid_event;
uint8_t port = param->port;
- rte_atomic32_t *total_events = param->total_events;
+ __rte_atomic uint32_t *total_events = param->total_events;
int ret;
- while (rte_atomic32_read(total_events) > 0) {
+ while (rte_atomic_load_explicit(total_events, rte_memory_order_relaxed) > 0) {
valid_event = rte_event_dequeue_burst(evdev, port, &ev, 1, 0);
if (!valid_event)
continue;
@@ -455,13 +455,15 @@ worker_multi_port_fn(void *arg)
ret = validate_event(&ev);
RTE_TEST_ASSERT_SUCCESS(ret, "Failed to validate event");
rte_pktmbuf_free(ev.mbuf);
- rte_atomic32_sub(total_events, 1);
+
+ rte_atomic_fetch_sub_explicit(total_events, 1,
+ rte_memory_order_relaxed);
}
return 0;
}
static int
-wait_workers_to_join(int lcore, const rte_atomic32_t *count)
+wait_workers_to_join(int lcore, const __rte_atomic uint32_t *count)
{
uint64_t cycles, print_cycles;
@@ -472,15 +474,15 @@ wait_workers_to_join(int lcore, const rte_atomic32_t *count)
uint64_t new_cycles = rte_get_timer_cycles();
if (new_cycles - print_cycles > rte_get_timer_hz()) {
- dpaa2_evdev_dbg("\r%s: events %d", __func__,
- rte_atomic32_read(count));
+ dpaa2_evdev_dbg("\r%s: events %u", __func__,
+ rte_atomic_load_explicit(count, rte_memory_order_relaxed));
print_cycles = new_cycles;
}
if (new_cycles - cycles > rte_get_timer_hz() * 10) {
dpaa2_evdev_info(
- "%s: No schedules for seconds, deadlock (%d)",
+ "%s: No schedules for seconds, deadlock (%u)",
__func__,
- rte_atomic32_read(count));
+ rte_atomic_load_explicit(count, rte_memory_order_relaxed));
rte_event_dev_dump(evdev, stdout);
cycles = new_cycles;
return -1;
@@ -500,13 +502,13 @@ launch_workers_and_wait(int (*main_worker)(void *),
int w_lcore;
int ret;
struct test_core_param *param;
- rte_atomic32_t atomic_total_events;
+ RTE_ATOMIC(uint32_t) atomic_total_events;
uint64_t dequeue_tmo_ticks;
if (!nb_workers)
return 0;
- rte_atomic32_set(&atomic_total_events, total_events);
+ atomic_total_events = total_events;
RTE_BUILD_BUG_ON(NUM_PACKETS < MAX_EVENTS);
param = malloc(sizeof(struct test_core_param) * nb_workers);
diff --git a/drivers/event/octeontx/ssovf_evdev_selftest.c b/drivers/event/octeontx/ssovf_evdev_selftest.c
index b54ae126d2..5eeed2b2ce 100644
--- a/drivers/event/octeontx/ssovf_evdev_selftest.c
+++ b/drivers/event/octeontx/ssovf_evdev_selftest.c
@@ -4,7 +4,7 @@
#include <stdlib.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_common.h>
#include <rte_cycles.h>
#include <rte_debug.h>
@@ -84,7 +84,7 @@ seqn_list_check(int limit)
}
struct test_core_param {
- rte_atomic32_t *total_events;
+ __rte_atomic uint32_t *total_events;
uint64_t dequeue_tmo_ticks;
uint8_t port;
uint8_t sched_type;
@@ -558,10 +558,10 @@ worker_multi_port_fn(void *arg)
struct rte_event ev;
uint16_t valid_event;
uint8_t port = param->port;
- rte_atomic32_t *total_events = param->total_events;
+ __rte_atomic uint32_t *total_events = param->total_events;
int ret;
- while (rte_atomic32_read(total_events) > 0) {
+ while (rte_atomic_load_explicit(total_events, rte_memory_order_relaxed) > 0) {
valid_event = rte_event_dequeue_burst(evdev, port, &ev, 1, 0);
if (!valid_event)
continue;
@@ -569,13 +569,14 @@ worker_multi_port_fn(void *arg)
ret = validate_event(&ev);
RTE_TEST_ASSERT_SUCCESS(ret, "Failed to validate event");
rte_pktmbuf_free(ev.mbuf);
- rte_atomic32_sub(total_events, 1);
+
+ rte_atomic_fetch_sub_explicit(total_events, 1, rte_memory_order_release);
}
return 0;
}
static inline int
-wait_workers_to_join(int lcore, const rte_atomic32_t *count)
+wait_workers_to_join(int lcore, const __rte_atomic uint32_t *count)
{
uint64_t cycles, print_cycles;
RTE_SET_USED(count);
@@ -583,17 +584,16 @@ wait_workers_to_join(int lcore, const rte_atomic32_t *count)
print_cycles = cycles = rte_get_timer_cycles();
while (rte_eal_get_lcore_state(lcore) != WAIT) {
uint64_t new_cycles = rte_get_timer_cycles();
+ uint32_t cur_count = rte_atomic_load_explicit(count, rte_memory_order_relaxed);
if (new_cycles - print_cycles > rte_get_timer_hz()) {
- ssovf_log_dbg("\r%s: events %d", __func__,
- rte_atomic32_read(count));
+ ssovf_log_dbg("\r%s: events %u", __func__, cur_count);
print_cycles = new_cycles;
}
if (new_cycles - cycles > rte_get_timer_hz() * 10) {
ssovf_log_dbg(
- "%s: No schedules for seconds, deadlock (%d)",
- __func__,
- rte_atomic32_read(count));
+ "%s: No schedules for seconds, deadlock (%u)",
+ __func__, cur_count);
rte_event_dev_dump(evdev, stdout);
cycles = new_cycles;
return -1;
@@ -613,13 +613,13 @@ launch_workers_and_wait(int (*main_worker)(void *),
int w_lcore;
int ret;
struct test_core_param *param;
- rte_atomic32_t atomic_total_events;
+ RTE_ATOMIC(uint32_t) atomic_total_events;
uint64_t dequeue_tmo_ticks;
if (!nb_workers)
return 0;
- rte_atomic32_set(&atomic_total_events, total_events);
+ atomic_total_events = total_events;
seqn_list_init();
param = malloc(sizeof(struct test_core_param) * nb_workers);
@@ -889,10 +889,10 @@ worker_flow_based_pipeline(void *arg)
uint16_t valid_event;
uint8_t port = param->port;
uint8_t new_sched_type = param->sched_type;
- rte_atomic32_t *total_events = param->total_events;
+ __rte_atomic uint32_t *total_events = param->total_events;
uint64_t dequeue_tmo_ticks = param->dequeue_tmo_ticks;
- while (rte_atomic32_read(total_events) > 0) {
+ while (rte_atomic_load_explicit(total_events, rte_memory_order_relaxed) > 0) {
valid_event = rte_event_dequeue_burst(evdev, port, &ev, 1,
dequeue_tmo_ticks);
if (!valid_event)
@@ -910,7 +910,8 @@ worker_flow_based_pipeline(void *arg)
} else if (ev.sub_event_type == 1) { /* Events from stage 1*/
if (seqn_list_update(*rte_event_pmd_selftest_seqn(ev.mbuf)) == 0) {
rte_pktmbuf_free(ev.mbuf);
- rte_atomic32_sub(total_events, 1);
+ rte_atomic_fetch_sub_explicit(total_events, 1,
+ rte_memory_order_release);
} else {
ssovf_log_dbg("Failed to update seqn_list");
return -1;
@@ -1044,10 +1045,10 @@ worker_group_based_pipeline(void *arg)
uint16_t valid_event;
uint8_t port = param->port;
uint8_t new_sched_type = param->sched_type;
- rte_atomic32_t *total_events = param->total_events;
+ __rte_atomic uint32_t *total_events = param->total_events;
uint64_t dequeue_tmo_ticks = param->dequeue_tmo_ticks;
- while (rte_atomic32_read(total_events) > 0) {
+ while (rte_atomic_load_explicit(total_events, rte_memory_order_relaxed) > 0) {
valid_event = rte_event_dequeue_burst(evdev, port, &ev, 1,
dequeue_tmo_ticks);
if (!valid_event)
@@ -1065,7 +1066,8 @@ worker_group_based_pipeline(void *arg)
} else if (ev.queue_id == 1) { /* Events from stage 1(group 1)*/
if (seqn_list_update(*rte_event_pmd_selftest_seqn(ev.mbuf)) == 0) {
rte_pktmbuf_free(ev.mbuf);
- rte_atomic32_sub(total_events, 1);
+ rte_atomic_fetch_sub_explicit(total_events, 1,
+ rte_memory_order_release);
} else {
ssovf_log_dbg("Failed to update seqn_list");
return -1;
@@ -1203,16 +1205,17 @@ worker_flow_based_pipeline_max_stages_rand_sched_type(void *arg)
struct rte_event ev;
uint16_t valid_event;
uint8_t port = param->port;
- rte_atomic32_t *total_events = param->total_events;
+ __rte_atomic uint32_t *total_events = param->total_events;
- while (rte_atomic32_read(total_events) > 0) {
+ while (rte_atomic_load_explicit(total_events, rte_memory_order_relaxed) > 0) {
valid_event = rte_event_dequeue_burst(evdev, port, &ev, 1, 0);
if (!valid_event)
continue;
if (ev.sub_event_type == 255) { /* last stage */
rte_pktmbuf_free(ev.mbuf);
- rte_atomic32_sub(total_events, 1);
+ rte_atomic_fetch_sub_explicit(total_events, 1,
+ rte_memory_order_release);
} else {
ev.event_type = RTE_EVENT_TYPE_CPU;
ev.sub_event_type++;
@@ -1278,16 +1281,17 @@ worker_queue_based_pipeline_max_stages_rand_sched_type(void *arg)
RTE_EVENT_DEV_ATTR_QUEUE_COUNT,
&queue_count), "Queue count get failed");
uint8_t nr_queues = queue_count;
- rte_atomic32_t *total_events = param->total_events;
+ __rte_atomic uint32_t *total_events = param->total_events;
- while (rte_atomic32_read(total_events) > 0) {
+ while (rte_atomic_load_explicit(total_events, rte_memory_order_relaxed) > 0) {
valid_event = rte_event_dequeue_burst(evdev, port, &ev, 1, 0);
if (!valid_event)
continue;
if (ev.queue_id == nr_queues - 1) { /* last stage */
rte_pktmbuf_free(ev.mbuf);
- rte_atomic32_sub(total_events, 1);
+ rte_atomic_fetch_sub_explicit(total_events, 1,
+ rte_memory_order_release);
} else {
ev.event_type = RTE_EVENT_TYPE_CPU;
ev.queue_id++;
@@ -1320,16 +1324,17 @@ worker_mixed_pipeline_max_stages_rand_sched_type(void *arg)
RTE_EVENT_DEV_ATTR_QUEUE_COUNT,
&queue_count), "Queue count get failed");
uint8_t nr_queues = queue_count;
- rte_atomic32_t *total_events = param->total_events;
+ __rte_atomic uint32_t *total_events = param->total_events;
- while (rte_atomic32_read(total_events) > 0) {
+ while (rte_atomic_load_explicit(total_events, rte_memory_order_relaxed) > 0) {
valid_event = rte_event_dequeue_burst(evdev, port, &ev, 1, 0);
if (!valid_event)
continue;
if (ev.queue_id == nr_queues - 1) { /* Last stage */
rte_pktmbuf_free(ev.mbuf);
- rte_atomic32_sub(total_events, 1);
+ rte_atomic_fetch_sub_explicit(total_events, 1,
+ rte_memory_order_release);
} else {
ev.event_type = RTE_EVENT_TYPE_CPU;
ev.queue_id++;
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 22/27] net/hinic: replace rte_atomic32 with stdatomic
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (20 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 21/27] drivers/event: replace rte_atomic32 in selftests Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 23/27] net/txgbe: " Stephen Hemminger
` (4 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Xiaoyun Wang
Convert dma_pool::inuse and hinic_os_dep::dma_alloc_cnt to
RTE_ATOMIC(uint32_t) and replace rte_atomic32_*() with the
rte_atomic_*_explicit() equivalents. The matching local variable
and log format change from int/%d to uint32_t/%u.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/hinic/base/hinic_compat.h | 2 +-
drivers/net/hinic/base/hinic_pmd_hwdev.c | 24 ++++++++++++++----------
drivers/net/hinic/base/hinic_pmd_hwdev.h | 4 ++--
3 files changed, 17 insertions(+), 13 deletions(-)
diff --git a/drivers/net/hinic/base/hinic_compat.h b/drivers/net/hinic/base/hinic_compat.h
index 707a3b92b9..c53b88b96d 100644
--- a/drivers/net/hinic/base/hinic_compat.h
+++ b/drivers/net/hinic/base/hinic_compat.h
@@ -15,7 +15,7 @@
#include <rte_memzone.h>
#include <rte_memcpy.h>
#include <rte_malloc.h>
-#include <rte_atomic.h>
+#include <rte_stdatomic.h>
#include <rte_spinlock.h>
#include <rte_cycles.h>
#include <rte_log.h>
diff --git a/drivers/net/hinic/base/hinic_pmd_hwdev.c b/drivers/net/hinic/base/hinic_pmd_hwdev.c
index 818698dcb3..9a1b126632 100644
--- a/drivers/net/hinic/base/hinic_pmd_hwdev.c
+++ b/drivers/net/hinic/base/hinic_pmd_hwdev.c
@@ -116,7 +116,8 @@ static void *hinic_dma_mem_zalloc(struct hinic_hwdev *hwdev, size_t size,
dma_addr_t *dma_handle, unsigned int align,
unsigned int socket_id)
{
- int rc, alloc_cnt;
+ int rc;
+ uint32_t alloc_cnt;
const struct rte_memzone *mz;
char z_name[RTE_MEMZONE_NAMESIZE];
hash_sig_t sig;
@@ -125,8 +126,9 @@ static void *hinic_dma_mem_zalloc(struct hinic_hwdev *hwdev, size_t size,
if (dma_handle == NULL || 0 == size)
return NULL;
- alloc_cnt = rte_atomic32_add_return(&hwdev->os_dep.dma_alloc_cnt, 1);
- snprintf(z_name, sizeof(z_name), "%s_%d",
+ alloc_cnt = rte_atomic_fetch_add_explicit(&hwdev->os_dep.dma_alloc_cnt,
+ 1, rte_memory_order_relaxed);
+ snprintf(z_name, sizeof(z_name), "%s_%u",
hwdev->pcidev_hdl->name, alloc_cnt);
mz = rte_memzone_reserve_aligned(z_name, size, socket_id,
@@ -282,7 +284,6 @@ struct dma_pool *dma_pool_create(const char *name, void *dev,
if (!pool)
return NULL;
- rte_atomic32_set(&pool->inuse, 0);
pool->elem_size = size;
pool->align = align;
pool->boundary = boundary;
@@ -294,12 +295,15 @@ struct dma_pool *dma_pool_create(const char *name, void *dev,
void dma_pool_destroy(struct dma_pool *pool)
{
+ uint32_t inuse;
+
if (!pool)
return;
- if (rte_atomic32_read(&pool->inuse) != 0) {
- PMD_DRV_LOG(ERR, "Leak memory, dma_pool: %s, inuse_count: %d",
- pool->name, rte_atomic32_read(&pool->inuse));
+ inuse = rte_atomic_load_explicit(&pool->inuse, rte_memory_order_relaxed);
+ if (inuse != 0) {
+ PMD_DRV_LOG(ERR, "Leak memory, dma_pool: %s, inuse_count: %u",
+ pool->name, inuse);
}
rte_free(pool);
@@ -312,14 +316,14 @@ void *dma_pool_alloc(struct pci_pool *pool, dma_addr_t *dma_addr)
buf = hinic_dma_mem_zalloc(pool->hwdev, pool->elem_size, dma_addr,
(u32)pool->align, SOCKET_ID_ANY);
if (buf)
- rte_atomic32_inc(&pool->inuse);
+ rte_atomic_fetch_add_explicit(&pool->inuse, 1, rte_memory_order_relaxed);
return buf;
}
void dma_pool_free(struct pci_pool *pool, void *vaddr, dma_addr_t dma)
{
- rte_atomic32_dec(&pool->inuse);
+ rte_atomic_fetch_sub_explicit(&pool->inuse, 1, rte_memory_order_relaxed);
hinic_dma_mem_free(pool->hwdev, pool->elem_size, vaddr, dma);
}
@@ -329,7 +333,7 @@ int hinic_osdep_init(struct hinic_hwdev *hwdev)
struct rte_hash_parameters dh_params = { 0 };
struct rte_hash *paddr_hash = NULL;
- rte_atomic32_set(&hwdev->os_dep.dma_alloc_cnt, 0);
+ hwdev->os_dep.dma_alloc_cnt = 0;
rte_spinlock_init(&hwdev->os_dep.dma_hash_lock);
dh_params.name = hwdev->pcidev_hdl->name;
diff --git a/drivers/net/hinic/base/hinic_pmd_hwdev.h b/drivers/net/hinic/base/hinic_pmd_hwdev.h
index d6896b3f13..ad30ddd72e 100644
--- a/drivers/net/hinic/base/hinic_pmd_hwdev.h
+++ b/drivers/net/hinic/base/hinic_pmd_hwdev.h
@@ -18,7 +18,7 @@
/* dma pool */
struct dma_pool {
- rte_atomic32_t inuse;
+ RTE_ATOMIC(uint32_t) inuse;
size_t elem_size;
size_t align;
size_t boundary;
@@ -402,7 +402,7 @@ struct hinic_hilink_link_info {
/* dma os dependency implementation */
struct hinic_os_dep {
/* kernel dma alloc api */
- rte_atomic32_t dma_alloc_cnt;
+ RTE_ATOMIC(uint32_t) dma_alloc_cnt;
rte_spinlock_t dma_hash_lock;
struct rte_hash *dma_addr_hash;
};
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 23/27] net/txgbe: replace rte_atomic32 with stdatomic
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (21 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 22/27] net/hinic: replace rte_atomic32 with stdatomic Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 24/27] net/vhost: use stdatomic instead of rte_atomic32 Stephen Hemminger
` (3 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Jiawen Wu, Zaiyu Wang
The swfw_busy flag guarding the AML SW-FW mailbox is a one-bit lock,
so convert it to RTE_ATOMIC(bool) and replace the legacy
test-and-set / clear pair with explicit acquire-release:
rte_atomic32_test_and_set ->
rte_atomic_exchange_explicit(.., true, acquire)
rte_atomic32_clear ->
rte_atomic_store_explicit(.., false, release)
Acquire on the take pairs with release on the drop, so accesses
inside the critical section are synchronized between successive
holders. Default zero-initialization of struct txgbe_hw still
gives swfw_busy = false, so no init site needs updating.
Note:
The previous rte_atomic32_test_and_set return value was
inverted relative to what this code expected; this patch
incidentally corrects that. A standalone Fixes: patch is
queued in net-next.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/txgbe/base/txgbe_mng.c | 4 ++--
drivers/net/txgbe/base/txgbe_type.h | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/txgbe/base/txgbe_mng.c b/drivers/net/txgbe/base/txgbe_mng.c
index a1974820b6..c58e1d6589 100644
--- a/drivers/net/txgbe/base/txgbe_mng.c
+++ b/drivers/net/txgbe/base/txgbe_mng.c
@@ -185,7 +185,7 @@ txgbe_host_interface_command_aml(struct txgbe_hw *hw, u32 *buffer,
}
/* try to get lock */
- while (rte_atomic32_test_and_set(&hw->swfw_busy)) {
+ while (rte_atomic_exchange_explicit(&hw->swfw_busy, true, rte_memory_order_acquire)) {
timeout--;
if (!timeout)
return TXGBE_ERR_TIMEOUT;
@@ -266,7 +266,7 @@ txgbe_host_interface_command_aml(struct txgbe_hw *hw, u32 *buffer,
/* index++, index replace txgbe_hic_hdr.checksum */
hw->swfw_index = resp->index == TXGBE_HIC_HDR_INDEX_MAX ?
0 : resp->index + 1;
- rte_atomic32_clear(&hw->swfw_busy);
+ rte_atomic_store_explicit(&hw->swfw_busy, false, rte_memory_order_release);
return err;
}
diff --git a/drivers/net/txgbe/base/txgbe_type.h b/drivers/net/txgbe/base/txgbe_type.h
index ede780321f..d3c82d51a4 100644
--- a/drivers/net/txgbe/base/txgbe_type.h
+++ b/drivers/net/txgbe/base/txgbe_type.h
@@ -880,7 +880,7 @@ struct txgbe_hw {
rte_spinlock_t phy_lock;
/*amlite: new SW-FW mbox */
u8 swfw_index;
- rte_atomic32_t swfw_busy;
+ RTE_ATOMIC(bool) swfw_busy;
u32 fec_mode;
u32 cur_fec_link;
};
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 24/27] net/vhost: use stdatomic instead of rte_atomic32
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (22 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 23/27] net/txgbe: " Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 25/27] vdpa/ifc: replace rte_atomic32 with stdatomic Stephen Hemminger
` (2 subsequent siblings)
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Maxime Coquelin, Chenbo Xia
Convert allow_queuing, while_queuing, started, and dev_attached from
rte_atomic32_t to RTE_ATOMIC(uint32_t) and replace rte_atomic32_*()
with rte_atomic_*_explicit().
The data-path / control-thread handshake on allow_queuing and
while_queuing is a Dekker-style mutual-visibility pattern: each side
stores its own flag and then loads the peer's. Both legs must be
seq_cst to forbid store-load reordering; anything weaker permits both
sides to miss each other. The previous rte_atomic32_set/read compiled
to plain volatile stores/loads and provided no such ordering, so this
also closes a latent ordering hole on weakly-ordered ISAs.
The data-path exit store of while_queuing=0 is release, ordering
preceding slot accesses before the control thread observes the data
path as idle.
The flags started and dev_attached are consulted only inside
update_queuing_status, where the per-queue handshake provides the
real synchronization; their loads and stores are relaxed.
Factor the per-queue allow_queuing store and while_queuing wait into
a small update_queue() helper used by both rx and tx loops.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/net/vhost/rte_eth_vhost.c | 103 +++++++++++++++++++-----------
1 file changed, 65 insertions(+), 38 deletions(-)
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 05940f2461..3b1eedfe42 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -73,8 +73,8 @@ struct vhost_stats {
struct vhost_queue {
int vid;
- rte_atomic32_t allow_queuing;
- rte_atomic32_t while_queuing;
+ RTE_ATOMIC(uint32_t) allow_queuing;
+ RTE_ATOMIC(uint32_t) while_queuing;
struct pmd_internal *internal;
struct rte_mempool *mb_pool;
uint16_t port;
@@ -86,14 +86,14 @@ struct vhost_queue {
};
struct pmd_internal {
- rte_atomic32_t dev_attached;
+ RTE_ATOMIC(uint32_t) dev_attached;
char *iface_name;
uint64_t flags;
uint64_t disable_flags;
uint64_t features;
uint16_t max_queues;
int vid;
- rte_atomic32_t started;
+ RTE_ATOMIC(uint32_t) started;
bool vlan_strip;
bool rx_sw_csum;
bool tx_sw_csum;
@@ -406,12 +406,19 @@ eth_vhost_rx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
uint16_t i, nb_rx = 0;
uint16_t nb_receive = nb_bufs;
- if (unlikely(rte_atomic32_read(&r->allow_queuing) == 0))
+ /* Fast-path early exit; racy load is fine here -- if we miss a
+ * transition we get caught by the seq_cst check below.
+ */
+ if (unlikely(rte_atomic_load_explicit(&r->allow_queuing, rte_memory_order_relaxed) == 0))
return 0;
- rte_atomic32_set(&r->while_queuing, 1);
-
- if (unlikely(rte_atomic32_read(&r->allow_queuing) == 0))
+ /* Announce presence, then re-check. The store and the following
+ * load MUST both be seq_cst so they are totally ordered with the
+ * control thread's store-to-allow_queuing / load-of-while_queuing
+ * pair. Anything weaker permits both sides to miss each other.
+ */
+ rte_atomic_store_explicit(&r->while_queuing, 1, rte_memory_order_seq_cst);
+ if (unlikely(rte_atomic_load_explicit(&r->allow_queuing, rte_memory_order_seq_cst) == 0))
goto out;
/* Dequeue packets from guest TX queue */
@@ -446,7 +453,7 @@ eth_vhost_rx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
}
out:
- rte_atomic32_set(&r->while_queuing, 0);
+ rte_atomic_store_explicit(&r->while_queuing, 0, rte_memory_order_release);
return nb_rx;
}
@@ -460,12 +467,19 @@ eth_vhost_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
uint64_t nb_bytes = 0;
uint64_t nb_missed = 0;
- if (unlikely(rte_atomic32_read(&r->allow_queuing) == 0))
+ /* Fast-path early exit; racy load is fine here -- if we miss a
+ * transition we get caught by the seq_cst check below.
+ */
+ if (unlikely(rte_atomic_load_explicit(&r->allow_queuing, rte_memory_order_relaxed) == 0))
return 0;
- rte_atomic32_set(&r->while_queuing, 1);
-
- if (unlikely(rte_atomic32_read(&r->allow_queuing) == 0))
+ /* Announce presence, then re-check. The store and the following
+ * load MUST both be seq_cst so they are totally ordered with the
+ * control thread's store-to-allow_queuing / load-of-while_queuing
+ * pair. Anything weaker permits both sides to miss each other.
+ */
+ rte_atomic_store_explicit(&r->while_queuing, 1, rte_memory_order_seq_cst);
+ if (unlikely(rte_atomic_load_explicit(&r->allow_queuing, rte_memory_order_seq_cst) == 0))
goto out;
for (i = 0; i < nb_bufs; i++) {
@@ -515,7 +529,7 @@ eth_vhost_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
for (i = 0; likely(i < nb_tx); i++)
rte_pktmbuf_free(bufs[i]);
out:
- rte_atomic32_set(&r->while_queuing, 0);
+ rte_atomic_store_explicit(&r->while_queuing, 0, rte_memory_order_release);
return nb_tx;
}
@@ -744,6 +758,19 @@ eth_vhost_unconfigure_intr(struct rte_eth_dev *eth_dev)
}
}
+static inline void
+update_queue(struct vhost_queue *vq, uint32_t allow, bool wait_queuing)
+{
+ /* seq_cst: pairs with the data-path's seq_cst store of
+ * while_queuing and seq_cst load of allow_queuing. See
+ * eth_vhost_rx().
+ */
+ rte_atomic_store_explicit(&vq->allow_queuing, allow, rte_memory_order_seq_cst);
+ if (wait_queuing)
+ rte_wait_until_equal_32((volatile uint32_t *)(uintptr_t)&vq->while_queuing,
+ 0, rte_memory_order_seq_cst);
+}
+
static void
update_queuing_status(struct rte_eth_dev *dev, bool wait_queuing)
{
@@ -751,14 +778,18 @@ update_queuing_status(struct rte_eth_dev *dev, bool wait_queuing)
struct vhost_queue *vq;
struct rte_vhost_vring_state *state;
unsigned int i;
- int allow_queuing = 1;
+ bool allow_queuing = true;
if (!dev->data->rx_queues || !dev->data->tx_queues)
return;
- if (rte_atomic32_read(&internal->started) == 0 ||
- rte_atomic32_read(&internal->dev_attached) == 0)
- allow_queuing = 0;
+ /* These are control-plane flags consulted only here;
+ * the real data-path handshake is on vq->allow_queuing below.
+ * Relaxed is sufficient.
+ */
+ if (rte_atomic_load_explicit(&internal->started, rte_memory_order_relaxed) == 0 ||
+ rte_atomic_load_explicit(&internal->dev_attached, rte_memory_order_relaxed) == 0)
+ allow_queuing = false;
state = vring_states[dev->data->port_id];
@@ -767,24 +798,18 @@ update_queuing_status(struct rte_eth_dev *dev, bool wait_queuing)
vq = dev->data->rx_queues[i];
if (vq == NULL)
continue;
- if (allow_queuing && state->cur[vq->virtqueue_id])
- rte_atomic32_set(&vq->allow_queuing, 1);
- else
- rte_atomic32_set(&vq->allow_queuing, 0);
- while (wait_queuing && rte_atomic32_read(&vq->while_queuing))
- rte_pause();
+
+ update_queue(vq, !!(allow_queuing && state->cur[vq->virtqueue_id]),
+ wait_queuing);
}
for (i = 0; i < dev->data->nb_tx_queues; i++) {
vq = dev->data->tx_queues[i];
if (vq == NULL)
continue;
- if (allow_queuing && state->cur[vq->virtqueue_id])
- rte_atomic32_set(&vq->allow_queuing, 1);
- else
- rte_atomic32_set(&vq->allow_queuing, 0);
- while (wait_queuing && rte_atomic32_read(&vq->while_queuing))
- rte_pause();
+
+ update_queue(vq, !!(allow_queuing && state->cur[vq->virtqueue_id]),
+ wait_queuing);
}
}
@@ -848,7 +873,7 @@ new_device(int vid)
}
internal->vid = vid;
- if (rte_atomic32_read(&internal->started) == 1) {
+ if (rte_atomic_load_explicit(&internal->started, rte_memory_order_relaxed) == 1) {
queue_setup(eth_dev, internal);
if (dev_conf->intr_conf.rxq)
eth_vhost_configure_intr(eth_dev);
@@ -863,7 +888,7 @@ new_device(int vid)
vhost_dev_csum_configure(eth_dev);
- rte_atomic32_set(&internal->dev_attached, 1);
+ rte_atomic_store_explicit(&internal->dev_attached, 1, rte_memory_order_relaxed);
update_queuing_status(eth_dev, false);
VHOST_LOG_LINE(INFO, "Vhost device %d created", vid);
@@ -893,7 +918,7 @@ destroy_device(int vid)
eth_dev = list->eth_dev;
internal = eth_dev->data->dev_private;
- rte_atomic32_set(&internal->dev_attached, 0);
+ rte_atomic_store_explicit(&internal->dev_attached, 0, rte_memory_order_relaxed);
update_queuing_status(eth_dev, true);
eth_vhost_unconfigure_intr(eth_dev);
@@ -1148,11 +1173,11 @@ eth_dev_start(struct rte_eth_dev *eth_dev)
}
queue_setup(eth_dev, internal);
- if (rte_atomic32_read(&internal->dev_attached) == 1 &&
+ if (rte_atomic_load_explicit(&internal->dev_attached, rte_memory_order_relaxed) == 1 &&
dev_conf->intr_conf.rxq)
eth_vhost_configure_intr(eth_dev);
- rte_atomic32_set(&internal->started, 1);
+ rte_atomic_store_explicit(&internal->started, 1, rte_memory_order_relaxed);
update_queuing_status(eth_dev, false);
for (i = 0; i < eth_dev->data->nb_rx_queues; i++)
@@ -1170,7 +1195,7 @@ eth_dev_stop(struct rte_eth_dev *dev)
uint16_t i;
dev->data->dev_started = 0;
- rte_atomic32_set(&internal->started, 0);
+ rte_atomic_store_explicit(&internal->started, 0, rte_memory_order_relaxed);
update_queuing_status(dev, true);
for (i = 0; i < dev->data->nb_rx_queues; i++)
@@ -1471,8 +1496,10 @@ vhost_dev_priv_dump(struct rte_eth_dev *dev, FILE *f)
fprintf(f, "features: 0x%" PRIx64 "\n", internal->features);
fprintf(f, "max_queues: %u\n", internal->max_queues);
fprintf(f, "vid: %d\n", internal->vid);
- fprintf(f, "started: %d\n", rte_atomic32_read(&internal->started));
- fprintf(f, "dev_attached: %d\n", rte_atomic32_read(&internal->dev_attached));
+ fprintf(f, "started: %u\n",
+ rte_atomic_load_explicit(&internal->started, rte_memory_order_relaxed));
+ fprintf(f, "dev_attached: %u\n",
+ rte_atomic_load_explicit(&internal->dev_attached, rte_memory_order_relaxed));
fprintf(f, "vlan_strip: %d\n", internal->vlan_strip);
fprintf(f, "rx_sw_csum: %d\n", internal->rx_sw_csum);
fprintf(f, "tx_sw_csum: %d\n", internal->tx_sw_csum);
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 25/27] vdpa/ifc: replace rte_atomic32 with stdatomic
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (23 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 24/27] net/vhost: use stdatomic instead of rte_atomic32 Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 26/27] test/atomic: suppress deprecation warnings for legacy APIs Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 27/27] eal: mark rte_atomicNN as deprecated Stephen Hemminger
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
Last in-tree caller of rte_atomic32_*(), blocking deprecation of the
rte_atomicNN_*() family.
Replace rte_atomic32_read/set() with rte_atomic_load_explicit() and
rte_atomic_store_explicit() on the started, dev_attached, and running
flags. Narrow them to bool (only ever hold 0/1) and group with the
existing bools to reduce padding in struct ifcvf_internal.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
drivers/vdpa/ifc/ifcvf_vdpa.c | 37 ++++++++++++++++++-----------------
1 file changed, 19 insertions(+), 18 deletions(-)
diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
index f319d455ba..e5da11a2ba 100644
--- a/drivers/vdpa/ifc/ifcvf_vdpa.c
+++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
@@ -25,6 +25,7 @@
#include <rte_log.h>
#include <rte_kvargs.h>
#include <rte_devargs.h>
+#include <rte_stdatomic.h>
#include "base/ifcvf.h"
@@ -68,10 +69,10 @@ struct ifcvf_internal {
struct rte_vdpa_device *vdev;
uint16_t max_queues;
uint64_t features;
- rte_atomic32_t started;
- rte_atomic32_t dev_attached;
- rte_atomic32_t running;
rte_spinlock_t lock;
+ RTE_ATOMIC(bool) started;
+ RTE_ATOMIC(bool) dev_attached;
+ RTE_ATOMIC(bool) running;
bool sw_lm;
bool sw_fallback_running;
/* mediated vring for sw fallback */
@@ -712,9 +713,9 @@ update_datapath(struct ifcvf_internal *internal)
rte_spinlock_lock(&internal->lock);
- if (!rte_atomic32_read(&internal->running) &&
- (rte_atomic32_read(&internal->started) &&
- rte_atomic32_read(&internal->dev_attached))) {
+ if (!rte_atomic_load_explicit(&internal->running, rte_memory_order_seq_cst) &&
+ (rte_atomic_load_explicit(&internal->started, rte_memory_order_seq_cst) &&
+ rte_atomic_load_explicit(&internal->dev_attached, rte_memory_order_seq_cst))) {
ret = ifcvf_dma_map(internal, true);
if (ret)
goto err;
@@ -735,10 +736,10 @@ update_datapath(struct ifcvf_internal *internal)
if (ret)
goto err;
- rte_atomic32_set(&internal->running, 1);
- } else if (rte_atomic32_read(&internal->running) &&
- (!rte_atomic32_read(&internal->started) ||
- !rte_atomic32_read(&internal->dev_attached))) {
+ rte_atomic_store_explicit(&internal->running, true, rte_memory_order_seq_cst);
+ } else if (rte_atomic_load_explicit(&internal->running, rte_memory_order_seq_cst) &&
+ (!rte_atomic_load_explicit(&internal->started, rte_memory_order_seq_cst) ||
+ !rte_atomic_load_explicit(&internal->dev_attached, rte_memory_order_seq_cst))) {
unset_intr_relay(internal);
ret = unset_notify_relay(internal);
@@ -755,7 +756,7 @@ update_datapath(struct ifcvf_internal *internal)
if (ret)
goto err;
- rte_atomic32_set(&internal->running, 0);
+ rte_atomic_store_explicit(&internal->running, false, rte_memory_order_seq_cst);
}
rte_spinlock_unlock(&internal->lock);
@@ -1058,7 +1059,7 @@ ifcvf_sw_fallback_switchover(struct ifcvf_internal *internal)
vdpa_disable_vfio_intr(internal);
- rte_atomic32_set(&internal->running, 0);
+ rte_atomic_store_explicit(&internal->running, false, rte_memory_order_seq_cst);
ret = rte_vhost_host_notifier_ctrl(vid, RTE_VHOST_QUEUE_ALL, false);
if (ret && ret != -ENOTSUP)
@@ -1113,11 +1114,11 @@ ifcvf_dev_config(int vid)
internal = list->internal;
internal->vid = vid;
- rte_atomic32_set(&internal->dev_attached, 1);
+ rte_atomic_store_explicit(&internal->dev_attached, true, rte_memory_order_seq_cst);
if (update_datapath(internal) < 0) {
DRV_LOG(ERR, "failed to update datapath for vDPA device %s",
vdev->device->name);
- rte_atomic32_set(&internal->dev_attached, 0);
+ rte_atomic_store_explicit(&internal->dev_attached, false, rte_memory_order_seq_cst);
return -1;
}
@@ -1166,7 +1167,7 @@ ifcvf_dev_close(int vid)
internal->sw_fallback_running = false;
} else {
- rte_atomic32_set(&internal->dev_attached, 0);
+ rte_atomic_store_explicit(&internal->dev_attached, false, rte_memory_order_seq_cst);
if (update_datapath(internal) < 0) {
DRV_LOG(ERR, "failed to update datapath for vDPA device %s",
vdev->device->name);
@@ -1782,10 +1783,10 @@ ifcvf_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
goto error;
}
- rte_atomic32_set(&internal->started, 1);
+ rte_atomic_store_explicit(&internal->started, true, rte_memory_order_seq_cst);
if (update_datapath(internal) < 0) {
DRV_LOG(ERR, "failed to update datapath %s", pci_dev->name);
- rte_atomic32_set(&internal->started, 0);
+ rte_atomic_store_explicit(&internal->started, false, rte_memory_order_seq_cst);
rte_vdpa_unregister_device(internal->vdev);
pthread_mutex_lock(&internal_list_lock);
TAILQ_REMOVE(&internal_list, list, next);
@@ -1819,7 +1820,7 @@ ifcvf_pci_remove(struct rte_pci_device *pci_dev)
}
internal = list->internal;
- rte_atomic32_set(&internal->started, 0);
+ rte_atomic_store_explicit(&internal->started, false, rte_memory_order_seq_cst);
if (update_datapath(internal) < 0)
DRV_LOG(ERR, "failed to update datapath %s", pci_dev->name);
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 26/27] test/atomic: suppress deprecation warnings for legacy APIs
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (24 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 25/27] vdpa/ifc: replace rte_atomic32 with stdatomic Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 27/27] eal: mark rte_atomicNN as deprecated Stephen Hemminger
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
The rte_atomicNN_* APIs are now marked __rte_deprecated.
Wrap the whole file with __rte_diagnostic_push / pop and a
GCC pragma -Wdeprecated-declarations.
In future, when the APIs are removed this test collapses to just the
128-bit compare-and-swap case and the suppression goes with it.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test_atomic.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/app/test/test_atomic.c b/app/test/test_atomic.c
index 2a4531b833..f32a1aeff4 100644
--- a/app/test/test_atomic.c
+++ b/app/test/test_atomic.c
@@ -100,6 +100,15 @@
* - At the end of the test, the number of corrupted tokens must be 0.
*/
+/*
+ * The rte_atomicNN_* APIs exercised below are deprecated in favour of C11 atomics.
+ * Suppress the deprecation warnings for the whole file;
+ * when the APIs are removed this test collapses to the 128-bit
+ * compare-and-swap case and the suppression goes with it.
+ */
+__rte_diagnostic_push
+_Pragma("GCC diagnostic ignored \"-Wdeprecated-declarations\"")
+
#define NUM_ATOMIC_TYPES 3
#define N_BASE 1000000u
@@ -645,4 +654,7 @@ test_atomic(void)
return 0;
}
REGISTER_FAST_TEST(atomic_autotest, NOHUGE_SKIP, ASAN_OK, test_atomic);
+
+__rte_diagnostic_pop
+
#endif /* RTE_TOOLCHAIN_MSVC */
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* [PATCH v4 27/27] eal: mark rte_atomicNN as deprecated
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
` (25 preceding siblings ...)
2026-05-26 23:24 ` [PATCH v4 26/27] test/atomic: suppress deprecation warnings for legacy APIs Stephen Hemminger
@ 2026-05-26 23:24 ` Stephen Hemminger
26 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-26 23:24 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Thomas Monjalon
The decision to deprecate the rte_atomicNN functions was made
back in 2021, but in-tree code continued to use them.
Now thatall in-tree callers have been converted to C11 stdatomic,
markthe functions with __rte_deprecated so any remaining user code
will see the deprecation warning.
Since any new use of rte_atomicNN will be caught by the compiler,
there is no longer any need to check for it in checkpatches.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
devtools/checkpatches.sh | 8 --
doc/guides/rel_notes/deprecation.rst | 4 +-
doc/guides/rel_notes/release_26_07.rst | 4 +
lib/eal/include/generic/rte_atomic.h | 100 ++++++++++++++-----------
lib/eal/include/rte_common.h | 2 +
5 files changed, 62 insertions(+), 56 deletions(-)
diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
index 81bb0fe4e8..a0cbbf09db 100755
--- a/devtools/checkpatches.sh
+++ b/devtools/checkpatches.sh
@@ -113,14 +113,6 @@ check_forbidden_additions() { # <patch>
-f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
"$1" || res=1
- # refrain from new additions of 16/32/64 bits rte_atomicNN_xxx()
- awk -v FOLDERS="lib drivers app examples" \
- -v EXPRESSIONS="rte_atomic[0-9][0-9]_.*\\\(" \
- -v RET_ON_FAIL=1 \
- -v MESSAGE='Using rte_atomicNN_xxx' \
- -f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
- "$1" || res=1
-
# refrain from using compiler __sync_xxx builtins
awk -v FOLDERS="lib drivers app examples" \
-v EXPRESSIONS="__sync_.*\\\(" \
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 2190419f79..5d9226d551 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -43,9 +43,7 @@ Deprecation Notices
* rte_atomicNN_xxx: These APIs do not take memory order parameter. This does
not allow for writing optimized code for all the CPU architectures supported
in DPDK. DPDK has adopted the atomic operations from
- https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html. These
- operations must be used for patches that need to be merged in 20.08 onwards.
- This change will not introduce any performance degradation.
+ https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html.
* lib: will fix extending some enum/define breaking the ABI. There are multiple
samples in DPDK that enum/define terminated with a ``.*MAX.*`` value which is
diff --git a/doc/guides/rel_notes/release_26_07.rst b/doc/guides/rel_notes/release_26_07.rst
index 58d782f77e..92c12c83a4 100644
--- a/doc/guides/rel_notes/release_26_07.rst
+++ b/doc/guides/rel_notes/release_26_07.rst
@@ -106,6 +106,10 @@ API Changes
Also, make sure to start the actual text at the margin.
=======================================================
+* atomic: Marked the ``rte_atomicNN`` functions as deprecated.
+ As previously announced these functions were intended to be deprecated
+ but was not being enforced.
+
ABI Changes
-----------
diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h
index 1b04b43cbb..1a113c3df2 100644
--- a/lib/eal/include/generic/rte_atomic.h
+++ b/lib/eal/include/generic/rte_atomic.h
@@ -152,6 +152,14 @@ rte_smp_rmb(void)
rte_atomic_thread_fence(rte_memory_order_acquire);
}
+
+/*
+ * The rte_atomicNN_* APIs defined below are deprecated in favour of C11 atomics.
+ * Suppress the deprecation warnings for the inlines to allow inter-related usage.
+ */
+__rte_diagnostic_push
+__rte_allow_deprecated
+
/*------------------------- 16 bit atomic operations -------------------------*/
#ifndef RTE_TOOLCHAIN_MSVC
@@ -172,7 +180,7 @@ rte_smp_rmb(void)
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
+static __rte_deprecated inline int
rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
@@ -193,7 +201,7 @@ rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
* @return
* The original value at that location
*/
-static inline uint16_t
+static __rte_deprecated inline uint16_t
rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
{
return rte_atomic_exchange_explicit((volatile __rte_atomic uint16_t *)dst,
@@ -218,7 +226,7 @@ typedef struct {
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic16_init(rte_atomic16_t *v)
{
v->cnt = 0;
@@ -232,7 +240,7 @@ rte_atomic16_init(rte_atomic16_t *v)
* @return
* The value of the counter.
*/
-static inline int16_t
+static __rte_deprecated inline int16_t
rte_atomic16_read(const rte_atomic16_t *v)
{
return v->cnt;
@@ -246,7 +254,7 @@ rte_atomic16_read(const rte_atomic16_t *v)
* @param new_value
* The new value for the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic16_set(rte_atomic16_t *v, int16_t new_value)
{
v->cnt = new_value;
@@ -260,7 +268,7 @@ rte_atomic16_set(rte_atomic16_t *v, int16_t new_value)
* @param inc
* The value to be added to the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic16_add(rte_atomic16_t *v, int16_t inc)
{
rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v->cnt, inc,
@@ -275,7 +283,7 @@ rte_atomic16_add(rte_atomic16_t *v, int16_t inc)
* @param dec
* The value to be subtracted from the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
{
rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v->cnt, dec,
@@ -288,7 +296,7 @@ rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic16_inc(rte_atomic16_t *v)
{
rte_atomic16_add(v, 1);
@@ -300,7 +308,7 @@ rte_atomic16_inc(rte_atomic16_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic16_dec(rte_atomic16_t *v)
{
rte_atomic16_sub(v, 1);
@@ -319,7 +327,7 @@ rte_atomic16_dec(rte_atomic16_t *v)
* @return
* The value of v after the addition.
*/
-static inline int16_t
+static __rte_deprecated inline int16_t
rte_atomic16_add_return(rte_atomic16_t *v, int16_t inc)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v->cnt, inc,
@@ -340,7 +348,7 @@ rte_atomic16_add_return(rte_atomic16_t *v, int16_t inc)
* @return
* The value of v after the subtraction.
*/
-static inline int16_t
+static __rte_deprecated inline int16_t
rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v->cnt, dec,
@@ -358,7 +366,7 @@ rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
* @return
* True if the result after the increment operation is 0; false otherwise.
*/
-static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+static __rte_deprecated inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
rte_memory_order_seq_cst) + 1 == 0;
@@ -375,7 +383,7 @@ static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
* @return
* True if the result after the decrement operation is 0; false otherwise.
*/
-static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+static __rte_deprecated inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v->cnt, 1,
rte_memory_order_seq_cst) - 1 == 0;
@@ -392,7 +400,7 @@ static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+static __rte_deprecated inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
{
return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
}
@@ -403,7 +411,7 @@ static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void rte_atomic16_clear(rte_atomic16_t *v)
+static __rte_deprecated inline void rte_atomic16_clear(rte_atomic16_t *v)
{
v->cnt = 0;
}
@@ -426,7 +434,7 @@ static inline void rte_atomic16_clear(rte_atomic16_t *v)
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
+static __rte_deprecated inline int
rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
@@ -447,7 +455,7 @@ rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
* @return
* The original value at that location
*/
-static inline uint32_t
+static __rte_deprecated inline uint32_t
rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
{
return rte_atomic_exchange_explicit((volatile __rte_atomic uint32_t *)dst,
@@ -472,7 +480,7 @@ typedef struct {
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic32_init(rte_atomic32_t *v)
{
v->cnt = 0;
@@ -486,7 +494,7 @@ rte_atomic32_init(rte_atomic32_t *v)
* @return
* The value of the counter.
*/
-static inline int32_t
+static __rte_deprecated inline int32_t
rte_atomic32_read(const rte_atomic32_t *v)
{
return v->cnt;
@@ -500,7 +508,7 @@ rte_atomic32_read(const rte_atomic32_t *v)
* @param new_value
* The new value for the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic32_set(rte_atomic32_t *v, int32_t new_value)
{
v->cnt = new_value;
@@ -514,7 +522,7 @@ rte_atomic32_set(rte_atomic32_t *v, int32_t new_value)
* @param inc
* The value to be added to the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic32_add(rte_atomic32_t *v, int32_t inc)
{
rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v->cnt, inc,
@@ -529,7 +537,7 @@ rte_atomic32_add(rte_atomic32_t *v, int32_t inc)
* @param dec
* The value to be subtracted from the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
{
rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v->cnt, dec,
@@ -542,7 +550,7 @@ rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic32_inc(rte_atomic32_t *v)
{
rte_atomic32_add(v, 1);
@@ -554,7 +562,7 @@ rte_atomic32_inc(rte_atomic32_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic32_dec(rte_atomic32_t *v)
{
rte_atomic32_sub(v,1);
@@ -573,7 +581,7 @@ rte_atomic32_dec(rte_atomic32_t *v)
* @return
* The value of v after the addition.
*/
-static inline int32_t
+static __rte_deprecated inline int32_t
rte_atomic32_add_return(rte_atomic32_t *v, int32_t inc)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v->cnt, inc,
@@ -594,7 +602,7 @@ rte_atomic32_add_return(rte_atomic32_t *v, int32_t inc)
* @return
* The value of v after the subtraction.
*/
-static inline int32_t
+static __rte_deprecated inline int32_t
rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v->cnt, dec,
@@ -612,7 +620,7 @@ rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
* @return
* True if the result after the increment operation is 0; false otherwise.
*/
-static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
+static __rte_deprecated inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
rte_memory_order_seq_cst) + 1 == 0;
@@ -629,7 +637,7 @@ static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
* @return
* True if the result after the decrement operation is 0; false otherwise.
*/
-static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
+static __rte_deprecated inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v->cnt, 1,
rte_memory_order_seq_cst) - 1 == 0;
@@ -646,7 +654,7 @@ static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+static __rte_deprecated inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
{
return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
}
@@ -657,7 +665,7 @@ static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void rte_atomic32_clear(rte_atomic32_t *v)
+static __rte_deprecated inline void rte_atomic32_clear(rte_atomic32_t *v)
{
v->cnt = 0;
}
@@ -679,7 +687,7 @@ static inline void rte_atomic32_clear(rte_atomic32_t *v)
* @return
* Non-zero on success; 0 on failure.
*/
-static inline int
+static __rte_deprecated inline int
rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
{
return __sync_bool_compare_and_swap(dst, exp, src);
@@ -700,7 +708,7 @@ rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
* @return
* The original value at that location
*/
-static inline uint64_t
+static __rte_deprecated inline uint64_t
rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
{
return rte_atomic_exchange_explicit((volatile __rte_atomic uint64_t *)dst,
@@ -725,7 +733,7 @@ typedef struct {
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic64_init(rte_atomic64_t *v)
{
#ifdef __LP64__
@@ -750,7 +758,7 @@ rte_atomic64_init(rte_atomic64_t *v)
* @return
* The value of the counter.
*/
-static inline int64_t
+static __rte_deprecated inline int64_t
rte_atomic64_read(rte_atomic64_t *v)
{
#ifdef __LP64__
@@ -777,7 +785,7 @@ rte_atomic64_read(rte_atomic64_t *v)
* @param new_value
* The new value of the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
{
#ifdef __LP64__
@@ -802,7 +810,7 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
* @param inc
* The value to be added to the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
{
rte_atomic_fetch_add_explicit((volatile __rte_atomic int64_t *)&v->cnt, inc,
@@ -817,7 +825,7 @@ rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
* @param dec
* The value to be subtracted from the counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
{
rte_atomic_fetch_sub_explicit((volatile __rte_atomic int64_t *)&v->cnt, dec,
@@ -830,7 +838,7 @@ rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic64_inc(rte_atomic64_t *v)
{
rte_atomic64_add(v, 1);
@@ -842,7 +850,7 @@ rte_atomic64_inc(rte_atomic64_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void
+static __rte_deprecated inline void
rte_atomic64_dec(rte_atomic64_t *v)
{
rte_atomic64_sub(v, 1);
@@ -861,7 +869,7 @@ rte_atomic64_dec(rte_atomic64_t *v)
* @return
* The value of v after the addition.
*/
-static inline int64_t
+static __rte_deprecated inline int64_t
rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
{
return rte_atomic_fetch_add_explicit((volatile __rte_atomic int64_t *)&v->cnt, inc,
@@ -881,7 +889,7 @@ rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
* @return
* The value of v after the subtraction.
*/
-static inline int64_t
+static __rte_deprecated inline int64_t
rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
{
return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int64_t *)&v->cnt, dec,
@@ -899,7 +907,7 @@ rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
* @return
* True if the result after the addition is 0; false otherwise.
*/
-static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+static __rte_deprecated inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
{
return rte_atomic64_add_return(v, 1) == 0;
}
@@ -915,7 +923,7 @@ static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
* @return
* True if the result after subtraction is 0; false otherwise.
*/
-static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+static __rte_deprecated inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
{
return rte_atomic64_sub_return(v, 1) == 0;
}
@@ -931,7 +939,7 @@ static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
* @return
* 0 if failed; else 1, success.
*/
-static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+static __rte_deprecated inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
{
return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
}
@@ -942,13 +950,15 @@ static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
* @param v
* A pointer to the atomic counter.
*/
-static inline void rte_atomic64_clear(rte_atomic64_t *v)
+static __rte_deprecated inline void rte_atomic64_clear(rte_atomic64_t *v)
{
rte_atomic64_set(v, 0);
}
#endif
+__rte_diagnostic_pop
+
/*------------------------ 128 bit atomic operations -------------------------*/
/**
diff --git a/lib/eal/include/rte_common.h b/lib/eal/include/rte_common.h
index 0a356abae2..aa56aa6a94 100644
--- a/lib/eal/include/rte_common.h
+++ b/lib/eal/include/rte_common.h
@@ -172,9 +172,11 @@ typedef uint16_t unaligned_uint16_t;
#ifdef RTE_TOOLCHAIN_MSVC
#define __rte_deprecated
#define __rte_deprecated_msg(msg)
+#define __rte_allow_deprecated
#else
#define __rte_deprecated __attribute__((__deprecated__))
#define __rte_deprecated_msg(msg) __attribute__((__deprecated__(msg)))
+#define __rte_allow_deprecated _Pragma("GCC diagnostic ignored \"-Wdeprecated-declarations\"")
#endif
/**
--
2.53.0
^ permalink raw reply related [flat|nested] 105+ messages in thread
* RE: [EXTERNAL] [PATCH v4 15/27] net/netvsc: replace rte_atomic32 with stdatomic
2026-05-26 23:24 ` [PATCH v4 15/27] net/netvsc: replace rte_atomic32 " Stephen Hemminger
@ 2026-05-27 0:29 ` Long Li
2026-05-31 16:35 ` Stephen Hemminger
0 siblings, 1 reply; 105+ messages in thread
From: Long Li @ 2026-05-27 0:29 UTC (permalink / raw)
To: Stephen Hemminger, dev@dpdk.org; +Cc: Wei Hu
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Tuesday, May 26, 2026 4:24 PM
> To: dev@dpdk.org
> Cc: Stephen Hemminger <stephen@networkplumber.org>; Long Li
> <longli@microsoft.com>; Wei Hu <weh@microsoft.com>
> Subject: [EXTERNAL] [PATCH v4 15/27] net/netvsc: replace rte_atomic32 with
> stdatomic
>
> Change the rndis transaction id and buffer usage to use stdatomic functions.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> drivers/net/netvsc/hn_rndis.c | 28 +++++++++++++++++++---------
> drivers/net/netvsc/hn_rxtx.c | 12 +++++++-----
> drivers/net/netvsc/hn_var.h | 6 +++---
> 3 files changed, 29 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/net/netvsc/hn_rndis.c b/drivers/net/netvsc/hn_rndis.c
> index 7c54eebcef..4b1d3d5539 100644
> --- a/drivers/net/netvsc/hn_rndis.c
> +++ b/drivers/net/netvsc/hn_rndis.c
> @@ -17,7 +17,7 @@
> #include <rte_string_fns.h>
> #include <rte_memzone.h>
> #include <rte_malloc.h>
> -#include <rte_atomic.h>
> +#include <rte_stdatomic.h>
> #include <rte_alarm.h>
> #include <rte_branch_prediction.h>
> #include <rte_ether.h>
> @@ -59,7 +59,8 @@ hn_rndis_rid(struct hn_data *hv)
> uint32_t rid;
>
> do {
> - rid = rte_atomic32_add_return(&hv->rndis_req_id, 1);
> + rid = rte_atomic_fetch_add_explicit(&hv->rndis_req_id, 1,
> +
> rte_memory_order_seq_cst);
Does rte_atomic_fetch_add_explicit() return the old value of hv->rndis_req_id? If yes this is not correct, as the rte_atomic32_add_return() used to return the new value.
^ permalink raw reply [flat|nested] 105+ messages in thread
* RE: [PATCH v4 04/27] bpf: use C11 atomics in BPF_ST_ATOMIC_REG
2026-05-26 23:23 ` [PATCH v4 04/27] bpf: use C11 atomics in BPF_ST_ATOMIC_REG Stephen Hemminger
@ 2026-05-27 16:52 ` Marat Khalili
0 siblings, 0 replies; 105+ messages in thread
From: Marat Khalili @ 2026-05-27 16:52 UTC (permalink / raw)
To: Stephen Hemminger, dev@dpdk.org; +Cc: Konstantin Ananyev
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Wednesday 27 May 2026 00:24
> To: dev@dpdk.org
> Cc: Stephen Hemminger <stephen@networkplumber.org>; Konstantin Ananyev <konstantin.ananyev@huawei.com>;
> Marat Khalili <marat.khalili@huawei.com>
> Subject: [PATCH v4 04/27] bpf: use C11 atomics in BPF_ST_ATOMIC_REG
>
> The BPF_ST_ATOMIC_REG macro generated code with deprecated
> rte_atomicNN_add and rte_atomicNN_exchange.
>
> Replace this with the equivalent rte_stdatomic definitions.
> Use memory order seq_cst to preserve the previous behavior of
> rte_atomicNN_add() / rte_atomicNN_exchange() and matches
> the Linux kernel BPF interpreter for these opcodes.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> lib/bpf/bpf_exec.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/lib/bpf/bpf_exec.c b/lib/bpf/bpf_exec.c
> index 18013753b1..ee6ec7516f 100644
> --- a/lib/bpf/bpf_exec.c
> +++ b/lib/bpf/bpf_exec.c
> @@ -10,6 +10,7 @@
> #include <rte_log.h>
> #include <rte_debug.h>
> #include <rte_byteorder.h>
> +#include <rte_stdatomic.h>
>
> #include "bpf_impl.h"
>
> @@ -65,16 +66,16 @@
> (type)(reg)[(ins)->src_reg])
>
> #define BPF_ST_ATOMIC_REG(reg, ins, tp) do { \
> + RTE_ATOMIC(uint##tp##_t) *dst = (RTE_ATOMIC(uint##tp##_t) *) \
> + (uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off); \
> switch (ins->imm) { \
> case BPF_ATOMIC_ADD: \
> - rte_atomic##tp##_add((rte_atomic##tp##_t *) \
> - (uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off), \
> - (reg)[(ins)->src_reg]); \
> + rte_atomic_fetch_add_explicit(dst, \
> + (reg)[(ins)->src_reg], rte_memory_order_seq_cst); \
> break; \
> case BPF_ATOMIC_XCHG: \
> - (reg)[(ins)->src_reg] = rte_atomic##tp##_exchange((uint##tp##_t *) \
> - (uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off), \
> - (reg)[(ins)->src_reg]); \
> + (reg)[(ins)->src_reg] = rte_atomic_exchange_explicit(dst, \
> + (reg)[(ins)->src_reg], rte_memory_order_seq_cst); \
> break; \
> default: \
> /* this should be caught by validator and never reach here */ \
> --
> 2.53.0
Reviewed-by: Marat Khalili <marat.khalili@huawei.com>
nit: in the last sentence of the commit message `s` is not needed in `matches`,
and some word like `behavior` can be added for clarity after `interpreter`.
FWIW whole patchset builds on our amd64 and arm64 machines and passes our subset of tests.
^ permalink raw reply [flat|nested] 105+ messages in thread
* Re: [EXTERNAL] [PATCH v4 15/27] net/netvsc: replace rte_atomic32 with stdatomic
2026-05-27 0:29 ` [EXTERNAL] " Long Li
@ 2026-05-31 16:35 ` Stephen Hemminger
0 siblings, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-05-31 16:35 UTC (permalink / raw)
To: Long Li; +Cc: dev@dpdk.org, Wei Hu
On Wed, 27 May 2026 00:29:55 +0000
Long Li <longli@microsoft.com> wrote:
> > @@ -59,7 +59,8 @@ hn_rndis_rid(struct hn_data *hv)
> > uint32_t rid;
> >
> > do {
> > - rid = rte_atomic32_add_return(&hv->rndis_req_id, 1);
> > + rid = rte_atomic_fetch_add_explicit(&hv->rndis_req_id, 1,
> > +
> > rte_memory_order_seq_cst);
>
> Does rte_atomic_fetch_add_explicit() return the old value of hv->rndis_req_id? If yes this is not correct, as the rte_atomic32_add_return() used to return the new value.
In this case it is harmless, the req_id field is only used here and
it will just sit one behind the last issued value.
The do { } while (rid == 0) loop will absorb the case when request id
is zero.
For next version, will just add one; simpler and avoids the need to explain.
^ permalink raw reply [flat|nested] 105+ messages in thread
* Re: [PATCH v4 11/27] net/sfc: replace rte_atomic with stdatomic
2026-05-26 23:24 ` [PATCH v4 11/27] net/sfc: replace rte_atomic with stdatomic Stephen Hemminger
@ 2026-06-01 9:22 ` Andrew Rybchenko
0 siblings, 0 replies; 105+ messages in thread
From: Andrew Rybchenko @ 2026-06-01 9:22 UTC (permalink / raw)
To: Stephen Hemminger, dev
On 5/27/26 2:24 AM, Stephen Hemminger wrote:
> The rte_atomicNN functions are deprecated and need to be replaced.
> Use stdatomic for the restart required flag.
> Use existing ethdev helper to set link status.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
^ permalink raw reply [flat|nested] 105+ messages in thread
* RE: [PATCH v4 03/27] ring: unify memory model on C11, remove atomic32
2026-05-26 23:23 ` [PATCH v4 03/27] ring: unify memory model on C11, remove atomic32 Stephen Hemminger
@ 2026-06-01 18:18 ` Konstantin Ananyev
2026-06-01 21:05 ` Stephen Hemminger
2026-06-01 21:18 ` Stephen Hemminger
2026-06-01 22:07 ` Stephen Hemminger
1 sibling, 2 replies; 105+ messages in thread
From: Konstantin Ananyev @ 2026-06-01 18:18 UTC (permalink / raw)
To: Stephen Hemminger, dev@dpdk.org; +Cc: Wathsala Vithanage
> Remove the RTE_USE_C11_MEM_MODEL build switch; C11 atomics are now
> the default for all platforms. Unifies __rte_ring_update_tail into
> the C11 form (atomic_store_release replaces the older rte_smp_wmb +
> plain store on the generic path) and renames rte_ring_generic_pvt.h
> to rte_ring_x86_pvt.h to reflect its new scope.
>
> Also splits the head-move helper into separate ST and MT variants,
> removing the runtime is_st branch from the MT retry loop.
> This gets small boost and scopes the following exception
> more tightly.
>
> Exception: on x86 with GCC, atomic_compare_exchange on the head CAS
> regresses MP/MC contended throughput by ~20% existing hand-written
> cmpxchg. As a workaround, GCC-on-x86 builds use the older
> __sync_bool_compare_and_swap builtin, which generates equivalent
> code to the original asm. Can be reverted if/when GCC gets
> fixed; similar issue was observed in Linux kernel.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> lib/ring/meson.build | 2 +-
> lib/ring/rte_ring_c11_pvt.h | 75 +++--------
> lib/ring/rte_ring_elem_pvt.h | 125 ++++++++++++++++--
> ..._ring_generic_pvt.h => rte_ring_x86_pvt.h} | 61 ++-------
> lib/ring/soring.c | 15 ++-
> 5 files changed, 158 insertions(+), 120 deletions(-)
> rename lib/ring/{rte_ring_generic_pvt.h => rte_ring_x86_pvt.h} (60%)
>
> diff --git a/lib/ring/meson.build b/lib/ring/meson.build
> index 21f2c12989..b178c963b8 100644
> --- a/lib/ring/meson.build
> +++ b/lib/ring/meson.build
> @@ -9,7 +9,7 @@ indirect_headers += files (
> 'rte_ring_elem.h',
> 'rte_ring_elem_pvt.h',
> 'rte_ring_c11_pvt.h',
> - 'rte_ring_generic_pvt.h',
> + 'rte_ring_x86_pvt.h',
> 'rte_ring_hts.h',
> 'rte_ring_hts_elem_pvt.h',
> 'rte_ring_peek.h',
> diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h
> index 07b6efc416..3efe011f08 100644
> --- a/lib/ring/rte_ring_c11_pvt.h
> +++ b/lib/ring/rte_ring_c11_pvt.h
> @@ -15,35 +15,10 @@
> * @file rte_ring_c11_pvt.h
> * It is not recommended to include this file directly,
> * include <rte_ring.h> instead.
> - * Contains internal helper functions for MP/SP and MC/SC ring modes.
> + * Contains internal helper functions for MP and MC ring modes.
> * For more information please refer to <rte_ring.h>.
> */
>
> -/**
> - * @internal This function updates tail values.
> - */
> -static __rte_always_inline void
> -__rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
> - uint32_t new_val, uint32_t single, uint32_t enqueue)
> -{
> - RTE_SET_USED(enqueue);
> -
> - /*
> - * If there are other enqueues/dequeues in progress that preceded us,
> - * we need to wait for them to complete
> - */
> - if (!single)
> - rte_wait_until_equal_32((uint32_t *)(uintptr_t)&ht->tail, old_val,
> - rte_memory_order_relaxed);
> -
> - /*
> - * R0: Establishes a synchronizing edge with load-acquire of tail at A1.
> - * Ensures that memory effects by this thread on ring elements array
> - * is observed by a different thread of the other type.
> - */
> - rte_atomic_store_explicit(&ht->tail, new_val,
> rte_memory_order_release);
> -}
> -
> /**
> * @internal This is a helper function that moves the producer/consumer head
> *
> @@ -72,14 +47,11 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht,
> uint32_t old_val,
> * If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only
> */
> static __rte_always_inline unsigned int
> -__rte_ring_headtail_move_head(struct rte_ring_headtail *d,
> +__rte_ring_headtail_move_head_mt(struct rte_ring_headtail *d,
> const struct rte_ring_headtail *s, uint32_t capacity,
> - unsigned int is_st, unsigned int n,
> - enum rte_ring_queue_behavior behavior,
> + unsigned int n, enum rte_ring_queue_behavior behavior,
> uint32_t *old_head, uint32_t *new_head, uint32_t *entries)
> {
> - uint32_t stail;
> - int success;
> unsigned int max = n;
>
> /*
> @@ -89,8 +61,7 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail
> *d,
> * d->head.
> * If not, an unsafe partial order may ensue.
> */
> - *old_head = rte_atomic_load_explicit(&d->head,
> - rte_memory_order_acquire);
> + *old_head = rte_atomic_load_explicit(&d->head, rte_memory_order_acquire);
> do {
> /* Reset n to the initial burst count */
> n = max;
> @@ -101,15 +72,14 @@ __rte_ring_headtail_move_head(struct
> rte_ring_headtail *d,
> * ring elements array is observed by the time
> * this thread observes its tail update.
> */
> - stail = rte_atomic_load_explicit(&s->tail,
> - rte_memory_order_acquire);
> + uint32_t stail = rte_atomic_load_explicit(&s->tail,
> rte_memory_order_acquire);
>
> /* The subtraction is done between two unsigned 32bits value
> * (the result is always modulo 32 bits even if we have
> * *old_head > s->tail). So 'entries' is always between 0
> * and capacity (which is < size).
> */
> - *entries = (capacity + stail - *old_head);
> + *entries = capacity + stail - *old_head;
>
> /* check that we have enough room in ring */
> if (unlikely(n > *entries))
> @@ -120,25 +90,20 @@ __rte_ring_headtail_move_head(struct
> rte_ring_headtail *d,
> return 0;
>
> *new_head = *old_head + n;
> - if (is_st) {
> - d->head = *new_head;
> - success = 1;
> - } else
> - /* on failure, *old_head is updated */
> - /*
> - * R1/A2.
> - * R1: Establishes a synchronizing edge with A0 of a
> - * different thread.
> - * A2: Establishes a synchronizing edge with R1 of a
> - * different thread to observe same value for stail
> - * observed by that thread on CAS failure (to retry
> - * with an updated *old_head).
> - */
> - success =
> rte_atomic_compare_exchange_strong_explicit(
> - &d->head, old_head, *new_head,
> - rte_memory_order_release,
> - rte_memory_order_acquire);
> - } while (unlikely(success == 0));
> +
> + /* on failure, *old_head is updated */
> + /*
> + * R1/A2.
> + * R1: Establishes a synchronizing edge with A0 of a
> + * different thread.
> + * A2: Establishes a synchronizing edge with R1 of a
> + * different thread to observe same value for stail
> + * observed by that thread on CAS failure (to retry
> + * with an updated *old_head).
> + */
> + } while (unlikely(!rte_atomic_compare_exchange_strong_explicit(
> + &d->head, old_head, *new_head,
> + rte_memory_order_release,
> rte_memory_order_acquire)));
> return n;
> }
>
> diff --git a/lib/ring/rte_ring_elem_pvt.h b/lib/ring/rte_ring_elem_pvt.h
> index 6eafae121f..9d1da12a92 100644
> --- a/lib/ring/rte_ring_elem_pvt.h
> +++ b/lib/ring/rte_ring_elem_pvt.h
> @@ -299,17 +299,108 @@ __rte_ring_dequeue_elems(struct rte_ring *r,
> uint32_t cons_head,
> cons_head & r->mask, esize, num);
> }
>
> -/* Between load and load. there might be cpu reorder in weak model
> - * (powerpc/arm).
> - * There are 2 choices for the users
> - * 1.use rmb() memory barrier
> - * 2.use one-direction load_acquire/store_release barrier
> - * It depends on performance test results.
> +/**
> + * @internal This function updates tail values.
> */
> -#ifdef RTE_USE_C11_MEM_MODEL
> -#include "rte_ring_c11_pvt.h"
> +static __rte_always_inline void
> +__rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
> + uint32_t new_val, uint32_t single, uint32_t enqueue)
> +{
> + RTE_SET_USED(enqueue);
> +
> + /*
> + * If there are other enqueues/dequeues in progress that preceded us,
> + * we need to wait for them to complete
> + */
> + if (!single)
> + rte_wait_until_equal_32((uint32_t *)(uintptr_t)&ht->tail, old_val,
> + rte_memory_order_relaxed);
> +
> + /*
> + * R0: Establishes a synchronizing edge with load-acquire of tail at A1.
> + * Ensures that memory effects by this thread on ring elements array
> + * is observed by a different thread of the other type.
> + */
> + rte_atomic_store_explicit(&ht->tail, new_val,
> rte_memory_order_release);
> +}
> +
> +/**
> + * @internal This is a helper function that moves the producer/consumer head
> + *
> + *
> + * This optimized version for single threaded case.
> + *
> + * @param d
> + * A pointer to the headtail structure with head value to be moved
> + * @param s
> + * A pointer to the counter-part headtail structure. Note that this
> + * function only reads tail value from it
> + * @param capacity
> + * Either ring capacity value (for producer), or zero (for consumer)
> + * @param n
> + * The number of elements we want to move head value on
> + * @param behavior
> + * RTE_RING_QUEUE_FIXED: Move on a fixed number of items
> + * RTE_RING_QUEUE_VARIABLE: Move on as many items as possible
> + * @param old_head
> + * Returns head value as it was before the move
> + * @param new_head
> + * Returns the new head value
> + * @param entries
> + * Returns the number of ring entries available BEFORE head was moved
> + * @return
> + * Actual number of objects the head was moved on
> + * If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only
> + */
> +static __rte_always_inline unsigned int
> +__rte_ring_headtail_move_head_st(struct rte_ring_headtail *d,
> + const struct rte_ring_headtail *s, uint32_t capacity,
> + unsigned int n, enum rte_ring_queue_behavior behavior,
> + uint32_t *old_head, uint32_t *new_head, uint32_t *entries)
> +{
> + uint32_t stail;
> +
I really like the idea to split _st and _mt move_head into separate functions.
That makes code much cleaner an easier to understand and maintain.
Few comments on actual '_st' implementation below:
> + /*
> + * A0: Establishes a synchronizing edge with R1.
> + * Ensure that this thread observes same values
> + * to stail observed by the thread that updated
> + * d->head.
> + * If not, an unsafe partial order may ensue.
> + */
I believe that comment is not relevant for '_st',
there is no R1 anymore for '_st' - see below,
and no other thread except that one can move the head.
So, there is probably no point to use '_acquire' order here.
> + *old_head = rte_atomic_load_explicit(&d->head,
> rte_memory_order_acquire);
> +
> + /*
> + * A1: Establishes a synchronizing edge with R0.
> + * Ensures that other thread's memory effects on
> + * ring elements array is observed by the time
> + * this thread observes its tail update.
> + */
> + stail = rte_atomic_load_explicit(&s->tail, rte_memory_order_acquire);
> +
> + /* The subtraction is done between two unsigned 32bits value
> + * (the result is always modulo 32 bits even if we have
> + * *old_head > s->tail). So 'entries' is always between 0
> + * and capacity (which is < size).
> + */
> + *entries = capacity + stail - *old_head;
> +
> + /* check that we have enough room in ring */
> + if (unlikely(n > *entries))
> + n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *entries;
> +
> + if (n > 0) {
> + *new_head = *old_head + n;
> + d->head = *new_head;
There is a bit of inconsistency with the 'load' operation above:
If we use atomic_load(&d->head. ...) then it would be better to use
atomic_store(&d->head,..., order_relaxed) here.
> + }
> +
> + return n;
> +}
> +
> +/* There are two choices because GCC optimizer does poorly on
> atomic_compare_exchange */
> +#if defined(RTE_TOOLCHAIN_GCC) && defined(RTE_ARCH_X86)
If we still need to use legacy code for x86, I think we need an explcit macro
to enable C11 for x86 (RTE_RING_FORCE_C11 or so):
to make sure that C11 version will still get tested and measured on x86.
> +#include "rte_ring_x86_pvt.h"
> #else
> -#include "rte_ring_generic_pvt.h"
> +#include "rte_ring_c11_pvt.h"
> #endif
I tried to look at compiler output for both cases, most of the code
looks nearly identical, one thing that I noticed:
C11 __rte_ring_headtail_move_head_mt() uses output
parameter: 'uint32_t *old_head' directly within CAS operation.
In x86_64 that cause gcc to generate extra instructions to
store return value of CAS (eax) within 'old_head' memory location,
even when CAS was not successfull and another attempt should be
performed. In some cases, even extra branch can be observed:
https://godbolt.org/z/4dTrqMjYe
In constrast, x86 specific version that uses
__sync_bool_compare_and_swap() doesn't exibit such problem,
as __sync_bool_compare_and_swap() doesn't update the 'old_head'
with new value, and we have to re-read it explicitly on each iteration.
I tried to overcome that problem by using local variable 'head' inside the loop,
and updaing '*old_head' value only at exit.
With such change gcc manages to avoid extra store(/branch),
see __rte_ring_headtail_move_head_mt_c11_v2() in the link above.
Can I ask you to re-run your perf test with the patch:
https://patchwork.dpdk.org/project/dpdk/patch/20260601181509.71007-1-konstantin.ananyev@huawei.com/
applied on top of your changes and see would it help in terms of performance?
From other side - if you'll point me to the exact tests you are running,
I am happy to repeat them on my box.
My preference would be to avoid arch/compiler specific versions, if possible.
> /**
> @@ -341,8 +432,12 @@ __rte_ring_move_prod_head(struct rte_ring *r,
> unsigned int is_sp,
> uint32_t *old_head, uint32_t *new_head,
> uint32_t *free_entries)
> {
> - return __rte_ring_headtail_move_head(&r->prod, &r->cons, r->capacity,
> - is_sp, n, behavior, old_head, new_head, free_entries);
> + if (is_sp)
> + return __rte_ring_headtail_move_head_st(&r->prod, &r->cons,
> r->capacity,
> + n, behavior, old_head, new_head, free_entries);
> + else
> + return __rte_ring_headtail_move_head_mt(&r->prod, &r->cons,
> r->capacity,
> + n, behavior, old_head, new_head, free_entries);
> }
>
> /**
> @@ -374,8 +469,12 @@ __rte_ring_move_cons_head(struct rte_ring *r,
> unsigned int is_sc,
> uint32_t *old_head, uint32_t *new_head,
> uint32_t *entries)
> {
> - return __rte_ring_headtail_move_head(&r->cons, &r->prod, 0,
> - is_sc, n, behavior, old_head, new_head, entries);
> + if (is_sc)
> + return __rte_ring_headtail_move_head_st(&r->cons, &r->prod,
> 0,
> + n, behavior, old_head, new_head, entries);
> + else
> + return __rte_ring_headtail_move_head_mt(&r->cons, &r->prod,
> 0,
> + n, behavior, old_head, new_head, entries);
> }
>
> /**
> diff --git a/lib/ring/rte_ring_generic_pvt.h b/lib/ring/rte_ring_x86_pvt.h
> similarity index 60%
> rename from lib/ring/rte_ring_generic_pvt.h
> rename to lib/ring/rte_ring_x86_pvt.h
> index affd2d5ba7..c8de108bbd 100644
> --- a/lib/ring/rte_ring_generic_pvt.h
> +++ b/lib/ring/rte_ring_x86_pvt.h
> @@ -7,39 +7,19 @@
> * Used as BSD-3 Licensed with permission from Kip Macy.
> */
>
> -#ifndef _RTE_RING_GENERIC_PVT_H_
> -#define _RTE_RING_GENERIC_PVT_H_
> +#ifndef _RTE_RING_X86_PVT_H_
> +#define _RTE_RING_X86_PVT_H_
>
> /**
> - * @file rte_ring_generic_pvt.h
> + * @file rte_ring_x86_pvt.h
> * It is not recommended to include this file directly,
> * include <rte_ring.h> instead.
> - * Contains internal helper functions for MP/SP and MC/SC ring modes.
> - * For more information please refer to <rte_ring.h>.
> + *
> + * Contains internal helper functions for MP and MC ring modes.
> + * It is GCC specific to workaround poor optimizer handling of C11 atomic
> + * compare_exchange.
> */
>
> -/**
> - * @internal This function updates tail values.
> - */
> -static __rte_always_inline void
> -__rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t old_val,
> - uint32_t new_val, uint32_t single, uint32_t enqueue)
> -{
> - if (enqueue)
> - rte_smp_wmb();
> - else
> - rte_smp_rmb();
> - /*
> - * If there are other enqueues/dequeues in progress that preceded us,
> - * we need to wait for them to complete
> - */
> - if (!single)
> - rte_wait_until_equal_32((volatile uint32_t *)(uintptr_t)&ht->tail,
> old_val,
> - rte_memory_order_relaxed);
> -
> - ht->tail = new_val;
> -}
> -
> /**
> * @internal This is a helper function that moves the producer/consumer head
> *
> @@ -50,8 +30,6 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht,
> uint32_t old_val,
> * function only reads tail value from it
> * @param capacity
> * Either ring capacity value (for producer), or zero (for consumer)
> - * @param is_st
> - * Indicates whether multi-thread safe path is needed or not
> * @param n
> * The number of elements we want to move head value on
> * @param behavior
> @@ -68,14 +46,13 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht,
> uint32_t old_val,
> * If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only
> */
> static __rte_always_inline unsigned int
> -__rte_ring_headtail_move_head(struct rte_ring_headtail *d,
> +__rte_ring_headtail_move_head_mt(struct rte_ring_headtail *d,
> const struct rte_ring_headtail *s, uint32_t capacity,
> - unsigned int is_st, unsigned int n,
> + unsigned int n,
> enum rte_ring_queue_behavior behavior,
> uint32_t *old_head, uint32_t *new_head, uint32_t *entries)
> {
> unsigned int max = n;
> - int success;
>
> do {
> /* Reset n to the initial burst count */
> @@ -83,18 +60,13 @@ __rte_ring_headtail_move_head(struct rte_ring_headtail
> *d,
>
> *old_head = d->head;
>
> - /* add rmb barrier to avoid load/load reorder in weak
> - * memory model. It is noop on x86
> - */
> - rte_smp_rmb();
> -
> /*
> * The subtraction is done between two unsigned 32bits value
> * (the result is always modulo 32 bits even if we have
> * *old_head > s->tail). So 'entries' is always between 0
> * and capacity (which is < size).
> */
> - *entries = (capacity + s->tail - *old_head);
> + *entries = capacity + s->tail - *old_head;
>
> /* check that we have enough room in ring */
> if (unlikely(n > *entries))
> @@ -105,15 +77,10 @@ __rte_ring_headtail_move_head(struct
> rte_ring_headtail *d,
> return 0;
>
> *new_head = *old_head + n;
> - if (is_st) {
> - d->head = *new_head;
> - success = 1;
> - } else
> - success = rte_atomic32_cmpset(
> - (uint32_t *)(uintptr_t)&d->head,
> - *old_head, *new_head);
> - } while (unlikely(success == 0));
> + } while (unlikely(!__sync_bool_compare_and_swap(
> + (uint32_t *)(uintptr_t)&d->head,
> + *old_head, *new_head)));
> return n;
> }
>
> -#endif /* _RTE_RING_GENERIC_PVT_H_ */
> +#endif /* _RTE_RING_X86_PVT_H_ */
> diff --git a/lib/ring/soring.c b/lib/ring/soring.c
> index 3b90521bdb..0e8bbc03c1 100644
> --- a/lib/ring/soring.c
> +++ b/lib/ring/soring.c
> @@ -135,9 +135,12 @@ __rte_soring_move_prod_head(struct rte_soring *r,
> uint32_t num,
>
> switch (st) {
> case RTE_RING_SYNC_ST:
> + n = __rte_ring_headtail_move_head_st(&r->prod.ht, &r-
> >cons.ht,
> + r->capacity, num, behavior, head, next, free);
> + break;
> case RTE_RING_SYNC_MT:
> - n = __rte_ring_headtail_move_head(&r->prod.ht, &r->cons.ht,
> - r->capacity, st, num, behavior, head, next, free);
> + n = __rte_ring_headtail_move_head_mt(&r->prod.ht, &r-
> >cons.ht,
> + r->capacity, num, behavior, head, next, free);
> break;
> case RTE_RING_SYNC_MT_RTS:
> n = __rte_ring_rts_move_head(&r->prod.rts, &r->cons.ht,
> @@ -168,9 +171,13 @@ __rte_soring_move_cons_head(struct rte_soring *r,
> uint32_t stage, uint32_t num,
>
> switch (st) {
> case RTE_RING_SYNC_ST:
> + n = __rte_ring_headtail_move_head_st(&r->cons.ht,
> + &r->stage[stage].ht, 0, num, behavior,
> + head, next, avail);
> + break;
> case RTE_RING_SYNC_MT:
> - n = __rte_ring_headtail_move_head(&r->cons.ht,
> - &r->stage[stage].ht, 0, st, num, behavior,
> + n = __rte_ring_headtail_move_head_mt(&r->cons.ht,
> + &r->stage[stage].ht, 0, num, behavior,
> head, next, avail);
> break;
> case RTE_RING_SYNC_MT_RTS:
> --
> 2.53.0
^ permalink raw reply [flat|nested] 105+ messages in thread
* RE: [PATCH v4 01/27] eal: use intrinsics for rte_atomic on all platforms
2026-05-26 23:23 ` [PATCH v4 01/27] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
@ 2026-06-01 18:23 ` Konstantin Ananyev
0 siblings, 0 replies; 105+ messages in thread
From: Konstantin Ananyev @ 2026-06-01 18:23 UTC (permalink / raw)
To: Stephen Hemminger, dev@dpdk.org
Cc: Wathsala Vithanage, Bibo Mao, David Christensen, Sun Yuechi,
Bruce Richardson
> Next step is to deprecate the rte_atomicNN_*() family. Rather than
> maintaining both the inline asm and intrinsic fallbacks, drop the
> asm paths and use intrinsics everywhere.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> lib/eal/arm/include/rte_atomic_32.h | 4 -
> lib/eal/arm/include/rte_atomic_64.h | 4 -
> lib/eal/include/generic/rte_atomic.h | 76 +---------
> lib/eal/loongarch/include/rte_atomic.h | 4 -
> lib/eal/ppc/include/rte_atomic.h | 173 -----------------------
> lib/eal/riscv/include/rte_atomic.h | 4 -
> lib/eal/x86/include/rte_atomic.h | 172 ----------------------
> lib/eal/x86/include/rte_atomic_32.h | 188 -------------------------
> lib/eal/x86/include/rte_atomic_64.h | 157 ---------------------
> 9 files changed, 6 insertions(+), 776 deletions(-)
>
> diff --git a/lib/eal/arm/include/rte_atomic_32.h
> b/lib/eal/arm/include/rte_atomic_32.h
> index 0b9a0dfa30..696a539fef 100644
> --- a/lib/eal/arm/include/rte_atomic_32.h
> +++ b/lib/eal/arm/include/rte_atomic_32.h
> @@ -5,10 +5,6 @@
> #ifndef _RTE_ATOMIC_ARM32_H_
> #define _RTE_ATOMIC_ARM32_H_
>
> -#ifndef RTE_FORCE_INTRINSICS
> -# error Platform must be built with RTE_FORCE_INTRINSICS
> -#endif
> -
> #include "generic/rte_atomic.h"
>
> #ifdef __cplusplus
> diff --git a/lib/eal/arm/include/rte_atomic_64.h
> b/lib/eal/arm/include/rte_atomic_64.h
> index 181bb60929..9f790238df 100644
> --- a/lib/eal/arm/include/rte_atomic_64.h
> +++ b/lib/eal/arm/include/rte_atomic_64.h
> @@ -6,10 +6,6 @@
> #ifndef _RTE_ATOMIC_ARM64_H_
> #define _RTE_ATOMIC_ARM64_H_
>
> -#ifndef RTE_FORCE_INTRINSICS
> -# error Platform must be built with RTE_FORCE_INTRINSICS
> -#endif
> -
> #include "generic/rte_atomic.h"
> #include <rte_branch_prediction.h>
> #include <rte_debug.h>
> diff --git a/lib/eal/include/generic/rte_atomic.h
> b/lib/eal/include/generic/rte_atomic.h
> index 0a4f3f8528..292e52fade 100644
> --- a/lib/eal/include/generic/rte_atomic.h
> +++ b/lib/eal/include/generic/rte_atomic.h
> @@ -187,13 +187,11 @@ static inline void
> rte_atomic_thread_fence(rte_memory_order memorder);
> static inline int
> rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline int
> rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
> {
> return __sync_bool_compare_and_swap(dst, exp, src);
> }
> -#endif
>
> /**
> * Atomic exchange.
> @@ -211,15 +209,11 @@ rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t
> exp, uint16_t src)
> * The original value at that location
> */
> static inline uint16_t
> -rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val);
> -
> -#ifdef RTE_FORCE_INTRINSICS
> -static inline uint16_t
> rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
> {
> - return rte_atomic_exchange_explicit(dst, val,
> rte_memory_order_seq_cst);
> + return rte_atomic_exchange_explicit((volatile __rte_atomic uint16_t
> *)dst,
> + val, rte_memory_order_seq_cst);
> }
> -#endif
>
> /**
> * The atomic counter structure.
> @@ -312,13 +306,11 @@ rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
> static inline void
> rte_atomic16_inc(rte_atomic16_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline void
> rte_atomic16_inc(rte_atomic16_t *v)
> {
> rte_atomic16_add(v, 1);
> }
> -#endif
>
> /**
> * Atomically decrement a counter by one.
> @@ -329,13 +321,11 @@ rte_atomic16_inc(rte_atomic16_t *v)
> static inline void
> rte_atomic16_dec(rte_atomic16_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline void
> rte_atomic16_dec(rte_atomic16_t *v)
> {
> rte_atomic16_sub(v, 1);
> }
> -#endif
>
> /**
> * Atomically add a 16-bit value to a counter and return the result.
> @@ -391,13 +381,11 @@ rte_atomic16_sub_return(rte_atomic16_t *v, int16_t
> dec)
> */
> static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
> {
> return rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v-
> >cnt, 1,
> rte_memory_order_seq_cst) + 1 == 0;
> }
> -#endif
>
> /**
> * Atomically decrement a 16-bit counter by one and test.
> @@ -412,13 +400,11 @@ static inline int
> rte_atomic16_inc_and_test(rte_atomic16_t *v)
> */
> static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
> {
> return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v-
> >cnt, 1,
> rte_memory_order_seq_cst) - 1 == 0;
> }
> -#endif
>
> /**
> * Atomically test and set a 16-bit atomic counter.
> @@ -433,12 +419,10 @@ static inline int
> rte_atomic16_dec_and_test(rte_atomic16_t *v)
> */
> static inline int rte_atomic16_test_and_set(rte_atomic16_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
> {
> return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
> }
> -#endif
>
> /**
> * Atomically set a 16-bit counter to 0.
> @@ -472,13 +456,11 @@ static inline void rte_atomic16_clear(rte_atomic16_t
> *v)
> static inline int
> rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline int
> rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
> {
> return __sync_bool_compare_and_swap(dst, exp, src);
> }
> -#endif
>
> /**
> * Atomic exchange.
> @@ -496,15 +478,11 @@ rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t
> exp, uint32_t src)
> * The original value at that location
> */
> static inline uint32_t
> -rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val);
> -
> -#ifdef RTE_FORCE_INTRINSICS
> -static inline uint32_t
> rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
> {
> - return rte_atomic_exchange_explicit(dst, val,
> rte_memory_order_seq_cst);
> + return rte_atomic_exchange_explicit((volatile __rte_atomic uint32_t
> *)dst,
> + val, rte_memory_order_seq_cst);
> }
> -#endif
>
> /**
> * The atomic counter structure.
> @@ -597,13 +575,11 @@ rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
> static inline void
> rte_atomic32_inc(rte_atomic32_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline void
> rte_atomic32_inc(rte_atomic32_t *v)
> {
> rte_atomic32_add(v, 1);
> }
> -#endif
>
> /**
> * Atomically decrement a counter by one.
> @@ -614,13 +590,11 @@ rte_atomic32_inc(rte_atomic32_t *v)
> static inline void
> rte_atomic32_dec(rte_atomic32_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline void
> rte_atomic32_dec(rte_atomic32_t *v)
> {
> rte_atomic32_sub(v,1);
> }
> -#endif
>
> /**
> * Atomically add a 32-bit value to a counter and return the result.
> @@ -676,13 +650,11 @@ rte_atomic32_sub_return(rte_atomic32_t *v, int32_t
> dec)
> */
> static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
> {
> return rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v-
> >cnt, 1,
> rte_memory_order_seq_cst) + 1 == 0;
> }
> -#endif
>
> /**
> * Atomically decrement a 32-bit counter by one and test.
> @@ -697,13 +669,11 @@ static inline int
> rte_atomic32_inc_and_test(rte_atomic32_t *v)
> */
> static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
> {
> return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v-
> >cnt, 1,
> rte_memory_order_seq_cst) - 1 == 0;
> }
> -#endif
>
> /**
> * Atomically test and set a 32-bit atomic counter.
> @@ -718,12 +688,10 @@ static inline int
> rte_atomic32_dec_and_test(rte_atomic32_t *v)
> */
> static inline int rte_atomic32_test_and_set(rte_atomic32_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
> {
> return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
> }
> -#endif
>
> /**
> * Atomically set a 32-bit counter to 0.
> @@ -756,13 +724,11 @@ static inline void rte_atomic32_clear(rte_atomic32_t
> *v)
> static inline int
> rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline int
> rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
> {
> return __sync_bool_compare_and_swap(dst, exp, src);
> }
> -#endif
>
> /**
> * Atomic exchange.
> @@ -780,15 +746,11 @@ rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t
> exp, uint64_t src)
> * The original value at that location
> */
> static inline uint64_t
> -rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val);
> -
> -#ifdef RTE_FORCE_INTRINSICS
> -static inline uint64_t
> rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
> {
> - return rte_atomic_exchange_explicit(dst, val,
> rte_memory_order_seq_cst);
> + return rte_atomic_exchange_explicit((volatile __rte_atomic uint64_t
> *)dst,
> + val, rte_memory_order_seq_cst);
> }
> -#endif
>
> /**
> * The atomic counter structure.
> @@ -811,7 +773,6 @@ typedef struct {
> static inline void
> rte_atomic64_init(rte_atomic64_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline void
> rte_atomic64_init(rte_atomic64_t *v)
> {
> @@ -828,7 +789,6 @@ rte_atomic64_init(rte_atomic64_t *v)
> }
> #endif
> }
> -#endif
>
> /**
> * Atomically read a 64-bit counter.
> @@ -841,7 +801,6 @@ rte_atomic64_init(rte_atomic64_t *v)
> static inline int64_t
> rte_atomic64_read(rte_atomic64_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline int64_t
> rte_atomic64_read(rte_atomic64_t *v)
> {
> @@ -860,7 +819,6 @@ rte_atomic64_read(rte_atomic64_t *v)
> return tmp;
> #endif
> }
> -#endif
>
> /**
> * Atomically set a 64-bit counter.
> @@ -873,7 +831,6 @@ rte_atomic64_read(rte_atomic64_t *v)
> static inline void
> rte_atomic64_set(rte_atomic64_t *v, int64_t new_value);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline void
> rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
> {
> @@ -890,7 +847,6 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
> }
> #endif
> }
> -#endif
>
> /**
> * Atomically add a 64-bit value to a counter.
> @@ -903,14 +859,12 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t
> new_value)
> static inline void
> rte_atomic64_add(rte_atomic64_t *v, int64_t inc);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline void
> rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
> {
> rte_atomic_fetch_add_explicit((volatile __rte_atomic int64_t *)&v->cnt,
> inc,
> rte_memory_order_seq_cst);
> }
> -#endif
>
> /**
> * Atomically subtract a 64-bit value from a counter.
> @@ -923,14 +877,12 @@ rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
> static inline void
> rte_atomic64_sub(rte_atomic64_t *v, int64_t dec);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline void
> rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
> {
> rte_atomic_fetch_sub_explicit((volatile __rte_atomic int64_t *)&v->cnt,
> dec,
> rte_memory_order_seq_cst);
> }
> -#endif
>
> /**
> * Atomically increment a 64-bit counter by one and test.
> @@ -941,13 +893,11 @@ rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
> static inline void
> rte_atomic64_inc(rte_atomic64_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline void
> rte_atomic64_inc(rte_atomic64_t *v)
> {
> rte_atomic64_add(v, 1);
> }
> -#endif
>
> /**
> * Atomically decrement a 64-bit counter by one and test.
> @@ -958,13 +908,11 @@ rte_atomic64_inc(rte_atomic64_t *v)
> static inline void
> rte_atomic64_dec(rte_atomic64_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline void
> rte_atomic64_dec(rte_atomic64_t *v)
> {
> rte_atomic64_sub(v, 1);
> }
> -#endif
>
> /**
> * Add a 64-bit value to an atomic counter and return the result.
> @@ -982,14 +930,12 @@ rte_atomic64_dec(rte_atomic64_t *v)
> static inline int64_t
> rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline int64_t
> rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
> {
> return rte_atomic_fetch_add_explicit((volatile __rte_atomic int64_t *)&v-
> >cnt, inc,
> rte_memory_order_seq_cst) + inc;
> }
> -#endif
>
> /**
> * Subtract a 64-bit value from an atomic counter and return the result.
> @@ -1007,14 +953,12 @@ rte_atomic64_add_return(rte_atomic64_t *v, int64_t
> inc)
> static inline int64_t
> rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline int64_t
> rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
> {
> return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int64_t *)&v-
> >cnt, dec,
> rte_memory_order_seq_cst) - dec;
> }
> -#endif
>
> /**
> * Atomically increment a 64-bit counter by one and test.
> @@ -1029,12 +973,10 @@ rte_atomic64_sub_return(rte_atomic64_t *v, int64_t
> dec)
> */
> static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
> {
> return rte_atomic64_add_return(v, 1) == 0;
> }
> -#endif
>
> /**
> * Atomically decrement a 64-bit counter by one and test.
> @@ -1049,12 +991,10 @@ static inline int
> rte_atomic64_inc_and_test(rte_atomic64_t *v)
> */
> static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
> {
> return rte_atomic64_sub_return(v, 1) == 0;
> }
> -#endif
>
> /**
> * Atomically test and set a 64-bit atomic counter.
> @@ -1069,12 +1009,10 @@ static inline int
> rte_atomic64_dec_and_test(rte_atomic64_t *v)
> */
> static inline int rte_atomic64_test_and_set(rte_atomic64_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
> {
> return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
> }
> -#endif
>
> /**
> * Atomically set a 64-bit counter to 0.
> @@ -1084,12 +1022,10 @@ static inline int
> rte_atomic64_test_and_set(rte_atomic64_t *v)
> */
> static inline void rte_atomic64_clear(rte_atomic64_t *v);
>
> -#ifdef RTE_FORCE_INTRINSICS
> static inline void rte_atomic64_clear(rte_atomic64_t *v)
> {
> rte_atomic64_set(v, 0);
> }
> -#endif
>
> #endif
>
> diff --git a/lib/eal/loongarch/include/rte_atomic.h
> b/lib/eal/loongarch/include/rte_atomic.h
> index c8066a4612..785a452c9e 100644
> --- a/lib/eal/loongarch/include/rte_atomic.h
> +++ b/lib/eal/loongarch/include/rte_atomic.h
> @@ -5,10 +5,6 @@
> #ifndef RTE_ATOMIC_LOONGARCH_H
> #define RTE_ATOMIC_LOONGARCH_H
>
> -#ifndef RTE_FORCE_INTRINSICS
> -# error Platform must be built with RTE_FORCE_INTRINSICS
> -#endif
> -
> #include <rte_common.h>
> #include "generic/rte_atomic.h"
>
> diff --git a/lib/eal/ppc/include/rte_atomic.h b/lib/eal/ppc/include/rte_atomic.h
> index 10acc238f9..64f4c3d670 100644
> --- a/lib/eal/ppc/include/rte_atomic.h
> +++ b/lib/eal/ppc/include/rte_atomic.h
> @@ -43,179 +43,6 @@ rte_atomic_thread_fence(rte_memory_order memorder)
> }
>
> /*------------------------- 16 bit atomic operations -------------------------*/
> -#ifndef RTE_FORCE_INTRINSICS
> -static inline int
> -rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
> -{
> - return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src,
> rte_memory_order_acquire,
> - rte_memory_order_acquire) ? 1 : 0;
> -}
> -
> -static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
> -{
> - return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
> -}
> -
> -static inline void
> -rte_atomic16_inc(rte_atomic16_t *v)
> -{
> - rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
> -}
> -
> -static inline void
> -rte_atomic16_dec(rte_atomic16_t *v)
> -{
> - rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
> -}
> -
> -static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
> -{
> - return rte_atomic_fetch_add_explicit(&v->cnt, 1,
> rte_memory_order_acquire) + 1 == 0;
> -}
> -
> -static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
> -{
> - return rte_atomic_fetch_sub_explicit(&v->cnt, 1,
> rte_memory_order_acquire) - 1 == 0;
> -}
> -
> -static inline uint16_t
> -rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
> -{
> - return __atomic_exchange_2(dst, val, rte_memory_order_seq_cst);
> -}
> -
> -/*------------------------- 32 bit atomic operations -------------------------*/
> -
> -static inline int
> -rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
> -{
> - return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src,
> rte_memory_order_acquire,
> - rte_memory_order_acquire) ? 1 : 0;
> -}
> -
> -static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
> -{
> - return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
> -}
> -
> -static inline void
> -rte_atomic32_inc(rte_atomic32_t *v)
> -{
> - rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
> -}
> -
> -static inline void
> -rte_atomic32_dec(rte_atomic32_t *v)
> -{
> - rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
> -}
> -
> -static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
> -{
> - return rte_atomic_fetch_add_explicit(&v->cnt, 1,
> rte_memory_order_acquire) + 1 == 0;
> -}
> -
> -static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
> -{
> - return rte_atomic_fetch_sub_explicit(&v->cnt, 1,
> rte_memory_order_acquire) - 1 == 0;
> -}
> -
> -static inline uint32_t
> -rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
> -{
> - return __atomic_exchange_4(dst, val, rte_memory_order_seq_cst);
> -}
> -
> -/*------------------------- 64 bit atomic operations -------------------------*/
> -
> -static inline int
> -rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
> -{
> - return rte_atomic_compare_exchange_strong_explicit(dst, &exp, src,
> rte_memory_order_acquire,
> - rte_memory_order_acquire) ? 1 : 0;
> -}
> -
> -static inline void
> -rte_atomic64_init(rte_atomic64_t *v)
> -{
> - v->cnt = 0;
> -}
> -
> -static inline int64_t
> -rte_atomic64_read(rte_atomic64_t *v)
> -{
> - return v->cnt;
> -}
> -
> -static inline void
> -rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
> -{
> - v->cnt = new_value;
> -}
> -
> -static inline void
> -rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
> -{
> - rte_atomic_fetch_add_explicit(&v->cnt, inc, rte_memory_order_acquire);
> -}
> -
> -static inline void
> -rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
> -{
> - rte_atomic_fetch_sub_explicit(&v->cnt, dec,
> rte_memory_order_acquire);
> -}
> -
> -static inline void
> -rte_atomic64_inc(rte_atomic64_t *v)
> -{
> - rte_atomic_fetch_add_explicit(&v->cnt, 1, rte_memory_order_acquire);
> -}
> -
> -static inline void
> -rte_atomic64_dec(rte_atomic64_t *v)
> -{
> - rte_atomic_fetch_sub_explicit(&v->cnt, 1, rte_memory_order_acquire);
> -}
> -
> -static inline int64_t
> -rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
> -{
> - return rte_atomic_fetch_add_explicit(&v->cnt, inc,
> rte_memory_order_acquire) + inc;
> -}
> -
> -static inline int64_t
> -rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
> -{
> - return rte_atomic_fetch_sub_explicit(&v->cnt, dec,
> rte_memory_order_acquire) - dec;
> -}
> -
> -static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
> -{
> - return rte_atomic_fetch_add_explicit(&v->cnt, 1,
> rte_memory_order_acquire) + 1 == 0;
> -}
> -
> -static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
> -{
> - return rte_atomic_fetch_sub_explicit(&v->cnt, 1,
> rte_memory_order_acquire) - 1 == 0;
> -}
> -
> -static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
> -{
> - return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
> -}
> -
> -static inline void rte_atomic64_clear(rte_atomic64_t *v)
> -{
> - v->cnt = 0;
> -}
> -
> -static inline uint64_t
> -rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
> -{
> - return __atomic_exchange_8(dst, val, rte_memory_order_seq_cst);
> -}
> -
> -#endif
>
> #ifdef __cplusplus
> }
> diff --git a/lib/eal/riscv/include/rte_atomic.h b/lib/eal/riscv/include/rte_atomic.h
> index 66346ad474..061b175f33 100644
> --- a/lib/eal/riscv/include/rte_atomic.h
> +++ b/lib/eal/riscv/include/rte_atomic.h
> @@ -8,10 +8,6 @@
> #ifndef RTE_ATOMIC_RISCV_H
> #define RTE_ATOMIC_RISCV_H
>
> -#ifndef RTE_FORCE_INTRINSICS
> -# error Platform must be built with RTE_FORCE_INTRINSICS
> -#endif
> -
> #include <stdint.h>
> #include <rte_common.h>
> #include <rte_config.h>
> diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h
> index e071e4234e..4f05302c9f 100644
> --- a/lib/eal/x86/include/rte_atomic.h
> +++ b/lib/eal/x86/include/rte_atomic.h
> @@ -111,178 +111,6 @@ rte_atomic_thread_fence(rte_memory_order
> memorder)
> extern "C" {
> #endif
>
> -#ifndef RTE_FORCE_INTRINSICS
> -static inline int
> -rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
> -{
> - uint8_t res;
> -
> - asm volatile(
> - MPLOCKED
> - "cmpxchgw %[src], %[dst];"
> - "sete %[res];"
> - : [res] "=a" (res), /* output */
> - [dst] "=m" (*dst)
> - : [src] "r" (src), /* input */
> - "a" (exp),
> - "m" (*dst)
> - : "memory"); /* no-clobber list */
> - return res;
> -}
> -
> -static inline uint16_t
> -rte_atomic16_exchange(volatile uint16_t *dst, uint16_t val)
> -{
> - asm volatile(
> - MPLOCKED
> - "xchgw %0, %1;"
> - : "=r" (val), "=m" (*dst)
> - : "0" (val), "m" (*dst)
> - : "memory"); /* no-clobber list */
> - return val;
> -}
> -
> -static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
> -{
> - return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
> -}
> -
> -static inline void
> -rte_atomic16_inc(rte_atomic16_t *v)
> -{
> - asm volatile(
> - MPLOCKED
> - "incw %[cnt]"
> - : [cnt] "=m" (v->cnt) /* output */
> - : "m" (v->cnt) /* input */
> - );
> -}
> -
> -static inline void
> -rte_atomic16_dec(rte_atomic16_t *v)
> -{
> - asm volatile(
> - MPLOCKED
> - "decw %[cnt]"
> - : [cnt] "=m" (v->cnt) /* output */
> - : "m" (v->cnt) /* input */
> - );
> -}
> -
> -static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
> -{
> - uint8_t ret;
> -
> - asm volatile(
> - MPLOCKED
> - "incw %[cnt] ; "
> - "sete %[ret]"
> - : [cnt] "+m" (v->cnt), /* output */
> - [ret] "=qm" (ret)
> - );
> - return ret != 0;
> -}
> -
> -static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
> -{
> - uint8_t ret;
> -
> - asm volatile(MPLOCKED
> - "decw %[cnt] ; "
> - "sete %[ret]"
> - : [cnt] "+m" (v->cnt), /* output */
> - [ret] "=qm" (ret)
> - );
> - return ret != 0;
> -}
> -
> -/*------------------------- 32 bit atomic operations -------------------------*/
> -
> -static inline int
> -rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
> -{
> - uint8_t res;
> -
> - asm volatile(
> - MPLOCKED
> - "cmpxchgl %[src], %[dst];"
> - "sete %[res];"
> - : [res] "=a" (res), /* output */
> - [dst] "=m" (*dst)
> - : [src] "r" (src), /* input */
> - "a" (exp),
> - "m" (*dst)
> - : "memory"); /* no-clobber list */
> - return res;
> -}
> -
> -static inline uint32_t
> -rte_atomic32_exchange(volatile uint32_t *dst, uint32_t val)
> -{
> - asm volatile(
> - MPLOCKED
> - "xchgl %0, %1;"
> - : "=r" (val), "=m" (*dst)
> - : "0" (val), "m" (*dst)
> - : "memory"); /* no-clobber list */
> - return val;
> -}
> -
> -static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
> -{
> - return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
> -}
> -
> -static inline void
> -rte_atomic32_inc(rte_atomic32_t *v)
> -{
> - asm volatile(
> - MPLOCKED
> - "incl %[cnt]"
> - : [cnt] "=m" (v->cnt) /* output */
> - : "m" (v->cnt) /* input */
> - );
> -}
> -
> -static inline void
> -rte_atomic32_dec(rte_atomic32_t *v)
> -{
> - asm volatile(
> - MPLOCKED
> - "decl %[cnt]"
> - : [cnt] "=m" (v->cnt) /* output */
> - : "m" (v->cnt) /* input */
> - );
> -}
> -
> -static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
> -{
> - uint8_t ret;
> -
> - asm volatile(
> - MPLOCKED
> - "incl %[cnt] ; "
> - "sete %[ret]"
> - : [cnt] "+m" (v->cnt), /* output */
> - [ret] "=qm" (ret)
> - );
> - return ret != 0;
> -}
> -
> -static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
> -{
> - uint8_t ret;
> -
> - asm volatile(MPLOCKED
> - "decl %[cnt] ; "
> - "sete %[ret]"
> - : [cnt] "+m" (v->cnt), /* output */
> - [ret] "=qm" (ret)
> - );
> - return ret != 0;
> -}
> -
> -#endif /* !RTE_FORCE_INTRINSICS */
>
> #ifdef __cplusplus
> }
> diff --git a/lib/eal/x86/include/rte_atomic_32.h
> b/lib/eal/x86/include/rte_atomic_32.h
> index 0f25863aa5..37d139f30d 100644
> --- a/lib/eal/x86/include/rte_atomic_32.h
> +++ b/lib/eal/x86/include/rte_atomic_32.h
> @@ -20,193 +20,5 @@
>
> /*------------------------- 64 bit atomic operations -------------------------*/
>
> -#ifndef RTE_FORCE_INTRINSICS
> -static inline int
> -rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
> -{
> - uint8_t res;
> - union {
> - struct {
> - uint32_t l32;
> - uint32_t h32;
> - };
> - uint64_t u64;
> - } _exp, _src;
> -
> - _exp.u64 = exp;
> - _src.u64 = src;
> -
> -#ifndef __PIC__
> - asm volatile (
> - MPLOCKED
> - "cmpxchg8b (%[dst]);"
> - "setz %[res];"
> - : [res] "=a" (res) /* result in eax */
> - : [dst] "S" (dst), /* esi */
> - "b" (_src.l32), /* ebx */
> - "c" (_src.h32), /* ecx */
> - "a" (_exp.l32), /* eax */
> - "d" (_exp.h32) /* edx */
> - : "memory" ); /* no-clobber list */
> -#else
> - asm volatile (
> - "xchgl %%ebx, %%edi;\n"
> - MPLOCKED
> - "cmpxchg8b (%[dst]);"
> - "setz %[res];"
> - "xchgl %%ebx, %%edi;\n"
> - : [res] "=a" (res) /* result in eax */
> - : [dst] "S" (dst), /* esi */
> - "D" (_src.l32), /* ebx */
> - "c" (_src.h32), /* ecx */
> - "a" (_exp.l32), /* eax */
> - "d" (_exp.h32) /* edx */
> - : "memory" ); /* no-clobber list */
> -#endif
> -
> - return res;
> -}
> -
> -static inline uint64_t
> -rte_atomic64_exchange(volatile uint64_t *dest, uint64_t val)
> -{
> - uint64_t old;
> -
> - do {
> - old = *dest;
> - } while (rte_atomic64_cmpset(dest, old, val) == 0);
> -
> - return old;
> -}
> -
> -static inline void
> -rte_atomic64_init(rte_atomic64_t *v)
> -{
> - int success = 0;
> - uint64_t tmp;
> -
> - while (success == 0) {
> - tmp = v->cnt;
> - success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
> - tmp, 0);
> - }
> -}
> -
> -static inline int64_t
> -rte_atomic64_read(rte_atomic64_t *v)
> -{
> - int success = 0;
> - uint64_t tmp;
> -
> - while (success == 0) {
> - tmp = v->cnt;
> - /* replace the value by itself */
> - success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
> - tmp, tmp);
> - }
> - return tmp;
> -}
> -
> -static inline void
> -rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
> -{
> - int success = 0;
> - uint64_t tmp;
> -
> - while (success == 0) {
> - tmp = v->cnt;
> - success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
> - tmp, new_value);
> - }
> -}
> -
> -static inline void
> -rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
> -{
> - int success = 0;
> - uint64_t tmp;
> -
> - while (success == 0) {
> - tmp = v->cnt;
> - success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
> - tmp, tmp + inc);
> - }
> -}
> -
> -static inline void
> -rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
> -{
> - int success = 0;
> - uint64_t tmp;
> -
> - while (success == 0) {
> - tmp = v->cnt;
> - success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
> - tmp, tmp - dec);
> - }
> -}
> -
> -static inline void
> -rte_atomic64_inc(rte_atomic64_t *v)
> -{
> - rte_atomic64_add(v, 1);
> -}
> -
> -static inline void
> -rte_atomic64_dec(rte_atomic64_t *v)
> -{
> - rte_atomic64_sub(v, 1);
> -}
> -
> -static inline int64_t
> -rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
> -{
> - int success = 0;
> - uint64_t tmp;
> -
> - while (success == 0) {
> - tmp = v->cnt;
> - success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
> - tmp, tmp + inc);
> - }
> -
> - return tmp + inc;
> -}
> -
> -static inline int64_t
> -rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
> -{
> - int success = 0;
> - uint64_t tmp;
> -
> - while (success == 0) {
> - tmp = v->cnt;
> - success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
> - tmp, tmp - dec);
> - }
> -
> - return tmp - dec;
> -}
> -
> -static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
> -{
> - return rte_atomic64_add_return(v, 1) == 0;
> -}
> -
> -static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
> -{
> - return rte_atomic64_sub_return(v, 1) == 0;
> -}
> -
> -static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
> -{
> - return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
> -}
> -
> -static inline void rte_atomic64_clear(rte_atomic64_t *v)
> -{
> - rte_atomic64_set(v, 0);
> -}
> -#endif
>
> #endif /* _RTE_ATOMIC_I686_H_ */
> diff --git a/lib/eal/x86/include/rte_atomic_64.h
> b/lib/eal/x86/include/rte_atomic_64.h
> index 0a7a2131e0..1cd12695a2 100644
> --- a/lib/eal/x86/include/rte_atomic_64.h
> +++ b/lib/eal/x86/include/rte_atomic_64.h
> @@ -22,163 +22,6 @@
>
> /*------------------------- 64 bit atomic operations -------------------------*/
>
> -#ifndef RTE_FORCE_INTRINSICS
> -static inline int
> -rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
> -{
> - uint8_t res;
> -
> -
> - asm volatile(
> - MPLOCKED
> - "cmpxchgq %[src], %[dst];"
> - "sete %[res];"
> - : [res] "=a" (res), /* output */
> - [dst] "=m" (*dst)
> - : [src] "r" (src), /* input */
> - "a" (exp),
> - "m" (*dst)
> - : "memory"); /* no-clobber list */
> -
> - return res;
> -}
> -
> -static inline uint64_t
> -rte_atomic64_exchange(volatile uint64_t *dst, uint64_t val)
> -{
> - asm volatile(
> - MPLOCKED
> - "xchgq %0, %1;"
> - : "=r" (val), "=m" (*dst)
> - : "0" (val), "m" (*dst)
> - : "memory"); /* no-clobber list */
> - return val;
> -}
> -
> -static inline void
> -rte_atomic64_init(rte_atomic64_t *v)
> -{
> - v->cnt = 0;
> -}
> -
> -static inline int64_t
> -rte_atomic64_read(rte_atomic64_t *v)
> -{
> - return v->cnt;
> -}
> -
> -static inline void
> -rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
> -{
> - v->cnt = new_value;
> -}
> -
> -static inline void
> -rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
> -{
> - asm volatile(
> - MPLOCKED
> - "addq %[inc], %[cnt]"
> - : [cnt] "=m" (v->cnt) /* output */
> - : [inc] "ir" (inc), /* input */
> - "m" (v->cnt)
> - );
> -}
> -
> -static inline void
> -rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
> -{
> - asm volatile(
> - MPLOCKED
> - "subq %[dec], %[cnt]"
> - : [cnt] "=m" (v->cnt) /* output */
> - : [dec] "ir" (dec), /* input */
> - "m" (v->cnt)
> - );
> -}
> -
> -static inline void
> -rte_atomic64_inc(rte_atomic64_t *v)
> -{
> - asm volatile(
> - MPLOCKED
> - "incq %[cnt]"
> - : [cnt] "=m" (v->cnt) /* output */
> - : "m" (v->cnt) /* input */
> - );
> -}
> -
> -static inline void
> -rte_atomic64_dec(rte_atomic64_t *v)
> -{
> - asm volatile(
> - MPLOCKED
> - "decq %[cnt]"
> - : [cnt] "=m" (v->cnt) /* output */
> - : "m" (v->cnt) /* input */
> - );
> -}
> -
> -static inline int64_t
> -rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
> -{
> - int64_t prev = inc;
> -
> - asm volatile(
> - MPLOCKED
> - "xaddq %[prev], %[cnt]"
> - : [prev] "+r" (prev), /* output */
> - [cnt] "=m" (v->cnt)
> - : "m" (v->cnt) /* input */
> - );
> - return prev + inc;
> -}
> -
> -static inline int64_t
> -rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
> -{
> - return rte_atomic64_add_return(v, -dec);
> -}
> -
> -static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
> -{
> - uint8_t ret;
> -
> - asm volatile(
> - MPLOCKED
> - "incq %[cnt] ; "
> - "sete %[ret]"
> - : [cnt] "+m" (v->cnt), /* output */
> - [ret] "=qm" (ret)
> - );
> -
> - return ret != 0;
> -}
> -
> -static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
> -{
> - uint8_t ret;
> -
> - asm volatile(
> - MPLOCKED
> - "decq %[cnt] ; "
> - "sete %[ret]"
> - : [cnt] "+m" (v->cnt), /* output */
> - [ret] "=qm" (ret)
> - );
> - return ret != 0;
> -}
> -
> -static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
> -{
> - return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
> -}
> -
> -static inline void rte_atomic64_clear(rte_atomic64_t *v)
> -{
> - v->cnt = 0;
> -}
> -#endif
>
> /*------------------------ 128 bit atomic operations -------------------------*/
>
> --
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> 2.53.0
^ permalink raw reply [flat|nested] 105+ messages in thread
* RE: [PATCH v4 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence
2026-05-26 23:23 ` [PATCH v4 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
@ 2026-06-01 18:24 ` Konstantin Ananyev
0 siblings, 0 replies; 105+ messages in thread
From: Konstantin Ananyev @ 2026-06-01 18:24 UTC (permalink / raw)
To: Stephen Hemminger, dev@dpdk.org
Cc: Thomas Monjalon, Wathsala Vithanage, Bibo Mao, David Christensen,
Sun Yuechi, Bruce Richardson
> The rte_smp_mb(), rte_smp_wmb() and rte_smp_rmb() functions were
> flagged as deprecated by commit 3ec965b6de12 ("doc: update atomic
> operation deprecation") in 2021 but nothing came of it.
>
> Reimplement them as inline wrappers over rte_atomic_thread_fence()
> and drop the deprecation notice.
> The API is preserved; only the implementation changes.
>
> The wrapper provides stronger guarantees than previous code
> because there is no C11 equivalent to old rte_smp_qmb().
> Generated code is unchanged on x86; on arm64,
> release/acquire emit dmb ish instead of dmb ishst/ishld;
> the difference is below measurement noise.
>
> Drop restrictions on rte_smp_XX in checkpatch since they are
> no longer on deprecation cycle.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> devtools/checkpatches.sh | 8 --
> doc/guides/rel_notes/deprecation.rst | 8 --
> lib/eal/arm/include/rte_atomic_32.h | 6 --
> lib/eal/arm/include/rte_atomic_64.h | 6 --
> lib/eal/include/generic/rte_atomic.h | 130 +++++--------------------
> lib/eal/loongarch/include/rte_atomic.h | 6 --
> lib/eal/ppc/include/rte_atomic.h | 6 --
> lib/eal/riscv/include/rte_atomic.h | 6 --
> lib/eal/x86/include/rte_atomic.h | 33 +++----
> 9 files changed, 37 insertions(+), 172 deletions(-)
>
> diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
> index f5dd77443f..81bb0fe4e8 100755
> --- a/devtools/checkpatches.sh
> +++ b/devtools/checkpatches.sh
> @@ -121,14 +121,6 @@ check_forbidden_additions() { # <patch>
> -f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
> "$1" || res=1
>
> - # refrain from new additions of rte_smp_[r/w]mb()
> - awk -v FOLDERS="lib drivers app examples" \
> - -v EXPRESSIONS="rte_smp_(r|w)?mb\\\(" \
> - -v RET_ON_FAIL=1 \
> - -v MESSAGE='Using rte_smp_[r/w]mb' \
> - -f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
> - "$1" || res=1
> -
> # refrain from using compiler __sync_xxx builtins
> awk -v FOLDERS="lib drivers app examples" \
> -v EXPRESSIONS="__sync_.*\\\(" \
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index 35c9b4e06c..2190419f79 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -47,14 +47,6 @@ Deprecation Notices
> operations must be used for patches that need to be merged in 20.08 onwards.
> This change will not introduce any performance degradation.
>
> -* rte_smp_*mb: These APIs provide full barrier functionality. However, many
> - use cases do not require full barriers. To support such use cases, DPDK has
> - adopted atomic operations from
> - https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html. These
> - operations and a new wrapper ``rte_atomic_thread_fence`` instead of
> - ``__atomic_thread_fence`` must be used for patches that need to be merged in
> - 20.08 onwards. This change will not introduce any performance degradation.
> -
> * lib: will fix extending some enum/define breaking the ABI. There are multiple
> samples in DPDK that enum/define terminated with a ``.*MAX.*`` value which is
> used by iterators, and arrays holding these values are sized with this
> diff --git a/lib/eal/arm/include/rte_atomic_32.h
> b/lib/eal/arm/include/rte_atomic_32.h
> index 696a539fef..4115271091 100644
> --- a/lib/eal/arm/include/rte_atomic_32.h
> +++ b/lib/eal/arm/include/rte_atomic_32.h
> @@ -17,12 +17,6 @@ extern "C" {
>
> #define rte_rmb() __sync_synchronize()
>
> -#define rte_smp_mb() rte_mb()
> -
> -#define rte_smp_wmb() rte_wmb()
> -
> -#define rte_smp_rmb() rte_rmb()
> -
> #define rte_io_mb() rte_mb()
>
> #define rte_io_wmb() rte_wmb()
> diff --git a/lib/eal/arm/include/rte_atomic_64.h
> b/lib/eal/arm/include/rte_atomic_64.h
> index 9f790238df..604e777bcd 100644
> --- a/lib/eal/arm/include/rte_atomic_64.h
> +++ b/lib/eal/arm/include/rte_atomic_64.h
> @@ -20,12 +20,6 @@ extern "C" {
>
> #define rte_rmb() asm volatile("dmb oshld" : : : "memory")
>
> -#define rte_smp_mb() asm volatile("dmb ish" : : : "memory")
> -
> -#define rte_smp_wmb() asm volatile("dmb ishst" : : : "memory")
> -
> -#define rte_smp_rmb() asm volatile("dmb ishld" : : : "memory")
> -
> #define rte_io_mb() rte_mb()
>
> #define rte_io_wmb() rte_wmb()
> diff --git a/lib/eal/include/generic/rte_atomic.h
> b/lib/eal/include/generic/rte_atomic.h
> index 292e52fade..1b04b43cbb 100644
> --- a/lib/eal/include/generic/rte_atomic.h
> +++ b/lib/eal/include/generic/rte_atomic.h
> @@ -59,55 +59,25 @@ static inline void rte_rmb(void);
> *
> * Guarantees that the LOAD and STORE operations that precede the
> * rte_smp_mb() call are globally visible across the lcores
> - * before the LOAD and STORE operations that follows it.
> - *
> - * @note
> - * This function is deprecated.
> - * It provides similar synchronization primitive as atomic fence,
> - * but has different syntax and memory ordering semantic. Hence
> - * deprecated for the simplicity of memory ordering semantics in use.
> - *
> - * rte_atomic_thread_fence(rte_memory_order_acq_rel) should be used
> instead.
> + * before the LOAD and STORE operations that follow it.
> */
> static inline void rte_smp_mb(void);
>
> /**
> * Write memory barrier between lcores
> *
> - * Guarantees that the STORE operations that precede the
> - * rte_smp_wmb() call are globally visible across the lcores
> - * before the STORE operations that follows it.
> - *
> - * @note
> - * This function is deprecated.
> - * It provides similar synchronization primitive as atomic fence,
> - * but has different syntax and memory ordering semantic. Hence
> - * deprecated for the simplicity of memory ordering semantics in use.
> - *
> - * rte_atomic_thread_fence(rte_memory_order_release) should be used
> instead.
> - * The fence also guarantees LOAD operations that precede the call
> - * are globally visible across the lcores before the STORE operations
> - * that follows it.
> + * Guarantees that the LOAD and STORE operations that precede the
> + * rte_smp_wmb() call are globally visible across the lcores before
> + * any STORE operations that follow it.
> */
> static inline void rte_smp_wmb(void);
>
> /**
> * Read memory barrier between lcores
> *
> - * Guarantees that the LOAD operations that precede the
> - * rte_smp_rmb() call are globally visible across the lcores
> - * before the LOAD operations that follows it.
> - *
> - * @note
> - * This function is deprecated.
> - * It provides similar synchronization primitive as atomic fence,
> - * but has different syntax and memory ordering semantic. Hence
> - * deprecated for the simplicity of memory ordering semantics in use.
> - *
> - * rte_atomic_thread_fence(rte_memory_order_acquire) should be used
> instead.
> - * The fence also guarantees LOAD operations that precede the call
> - * are globally visible across the lcores before the STORE operations
> - * that follows it.
> + * Guarantees that any LOAD operations that precede the rte_smp_rmb()
> + * call complete before LOAD and STORE operations that follow it
> + * become globally visible.
> */
> static inline void rte_smp_rmb(void);
> ///@}
> @@ -164,6 +134,24 @@ static inline void rte_io_rmb(void);
> */
> static inline void rte_atomic_thread_fence(rte_memory_order memorder);
>
> +static __rte_always_inline void
> +rte_smp_mb(void)
> +{
> + rte_atomic_thread_fence(rte_memory_order_seq_cst);
> +}
> +
> +static __rte_always_inline void
> +rte_smp_wmb(void)
> +{
> + rte_atomic_thread_fence(rte_memory_order_release);
> +}
> +
> +static __rte_always_inline void
> +rte_smp_rmb(void)
> +{
> + rte_atomic_thread_fence(rte_memory_order_acquire);
> +}
> +
> /*------------------------- 16 bit atomic operations -------------------------*/
>
> #ifndef RTE_TOOLCHAIN_MSVC
> @@ -184,9 +172,6 @@ static inline void
> rte_atomic_thread_fence(rte_memory_order memorder);
> * @return
> * Non-zero on success; 0 on failure.
> */
> -static inline int
> -rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src);
> -
> static inline int
> rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
> {
> @@ -303,9 +288,6 @@ rte_atomic16_sub(rte_atomic16_t *v, int16_t dec)
> * @param v
> * A pointer to the atomic counter.
> */
> -static inline void
> -rte_atomic16_inc(rte_atomic16_t *v);
> -
> static inline void
> rte_atomic16_inc(rte_atomic16_t *v)
> {
> @@ -318,9 +300,6 @@ rte_atomic16_inc(rte_atomic16_t *v)
> * @param v
> * A pointer to the atomic counter.
> */
> -static inline void
> -rte_atomic16_dec(rte_atomic16_t *v);
> -
> static inline void
> rte_atomic16_dec(rte_atomic16_t *v)
> {
> @@ -379,8 +358,6 @@ rte_atomic16_sub_return(rte_atomic16_t *v, int16_t dec)
> * @return
> * True if the result after the increment operation is 0; false otherwise.
> */
> -static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v);
> -
> static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
> {
> return rte_atomic_fetch_add_explicit((volatile __rte_atomic int16_t *)&v-
> >cnt, 1,
> @@ -398,8 +375,6 @@ static inline int
> rte_atomic16_inc_and_test(rte_atomic16_t *v)
> * @return
> * True if the result after the decrement operation is 0; false otherwise.
> */
> -static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v);
> -
> static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
> {
> return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int16_t *)&v-
> >cnt, 1,
> @@ -417,8 +392,6 @@ static inline int
> rte_atomic16_dec_and_test(rte_atomic16_t *v)
> * @return
> * 0 if failed; else 1, success.
> */
> -static inline int rte_atomic16_test_and_set(rte_atomic16_t *v);
> -
> static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
> {
> return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
> @@ -453,9 +426,6 @@ static inline void rte_atomic16_clear(rte_atomic16_t *v)
> * @return
> * Non-zero on success; 0 on failure.
> */
> -static inline int
> -rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src);
> -
> static inline int
> rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
> {
> @@ -572,9 +542,6 @@ rte_atomic32_sub(rte_atomic32_t *v, int32_t dec)
> * @param v
> * A pointer to the atomic counter.
> */
> -static inline void
> -rte_atomic32_inc(rte_atomic32_t *v);
> -
> static inline void
> rte_atomic32_inc(rte_atomic32_t *v)
> {
> @@ -587,9 +554,6 @@ rte_atomic32_inc(rte_atomic32_t *v)
> * @param v
> * A pointer to the atomic counter.
> */
> -static inline void
> -rte_atomic32_dec(rte_atomic32_t *v);
> -
> static inline void
> rte_atomic32_dec(rte_atomic32_t *v)
> {
> @@ -648,8 +612,6 @@ rte_atomic32_sub_return(rte_atomic32_t *v, int32_t dec)
> * @return
> * True if the result after the increment operation is 0; false otherwise.
> */
> -static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v);
> -
> static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
> {
> return rte_atomic_fetch_add_explicit((volatile __rte_atomic int32_t *)&v-
> >cnt, 1,
> @@ -667,8 +629,6 @@ static inline int
> rte_atomic32_inc_and_test(rte_atomic32_t *v)
> * @return
> * True if the result after the decrement operation is 0; false otherwise.
> */
> -static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v);
> -
> static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
> {
> return rte_atomic_fetch_sub_explicit((volatile __rte_atomic int32_t *)&v-
> >cnt, 1,
> @@ -686,8 +646,6 @@ static inline int
> rte_atomic32_dec_and_test(rte_atomic32_t *v)
> * @return
> * 0 if failed; else 1, success.
> */
> -static inline int rte_atomic32_test_and_set(rte_atomic32_t *v);
> -
> static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
> {
> return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
> @@ -721,9 +679,6 @@ static inline void rte_atomic32_clear(rte_atomic32_t *v)
> * @return
> * Non-zero on success; 0 on failure.
> */
> -static inline int
> -rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src);
> -
> static inline int
> rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
> {
> @@ -770,9 +725,6 @@ typedef struct {
> * @param v
> * A pointer to the atomic counter.
> */
> -static inline void
> -rte_atomic64_init(rte_atomic64_t *v);
> -
> static inline void
> rte_atomic64_init(rte_atomic64_t *v)
> {
> @@ -798,9 +750,6 @@ rte_atomic64_init(rte_atomic64_t *v)
> * @return
> * The value of the counter.
> */
> -static inline int64_t
> -rte_atomic64_read(rte_atomic64_t *v);
> -
> static inline int64_t
> rte_atomic64_read(rte_atomic64_t *v)
> {
> @@ -828,9 +777,6 @@ rte_atomic64_read(rte_atomic64_t *v)
> * @param new_value
> * The new value of the counter.
> */
> -static inline void
> -rte_atomic64_set(rte_atomic64_t *v, int64_t new_value);
> -
> static inline void
> rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
> {
> @@ -856,9 +802,6 @@ rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
> * @param inc
> * The value to be added to the counter.
> */
> -static inline void
> -rte_atomic64_add(rte_atomic64_t *v, int64_t inc);
> -
> static inline void
> rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
> {
> @@ -874,9 +817,6 @@ rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
> * @param dec
> * The value to be subtracted from the counter.
> */
> -static inline void
> -rte_atomic64_sub(rte_atomic64_t *v, int64_t dec);
> -
> static inline void
> rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
> {
> @@ -890,9 +830,6 @@ rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
> * @param v
> * A pointer to the atomic counter.
> */
> -static inline void
> -rte_atomic64_inc(rte_atomic64_t *v);
> -
> static inline void
> rte_atomic64_inc(rte_atomic64_t *v)
> {
> @@ -905,9 +842,6 @@ rte_atomic64_inc(rte_atomic64_t *v)
> * @param v
> * A pointer to the atomic counter.
> */
> -static inline void
> -rte_atomic64_dec(rte_atomic64_t *v);
> -
> static inline void
> rte_atomic64_dec(rte_atomic64_t *v)
> {
> @@ -927,9 +861,6 @@ rte_atomic64_dec(rte_atomic64_t *v)
> * @return
> * The value of v after the addition.
> */
> -static inline int64_t
> -rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc);
> -
> static inline int64_t
> rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
> {
> @@ -950,9 +881,6 @@ rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
> * @return
> * The value of v after the subtraction.
> */
> -static inline int64_t
> -rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec);
> -
> static inline int64_t
> rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
> {
> @@ -971,8 +899,6 @@ rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
> * @return
> * True if the result after the addition is 0; false otherwise.
> */
> -static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v);
> -
> static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
> {
> return rte_atomic64_add_return(v, 1) == 0;
> @@ -989,8 +915,6 @@ static inline int
> rte_atomic64_inc_and_test(rte_atomic64_t *v)
> * @return
> * True if the result after subtraction is 0; false otherwise.
> */
> -static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v);
> -
> static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
> {
> return rte_atomic64_sub_return(v, 1) == 0;
> @@ -1007,8 +931,6 @@ static inline int
> rte_atomic64_dec_and_test(rte_atomic64_t *v)
> * @return
> * 0 if failed; else 1, success.
> */
> -static inline int rte_atomic64_test_and_set(rte_atomic64_t *v);
> -
> static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
> {
> return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
> @@ -1020,8 +942,6 @@ static inline int
> rte_atomic64_test_and_set(rte_atomic64_t *v)
> * @param v
> * A pointer to the atomic counter.
> */
> -static inline void rte_atomic64_clear(rte_atomic64_t *v);
> -
> static inline void rte_atomic64_clear(rte_atomic64_t *v)
> {
> rte_atomic64_set(v, 0);
> diff --git a/lib/eal/loongarch/include/rte_atomic.h
> b/lib/eal/loongarch/include/rte_atomic.h
> index 785a452c9e..a789e3ab4d 100644
> --- a/lib/eal/loongarch/include/rte_atomic.h
> +++ b/lib/eal/loongarch/include/rte_atomic.h
> @@ -18,12 +18,6 @@ extern "C" {
>
> #define rte_rmb() rte_mb()
>
> -#define rte_smp_mb() rte_mb()
> -
> -#define rte_smp_wmb() rte_mb()
> -
> -#define rte_smp_rmb() rte_mb()
> -
> #define rte_io_mb() rte_mb()
>
> #define rte_io_wmb() rte_mb()
> diff --git a/lib/eal/ppc/include/rte_atomic.h b/lib/eal/ppc/include/rte_atomic.h
> index 64f4c3d670..0e64db2a35 100644
> --- a/lib/eal/ppc/include/rte_atomic.h
> +++ b/lib/eal/ppc/include/rte_atomic.h
> @@ -24,12 +24,6 @@ extern "C" {
>
> #define rte_rmb() asm volatile("sync" : : : "memory")
>
> -#define rte_smp_mb() rte_mb()
> -
> -#define rte_smp_wmb() rte_wmb()
> -
> -#define rte_smp_rmb() rte_rmb()
> -
> #define rte_io_mb() rte_mb()
>
> #define rte_io_wmb() rte_wmb()
> diff --git a/lib/eal/riscv/include/rte_atomic.h b/lib/eal/riscv/include/rte_atomic.h
> index 061b175f33..04c40e4e9b 100644
> --- a/lib/eal/riscv/include/rte_atomic.h
> +++ b/lib/eal/riscv/include/rte_atomic.h
> @@ -23,12 +23,6 @@ extern "C" {
>
> #define rte_rmb() asm volatile("fence r, r" : : : "memory")
>
> -#define rte_smp_mb() rte_mb()
> -
> -#define rte_smp_wmb() rte_wmb()
> -
> -#define rte_smp_rmb() rte_rmb()
> -
> #define rte_io_mb() asm volatile("fence iorw, iorw" : : : "memory")
>
> #define rte_io_wmb() asm volatile("fence orw, ow" : : : "memory")
> diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h
> index 4f05302c9f..f4d39ce4fe 100644
> --- a/lib/eal/x86/include/rte_atomic.h
> +++ b/lib/eal/x86/include/rte_atomic.h
> @@ -23,10 +23,6 @@
>
> #define rte_rmb() _mm_lfence()
>
> -#define rte_smp_wmb() rte_compiler_barrier()
> -
> -#define rte_smp_rmb() rte_compiler_barrier()
> -
> #ifdef __cplusplus
> extern "C" {
> #endif
> @@ -63,20 +59,6 @@ extern "C" {
> * So below we use that technique for rte_smp_mb() implementation.
> */
>
> -static __rte_always_inline void
> -rte_smp_mb(void)
> -{
> -#ifdef RTE_TOOLCHAIN_MSVC
> - _mm_mfence();
> -#else
> -#ifdef RTE_ARCH_I686
> - asm volatile("lock addl $0, -128(%%esp); " ::: "memory");
> -#else
> - asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
> -#endif
> -#endif
> -}
> -
> #define rte_io_mb() rte_mb()
>
> #define rte_io_wmb() rte_compiler_barrier()
> @@ -93,10 +75,19 @@ rte_smp_mb(void)
> static __rte_always_inline void
> rte_atomic_thread_fence(rte_memory_order memorder)
> {
> - if (memorder == rte_memory_order_seq_cst)
> - rte_smp_mb();
> - else
> + if (memorder == rte_memory_order_seq_cst) {
> +#ifdef RTE_TOOLCHAIN_MSVC
> + _mm_mfence();
> +#else
> +#ifdef RTE_ARCH_I686
> + asm volatile("lock addl $0, -128(%%esp); " ::: "memory");
> +#else
> + asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
> +#endif
> +#endif
> + } else {
> __rte_atomic_thread_fence(memorder);
> + }
> }
>
> #ifdef __cplusplus
> --
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> 2.53.0
^ permalink raw reply [flat|nested] 105+ messages in thread
* Re: [PATCH v4 03/27] ring: unify memory model on C11, remove atomic32
2026-06-01 18:18 ` Konstantin Ananyev
@ 2026-06-01 21:05 ` Stephen Hemminger
2026-06-01 21:18 ` Stephen Hemminger
1 sibling, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-06-01 21:05 UTC (permalink / raw)
To: Konstantin Ananyev; +Cc: dev@dpdk.org, Wathsala Vithanage
On Mon, 1 Jun 2026 18:18:18 +0000
Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
> > + /* check that we have enough room in ring */
> > + if (unlikely(n > *entries))
> > + n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *entries;
> > +
> > + if (n > 0) {
> > + *new_head = *old_head + n;
> > + d->head = *new_head;
>
> There is a bit of inconsistency with the 'load' operation above:
> If we use atomic_load(&d->head. ...) then it would be better to use
> atomic_store(&d->head,..., order_relaxed) here.
This is single thread case, so not sure atomic_store is needed.
The old code didn't do it.
There is a little confusion in the ST path.
It is used in two different context SP and SC.
For move_head, it is SC that matters and consumer moves the head;
BUT if using multiple produces MP the tail is also examined to
determine space.
^ permalink raw reply [flat|nested] 105+ messages in thread
* Re: [PATCH v4 03/27] ring: unify memory model on C11, remove atomic32
2026-06-01 18:18 ` Konstantin Ananyev
2026-06-01 21:05 ` Stephen Hemminger
@ 2026-06-01 21:18 ` Stephen Hemminger
1 sibling, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-06-01 21:18 UTC (permalink / raw)
To: Konstantin Ananyev; +Cc: dev@dpdk.org, Wathsala Vithanage
On Mon, 1 Jun 2026 18:18:18 +0000
Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
> > +
> > + /* check that we have enough room in ring */
> > + if (unlikely(n > *entries))
> > + n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *entries;
> > +
> > + if (n > 0) {
> > + *new_head = *old_head + n;
> > + d->head = *new_head;
>
> There is a bit of inconsistency with the 'load' operation above:
> If we use atomic_load(&d->head. ...) then it would be better to use
> atomic_store(&d->head,..., order_relaxed) here.
>
Will switch to atomic_store, with relaxed.
It generates same code on x86 and ARM as simple store.
And makes C11 checkers happy.
^ permalink raw reply [flat|nested] 105+ messages in thread
* Re: [PATCH v4 03/27] ring: unify memory model on C11, remove atomic32
2026-05-26 23:23 ` [PATCH v4 03/27] ring: unify memory model on C11, remove atomic32 Stephen Hemminger
2026-06-01 18:18 ` Konstantin Ananyev
@ 2026-06-01 22:07 ` Stephen Hemminger
1 sibling, 0 replies; 105+ messages in thread
From: Stephen Hemminger @ 2026-06-01 22:07 UTC (permalink / raw)
To: dev; +Cc: Konstantin Ananyev, Wathsala Vithanage
On Tue, 26 May 2026 16:23:53 -0700
Stephen Hemminger <stephen@networkplumber.org> wrote:
> Remove the RTE_USE_C11_MEM_MODEL build switch; C11 atomics are now
> the default for all platforms. Unifies __rte_ring_update_tail into
> the C11 form (atomic_store_release replaces the older rte_smp_wmb +
> plain store on the generic path) and renames rte_ring_generic_pvt.h
> to rte_ring_x86_pvt.h to reflect its new scope.
>
> Also splits the head-move helper into separate ST and MT variants,
> removing the runtime is_st branch from the MT retry loop.
> This gets small boost and scopes the following exception
> more tightly.
>
> Exception: on x86 with GCC, atomic_compare_exchange on the head CAS
> regresses MP/MC contended throughput by ~20% existing hand-written
> cmpxchg. As a workaround, GCC-on-x86 builds use the older
> __sync_bool_compare_and_swap builtin, which generates equivalent
> code to the original asm. Can be reverted if/when GCC gets
> fixed; similar issue was observed in Linux kernel.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
This is a big enough change, going to split it up as its own series.
^ permalink raw reply [flat|nested] 105+ messages in thread
end of thread, other threads:[~2026-06-01 22:07 UTC | newest]
Thread overview: 105+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-21 4:17 [RFC 0/7] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
2026-05-21 4:17 ` [RFC 1/7] doc: update versions in deprecation file Stephen Hemminger
2026-05-21 4:17 ` [RFC 2/7] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
2026-05-21 15:43 ` Wathsala Vithanage
2026-05-21 4:17 ` [RFC 3/7] ring: use C11 atomic operations for MP/SP head/tail Stephen Hemminger
2026-05-21 15:57 ` Wathsala Vithanage
2026-05-21 4:17 ` [RFC 4/7] net/zxdh: work around GCC bitfield uninit false positive Stephen Hemminger
2026-05-21 4:17 ` [RFC 5/7] net/bonding: use stdatomic Stephen Hemminger
2026-05-21 4:17 ` [RFC 6/7] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
2026-05-21 4:17 ` [RFC 7/7] config: use RTE_FORCE_INTRINSICS on all platforms Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 01/11] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 02/11] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 03/11] ring: use C11 atomic operations for MP/SP head/tail Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 04/11] net/bonding: use stdatomic Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 05/11] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 06/11] net/ena: replace use of rte_atomicNN Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 07/11] net/failsafe: convert to stdatomic Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 08/11] net/enic: do not use deprecated rte_atomic64 Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 09/11] net/pfe: use ethdev linkstatus helpers Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 10/11] net/sfc: replace rte_atomic with stdatomic Stephen Hemminger
2026-05-21 18:04 ` [RFC v2 11/11] crypto/ccp: replace use of rte_atomic64 " Stephen Hemminger
2026-05-22 14:19 ` [RFC v2 00/11] prepare deprecation of rte_atomicNN_*() family Bruce Richardson
2026-05-22 14:45 ` Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 01/27] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 03/27] ring: use compare-and-swap wrapper Stephen Hemminger
2026-05-25 7:41 ` Konstantin Ananyev
2026-05-25 14:31 ` Stephen Hemminger
2026-05-25 15:35 ` Stephen Hemminger
2026-05-25 15:47 ` Morten Brørup
2026-05-23 19:16 ` [PATCH v3 04/27] bpf: replace atomic op macro with typed helpers Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 05/27] net/bonding: use stdatomic Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 06/27] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 07/27] net/ena: replace use of rte_atomicNN Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 08/27] net/failsafe: convert to stdatomic Stephen Hemminger
2026-05-23 19:16 ` [PATCH v3 09/27] net/enic: do not use deprecated rte_atomic64 Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 00/27] deprecate rte_atomicNN family Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 01/27] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 03/27] ring: use compare-and-swap wrapper Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 04/27] bpf: replace atomic op macro with typed helpers Stephen Hemminger
2026-05-25 10:49 ` Marat Khalili
2026-05-23 19:56 ` [PATCH v3 05/27] net/bonding: use stdatomic Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 06/27] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 07/27] net/ena: replace use of rte_atomicNN Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 08/27] net/failsafe: convert to stdatomic Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 09/27] net/enic: do not use deprecated rte_atomic64 Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 10/27] net/pfe: use ethdev linkstatus helpers Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 11/27] net/sfc: replace rte_atomic with stdatomic Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 12/27] crypto/ccp: replace use of rte_atomic64 " Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 13/27] bus/dpaa: replace rte_atomic16 " Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 14/27] drivers: " Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 15/27] net/netvsc: replace rte_atomic32 " Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 16/27] event/sw: convert from rte_atomic32 to stdatomic Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 17/27] bus/vmbus: convert from rte_atomic " Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 18/27] common/dpaax: remove unused atomic macros Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 19/27] net/bnx2x: convert from rte_atomic32 to stdatomic Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 20/27] bus/fslmc: replace rte_atomic32 with stdatomic Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 21/27] drivers/event: replace rte_atomic32 in selftests Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 22/27] net/hinic: replace rte_atomic32 with stdatomic Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 23/27] net/txgbe: " Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 24/27] net/vhost: use stdatomic instead of rte_atomic32 Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 25/27] vdpa/ifc: replace rte_atomic32 with stdatomic Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 26/27] test/atomic: suppress deprecation warnings for legacy APIs Stephen Hemminger
2026-05-23 19:56 ` [PATCH v3 27/27] eal: mark rte_atomicNN as deprecated Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 00/27] deprecate rte_atomicNN family Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 01/27] eal: use intrinsics for rte_atomic on all platforms Stephen Hemminger
2026-06-01 18:23 ` Konstantin Ananyev
2026-05-26 23:23 ` [PATCH v4 02/27] eal: reimplement rte_smp_*mb with rte_atomic_thread_fence Stephen Hemminger
2026-06-01 18:24 ` Konstantin Ananyev
2026-05-26 23:23 ` [PATCH v4 03/27] ring: unify memory model on C11, remove atomic32 Stephen Hemminger
2026-06-01 18:18 ` Konstantin Ananyev
2026-06-01 21:05 ` Stephen Hemminger
2026-06-01 21:18 ` Stephen Hemminger
2026-06-01 22:07 ` Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 04/27] bpf: use C11 atomics in BPF_ST_ATOMIC_REG Stephen Hemminger
2026-05-27 16:52 ` Marat Khalili
2026-05-26 23:23 ` [PATCH v4 05/27] net/bonding: use stdatomic Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 06/27] net/nbl: remove unused rte_atomic16 field Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 07/27] net/ena: replace use of rte_atomicNN Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 08/27] net/failsafe: convert to stdatomic Stephen Hemminger
2026-05-26 23:23 ` [PATCH v4 09/27] net/enic: do not use deprecated rte_atomic64 Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 10/27] net/pfe: use ethdev linkstatus helpers Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 11/27] net/sfc: replace rte_atomic with stdatomic Stephen Hemminger
2026-06-01 9:22 ` Andrew Rybchenko
2026-05-26 23:24 ` [PATCH v4 12/27] crypto/ccp: replace use of rte_atomic64 " Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 13/27] bus/dpaa: replace rte_atomic16 " Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 14/27] drivers: " Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 15/27] net/netvsc: replace rte_atomic32 " Stephen Hemminger
2026-05-27 0:29 ` [EXTERNAL] " Long Li
2026-05-31 16:35 ` Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 16/27] event/sw: convert from rte_atomic32 to stdatomic Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 17/27] bus/vmbus: convert from rte_atomic " Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 18/27] common/dpaax: use stdatomic instead of rte_atomic Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 19/27] net/bnx2x: convert from rte_atomic32 to stdatomic Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 20/27] bus/fslmc: replace rte_atomic32 with stdatomic Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 21/27] drivers/event: replace rte_atomic32 in selftests Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 22/27] net/hinic: replace rte_atomic32 with stdatomic Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 23/27] net/txgbe: " Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 24/27] net/vhost: use stdatomic instead of rte_atomic32 Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 25/27] vdpa/ifc: replace rte_atomic32 with stdatomic Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 26/27] test/atomic: suppress deprecation warnings for legacy APIs Stephen Hemminger
2026-05-26 23:24 ` [PATCH v4 27/27] eal: mark rte_atomicNN as deprecated Stephen Hemminger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox