* Re: [PATCH] linux/mem: atomically prefault hugepages in alloc_seg
2026-05-20 12:57 [PATCH] linux/mem: atomically prefault hugepages in alloc_seg Michal Sieron
@ 2026-05-20 14:57 ` Stephen Hemminger
2026-05-20 16:47 ` Stephen Hemminger
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Stephen Hemminger @ 2026-05-20 14:57 UTC (permalink / raw)
To: Michal Sieron; +Cc: dev
On Wed, 20 May 2026 14:57:56 +0200
Michal Sieron <michal.sieron@nokia.com> wrote:
> In rare cases, when a secondary process calls rte_eal_init() it can
> cause a data race during page prefaulting in alloc_seg().
>
> An atomic compare-exchange in a loop should eliminate the data race.
>
> Signed-off-by: Michal Sieron <michal.sieron@nokia.com>
> ---
> lib/eal/linux/eal_memalloc.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c
> index a39bc31c7b..cb92fda2e8 100644
> --- a/lib/eal/linux/eal_memalloc.c
> +++ b/lib/eal/linux/eal_memalloc.c
> @@ -30,6 +30,7 @@
> #include <rte_eal.h>
> #include <rte_memory.h>
> #include <rte_cycles.h>
> +#include <rte_atomic.h>
>
> #include "eal_filesystem.h"
> #include "eal_internal_cfg.h"
> @@ -600,7 +601,9 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
> * that is already there, so read the old value, and write itback.
> * kernel populates the page with zeroes initially.
> */
> - *(volatile int *)addr = *(volatile int *)addr;
> + int snapshot = *(volatile int *)addr;
> + while (!rte_atomic_compare_exchange_strong((volatile int *)addr, &snapshot, snapshot))
> + ;
>
> iova = rte_mem_virt2iova(addr);
> if (iova == RTE_BAD_PHYS_ADDR) {
No don't use a loop with compare_exchange_strong here.
It could get stuck.
Should just a an relaxed load be enough to get the page in?
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] linux/mem: atomically prefault hugepages in alloc_seg
2026-05-20 12:57 [PATCH] linux/mem: atomically prefault hugepages in alloc_seg Michal Sieron
2026-05-20 14:57 ` Stephen Hemminger
@ 2026-05-20 16:47 ` Stephen Hemminger
2026-05-20 17:07 ` Stephen Hemminger
2026-05-20 17:08 ` [PATCH] eal: fix data race in hugepage prefault Stephen Hemminger
3 siblings, 0 replies; 8+ messages in thread
From: Stephen Hemminger @ 2026-05-20 16:47 UTC (permalink / raw)
To: Michal Sieron; +Cc: dev
On Wed, 20 May 2026 14:57:56 +0200
Michal Sieron <michal.sieron@nokia.com> wrote:
> In rare cases, when a secondary process calls rte_eal_init() it can
> cause a data race during page prefaulting in alloc_seg().
>
> An atomic compare-exchange in a loop should eliminate the data race.
>
> Signed-off-by: Michal Sieron <michal.sieron@nokia.com>
> ---
Build fails. Fix and resubmit.
Looks like you did this against older version of DPDK before stdatomic.
FAILED: [code=1] lib/librte_eal.a.p/eal_linux_eal_memalloc.c.o
ccache clang -Ilib/librte_eal.a.p -Ilib -I../lib -Ilib/eal/common -I../lib/eal/common -I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -I../kernel/linux -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/argparse -I../lib/argparse -Xclang -fcolor-diagnostics -fsanitize=undefined -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -std=c11 -include rte_config.h -Wvla -Wcast-qual -Wcomma -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpointer-arith -Wshadow -Wsign-compare -Wstrict-prototypes -Wundef -Wwrite-strings -Wno-missing-field-initializers -D_GNU_SOURCE -fPIC -march=corei7 -mrtm -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API '-DABI_VERSION="26.2"' -DRTE_EAL_PTHREAD_ATTR_SETAFFINITY_NP -DRTE_LOG_DEFAULT_LOGTYPE=lib.eal -DRTE_ANNOTATE_LOCKS -Wthread-safety -MD -MQ lib/librte_eal.a.p/eal_linux_eal_memalloc.c.o -MF lib/librte_eal.a.p/eal_linux_eal_memalloc.c.o.d -o lib/librte_eal.a.p/eal_linux_eal_memalloc.c.o -c ../lib/eal/linux/eal_memalloc.c
../lib/eal/linux/eal_memalloc.c:605:10: error: implicit declaration of function 'rte_atomic_compare_exchange_strong' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
while (!rte_atomic_compare_exchange_strong((volatile int *)addr, &snapshot, snapshot))
^
../lib/eal/linux/eal_memalloc.c:605:10: note: did you mean '__atomic_compare_exchange_n'?
../lib/eal/include/generic/rte_rwlock.h:189:6: note: '__atomic_compare_exchange_n' declared here
rte_atomic_compare_exchange_weak_explicit(&rwl->cnt, &x, x + RTE_RWLOCK_WRITE,
^
../lib/eal/include/rte_stdatomic.h:151:2: note: expanded from macro 'rte_atomic_compare_exchange_weak_explicit'
__atomic_compare_exchange_n(ptr, expected, desired, 1, \
^
1 error generated.
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] linux/mem: atomically prefault hugepages in alloc_seg
2026-05-20 12:57 [PATCH] linux/mem: atomically prefault hugepages in alloc_seg Michal Sieron
2026-05-20 14:57 ` Stephen Hemminger
2026-05-20 16:47 ` Stephen Hemminger
@ 2026-05-20 17:07 ` Stephen Hemminger
2026-05-20 17:08 ` [PATCH] eal: fix data race in hugepage prefault Stephen Hemminger
3 siblings, 0 replies; 8+ messages in thread
From: Stephen Hemminger @ 2026-05-20 17:07 UTC (permalink / raw)
To: Michal Sieron; +Cc: dev
On Wed, 20 May 2026 14:57:56 +0200
Michal Sieron <michal.sieron@nokia.com> wrote:
> In rare cases, when a secondary process calls rte_eal_init() it can
> cause a data race during page prefaulting in alloc_seg().
>
> An atomic compare-exchange in a loop should eliminate the data race.
>
> Signed-off-by: Michal Sieron <michal.sieron@nokia.com>
> ---
AI had good suggestion when reviewing this.
Your version is still racy (on the read side).
A simple non-racy, and no loop version would be:
rte_atomic_fetch_or_explicit((int *)addr, 0, rte_memory_order_relaxed);
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] eal: fix data race in hugepage prefault
2026-05-20 12:57 [PATCH] linux/mem: atomically prefault hugepages in alloc_seg Michal Sieron
` (2 preceding siblings ...)
2026-05-20 17:07 ` Stephen Hemminger
@ 2026-05-20 17:08 ` Stephen Hemminger
2026-06-01 10:03 ` Thomas Monjalon
2026-06-01 16:00 ` [PATCH v2] " Stephen Hemminger
3 siblings, 2 replies; 8+ messages in thread
From: Stephen Hemminger @ 2026-05-20 17:08 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, stable, Michal Sieron, Thomas Monjalon,
Anatoly Burakov, Bruce Richardson
The prefault step in alloc_seg() reads a value from the hugepage and
writes it back unchanged to force the kernel to commit the backing
page. The read and write were not atomic, which races with concurrent
access to the same physical page from a secondary process attaching
to the hugetlbfs-backed mapping during rte_eal_init().
Replace the non-atomic load+store with a single atomic fetch-or of
zero. This touches the page with an atomic read-modify-write without
changing its contents, eliminating the race while preserving the
original intent of forcing a write fault.
Fixes: 0f1631be24bd ("mem: fix page fault trigger")
Cc: stable@dpdk.org
Reported-by: Michal Sieron <michal.sieron@nokia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
.mailmap | 1 +
lib/eal/linux/eal_memalloc.c | 7 ++++---
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/.mailmap b/.mailmap
index 4d26d9c286..07c49eb32f 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1086,6 +1086,7 @@ Michal Mazurek <maz@semihalf.com>
Michal Michalik <michal.michalik@intel.com>
Michal Nowak <michal2.nowak@intel.com>
Michal Schmidt <mschmidt@redhat.com>
+Michal Sieron <michal.sieron@nokia.com>
Michal Swiatkowski <michal.swiatkowski@intel.com>
Michal Wilczynski <michal.wilczynski@intel.com>
Michał Mirosław <michal.miroslaw@atendesoftware.pl> <mirq-linux@rere.qmqm.pl>
diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c
index a39bc31c7b..e73a0c11a6 100644
--- a/lib/eal/linux/eal_memalloc.c
+++ b/lib/eal/linux/eal_memalloc.c
@@ -25,6 +25,7 @@
#include <linux/falloc.h>
#include <linux/mman.h> /* for hugetlb-related mmap flags */
+#include <rte_atomic.h>
#include <rte_common.h>
#include <rte_log.h>
#include <rte_eal.h>
@@ -597,10 +598,10 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
/* we need to trigger a write to the page to enforce page fault and
* ensure that page is accessible to us, but we can't overwrite value
- * that is already there, so read the old value, and write itback.
- * kernel populates the page with zeroes initially.
+ * that is already there.
+ * Use an atomic OR with zero to touch the page without changing its contents.
*/
- *(volatile int *)addr = *(volatile int *)addr;
+ (void)rte_atomic_fetch_or_explicit((int *)addr, 0, rte_memory_order_relaxed);
iova = rte_mem_virt2iova(addr);
if (iova == RTE_BAD_PHYS_ADDR) {
--
2.53.0
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH] eal: fix data race in hugepage prefault
2026-05-20 17:08 ` [PATCH] eal: fix data race in hugepage prefault Stephen Hemminger
@ 2026-06-01 10:03 ` Thomas Monjalon
2026-06-01 16:00 ` [PATCH v2] " Stephen Hemminger
1 sibling, 0 replies; 8+ messages in thread
From: Thomas Monjalon @ 2026-06-01 10:03 UTC (permalink / raw)
To: Michal Sieron, Stephen Hemminger
Cc: dev, stable, Anatoly Burakov, Bruce Richardson
20/05/2026 19:08, Stephen Hemminger:
> The prefault step in alloc_seg() reads a value from the hugepage and
> writes it back unchanged to force the kernel to commit the backing
> page. The read and write were not atomic, which races with concurrent
> access to the same physical page from a secondary process attaching
> to the hugetlbfs-backed mapping during rte_eal_init().
>
> Replace the non-atomic load+store with a single atomic fetch-or of
> zero. This touches the page with an atomic read-modify-write without
> changing its contents, eliminating the race while preserving the
> original intent of forcing a write fault.
>
> Fixes: 0f1631be24bd ("mem: fix page fault trigger")
> Cc: stable@dpdk.org
>
> Reported-by: Michal Sieron <michal.sieron@nokia.com>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> --- a/lib/eal/linux/eal_memalloc.c
> +++ b/lib/eal/linux/eal_memalloc.c
> - *(volatile int *)addr = *(volatile int *)addr;
> + (void)rte_atomic_fetch_or_explicit((int *)addr, 0, rte_memory_order_relaxed);
There is a compilation failure:
lib/eal/linux/eal_memalloc.c:604:8: error: address argument to atomic operation must be a pointer to _Atomic type ('int *' invalid)
(void)rte_atomic_fetch_or_explicit((int *)addr, 0, rte_memory_order_relaxed);
^ ~~~~~~~~~~~
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH v2] eal: fix data race in hugepage prefault
2026-05-20 17:08 ` [PATCH] eal: fix data race in hugepage prefault Stephen Hemminger
2026-06-01 10:03 ` Thomas Monjalon
@ 2026-06-01 16:00 ` Stephen Hemminger
2026-06-03 15:57 ` Thomas Monjalon
1 sibling, 1 reply; 8+ messages in thread
From: Stephen Hemminger @ 2026-06-01 16:00 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, stable, Michal Sieron, Thomas Monjalon,
Anatoly Burakov, Bruce Richardson
The prefault step in alloc_seg() reads a value from the hugepage and
writes it back unchanged to force the kernel to commit the backing
page. The read and write were not atomic, which races with concurrent
access to the same physical page from a secondary process attaching
to the hugetlbfs-backed mapping during rte_eal_init().
Replace the non-atomic load+store with a single atomic fetch-or of
zero. This touches the page with an atomic read-modify-write without
changing its contents, eliminating the race while preserving the
original intent of forcing a write fault.
Fixes: 0f1631be24bd ("mem: fix page fault trigger")
Cc: stable@dpdk.org
Reported-by: Michal Sieron <michal.sieron@nokia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
.mailmap | 1 +
lib/eal/linux/eal_memalloc.c | 8 +++++---
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/.mailmap b/.mailmap
index 43febb9030..3c45e365d3 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1094,6 +1094,7 @@ Michal Mazurek <maz@semihalf.com>
Michal Michalik <michal.michalik@intel.com>
Michal Nowak <michal2.nowak@intel.com>
Michal Schmidt <mschmidt@redhat.com>
+Michal Sieron <michal.sieron@nokia.com>
Michal Swiatkowski <michal.swiatkowski@intel.com>
Michal Wilczynski <michal.wilczynski@intel.com>
Michał Mirosław <michal.miroslaw@atendesoftware.pl> <mirq-linux@rere.qmqm.pl>
diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c
index a39bc31c7b..7359a41d3f 100644
--- a/lib/eal/linux/eal_memalloc.c
+++ b/lib/eal/linux/eal_memalloc.c
@@ -25,6 +25,7 @@
#include <linux/falloc.h>
#include <linux/mman.h> /* for hugetlb-related mmap flags */
+#include <rte_atomic.h>
#include <rte_common.h>
#include <rte_log.h>
#include <rte_eal.h>
@@ -597,10 +598,11 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
/* we need to trigger a write to the page to enforce page fault and
* ensure that page is accessible to us, but we can't overwrite value
- * that is already there, so read the old value, and write itback.
- * kernel populates the page with zeroes initially.
+ * that is already there.
+ * Use an atomic OR with zero to touch the page without changing its contents.
*/
- *(volatile int *)addr = *(volatile int *)addr;
+ (void)rte_atomic_fetch_or_explicit((__rte_atomic uint64_t *)addr, 0,
+ rte_memory_order_relaxed);
iova = rte_mem_virt2iova(addr);
if (iova == RTE_BAD_PHYS_ADDR) {
--
2.53.0
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH v2] eal: fix data race in hugepage prefault
2026-06-01 16:00 ` [PATCH v2] " Stephen Hemminger
@ 2026-06-03 15:57 ` Thomas Monjalon
0 siblings, 0 replies; 8+ messages in thread
From: Thomas Monjalon @ 2026-06-03 15:57 UTC (permalink / raw)
To: Stephen Hemminger
Cc: dev, stable, Michal Sieron, Anatoly Burakov, Bruce Richardson
01/06/2026 18:00, Stephen Hemminger:
> The prefault step in alloc_seg() reads a value from the hugepage and
> writes it back unchanged to force the kernel to commit the backing
> page. The read and write were not atomic, which races with concurrent
> access to the same physical page from a secondary process attaching
> to the hugetlbfs-backed mapping during rte_eal_init().
>
> Replace the non-atomic load+store with a single atomic fetch-or of
> zero. This touches the page with an atomic read-modify-write without
> changing its contents, eliminating the race while preserving the
> original intent of forcing a write fault.
>
> Fixes: 0f1631be24bd ("mem: fix page fault trigger")
> Cc: stable@dpdk.org
>
> Reported-by: Michal Sieron <michal.sieron@nokia.com>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Applied, thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread