DPDK-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] linux/mem: atomically prefault hugepages in alloc_seg
@ 2026-05-20 12:57 Michal Sieron
  2026-05-20 14:57 ` Stephen Hemminger
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Michal Sieron @ 2026-05-20 12:57 UTC (permalink / raw)
  To: dev; +Cc: Michal Sieron

In rare cases, when a secondary process calls rte_eal_init() it can
cause a data race during page prefaulting in alloc_seg().

An atomic compare-exchange in a loop should eliminate the data race.

Signed-off-by: Michal Sieron <michal.sieron@nokia.com>
---
 lib/eal/linux/eal_memalloc.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c
index a39bc31c7b..cb92fda2e8 100644
--- a/lib/eal/linux/eal_memalloc.c
+++ b/lib/eal/linux/eal_memalloc.c
@@ -30,6 +30,7 @@
 #include <rte_eal.h>
 #include <rte_memory.h>
 #include <rte_cycles.h>
+#include <rte_atomic.h>
 
 #include "eal_filesystem.h"
 #include "eal_internal_cfg.h"
@@ -600,7 +601,9 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 	 * that is already there, so read the old value, and write itback.
 	 * kernel populates the page with zeroes initially.
 	 */
-	*(volatile int *)addr = *(volatile int *)addr;
+	int snapshot = *(volatile int *)addr;
+	while (!rte_atomic_compare_exchange_strong((volatile int *)addr, &snapshot, snapshot))
+		;
 
 	iova = rte_mem_virt2iova(addr);
 	if (iova == RTE_BAD_PHYS_ADDR) {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] linux/mem: atomically prefault hugepages in alloc_seg
  2026-05-20 12:57 [PATCH] linux/mem: atomically prefault hugepages in alloc_seg Michal Sieron
@ 2026-05-20 14:57 ` Stephen Hemminger
  2026-05-20 16:47 ` Stephen Hemminger
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Stephen Hemminger @ 2026-05-20 14:57 UTC (permalink / raw)
  To: Michal Sieron; +Cc: dev

On Wed, 20 May 2026 14:57:56 +0200
Michal Sieron <michal.sieron@nokia.com> wrote:

> In rare cases, when a secondary process calls rte_eal_init() it can
> cause a data race during page prefaulting in alloc_seg().
> 
> An atomic compare-exchange in a loop should eliminate the data race.
> 
> Signed-off-by: Michal Sieron <michal.sieron@nokia.com>
> ---
>  lib/eal/linux/eal_memalloc.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c
> index a39bc31c7b..cb92fda2e8 100644
> --- a/lib/eal/linux/eal_memalloc.c
> +++ b/lib/eal/linux/eal_memalloc.c
> @@ -30,6 +30,7 @@
>  #include <rte_eal.h>
>  #include <rte_memory.h>
>  #include <rte_cycles.h>
> +#include <rte_atomic.h>
>  
>  #include "eal_filesystem.h"
>  #include "eal_internal_cfg.h"
> @@ -600,7 +601,9 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
>  	 * that is already there, so read the old value, and write itback.
>  	 * kernel populates the page with zeroes initially.
>  	 */
> -	*(volatile int *)addr = *(volatile int *)addr;
> +	int snapshot = *(volatile int *)addr;
> +	while (!rte_atomic_compare_exchange_strong((volatile int *)addr, &snapshot, snapshot))
> +		;
>  
>  	iova = rte_mem_virt2iova(addr);
>  	if (iova == RTE_BAD_PHYS_ADDR) {

No don't use a loop with compare_exchange_strong here.
It could get stuck.
Should just a an relaxed load be enough to get the page in?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] linux/mem: atomically prefault hugepages in alloc_seg
  2026-05-20 12:57 [PATCH] linux/mem: atomically prefault hugepages in alloc_seg Michal Sieron
  2026-05-20 14:57 ` Stephen Hemminger
@ 2026-05-20 16:47 ` Stephen Hemminger
  2026-05-20 17:07 ` Stephen Hemminger
  2026-05-20 17:08 ` [PATCH] eal: fix data race in hugepage prefault Stephen Hemminger
  3 siblings, 0 replies; 5+ messages in thread
From: Stephen Hemminger @ 2026-05-20 16:47 UTC (permalink / raw)
  To: Michal Sieron; +Cc: dev

On Wed, 20 May 2026 14:57:56 +0200
Michal Sieron <michal.sieron@nokia.com> wrote:

> In rare cases, when a secondary process calls rte_eal_init() it can
> cause a data race during page prefaulting in alloc_seg().
> 
> An atomic compare-exchange in a loop should eliminate the data race.
> 
> Signed-off-by: Michal Sieron <michal.sieron@nokia.com>
> ---

Build fails. Fix and resubmit.
Looks like you did this against older version of DPDK before stdatomic.

FAILED: [code=1] lib/librte_eal.a.p/eal_linux_eal_memalloc.c.o 
ccache clang -Ilib/librte_eal.a.p -Ilib -I../lib -Ilib/eal/common -I../lib/eal/common -I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -I../kernel/linux -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/argparse -I../lib/argparse -Xclang -fcolor-diagnostics -fsanitize=undefined -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -std=c11 -include rte_config.h -Wvla -Wcast-qual -Wcomma -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpointer-arith -Wshadow -Wsign-compare -Wstrict-prototypes -Wundef -Wwrite-strings -Wno-missing-field-initializers -D_GNU_SOURCE -fPIC -march=corei7 -mrtm -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API '-DABI_VERSION="26.2"' -DRTE_EAL_PTHREAD_ATTR_SETAFFINITY_NP -DRTE_LOG_DEFAULT_LOGTYPE=lib.eal -DRTE_ANNOTATE_LOCKS -Wthread-safety -MD -MQ lib/librte_eal.a.p/eal_linux_eal_memalloc.c.o -MF lib/librte_eal.a.p/eal_linux_eal_memalloc.c.o.d -o lib/librte_eal.a.p/eal_linux_eal_memalloc.c.o -c ../lib/eal/linux/eal_memalloc.c
../lib/eal/linux/eal_memalloc.c:605:10: error: implicit declaration of function 'rte_atomic_compare_exchange_strong' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        while (!rte_atomic_compare_exchange_strong((volatile int *)addr, &snapshot, snapshot))
                ^
../lib/eal/linux/eal_memalloc.c:605:10: note: did you mean '__atomic_compare_exchange_n'?
../lib/eal/include/generic/rte_rwlock.h:189:6: note: '__atomic_compare_exchange_n' declared here
            rte_atomic_compare_exchange_weak_explicit(&rwl->cnt, &x, x + RTE_RWLOCK_WRITE,
            ^
../lib/eal/include/rte_stdatomic.h:151:2: note: expanded from macro 'rte_atomic_compare_exchange_weak_explicit'
        __atomic_compare_exchange_n(ptr, expected, desired, 1, \
        ^
1 error generated.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] linux/mem: atomically prefault hugepages in alloc_seg
  2026-05-20 12:57 [PATCH] linux/mem: atomically prefault hugepages in alloc_seg Michal Sieron
  2026-05-20 14:57 ` Stephen Hemminger
  2026-05-20 16:47 ` Stephen Hemminger
@ 2026-05-20 17:07 ` Stephen Hemminger
  2026-05-20 17:08 ` [PATCH] eal: fix data race in hugepage prefault Stephen Hemminger
  3 siblings, 0 replies; 5+ messages in thread
From: Stephen Hemminger @ 2026-05-20 17:07 UTC (permalink / raw)
  To: Michal Sieron; +Cc: dev

On Wed, 20 May 2026 14:57:56 +0200
Michal Sieron <michal.sieron@nokia.com> wrote:

> In rare cases, when a secondary process calls rte_eal_init() it can
> cause a data race during page prefaulting in alloc_seg().
> 
> An atomic compare-exchange in a loop should eliminate the data race.
> 
> Signed-off-by: Michal Sieron <michal.sieron@nokia.com>
> ---

AI had good suggestion when reviewing this.
Your version is still racy (on the read side).

A simple non-racy, and no loop version would be:

	rte_atomic_fetch_or_explicit((int *)addr, 0, rte_memory_order_relaxed);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] eal: fix data race in hugepage prefault
  2026-05-20 12:57 [PATCH] linux/mem: atomically prefault hugepages in alloc_seg Michal Sieron
                   ` (2 preceding siblings ...)
  2026-05-20 17:07 ` Stephen Hemminger
@ 2026-05-20 17:08 ` Stephen Hemminger
  3 siblings, 0 replies; 5+ messages in thread
From: Stephen Hemminger @ 2026-05-20 17:08 UTC (permalink / raw)
  To: dev
  Cc: Stephen Hemminger, stable, Michal Sieron, Thomas Monjalon,
	Anatoly Burakov, Bruce Richardson

The prefault step in alloc_seg() reads a value from the hugepage and
writes it back unchanged to force the kernel to commit the backing
page. The read and write were not atomic, which races with concurrent
access to the same physical page from a secondary process attaching
to the hugetlbfs-backed mapping during rte_eal_init().

Replace the non-atomic load+store with a single atomic fetch-or of
zero. This touches the page with an atomic read-modify-write without
changing its contents, eliminating the race while preserving the
original intent of forcing a write fault.

Fixes: 0f1631be24bd ("mem: fix page fault trigger")
Cc: stable@dpdk.org

Reported-by: Michal Sieron <michal.sieron@nokia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 .mailmap                     | 1 +
 lib/eal/linux/eal_memalloc.c | 7 ++++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/.mailmap b/.mailmap
index 4d26d9c286..07c49eb32f 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1086,6 +1086,7 @@ Michal Mazurek <maz@semihalf.com>
 Michal Michalik <michal.michalik@intel.com>
 Michal Nowak <michal2.nowak@intel.com>
 Michal Schmidt <mschmidt@redhat.com>
+Michal Sieron <michal.sieron@nokia.com>
 Michal Swiatkowski <michal.swiatkowski@intel.com>
 Michal Wilczynski <michal.wilczynski@intel.com>
 Michał Mirosław <michal.miroslaw@atendesoftware.pl> <mirq-linux@rere.qmqm.pl>
diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c
index a39bc31c7b..e73a0c11a6 100644
--- a/lib/eal/linux/eal_memalloc.c
+++ b/lib/eal/linux/eal_memalloc.c
@@ -25,6 +25,7 @@
 #include <linux/falloc.h>
 #include <linux/mman.h> /* for hugetlb-related mmap flags */
 
+#include <rte_atomic.h>
 #include <rte_common.h>
 #include <rte_log.h>
 #include <rte_eal.h>
@@ -597,10 +598,10 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 
 	/* we need to trigger a write to the page to enforce page fault and
 	 * ensure that page is accessible to us, but we can't overwrite value
-	 * that is already there, so read the old value, and write itback.
-	 * kernel populates the page with zeroes initially.
+	 * that is already there.
+	 * Use an atomic OR with zero to touch the page without changing its contents.
 	 */
-	*(volatile int *)addr = *(volatile int *)addr;
+	(void)rte_atomic_fetch_or_explicit((int *)addr, 0, rte_memory_order_relaxed);
 
 	iova = rte_mem_virt2iova(addr);
 	if (iova == RTE_BAD_PHYS_ADDR) {
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-05-20 17:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-20 12:57 [PATCH] linux/mem: atomically prefault hugepages in alloc_seg Michal Sieron
2026-05-20 14:57 ` Stephen Hemminger
2026-05-20 16:47 ` Stephen Hemminger
2026-05-20 17:07 ` Stephen Hemminger
2026-05-20 17:08 ` [PATCH] eal: fix data race in hugepage prefault Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox