* [PATCH 0/6] test: fix sporadic failures on high core count systems
@ 2026-01-18 20:09 Stephen Hemminger
2026-01-18 20:09 ` [PATCH 1/6] test: add pause to synchronization spinloops Stephen Hemminger
` (7 more replies)
0 siblings, 8 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-18 20:09 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
This series addresses several test failures that occur sporadically on
systems with many cores (32+), particularly on AMD Zen architectures.
I think Ferruh may have addressed similar problems in earlier
releases.
The root causes fall into three categories:
1. Missing rte_pause() in synchronization spinloops (patch 1)
Tight spinloops without pause cause SMT thread starvation and
unpredictable timing behavior.
2. Fixed iteration counts that don't scale (patch 2)
The atomic test performs 1M iterations per worker regardless of
core count. With 32+ cores, contention causes timeout failures.
Bugzilla ID: 952
3. File-prefix collisions during parallel test execution (patches 5-6)
Multiple tests using the default "rte" prefix compete for the same
fbarray files, causing EAL initialization failures.
Additionally, two BPF-related fixes are included:
4. Race condition in BPF ELF loading (patch 3)
Missing fsync() before close() causes sporadic EINVAL failures.
5. Unsupported BPF instructions with newer clang (patch 4)
Clang 20+ generates JMP32 instructions that DPDK BPF doesn't support.
Bugzilla ID: 1844
Stephen Hemminger (6):
test: add pause to synchronization spinloops
test: fix timeout for atomic test on high core count systems
test: fix race condition in ELF load tests
test: fix unsupported BPF instructions in elf load test
test: add file-prefix for all fast-tests on Linux
test: fix trace_autotest_with_traces parallel execution
app/test/bpf/meson.build | 3 +-
app/test/suites/meson.build | 20 ++++++++---
app/test/test_atomic.c | 67 ++++++++++++++++++++++---------------
app/test/test_bpf.c | 5 ++-
4 files changed, 62 insertions(+), 33 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH 1/6] test: add pause to synchronization spinloops
2026-01-18 20:09 [PATCH 0/6] test: fix sporadic failures on high core count systems Stephen Hemminger
@ 2026-01-18 20:09 ` Stephen Hemminger
2026-01-18 20:09 ` [PATCH 2/6] test: fix timeout for atomic test on high core count systems Stephen Hemminger
` (6 subsequent siblings)
7 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-18 20:09 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable
The atomic test uses tight spinloops to synchronize worker threads
before starting each test phase. These spinloops lack rte_pause()
which causes problems on high core count systems, particularly AMD Zen
architectures where:
- Tight spinloops without pause can starve SMT sibling threads
- Memory ordering and store-buffer forwarding behave differently
- Higher core counts amplify timing windows for race conditions
This manifests as sporadic test failures on systems with 32+ cores
that don't reproduce on smaller core count systems.
Add rte_pause() to all seven synchronization spinloops to allow
proper CPU resource sharing and improve memory ordering behavior.
Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test_atomic.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/app/test/test_atomic.c b/app/test/test_atomic.c
index 8160a33e0e..b1a0d40ece 100644
--- a/app/test/test_atomic.c
+++ b/app/test/test_atomic.c
@@ -15,6 +15,7 @@
#include <rte_atomic.h>
#include <rte_eal.h>
#include <rte_lcore.h>
+#include <rte_pause.h>
#include <rte_random.h>
#include <rte_hash_crc.h>
@@ -114,7 +115,7 @@ test_atomic_usual(__rte_unused void *arg)
unsigned i;
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
for (i = 0; i < N; i++)
rte_atomic16_inc(&a16);
@@ -150,7 +151,7 @@ static int
test_atomic_tas(__rte_unused void *arg)
{
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
if (rte_atomic16_test_and_set(&a16))
rte_atomic64_inc(&count);
@@ -171,7 +172,7 @@ test_atomic_addsub_and_return(__rte_unused void *arg)
unsigned i;
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
for (i = 0; i < N; i++) {
tmp16 = rte_atomic16_add_return(&a16, 1);
@@ -210,7 +211,7 @@ static int
test_atomic_inc_and_test(__rte_unused void *arg)
{
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
if (rte_atomic16_inc_and_test(&a16)) {
rte_atomic64_inc(&count);
@@ -237,7 +238,7 @@ static int
test_atomic_dec_and_test(__rte_unused void *arg)
{
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
if (rte_atomic16_dec_and_test(&a16))
rte_atomic64_inc(&count);
@@ -269,7 +270,7 @@ test_atomic128_cmp_exchange(__rte_unused void *arg)
unsigned int i;
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
expected = count128;
@@ -407,7 +408,7 @@ test_atomic_exchange(__rte_unused void *arg)
/* Wait until all of the other threads have been dispatched */
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
/*
* Let the battle begin! Every thread attempts to steal the current
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 2/6] test: fix timeout for atomic test on high core count systems
2026-01-18 20:09 [PATCH 0/6] test: fix sporadic failures on high core count systems Stephen Hemminger
2026-01-18 20:09 ` [PATCH 1/6] test: add pause to synchronization spinloops Stephen Hemminger
@ 2026-01-18 20:09 ` Stephen Hemminger
2026-01-18 20:09 ` [PATCH 3/6] test: fix race condition in ELF load tests Stephen Hemminger
` (5 subsequent siblings)
7 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-18 20:09 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable
The atomic test uses tight spinloops to synchronize worker threads
and performs a fixed 1,000,000 iterations per worker. This causes
two problems on high core count systems:
With many cores (e.g., 32), the massive contention on shared
atomic variables causes the test to exceed the 10 second timeout.
Scale iterations inversely with core count to maintain roughly
constant test duration regardless of system size
With 32 cores, iterations drop from 1,000,000 to 31,250 per worker,
which keeps the test well within the timeout while still providing
meaningful coverage.
Bugzilla ID: 952
Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test_atomic.c | 52 ++++++++++++++++++++++++++----------------
1 file changed, 32 insertions(+), 20 deletions(-)
diff --git a/app/test/test_atomic.c b/app/test/test_atomic.c
index b1a0d40ece..ccd8e5d29b 100644
--- a/app/test/test_atomic.c
+++ b/app/test/test_atomic.c
@@ -10,6 +10,7 @@
#include <sys/queue.h>
#include <rte_memory.h>
+#include <rte_common.h>
#include <rte_per_lcore.h>
#include <rte_launch.h>
#include <rte_atomic.h>
@@ -101,7 +102,15 @@
#define NUM_ATOMIC_TYPES 3
-#define N 1000000
+#define N_BASE 1000000u
+#define N_MIN 10000u
+
+/*
+ * Number of iterations for each test, scaled inversely with core count.
+ * More cores means more contention which increases time per operation.
+ * Calculated once at test start to avoid repeated computation in workers.
+ */
+static unsigned int num_iterations;
static rte_atomic16_t a16;
static rte_atomic32_t a32;
@@ -112,36 +121,36 @@ static rte_atomic32_t synchro;
static int
test_atomic_usual(__rte_unused void *arg)
{
- unsigned i;
+ unsigned int i;
while (rte_atomic32_read(&synchro) == 0)
rte_pause();
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic16_inc(&a16);
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic16_dec(&a16);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic16_add(&a16, 5);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic16_sub(&a16, 5);
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic32_inc(&a32);
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic32_dec(&a32);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic32_add(&a32, 5);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic32_sub(&a32, 5);
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic64_inc(&a64);
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic64_dec(&a64);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic64_add(&a64, 5);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic64_sub(&a64, 5);
return 0;
@@ -169,12 +178,12 @@ test_atomic_addsub_and_return(__rte_unused void *arg)
uint32_t tmp16;
uint32_t tmp32;
uint64_t tmp64;
- unsigned i;
+ unsigned int i;
while (rte_atomic32_read(&synchro) == 0)
rte_pause();
- for (i = 0; i < N; i++) {
+ for (i = 0; i < num_iterations; i++) {
tmp16 = rte_atomic16_add_return(&a16, 1);
rte_atomic64_add(&count, tmp16);
@@ -274,7 +283,7 @@ test_atomic128_cmp_exchange(__rte_unused void *arg)
expected = count128;
- for (i = 0; i < N; i++) {
+ for (i = 0; i < num_iterations; i++) {
do {
rte_int128_t desired;
@@ -401,7 +410,7 @@ get_crc8(uint8_t *message, int length)
static int
test_atomic_exchange(__rte_unused void *arg)
{
- int i;
+ unsigned int i;
test16_t nt16, ot16; /* new token, old token */
test32_t nt32, ot32;
test64_t nt64, ot64;
@@ -417,7 +426,7 @@ test_atomic_exchange(__rte_unused void *arg)
* appropriate crc32 hash for the data) then the test iteration has
* passed. If the token is invalid, increment the counter.
*/
- for (i = 0; i < N; i++) {
+ for (i = 0; i < num_iterations; i++) {
/* Test 64bit Atomic Exchange */
nt64.u64 = rte_rand();
@@ -446,6 +455,9 @@ test_atomic_exchange(__rte_unused void *arg)
static int
test_atomic(void)
{
+ /* Scale iterations by number of cores to keep test duration reasonable */
+ num_iterations = RTE_MAX(N_BASE / rte_lcore_count(), N_MIN);
+
rte_atomic16_init(&a16);
rte_atomic32_init(&a32);
rte_atomic64_init(&a64);
@@ -593,7 +605,7 @@ test_atomic(void)
rte_atomic32_clear(&synchro);
iterations = count128.val[0] - count128.val[1];
- if (iterations != (uint64_t)4*N*(rte_lcore_count()-1)) {
+ if (iterations != (uint64_t)4*num_iterations*(rte_lcore_count()-1)) {
printf("128-bit compare and swap failed\n");
return -1;
}
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 3/6] test: fix race condition in ELF load tests
2026-01-18 20:09 [PATCH 0/6] test: fix sporadic failures on high core count systems Stephen Hemminger
2026-01-18 20:09 ` [PATCH 1/6] test: add pause to synchronization spinloops Stephen Hemminger
2026-01-18 20:09 ` [PATCH 2/6] test: fix timeout for atomic test on high core count systems Stephen Hemminger
@ 2026-01-18 20:09 ` Stephen Hemminger
2026-01-19 11:42 ` Marat Khalili
2026-01-19 18:24 ` Stephen Hemminger
2026-01-18 20:09 ` [PATCH 4/6] test: fix unsupported BPF instructions in elf load test Stephen Hemminger
` (4 subsequent siblings)
7 siblings, 2 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-18 20:09 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Konstantin Ananyev, Marat Khalili
The BPF ELF tests sporadically fail with EINVAL when loading from
the temporary file. This is a race condition where the BPF loader
reads the file before the data is fully flushed to disk.
Add fsync() before close() in create_temp_bpf_file() to ensure the
BPF object data is visible on the filesystem before attempting to
load it.
Also fix two related issues found during review
- Add missing TEST_ASSERT for mempool creation in test_bpf_elf_tx_load
- Initialize port variable in test_bpf_elf_rx_load to avoid undefined
behavior in cleanup path if null_vdev_setup fails early
Fixes: cf1e03f881af ("test/bpf: add ELF loading")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test_bpf.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/app/test/test_bpf.c b/app/test/test_bpf.c
index a7d56f8d86..03705075d8 100644
--- a/app/test/test_bpf.c
+++ b/app/test/test_bpf.c
@@ -3311,6 +3311,8 @@ create_temp_bpf_file(const uint8_t *data, size_t size, const char *name)
/* Write BPF object data */
written = write(fd, data, size);
+ if (written == (ssize_t)size)
+ fsync(fd);
close(fd);
if (written != (ssize_t)size) {
@@ -3580,6 +3582,7 @@ test_bpf_elf_tx_load(void)
mb_pool = rte_pktmbuf_pool_create("bpf_tx_test_pool", BPF_TEST_POOLSIZE,
0, 0, RTE_MBUF_DEFAULT_BUF_SIZE,
SOCKET_ID_ANY);
+ TEST_ASSERT(mb_pool != NULL, "failed to create mempool");
ret = null_vdev_setup(null_dev, &port, mb_pool);
if (ret != 0)
@@ -3664,7 +3667,7 @@ test_bpf_elf_rx_load(void)
static const char null_dev[] = "net_null_bpf0";
struct rte_mempool *pool = NULL;
char *tmpfile = NULL;
- uint16_t port;
+ uint16_t port = UINT16_MAX;
int ret;
printf("%s start\n", __func__);
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 4/6] test: fix unsupported BPF instructions in elf load test
2026-01-18 20:09 [PATCH 0/6] test: fix sporadic failures on high core count systems Stephen Hemminger
` (2 preceding siblings ...)
2026-01-18 20:09 ` [PATCH 3/6] test: fix race condition in ELF load tests Stephen Hemminger
@ 2026-01-18 20:09 ` Stephen Hemminger
2026-01-19 11:43 ` Marat Khalili
2026-01-18 20:09 ` [PATCH 5/6] test: add file-prefix for all fast-tests on Linux Stephen Hemminger
` (3 subsequent siblings)
7 siblings, 1 reply; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-18 20:09 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Konstantin Ananyev, Marat Khalili
The DPDK BPF library only handles the base BPF instructions.
It does not handle JMP32 which would cause the bpf_elf_load
test to fail on clang 20 or later.
Bugzilla ID: 1844
Fixes: cf1e03f881af ("test/bpf: add ELF loading")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/bpf/meson.build | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/app/test/bpf/meson.build b/app/test/bpf/meson.build
index aaecfa7018..91c1b434f8 100644
--- a/app/test/bpf/meson.build
+++ b/app/test/bpf/meson.build
@@ -24,7 +24,8 @@ if not xxd.found()
endif
# BPF compiler flags
-bpf_cflags = [ '-O2', '-target', 'bpf', '-g', '-c']
+# At present: DPDK BPF does not support v3 or later
+bpf_cflags = [ '-O2', '-target', 'bpf', '-mcpu=v2', '-g', '-c']
# Enable test in test_bpf.c
cflags += '-DTEST_BPF_ELF_LOAD'
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 5/6] test: add file-prefix for all fast-tests on Linux
2026-01-18 20:09 [PATCH 0/6] test: fix sporadic failures on high core count systems Stephen Hemminger
` (3 preceding siblings ...)
2026-01-18 20:09 ` [PATCH 4/6] test: fix unsupported BPF instructions in elf load test Stephen Hemminger
@ 2026-01-18 20:09 ` Stephen Hemminger
2026-01-19 13:06 ` Marat Khalili
2026-01-18 20:09 ` [PATCH 6/6] test: fix trace_autotest_with_traces parallel execution Stephen Hemminger
` (2 subsequent siblings)
7 siblings, 1 reply; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-18 20:09 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Bruce Richardson, Morten Brørup
When running tests in parallel on systems with many cores, multiple test
processes collide on the default "rte" file-prefix, causing EAL
initialization failures:
EAL: Cannot allocate memzone list: Device or resource busy
EAL: Cannot init memzone
This occurs because all DPDK tests (including --no-huge tests) use
file-backed arrays for memzone tracking. These files are created at
/var/run/dpdk/<prefix>/fbarray_memzone and require exclusive locking
during initialization. When multiple tests run in parallel with the
same file-prefix, they compete for this lock.
The original implementation included --file-prefix for Linux to
prevent this collision. This was later removed during test
infrastructure refactoring.
Restore the --file-prefix argument for all fast-tests on Linux,
regardless of whether they use hugepages. Tests that exercise
file-prefix functionality (like eal_flags_file_prefix_autotest)
spawn child processes with their own hardcoded prefixes and use
get_current_prefix() to verify the parent's resources, so they work
correctly regardless of what prefix the parent process uses.
Fixes: 50823f30f0c8 ("test: build using per-file dependencies")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/suites/meson.build | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/app/test/suites/meson.build b/app/test/suites/meson.build
index 1010150eee..38df1cfec2 100644
--- a/app/test/suites/meson.build
+++ b/app/test/suites/meson.build
@@ -85,7 +85,11 @@ foreach suite:test_suites
if nohuge
test_args += test_no_huge_args
elif not has_hugepage
- continue #skip this tests
+ continue # skip this tests
+ endif
+ if is_linux
+ # use unique file-prefix to allow parallel runs
+ test_args += ['--file-prefix=' + test_name.underscorify()]
endif
if not asan and get_option('b_sanitize').contains('address')
continue # skip this test
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 6/6] test: fix trace_autotest_with_traces parallel execution
2026-01-18 20:09 [PATCH 0/6] test: fix sporadic failures on high core count systems Stephen Hemminger
` (4 preceding siblings ...)
2026-01-18 20:09 ` [PATCH 5/6] test: add file-prefix for all fast-tests on Linux Stephen Hemminger
@ 2026-01-18 20:09 ` Stephen Hemminger
2026-01-19 13:13 ` Marat Khalili
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
2026-03-05 17:50 ` [PATCH v4 00/11] test: fix test failures on high core count systems Stephen Hemminger
7 siblings, 1 reply; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-18 20:09 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Morten Brørup, Bruce Richardson
The trace_autotest_with_traces test was reusing the test_args array
from trace_autotest, which already contained --file-prefix=trace_autotest.
Fix by building trace_args from scratch for the _with_traces test
variant instead of appending to the existing test_args array.
Fixes: 0aeaf75df879 ("test: define unit tests suites based on test types")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/suites/meson.build | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/app/test/suites/meson.build b/app/test/suites/meson.build
index 38df1cfec2..e62170bebf 100644
--- a/app/test/suites/meson.build
+++ b/app/test/suites/meson.build
@@ -106,10 +106,18 @@ foreach suite:test_suites
is_parallel : false,
suite : 'fast-tests')
if not is_windows and test_name == 'trace_autotest'
- test_args += ['--trace=.*']
- test_args += ['--trace-dir=@0@'.format(meson.current_build_dir())]
+ # build separate args list to avoid duplicate --file-prefix
+ trace_args = test_no_huge_args
+ trace_args += ['--trace=.*']
+ trace_args += ['--trace-dir=@0@'.format(meson.current_build_dir())]
+ if is_linux
+ trace_args += ['--file-prefix=trace_autotest_with_traces']
+ endif
+ if get_option('default_library') == 'shared'
+ trace_args += ['-d', dpdk_drivers_build_dir]
+ endif
test(test_name + '_with_traces', dpdk_test,
- args : test_args,
+ args : trace_args,
env: ['DPDK_TEST=' + test_name],
timeout : timeout_seconds_fast,
is_parallel : false,
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* RE: [PATCH 3/6] test: fix race condition in ELF load tests
2026-01-18 20:09 ` [PATCH 3/6] test: fix race condition in ELF load tests Stephen Hemminger
@ 2026-01-19 11:42 ` Marat Khalili
2026-01-20 0:03 ` Stephen Hemminger
2026-01-19 18:24 ` Stephen Hemminger
1 sibling, 1 reply; 53+ messages in thread
From: Marat Khalili @ 2026-01-19 11:42 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: stable@dpdk.org, dev@dpdk.org, Konstantin Ananyev
For the create_temp_bpf_file part I'm not 100% convinced by the explanation.
Although fsync will not make things worse, it should not be necessary unless
the infra is broken (but then we'd see other problems; and note that if we
started to call fsync we also need to call it on directory etc.). Another
possibility is write call getting interrupted before all data is written. Can
you share an example of failed job? Can we add more error handling and logging,
like check return value from close (also fsyncs if we add them)? Can we use
stdio calls instead of libc ones BTW, they will handle partial writes at least,
on top of being more portable?
Can ack the port part.
P.S. Did not worth its own patch, but since you are working on it can you also
do s/sizeof(struct rte_mbuf)/RTE_MBUF_DEFAULT_BUF_SIZE/g ? I overlooked it last
time.
^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [PATCH 4/6] test: fix unsupported BPF instructions in elf load test
2026-01-18 20:09 ` [PATCH 4/6] test: fix unsupported BPF instructions in elf load test Stephen Hemminger
@ 2026-01-19 11:43 ` Marat Khalili
0 siblings, 0 replies; 53+ messages in thread
From: Marat Khalili @ 2026-01-19 11:43 UTC (permalink / raw)
To: Stephen Hemminger, dev@dpdk.org; +Cc: stable@dpdk.org, Konstantin Ananyev
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Sunday 18 January 2026 20:09
> To: dev@dpdk.org
> Cc: Stephen Hemminger <stephen@networkplumber.org>; stable@dpdk.org; Konstantin Ananyev
> <konstantin.ananyev@huawei.com>; Marat Khalili <marat.khalili@huawei.com>
> Subject: [PATCH 4/6] test: fix unsupported BPF instructions in elf load test
>
> The DPDK BPF library only handles the base BPF instructions.
> It does not handle JMP32 which would cause the bpf_elf_load
> test to fail on clang 20 or later.
>
> Bugzilla ID: 1844
> Fixes: cf1e03f881af ("test/bpf: add ELF loading")
> Cc: stable@dpdk.org
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> app/test/bpf/meson.build | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/app/test/bpf/meson.build b/app/test/bpf/meson.build
> index aaecfa7018..91c1b434f8 100644
> --- a/app/test/bpf/meson.build
> +++ b/app/test/bpf/meson.build
> @@ -24,7 +24,8 @@ if not xxd.found()
> endif
>
> # BPF compiler flags
> -bpf_cflags = [ '-O2', '-target', 'bpf', '-g', '-c']
> +# At present: DPDK BPF does not support v3 or later
> +bpf_cflags = [ '-O2', '-target', 'bpf', '-mcpu=v2', '-g', '-c']
>
> # Enable test in test_bpf.c
> cflags += '-DTEST_BPF_ELF_LOAD'
> --
> 2.51.0
Acked-by: Marat Khalili <marat.khalili@huawei.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [PATCH 5/6] test: add file-prefix for all fast-tests on Linux
2026-01-18 20:09 ` [PATCH 5/6] test: add file-prefix for all fast-tests on Linux Stephen Hemminger
@ 2026-01-19 13:06 ` Marat Khalili
2026-01-19 14:01 ` Bruce Richardson
0 siblings, 1 reply; 53+ messages in thread
From: Marat Khalili @ 2026-01-19 13:06 UTC (permalink / raw)
To: Stephen Hemminger, dev@dpdk.org
Cc: stable@dpdk.org, Bruce Richardson, Morten Brørup
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Sunday 18 January 2026 20:09
> To: dev@dpdk.org
> Cc: Stephen Hemminger <stephen@networkplumber.org>; stable@dpdk.org; Bruce Richardson
> <bruce.richardson@intel.com>; Morten Brørup <mb@smartsharesystems.com>
> Subject: [PATCH 5/6] test: add file-prefix for all fast-tests on Linux
>
> When running tests in parallel on systems with many cores, multiple test
> processes collide on the default "rte" file-prefix, causing EAL
> initialization failures:
>
> EAL: Cannot allocate memzone list: Device or resource busy
> EAL: Cannot init memzone
>
> This occurs because all DPDK tests (including --no-huge tests) use
> file-backed arrays for memzone tracking. These files are created at
> /var/run/dpdk/<prefix>/fbarray_memzone and require exclusive locking
> during initialization. When multiple tests run in parallel with the
> same file-prefix, they compete for this lock.
>
> The original implementation included --file-prefix for Linux to
> prevent this collision. This was later removed during test
> infrastructure refactoring.
>
> Restore the --file-prefix argument for all fast-tests on Linux,
> regardless of whether they use hugepages. Tests that exercise
> file-prefix functionality (like eal_flags_file_prefix_autotest)
> spawn child processes with their own hardcoded prefixes and use
> get_current_prefix() to verify the parent's resources, so they work
> correctly regardless of what prefix the parent process uses.
>
> Fixes: 50823f30f0c8 ("test: build using per-file dependencies")
> Cc: stable@dpdk.org
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> app/test/suites/meson.build | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/app/test/suites/meson.build b/app/test/suites/meson.build
> index 1010150eee..38df1cfec2 100644
> --- a/app/test/suites/meson.build
> +++ b/app/test/suites/meson.build
> @@ -85,7 +85,11 @@ foreach suite:test_suites
> if nohuge
> test_args += test_no_huge_args
> elif not has_hugepage
> - continue #skip this tests
> + continue # skip this tests
> + endif
> + if is_linux
> + # use unique file-prefix to allow parallel runs
> + test_args += ['--file-prefix=' + test_name.underscorify()]
> endif
> if not asan and get_option('b_sanitize').contains('address')
> continue # skip this test
> --
> 2.51.0
>
Note that in CI systems running multiple builds for different targets or
branches there will still be problems. Also, if you have to resubmit, can you
move new lines below handling of nohuge and asan.
With or without points above addressed,
Acked-by: Marat Khalili <marat.khalili@huawei.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [PATCH 6/6] test: fix trace_autotest_with_traces parallel execution
2026-01-18 20:09 ` [PATCH 6/6] test: fix trace_autotest_with_traces parallel execution Stephen Hemminger
@ 2026-01-19 13:13 ` Marat Khalili
2026-01-20 0:07 ` Stephen Hemminger
0 siblings, 1 reply; 53+ messages in thread
From: Marat Khalili @ 2026-01-19 13:13 UTC (permalink / raw)
To: Stephen Hemminger, dev@dpdk.org
Cc: stable@dpdk.org, Morten Brørup, Bruce Richardson
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Sunday 18 January 2026 20:09
> To: dev@dpdk.org
> Cc: Stephen Hemminger <stephen@networkplumber.org>; stable@dpdk.org; Morten Brørup
> <mb@smartsharesystems.com>; Bruce Richardson <bruce.richardson@intel.com>
> Subject: [PATCH 6/6] test: fix trace_autotest_with_traces parallel execution
>
> The trace_autotest_with_traces test was reusing the test_args array
> from trace_autotest, which already contained --file-prefix=trace_autotest.
>
> Fix by building trace_args from scratch for the _with_traces test
> variant instead of appending to the existing test_args array.
>
> Fixes: 0aeaf75df879 ("test: define unit tests suites based on test types")
> Cc: stable@dpdk.org
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> app/test/suites/meson.build | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/app/test/suites/meson.build b/app/test/suites/meson.build
> index 38df1cfec2..e62170bebf 100644
> --- a/app/test/suites/meson.build
> +++ b/app/test/suites/meson.build
> @@ -106,10 +106,18 @@ foreach suite:test_suites
> is_parallel : false,
> suite : 'fast-tests')
> if not is_windows and test_name == 'trace_autotest'
> - test_args += ['--trace=.*']
> - test_args += ['--trace-dir=@0@'.format(meson.current_build_dir())]
> + # build separate args list to avoid duplicate --file-prefix
> + trace_args = test_no_huge_args
> + trace_args += ['--trace=.*']
> + trace_args += ['--trace-dir=@0@'.format(meson.current_build_dir())]
> + if is_linux
> + trace_args += ['--file-prefix=trace_autotest_with_traces']
> + endif
> + if get_option('default_library') == 'shared'
> + trace_args += ['-d', dpdk_drivers_build_dir]
> + endif
> test(test_name + '_with_traces', dpdk_test,
> - args : test_args,
> + args : trace_args,
> env: ['DPDK_TEST=' + test_name],
> timeout : timeout_seconds_fast,
> is_parallel : false,
> --
> 2.51.0
>
Instead of duplicating some code and hardcoding values for trace_autotest,
which is error-prone, can we make adding file prefix the last thing we do to
test_args (perhaps doing it right in the `args : ...` expression) and re-use
the previous value and logic like we did it before?
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 5/6] test: add file-prefix for all fast-tests on Linux
2026-01-19 13:06 ` Marat Khalili
@ 2026-01-19 14:01 ` Bruce Richardson
0 siblings, 0 replies; 53+ messages in thread
From: Bruce Richardson @ 2026-01-19 14:01 UTC (permalink / raw)
To: Marat Khalili
Cc: Stephen Hemminger, dev@dpdk.org, stable@dpdk.org,
Morten Brørup
On Mon, Jan 19, 2026 at 01:06:43PM +0000, Marat Khalili wrote:
> > -----Original Message-----
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > Sent: Sunday 18 January 2026 20:09
> > To: dev@dpdk.org
> > Cc: Stephen Hemminger <stephen@networkplumber.org>; stable@dpdk.org; Bruce Richardson
> > <bruce.richardson@intel.com>; Morten Brørup <mb@smartsharesystems.com>
> > Subject: [PATCH 5/6] test: add file-prefix for all fast-tests on Linux
> >
> > When running tests in parallel on systems with many cores, multiple test
> > processes collide on the default "rte" file-prefix, causing EAL
> > initialization failures:
> >
> > EAL: Cannot allocate memzone list: Device or resource busy
> > EAL: Cannot init memzone
> >
> > This occurs because all DPDK tests (including --no-huge tests) use
> > file-backed arrays for memzone tracking. These files are created at
> > /var/run/dpdk/<prefix>/fbarray_memzone and require exclusive locking
> > during initialization. When multiple tests run in parallel with the
> > same file-prefix, they compete for this lock.
> >
> > The original implementation included --file-prefix for Linux to
> > prevent this collision. This was later removed during test
> > infrastructure refactoring.
> >
> > Restore the --file-prefix argument for all fast-tests on Linux,
> > regardless of whether they use hugepages. Tests that exercise
> > file-prefix functionality (like eal_flags_file_prefix_autotest)
> > spawn child processes with their own hardcoded prefixes and use
> > get_current_prefix() to verify the parent's resources, so they work
> > correctly regardless of what prefix the parent process uses.
> >
> > Fixes: 50823f30f0c8 ("test: build using per-file dependencies")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > ---
> > app/test/suites/meson.build | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/app/test/suites/meson.build b/app/test/suites/meson.build
> > index 1010150eee..38df1cfec2 100644
> > --- a/app/test/suites/meson.build
> > +++ b/app/test/suites/meson.build
> > @@ -85,7 +85,11 @@ foreach suite:test_suites
> > if nohuge
> > test_args += test_no_huge_args
> > elif not has_hugepage
> > - continue #skip this tests
> > + continue # skip this tests
Since you are modifying this line, s/tests/test/.
> > + endif
> > + if is_linux
> > + # use unique file-prefix to allow parallel runs
> > + test_args += ['--file-prefix=' + test_name.underscorify()]
> > endif
> > if not asan and get_option('b_sanitize').contains('address')
> > continue # skip this test
> > --
> > 2.51.0
> >
>
> Note that in CI systems running multiple builds for different targets or
> branches there will still be problems. Also, if you have to resubmit, can you
> move new lines below handling of nohuge and asan.
>
> With or without points above addressed,
>
> Acked-by: Marat Khalili <marat.khalili@huawei.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 3/6] test: fix race condition in ELF load tests
2026-01-18 20:09 ` [PATCH 3/6] test: fix race condition in ELF load tests Stephen Hemminger
2026-01-19 11:42 ` Marat Khalili
@ 2026-01-19 18:24 ` Stephen Hemminger
1 sibling, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-19 18:24 UTC (permalink / raw)
To: dev; +Cc: stable, Konstantin Ananyev, Marat Khalili
On Sun, 18 Jan 2026 12:09:10 -0800
Stephen Hemminger <stephen@networkplumber.org> wrote:
> The BPF ELF tests sporadically fail with EINVAL when loading from
> the temporary file. This is a race condition where the BPF loader
> reads the file before the data is fully flushed to disk.
>
> Add fsync() before close() in create_temp_bpf_file() to ensure the
> BPF object data is visible on the filesystem before attempting to
> load it.
>
> Also fix two related issues found during review
> - Add missing TEST_ASSERT for mempool creation in test_bpf_elf_tx_load
> - Initialize port variable in test_bpf_elf_rx_load to avoid undefined
> behavior in cleanup path if null_vdev_setup fails early
>
> Fixes: cf1e03f881af ("test/bpf: add ELF loading")
> Cc: stable@dpdk.org
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> app/test/test_bpf.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/app/test/test_bpf.c b/app/test/test_bpf.c
> index a7d56f8d86..03705075d8 100644
> --- a/app/test/test_bpf.c
> +++ b/app/test/test_bpf.c
> @@ -3311,6 +3311,8 @@ create_temp_bpf_file(const uint8_t *data, size_t size, const char *name)
>
> /* Write BPF object data */
> written = write(fd, data, size);
> + if (written == (ssize_t)size)
> + fsync(fd);
Agree, I don't think fsync is really needed.
This was a bandaid that changed the timing. The root cause of the
test failures was the overlap of parallel tests, so let me drop this bit.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 3/6] test: fix race condition in ELF load tests
2026-01-19 11:42 ` Marat Khalili
@ 2026-01-20 0:03 ` Stephen Hemminger
2026-01-20 10:30 ` Marat Khalili
0 siblings, 1 reply; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-20 0:03 UTC (permalink / raw)
To: Marat Khalili; +Cc: stable@dpdk.org, dev@dpdk.org, Konstantin Ananyev
On Mon, 19 Jan 2026 11:42:25 +0000
Marat Khalili <marat.khalili@huawei.com> wrote:
> P.S. Did not worth its own patch, but since you are working on it can you also
> do s/sizeof(struct rte_mbuf)/RTE_MBUF_DEFAULT_BUF_SIZE/g ? I overlooked it last
> time.
Where is that bit?
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 6/6] test: fix trace_autotest_with_traces parallel execution
2026-01-19 13:13 ` Marat Khalili
@ 2026-01-20 0:07 ` Stephen Hemminger
2026-01-20 11:36 ` Marat Khalili
0 siblings, 1 reply; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-20 0:07 UTC (permalink / raw)
To: Marat Khalili
Cc: dev@dpdk.org, stable@dpdk.org, Morten Brørup,
Bruce Richardson
On Mon, 19 Jan 2026 13:13:40 +0000
Marat Khalili <marat.khalili@huawei.com> wrote:
> > -----Original Message-----
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > Sent: Sunday 18 January 2026 20:09
> > To: dev@dpdk.org
> > Cc: Stephen Hemminger <stephen@networkplumber.org>; stable@dpdk.org; Morten Brørup
> > <mb@smartsharesystems.com>; Bruce Richardson <bruce.richardson@intel.com>
> > Subject: [PATCH 6/6] test: fix trace_autotest_with_traces parallel execution
> >
> > The trace_autotest_with_traces test was reusing the test_args array
> > from trace_autotest, which already contained --file-prefix=trace_autotest.
> >
> > Fix by building trace_args from scratch for the _with_traces test
> > variant instead of appending to the existing test_args array.
> >
> > Fixes: 0aeaf75df879 ("test: define unit tests suites based on test types")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > ---
> > app/test/suites/meson.build | 14 +++++++++++---
> > 1 file changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/app/test/suites/meson.build b/app/test/suites/meson.build
> > index 38df1cfec2..e62170bebf 100644
> > --- a/app/test/suites/meson.build
> > +++ b/app/test/suites/meson.build
> > @@ -106,10 +106,18 @@ foreach suite:test_suites
> > is_parallel : false,
> > suite : 'fast-tests')
> > if not is_windows and test_name == 'trace_autotest'
> > - test_args += ['--trace=.*']
> > - test_args += ['--trace-dir=@0@'.format(meson.current_build_dir())]
> > + # build separate args list to avoid duplicate --file-prefix
> > + trace_args = test_no_huge_args
> > + trace_args += ['--trace=.*']
> > + trace_args += ['--trace-dir=@0@'.format(meson.current_build_dir())]
> > + if is_linux
> > + trace_args += ['--file-prefix=trace_autotest_with_traces']
> > + endif
> > + if get_option('default_library') == 'shared'
> > + trace_args += ['-d', dpdk_drivers_build_dir]
> > + endif
> > test(test_name + '_with_traces', dpdk_test,
> > - args : test_args,
> > + args : trace_args,
> > env: ['DPDK_TEST=' + test_name],
> > timeout : timeout_seconds_fast,
> > is_parallel : false,
> > --
> > 2.51.0
> >
>
> Instead of duplicating some code and hardcoding values for trace_autotest,
> which is error-prone, can we make adding file prefix the last thing we do to
> test_args (perhaps doing it right in the `args : ...` expression) and re-use
> the previous value and logic like we did it before?
Not sure, what would it look like. Having special case args for test
looks like a kludge anyway.
^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [PATCH 3/6] test: fix race condition in ELF load tests
2026-01-20 0:03 ` Stephen Hemminger
@ 2026-01-20 10:30 ` Marat Khalili
0 siblings, 0 replies; 53+ messages in thread
From: Marat Khalili @ 2026-01-20 10:30 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: stable@dpdk.org, dev@dpdk.org, Konstantin Ananyev
> > P.S. Did not worth its own patch, but since you are working on it can you also
> > do s/sizeof(struct rte_mbuf)/RTE_MBUF_DEFAULT_BUF_SIZE/g ? I overlooked it last
> > time.
>
>
> Where is that bit?
This in app/test/test_bpf.c is a mix of two worlds:
```c
const struct rte_bpf_prm prm = {
.prog_arg = {
.type = RTE_BPF_ARG_PTR,
.size = sizeof(struct rte_mbuf),
},
};
```
The pointer points to the mbuf buffer, but the size is provided for the mbuf
struct.
Now that I think of it, what lib/bpf/bpf_pkt.c passes to BPF program is not a
pointer to the mbuf buffer, but a pointer to the packet data which is bigger by
RTE_PKTMBUF_HEADROOM, so we should probably specify RTE_MBUF_DEFAULT_DATAROOM
in the size member for our case.
^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [PATCH 6/6] test: fix trace_autotest_with_traces parallel execution
2026-01-20 0:07 ` Stephen Hemminger
@ 2026-01-20 11:36 ` Marat Khalili
0 siblings, 0 replies; 53+ messages in thread
From: Marat Khalili @ 2026-01-20 11:36 UTC (permalink / raw)
To: Stephen Hemminger
Cc: dev@dpdk.org, stable@dpdk.org, Morten Brørup,
Bruce Richardson
> Not sure, what would it look like.
There is multiple ways, how about the one below.
> Having special case args for test looks like a kludge anyway.
100% agree, but it's not like we can easily fix it right now.
diff --git a/app/test/suites/meson.build b/app/test/suites/meson.build
index 4c815ea097..84da5e6d0e 100644
--- a/app/test/suites/meson.build
+++ b/app/test/suites/meson.build
@@ -99,22 +99,20 @@ foreach suite:test_suites
test_args += ['-d', dpdk_drivers_build_dir]
endif
+ if test_name.endswith('_with_traces')
+ if is_windows
+ continue
+ endif
+ test_args += ['--trace=.*']
+ test_args += ['--trace-dir=@0@'.format(meson.current_build_dir())]
+ endif
+
test(test_name, dpdk_test,
args : test_args,
env: ['DPDK_TEST=' + test_name],
timeout : timeout_seconds_fast,
is_parallel : false,
suite : 'fast-tests')
- if not is_windows and test_name == 'trace_autotest'
- test_args += ['--trace=.*']
- test_args += ['--trace-dir=@0@'.format(meson.current_build_dir())]
- test(test_name + '_with_traces', dpdk_test,
- args : test_args,
- env: ['DPDK_TEST=' + test_name],
- timeout : timeout_seconds_fast,
- is_parallel : false,
- suite : 'fast-tests')
- endif
endforeach
endif
endforeach
diff --git a/app/test/test_trace.c b/app/test/test_trace.c
index 97dc69f68c..451d87bd16 100644
--- a/app/test/test_trace.c
+++ b/app/test/test_trace.c
@@ -255,3 +255,4 @@ test_trace(void)
#endif /* !RTE_EXEC_ENV_WINDOWS */
REGISTER_FAST_TEST(trace_autotest, NOHUGE_OK, ASAN_OK, test_trace);
+REGISTER_FAST_TEST(trace_autotest_with_traces, NOHUGE_OK, ASAN_OK, test_trace);
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 00/14] test: fix test failures on high cores
2026-01-18 20:09 [PATCH 0/6] test: fix sporadic failures on high core count systems Stephen Hemminger
` (5 preceding siblings ...)
2026-01-18 20:09 ` [PATCH 6/6] test: fix trace_autotest_with_traces parallel execution Stephen Hemminger
@ 2026-01-22 0:50 ` Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 01/14] test: add pause to synchronization spinloops Stephen Hemminger
` (14 more replies)
2026-03-05 17:50 ` [PATCH v4 00/11] test: fix test failures on high core count systems Stephen Hemminger
7 siblings, 15 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-22 0:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
This series addresses several categories of test infrastructure issues:
1. Spinloop synchronization fixes (patch 1)
Tests using tight spinloops for thread synchronization cause sporadic
failures on high core count AMD Zen systems. The loops starve SMT
sibling threads and create race conditions. Adding rte_pause() to
all synchronization spinloops fixes these issues.
2. Test scaling for high core count systems (patches 2-5)
Several tests use fixed iteration counts that cause timeouts on
systems with many cores due to increased lock contention. A new
helper function test_scale_iterations() scales iterations inversely
with core count to maintain roughly constant test duration.
Affected tests:
- atomic_autotest (Bugzilla #952)
- mcslock_autotest
- stack_autotest
- timer_secondary_autotest
3. BPF test fixes (patches 6-8, 14)
- Fix missing error handling in ELF load tests
- Fix clang 20+ compatibility by restricting to BPF v2 instruction
set (Bugzilla #1844)
- Skip ELF test gracefully if null PMD is disabled
- Fix incorrect size parameter in Rx/Tx load tests
4. Parallel test execution fixes (patches 9-10)
Multiple tests colliding on the default "rte" file-prefix causes EAL
initialization failures when running tests in parallel. Restore unique
file-prefix for all fast-tests on Linux, including a separate prefix
for trace_autotest_with_traces.
5. Test skip conditions (patches 11-13)
Tests that depend on optional drivers (null PMD, eventdev) should skip
gracefully when those drivers are disabled via -Ddisable_drivers=
rather than failing.
v4 - add additional scaling for failures reported on 96 core system
- gracefully handle case where null PMD disabled in build
Stephen Hemminger (14):
test: add pause to synchronization spinloops
test: scale atomic test based on core count
test/mcslock: scale test based on number of cores
test/stack: scale test based on number of cores
test/timer: scale test based on number of cores
test/bpf: fix error handling in ELF load tests
test/bpf: fix unsupported BPF instructions in ELF load test
test/bpf: skip ELF test if null PMD disabled
test: add file-prefix for all fast-tests on Linux
test: fix trace_autotest_with_traces parallel execution
test/eventdev: skip test if eventdev driver disabled
test/pcapng: skip test if null driver missing
test/vdev: skip test if no null PMD
test/bpf: pass correct size for Rx/Tx load tests
app/test/bpf/meson.build | 3 +-
app/test/meson.build | 4 +-
app/test/suites/meson.build | 19 +++++---
app/test/test.h | 19 ++++++++
app/test/test_atomic.c | 66 ++++++++++++++++------------
app/test/test_bpf.c | 13 ++++--
app/test/test_event_eth_tx_adapter.c | 11 +++--
app/test/test_mcslock.c | 10 +++--
app/test/test_stack.c | 8 ++--
app/test/test_threads.c | 17 +++----
app/test/test_timer_secondary.c | 14 +++---
11 files changed, 121 insertions(+), 63 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v3 01/14] test: add pause to synchronization spinloops
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
@ 2026-01-22 0:50 ` Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 02/14] test: scale atomic test based on core count Stephen Hemminger
` (13 subsequent siblings)
14 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-22 0:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Bruce Richardson
The atomic and thread tests use tight spinloops to synchronize.
These spinloops lack rte_pause() which causes problems on high core
count systems, particularly AMD Zen architectures where:
- Tight spinloops without pause can starve SMT sibling threads
- Memory ordering and store-buffer forwarding behave differently
- Higher core counts amplify timing windows for race conditions
This manifests as sporadic test failures on systems with 32+ cores
that don't reproduce on smaller core count systems.
Add rte_pause() to all seven synchronization spinloops to allow
proper CPU resource sharing and improve memory ordering behavior.
Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
app/test/test_atomic.c | 15 ++++++++-------
app/test/test_threads.c | 17 +++++++++--------
2 files changed, 17 insertions(+), 15 deletions(-)
diff --git a/app/test/test_atomic.c b/app/test/test_atomic.c
index 8160a33e0e..b1a0d40ece 100644
--- a/app/test/test_atomic.c
+++ b/app/test/test_atomic.c
@@ -15,6 +15,7 @@
#include <rte_atomic.h>
#include <rte_eal.h>
#include <rte_lcore.h>
+#include <rte_pause.h>
#include <rte_random.h>
#include <rte_hash_crc.h>
@@ -114,7 +115,7 @@ test_atomic_usual(__rte_unused void *arg)
unsigned i;
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
for (i = 0; i < N; i++)
rte_atomic16_inc(&a16);
@@ -150,7 +151,7 @@ static int
test_atomic_tas(__rte_unused void *arg)
{
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
if (rte_atomic16_test_and_set(&a16))
rte_atomic64_inc(&count);
@@ -171,7 +172,7 @@ test_atomic_addsub_and_return(__rte_unused void *arg)
unsigned i;
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
for (i = 0; i < N; i++) {
tmp16 = rte_atomic16_add_return(&a16, 1);
@@ -210,7 +211,7 @@ static int
test_atomic_inc_and_test(__rte_unused void *arg)
{
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
if (rte_atomic16_inc_and_test(&a16)) {
rte_atomic64_inc(&count);
@@ -237,7 +238,7 @@ static int
test_atomic_dec_and_test(__rte_unused void *arg)
{
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
if (rte_atomic16_dec_and_test(&a16))
rte_atomic64_inc(&count);
@@ -269,7 +270,7 @@ test_atomic128_cmp_exchange(__rte_unused void *arg)
unsigned int i;
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
expected = count128;
@@ -407,7 +408,7 @@ test_atomic_exchange(__rte_unused void *arg)
/* Wait until all of the other threads have been dispatched */
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
/*
* Let the battle begin! Every thread attempts to steal the current
diff --git a/app/test/test_threads.c b/app/test/test_threads.c
index 5cd8bd4559..e2700b4a92 100644
--- a/app/test/test_threads.c
+++ b/app/test/test_threads.c
@@ -7,6 +7,7 @@
#include <rte_thread.h>
#include <rte_debug.h>
#include <rte_stdatomic.h>
+#include <rte_pause.h>
#include "test.h"
@@ -23,7 +24,7 @@ thread_main(void *arg)
rte_atomic_store_explicit(&thread_id_ready, 1, rte_memory_order_release);
while (rte_atomic_load_explicit(&thread_id_ready, rte_memory_order_acquire) == 1)
- ;
+ rte_pause();
return 0;
}
@@ -39,7 +40,7 @@ test_thread_create_join(void)
"Failed to create thread.");
while (rte_atomic_load_explicit(&thread_id_ready, rte_memory_order_acquire) == 0)
- ;
+ rte_pause();
RTE_TEST_ASSERT(rte_thread_equal(thread_id, thread_main_id) != 0,
"Unexpected thread id.");
@@ -63,7 +64,7 @@ test_thread_create_detach(void)
&thread_main_id) == 0, "Failed to create thread.");
while (rte_atomic_load_explicit(&thread_id_ready, rte_memory_order_acquire) == 0)
- ;
+ rte_pause();
RTE_TEST_ASSERT(rte_thread_equal(thread_id, thread_main_id) != 0,
"Unexpected thread id.");
@@ -87,7 +88,7 @@ test_thread_priority(void)
"Failed to create thread");
while (rte_atomic_load_explicit(&thread_id_ready, rte_memory_order_acquire) == 0)
- ;
+ rte_pause();
priority = RTE_THREAD_PRIORITY_NORMAL;
RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == 0,
@@ -139,7 +140,7 @@ test_thread_affinity(void)
"Failed to create thread");
while (rte_atomic_load_explicit(&thread_id_ready, rte_memory_order_acquire) == 0)
- ;
+ rte_pause();
RTE_TEST_ASSERT(rte_thread_get_affinity_by_id(thread_id, &cpuset0) == 0,
"Failed to get thread affinity");
@@ -192,7 +193,7 @@ test_thread_attributes_affinity(void)
"Failed to create attributes affinity thread.");
while (rte_atomic_load_explicit(&thread_id_ready, rte_memory_order_acquire) == 0)
- ;
+ rte_pause();
RTE_TEST_ASSERT(rte_thread_get_affinity_by_id(thread_id, &cpuset1) == 0,
"Failed to get attributes thread affinity");
@@ -221,7 +222,7 @@ test_thread_attributes_priority(void)
"Failed to create attributes priority thread.");
while (rte_atomic_load_explicit(&thread_id_ready, rte_memory_order_acquire) == 0)
- ;
+ rte_pause();
RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
"Failed to get thread priority");
@@ -245,7 +246,7 @@ test_thread_control_create_join(void)
"Failed to create thread.");
while (rte_atomic_load_explicit(&thread_id_ready, rte_memory_order_acquire) == 0)
- ;
+ rte_pause();
RTE_TEST_ASSERT(rte_thread_equal(thread_id, thread_main_id) != 0,
"Unexpected thread id.");
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 02/14] test: scale atomic test based on core count
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 01/14] test: add pause to synchronization spinloops Stephen Hemminger
@ 2026-01-22 0:50 ` Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 03/14] test/mcslock: scale test based on number of cores Stephen Hemminger
` (12 subsequent siblings)
14 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-22 0:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable
The atomic test uses tight spinloops to synchronize worker threads
and performs a fixed 1,000,000 iterations per worker. This causes
two problems on high core count systems:
With many cores (e.g., 32), the massive contention on shared
atomic variables causes the test to exceed the 10 second timeout.
Scale iterations inversely with core count to maintain roughly
constant test duration regardless of system size
With 32 cores, iterations drop from 1,000,000 to 31,250 per worker,
which keeps the test well within the timeout while still providing
meaningful coverage.
Add helper function to test.h so that other similar problems
can be addressed in followon patches.
Bugzilla ID: 952
Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test.h | 19 ++++++++++++++++
app/test/test_atomic.c | 51 +++++++++++++++++++++++++-----------------
2 files changed, 50 insertions(+), 20 deletions(-)
diff --git a/app/test/test.h b/app/test/test.h
index 10dc45f19d..1f12fc5397 100644
--- a/app/test/test.h
+++ b/app/test/test.h
@@ -12,6 +12,7 @@
#include <rte_hexdump.h>
#include <rte_common.h>
+#include <rte_lcore.h>
#include <rte_os_shim.h>
#define TEST_SUCCESS EXIT_SUCCESS
@@ -223,4 +224,22 @@ void add_test_command(struct test_command *t);
*/
#define REGISTER_ATTIC_TEST REGISTER_TEST_COMMAND
+/**
+ * Scale test iterations inversely with core count.
+ *
+ * On high core count systems, tests with per-core work can exceed
+ * timeout limits due to increased lock contention and scheduling
+ * overhead. This helper scales iterations to keep total test time
+ * roughly constant regardless of core count.
+ *
+ * @param base Base iteration count (used on single-core systems)
+ * @param min Minimum iterations (floor to ensure meaningful testing)
+ * @return Scaled iteration count
+ */
+static inline unsigned int
+test_scale_iterations(unsigned int base, unsigned int min)
+{
+ return RTE_MAX(base / rte_lcore_count(), min);
+}
+
#endif
diff --git a/app/test/test_atomic.c b/app/test/test_atomic.c
index b1a0d40ece..2a4531b833 100644
--- a/app/test/test_atomic.c
+++ b/app/test/test_atomic.c
@@ -10,6 +10,7 @@
#include <sys/queue.h>
#include <rte_memory.h>
+#include <rte_common.h>
#include <rte_per_lcore.h>
#include <rte_launch.h>
#include <rte_atomic.h>
@@ -101,7 +102,15 @@
#define NUM_ATOMIC_TYPES 3
-#define N 1000000
+#define N_BASE 1000000u
+#define N_MIN 10000u
+
+/*
+ * Number of iterations for each test, scaled inversely with core count.
+ * More cores means more contention which increases time per operation.
+ * Calculated once at test start to avoid repeated computation in workers.
+ */
+static unsigned int num_iterations;
static rte_atomic16_t a16;
static rte_atomic32_t a32;
@@ -112,36 +121,36 @@ static rte_atomic32_t synchro;
static int
test_atomic_usual(__rte_unused void *arg)
{
- unsigned i;
+ unsigned int i;
while (rte_atomic32_read(&synchro) == 0)
rte_pause();
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic16_inc(&a16);
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic16_dec(&a16);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic16_add(&a16, 5);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic16_sub(&a16, 5);
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic32_inc(&a32);
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic32_dec(&a32);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic32_add(&a32, 5);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic32_sub(&a32, 5);
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic64_inc(&a64);
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic64_dec(&a64);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic64_add(&a64, 5);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic64_sub(&a64, 5);
return 0;
@@ -169,12 +178,12 @@ test_atomic_addsub_and_return(__rte_unused void *arg)
uint32_t tmp16;
uint32_t tmp32;
uint64_t tmp64;
- unsigned i;
+ unsigned int i;
while (rte_atomic32_read(&synchro) == 0)
rte_pause();
- for (i = 0; i < N; i++) {
+ for (i = 0; i < num_iterations; i++) {
tmp16 = rte_atomic16_add_return(&a16, 1);
rte_atomic64_add(&count, tmp16);
@@ -274,7 +283,7 @@ test_atomic128_cmp_exchange(__rte_unused void *arg)
expected = count128;
- for (i = 0; i < N; i++) {
+ for (i = 0; i < num_iterations; i++) {
do {
rte_int128_t desired;
@@ -401,7 +410,7 @@ get_crc8(uint8_t *message, int length)
static int
test_atomic_exchange(__rte_unused void *arg)
{
- int i;
+ unsigned int i;
test16_t nt16, ot16; /* new token, old token */
test32_t nt32, ot32;
test64_t nt64, ot64;
@@ -417,7 +426,7 @@ test_atomic_exchange(__rte_unused void *arg)
* appropriate crc32 hash for the data) then the test iteration has
* passed. If the token is invalid, increment the counter.
*/
- for (i = 0; i < N; i++) {
+ for (i = 0; i < num_iterations; i++) {
/* Test 64bit Atomic Exchange */
nt64.u64 = rte_rand();
@@ -446,6 +455,8 @@ test_atomic_exchange(__rte_unused void *arg)
static int
test_atomic(void)
{
+ num_iterations = test_scale_iterations(N_BASE, N_MIN);
+
rte_atomic16_init(&a16);
rte_atomic32_init(&a32);
rte_atomic64_init(&a64);
@@ -593,7 +604,7 @@ test_atomic(void)
rte_atomic32_clear(&synchro);
iterations = count128.val[0] - count128.val[1];
- if (iterations != (uint64_t)4*N*(rte_lcore_count()-1)) {
+ if (iterations != (uint64_t)4*num_iterations*(rte_lcore_count()-1)) {
printf("128-bit compare and swap failed\n");
return -1;
}
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 03/14] test/mcslock: scale test based on number of cores
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 01/14] test: add pause to synchronization spinloops Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 02/14] test: scale atomic test based on core count Stephen Hemminger
@ 2026-01-22 0:50 ` Stephen Hemminger
2026-01-22 10:41 ` Konstantin Ananyev
2026-01-22 0:50 ` [PATCH v3 04/14] test/stack: " Stephen Hemminger
` (11 subsequent siblings)
14 siblings, 1 reply; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-22 0:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Honnappa Nagarahalli, Phil Yang,
Gavin Hu
This test uses loops to synchronize but has problems on systems
with high number of cores. Scale iterations to the number of
cores.
Fixes: 32dcb9fd2a22 ("test/mcslock: add MCS queued lock unit test")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test_mcslock.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/app/test/test_mcslock.c b/app/test/test_mcslock.c
index 245df99a5d..b182da72f1 100644
--- a/app/test/test_mcslock.c
+++ b/app/test/test_mcslock.c
@@ -42,6 +42,10 @@ RTE_ATOMIC(rte_mcslock_t *) p_ml_perf;
static unsigned int count;
+#define MAX_LOOP_BASE 1000000u
+#define MAX_LOOP_MIN 10000u
+static unsigned int max_loop;
+
static RTE_ATOMIC(uint32_t) synchro;
static int
@@ -60,8 +64,6 @@ test_mcslock_per_core(__rte_unused void *arg)
static uint64_t time_count[RTE_MAX_LCORE] = {0};
-#define MAX_LOOP 1000000
-
static int
load_loop_fn(void *func_param)
{
@@ -78,7 +80,7 @@ load_loop_fn(void *func_param)
rte_wait_until_equal_32((uint32_t *)(uintptr_t)&synchro, 1, rte_memory_order_relaxed);
begin = rte_get_timer_cycles();
- while (lcount < MAX_LOOP) {
+ while (lcount < max_loop) {
if (use_lock)
rte_mcslock_lock(&p_ml_perf, &ml_perf_me);
@@ -175,6 +177,8 @@ test_mcslock(void)
rte_mcslock_t ml_me;
rte_mcslock_t ml_try_me;
+ max_loop = test_scale_iterations(MAX_LOOP_BASE, MAX_LOOP_MIN);
+
/*
* Test mcs lock & unlock on each core
*/
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 04/14] test/stack: scale test based on number of cores
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
` (2 preceding siblings ...)
2026-01-22 0:50 ` [PATCH v3 03/14] test/mcslock: scale test based on number of cores Stephen Hemminger
@ 2026-01-22 0:50 ` Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 05/14] test/timer: " Stephen Hemminger
` (10 subsequent siblings)
14 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-22 0:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Gage Eads, Olivier Matz
This test uses loops to synchronize but has problems on systems
with high number of cores. Scale iterations to the number of
cores.
Fixes: 5e2e61b99e91 ("test/stack: check stack API")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test_stack.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/app/test/test_stack.c b/app/test/test_stack.c
index abc3114729..797c0a83ea 100644
--- a/app/test/test_stack.c
+++ b/app/test/test_stack.c
@@ -268,7 +268,8 @@ test_free_null(void)
return 0;
}
-#define NUM_ITERS_PER_THREAD 100000
+#define NUM_ITERS_BASE 100000u
+#define NUM_ITERS_MIN 1000u
struct test_args {
struct rte_stack *s;
@@ -280,9 +281,10 @@ static int
stack_thread_push_pop(__rte_unused void *args)
{
void *obj_table[MAX_BULK];
- int i;
+ unsigned int i, num_iters;
- for (i = 0; i < NUM_ITERS_PER_THREAD; i++) {
+ num_iters = test_scale_iterations(NUM_ITERS_BASE, NUM_ITERS_MIN);
+ for (i = 0; i < num_iters; i++) {
unsigned int num;
num = rte_rand() % MAX_BULK;
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 05/14] test/timer: scale test based on number of cores
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
` (3 preceding siblings ...)
2026-01-22 0:50 ` [PATCH v3 04/14] test/stack: " Stephen Hemminger
@ 2026-01-22 0:50 ` Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 06/14] test/bpf: fix error handling in ELF load tests Stephen Hemminger
` (9 subsequent siblings)
14 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-22 0:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Erik Gabriel Carrillo
With large core system this would end up taking too long and
test would time out. Scale the number of timers based on
the number lcores available.
Fixes: 50247fe03fe0 ("test/timer: exercise new APIs in secondary process")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test_timer_secondary.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/app/test/test_timer_secondary.c b/app/test/test_timer_secondary.c
index 8bff904ed4..003782b89c 100644
--- a/app/test/test_timer_secondary.c
+++ b/app/test/test_timer_secondary.c
@@ -27,7 +27,8 @@ test_timer_secondary(void)
#include "process.h"
-#define NUM_TIMERS (1 << 20) /* ~1M timers */
+#define NUM_TIMERS_MAX (1 << 20) /* ~1M timers */
+#define NUM_TIMERS_MIN (1 << 14) /* 16K minimum */
#define NUM_LCORES_NEEDED 3
#define TEST_INFO_MZ_NAME "test_timer_info_mz"
#define MSECPERSEC 1E3
@@ -38,11 +39,12 @@ struct test_info {
unsigned int main_lcore;
unsigned int mgr_lcore;
unsigned int sec_lcore;
+ unsigned int num_timers;
uint32_t timer_data_id;
volatile int expected_count;
volatile int expired_count;
struct rte_mempool *tim_mempool;
- struct rte_timer *expired_timers[NUM_TIMERS];
+ struct rte_timer *expired_timers[NUM_TIMERS_MAX];
int expired_timers_idx;
volatile int exit_flag;
};
@@ -138,8 +140,10 @@ test_timer_secondary(void)
"test data");
test_info = mz->addr;
+ test_info->num_timers = test_scale_iterations(NUM_TIMERS_MAX, NUM_TIMERS_MIN);
+
test_info->tim_mempool = rte_mempool_create("test_timer_mp",
- NUM_TIMERS, sizeof(struct rte_timer), 0, 0,
+ test_info->num_timers, sizeof(struct rte_timer), 0, 0,
NULL, NULL, NULL, NULL, rte_socket_id(), 0);
ret = rte_timer_data_alloc(&test_info->timer_data_id);
@@ -178,14 +182,14 @@ test_timer_secondary(void)
} else if (proc_type == RTE_PROC_SECONDARY) {
uint64_t ticks, timeout_ms;
struct rte_timer *tim;
- int i;
+ unsigned int i;
mz = rte_memzone_lookup(TEST_INFO_MZ_NAME);
TEST_ASSERT_NOT_NULL(mz, "Couldn't lookup memzone for "
"test info");
test_info = mz->addr;
- for (i = 0; i < NUM_TIMERS; i++) {
+ for (i = 0; i < test_info->num_timers; i++) {
rte_mempool_get(test_info->tim_mempool, (void **)&tim);
rte_timer_init(tim);
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 06/14] test/bpf: fix error handling in ELF load tests
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
` (4 preceding siblings ...)
2026-01-22 0:50 ` [PATCH v3 05/14] test/timer: " Stephen Hemminger
@ 2026-01-22 0:50 ` Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 07/14] test/bpf: fix unsupported BPF instructions in ELF load test Stephen Hemminger
` (8 subsequent siblings)
14 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-22 0:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Marat Khalili, Konstantin Ananyev
Address related issues found during review
- Add missing TEST_ASSERT for mempool creation in test_bpf_elf_tx_load
- Initialize port variable in test_bpf_elf_rx_load to avoid undefined
behavior in cleanup path if null_vdev_setup fails early
Fixes: cf1e03f881af ("test/bpf: add ELF loading")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Marat Khalili <marat.khalili@huawei.com>
---
app/test/test_bpf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/app/test/test_bpf.c b/app/test/test_bpf.c
index a7d56f8d86..0e969f9f13 100644
--- a/app/test/test_bpf.c
+++ b/app/test/test_bpf.c
@@ -3580,6 +3580,7 @@ test_bpf_elf_tx_load(void)
mb_pool = rte_pktmbuf_pool_create("bpf_tx_test_pool", BPF_TEST_POOLSIZE,
0, 0, RTE_MBUF_DEFAULT_BUF_SIZE,
SOCKET_ID_ANY);
+ TEST_ASSERT(mb_pool != NULL, "failed to create mempool");
ret = null_vdev_setup(null_dev, &port, mb_pool);
if (ret != 0)
@@ -3664,7 +3665,7 @@ test_bpf_elf_rx_load(void)
static const char null_dev[] = "net_null_bpf0";
struct rte_mempool *pool = NULL;
char *tmpfile = NULL;
- uint16_t port;
+ uint16_t port = UINT16_MAX;
int ret;
printf("%s start\n", __func__);
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 07/14] test/bpf: fix unsupported BPF instructions in ELF load test
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
` (5 preceding siblings ...)
2026-01-22 0:50 ` [PATCH v3 06/14] test/bpf: fix error handling in ELF load tests Stephen Hemminger
@ 2026-01-22 0:50 ` Stephen Hemminger
2026-01-22 10:33 ` Konstantin Ananyev
2026-01-22 0:50 ` [PATCH v3 08/14] test/bpf: skip ELF test if null PMD disabled Stephen Hemminger
` (7 subsequent siblings)
14 siblings, 1 reply; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-22 0:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Marat Khalili, Konstantin Ananyev
The DPDK BPF library only handles the base BPF instructions.
It does not handle JMP32 which would cause the bpf_elf_load
test to fail on clang 20 or later.
Bugzilla ID: 1844
Fixes: cf1e03f881af ("test/bpf: add ELF loading")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Marat Khalili <marat.khalili@huawei.com>
---
app/test/bpf/meson.build | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/app/test/bpf/meson.build b/app/test/bpf/meson.build
index aaecfa7018..91c1b434f8 100644
--- a/app/test/bpf/meson.build
+++ b/app/test/bpf/meson.build
@@ -24,7 +24,8 @@ if not xxd.found()
endif
# BPF compiler flags
-bpf_cflags = [ '-O2', '-target', 'bpf', '-g', '-c']
+# At present: DPDK BPF does not support v3 or later
+bpf_cflags = [ '-O2', '-target', 'bpf', '-mcpu=v2', '-g', '-c']
# Enable test in test_bpf.c
cflags += '-DTEST_BPF_ELF_LOAD'
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 08/14] test/bpf: skip ELF test if null PMD disabled
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
` (6 preceding siblings ...)
2026-01-22 0:50 ` [PATCH v3 07/14] test/bpf: fix unsupported BPF instructions in ELF load test Stephen Hemminger
@ 2026-01-22 0:50 ` Stephen Hemminger
2026-01-23 11:56 ` Marat Khalili
2026-01-22 0:50 ` [PATCH v3 09/14] test: add file-prefix for all fast-tests on Linux Stephen Hemminger
` (6 subsequent siblings)
14 siblings, 1 reply; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-22 0:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Konstantin Ananyev, Marat Khalili
If null PMD is disabled, the test to load filter can not work
because it uses that. Change to skip the test if setup fails.
Fixes: 81038845c90b ("test/bpf: add Rx and Tx filtering")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test_bpf.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/app/test/test_bpf.c b/app/test/test_bpf.c
index 0e969f9f13..8bf783c00c 100644
--- a/app/test/test_bpf.c
+++ b/app/test/test_bpf.c
@@ -3441,7 +3441,11 @@ static int null_vdev_setup(const char *name, uint16_t *port, struct rte_mempool
/* Make a null device */
ret = rte_vdev_init(name, NULL);
- TEST_ASSERT(ret == 0, "rte_vdev_init(%s) failed: %d", name, ret);
+ if (ret != 0) {
+ printf("rte_vdev_init(%s) failed: %d:%s\n",
+ name, ret, strerror(-ret));
+ return -ENOTSUP;
+ }
ret = rte_eth_dev_get_port_by_name(name, port);
TEST_ASSERT(ret == 0, "failed to get port id for %s: %d", name, ret);
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 09/14] test: add file-prefix for all fast-tests on Linux
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
` (7 preceding siblings ...)
2026-01-22 0:50 ` [PATCH v3 08/14] test/bpf: skip ELF test if null PMD disabled Stephen Hemminger
@ 2026-01-22 0:50 ` Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 10/14] test: fix trace_autotest_with_traces parallel execution Stephen Hemminger
` (5 subsequent siblings)
14 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-22 0:50 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, stable, Marat Khalili, Bruce Richardson,
Morten Brørup
When running tests in parallel on systems with many cores, multiple test
processes collide on the default "rte" file-prefix, causing EAL
initialization failures:
EAL: Cannot allocate memzone list: Device or resource busy
EAL: Cannot init memzone
This occurs because all DPDK tests (including --no-huge tests) use
file-backed arrays for memzone tracking. These files are created at
/var/run/dpdk/<prefix>/fbarray_memzone and require exclusive locking
during initialization. When multiple tests run in parallel with the
same file-prefix, they compete for this lock.
The original implementation included --file-prefix for Linux to
prevent this collision. This was later removed during test
infrastructure refactoring.
Restore the --file-prefix argument for all fast-tests on Linux,
regardless of whether they use hugepages. Tests that exercise
file-prefix functionality (like eal_flags_file_prefix_autotest)
spawn child processes with their own hardcoded prefixes and use
get_current_prefix() to verify the parent's resources, so they work
correctly regardless of what prefix the parent process uses.
Fixes: 50823f30f0c8 ("test: build using per-file dependencies")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Marat Khalili <marat.khalili@huawei.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
app/test/suites/meson.build | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/app/test/suites/meson.build b/app/test/suites/meson.build
index 1010150eee..4c815ea097 100644
--- a/app/test/suites/meson.build
+++ b/app/test/suites/meson.build
@@ -85,11 +85,15 @@ foreach suite:test_suites
if nohuge
test_args += test_no_huge_args
elif not has_hugepage
- continue #skip this tests
+ continue # skip this test
endif
if not asan and get_option('b_sanitize').contains('address')
continue # skip this test
endif
+ if is_linux
+ # use unique file-prefix to allow parallel runs
+ test_args += ['--file-prefix=' + test_name.underscorify()]
+ endif
if get_option('default_library') == 'shared'
test_args += ['-d', dpdk_drivers_build_dir]
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 10/14] test: fix trace_autotest_with_traces parallel execution
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
` (8 preceding siblings ...)
2026-01-22 0:50 ` [PATCH v3 09/14] test: add file-prefix for all fast-tests on Linux Stephen Hemminger
@ 2026-01-22 0:50 ` Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 11/14] test/eventdev: skip test if eventdev driver disabled Stephen Hemminger
` (4 subsequent siblings)
14 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-22 0:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Bruce Richardson, Morten Brørup
The trace_autotest_with_traces test needs a unique file-prefix to avoid
collisions when running in parallel with other tests.
Rather than duplicating test argument construction, restructure to add
file-prefix as the last step. This allows reusing test_args for the
trace variant by concatenating the trace-specific arguments and a
different file-prefix at the end.
Fixes: 0aeaf75df879 ("test: define unit tests suites based on test types")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/suites/meson.build | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/app/test/suites/meson.build b/app/test/suites/meson.build
index 4c815ea097..786c459c24 100644
--- a/app/test/suites/meson.build
+++ b/app/test/suites/meson.build
@@ -90,26 +90,29 @@ foreach suite:test_suites
if not asan and get_option('b_sanitize').contains('address')
continue # skip this test
endif
- if is_linux
- # use unique file-prefix to allow parallel runs
- test_args += ['--file-prefix=' + test_name.underscorify()]
- endif
-
if get_option('default_library') == 'shared'
test_args += ['-d', dpdk_drivers_build_dir]
endif
+ # use unique file-prefix to allow parallel runs
+ file_prefix = []
+ trace_prefix = []
+ if is_linux
+ file_prefix = ['--file-prefix=' + test_name.underscorify()]
+ trace_prefix = [file_prefix[0] + '_with_traces']
+ endif
+
test(test_name, dpdk_test,
- args : test_args,
+ args : test_args + file_prefix,
env: ['DPDK_TEST=' + test_name],
timeout : timeout_seconds_fast,
is_parallel : false,
suite : 'fast-tests')
if not is_windows and test_name == 'trace_autotest'
- test_args += ['--trace=.*']
- test_args += ['--trace-dir=@0@'.format(meson.current_build_dir())]
+ trace_extra = ['--trace=.*',
+ '--trace-dir=@0@'.format(meson.current_build_dir())]
test(test_name + '_with_traces', dpdk_test,
- args : test_args,
+ args : test_args + trace_extra + trace_prefix,
env: ['DPDK_TEST=' + test_name],
timeout : timeout_seconds_fast,
is_parallel : false,
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 11/14] test/eventdev: skip test if eventdev driver disabled
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
` (9 preceding siblings ...)
2026-01-22 0:50 ` [PATCH v3 10/14] test: fix trace_autotest_with_traces parallel execution Stephen Hemminger
@ 2026-01-22 0:50 ` Stephen Hemminger
2026-01-22 20:40 ` Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 12/14] test/pcapng: skip test if null driver missing Stephen Hemminger
` (3 subsequent siblings)
14 siblings, 1 reply; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-22 0:50 UTC (permalink / raw)
To: dev
Cc: Stephen Hemminger, stable, Naga Harish K S V, Morten Brørup,
Bruce Richardson
If DPDK is build with -Ddisable_drivers=eventdev/* then the
test will be unable to create the needed software eventdev device.
Fixes: 50823f30f0c8 ("test: build using per-file dependencies")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test_event_eth_tx_adapter.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/app/test/test_event_eth_tx_adapter.c b/app/test/test_event_eth_tx_adapter.c
index a128a4f2c2..0753634e7d 100644
--- a/app/test/test_event_eth_tx_adapter.c
+++ b/app/test/test_event_eth_tx_adapter.c
@@ -208,11 +208,14 @@ testsuite_setup(void)
TEST_ASSERT(err == 0, "Port initialization failed err %d\n", err);
if (rte_event_dev_count() == 0) {
- printf("Failed to find a valid event device,"
- " testing with event_sw0 device\n");
+ printf("Failed to find a valid event device, testing with event_sw0 device\n");
err = rte_vdev_init(vdev_name, NULL);
- TEST_ASSERT(err == 0, "vdev %s creation failed %d\n",
- vdev_name, err);
+ if (err != 0) {
+ printf("vdev %s creation failed %d: %s\n", vdev_name,
+ err, strerror(-err));
+ return TEST_SKIPPED;
+ }
+
event_dev_delete = 1;
}
return err;
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 12/14] test/pcapng: skip test if null driver missing
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
` (10 preceding siblings ...)
2026-01-22 0:50 ` [PATCH v3 11/14] test/eventdev: skip test if eventdev driver disabled Stephen Hemminger
@ 2026-01-22 0:50 ` Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 13/14] test/vdev: skip test if no null PMD Stephen Hemminger
` (2 subsequent siblings)
14 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-22 0:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, David Marchand
If null PMD is disabled in the build via -Ddisable_drivers=null
then do not build pcapng test.
Fixes: 6f01a3ca5c7f ("test: fix dependency on pcapng")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/meson.build | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/app/test/meson.build b/app/test/meson.build
index f4d04a6e42..d03e5f7339 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -135,7 +135,7 @@ source_file_deps = {
'test_mp_secondary.c': ['hash'],
'test_net_ether.c': ['net'],
'test_net_ip6.c': ['net'],
- 'test_pcapng.c': ['ethdev', 'net', 'pcapng', 'bus_vdev'],
+ 'test_pcapng.c': ['net_null', 'ethdev', 'pcapng', 'bus_vdev'],
'test_pdcp.c': ['eventdev', 'pdcp', 'net', 'timer', 'security'],
'test_pdump.c': ['pdump'] + sample_packet_forward_deps,
'test_per_lcore.c': [],
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 13/14] test/vdev: skip test if no null PMD
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
` (11 preceding siblings ...)
2026-01-22 0:50 ` [PATCH v3 12/14] test/pcapng: skip test if null driver missing Stephen Hemminger
@ 2026-01-22 0:50 ` Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 14/14] test/bpf: pass correct size for Rx/Tx load tests Stephen Hemminger
2026-03-05 16:39 ` [PATCH v3 00/14] test: fix test failures on high cores David Marchand
14 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-22 0:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Bruce Richardson, Morten Brørup
The vdev test depends on the null PMD for testing.
Skip building the test if that is disabled.
Fixes: 50823f30f0c8 ("test: build using per-file dependencies")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/meson.build | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/app/test/meson.build b/app/test/meson.build
index d03e5f7339..ccddf1af8e 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -209,7 +209,7 @@ source_file_deps = {
'test_trace.c': [],
'test_trace_perf.c': [],
'test_trace_register.c': [],
- 'test_vdev.c': ['kvargs', 'bus_vdev'],
+ 'test_vdev.c': ['net_null', 'kvargs', 'bus_vdev'],
'test_version.c': [],
}
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 14/14] test/bpf: pass correct size for Rx/Tx load tests
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
` (12 preceding siblings ...)
2026-01-22 0:50 ` [PATCH v3 13/14] test/vdev: skip test if no null PMD Stephen Hemminger
@ 2026-01-22 0:50 ` Stephen Hemminger
2026-01-23 11:50 ` Marat Khalili
2026-03-05 16:39 ` [PATCH v3 00/14] test: fix test failures on high cores David Marchand
14 siblings, 1 reply; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-22 0:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Marat Khalili, Konstantin Ananyev
Use the correct size in bpf_prm to help with validation.
Fixes: 81038845c90b ("test/bpf: add Rx and Tx filtering")
Cc: stable@dpdk.org
Suggested-by: Marat Khalili <marat.khalili@huawei.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test_bpf.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/app/test/test_bpf.c b/app/test/test_bpf.c
index 8bf783c00c..bf92bc545c 100644
--- a/app/test/test_bpf.c
+++ b/app/test/test_bpf.c
@@ -3533,7 +3533,7 @@ static int bpf_tx_test(uint16_t port, const char *tmpfile, struct rte_mempool *p
const struct rte_bpf_prm prm = {
.prog_arg = {
.type = RTE_BPF_ARG_PTR,
- .size = sizeof(struct rte_mbuf),
+ .size = sizeof(struct dummy_net),
},
};
int ret;
@@ -3632,7 +3632,7 @@ static int bpf_rx_test(uint16_t port, const char *tmpfile, struct rte_mempool *p
const struct rte_bpf_prm prm = {
.prog_arg = {
.type = RTE_BPF_ARG_PTR,
- .size = sizeof(struct rte_mbuf),
+ .size = sizeof(struct dummy_net),
},
};
int ret;
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* RE: [PATCH v3 07/14] test/bpf: fix unsupported BPF instructions in ELF load test
2026-01-22 0:50 ` [PATCH v3 07/14] test/bpf: fix unsupported BPF instructions in ELF load test Stephen Hemminger
@ 2026-01-22 10:33 ` Konstantin Ananyev
0 siblings, 0 replies; 53+ messages in thread
From: Konstantin Ananyev @ 2026-01-22 10:33 UTC (permalink / raw)
To: Stephen Hemminger, dev@dpdk.org; +Cc: stable@dpdk.org, Marat Khalili
>
> The DPDK BPF library only handles the base BPF instructions.
> It does not handle JMP32 which would cause the bpf_elf_load
> test to fail on clang 20 or later.
>
> Bugzilla ID: 1844
> Fixes: cf1e03f881af ("test/bpf: add ELF loading")
> Cc: stable@dpdk.org
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> Acked-by: Marat Khalili <marat.khalili@huawei.com>
> ---
> app/test/bpf/meson.build | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/app/test/bpf/meson.build b/app/test/bpf/meson.build
> index aaecfa7018..91c1b434f8 100644
> --- a/app/test/bpf/meson.build
> +++ b/app/test/bpf/meson.build
> @@ -24,7 +24,8 @@ if not xxd.found()
> endif
>
> # BPF compiler flags
> -bpf_cflags = [ '-O2', '-target', 'bpf', '-g', '-c']
> +# At present: DPDK BPF does not support v3 or later
> +bpf_cflags = [ '-O2', '-target', 'bpf', '-mcpu=v2', '-g', '-c']
>
> # Enable test in test_bpf.c
> cflags += '-DTEST_BPF_ELF_LOAD'
> --
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> 2.51.0
^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [PATCH v3 03/14] test/mcslock: scale test based on number of cores
2026-01-22 0:50 ` [PATCH v3 03/14] test/mcslock: scale test based on number of cores Stephen Hemminger
@ 2026-01-22 10:41 ` Konstantin Ananyev
2026-01-27 20:31 ` Stephen Hemminger
0 siblings, 1 reply; 53+ messages in thread
From: Konstantin Ananyev @ 2026-01-22 10:41 UTC (permalink / raw)
To: Stephen Hemminger, dev@dpdk.org
Cc: stable@dpdk.org, Honnappa Nagarahalli, Phil Yang, Gavin Hu
> This test uses loops to synchronize but has problems on systems
> with high number of cores. Scale iterations to the number of
> cores.
>
> Fixes: 32dcb9fd2a22 ("test/mcslock: add MCS queued lock unit test")
> Cc: stable@dpdk.org
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> app/test/test_mcslock.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/app/test/test_mcslock.c b/app/test/test_mcslock.c
> index 245df99a5d..b182da72f1 100644
> --- a/app/test/test_mcslock.c
> +++ b/app/test/test_mcslock.c
> @@ -42,6 +42,10 @@ RTE_ATOMIC(rte_mcslock_t *) p_ml_perf;
>
> static unsigned int count;
>
> +#define MAX_LOOP_BASE 1000000u
> +#define MAX_LOOP_MIN 10000u
> +static unsigned int max_loop;
> +
> static RTE_ATOMIC(uint32_t) synchro;
>
> static int
> @@ -60,8 +64,6 @@ test_mcslock_per_core(__rte_unused void *arg)
>
> static uint64_t time_count[RTE_MAX_LCORE] = {0};
>
> -#define MAX_LOOP 1000000
> -
> static int
> load_loop_fn(void *func_param)
> {
> @@ -78,7 +80,7 @@ load_loop_fn(void *func_param)
> rte_wait_until_equal_32((uint32_t *)(uintptr_t)&synchro, 1,
> rte_memory_order_relaxed);
>
> begin = rte_get_timer_cycles();
> - while (lcount < MAX_LOOP) {
> + while (lcount < max_loop) {
> if (use_lock)
> rte_mcslock_lock(&p_ml_perf, &ml_perf_me);
>
> @@ -175,6 +177,8 @@ test_mcslock(void)
> rte_mcslock_t ml_me;
> rte_mcslock_t ml_try_me;
>
> + max_loop = test_scale_iterations(MAX_LOOP_BASE, MAX_LOOP_MIN);
> +
Here, and in other similar cases, would it make sense to terminate by timeout
(i.e. number of cycles passed since start of the test)?
> /*
> * Test mcs lock & unlock on each core
> */
> --
> 2.51.0
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 11/14] test/eventdev: skip test if eventdev driver disabled
2026-01-22 0:50 ` [PATCH v3 11/14] test/eventdev: skip test if eventdev driver disabled Stephen Hemminger
@ 2026-01-22 20:40 ` Stephen Hemminger
2026-01-23 9:06 ` Bruce Richardson
0 siblings, 1 reply; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-22 20:40 UTC (permalink / raw)
To: dev; +Cc: stable, Naga Harish K S V, Morten Brørup, Bruce Richardson
On Wed, 21 Jan 2026 16:50:27 -0800
Stephen Hemminger <stephen@networkplumber.org> wrote:
> If DPDK is build with -Ddisable_drivers=eventdev/* then the
> test will be unable to create the needed software eventdev device.
>
> Fixes: 50823f30f0c8 ("test: build using per-file dependencies")
> Cc: stable@dpdk.org
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Maybe more needs to be done or handle this in meson build.
Not sure.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 11/14] test/eventdev: skip test if eventdev driver disabled
2026-01-22 20:40 ` Stephen Hemminger
@ 2026-01-23 9:06 ` Bruce Richardson
0 siblings, 0 replies; 53+ messages in thread
From: Bruce Richardson @ 2026-01-23 9:06 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev, stable, Naga Harish K S V, Morten Brørup
On Thu, Jan 22, 2026 at 12:40:44PM -0800, Stephen Hemminger wrote:
> On Wed, 21 Jan 2026 16:50:27 -0800
> Stephen Hemminger <stephen@networkplumber.org> wrote:
>
> > If DPDK is build with -Ddisable_drivers=eventdev/* then the
> > test will be unable to create the needed software eventdev device.
> >
> > Fixes: 50823f30f0c8 ("test: build using per-file dependencies")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>
> Maybe more needs to be done or handle this in meson build.
> Not sure.
If the code changes in tests to handle this are only a few lines as here,
it's probably not worth doing. Also, this particular case can't be handled
in meson.build, since it has a runtime as well as a build time dependency. For
example, a build could include the DLB2 eventdev driver and so the test
would run on systems with that hardware, but if the SW eventdev fallback is
not built, then it fails to run on hardware without. That sort of
situation, build-time dependencies can't handle.
/Bruce
^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [PATCH v3 14/14] test/bpf: pass correct size for Rx/Tx load tests
2026-01-22 0:50 ` [PATCH v3 14/14] test/bpf: pass correct size for Rx/Tx load tests Stephen Hemminger
@ 2026-01-23 11:50 ` Marat Khalili
0 siblings, 0 replies; 53+ messages in thread
From: Marat Khalili @ 2026-01-23 11:50 UTC (permalink / raw)
To: Stephen Hemminger, dev@dpdk.org; +Cc: stable@dpdk.org, Konstantin Ananyev
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Thursday 22 January 2026 00:51
> To: dev@dpdk.org
> Cc: Stephen Hemminger <stephen@networkplumber.org>; stable@dpdk.org; Marat Khalili
> <marat.khalili@huawei.com>; Konstantin Ananyev <konstantin.ananyev@huawei.com>
> Subject: [PATCH v3 14/14] test/bpf: pass correct size for Rx/Tx load tests
>
> Use the correct size in bpf_prm to help with validation.
>
> Fixes: 81038845c90b ("test/bpf: add Rx and Tx filtering")
> Cc: stable@dpdk.org
>
> Suggested-by: Marat Khalili <marat.khalili@huawei.com>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> app/test/test_bpf.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/app/test/test_bpf.c b/app/test/test_bpf.c
> index 8bf783c00c..bf92bc545c 100644
> --- a/app/test/test_bpf.c
> +++ b/app/test/test_bpf.c
> @@ -3533,7 +3533,7 @@ static int bpf_tx_test(uint16_t port, const char *tmpfile, struct rte_mempool *p
> const struct rte_bpf_prm prm = {
> .prog_arg = {
> .type = RTE_BPF_ARG_PTR,
> - .size = sizeof(struct rte_mbuf),
> + .size = sizeof(struct dummy_net),
> },
> };
> int ret;
> @@ -3632,7 +3632,7 @@ static int bpf_rx_test(uint16_t port, const char *tmpfile, struct rte_mempool *p
> const struct rte_bpf_prm prm = {
> .prog_arg = {
> .type = RTE_BPF_ARG_PTR,
> - .size = sizeof(struct rte_mbuf),
> + .size = sizeof(struct dummy_net),
> },
> };
> int ret;
> --
> 2.51.0
Thank you for fixing this. Not sure we actually have VLAN tag there, but it is
less misleading now in any case.
Acked-by: Marat Khalili <marat.khalili@huawei.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [PATCH v3 08/14] test/bpf: skip ELF test if null PMD disabled
2026-01-22 0:50 ` [PATCH v3 08/14] test/bpf: skip ELF test if null PMD disabled Stephen Hemminger
@ 2026-01-23 11:56 ` Marat Khalili
0 siblings, 0 replies; 53+ messages in thread
From: Marat Khalili @ 2026-01-23 11:56 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: stable@dpdk.org, Konstantin Ananyev, dev@dpdk.org
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Thursday 22 January 2026 00:50
> To: dev@dpdk.org
> Cc: Stephen Hemminger <stephen@networkplumber.org>; stable@dpdk.org; Konstantin Ananyev
> <konstantin.ananyev@huawei.com>; Marat Khalili <marat.khalili@huawei.com>
> Subject: [PATCH v3 08/14] test/bpf: skip ELF test if null PMD disabled
>
> If null PMD is disabled, the test to load filter can not work
> because it uses that. Change to skip the test if setup fails.
>
> Fixes: 81038845c90b ("test/bpf: add Rx and Tx filtering")
> Cc: stable@dpdk.org
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> app/test/test_bpf.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/app/test/test_bpf.c b/app/test/test_bpf.c
> index 0e969f9f13..8bf783c00c 100644
> --- a/app/test/test_bpf.c
> +++ b/app/test/test_bpf.c
> @@ -3441,7 +3441,11 @@ static int null_vdev_setup(const char *name, uint16_t *port, struct rte_mempool
>
> /* Make a null device */
> ret = rte_vdev_init(name, NULL);
> - TEST_ASSERT(ret == 0, "rte_vdev_init(%s) failed: %d", name, ret);
> + if (ret != 0) {
> + printf("rte_vdev_init(%s) failed: %d:%s\n",
> + name, ret, strerror(-ret));
> + return -ENOTSUP;
> + }
>
> ret = rte_eth_dev_get_port_by_name(name, port);
> TEST_ASSERT(ret == 0, "failed to get port id for %s: %d", name, ret);
> --
> 2.51.0
I know there are multiple approaches considered, but this one is ok as well.
However, I would only return -ENOTSUP if we _know_ null PMD is disabled, not
if it just failed to initialize.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 03/14] test/mcslock: scale test based on number of cores
2026-01-22 10:41 ` Konstantin Ananyev
@ 2026-01-27 20:31 ` Stephen Hemminger
0 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-01-27 20:31 UTC (permalink / raw)
To: Konstantin Ananyev
Cc: dev@dpdk.org, stable@dpdk.org, Honnappa Nagarahalli, Phil Yang,
Gavin Hu
On Thu, 22 Jan 2026 10:41:04 +0000
Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
> > This test uses loops to synchronize but has problems on systems
> > with high number of cores. Scale iterations to the number of
> > cores.
> >
> > Fixes: 32dcb9fd2a22 ("test/mcslock: add MCS queued lock unit test")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > ---
> > app/test/test_mcslock.c | 10 +++++++---
> > 1 file changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/app/test/test_mcslock.c b/app/test/test_mcslock.c
> > index 245df99a5d..b182da72f1 100644
> > --- a/app/test/test_mcslock.c
> > +++ b/app/test/test_mcslock.c
> > @@ -42,6 +42,10 @@ RTE_ATOMIC(rte_mcslock_t *) p_ml_perf;
> >
> > static unsigned int count;
> >
> > +#define MAX_LOOP_BASE 1000000u
> > +#define MAX_LOOP_MIN 10000u
> > +static unsigned int max_loop;
> > +
> > static RTE_ATOMIC(uint32_t) synchro;
> >
> > static int
> > @@ -60,8 +64,6 @@ test_mcslock_per_core(__rte_unused void *arg)
> >
> > static uint64_t time_count[RTE_MAX_LCORE] = {0};
> >
> > -#define MAX_LOOP 1000000
> > -
> > static int
> > load_loop_fn(void *func_param)
> > {
> > @@ -78,7 +80,7 @@ load_loop_fn(void *func_param)
> > rte_wait_until_equal_32((uint32_t *)(uintptr_t)&synchro, 1,
> > rte_memory_order_relaxed);
> >
> > begin = rte_get_timer_cycles();
> > - while (lcount < MAX_LOOP) {
> > + while (lcount < max_loop) {
> > if (use_lock)
> > rte_mcslock_lock(&p_ml_perf, &ml_perf_me);
> >
> > @@ -175,6 +177,8 @@ test_mcslock(void)
> > rte_mcslock_t ml_me;
> > rte_mcslock_t ml_try_me;
> >
> > + max_loop = test_scale_iterations(MAX_LOOP_BASE, MAX_LOOP_MIN);
> > +
>
> Here, and in other similar cases, would it make sense to terminate by timeout
> (i.e. number of cycles passed since start of the test)?
>
Thought about that but there are couple of issues:
1. The act of reading the TSC value would change CPU behavior and reduce
the amount of contention. Potentially hiding bugs.
2. If the test suddenly had much worse performance, the number of iterations
would be much less but if test was based on timeout it wouldn't see it.
Conclusion: keep original iteration based limits
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 00/14] test: fix test failures on high cores
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
` (13 preceding siblings ...)
2026-01-22 0:50 ` [PATCH v3 14/14] test/bpf: pass correct size for Rx/Tx load tests Stephen Hemminger
@ 2026-03-05 16:39 ` David Marchand
14 siblings, 0 replies; 53+ messages in thread
From: David Marchand @ 2026-03-05 16:39 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev, Thomas Monjalon
On Thu, 22 Jan 2026 at 01:54, Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> This series addresses several categories of test infrastructure issues:
>
> 1. Spinloop synchronization fixes (patch 1)
>
> Tests using tight spinloops for thread synchronization cause sporadic
> failures on high core count AMD Zen systems. The loops starve SMT
> sibling threads and create race conditions. Adding rte_pause() to
> all synchronization spinloops fixes these issues.
>
> 2. Test scaling for high core count systems (patches 2-5)
>
> Several tests use fixed iteration counts that cause timeouts on
> systems with many cores due to increased lock contention. A new
> helper function test_scale_iterations() scales iterations inversely
> with core count to maintain roughly constant test duration.
>
> Affected tests:
> - atomic_autotest (Bugzilla #952)
> - mcslock_autotest
> - stack_autotest
> - timer_secondary_autotest
>
> 3. BPF test fixes (patches 6-8, 14)
>
> - Fix missing error handling in ELF load tests
> - Fix clang 20+ compatibility by restricting to BPF v2 instruction
> set (Bugzilla #1844)
> - Skip ELF test gracefully if null PMD is disabled
> - Fix incorrect size parameter in Rx/Tx load tests
>
> 4. Parallel test execution fixes (patches 9-10)
>
> Multiple tests colliding on the default "rte" file-prefix causes EAL
> initialization failures when running tests in parallel. Restore unique
> file-prefix for all fast-tests on Linux, including a separate prefix
> for trace_autotest_with_traces.
>
> 5. Test skip conditions (patches 11-13)
>
> Tests that depend on optional drivers (null PMD, eventdev) should skip
> gracefully when those drivers are disabled via -Ddisable_drivers=
> rather than failing.
>
> v4 - add additional scaling for failures reported on 96 core system
> - gracefully handle case where null PMD disabled in build
>
> Stephen Hemminger (14):
> test: add pause to synchronization spinloops
> test: scale atomic test based on core count
> test/mcslock: scale test based on number of cores
> test/stack: scale test based on number of cores
> test/timer: scale test based on number of cores
> test/bpf: fix error handling in ELF load tests
> test/bpf: fix unsupported BPF instructions in ELF load test
> test/bpf: skip ELF test if null PMD disabled
> test: add file-prefix for all fast-tests on Linux
> test: fix trace_autotest_with_traces parallel execution
> test/eventdev: skip test if eventdev driver disabled
> test/pcapng: skip test if null driver missing
> test/vdev: skip test if no null PMD
> test/bpf: pass correct size for Rx/Tx load tests
>
> app/test/bpf/meson.build | 3 +-
> app/test/meson.build | 4 +-
> app/test/suites/meson.build | 19 +++++---
> app/test/test.h | 19 ++++++++
> app/test/test_atomic.c | 66 ++++++++++++++++------------
> app/test/test_bpf.c | 13 ++++--
> app/test/test_event_eth_tx_adapter.c | 11 +++--
> app/test/test_mcslock.c | 10 +++--
> app/test/test_stack.c | 8 ++--
> app/test/test_threads.c | 17 +++----
> app/test/test_timer_secondary.c | 14 +++---
> 11 files changed, 121 insertions(+), 63 deletions(-)
I think there is some overlap with some other merged patches (like a
conflict on the meson side).
Please check and rebase this series.
Thanks.
--
David Marchand
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v4 00/11] test: fix test failures on high core count systems
2026-01-18 20:09 [PATCH 0/6] test: fix sporadic failures on high core count systems Stephen Hemminger
` (6 preceding siblings ...)
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
@ 2026-03-05 17:50 ` Stephen Hemminger
2026-03-05 17:50 ` [PATCH v4 01/11] test: add pause to synchronization spinloops Stephen Hemminger
` (11 more replies)
7 siblings, 12 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-03-05 17:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger
Fix several test infrastructure issues affecting high core count systems.
Add rte_pause() to synchronization spinloops that starve SMT sibling
threads on AMD Zen systems. Scale iteration counts inversely with core
count via a new test_scale_iterations() helper to prevent timeouts
from lock contention (atomic, mcslock, stack, timer_secondary tests).
Replace volatile variables with C11 atomics in timer_secondary, fix
parallel test execution by restoring unique file-prefix for fast-tests
on Linux, and fix BPF ELF load test error handling and clang 20+
compatibility.
v4 - drop patches already covered elsewhere
- replace volatile with C11 atomics in timer secondary test
- add missing mempool_get error check in timer secondary test
- move iteration scaling out of worker function in stack test
Stephen Hemminger (11):
test: add pause to synchronization spinloops
test/atomic: scale test based on core count
test/mcslock: scale test based on number of cores
test/stack: scale test based on number of cores
test/timer: scale test based on number of cores
test/timer: replace volatile with C11 atomics
test: add file-prefix for all fast-tests on Linux
test: fix trace_autotest_with_traces parallel execution
test/bpf: fix error handling in ELF load tests
test/bpf: fix unsupported BPF instructions in ELF load test
test/bpf: pass correct size for Rx/Tx load tests
app/test/bpf/meson.build | 3 +-
app/test/suites/meson.build | 19 +++++++---
app/test/test.h | 19 ++++++++++
app/test/test_atomic.c | 66 +++++++++++++++++++--------------
app/test/test_bpf.c | 7 ++--
app/test/test_mcslock.c | 10 +++--
app/test/test_stack.c | 11 ++++--
app/test/test_threads.c | 17 +++++----
app/test/test_timer_secondary.c | 45 ++++++++++++++--------
9 files changed, 131 insertions(+), 66 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v4 01/11] test: add pause to synchronization spinloops
2026-03-05 17:50 ` [PATCH v4 00/11] test: fix test failures on high core count systems Stephen Hemminger
@ 2026-03-05 17:50 ` Stephen Hemminger
2026-03-05 17:50 ` [PATCH v4 02/11] test/atomic: scale test based on core count Stephen Hemminger
` (10 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-03-05 17:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Bruce Richardson
The atomic and thread tests use tight spinloops to synchronize.
These spinloops lack rte_pause() which causes problems on high core
count systems, particularly AMD Zen architectures where:
- Tight spinloops without pause can starve SMT sibling threads
- Memory ordering and store-buffer forwarding behave differently
- Higher core counts amplify timing windows for race conditions
This manifests as sporadic test failures on systems with 32+ cores
that don't reproduce on smaller core count systems.
Add rte_pause() to all seven synchronization spinloops to allow
proper CPU resource sharing and improve memory ordering behavior.
Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
app/test/test_atomic.c | 15 ++++++++-------
app/test/test_threads.c | 17 +++++++++--------
2 files changed, 17 insertions(+), 15 deletions(-)
diff --git a/app/test/test_atomic.c b/app/test/test_atomic.c
index 8160a33e0e..b1a0d40ece 100644
--- a/app/test/test_atomic.c
+++ b/app/test/test_atomic.c
@@ -15,6 +15,7 @@
#include <rte_atomic.h>
#include <rte_eal.h>
#include <rte_lcore.h>
+#include <rte_pause.h>
#include <rte_random.h>
#include <rte_hash_crc.h>
@@ -114,7 +115,7 @@ test_atomic_usual(__rte_unused void *arg)
unsigned i;
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
for (i = 0; i < N; i++)
rte_atomic16_inc(&a16);
@@ -150,7 +151,7 @@ static int
test_atomic_tas(__rte_unused void *arg)
{
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
if (rte_atomic16_test_and_set(&a16))
rte_atomic64_inc(&count);
@@ -171,7 +172,7 @@ test_atomic_addsub_and_return(__rte_unused void *arg)
unsigned i;
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
for (i = 0; i < N; i++) {
tmp16 = rte_atomic16_add_return(&a16, 1);
@@ -210,7 +211,7 @@ static int
test_atomic_inc_and_test(__rte_unused void *arg)
{
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
if (rte_atomic16_inc_and_test(&a16)) {
rte_atomic64_inc(&count);
@@ -237,7 +238,7 @@ static int
test_atomic_dec_and_test(__rte_unused void *arg)
{
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
if (rte_atomic16_dec_and_test(&a16))
rte_atomic64_inc(&count);
@@ -269,7 +270,7 @@ test_atomic128_cmp_exchange(__rte_unused void *arg)
unsigned int i;
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
expected = count128;
@@ -407,7 +408,7 @@ test_atomic_exchange(__rte_unused void *arg)
/* Wait until all of the other threads have been dispatched */
while (rte_atomic32_read(&synchro) == 0)
- ;
+ rte_pause();
/*
* Let the battle begin! Every thread attempts to steal the current
diff --git a/app/test/test_threads.c b/app/test/test_threads.c
index 5cd8bd4559..e2700b4a92 100644
--- a/app/test/test_threads.c
+++ b/app/test/test_threads.c
@@ -7,6 +7,7 @@
#include <rte_thread.h>
#include <rte_debug.h>
#include <rte_stdatomic.h>
+#include <rte_pause.h>
#include "test.h"
@@ -23,7 +24,7 @@ thread_main(void *arg)
rte_atomic_store_explicit(&thread_id_ready, 1, rte_memory_order_release);
while (rte_atomic_load_explicit(&thread_id_ready, rte_memory_order_acquire) == 1)
- ;
+ rte_pause();
return 0;
}
@@ -39,7 +40,7 @@ test_thread_create_join(void)
"Failed to create thread.");
while (rte_atomic_load_explicit(&thread_id_ready, rte_memory_order_acquire) == 0)
- ;
+ rte_pause();
RTE_TEST_ASSERT(rte_thread_equal(thread_id, thread_main_id) != 0,
"Unexpected thread id.");
@@ -63,7 +64,7 @@ test_thread_create_detach(void)
&thread_main_id) == 0, "Failed to create thread.");
while (rte_atomic_load_explicit(&thread_id_ready, rte_memory_order_acquire) == 0)
- ;
+ rte_pause();
RTE_TEST_ASSERT(rte_thread_equal(thread_id, thread_main_id) != 0,
"Unexpected thread id.");
@@ -87,7 +88,7 @@ test_thread_priority(void)
"Failed to create thread");
while (rte_atomic_load_explicit(&thread_id_ready, rte_memory_order_acquire) == 0)
- ;
+ rte_pause();
priority = RTE_THREAD_PRIORITY_NORMAL;
RTE_TEST_ASSERT(rte_thread_set_priority(thread_id, priority) == 0,
@@ -139,7 +140,7 @@ test_thread_affinity(void)
"Failed to create thread");
while (rte_atomic_load_explicit(&thread_id_ready, rte_memory_order_acquire) == 0)
- ;
+ rte_pause();
RTE_TEST_ASSERT(rte_thread_get_affinity_by_id(thread_id, &cpuset0) == 0,
"Failed to get thread affinity");
@@ -192,7 +193,7 @@ test_thread_attributes_affinity(void)
"Failed to create attributes affinity thread.");
while (rte_atomic_load_explicit(&thread_id_ready, rte_memory_order_acquire) == 0)
- ;
+ rte_pause();
RTE_TEST_ASSERT(rte_thread_get_affinity_by_id(thread_id, &cpuset1) == 0,
"Failed to get attributes thread affinity");
@@ -221,7 +222,7 @@ test_thread_attributes_priority(void)
"Failed to create attributes priority thread.");
while (rte_atomic_load_explicit(&thread_id_ready, rte_memory_order_acquire) == 0)
- ;
+ rte_pause();
RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
"Failed to get thread priority");
@@ -245,7 +246,7 @@ test_thread_control_create_join(void)
"Failed to create thread.");
while (rte_atomic_load_explicit(&thread_id_ready, rte_memory_order_acquire) == 0)
- ;
+ rte_pause();
RTE_TEST_ASSERT(rte_thread_equal(thread_id, thread_main_id) != 0,
"Unexpected thread id.");
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v4 02/11] test/atomic: scale test based on core count
2026-03-05 17:50 ` [PATCH v4 00/11] test: fix test failures on high core count systems Stephen Hemminger
2026-03-05 17:50 ` [PATCH v4 01/11] test: add pause to synchronization spinloops Stephen Hemminger
@ 2026-03-05 17:50 ` Stephen Hemminger
2026-03-05 17:50 ` [PATCH v4 03/11] test/mcslock: scale test based on number of cores Stephen Hemminger
` (9 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-03-05 17:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable
The atomic test uses tight spinloops to synchronize worker threads
and performs a fixed 1,000,000 iterations per worker. This causes
two problems on high core count systems:
With many cores (e.g., 32), the massive contention on shared
atomic variables causes the test to exceed the 10 second timeout.
Scale iterations inversely with core count to maintain roughly
constant test duration regardless of system size
With 32 cores, iterations drop from 1,000,000 to 31,250 per worker,
which keeps the test well within the timeout while still providing
meaningful coverage.
Add helper function to test.h so that other similar problems
can be addressed in followon patches.
Bugzilla ID: 952
Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test.h | 19 ++++++++++++++++
app/test/test_atomic.c | 51 +++++++++++++++++++++++++-----------------
2 files changed, 50 insertions(+), 20 deletions(-)
diff --git a/app/test/test.h b/app/test/test.h
index 10dc45f19d..1f12fc5397 100644
--- a/app/test/test.h
+++ b/app/test/test.h
@@ -12,6 +12,7 @@
#include <rte_hexdump.h>
#include <rte_common.h>
+#include <rte_lcore.h>
#include <rte_os_shim.h>
#define TEST_SUCCESS EXIT_SUCCESS
@@ -223,4 +224,22 @@ void add_test_command(struct test_command *t);
*/
#define REGISTER_ATTIC_TEST REGISTER_TEST_COMMAND
+/**
+ * Scale test iterations inversely with core count.
+ *
+ * On high core count systems, tests with per-core work can exceed
+ * timeout limits due to increased lock contention and scheduling
+ * overhead. This helper scales iterations to keep total test time
+ * roughly constant regardless of core count.
+ *
+ * @param base Base iteration count (used on single-core systems)
+ * @param min Minimum iterations (floor to ensure meaningful testing)
+ * @return Scaled iteration count
+ */
+static inline unsigned int
+test_scale_iterations(unsigned int base, unsigned int min)
+{
+ return RTE_MAX(base / rte_lcore_count(), min);
+}
+
#endif
diff --git a/app/test/test_atomic.c b/app/test/test_atomic.c
index b1a0d40ece..2a4531b833 100644
--- a/app/test/test_atomic.c
+++ b/app/test/test_atomic.c
@@ -10,6 +10,7 @@
#include <sys/queue.h>
#include <rte_memory.h>
+#include <rte_common.h>
#include <rte_per_lcore.h>
#include <rte_launch.h>
#include <rte_atomic.h>
@@ -101,7 +102,15 @@
#define NUM_ATOMIC_TYPES 3
-#define N 1000000
+#define N_BASE 1000000u
+#define N_MIN 10000u
+
+/*
+ * Number of iterations for each test, scaled inversely with core count.
+ * More cores means more contention which increases time per operation.
+ * Calculated once at test start to avoid repeated computation in workers.
+ */
+static unsigned int num_iterations;
static rte_atomic16_t a16;
static rte_atomic32_t a32;
@@ -112,36 +121,36 @@ static rte_atomic32_t synchro;
static int
test_atomic_usual(__rte_unused void *arg)
{
- unsigned i;
+ unsigned int i;
while (rte_atomic32_read(&synchro) == 0)
rte_pause();
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic16_inc(&a16);
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic16_dec(&a16);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic16_add(&a16, 5);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic16_sub(&a16, 5);
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic32_inc(&a32);
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic32_dec(&a32);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic32_add(&a32, 5);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic32_sub(&a32, 5);
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic64_inc(&a64);
- for (i = 0; i < N; i++)
+ for (i = 0; i < num_iterations; i++)
rte_atomic64_dec(&a64);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic64_add(&a64, 5);
- for (i = 0; i < (N / 5); i++)
+ for (i = 0; i < (num_iterations / 5); i++)
rte_atomic64_sub(&a64, 5);
return 0;
@@ -169,12 +178,12 @@ test_atomic_addsub_and_return(__rte_unused void *arg)
uint32_t tmp16;
uint32_t tmp32;
uint64_t tmp64;
- unsigned i;
+ unsigned int i;
while (rte_atomic32_read(&synchro) == 0)
rte_pause();
- for (i = 0; i < N; i++) {
+ for (i = 0; i < num_iterations; i++) {
tmp16 = rte_atomic16_add_return(&a16, 1);
rte_atomic64_add(&count, tmp16);
@@ -274,7 +283,7 @@ test_atomic128_cmp_exchange(__rte_unused void *arg)
expected = count128;
- for (i = 0; i < N; i++) {
+ for (i = 0; i < num_iterations; i++) {
do {
rte_int128_t desired;
@@ -401,7 +410,7 @@ get_crc8(uint8_t *message, int length)
static int
test_atomic_exchange(__rte_unused void *arg)
{
- int i;
+ unsigned int i;
test16_t nt16, ot16; /* new token, old token */
test32_t nt32, ot32;
test64_t nt64, ot64;
@@ -417,7 +426,7 @@ test_atomic_exchange(__rte_unused void *arg)
* appropriate crc32 hash for the data) then the test iteration has
* passed. If the token is invalid, increment the counter.
*/
- for (i = 0; i < N; i++) {
+ for (i = 0; i < num_iterations; i++) {
/* Test 64bit Atomic Exchange */
nt64.u64 = rte_rand();
@@ -446,6 +455,8 @@ test_atomic_exchange(__rte_unused void *arg)
static int
test_atomic(void)
{
+ num_iterations = test_scale_iterations(N_BASE, N_MIN);
+
rte_atomic16_init(&a16);
rte_atomic32_init(&a32);
rte_atomic64_init(&a64);
@@ -593,7 +604,7 @@ test_atomic(void)
rte_atomic32_clear(&synchro);
iterations = count128.val[0] - count128.val[1];
- if (iterations != (uint64_t)4*N*(rte_lcore_count()-1)) {
+ if (iterations != (uint64_t)4*num_iterations*(rte_lcore_count()-1)) {
printf("128-bit compare and swap failed\n");
return -1;
}
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v4 03/11] test/mcslock: scale test based on number of cores
2026-03-05 17:50 ` [PATCH v4 00/11] test: fix test failures on high core count systems Stephen Hemminger
2026-03-05 17:50 ` [PATCH v4 01/11] test: add pause to synchronization spinloops Stephen Hemminger
2026-03-05 17:50 ` [PATCH v4 02/11] test/atomic: scale test based on core count Stephen Hemminger
@ 2026-03-05 17:50 ` Stephen Hemminger
2026-03-05 17:50 ` [PATCH v4 04/11] test/stack: " Stephen Hemminger
` (8 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-03-05 17:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable
This test uses loops to synchronize but has problems on systems
with high number of cores. Scale iterations to the number of
cores.
Fixes: 32dcb9fd2a22 ("test/mcslock: add MCS queued lock unit test")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test_mcslock.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/app/test/test_mcslock.c b/app/test/test_mcslock.c
index 245df99a5d..b182da72f1 100644
--- a/app/test/test_mcslock.c
+++ b/app/test/test_mcslock.c
@@ -42,6 +42,10 @@ RTE_ATOMIC(rte_mcslock_t *) p_ml_perf;
static unsigned int count;
+#define MAX_LOOP_BASE 1000000u
+#define MAX_LOOP_MIN 10000u
+static unsigned int max_loop;
+
static RTE_ATOMIC(uint32_t) synchro;
static int
@@ -60,8 +64,6 @@ test_mcslock_per_core(__rte_unused void *arg)
static uint64_t time_count[RTE_MAX_LCORE] = {0};
-#define MAX_LOOP 1000000
-
static int
load_loop_fn(void *func_param)
{
@@ -78,7 +80,7 @@ load_loop_fn(void *func_param)
rte_wait_until_equal_32((uint32_t *)(uintptr_t)&synchro, 1, rte_memory_order_relaxed);
begin = rte_get_timer_cycles();
- while (lcount < MAX_LOOP) {
+ while (lcount < max_loop) {
if (use_lock)
rte_mcslock_lock(&p_ml_perf, &ml_perf_me);
@@ -175,6 +177,8 @@ test_mcslock(void)
rte_mcslock_t ml_me;
rte_mcslock_t ml_try_me;
+ max_loop = test_scale_iterations(MAX_LOOP_BASE, MAX_LOOP_MIN);
+
/*
* Test mcs lock & unlock on each core
*/
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v4 04/11] test/stack: scale test based on number of cores
2026-03-05 17:50 ` [PATCH v4 00/11] test: fix test failures on high core count systems Stephen Hemminger
` (2 preceding siblings ...)
2026-03-05 17:50 ` [PATCH v4 03/11] test/mcslock: scale test based on number of cores Stephen Hemminger
@ 2026-03-05 17:50 ` Stephen Hemminger
2026-03-05 17:50 ` [PATCH v4 05/11] test/timer: " Stephen Hemminger
` (7 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-03-05 17:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable
This test uses loops to synchronize but has problems on systems
with high number of cores. Scale iterations to the number of
cores.
Fixes: 5e2e61b99e91 ("test/stack: check stack API")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test_stack.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/app/test/test_stack.c b/app/test/test_stack.c
index abc3114729..5517982774 100644
--- a/app/test/test_stack.c
+++ b/app/test/test_stack.c
@@ -268,10 +268,12 @@ test_free_null(void)
return 0;
}
-#define NUM_ITERS_PER_THREAD 100000
+#define NUM_ITERS_BASE 100000u
+#define NUM_ITERS_MIN 1000u
struct test_args {
struct rte_stack *s;
+ unsigned int num_iters;
};
static struct test_args thread_test_args;
@@ -280,9 +282,9 @@ static int
stack_thread_push_pop(__rte_unused void *args)
{
void *obj_table[MAX_BULK];
- int i;
+ unsigned int i;
- for (i = 0; i < NUM_ITERS_PER_THREAD; i++) {
+ for (i = 0; i < thread_test_args.num_iters; i++) {
unsigned int num;
num = rte_rand() % MAX_BULK;
@@ -308,12 +310,14 @@ test_stack_multithreaded(uint32_t flags)
{
unsigned int lcore_id;
struct rte_stack *s;
+ unsigned int iterations;
int result = 0;
if (rte_lcore_count() < 2) {
printf("Not enough cores for test_stack_multithreaded, expecting at least 2\n");
return TEST_SKIPPED;
}
+ iterations = test_scale_iterations(NUM_ITERS_BASE, NUM_ITERS_MIN);
printf("[%s():%u] Running with %u lcores\n",
__func__, __LINE__, rte_lcore_count());
@@ -325,6 +329,7 @@ test_stack_multithreaded(uint32_t flags)
return -1;
}
+ thread_test_args.num_iters = iterations;
thread_test_args.s = s;
if (rte_eal_mp_remote_launch(stack_thread_push_pop, NULL, CALL_MAIN))
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v4 05/11] test/timer: scale test based on number of cores
2026-03-05 17:50 ` [PATCH v4 00/11] test: fix test failures on high core count systems Stephen Hemminger
` (3 preceding siblings ...)
2026-03-05 17:50 ` [PATCH v4 04/11] test/stack: " Stephen Hemminger
@ 2026-03-05 17:50 ` Stephen Hemminger
2026-03-05 17:51 ` [PATCH v4 06/11] test/timer: replace volatile with C11 atomics Stephen Hemminger
` (6 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-03-05 17:50 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable
With large core system this would end up taking too long and
test would time out. Scale the number of timers based on
the number lcores available.
Add a check that mempool_get succeeds.
Fixes: 50247fe03fe0 ("test/timer: exercise new APIs in secondary process")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test_timer_secondary.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/app/test/test_timer_secondary.c b/app/test/test_timer_secondary.c
index 0fc07dcbad..c80dee9c6c 100644
--- a/app/test/test_timer_secondary.c
+++ b/app/test/test_timer_secondary.c
@@ -27,7 +27,8 @@ test_timer_secondary(void)
#include "process.h"
-#define NUM_TIMERS (1 << 20) /* ~1M timers */
+#define NUM_TIMERS_MAX (1 << 20) /* ~1M timers */
+#define NUM_TIMERS_MIN (1 << 14) /* 16K minimum */
#define NUM_LCORES_NEEDED 3
#define TEST_INFO_MZ_NAME "test_timer_info_mz"
#define MSECPERSEC 1E3
@@ -38,11 +39,12 @@ struct test_info {
unsigned int main_lcore;
unsigned int mgr_lcore;
unsigned int sec_lcore;
+ unsigned int num_timers;
uint32_t timer_data_id;
volatile int expected_count;
volatile int expired_count;
struct rte_mempool *tim_mempool;
- struct rte_timer *expired_timers[NUM_TIMERS];
+ struct rte_timer *expired_timers[NUM_TIMERS_MAX];
int expired_timers_idx;
volatile int exit_flag;
};
@@ -134,8 +136,10 @@ test_timer_secondary(void)
"test data");
test_info = mz->addr;
+ test_info->num_timers = test_scale_iterations(NUM_TIMERS_MAX, NUM_TIMERS_MIN);
+
test_info->tim_mempool = rte_mempool_create("test_timer_mp",
- NUM_TIMERS, sizeof(struct rte_timer), 0, 0,
+ test_info->num_timers, sizeof(struct rte_timer), 0, 0,
NULL, NULL, NULL, NULL, rte_socket_id(), 0);
ret = rte_timer_data_alloc(&test_info->timer_data_id);
@@ -174,15 +178,16 @@ test_timer_secondary(void)
} else if (proc_type == RTE_PROC_SECONDARY) {
uint64_t ticks, timeout_ms;
struct rte_timer *tim;
- int i;
+ unsigned int i;
mz = rte_memzone_lookup(TEST_INFO_MZ_NAME);
TEST_ASSERT_NOT_NULL(mz, "Couldn't lookup memzone for "
"test info");
test_info = mz->addr;
- for (i = 0; i < NUM_TIMERS; i++) {
- rte_mempool_get(test_info->tim_mempool, (void **)&tim);
+ for (i = 0; i < test_info->num_timers; i++) {
+ ret = rte_mempool_get(test_info->tim_mempool, (void **)&tim);
+ TEST_ASSERT_SUCCESS(ret, "Couldn't get timer from mempool");
rte_timer_init(tim);
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v4 06/11] test/timer: replace volatile with C11 atomics
2026-03-05 17:50 ` [PATCH v4 00/11] test: fix test failures on high core count systems Stephen Hemminger
` (4 preceding siblings ...)
2026-03-05 17:50 ` [PATCH v4 05/11] test/timer: " Stephen Hemminger
@ 2026-03-05 17:51 ` Stephen Hemminger
2026-03-05 17:51 ` [PATCH v4 07/11] test: add file-prefix for all fast-tests on Linux Stephen Hemminger
` (5 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-03-05 17:51 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable
Replace volatile variables in test_timer_secondary shared memory
structure with RTE_ATOMIC() and rte_atomic_*_explicit() operations.
Change expected_count and expired_count from int to unsigned int
since they are non-negative counters.
Fixes: 50247fe03fe0 ("test/timer: exercise new APIs in secondary process")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/test_timer_secondary.c | 28 +++++++++++++++++++---------
1 file changed, 19 insertions(+), 9 deletions(-)
diff --git a/app/test/test_timer_secondary.c b/app/test/test_timer_secondary.c
index c80dee9c6c..57ab39130d 100644
--- a/app/test/test_timer_secondary.c
+++ b/app/test/test_timer_secondary.c
@@ -12,7 +12,9 @@
#include <rte_timer.h>
#include <rte_cycles.h>
#include <rte_mempool.h>
+#include <rte_pause.h>
#include <rte_random.h>
+#include <rte_stdatomic.h>
#include "test.h"
@@ -41,12 +43,12 @@ struct test_info {
unsigned int sec_lcore;
unsigned int num_timers;
uint32_t timer_data_id;
- volatile int expected_count;
- volatile int expired_count;
+ RTE_ATOMIC(unsigned int) expected_count;
+ RTE_ATOMIC(unsigned int) expired_count;
struct rte_mempool *tim_mempool;
struct rte_timer *expired_timers[NUM_TIMERS_MAX];
int expired_timers_idx;
- volatile int exit_flag;
+ RTE_ATOMIC(int) exit_flag;
};
static int
@@ -76,7 +78,8 @@ handle_expired_timer(struct rte_timer *tim)
{
struct test_info *test_info = tim->arg;
- test_info->expired_count++;
+ rte_atomic_fetch_add_explicit(&test_info->expired_count, 1,
+ rte_memory_order_relaxed);
test_info->expired_timers[test_info->expired_timers_idx++] = tim;
}
@@ -88,7 +91,8 @@ timer_manage_loop(void *arg)
uint64_t prev_tsc = 0, cur_tsc, diff_tsc;
struct test_info *test_info = arg;
- while (!test_info->exit_flag) {
+ while (!rte_atomic_load_explicit(&test_info->exit_flag,
+ rte_memory_order_acquire)) {
cur_tsc = rte_rdtsc();
diff_tsc = cur_tsc - prev_tsc;
@@ -163,7 +167,8 @@ test_timer_secondary(void)
/* must set exit flag even on error case, so check ret later */
rte_delay_ms(500);
- test_info->exit_flag = 1;
+ rte_atomic_store_explicit(&test_info->exit_flag, 1,
+ rte_memory_order_release);
TEST_ASSERT_SUCCESS(ret, "Secondary process execution failed");
rte_eal_wait_lcore(*mgr_lcorep);
@@ -172,7 +177,10 @@ test_timer_secondary(void)
rte_timer_alt_dump_stats(test_info->timer_data_id, stdout);
#endif
- return test_info->expected_count == test_info->expired_count ?
+ return rte_atomic_load_explicit(&test_info->expected_count,
+ rte_memory_order_relaxed) ==
+ rte_atomic_load_explicit(&test_info->expired_count,
+ rte_memory_order_relaxed) ?
TEST_SUCCESS : TEST_FAILED;
} else if (proc_type == RTE_PROC_SECONDARY) {
@@ -202,7 +210,8 @@ test_timer_secondary(void)
if (ret < 0)
return TEST_FAILED;
- test_info->expected_count++;
+ rte_atomic_fetch_add_explicit(&test_info->expected_count,
+ 1, rte_memory_order_relaxed);
/* randomly leave timer running or stop it */
if (rte_rand() & 1)
@@ -211,7 +220,8 @@ test_timer_secondary(void)
ret = rte_timer_alt_stop(test_info->timer_data_id,
tim);
if (ret == 0) {
- test_info->expected_count--;
+ rte_atomic_fetch_sub_explicit(&test_info->expected_count,
+ 1, rte_memory_order_relaxed);
rte_mempool_put(test_info->tim_mempool,
(void *)tim);
}
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v4 07/11] test: add file-prefix for all fast-tests on Linux
2026-03-05 17:50 ` [PATCH v4 00/11] test: fix test failures on high core count systems Stephen Hemminger
` (5 preceding siblings ...)
2026-03-05 17:51 ` [PATCH v4 06/11] test/timer: replace volatile with C11 atomics Stephen Hemminger
@ 2026-03-05 17:51 ` Stephen Hemminger
2026-03-05 17:51 ` [PATCH v4 08/11] test: fix trace_autotest_with_traces parallel execution Stephen Hemminger
` (4 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-03-05 17:51 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Marat Khalili, Bruce Richardson
When running tests in parallel on systems with many cores, multiple test
processes collide on the default "rte" file-prefix, causing EAL
initialization failures:
EAL: Cannot allocate memzone list: Device or resource busy
EAL: Cannot init memzone
This occurs because all DPDK tests (including --no-huge tests) use
file-backed arrays for memzone tracking. These files are created at
/var/run/dpdk/<prefix>/fbarray_memzone and require exclusive locking
during initialization. When multiple tests run in parallel with the
same file-prefix, they compete for this lock.
The original implementation included --file-prefix for Linux to
prevent this collision. This was later removed during test
infrastructure refactoring.
Restore the --file-prefix argument for all fast-tests on Linux,
regardless of whether they use hugepages. Tests that exercise
file-prefix functionality (like eal_flags_file_prefix_autotest)
spawn child processes with their own hardcoded prefixes and use
get_current_prefix() to verify the parent's resources, so they work
correctly regardless of what prefix the parent process uses.
Fixes: 50823f30f0c8 ("test: build using per-file dependencies")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Marat Khalili <marat.khalili@huawei.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
app/test/suites/meson.build | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/app/test/suites/meson.build b/app/test/suites/meson.build
index 1010150eee..4c815ea097 100644
--- a/app/test/suites/meson.build
+++ b/app/test/suites/meson.build
@@ -85,11 +85,15 @@ foreach suite:test_suites
if nohuge
test_args += test_no_huge_args
elif not has_hugepage
- continue #skip this tests
+ continue # skip this test
endif
if not asan and get_option('b_sanitize').contains('address')
continue # skip this test
endif
+ if is_linux
+ # use unique file-prefix to allow parallel runs
+ test_args += ['--file-prefix=' + test_name.underscorify()]
+ endif
if get_option('default_library') == 'shared'
test_args += ['-d', dpdk_drivers_build_dir]
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v4 08/11] test: fix trace_autotest_with_traces parallel execution
2026-03-05 17:50 ` [PATCH v4 00/11] test: fix test failures on high core count systems Stephen Hemminger
` (6 preceding siblings ...)
2026-03-05 17:51 ` [PATCH v4 07/11] test: add file-prefix for all fast-tests on Linux Stephen Hemminger
@ 2026-03-05 17:51 ` Stephen Hemminger
2026-03-05 17:51 ` [PATCH v4 09/11] test/bpf: fix error handling in ELF load tests Stephen Hemminger
` (3 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-03-05 17:51 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable
The trace_autotest_with_traces test needs a unique file-prefix to avoid
collisions when running in parallel with other tests.
Rather than duplicating test argument construction, restructure to add
file-prefix as the last step. This allows reusing test_args for the
trace variant by concatenating the trace-specific arguments and a
different file-prefix at the end.
Fixes: 0aeaf75df879 ("test: define unit tests suites based on test types")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
app/test/suites/meson.build | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/app/test/suites/meson.build b/app/test/suites/meson.build
index 4c815ea097..786c459c24 100644
--- a/app/test/suites/meson.build
+++ b/app/test/suites/meson.build
@@ -90,26 +90,29 @@ foreach suite:test_suites
if not asan and get_option('b_sanitize').contains('address')
continue # skip this test
endif
- if is_linux
- # use unique file-prefix to allow parallel runs
- test_args += ['--file-prefix=' + test_name.underscorify()]
- endif
-
if get_option('default_library') == 'shared'
test_args += ['-d', dpdk_drivers_build_dir]
endif
+ # use unique file-prefix to allow parallel runs
+ file_prefix = []
+ trace_prefix = []
+ if is_linux
+ file_prefix = ['--file-prefix=' + test_name.underscorify()]
+ trace_prefix = [file_prefix[0] + '_with_traces']
+ endif
+
test(test_name, dpdk_test,
- args : test_args,
+ args : test_args + file_prefix,
env: ['DPDK_TEST=' + test_name],
timeout : timeout_seconds_fast,
is_parallel : false,
suite : 'fast-tests')
if not is_windows and test_name == 'trace_autotest'
- test_args += ['--trace=.*']
- test_args += ['--trace-dir=@0@'.format(meson.current_build_dir())]
+ trace_extra = ['--trace=.*',
+ '--trace-dir=@0@'.format(meson.current_build_dir())]
test(test_name + '_with_traces', dpdk_test,
- args : test_args,
+ args : test_args + trace_extra + trace_prefix,
env: ['DPDK_TEST=' + test_name],
timeout : timeout_seconds_fast,
is_parallel : false,
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v4 09/11] test/bpf: fix error handling in ELF load tests
2026-03-05 17:50 ` [PATCH v4 00/11] test: fix test failures on high core count systems Stephen Hemminger
` (7 preceding siblings ...)
2026-03-05 17:51 ` [PATCH v4 08/11] test: fix trace_autotest_with_traces parallel execution Stephen Hemminger
@ 2026-03-05 17:51 ` Stephen Hemminger
2026-03-05 17:51 ` [PATCH v4 10/11] test/bpf: fix unsupported BPF instructions in ELF load test Stephen Hemminger
` (2 subsequent siblings)
11 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-03-05 17:51 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Marat Khalili
Address related issues found during review
- Add missing TEST_ASSERT for mempool creation in test_bpf_elf_tx_load
- Initialize port variable in test_bpf_elf_rx_load to avoid undefined
behavior in cleanup path if null_vdev_setup fails early
Fixes: cf1e03f881af ("test/bpf: add ELF loading")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Marat Khalili <marat.khalili@huawei.com>
---
app/test/test_bpf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/app/test/test_bpf.c b/app/test/test_bpf.c
index 093cf5fe1d..144121ac79 100644
--- a/app/test/test_bpf.c
+++ b/app/test/test_bpf.c
@@ -4231,6 +4231,7 @@ test_bpf_elf_tx_load(void)
mb_pool = rte_pktmbuf_pool_create("bpf_tx_test_pool", BPF_TEST_POOLSIZE,
0, 0, RTE_MBUF_DEFAULT_BUF_SIZE,
SOCKET_ID_ANY);
+ TEST_ASSERT(mb_pool != NULL, "failed to create mempool");
ret = null_vdev_setup(null_dev, &port, mb_pool);
if (ret != 0)
@@ -4315,7 +4316,7 @@ test_bpf_elf_rx_load(void)
static const char null_dev[] = "net_null_bpf0";
struct rte_mempool *pool = NULL;
char *tmpfile = NULL;
- uint16_t port;
+ uint16_t port = UINT16_MAX;
int ret;
printf("%s start\n", __func__);
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v4 10/11] test/bpf: fix unsupported BPF instructions in ELF load test
2026-03-05 17:50 ` [PATCH v4 00/11] test: fix test failures on high core count systems Stephen Hemminger
` (8 preceding siblings ...)
2026-03-05 17:51 ` [PATCH v4 09/11] test/bpf: fix error handling in ELF load tests Stephen Hemminger
@ 2026-03-05 17:51 ` Stephen Hemminger
2026-03-05 17:51 ` [PATCH v4 11/11] test/bpf: pass correct size for Rx/Tx load tests Stephen Hemminger
2026-03-17 13:28 ` [PATCH v4 00/11] test: fix test failures on high core count systems Thomas Monjalon
11 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-03-05 17:51 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Marat Khalili
The DPDK BPF library only handles the base BPF instructions.
It does not handle JMP32 which would cause the bpf_elf_load
test to fail on clang 20 or later.
Bugzilla ID: 1844
Fixes: cf1e03f881af ("test/bpf: add ELF loading")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Marat Khalili <marat.khalili@huawei.com>
---
app/test/bpf/meson.build | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/app/test/bpf/meson.build b/app/test/bpf/meson.build
index aaecfa7018..91c1b434f8 100644
--- a/app/test/bpf/meson.build
+++ b/app/test/bpf/meson.build
@@ -24,7 +24,8 @@ if not xxd.found()
endif
# BPF compiler flags
-bpf_cflags = [ '-O2', '-target', 'bpf', '-g', '-c']
+# At present: DPDK BPF does not support v3 or later
+bpf_cflags = [ '-O2', '-target', 'bpf', '-mcpu=v2', '-g', '-c']
# Enable test in test_bpf.c
cflags += '-DTEST_BPF_ELF_LOAD'
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v4 11/11] test/bpf: pass correct size for Rx/Tx load tests
2026-03-05 17:50 ` [PATCH v4 00/11] test: fix test failures on high core count systems Stephen Hemminger
` (9 preceding siblings ...)
2026-03-05 17:51 ` [PATCH v4 10/11] test/bpf: fix unsupported BPF instructions in ELF load test Stephen Hemminger
@ 2026-03-05 17:51 ` Stephen Hemminger
2026-03-17 13:28 ` [PATCH v4 00/11] test: fix test failures on high core count systems Thomas Monjalon
11 siblings, 0 replies; 53+ messages in thread
From: Stephen Hemminger @ 2026-03-05 17:51 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, stable, Marat Khalili
Use the correct size in bpf_prm to help with validation.
Fixes: 81038845c90b ("test/bpf: add Rx and Tx filtering")
Cc: stable@dpdk.org
Suggested-by: Marat Khalili <marat.khalili@huawei.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Marat Khalili <marat.khalili@huawei.com>
---
app/test/test_bpf.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/app/test/test_bpf.c b/app/test/test_bpf.c
index 144121ac79..dd24722450 100644
--- a/app/test/test_bpf.c
+++ b/app/test/test_bpf.c
@@ -4180,7 +4180,7 @@ static int bpf_tx_test(uint16_t port, const char *tmpfile, struct rte_mempool *p
const struct rte_bpf_prm prm = {
.prog_arg = {
.type = RTE_BPF_ARG_PTR,
- .size = sizeof(struct rte_mbuf),
+ .size = sizeof(struct dummy_net),
},
};
int ret;
@@ -4279,7 +4279,7 @@ static int bpf_rx_test(uint16_t port, const char *tmpfile, struct rte_mempool *p
const struct rte_bpf_prm prm = {
.prog_arg = {
.type = RTE_BPF_ARG_PTR,
- .size = sizeof(struct rte_mbuf),
+ .size = sizeof(struct dummy_net),
},
};
int ret;
--
2.51.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v4 00/11] test: fix test failures on high core count systems
2026-03-05 17:50 ` [PATCH v4 00/11] test: fix test failures on high core count systems Stephen Hemminger
` (10 preceding siblings ...)
2026-03-05 17:51 ` [PATCH v4 11/11] test/bpf: pass correct size for Rx/Tx load tests Stephen Hemminger
@ 2026-03-17 13:28 ` Thomas Monjalon
11 siblings, 0 replies; 53+ messages in thread
From: Thomas Monjalon @ 2026-03-17 13:28 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev
> Stephen Hemminger (11):
> test: add pause to synchronization spinloops
> test/atomic: scale test based on core count
> test/mcslock: scale test based on number of cores
> test/stack: scale test based on number of cores
> test/timer: scale test based on number of cores
> test/timer: replace volatile with C11 atomics
> test: add file-prefix for all fast-tests on Linux
> test: fix trace_autotest_with_traces parallel execution
> test/bpf: fix error handling in ELF load tests
> test/bpf: fix unsupported BPF instructions in ELF load test
> test/bpf: pass correct size for Rx/Tx load tests
Applied, thanks.
^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2026-03-17 13:28 UTC | newest]
Thread overview: 53+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-18 20:09 [PATCH 0/6] test: fix sporadic failures on high core count systems Stephen Hemminger
2026-01-18 20:09 ` [PATCH 1/6] test: add pause to synchronization spinloops Stephen Hemminger
2026-01-18 20:09 ` [PATCH 2/6] test: fix timeout for atomic test on high core count systems Stephen Hemminger
2026-01-18 20:09 ` [PATCH 3/6] test: fix race condition in ELF load tests Stephen Hemminger
2026-01-19 11:42 ` Marat Khalili
2026-01-20 0:03 ` Stephen Hemminger
2026-01-20 10:30 ` Marat Khalili
2026-01-19 18:24 ` Stephen Hemminger
2026-01-18 20:09 ` [PATCH 4/6] test: fix unsupported BPF instructions in elf load test Stephen Hemminger
2026-01-19 11:43 ` Marat Khalili
2026-01-18 20:09 ` [PATCH 5/6] test: add file-prefix for all fast-tests on Linux Stephen Hemminger
2026-01-19 13:06 ` Marat Khalili
2026-01-19 14:01 ` Bruce Richardson
2026-01-18 20:09 ` [PATCH 6/6] test: fix trace_autotest_with_traces parallel execution Stephen Hemminger
2026-01-19 13:13 ` Marat Khalili
2026-01-20 0:07 ` Stephen Hemminger
2026-01-20 11:36 ` Marat Khalili
2026-01-22 0:50 ` [PATCH v3 00/14] test: fix test failures on high cores Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 01/14] test: add pause to synchronization spinloops Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 02/14] test: scale atomic test based on core count Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 03/14] test/mcslock: scale test based on number of cores Stephen Hemminger
2026-01-22 10:41 ` Konstantin Ananyev
2026-01-27 20:31 ` Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 04/14] test/stack: " Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 05/14] test/timer: " Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 06/14] test/bpf: fix error handling in ELF load tests Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 07/14] test/bpf: fix unsupported BPF instructions in ELF load test Stephen Hemminger
2026-01-22 10:33 ` Konstantin Ananyev
2026-01-22 0:50 ` [PATCH v3 08/14] test/bpf: skip ELF test if null PMD disabled Stephen Hemminger
2026-01-23 11:56 ` Marat Khalili
2026-01-22 0:50 ` [PATCH v3 09/14] test: add file-prefix for all fast-tests on Linux Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 10/14] test: fix trace_autotest_with_traces parallel execution Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 11/14] test/eventdev: skip test if eventdev driver disabled Stephen Hemminger
2026-01-22 20:40 ` Stephen Hemminger
2026-01-23 9:06 ` Bruce Richardson
2026-01-22 0:50 ` [PATCH v3 12/14] test/pcapng: skip test if null driver missing Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 13/14] test/vdev: skip test if no null PMD Stephen Hemminger
2026-01-22 0:50 ` [PATCH v3 14/14] test/bpf: pass correct size for Rx/Tx load tests Stephen Hemminger
2026-01-23 11:50 ` Marat Khalili
2026-03-05 16:39 ` [PATCH v3 00/14] test: fix test failures on high cores David Marchand
2026-03-05 17:50 ` [PATCH v4 00/11] test: fix test failures on high core count systems Stephen Hemminger
2026-03-05 17:50 ` [PATCH v4 01/11] test: add pause to synchronization spinloops Stephen Hemminger
2026-03-05 17:50 ` [PATCH v4 02/11] test/atomic: scale test based on core count Stephen Hemminger
2026-03-05 17:50 ` [PATCH v4 03/11] test/mcslock: scale test based on number of cores Stephen Hemminger
2026-03-05 17:50 ` [PATCH v4 04/11] test/stack: " Stephen Hemminger
2026-03-05 17:50 ` [PATCH v4 05/11] test/timer: " Stephen Hemminger
2026-03-05 17:51 ` [PATCH v4 06/11] test/timer: replace volatile with C11 atomics Stephen Hemminger
2026-03-05 17:51 ` [PATCH v4 07/11] test: add file-prefix for all fast-tests on Linux Stephen Hemminger
2026-03-05 17:51 ` [PATCH v4 08/11] test: fix trace_autotest_with_traces parallel execution Stephen Hemminger
2026-03-05 17:51 ` [PATCH v4 09/11] test/bpf: fix error handling in ELF load tests Stephen Hemminger
2026-03-05 17:51 ` [PATCH v4 10/11] test/bpf: fix unsupported BPF instructions in ELF load test Stephen Hemminger
2026-03-05 17:51 ` [PATCH v4 11/11] test/bpf: pass correct size for Rx/Tx load tests Stephen Hemminger
2026-03-17 13:28 ` [PATCH v4 00/11] test: fix test failures on high core count systems Thomas Monjalon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox