From: Wander Lairson Costa <wander@redhat.com>
To: Clark Williams <williams@redhat.com>,
John Kacur <jkacur@redhat.com>,
linux-rt-users@vger.kernel.org
Cc: Juri Lelli <juri.lelli@redhat.com>,
luffyluo@tencent.com, davidlt@rivosinc.com,
Wander Lairson Costa <wander@redhat.com>
Subject: [[PATCH stalld] 00/33] Test suite hardening, correctness fixes, and BPF optimization
Date: Wed, 20 May 2026 11:00:27 -0300 [thread overview]
Message-ID: <20260520140104.112142-1-wander@redhat.com> (raw)
The stalld functional test suite has accumulated significant
reliability and maintainability issues over time. Several tests
were silently passing due to missing assertions, shell scoping
bugs that discarded subshell results, and grep substring matches
causing false positives on multi-digit CPU systems. Massive
boilerplate duplication existed across the suite, timing-dependent
sleeps caused flakiness, and a daemon bug was discovered where
--force_fifo combined with single-threaded mode silently fell
back to adaptive mode instead of erroring out. Additionally, the
BPF queue_track backend suffered from an O(n) linear array scan
on every sched_wakeup event, consuming significant real-time
latency budget in Telco workloads.
This series addresses these issues in three areas. First, the
daemon's --force_fifo validation is fixed to reject the
incompatible single-threaded mode at startup. Second, the test
suite is overhauled by introducing eight shared helpers
(test_section, cleanup_scenario, find_starved_child,
init_functional_test, assert_stalld_rejects, assert_log_contains,
assert_success, and starvation/boost assertion helpers) that
replace duplicated validation patterns across all functional
tests. The tests are then hardened with proper assertions,
transitioned to fail-fast semantics where any failure immediately
aborts the run, and timing-dependent sleeps are replaced with
event-driven synchronization. The starvation_gen helper's signal
handler is fixed for async-signal safety. Legacy C test
infrastructure, stale documentation, and unreachable code are
removed. Third, the BPF queue_track backend replaces its per-CPU
linear array with a hash map keyed by (cpu, pid) for O(1) task
lookups, reducing thread latency from 5-6us to 3us on real-time
systems.
As a result, approximately seven previously silent tests now
correctly validate daemon behavior, the suite runs deterministically
via event-driven waits rather than fixed sleeps, and the BPF
backend eliminates its lookup overhead entirely. The series yields
a net reduction of roughly 9,850 lines (+874/-10,725).
Wander Lairson Costa (33):
stalld: Reject --force_fifo in single-threaded mode
tests: Introduce test_section() helper
tests: Introduce cleanup_scenario() helper
tests: Introduce starvation and boost asserts
tests: Introduce find_starved_child() helper
tests: Fix task exit timing in test_boost_restoration
tests: Consolidate and adopt init_functional_test()
tests: Introduce assert_stalld_rejects() helper
tests: Fix boost verification in runtime and duration tests
tests: Fix subshell swallowing test results
tests: Fix repeated log match finding same line
chore: Remove legacy test infrastructure and stale docs
tests: Add assertions to SCHED_OTHER restoration test
tests: Fix CPU selection grep substring matches
tests: Add idle CPU skipping assertion
tests: Remove redundant pkill from cleanup
tests: Introduce and adopt assert_log_contains() helper
tests: Remove weak, redundant, and assertion-free test blocks
tests: Introduce and adopt assert_success() helper
tests: Replace wait conditionals with asserts
tests: Remove if-wrappers around assert calls
tests: Abort immediately on test failure
tests: Remove dead code after making fail() fatal
tests: Introduce and adopt process helpers
tests: Extract wait_for_process_exit helper
tests: Reduce default wait timeouts
tests: Reduce starvation_gen durations
tests: Replace init sleeps in test_affinity
tests: Drop redundant sleeps in test_pidfile
tests: Remove redundant sleeps after start_stalld
tests: Reduce timing and replace sleeps with event waits
tests: Fix async-signal-unsafe handler
bpf: Replace linear task scan with hash map
.claude/CLAUDE.md | 585 ---------
.claude/agents/agent-prompt-engineer.md | 135 --
.claude/agents/c-expert.md | 53 -
.claude/agents/code-reviewer.md | 104 --
.claude/agents/get-agent-hash | 99 --
.claude/agents/git-scm-master.md | 1154 -----------------
.claude/agents/kernel-hacker.md | 231 ----
.claude/agents/plan-validator.md | 130 --
.claude/agents/project-historian.md | 285 ----
.claude/agents/project-librarian.md | 604 ---------
.claude/agents/project-manager.md | 388 ------
.claude/agents/project-scope-guardian.md | 326 -----
.claude/agents/python-expert.md | 57 -
.claude/agents/test-specialist.md | 656 ----------
.claude/agents/update-agent-hashes | 96 --
.claude/context-snapshot.json | 103 --
.claude/rules | 42 -
.gitignore | 5 +-
bpf/stalld.bpf.c | 206 ++-
src/queue_track.c | 95 +-
src/queue_track.h | 107 +-
src/utils.c | 9 +-
tests/BACKEND_USAGE.md | 269 ----
tests/CONTEXT_SNAPSHOT_2025-10-31.md | 231 ----
tests/Makefile | 29 +-
tests/README.md | 29 +-
tests/TODO.md | 727 -----------
tests/functional/test_affinity.sh | 178 +--
tests/functional/test_backend_selection.sh | 24 +-
tests/functional/test_boost_duration.sh | 164 +--
tests/functional/test_boost_period.sh | 151 +--
tests/functional/test_boost_restoration.sh | 310 +----
tests/functional/test_boost_runtime.sh | 179 +--
tests/functional/test_cpu_selection.sh | 76 +-
tests/functional/test_deadline_boosting.sh | 266 +---
tests/functional/test_fifo_boosting.sh | 269 +---
.../test_fifo_priority_starvation.sh | 197 +--
tests/functional/test_force_fifo.sh | 196 +--
tests/functional/test_foreground.sh | 65 +-
tests/functional/test_idle_detection.sh | 195 +--
tests/functional/test_log_only.sh | 30 +-
tests/functional/test_logging_destinations.sh | 38 +-
tests/functional/test_pidfile.sh | 166 +--
tests/functional/test_runqueue_parsing.sh | 417 ------
tests/functional/test_starvation_detection.sh | 281 +---
tests/functional/test_starvation_threshold.sh | 138 +-
tests/functional/test_task_merging.sh | 161 +--
tests/helpers/starvation_gen.c | 2 +-
tests/helpers/test_helpers.sh | 356 +++--
tests/legacy/README.md | 169 ---
tests/legacy/test01.c | 506 --------
tests/legacy/test01_wrapper.sh | 161 ---
tests/run_tests.sh | 149 +--
53 files changed, 874 insertions(+), 10725 deletions(-)
delete mode 100644 .claude/CLAUDE.md
delete mode 100644 .claude/agents/agent-prompt-engineer.md
delete mode 100644 .claude/agents/c-expert.md
delete mode 100644 .claude/agents/code-reviewer.md
delete mode 100755 .claude/agents/get-agent-hash
delete mode 100644 .claude/agents/git-scm-master.md
delete mode 100644 .claude/agents/kernel-hacker.md
delete mode 100644 .claude/agents/plan-validator.md
delete mode 100644 .claude/agents/project-historian.md
delete mode 100644 .claude/agents/project-librarian.md
delete mode 100644 .claude/agents/project-manager.md
delete mode 100644 .claude/agents/project-scope-guardian.md
delete mode 100644 .claude/agents/python-expert.md
delete mode 100644 .claude/agents/test-specialist.md
delete mode 100755 .claude/agents/update-agent-hashes
delete mode 100644 .claude/context-snapshot.json
delete mode 100644 .claude/rules
delete mode 100644 tests/BACKEND_USAGE.md
delete mode 100644 tests/CONTEXT_SNAPSHOT_2025-10-31.md
delete mode 100644 tests/TODO.md
delete mode 100755 tests/functional/test_runqueue_parsing.sh
delete mode 100644 tests/legacy/README.md
delete mode 100644 tests/legacy/test01.c
delete mode 100755 tests/legacy/test01_wrapper.sh
--
2.54.0
next reply other threads:[~2026-05-20 14:01 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-20 14:00 Wander Lairson Costa [this message]
2026-05-20 14:00 ` [[PATCH stalld] 01/33] stalld: Reject --force_fifo in single-threaded mode Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 02/33] tests: Introduce test_section() helper Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 03/33] tests: Introduce cleanup_scenario() helper Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 04/33] tests: Introduce starvation and boost asserts Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 05/33] tests: Introduce find_starved_child() helper Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 06/33] tests: Fix task exit timing in test_boost_restoration Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 07/33] tests: Consolidate and adopt init_functional_test() Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 08/33] tests: Introduce assert_stalld_rejects() helper Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 09/33] tests: Fix boost verification in runtime and duration tests Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 10/33] tests: Fix subshell swallowing test results Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 11/33] tests: Fix repeated log match finding same line Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 12/33] chore: Remove legacy test infrastructure and stale docs Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 13/33] tests: Add assertions to SCHED_OTHER restoration test Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 14/33] tests: Fix CPU selection grep substring matches Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 15/33] tests: Add idle CPU skipping assertion Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 16/33] tests: Remove redundant pkill from cleanup Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 17/33] tests: Introduce and adopt assert_log_contains() helper Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 18/33] tests: Remove weak, redundant, and assertion-free test blocks Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 19/33] tests: Introduce and adopt assert_success() helper Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 20/33] tests: Replace wait conditionals with asserts Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 21/33] tests: Remove if-wrappers around assert calls Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 22/33] tests: Abort immediately on test failure Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 23/33] tests: Remove dead code after making fail() fatal Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 24/33] tests: Introduce and adopt process helpers Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 25/33] tests: Extract wait_for_process_exit helper Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 26/33] tests: Reduce default wait timeouts Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 27/33] tests: Reduce starvation_gen durations Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 28/33] tests: Replace init sleeps in test_affinity Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 29/33] tests: Drop redundant sleeps in test_pidfile Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 30/33] tests: Remove redundant sleeps after start_stalld Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 31/33] tests: Reduce timing and replace sleeps with event waits Wander Lairson Costa
2026-05-20 14:00 ` [[PATCH stalld] 32/33] tests: Fix async-signal-unsafe handler Wander Lairson Costa
2026-05-20 14:01 ` [[PATCH stalld] 33/33] bpf: Replace linear task scan with hash map Wander Lairson Costa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260520140104.112142-1-wander@redhat.com \
--to=wander@redhat.com \
--cc=davidlt@rivosinc.com \
--cc=jkacur@redhat.com \
--cc=juri.lelli@redhat.com \
--cc=linux-rt-users@vger.kernel.org \
--cc=luffyluo@tencent.com \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.