public inbox for linux-rt-users@vger.kernel.org
 help / color / mirror / Atom feed
From: Wander Lairson Costa <wander@redhat.com>
To: williams@redhat.com, jkacur@redhat.com, juri.lelli@redhat.com,
	luffyluo@tencent.com, davidlt@rivosinc.com,
	linux-rt-users@vger.kernel.org
Cc: Wander Lairson Costa <wander@redhat.com>
Subject: [PATCH stalld 02/36] tests/helpers: Fix stalld daemon detection in start_stalld()
Date: Mon, 30 Mar 2026 16:43:25 -0300	[thread overview]
Message-ID: <20260330194410.103953-3-wander@redhat.com> (raw)
In-Reply-To: <20260330194410.103953-1-wander@redhat.com>

The start_stalld() test helper function incorrectly assumed that
daemonized processes would always have a parent PID (PPID) of 1 (init)
or 2 (kthreadd). This assumption is no longer valid on modern Linux
systems utilizing systemd, where orphaned processes are commonly
re-parented to the user's systemd session, not directly to the global
init process.

This led to failures in the test_foreground test on both sched_debug
and queue_track backends due to the helper failing to correctly
identify the daemonized stalld process.

To address this, the daemon detection logic has been refactored.
Instead of relying on PPID checks, the helper now first waits for the
initial shell background process (shell_pid) that started stalld to
exit. This exit signals that stalld has successfully performed its
double-fork and daemonized. Once the parent shell process has exited,
the helper then uses pgrep -n -x stalld to reliably find the newest
running stalld instance, which corresponds to the daemonized process.
This approach provides a more robust and platform-agnostic method for
detecting stalld when it runs in daemon mode.

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
---
 tests/helpers/test_helpers.sh | 45 ++++++++++-------------------------
 1 file changed, 13 insertions(+), 32 deletions(-)

diff --git a/tests/helpers/test_helpers.sh b/tests/helpers/test_helpers.sh
index c42dab3..5a384a7 100755
--- a/tests/helpers/test_helpers.sh
+++ b/tests/helpers/test_helpers.sh
@@ -321,48 +321,29 @@ start_stalld() {
 				fi
 			fi
 		else
-			# Daemon mode: parent forks, child becomes daemon (ppid=1), parent exits
-			# We need to wait for and find the DAEMON child, not the exiting parent
+			# Daemon mode: stalld double-forks, so the shell_pid will exit
+			# and the daemon will be re-parented. Wait for the shell process
+			# to exit, then find the newest stalld process.
 			while [ $attempt -lt $max_attempts ]; do
 				sleep 0.5
 
-				# Get all stalld processes
-				local pids=$(pgrep -x stalld 2>/dev/null)
-
-				for pid in $pids; do
-					if kill -0 $pid 2>/dev/null; then
-						# Check parent PID - daemon should have ppid=1 (init) or ppid=2 (kthreadd)
-						local ppid=$(ps -o ppid= -p $pid 2>/dev/null | tr -d ' ')
-
-						# For daemonized processes, wait for ppid=1 or 2
-						if [ "$ppid" = "1" ] || [ "$ppid" = "2" ]; then
-							STALLD_PID=$pid
-							# Wait a bit to ensure daemon is stable
-							sleep 0.5
-							# Verify it's still running
-							if kill -0 $pid 2>/dev/null; then
-								break 2  # Break out of both loops
-							else
-								# Daemon died, keep looking
-								STALLD_PID=""
-							fi
-						fi
+				# Check if shell_pid has exited (daemonization complete)
+				if ! kill -0 ${shell_pid} 2>/dev/null; then
+					# Shell process exited, daemon should be running
+					# Use pgrep -n to find the newest stalld process
+					STALLD_PID=$(pgrep -n -x stalld 2>/dev/null)
+					if [ -n "${STALLD_PID}" ] && kill -0 ${STALLD_PID} 2>/dev/null; then
+						break
 					fi
-				done
-
-				# If we found a daemonized process, we're done
-				if [ -n "${STALLD_PID}" ]; then
-					break
 				fi
 
 				attempt=$((attempt + 1))
 			done
 
-			# Last resort: use the shell PID if nothing else worked
+			# If we still don't have a PID, try one more time
 			if [ -z "${STALLD_PID}" ]; then
-				if kill -0 ${shell_pid} 2>/dev/null; then
-					STALLD_PID=${shell_pid}
-				fi
+				sleep 1
+				STALLD_PID=$(pgrep -n -x stalld 2>/dev/null)
 			fi
 		fi
 	fi
-- 
2.53.0


  parent reply	other threads:[~2026-03-30 19:44 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-30 19:43 [PATCH stalld 00/36] tests: Replace timing-dependent synchronization with event-driven helpers Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 01/36] tests: Add pre-test and post-test cleanup of stalld processes Wander Lairson Costa
2026-03-30 19:43 ` Wander Lairson Costa [this message]
2026-03-30 19:43 ` [PATCH stalld 03/36] tests/helpers: Remove duplicate log() function Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 04/36] tests: Add per-test runtime measurement to run_tests.sh Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 05/36] tests/functional: Fix and refactor test_backend_selection.sh Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 06/36] tests/functional: Fix test_logging_destinations.sh path and backend Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 07/36] tests/helpers: Replace sleep with poll in start_stalld_with_log() Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 08/36] tests/helpers: Fix stop_stalld() timeout and shutdown logic Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 09/36] tests/helpers: Fix relative path in backend detection functions Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 10/36] tests/functional: Remove redundant post-stop_stalld sleeps Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 11/36] tests/functional: Fix false positive log matching in test_logging_destinations Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 12/36] tests/helpers: Rewrite wait_for_log_message() with process substitution Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 13/36] tests/helpers: Add wait_for_stalld_ready() and use in start_stalld_with_log() Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 14/36] tests/helpers: Fix fractional sleep timeout bugs Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 15/36] tests/helpers: Flush stdout after starvation_gen startup messages Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 16/36] tests/helpers: Add start_starvation_gen() helper function Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 17/36] tests/helpers: Add wait_for_starvation_detected() and wait_for_boost_detected() Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 18/36] tests/functional: Use start_starvation_gen() helper Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 19/36] tests/functional: Replace detection sleeps with event-driven helpers Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 20/36] tests/functional: Remove duplicated -a flag in test_fifo_priority_starvation Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 21/36] tests/functional: Add missing -a flag in test_starvation_detection Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 22/36] tests/functional: Use start_stalld_with_log() in test_log_only Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 23/36] tests/functional: Use start_stalld_with_log() in test_logging_destinations Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 24/36] tests/functional: Use start_stalld_with_log() in test_cpu_selection Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 25/36] tests/functional: Use wait_for_stalld_ready() in test_backend_selection Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 26/36] tests/functional: Use timeout for error path in test_force_fifo Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 27/36] tests/functional: Use timeout for invalid argument tests Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 28/36] tests: Add pass() helper and replace assert_equals hack Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 29/36] tests/functional: Use pass() for all test pass reporting Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 30/36] tests: Add fail() helper and use for all test failures Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 31/36] tests/helpers: Use pass()/fail() in assert functions Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 32/36] tests/functional: Fix multi-CPU detection in test_starvation_detection Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 33/36] tests/functional: Accept FIFO fallback in test_fifo_boosting Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 34/36] tests/functional: Fix readiness detection and FIFO fallback in test_force_fifo Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 35/36] tests/functional: Fix invalid pidfile test in test_pidfile Wander Lairson Costa
2026-03-30 19:43 ` [PATCH stalld 36/36] stalld: die on invalid CPU affinity Wander Lairson Costa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260330194410.103953-3-wander@redhat.com \
    --to=wander@redhat.com \
    --cc=davidlt@rivosinc.com \
    --cc=jkacur@redhat.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=luffyluo@tencent.com \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox