All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Senna Tschudin <peter.senna@linux.intel.com>
To: "igt-dev@lists.freedesktop.org" <igt-dev@lists.freedesktop.org>
Cc: Kamil Konieczny <kamil.konieczny@linux.intel.com>,
	Petri Latvala <adrinael@adrinael.net>
Subject: [PATCH i-g-t v6] runner/executor: Abort when child process is killed by a signal
Date: Thu, 3 Oct 2024 15:27:37 +0200	[thread overview]
Message-ID: <999b078d-4338-4158-915e-3bc83338f4e1@linux.intel.com> (raw)
In-Reply-To: <f7319a46-d6d4-4ea5-9863-99ccd7cea3a5@linux.intel.com>

Manually killing a test process results in igt-runner silently marking the
test as incomplete. Change the behavior to abort verbosely when a test is
killed.

In order for the new behavior to work, child termination is probed on
every iteration of the while loop inside monitor_output(). Add the
bool test_timed_out to track when igt_runner is intentionally terminating
a test, and do not interfere with it.

Tested by:
 - using the --per-test-timeout flag and checking that results.json labels
   the test as a timeout.
 - manually killing a test process and checking that results.json labels
   the test as an abort with a message stating the test was killed.

 v6: - clear code comments
     - move child termination probing closer to where sigfd is handled
     - add a bool test_timed_out to ensure we will not interfere with
       igt_runner intentionally killing tests.
     - move the aborting code to outside the while loop

 v5: do not use sigdescr_np() as it seems to be a fairly new lib function
     that does not compile on older Ubuntu
 v4: improve abort code path to not interfere with igt-runner timeouts
 v3: do not interfere with igt-runner killing tests due to timeout and
     diskspace
 v2: fix race condition

Cc: Petri Latvala <adrinael@adrinael.net>
Cc: Kamil Konieczny <kamil.konieczny@linux.intel.com>
Signed-off-by: Peter Senna Tschudin <peter.senna@linux.intel.com>
---
 runner/executor.c | 29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/runner/executor.c b/runner/executor.c
index ac73e1dde..69b4ed939 100644
--- a/runner/executor.c
+++ b/runner/executor.c
@@ -888,12 +888,15 @@ static int monitor_output(pid_t child,
 	const int interval_length = 1;
 	int wd_timeout;
 	int killed = 0; /* 0 if not killed, signal number otherwise */
+	bool child_reaped = false;
+	bool child_killed_by_signal = false;
 	struct timespec time_beg, time_now, time_last_activity, time_last_subtest, time_killed;
 	unsigned long taints = 0;
 	bool aborting = false;
 	size_t disk_usage = 0;
 	bool socket_comms_used = false; /* whether the test actually uses comms */
 	bool results_received = false; /* whether we already have test results that might need overriding if we detect an abort condition */
+	bool test_timed_out = false;
 
 	igt_gettime(&time_beg);
 	time_last_activity = time_last_subtest = time_killed = time_beg;
@@ -1233,6 +1236,14 @@ static int monitor_output(pid_t child,
 			}
 		}
 
+		/* Always check for abort conditions */
+		if (child == waitpid(child, &status, WNOHANG)) {
+			child_reaped = true;
+			if (WIFSIGNALED(status)) {
+				child_killed_by_signal = true;
+				killed = WTERMSIG(status);
+			}
+		}
 		if (sigfd >= 0 && FD_ISSET(sigfd, &set)) {
 			double time;
 
@@ -1241,7 +1252,12 @@ static int monitor_output(pid_t child,
 				errf("Error reading from signalfd: %m\n");
 				continue;
 			} else if (siginfo.ssi_signo == SIGCHLD) {
-				if (child != waitpid(child, &status, WNOHANG)) {
+				if (!child_reaped) {
+					/* Was child killed since we last checked? */
+					if (child == waitpid(child, &status, WNOHANG))
+						child_reaped = true;
+				}
+				if (!child_reaped) {
 					errf("Failed to reap child\n");
 					status = 9999;
 				} else if (WIFEXITED(status)) {
@@ -1303,7 +1319,6 @@ static int monitor_output(pid_t child,
 							fdatasync(outputs[_F_JOURNAL]);
 					}
 				}
-
 				aborting = true;
 				killed = SIGQUIT;
 				if (!kill_child(killed, child))
@@ -1447,6 +1462,7 @@ static int monitor_output(pid_t child,
 						 disk_usage);
 
 		if (timeout_reason) {
+			test_timed_out = true;
 			if (killed == SIGKILL) {
 				/* Nothing that can be done, really. Let's tell the caller we want to abort. */
 
@@ -1485,6 +1501,15 @@ static int monitor_output(pid_t child,
 		}
 	}
 
+	if (!test_timed_out && child_killed_by_signal) {
+		sprintf(buf, "Test terminated by a signal %s (%d).\n",
+				strsignal(killed), -killed);
+		errf("%s", buf);
+
+		*abortreason = strdup(buf);
+		aborting = true;
+	}
+
 	dump_dmesg(kmsgfd, outputs[_F_DMESG]);
 	if (settings->sync)
 		fdatasync(outputs[_F_DMESG]);
-- 
2.34.1


  parent reply	other threads:[~2024-10-03 13:27 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-28  6:40 [PATCH i-g-t] runner/executor.c: Detect tests killed by a signal Peter Senna Tschudin
2024-08-28  8:37 ` ✗ GitLab.Pipeline: warning for " Patchwork
2024-08-28  8:40 ` ✓ CI.xeBAT: success " Patchwork
2024-08-28  8:52 ` ✗ Fi.CI.BAT: failure " Patchwork
2024-08-28 13:15 ` ✗ CI.xeFULL: " Patchwork
2024-08-29 12:13 ` [PATCH i-g-t v2] runner/executor: Detect when child process is " Peter Senna Tschudin
2024-08-29 17:37 ` ✗ CI.xeBAT: failure for runner/executor.c: Detect tests killed by a signal (rev2) Patchwork
2024-08-30  4:28   ` Peter Senna Tschudin
2024-08-29 17:49 ` ✓ Fi.CI.BAT: success " Patchwork
2024-08-30  6:34 ` ✗ CI.xeFULL: failure " Patchwork
2024-08-30 12:39 ` [PATCH i-g-t v3] runner/executor: Detect when child process is killed by a signal Peter Senna Tschudin
2024-08-30 16:19 ` ✓ CI.xeBAT: success for runner/executor.c: Detect tests killed by a signal (rev3) Patchwork
2024-08-30 16:31 ` ✓ Fi.CI.BAT: " Patchwork
2024-08-31  0:57 ` ✗ Fi.CI.IGT: failure for runner/executor.c: Detect tests killed by a signal (rev2) Patchwork
2024-08-31  4:32 ` ✓ CI.xeFULL: success for runner/executor.c: Detect tests killed by a signal (rev3) Patchwork
2024-09-01  3:32 ` ✓ Fi.CI.IGT: " Patchwork
2024-09-03  6:19 ` [PATCH i-g-t v4] runner/executor: Detect when child process is killed by a signal Peter Senna Tschudin
2024-09-03 11:44 ` ✗ Fi.CI.BUILD: failure for runner/executor.c: Detect tests killed by a signal (rev4) Patchwork
2024-09-03 12:05 ` [PATCH i-g-t v5] runner/executor: Detect when child process is killed by a signal Peter Senna Tschudin
2024-09-10 15:58   ` Kamil Konieczny
2024-09-11  7:06     ` Peter Senna Tschudin
2024-09-11 10:50       ` Kamil Konieczny
2024-09-03 12:36 ` ✓ CI.xeBAT: success for runner/executor.c: Detect tests killed by a signal (rev5) Patchwork
2024-09-03 13:04 ` ✓ Fi.CI.BAT: " Patchwork
2024-09-03 14:46 ` ✗ CI.xeFULL: failure " Patchwork
2024-09-03 14:56   ` Peter Senna Tschudin
2024-09-04 14:34 ` ✓ Fi.CI.IGT: success " Patchwork
2024-09-05 10:44 ` ✗ GitLab.Pipeline: warning for runner/executor.c: Detect tests killed by a signal (rev4) Patchwork
2024-10-03 13:27 ` Peter Senna Tschudin [this message]
2024-10-09 14:46   ` [PATCH i-g-t v6] runner/executor: Abort when child process is killed by a signal Kamil Konieczny
2024-10-09 16:36   ` Kamil Konieczny
2024-10-11 15:18   ` Kamil Konieczny
2024-10-03 16:17 ` ✓ CI.xeBAT: success for runner/executor.c: Detect tests killed by a signal (rev6) Patchwork
2024-10-03 16:29 ` ✓ Fi.CI.BAT: " Patchwork
2024-10-03 18:00 ` ✗ CI.xeFULL: failure " Patchwork
2024-10-04  5:30   ` Peter Senna Tschudin
2024-10-04 10:34 ` ✗ GitLab.Pipeline: warning " Patchwork
2024-10-08  3:28 ` ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=999b078d-4338-4158-915e-3bc83338f4e1@linux.intel.com \
    --to=peter.senna@linux.intel.com \
    --cc=adrinael@adrinael.net \
    --cc=igt-dev@lists.freedesktop.org \
    --cc=kamil.konieczny@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.