Igt-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Senna Tschudin <peter.senna@linux.intel.com>
To: "igt-dev@lists.freedesktop.org" <igt-dev@lists.freedesktop.org>
Cc: Kamil Konieczny <kamil.konieczny@linux.intel.com>,
	Petri Latvala <adrinael@adrinael.net>
Subject: [PATCH i-g-t v6] runner/executor: Abort when child process is killed by a signal
Date: Thu, 3 Oct 2024 15:27:37 +0200	[thread overview]
Message-ID: <999b078d-4338-4158-915e-3bc83338f4e1@linux.intel.com> (raw)
In-Reply-To: <f7319a46-d6d4-4ea5-9863-99ccd7cea3a5@linux.intel.com>

Manually killing a test process results in igt-runner silently marking the
test as incomplete. Change the behavior to abort verbosely when a test is
killed.

In order for the new behavior to work, child termination is probed on
every iteration of the while loop inside monitor_output(). Add the
bool test_timed_out to track when igt_runner is intentionally terminating
a test, and do not interfere with it.

Tested by:
 - using the --per-test-timeout flag and checking that results.json labels
   the test as a timeout.
 - manually killing a test process and checking that results.json labels
   the test as an abort with a message stating the test was killed.

 v6: - clear code comments
     - move child termination probing closer to where sigfd is handled
     - add a bool test_timed_out to ensure we will not interfere with
       igt_runner intentionally killing tests.
     - move the aborting code to outside the while loop

 v5: do not use sigdescr_np() as it seems to be a fairly new lib function
     that does not compile on older Ubuntu
 v4: improve abort code path to not interfere with igt-runner timeouts
 v3: do not interfere with igt-runner killing tests due to timeout and
     diskspace
 v2: fix race condition

Cc: Petri Latvala <adrinael@adrinael.net>
Cc: Kamil Konieczny <kamil.konieczny@linux.intel.com>
Signed-off-by: Peter Senna Tschudin <peter.senna@linux.intel.com>
---
 runner/executor.c | 29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/runner/executor.c b/runner/executor.c
index ac73e1dde..69b4ed939 100644
--- a/runner/executor.c
+++ b/runner/executor.c
@@ -888,12 +888,15 @@ static int monitor_output(pid_t child,
 	const int interval_length = 1;
 	int wd_timeout;
 	int killed = 0; /* 0 if not killed, signal number otherwise */
+	bool child_reaped = false;
+	bool child_killed_by_signal = false;
 	struct timespec time_beg, time_now, time_last_activity, time_last_subtest, time_killed;
 	unsigned long taints = 0;
 	bool aborting = false;
 	size_t disk_usage = 0;
 	bool socket_comms_used = false; /* whether the test actually uses comms */
 	bool results_received = false; /* whether we already have test results that might need overriding if we detect an abort condition */
+	bool test_timed_out = false;
 
 	igt_gettime(&time_beg);
 	time_last_activity = time_last_subtest = time_killed = time_beg;
@@ -1233,6 +1236,14 @@ static int monitor_output(pid_t child,
 			}
 		}
 
+		/* Always check for abort conditions */
+		if (child == waitpid(child, &status, WNOHANG)) {
+			child_reaped = true;
+			if (WIFSIGNALED(status)) {
+				child_killed_by_signal = true;
+				killed = WTERMSIG(status);
+			}
+		}
 		if (sigfd >= 0 && FD_ISSET(sigfd, &set)) {
 			double time;
 
@@ -1241,7 +1252,12 @@ static int monitor_output(pid_t child,
 				errf("Error reading from signalfd: %m\n");
 				continue;
 			} else if (siginfo.ssi_signo == SIGCHLD) {
-				if (child != waitpid(child, &status, WNOHANG)) {
+				if (!child_reaped) {
+					/* Was child killed since we last checked? */
+					if (child == waitpid(child, &status, WNOHANG))
+						child_reaped = true;
+				}
+				if (!child_reaped) {
 					errf("Failed to reap child\n");
 					status = 9999;
 				} else if (WIFEXITED(status)) {
@@ -1303,7 +1319,6 @@ static int monitor_output(pid_t child,
 							fdatasync(outputs[_F_JOURNAL]);
 					}
 				}
-
 				aborting = true;
 				killed = SIGQUIT;
 				if (!kill_child(killed, child))
@@ -1447,6 +1462,7 @@ static int monitor_output(pid_t child,
 						 disk_usage);
 
 		if (timeout_reason) {
+			test_timed_out = true;
 			if (killed == SIGKILL) {
 				/* Nothing that can be done, really. Let's tell the caller we want to abort. */
 
@@ -1485,6 +1501,15 @@ static int monitor_output(pid_t child,
 		}
 	}
 
+	if (!test_timed_out && child_killed_by_signal) {
+		sprintf(buf, "Test terminated by a signal %s (%d).\n",
+				strsignal(killed), -killed);
+		errf("%s", buf);
+
+		*abortreason = strdup(buf);
+		aborting = true;
+	}
+
 	dump_dmesg(kmsgfd, outputs[_F_DMESG]);
 	if (settings->sync)
 		fdatasync(outputs[_F_DMESG]);
-- 
2.34.1


  parent reply	other threads:[~2024-10-03 13:27 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-28  6:40 [PATCH i-g-t] runner/executor.c: Detect tests killed by a signal Peter Senna Tschudin
2024-08-28  8:37 ` ✗ GitLab.Pipeline: warning for " Patchwork
2024-08-28  8:40 ` ✓ CI.xeBAT: success " Patchwork
2024-08-28  8:52 ` ✗ Fi.CI.BAT: failure " Patchwork
2024-08-28 13:15 ` ✗ CI.xeFULL: " Patchwork
2024-08-29 12:13 ` [PATCH i-g-t v2] runner/executor: Detect when child process is " Peter Senna Tschudin
2024-08-29 17:37 ` ✗ CI.xeBAT: failure for runner/executor.c: Detect tests killed by a signal (rev2) Patchwork
2024-08-30  4:28   ` Peter Senna Tschudin
2024-08-29 17:49 ` ✓ Fi.CI.BAT: success " Patchwork
2024-08-30  6:34 ` ✗ CI.xeFULL: failure " Patchwork
2024-08-30 12:39 ` [PATCH i-g-t v3] runner/executor: Detect when child process is killed by a signal Peter Senna Tschudin
2024-08-30 16:19 ` ✓ CI.xeBAT: success for runner/executor.c: Detect tests killed by a signal (rev3) Patchwork
2024-08-30 16:31 ` ✓ Fi.CI.BAT: " Patchwork
2024-08-31  0:57 ` ✗ Fi.CI.IGT: failure for runner/executor.c: Detect tests killed by a signal (rev2) Patchwork
2024-08-31  4:32 ` ✓ CI.xeFULL: success for runner/executor.c: Detect tests killed by a signal (rev3) Patchwork
2024-09-01  3:32 ` ✓ Fi.CI.IGT: " Patchwork
2024-09-03  6:19 ` [PATCH i-g-t v4] runner/executor: Detect when child process is killed by a signal Peter Senna Tschudin
2024-09-03 11:44 ` ✗ Fi.CI.BUILD: failure for runner/executor.c: Detect tests killed by a signal (rev4) Patchwork
2024-09-03 12:05 ` [PATCH i-g-t v5] runner/executor: Detect when child process is killed by a signal Peter Senna Tschudin
2024-09-10 15:58   ` Kamil Konieczny
2024-09-11  7:06     ` Peter Senna Tschudin
2024-09-11 10:50       ` Kamil Konieczny
2024-09-03 12:36 ` ✓ CI.xeBAT: success for runner/executor.c: Detect tests killed by a signal (rev5) Patchwork
2024-09-03 13:04 ` ✓ Fi.CI.BAT: " Patchwork
2024-09-03 14:46 ` ✗ CI.xeFULL: failure " Patchwork
2024-09-03 14:56   ` Peter Senna Tschudin
2024-09-04 14:34 ` ✓ Fi.CI.IGT: success " Patchwork
2024-09-05 10:44 ` ✗ GitLab.Pipeline: warning for runner/executor.c: Detect tests killed by a signal (rev4) Patchwork
2024-10-03 13:27 ` Peter Senna Tschudin [this message]
2024-10-09 14:46   ` [PATCH i-g-t v6] runner/executor: Abort when child process is killed by a signal Kamil Konieczny
2024-10-09 16:36   ` Kamil Konieczny
2024-10-11 15:18   ` Kamil Konieczny
2024-10-03 16:17 ` ✓ CI.xeBAT: success for runner/executor.c: Detect tests killed by a signal (rev6) Patchwork
2024-10-03 16:29 ` ✓ Fi.CI.BAT: " Patchwork
2024-10-03 18:00 ` ✗ CI.xeFULL: failure " Patchwork
2024-10-04  5:30   ` Peter Senna Tschudin
2024-10-04 10:34 ` ✗ GitLab.Pipeline: warning " Patchwork
2024-10-08  3:28 ` ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=999b078d-4338-4158-915e-3bc83338f4e1@linux.intel.com \
    --to=peter.senna@linux.intel.com \
    --cc=adrinael@adrinael.net \
    --cc=igt-dev@lists.freedesktop.org \
    --cc=kamil.konieczny@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox