public inbox for ltp@lists.linux.it
 help / color / mirror / Atom feed
* [LTP] [PATCH 1/2 v2] tst_test: Fail the test subprocess cannot be killed
@ 2018-06-27 15:22 Cyril Hrubis
  2018-06-27 15:22 ` [LTP] [PATCH 2/2] [WORK-IN-PROGRESS] lib/tst_test: Dump stack for test processes stuck in kernel Cyril Hrubis
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Cyril Hrubis @ 2018-06-27 15:22 UTC (permalink / raw)
  To: ltp

If there are any leftover children the main test process will likely be
killed while sleeping in wait(). That is because all child processes are
either waited explicitely by the test code or implicitly by the test
library.

We also send SIGKILL to the whole process group, so if one of the
children continues to live for long enough that very likely means that
it ended up stuck in the kernel.

So if there are any processes left with in the process group we created
once the process group leader i.e. main test process has been waited
for we loop for a short while to give the init daemon chance to reap the
process after it has been reparented and if that does not happen for a
few seconds we declare the process to be stuck in the kernel.

Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
CC: Eric Biggers <ebiggers3@gmail.com>
---
 lib/tst_test.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/lib/tst_test.c b/lib/tst_test.c
index 80808854e..329168a24 100644
--- a/lib/tst_test.c
+++ b/lib/tst_test.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2015-2016 Cyril Hrubis <chrubis@suse.cz>
+ * Copyright (c) 2015-2018 Cyril Hrubis <chrubis@suse.cz>
  *
  * This program is free software: you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -1047,6 +1047,21 @@ static int fork_testrun(void)
 	alarm(0);
 	SAFE_SIGNAL(SIGINT, SIG_DFL);
 
+	unsigned int sleep = 100;
+	unsigned int retries = 0;
+
+	while (kill(-test_pid, 0) == 0) {
+
+		usleep(sleep);
+		sleep*=2;
+
+		if (retries++ <= 14)
+			continue;
+
+		tst_res(TFAIL, "Test process child stuck in the kernel!");
+		tst_brk(TFAIL, "Congratulation, likely test hit a kernel bug.");
+	}
+
 	if (WIFEXITED(status) && WEXITSTATUS(status))
 		return WEXITSTATUS(status);
 
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-06-28 13:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-06-27 15:22 [LTP] [PATCH 1/2 v2] tst_test: Fail the test subprocess cannot be killed Cyril Hrubis
2018-06-27 15:22 ` [LTP] [PATCH 2/2] [WORK-IN-PROGRESS] lib/tst_test: Dump stack for test processes stuck in kernel Cyril Hrubis
2018-06-28 13:05   ` Jan Stancek
2018-06-28  7:10 ` [LTP] [PATCH 1/2 v2] tst_test: Fail the test subprocess cannot be killed Petr Vorel
2018-06-28  9:41 ` Li Wang
2018-06-28 10:08   ` Li Wang
2018-06-28 10:23     ` Cyril Hrubis
2018-06-28 10:20   ` Cyril Hrubis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox