From: Jan Stancek <jstancek@redhat.com>
To: ltp@lists.linux.it
Subject: [LTP] [RFC] [PATCH] tst_test: Fail the test subprocess cannot be killed
Date: Wed, 27 Jun 2018 09:21:29 -0400 (EDT) [thread overview]
Message-ID: <937810331.29259972.1530105689475.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <20180627123606.27726-1-chrubis@suse.cz>
----- Original Message -----
> If there are any leftover children the main test process will likely be
> killed while sleeping in wait(). That is because all child processes are
> either waited explicitely by the test code or implicitly by the test
> library.
>
> We also send SIGKILL to the whole process group, so if one of the
> children continues to live for long enough it very likely means that
> it has ended up stuck in the kernel.
>
> So if there are any processes left with in the process group for the
> test processes once the process group leader i.e. main test process has
> been waited for we loop for a short while to give the init daemon chance
> to reap the process after it has been reparented and if that does not
> happen for a few seconds we declare the process to be stuck in the
> kernel.
>
> Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
> CC: Eric Biggers <ebiggers3@gmail.com>
> ---
> lib/tst_test.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/lib/tst_test.c b/lib/tst_test.c
> index 80808854e..6316ac865 100644
> --- a/lib/tst_test.c
> +++ b/lib/tst_test.c
> @@ -1047,6 +1047,21 @@ static int fork_testrun(void)
> alarm(0);
> SAFE_SIGNAL(SIGINT, SIG_DFL);
>
> + unsigned int sleep = 100;
> + unsigned int retries = 0;
> +
> + while (kill(-test_pid, 0) == 0) {
> +
> + usleep(sleep);
> + sleep*=2;
> +
> + if (retries++ <= 14)
> + continue;
> +
> + tst_res(TINFO, "Test process child stuck in the kernel!");
> + tst_brk(TFAIL, "Congratulation, likely test hit a kernel bug.");
> + }
> +
Looks good to me.
I'm thinking if we shouldn't also try to gather some data
that would help person looking at the logs. For example:
collect /proc/<pid>/stack output or trigger sysrq-t or sysrq-w.
Regards,
Jan
> if (WIFEXITED(status) && WEXITSTATUS(status))
> return WEXITSTATUS(status);
>
> --
> 2.13.6
>
>
> --
> Mailing list info: https://lists.linux.it/listinfo/ltp
>
next prev parent reply other threads:[~2018-06-27 13:21 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-27 12:36 [LTP] [RFC] [PATCH] tst_test: Fail the test subprocess cannot be killed Cyril Hrubis
2018-06-27 13:21 ` Jan Stancek [this message]
2018-06-27 13:44 ` Cyril Hrubis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=937810331.29259972.1530105689475.JavaMail.zimbra@redhat.com \
--to=jstancek@redhat.com \
--cc=ltp@lists.linux.it \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.