From: Cyril Hrubis <chrubis@suse.cz>
To: ltp@lists.linux.it
Subject: [LTP] [PATCH 1/2 v2] tst_test: Fail the test subprocess cannot be killed
Date: Thu, 28 Jun 2018 12:20:45 +0200 [thread overview]
Message-ID: <20180628102045.GB21866@rei> (raw)
In-Reply-To: <CAEemH2f7TQG+eSznxEnM-NgBx4gjvx=wy6JZh0mjCD_+=mbsGQ@mail.gmail.com>
Hi!
> > + unsigned int sleep = 100;
> > + unsigned int retries = 0;
> > +
> > + while (kill(-test_pid, 0) == 0) {
>
> I'm a little worried about here, image that, if a process_A(test_pid)
> exist to make function kill(-test_pid, 0) return 0 at first time, then
> we go into this while loop, but during the sleeping time process_A
> exit and system reuse the test_pid to another process_B, we will still
> keep looping and very probably make mistake to report TFAIL(with stack
> of process_B dump to ltp user in PATCH 2/2).
That is known limitation of UNIX. In practice it's very unlikely that
the pid would be reused in very short timeframe unless there is a fork
bomb running on the system or the system is out of pids, which both
means greater trouble.
Just try to run 'watch /proc/self/stat' and look how fast is the first
number increasing. On an idle system it's increased by a single digit
number every two seconds and even if you run a parallel compilation in
background it takes a long time until we start to reuse recenlty used
pids.
I guess that we can remove the part that doubles the sleep and increase
the number of retries accordingly, that way we would be much more likely
to hit even very short interval when the pid was not allocated.
We can also include various sanity checks, we may examine the process
whoose process group matches the test_pid to some degree. We can for
instance check if the process has been reparented to init i.e. parent
pid == 1 which happens when the parent is killed. However I would like
to avoid anything too complicated since at a point we get to this
situation the kernel has been likely corrupted so all bets are off, the
system is in inconsistent state and the best action to take is to try to
inform the tester that something went wrong.
--
Cyril Hrubis
chrubis@suse.cz
prev parent reply other threads:[~2018-06-28 10:20 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-27 15:22 [LTP] [PATCH 1/2 v2] tst_test: Fail the test subprocess cannot be killed Cyril Hrubis
2018-06-27 15:22 ` [LTP] [PATCH 2/2] [WORK-IN-PROGRESS] lib/tst_test: Dump stack for test processes stuck in kernel Cyril Hrubis
2018-06-28 13:05 ` Jan Stancek
2018-06-28 7:10 ` [LTP] [PATCH 1/2 v2] tst_test: Fail the test subprocess cannot be killed Petr Vorel
2018-06-28 9:41 ` Li Wang
2018-06-28 10:08 ` Li Wang
2018-06-28 10:23 ` Cyril Hrubis
2018-06-28 10:20 ` Cyril Hrubis [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180628102045.GB21866@rei \
--to=chrubis@suse.cz \
--cc=ltp@lists.linux.it \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox