From: Petr Vorel <pvorel@suse.cz>
To: Cyril Hrubis <chrubis@suse.cz>
Cc: ltp@lists.linux.it
Subject: Re: [LTP] [PATCH] [RFC] lib: shell: Fix timeout process races
Date: Mon, 20 Sep 2021 09:52:57 +0200 [thread overview]
Message-ID: <YUg92YO7x1wQO/qD@pevik> (raw)
In-Reply-To: <20210917141719.5739-1-chrubis@suse.cz>
Hi Cyril,
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Tested-by: Petr Vorel <pvorel@suse.cz>
> There were actually several races in the shell library timeout handling.
> This commit fixes hopefully all of them by:
> * Reimplementing the backgroud timer in C
+1
> * Making sure that the timer has started before we leave test setup
+1
> The rewrite of the backround timer to C allows us to run all the timeout
> logic in a single process, which simplifies the whole problem greatly
> since previously we had chain of processes that estabilished signal
> handlers to kill it's descendants, which in the end had a few races in
> it.
> The race that caused the problems is, as far as I can tell, in the way
> how shell spawns it's children. I haven't checked the shell code, but I
> guess that when shell runs a process in bacground it does vfork() +
> exec() and because signals are ignored during the operation. If the
> SIGTERM arrives at that point it gets lost.
> That means that we created a race window in the shell library each time
> we started a new process. The rewrite to C simplifies the code but we
> still end up with a single place where this can happen and that is when
> we execute the tst_timeout_kill binary. This is now fixed in the shell
> library by waiting until the background process gets to a sleep state,
> which means that the proces has been executed and waiting for the
> timeout.
> After these fixes I haven't been able to reproduce the hang with:
> cat > debug.sh <<EOF
> #!/bin/sh
> TST_SETUP="setup"
> TST_TESTFUNC="do_test"
> . tst_test.sh
> setup()
> {
> tst_brk TCONF "quit now!"
> }
> do_test()
> {
> tst_res TPASS "pass :)"
> }
> tst_run
> EOF
> # while true; do ./debug.sh; done
I verified it's ok on both VM which were previously affected.
After release I might write a test for tst_timeout_kill.c.
Thanks for fixing it!
Kind regards,
Petr
--
Mailing list info: https://lists.linux.it/listinfo/ltp
next prev parent reply other threads:[~2021-09-20 7:53 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-17 14:17 [LTP] [PATCH] [RFC] lib: shell: Fix timeout process races Cyril Hrubis
2021-09-17 14:17 ` Cyril Hrubis
2021-09-20 4:51 ` Joerg Vehlow
2021-09-20 7:36 ` Cyril Hrubis
2021-09-20 12:02 ` Cyril Hrubis
2021-09-21 3:45 ` Li Wang
2021-09-20 7:52 ` Petr Vorel [this message]
2021-09-20 7:58 ` Petr Vorel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YUg92YO7x1wQO/qD@pevik \
--to=pvorel@suse.cz \
--cc=chrubis@suse.cz \
--cc=ltp@lists.linux.it \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox