public inbox for ltp@lists.linux.it
 help / color / mirror / Atom feed
From: Petr Vorel <pvorel@suse.cz>
To: Cyril Hrubis <chrubis@suse.cz>
Cc: ltp@lists.linux.it
Subject: Re: [LTP] [PATCH] [RFC] lib: shell: Fix timeout process races
Date: Mon, 20 Sep 2021 09:52:57 +0200	[thread overview]
Message-ID: <YUg92YO7x1wQO/qD@pevik> (raw)
In-Reply-To: <20210917141719.5739-1-chrubis@suse.cz>

Hi Cyril,

Reviewed-by: Petr Vorel <pvorel@suse.cz>
Tested-by: Petr Vorel <pvorel@suse.cz>

> There were actually several races in the shell library timeout handling.

> This commit fixes hopefully all of them by:

> * Reimplementing the backgroud timer in C
+1

> * Making sure that the timer has started before we leave test setup
+1

> The rewrite of the backround timer to C allows us to run all the timeout
> logic in a single process, which simplifies the whole problem greatly
> since previously we had chain of processes that estabilished signal
> handlers to kill it's descendants, which in the end had a few races in
> it.

> The race that caused the problems is, as far as I can tell, in the way
> how shell spawns it's children. I haven't checked the shell code, but I
> guess that when shell runs a process in bacground it does vfork() +
> exec() and because signals are ignored during the operation. If the
> SIGTERM arrives at that point it gets lost.

> That means that we created a race window in the shell library each time
> we started a new process. The rewrite to C simplifies the code but we
> still end up with a single place where this can happen and that is when
> we execute the tst_timeout_kill binary. This is now fixed in the shell
> library by waiting until the background process gets to a sleep state,
> which means that the proces has been executed and waiting for the
> timeout.

> After these fixes I haven't been able to reproduce the hang with:

> cat > debug.sh <<EOF
> #!/bin/sh

> TST_SETUP="setup"
> TST_TESTFUNC="do_test"
> . tst_test.sh

> setup()
> {
>         tst_brk TCONF "quit now!"
> }

> do_test()
> {
>         tst_res TPASS "pass :)"
> }

> tst_run
> EOF

> # while true; do ./debug.sh; done
I verified it's ok on both VM which were previously affected.

After release I might write a test for tst_timeout_kill.c.
Thanks for fixing it!

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

  parent reply	other threads:[~2021-09-20  7:53 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-17 14:17 [LTP] [PATCH] [RFC] lib: shell: Fix timeout process races Cyril Hrubis
2021-09-17 14:17 ` Cyril Hrubis
2021-09-20  4:51 ` Joerg Vehlow
2021-09-20  7:36   ` Cyril Hrubis
2021-09-20 12:02     ` Cyril Hrubis
2021-09-21  3:45     ` Li Wang
2021-09-20  7:52 ` Petr Vorel [this message]
2021-09-20  7:58 ` Petr Vorel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YUg92YO7x1wQO/qD@pevik \
    --to=pvorel@suse.cz \
    --cc=chrubis@suse.cz \
    --cc=ltp@lists.linux.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox