All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Vorel <pvorel@suse.cz>
To: ltp@lists.linux.it
Subject: [LTP] [PATCH v2] tst_test: using SIGTERM to terminate process
Date: Thu, 16 Sep 2021 01:01:28 +0200	[thread overview]
Message-ID: <YUJ7SC/FoA8wTaf2@pevik> (raw)
In-Reply-To: <YUJ2XA7W3JPODyNC@pevik>

Hi Li, all,

> Hi Li, all,

[ Cc Cyril and Alexey ]

> > We'd better avoid using SIGINT for process terminating becuasue,
> > it has different behavoir on kind of shell.

> > From Joerg Vehlow's test:

> >  - bash does not seem to care about SIGINT delivery to background
> >    processes, but can be blocked using trap

> >  - zsh ignores SIGINT for background processes by default, but can be
> >    allowed using trap

> >  - dash and busybox sh ignore the signal to background processes, and
> >    this cannot be changed with trap

> > This patch cover the below situations:

> >  1. SIGINT (Ctrl^C) for terminating the main process and do cleanup
> >     correctly before a timeout

> >  2. Test finish normally and retrieves the _tst_timeout_process in the
> >     background via SIGTERM(sending by _tst_cleanup_timer)

> >  3. Test timed out occurs and _tst_kill_test sending SIGTERM to
> >     terminating all process, and the main process do cleanup work

> >  4. Test timed out occurs but still have process alive after _tst_kill_test
> >     sending SIGTERM, then sending SIGKILL to the whole group

> >  5. Test terminated by SIGTERM unexpectly (e.g. system shutdown or process
> >     manager) and do cleanup work as well

> > Co-authored-by: Joerg Vehlow <joerg.vehlow@aox-tech.de>
> > Signed-off-by: Li Wang <liwang@redhat.com>
> > Reviewed-by: Joerg Vehlow <joerg.vehlow@aox-tech.de>
> ...

> > +++ b/testcases/lib/tst_test.sh
> > @@ -21,7 +21,8 @@ export TST_LIB_LOADED=1
> >  . tst_security.sh

> >  # default trap function
> > -trap "tst_brk TBROK 'test interrupted or timed out'" INT
> > +trap "tst_brk TBROK 'test interrupted'" INT
> > +trap "unset _tst_setup_timer_pid; tst_brk TBROK 'test terminated'" TERM

> FYI this commit (merged as 4a6b8a697 ("tst_test: using SIGTERM to terminate process"))
> broke net_stress_interface tests, particularly tst_require_cmds() call (which
> calls tst_brk TCONF:

> # ./if-addr-adddel.sh -c ifconfig
> if-addr-adddel 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
> if-addr-adddel 1 TINFO: add local addr 10.0.0.2/24
> if-addr-adddel 1 TINFO: add local addr fd00:1:1:1::2/64
> if-addr-adddel 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
> if-addr-adddel 1 TINFO: add remote addr 10.0.0.1/24
> if-addr-adddel 1 TINFO: add remote addr fd00:1:1:1::1/64
> if-addr-adddel 1 TINFO: Network config (local -- remote):
> if-addr-adddel 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
> if-addr-adddel 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
> if-addr-adddel 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
> if-addr-adddel 1 TINFO: timeout per run is 0h 5m 0s
> if-addr-adddel 1 TCONF: 'ifconfig' not found
> => waits till timeout
> if-addr-adddel 1 TBROK: Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
> if-addr-adddel 1 TWARN: test terminated

> Debugging it hangs in wait in _tst_cleanup_timer():

> kill -TERM $_tst_setup_timer_pid 2>/dev/null
> wait $_tst_setup_timer_pid 2>/dev/null

> because kill does not kill the test.

> The problem looks to be that unset actually does not work.
> trap "unset _tst_setup_timer_pid; tst_brk TBROK 'test terminated'" TERM

> It looks to be something setup specific, because I discovered this on SLES on
> both bash and dash. Running it on current Debian testing it works on both bash
> and dash. I checked shopt output on both, but don't see anything obvious. It
> must be something else.
OK, repeatedly running on Debian with dash I managed to get hang as well:

Here it does not even quit the test:

if-addr-adddel 1 TCONF: 'ifconfig' not found
if-addr-adddel 1 TBROK: Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
if-addr-adddel 1 TBROK: Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
if-addr-adddel 1 TBROK: Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
if-addr-adddel 1 TBROK: Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
if-addr-adddel 1 TBROK: Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
if-addr-adddel 1 TWARN: test terminated

Maybe not only SIGINT, but even SIGTERM is not reliable to background process?
Minimal reproducible example, on Dash needs few runs to hang:

cat > debug.sh <<EOF
#!/bin/sh

TST_SETUP="setup"
TST_TESTFUNC="do_test"
. tst_test.sh

setup()
{
	tst_brk TCONF "quit now!"
}

do_test()
{
	tst_res TPASS "pass :)"
}

tst_run
EOF

# while true; do ./debug.sh; done

Kind regards,
Petr

> Kind regards,
> Petr

> >  _tst_do_exit()
> >  {
> > @@ -439,9 +440,9 @@ _tst_kill_test()
> >  {
> >  	local i=10

> > -	trap '' INT
> > -	tst_res TBROK "Test timeouted, sending SIGINT! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1"
> > -	kill -INT -$pid
> > +	trap '' TERM
> > +	tst_res TBROK "Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1"
> > +	kill -TERM -$pid
> >  	tst_sleep 100ms

> >  	while kill -0 $pid >/dev/null 2>&1 && [ $i -gt 0 ]; do

WARNING: multiple messages have this Message-ID (diff)
From: Petr Vorel <pvorel@suse.cz>
To: Li Wang <liwang@redhat.com>,
	Joerg Vehlow <joerg.vehlow@aox-tech.de>,
	ltp@lists.linux.it
Subject: Re: [LTP] [PATCH v2] tst_test: using SIGTERM to terminate process
Date: Thu, 16 Sep 2021 01:01:28 +0200	[thread overview]
Message-ID: <YUJ7SC/FoA8wTaf2@pevik> (raw)
Message-ID: <20210915230128.carTA11vO6CbP8UhpFtCGkk2zR5yIFThbaO-p_UI3tQ@z> (raw)
In-Reply-To: <YUJ2XA7W3JPODyNC@pevik>

Hi Li, all,

> Hi Li, all,

[ Cc Cyril and Alexey ]

> > We'd better avoid using SIGINT for process terminating becuasue,
> > it has different behavoir on kind of shell.

> > From Joerg Vehlow's test:

> >  - bash does not seem to care about SIGINT delivery to background
> >    processes, but can be blocked using trap

> >  - zsh ignores SIGINT for background processes by default, but can be
> >    allowed using trap

> >  - dash and busybox sh ignore the signal to background processes, and
> >    this cannot be changed with trap

> > This patch cover the below situations:

> >  1. SIGINT (Ctrl^C) for terminating the main process and do cleanup
> >     correctly before a timeout

> >  2. Test finish normally and retrieves the _tst_timeout_process in the
> >     background via SIGTERM(sending by _tst_cleanup_timer)

> >  3. Test timed out occurs and _tst_kill_test sending SIGTERM to
> >     terminating all process, and the main process do cleanup work

> >  4. Test timed out occurs but still have process alive after _tst_kill_test
> >     sending SIGTERM, then sending SIGKILL to the whole group

> >  5. Test terminated by SIGTERM unexpectly (e.g. system shutdown or process
> >     manager) and do cleanup work as well

> > Co-authored-by: Joerg Vehlow <joerg.vehlow@aox-tech.de>
> > Signed-off-by: Li Wang <liwang@redhat.com>
> > Reviewed-by: Joerg Vehlow <joerg.vehlow@aox-tech.de>
> ...

> > +++ b/testcases/lib/tst_test.sh
> > @@ -21,7 +21,8 @@ export TST_LIB_LOADED=1
> >  . tst_security.sh

> >  # default trap function
> > -trap "tst_brk TBROK 'test interrupted or timed out'" INT
> > +trap "tst_brk TBROK 'test interrupted'" INT
> > +trap "unset _tst_setup_timer_pid; tst_brk TBROK 'test terminated'" TERM

> FYI this commit (merged as 4a6b8a697 ("tst_test: using SIGTERM to terminate process"))
> broke net_stress_interface tests, particularly tst_require_cmds() call (which
> calls tst_brk TCONF:

> # ./if-addr-adddel.sh -c ifconfig
> if-addr-adddel 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
> if-addr-adddel 1 TINFO: add local addr 10.0.0.2/24
> if-addr-adddel 1 TINFO: add local addr fd00:1:1:1::2/64
> if-addr-adddel 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
> if-addr-adddel 1 TINFO: add remote addr 10.0.0.1/24
> if-addr-adddel 1 TINFO: add remote addr fd00:1:1:1::1/64
> if-addr-adddel 1 TINFO: Network config (local -- remote):
> if-addr-adddel 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
> if-addr-adddel 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
> if-addr-adddel 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
> if-addr-adddel 1 TINFO: timeout per run is 0h 5m 0s
> if-addr-adddel 1 TCONF: 'ifconfig' not found
> => waits till timeout
> if-addr-adddel 1 TBROK: Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
> if-addr-adddel 1 TWARN: test terminated

> Debugging it hangs in wait in _tst_cleanup_timer():

> kill -TERM $_tst_setup_timer_pid 2>/dev/null
> wait $_tst_setup_timer_pid 2>/dev/null

> because kill does not kill the test.

> The problem looks to be that unset actually does not work.
> trap "unset _tst_setup_timer_pid; tst_brk TBROK 'test terminated'" TERM

> It looks to be something setup specific, because I discovered this on SLES on
> both bash and dash. Running it on current Debian testing it works on both bash
> and dash. I checked shopt output on both, but don't see anything obvious. It
> must be something else.
OK, repeatedly running on Debian with dash I managed to get hang as well:

Here it does not even quit the test:

if-addr-adddel 1 TCONF: 'ifconfig' not found
if-addr-adddel 1 TBROK: Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
if-addr-adddel 1 TBROK: Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
if-addr-adddel 1 TBROK: Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
if-addr-adddel 1 TBROK: Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
if-addr-adddel 1 TBROK: Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
if-addr-adddel 1 TWARN: test terminated

Maybe not only SIGINT, but even SIGTERM is not reliable to background process?
Minimal reproducible example, on Dash needs few runs to hang:

cat > debug.sh <<EOF
#!/bin/sh

TST_SETUP="setup"
TST_TESTFUNC="do_test"
. tst_test.sh

setup()
{
	tst_brk TCONF "quit now!"
}

do_test()
{
	tst_res TPASS "pass :)"
}

tst_run
EOF

# while true; do ./debug.sh; done

Kind regards,
Petr

> Kind regards,
> Petr

> >  _tst_do_exit()
> >  {
> > @@ -439,9 +440,9 @@ _tst_kill_test()
> >  {
> >  	local i=10

> > -	trap '' INT
> > -	tst_res TBROK "Test timeouted, sending SIGINT! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1"
> > -	kill -INT -$pid
> > +	trap '' TERM
> > +	tst_res TBROK "Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1"
> > +	kill -TERM -$pid
> >  	tst_sleep 100ms

> >  	while kill -0 $pid >/dev/null 2>&1 && [ $i -gt 0 ]; do

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

  reply	other threads:[~2021-09-15 23:01 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-19  8:58 [LTP] [PATCH v2] tst_test: using SIGTERM to terminate process Li Wang
2021-05-19  9:21 ` Joerg Vehlow
2021-05-27  4:11   ` Li Wang
2021-05-31  9:25 ` Li Wang
2021-09-15 22:40 ` Petr Vorel
2021-09-15 22:40   ` Petr Vorel
2021-09-15 23:01   ` Petr Vorel [this message]
2021-09-15 23:01     ` Petr Vorel
2021-09-17  8:50     ` Cyril Hrubis
2021-09-17  8:50       ` Cyril Hrubis
2021-09-17  9:17       ` Petr Vorel
2021-09-17  9:17         ` Petr Vorel
2021-09-17 10:18         ` Petr Vorel
2021-09-17 10:18           ` Petr Vorel
2021-09-17 10:59           ` Cyril Hrubis
2021-09-17 10:59             ` Cyril Hrubis
2021-09-17 11:03             ` Petr Vorel
2021-09-17 11:03               ` Petr Vorel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YUJ7SC/FoA8wTaf2@pevik \
    --to=pvorel@suse.cz \
    --cc=ltp@lists.linux.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.