From: Petr Vorel <pvorel@suse.cz>
To: Li Wang <liwang@redhat.com>, Martin Doucha <martin.doucha@suse.com>
Cc: LTP List <ltp@lists.linux.it>
Subject: [LTP] 72b1728674 causing regressions [ [PATCH v2] Terminate leftover subprocesses when main test process crashes]
Date: Fri, 18 Feb 2022 13:30:21 +0100 [thread overview]
Message-ID: <Yg+RXbUTOxK56iZa@pevik> (raw)
In-Reply-To: <CAEemH2fqy3_t=-dbqE9Bx3VH6sZbNvM_bMon4zMukOh+rmw42Q@mail.gmail.com>
Hi all,
> On Fri, Feb 11, 2022 at 9:30 PM Martin Doucha <mdoucha@suse.cz> wrote:
> > On 11. 02. 22 13:55, Cyril Hrubis wrote:
> > > Hi!
> > >> --- a/lib/tst_test.c
> > >> +++ b/lib/tst_test.c
> > >> @@ -1495,6 +1495,9 @@ static int fork_testrun(void)
> > >> return TFAIL;
> > >> }
> > >> + if (tst_test->forks_child)
> > >> + kill(-test_pid, SIGKILL);
FYI This broke all LTP network tests which use netstress.c binary,
they now randomly fails after "tst_test.c:1499: TINFO: Killed the leftover descendant processes"
I was thinking whether it's not actually kernel bug which is now visible,
but the behavior is the same on various kernels: SLES 5.14, openSUSE 5.16.8,
older Debian 5.3. and different VM setup (but disabled firewall, also randomly
failing means it's not a firewall issue).
Not sure now whether netstress.c should be altered or we should add flag to the
API to not run this cleanup.
DEBUGGING:
The reason is hidden, because netstress.c output is redirected and printed only
on error.
Sometimes it's just a warning:
# ./tcp_ipsec.sh -s 100:1000:65535:R65535
...
tcp_ipsec 1 TINFO: run server 'netstress -D ltp_ns_veth1 -R 10 -B /tmp/LTP_tcp_ipsec.Qn3NINBzja'
tcp_ipsec 1 TINFO: run client 'netstress -l -H 10.0.0.1 -n 100 -N 100 -D ltp_ns_veth2 -a 2 -r 100 -d tst_netload.res' 5 times
tcp_ipsec 1 TWARN: netstress failed, ret: 2
tcp_ipsec 1 TPASS: netstress passed, median time 4 ms, data: 4 5 4 4
tcp_ipsec 2 TINFO: run server 'netstress -D ltp_ns_veth1 -R 10 -B /tmp/LTP_tcp_ipsec.Qn3NINBzja'
tcp_ipsec 2 TINFO: run client 'netstress -l -H 10.0.0.1 -n 1000 -N 1000 -D ltp_ns_veth2 -a 2 -r 100 -d tst_netload.res' 5 times
tcp_ipsec 2 TPASS: netstress passed, median time 6 ms, data: 6 6 4 5 6
tcp_ipsec 3 TINFO: run server 'netstress -D ltp_ns_veth1 -R 10 -B /tmp/LTP_tcp_ipsec.Qn3NINBzja'
tcp_ipsec 3 TINFO: run client 'netstress -l -H 10.0.0.1 -n 65535 -N 65535 -D ltp_ns_veth2 -a 2 -r 100 -d tst_netload.res' 5 times
tcp_ipsec 3 TPASS: netstress passed, median time 9 ms, data: 11 10 9 9 9
tcp_ipsec 4 TINFO: run server 'netstress -D ltp_ns_veth1 -R 10 -B /tmp/LTP_tcp_ipsec.Qn3NINBzja'
tcp_ipsec 4 TINFO: run client 'netstress -l -H 10.0.0.1 -A 65535 -D ltp_ns_veth2 -a 2 -r 100 -d tst_netload.res' 5 times
tcp_ipsec 4 TPASS: netstress passed, median time 8 ms, data: 8 8 8 9 7
tcp_ipsec 5 TINFO: AppArmor enabled, this may affect test results
tcp_ipsec 5 TINFO: it can be disabled with TST_DISABLE_APPARMOR=1 (requires super/root)
tcp_ipsec 5 TINFO: loaded AppArmor profiles: none
# ./tcp_ipsec.sh -s 100:1000:65535:R65535
...
tcp_ipsec 1 TINFO: run server 'netstress -D ltp_ns_veth1 -R 10 -B /tmp/LTP_tcp_ipsec.4I7mEMaCeK'
tcp_ipsec 1 TINFO: run client 'netstress -l -H 10.0.0.1 -n 100 -N 100 -D ltp_ns_veth2 -a 2 -r 100 -d tst_netload.res' 5 times
tcp_ipsec 1 TPASS: netstress passed, median time 6 ms, data: 5 5 6 6 6
tcp_ipsec 2 TINFO: run server 'netstress -D ltp_ns_veth1 -R 10 -B /tmp/LTP_tcp_ipsec.4I7mEMaCeK'
tcp_ipsec 2 TINFO: run client 'netstress -l -H 10.0.0.1 -n 1000 -N 1000 -D ltp_ns_veth2 -a 2 -r 100 -d tst_netload.res' 5 times
tcp_ipsec 2 TWARN: netstress failed, ret: 2
tcp_ipsec 2 TPASS: netstress passed, median time 5 ms, data: 4 6 5 5
tcp_ipsec 3 TINFO: run server 'netstress -D ltp_ns_veth1 -R 10 -B /tmp/LTP_tcp_ipsec.4I7mEMaCeK'
tcp_ipsec 3 TINFO: run client 'netstress -l -H 10.0.0.1 -n 65535 -N 65535 -D ltp_ns_veth2 -a 2 -r 100 -d tst_netload.res' 5 times
tcp_ipsec 3 TPASS: netstress passed, median time 10 ms, data: 10 10 8 9 10
tcp_ipsec 4 TINFO: run server 'netstress -D ltp_ns_veth1 -R 10 -B /tmp/LTP_tcp_ipsec.4I7mEMaCeK'
tcp_ipsec 4 TINFO: run client 'netstress -l -H 10.0.0.1 -A 65535 -D ltp_ns_veth2 -a 2 -r 100 -d tst_netload.res' 5 times
tcp_ipsec 4 TPASS: netstress passed, median time 11 ms, data: 12 11 11 11 11
tcp_ipsec 5 TINFO: AppArmor enabled, this may affect test results
tcp_ipsec 5 TINFO: it can be disabled with TST_DISABLE_APPARMOR=1 (requires super/root)
tcp_ipsec 5 TINFO: loaded AppArmor profiles: none
Sometimes it's a hard failure, where we at least see the log:
tcp_ipsec 1 TPASS: netstress passed, median time 5 ms, data: 4 7 4 8 5
tcp_ipsec 2 TINFO: run server 'netstress -D ltp_ns_veth1 -R 10 -B /tmp/LTP_tcp_ipsec.rEORDqdaS6'
tcp_ipsec 2 TINFO: run client 'netstress -l -H 10.0.0.1 -n 1000 -N 1000 -D ltp_ns_veth2 -a 2 -r 100 -d tst_netload.res' 5 times
tcp_ipsec 2 TPASS: netstress passed, median time 6 ms, data: 4 6 6 4 6
tcp_ipsec 3 TINFO: run server 'netstress -D ltp_ns_veth1 -R 10 -B /tmp/LTP_tcp_ipsec.rEORDqdaS6'
tcp_ipsec 3 TINFO: run client 'netstress -l -H 10.0.0.1 -n 65535 -N 65535 -D ltp_ns_veth2 -a 2 -r 100 -d tst_netload.res' 5 times
tcp_ipsec 3 TWARN: netstress failed, ret: 2
netstress.c:642: TBROK: Server closed
tst_test.c:1457: TINFO: Timeout per run is 0h 05m 00s
netstress.c:895: TINFO: connection: addr '10.0.0.1', port '33985'
netstress.c:896: TINFO: client max req: 100
netstress.c:897: TINFO: clients num: 2
netstress.c:902: TINFO: client msg size: 65535
netstress.c:903: TINFO: server msg size: 65535
netstress.c:817: TINFO: tcp_tw_reuse is already set
netstress.c:947: TINFO: TCP client is using old TCP API.
netstress.c:789: TINFO: '/proc/sys/net/ipv4/tcp_fastopen' is 1
netstress.c:476: TINFO: Running the test over IPv4
netstress.c:344: TBROK: connect(4, 10.0.0.1:33985, 16) failed: ECONNREFUSED (111)
netstress.c:344: TBROK: connect(3, 10.0.0.1:33985, 16) failed: ECONNREFUSED (111)
But with patch below it shows that server process is killed:
tcp_ipsec 1 TPASS: netstress passed, median time 5 ms, data: 6 5 5 4 5
tcp_ipsec 2 TINFO: run server 'netstress -D ltp_ns_veth1 -R 10 -B /tmp/LTP_tcp_ipsec.DId6DBCQ2W'
tcp_ipsec 2 TINFO: run client 'netstress -l -H 10.0.0.1 -n 1000 -N 1000 -D ltp_ns_veth2 -a 2 -r 100 -d tst_netload.res' 5 times
tcp_ipsec 2 TINFO: ===== 1: remote netstress, ret: 0, cat tst_netload.log =====
tst_test.c:1457: TINFO: Timeout per run is 0h 05m 00s
netstress.c:923: TINFO: max requests '10'
netstress.c:947: TINFO: TCP server is using old TCP API.
netstress.c:789: TINFO: '/proc/sys/net/ipv4/tcp_fastopen' is 1
netstress.c:678: TINFO: assigning a name to the server socket...
netstress.c:685: TINFO: bind to port 36103
netstress.c:706: TINFO: Listen on the socket '5'
tst_test.c:1499: TINFO: Killed the leftover descendant processes
=> HERE netstress server process is killed after TPASS
Summary:
passed 0
failed 0
broken 0
skipped 0
warnings 0
---
tcp_ipsec 2 TWARN: netstress failed, ret: 2
=> causing TWARN for client.
And hard failure:
tcp_ipsec 4 TINFO: ===== 5: remote netstress, ret: 0, cat tst_netload.log =====
tst_test.c:1457: TINFO: Timeout per run is 0h 05m 00s
netstress.c:923: TINFO: max requests '10'
netstress.c:947: TINFO: TCP server is using old TCP API.
netstress.c:789: TINFO: '/proc/sys/net/ipv4/tcp_fastopen' is 1
netstress.c:678: TINFO: assigning a name to the server socket...
netstress.c:685: TINFO: bind to port 36709
netstress.c:706: TINFO: Listen on the socket '5'
tst_test.c:1499: TINFO: Killed the leftover descendant processes
Summary:
passed 0
failed 0
broken 0
skipped 0
warnings 0
---
tcp_ipsec 4 TWARN: netstress failed, ret: 2
netstress.c:642: TBROK: Server closed
tst_test.c:1457: TINFO: Timeout per run is 0h 05m 00s
netstress.c:874: TINFO: rand start seed 0xff9e
netstress.c:895: TINFO: connection: addr '10.0.0.1', port '36709'
netstress.c:896: TINFO: client max req: 100
netstress.c:897: TINFO: clients num: 2
netstress.c:900: TINFO: random msg size [5 65530]
netstress.c:817: TINFO: tcp_tw_reuse is already set
netstress.c:947: TINFO: TCP client is using old TCP API.
netstress.c:789: TINFO: '/proc/sys/net/ipv4/tcp_fastopen' is 1
netstress.c:476: TINFO: Running the test over IPv4
netstress.c:344: TBROK: connect(4, 10.0.0.1:36709, 16) failed: ECONNREFUSED (111)
netstress.c:344: TBROK: connect(3, 10.0.0.1:36709, 16) failed: ECONNREFUSED (111)
Summary:
passed 0
failed 0
broken 2
skipped 0
warnings 0
tcp_ipsec 4 TFAIL: expected 'pass' but ret: '2'
Kind regards,
Petr
+++ testcases/lib/tst_net.sh
@@ -728,6 +728,10 @@ tst_netload()
for i in $(seq 1 $run_cnt); do
tst_rhost_run -c "netstress $s_opts" > tst_netload.log 2>&1
+ tst_res_ TINFO "===== $i: remote netstress, ret: $ret, cat tst_netload.log ====="
+ cat tst_netload.log
+ printf -- "---\n\n"
+
if [ $? -ne 0 ]; then
cat tst_netload.log
local ttype="TFAIL"
--
Mailing list info: https://lists.linux.it/listinfo/ltp
next prev parent reply other threads:[~2022-02-18 12:30 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-11 11:44 [LTP] [PATCH v2] Terminate leftover subprocesses when main test process crashes Martin Doucha
2022-02-11 12:55 ` Cyril Hrubis
2022-02-11 13:29 ` Martin Doucha
2022-02-12 3:03 ` Li Wang
2022-02-18 12:30 ` Petr Vorel [this message]
2022-02-18 12:42 ` [LTP] 72b1728674 causing regressions [ [PATCH v2] Terminate leftover subprocesses when main test process crashes] Cyril Hrubis
2022-02-18 14:42 ` Petr Vorel
2022-02-18 14:48 ` Cyril Hrubis
2022-02-18 15:32 ` Petr Vorel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yg+RXbUTOxK56iZa@pevik \
--to=pvorel@suse.cz \
--cc=liwang@redhat.com \
--cc=ltp@lists.linux.it \
--cc=martin.doucha@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.