* [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines
@ 2025-01-16 2:01 Jakub Kicinski
2025-01-16 11:52 ` Przemek Kitszel
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Jakub Kicinski @ 2025-01-16 2:01 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms, Jakub Kicinski,
shuah, linux-kselftest, willemdebruijn.kernel
Commit b9d5f5711dd8 ("selftests: net: increase the delay for relative
cmsg_time.sh test") widened the accepted value range 8x but we still
see flakes (at a rate of around 7%).
Return XFAIL for the most timing sensitive test on slow machines.
Before:
# ./cmsg_time.sh
Case UDPv4 - TXTIME rel returned '8074us - 7397us < 4000', expected 'OK'
FAIL - 1/36 cases failed
After:
# ./cmsg_time.sh
Case UDPv4 - TXTIME rel returned '1123us - 941us < 500', expected 'OK' (XFAIL)
Case UDPv6 - TXTIME rel returned '1227us - 776us < 500', expected 'OK' (XFAIL)
OK
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: shuah@kernel.org
CC: linux-kselftest@vger.kernel.org
CC: willemdebruijn.kernel@gmail.com
---
tools/testing/selftests/net/cmsg_time.sh | 35 +++++++++++++++++-------
1 file changed, 25 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/net/cmsg_time.sh b/tools/testing/selftests/net/cmsg_time.sh
index 1d7e756644bc..478af0aefa97 100755
--- a/tools/testing/selftests/net/cmsg_time.sh
+++ b/tools/testing/selftests/net/cmsg_time.sh
@@ -34,13 +34,28 @@ BAD=0
TOTAL=0
check_result() {
+ local ret=$1
+ local got=$2
+ local exp=$3
+ local case=$4
+ local xfail=$5
+ local xf=
+ local inc=
+
+ if [ "$xfail" == "xfail" ]; then
+ xf="(XFAIL)"
+ inc=0
+ else
+ inc=1
+ fi
+
((TOTAL++))
- if [ $1 -ne 0 ]; then
- echo " Case $4 returned $1, expected 0"
- ((BAD++))
+ if [ $ret -ne 0 ]; then
+ echo " Case $case returned $ret, expected 0 $xf"
+ ((BAD+=inc))
elif [ "$2" != "$3" ]; then
- echo " Case $4 returned '$2', expected '$3'"
- ((BAD++))
+ echo " Case $case returned '$got', expected '$exp' $xf"
+ ((BAD+=inc))
fi
}
@@ -66,14 +81,14 @@ for i in "-4 $TGT4" "-6 $TGT6"; do
awk '/SND/ { if ($3 > 1000) print "OK"; }')
check_result $? "$ts" "OK" "$prot - TXTIME abs"
- [ "$KSFT_MACHINE_SLOW" = yes ] && delay=8000 || delay=1000
+ [ "$KSFT_MACHINE_SLOW" = yes ] && xfail=xfail
- ts=$(ip netns exec $NS ./cmsg_sender -p $p $i 1234 -t -d $delay |
+ ts=$(ip netns exec $NS ./cmsg_sender -p $p $i 1234 -t -d 1000 |
awk '/SND/ {snd=$3}
/SCHED/ {sch=$3}
- END { if (snd - sch > '$((delay/2))') print "OK";
- else print snd, "-", sch, "<", '$((delay/2))'; }')
- check_result $? "$ts" "OK" "$prot - TXTIME rel"
+ END { if (snd - sch > 500) print "OK";
+ else print snd, "-", sch, "<", 500; }')
+ check_result $? "$ts" "OK" "$prot - TXTIME rel" $xfail
done
done
--
2.48.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines
2025-01-16 2:01 [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines Jakub Kicinski
@ 2025-01-16 11:52 ` Przemek Kitszel
2025-01-16 13:00 ` Willem de Bruijn
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Przemek Kitszel @ 2025-01-16 11:52 UTC (permalink / raw)
To: Jakub Kicinski
Cc: netdev, edumazet, davem, pabeni, andrew+netdev, horms, shuah,
linux-kselftest, willemdebruijn.kernel
On 1/16/25 03:01, Jakub Kicinski wrote:
> Commit b9d5f5711dd8 ("selftests: net: increase the delay for relative
> cmsg_time.sh test") widened the accepted value range 8x but we still
> see flakes (at a rate of around 7%).
you have undid the 8x change by this commit (without mentioning that)
[fine for me]
>
> Return XFAIL for the most timing sensitive test on slow machines.
code change looks fine for me, and does exactly that, so:
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
>
> Before:
>
> # ./cmsg_time.sh
> Case UDPv4 - TXTIME rel returned '8074us - 7397us < 4000', expected 'OK'
> FAIL - 1/36 cases failed
>
> After:
>
> # ./cmsg_time.sh
> Case UDPv4 - TXTIME rel returned '1123us - 941us < 500', expected 'OK' (XFAIL)
> Case UDPv6 - TXTIME rel returned '1227us - 776us < 500', expected 'OK' (XFAIL)
> OK
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines
2025-01-16 2:01 [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines Jakub Kicinski
2025-01-16 11:52 ` Przemek Kitszel
@ 2025-01-16 13:00 ` Willem de Bruijn
2025-01-17 12:49 ` Petr Machata
2025-01-18 3:50 ` patchwork-bot+netdevbpf
3 siblings, 0 replies; 7+ messages in thread
From: Willem de Bruijn @ 2025-01-16 13:00 UTC (permalink / raw)
To: Jakub Kicinski, davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms, Jakub Kicinski,
shuah, linux-kselftest, willemdebruijn.kernel
Jakub Kicinski wrote:
> Commit b9d5f5711dd8 ("selftests: net: increase the delay for relative
> cmsg_time.sh test") widened the accepted value range 8x but we still
> see flakes (at a rate of around 7%).
>
> Return XFAIL for the most timing sensitive test on slow machines.
>
> Before:
>
> # ./cmsg_time.sh
> Case UDPv4 - TXTIME rel returned '8074us - 7397us < 4000', expected 'OK'
> FAIL - 1/36 cases failed
>
> After:
>
> # ./cmsg_time.sh
> Case UDPv4 - TXTIME rel returned '1123us - 941us < 500', expected 'OK' (XFAIL)
> Case UDPv6 - TXTIME rel returned '1227us - 776us < 500', expected 'OK' (XFAIL)
> OK
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines
2025-01-16 2:01 [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines Jakub Kicinski
2025-01-16 11:52 ` Przemek Kitszel
2025-01-16 13:00 ` Willem de Bruijn
@ 2025-01-17 12:49 ` Petr Machata
2025-01-17 14:53 ` Jakub Kicinski
2025-01-18 3:50 ` patchwork-bot+netdevbpf
3 siblings, 1 reply; 7+ messages in thread
From: Petr Machata @ 2025-01-17 12:49 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, shuah,
linux-kselftest, willemdebruijn.kernel
Jakub Kicinski <kuba@kernel.org> writes:
> Commit b9d5f5711dd8 ("selftests: net: increase the delay for relative
> cmsg_time.sh test") widened the accepted value range 8x but we still
> see flakes (at a rate of around 7%).
>
> Return XFAIL for the most timing sensitive test on slow machines.
>
> Before:
>
> # ./cmsg_time.sh
> Case UDPv4 - TXTIME rel returned '8074us - 7397us < 4000', expected 'OK'
> FAIL - 1/36 cases failed
>
> After:
>
> # ./cmsg_time.sh
> Case UDPv4 - TXTIME rel returned '1123us - 941us < 500', expected 'OK' (XFAIL)
> Case UDPv6 - TXTIME rel returned '1227us - 776us < 500', expected 'OK' (XFAIL)
> OK
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: shuah@kernel.org
> CC: linux-kselftest@vger.kernel.org
> CC: willemdebruijn.kernel@gmail.com
> ---
> tools/testing/selftests/net/cmsg_time.sh | 35 +++++++++++++++++-------
> 1 file changed, 25 insertions(+), 10 deletions(-)
>
> diff --git a/tools/testing/selftests/net/cmsg_time.sh b/tools/testing/selftests/net/cmsg_time.sh
> index 1d7e756644bc..478af0aefa97 100755
> --- a/tools/testing/selftests/net/cmsg_time.sh
> +++ b/tools/testing/selftests/net/cmsg_time.sh
> @@ -34,13 +34,28 @@ BAD=0
> TOTAL=0
>
> check_result() {
> + local ret=$1
> + local got=$2
> + local exp=$3
> + local case=$4
> + local xfail=$5
> + local xf=
> + local inc=
> +
> + if [ "$xfail" == "xfail" ]; then
> + xf="(XFAIL)"
> + inc=0
> + else
> + inc=1
> + fi
> +
> ((TOTAL++))
> - if [ $1 -ne 0 ]; then
> - echo " Case $4 returned $1, expected 0"
> - ((BAD++))
> + if [ $ret -ne 0 ]; then
> + echo " Case $case returned $ret, expected 0 $xf"
> + ((BAD+=inc))
> elif [ "$2" != "$3" ]; then
> - echo " Case $4 returned '$2', expected '$3'"
> - ((BAD++))
> + echo " Case $case returned '$got', expected '$exp' $xf"
> + ((BAD+=inc))
> fi
> }
>
> @@ -66,14 +81,14 @@ for i in "-4 $TGT4" "-6 $TGT6"; do
> awk '/SND/ { if ($3 > 1000) print "OK"; }')
> check_result $? "$ts" "OK" "$prot - TXTIME abs"
>
> - [ "$KSFT_MACHINE_SLOW" = yes ] && delay=8000 || delay=1000
> + [ "$KSFT_MACHINE_SLOW" = yes ] && xfail=xfail
>
> - ts=$(ip netns exec $NS ./cmsg_sender -p $p $i 1234 -t -d $delay |
> + ts=$(ip netns exec $NS ./cmsg_sender -p $p $i 1234 -t -d 1000 |
> awk '/SND/ {snd=$3}
> /SCHED/ {sch=$3}
> - END { if (snd - sch > '$((delay/2))') print "OK";
> - else print snd, "-", sch, "<", '$((delay/2))'; }')
> - check_result $? "$ts" "OK" "$prot - TXTIME rel"
> + END { if (snd - sch > 500) print "OK";
> + else print snd, "-", sch, "<", 500; }')
> + check_result $? "$ts" "OK" "$prot - TXTIME rel" $xfail
> done
> done
This logging and xfail handling duplicates lib.sh. Would a patch like
below be OK with you? The gist of it is to just use check_err, log_test
and xfail_on_slow from lib.sh to achieve what the test open-codes.
I can send it as a follow-up instead if you prefer.
--- a/tools/testing/selftests/net/cmsg_time.sh
+++ b/tools/testing/selftests/net/cmsg_time.sh
@@ -29,19 +29,21 @@ ip -netns $NS addr add $IP6 dev dummy0
# Need FQ for TXTIME
ip netns exec $NS tc qdisc replace dev dummy0 root fq
-# Test
-BAD=0
-TOTAL=0
-
check_result() {
- ((TOTAL++))
- if [ $1 -ne 0 ]; then
- echo " Case $4 returned $1, expected 0"
- ((BAD++))
- elif [ "$2" != "$3" ]; then
- echo " Case $4 returned '$2', expected '$3'"
- ((BAD++))
- fi
+ local ret=$1
+ local got=$2
+ local exp=$3
+ local case=$4
+
+ RET=0
+
+ [ $1 -eq 0 ]
+ check_err $? "Case $4 returned $1, expected 0"
+
+ [ "$2" == "$3" ]
+ check_err $? "Case $4 returned '$2', expected '$3'"
+
+ log_test "$4"
}
for i in "-4 $TGT4" "-6 $TGT6"; do
@@ -73,15 +79,8 @@ for i in "-4 $TGT4" "-6 $TGT6"; do
/SCHED/ {sch=$3}
END { if (snd - sch > '$((delay/2))') print "OK";
else print snd, "-", sch, "<", '$((delay/2))'; }')
- check_result $? "$ts" "OK" "$prot - TXTIME rel"
+ xfail_on_slow check_result $? "$ts" "OK" "$prot - TXTIME rel"
done
done
-# Summary
-if [ $BAD -ne 0 ]; then
- echo "FAIL - $BAD/$TOTAL cases failed"
- exit 1
-else
- echo "OK"
- exit 0
-fi
+exit $EXIT_STATUS
This is much more verbose, but that's how tests tend to be. I could
change it to only log on RET != 0, but like this the custom results
block can go away and the test is overall more median.
bash-5.2# KSFT_MACHINE_SLOW=yes ./cmsg_time.sh
TEST: UDPv4 - no options [ OK ]
TEST: UDPv4 - ts cnt [ OK ]
TEST: UDPv4 - ts0 SCHED [ OK ]
TEST: UDPv4 - ts0 SND [ OK ]
TEST: UDPv4 - TXTIME abs [ OK ]
TEST: UDPv4 - TXTIME rel [XFAIL]
Case UDPv4 - TXTIME rel returned '34us - 30us < 4000', expected 'OK'
TEST: ICMPv4 - no options [ OK ]
TEST: ICMPv4 - ts cnt [ OK ]
[... snip ...]
bash-5.2# echo $?
0
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines
2025-01-17 12:49 ` Petr Machata
@ 2025-01-17 14:53 ` Jakub Kicinski
2025-01-17 15:16 ` Petr Machata
0 siblings, 1 reply; 7+ messages in thread
From: Jakub Kicinski @ 2025-01-17 14:53 UTC (permalink / raw)
To: Petr Machata
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, shuah,
linux-kselftest, willemdebruijn.kernel
On Fri, 17 Jan 2025 13:49:23 +0100 Petr Machata wrote:
> This logging and xfail handling duplicates lib.sh. Would a patch like
> below be OK with you? The gist of it is to just use check_err, log_test
> and xfail_on_slow from lib.sh to achieve what the test open-codes.
Hm, maybe? I'm not going to say no if you send a patch, but IMHO
if we were to change the output I think we should make it ktap.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines
2025-01-17 14:53 ` Jakub Kicinski
@ 2025-01-17 15:16 ` Petr Machata
0 siblings, 0 replies; 7+ messages in thread
From: Petr Machata @ 2025-01-17 15:16 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Petr Machata, davem, netdev, edumazet, pabeni, andrew+netdev,
horms, shuah, linux-kselftest, willemdebruijn.kernel
Jakub Kicinski <kuba@kernel.org> writes:
> On Fri, 17 Jan 2025 13:49:23 +0100 Petr Machata wrote:
>> This logging and xfail handling duplicates lib.sh. Would a patch like
>> below be OK with you? The gist of it is to just use check_err, log_test
>> and xfail_on_slow from lib.sh to achieve what the test open-codes.
>
> Hm, maybe? I'm not going to say no if you send a patch, but IMHO
> if we were to change the output I think we should make it ktap.
Hmm, at some point, we'll want to convert lib.sh to ktap as well.
Hopefully transparently so that all the existing selftests out there
just magically become ktap-compatible.
If you want to preserve the output 1:1 then I feel like that's bending
over backwards too much. We could still maybe reuse xfail_on_slow and
ask for FAIL_TO_XFAIL, but that's marked internal in lib.sh. So I
retract my proposal. Too faffy at that point.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines
2025-01-16 2:01 [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines Jakub Kicinski
` (2 preceding siblings ...)
2025-01-17 12:49 ` Petr Machata
@ 2025-01-18 3:50 ` patchwork-bot+netdevbpf
3 siblings, 0 replies; 7+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-01-18 3:50 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, shuah,
linux-kselftest, willemdebruijn.kernel
Hello:
This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Wed, 15 Jan 2025 18:01:05 -0800 you wrote:
> Commit b9d5f5711dd8 ("selftests: net: increase the delay for relative
> cmsg_time.sh test") widened the accepted value range 8x but we still
> see flakes (at a rate of around 7%).
>
> Return XFAIL for the most timing sensitive test on slow machines.
>
> Before:
>
> [...]
Here is the summary with links:
- [net-next] selftests: net: give up on the cmsg_time accuracy on slow machines
https://git.kernel.org/netdev/net-next/c/54ea680b759c
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-01-18 3:50 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-16 2:01 [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines Jakub Kicinski
2025-01-16 11:52 ` Przemek Kitszel
2025-01-16 13:00 ` Willem de Bruijn
2025-01-17 12:49 ` Petr Machata
2025-01-17 14:53 ` Jakub Kicinski
2025-01-17 15:16 ` Petr Machata
2025-01-18 3:50 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).