* Re: [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines
2025-01-16 2:01 [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines Jakub Kicinski
@ 2025-01-16 11:52 ` Przemek Kitszel
2025-01-16 13:00 ` Willem de Bruijn
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Przemek Kitszel @ 2025-01-16 11:52 UTC (permalink / raw)
To: Jakub Kicinski
Cc: netdev, edumazet, davem, pabeni, andrew+netdev, horms, shuah,
linux-kselftest, willemdebruijn.kernel
On 1/16/25 03:01, Jakub Kicinski wrote:
> Commit b9d5f5711dd8 ("selftests: net: increase the delay for relative
> cmsg_time.sh test") widened the accepted value range 8x but we still
> see flakes (at a rate of around 7%).
you have undid the 8x change by this commit (without mentioning that)
[fine for me]
>
> Return XFAIL for the most timing sensitive test on slow machines.
code change looks fine for me, and does exactly that, so:
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
>
> Before:
>
> # ./cmsg_time.sh
> Case UDPv4 - TXTIME rel returned '8074us - 7397us < 4000', expected 'OK'
> FAIL - 1/36 cases failed
>
> After:
>
> # ./cmsg_time.sh
> Case UDPv4 - TXTIME rel returned '1123us - 941us < 500', expected 'OK' (XFAIL)
> Case UDPv6 - TXTIME rel returned '1227us - 776us < 500', expected 'OK' (XFAIL)
> OK
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines
2025-01-16 2:01 [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines Jakub Kicinski
2025-01-16 11:52 ` Przemek Kitszel
@ 2025-01-16 13:00 ` Willem de Bruijn
2025-01-17 12:49 ` Petr Machata
2025-01-18 3:50 ` patchwork-bot+netdevbpf
3 siblings, 0 replies; 7+ messages in thread
From: Willem de Bruijn @ 2025-01-16 13:00 UTC (permalink / raw)
To: Jakub Kicinski, davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms, Jakub Kicinski,
shuah, linux-kselftest, willemdebruijn.kernel
Jakub Kicinski wrote:
> Commit b9d5f5711dd8 ("selftests: net: increase the delay for relative
> cmsg_time.sh test") widened the accepted value range 8x but we still
> see flakes (at a rate of around 7%).
>
> Return XFAIL for the most timing sensitive test on slow machines.
>
> Before:
>
> # ./cmsg_time.sh
> Case UDPv4 - TXTIME rel returned '8074us - 7397us < 4000', expected 'OK'
> FAIL - 1/36 cases failed
>
> After:
>
> # ./cmsg_time.sh
> Case UDPv4 - TXTIME rel returned '1123us - 941us < 500', expected 'OK' (XFAIL)
> Case UDPv6 - TXTIME rel returned '1227us - 776us < 500', expected 'OK' (XFAIL)
> OK
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines
2025-01-16 2:01 [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines Jakub Kicinski
2025-01-16 11:52 ` Przemek Kitszel
2025-01-16 13:00 ` Willem de Bruijn
@ 2025-01-17 12:49 ` Petr Machata
2025-01-17 14:53 ` Jakub Kicinski
2025-01-18 3:50 ` patchwork-bot+netdevbpf
3 siblings, 1 reply; 7+ messages in thread
From: Petr Machata @ 2025-01-17 12:49 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, shuah,
linux-kselftest, willemdebruijn.kernel
Jakub Kicinski <kuba@kernel.org> writes:
> Commit b9d5f5711dd8 ("selftests: net: increase the delay for relative
> cmsg_time.sh test") widened the accepted value range 8x but we still
> see flakes (at a rate of around 7%).
>
> Return XFAIL for the most timing sensitive test on slow machines.
>
> Before:
>
> # ./cmsg_time.sh
> Case UDPv4 - TXTIME rel returned '8074us - 7397us < 4000', expected 'OK'
> FAIL - 1/36 cases failed
>
> After:
>
> # ./cmsg_time.sh
> Case UDPv4 - TXTIME rel returned '1123us - 941us < 500', expected 'OK' (XFAIL)
> Case UDPv6 - TXTIME rel returned '1227us - 776us < 500', expected 'OK' (XFAIL)
> OK
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: shuah@kernel.org
> CC: linux-kselftest@vger.kernel.org
> CC: willemdebruijn.kernel@gmail.com
> ---
> tools/testing/selftests/net/cmsg_time.sh | 35 +++++++++++++++++-------
> 1 file changed, 25 insertions(+), 10 deletions(-)
>
> diff --git a/tools/testing/selftests/net/cmsg_time.sh b/tools/testing/selftests/net/cmsg_time.sh
> index 1d7e756644bc..478af0aefa97 100755
> --- a/tools/testing/selftests/net/cmsg_time.sh
> +++ b/tools/testing/selftests/net/cmsg_time.sh
> @@ -34,13 +34,28 @@ BAD=0
> TOTAL=0
>
> check_result() {
> + local ret=$1
> + local got=$2
> + local exp=$3
> + local case=$4
> + local xfail=$5
> + local xf=
> + local inc=
> +
> + if [ "$xfail" == "xfail" ]; then
> + xf="(XFAIL)"
> + inc=0
> + else
> + inc=1
> + fi
> +
> ((TOTAL++))
> - if [ $1 -ne 0 ]; then
> - echo " Case $4 returned $1, expected 0"
> - ((BAD++))
> + if [ $ret -ne 0 ]; then
> + echo " Case $case returned $ret, expected 0 $xf"
> + ((BAD+=inc))
> elif [ "$2" != "$3" ]; then
> - echo " Case $4 returned '$2', expected '$3'"
> - ((BAD++))
> + echo " Case $case returned '$got', expected '$exp' $xf"
> + ((BAD+=inc))
> fi
> }
>
> @@ -66,14 +81,14 @@ for i in "-4 $TGT4" "-6 $TGT6"; do
> awk '/SND/ { if ($3 > 1000) print "OK"; }')
> check_result $? "$ts" "OK" "$prot - TXTIME abs"
>
> - [ "$KSFT_MACHINE_SLOW" = yes ] && delay=8000 || delay=1000
> + [ "$KSFT_MACHINE_SLOW" = yes ] && xfail=xfail
>
> - ts=$(ip netns exec $NS ./cmsg_sender -p $p $i 1234 -t -d $delay |
> + ts=$(ip netns exec $NS ./cmsg_sender -p $p $i 1234 -t -d 1000 |
> awk '/SND/ {snd=$3}
> /SCHED/ {sch=$3}
> - END { if (snd - sch > '$((delay/2))') print "OK";
> - else print snd, "-", sch, "<", '$((delay/2))'; }')
> - check_result $? "$ts" "OK" "$prot - TXTIME rel"
> + END { if (snd - sch > 500) print "OK";
> + else print snd, "-", sch, "<", 500; }')
> + check_result $? "$ts" "OK" "$prot - TXTIME rel" $xfail
> done
> done
This logging and xfail handling duplicates lib.sh. Would a patch like
below be OK with you? The gist of it is to just use check_err, log_test
and xfail_on_slow from lib.sh to achieve what the test open-codes.
I can send it as a follow-up instead if you prefer.
--- a/tools/testing/selftests/net/cmsg_time.sh
+++ b/tools/testing/selftests/net/cmsg_time.sh
@@ -29,19 +29,21 @@ ip -netns $NS addr add $IP6 dev dummy0
# Need FQ for TXTIME
ip netns exec $NS tc qdisc replace dev dummy0 root fq
-# Test
-BAD=0
-TOTAL=0
-
check_result() {
- ((TOTAL++))
- if [ $1 -ne 0 ]; then
- echo " Case $4 returned $1, expected 0"
- ((BAD++))
- elif [ "$2" != "$3" ]; then
- echo " Case $4 returned '$2', expected '$3'"
- ((BAD++))
- fi
+ local ret=$1
+ local got=$2
+ local exp=$3
+ local case=$4
+
+ RET=0
+
+ [ $1 -eq 0 ]
+ check_err $? "Case $4 returned $1, expected 0"
+
+ [ "$2" == "$3" ]
+ check_err $? "Case $4 returned '$2', expected '$3'"
+
+ log_test "$4"
}
for i in "-4 $TGT4" "-6 $TGT6"; do
@@ -73,15 +79,8 @@ for i in "-4 $TGT4" "-6 $TGT6"; do
/SCHED/ {sch=$3}
END { if (snd - sch > '$((delay/2))') print "OK";
else print snd, "-", sch, "<", '$((delay/2))'; }')
- check_result $? "$ts" "OK" "$prot - TXTIME rel"
+ xfail_on_slow check_result $? "$ts" "OK" "$prot - TXTIME rel"
done
done
-# Summary
-if [ $BAD -ne 0 ]; then
- echo "FAIL - $BAD/$TOTAL cases failed"
- exit 1
-else
- echo "OK"
- exit 0
-fi
+exit $EXIT_STATUS
This is much more verbose, but that's how tests tend to be. I could
change it to only log on RET != 0, but like this the custom results
block can go away and the test is overall more median.
bash-5.2# KSFT_MACHINE_SLOW=yes ./cmsg_time.sh
TEST: UDPv4 - no options [ OK ]
TEST: UDPv4 - ts cnt [ OK ]
TEST: UDPv4 - ts0 SCHED [ OK ]
TEST: UDPv4 - ts0 SND [ OK ]
TEST: UDPv4 - TXTIME abs [ OK ]
TEST: UDPv4 - TXTIME rel [XFAIL]
Case UDPv4 - TXTIME rel returned '34us - 30us < 4000', expected 'OK'
TEST: ICMPv4 - no options [ OK ]
TEST: ICMPv4 - ts cnt [ OK ]
[... snip ...]
bash-5.2# echo $?
0
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines
2025-01-16 2:01 [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines Jakub Kicinski
` (2 preceding siblings ...)
2025-01-17 12:49 ` Petr Machata
@ 2025-01-18 3:50 ` patchwork-bot+netdevbpf
3 siblings, 0 replies; 7+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-01-18 3:50 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, shuah,
linux-kselftest, willemdebruijn.kernel
Hello:
This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Wed, 15 Jan 2025 18:01:05 -0800 you wrote:
> Commit b9d5f5711dd8 ("selftests: net: increase the delay for relative
> cmsg_time.sh test") widened the accepted value range 8x but we still
> see flakes (at a rate of around 7%).
>
> Return XFAIL for the most timing sensitive test on slow machines.
>
> Before:
>
> [...]
Here is the summary with links:
- [net-next] selftests: net: give up on the cmsg_time accuracy on slow machines
https://git.kernel.org/netdev/net-next/c/54ea680b759c
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 7+ messages in thread