netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Petr Machata <petrm@nvidia.com>
Cc: Nikolay Aleksandrov <razor@blackwall.org>,
	Hangbin Liu <liuhangbin@gmail.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [TEST] forwarding/router_bridge_lag.sh started to flake on Monday
Date: Fri, 23 Aug 2024 08:02:53 -0700	[thread overview]
Message-ID: <20240823080253.1c11c028@kernel.org> (raw)
In-Reply-To: <87a5h3l9q1.fsf@nvidia.com>

On Fri, 23 Aug 2024 13:28:11 +0200 Petr Machata wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
> 
> > Looks like forwarding/router_bridge_lag.sh has gotten a lot more flaky
> > this week. It flaked very occasionally (and in a different way) before:
> >
> > https://netdev.bots.linux.dev/contest.html?executor=vmksft-forwarding&test=router-bridge-lag-sh&ld_cnt=250
> >
> > There doesn't seem to be any obvious commit that could have caused this.  
> 
> Hmm:
>     # 3.37 [+0.11] Error: Device is up. Set it down before adding it as a team port.
> 
> How are the tests isolated, are they each run in their own vng, or are
> instances shared? Could it be that the test that runs befor this one
> neglects to take a port down?

Yes, each one has its own VM, but the VM is reused for multiple tests
serially. The "info" file shows which VM was use (thr-id identifies
the worker, vm-id identifies VM within the worker, worker will restart
the VM if it detects a crash).

> In one failure case (I don't see further back or my browser would
> apparently catch fire) the predecessor was no_forwarding.sh, and indeed
> it looks like it raises the ports, but I don't see where it sets them
> back down.
> 
> Then router-bridge-lag's cleanup downs the ports, and on rerun it
> succeeds. The issue would be probabilistic, because no_forwarding does
> not always run before this test, and some tests do not care that the
> ports are up. If that's the root cause, this should fix it:
> 
> From 0baf91dc24b95ae0cadfdf5db05b74888e6a228a Mon Sep 17 00:00:00 2001
> Message-ID: <0baf91dc24b95ae0cadfdf5db05b74888e6a228a.1724413545.git.petrm@nvidia.com>
> From: Petr Machata <petrm@nvidia.com>
> Date: Fri, 23 Aug 2024 14:42:48 +0300
> Subject: [PATCH net-next mlxsw] selftests: forwarding: no_forwarding: Down
>  ports on cleanup
> To: <nbu-linux-internal@nvidia.com>
> 
> This test neglects to put ports down on cleanup. Fix it.
> 
> Fixes: 476a4f05d9b8 ("selftests: forwarding: add a no_forwarding.sh test")
> Signed-off-by: Petr Machata <petrm@nvidia.com>
> ---
>  tools/testing/selftests/net/forwarding/no_forwarding.sh | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/tools/testing/selftests/net/forwarding/no_forwarding.sh b/tools/testing/selftests/net/forwarding/no_forwarding.sh
> index af3b398d13f0..9e677aa64a06 100755
> --- a/tools/testing/selftests/net/forwarding/no_forwarding.sh
> +++ b/tools/testing/selftests/net/forwarding/no_forwarding.sh
> @@ -233,6 +233,9 @@ cleanup()
>  {
>  	pre_cleanup
>  
> +	ip link set dev $swp2 down
> +	ip link set dev $swp1 down
> +
>  	h2_destroy
>  	h1_destroy
>  

no_forwarding always runs in thread 0 because it's the slowest tests
and we try to run from the slowest as a basic bin packing heuristic.
Clicking thru the failures I don't see them on thread 0.

But putting the ports down seems like a good cleanup regardless.

  reply	other threads:[~2024-08-23 15:02 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-22 15:37 [TEST] forwarding/router_bridge_lag.sh started to flake on Monday Jakub Kicinski
2024-08-23 11:28 ` Petr Machata
2024-08-23 15:02   ` Jakub Kicinski [this message]
2024-08-23 16:13     ` Petr Machata
2024-08-24 21:27       ` Jakub Kicinski
2024-08-25  9:01         ` Petr Machata

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240823080253.1c11c028@kernel.org \
    --to=kuba@kernel.org \
    --cc=liuhangbin@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=petrm@nvidia.com \
    --cc=razor@blackwall.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).