From: Petr Machata <petrm@nvidia.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Petr Machata <petrm@nvidia.com>,
Nikolay Aleksandrov <razor@blackwall.org>,
Hangbin Liu <liuhangbin@gmail.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [TEST] forwarding/router_bridge_lag.sh started to flake on Monday
Date: Fri, 23 Aug 2024 18:13:01 +0200 [thread overview]
Message-ID: <87ttfbi5ce.fsf@nvidia.com> (raw)
In-Reply-To: <20240823080253.1c11c028@kernel.org>
Jakub Kicinski <kuba@kernel.org> writes:
> On Fri, 23 Aug 2024 13:28:11 +0200 Petr Machata wrote:
>> Jakub Kicinski <kuba@kernel.org> writes:
>>
>> > Looks like forwarding/router_bridge_lag.sh has gotten a lot more flaky
>> > this week. It flaked very occasionally (and in a different way) before:
>> >
>> > https://netdev.bots.linux.dev/contest.html?executor=vmksft-forwarding&test=router-bridge-lag-sh&ld_cnt=250
>> >
>> > There doesn't seem to be any obvious commit that could have caused this.
>>
>> Hmm:
>> # 3.37 [+0.11] Error: Device is up. Set it down before adding it as a team port.
>>
>> How are the tests isolated, are they each run in their own vng, or are
>> instances shared? Could it be that the test that runs befor this one
>> neglects to take a port down?
>
> Yes, each one has its own VM, but the VM is reused for multiple tests
> serially. The "info" file shows which VM was use (thr-id identifies
> the worker, vm-id identifies VM within the worker, worker will restart
> the VM if it detects a crash).
OK, so my guess would be that whatever ran before the test forgot to put
the port down.
>> In one failure case (I don't see further back or my browser would
>> apparently catch fire) the predecessor was no_forwarding.sh, and indeed
>> it looks like it raises the ports, but I don't see where it sets them
>> back down.
>>
>> Then router-bridge-lag's cleanup downs the ports, and on rerun it
>> succeeds. The issue would be probabilistic, because no_forwarding does
>> not always run before this test, and some tests do not care that the
>> ports are up. If that's the root cause, this should fix it:
>>
>> From 0baf91dc24b95ae0cadfdf5db05b74888e6a228a Mon Sep 17 00:00:00 2001
>> Message-ID: <0baf91dc24b95ae0cadfdf5db05b74888e6a228a.1724413545.git.petrm@nvidia.com>
>> From: Petr Machata <petrm@nvidia.com>
>> Date: Fri, 23 Aug 2024 14:42:48 +0300
>> Subject: [PATCH net-next mlxsw] selftests: forwarding: no_forwarding: Down
>> ports on cleanup
>> To: <nbu-linux-internal@nvidia.com>
>>
>> This test neglects to put ports down on cleanup. Fix it.
>>
>> Fixes: 476a4f05d9b8 ("selftests: forwarding: add a no_forwarding.sh test")
>> Signed-off-by: Petr Machata <petrm@nvidia.com>
>> ---
>> tools/testing/selftests/net/forwarding/no_forwarding.sh | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/tools/testing/selftests/net/forwarding/no_forwarding.sh b/tools/testing/selftests/net/forwarding/no_forwarding.sh
>> index af3b398d13f0..9e677aa64a06 100755
>> --- a/tools/testing/selftests/net/forwarding/no_forwarding.sh
>> +++ b/tools/testing/selftests/net/forwarding/no_forwarding.sh
>> @@ -233,6 +233,9 @@ cleanup()
>> {
>> pre_cleanup
>>
>> + ip link set dev $swp2 down
>> + ip link set dev $swp1 down
>> +
>> h2_destroy
>> h1_destroy
>>
>
> no_forwarding always runs in thread 0 because it's the slowest tests
> and we try to run from the slowest as a basic bin packing heuristic.
> Clicking thru the failures I don't see them on thread 0.
Is there a way to see what ran before?
> But putting the ports down seems like a good cleanup regardless.
I'll send it as a proper patch.
next prev parent reply other threads:[~2024-08-23 16:15 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-22 15:37 [TEST] forwarding/router_bridge_lag.sh started to flake on Monday Jakub Kicinski
2024-08-23 11:28 ` Petr Machata
2024-08-23 15:02 ` Jakub Kicinski
2024-08-23 16:13 ` Petr Machata [this message]
2024-08-24 21:27 ` Jakub Kicinski
2024-08-25 9:01 ` Petr Machata
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ttfbi5ce.fsf@nvidia.com \
--to=petrm@nvidia.com \
--cc=kuba@kernel.org \
--cc=liuhangbin@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=razor@blackwall.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.