* [TEST] Weird RSS state on ice
@ 2026-06-24 15:30 Jakub Kicinski
2026-06-25 7:11 ` [Intel-wired-lan] " Loktionov, Aleksandr
0 siblings, 1 reply; 4+ messages in thread
From: Jakub Kicinski @ 2026-06-24 15:30 UTC (permalink / raw)
To: Adrian Pielech, Przemyslaw Kitszel
Cc: netdev@vger.kernel.org, intel-wired-lan
Hi!
I noticed in the netdev CI that the ice runner fails to run the
toeplitz tests because of the RSS config.
https://netdev-ci-results.intel.com/ice-results/net-next-hw-2026-06-23--00-00/ice-E810-CQ2/toeplitz.py/stdout
I added some extra debug on the branch:
net.lib.ynl.pyynl.lib.ynl.NlError: Netlink error: hash field config is not symmetric 16 304: Invalid argument {'bad-attr': '.input-xfrm'}
16, 304 means GTP flow, GTP_TEID field. So we are trying to disable
symmetric RSS, but the field configuration contains TEID. The problem
is this is an illegal configuration in the first place. We are
_disabling_ symmetric RSS, but the kernel tries to make sure that both
before and after states are correct (because the configuration involves
multiple calls to the drivers and may fail half-way-thru). If the
current config is illegal net/ethtool/ won't even let us restore it to
sane state.
So the question is how we got into this state. It does not happen
on netdev machines. And on Intel machines it happens randomly around
30% of the time.
I tried to look thru the driver code and I don't see how we could end
up with such a config.
Could y'all have a look and figure out / fix this? This has been
happening for a while back but I was waiting until the merge window
to poke at it first.
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: [Intel-wired-lan] [TEST] Weird RSS state on ice
2026-06-24 15:30 [TEST] Weird RSS state on ice Jakub Kicinski
@ 2026-06-25 7:11 ` Loktionov, Aleksandr
2026-06-26 2:06 ` Jakub Kicinski
0 siblings, 1 reply; 4+ messages in thread
From: Loktionov, Aleksandr @ 2026-06-25 7:11 UTC (permalink / raw)
To: Jakub Kicinski, Pielech, Adrian, Kitszel, Przemyslaw
Cc: netdev@vger.kernel.org, intel-wired-lan@lists.osuosl.org
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Jakub Kicinski
> Sent: Wednesday, June 24, 2026 5:30 PM
> To: Pielech, Adrian <adrian.pielech@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>
> Cc: netdev@vger.kernel.org; intel-wired-lan@lists.osuosl.org
> Subject: [Intel-wired-lan] [TEST] Weird RSS state on ice
>
> Hi!
>
> I noticed in the netdev CI that the ice runner fails to run the
> toeplitz tests because of the RSS config.
>
> https://netdev-ci-results.intel.com/ice-results/net-next-hw-2026-06-
> 23--00-00/ice-E810-CQ2/toeplitz.py/stdout
>
> I added some extra debug on the branch:
>
> net.lib.ynl.pyynl.lib.ynl.NlError: Netlink error: hash field config is
> not symmetric 16 304: Invalid argument {'bad-attr': '.input-xfrm'}
>
> 16, 304 means GTP flow, GTP_TEID field. So we are trying to disable
> symmetric RSS, but the field configuration contains TEID. The problem
> is this is an illegal configuration in the first place. We are
> _disabling_ symmetric RSS, but the kernel tries to make sure that both
> before and after states are correct (because the configuration
> involves multiple calls to the drivers and may fail half-way-thru). If
> the current config is illegal net/ethtool/ won't even let us restore
> it to sane state.
>
> So the question is how we got into this state. It does not happen on
> netdev machines. And on Intel machines it happens randomly around 30%
> of the time.
>
> I tried to look thru the driver code and I don't see how we could end
> up with such a config.
>
> Could y'all have a look and figure out / fix this? This has been
> happening for a while back but I was waiting until the merge window to
> poke at it first.
Good day, Jakub
The patchset didn't help?
[PATCH iwl-next v5 2/2] ice: implement symmetric RSS hash configuration
With the best regards
Alex
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Intel-wired-lan] [TEST] Weird RSS state on ice
2026-06-25 7:11 ` [Intel-wired-lan] " Loktionov, Aleksandr
@ 2026-06-26 2:06 ` Jakub Kicinski
2026-06-26 5:51 ` Loktionov, Aleksandr
0 siblings, 1 reply; 4+ messages in thread
From: Jakub Kicinski @ 2026-06-26 2:06 UTC (permalink / raw)
To: Loktionov, Aleksandr
Cc: Pielech, Adrian, Kitszel, Przemyslaw, netdev@vger.kernel.org,
intel-wired-lan@lists.osuosl.org
On Thu, 25 Jun 2026 07:11:14 +0000 Loktionov, Aleksandr wrote:
> The patchset didn't help?
>
> [PATCH iwl-next v5 2/2] ice: implement symmetric RSS hash configuration
Not sure, it's not in tree, and lore doesn't want to point me at it
either. What I don't get is how we get into the bad state in the first
place.
Looking at other tests today I spotted that rss flow label test is also
behaving oddly. Most of the time the first case fails and the second
passes:
test "rss-flow-label-py"
group "selftests-drivers-net-hw"
result "fail"
link "https://netdev-ci-results.intel.com/ice-results/net-next-hw-2026-06-26--00-00/ice-E810-XXV4/rss_flow_label.py/stdout"
results
0
test "rss-flow-label-test-rss-flow-label"
result "fail"
1
test "rss-flow-label-test-rss-flow-label-6only"
result "pass"
But every now and then they skip:
ok 1 rss_flow_label.test_rss_flow_label # SKIP Device doesn't support Flow Label for UDP6
ok 2 rss_flow_label.test_rss_flow_label_6only # SKIP Device doesn't support Flow Label for UDP6
test "rss-flow-label-py"
group "selftests-drivers-net-hw"
result "skip"
link "https://netdev-ci-results.intel.com/ice-results/net-next-hw-2026-06-25--16-00/ice-E810-XXV4/rss_flow_label.py/stdout"
results
0
test "rss-flow-label-test-rss-flow-label"
result "skip"
1
test "rss-flow-label-test-rss-flow-label-6only"
result "skip"
The devlink info is identical so it must be that the device
is in unclean state sometimes?? Do y'all power cycle these
machines between runs?
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: [Intel-wired-lan] [TEST] Weird RSS state on ice
2026-06-26 2:06 ` Jakub Kicinski
@ 2026-06-26 5:51 ` Loktionov, Aleksandr
0 siblings, 0 replies; 4+ messages in thread
From: Loktionov, Aleksandr @ 2026-06-26 5:51 UTC (permalink / raw)
To: Jakub Kicinski, Pielech, Adrian
Cc: Kitszel, Przemyslaw, netdev@vger.kernel.org,
intel-wired-lan@lists.osuosl.org
> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Friday, June 26, 2026 4:06 AM
> To: Loktionov, Aleksandr <aleksandr.loktionov@intel.com>
> Cc: Pielech, Adrian <adrian.pielech@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; netdev@vger.kernel.org; intel-wired-
> lan@lists.osuosl.org
> Subject: Re: [Intel-wired-lan] [TEST] Weird RSS state on ice
>
> On Thu, 25 Jun 2026 07:11:14 +0000 Loktionov, Aleksandr wrote:
> > The patchset didn't help?
> >
> > [PATCH iwl-next v5 2/2] ice: implement symmetric RSS hash
> > configuration
>
> Not sure, it's not in tree, and lore doesn't want to point me at it
> either. What I don't get is how we get into the bad state in the first
> place.
>
> Looking at other tests today I spotted that rss flow label test is
> also behaving oddly. Most of the time the first case fails and the
> second
> passes:
>
> test "rss-flow-label-py"
> group "selftests-drivers-net-hw"
> result "fail"
> link "https://netdev-ci-results.intel.com/ice-results/net-next-hw-
> 2026-06-26--00-00/ice-E810-XXV4/rss_flow_label.py/stdout"
> results
> 0
> test "rss-flow-label-test-rss-flow-label"
> result "fail"
> 1
> test "rss-flow-label-test-rss-flow-label-6only"
> result "pass"
>
>
> But every now and then they skip:
>
> ok 1 rss_flow_label.test_rss_flow_label # SKIP Device doesn't support
> Flow Label for UDP6 ok 2 rss_flow_label.test_rss_flow_label_6only #
> SKIP Device doesn't support Flow Label for UDP6
>
> test "rss-flow-label-py"
> group "selftests-drivers-net-hw"
> result "skip"
> link "https://netdev-ci-results.intel.com/ice-results/net-next-hw-
> 2026-06-25--16-00/ice-E810-XXV4/rss_flow_label.py/stdout"
> results
> 0
> test "rss-flow-label-test-rss-flow-label"
> result "skip"
> 1
> test "rss-flow-label-test-rss-flow-label-6only"
> result "skip"
>
>
> The devlink info is identical so it must be that the device is in
> unclean state sometimes?? Do y'all power cycle these machines between
> runs?
Good day, Jakub
I heard from @Pielech, Adrian that we experienced infrastructure issues, but reboots helped us. Please ask him about CI infrastructure.
About my v5 March 16 symmetric RSS fix, which worked for me, I've just resent it today, please bless it.
With the best regards
Alex
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-06-26 5:51 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-24 15:30 [TEST] Weird RSS state on ice Jakub Kicinski
2026-06-25 7:11 ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-06-26 2:06 ` Jakub Kicinski
2026-06-26 5:51 ` Loktionov, Aleksandr
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox