* [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size
@ 2026-01-31 22:54 Jakub Kicinski
2026-01-31 22:54 ` [PATCH net-next v2 2/2] docs: networking: mention that RSS table should be 4x the queue count Jakub Kicinski
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Jakub Kicinski @ 2026-01-31 22:54 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms, Jakub Kicinski,
Willem de Bruijn, shuah, linux-kselftest
Add a test which checks that the RSS table is at least 4x the max
queue count supported by the device. The original RSS spec from
Microsoft stated that the RSS indirection table should be 2 to 8
times the CPU count, presumably assuming queue per CPU. If the
CPU count is not a power of two, however, a power-of-2 table
2x larger than queue count results in a 33% traffic imbalance.
Validate that the indirection table is at least 4x the queue
count. This lowers the imbalance to 16% which empirically
appears to be more acceptable to memcache-like workloads.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
v2:
- no changes (see patch 2)
v1: https://lore.kernel.org/20260130192912.826454-1-kuba@kernel.org
CC: shuah@kernel.org
CC: linux-kselftest@vger.kernel.org
---
.../testing/selftests/drivers/net/hw/Makefile | 1 +
| 88 +++++++++++++++++++
2 files changed, 89 insertions(+)
create mode 100755 tools/testing/selftests/drivers/net/hw/rss_drv.py
diff --git a/tools/testing/selftests/drivers/net/hw/Makefile b/tools/testing/selftests/drivers/net/hw/Makefile
index 9c163ba6feee..a64140333a46 100644
--- a/tools/testing/selftests/drivers/net/hw/Makefile
+++ b/tools/testing/selftests/drivers/net/hw/Makefile
@@ -35,6 +35,7 @@ TEST_PROGS = \
pp_alloc_fail.py \
rss_api.py \
rss_ctx.py \
+ rss_drv.py \
rss_flow_label.py \
rss_input_xfrm.py \
toeplitz.py \
--git a/tools/testing/selftests/drivers/net/hw/rss_drv.py b/tools/testing/selftests/drivers/net/hw/rss_drv.py
new file mode 100755
index 000000000000..2d1a33189076
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/hw/rss_drv.py
@@ -0,0 +1,88 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+"""
+Driver-related behavior tests for RSS.
+"""
+
+from lib.py import ksft_run, ksft_exit, ksft_ge
+from lib.py import ksft_variants, KsftNamedVariant, KsftSkipEx
+from lib.py import defer, ethtool
+from lib.py import EthtoolFamily, NlError
+from lib.py import NetDrvEnv
+
+
+def _is_power_of_two(n):
+ return n > 0 and (n & (n - 1)) == 0
+
+
+def _get_rss(cfg, context=0):
+ return ethtool(f"-x {cfg.ifname} context {context}", json=True)[0]
+
+
+def _test_rss_indir_size(cfg, qcnt, context=0):
+ """Test that indirection table size is at least 4x queue count."""
+ ethtool(f"-L {cfg.ifname} combined {qcnt}")
+
+ rss = _get_rss(cfg, context=context)
+ indir = rss['rss-indirection-table']
+ ksft_ge(len(indir), 4 * qcnt, "Table smaller than 4x")
+ return len(indir)
+
+
+def _maybe_create_context(cfg, create_context):
+ """ Either create a context and return its ID or return 0 for main ctx """
+ if not create_context:
+ return 0
+ try:
+ ctx = cfg.ethnl.rss_create_act({'header': {'dev-index': cfg.ifindex}})
+ ctx_id = ctx['context']
+ defer(cfg.ethnl.rss_delete_act,
+ {'header': {'dev-index': cfg.ifindex}, 'context': ctx_id})
+ except NlError:
+ raise KsftSkipEx("Device does not support additional RSS contexts")
+
+ return ctx_id
+
+
+@ksft_variants([
+ KsftNamedVariant("main", False),
+ KsftNamedVariant("ctx", True),
+])
+def indir_size_4x(cfg, create_context):
+ """
+ Test that the indirection table has at least 4 entries per queue.
+ Empirically network-heavy workloads like memcache suffer with the 33%
+ imbalance of a 2x indirection table size.
+ 4x table translates to a 16% imbalance.
+ """
+ channels = cfg.ethnl.channels_get({'header': {'dev-index': cfg.ifindex}})
+ ch_max = channels.get('combined-max', 0)
+ qcnt = channels['combined-count']
+
+ if ch_max < 3:
+ raise KsftSkipEx(f"Not enough queues for the test: max={ch_max}")
+
+ defer(ethtool, f"-L {cfg.ifname} combined {qcnt}")
+ ethtool(f"-L {cfg.ifname} combined 3")
+
+ ctx_id = _maybe_create_context(cfg, create_context)
+
+ indir_sz = _test_rss_indir_size(cfg, 3, context=ctx_id)
+
+ # Test with max queue count (max - 1 if max is a power of two)
+ test_max = ch_max - 1 if _is_power_of_two(ch_max) else ch_max
+ if test_max > 3 and indir_sz < test_max * 4:
+ _test_rss_indir_size(cfg, test_max, context=ctx_id)
+
+
+def main() -> None:
+ """ Ksft boiler plate main """
+ with NetDrvEnv(__file__) as cfg:
+ cfg.ethnl = EthtoolFamily()
+ ksft_run([indir_size_4x], args=(cfg, ))
+ ksft_exit()
+
+
+if __name__ == "__main__":
+ main()
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next v2 2/2] docs: networking: mention that RSS table should be 4x the queue count
2026-01-31 22:54 [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size Jakub Kicinski
@ 2026-01-31 22:54 ` Jakub Kicinski
2026-02-01 7:56 ` Eric Dumazet
2026-02-03 1:10 ` [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size patchwork-bot+netdevbpf
2026-02-11 20:10 ` Yael Chemla
2 siblings, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2026-01-31 22:54 UTC (permalink / raw)
To: davem; +Cc: netdev, edumazet, pabeni, andrew+netdev, horms, Jakub Kicinski
Spell out the recommendation that the RSS table should be
4x the queue count to avoid traffic imbalance. Include minor
rephrasing and removal of the explicit 128 entry example
since a 128 entry table is inadequate on modern machines.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
v2:
- new patch
CC: edumazet@google.com
---
Documentation/networking/scaling.rst | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/Documentation/networking/scaling.rst b/Documentation/networking/scaling.rst
index 99b6a61e5e31..0023afa530ec 100644
--- a/Documentation/networking/scaling.rst
+++ b/Documentation/networking/scaling.rst
@@ -38,11 +38,15 @@ that is not the focus of these techniques.
The filter used in RSS is typically a hash function over the network
and/or transport layer headers-- for example, a 4-tuple hash over
IP addresses and TCP ports of a packet. The most common hardware
-implementation of RSS uses a 128-entry indirection table where each entry
+implementation of RSS uses an indirection table where each entry
stores a queue number. The receive queue for a packet is determined
-by masking out the low order seven bits of the computed hash for the
-packet (usually a Toeplitz hash), taking this number as a key into the
-indirection table and reading the corresponding value.
+by indexing the indirection table with the low order bits of the
+computed hash for the packet (usually a Toeplitz hash).
+
+The indirection table helps even out the traffic distribution when queue
+count is not a power of two. NICs should provide an indirection table
+at least 4 times larger than the queue count. 4x table results in ~16%
+imbalance between the queues, which is acceptable for most applications.
Some NICs support symmetric RSS hashing where, if the IP (source address,
destination address) and TCP/UDP (source port, destination port) tuples
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 2/2] docs: networking: mention that RSS table should be 4x the queue count
2026-01-31 22:54 ` [PATCH net-next v2 2/2] docs: networking: mention that RSS table should be 4x the queue count Jakub Kicinski
@ 2026-02-01 7:56 ` Eric Dumazet
0 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2026-02-01 7:56 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: davem, netdev, pabeni, andrew+netdev, horms
On Sat, Jan 31, 2026 at 11:54 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> Spell out the recommendation that the RSS table should be
> 4x the queue count to avoid traffic imbalance. Include minor
> rephrasing and removal of the explicit 128 entry example
> since a 128 entry table is inadequate on modern machines.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thanks Jakub.
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size
2026-01-31 22:54 [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size Jakub Kicinski
2026-01-31 22:54 ` [PATCH net-next v2 2/2] docs: networking: mention that RSS table should be 4x the queue count Jakub Kicinski
@ 2026-02-03 1:10 ` patchwork-bot+netdevbpf
2026-02-11 20:10 ` Yael Chemla
2 siblings, 0 replies; 12+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-02-03 1:10 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, willemb,
shuah, linux-kselftest
Hello:
This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Sat, 31 Jan 2026 14:54:53 -0800 you wrote:
> Add a test which checks that the RSS table is at least 4x the max
> queue count supported by the device. The original RSS spec from
> Microsoft stated that the RSS indirection table should be 2 to 8
> times the CPU count, presumably assuming queue per CPU. If the
> CPU count is not a power of two, however, a power-of-2 table
> 2x larger than queue count results in a 33% traffic imbalance.
> Validate that the indirection table is at least 4x the queue
> count. This lowers the imbalance to 16% which empirically
> appears to be more acceptable to memcache-like workloads.
>
> [...]
Here is the summary with links:
- [net-next,v2,1/2] selftests: drv-net: rss: validate min RSS table size
https://git.kernel.org/netdev/net-next/c/9e3d4dae9832
- [net-next,v2,2/2] docs: networking: mention that RSS table should be 4x the queue count
https://git.kernel.org/netdev/net-next/c/ba9c5611f088
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size
2026-01-31 22:54 [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size Jakub Kicinski
2026-01-31 22:54 ` [PATCH net-next v2 2/2] docs: networking: mention that RSS table should be 4x the queue count Jakub Kicinski
2026-02-03 1:10 ` [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size patchwork-bot+netdevbpf
@ 2026-02-11 20:10 ` Yael Chemla
2026-02-11 21:43 ` Jakub Kicinski
2 siblings, 1 reply; 12+ messages in thread
From: Yael Chemla @ 2026-02-11 20:10 UTC (permalink / raw)
To: Jakub Kicinski, davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms, Willem de Bruijn,
shuah, linux-kselftest, Tariq Toukan, Gal Pressman, noren
On 01/02/2026 0:54, Jakub Kicinski wrote:
> Add a test which checks that the RSS table is at least 4x the max
> queue count supported by the device. The original RSS spec from
> Microsoft stated that the RSS indirection table should be 2 to 8
> times the CPU count, presumably assuming queue per CPU. If the
> CPU count is not a power of two, however, a power-of-2 table
> 2x larger than queue count results in a 33% traffic imbalance.
> Validate that the indirection table is at least 4x the queue
> count. This lowers the imbalance to 16% which empirically
> appears to be more acceptable to memcache-like workloads.
>
> Reviewed-by: Willem de Bruijn <willemb@google.com>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> v2:
> - no changes (see patch 2)
> v1: https://lore.kernel.org/20260130192912.826454-1-kuba@kernel.org
>
> CC: shuah@kernel.org
> CC: linux-kselftest@vger.kernel.org
> ---
> .../testing/selftests/drivers/net/hw/Makefile | 1 +
> .../selftests/drivers/net/hw/rss_drv.py | 88 +++++++++++++++++++
> 2 files changed, 89 insertions(+)
> create mode 100755 tools/testing/selftests/drivers/net/hw/rss_drv.py
>
> diff --git a/tools/testing/selftests/drivers/net/hw/Makefile b/tools/testing/selftests/drivers/net/hw/Makefile
> index 9c163ba6feee..a64140333a46 100644
> --- a/tools/testing/selftests/drivers/net/hw/Makefile
> +++ b/tools/testing/selftests/drivers/net/hw/Makefile
> @@ -35,6 +35,7 @@ TEST_PROGS = \
> pp_alloc_fail.py \
> rss_api.py \
> rss_ctx.py \
> + rss_drv.py \
> rss_flow_label.py \
> rss_input_xfrm.py \
> toeplitz.py \
> diff --git a/tools/testing/selftests/drivers/net/hw/rss_drv.py b/tools/testing/selftests/drivers/net/hw/rss_drv.py
> new file mode 100755
> index 000000000000..2d1a33189076
> --- /dev/null
> +++ b/tools/testing/selftests/drivers/net/hw/rss_drv.py
Hi Jakub,
Thanks for the test addition. I wanted to raise a concern regarding the
spread factor requirement that may apply to mlx5 and potentially other
drivers as well.
The real issue arises when the hardware's maximum RQT (indirection
table) size isn't large enough to accommodate both the desired number of
channels and a spread factor of 4. RX queues/channels serve multiple
purposes beyond RSS - they're also used for XDP, AF_XDP, and direct
queue steering via ntuple filters or TC.
Artificially limiting the number of channels based solely on RSS spread
requirements would be overly restrictive for these non-RSS use cases.
In such scenarios, we'd rather have a slightly degraded spread factor
(< 4) than limit channel availability.
We'd appreciate any feedback on this approach.
Thanks,
Yael
> @@ -0,0 +1,88 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: GPL-2.0
> +
> +"""
> +Driver-related behavior tests for RSS.
> +"""
> +
> +from lib.py import ksft_run, ksft_exit, ksft_ge
> +from lib.py import ksft_variants, KsftNamedVariant, KsftSkipEx
> +from lib.py import defer, ethtool
> +from lib.py import EthtoolFamily, NlError
> +from lib.py import NetDrvEnv
> +
> +
> +def _is_power_of_two(n):
> + return n > 0 and (n & (n - 1)) == 0
> +
> +
> +def _get_rss(cfg, context=0):
> + return ethtool(f"-x {cfg.ifname} context {context}", json=True)[0]
> +
> +
> +def _test_rss_indir_size(cfg, qcnt, context=0):
> + """Test that indirection table size is at least 4x queue count."""
> + ethtool(f"-L {cfg.ifname} combined {qcnt}")
> +
> + rss = _get_rss(cfg, context=context)
> + indir = rss['rss-indirection-table']
> + ksft_ge(len(indir), 4 * qcnt, "Table smaller than 4x")
> + return len(indir)
> +
> +
> +def _maybe_create_context(cfg, create_context):
> + """ Either create a context and return its ID or return 0 for main ctx """
> + if not create_context:
> + return 0
> + try:
> + ctx = cfg.ethnl.rss_create_act({'header': {'dev-index': cfg.ifindex}})
> + ctx_id = ctx['context']
> + defer(cfg.ethnl.rss_delete_act,
> + {'header': {'dev-index': cfg.ifindex}, 'context': ctx_id})
> + except NlError:
> + raise KsftSkipEx("Device does not support additional RSS contexts")
> +
> + return ctx_id
> +
> +
> +@ksft_variants([
> + KsftNamedVariant("main", False),
> + KsftNamedVariant("ctx", True),
> +])
> +def indir_size_4x(cfg, create_context):
> + """
> + Test that the indirection table has at least 4 entries per queue.
> + Empirically network-heavy workloads like memcache suffer with the 33%
> + imbalance of a 2x indirection table size.
> + 4x table translates to a 16% imbalance.
> + """
> + channels = cfg.ethnl.channels_get({'header': {'dev-index': cfg.ifindex}})
> + ch_max = channels.get('combined-max', 0)
> + qcnt = channels['combined-count']
> +
> + if ch_max < 3:
> + raise KsftSkipEx(f"Not enough queues for the test: max={ch_max}")
> +
> + defer(ethtool, f"-L {cfg.ifname} combined {qcnt}")
> + ethtool(f"-L {cfg.ifname} combined 3")
> +
> + ctx_id = _maybe_create_context(cfg, create_context)
> +
> + indir_sz = _test_rss_indir_size(cfg, 3, context=ctx_id)
> +
> + # Test with max queue count (max - 1 if max is a power of two)
> + test_max = ch_max - 1 if _is_power_of_two(ch_max) else ch_max
> + if test_max > 3 and indir_sz < test_max * 4:
> + _test_rss_indir_size(cfg, test_max, context=ctx_id)
> +
> +
> +def main() -> None:
> + """ Ksft boiler plate main """
> + with NetDrvEnv(__file__) as cfg:
> + cfg.ethnl = EthtoolFamily()
> + ksft_run([indir_size_4x], args=(cfg, ))
> + ksft_exit()
> +
> +
> +if __name__ == "__main__":
> + main()
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size
2026-02-11 20:10 ` Yael Chemla
@ 2026-02-11 21:43 ` Jakub Kicinski
2026-02-12 9:41 ` Tariq Toukan
0 siblings, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2026-02-11 21:43 UTC (permalink / raw)
To: Yael Chemla
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms,
Willem de Bruijn, shuah, linux-kselftest, Tariq Toukan,
Gal Pressman, noren
On Wed, 11 Feb 2026 22:10:56 +0200 Yael Chemla wrote:
> Thanks for the test addition. I wanted to raise a concern regarding the
> spread factor requirement that may apply to mlx5 and potentially other
> drivers as well.
> The real issue arises when the hardware's maximum RQT (indirection
> table) size isn't large enough to accommodate both the desired number of
> channels and a spread factor of 4. RX queues/channels serve multiple
> purposes beyond RSS - they're also used for XDP, AF_XDP, and direct
> queue steering via ntuple filters or TC.
> Artificially limiting the number of channels based solely on RSS spread
> requirements would be overly restrictive for these non-RSS use cases.
> In such scenarios, we'd rather have a slightly degraded spread factor
> (< 4) than limit channel availability.
> We'd appreciate any feedback on this approach.
That's fine. In fact IIRC ixgbe (infamously) had more queues than
it could fit in its RSS table. So none of this is new. At the same
time if user _does_ want to use a lot of queues in the main context
fewer than 4x entries in the indir table is inadequate.
The test is based on production experience, and provides valuable
guidance to device developers.
I'm not sure what you want me to say here.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size
2026-02-11 21:43 ` Jakub Kicinski
@ 2026-02-12 9:41 ` Tariq Toukan
2026-02-13 1:22 ` Jakub Kicinski
0 siblings, 1 reply; 12+ messages in thread
From: Tariq Toukan @ 2026-02-12 9:41 UTC (permalink / raw)
To: Jakub Kicinski, Yael Chemla
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms,
Willem de Bruijn, shuah, linux-kselftest, Tariq Toukan,
Gal Pressman, noren
On 11/02/2026 23:43, Jakub Kicinski wrote:
> On Wed, 11 Feb 2026 22:10:56 +0200 Yael Chemla wrote:
>> Thanks for the test addition. I wanted to raise a concern regarding the
>> spread factor requirement that may apply to mlx5 and potentially other
>> drivers as well.
>> The real issue arises when the hardware's maximum RQT (indirection
>> table) size isn't large enough to accommodate both the desired number of
>> channels and a spread factor of 4. RX queues/channels serve multiple
>> purposes beyond RSS - they're also used for XDP, AF_XDP, and direct
>> queue steering via ntuple filters or TC.
>> Artificially limiting the number of channels based solely on RSS spread
>> requirements would be overly restrictive for these non-RSS use cases.
>> In such scenarios, we'd rather have a slightly degraded spread factor
>> (< 4) than limit channel availability.
>> We'd appreciate any feedback on this approach.
>
> That's fine. In fact IIRC ixgbe (infamously) had more queues than
> it could fit in its RSS table. So none of this is new. At the same
> time if user _does_ want to use a lot of queues in the main context
> fewer than 4x entries in the indir table is inadequate.
>
> The test is based on production experience, and provides valuable
> guidance to device developers.
>
> I'm not sure what you want me to say here.
>
No doubt that larger factors help overcome imbalance issues, and it's
fine to recommend using 4x (or even larger) factors.
The point is, when this comes with a selftest, it's less of a
recommendation/guidance anymore, it becomes kind of a requirement, an
expected behavior. Otherwise the test fails.
This ignores multiple other considerations:
1. Existing behavior: In general, mlx5e today implies 2x factor, so it
would fail this new test.
2. Device resources: In large scale (high num of channels, or high num
of netdevs on the same chip, or both), it is not obvious that increasing
the indirection table size is still desirable, or even possible. To pass
the selftest, you'll have to limit the max number of channels.
3. ch_max should win: Related to point #2. Driver should not enforce
limitations on supported ch_max just to fulfill the recommendation and
pass the test. I prefer flexibility, give the admin the control. That
means, driver would use 4x factor (or larger) whenever possible, but
would not block configurations in which the 4x factor cannot be satisfied.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size
2026-02-12 9:41 ` Tariq Toukan
@ 2026-02-13 1:22 ` Jakub Kicinski
2026-02-16 8:28 ` Tariq Toukan
0 siblings, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2026-02-13 1:22 UTC (permalink / raw)
To: Tariq Toukan
Cc: Yael Chemla, davem, netdev, edumazet, pabeni, andrew+netdev,
horms, Willem de Bruijn, shuah, linux-kselftest, Tariq Toukan,
Gal Pressman, noren
On Thu, 12 Feb 2026 11:41:19 +0200 Tariq Toukan wrote:
> On 11/02/2026 23:43, Jakub Kicinski wrote:
> > On Wed, 11 Feb 2026 22:10:56 +0200 Yael Chemla wrote:
> >> Thanks for the test addition. I wanted to raise a concern regarding the
> >> spread factor requirement that may apply to mlx5 and potentially other
> >> drivers as well.
> >> The real issue arises when the hardware's maximum RQT (indirection
> >> table) size isn't large enough to accommodate both the desired number of
> >> channels and a spread factor of 4. RX queues/channels serve multiple
> >> purposes beyond RSS - they're also used for XDP, AF_XDP, and direct
> >> queue steering via ntuple filters or TC.
> >> Artificially limiting the number of channels based solely on RSS spread
> >> requirements would be overly restrictive for these non-RSS use cases.
> >> In such scenarios, we'd rather have a slightly degraded spread factor
> >> (< 4) than limit channel availability.
> >> We'd appreciate any feedback on this approach.
> >
> > That's fine. In fact IIRC ixgbe (infamously) had more queues than
> > it could fit in its RSS table. So none of this is new. At the same
> > time if user _does_ want to use a lot of queues in the main context
> > fewer than 4x entries in the indir table is inadequate.
> >
> > The test is based on production experience, and provides valuable
> > guidance to device developers.
> >
> > I'm not sure what you want me to say here.
>
> No doubt that larger factors help overcome imbalance issues, and it's
> fine to recommend using 4x (or even larger) factors.
>
> The point is, when this comes with a selftest, it's less of a
> recommendation/guidance anymore, it becomes kind of a requirement, an
> expected behavior. Otherwise the test fails.
>
> This ignores multiple other considerations:
>
> 1. Existing behavior: In general, mlx5e today implies 2x factor, so it
> would fail this new test.
>
> 2. Device resources: In large scale (high num of channels, or high num
> of netdevs on the same chip, or both), it is not obvious that increasing
> the indirection table size is still desirable, or even possible. To pass
> the selftest, you'll have to limit the max number of channels.
>
> 3. ch_max should win: Related to point #2. Driver should not enforce
> limitations on supported ch_max just to fulfill the recommendation and
> pass the test. I prefer flexibility, give the admin the control. That
> means, driver would use 4x factor (or larger) whenever possible, but
> would not block configurations in which the 4x factor cannot be satisfied.
Oh I see.. I wasn't aware the CX7 has a limitation of the indirection
table size. I wrote the test because of a similar limitation in a
different NIC, but that one has been fixed.. I have limited access to
CX7 NICs, the one I tested on maxed out at 63 queues so the test has
passed.
Is it not possible to create an indirection table larger than 256
entries? 256 is not a lot, AMD Venice (to pick one) will have up
to 256 CPU cores (not threads) in a single CPU package.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size
2026-02-13 1:22 ` Jakub Kicinski
@ 2026-02-16 8:28 ` Tariq Toukan
2026-02-17 21:57 ` Jakub Kicinski
0 siblings, 1 reply; 12+ messages in thread
From: Tariq Toukan @ 2026-02-16 8:28 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Yael Chemla, davem, netdev, edumazet, pabeni, andrew+netdev,
horms, Willem de Bruijn, shuah, linux-kselftest, Tariq Toukan,
Gal Pressman, noren
On 13/02/2026 3:22, Jakub Kicinski wrote:
> On Thu, 12 Feb 2026 11:41:19 +0200 Tariq Toukan wrote:
>> On 11/02/2026 23:43, Jakub Kicinski wrote:
>>> On Wed, 11 Feb 2026 22:10:56 +0200 Yael Chemla wrote:
>>>> Thanks for the test addition. I wanted to raise a concern regarding the
>>>> spread factor requirement that may apply to mlx5 and potentially other
>>>> drivers as well.
>>>> The real issue arises when the hardware's maximum RQT (indirection
>>>> table) size isn't large enough to accommodate both the desired number of
>>>> channels and a spread factor of 4. RX queues/channels serve multiple
>>>> purposes beyond RSS - they're also used for XDP, AF_XDP, and direct
>>>> queue steering via ntuple filters or TC.
>>>> Artificially limiting the number of channels based solely on RSS spread
>>>> requirements would be overly restrictive for these non-RSS use cases.
>>>> In such scenarios, we'd rather have a slightly degraded spread factor
>>>> (< 4) than limit channel availability.
>>>> We'd appreciate any feedback on this approach.
>>>
>>> That's fine. In fact IIRC ixgbe (infamously) had more queues than
>>> it could fit in its RSS table. So none of this is new. At the same
>>> time if user _does_ want to use a lot of queues in the main context
>>> fewer than 4x entries in the indir table is inadequate.
>>>
>>> The test is based on production experience, and provides valuable
>>> guidance to device developers.
>>>
>>> I'm not sure what you want me to say here.
>>
>> No doubt that larger factors help overcome imbalance issues, and it's
>> fine to recommend using 4x (or even larger) factors.
>>
>> The point is, when this comes with a selftest, it's less of a
>> recommendation/guidance anymore, it becomes kind of a requirement, an
>> expected behavior. Otherwise the test fails.
>>
>> This ignores multiple other considerations:
>>
>> 1. Existing behavior: In general, mlx5e today implies 2x factor, so it
>> would fail this new test.
>>
>> 2. Device resources: In large scale (high num of channels, or high num
>> of netdevs on the same chip, or both), it is not obvious that increasing
>> the indirection table size is still desirable, or even possible. To pass
>> the selftest, you'll have to limit the max number of channels.
>>
>> 3. ch_max should win: Related to point #2. Driver should not enforce
>> limitations on supported ch_max just to fulfill the recommendation and
>> pass the test. I prefer flexibility, give the admin the control. That
>> means, driver would use 4x factor (or larger) whenever possible, but
>> would not block configurations in which the 4x factor cannot be satisfied.
>
> Oh I see.. I wasn't aware the CX7 has a limitation of the indirection
> table size.
There is a limitation, we read it from FW.
It's usually not small, much larger than 256.
But currently it can vary according to FW decisions in scale (resource
management).
> I wrote the test because of a similar limitation in a
> different NIC, but that one has been fixed.. I have limited access to
> CX7 NICs, the one I tested on maxed out at 63 queues so the test has
> passed.
>
> Is it not possible to create an indirection table larger than 256
> entries?
It is possible, depending on the exposed FW capability.
As of today, there are high-scale configurations (many VFs for example)
where the FW exposed cap is lowered.
> 256 is not a lot, AMD Venice (to pick one) will have up
> to 256 CPU cores (not threads) in a single CPU package.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size
2026-02-16 8:28 ` Tariq Toukan
@ 2026-02-17 21:57 ` Jakub Kicinski
2026-02-18 8:02 ` Tariq Toukan
0 siblings, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2026-02-17 21:57 UTC (permalink / raw)
To: Tariq Toukan
Cc: Yael Chemla, davem, netdev, edumazet, pabeni, andrew+netdev,
horms, Willem de Bruijn, shuah, linux-kselftest, Tariq Toukan,
Gal Pressman, noren
On Mon, 16 Feb 2026 10:28:52 +0200 Tariq Toukan wrote:
> >> This ignores multiple other considerations:
> >>
> >> 1. Existing behavior: In general, mlx5e today implies 2x factor, so it
> >> would fail this new test.
> >>
> >> 2. Device resources: In large scale (high num of channels, or high num
> >> of netdevs on the same chip, or both), it is not obvious that increasing
> >> the indirection table size is still desirable, or even possible. To pass
> >> the selftest, you'll have to limit the max number of channels.
> >>
> >> 3. ch_max should win: Related to point #2. Driver should not enforce
> >> limitations on supported ch_max just to fulfill the recommendation and
> >> pass the test. I prefer flexibility, give the admin the control. That
> >> means, driver would use 4x factor (or larger) whenever possible, but
> >> would not block configurations in which the 4x factor cannot be satisfied.
> >
> > Oh I see.. I wasn't aware the CX7 has a limitation of the indirection
> > table size.
>
> There is a limitation, we read it from FW.
> It's usually not small, much larger than 256.
>
> But currently it can vary according to FW decisions in scale (resource
> management).
>
> > I wrote the test because of a similar limitation in a
> > different NIC, but that one has been fixed.. I have limited access to
> > CX7 NICs, the one I tested on maxed out at 63 queues so the test has
> > passed.
> >
> > Is it not possible to create an indirection table larger than 256
> > entries?
>
> It is possible, depending on the exposed FW capability.
> As of today, there are high-scale configurations (many VFs for example)
> where the FW exposed cap is lowered.
Not entirely sure what you expect the outcome of this discussion to be.
The 2x indirection table has been proven inadequate for real production
use. I'm not talking about some theory or benchmarks, actual workloads
reported machines/NICs with such table as unusable (workload starts
choking way before reaching expected machine capacity).
That said I just checked out of curiosity and the OCP NIC spec also
states:
The minimum supported indirection table size MUST be 128. The minimum
SHOULD be at least 4 times the number of supported receive queues.
so I guess the 4x isn't exactly a new recommendation.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size
2026-02-17 21:57 ` Jakub Kicinski
@ 2026-02-18 8:02 ` Tariq Toukan
2026-02-18 15:45 ` Jakub Kicinski
0 siblings, 1 reply; 12+ messages in thread
From: Tariq Toukan @ 2026-02-18 8:02 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Yael Chemla, davem, netdev, edumazet, pabeni, andrew+netdev,
horms, Willem de Bruijn, shuah, linux-kselftest, Tariq Toukan,
Gal Pressman, noren
On 17/02/2026 23:57, Jakub Kicinski wrote:
> On Mon, 16 Feb 2026 10:28:52 +0200 Tariq Toukan wrote:
>>>> This ignores multiple other considerations:
>>>>
>>>> 1. Existing behavior: In general, mlx5e today implies 2x factor, so it
>>>> would fail this new test.
>>>>
>>>> 2. Device resources: In large scale (high num of channels, or high num
>>>> of netdevs on the same chip, or both), it is not obvious that increasing
>>>> the indirection table size is still desirable, or even possible. To pass
>>>> the selftest, you'll have to limit the max number of channels.
>>>>
>>>> 3. ch_max should win: Related to point #2. Driver should not enforce
>>>> limitations on supported ch_max just to fulfill the recommendation and
>>>> pass the test. I prefer flexibility, give the admin the control. That
>>>> means, driver would use 4x factor (or larger) whenever possible, but
>>>> would not block configurations in which the 4x factor cannot be satisfied.
>>>
>>> Oh I see.. I wasn't aware the CX7 has a limitation of the indirection
>>> table size.
>>
>> There is a limitation, we read it from FW.
>> It's usually not small, much larger than 256.
>>
>> But currently it can vary according to FW decisions in scale (resource
>> management).
>>
>>> I wrote the test because of a similar limitation in a
>>> different NIC, but that one has been fixed.. I have limited access to
>>> CX7 NICs, the one I tested on maxed out at 63 queues so the test has
>>> passed.
>>>
>>> Is it not possible to create an indirection table larger than 256
>>> entries?
>>
>> It is possible, depending on the exposed FW capability.
>> As of today, there are high-scale configurations (many VFs for example)
>> where the FW exposed cap is lowered.
>
> Not entirely sure what you expect the outcome of this discussion to be.
>
> The 2x indirection table has been proven inadequate for real production
> use. I'm not talking about some theory or benchmarks, actual workloads
> reported machines/NICs with such table as unusable (workload starts
> choking way before reaching expected machine capacity).
>
> That said I just checked out of curiosity and the OCP NIC spec also
> states:
>
> The minimum supported indirection table size MUST be 128.
That should be fine. Max table size cap is always >= 256.
> The minimum
> SHOULD be at least 4 times the number of supported receive queues.
>
> so I guess the 4x isn't exactly a new recommendation.
My position is as follows:
Provide the appropriate factor *if possible* (currently it is 2x, and we
will likely increase it to 4x).
However, do not limit the maximum nch if the factor cannot be satisfied
- even if that results in the selftest failing.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size
2026-02-18 8:02 ` Tariq Toukan
@ 2026-02-18 15:45 ` Jakub Kicinski
0 siblings, 0 replies; 12+ messages in thread
From: Jakub Kicinski @ 2026-02-18 15:45 UTC (permalink / raw)
To: Tariq Toukan
Cc: Yael Chemla, davem, netdev, edumazet, pabeni, andrew+netdev,
horms, Willem de Bruijn, shuah, linux-kselftest, Tariq Toukan,
Gal Pressman, noren
On Wed, 18 Feb 2026 10:02:01 +0200 Tariq Toukan wrote:
> > The minimum
> > SHOULD be at least 4 times the number of supported receive queues.
> >
> > so I guess the 4x isn't exactly a new recommendation.
>
> My position is as follows:
> Provide the appropriate factor *if possible* (currently it is 2x, and we
> will likely increase it to 4x).
> However, do not limit the maximum nch if the factor cannot be satisfied
> - even if that results in the selftest failing.
Yes, that's definitely fair. We can adjust the comment in the test
to make that clear. The user may want to use the queues in a different
RSS context / zero-copy so limiting queue count would be the wrong
direction. The test, is only signaling practicality of using all queues
in the main context.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-02-18 15:45 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-31 22:54 [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size Jakub Kicinski
2026-01-31 22:54 ` [PATCH net-next v2 2/2] docs: networking: mention that RSS table should be 4x the queue count Jakub Kicinski
2026-02-01 7:56 ` Eric Dumazet
2026-02-03 1:10 ` [PATCH net-next v2 1/2] selftests: drv-net: rss: validate min RSS table size patchwork-bot+netdevbpf
2026-02-11 20:10 ` Yael Chemla
2026-02-11 21:43 ` Jakub Kicinski
2026-02-12 9:41 ` Tariq Toukan
2026-02-13 1:22 ` Jakub Kicinski
2026-02-16 8:28 ` Tariq Toukan
2026-02-17 21:57 ` Jakub Kicinski
2026-02-18 8:02 ` Tariq Toukan
2026-02-18 15:45 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox