netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Question] ixgbe:Mechanism of RSS
@ 2025-01-02  3:53 Haifeng Xu
  2025-01-02  8:13 ` Eric Dumazet
  0 siblings, 1 reply; 15+ messages in thread
From: Haifeng Xu @ 2025-01-02  3:53 UTC (permalink / raw)
  To: Tony Nguyen, , Przemek Kitszel, , "David S. Miller",
	, Eric Dumazet, , Jakub Kicinski, , Paolo Abeni
  Cc: linux-kernel, netdev, intel-wired-lan

Hi masters,

	We use the Intel Corporation 82599ES NIC in our production environment. And it has 63 rx queues, every rx queue interrupt is processed by a single cpu.
	The RSS configuration can be seen as follow:

	RX flow hash indirection table for eno5 with 63 RX ring(s):
	0:      0     1     2     3     4     5     6     7
	8:      8     9    10    11    12    13    14    15
	16:      0     1     2     3     4     5     6     7
	24:      8     9    10    11    12    13    14    15
	32:      0     1     2     3     4     5     6     7
	40:      8     9    10    11    12    13    14    15
	48:      0     1     2     3     4     5     6     7
	56:      8     9    10    11    12    13    14    15
	64:      0     1     2     3     4     5     6     7
	72:      8     9    10    11    12    13    14    15
	80:      0     1     2     3     4     5     6     7
	88:      8     9    10    11    12    13    14    15
	96:      0     1     2     3     4     5     6     7
	104:      8     9    10    11    12    13    14    15
	112:      0     1     2     3     4     5     6     7
	120:      8     9    10    11    12    13    14    15

	The maximum number of RSS queues is 16. So I have some questions about this. Will other cpus except 0~15 receive the rx interrupts? 

	In our production environment, cpu 16~62 also receive the rx interrupts. Was our RSS misconfigured?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Question] ixgbe:Mechanism of RSS
  2025-01-02  3:53 [Question] ixgbe:Mechanism of RSS Haifeng Xu
@ 2025-01-02  8:13 ` Eric Dumazet
  2025-01-02  8:43   ` Haifeng Xu
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2025-01-02  8:13 UTC (permalink / raw)
  To: Haifeng Xu
  Cc: Tony Nguyen, , Przemek Kitszel, , David S. Miller,
	, Jakub Kicinski, , Paolo Abeni, linux-kernel, netdev,
	intel-wired-lan

On Thu, Jan 2, 2025 at 4:53 AM Haifeng Xu <haifeng.xu@shopee.com> wrote:
>
> Hi masters,
>
>         We use the Intel Corporation 82599ES NIC in our production environment. And it has 63 rx queues, every rx queue interrupt is processed by a single cpu.
>         The RSS configuration can be seen as follow:
>
>         RX flow hash indirection table for eno5 with 63 RX ring(s):
>         0:      0     1     2     3     4     5     6     7
>         8:      8     9    10    11    12    13    14    15
>         16:      0     1     2     3     4     5     6     7
>         24:      8     9    10    11    12    13    14    15
>         32:      0     1     2     3     4     5     6     7
>         40:      8     9    10    11    12    13    14    15
>         48:      0     1     2     3     4     5     6     7
>         56:      8     9    10    11    12    13    14    15
>         64:      0     1     2     3     4     5     6     7
>         72:      8     9    10    11    12    13    14    15
>         80:      0     1     2     3     4     5     6     7
>         88:      8     9    10    11    12    13    14    15
>         96:      0     1     2     3     4     5     6     7
>         104:      8     9    10    11    12    13    14    15
>         112:      0     1     2     3     4     5     6     7
>         120:      8     9    10    11    12    13    14    15
>
>         The maximum number of RSS queues is 16. So I have some questions about this. Will other cpus except 0~15 receive the rx interrupts?
>
>         In our production environment, cpu 16~62 also receive the rx interrupts. Was our RSS misconfigured?

It really depends on which cpus are assigned to each IRQ.

Look at /proc/irq/{IRQ_NUM}/smp_affinity

Also you can have some details in Documentation/networking/scaling.rst

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Question] ixgbe:Mechanism of RSS
  2025-01-02  8:13 ` Eric Dumazet
@ 2025-01-02  8:43   ` Haifeng Xu
  2025-01-02 10:34     ` Eric Dumazet
  0 siblings, 1 reply; 15+ messages in thread
From: Haifeng Xu @ 2025-01-02  8:43 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tony Nguyen, Przemek Kitszel, David S. Miller, Jakub Kicinski,
	Paolo Abeni, linux-kernel, netdev, intel-wired-lan



On 2025/1/2 16:13, Eric Dumazet wrote:
> On Thu, Jan 2, 2025 at 4:53 AM Haifeng Xu <haifeng.xu@shopee.com> wrote:
>>
>> Hi masters,
>>
>>         We use the Intel Corporation 82599ES NIC in our production environment. And it has 63 rx queues, every rx queue interrupt is processed by a single cpu.
>>         The RSS configuration can be seen as follow:
>>
>>         RX flow hash indirection table for eno5 with 63 RX ring(s):
>>         0:      0     1     2     3     4     5     6     7
>>         8:      8     9    10    11    12    13    14    15
>>         16:      0     1     2     3     4     5     6     7
>>         24:      8     9    10    11    12    13    14    15
>>         32:      0     1     2     3     4     5     6     7
>>         40:      8     9    10    11    12    13    14    15
>>         48:      0     1     2     3     4     5     6     7
>>         56:      8     9    10    11    12    13    14    15
>>         64:      0     1     2     3     4     5     6     7
>>         72:      8     9    10    11    12    13    14    15
>>         80:      0     1     2     3     4     5     6     7
>>         88:      8     9    10    11    12    13    14    15
>>         96:      0     1     2     3     4     5     6     7
>>         104:      8     9    10    11    12    13    14    15
>>         112:      0     1     2     3     4     5     6     7
>>         120:      8     9    10    11    12    13    14    15
>>
>>         The maximum number of RSS queues is 16. So I have some questions about this. Will other cpus except 0~15 receive the rx interrupts?
>>
>>         In our production environment, cpu 16~62 also receive the rx interrupts. Was our RSS misconfigured?
> 
> It really depends on which cpus are assigned to each IRQ.
> 

Hi Eric,

Each irq was assigned to a single cpu, for exapmle:

irq	cpu

117      0
118      1

......

179      62

All cpus trigger interrupts not only cpus 0~15. 
It seems that the result is inconsistent with the RSS hash value.


Thanks!

> Look at /proc/irq/{IRQ_NUM}/smp_affinity
> 
> Also you can have some details in Documentation/networking/scaling.rst


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Question] ixgbe:Mechanism of RSS
  2025-01-02  8:43   ` Haifeng Xu
@ 2025-01-02 10:34     ` Eric Dumazet
  2025-01-02 11:23       ` Haifeng Xu
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2025-01-02 10:34 UTC (permalink / raw)
  To: Haifeng Xu
  Cc: Tony Nguyen, Przemek Kitszel, David S. Miller, Jakub Kicinski,
	Paolo Abeni, linux-kernel, netdev, intel-wired-lan

On Thu, Jan 2, 2025 at 9:43 AM Haifeng Xu <haifeng.xu@shopee.com> wrote:
>
>
>
> On 2025/1/2 16:13, Eric Dumazet wrote:
> > On Thu, Jan 2, 2025 at 4:53 AM Haifeng Xu <haifeng.xu@shopee.com> wrote:
> >>
> >> Hi masters,
> >>
> >>         We use the Intel Corporation 82599ES NIC in our production environment. And it has 63 rx queues, every rx queue interrupt is processed by a single cpu.
> >>         The RSS configuration can be seen as follow:
> >>
> >>         RX flow hash indirection table for eno5 with 63 RX ring(s):
> >>         0:      0     1     2     3     4     5     6     7
> >>         8:      8     9    10    11    12    13    14    15
> >>         16:      0     1     2     3     4     5     6     7
> >>         24:      8     9    10    11    12    13    14    15
> >>         32:      0     1     2     3     4     5     6     7
> >>         40:      8     9    10    11    12    13    14    15
> >>         48:      0     1     2     3     4     5     6     7
> >>         56:      8     9    10    11    12    13    14    15
> >>         64:      0     1     2     3     4     5     6     7
> >>         72:      8     9    10    11    12    13    14    15
> >>         80:      0     1     2     3     4     5     6     7
> >>         88:      8     9    10    11    12    13    14    15
> >>         96:      0     1     2     3     4     5     6     7
> >>         104:      8     9    10    11    12    13    14    15
> >>         112:      0     1     2     3     4     5     6     7
> >>         120:      8     9    10    11    12    13    14    15
> >>
> >>         The maximum number of RSS queues is 16. So I have some questions about this. Will other cpus except 0~15 receive the rx interrupts?
> >>
> >>         In our production environment, cpu 16~62 also receive the rx interrupts. Was our RSS misconfigured?
> >
> > It really depends on which cpus are assigned to each IRQ.
> >
>
> Hi Eric,
>
> Each irq was assigned to a single cpu, for exapmle:
>
> irq     cpu
>
> 117      0
> 118      1
>
> ......
>
> 179      62
>
> All cpus trigger interrupts not only cpus 0~15.
> It seems that the result is inconsistent with the RSS hash value.
>
>

I misread your report, I thought you had 16 receive queues.

Why don't you change "ethtool -L eno5 rx 16", instead of trying to
configure RSS manually ?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Question] ixgbe:Mechanism of RSS
  2025-01-02 10:34     ` Eric Dumazet
@ 2025-01-02 11:23       ` Haifeng Xu
  2025-01-02 11:46         ` Eric Dumazet
  2025-01-02 16:01         ` Edward Cree
  0 siblings, 2 replies; 15+ messages in thread
From: Haifeng Xu @ 2025-01-02 11:23 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tony Nguyen, Przemek Kitszel, David S. Miller, Jakub Kicinski,
	Paolo Abeni, linux-kernel, netdev, intel-wired-lan



On 2025/1/2 18:34, Eric Dumazet wrote:
> On Thu, Jan 2, 2025 at 9:43 AM Haifeng Xu <haifeng.xu@shopee.com> wrote:
>>
>>
>>
>> On 2025/1/2 16:13, Eric Dumazet wrote:
>>> On Thu, Jan 2, 2025 at 4:53 AM Haifeng Xu <haifeng.xu@shopee.com> wrote:
>>>>
>>>> Hi masters,
>>>>
>>>>         We use the Intel Corporation 82599ES NIC in our production environment. And it has 63 rx queues, every rx queue interrupt is processed by a single cpu.
>>>>         The RSS configuration can be seen as follow:
>>>>
>>>>         RX flow hash indirection table for eno5 with 63 RX ring(s):
>>>>         0:      0     1     2     3     4     5     6     7
>>>>         8:      8     9    10    11    12    13    14    15
>>>>         16:      0     1     2     3     4     5     6     7
>>>>         24:      8     9    10    11    12    13    14    15
>>>>         32:      0     1     2     3     4     5     6     7
>>>>         40:      8     9    10    11    12    13    14    15
>>>>         48:      0     1     2     3     4     5     6     7
>>>>         56:      8     9    10    11    12    13    14    15
>>>>         64:      0     1     2     3     4     5     6     7
>>>>         72:      8     9    10    11    12    13    14    15
>>>>         80:      0     1     2     3     4     5     6     7
>>>>         88:      8     9    10    11    12    13    14    15
>>>>         96:      0     1     2     3     4     5     6     7
>>>>         104:      8     9    10    11    12    13    14    15
>>>>         112:      0     1     2     3     4     5     6     7
>>>>         120:      8     9    10    11    12    13    14    15
>>>>
>>>>         The maximum number of RSS queues is 16. So I have some questions about this. Will other cpus except 0~15 receive the rx interrupts?
>>>>
>>>>         In our production environment, cpu 16~62 also receive the rx interrupts. Was our RSS misconfigured?
>>>
>>> It really depends on which cpus are assigned to each IRQ.
>>>
>>
>> Hi Eric,
>>
>> Each irq was assigned to a single cpu, for exapmle:
>>
>> irq     cpu
>>
>> 117      0
>> 118      1
>>
>> ......
>>
>> 179      62
>>
>> All cpus trigger interrupts not only cpus 0~15.
>> It seems that the result is inconsistent with the RSS hash value.
>>
>>
> 
> I misread your report, I thought you had 16 receive queues.
> 
> Why don't you change "ethtool -L eno5 rx 16", instead of trying to
> configure RSS manually ?

Hi Eric,

We want to make full use of cpu resources to receive packets. So
we enable 63 rx queues. But we found the rate of interrupt growth
on cpu 0~15 is faster than other cpus(almost twice). I don't know 
whether it is related to RSS configuration. We didn't make any changes
on the RSS configration after the server is up.



FYI, on another server, we use Mellanox Technologies MT27800 NIC.
The rate of interrupt growth on cpu 0~63 seems have little gap.

It's RSS configration can be seen as follow:

RX flow hash indirection table for ens2f0np0 with 63 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19    20    21    22    23
   24:     24    25    26    27    28    29    30    31
   32:     32    33    34    35    36    37    38    39
   40:     40    41    42    43    44    45    46    47
   48:     48    49    50    51    52    53    54    55
   56:     56    57    58    59    60    61    62     0
   64:      1     2     3     4     5     6     7     8
   72:      9    10    11    12    13    14    15    16
   80:     17    18    19    20    21    22    23    24
   88:     25    26    27    28    29    30    31    32
   96:     33    34    35    36    37    38    39    40
  104:     41    42    43    44    45    46    47    48
  112:     49    50    51    52    53    54    55    56
  120:     57    58    59    60    61    62     0     1
  128:      2     3     4     5     6     7     8     9
  136:     10    11    12    13    14    15    16    17
  144:     18    19    20    21    22    23    24    25
  152:     26    27    28    29    30    31    32    33
  160:     34    35    36    37    38    39    40    41
  168:     42    43    44    45    46    47    48    49
  176:     50    51    52    53    54    55    56    57
  184:     58    59    60    61    62     0     1     2
  192:      3     4     5     6     7     8     9    10
  200:     11    12    13    14    15    16    17    18
  208:     19    20    21    22    23    24    25    26
  216:     27    28    29    30    31    32    33    34
  224:     35    36    37    38    39    40    41    42
  232:     43    44    45    46    47    48    49    50
  240:     51    52    53    54    55    56    57    58
  248:     59    60    61    62     0     1     2     3


I am confused that why ixgbe NIC can dispatch the packets
to the rx queues that not specified in RSS configuration.

Thanks!



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Question] ixgbe:Mechanism of RSS
  2025-01-02 11:23       ` Haifeng Xu
@ 2025-01-02 11:46         ` Eric Dumazet
  2025-01-03  2:36           ` Haifeng Xu
  2025-01-02 16:01         ` Edward Cree
  1 sibling, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2025-01-02 11:46 UTC (permalink / raw)
  To: Haifeng Xu
  Cc: Tony Nguyen, Przemek Kitszel, David S. Miller, Jakub Kicinski,
	Paolo Abeni, linux-kernel, netdev, intel-wired-lan

On Thu, Jan 2, 2025 at 12:23 PM Haifeng Xu <haifeng.xu@shopee.com> wrote:
>
>
>
> On 2025/1/2 18:34, Eric Dumazet wrote:
> > On Thu, Jan 2, 2025 at 9:43 AM Haifeng Xu <haifeng.xu@shopee.com> wrote:
> >>
> >>
> >>
> >> On 2025/1/2 16:13, Eric Dumazet wrote:
> >>> On Thu, Jan 2, 2025 at 4:53 AM Haifeng Xu <haifeng.xu@shopee.com> wrote:
> >>>>
> >>>> Hi masters,
> >>>>
> >>>>         We use the Intel Corporation 82599ES NIC in our production environment. And it has 63 rx queues, every rx queue interrupt is processed by a single cpu.
> >>>>         The RSS configuration can be seen as follow:
> >>>>
> >>>>         RX flow hash indirection table for eno5 with 63 RX ring(s):
> >>>>         0:      0     1     2     3     4     5     6     7
> >>>>         8:      8     9    10    11    12    13    14    15
> >>>>         16:      0     1     2     3     4     5     6     7
> >>>>         24:      8     9    10    11    12    13    14    15
> >>>>         32:      0     1     2     3     4     5     6     7
> >>>>         40:      8     9    10    11    12    13    14    15
> >>>>         48:      0     1     2     3     4     5     6     7
> >>>>         56:      8     9    10    11    12    13    14    15
> >>>>         64:      0     1     2     3     4     5     6     7
> >>>>         72:      8     9    10    11    12    13    14    15
> >>>>         80:      0     1     2     3     4     5     6     7
> >>>>         88:      8     9    10    11    12    13    14    15
> >>>>         96:      0     1     2     3     4     5     6     7
> >>>>         104:      8     9    10    11    12    13    14    15
> >>>>         112:      0     1     2     3     4     5     6     7
> >>>>         120:      8     9    10    11    12    13    14    15
> >>>>
> >>>>         The maximum number of RSS queues is 16. So I have some questions about this. Will other cpus except 0~15 receive the rx interrupts?
> >>>>
> >>>>         In our production environment, cpu 16~62 also receive the rx interrupts. Was our RSS misconfigured?
> >>>
> >>> It really depends on which cpus are assigned to each IRQ.
> >>>
> >>
> >> Hi Eric,
> >>
> >> Each irq was assigned to a single cpu, for exapmle:
> >>
> >> irq     cpu
> >>
> >> 117      0
> >> 118      1
> >>
> >> ......
> >>
> >> 179      62
> >>
> >> All cpus trigger interrupts not only cpus 0~15.
> >> It seems that the result is inconsistent with the RSS hash value.
> >>
> >>
> >
> > I misread your report, I thought you had 16 receive queues.
> >
> > Why don't you change "ethtool -L eno5 rx 16", instead of trying to
> > configure RSS manually ?
>
> Hi Eric,
>
> We want to make full use of cpu resources to receive packets. So
> we enable 63 rx queues. But we found the rate of interrupt growth
> on cpu 0~15 is faster than other cpus(almost twice). I don't know
> whether it is related to RSS configuration. We didn't make any changes
> on the RSS configration after the server is up.
>
>
>
> FYI, on another server, we use Mellanox Technologies MT27800 NIC.
> The rate of interrupt growth on cpu 0~63 seems have little gap.
>
> It's RSS configration can be seen as follow:
>
> RX flow hash indirection table for ens2f0np0 with 63 RX ring(s):
>     0:      0     1     2     3     4     5     6     7
>     8:      8     9    10    11    12    13    14    15
>    16:     16    17    18    19    20    21    22    23
>    24:     24    25    26    27    28    29    30    31
>    32:     32    33    34    35    36    37    38    39
>    40:     40    41    42    43    44    45    46    47
>    48:     48    49    50    51    52    53    54    55
>    56:     56    57    58    59    60    61    62     0
>    64:      1     2     3     4     5     6     7     8
>    72:      9    10    11    12    13    14    15    16
>    80:     17    18    19    20    21    22    23    24
>    88:     25    26    27    28    29    30    31    32
>    96:     33    34    35    36    37    38    39    40
>   104:     41    42    43    44    45    46    47    48
>   112:     49    50    51    52    53    54    55    56
>   120:     57    58    59    60    61    62     0     1
>   128:      2     3     4     5     6     7     8     9
>   136:     10    11    12    13    14    15    16    17
>   144:     18    19    20    21    22    23    24    25
>   152:     26    27    28    29    30    31    32    33
>   160:     34    35    36    37    38    39    40    41
>   168:     42    43    44    45    46    47    48    49
>   176:     50    51    52    53    54    55    56    57
>   184:     58    59    60    61    62     0     1     2
>   192:      3     4     5     6     7     8     9    10
>   200:     11    12    13    14    15    16    17    18
>   208:     19    20    21    22    23    24    25    26
>   216:     27    28    29    30    31    32    33    34
>   224:     35    36    37    38    39    40    41    42
>   232:     43    44    45    46    47    48    49    50
>   240:     51    52    53    54    55    56    57    58
>   248:     59    60    61    62     0     1     2     3
>
>
> I am confused that why ixgbe NIC can dispatch the packets
> to the rx queues that not specified in RSS configuration.

Perhaps make sure to change RX flow hash indirection table on the
Intel NIC then...

Maybe you changed the default configuration.

ethtool -X eno5 equal 64
Or
ethtool -X eno5 default

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Question] ixgbe:Mechanism of RSS
  2025-01-02 11:23       ` Haifeng Xu
  2025-01-02 11:46         ` Eric Dumazet
@ 2025-01-02 16:01         ` Edward Cree
  2025-01-02 16:39           ` Jakub Kicinski
  2025-01-03  3:05           ` Haifeng Xu
  1 sibling, 2 replies; 15+ messages in thread
From: Edward Cree @ 2025-01-02 16:01 UTC (permalink / raw)
  To: Haifeng Xu, Eric Dumazet
  Cc: Tony Nguyen, Przemek Kitszel, David S. Miller, Jakub Kicinski,
	Paolo Abeni, linux-kernel, netdev, intel-wired-lan

On 02/01/2025 11:23, Haifeng Xu wrote:
> We want to make full use of cpu resources to receive packets. So
> we enable 63 rx queues. But we found the rate of interrupt growth
> on cpu 0~15 is faster than other cpus(almost twice).
...
> I am confused that why ixgbe NIC can dispatch the packets
> to the rx queues that not specified in RSS configuration.

Hypothesis: it isn't doing so, RX is only happening on cpus (and
 queues) 0-15, but the other CPUs are still sending traffic and
 thus getting TX completion interrupts from their TX queues.
`ethtool -S` output has per-queue traffic stats which should
 confirm this.

(But Eric is right that if you _want_ RX to use every CPU you
 should just change the indirection table.)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Question] ixgbe:Mechanism of RSS
  2025-01-02 16:01         ` Edward Cree
@ 2025-01-02 16:39           ` Jakub Kicinski
  2025-01-03  2:37             ` Haifeng Xu
  2025-01-03  3:05           ` Haifeng Xu
  1 sibling, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2025-01-02 16:39 UTC (permalink / raw)
  To: Edward Cree
  Cc: Haifeng Xu, Eric Dumazet, Tony Nguyen, Przemek Kitszel,
	David S. Miller, Paolo Abeni, linux-kernel, netdev,
	intel-wired-lan

On Thu, 2 Jan 2025 16:01:18 +0000 Edward Cree wrote:
> On 02/01/2025 11:23, Haifeng Xu wrote:
> > We want to make full use of cpu resources to receive packets. So
> > we enable 63 rx queues. But we found the rate of interrupt growth
> > on cpu 0~15 is faster than other cpus(almost twice).  
> ...
> > I am confused that why ixgbe NIC can dispatch the packets
> > to the rx queues that not specified in RSS configuration.  
> 
> Hypothesis: it isn't doing so, RX is only happening on cpus (and
>  queues) 0-15, but the other CPUs are still sending traffic and
>  thus getting TX completion interrupts from their TX queues.
> `ethtool -S` output has per-queue traffic stats which should
>  confirm this.
> 
> (But Eric is right that if you _want_ RX to use every CPU you
>  should just change the indirection table.)

IIRC Niantic had 4 bit entries in the RSS table or some such.
It wasn't possible to RSS across more than 16 queues at a time.
It's a great NIC but a bit dated at this point.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Question] ixgbe:Mechanism of RSS
  2025-01-02 11:46         ` Eric Dumazet
@ 2025-01-03  2:36           ` Haifeng Xu
  0 siblings, 0 replies; 15+ messages in thread
From: Haifeng Xu @ 2025-01-03  2:36 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tony Nguyen, Przemek Kitszel, David S. Miller, Jakub Kicinski,
	Paolo Abeni, linux-kernel, netdev, intel-wired-lan



On 2025/1/2 19:46, Eric Dumazet wrote:
> On Thu, Jan 2, 2025 at 12:23 PM Haifeng Xu <haifeng.xu@shopee.com> wrote:
>>
>>
>>
>> On 2025/1/2 18:34, Eric Dumazet wrote:
>>> On Thu, Jan 2, 2025 at 9:43 AM Haifeng Xu <haifeng.xu@shopee.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2025/1/2 16:13, Eric Dumazet wrote:
>>>>> On Thu, Jan 2, 2025 at 4:53 AM Haifeng Xu <haifeng.xu@shopee.com> wrote:
>>>>>>
>>>>>> Hi masters,
>>>>>>
>>>>>>         We use the Intel Corporation 82599ES NIC in our production environment. And it has 63 rx queues, every rx queue interrupt is processed by a single cpu.
>>>>>>         The RSS configuration can be seen as follow:
>>>>>>
>>>>>>         RX flow hash indirection table for eno5 with 63 RX ring(s):
>>>>>>         0:      0     1     2     3     4     5     6     7
>>>>>>         8:      8     9    10    11    12    13    14    15
>>>>>>         16:      0     1     2     3     4     5     6     7
>>>>>>         24:      8     9    10    11    12    13    14    15
>>>>>>         32:      0     1     2     3     4     5     6     7
>>>>>>         40:      8     9    10    11    12    13    14    15
>>>>>>         48:      0     1     2     3     4     5     6     7
>>>>>>         56:      8     9    10    11    12    13    14    15
>>>>>>         64:      0     1     2     3     4     5     6     7
>>>>>>         72:      8     9    10    11    12    13    14    15
>>>>>>         80:      0     1     2     3     4     5     6     7
>>>>>>         88:      8     9    10    11    12    13    14    15
>>>>>>         96:      0     1     2     3     4     5     6     7
>>>>>>         104:      8     9    10    11    12    13    14    15
>>>>>>         112:      0     1     2     3     4     5     6     7
>>>>>>         120:      8     9    10    11    12    13    14    15
>>>>>>
>>>>>>         The maximum number of RSS queues is 16. So I have some questions about this. Will other cpus except 0~15 receive the rx interrupts?
>>>>>>
>>>>>>         In our production environment, cpu 16~62 also receive the rx interrupts. Was our RSS misconfigured?
>>>>>
>>>>> It really depends on which cpus are assigned to each IRQ.
>>>>>
>>>>
>>>> Hi Eric,
>>>>
>>>> Each irq was assigned to a single cpu, for exapmle:
>>>>
>>>> irq     cpu
>>>>
>>>> 117      0
>>>> 118      1
>>>>
>>>> ......
>>>>
>>>> 179      62
>>>>
>>>> All cpus trigger interrupts not only cpus 0~15.
>>>> It seems that the result is inconsistent with the RSS hash value.
>>>>
>>>>
>>>
>>> I misread your report, I thought you had 16 receive queues.
>>>
>>> Why don't you change "ethtool -L eno5 rx 16", instead of trying to
>>> configure RSS manually ?
>>
>> Hi Eric,
>>
>> We want to make full use of cpu resources to receive packets. So
>> we enable 63 rx queues. But we found the rate of interrupt growth
>> on cpu 0~15 is faster than other cpus(almost twice). I don't know
>> whether it is related to RSS configuration. We didn't make any changes
>> on the RSS configration after the server is up.
>>
>>
>>
>> FYI, on another server, we use Mellanox Technologies MT27800 NIC.
>> The rate of interrupt growth on cpu 0~63 seems have little gap.
>>
>> It's RSS configration can be seen as follow:
>>
>> RX flow hash indirection table for ens2f0np0 with 63 RX ring(s):
>>     0:      0     1     2     3     4     5     6     7
>>     8:      8     9    10    11    12    13    14    15
>>    16:     16    17    18    19    20    21    22    23
>>    24:     24    25    26    27    28    29    30    31
>>    32:     32    33    34    35    36    37    38    39
>>    40:     40    41    42    43    44    45    46    47
>>    48:     48    49    50    51    52    53    54    55
>>    56:     56    57    58    59    60    61    62     0
>>    64:      1     2     3     4     5     6     7     8
>>    72:      9    10    11    12    13    14    15    16
>>    80:     17    18    19    20    21    22    23    24
>>    88:     25    26    27    28    29    30    31    32
>>    96:     33    34    35    36    37    38    39    40
>>   104:     41    42    43    44    45    46    47    48
>>   112:     49    50    51    52    53    54    55    56
>>   120:     57    58    59    60    61    62     0     1
>>   128:      2     3     4     5     6     7     8     9
>>   136:     10    11    12    13    14    15    16    17
>>   144:     18    19    20    21    22    23    24    25
>>   152:     26    27    28    29    30    31    32    33
>>   160:     34    35    36    37    38    39    40    41
>>   168:     42    43    44    45    46    47    48    49
>>   176:     50    51    52    53    54    55    56    57
>>   184:     58    59    60    61    62     0     1     2
>>   192:      3     4     5     6     7     8     9    10
>>   200:     11    12    13    14    15    16    17    18
>>   208:     19    20    21    22    23    24    25    26
>>   216:     27    28    29    30    31    32    33    34
>>   224:     35    36    37    38    39    40    41    42
>>   232:     43    44    45    46    47    48    49    50
>>   240:     51    52    53    54    55    56    57    58
>>   248:     59    60    61    62     0     1     2     3
>>
>>
>> I am confused that why ixgbe NIC can dispatch the packets
>> to the rx queues that not specified in RSS configuration.
> 
> Perhaps make sure to change RX flow hash indirection table on the
> Intel NIC then...
> 
> Maybe you changed the default configuration.
> 
> ethtool -X eno5 equal 64


The maximum number of RSS queues supported by Intel Corporation 82599ES NIC
is 16. When I specify the number which is larger than 16, it shows the below message.

"Cannot set RX flow hash configuration: Invalid argument."

> Or
> ethtool -X eno5 default

This command can run sucessfully and as I saied above, it only has 16 queues.

RX flow hash indirection table for eno5 with 63 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:      0     1     2     3     4     5     6     7
   24:      8     9    10    11    12    13    14    15
   32:      0     1     2     3     4     5     6     7
   40:      8     9    10    11    12    13    14    15
   48:      0     1     2     3     4     5     6     7
   56:      8     9    10    11    12    13    14    15
   64:      0     1     2     3     4     5     6     7
   72:      8     9    10    11    12    13    14    15
   80:      0     1     2     3     4     5     6     7
   88:      8     9    10    11    12    13    14    15
   96:      0     1     2     3     4     5     6     7
  104:      8     9    10    11    12    13    14    15
  112:      0     1     2     3     4     5     6     7
  120:      8     9    10    11    12    13    14    15

Thanks!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Question] ixgbe:Mechanism of RSS
  2025-01-02 16:39           ` Jakub Kicinski
@ 2025-01-03  2:37             ` Haifeng Xu
  0 siblings, 0 replies; 15+ messages in thread
From: Haifeng Xu @ 2025-01-03  2:37 UTC (permalink / raw)
  To: Jakub Kicinski, Edward Cree
  Cc: Eric Dumazet, Tony Nguyen, Przemek Kitszel, David S. Miller,
	Paolo Abeni, linux-kernel, netdev, intel-wired-lan



On 2025/1/3 00:39, Jakub Kicinski wrote:
> On Thu, 2 Jan 2025 16:01:18 +0000 Edward Cree wrote:
>> On 02/01/2025 11:23, Haifeng Xu wrote:
>>> We want to make full use of cpu resources to receive packets. So
>>> we enable 63 rx queues. But we found the rate of interrupt growth
>>> on cpu 0~15 is faster than other cpus(almost twice).  
>> ...
>>> I am confused that why ixgbe NIC can dispatch the packets
>>> to the rx queues that not specified in RSS configuration.  
>>
>> Hypothesis: it isn't doing so, RX is only happening on cpus (and
>>  queues) 0-15, but the other CPUs are still sending traffic and
>>  thus getting TX completion interrupts from their TX queues.
>> `ethtool -S` output has per-queue traffic stats which should
>>  confirm this.
>>
>> (But Eric is right that if you _want_ RX to use every CPU you
>>  should just change the indirection table.)
> 
> IIRC Niantic had 4 bit entries in the RSS table or some such.
> It wasn't possible to RSS across more than 16 queues at a time.
> It's a great NIC but a bit dated at this point.

Yes, It only has 16 RSS queues.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Question] ixgbe:Mechanism of RSS
  2025-01-02 16:01         ` Edward Cree
  2025-01-02 16:39           ` Jakub Kicinski
@ 2025-01-03  3:05           ` Haifeng Xu
  2025-01-07 17:16             ` Tony Nguyen
  1 sibling, 1 reply; 15+ messages in thread
From: Haifeng Xu @ 2025-01-03  3:05 UTC (permalink / raw)
  To: Edward Cree, Eric Dumazet
  Cc: Tony Nguyen, Przemek Kitszel, David S. Miller, Jakub Kicinski,
	Paolo Abeni, linux-kernel, netdev, intel-wired-lan



On 2025/1/3 00:01, Edward Cree wrote:
> On 02/01/2025 11:23, Haifeng Xu wrote:
>> We want to make full use of cpu resources to receive packets. So
>> we enable 63 rx queues. But we found the rate of interrupt growth
>> on cpu 0~15 is faster than other cpus(almost twice).
> ...
>> I am confused that why ixgbe NIC can dispatch the packets
>> to the rx queues that not specified in RSS configuration.
> 
> Hypothesis: it isn't doing so, RX is only happening on cpus (and
>  queues) 0-15, but the other CPUs are still sending traffic and
>  thus getting TX completion interrupts from their TX queues.
> `ethtool -S` output has per-queue traffic stats which should
>  confirm this.
> 

I use ethtool -S to check the rx_queus stats and here is the result.

According to the below stats, all cpus have new packets received.

cpu     t1(bytes)       t2(bytes)       delta(bytes)

0	154155550267550	154156433828875	883561325
1	148748566285840	148749509346247	943060407
2	148874911191685	148875798038140	886846455
3	152483460327704	152484251468998	791141294
4	147790981836915	147791775847804	794010889
5	146047892285722	146048778285682	885999960
6	142880516825921	142881213804363	696978442
7	152016735168735	152017707542774	972374039
8	146019936404393	146020739070311	802665918
9	147448522715540	147449258018186	735302646
10	145865736299432	145866601503106	865203674
11	149548527982122	149549289026453	761044331
12	146848384328236	146849303547769	919219533
13	152942139118542	152942769029253	629910711
14	150884661854828	150885556866976	895012148
15	149222733506734	149223510491115	776984381
16	34150226069524	34150375855113	149785589
17	34115700500819	34115914271025	213770206
18	33906215129998	33906448044501	232914503
19	33983812095357	33983986258546	174163189
20	34156349675011	34156565159083	215484072
21	33574293379024	33574490695725	197316701
22	33438129453422	33438297911151	168457729
23	32967454521585	32967612494711	157973126
24	33507443427266	33507604828468	161401202
25	33413275870121	33413433901940	158031819
26	33852322542796	33852527061150	204518354
27	33131162685385	33131330621474	167936089
28	33407661780251	33407823112381	161332130
29	34256799173845	34256944837757	145663912
30	33814458585183	33814623673528	165088345
31	33848638714862	33848775218038	136503176
32	18683932398308	18684069540891	137142583
33	19454524281229	19454647908293	123627064
34	19717744365436	19717900618222	156252786
35	20295086765202	20295245869666	159104464
36	20501853066588	20502000738936	147672348
37	20954631043374	20954797204375	166161001
38	21102911073326	21103062510369	151437043
39	21376404644179	21376515307288	110663109
40	20935812784743	20935983891491	171106748
41	20721278456831	20721435955715	157498884
42	21268291801465	21268425244578	133443113
43	21661413672829	21661629019091	215346262
44	21696437732484	21696568800049	131067565
45	21027869000890	21028020401214	151400324
46	21707137252644	21707293761990	156509346
47	20655623913790	20655740452889	116539099
48	32692002128477	32692138244468	136115991
49	33548445851486	33548569927672	124076186
50	33197264968787	33197448645817	183677030
51	33379544010500	33379746565576	202555076
52	33503579011721	33503722596159	143584438
53	33145734550468	33145892305819	157755351
54	33422692741858	33422844156764	151414906
55	32750945531107	32751131302251	185771144
56	33404955373530	33405157766253	202392723
57	33701185654471	33701313174725	127520254
58	33014531699810	33014700058409	168358599
59	32948906758429	32949151147605	244389176
60	33470813725985	33470993164755	179438770
61	33803771479735	33803971758441	200278706
62	33509751180818	33509926649969	175469151

Thanks!

> (But Eric is right that if you _want_ RX to use every CPU you
>  should just change the indirection table.)


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Question] ixgbe:Mechanism of RSS
  2025-01-03  3:05           ` Haifeng Xu
@ 2025-01-07 17:16             ` Tony Nguyen
  2025-01-08  3:36               ` Haifeng Xu
  0 siblings, 1 reply; 15+ messages in thread
From: Tony Nguyen @ 2025-01-07 17:16 UTC (permalink / raw)
  To: Haifeng Xu, Edward Cree, Eric Dumazet, Aleksandr Loktionov,
	Kwapulinski, Piotr
  Cc: Przemek Kitszel, David S. Miller, Jakub Kicinski, Paolo Abeni,
	linux-kernel, netdev, intel-wired-lan



On 1/2/2025 7:05 PM, Haifeng Xu wrote:
> 
> 
> On 2025/1/3 00:01, Edward Cree wrote:
>> On 02/01/2025 11:23, Haifeng Xu wrote:
>>> We want to make full use of cpu resources to receive packets. So
>>> we enable 63 rx queues. But we found the rate of interrupt growth
>>> on cpu 0~15 is faster than other cpus(almost twice).
>> ...
>>> I am confused that why ixgbe NIC can dispatch the packets
>>> to the rx queues that not specified in RSS configuration.
>>
>> Hypothesis: it isn't doing so, RX is only happening on cpus (and
>>   queues) 0-15, but the other CPUs are still sending traffic and
>>   thus getting TX completion interrupts from their TX queues.
>> `ethtool -S` output has per-queue traffic stats which should
>>   confirm this.
>>
> 
> I use ethtool -S to check the rx_queus stats and here is the result.
> 
> According to the below stats, all cpus have new packets received.

+ Alex and Piotr

What's your ntuple filter setting? If it's off, I suspect it may be the 
Flow Director ATR (Application Targeting Routing) feature which will 
utilize all queues. I believe if you turn on ntuple filters this will 
turn that feature off.

Thanks,
Tony

> 
> cpu     t1(bytes)       t2(bytes)       delta(bytes)
> 
> 0	154155550267550	154156433828875	883561325
> 1	148748566285840	148749509346247	943060407
> 2	148874911191685	148875798038140	886846455
> 3	152483460327704	152484251468998	791141294
> 4	147790981836915	147791775847804	794010889
> 5	146047892285722	146048778285682	885999960
> 6	142880516825921	142881213804363	696978442
> 7	152016735168735	152017707542774	972374039
> 8	146019936404393	146020739070311	802665918
> 9	147448522715540	147449258018186	735302646
> 10	145865736299432	145866601503106	865203674
> 11	149548527982122	149549289026453	761044331
> 12	146848384328236	146849303547769	919219533
> 13	152942139118542	152942769029253	629910711
> 14	150884661854828	150885556866976	895012148
> 15	149222733506734	149223510491115	776984381
> 16	34150226069524	34150375855113	149785589
> 17	34115700500819	34115914271025	213770206
> 18	33906215129998	33906448044501	232914503
> 19	33983812095357	33983986258546	174163189
> 20	34156349675011	34156565159083	215484072
> 21	33574293379024	33574490695725	197316701
> 22	33438129453422	33438297911151	168457729
> 23	32967454521585	32967612494711	157973126
> 24	33507443427266	33507604828468	161401202
> 25	33413275870121	33413433901940	158031819
> 26	33852322542796	33852527061150	204518354
> 27	33131162685385	33131330621474	167936089
> 28	33407661780251	33407823112381	161332130
> 29	34256799173845	34256944837757	145663912
> 30	33814458585183	33814623673528	165088345
> 31	33848638714862	33848775218038	136503176
> 32	18683932398308	18684069540891	137142583
> 33	19454524281229	19454647908293	123627064
> 34	19717744365436	19717900618222	156252786
> 35	20295086765202	20295245869666	159104464
> 36	20501853066588	20502000738936	147672348
> 37	20954631043374	20954797204375	166161001
> 38	21102911073326	21103062510369	151437043
> 39	21376404644179	21376515307288	110663109
> 40	20935812784743	20935983891491	171106748
> 41	20721278456831	20721435955715	157498884
> 42	21268291801465	21268425244578	133443113
> 43	21661413672829	21661629019091	215346262
> 44	21696437732484	21696568800049	131067565
> 45	21027869000890	21028020401214	151400324
> 46	21707137252644	21707293761990	156509346
> 47	20655623913790	20655740452889	116539099
> 48	32692002128477	32692138244468	136115991
> 49	33548445851486	33548569927672	124076186
> 50	33197264968787	33197448645817	183677030
> 51	33379544010500	33379746565576	202555076
> 52	33503579011721	33503722596159	143584438
> 53	33145734550468	33145892305819	157755351
> 54	33422692741858	33422844156764	151414906
> 55	32750945531107	32751131302251	185771144
> 56	33404955373530	33405157766253	202392723
> 57	33701185654471	33701313174725	127520254
> 58	33014531699810	33014700058409	168358599
> 59	32948906758429	32949151147605	244389176
> 60	33470813725985	33470993164755	179438770
> 61	33803771479735	33803971758441	200278706
> 62	33509751180818	33509926649969	175469151
> 
> Thanks!
> 
>> (But Eric is right that if you _want_ RX to use every CPU you
>>   should just change the indirection table.)
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Question] ixgbe:Mechanism of RSS
  2025-01-07 17:16             ` Tony Nguyen
@ 2025-01-08  3:36               ` Haifeng Xu
  2025-01-08 21:06                 ` Tony Nguyen
  0 siblings, 1 reply; 15+ messages in thread
From: Haifeng Xu @ 2025-01-08  3:36 UTC (permalink / raw)
  To: Tony Nguyen, Edward Cree, Eric Dumazet, Aleksandr Loktionov,
	Kwapulinski, Piotr
  Cc: Przemek Kitszel, David S. Miller, Jakub Kicinski, Paolo Abeni,
	linux-kernel, netdev, intel-wired-lan



On 2025/1/8 01:16, Tony Nguyen wrote:
> 
> 
> On 1/2/2025 7:05 PM, Haifeng Xu wrote:
>>
>>
>> On 2025/1/3 00:01, Edward Cree wrote:
>>> On 02/01/2025 11:23, Haifeng Xu wrote:
>>>> We want to make full use of cpu resources to receive packets. So
>>>> we enable 63 rx queues. But we found the rate of interrupt growth
>>>> on cpu 0~15 is faster than other cpus(almost twice).
>>> ...
>>>> I am confused that why ixgbe NIC can dispatch the packets
>>>> to the rx queues that not specified in RSS configuration.
>>>
>>> Hypothesis: it isn't doing so, RX is only happening on cpus (and
>>>   queues) 0-15, but the other CPUs are still sending traffic and
>>>   thus getting TX completion interrupts from their TX queues.
>>> `ethtool -S` output has per-queue traffic stats which should
>>>   confirm this.
>>>
>>
>> I use ethtool -S to check the rx_queus stats and here is the result.
>>
>> According to the below stats, all cpus have new packets received.
> 
> + Alex and Piotr
> 
> What's your ntuple filter setting? If it's off, I suspect it may be the Flow Director ATR (Application Targeting Routing) feature which will utilize all queues. I believe if you turn on ntuple filters this will turn that feature off.

Yes, our ntuple filter setting is off. After turning on the ntuple filters, I compare the delta of recieved packets,
only 0~15 rx rings are non-zero, other rx rings are zero.

If we want to spread the packets across 0~62, how can we tune the NIC setting?
we have enabled 63 rx queues, irq_affinity and rx-flow-hash, but the 0~15 cpu
received more packets than others.

Thanks!



> 
> Thanks,
> Tony
> 
>>
>> cpu     t1(bytes)       t2(bytes)       delta(bytes)
>>
>> 0    154155550267550    154156433828875    883561325
>> 1    148748566285840    148749509346247    943060407
>> 2    148874911191685    148875798038140    886846455
>> 3    152483460327704    152484251468998    791141294
>> 4    147790981836915    147791775847804    794010889
>> 5    146047892285722    146048778285682    885999960
>> 6    142880516825921    142881213804363    696978442
>> 7    152016735168735    152017707542774    972374039
>> 8    146019936404393    146020739070311    802665918
>> 9    147448522715540    147449258018186    735302646
>> 10    145865736299432    145866601503106    865203674
>> 11    149548527982122    149549289026453    761044331
>> 12    146848384328236    146849303547769    919219533
>> 13    152942139118542    152942769029253    629910711
>> 14    150884661854828    150885556866976    895012148
>> 15    149222733506734    149223510491115    776984381
>> 16    34150226069524    34150375855113    149785589
>> 17    34115700500819    34115914271025    213770206
>> 18    33906215129998    33906448044501    232914503
>> 19    33983812095357    33983986258546    174163189
>> 20    34156349675011    34156565159083    215484072
>> 21    33574293379024    33574490695725    197316701
>> 22    33438129453422    33438297911151    168457729
>> 23    32967454521585    32967612494711    157973126
>> 24    33507443427266    33507604828468    161401202
>> 25    33413275870121    33413433901940    158031819
>> 26    33852322542796    33852527061150    204518354
>> 27    33131162685385    33131330621474    167936089
>> 28    33407661780251    33407823112381    161332130
>> 29    34256799173845    34256944837757    145663912
>> 30    33814458585183    33814623673528    165088345
>> 31    33848638714862    33848775218038    136503176
>> 32    18683932398308    18684069540891    137142583
>> 33    19454524281229    19454647908293    123627064
>> 34    19717744365436    19717900618222    156252786
>> 35    20295086765202    20295245869666    159104464
>> 36    20501853066588    20502000738936    147672348
>> 37    20954631043374    20954797204375    166161001
>> 38    21102911073326    21103062510369    151437043
>> 39    21376404644179    21376515307288    110663109
>> 40    20935812784743    20935983891491    171106748
>> 41    20721278456831    20721435955715    157498884
>> 42    21268291801465    21268425244578    133443113
>> 43    21661413672829    21661629019091    215346262
>> 44    21696437732484    21696568800049    131067565
>> 45    21027869000890    21028020401214    151400324
>> 46    21707137252644    21707293761990    156509346
>> 47    20655623913790    20655740452889    116539099
>> 48    32692002128477    32692138244468    136115991
>> 49    33548445851486    33548569927672    124076186
>> 50    33197264968787    33197448645817    183677030
>> 51    33379544010500    33379746565576    202555076
>> 52    33503579011721    33503722596159    143584438
>> 53    33145734550468    33145892305819    157755351
>> 54    33422692741858    33422844156764    151414906
>> 55    32750945531107    32751131302251    185771144
>> 56    33404955373530    33405157766253    202392723
>> 57    33701185654471    33701313174725    127520254
>> 58    33014531699810    33014700058409    168358599
>> 59    32948906758429    32949151147605    244389176
>> 60    33470813725985    33470993164755    179438770
>> 61    33803771479735    33803971758441    200278706
>> 62    33509751180818    33509926649969    175469151
>>
>> Thanks!
>>
>>> (But Eric is right that if you _want_ RX to use every CPU you
>>>   should just change the indirection table.)
>>
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Question] ixgbe:Mechanism of RSS
  2025-01-08  3:36               ` Haifeng Xu
@ 2025-01-08 21:06                 ` Tony Nguyen
  2025-01-09  3:26                   ` Haifeng Xu
  0 siblings, 1 reply; 15+ messages in thread
From: Tony Nguyen @ 2025-01-08 21:06 UTC (permalink / raw)
  To: Haifeng Xu, Edward Cree, Eric Dumazet, Aleksandr Loktionov,
	Kwapulinski, Piotr
  Cc: Przemek Kitszel, David S. Miller, Jakub Kicinski, Paolo Abeni,
	linux-kernel, netdev, intel-wired-lan



On 1/7/2025 7:36 PM, Haifeng Xu wrote:
> 
> 
> On 2025/1/8 01:16, Tony Nguyen wrote:

...

>>
>> What's your ntuple filter setting? If it's off, I suspect it may be the Flow Director ATR (Application Targeting Routing) feature which will utilize all queues. I believe if you turn on ntuple filters this will turn that feature off.
> 
> Yes, our ntuple filter setting is off. After turning on the ntuple filters, I compare the delta of recieved packets,
> only 0~15 rx rings are non-zero, other rx rings are zero.
> 
> If we want to spread the packets across 0~62, how can we tune the NIC setting?
> we have enabled 63 rx queues, irq_affinity and rx-flow-hash, but the 0~15 cpu
> received more packets than others.

As Jakub mentioned earlier, HW RSS is only supported on this device for 
16 queues. ATR will steer bi-directional traffic to utilize additional 
queues, however, once its exhausted it will fallback to RSS, which is 
why CPUs 0-15 are receiving more traffic than the others. I'm not aware 
of a way to evenly spread the traffic beyond the 16 HW supported RSS 
queues for this device.

Thanks,
Tony

> Thanks!



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Question] ixgbe:Mechanism of RSS
  2025-01-08 21:06                 ` Tony Nguyen
@ 2025-01-09  3:26                   ` Haifeng Xu
  0 siblings, 0 replies; 15+ messages in thread
From: Haifeng Xu @ 2025-01-09  3:26 UTC (permalink / raw)
  To: Tony Nguyen, Edward Cree, Eric Dumazet, Aleksandr Loktionov,
	Kwapulinski, Piotr
  Cc: Przemek Kitszel, David S. Miller, Jakub Kicinski, Paolo Abeni,
	linux-kernel, netdev, intel-wired-lan



On 2025/1/9 05:06, Tony Nguyen wrote:
> 
> 
> On 1/7/2025 7:36 PM, Haifeng Xu wrote:
>>
>>
>> On 2025/1/8 01:16, Tony Nguyen wrote:
> 
> ...
> 
>>>
>>> What's your ntuple filter setting? If it's off, I suspect it may be the Flow Director ATR (Application Targeting Routing) feature which will utilize all queues. I believe if you turn on ntuple filters this will turn that feature off.
>>
>> Yes, our ntuple filter setting is off. After turning on the ntuple filters, I compare the delta of recieved packets,
>> only 0~15 rx rings are non-zero, other rx rings are zero.
>>
>> If we want to spread the packets across 0~62, how can we tune the NIC setting?
>> we have enabled 63 rx queues, irq_affinity and rx-flow-hash, but the 0~15 cpu
>> received more packets than others.
> 
> As Jakub mentioned earlier, HW RSS is only supported on this device for 16 queues. ATR will steer bi-directional traffic to utilize additional queues, however, once its exhausted it will fallback to RSS, which is why CPUs 0-15 are receiving more traffic than the others. I'm not aware of a way to evenly spread the traffic beyond the 16 HW supported RSS queues for this device.

Ok, thanks!

> 
> Thanks,
> Tony
> 
>> Thanks!
> 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-01-09  3:26 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-02  3:53 [Question] ixgbe:Mechanism of RSS Haifeng Xu
2025-01-02  8:13 ` Eric Dumazet
2025-01-02  8:43   ` Haifeng Xu
2025-01-02 10:34     ` Eric Dumazet
2025-01-02 11:23       ` Haifeng Xu
2025-01-02 11:46         ` Eric Dumazet
2025-01-03  2:36           ` Haifeng Xu
2025-01-02 16:01         ` Edward Cree
2025-01-02 16:39           ` Jakub Kicinski
2025-01-03  2:37             ` Haifeng Xu
2025-01-03  3:05           ` Haifeng Xu
2025-01-07 17:16             ` Tony Nguyen
2025-01-08  3:36               ` Haifeng Xu
2025-01-08 21:06                 ` Tony Nguyen
2025-01-09  3:26                   ` Haifeng Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).