[Intel-wired-lan] i40e card Tx resets

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Intel-wired-lan] i40e card Tx resets
@ 2016-03-14 21:43 Sowmini Varadhan
  2016-03-15  6:12 ` [Intel-wired-lan] [E1000-devel] " zhuyj
  0 siblings, 1 reply; 14+ messages in thread
From: Sowmini Varadhan @ 2016-03-14 21:43 UTC (permalink / raw)
  To: intel-wired-lan

Hi,

I am trying out some DB stress tests on both i40e and ixgbe. The 
stress test that I use is rds-stress (http://linux.die.net/man/1/rds-stress),
and I can list out the entire set of steps and parameters that I
am using to run this if that info is  interesting.

My test bed is a pair of X5-2 (Haswell) servers, each with a
Niantic (X540-AT2) card and a Fortville card. The Niantic/fortville
cards are connected back-to-back, so I essentially have a 10G
connection and a 40G connection.

Everything else (kernel, RDS modules, stress test and parameters)
remaining the same, I get  the expected throughput on the 10G
connection, but the i40e connection goes through a lot of TX
errors that result in console messages like this:

  i40e 0000:81:00.0: TX driver issue detected, PF reset issued
  i40e 0000:81:00.0 eth2: adding 68:05:ca:30:db:30 vid=0
  i40e 0000:81:00.0: TX driver issue detected, PF reset issued
  i40e 0000:81:00.0 eth2: VSI_seid 390, Hung TX queue 32, tx_pending: 82, NTC:0xeb, HWB: 0xeb, NTU: 0x13d, TAIL: 0x13d
  i40e 0000:81:00.0 eth2: VSI_seid 390, Issuing force_wb for TX queue 32, Interrupt Reg

I understand these are "mdd errors", but how can I find out what 
triggered these errors, any hints?

The other data-point here is that if I disable tso, and fall back
to gso, there are no tx errors, and throughput matches the 10G 
connection (for the same set of test parameters).

Please let me know if there is any other info that would help.
The kernel is a 4.5.0-rc2 kernel. Info for the i40e card is

    # ethtool -i eth3
    driver: i40e
    version: 1.4.8-k
    firmware-version: 5.02 0x80002285 0.0.0
    bus-info: 0000:81:00.1 
       :

Thanks in advance,
--Sowmini

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [E1000-devel] i40e card Tx resets
  2016-03-14 21:43 [Intel-wired-lan] i40e card Tx resets Sowmini Varadhan
@ 2016-03-15  6:12 ` zhuyj
  2016-03-15  8:55   ` zhuyj
  0 siblings, 1 reply; 14+ messages in thread
From: zhuyj @ 2016-03-15  6:12 UTC (permalink / raw)
  To: intel-wired-lan

Hi,

I have the similar problem. Would you like to make tests with pktgen tools?
Maybe the test result is not related with tso.

Zhu Yanjun

On 03/15/2016 05:43 AM, Sowmini Varadhan wrote:
> Hi,
>
> I am trying out some DB stress tests on both i40e and ixgbe. The
> stress test that I use is rds-stress (http://linux.die.net/man/1/rds-stress),
> and I can list out the entire set of steps and parameters that I
> am using to run this if that info is  interesting.
>
> My test bed is a pair of X5-2 (Haswell) servers, each with a
> Niantic (X540-AT2) card and a Fortville card. The Niantic/fortville
> cards are connected back-to-back, so I essentially have a 10G
> connection and a 40G connection.
>
> Everything else (kernel, RDS modules, stress test and parameters)
> remaining the same, I get  the expected throughput on the 10G
> connection, but the i40e connection goes through a lot of TX
> errors that result in console messages like this:
>
>    i40e 0000:81:00.0: TX driver issue detected, PF reset issued
>    i40e 0000:81:00.0 eth2: adding 68:05:ca:30:db:30 vid=0
>    i40e 0000:81:00.0: TX driver issue detected, PF reset issued
>    i40e 0000:81:00.0 eth2: VSI_seid 390, Hung TX queue 32, tx_pending: 82, NTC:0xeb, HWB: 0xeb, NTU: 0x13d, TAIL: 0x13d
>    i40e 0000:81:00.0 eth2: VSI_seid 390, Issuing force_wb for TX queue 32, Interrupt Reg
>
> I understand these are "mdd errors", but how can I find out what
> triggered these errors, any hints?
>
> The other data-point here is that if I disable tso, and fall back
> to gso, there are no tx errors, and throughput matches the 10G
> connection (for the same set of test parameters).
>
> Please let me know if there is any other info that would help.
> The kernel is a 4.5.0-rc2 kernel. Info for the i40e card is
>
>      # ethtool -i eth3
>      driver: i40e
>      version: 1.4.8-k
>      firmware-version: 5.02 0x80002285 0.0.0
>      bus-info: 0000:81:00.1
>         :
>
> Thanks in advance,
> --Sowmini
>
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140
> _______________________________________________
> E1000-devel mailing list
> E1000-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/e1000-devel
> To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [E1000-devel] i40e card Tx resets
  2016-03-15  6:12 ` [Intel-wired-lan] [E1000-devel] " zhuyj
@ 2016-03-15  8:55   ` zhuyj
  2016-03-15 10:54     ` Sowmini Varadhan
  0 siblings, 1 reply; 14+ messages in thread
From: zhuyj @ 2016-03-15  8:55 UTC (permalink / raw)
  To: intel-wired-lan

On 03/15/2016 02:12 PM, zhuyj wrote:
> Hi,
>
> I have the similar problem. Would you like to make tests with pktgen 
> tools?
> Maybe the test result is not related with tso.

Sorry. I explain this in details.
I have an similar problem. At first, I think it is related with tso.
Then I made tests with pktgen tools and found that this similar problem 
still occurred whether
tso is enabled or not.

So I suggest to make tests with pktgen tools to exclude tso.

Best Regards!
Zhu Yanjun

>
> Zhu Yanjun
>
> On 03/15/2016 05:43 AM, Sowmini Varadhan wrote:
>> Hi,
>>
>> I am trying out some DB stress tests on both i40e and ixgbe. The
>> stress test that I use is rds-stress 
>> (http://linux.die.net/man/1/rds-stress),
>> and I can list out the entire set of steps and parameters that I
>> am using to run this if that info is  interesting.
>>
>> My test bed is a pair of X5-2 (Haswell) servers, each with a
>> Niantic (X540-AT2) card and a Fortville card. The Niantic/fortville
>> cards are connected back-to-back, so I essentially have a 10G
>> connection and a 40G connection.
>>
>> Everything else (kernel, RDS modules, stress test and parameters)
>> remaining the same, I get  the expected throughput on the 10G
>> connection, but the i40e connection goes through a lot of TX
>> errors that result in console messages like this:
>>
>>    i40e 0000:81:00.0: TX driver issue detected, PF reset issued
>>    i40e 0000:81:00.0 eth2: adding 68:05:ca:30:db:30 vid=0
>>    i40e 0000:81:00.0: TX driver issue detected, PF reset issued
>>    i40e 0000:81:00.0 eth2: VSI_seid 390, Hung TX queue 32, 
>> tx_pending: 82, NTC:0xeb, HWB: 0xeb, NTU: 0x13d, TAIL: 0x13d
>>    i40e 0000:81:00.0 eth2: VSI_seid 390, Issuing force_wb for TX 
>> queue 32, Interrupt Reg
>>
>> I understand these are "mdd errors", but how can I find out what
>> triggered these errors, any hints?
>>
>> The other data-point here is that if I disable tso, and fall back
>> to gso, there are no tx errors, and throughput matches the 10G
>> connection (for the same set of test parameters).
>>
>> Please let me know if there is any other info that would help.
>> The kernel is a 4.5.0-rc2 kernel. Info for the i40e card is
>>
>>      # ethtool -i eth3
>>      driver: i40e
>>      version: 1.4.8-k
>>      firmware-version: 5.02 0x80002285 0.0.0
>>      bus-info: 0000:81:00.1
>>         :
>>
>> Thanks in advance,
>> --Sowmini
>>
>> ------------------------------------------------------------------------------ 
>>
>> Transform Data into Opportunity.
>> Accelerate data analysis in your applications with
>> Intel Data Analytics Acceleration Library.
>> Click to learn more.
>> http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140
>> _______________________________________________
>> E1000-devel mailing list
>> E1000-devel at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/e1000-devel
>> To learn more about Intel&#174; Ethernet, visit 
>> http://communities.intel.com/community/wired
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [E1000-devel] i40e card Tx resets
  2016-03-15  8:55   ` zhuyj
@ 2016-03-15 10:54     ` Sowmini Varadhan
  2016-03-16  3:19       ` zhuyj
  0 siblings, 1 reply; 14+ messages in thread
From: Sowmini Varadhan @ 2016-03-15 10:54 UTC (permalink / raw)
  To: intel-wired-lan

On (03/15/16 16:55), zhuyj wrote:
> Sorry. I explain this in details.
> I have an similar problem. At first, I think it is related with tso.
> Then I made tests with pktgen tools and found that this similar
> problem still occurred whether
> tso is enabled or not.
> 
> So I suggest to make tests with pktgen tools to exclude tso.
> 

I realize that TSO might not be the root cause (Tushar also
pointed that out) but might just be triggering the issue...

I dont think we need pktgen at this point- it's quite easy
to reproduce this on commodity Haswell servers, and by installing
the rds-stress from the rpm below:

http://public-yum.oracle.com/repo/OracleLinux/OL6/ofed_UEK/x86_64//getPackageSource/rds-tools-2.0.7-1.12.el6.src.rpm

To run it, set up 2 nodes  connected on i40e. I shall call them
"client" and "server" though both will send traffic in the test

Start the listener:
 server# modprobe rds-tcp
 server# rds-stress -r <server addr>

Start the test:

 client# modprobe rds-tcp
 client# rds-stress -r <client addr> -s <server-addr> -q 256 -a 8192 -d16 -t16 -T30 

(all params are explained in the rds-stress man page)

If you do this on ixgbe, you will see that the column for "tx+rx K/s"
shows a steady throughput, whereas i40e numbers are bursty and low.

Also, for i40e, you will see messages about TX hang on on the console.

I think that, to find the root-cause, we need to see what is
triggering the mdd error.

Would be good if someone from Intel could provide some hints on
how to do that (or try the above tests!)

--Sowmini

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [E1000-devel] i40e card Tx resets
  2016-03-15 10:54     ` Sowmini Varadhan
@ 2016-03-16  3:19       ` zhuyj
  2016-03-16  3:25         ` Sowmini Varadhan
  0 siblings, 1 reply; 14+ messages in thread
From: zhuyj @ 2016-03-16  3:19 UTC (permalink / raw)
  To: intel-wired-lan

On 03/15/2016 06:54 PM, Sowmini Varadhan wrote:
> On (03/15/16 16:55), zhuyj wrote:
>> Sorry. I explain this in details.
>> I have an similar problem. At first, I think it is related with tso.
>> Then I made tests with pktgen tools and found that this similar
>> problem still occurred whether
>> tso is enabled or not.
>>
>> So I suggest to make tests with pktgen tools to exclude tso.
>>
> I realize that TSO might not be the root cause (Tushar also
> pointed that out) but might just be triggering the issue...
>
> I dont think we need pktgen at this point- it's quite easy
> to reproduce this on commodity Haswell servers, and by installing
> the rds-stress from the rpm below:
>
> http://public-yum.oracle.com/repo/OracleLinux/OL6/ofed_UEK/x86_64//getPackageSource/rds-tools-2.0.7-1.12.el6.src.rpm
>
> To run it, set up 2 nodes  connected on i40e. I shall call them
> "client" and "server" though both will send traffic in the test
>
> Start the listener:
>   server# modprobe rds-tcp
>   server# rds-stress -r <server addr>
>
> Start the test:
>
>   client# modprobe rds-tcp
>   client# rds-stress -r <client addr> -s <server-addr> -q 256 -a 8192 -d16 -t16 -T30
>
> (all params are explained in the rds-stress man page)
>
> If you do this on ixgbe, you will see that the column for "tx+rx K/s"
> shows a steady throughput, whereas i40e numbers are bursty and low.
>
> Also, for i40e, you will see messages about TX hang on on the console.
>
> I think that, to find the root-cause, we need to see what is
> triggering the mdd error.
Hi,

Thanks for your reply.
Yesterday I made tests with rds-tools. It is very pity that I can not 
reproduce my problem with rds-tools.

But with pktgen tools, I can reproduce this problem easily. And I found 
that if I set the packet size to 17792 or less than this size,
this problem would not occur. But if I set the packet size > 17792, for 
example 17793, my problem would occur.

As such, maybe the packet size triggers my problem. I am not sure that 
the packet size will trigger your size.

Best Regards!
Zhu Yanjun
>
> Would be good if someone from Intel could provide some hints on
> how to do that (or try the above tests!)
>
> --Sowmini


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [E1000-devel] i40e card Tx resets
  2016-03-16  3:19       ` zhuyj
@ 2016-03-16  3:25         ` Sowmini Varadhan
  2016-03-16 11:46           ` zhuyj
  0 siblings, 1 reply; 14+ messages in thread
From: Sowmini Varadhan @ 2016-03-16  3:25 UTC (permalink / raw)
  To: intel-wired-lan

On (03/16/16 11:19), zhuyj wrote:
> 
> Thanks for your reply.
> Yesterday I made tests with rds-tools. It is very pity that I can
> not reproduce my problem with rds-tools.
> 
> But with pktgen tools, I can reproduce this problem easily. And I
> found that if I set the packet size to 17792 or less than this size,
> this problem would not occur. But if I set the packet size > 17792,
> for example 17793, my problem would occur.

I think it might have to do with number of sockets/cpu/cores and
the irq balancing issues.

> As such, maybe the packet size triggers my problem. I am not sure
> that the packet size will trigger your size.

In my case I did try against netperf request-response (but that is
single threaded) and iperf (but that is unidirectional, i.e.,
it is not a bidirectional request-response test)

perhaps if you share the pktgen config (did you change the
code itself)? some of the i40e experts at intel (who are also
on the to/cc lists of this mail) can try it out themselves? 

I still think that the fundamental problem can be identified
by looking for what's causing the mdd event. 

It must be some type of bug, because ixgbe works fine for me,
and this seems like a regression on that performance.

--Sowmini

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [E1000-devel] i40e card Tx resets
  2016-03-16  3:25         ` Sowmini Varadhan
@ 2016-03-16 11:46           ` zhuyj
  2016-03-16 14:36             ` Sowmini Varadhan
  0 siblings, 1 reply; 14+ messages in thread
From: zhuyj @ 2016-03-16 11:46 UTC (permalink / raw)
  To: intel-wired-lan

On 03/16/2016 11:25 AM, Sowmini Varadhan wrote:
> On (03/16/16 11:19), zhuyj wrote:
>> Thanks for your reply.
>> Yesterday I made tests with rds-tools. It is very pity that I can
>> not reproduce my problem with rds-tools.
>>
>> But with pktgen tools, I can reproduce this problem easily. And I
>> found that if I set the packet size to 17792 or less than this size,
>> this problem would not occur. But if I set the packet size > 17792,
>> for example 17793, my problem would occur.
> I think it might have to do with number of sockets/cpu/cores and
> the irq balancing issues.
>
>> As such, maybe the packet size triggers my problem. I am not sure
>> that the packet size will trigger your size.
> In my case I did try against netperf request-response (but that is
> single threaded) and iperf (but that is unidirectional, i.e.,
> it is not a bidirectional request-response test)
>
> perhaps if you share the pktgen config (did you change the
> code itself)? some of the i40e experts at intel (who are also
> on the to/cc lists of this mail) can try it out themselves?
>
> I still think that the fundamental problem can be identified
> by looking for what's causing the mdd event.
>
> It must be some type of bug, because ixgbe works fine for me,
> and this seems like a regression on that performance.
>
> --Sowmini
>
It is busy today. Tomorrow I will share the steps about pktgen tools.

Best Regards!
Zhu Yanjun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [E1000-devel] i40e card Tx resets
  2016-03-16 11:46           ` zhuyj
@ 2016-03-16 14:36             ` Sowmini Varadhan
  2016-03-17  2:20               ` zhuyj
  0 siblings, 1 reply; 14+ messages in thread
From: Sowmini Varadhan @ 2016-03-16 14:36 UTC (permalink / raw)
  To: intel-wired-lan

On (03/16/16 19:46), zhuyj wrote:

> It is busy today. Tomorrow I will share the steps about pktgen tools.

Ok. I can give that a try on my machine. It would be good to
have some way to reproduce this as simply as possible, maybe
we can take the discussion to netdev to see if others there
have experienced this.

--Sowmini


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [E1000-devel] i40e card Tx resets
  2016-03-16 14:36             ` Sowmini Varadhan
@ 2016-03-17  2:20               ` zhuyj
  2016-03-17  2:29                 ` zhuyj
  2016-03-17 18:56                 ` Sowmini Varadhan
  0 siblings, 2 replies; 14+ messages in thread
From: zhuyj @ 2016-03-17  2:20 UTC (permalink / raw)
  To: intel-wired-lan

On 03/16/2016 10:36 PM, Sowmini Varadhan wrote:
> On (03/16/16 19:46), zhuyj wrote:
>
>> It is busy today. Tomorrow I will share the steps about pktgen tools.
> Ok. I can give that a try on my machine. It would be good to
> have some way to reproduce this as simply as possible, maybe
> we can take the discussion to netdev to see if others there
> have experienced this.
>
> --Sowmini
>
1. modprobe NET_PKTGEN

2. download the tar file and uncompress to any directory.
This tar file is from kernel. It is in samples/pktgen/

3. cd pktgen

4. pktgen_sample02_multiqueue.sh -i ethx -s size -t cpu_number

If size is set to a big number, the similar defect will occur.
Adjust this size to a appropriate number, my defect will not occur.

In the test, I found some types igb nic, such as i210, will work well no 
matter the size is a big number.
some nic, such as 82580, it will not work well if the size is too big.

As such, I think my problem results from the hardware and the big size 
triggers this problem.

I hope this can help us all.

Zhu Yanjun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [E1000-devel] i40e card Tx resets
  2016-03-17  2:20               ` zhuyj
@ 2016-03-17  2:29                 ` zhuyj
  2016-03-17 18:56                 ` Sowmini Varadhan
  1 sibling, 0 replies; 14+ messages in thread
From: zhuyj @ 2016-03-17  2:29 UTC (permalink / raw)
  To: intel-wired-lan

On 03/17/2016 10:20 AM, zhuyj wrote:
> On 03/16/2016 10:36 PM, Sowmini Varadhan wrote:
>> On (03/16/16 19:46), zhuyj wrote:
>>
>>> It is busy today. Tomorrow I will share the steps about pktgen tools.
>> Ok. I can give that a try on my machine. It would be good to
>> have some way to reproduce this as simply as possible, maybe
>> we can take the discussion to netdev to see if others there
>> have experienced this.
>>
>> --Sowmini
>>
> 1. modprobe NET_PKTGEN
>
> 2. download the tar file and uncompress to any directory.
> This tar file is from kernel. It is in samples/pktgen/
Sorry. The tar file is in the attachment. Please check it.

Zhu Yanjun
>
> 3. cd pktgen
>
> 4. pktgen_sample02_multiqueue.sh -i ethx -s size -t cpu_number
>
> If size is set to a big number, the similar defect will occur.
> Adjust this size to a appropriate number, my defect will not occur.
>
> In the test, I found some types igb nic, such as i210, will work well 
> no matter the size is a big number.
> some nic, such as 82580, it will not work well if the size is too big.
>
> As such, I think my problem results from the hardware and the big size 
> triggers this problem.
>
> I hope this can help us all.
>
> Zhu Yanjun
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: pktgen.tgz
Type: application/x-compressed-tar
Size: 6257 bytes
Desc: not available
URL: <http://lists.osuosl.org/pipermail/intel-wired-lan/attachments/20160317/607a4184/attachment-0001.bin>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [E1000-devel] i40e card Tx resets
  2016-03-17  2:20               ` zhuyj
  2016-03-17  2:29                 ` zhuyj
@ 2016-03-17 18:56                 ` Sowmini Varadhan
  2016-03-17 19:28                   ` Jesse Brandeburg
  1 sibling, 1 reply; 14+ messages in thread
From: Sowmini Varadhan @ 2016-03-17 18:56 UTC (permalink / raw)
  To: intel-wired-lan

On (03/17/16 10:20), zhuyj wrote:
> 1. modprobe NET_PKTGEN
> 
> 2. download the tar file and uncompress to any directory.
> This tar file is from kernel. It is in samples/pktgen/
> 
> 3. cd pktgen
> 
> 4. pktgen_sample02_multiqueue.sh -i ethx -s size -t cpu_number

Indeed, I see the same thing as you, and it was very easy to 
reproduce. It was very interesting that the problem can happen with
as few as 3 threads, at which point I see the TX hang at exactly
-s 12305 

I see:
i40e 0000:82:00.0: TX driver issue detected, PF reset issued
i40e 0000:82:00.0 eth2: VSI_seid 390, Hung TX queue 0, tx_pending: 492, NTC:0x140, HWB: 0x140, NTU: 0x12c, TAIL: 0x12c

I think the common factor in both our test cases is that we have some
kernel thread that can efficiently send packets without any context
switches. 

Has anyone here seen this before? I'll see if I can find some cycles
to figure this out, if not, maybe its worth bringing up on netdev,
to see if others have seen this, and to draw some patterns.

> 
> If size is set to a big number, the similar defect will occur.
> Adjust this size to a appropriate number, my defect will not occur.
> 
> In the test, I found some types igb nic, such as i210, will work
> well no matter the size is a big number.
> some nic, such as 82580, it will not work well if the size is too big.
> 
> As such, I think my problem results from the hardware and the big
> size triggers this problem.
> 
> I hope this can help us all.
> 
> Zhu Yanjun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [E1000-devel] i40e card Tx resets
  2016-03-17 18:56                 ` Sowmini Varadhan
@ 2016-03-17 19:28                   ` Jesse Brandeburg
  2016-03-17 19:41                     ` Sowmini Varadhan
  2016-03-18 11:08                     ` zhuyj
  0 siblings, 2 replies; 14+ messages in thread
From: Jesse Brandeburg @ 2016-03-17 19:28 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, 17 Mar 2016 14:56:14 -0400
Sowmini Varadhan <sowmini.varadhan@oracle.com> wrote:

> On (03/17/16 10:20), zhuyj wrote:
> > 1. modprobe NET_PKTGEN
> > 
> > 2. download the tar file and uncompress to any directory.
> > This tar file is from kernel. It is in samples/pktgen/
> > 
> > 3. cd pktgen
> > 
> > 4. pktgen_sample02_multiqueue.sh -i ethx -s size -t cpu_number
> 
> Indeed, I see the same thing as you, and it was very easy to 
> reproduce. It was very interesting that the problem can happen with
> as few as 3 threads, at which point I see the TX hang at exactly
> -s 12305 

Okay, sorry I hadn't jumped into this thread yet.

I can uniquivically tell you that what Sowmini saw with the MDD with
stack based RDS-STRESS testing is *NOT* the same as what you're seeing
while using pktgen with invalid huge skb->data buffers.

We can ask on netdev if the driver should defend against this kind of
input to hard_start_xmit (transmit routine), but the driver doesn't
check the maximum length of the skb to see if it is invalid, because
the stack can never build (only pktgen can) these invalid SKBs.

The issue is that pktgen builds skb->data with a contiguous buffer of
whatever size transmit requested, (regardless of MTU) and then sends it
straight to the transmit routine, no segmentation flags, no MSS set.

This causes the driver to build a transmit descriptor with an invalid
length, which the hardware then "ASSERTS" on by issuing an MDD
interrupt and freezing the bad acting queue.

> I see:
> i40e 0000:82:00.0: TX driver issue detected, PF reset issued
> i40e 0000:82:00.0 eth2: VSI_seid 390, Hung TX queue 0, tx_pending: 492, NTC:0x140, HWB: 0x140, NTU: 0x12c, TAIL: 0x12c
> 
> I think the common factor in both our test cases is that we have some
> kernel thread that can efficiently send packets without any context
> switches. 

You've found a red herring (mistakenly connected two separate events)
so I think you can stop going down this path (pktgen).

> Has anyone here seen this before? I'll see if I can find some cycles
> to figure this out, if not, maybe its worth bringing up on netdev,
> to see if others have seen this, and to draw some patterns.

we don't need to bring it up on netdev.  We have a way to troubleshoot
MDDs that I can send to you, if you want to do the work.  Otherwise we
need to have some time to reproduce here.

> > If size is set to a big number, the similar defect will occur.
> > Adjust this size to a appropriate number, my defect will not occur.
> > 
> > In the test, I found some types igb nic, such as i210, will work
> > well no matter the size is a big number.
> > some nic, such as 82580, it will not work well if the size is too big.

This is mostly a combination of driver implementation and how the
hardware handles a descriptor that is too large.  The driver *could*
check to make sure the skb->data is never too large, but in that same
vein, we *could* fix pktgen to never send a frame greater than MTU down
to the driver.

> > 
> > As such, I think my problem results from the hardware and the big
> > size triggers this problem.
> > 
> > I hope this can help us all.

Unfortunately Zhu's problem with pktgen is not a reproducer of
Sowmini's problem.

In the case of pktgen, it is a "don't do that, because it hurts" kind of
bug. In the case of rds-stress, we need to reproduce it here and figure
out what hardware constraint the driver is violating during set up of
the transmit.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [E1000-devel] i40e card Tx resets
  2016-03-17 19:28                   ` Jesse Brandeburg
@ 2016-03-17 19:41                     ` Sowmini Varadhan
  2016-03-18 11:08                     ` zhuyj
  1 sibling, 0 replies; 14+ messages in thread
From: Sowmini Varadhan @ 2016-03-17 19:41 UTC (permalink / raw)
  To: intel-wired-lan

On (03/17/16 12:28), Jesse Brandeburg wrote:
> We can ask on netdev if the driver should defend against this kind of
> input to hard_start_xmit (transmit routine), but the driver doesn't
> check the maximum length of the skb to see if it is invalid, because
> the stack can never build (only pktgen can) these invalid SKBs.
> 
> The issue is that pktgen builds skb->data with a contiguous buffer of
> whatever size transmit requested, (regardless of MTU) and then sends it
> straight to the transmit routine, no segmentation flags, no MSS set.

I see. And after you mentioned it, I checked with ixgbe, sure 
enough, that also results in a tx-hang for the pktgen test case
(whereas there were no issues with the (rds-stress , ixgbe) test.

I would surmise that pktgen is a bit of an outlier, more interesting
to focus on those cases that use the regular stack.

I dont know if dpdk can create the same issues as pktgen?

> we don't need to bring it up on netdev.  We have a way to troubleshoot
> MDDs that I can send to you, if you want to do the work.  Otherwise we
> need to have some time to reproduce here.

yes, I can do the work, since I already have this nicely set up.
Just need some hings on how to trouble-shoot the mdd.

--Sowmini


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Intel-wired-lan] [E1000-devel] i40e card Tx resets
  2016-03-17 19:28                   ` Jesse Brandeburg
  2016-03-17 19:41                     ` Sowmini Varadhan
@ 2016-03-18 11:08                     ` zhuyj
  1 sibling, 0 replies; 14+ messages in thread
From: zhuyj @ 2016-03-18 11:08 UTC (permalink / raw)
  To: intel-wired-lan

On 03/18/2016 03:28 AM, Jesse Brandeburg wrote:
> On Thu, 17 Mar 2016 14:56:14 -0400
> Sowmini Varadhan <sowmini.varadhan@oracle.com> wrote:
>
>> On (03/17/16 10:20), zhuyj wrote:
>>> 1. modprobe NET_PKTGEN
>>>
>>> 2. download the tar file and uncompress to any directory.
>>> This tar file is from kernel. It is in samples/pktgen/
>>>
>>> 3. cd pktgen
>>>
>>> 4. pktgen_sample02_multiqueue.sh -i ethx -s size -t cpu_number
>> Indeed, I see the same thing as you, and it was very easy to
>> reproduce. It was very interesting that the problem can happen with
>> as few as 3 threads, at which point I see the TX hang at exactly
>> -s 12305
> Okay, sorry I hadn't jumped into this thread yet.
>
> I can uniquivically tell you that what Sowmini saw with the MDD with
> stack based RDS-STRESS testing is *NOT* the same as what you're seeing
> while using pktgen with invalid huge skb->data buffers.
>
> We can ask on netdev if the driver should defend against this kind of
> input to hard_start_xmit (transmit routine), but the driver doesn't
> check the maximum length of the skb to see if it is invalid, because
> the stack can never build (only pktgen can) these invalid SKBs.
>
> The issue is that pktgen builds skb->data with a contiguous buffer of
> whatever size transmit requested, (regardless of MTU) and then sends it
> straight to the transmit routine, no segmentation flags, no MSS set.
>
> This causes the driver to build a transmit descriptor with an invalid
> length, which the hardware then "ASSERTS" on by issuing an MDD
> interrupt and freezing the bad acting queue.
>
>> I see:
>> i40e 0000:82:00.0: TX driver issue detected, PF reset issued
>> i40e 0000:82:00.0 eth2: VSI_seid 390, Hung TX queue 0, tx_pending: 492, NTC:0x140, HWB: 0x140, NTU: 0x12c, TAIL: 0x12c
>>
>> I think the common factor in both our test cases is that we have some
>> kernel thread that can efficiently send packets without any context
>> switches.
> You've found a red herring (mistakenly connected two separate events)
> so I think you can stop going down this path (pktgen).
>
>> Has anyone here seen this before? I'll see if I can find some cycles
>> to figure this out, if not, maybe its worth bringing up on netdev,
>> to see if others have seen this, and to draw some patterns.
> we don't need to bring it up on netdev.  We have a way to troubleshoot
> MDDs that I can send to you, if you want to do the work.  Otherwise we
> need to have some time to reproduce here.
>
>>> If size is set to a big number, the similar defect will occur.
>>> Adjust this size to a appropriate number, my defect will not occur.
>>>
>>> In the test, I found some types igb nic, such as i210, will work
>>> well no matter the size is a big number.
>>> some nic, such as 82580, it will not work well if the size is too big.
> This is mostly a combination of driver implementation and how the
> hardware handles a descriptor that is too large.  The driver *could*
> check to make sure the skb->data is never too large, but in that same
> vein, we *could* fix pktgen to never send a frame greater than MTU down
> to the driver.
Do you mean this is not a bug in nic?
And it is unnecessary to fix it?

But if a test tool makes tests like pktgen, how to handle it?

We just suggests not to make such tests?

Best Regards!
Zhu Yanjun
>
>>> As such, I think my problem results from the hardware and the big
>>> size triggers this problem.
>>>
>>> I hope this can help us all.
> Unfortunately Zhu's problem with pktgen is not a reproducer of
> Sowmini's problem.
>
> In the case of pktgen, it is a "don't do that, because it hurts" kind of
> bug. In the case of rds-stress, we need to reproduce it here and figure
> out what hardware constraint the driver is violating during set up of
> the transmit.
>
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-03-18 11:08 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-14 21:43 [Intel-wired-lan] i40e card Tx resets Sowmini Varadhan
2016-03-15  6:12 ` [Intel-wired-lan] [E1000-devel] " zhuyj
2016-03-15  8:55   ` zhuyj
2016-03-15 10:54     ` Sowmini Varadhan
2016-03-16  3:19       ` zhuyj
2016-03-16  3:25         ` Sowmini Varadhan
2016-03-16 11:46           ` zhuyj
2016-03-16 14:36             ` Sowmini Varadhan
2016-03-17  2:20               ` zhuyj
2016-03-17  2:29                 ` zhuyj
2016-03-17 18:56                 ` Sowmini Varadhan
2016-03-17 19:28                   ` Jesse Brandeburg
2016-03-17 19:41                     ` Sowmini Varadhan
2016-03-18 11:08                     ` zhuyj

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.