Large performance regression with 6in4 tunnel (sit)

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Large performance regression with 6in4 tunnel (sit)
@ 2016-11-25  1:09 Stephen Rothwell
  2016-11-25  2:18 ` Eli Cooper
  2016-11-25  2:30 ` Eric Dumazet
  0 siblings, 2 replies; 18+ messages in thread
From: Stephen Rothwell @ 2016-11-25  1:09 UTC (permalink / raw)
  To: netdev

Hi all,

This is a typical user error report i.e. a net well specified one :-)

I am using a 6in4 tunnel from my Linux server at home (since my ISP
does not provide native IPv6) to another hosted Linus server (that has
native IPv6 connectivity).  The throughput for IPv6 connections has
dropped from megabits per second to 10s of kilobits per second.

First, I am using Debian supplied kernels, so strike one, right?

Second, I don't actually remember when the problem started - it probably
started when I upgraded from a v4.4 based kernel to a v4.7 based one.
This server does not get rebooted very often as it runs hosted services
for quite a few people (its is ozlabs.org ...).

I tried creating the same tunnel to another hosted server I have access
to that is running a v3.16 based kernel and the performance is fine
(actually upward of 40MB/s).

I noticed from a tcpdump on the hosted server that (when I fetch a
large file over HTTP) the server is sending packets larger than the MTU
of the tunnel.  These packets don't get acked and are later resent as
MTU sized packets.  I will then send more larger packets and repeat ...

The mtu of the tunnel is set to 1280 (though leaving it unset and using
the default gave the same results).  The tunnel is using sit and is
statically set up at both ends (though the hosted server end does not
specify a remote ipv4 end point).

Is there anything else I can tell you?  Testing patches is a bit of a
pain, unfortunately, but I was hoping that someone may remember
something that may have caused this.
-- 
Cheers,
Stephen Rothwell

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-25  1:09 Large performance regression with 6in4 tunnel (sit) Stephen Rothwell
@ 2016-11-25  2:18 ` Eli Cooper
  2016-11-25  2:45   ` Stephen Rothwell
  2016-11-25  2:30 ` Eric Dumazet
  1 sibling, 1 reply; 18+ messages in thread
From: Eli Cooper @ 2016-11-25  2:18 UTC (permalink / raw)
  To: Stephen Rothwell, netdev

Hi Stephen,

On 2016/11/25 9:09, Stephen Rothwell wrote:
> Hi all,
>
> This is a typical user error report i.e. a net well specified one :-)
>
> I am using a 6in4 tunnel from my Linux server at home (since my ISP
> does not provide native IPv6) to another hosted Linus server (that has
> native IPv6 connectivity).  The throughput for IPv6 connections has
> dropped from megabits per second to 10s of kilobits per second.
>
> First, I am using Debian supplied kernels, so strike one, right?
>
> Second, I don't actually remember when the problem started - it probably
> started when I upgraded from a v4.4 based kernel to a v4.7 based one.
> This server does not get rebooted very often as it runs hosted services
> for quite a few people (its is ozlabs.org ...).
>
> I tried creating the same tunnel to another hosted server I have access
> to that is running a v3.16 based kernel and the performance is fine
> (actually upward of 40MB/s).
>
> I noticed from a tcpdump on the hosted server that (when I fetch a
> large file over HTTP) the server is sending packets larger than the MTU
> of the tunnel.  These packets don't get acked and are later resent as
> MTU sized packets.  I will then send more larger packets and repeat ...

Sounds like TSO/GSO packets are not properly segmented and therefore
dropped.

Could you first try turning off segmentation offloading for the tunnel
interface?
    ethtool -K sit0 tso off gso off

> The mtu of the tunnel is set to 1280 (though leaving it unset and using
> the default gave the same results).  The tunnel is using sit and is
> statically set up at both ends (though the hosted server end does not
> specify a remote ipv4 end point).
>
> Is there anything else I can tell you?  Testing patches is a bit of a
> pain, unfortunately, but I was hoping that someone may remember
> something that may have caused this.

Regards,
Eli

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-25  2:18 ` Eli Cooper
@ 2016-11-25  2:45   ` Stephen Rothwell
  2016-11-25  3:01     ` Eric Dumazet
                       ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Stephen Rothwell @ 2016-11-25  2:45 UTC (permalink / raw)
  To: Eli Cooper; +Cc: netdev

Hi Eli,

On Fri, 25 Nov 2016 10:18:12 +0800 Eli Cooper <elicooper@gmx.com> wrote:
>
> Sounds like TSO/GSO packets are not properly segmented and therefore
> dropped.
> 
> Could you first try turning off segmentation offloading for the tunnel
> interface?
>     ethtool -K sit0 tso off gso off

On Thu, 24 Nov 2016 18:30:14 -0800 Eric Dumazet <eric.dumazet@gmail.com>
>
> You also could try to disable TSO and see if this makes a difference
> 
> ethtool -K sixtofour0 tso off

So turning off tso brings performance up to IPv4 levels ...

Thanks for that, it solves my immediate problem.
-- 
Cheers,
Stephen Rothwell

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-25  2:45   ` Stephen Rothwell
@ 2016-11-25  3:01     ` Eric Dumazet
  2016-11-25  3:09       ` Stephen Rothwell
  2016-11-25  4:06     ` Sven-Haegar Koch
  2016-11-25  6:05     ` Eli Cooper
  2 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2016-11-25  3:01 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: Eli Cooper, netdev

On Fri, 2016-11-25 at 13:45 +1100, Stephen Rothwell wrote:

> So turning off tso brings performance up to IPv4 levels ...

ok.

> 
> Thanks for that, it solves my immediate problem.

Since I do not have this problem at all on my hosts, it could be a buggy
ethernet driver.

Could you share what NIC card and driver you are using ?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-25  3:01     ` Eric Dumazet
@ 2016-11-25  3:09       ` Stephen Rothwell
  2016-11-25  3:54         ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Stephen Rothwell @ 2016-11-25  3:09 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Eli Cooper, netdev

Hi Eric,

On Thu, 24 Nov 2016 19:01:28 -0800 Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> Since I do not have this problem at all on my hosts, it could be a buggy
> ethernet driver.
> 
> Could you share what NIC card and driver you are using ?


# uname -a
Linux bilbo 4.7.0-1-amd64 #1 SMP Debian 4.7.8-1 (2016-10-19) x86_64 GNU/Linux

# lspci | grep -i net
03:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
04:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)

from boot dmesg:

[    7.573725] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.3.0-k
[    7.573726] igb: Copyright (c) 2007-2014 Intel Corporation.
[    7.752918] igb 0000:03:00.0: added PHC on eth0
[    7.752925] igb 0000:03:00.0: Intel(R) Gigabit Ethernet Network Connection
[    7.752927] igb 0000:03:00.0: eth0: (PCIe:2.5Gb/s:Width x1) 00:1e:67:9f:d4:24
[    7.753460] igb 0000:03:00.0: eth0: PBA No: 102100-000
[    7.753460] igb 0000:03:00.0: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s)
[    7.902433] igb 0000:04:00.0: added PHC on eth1
[    7.902434] igb 0000:04:00.0: Intel(R) Gigabit Ethernet Network Connection
[    7.902435] igb 0000:04:00.0: eth1: (PCIe:2.5Gb/s:Width x1) 00:1e:67:9f:d4:25
[    7.902484] igb 0000:04:00.0: eth1: PBA No: 102100-000
[    7.902485] igb 0000:04:00.0: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s)
[   19.753325] igb 0000:03:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX

-- 
Cheers,
Stephen Rothwell

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-25  3:09       ` Stephen Rothwell
@ 2016-11-25  3:54         ` Eric Dumazet
  2016-11-25  6:12           ` Stephen Rothwell
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2016-11-25  3:54 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: Eli Cooper, netdev

On Fri, 2016-11-25 at 14:09 +1100, Stephen Rothwell wrote:
> Hi Eric,
> 
> On Thu, 24 Nov 2016 19:01:28 -0800 Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> > Since I do not have this problem at all on my hosts, it could be a buggy
> > ethernet driver.
> > 
> > Could you share what NIC card and driver you are using ?
> 
> 
> # uname -a
> Linux bilbo 4.7.0-1-amd64 #1 SMP Debian 4.7.8-1 (2016-10-19) x86_64 GNU/Linux
> 
> # lspci | grep -i net
> 03:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
> 04:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
> 
> from boot dmesg:
> 
> [    7.573725] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.3.0-k
> [    7.573726] igb: Copyright (c) 2007-2014 Intel Corporation.
> [    7.752918] igb 0000:03:00.0: added PHC on eth0
> [    7.752925] igb 0000:03:00.0: Intel(R) Gigabit Ethernet Network Connection
> [    7.752927] igb 0000:03:00.0: eth0: (PCIe:2.5Gb/s:Width x1) 00:1e:67:9f:d4:24
> [    7.753460] igb 0000:03:00.0: eth0: PBA No: 102100-000
> [    7.753460] igb 0000:03:00.0: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s)
> [    7.902433] igb 0000:04:00.0: added PHC on eth1
> [    7.902434] igb 0000:04:00.0: Intel(R) Gigabit Ethernet Network Connection
> [    7.902435] igb 0000:04:00.0: eth1: (PCIe:2.5Gb/s:Width x1) 00:1e:67:9f:d4:25
> [    7.902484] igb 0000:04:00.0: eth1: PBA No: 102100-000
> [    7.902485] igb 0000:04:00.0: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s)
> [   19.753325] igb 0000:03:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
> 

Could you now report : 

ethtool -k eth0

Thanks

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-25  3:54         ` Eric Dumazet
@ 2016-11-25  6:12           ` Stephen Rothwell
  0 siblings, 0 replies; 18+ messages in thread
From: Stephen Rothwell @ 2016-11-25  6:12 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Eli Cooper, netdev

Hi Eric,

On Thu, 24 Nov 2016 19:54:04 -0800 Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> Could you now report : 
> 
> ethtool -k eth0

Features for eth0:
rx-checksumming: on
tx-checksumming: on
	tx-checksum-ipv4: off [fixed]
	tx-checksum-ip-generic: on
	tx-checksum-ipv6: off [fixed]
	tx-checksum-fcoe-crc: off [fixed]
	tx-checksum-sctp: on
scatter-gather: on
	tx-scatter-gather: on
	tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
	tx-tcp-segmentation: on
	tx-tcp-ecn-segmentation: off [fixed]
	tx-tcp-mangleid-segmentation: off
	tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-gre-csum-segmentation: on
tx-ipxip4-segmentation: on
tx-ipxip6-segmentation: on
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: on
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: off [fixed]
hw-tc-offload: off [fixed]


-- 
Cheers,
Stephen Rothwell

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-25  2:45   ` Stephen Rothwell
  2016-11-25  3:01     ` Eric Dumazet
@ 2016-11-25  4:06     ` Sven-Haegar Koch
  2016-11-27  3:23       ` Stephen Rothwell
  2016-11-25  6:05     ` Eli Cooper
  2 siblings, 1 reply; 18+ messages in thread
From: Sven-Haegar Koch @ 2016-11-25  4:06 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: Eli Cooper, netdev

On Fri, 25 Nov 2016, Stephen Rothwell wrote:

> On Fri, 25 Nov 2016 10:18:12 +0800 Eli Cooper <elicooper@gmx.com> wrote:
> >
> > Sounds like TSO/GSO packets are not properly segmented and therefore
> > dropped.
> > 
> > Could you first try turning off segmentation offloading for the tunnel
> > interface?
> >     ethtool -K sit0 tso off gso off
> 
> On Thu, 24 Nov 2016 18:30:14 -0800 Eric Dumazet <eric.dumazet@gmail.com>
> >
> > You also could try to disable TSO and see if this makes a difference
> > 
> > ethtool -K sixtofour0 tso off
> 
> So turning off tso brings performance up to IPv4 levels ...
> 
> Thanks for that, it solves my immediate problem.

Somehow this problem description really reminds me of a report on 
netdev a bit ago, which the following patch fixed:

commit 9ee6c5dc816aa8256257f2cd4008a9291ec7e985
Author: Lance Richardson <lrichard@redhat.com>
Date:   Wed Nov 2 16:36:17 2016 -0400

    ipv4: allow local fragmentation in ip_finish_output_gso()
    
    Some configurations (e.g. geneve interface with default
    MTU of 1500 over an ethernet interface with 1500 MTU) result
    in the transmission of packets that exceed the configured MTU.
    While this should be considered to be a "bad" configuration,
    it is still allowed and should not result in the sending
    of packets that exceed the configured MTU.

Could this be related?

I suppose it would be difficult to test this patch on this machine?

c'ya
sven-haegar

-- 
Three may keep a secret, if two of them are dead.
- Ben F.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-25  4:06     ` Sven-Haegar Koch
@ 2016-11-27  3:23       ` Stephen Rothwell
  2016-11-28 17:54         ` Lance Richardson
  2016-11-28 21:32         ` Alexander Duyck
  0 siblings, 2 replies; 18+ messages in thread
From: Stephen Rothwell @ 2016-11-27  3:23 UTC (permalink / raw)
  To: Sven-Haegar Koch; +Cc: Eli Cooper, netdev, Eric Dumazet

Hi Sven-Haegar,

On Fri, 25 Nov 2016 05:06:53 +0100 (CET) Sven-Haegar Koch <haegar@sdinet.de> wrote:
>
> Somehow this problem description really reminds me of a report on 
> netdev a bit ago, which the following patch fixed:
> 
> commit 9ee6c5dc816aa8256257f2cd4008a9291ec7e985
> Author: Lance Richardson <lrichard@redhat.com>
> Date:   Wed Nov 2 16:36:17 2016 -0400
> 
>     ipv4: allow local fragmentation in ip_finish_output_gso()
>     
>     Some configurations (e.g. geneve interface with default
>     MTU of 1500 over an ethernet interface with 1500 MTU) result
>     in the transmission of packets that exceed the configured MTU.
>     While this should be considered to be a "bad" configuration,
>     it is still allowed and should not result in the sending
>     of packets that exceed the configured MTU.
> 
> Could this be related?
> 
> I suppose it would be difficult to test this patch on this machine?

The kernel I am running on is based on 4.7.8, so the above patch
doesn't come close to applying. Most fo what it is reverting was
introduced in commit 359ebda25aa0 ("net/ipv4: Introduce IPSKB_FRAG_SEGS
bit to inet_skb_parm.flags") in v4.8-rc1.

-- 
Cheers,
Stephen Rothwell

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-27  3:23       ` Stephen Rothwell
@ 2016-11-28 17:54         ` Lance Richardson
  2016-11-28 19:49           ` Lance Richardson
  2016-11-28 21:32         ` Alexander Duyck
  1 sibling, 1 reply; 18+ messages in thread
From: Lance Richardson @ 2016-11-28 17:54 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: Sven-Haegar Koch, Eli Cooper, netdev, Eric Dumazet

> From: "Stephen Rothwell" <sfr@canb.auug.org.au>
> To: "Sven-Haegar Koch" <haegar@sdinet.de>
> Cc: "Eli Cooper" <elicooper@gmx.com>, netdev@vger.kernel.org, "Eric Dumazet" <eric.dumazet@gmail.com>
> Sent: Saturday, November 26, 2016 10:23:40 PM
> Subject: Re: Large performance regression with 6in4 tunnel (sit)
> 
> Hi Sven-Haegar,
> 
> On Fri, 25 Nov 2016 05:06:53 +0100 (CET) Sven-Haegar Koch <haegar@sdinet.de>
> wrote:
> >
> > Somehow this problem description really reminds me of a report on
> > netdev a bit ago, which the following patch fixed:
> > 
> > commit 9ee6c5dc816aa8256257f2cd4008a9291ec7e985
> > Author: Lance Richardson <lrichard@redhat.com>
> > Date:   Wed Nov 2 16:36:17 2016 -0400
> > 
> >     ipv4: allow local fragmentation in ip_finish_output_gso()
> >     
> >     Some configurations (e.g. geneve interface with default
> >     MTU of 1500 over an ethernet interface with 1500 MTU) result
> >     in the transmission of packets that exceed the configured MTU.
> >     While this should be considered to be a "bad" configuration,
> >     it is still allowed and should not result in the sending
> >     of packets that exceed the configured MTU.
> > 
> > Could this be related?
> > 
> > I suppose it would be difficult to test this patch on this machine?
> 
> The kernel I am running on is based on 4.7.8, so the above patch
> doesn't come close to applying. Most fo what it is reverting was
> introduced in commit 359ebda25aa0 ("net/ipv4: Introduce IPSKB_FRAG_SEGS
> bit to inet_skb_parm.flags") in v4.8-rc1.
> 
> --
> Cheers,
> Stephen Rothwell
> 

This should be equivalent for 4.7.x:

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 4bd4921..8a253e2 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -224,8 +224,7 @@ static int ip_finish_output_gso(struct net *net, struct sock *sk,
        int ret = 0;
 
        /* common case: locally created skb or seglen is <= mtu */
-       if (((IPCB(skb)->flags & IPSKB_FORWARDED) == 0) ||
-             skb_gso_network_seglen(skb) <= mtu)
+       if (skb_gso_network_seglen(skb) <= mtu)
                return ip_finish_output2(net, sk, skb);
 
        /* Slowpath -  GSO segment length is exceeding the dst MTU.

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-28 17:54         ` Lance Richardson
@ 2016-11-28 19:49           ` Lance Richardson
  0 siblings, 0 replies; 18+ messages in thread
From: Lance Richardson @ 2016-11-28 19:49 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: Sven-Haegar Koch, Eli Cooper, netdev, Eric Dumazet

> From: "Lance Richardson" <lrichard@redhat.com>
> To: "Stephen Rothwell" <sfr@canb.auug.org.au>
> Cc: "Sven-Haegar Koch" <haegar@sdinet.de>, "Eli Cooper" <elicooper@gmx.com>, netdev@vger.kernel.org, "Eric Dumazet"
> <eric.dumazet@gmail.com>
> Sent: Monday, November 28, 2016 12:54:07 PM
> Subject: Re: Large performance regression with 6in4 tunnel (sit)
> 
> > From: "Stephen Rothwell" <sfr@canb.auug.org.au>
> > To: "Sven-Haegar Koch" <haegar@sdinet.de>
> > Cc: "Eli Cooper" <elicooper@gmx.com>, netdev@vger.kernel.org, "Eric
> > Dumazet" <eric.dumazet@gmail.com>
> > Sent: Saturday, November 26, 2016 10:23:40 PM
> > Subject: Re: Large performance regression with 6in4 tunnel (sit)
> > 
> > Hi Sven-Haegar,
> > 
> > On Fri, 25 Nov 2016 05:06:53 +0100 (CET) Sven-Haegar Koch
> > <haegar@sdinet.de>
> > wrote:
> > >
> > > Somehow this problem description really reminds me of a report on
> > > netdev a bit ago, which the following patch fixed:
> > > 
> > > commit 9ee6c5dc816aa8256257f2cd4008a9291ec7e985
> > > Author: Lance Richardson <lrichard@redhat.com>
> > > Date:   Wed Nov 2 16:36:17 2016 -0400
> > > 
> > >     ipv4: allow local fragmentation in ip_finish_output_gso()
> > >     
> > >     Some configurations (e.g. geneve interface with default
> > >     MTU of 1500 over an ethernet interface with 1500 MTU) result
> > >     in the transmission of packets that exceed the configured MTU.
> > >     While this should be considered to be a "bad" configuration,
> > >     it is still allowed and should not result in the sending
> > >     of packets that exceed the configured MTU.
> > > 
> > > Could this be related?
> > > 
> > > I suppose it would be difficult to test this patch on this machine?
> > 
> > The kernel I am running on is based on 4.7.8, so the above patch
> > doesn't come close to applying. Most fo what it is reverting was
> > introduced in commit 359ebda25aa0 ("net/ipv4: Introduce IPSKB_FRAG_SEGS
> > bit to inet_skb_parm.flags") in v4.8-rc1.
> > 
> > --
> > Cheers,
> > Stephen Rothwell
> > 
> 
> This should be equivalent for 4.7.x:
> 
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index 4bd4921..8a253e2 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -224,8 +224,7 @@ static int ip_finish_output_gso(struct net *net, struct
> sock *sk,
>         int ret = 0;
>  
>         /* common case: locally created skb or seglen is <= mtu */
> -       if (((IPCB(skb)->flags & IPSKB_FORWARDED) == 0) ||
> -             skb_gso_network_seglen(skb) <= mtu)
> +       if (skb_gso_network_seglen(skb) <= mtu)
>                 return ip_finish_output2(net, sk, skb);
>  
>         /* Slowpath -  GSO segment length is exceeding the dst MTU.
> 

BTW, I do think this would be worth trying. For the geneve case, I
measured on the order of a 10X-100X performance hit without this
patch, traces were similar to what you describe (too-large gso packets
were dropped, corresponding TCP segments were retransmitted later via
a non-gso code path).

Regards,

   Lance

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-27  3:23       ` Stephen Rothwell
  2016-11-28 17:54         ` Lance Richardson
@ 2016-11-28 21:32         ` Alexander Duyck
  2016-11-28 22:38           ` Stephen Rothwell
  1 sibling, 1 reply; 18+ messages in thread
From: Alexander Duyck @ 2016-11-28 21:32 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: Sven-Haegar Koch, Eli Cooper, Netdev, Eric Dumazet

On Sat, Nov 26, 2016 at 7:23 PM, Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> Hi Sven-Haegar,
>
> On Fri, 25 Nov 2016 05:06:53 +0100 (CET) Sven-Haegar Koch <haegar@sdinet.de> wrote:
>>
>> Somehow this problem description really reminds me of a report on
>> netdev a bit ago, which the following patch fixed:
>>
>> commit 9ee6c5dc816aa8256257f2cd4008a9291ec7e985
>> Author: Lance Richardson <lrichard@redhat.com>
>> Date:   Wed Nov 2 16:36:17 2016 -0400
>>
>>     ipv4: allow local fragmentation in ip_finish_output_gso()
>>
>>     Some configurations (e.g. geneve interface with default
>>     MTU of 1500 over an ethernet interface with 1500 MTU) result
>>     in the transmission of packets that exceed the configured MTU.
>>     While this should be considered to be a "bad" configuration,
>>     it is still allowed and should not result in the sending
>>     of packets that exceed the configured MTU.
>>
>> Could this be related?
>>
>> I suppose it would be difficult to test this patch on this machine?
>
> The kernel I am running on is based on 4.7.8, so the above patch
> doesn't come close to applying. Most fo what it is reverting was
> introduced in commit 359ebda25aa0 ("net/ipv4: Introduce IPSKB_FRAG_SEGS
> bit to inet_skb_parm.flags") in v4.8-rc1.

So I think I have this root caused.  The problem seems to be the fact
that I chose to use lco_csum when trying to cancel out the inner IP
header from the checksum and it turns out that the transport offset is
never updated in the case of these tunnels.

For now a workaround is to just set tx-gso-partial to off on the
interface the tunnel is running over and you should be able to pass
traffic without any issues.

I have a patch for igb/igbvf that should be out in the next hour or so
which should address it.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-28 21:32         ` Alexander Duyck
@ 2016-11-28 22:38           ` Stephen Rothwell
  0 siblings, 0 replies; 18+ messages in thread
From: Stephen Rothwell @ 2016-11-28 22:38 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Sven-Haegar Koch, Eli Cooper, Netdev, Eric Dumazet

Hi Alex,

On Mon, 28 Nov 2016 13:32:21 -0800 Alexander Duyck <alexander.duyck@gmail.com> wrote:
>
> So I think I have this root caused.  The problem seems to be the fact
> that I chose to use lco_csum when trying to cancel out the inner IP
> header from the checksum and it turns out that the transport offset is
> never updated in the case of these tunnels.
> 
> For now a workaround is to just set tx-gso-partial to off on the
> interface the tunnel is running over and you should be able to pass
> traffic without any issues.

OK, so that works (even with gso and tso set to "on" on the sit
interface).  Thanks.

> I have a patch for igb/igbvf that should be out in the next hour or so
> which should address it.

That will be a bit harder to test, but I will see what I can do.

-- 
Cheers,
Stephen Rothwell

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-25  2:45   ` Stephen Rothwell
  2016-11-25  3:01     ` Eric Dumazet
  2016-11-25  4:06     ` Sven-Haegar Koch
@ 2016-11-25  6:05     ` Eli Cooper
  2016-11-27  0:54       ` Stephen Rothwell
  2 siblings, 1 reply; 18+ messages in thread
From: Eli Cooper @ 2016-11-25  6:05 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: netdev

Hi Stephen,

On 2016/11/25 10:45, Stephen Rothwell wrote:
> Hi Eli,
>
> On Fri, 25 Nov 2016 10:18:12 +0800 Eli Cooper <elicooper@gmx.com> wrote:
>> Sounds like TSO/GSO packets are not properly segmented and therefore
>> dropped.
>>
>> Could you first try turning off segmentation offloading for the tunnel
>> interface?
>>     ethtool -K sit0 tso off gso off
> On Thu, 24 Nov 2016 18:30:14 -0800 Eric Dumazet <eric.dumazet@gmail.com>
>> You also could try to disable TSO and see if this makes a difference
>>
>> ethtool -K sixtofour0 tso off
> So turning off tso brings performance up to IPv4 levels ...
>
> Thanks for that, it solves my immediate problem.

I think this is similar to the bug I fixed in commit ae148b085876
("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()").

I can reproduce a similar problem by applying xfrm to sit traffic.
TSO/GSO packets are dropped when IPSec is enabled, and IPv6 throughput
drops to 10s of Kbps. I am not sure if this is the same issue you
experienced, but I wrote a patch that fixed at least the issue I had.

Could you test the patch I sent to the mailing list just now?

Thanks,
Eli

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-25  6:05     ` Eli Cooper
@ 2016-11-27  0:54       ` Stephen Rothwell
  2016-11-27  2:02         ` Stephen Rothwell
  0 siblings, 1 reply; 18+ messages in thread
From: Stephen Rothwell @ 2016-11-27  0:54 UTC (permalink / raw)
  To: Eli Cooper; +Cc: netdev

Hi Eli,

On Fri, 25 Nov 2016 14:05:04 +0800 Eli Cooper <elicooper@gmx.com> wrote:
>
> I think this is similar to the bug I fixed in commit ae148b085876
> ("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()").
> 
> I can reproduce a similar problem by applying xfrm to sit traffic.
> TSO/GSO packets are dropped when IPSec is enabled, and IPv6 throughput
> drops to 10s of Kbps. I am not sure if this is the same issue you
> experienced, but I wrote a patch that fixed at least the issue I had.
> 
> Could you test the patch I sent to the mailing list just now?

Thanks for the patch!

Its a bit tricky to test since the problem only occurs in a production
machine (I tried reproducing in a VM, but the problem did not occur),
but I will try to just rebuild the sit module and see if I can insert
the modified one.

-- 
Cheers,
Stephen Rothwell

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-27  0:54       ` Stephen Rothwell
@ 2016-11-27  2:02         ` Stephen Rothwell
  2016-11-27 16:22           ` Eli Cooper
  0 siblings, 1 reply; 18+ messages in thread
From: Stephen Rothwell @ 2016-11-27  2:02 UTC (permalink / raw)
  To: Eli Cooper; +Cc: netdev

Hi Eli,

On Sun, 27 Nov 2016 11:54:41 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> On Fri, 25 Nov 2016 14:05:04 +0800 Eli Cooper <elicooper@gmx.com> wrote:
> >
> > I think this is similar to the bug I fixed in commit ae148b085876
> > ("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()").
> > 
> > I can reproduce a similar problem by applying xfrm to sit traffic.
> > TSO/GSO packets are dropped when IPSec is enabled, and IPv6 throughput
> > drops to 10s of Kbps. I am not sure if this is the same issue you
> > experienced, but I wrote a patch that fixed at least the issue I had.
> > 
> > Could you test the patch I sent to the mailing list just now?  
> 
> Thanks for the patch!
> 
> Its a bit tricky to test since the problem only occurs in a production
> machine (I tried reproducing in a VM, but the problem did not occur),
> but I will try to just rebuild the sit module and see if I can insert
> the modified one.

OK, I tried your patch and unfortunately, it doesn't seem to have
worked ... I still get the large packets dropped and resent smaller.

-- 
Cheers,
Stephen Rothwell

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-27  2:02         ` Stephen Rothwell
@ 2016-11-27 16:22           ` Eli Cooper
  0 siblings, 0 replies; 18+ messages in thread
From: Eli Cooper @ 2016-11-27 16:22 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: netdev

Hi Stephen,


On 2016/11/27 10:02, Stephen Rothwell wrote:
> Hi Eli,
>
> On Sun, 27 Nov 2016 11:54:41 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>> On Fri, 25 Nov 2016 14:05:04 +0800 Eli Cooper <elicooper@gmx.com> wrote:
>>> I think this is similar to the bug I fixed in commit ae148b085876
>>> ("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()").
>>>
>>> I can reproduce a similar problem by applying xfrm to sit traffic.
>>> TSO/GSO packets are dropped when IPSec is enabled, and IPv6 throughput
>>> drops to 10s of Kbps. I am not sure if this is the same issue you
>>> experienced, but I wrote a patch that fixed at least the issue I had.
>>>
>>> Could you test the patch I sent to the mailing list just now?  
>> Thanks for the patch!
>>
>> Its a bit tricky to test since the problem only occurs in a production
>> machine (I tried reproducing in a VM, but the problem did not occur),
That's probably because the ethernet NIC in your VM does not support
segmentation offloading. You could, however, try reproducing it on
another (real) machine with the same driver.
>> but I will try to just rebuild the sit module and see if I can insert
>> the modified one.
> OK, I tried your patch and unfortunately, it doesn't seem to have
> worked ... I still get the large packets dropped and resent smaller.
>
It's a shame ... In my case, large packets are dropped only when xfrm is
in effect (therefore another output path is taken), and probably that's
not your case. Well, on the plus side, at least you reminded me that sit
device also needs to update skb's protocol.

Thanks,
Eli

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Large performance regression with 6in4 tunnel (sit)
  2016-11-25  1:09 Large performance regression with 6in4 tunnel (sit) Stephen Rothwell
  2016-11-25  2:18 ` Eli Cooper
@ 2016-11-25  2:30 ` Eric Dumazet
  1 sibling, 0 replies; 18+ messages in thread
From: Eric Dumazet @ 2016-11-25  2:30 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: netdev

On Fri, 2016-11-25 at 12:09 +1100, Stephen Rothwell wrote:
> Hi all,
> 
> This is a typical user error report i.e. a net well specified one :-)
> 
> I am using a 6in4 tunnel from my Linux server at home (since my ISP
> does not provide native IPv6) to another hosted Linus server (that has
> native IPv6 connectivity).  The throughput for IPv6 connections has
> dropped from megabits per second to 10s of kilobits per second.
> 
> First, I am using Debian supplied kernels, so strike one, right?
> 
> Second, I don't actually remember when the problem started - it probably
> started when I upgraded from a v4.4 based kernel to a v4.7 based one.
> This server does not get rebooted very often as it runs hosted services
> for quite a few people (its is ozlabs.org ...).
> 
> I tried creating the same tunnel to another hosted server I have access
> to that is running a v3.16 based kernel and the performance is fine
> (actually upward of 40MB/s).
> 
> I noticed from a tcpdump on the hosted server that (when I fetch a
> large file over HTTP) the server is sending packets larger than the MTU
> of the tunnel.  These packets don't get acked and are later resent as
> MTU sized packets.  I will then send more larger packets and repeat ...


tcpdump shows big packets because SIT supports TSO (since linux-3.13)

lpaa23:~# ip -6 ro get 2002:af6:798::1
2002:af6:798::1 via fe80:: dev sixtofour0  src 2002:af6:797::  metric
1024  pref medium
lpaa23:~# ./netperf -H 2002:af6:798::1
MIGRATED TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to
2002:af6:798::1 () port 0 AF_INET6
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 87380  16384  16384    10.00    10374.64   

lpaa23:~# ethtool -k sixtofour0|grep seg
tcp-segmentation-offload: on
	tx-tcp-segmentation: on
	tx-tcp-ecn-segmentation: on
	tx-tcp6-segmentation: on
	tx-tcp-psp-segmentation: off [fixed]
generic-segmentation-offload: on
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
tx-ggre-segmentation: off [fixed]


> 
> The mtu of the tunnel is set to 1280 (though leaving it unset and using
> the default gave the same results).  The tunnel is using sit and is
> statically set up at both ends (though the hosted server end does not
> specify a remote ipv4 end point).
> 
> Is there anything else I can tell you?  Testing patches is a bit of a
> pain, unfortunately, but I was hoping that someone may remember
> something that may have caused this.

You could use "perf record -a -g -e skb:kfree_skb" to see where packets
are dropped on the sender.

You also could try to disable TSO and see if this makes a difference

ethtool -K sixtofour0 tso off

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-11-28 22:38 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-25  1:09 Large performance regression with 6in4 tunnel (sit) Stephen Rothwell
2016-11-25  2:18 ` Eli Cooper
2016-11-25  2:45   ` Stephen Rothwell
2016-11-25  3:01     ` Eric Dumazet
2016-11-25  3:09       ` Stephen Rothwell
2016-11-25  3:54         ` Eric Dumazet
2016-11-25  6:12           ` Stephen Rothwell
2016-11-25  4:06     ` Sven-Haegar Koch
2016-11-27  3:23       ` Stephen Rothwell
2016-11-28 17:54         ` Lance Richardson
2016-11-28 19:49           ` Lance Richardson
2016-11-28 21:32         ` Alexander Duyck
2016-11-28 22:38           ` Stephen Rothwell
2016-11-25  6:05     ` Eli Cooper
2016-11-27  0:54       ` Stephen Rothwell
2016-11-27  2:02         ` Stephen Rothwell
2016-11-27 16:22           ` Eli Cooper
2016-11-25  2:30 ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).