All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joseph Gasparakis <joseph.gasparakis@intel.com>
To: Or Gerlitz <or.gerlitz@gmail.com>
Cc: Joseph Gasparakis <joseph.gasparakis@intel.com>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Jerry Chu <hkchu@google.com>, Or Gerlitz <ogerlitz@mellanox.com>,
	Eric Dumazet <edumazet@google.com>,
	Alexei Starovoitov <ast@plumgrid.com>,
	Pravin B Shelar <pshelar@nicira.com>,
	David Miller <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>
Subject: Re: vxlan/veth performance issues on net.git + latest kernels
Date: Tue, 3 Dec 2013 16:35:39 -0800 (PST)	[thread overview]
Message-ID: <alpine.LFD.2.03.1312031626400.7539@intel.com> (raw)
In-Reply-To: <CAJZOPZLT_3msfz_XY45GOBMWK_y8WEYBTq3rGRUijpYyXE1ddg@mail.gmail.com>



On Tue, 3 Dec 2013, Or Gerlitz wrote:

> On Wed, Dec 4, 2013 at 1:13 AM, Joseph Gasparakis
> <joseph.gasparakis@intel.com> wrote:
> >
> >
> > On Tue, 3 Dec 2013, Or Gerlitz wrote:
> >
> >> On Tue, Dec 3, 2013 at 11:11 PM, Joseph Gasparakis
> >> <joseph.gasparakis@intel.com> wrote:
> >>
> >> >>> lack of GRO : receiver seems to not be able to receive as fast as you want.
> >> >>>>      TCPOFOQueue: 3167879
> >> >>> So many packets are received out of order (because of losses)
> >>
> >> >> I see that there's no GRO also for the non-veth tests which involve
> >> >> vxlan, and over there the receiving side is capable to consume the
> >> >> packets, do you have rough explaination why adding veth to the chain
> >> >> is such game changer which makes things to start falling out?
> >>
> >> > I have seen this before. Here are my findings:
> >> >
> >> > The gso_type is different if the skb comes from veth or not. From veth,
> >> > you will see the SKB_GSO_DODGY set. This breaks things as when the
> >> > skb with DODGY set moves from vxlan to the driver through dev_xmit_hard,
> >> > the stack drops it silently. I never got the time to find the root cause
> >> > for this, but I know it causes re-transmissions and big performance
> >> > degregation.
> >> >
> >> > I went as far as just quickly hacking a one liner unsetting the DODGY bit
> >> > in vxlan.c and that bypassed the issue and recovered the performance
> >> > problem, but obviously this is not a real fix.
> >>
> >> thanks for the heads up, few quick questions/clafications --
> >>
> >> -- you are talking on drops done @ the sender side, correct? Eric was
> >> saying we have evidences that the drops happen on the receiver.
> >
> > I am *guessing* drops on the Rx are due to the drops at the Tx. See my
> > answer to your next question for more info.
> >
> >>
> >> -- without the hack you did, still packets are sent/received, so what
> >> makes the stack to drop only some of them?
> >>
> >
> > What I had seen is GSOs getting dropped on the Tx side. Basically the GSOs
> > never made it to the driver, they were broken into non GSO smaller skbs by
> > the stack. I think the stack is not handling well the GSO with the DODGY
> > bit set, and that causes it to maybe partially the packet to be emitted,
> > causing the re-transmits (and maybe the drops on your Rx end)? Of course
> > all this is speculation, the fact that I know is that as soon as I was
> > forcing the gso type I saw offloaded VXLAN encapsulated traffic at decent speeds.
> >
> >> -- why packets coming from veth would have the SKB_GSO_DODGY bit set?
> >
> > That is something I would love to know too. I am guessing this is a way
> > for the VM to say it is a non-trusted packet? And maybe all this can be
> > fixed by maybe setting something on the VM through a userspace tool that
> > will stop the veth to set the DODGY bit?
> >
> >>
> >> -- so where is now (say net.git or 3.12.x) this one line you commented
> >> out? I don't see in vxlan.c or in ip_tunnel_core.c / ip_tunnel.c
> >> explicit setting of SKB_GSO_DODGY
> >
> > I did not commit it, as this was just a workaround to prove to myself that
> > the problem I was seing was due to the gso_type, and it would actually
> > just hide the problem and not give a proper solution to it.
> >
> >>
> >> Also, I am pretty sure the problem exists also when sending/receiving
> >> guest traffic through tap/macvtap <--> vhost/virtio-net and friends, I
> >> just sticked to the veth flavour b/c its one (== the hypervisor)
> >> network stack to debug and not two (+ the guest one).
> 
> understood, can you point the line/area you hacked, I'd like to try it
> too and see the impact

I was printing the gso_type in vxlan_xmit_skb(), right before 
iptunnel_xmit() gets called (I was focus UDPv4 encap only). Then I saw the 
gso_type was different when a VM was involved and when it was not 
(although I was transmitting exactly the same packet), and then I replaced 
my printk with something like skb_shinfo(skb)->gso_type = <the gso type I had
for non-VM skb> and it all worked.

Then I looked into what was different between the two gso_types and the 
only difference was that SKB_GSO_DODGY was set when Tx'ing from the VM.
I am sure I could have been more delicate with the aproach, but hey, it
worked for me.

I would be curious to see if this is the same issue as mine. It seems like 
it is.

> 
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe netdev" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> 

  reply	other threads:[~2013-12-04  0:18 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-03 15:05 vxlan/veth performance issues on net.git + latest kernels Or Gerlitz
2013-12-03 15:30 ` Eric Dumazet
2013-12-03 19:55   ` Or Gerlitz
2013-12-03 21:11     ` Joseph Gasparakis
2013-12-03 21:09       ` Or Gerlitz
2013-12-03 21:24         ` Eric Dumazet
2013-12-03 21:36           ` Or Gerlitz
2013-12-03 21:50             ` David Miller
2013-12-03 21:55               ` Eric Dumazet
2013-12-03 22:15                 ` Or Gerlitz
2013-12-03 22:22                 ` Or Gerlitz
2013-12-03 22:30                   ` Hannes Frederic Sowa
2013-12-03 22:35                     ` Or Gerlitz
2013-12-03 22:39                       ` Hannes Frederic Sowa
2013-12-03 23:10                 ` Or Gerlitz
2013-12-03 23:30                   ` Or Gerlitz
2013-12-03 23:49                     ` Hannes Frederic Sowa
2013-12-03 23:59                   ` Eric Dumazet
2013-12-04  0:26                     ` Alexei Starovoitov
2013-12-04  0:36                       ` Eric Dumazet
2013-12-04  0:55                         ` Alexei Starovoitov
2013-12-04  1:23                           ` Eric Dumazet
2013-12-04  1:59                             ` Alexei Starovoitov
2013-12-06  9:06                             ` Or Gerlitz
2013-12-06 13:36                               ` Eric Dumazet
2013-12-07 21:20                                 ` Or Gerlitz
2013-12-08 12:09                                 ` Or Gerlitz
2013-12-04  6:39                     ` David Miller
2013-12-04 17:40                       ` Eric Dumazet
2013-12-05 12:45                     ` [PATCH net-next] net: introduce dev_consume_skb_any() Eric Dumazet
2013-12-05 14:13                       ` Hannes Frederic Sowa
2013-12-05 14:45                         ` Eric Dumazet
2013-12-05 15:05                           ` Eric Dumazet
2013-12-05 15:44                             ` Hannes Frederic Sowa
2013-12-05 16:38                               ` Eric Dumazet
2013-12-05 16:54                                 ` Hannes Frederic Sowa
2013-12-06 20:24                       ` David Miller
2013-12-03 23:13         ` vxlan/veth performance issues on net.git + latest kernels Joseph Gasparakis
2013-12-03 23:09           ` Or Gerlitz
2013-12-04  0:35             ` Joseph Gasparakis [this message]
2013-12-04  0:34               ` Alexei Starovoitov
2013-12-04  1:29                 ` Joseph Gasparakis
2013-12-04  1:18                   ` Eric Dumazet
2013-12-04  0:44               ` Joseph Gasparakis
2013-12-04  8:35               ` Or Gerlitz
2013-12-04  9:24                 ` Joseph Gasparakis
2013-12-04  9:41                   ` Or Gerlitz
2013-12-04 15:20                     ` Or Gerlitz
     [not found]                     ` <52A197DF.5010806@mellanox.com>
2013-12-06  9:30                       ` Or Gerlitz
2013-12-08 12:43                         ` Mike Rapoport
2013-12-08 13:07                           ` Or Gerlitz
2013-12-08 14:30                             ` Mike Rapoport
2013-12-08 20:50                               ` Eric Dumazet
2013-12-08 21:36                                 ` Eric Dumazet
2013-12-06 10:30                       ` Joseph Gasparakis
2013-12-07 21:27                         ` Or Gerlitz
2013-12-08 18:08                           ` Joseph Gasparakis
2013-12-08 20:12                             ` Or Gerlitz
2013-12-08 15:21                         ` Or Gerlitz
2013-12-03 17:12 ` Eric Dumazet
2013-12-03 19:50   ` Or Gerlitz
2013-12-03 20:19     ` John Fastabend
2013-12-03 21:12     ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.03.1312031626400.7539@intel.com \
    --to=joseph.gasparakis@intel.com \
    --cc=ast@plumgrid.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=hkchu@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=ogerlitz@mellanox.com \
    --cc=or.gerlitz@gmail.com \
    --cc=pshelar@nicira.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.