From: Flavio Leitner <fbl@redhat.com>
To: "Du, Fan" <fan.du@intel.com>
Cc: "'Jason Wang'" <jasowang@redhat.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"davem@davemloft.net" <davem@davemloft.net>,
"fw@strlen.de" <fw@strlen.de>
Subject: Re: [PATCH net] gso: do GSO for local skb with size bigger than MTU
Date: Tue, 2 Dec 2014 13:44:26 -0200 [thread overview]
Message-ID: <20141202154425.GA5344@t520.home> (raw)
In-Reply-To: <5A90DA2E42F8AE43BC4A093BF0678848DED92B@SHSMSX104.ccr.corp.intel.com>
On Sun, Nov 30, 2014 at 10:08:32AM +0000, Du, Fan wrote:
>
>
> >-----Original Message-----
> >From: Jason Wang [mailto:jasowang@redhat.com]
> >Sent: Friday, November 28, 2014 3:02 PM
> >To: Du, Fan
> >Cc: netdev@vger.kernel.org; davem@davemloft.net; fw@strlen.de; Du, Fan
> >Subject: Re: [PATCH net] gso: do GSO for local skb with size bigger than MTU
> >
> >
> >
> >On Fri, Nov 28, 2014 at 2:33 PM, Fan Du <fan.du@intel.com> wrote:
> >> Test scenario: two KVM guests sitting in different hosts communicate
> >> to each other with a vxlan tunnel.
> >>
> >> All interface MTU is default 1500 Bytes, from guest point of view, its
> >> skb gso_size could be as bigger as 1448Bytes, however after guest skb
> >> goes through vxlan encapuslation, individual segments length of a gso
> >> packet could exceed physical NIC MTU 1500, which will be lost at
> >> recevier side.
> >>
> >> So it's possible in virtualized environment, locally created skb len
> >> after encapslation could be bigger than underlayer MTU. In such case,
> >> it's reasonable to do GSO first, then fragment any packet bigger than
> >> MTU as possible.
> >>
> >> +---------------+ TX RX +---------------+
> >> | KVM Guest | -> ... -> | KVM Guest |
> >> +-+-----------+-+ +-+-----------+-+
> >> |Qemu/VirtIO| |Qemu/VirtIO|
> >> +-----------+ +-----------+
> >> | |
> >> v tap0 tap0 v
> >> +-----------+ +-----------+
> >> | ovs bridge| | ovs bridge|
> >> +-----------+ +-----------+
> >> | vxlan vxlan |
> >> v v
> >> +-----------+ +-----------+
> >> | NIC | <------> | NIC |
> >> +-----------+ +-----------+
> >>
> >> Steps to reproduce:
> >> 1. Using kernel builtin openvswitch module to setup ovs bridge.
> >> 2. Runing iperf without -M, communication will stuck.
> >
> >Is this issue specific to ovs or ipv4? Path MTU discovery should help in this case I
> >believe.
>
> Problem here is host stack push local over-sized gso skb down to NIC, and perform GSO there
> without any further ip segmentation.
>
> Reasonable behavior is do gso first at ip level, if gso-ed skb is bigger than MTU && df is set,
> Then push ICMP_DEST_UNREACH/ICMP_FRAG_NEEDED message back to sender to adjust mtu.
>
> For PMTU to work, that's another issue I will try to address later on.
>
> >>
> >>
> >> Signed-off-by: Fan Du <fan.du@intel.com>
> >> ---
> >> net/ipv4/ip_output.c | 7 ++++---
> >> 1 files changed, 4 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index
> >> bc6471d..558b5f8 100644
> >> --- a/net/ipv4/ip_output.c
> >> +++ b/net/ipv4/ip_output.c
> >> @@ -217,9 +217,10 @@ static int ip_finish_output_gso(struct sk_buff
> >> *skb)
> >> struct sk_buff *segs;
> >> int ret = 0;
> >>
> >> - /* common case: locally created skb or seglen is <= mtu */
> >> - if (((IPCB(skb)->flags & IPSKB_FORWARDED) == 0) ||
> >> - skb_gso_network_seglen(skb) <= ip_skb_dst_mtu(skb))
> >> + /* Both locally created skb and forwarded skb could exceed
> >> + * MTU size, so make a unified rule for them all.
> >> + */
> >> + if (skb_gso_network_seglen(skb) <= ip_skb_dst_mtu(skb))
> >> return ip_finish_output2(skb);
Are you using kernel's vxlan device or openvswitch's vxlan device?
Because for kernel's vxlan devices the MTU accounts for the header
overhead so I believe your patch would work. However, the MTU is
not visible for the ovs's vxlan devices, so that wouldn't work.
fbl
next prev parent reply other threads:[~2014-12-02 15:44 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-28 6:33 [PATCH net] gso: do GSO for local skb with size bigger than MTU Fan Du
2014-11-28 7:02 ` Jason Wang
2014-11-30 10:08 ` Du, Fan
2014-12-01 13:52 ` Thomas Graf
[not found] ` <20141201135225.GA16814-FZi0V3Vbi30CUdFEqe4BF2D2FQJk+8+b@public.gmane.org>
2014-12-01 15:06 ` Michael S. Tsirkin
2014-12-02 15:48 ` Flavio Leitner
2014-12-02 17:09 ` Thomas Graf
[not found] ` <20141202170927.GA9457-FZi0V3Vbi30CUdFEqe4BF2D2FQJk+8+b@public.gmane.org>
2014-12-02 17:34 ` Michael S. Tsirkin
2014-12-02 17:41 ` Thomas Graf
2014-12-02 18:12 ` Jesse Gross
[not found] ` <CAEP_g=-86Z6pxNow-wjnbx_v9er_TSn6x5waigqVqYHa7tEQJw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-12-03 9:03 ` Michael S. Tsirkin
2014-12-03 18:07 ` Jesse Gross
[not found] ` <CAEP_g=9C+D3gbjJ4n1t6xuyjqEAMYi4ZfqPoe92UAoQJH-UsKg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-12-03 18:38 ` Michael S. Tsirkin
2014-12-03 18:56 ` Rick Jones
[not found] ` <547F5CC2.8000908-VXdhtT5mjnY@public.gmane.org>
2014-12-04 10:17 ` Michael S. Tsirkin
2014-12-03 19:38 ` Jesse Gross
2014-12-03 22:02 ` Thomas Graf
[not found] ` <20141203220244.GA8822-FZi0V3Vbi30CUdFEqe4BF2D2FQJk+8+b@public.gmane.org>
2014-12-03 22:50 ` Michael S. Tsirkin
2014-12-03 22:51 ` Jesse Gross
2014-12-03 23:05 ` Thomas Graf
[not found] ` <20141203230551.GC8822-FZi0V3Vbi30CUdFEqe4BF2D2FQJk+8+b@public.gmane.org>
2014-12-04 0:54 ` Jesse Gross
2014-12-04 1:15 ` Thomas Graf
2014-12-04 1:51 ` Jesse Gross
2014-12-04 9:26 ` Thomas Graf
2014-12-04 23:19 ` Jesse Gross
2014-12-04 7:48 ` Du Fan
2014-12-04 23:23 ` Jesse Gross
2014-12-05 0:25 ` Du Fan
2014-12-03 2:31 ` Du, Fan
2015-01-05 6:02 ` Fan Du
[not found] ` <54AA2912.6090903-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-01-05 17:58 ` Jesse Gross
2015-01-06 9:34 ` Fan Du
2015-01-06 19:11 ` Jesse Gross
[not found] ` <CAEP_g=8bCR=PeSoi09jLWLtNUrxhzx45h1Wm=9D=R57AqUac2w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-07 5:58 ` Fan Du
2015-01-07 20:52 ` Jesse Gross
[not found] ` <CAEP_g=8EBeQUFkRRsG3sznYryd+LE9qJKWQXfS==HG2HDO=UKA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-08 9:39 ` Fan Du
2015-01-08 19:55 ` Jesse Gross
[not found] ` <CAEP_g=9hh+MG7AWEnct7CwRqp=ZghpbkDeQ5BhGQktDgMST1jA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-09 5:42 ` Fan Du
2015-01-12 18:48 ` Jesse Gross
2015-01-09 5:48 ` Fan Du
2015-01-12 18:55 ` Jesse Gross
2015-01-13 16:58 ` Thomas Graf
2014-12-02 15:44 ` Flavio Leitner [this message]
2014-12-02 18:06 ` Jesse Gross
2014-12-02 21:32 ` Flavio Leitner
2014-12-02 21:47 ` Jesse Gross
2014-12-03 1:58 ` Du, Fan
2014-11-30 10:26 ` Florian Westphal
2014-11-30 10:55 ` Du, Fan
2014-11-30 15:11 ` Florian Westphal
2014-12-01 6:47 ` Du, Fan
2014-12-03 3:23 ` David Miller
2014-12-03 3:32 ` Du, Fan
2014-12-03 4:35 ` David Miller
2014-12-03 4:50 ` Du, Fan
2014-12-03 5:14 ` David Miller
2014-12-03 6:53 ` Du, Fan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141202154425.GA5344@t520.home \
--to=fbl@redhat.com \
--cc=davem@davemloft.net \
--cc=fan.du@intel.com \
--cc=fw@strlen.de \
--cc=jasowang@redhat.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).