From mboxrd@z Thu Jan  1 00:00:00 1970
From: Stephen Hemminger <stephen@networkplumber.org>
Subject: Re: [PATCH RFC 2/2] veth: propagate bridge GSO to peer
Date: Thu, 30 Nov 2017 09:10:21 -0800
Message-ID: <20171130091021.658869b0@xeon-e3>
References: <20171126181749.19288-1-sthemmin@microsoft.com>
        <20171126181749.19288-3-sthemmin@microsoft.com>
        <aac2350d-3a10-c97f-939b-a32d5e300412@gmail.com>
        <20171126230725.1fcc3b51@xeon-e3>
        <20171127201419.GA79@intel.com>
        <20171127131502.1fbfaa66@xeon-e3>
        <20171128014222.GA503@intel.com>
        <91628267-2e48-a231-7cc2-4830eb95ceef@gmail.com>
        <20171130003534.GA60@intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Cc: David Ahern <dsahern@gmail.com>, davem@davemloft.net,
        netdev@vger.kernel.org, sthemmin@microsoft.com,
        shiny.sebastian@intel.com
To: Solio Sarabia <solio.sarabia@intel.com>
Return-path: <netdev-owner@vger.kernel.org>
Received: from mail-pl0-f48.google.com ([209.85.160.48]:42419 "EHLO
        mail-pl0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1752286AbdK3RKb (ORCPT
        <rfc822;netdev@vger.kernel.org>); Thu, 30 Nov 2017 12:10:31 -0500
Received: by mail-pl0-f48.google.com with SMTP id bd8so4611326plb.9
        for <netdev@vger.kernel.org>; Thu, 30 Nov 2017 09:10:31 -0800 (PST)
In-Reply-To: <20171130003534.GA60@intel.com>
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On Wed, 29 Nov 2017 16:35:37 -0800
Solio Sarabia <solio.sarabia@intel.com> wrote:

> On Mon, Nov 27, 2017 at 07:02:01PM -0700, David Ahern wrote:
> > On 11/27/17 6:42 PM, Solio Sarabia wrote:  
> > > Adding ioctl support for 'ip link set' would work. I'm still concerned
> > > how to enforce the upper limit to not exceed that of the lower devices.
> > >   
> Actually, giving the user control to change gso doesn't solve the issue.
> In a VM, user could simple ignore setting the gso, still hurting host
> perf. We need to enforce the lower gso on the bridge/veth.
> 
> Should this issue be fixed at hv_netvsc level? Why is the driver passing
> down gso buffer sizes greater than what synthetic interface allows.
> 
> > > Consider a system with three NICs, each reporting values in the range
> > > [60,000 - 62,780]. Users could set virtual interfaces' gso to 65,536,
> > > exceeding the limit, and having the host do sw gso (vms settings must
> > > not affect host performance.)
> > > 
> > > Looping through interfaces?  With the difference that now it'd be
> > > trigger upon user's request, not every time a veth is created (like one
> > > previous patch discussed.)
> > >   
> > 
> > You are concerned about the routed case right? One option is to have VRF
> > devices propagate gso sizes to all devices (veth, vlan, etc) enslaved to
> > it. VRF devices are Layer 3 master devices so an L3 parallel to a bridge.  
> Having the VRF device propagate the gso to its slaves is opposite of
> what we do now: get minimum of all ports and assign it to bridge
> (net/bridge/br_if.c, br_min_mtu, br_set_gso_limits.)
> 
> Would it be right to change the logic flow? If so, this this could work:
> 
> (1) bridge gets gso from lower devices upon init/setup
> (2) when new device is attached to bridge, bridge sets gso for this new
>     slave (and its peer if it's veth.)
> (3) as the code is now, there's an optimization opportunity: for every
>     new interface attached to bridge, bridge loops through all ports to
>     set gso, mtu. It's not necessary as bridge already has the minimum
>     from previous interfaces attached. Could be O(1) instead of O(n).


The problem goes back into the core GSO networking code.
Something like this is needed.

static inline bool netif_needs_gso(struct sk_buff *skb,
				   const struct net_device *dev,
				   netdev_features_t features)
{
	return skb_is_gso(skb) &&
		(!skb_gso_ok(skb, features) ||
		 unlikely(skb_shinfo(skb)->gso_segs > dev->gso_max_segs) ||  << new
		 unlikely(skb_shinfo(skb)->gso_size > dev->gso_max_size) ||  << new
		 unlikely((skb->ip_summed != CHECKSUM_PARTIAL) &&
			  (skb->ip_summed != CHECKSUM_UNNECESSARY)));
}

What that will do is split up the monster GSO packets if they ever bleed
across from one device to another through the twisty mazes of packet
processing paths.