netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: Solio Sarabia <solio.sarabia@intel.com>
Cc: netdev@vger.kernel.org, davem@davemloft.net,
	eric.dumazet@gmail.com, dsahern@gmail.com, kys@microsoft.com,
	shiny.sebastian@intel.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC] veth: make veth aware of gso buffer size
Date: Sun, 26 Nov 2017 09:26:28 -0800	[thread overview]
Message-ID: <20171126092628.668dc189@xeon-e3> (raw)
In-Reply-To: <1511645212-18600-1-git-send-email-solio.sarabia@intel.com>

On Sat, 25 Nov 2017 13:26:52 -0800
Solio Sarabia <solio.sarabia@intel.com> wrote:

> GSO buffer size supported by underlying devices is not propagated to
> veth. In high-speed connections with hw TSO enabled, veth sends buffers
> bigger than lower device's maximum GSO, forcing sw TSO and increasing
> system CPU usage.
> 
> Signed-off-by: Solio Sarabia <solio.sarabia@intel.com>
> ---
> Exposing gso_max_size via sysfs is not advised [0]. This patch queries
> available interfaces get this value. Reading dev_list is O(n), since it
> can be large (e.g. hundreds of containers), only a subset of interfaces
> is inspected.  _Please_ advise pointers how to make veth aware of lower
> device's GSO value.
> 
> In a test scenario with Hyper-V, Ubuntu VM, Docker inside VM, and NTttcp
> microworkload sending 40 Gbps from one container, this fix reduces 3x
> sender host CPU overhead, since now all TSO is done on physical NIC.
> Savings in CPU cycles benefit other use cases where veth is used, and
> the GSO buffer size is properly set.
> 
> [0] https://lkml.org/lkml/2017/11/24/512
> 
>  drivers/net/veth.c | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index f5438d0..e255b51 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -298,6 +298,34 @@ static const struct net_device_ops veth_netdev_ops = {
>  		       NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX | \
>  		       NETIF_F_HW_VLAN_STAG_TX | NETIF_F_HW_VLAN_STAG_RX )
>  
> +static void veth_set_gso(struct net_device *dev)
> +{
> +	struct net_device *nd;
> +	unsigned int size = GSO_MAX_SIZE;
> +	u16 segs = GSO_MAX_SEGS;
> +	unsigned int count = 0;
> +	const unsigned int limit = 10;
> +
> +	/* Set default gso based on available physical/synthetic devices,
> +	 * ignore virtual interfaces, and limit looping through dev_list
> +	 * as the total number of interfaces can be large.
> +	 */
> +	read_lock(&dev_base_lock);
> +	for_each_netdev(&init_net, nd) {
> +		if (count >= limit)
> +			break;
> +		if (nd->dev.parent && nd->flags & IFF_UP) {
> +			size = min(size, nd->gso_max_size);
> +			segs = min(segs, nd->gso_max_segs);
> +		}
> +		count++;
> +	}
> +
> +	read_unlock(&dev_base_lock);
> +	netif_set_gso_max_size(dev, size);
> +	dev->gso_max_segs = size ? size - 1 : 0;
> +}

Thanks for looking for a solution. 

Looking at the first 10 devices (including those not related to veth) is not that
great a method. There maybe 100's of tunnels, and there is no guarantee of ordering
in the device list. And what about network namespaces, looking in root namespace
is suspect as well.

The locking also looks wrong. veth_setup is called with RTNL held
(from __rtnl_link_register). Therefore acquiring dev_base_lock is not necessary.

      reply	other threads:[~2017-11-26 17:26 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-25 21:26 [PATCH RFC] veth: make veth aware of gso buffer size Solio Sarabia
2017-11-26 17:26 ` Stephen Hemminger [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171126092628.668dc189@xeon-e3 \
    --to=stephen@networkplumber.org \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=eric.dumazet@gmail.com \
    --cc=kys@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=shiny.sebastian@intel.com \
    --cc=solio.sarabia@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).