* [PATCH 0/2] Disable forwarding of LRO skbs
@ 2008-04-30 21:48 Ben Hutchings
2008-04-30 21:51 ` [PATCH 1/2] " Ben Hutchings
` (3 more replies)
0 siblings, 4 replies; 29+ messages in thread
From: Ben Hutchings @ 2008-04-30 21:48 UTC (permalink / raw)
To: David Miller; +Cc: Kieran Mansley, Stephen Hemminger, netdev
Large Receive Offload (LRO) destroys packet headers that should be
preserved when forwarding. Currently it also triggers a BUG() or WARN()
in skb_gso_segment(). We should disable it wherever forwarding is
enabled, and discard LRO skbs with a warning if it is turned back on.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
^ permalink raw reply [flat|nested] 29+ messages in thread* Re: [PATCH 1/2] Disable forwarding of LRO skbs 2008-04-30 21:48 [PATCH 0/2] Disable forwarding of LRO skbs Ben Hutchings @ 2008-04-30 21:51 ` Ben Hutchings 2008-05-01 9:51 ` David Miller 2008-04-30 21:54 ` [PATCH 2/2] " Ben Hutchings ` (2 subsequent siblings) 3 siblings, 1 reply; 29+ messages in thread From: Ben Hutchings @ 2008-04-30 21:51 UTC (permalink / raw) To: David Miller; +Cc: Kieran Mansley, Stephen Hemminger, netdev Add skb_warn_if_lro() to test whether an skb was received with LRO and warn if so. Change br_forward(), ip_forward() and ip6_forward() to call skb_warn_if_lro() and discard the skb if it returns true. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 11fd9f2..18644cd 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1701,6 +1701,8 @@ static inline int skb_is_gso_v6(const struct sk_buff *skb) return skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6; } +extern bool skb_warn_if_lro(const struct sk_buff *skb); + static inline void skb_forward_csum(struct sk_buff *skb) { /* Unfortunately we don't support this one. Any brave souls? */ diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 4fe605f..bfd80c6 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -2582,6 +2582,15 @@ bool skb_partial_csum_set(struct sk_buff *skb, u16 start, u16 off) return true; } +bool skb_warn_if_lro(const struct sk_buff *skb) +{ + bool ret = unlikely(skb_is_gso(skb)); + if (ret && net_ratelimit()) + pr_warning("%s: received packets cannot be forwarded" + " while LRO is enabled\n", skb->dev->name); + return ret; +} + EXPORT_SYMBOL(___pskb_trim); EXPORT_SYMBOL(__kfree_skb); EXPORT_SYMBOL(kfree_skb); @@ -2615,6 +2624,7 @@ EXPORT_SYMBOL(skb_seq_read); EXPORT_SYMBOL(skb_abort_seq_read); EXPORT_SYMBOL(skb_find_text); EXPORT_SYMBOL(skb_append_datato_frags); +EXPORT_SYMBOL(skb_warn_if_lro); EXPORT_SYMBOL_GPL(skb_to_sgvec); EXPORT_SYMBOL_GPL(skb_cow_data); diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c index bdd7c35..cfee7c2 100644 --- a/net/bridge/br_forward.c +++ b/net/bridge/br_forward.c @@ -91,7 +91,7 @@ void br_deliver(const struct net_bridge_port *to, struct sk_buff *skb) /* called with rcu_read_lock */ void br_forward(const struct net_bridge_port *to, struct sk_buff *skb) { - if (should_deliver(to, skb)) { + if (!skb_warn_if_lro(skb) && should_deliver(to, skb)) { __br_forward(to, skb); return; } diff --git a/net/ipv4/ip_forward.c b/net/ipv4/ip_forward.c index 4813c39..a6e4cfa 100644 --- a/net/ipv4/ip_forward.c +++ b/net/ipv4/ip_forward.c @@ -58,6 +58,9 @@ int ip_forward(struct sk_buff *skb) struct rtable *rt; /* Route we use */ struct ip_options * opt = &(IPCB(skb)->opt); + if (skb_warn_if_lro(skb)) + goto drop; + if (!xfrm4_policy_check(NULL, XFRM_POLICY_FWD, skb)) goto drop; @@ -85,7 +88,7 @@ int ip_forward(struct sk_buff *skb) if (opt->is_strictroute && rt->rt_dst != rt->rt_gateway) goto sr_failed; - if (unlikely(skb->len > dst_mtu(&rt->u.dst) && !skb_is_gso(skb) && + if (unlikely(skb->len > dst_mtu(&rt->u.dst) && (ip_hdr(skb)->frag_off & htons(IP_DF))) && !skb->local_df) { IP_INC_STATS(IPSTATS_MIB_FRAGFAILS); icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 0af2e05..c7f4eec 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -409,6 +409,9 @@ int ip6_forward(struct sk_buff *skb) if (ipv6_devconf.forwarding == 0) goto error; + if (skb_warn_if_lro(skb)) + goto drop; + if (!xfrm6_policy_check(NULL, XFRM_POLICY_FWD, skb)) { IP6_INC_STATS(ip6_dst_idev(dst), IPSTATS_MIB_INDISCARDS); goto drop; -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 1/2] Disable forwarding of LRO skbs 2008-04-30 21:51 ` [PATCH 1/2] " Ben Hutchings @ 2008-05-01 9:51 ` David Miller 0 siblings, 0 replies; 29+ messages in thread From: David Miller @ 2008-05-01 9:51 UTC (permalink / raw) To: bhutchings; +Cc: kmansley, shemminger, netdev From: Ben Hutchings <bhutchings@solarflare.com> Date: Wed, 30 Apr 2008 22:51:26 +0100 > Add skb_warn_if_lro() to test whether an skb was received with LRO and > warn if so. > > Change br_forward(), ip_forward() and ip6_forward() to call > skb_warn_if_lro() and discard the skb if it returns true. This skb_is_gso() test should be inline. So, instead make this something like: include/linux/skbuff.h: extern bool __skb_warn_if_lro(const struct sk_buff *skb); static inline bool skb_warn_if_lro(const struct sk_buff *skb) { if (unlikely(skb_is_gso(skb))) return __skb_warn_if_lro(skb); return false; } net/core/skbuff.c: bool skb_warn_if_lro(const struct sk_buff *skb) { if (net_ratelimit()) pr_warning("%s: received packets cannot be forwarded" " while LRO is enabled\n", skb->dev->name); return true; } Can you make this correction and resubmit this patch? The second one looks fine to me, you don't need to resubmit that one. Thanks! ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/2] Disable forwarding of LRO skbs 2008-04-30 21:48 [PATCH 0/2] Disable forwarding of LRO skbs Ben Hutchings 2008-04-30 21:51 ` [PATCH 1/2] " Ben Hutchings @ 2008-04-30 21:54 ` Ben Hutchings 2008-04-30 21:58 ` [PATCH 0/2] " David Miller 2008-05-01 10:42 ` Herbert Xu 3 siblings, 0 replies; 29+ messages in thread From: Ben Hutchings @ 2008-04-30 21:54 UTC (permalink / raw) To: David Miller; +Cc: Kieran Mansley, Stephen Hemminger, netdev Add dev_disable_lro() function. Change br_add_if() and all functions that enable IPv4 and IPv6 forwarding to call dev_disable_lro() on the affected devices. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 7c1d446..d7e8f1f 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -866,6 +866,7 @@ extern struct net_device *__dev_get_by_name(struct net *net, const char *name); extern int dev_alloc_name(struct net_device *dev, const char *name); extern int dev_open(struct net_device *dev); extern int dev_close(struct net_device *dev); +extern void dev_disable_lro(struct net_device *dev); extern int dev_queue_xmit(struct sk_buff *skb); extern int register_netdevice(struct net_device *dev); extern void unregister_netdevice(struct net_device *dev); diff --git a/net/core/dev.c b/net/core/dev.c index e1df1ab..62cca32 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -90,6 +90,7 @@ #include <linux/if_ether.h> #include <linux/netdevice.h> #include <linux/etherdevice.h> +#include <linux/ethtool.h> #include <linux/notifier.h> #include <linux/skbuff.h> #include <net/net_namespace.h> @@ -1108,6 +1109,29 @@ int dev_close(struct net_device *dev) } +/** + * dev_disable_lro - disable Large Receive Offload on a device + * @dev: device + * + * Disable Large Receive Offload (LRO) on a net device. Must be + * called under RTNL. This is needed if received packets may be + * forwarded to another interface. + */ +void dev_disable_lro(struct net_device *dev) +{ + if (dev->ethtool_ops && dev->ethtool_ops->get_flags && + dev->ethtool_ops->set_flags) { + u32 flags = dev->ethtool_ops->get_flags(dev); + if (flags & ETH_FLAG_LRO) { + flags &= ~ETH_FLAG_LRO; + dev->ethtool_ops->set_flags(dev, flags); + } + } + WARN_ON(dev->features & NETIF_F_LRO); +} +EXPORT_SYMBOL(dev_disable_lro); + + static int dev_boot_phase = 1; /* diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c index 298e0f4..1bd6631 100644 --- a/net/bridge/br_if.c +++ b/net/bridge/br_if.c @@ -387,6 +387,7 @@ int br_add_if(struct net_bridge *br, struct net_device *dev) goto err2; rcu_assign_pointer(dev->br_port, p); + dev_disable_lro(dev); dev_set_promiscuity(dev, 1); list_add_rcu(&p->list, &br->port_list); diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index 6848e47..f88d395 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -171,6 +171,8 @@ static struct in_device *inetdev_init(struct net_device *dev) in_dev->dev = dev; if ((in_dev->arp_parms = neigh_parms_alloc(dev, &arp_tbl)) == NULL) goto out_kfree; + if (IPV4_DEVCONF(in_dev->cnf, FORWARDING)) + dev_disable_lro(dev); /* Reference in_dev->dev */ dev_hold(dev); /* Account for reference dev->ip_ptr (below) */ @@ -1250,6 +1252,8 @@ static void inet_forward_change(struct net *net) read_lock(&dev_base_lock); for_each_netdev(net, dev) { struct in_device *in_dev; + if (on) + dev_disable_lro(dev); rcu_read_lock(); in_dev = __in_dev_get_rcu(dev); if (in_dev) @@ -1257,8 +1261,6 @@ static void inet_forward_change(struct net *net) rcu_read_unlock(); } read_unlock(&dev_base_lock); - - rt_cache_flush(0); } static int devinet_conf_proc(ctl_table *ctl, int write, @@ -1344,10 +1346,19 @@ static int devinet_sysctl_forward(ctl_table *ctl, int write, if (write && *valp != val) { struct net *net = ctl->extra2; - if (valp == &IPV4_DEVCONF_ALL(net, FORWARDING)) - inet_forward_change(net); - else if (valp != &IPV4_DEVCONF_DFLT(net, FORWARDING)) + if (valp != &IPV4_DEVCONF_DFLT(net, FORWARDING)) { + rtnl_lock(); + if (valp == &IPV4_DEVCONF_ALL(net, FORWARDING)) { + inet_forward_change(net); + } else if (*valp) { + struct ipv4_devconf *cnf = ctl->extra1; + struct in_device *idev = + container_of(cnf, struct in_device, cnf); + dev_disable_lro(idev->dev); + } + rtnl_unlock(); rt_cache_flush(0); + } } return ret; diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 8a0fd40..5be6874 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -344,6 +344,8 @@ static struct inet6_dev * ipv6_add_dev(struct net_device *dev) kfree(ndev); return NULL; } + if (ndev->cnf.forwarding) + dev_disable_lro(dev); /* We refer to the device */ dev_hold(dev); @@ -438,6 +440,8 @@ static void dev_forward_change(struct inet6_dev *idev) if (!idev) return; dev = idev->dev; + if (idev->cnf.forwarding) + dev_disable_lro(dev); if (dev && (dev->flags & IFF_MULTICAST)) { if (idev->cnf.forwarding) ipv6_dev_mc_inc(dev, &in6addr_linklocal_allrouters); @@ -483,12 +487,14 @@ static void addrconf_fixup_forwarding(struct ctl_table *table, int *p, int old) if (p == &net->ipv6.devconf_dflt->forwarding) return; + rtnl_lock(); if (p == &net->ipv6.devconf_all->forwarding) { __s32 newf = net->ipv6.devconf_all->forwarding; net->ipv6.devconf_dflt->forwarding = newf; addrconf_forward_change(net, newf); } else if ((!*p) ^ (!old)) dev_forward_change((struct inet6_dev *)table->extra1); + rtnl_unlock(); if (*p) rt6_purge_dflt_routers(net); -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-04-30 21:48 [PATCH 0/2] Disable forwarding of LRO skbs Ben Hutchings 2008-04-30 21:51 ` [PATCH 1/2] " Ben Hutchings 2008-04-30 21:54 ` [PATCH 2/2] " Ben Hutchings @ 2008-04-30 21:58 ` David Miller 2008-05-01 10:19 ` Herbert Xu 2008-05-01 10:42 ` Herbert Xu 3 siblings, 1 reply; 29+ messages in thread From: David Miller @ 2008-04-30 21:58 UTC (permalink / raw) To: bhutchings; +Cc: kmansley, shemminger, netdev From: Ben Hutchings <bhutchings@solarflare.com> Date: Wed, 30 Apr 2008 22:48:46 +0100 > Large Receive Offload (LRO) destroys packet headers that should be > preserved when forwarding. Currently it also triggers a BUG() or WARN() > in skb_gso_segment(). We should disable it wherever forwarding is > enabled, and discard LRO skbs with a warning if it is turned back on. Thanks for reposting your work, I'll take a closer look at these two patches later today. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-04-30 21:58 ` [PATCH 0/2] " David Miller @ 2008-05-01 10:19 ` Herbert Xu 2008-05-01 10:34 ` David Miller 2008-05-01 11:00 ` Kieran Mansley 0 siblings, 2 replies; 29+ messages in thread From: Herbert Xu @ 2008-05-01 10:19 UTC (permalink / raw) To: David Miller; +Cc: bhutchings, kmansley, shemminger, netdev David Miller <davem@davemloft.net> wrote: > From: Ben Hutchings <bhutchings@solarflare.com> > Date: Wed, 30 Apr 2008 22:48:46 +0100 > >> Large Receive Offload (LRO) destroys packet headers that should be >> preserved when forwarding. Currently it also triggers a BUG() or WARN() >> in skb_gso_segment(). We should disable it wherever forwarding is >> enabled, and discard LRO skbs with a warning if it is turned back on. > > Thanks for reposting your work, I'll take a closer look at these > two patches later today. This patch completely breaks forwarding of GSO packets in a virtualised environment where we explicitly want to forward them as is to reduce per-packet overhead. So if LRO is causing problems then I suggest we find a better way around that that does not penalise all forms of GSO. Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 10:19 ` Herbert Xu @ 2008-05-01 10:34 ` David Miller 2008-05-01 10:38 ` Herbert Xu 2008-05-01 11:00 ` Kieran Mansley 1 sibling, 1 reply; 29+ messages in thread From: David Miller @ 2008-05-01 10:34 UTC (permalink / raw) To: herbert; +Cc: bhutchings, kmansley, shemminger, netdev From: Herbert Xu <herbert@gondor.apana.org.au> Date: Thu, 01 May 2008 18:19:54 +0800 > This patch completely breaks forwarding of GSO packets in a > virtualised environment where we explicitly want to forward > them as is to reduce per-packet overhead. So if LRO is causing > problems then I suggest we find a better way around that that > does not penalise all forms of GSO. I assume you mean forwarding via a bridge? I think we'll need to add a device attribute for what you want. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 10:34 ` David Miller @ 2008-05-01 10:38 ` Herbert Xu 2008-05-01 10:45 ` Herbert Xu 2008-05-01 10:51 ` David Miller 0 siblings, 2 replies; 29+ messages in thread From: Herbert Xu @ 2008-05-01 10:38 UTC (permalink / raw) To: David Miller; +Cc: bhutchings, kmansley, shemminger, netdev On Thu, May 01, 2008 at 03:34:14AM -0700, David Miller wrote: > From: Herbert Xu <herbert@gondor.apana.org.au> > Date: Thu, 01 May 2008 18:19:54 +0800 > > > This patch completely breaks forwarding of GSO packets in a > > virtualised environment where we explicitly want to forward > > them as is to reduce per-packet overhead. So if LRO is causing > > problems then I suggest we find a better way around that that > > does not penalise all forms of GSO. > > I assume you mean forwarding via a bridge? Not necessarily. For example we could be forwarding in a guest that's acting as a firewall appliance for other guests. > I think we'll need to add a device attribute for what > you want. I still don't understand why we can't forward LRO packets. It should be no different to forwarding GSO packets if it's properly constructed. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 10:38 ` Herbert Xu @ 2008-05-01 10:45 ` Herbert Xu 2008-05-01 10:52 ` David Miller 2008-05-01 10:51 ` David Miller 1 sibling, 1 reply; 29+ messages in thread From: Herbert Xu @ 2008-05-01 10:45 UTC (permalink / raw) To: David Miller; +Cc: bhutchings, kmansley, shemminger, netdev On Thu, May 01, 2008 at 06:38:54PM +0800, Herbert Xu wrote: > > I still don't understand why we can't forward LRO packets. It > should be no different to forwarding GSO packets if it's properly > constructed. Nevermind, doing LRO on forwarded traffic is fundamentally broken because it breaks the end-to-end connection. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 10:45 ` Herbert Xu @ 2008-05-01 10:52 ` David Miller 2008-05-01 10:55 ` Herbert Xu 0 siblings, 1 reply; 29+ messages in thread From: David Miller @ 2008-05-01 10:52 UTC (permalink / raw) To: herbert; +Cc: bhutchings, kmansley, shemminger, netdev From: Herbert Xu <herbert@gondor.apana.org.au> Date: Thu, 1 May 2008 18:45:17 +0800 > On Thu, May 01, 2008 at 06:38:54PM +0800, Herbert Xu wrote: > > > > I still don't understand why we can't forward LRO packets. It > > should be no different to forwarding GSO packets if it's properly > > constructed. > > Nevermind, doing LRO on forwarded traffic is fundamentally broken > because it breaks the end-to-end connection. Ok, that brings us back to your virtualization issues. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 10:52 ` David Miller @ 2008-05-01 10:55 ` Herbert Xu 2008-05-01 11:04 ` Kieran Mansley 0 siblings, 1 reply; 29+ messages in thread From: Herbert Xu @ 2008-05-01 10:55 UTC (permalink / raw) To: David Miller; +Cc: bhutchings, kmansley, shemminger, netdev On Thu, May 01, 2008 at 03:52:04AM -0700, David Miller wrote: > > Ok, that brings us back to your virtualization issues. I was talking about GSO traffic which doesn't originate from LRO. All traffic generated by guests would fall into that category. They can be forwarded by the guests and the host with no problems at all. So what we want to detect/forbid isn't the forwarding of GSO packets, but the LRO of packets which aren't destined for the local host. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 10:55 ` Herbert Xu @ 2008-05-01 11:04 ` Kieran Mansley 0 siblings, 0 replies; 29+ messages in thread From: Kieran Mansley @ 2008-05-01 11:04 UTC (permalink / raw) To: Herbert Xu; +Cc: David Miller, bhutchings, shemminger, netdev On Thu, 2008-05-01 at 18:55 +0800, Herbert Xu wrote: > So what we want to detect/forbid isn't the forwarding of GSO > packets, but the LRO of packets which aren't destined for the > local host. That's correct as I understand it. Kieran ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 10:38 ` Herbert Xu 2008-05-01 10:45 ` Herbert Xu @ 2008-05-01 10:51 ` David Miller 2008-05-01 10:53 ` Herbert Xu 1 sibling, 1 reply; 29+ messages in thread From: David Miller @ 2008-05-01 10:51 UTC (permalink / raw) To: herbert; +Cc: bhutchings, kmansley, shemminger, netdev From: Herbert Xu <herbert@gondor.apana.org.au> Date: Thu, 1 May 2008 18:38:54 +0800 > I still don't understand why we can't forward LRO packets. It > should be no different to forwarding GSO packets if it's properly > constructed. Hmmm, you should be right. On the output path, if the device can't GSO, we'll software un-GSO it. Perhaps the issue is that it's expensive to do that, and thus it could harm routing performance in such cases. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 10:51 ` David Miller @ 2008-05-01 10:53 ` Herbert Xu 0 siblings, 0 replies; 29+ messages in thread From: Herbert Xu @ 2008-05-01 10:53 UTC (permalink / raw) To: David Miller; +Cc: bhutchings, kmansley, shemminger, netdev On Thu, May 01, 2008 at 03:51:08AM -0700, David Miller wrote: > > Perhaps the issue is that it's expensive to do that, and thus > it could harm routing performance in such cases. Actually my brain was busted from reading FIPS doco :) LRO on forward is fundamentally broken because we'd be fiddling with other people's bits. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 10:19 ` Herbert Xu 2008-05-01 10:34 ` David Miller @ 2008-05-01 11:00 ` Kieran Mansley 2008-05-01 11:06 ` Herbert Xu 1 sibling, 1 reply; 29+ messages in thread From: Kieran Mansley @ 2008-05-01 11:00 UTC (permalink / raw) To: Herbert Xu; +Cc: David Miller, bhutchings, shemminger, netdev On Thu, 2008-05-01 at 18:19 +0800, Herbert Xu wrote: > David Miller <davem@davemloft.net> wrote: > > From: Ben Hutchings <bhutchings@solarflare.com> > > Date: Wed, 30 Apr 2008 22:48:46 +0100 > > > >> Large Receive Offload (LRO) destroys packet headers that should be > >> preserved when forwarding. Currently it also triggers a BUG() or WARN() > >> in skb_gso_segment(). We should disable it wherever forwarding is > >> enabled, and discard LRO skbs with a warning if it is turned back on. > > > > Thanks for reposting your work, I'll take a closer look at these > > two patches later today. > > This patch completely breaks forwarding of GSO packets in a > virtualised environment where we explicitly want to forward > them as is to reduce per-packet overhead. So if LRO is causing > problems then I suggest we find a better way around that that > does not penalise all forms of GSO. If I remember rightly the nub of the matter as far as that is concerned is that both LRO and GSO use the gso_size field in the SKB. Normally this overloading is fine as LRO is receive-side and GSO is send-side and so the two code paths are different and react to the gso_size field appropriately. However, when an LRO packet is forwarded by a bridge to another device it then appears on the send path and the gso_size field being set (because it is an LRO packet) means it is interpreted as being a GSO packet. To some extent this should be OK - fragmenting it should be really easy as it's already in what are likely to be appropriate sized chunks in the frag_list - but the code at the moment generates warnings and BUGs in this case as it's (understandably) not been written to cope with both LRO and GSO packets. There's no reason why it has to be that way however. Fundamentally I think it would be an improvement to have some definite way to distinguish LRO and GSO packets, regardless of whether it can cope with both. If that were available, it would be possible to warn on LRO packets without affecting the GSO packets being forwarded in virtualised environements. As an aside, the gso_size field isn't the only one that has a slightly different meaning depending on whether the SKB was generated by a stack or a device. The ip_summed field is I think another one, and would surely need special handling by the bridge when forwarding packets. If it's already doing this, there's a precedent for fixing things up there. Kieran ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 11:00 ` Kieran Mansley @ 2008-05-01 11:06 ` Herbert Xu 0 siblings, 0 replies; 29+ messages in thread From: Herbert Xu @ 2008-05-01 11:06 UTC (permalink / raw) To: Kieran Mansley; +Cc: David Miller, bhutchings, shemminger, netdev On Thu, May 01, 2008 at 12:00:28PM +0100, Kieran Mansley wrote: > > As an aside, the gso_size field isn't the only one that has a slightly > different meaning depending on whether the SKB was generated by a stack > or a device. The ip_summed field is I think another one, and would > surely need special handling by the bridge when forwarding packets. If > it's already doing this, there's a precedent for fixing things up there. You would've been right about ip_summed in the past :) As it stands I've made it specifically treat inbound/outbound packets in the same way so this is no longer true. As to gso_size, I think the crux of the problem is that we're doing LRO on forwarded traffic at all. Once you've done it to a packet we might as well make the stack forward it since what's done is done and forwarding it is going to be a hell lot better than dropping it. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-04-30 21:48 [PATCH 0/2] Disable forwarding of LRO skbs Ben Hutchings ` (2 preceding siblings ...) 2008-04-30 21:58 ` [PATCH 0/2] " David Miller @ 2008-05-01 10:42 ` Herbert Xu 2008-05-01 11:02 ` Ben Hutchings 3 siblings, 1 reply; 29+ messages in thread From: Herbert Xu @ 2008-05-01 10:42 UTC (permalink / raw) To: Ben Hutchings; +Cc: davem, kmansley, shemminger, netdev Ben Hutchings <bhutchings@solarflare.com> wrote: > Large Receive Offload (LRO) destroys packet headers that should be > preserved when forwarding. Currently it also triggers a BUG() or WARN() > in skb_gso_segment(). We should disable it wherever forwarding is > enabled, and discard LRO skbs with a warning if it is turned back on. I don't think forwarding GSO packets is bad per se. However, doing LRO on forwarded traffic is broken because it breaks the end-to-end connection. So I agree that we should detect this but preferrably not based on the GSO flag as LRO isn't the only source of GSO traffic and the rest of them can be forwarded just fine. Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 10:42 ` Herbert Xu @ 2008-05-01 11:02 ` Ben Hutchings 2008-05-01 11:08 ` Kieran Mansley 2008-05-01 14:06 ` Herbert Xu 0 siblings, 2 replies; 29+ messages in thread From: Ben Hutchings @ 2008-05-01 11:02 UTC (permalink / raw) To: Herbert Xu; +Cc: davem, kmansley, shemminger, netdev Herbert Xu wrote: > Ben Hutchings <bhutchings@solarflare.com> wrote: > > Large Receive Offload (LRO) destroys packet headers that should be > > preserved when forwarding. Currently it also triggers a BUG() or WARN() > > in skb_gso_segment(). We should disable it wherever forwarding is > > enabled, and discard LRO skbs with a warning if it is turned back on. > > I don't think forwarding GSO packets is bad per se. However, > doing LRO on forwarded traffic is broken because it breaks the > end-to-end connection. So I agree that we should detect this > but preferrably not based on the GSO flag as LRO isn't the only > source of GSO traffic and the rest of them can be forwarded just > fine. My understanding is that at the points I'm adding the check, we're only considering received skbs and gso_size will only be set if LRO was used. I was going to raise the virtualisation issue because Kieran has been in contact with XenSource over this and they believed LRO was not a problem in Xen. So long as the bridges in dom0 each connect a single physical interface to other domains running paravirtualised Linux, this is probably true. But I have no idea how to "whitelist" this sort of case. Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 11:02 ` Ben Hutchings @ 2008-05-01 11:08 ` Kieran Mansley 2008-05-01 11:12 ` Herbert Xu 2008-05-01 14:06 ` Herbert Xu 1 sibling, 1 reply; 29+ messages in thread From: Kieran Mansley @ 2008-05-01 11:08 UTC (permalink / raw) To: Ben Hutchings; +Cc: Herbert Xu, davem, shemminger, netdev On Thu, 2008-05-01 at 12:02 +0100, Ben Hutchings wrote: > I was going to raise the virtualisation issue because Kieran has been > in contact with XenSource over this and they believed LRO was not a > problem in Xen. So long as the bridges in dom0 each connect a single > physical interface to other domains running paravirtualised Linux, this > is probably true. Yes, everything works fine in that case. It's only when LRO packets are forwarded to another physical interface that things go wrong. Kieran ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 11:08 ` Kieran Mansley @ 2008-05-01 11:12 ` Herbert Xu 2008-05-01 11:18 ` Kieran Mansley 0 siblings, 1 reply; 29+ messages in thread From: Herbert Xu @ 2008-05-01 11:12 UTC (permalink / raw) To: Kieran Mansley; +Cc: Ben Hutchings, davem, shemminger, netdev On Thu, May 01, 2008 at 12:08:48PM +0100, Kieran Mansley wrote: > > Yes, everything works fine in that case. It's only when LRO packets are > forwarded to another physical interface that things go wrong. Now this is definitely broken because it fiddles with TCP packets in the middle, but still it should *work* in as much as making the packet go out. Well except that you need to fix up the checksum so that it only contains the pseudo-header on output. That's probably what you need to address if we failed in avoiding LRO on forwarded traffic. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 11:12 ` Herbert Xu @ 2008-05-01 11:18 ` Kieran Mansley 2008-05-01 11:37 ` Herbert Xu 0 siblings, 1 reply; 29+ messages in thread From: Kieran Mansley @ 2008-05-01 11:18 UTC (permalink / raw) To: Herbert Xu; +Cc: Ben Hutchings, davem, shemminger, netdev On Thu, 2008-05-01 at 19:12 +0800, Herbert Xu wrote: > On Thu, May 01, 2008 at 12:08:48PM +0100, Kieran Mansley wrote: > > > > Yes, everything works fine in that case. It's only when LRO packets are > > forwarded to another physical interface that things go wrong. > > Now this is definitely broken because it fiddles with TCP packets > in the middle, but still it should *work* in as much as making the > packet go out. Well except that you need to fix up the checksum > so that it only contains the pseudo-header on output. That's > probably what you need to address if we failed in avoiding LRO > on forwarded traffic. If I remember rightly it also gets upset when it comes to split the packet because the frag_list is already in use; it expects this not to be the case for GSO but it's valid for an LRO packet to do this I think. I don't think there's anything fundamentally difficult here, it's just that the code has never been written to cope. Kieran ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 11:18 ` Kieran Mansley @ 2008-05-01 11:37 ` Herbert Xu 2008-05-01 12:08 ` Ben Hutchings 0 siblings, 1 reply; 29+ messages in thread From: Herbert Xu @ 2008-05-01 11:37 UTC (permalink / raw) To: Kieran Mansley; +Cc: Ben Hutchings, davem, shemminger, netdev On Thu, May 01, 2008 at 12:18:08PM +0100, Kieran Mansley wrote: > > If I remember rightly it also gets upset when it comes to split the > packet because the frag_list is already in use; it expects this not to > be the case for GSO but it's valid for an LRO packet to do this I think. > I don't think there's anything fundamentally difficult here, it's just > that the code has never been written to cope. Oh so LRO packets still preserve the original packet boundaries? That's even easier. We could just make the GSO output code detect the presence of frag_list and fragment based on that. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 11:37 ` Herbert Xu @ 2008-05-01 12:08 ` Ben Hutchings 2008-05-01 12:19 ` Herbert Xu 0 siblings, 1 reply; 29+ messages in thread From: Ben Hutchings @ 2008-05-01 12:08 UTC (permalink / raw) To: Herbert Xu; +Cc: Kieran Mansley, davem, shemminger, netdev Herbert Xu wrote: > On Thu, May 01, 2008 at 12:18:08PM +0100, Kieran Mansley wrote: > > > > If I remember rightly it also gets upset when it comes to split the > > packet because the frag_list is already in use; it expects this not to > > be the case for GSO but it's valid for an LRO packet to do this I think. > > I don't think there's anything fundamentally difficult here, it's just > > that the code has never been written to cope. > > Oh so LRO packets still preserve the original packet boundaries? > That's even easier. We could just make the GSO output code detect > the presence of frag_list and fragment based on that. Depending on the driver, the packets may be received into skbs which are then attached to frag_list, or into page buffers which go in the frags array. If most packets can be merged by LRO, deferring skb allocation and using the frags array is a performance win. Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 12:08 ` Ben Hutchings @ 2008-05-01 12:19 ` Herbert Xu 2008-05-12 14:48 ` Ben Hutchings 0 siblings, 1 reply; 29+ messages in thread From: Herbert Xu @ 2008-05-01 12:19 UTC (permalink / raw) To: Ben Hutchings; +Cc: Kieran Mansley, davem, shemminger, netdev On Thu, May 01, 2008 at 01:08:51PM +0100, Ben Hutchings wrote: > > Depending on the driver, the packets may be received into skbs which are > then attached to frag_list, or into page buffers which go in the frags > array. If most packets can be merged by LRO, deferring skb allocation > and using the frags array is a performance win. But for each packet it's either going to be frag_list or page frags, right? So we could still fragment based on frag_list if it's present. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 12:19 ` Herbert Xu @ 2008-05-12 14:48 ` Ben Hutchings 0 siblings, 0 replies; 29+ messages in thread From: Ben Hutchings @ 2008-05-12 14:48 UTC (permalink / raw) To: Herbert Xu; +Cc: Kieran Mansley, davem, shemminger, netdev Herbert Xu wrote: > On Thu, May 01, 2008 at 01:08:51PM +0100, Ben Hutchings wrote: > > > > Depending on the driver, the packets may be received into skbs which are > > then attached to frag_list, or into page buffers which go in the frags > > array. If most packets can be merged by LRO, deferring skb allocation > > and using the frags array is a performance win. > > But for each packet it's either going to be frag_list or page > frags, right? So we could still fragment based on frag_list if > it's present. Please can you post an implementation of this. I don't want this bug to be left unfixed. Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 11:02 ` Ben Hutchings 2008-05-01 11:08 ` Kieran Mansley @ 2008-05-01 14:06 ` Herbert Xu 2008-06-15 0:51 ` Ben Hutchings 1 sibling, 1 reply; 29+ messages in thread From: Herbert Xu @ 2008-05-01 14:06 UTC (permalink / raw) To: Ben Hutchings; +Cc: davem, kmansley, shemminger, netdev On Thu, May 01, 2008 at 12:02:27PM +0100, Ben Hutchings wrote: > > My understanding is that at the points I'm adding the check, we're only > considering received skbs and gso_size will only be set if LRO was used. Unfortunately that's not the case. The forwarding path can and does occur for GSO packets sent from other guests. The obvious case would be a GSO packet from a guest forwarded by the host (as opposed to bridged) to the outside world. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-05-01 14:06 ` Herbert Xu @ 2008-06-15 0:51 ` Ben Hutchings 2008-06-15 1:46 ` Ben Hutchings 0 siblings, 1 reply; 29+ messages in thread From: Ben Hutchings @ 2008-06-15 0:51 UTC (permalink / raw) To: Herbert Xu; +Cc: davem, kmansley, shemminger, netdev Herbert Xu wrote: > On Thu, May 01, 2008 at 12:02:27PM +0100, Ben Hutchings wrote: > > > > My understanding is that at the points I'm adding the check, we're only > > considering received skbs and gso_size will only be set if LRO was used. > > Unfortunately that's not the case. The forwarding path can and > does occur for GSO packets sent from other guests. The obvious > case would be a GSO packet from a guest forwarded by the host > (as opposed to bridged) to the outside world. Can we distinguish them by testing skb->ip_summed == CHECKSUM_PARTIAL? That is a requirement for using hardware segmentation offload (see netif_needs_gso()) but an skb resulting from LRO should have ip_summed set to either CHECKSUM_UNNECESSARY or CHECKSUM_NONE. Alternately we could introduce some kind of status flag to say that a device is bridging or forwarding, and have ethtool_op_set_flags() block re-enabling of LRO when that's set. Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-06-15 0:51 ` Ben Hutchings @ 2008-06-15 1:46 ` Ben Hutchings 2008-06-15 3:38 ` Herbert Xu 0 siblings, 1 reply; 29+ messages in thread From: Ben Hutchings @ 2008-06-15 1:46 UTC (permalink / raw) To: Herbert Xu; +Cc: davem, kmansley, shemminger, netdev Ben Hutchings wrote: > Herbert Xu wrote: > > On Thu, May 01, 2008 at 12:02:27PM +0100, Ben Hutchings wrote: > > > > > > My understanding is that at the points I'm adding the check, we're only > > > considering received skbs and gso_size will only be set if LRO was used. > > > > Unfortunately that's not the case. The forwarding path can and > > does occur for GSO packets sent from other guests. The obvious > > case would be a GSO packet from a guest forwarded by the host > > (as opposed to bridged) to the outside world. > > Can we distinguish them by testing skb->ip_summed == CHECKSUM_PARTIAL? > That is a requirement for using hardware segmentation offload (see > netif_needs_gso()) but an skb resulting from LRO should have ip_summed > set to either CHECKSUM_UNNECESSARY or CHECKSUM_NONE. Actually, it seems like there's a cleaner test: gso_size != 0 && gso_type == 0 (if we really want GSO then the latter must be set). I'll have to try that out. Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/2] Disable forwarding of LRO skbs 2008-06-15 1:46 ` Ben Hutchings @ 2008-06-15 3:38 ` Herbert Xu 0 siblings, 0 replies; 29+ messages in thread From: Herbert Xu @ 2008-06-15 3:38 UTC (permalink / raw) To: Ben Hutchings; +Cc: davem, kmansley, shemminger, netdev On Sun, Jun 15, 2008 at 02:46:58AM +0100, Ben Hutchings wrote: > > > Can we distinguish them by testing skb->ip_summed == CHECKSUM_PARTIAL? > > That is a requirement for using hardware segmentation offload (see > > netif_needs_gso()) but an skb resulting from LRO should have ip_summed > > set to either CHECKSUM_UNNECESSARY or CHECKSUM_NONE. > > Actually, it seems like there's a cleaner test: gso_size != 0 && > gso_type == 0 (if we really want GSO then the latter must be set). > I'll have to try that out. Yeah either way should work. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2008-06-15 3:38 UTC | newest] Thread overview: 29+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-04-30 21:48 [PATCH 0/2] Disable forwarding of LRO skbs Ben Hutchings 2008-04-30 21:51 ` [PATCH 1/2] " Ben Hutchings 2008-05-01 9:51 ` David Miller 2008-04-30 21:54 ` [PATCH 2/2] " Ben Hutchings 2008-04-30 21:58 ` [PATCH 0/2] " David Miller 2008-05-01 10:19 ` Herbert Xu 2008-05-01 10:34 ` David Miller 2008-05-01 10:38 ` Herbert Xu 2008-05-01 10:45 ` Herbert Xu 2008-05-01 10:52 ` David Miller 2008-05-01 10:55 ` Herbert Xu 2008-05-01 11:04 ` Kieran Mansley 2008-05-01 10:51 ` David Miller 2008-05-01 10:53 ` Herbert Xu 2008-05-01 11:00 ` Kieran Mansley 2008-05-01 11:06 ` Herbert Xu 2008-05-01 10:42 ` Herbert Xu 2008-05-01 11:02 ` Ben Hutchings 2008-05-01 11:08 ` Kieran Mansley 2008-05-01 11:12 ` Herbert Xu 2008-05-01 11:18 ` Kieran Mansley 2008-05-01 11:37 ` Herbert Xu 2008-05-01 12:08 ` Ben Hutchings 2008-05-01 12:19 ` Herbert Xu 2008-05-12 14:48 ` Ben Hutchings 2008-05-01 14:06 ` Herbert Xu 2008-06-15 0:51 ` Ben Hutchings 2008-06-15 1:46 ` Ben Hutchings 2008-06-15 3:38 ` Herbert Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).