* regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? @ 2013-02-06 11:43 Michael S. Tsirkin 2013-02-06 13:07 ` Eric Dumazet 0 siblings, 1 reply; 16+ messages in thread From: Michael S. Tsirkin @ 2013-02-06 11:43 UTC (permalink / raw) To: alexander.h.duyck, stephen.s.ko, jeffrey.t.kirsher, David Miller, netdev It seems that starting with kernel 3.3 ixgbe sets gso_size for incoming frames. It seems that this might result in gso_size being set even when gso_type is 0. This in turn leads to a crash at macvtap_skb_to_vnet_hdr drivers/net/macvtap.c:628 which has this code: if (skb_is_gso(skb)) { struct skb_shared_info *sinfo = skb_shinfo(skb); /* This is a hint as to how much should be linear. */ vnet_hdr->hdr_len = skb_headlen(skb); vnet_hdr->gso_size = sinfo->gso_size; if (sinfo->gso_type & SKB_GSO_TCPV4) vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4; else if (sinfo->gso_type & SKB_GSO_TCPV6) vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6; else if (sinfo->gso_type & SKB_GSO_UDP) vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_UDP; else BUG(); if (sinfo->gso_type & SKB_GSO_TCP_ECN) vnet_hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN; } else vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE; Since skb_is_gso tests gso_size. What's the right way to handle this? Should skb_is_gso be changed to test gso_type != 0? -- MST ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? 2013-02-06 11:43 regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? Michael S. Tsirkin @ 2013-02-06 13:07 ` Eric Dumazet 2013-02-06 15:50 ` Michael S. Tsirkin 0 siblings, 1 reply; 16+ messages in thread From: Eric Dumazet @ 2013-02-06 13:07 UTC (permalink / raw) To: Michael S. Tsirkin Cc: alexander.h.duyck, stephen.s.ko, jeffrey.t.kirsher, David Miller, netdev On Wed, 2013-02-06 at 13:43 +0200, Michael S. Tsirkin wrote: > It seems that starting with kernel 3.3 ixgbe sets gso_size for > incoming frames. It seems that this might result in gso_size > being set even when gso_type is 0. > This in turn leads to a crash at macvtap_skb_to_vnet_hdr > drivers/net/macvtap.c:628 > which has this code: > > if (skb_is_gso(skb)) { > struct skb_shared_info *sinfo = skb_shinfo(skb); > > /* This is a hint as to how much should be linear. */ > vnet_hdr->hdr_len = skb_headlen(skb); > vnet_hdr->gso_size = sinfo->gso_size; > if (sinfo->gso_type & SKB_GSO_TCPV4) > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4; > else if (sinfo->gso_type & SKB_GSO_TCPV6) > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6; > else if (sinfo->gso_type & SKB_GSO_UDP) > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_UDP; > else > BUG(); > if (sinfo->gso_type & SKB_GSO_TCP_ECN) > vnet_hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN; > } else > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE; > > > Since skb_is_gso tests gso_size. > > What's the right way to handle this? Should skb_is_gso be > changed to test gso_type != 0? > Or fix ixgbe to set gso_type in ixgbe_get_headlen(), as it does all the dissection. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? 2013-02-06 13:07 ` Eric Dumazet @ 2013-02-06 15:50 ` Michael S. Tsirkin 2013-02-06 16:15 ` Eric Dumazet ` (2 more replies) 0 siblings, 3 replies; 16+ messages in thread From: Michael S. Tsirkin @ 2013-02-06 15:50 UTC (permalink / raw) To: Eric Dumazet Cc: alexander.h.duyck, stephen.s.ko, jeffrey.t.kirsher, David Miller, netdev, sony.chacko, mchan, jitendra.kalsaria, eilong On Wed, Feb 06, 2013 at 05:07:39AM -0800, Eric Dumazet wrote: > On Wed, 2013-02-06 at 13:43 +0200, Michael S. Tsirkin wrote: > > It seems that starting with kernel 3.3 ixgbe sets gso_size for > > incoming frames. It seems that this might result in gso_size > > being set even when gso_type is 0. > > This in turn leads to a crash at macvtap_skb_to_vnet_hdr > > drivers/net/macvtap.c:628 > > which has this code: > > > > if (skb_is_gso(skb)) { > > struct skb_shared_info *sinfo = skb_shinfo(skb); > > > > /* This is a hint as to how much should be linear. */ > > vnet_hdr->hdr_len = skb_headlen(skb); > > vnet_hdr->gso_size = sinfo->gso_size; > > if (sinfo->gso_type & SKB_GSO_TCPV4) > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4; > > else if (sinfo->gso_type & SKB_GSO_TCPV6) > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6; > > else if (sinfo->gso_type & SKB_GSO_UDP) > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_UDP; > > else > > BUG(); > > if (sinfo->gso_type & SKB_GSO_TCP_ECN) > > vnet_hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN; > > } else > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE; > > > > > > Since skb_is_gso tests gso_size. > > > > What's the right way to handle this? Should skb_is_gso be > > changed to test gso_type != 0? > > > > Or fix ixgbe to set gso_type in ixgbe_get_headlen(), as it does all the > dissection. Hmm, ixgbe_get_headlen isn't run on linear skbs though. Also, I'm not sure I understand when should drivers set gso size for incoming messages and what is a reasonable value. Commit log talks about improved performance for lossy connections, in this case, isn't this something net core should set? I see 3 in-tree drivers that do this: drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c: skb_shinfo(skb)->gso_size = bnx2x drivers/net/ethernet/intel/ixgbe/ixgbe_main.c: skb_shinfo(skb)->gso_size = DIV_ROUND_UP((skb->le drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c: skb_shinfo(skb)->gso_size = qlcnic_get_lr It seems likely the same issue applies there? -- MST ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? 2013-02-06 15:50 ` Michael S. Tsirkin @ 2013-02-06 16:15 ` Eric Dumazet 2013-02-06 17:42 ` Michael S. Tsirkin 2013-02-06 16:23 ` Eric Dumazet 2013-02-06 19:58 ` Ben Hutchings 2 siblings, 1 reply; 16+ messages in thread From: Eric Dumazet @ 2013-02-06 16:15 UTC (permalink / raw) To: Michael S. Tsirkin Cc: alexander.h.duyck, stephen.s.ko, jeffrey.t.kirsher, David Miller, netdev, sony.chacko, mchan, jitendra.kalsaria, eilong On Wed, 2013-02-06 at 17:50 +0200, Michael S. Tsirkin wrote: > Also, I'm not sure I understand when should drivers set gso size > for incoming messages and what is a reasonable value. > Commit log talks about improved performance for lossy connections, > in this case, isn't this something net core should set? > > I see 3 in-tree drivers that do this: > > drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c: skb_shinfo(skb)->gso_size = bnx2x bnx2x is fine, take a look at lines 464 > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c: skb_shinfo(skb)->gso_size = DIV_ROUND_UP((skb->le > drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c: skb_shinfo(skb)->gso_size = qlcnic_get_lr > > It seems likely the same issue applies there? Yes ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? 2013-02-06 16:15 ` Eric Dumazet @ 2013-02-06 17:42 ` Michael S. Tsirkin 2013-02-06 17:42 ` Eric Dumazet 2013-02-07 3:03 ` Cong Wang 0 siblings, 2 replies; 16+ messages in thread From: Michael S. Tsirkin @ 2013-02-06 17:42 UTC (permalink / raw) To: Eric Dumazet Cc: alexander.h.duyck, stephen.s.ko, jeffrey.t.kirsher, David Miller, netdev, sony.chacko, mchan, jitendra.kalsaria, eilong On Wed, Feb 06, 2013 at 08:15:32AM -0800, Eric Dumazet wrote: > On Wed, 2013-02-06 at 17:50 +0200, Michael S. Tsirkin wrote: > > > Also, I'm not sure I understand when should drivers set gso size > > for incoming messages and what is a reasonable value. > > Commit log talks about improved performance for lossy connections, > > in this case, isn't this something net core should set? > > > > I see 3 in-tree drivers that do this: > > > > drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c: skb_shinfo(skb)->gso_size = bnx2x > > bnx2x is fine, take a look at lines 464 Is this what you mean? /* This is needed in order to enable forwarding support */ if (frag_size) { skb_shinfo(skb)->gso_size = bnx2x_set_lro_mss(bp, tpa_info->parsing_flags, len_on_bd); /* set for GRO */ if (fp->mode == TPA_MODE_GRO) skb_shinfo(skb)->gso_type = (GET_FLAG(tpa_info->parsing_flags, PARSING_FLAGS_OVER_ETHERNET_PROTOCOL) == PRS_FLAG_OVERETH_IPV6) ? SKB_GSO_TCPV6 : SKB_GSO_TCPV4; } I see it sets gso_type but apparently only if mode is GRO? Will this still break if mode is set to LRO? > > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c: skb_shinfo(skb)->gso_size = DIV_ROUND_UP((skb->le > > drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c: skb_shinfo(skb)->gso_size = qlcnic_get_lr > > > > It seems likely the same issue applies there? > > Yes ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? 2013-02-06 17:42 ` Michael S. Tsirkin @ 2013-02-06 17:42 ` Eric Dumazet 2013-02-06 17:50 ` Michael S. Tsirkin 2013-02-06 18:05 ` Michael S. Tsirkin 2013-02-07 3:03 ` Cong Wang 1 sibling, 2 replies; 16+ messages in thread From: Eric Dumazet @ 2013-02-06 17:42 UTC (permalink / raw) To: Michael S. Tsirkin Cc: alexander.h.duyck, stephen.s.ko, jeffrey.t.kirsher, David Miller, netdev, sony.chacko, mchan, jitendra.kalsaria, eilong On Wed, 2013-02-06 at 19:42 +0200, Michael S. Tsirkin wrote: > On Wed, Feb 06, 2013 at 08:15:32AM -0800, Eric Dumazet wrote: > > On Wed, 2013-02-06 at 17:50 +0200, Michael S. Tsirkin wrote: > > > > > Also, I'm not sure I understand when should drivers set gso size > > > for incoming messages and what is a reasonable value. > > > Commit log talks about improved performance for lossy connections, > > > in this case, isn't this something net core should set? > > > > > > I see 3 in-tree drivers that do this: > > > > > > drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c: skb_shinfo(skb)->gso_size = bnx2x > > > > bnx2x is fine, take a look at lines 464 > > Is this what you mean? > > /* This is needed in order to enable forwarding support */ > if (frag_size) { > skb_shinfo(skb)->gso_size = bnx2x_set_lro_mss(bp, > tpa_info->parsing_flags, len_on_bd); > > /* set for GRO */ > if (fp->mode == TPA_MODE_GRO) > skb_shinfo(skb)->gso_type = > (GET_FLAG(tpa_info->parsing_flags, > PARSING_FLAGS_OVER_ETHERNET_PROTOCOL) == > PRS_FLAG_OVERETH_IPV6) ? > SKB_GSO_TCPV6 : SKB_GSO_TCPV4; > } > > > I see it sets gso_type but apparently only if mode is GRO? > Will this still break if mode is set to LRO? > > In net-next tree, line 464 looks like : static void bnx2x_set_gro_params(struct sk_buff *skb, u16 parsing_flags, u16 len_on_bd, unsigned int pkt_len) { /* TPA aggregation won't have either IP options or TCP options * other than timestamp or IPv6 extension headers. */ u16 hdrs_len = ETH_HLEN + sizeof(struct tcphdr); if (GET_FLAG(parsing_flags, PARSING_FLAGS_OVER_ETHERNET_PROTOCOL) == PRS_FLAG_OVERETH_IPV6) { hdrs_len += sizeof(struct ipv6hdr); <HERE> skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6; } else { hdrs_len += sizeof(struct iphdr); <HERE> skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4; } /* Check if there was a TCP timestamp, if there is it's will * always be 12 bytes length: nop nop kind length echo val. * * Otherwise FW would close the aggregation. */ if (parsing_flags & PARSING_FLAGS_TIME_STAMP_EXIST_FLAG) hdrs_len += TPA_TSTAMP_OPT_LEN; skb_shinfo(skb)->gso_size = len_on_bd - hdrs_len; ... ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? 2013-02-06 17:42 ` Eric Dumazet @ 2013-02-06 17:50 ` Michael S. Tsirkin 2013-02-06 18:05 ` Michael S. Tsirkin 1 sibling, 0 replies; 16+ messages in thread From: Michael S. Tsirkin @ 2013-02-06 17:50 UTC (permalink / raw) To: Eric Dumazet Cc: alexander.h.duyck, stephen.s.ko, jeffrey.t.kirsher, David Miller, netdev, sony.chacko, mchan, jitendra.kalsaria, eilong On Wed, Feb 06, 2013 at 09:42:45AM -0800, Eric Dumazet wrote: > On Wed, 2013-02-06 at 19:42 +0200, Michael S. Tsirkin wrote: > > On Wed, Feb 06, 2013 at 08:15:32AM -0800, Eric Dumazet wrote: > > > On Wed, 2013-02-06 at 17:50 +0200, Michael S. Tsirkin wrote: > > > > > > > Also, I'm not sure I understand when should drivers set gso size > > > > for incoming messages and what is a reasonable value. > > > > Commit log talks about improved performance for lossy connections, > > > > in this case, isn't this something net core should set? > > > > > > > > I see 3 in-tree drivers that do this: > > > > > > > > drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c: skb_shinfo(skb)->gso_size = bnx2x > > > > > > bnx2x is fine, take a look at lines 464 > > > > Is this what you mean? > > > > /* This is needed in order to enable forwarding support */ > > if (frag_size) { > > skb_shinfo(skb)->gso_size = bnx2x_set_lro_mss(bp, > > tpa_info->parsing_flags, len_on_bd); > > > > /* set for GRO */ > > if (fp->mode == TPA_MODE_GRO) > > skb_shinfo(skb)->gso_type = > > (GET_FLAG(tpa_info->parsing_flags, > > PARSING_FLAGS_OVER_ETHERNET_PROTOCOL) == > > PRS_FLAG_OVERETH_IPV6) ? > > SKB_GSO_TCPV6 : SKB_GSO_TCPV4; > > } > > > > > > I see it sets gso_type but apparently only if mode is GRO? > > Will this still break if mode is set to LRO? > > > > > > > In net-next tree, line 464 looks like : > > static void bnx2x_set_gro_params(struct sk_buff *skb, u16 parsing_flags, > u16 len_on_bd, unsigned int pkt_len) > { > /* TPA aggregation won't have either IP options or TCP options > * other than timestamp or IPv6 extension headers. > */ > u16 hdrs_len = ETH_HLEN + sizeof(struct tcphdr); > > if (GET_FLAG(parsing_flags, PARSING_FLAGS_OVER_ETHERNET_PROTOCOL) == > PRS_FLAG_OVERETH_IPV6) { > hdrs_len += sizeof(struct ipv6hdr); > <HERE> skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6; > } else { > hdrs_len += sizeof(struct iphdr); > <HERE> skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4; > } > > /* Check if there was a TCP timestamp, if there is it's will > * always be 12 bytes length: nop nop kind length echo val. > * > * Otherwise FW would close the aggregation. > */ > if (parsing_flags & PARSING_FLAGS_TIME_STAMP_EXIST_FLAG) > hdrs_len += TPA_TSTAMP_OPT_LEN; > > skb_shinfo(skb)->gso_size = len_on_bd - hdrs_len; > ... > OK, it looks like intel/qlogic can just look at skb protocol and set gso_type to TCPV4/V6. I'll try it and see if this works. -- MST ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? 2013-02-06 17:42 ` Eric Dumazet 2013-02-06 17:50 ` Michael S. Tsirkin @ 2013-02-06 18:05 ` Michael S. Tsirkin 2013-02-06 18:09 ` Eric Dumazet 1 sibling, 1 reply; 16+ messages in thread From: Michael S. Tsirkin @ 2013-02-06 18:05 UTC (permalink / raw) To: Eric Dumazet Cc: alexander.h.duyck, stephen.s.ko, jeffrey.t.kirsher, David Miller, netdev, sony.chacko, mchan, jitendra.kalsaria, eilong On Wed, Feb 06, 2013 at 09:42:45AM -0800, Eric Dumazet wrote: > On Wed, 2013-02-06 at 19:42 +0200, Michael S. Tsirkin wrote: > > On Wed, Feb 06, 2013 at 08:15:32AM -0800, Eric Dumazet wrote: > > > On Wed, 2013-02-06 at 17:50 +0200, Michael S. Tsirkin wrote: > > > > > > > Also, I'm not sure I understand when should drivers set gso size > > > > for incoming messages and what is a reasonable value. > > > > Commit log talks about improved performance for lossy connections, > > > > in this case, isn't this something net core should set? > > > > > > > > I see 3 in-tree drivers that do this: > > > > > > > > drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c: skb_shinfo(skb)->gso_size = bnx2x > > > > > > bnx2x is fine, take a look at lines 464 > > > > Is this what you mean? > > > > /* This is needed in order to enable forwarding support */ > > if (frag_size) { > > skb_shinfo(skb)->gso_size = bnx2x_set_lro_mss(bp, > > tpa_info->parsing_flags, len_on_bd); > > > > /* set for GRO */ > > if (fp->mode == TPA_MODE_GRO) > > skb_shinfo(skb)->gso_type = > > (GET_FLAG(tpa_info->parsing_flags, > > PARSING_FLAGS_OVER_ETHERNET_PROTOCOL) == > > PRS_FLAG_OVERETH_IPV6) ? > > SKB_GSO_TCPV6 : SKB_GSO_TCPV4; > > } > > > > > > I see it sets gso_type but apparently only if mode is GRO? > > Will this still break if mode is set to LRO? > > > > > > > In net-next tree, line 464 looks like : > > static void bnx2x_set_gro_params(struct sk_buff *skb, u16 parsing_flags, > u16 len_on_bd, unsigned int pkt_len) > { > /* TPA aggregation won't have either IP options or TCP options > * other than timestamp or IPv6 extension headers. > */ > u16 hdrs_len = ETH_HLEN + sizeof(struct tcphdr); > > if (GET_FLAG(parsing_flags, PARSING_FLAGS_OVER_ETHERNET_PROTOCOL) == > PRS_FLAG_OVERETH_IPV6) { > hdrs_len += sizeof(struct ipv6hdr); > <HERE> skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6; > } else { > hdrs_len += sizeof(struct iphdr); > <HERE> skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4; > } > > /* Check if there was a TCP timestamp, if there is it's will > * always be 12 bytes length: nop nop kind length echo val. > * > * Otherwise FW would close the aggregation. > */ > if (parsing_flags & PARSING_FLAGS_TIME_STAMP_EXIST_FLAG) > hdrs_len += TPA_TSTAMP_OPT_LEN; > > skb_shinfo(skb)->gso_size = len_on_bd - hdrs_len; > ... OK this means cbf1de72324a8105ddcc3d9ce9acbc613faea17e might be needed in 3.8 and maybe -stable, otherwise macvtap crashes when LRO is set. -- MST ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? 2013-02-06 18:05 ` Michael S. Tsirkin @ 2013-02-06 18:09 ` Eric Dumazet 0 siblings, 0 replies; 16+ messages in thread From: Eric Dumazet @ 2013-02-06 18:09 UTC (permalink / raw) To: Michael S. Tsirkin Cc: alexander.h.duyck, stephen.s.ko, jeffrey.t.kirsher, David Miller, netdev, sony.chacko, mchan, jitendra.kalsaria, eilong On Wed, 2013-02-06 at 20:05 +0200, Michael S. Tsirkin wrote: > > OK this means cbf1de72324a8105ddcc3d9ce9acbc613faea17e might be needed > in 3.8 and maybe -stable, otherwise macvtap crashes when LRO is set. > You may be right : My issue was an accounting error in a rarely used path (qdisc on ingress). Yours is definitely more serious ;) ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? 2013-02-06 17:42 ` Michael S. Tsirkin 2013-02-06 17:42 ` Eric Dumazet @ 2013-02-07 3:03 ` Cong Wang 1 sibling, 0 replies; 16+ messages in thread From: Cong Wang @ 2013-02-07 3:03 UTC (permalink / raw) To: netdev On Wed, 06 Feb 2013 at 17:42 GMT, Michael S. Tsirkin <mst@redhat.com> wrote: > > I see it sets gso_type but apparently only if mode is GRO? > Will this still break if mode is set to LRO? The comment inside skb_warn_if_lro() said: /* LRO sets gso_size but not gso_type, whereas if GSO is really * wanted then gso_type will be set. */ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? 2013-02-06 15:50 ` Michael S. Tsirkin 2013-02-06 16:15 ` Eric Dumazet @ 2013-02-06 16:23 ` Eric Dumazet 2013-02-06 19:58 ` Ben Hutchings 2 siblings, 0 replies; 16+ messages in thread From: Eric Dumazet @ 2013-02-06 16:23 UTC (permalink / raw) To: Michael S. Tsirkin Cc: alexander.h.duyck, stephen.s.ko, jeffrey.t.kirsher, David Miller, netdev, sony.chacko, mchan, jitendra.kalsaria, eilong On Wed, 2013-02-06 at 17:50 +0200, Michael S. Tsirkin wrote: > Commit log talks about improved performance for lossy connections, > in this case, isn't this something net core should set? > I fail to see where net core could do that. These drivers kind of bypass GRO, so they must provide skbs that are 100% correct. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? 2013-02-06 15:50 ` Michael S. Tsirkin 2013-02-06 16:15 ` Eric Dumazet 2013-02-06 16:23 ` Eric Dumazet @ 2013-02-06 19:58 ` Ben Hutchings 2013-02-06 21:45 ` Michael S. Tsirkin 2 siblings, 1 reply; 16+ messages in thread From: Ben Hutchings @ 2013-02-06 19:58 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Eric Dumazet, alexander.h.duyck, stephen.s.ko, jeffrey.t.kirsher, David Miller, netdev, sony.chacko, mchan, jitendra.kalsaria, eilong On Wed, 2013-02-06 at 17:50 +0200, Michael S. Tsirkin wrote: > On Wed, Feb 06, 2013 at 05:07:39AM -0800, Eric Dumazet wrote: > > On Wed, 2013-02-06 at 13:43 +0200, Michael S. Tsirkin wrote: > > > It seems that starting with kernel 3.3 ixgbe sets gso_size for > > > incoming frames. It seems that this might result in gso_size > > > being set even when gso_type is 0. > > > This in turn leads to a crash at macvtap_skb_to_vnet_hdr > > > drivers/net/macvtap.c:628 > > > which has this code: > > > > > > if (skb_is_gso(skb)) { > > > struct skb_shared_info *sinfo = skb_shinfo(skb); > > > > > > /* This is a hint as to how much should be linear. */ > > > vnet_hdr->hdr_len = skb_headlen(skb); > > > vnet_hdr->gso_size = sinfo->gso_size; > > > if (sinfo->gso_type & SKB_GSO_TCPV4) > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4; > > > else if (sinfo->gso_type & SKB_GSO_TCPV6) > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6; > > > else if (sinfo->gso_type & SKB_GSO_UDP) > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_UDP; > > > else > > > BUG(); > > > if (sinfo->gso_type & SKB_GSO_TCP_ECN) > > > vnet_hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN; > > > } else > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE; > > > > > > > > > Since skb_is_gso tests gso_size. > > > > > > What's the right way to handle this? Should skb_is_gso be > > > changed to test gso_type != 0? > > > > > > > Or fix ixgbe to set gso_type in ixgbe_get_headlen(), as it does all the > > dissection. > > > Hmm, ixgbe_get_headlen isn't run on linear skbs though. > > Also, I'm not sure I understand when should drivers set gso size > for incoming messages and what is a reasonable value. > Commit log talks about improved performance for lossy connections, > in this case, isn't this something net core should set? [...] It should be set to the segment size on the wire, so TCP gets a correct picture of packet loss. The networking core has no idea what hardware/ firmware LRO did. I've previously raised this issue of macvlan vs LRO (which is the same issue we previously had with IP forwarding and with bridging): http://thread.gmane.org/gmane.linux.network/221695 Ben. -- Ben Hutchings, Staff Engineer, Solarflare Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? 2013-02-06 19:58 ` Ben Hutchings @ 2013-02-06 21:45 ` Michael S. Tsirkin 2013-02-06 21:55 ` Ben Hutchings 0 siblings, 1 reply; 16+ messages in thread From: Michael S. Tsirkin @ 2013-02-06 21:45 UTC (permalink / raw) To: Ben Hutchings Cc: Eric Dumazet, alexander.h.duyck, stephen.s.ko, jeffrey.t.kirsher, David Miller, netdev, sony.chacko, mchan, jitendra.kalsaria, eilong On Wed, Feb 06, 2013 at 07:58:21PM +0000, Ben Hutchings wrote: > On Wed, 2013-02-06 at 17:50 +0200, Michael S. Tsirkin wrote: > > On Wed, Feb 06, 2013 at 05:07:39AM -0800, Eric Dumazet wrote: > > > On Wed, 2013-02-06 at 13:43 +0200, Michael S. Tsirkin wrote: > > > > It seems that starting with kernel 3.3 ixgbe sets gso_size for > > > > incoming frames. It seems that this might result in gso_size > > > > being set even when gso_type is 0. > > > > This in turn leads to a crash at macvtap_skb_to_vnet_hdr > > > > drivers/net/macvtap.c:628 > > > > which has this code: > > > > > > > > if (skb_is_gso(skb)) { > > > > struct skb_shared_info *sinfo = skb_shinfo(skb); > > > > > > > > /* This is a hint as to how much should be linear. */ > > > > vnet_hdr->hdr_len = skb_headlen(skb); > > > > vnet_hdr->gso_size = sinfo->gso_size; > > > > if (sinfo->gso_type & SKB_GSO_TCPV4) > > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4; > > > > else if (sinfo->gso_type & SKB_GSO_TCPV6) > > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6; > > > > else if (sinfo->gso_type & SKB_GSO_UDP) > > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_UDP; > > > > else > > > > BUG(); > > > > if (sinfo->gso_type & SKB_GSO_TCP_ECN) > > > > vnet_hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN; > > > > } else > > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE; > > > > > > > > > > > > Since skb_is_gso tests gso_size. > > > > > > > > What's the right way to handle this? Should skb_is_gso be > > > > changed to test gso_type != 0? > > > > > > > > > > Or fix ixgbe to set gso_type in ixgbe_get_headlen(), as it does all the > > > dissection. > > > > > > Hmm, ixgbe_get_headlen isn't run on linear skbs though. > > > > Also, I'm not sure I understand when should drivers set gso size > > for incoming messages and what is a reasonable value. > > Commit log talks about improved performance for lossy connections, > > in this case, isn't this something net core should set? > [...] > > It should be set to the segment size on the wire, so TCP gets a correct > picture of packet loss. The networking core has no idea what hardware/ > firmware LRO did. > > I've previously raised this issue of macvlan vs LRO (which is the same > issue we previously had with IP forwarding and with bridging): > http://thread.gmane.org/gmane.linux.network/221695 > > Ben. I see, you proposed disabling LRO the moment a macvlan is attached. If I understand correctly, the difference as compared to bridge is that bridge normally consumes all incoming packets, macvlan is often used in parallel with the underlying interface. BTW are there other issues with forwarding/bridging and LRO? If everyone sets gso_type in packets it seems we can leave LRO set even with bridging? > -- > Ben Hutchings, Staff Engineer, Solarflare > Not speaking for my employer; that's the marketing department's job. > They asked us to note that Solarflare product names are trademarked. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? 2013-02-06 21:45 ` Michael S. Tsirkin @ 2013-02-06 21:55 ` Ben Hutchings 2013-02-06 22:26 ` Michael S. Tsirkin 0 siblings, 1 reply; 16+ messages in thread From: Ben Hutchings @ 2013-02-06 21:55 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Eric Dumazet, alexander.h.duyck, stephen.s.ko, jeffrey.t.kirsher, David Miller, netdev, sony.chacko, mchan, jitendra.kalsaria, eilong On Wed, 2013-02-06 at 23:45 +0200, Michael S. Tsirkin wrote: > On Wed, Feb 06, 2013 at 07:58:21PM +0000, Ben Hutchings wrote: > > On Wed, 2013-02-06 at 17:50 +0200, Michael S. Tsirkin wrote: > > > On Wed, Feb 06, 2013 at 05:07:39AM -0800, Eric Dumazet wrote: > > > > On Wed, 2013-02-06 at 13:43 +0200, Michael S. Tsirkin wrote: > > > > > It seems that starting with kernel 3.3 ixgbe sets gso_size for > > > > > incoming frames. It seems that this might result in gso_size > > > > > being set even when gso_type is 0. > > > > > This in turn leads to a crash at macvtap_skb_to_vnet_hdr > > > > > drivers/net/macvtap.c:628 > > > > > which has this code: > > > > > > > > > > if (skb_is_gso(skb)) { > > > > > struct skb_shared_info *sinfo = skb_shinfo(skb); > > > > > > > > > > /* This is a hint as to how much should be linear. */ > > > > > vnet_hdr->hdr_len = skb_headlen(skb); > > > > > vnet_hdr->gso_size = sinfo->gso_size; > > > > > if (sinfo->gso_type & SKB_GSO_TCPV4) > > > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4; > > > > > else if (sinfo->gso_type & SKB_GSO_TCPV6) > > > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6; > > > > > else if (sinfo->gso_type & SKB_GSO_UDP) > > > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_UDP; > > > > > else > > > > > BUG(); > > > > > if (sinfo->gso_type & SKB_GSO_TCP_ECN) > > > > > vnet_hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN; > > > > > } else > > > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE; > > > > > > > > > > > > > > > Since skb_is_gso tests gso_size. > > > > > > > > > > What's the right way to handle this? Should skb_is_gso be > > > > > changed to test gso_type != 0? > > > > > > > > > > > > > Or fix ixgbe to set gso_type in ixgbe_get_headlen(), as it does all the > > > > dissection. > > > > > > > > > Hmm, ixgbe_get_headlen isn't run on linear skbs though. > > > > > > Also, I'm not sure I understand when should drivers set gso size > > > for incoming messages and what is a reasonable value. > > > Commit log talks about improved performance for lossy connections, > > > in this case, isn't this something net core should set? > > [...] > > > > It should be set to the segment size on the wire, so TCP gets a correct > > picture of packet loss. The networking core has no idea what hardware/ > > firmware LRO did. > > > > I've previously raised this issue of macvlan vs LRO (which is the same > > issue we previously had with IP forwarding and with bridging): > > http://thread.gmane.org/gmane.linux.network/221695 > > > > Ben. > > I see, you proposed disabling LRO the moment a macvlan is attached. > If I understand correctly, the difference as compared to bridge is that > bridge normally consumes all incoming packets, macvlan is often used in > parallel with the underlying interface. Not so different from IP forwarding, though, in that a single interface can both forward packets and deliver them locally. > BTW are there other issues with forwarding/bridging and LRO? If everyone > sets gso_type in packets it seems we can leave LRO set even with > bridging? Some implementations of LRO violate the end-to-end principle by merging segments with varying lengths and TCP timestamps. GRO is very careful to ensure that the original packets can be recovered from the skbs it produces. Ben. -- Ben Hutchings, Staff Engineer, Solarflare Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? 2013-02-06 21:55 ` Ben Hutchings @ 2013-02-06 22:26 ` Michael S. Tsirkin 2013-02-06 23:29 ` Ben Hutchings 0 siblings, 1 reply; 16+ messages in thread From: Michael S. Tsirkin @ 2013-02-06 22:26 UTC (permalink / raw) To: Ben Hutchings Cc: Eric Dumazet, alexander.h.duyck, stephen.s.ko, jeffrey.t.kirsher, David Miller, netdev, sony.chacko, mchan, jitendra.kalsaria, eilong On Wed, Feb 06, 2013 at 09:55:47PM +0000, Ben Hutchings wrote: > On Wed, 2013-02-06 at 23:45 +0200, Michael S. Tsirkin wrote: > > On Wed, Feb 06, 2013 at 07:58:21PM +0000, Ben Hutchings wrote: > > > On Wed, 2013-02-06 at 17:50 +0200, Michael S. Tsirkin wrote: > > > > On Wed, Feb 06, 2013 at 05:07:39AM -0800, Eric Dumazet wrote: > > > > > On Wed, 2013-02-06 at 13:43 +0200, Michael S. Tsirkin wrote: > > > > > > It seems that starting with kernel 3.3 ixgbe sets gso_size for > > > > > > incoming frames. It seems that this might result in gso_size > > > > > > being set even when gso_type is 0. > > > > > > This in turn leads to a crash at macvtap_skb_to_vnet_hdr > > > > > > drivers/net/macvtap.c:628 > > > > > > which has this code: > > > > > > > > > > > > if (skb_is_gso(skb)) { > > > > > > struct skb_shared_info *sinfo = skb_shinfo(skb); > > > > > > > > > > > > /* This is a hint as to how much should be linear. */ > > > > > > vnet_hdr->hdr_len = skb_headlen(skb); > > > > > > vnet_hdr->gso_size = sinfo->gso_size; > > > > > > if (sinfo->gso_type & SKB_GSO_TCPV4) > > > > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4; > > > > > > else if (sinfo->gso_type & SKB_GSO_TCPV6) > > > > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6; > > > > > > else if (sinfo->gso_type & SKB_GSO_UDP) > > > > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_UDP; > > > > > > else > > > > > > BUG(); > > > > > > if (sinfo->gso_type & SKB_GSO_TCP_ECN) > > > > > > vnet_hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN; > > > > > > } else > > > > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE; > > > > > > > > > > > > > > > > > > Since skb_is_gso tests gso_size. > > > > > > > > > > > > What's the right way to handle this? Should skb_is_gso be > > > > > > changed to test gso_type != 0? > > > > > > > > > > > > > > > > Or fix ixgbe to set gso_type in ixgbe_get_headlen(), as it does all the > > > > > dissection. > > > > > > > > > > > > Hmm, ixgbe_get_headlen isn't run on linear skbs though. > > > > > > > > Also, I'm not sure I understand when should drivers set gso size > > > > for incoming messages and what is a reasonable value. > > > > Commit log talks about improved performance for lossy connections, > > > > in this case, isn't this something net core should set? > > > [...] > > > > > > It should be set to the segment size on the wire, so TCP gets a correct > > > picture of packet loss. The networking core has no idea what hardware/ > > > firmware LRO did. > > > > > > I've previously raised this issue of macvlan vs LRO (which is the same > > > issue we previously had with IP forwarding and with bridging): > > > http://thread.gmane.org/gmane.linux.network/221695 > > > > > > Ben. > > > > I see, you proposed disabling LRO the moment a macvlan is attached. > > If I understand correctly, the difference as compared to bridge is that > > bridge normally consumes all incoming packets, macvlan is often used in > > parallel with the underlying interface. > > Not so different from IP forwarding, though, in that a single interface > can both forward packets and deliver them locally. Hmm, for ip forwarding we don't try do disable LRO on the device, do we? What's the solution there? > > BTW are there other issues with forwarding/bridging and LRO? If everyone > > sets gso_type in packets it seems we can leave LRO set even with > > bridging? > > Some implementations of LRO violate the end-to-end principle by merging > segments with varying lengths and TCP timestamps. GRO is very careful > to ensure that the original packets can be recovered from the skbs it > produces. > > Ben. > > -- > Ben Hutchings, Staff Engineer, Solarflare > Not speaking for my employer; that's the marketing department's job. > They asked us to note that Solarflare product names are trademarked. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? 2013-02-06 22:26 ` Michael S. Tsirkin @ 2013-02-06 23:29 ` Ben Hutchings 0 siblings, 0 replies; 16+ messages in thread From: Ben Hutchings @ 2013-02-06 23:29 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Eric Dumazet, alexander.h.duyck, stephen.s.ko, jeffrey.t.kirsher, David Miller, netdev, sony.chacko, mchan, jitendra.kalsaria, eilong On Thu, 2013-02-07 at 00:26 +0200, Michael S. Tsirkin wrote: > On Wed, Feb 06, 2013 at 09:55:47PM +0000, Ben Hutchings wrote: > > On Wed, 2013-02-06 at 23:45 +0200, Michael S. Tsirkin wrote: > > > On Wed, Feb 06, 2013 at 07:58:21PM +0000, Ben Hutchings wrote: > > > > On Wed, 2013-02-06 at 17:50 +0200, Michael S. Tsirkin wrote: > > > > > On Wed, Feb 06, 2013 at 05:07:39AM -0800, Eric Dumazet wrote: > > > > > > On Wed, 2013-02-06 at 13:43 +0200, Michael S. Tsirkin wrote: > > > > > > > It seems that starting with kernel 3.3 ixgbe sets gso_size for > > > > > > > incoming frames. It seems that this might result in gso_size > > > > > > > being set even when gso_type is 0. > > > > > > > This in turn leads to a crash at macvtap_skb_to_vnet_hdr > > > > > > > drivers/net/macvtap.c:628 > > > > > > > which has this code: > > > > > > > > > > > > > > if (skb_is_gso(skb)) { > > > > > > > struct skb_shared_info *sinfo = skb_shinfo(skb); > > > > > > > > > > > > > > /* This is a hint as to how much should be linear. */ > > > > > > > vnet_hdr->hdr_len = skb_headlen(skb); > > > > > > > vnet_hdr->gso_size = sinfo->gso_size; > > > > > > > if (sinfo->gso_type & SKB_GSO_TCPV4) > > > > > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4; > > > > > > > else if (sinfo->gso_type & SKB_GSO_TCPV6) > > > > > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6; > > > > > > > else if (sinfo->gso_type & SKB_GSO_UDP) > > > > > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_UDP; > > > > > > > else > > > > > > > BUG(); > > > > > > > if (sinfo->gso_type & SKB_GSO_TCP_ECN) > > > > > > > vnet_hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN; > > > > > > > } else > > > > > > > vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE; > > > > > > > > > > > > > > > > > > > > > Since skb_is_gso tests gso_size. > > > > > > > > > > > > > > What's the right way to handle this? Should skb_is_gso be > > > > > > > changed to test gso_type != 0? > > > > > > > > > > > > > > > > > > > Or fix ixgbe to set gso_type in ixgbe_get_headlen(), as it does all the > > > > > > dissection. > > > > > > > > > > > > > > > Hmm, ixgbe_get_headlen isn't run on linear skbs though. > > > > > > > > > > Also, I'm not sure I understand when should drivers set gso size > > > > > for incoming messages and what is a reasonable value. > > > > > Commit log talks about improved performance for lossy connections, > > > > > in this case, isn't this something net core should set? > > > > [...] > > > > > > > > It should be set to the segment size on the wire, so TCP gets a correct > > > > picture of packet loss. The networking core has no idea what hardware/ > > > > firmware LRO did. > > > > > > > > I've previously raised this issue of macvlan vs LRO (which is the same > > > > issue we previously had with IP forwarding and with bridging): > > > > http://thread.gmane.org/gmane.linux.network/221695 > > > > > > > > Ben. > > > > > > I see, you proposed disabling LRO the moment a macvlan is attached. > > > If I understand correctly, the difference as compared to bridge is that > > > bridge normally consumes all incoming packets, macvlan is often used in > > > parallel with the underlying interface. > > > > Not so different from IP forwarding, though, in that a single interface > > can both forward packets and deliver them locally. > > Hmm, for ip forwarding we don't try do disable LRO on the device, do we? > What's the solution there? Yes we do (or we did). Ben. > > > BTW are there other issues with forwarding/bridging and LRO? If everyone > > > sets gso_type in packets it seems we can leave LRO set even with > > > bridging? > > > > Some implementations of LRO violate the end-to-end principle by merging > > segments with varying lengths and TCP timestamps. GRO is very careful > > to ensure that the original packets can be recovered from the skbs it > > produces. > > > > Ben. > > > > -- > > Ben Hutchings, Staff Engineer, Solarflare > > Not speaking for my employer; that's the marketing department's job. > > They asked us to note that Solarflare product names are trademarked. -- Ben Hutchings, Staff Engineer, Solarflare Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2013-02-07 3:04 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-02-06 11:43 regression caused by 1d2024f61ec14bdb0c57a97a3fe73685abc2d198? Michael S. Tsirkin 2013-02-06 13:07 ` Eric Dumazet 2013-02-06 15:50 ` Michael S. Tsirkin 2013-02-06 16:15 ` Eric Dumazet 2013-02-06 17:42 ` Michael S. Tsirkin 2013-02-06 17:42 ` Eric Dumazet 2013-02-06 17:50 ` Michael S. Tsirkin 2013-02-06 18:05 ` Michael S. Tsirkin 2013-02-06 18:09 ` Eric Dumazet 2013-02-07 3:03 ` Cong Wang 2013-02-06 16:23 ` Eric Dumazet 2013-02-06 19:58 ` Ben Hutchings 2013-02-06 21:45 ` Michael S. Tsirkin 2013-02-06 21:55 ` Ben Hutchings 2013-02-06 22:26 ` Michael S. Tsirkin 2013-02-06 23:29 ` Ben Hutchings
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).