* Definition and usage of NETIF_F_HW_SUM? @ 2007-05-29 20:58 Stephen Hemminger 2007-05-29 21:36 ` Herbert Xu 0 siblings, 1 reply; 10+ messages in thread From: Stephen Hemminger @ 2007-05-29 20:58 UTC (permalink / raw) To: Herbert Xu; +Cc: netdev The flag NETIF_F_HW_SUM is being misused. The definition says that the device is capable of checksumming any packet. When in fact from usage it seems to imply that the device is capable of doing IPV6 as well as IPV4. Some devices like 8139too do "fake checksum offloading" because they always have to copy the packet. Some devices like via-rhine, don't really checksum but if they see CHECKSUM_PARTIAL then they copy. This is bogus, they should just let higher layer do checksum/copy. Devices like e1000, and bnx2 are broken because they assume only TCP/UDP and IPV4/IPV6. The definition of the flag says other protocols should work, but they probably send the hardware into a state of confusion. A few devices take a offset, starting point, and insertion point. This looks like the correct model. But no upper layer protocols other than IPV4/IPV6 can do checksum offload at present, so it seems moot. IMHO the correct solution would be to get rid if NETIF_F_HW_SUM and make a new flag NETIF_F_IPV6_SUM. Devices that can checksum both could do NETIF_F_IPV4_SUM|NETI_F_IPV6_SUM. -- Stephen Hemminger <shemminger@linux-foundation.org> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Definition and usage of NETIF_F_HW_SUM? 2007-05-29 20:58 Definition and usage of NETIF_F_HW_SUM? Stephen Hemminger @ 2007-05-29 21:36 ` Herbert Xu 2007-05-29 21:58 ` Stephen Hemminger ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Herbert Xu @ 2007-05-29 21:36 UTC (permalink / raw) To: Stephen Hemminger; +Cc: netdev, Michael Chan On Tue, May 29, 2007 at 01:58:13PM -0700, Stephen Hemminger wrote: > The flag NETIF_F_HW_SUM is being misused. The definition says that the device > is capable of checksumming any packet. When in fact from usage it seems to > imply that the device is capable of doing IPV6 as well as IPV4. That would be a problem. > Some devices like 8139too do "fake checksum offloading" because they always have to copy > the packet. > > Some devices like via-rhine, don't really checksum but if they see CHECKSUM_PARTIAL then they > copy. This is bogus, they should just let higher layer do checksum/copy. Actually this is OK because if they have to copy it then it's cheaper to checksum it there. Both of these should be able to support all protocols. > Devices like e1000, and bnx2 are broken because they assume only TCP/UDP and IPV4/IPV6. > The definition of the flag says other protocols should work, but they probably send the > hardware into a state of confusion. I just checked e1000 and it's correct as it does use the csum_offset when doing TX offload. However, you're definitely right that bnx2 seems to be broken. > A few devices take a offset, starting point, and insertion point. This looks like > the correct model. But no upper layer protocols other than IPV4/IPV6 can do checksum > offload at present, so it seems moot. I could easily whip up a patch to get GRE to use it for a start :) > IMHO the correct solution would be to get rid if NETIF_F_HW_SUM and make a new flag > NETIF_F_IPV6_SUM. Devices that can checksum both could do NETIF_F_IPV4_SUM|NETI_F_IPV6_SUM. We should definitely keep NETIF_F_HW_SUM for sane hardware such as the e1000. Unfortunately we may just have to invent IPV6_SUM for the broken ones. Ccing Michael to see if the bnx2 chip can actually do offset-based checksum offload. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Definition and usage of NETIF_F_HW_SUM? 2007-05-29 21:36 ` Herbert Xu @ 2007-05-29 21:58 ` Stephen Hemminger 2007-05-30 0:10 ` Michael Chan 2007-05-30 15:53 ` [RFC] IPV6 checksum offloading in network devices Stephen Hemminger 2 siblings, 0 replies; 10+ messages in thread From: Stephen Hemminger @ 2007-05-29 21:58 UTC (permalink / raw) To: Herbert Xu; +Cc: netdev, Michael Chan On Wed, 30 May 2007 07:36:18 +1000 Herbert Xu <herbert.xu@redhat.com> wrote: > On Tue, May 29, 2007 at 01:58:13PM -0700, Stephen Hemminger wrote: > > The flag NETIF_F_HW_SUM is being misused. The definition says that the device > > is capable of checksumming any packet. When in fact from usage it seems to > > imply that the device is capable of doing IPV6 as well as IPV4. > > That would be a problem. > > > Some devices like 8139too do "fake checksum offloading" because they always have to copy > > the packet. > > > > Some devices like via-rhine, don't really checksum but if they see CHECKSUM_PARTIAL then they > > copy. This is bogus, they should just let higher layer do checksum/copy. > > Actually this is OK because if they have to copy it then it's cheaper to > checksum it there. Both of these should be able to support all protocols. > > > Devices like e1000, and bnx2 are broken because they assume only TCP/UDP and IPV4/IPV6. > > The definition of the flag says other protocols should work, but they probably send the > > hardware into a state of confusion. > > I just checked e1000 and it's correct as it does use the csum_offset > when doing TX offload. However, you're definitely right that bnx2 > seems to be broken. > > > A few devices take a offset, starting point, and insertion point. This looks like > > the correct model. But no upper layer protocols other than IPV4/IPV6 can do checksum > > offload at present, so it seems moot. > > I could easily whip up a patch to get GRE to use it for a start :) > > > IMHO the correct solution would be to get rid if NETIF_F_HW_SUM and make a new flag > > NETIF_F_IPV6_SUM. Devices that can checksum both could do NETIF_F_IPV4_SUM|NETI_F_IPV6_SUM. > > We should definitely keep NETIF_F_HW_SUM for sane hardware such as the > e1000. Unfortunately we may just have to invent IPV6_SUM for the broken > ones. > The Marvell 88e8071 does IPV4 and IPV6 checksum, earlier chips could do arbitrary checksum. Looks like when they added the TSO6 logic, they made transmit state machine more protocol dependent. -- Stephen Hemminger <shemminger@linux-foundation.org> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Definition and usage of NETIF_F_HW_SUM? 2007-05-29 21:36 ` Herbert Xu 2007-05-29 21:58 ` Stephen Hemminger @ 2007-05-30 0:10 ` Michael Chan 2007-05-29 23:45 ` Stephen Hemminger 2007-05-30 15:53 ` [RFC] IPV6 checksum offloading in network devices Stephen Hemminger 2 siblings, 1 reply; 10+ messages in thread From: Michael Chan @ 2007-05-30 0:10 UTC (permalink / raw) To: Herbert Xu; +Cc: Stephen Hemminger, netdev On Wed, 2007-05-30 at 07:36 +1000, Herbert Xu wrote: > > I just checked e1000 and it's correct as it does use the csum_offset > when doing TX offload. However, you're definitely right that bnx2 > seems to be broken. > > > A few devices take a offset, starting point, and insertion point. This looks like > > the correct model. But no upper layer protocols other than IPV4/IPV6 can do checksum > > offload at present, so it seems moot. > > I could easily whip up a patch to get GRE to use it for a start :) > > > IMHO the correct solution would be to get rid if NETIF_F_HW_SUM and make a new flag > > NETIF_F_IPV6_SUM. Devices that can checksum both could do NETIF_F_IPV4_SUM|NETI_F_IPV6_SUM. > > We should definitely keep NETIF_F_HW_SUM for sane hardware such as the > e1000. Unfortunately we may just have to invent IPV6_SUM for the broken > ones. > > Ccing Michael to see if the bnx2 chip can actually do offset-based > checksum offload. > bnx2 and tg3 cannot do offset-based checksumming because the hardware doesn't have room in the buffer descriptors to specify the offsets. So regrettably, the NETIF_F_HW_SUM flag has been misused in these drivers. A new NETIF_F_IPV6_SUM flag will be very useful for us. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Definition and usage of NETIF_F_HW_SUM? 2007-05-30 0:10 ` Michael Chan @ 2007-05-29 23:45 ` Stephen Hemminger 2007-06-04 15:35 ` Ron Mercer 0 siblings, 1 reply; 10+ messages in thread From: Stephen Hemminger @ 2007-05-29 23:45 UTC (permalink / raw) To: Michael Chan; +Cc: Herbert Xu, netdev On Tue, 29 May 2007 17:10:52 -0700 "Michael Chan" <mchan@broadcom.com> wrote: > On Wed, 2007-05-30 at 07:36 +1000, Herbert Xu wrote: > > > > > I just checked e1000 and it's correct as it does use the csum_offset > > when doing TX offload. However, you're definitely right that bnx2 > > seems to be broken. > > > > > A few devices take a offset, starting point, and insertion point. This looks like > > > the correct model. But no upper layer protocols other than IPV4/IPV6 can do checksum > > > offload at present, so it seems moot. > > > > I could easily whip up a patch to get GRE to use it for a start :) > > > > > IMHO the correct solution would be to get rid if NETIF_F_HW_SUM and make a new flag > > > NETIF_F_IPV6_SUM. Devices that can checksum both could do NETIF_F_IPV4_SUM|NETI_F_IPV6_SUM. > > > > We should definitely keep NETIF_F_HW_SUM for sane hardware such as the > > e1000. Unfortunately we may just have to invent IPV6_SUM for the broken > > ones. > > > > Ccing Michael to see if the bnx2 chip can actually do offset-based > > checksum offload. > > > > bnx2 and tg3 cannot do offset-based checksumming because the hardware > doesn't have room in the buffer descriptors to specify the offsets. So > regrettably, the NETIF_F_HW_SUM flag has been misused in these drivers. > A new NETIF_F_IPV6_SUM flag will be very useful for us. Look furthur many drivers are just plain broken and use F_HW_SUM and can't even do IPV6 properly. I'll fix The worst code award goes to: qla3xxx.c which is broken on IPV6 and goes to trouble of computing all the offsets and they are already there in skb... -- Stephen Hemminger <shemminger@linux-foundation.org> ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Definition and usage of NETIF_F_HW_SUM? 2007-05-29 23:45 ` Stephen Hemminger @ 2007-06-04 15:35 ` Ron Mercer 0 siblings, 0 replies; 10+ messages in thread From: Ron Mercer @ 2007-06-04 15:35 UTC (permalink / raw) To: Stephen Hemminger; +Cc: netdev I was out of town last week and did not have a chance to respond. Yes, qla3xxx is (before Stephen's fix) broken on IPV6. I will review the changes and post a patch if necessary. > -----Original Message----- > From: netdev-owner@vger.kernel.org > [mailto:netdev-owner@vger.kernel.org] On Behalf Of Stephen Hemminger > Sent: Tuesday, May 29, 2007 4:46 PM > To: Michael Chan > Cc: Herbert Xu; netdev > Subject: Re: Definition and usage of NETIF_F_HW_SUM? > > On Tue, 29 May 2007 17:10:52 -0700 > "Michael Chan" <mchan@broadcom.com> wrote: > > > On Wed, 2007-05-30 at 07:36 +1000, Herbert Xu wrote: > > > > > > > > I just checked e1000 and it's correct as it does use the > csum_offset > > > when doing TX offload. However, you're definitely right that bnx2 > > > seems to be broken. > > > > > > > A few devices take a offset, starting point, and > insertion point. This looks like > > > > the correct model. But no upper layer protocols other > than IPV4/IPV6 can do checksum > > > > offload at present, so it seems moot. > > > > > > I could easily whip up a patch to get GRE to use it for a start :) > > > > > > > IMHO the correct solution would be to get rid if > NETIF_F_HW_SUM and make a new flag > > > > NETIF_F_IPV6_SUM. Devices that can checksum both could > do NETIF_F_IPV4_SUM|NETI_F_IPV6_SUM. > > > > > > We should definitely keep NETIF_F_HW_SUM for sane > hardware such as the > > > e1000. Unfortunately we may just have to invent IPV6_SUM > for the broken > > > ones. > > > > > > Ccing Michael to see if the bnx2 chip can actually do offset-based > > > checksum offload. > > > > > > > bnx2 and tg3 cannot do offset-based checksumming because > the hardware > > doesn't have room in the buffer descriptors to specify the > offsets. So > > regrettably, the NETIF_F_HW_SUM flag has been misused in > these drivers. > > A new NETIF_F_IPV6_SUM flag will be very useful for us. > > Look furthur many drivers are just plain broken and use F_HW_SUM > and can't even do IPV6 properly. I'll fix > > The worst code award goes to: qla3xxx.c > which is broken on IPV6 and goes to trouble of computing all the > offsets and they are already there in skb... > > > > -- > Stephen Hemminger <shemminger@linux-foundation.org> > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 10+ messages in thread
* [RFC] IPV6 checksum offloading in network devices 2007-05-29 21:36 ` Herbert Xu 2007-05-29 21:58 ` Stephen Hemminger 2007-05-30 0:10 ` Michael Chan @ 2007-05-30 15:53 ` Stephen Hemminger 2007-05-30 16:13 ` Patrick McHardy 2 siblings, 1 reply; 10+ messages in thread From: Stephen Hemminger @ 2007-05-30 15:53 UTC (permalink / raw) To: Herbert Xu, David S. Miller; +Cc: netdev, Michael Chan Here is a proposed change to address hardware that can do IPV6 checksum offload, but doesn't truly do generic hw checksumming. The bnx2 and tg3 are like this for some revisions, and upcoming Marvell 88e8071 is similar. --- a/drivers/net/bnx2.c 2007-05-30 08:26:18.000000000 -0700 +++ b/drivers/net/bnx2.c 2007-05-30 08:26:32.000000000 -0700 @@ -6471,10 +6471,10 @@ bnx2_init_one(struct pci_dev *pdev, cons memcpy(dev->perm_addr, bp->mac_addr, 6); bp->name = board_info[ent->driver_data].name; + dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG; if (CHIP_NUM(bp) == CHIP_NUM_5709) - dev->features |= NETIF_F_HW_CSUM | NETIF_F_SG; - else - dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG; + dev->features |= NETIF_F_IPV6_CSUM; + #ifdef BCM_VLAN dev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX; #endif --- a/drivers/net/tg3.c 2007-05-30 08:26:18.000000000 -0700 +++ b/drivers/net/tg3.c 2007-05-30 08:26:32.000000000 -0700 @@ -11959,12 +11959,11 @@ static int __devinit tg3_init_one(struct * checksumming. */ if ((tp->tg3_flags & TG3_FLAG_BROKEN_CHECKSUMS) == 0) { + dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG; if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5755 || GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5787) - dev->features |= NETIF_F_HW_CSUM; - else - dev->features |= NETIF_F_IP_CSUM; - dev->features |= NETIF_F_SG; + dev->features |= NETIF_F_IPV6_CSUM; + tp->tg3_flags |= TG3_FLAG_RX_CHECKSUMS; } else tp->tg3_flags &= ~TG3_FLAG_RX_CHECKSUMS; --- a/include/linux/netdevice.h 2007-05-30 08:26:18.000000000 -0700 +++ b/include/linux/netdevice.h 2007-05-30 08:30:20.000000000 -0700 @@ -314,9 +314,10 @@ struct net_device /* Net device features */ unsigned long features; #define NETIF_F_SG 1 /* Scatter/gather IO. */ -#define NETIF_F_IP_CSUM 2 /* Can checksum only TCP/UDP over IPv4. */ +#define NETIF_F_IP_CSUM 2 /* Can checksum TCP/UDP over IPv4. */ #define NETIF_F_NO_CSUM 4 /* Does not require checksum. F.e. loopack. */ #define NETIF_F_HW_CSUM 8 /* Can checksum all the packets. */ +#define NETIF_F_IPV6_CSUM 16 /* Can checksum TCP/UDP over IPV6 */ #define NETIF_F_HIGHDMA 32 /* Can DMA to high memory. */ #define NETIF_F_FRAGLIST 64 /* Scatter/gather IO. */ #define NETIF_F_HW_VLAN_TX 128 /* Transmit VLAN hw acceleration */ @@ -339,7 +340,8 @@ struct net_device #define NETIF_F_GSO_SOFTWARE (NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6) #define NETIF_F_GEN_CSUM (NETIF_F_NO_CSUM | NETIF_F_HW_CSUM) -#define NETIF_F_ALL_CSUM (NETIF_F_IP_CSUM | NETIF_F_GEN_CSUM) +#define NETIF_F_ALL_CSUM (NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM \ + | NETIF_F_GEN_CSUM) struct net_device *next_sched; --- a/net/bridge/br_if.c 2007-05-30 08:26:18.000000000 -0700 +++ b/net/bridge/br_if.c 2007-05-30 08:26:32.000000000 -0700 @@ -368,10 +368,18 @@ void br_features_recompute(struct net_br list_for_each_entry(p, &br->port_list, list) { unsigned long feature = p->dev->features; + /* if device needs checksumming, downgrade to hw checksumming */ if (checksum & NETIF_F_NO_CSUM && !(feature & NETIF_F_NO_CSUM)) checksum ^= NETIF_F_NO_CSUM | NETIF_F_HW_CSUM; + + /* if device can't do all checksum, downgrade to ipv4/ipv6 */ if (checksum & NETIF_F_HW_CSUM && !(feature & NETIF_F_HW_CSUM)) - checksum ^= NETIF_F_HW_CSUM | NETIF_F_IP_CSUM; + checksum ^= NETIF_F_HW_CSUM + | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM; + + if (checksum & NETIF_F_IPV6_CSUM && !(feature & NETIF_F_IPV6_CSUM)) + checksum &= ~NETIF_F_IPV6_CSUM; + if (!(feature & NETIF_F_IP_CSUM)) checksum = 0; --- a/net/core/dev.c 2007-05-30 08:26:18.000000000 -0700 +++ b/net/core/dev.c 2007-05-30 08:49:54.000000000 -0700 @@ -1509,9 +1509,11 @@ int dev_queue_xmit(struct sk_buff *skb) skb_set_transport_header(skb, skb->csum_start - skb_headroom(skb)); - if (!(dev->features & NETIF_F_GEN_CSUM) && - (!(dev->features & NETIF_F_IP_CSUM) || - skb->protocol != htons(ETH_P_IP))) + if (!(dev->features & NETIF_F_GEN_CSUM) + || ((dev->features & NETIF_F_IP_CSUM) + && skb->protocol == htons(ETH_P_IP)) + || ((dev->features & NETIF_F_IPV6_CSUM) + && skb->protocol == htons(ETH_P_IPV6))) if (skb_checksum_help(skb)) goto out_kfree_skb; } @@ -3105,6 +3107,22 @@ int register_netdevice(struct net_device } } + /* Fix illegal checksum combinations */ + if ((dev->features & NETIF_F_HW_CSUM) && + (dev->features & (NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM))) { + printk(KERN_NOTICE "%s: mixed HW and IP checksum settings.\n", + dev->name); + dev->features &= ~(NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM); + } + + if ((dev->features & NETIF_F_NO_CSUM) && + (dev->features & (NETIF_F_HW_CSUM|NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM))) { + printk(KERN_NOTICE "%s: mixed no checksumming and other settings.\n", + dev->name); + dev->features &= ~(NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM|NETIF_F_HW_CSUM); + } + + /* Fix illegal SG+CSUM combinations. */ if ((dev->features & NETIF_F_SG) && !(dev->features & NETIF_F_ALL_CSUM)) { --- a/net/ipv6/ipv6_sockglue.c 2007-05-29 16:44:45.000000000 -0700 +++ b/net/ipv6/ipv6_sockglue.c 2007-05-30 08:48:35.000000000 -0700 @@ -123,7 +123,7 @@ static struct sk_buff *ipv6_gso_segment( struct ipv6hdr *ipv6h; struct inet6_protocol *ops; - if (!(features & NETIF_F_HW_CSUM)) + if (!(features & (NETIF_F_HW_CSUM|NETIF_F_IPV6_CSUM)) features &= ~NETIF_F_SG; if (unlikely(skb_shinfo(skb)->gso_type & ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] IPV6 checksum offloading in network devices 2007-05-30 15:53 ` [RFC] IPV6 checksum offloading in network devices Stephen Hemminger @ 2007-05-30 16:13 ` Patrick McHardy 2007-05-30 21:00 ` Stephen Hemminger 0 siblings, 1 reply; 10+ messages in thread From: Patrick McHardy @ 2007-05-30 16:13 UTC (permalink / raw) To: Stephen Hemminger; +Cc: Herbert Xu, David S. Miller, netdev, Michael Chan Stephen Hemminger wrote: > Here is a proposed change to address hardware that can do IPV6 checksum > offload, but doesn't truly do generic hw checksumming. The bnx2 and tg3 > are like this for some revisions, and upcoming Marvell 88e8071 is similar. > > > --- a/include/linux/netdevice.h 2007-05-30 08:26:18.000000000 -0700 > +++ b/include/linux/netdevice.h 2007-05-30 08:30:20.000000000 -0700 > @@ -314,9 +314,10 @@ struct net_device > /* Net device features */ > unsigned long features; > #define NETIF_F_SG 1 /* Scatter/gather IO. */ > -#define NETIF_F_IP_CSUM 2 /* Can checksum only TCP/UDP over IPv4. */ > +#define NETIF_F_IP_CSUM 2 /* Can checksum TCP/UDP over IPv4. */ > #define NETIF_F_NO_CSUM 4 /* Does not require checksum. F.e. loopack. */ > #define NETIF_F_HW_CSUM 8 /* Can checksum all the packets. */ > +#define NETIF_F_IPV6_CSUM 16 /* Can checksum TCP/UDP over IPV6 */ > #define NETIF_F_HIGHDMA 32 /* Can DMA to high memory. */ > #define NETIF_F_FRAGLIST 64 /* Scatter/gather IO. */ > #define NETIF_F_HW_VLAN_TX 128 /* Transmit VLAN hw acceleration */ > @@ -339,7 +340,8 @@ struct net_device > #define NETIF_F_GSO_SOFTWARE (NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6) > > #define NETIF_F_GEN_CSUM (NETIF_F_NO_CSUM | NETIF_F_HW_CSUM) > -#define NETIF_F_ALL_CSUM (NETIF_F_IP_CSUM | NETIF_F_GEN_CSUM) > +#define NETIF_F_ALL_CSUM (NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM \ > + | NETIF_F_GEN_CSUM) This might confuse some of the existing IPv4 code that checks for NETIF_F_ALL_CSUM if we ever get a device that has NETIF_F_IPV6_CSUM but not NETIF_F_IP_CSUM. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] IPV6 checksum offloading in network devices 2007-05-30 16:13 ` Patrick McHardy @ 2007-05-30 21:00 ` Stephen Hemminger 2007-06-27 7:44 ` David Miller 0 siblings, 1 reply; 10+ messages in thread From: Stephen Hemminger @ 2007-05-30 21:00 UTC (permalink / raw) To: Patrick McHardy; +Cc: Herbert Xu, David S. Miller, netdev, Michael Chan This is better. There is still a possiblity when a device allows IPV6 and not IPV4 checksumming, that the checksum will be done in the fixup code in dev_queue_xmit. The existing model for checksum offload does not correctly handle devices that can offload IPV4 and IPV6 only. The NETIF_F_HW_CSUM flag implies device can do any arbitrary protocol. This patch: * adds NETIF_F_IPV6_CSUM for those devices * fixes bnx2 and tg3 devices that need it * add NETIF_F_IPV6_CSUM to ipv6 output (incl GSO) * fixes assumptions about NETIF_F_ALL_CSUM in nat * adjusts bridge union of checksumming computation --- drivers/net/bnx2.c | 6 +++--- drivers/net/tg3.c | 7 +++---- include/linux/netdevice.h | 8 ++++++-- net/bridge/br_if.c | 10 +++++++++- net/core/dev.c | 24 +++++++++++++++++++++--- net/ipv4/af_inet.c | 3 +++ net/ipv4/ip_output.c | 2 +- net/ipv4/netfilter/nf_nat_helper.c | 4 ++-- net/ipv6/ipv6_sockglue.c | 2 +- 9 files changed, 49 insertions(+), 17 deletions(-) --- a/drivers/net/bnx2.c 2007-05-30 08:26:18.000000000 -0700 +++ b/drivers/net/bnx2.c 2007-05-30 08:26:32.000000000 -0700 @@ -6471,10 +6471,10 @@ bnx2_init_one(struct pci_dev *pdev, cons memcpy(dev->perm_addr, bp->mac_addr, 6); bp->name = board_info[ent->driver_data].name; + dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG; if (CHIP_NUM(bp) == CHIP_NUM_5709) - dev->features |= NETIF_F_HW_CSUM | NETIF_F_SG; - else - dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG; + dev->features |= NETIF_F_IPV6_CSUM; + #ifdef BCM_VLAN dev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX; #endif --- a/drivers/net/tg3.c 2007-05-30 08:26:18.000000000 -0700 +++ b/drivers/net/tg3.c 2007-05-30 08:26:32.000000000 -0700 @@ -11959,12 +11959,11 @@ static int __devinit tg3_init_one(struct * checksumming. */ if ((tp->tg3_flags & TG3_FLAG_BROKEN_CHECKSUMS) == 0) { + dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG; if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5755 || GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5787) - dev->features |= NETIF_F_HW_CSUM; - else - dev->features |= NETIF_F_IP_CSUM; - dev->features |= NETIF_F_SG; + dev->features |= NETIF_F_IPV6_CSUM; + tp->tg3_flags |= TG3_FLAG_RX_CHECKSUMS; } else tp->tg3_flags &= ~TG3_FLAG_RX_CHECKSUMS; --- a/include/linux/netdevice.h 2007-05-30 08:26:18.000000000 -0700 +++ b/include/linux/netdevice.h 2007-05-30 10:10:57.000000000 -0700 @@ -314,9 +314,10 @@ struct net_device /* Net device features */ unsigned long features; #define NETIF_F_SG 1 /* Scatter/gather IO. */ -#define NETIF_F_IP_CSUM 2 /* Can checksum only TCP/UDP over IPv4. */ +#define NETIF_F_IP_CSUM 2 /* Can checksum TCP/UDP over IPv4. */ #define NETIF_F_NO_CSUM 4 /* Does not require checksum. F.e. loopack. */ #define NETIF_F_HW_CSUM 8 /* Can checksum all the packets. */ +#define NETIF_F_IPV6_CSUM 16 /* Can checksum TCP/UDP over IPV6 */ #define NETIF_F_HIGHDMA 32 /* Can DMA to high memory. */ #define NETIF_F_FRAGLIST 64 /* Scatter/gather IO. */ #define NETIF_F_HW_VLAN_TX 128 /* Transmit VLAN hw acceleration */ @@ -338,8 +339,11 @@ struct net_device /* List of features with software fallbacks. */ #define NETIF_F_GSO_SOFTWARE (NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6) + #define NETIF_F_GEN_CSUM (NETIF_F_NO_CSUM | NETIF_F_HW_CSUM) -#define NETIF_F_ALL_CSUM (NETIF_F_IP_CSUM | NETIF_F_GEN_CSUM) +#define NETIF_F_V4_CSUM (NETIF_F_GEN_CSUM | NETIF_F_IP_CSUM) +#define NETIF_F_V6_CSUM (NETIF_F_GEN_CSUM | NETIF_F_IPV6_CSUM) +#define NETIF_F_ALL_CSUM (NETIF_F_V4_CSUM | NETIF_F_V6_CSUM) struct net_device *next_sched; --- a/net/bridge/br_if.c 2007-05-30 08:26:18.000000000 -0700 +++ b/net/bridge/br_if.c 2007-05-30 10:07:22.000000000 -0700 @@ -368,10 +368,18 @@ void br_features_recompute(struct net_br list_for_each_entry(p, &br->port_list, list) { unsigned long feature = p->dev->features; + /* if device needs checksumming, downgrade to hw checksumming */ if (checksum & NETIF_F_NO_CSUM && !(feature & NETIF_F_NO_CSUM)) checksum ^= NETIF_F_NO_CSUM | NETIF_F_HW_CSUM; + + /* if device can't do all checksum, downgrade to ipv4/ipv6 */ if (checksum & NETIF_F_HW_CSUM && !(feature & NETIF_F_HW_CSUM)) - checksum ^= NETIF_F_HW_CSUM | NETIF_F_IP_CSUM; + checksum ^= NETIF_F_HW_CSUM + | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM; + + if (checksum & NETIF_F_IPV6_CSUM && !(feature & NETIF_F_IPV6_CSUM)) + checksum &= ~NETIF_F_IPV6_CSUM; + if (!(feature & NETIF_F_IP_CSUM)) checksum = 0; --- a/net/core/dev.c 2007-05-30 08:26:18.000000000 -0700 +++ b/net/core/dev.c 2007-05-30 10:07:31.000000000 -0700 @@ -1509,9 +1509,11 @@ int dev_queue_xmit(struct sk_buff *skb) skb_set_transport_header(skb, skb->csum_start - skb_headroom(skb)); - if (!(dev->features & NETIF_F_GEN_CSUM) && - (!(dev->features & NETIF_F_IP_CSUM) || - skb->protocol != htons(ETH_P_IP))) + if (!(dev->features & NETIF_F_GEN_CSUM) + || ((dev->features & NETIF_F_IP_CSUM) + && skb->protocol == htons(ETH_P_IP)) + || ((dev->features & NETIF_F_IPV6_CSUM) + && skb->protocol == htons(ETH_P_IPV6))) if (skb_checksum_help(skb)) goto out_kfree_skb; } @@ -3105,6 +3107,22 @@ int register_netdevice(struct net_device } } + /* Fix illegal checksum combinations */ + if ((dev->features & NETIF_F_HW_CSUM) && + (dev->features & (NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM))) { + printk(KERN_NOTICE "%s: mixed HW and IP checksum settings.\n", + dev->name); + dev->features &= ~(NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM); + } + + if ((dev->features & NETIF_F_NO_CSUM) && + (dev->features & (NETIF_F_HW_CSUM|NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM))) { + printk(KERN_NOTICE "%s: mixed no checksumming and other settings.\n", + dev->name); + dev->features &= ~(NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM|NETIF_F_HW_CSUM); + } + + /* Fix illegal SG+CSUM combinations. */ if ((dev->features & NETIF_F_SG) && !(dev->features & NETIF_F_ALL_CSUM)) { --- a/net/ipv6/ipv6_sockglue.c 2007-05-29 16:44:45.000000000 -0700 +++ b/net/ipv6/ipv6_sockglue.c 2007-05-30 09:54:52.000000000 -0700 @@ -123,7 +123,7 @@ static struct sk_buff *ipv6_gso_segment( struct ipv6hdr *ipv6h; struct inet6_protocol *ops; - if (!(features & NETIF_F_HW_CSUM)) + if (!(features & NETIF_F_V6_CSUM)) features &= ~NETIF_F_SG; if (unlikely(skb_shinfo(skb)->gso_type & --- a/net/ipv4/af_inet.c 2007-05-29 16:44:45.000000000 -0700 +++ b/net/ipv4/af_inet.c 2007-05-30 09:55:11.000000000 -0700 @@ -1170,6 +1170,9 @@ static struct sk_buff *inet_gso_segment( int ihl; int id; + if (!(features & NETIF_F_V4_CSUM)) + features &= ~NETIF_F_SG; + if (unlikely(skb_shinfo(skb)->gso_type & ~(SKB_GSO_TCPV4 | SKB_GSO_UDP | --- a/net/ipv4/ip_output.c 2007-05-29 16:44:45.000000000 -0700 +++ b/net/ipv4/ip_output.c 2007-05-30 09:51:33.000000000 -0700 @@ -837,7 +837,7 @@ int ip_append_data(struct sock *sk, */ if (transhdrlen && length + fragheaderlen <= mtu && - rt->u.dst.dev->features & NETIF_F_ALL_CSUM && + rt->u.dst.dev->features & NETIF_F_V4_CSUM && !exthdrlen) csummode = CHECKSUM_PARTIAL; --- a/net/ipv4/netfilter/nf_nat_helper.c 2007-05-29 16:44:45.000000000 -0700 +++ b/net/ipv4/netfilter/nf_nat_helper.c 2007-05-30 10:01:33.000000000 -0700 @@ -178,7 +178,7 @@ nf_nat_mangle_tcp_packet(struct sk_buff datalen = (*pskb)->len - iph->ihl*4; if ((*pskb)->ip_summed != CHECKSUM_PARTIAL) { if (!(rt->rt_flags & RTCF_LOCAL) && - (*pskb)->dev->features & NETIF_F_ALL_CSUM) { + (*pskb)->dev->features & NETIF_F_V4_CSUM) { (*pskb)->ip_summed = CHECKSUM_PARTIAL; (*pskb)->csum_start = skb_headroom(*pskb) + skb_network_offset(*pskb) + @@ -265,7 +265,7 @@ nf_nat_mangle_udp_packet(struct sk_buff if ((*pskb)->ip_summed != CHECKSUM_PARTIAL) { if (!(rt->rt_flags & RTCF_LOCAL) && - (*pskb)->dev->features & NETIF_F_ALL_CSUM) { + (*pskb)->dev->features & NETIF_F_V4_CSUM) { (*pskb)->ip_summed = CHECKSUM_PARTIAL; (*pskb)->csum_start = skb_headroom(*pskb) + skb_network_offset(*pskb) + ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] IPV6 checksum offloading in network devices 2007-05-30 21:00 ` Stephen Hemminger @ 2007-06-27 7:44 ` David Miller 0 siblings, 0 replies; 10+ messages in thread From: David Miller @ 2007-06-27 7:44 UTC (permalink / raw) To: shemminger; +Cc: kaber, herbert.xu, netdev, mchan From: Stephen Hemminger <shemminger@linux-foundation.org> Date: Wed, 30 May 2007 14:00:34 -0700 > This is better. > There is still a possiblity when a device allows IPV6 and not IPV4 > checksumming, that the checksum will be done in the fixup code in > dev_queue_xmit. > > The existing model for checksum offload does not correctly handle > devices that can offload IPV4 and IPV6 only. The NETIF_F_HW_CSUM flag > implies device can do any arbitrary protocol. > > This patch: > * adds NETIF_F_IPV6_CSUM for those devices > * fixes bnx2 and tg3 devices that need it > * add NETIF_F_IPV6_CSUM to ipv6 output (incl GSO) > * fixes assumptions about NETIF_F_ALL_CSUM in nat > * adjusts bridge union of checksumming computation I've applied this to net-2.6.23, we can back it out or rework it if there are problems. Several of these checksum feature tests are getting out of hand :-) ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2007-06-27 7:44 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-05-29 20:58 Definition and usage of NETIF_F_HW_SUM? Stephen Hemminger 2007-05-29 21:36 ` Herbert Xu 2007-05-29 21:58 ` Stephen Hemminger 2007-05-30 0:10 ` Michael Chan 2007-05-29 23:45 ` Stephen Hemminger 2007-06-04 15:35 ` Ron Mercer 2007-05-30 15:53 ` [RFC] IPV6 checksum offloading in network devices Stephen Hemminger 2007-05-30 16:13 ` Patrick McHardy 2007-05-30 21:00 ` Stephen Hemminger 2007-06-27 7:44 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).